CA2271774A1 - Helicobacter polypeptides and corresponding polynucleotide molecules - Google Patents
Helicobacter polypeptides and corresponding polynucleotide molecules Download PDFInfo
- Publication number
- CA2271774A1 CA2271774A1 CA002271774A CA2271774A CA2271774A1 CA 2271774 A1 CA2271774 A1 CA 2271774A1 CA 002271774 A CA002271774 A CA 002271774A CA 2271774 A CA2271774 A CA 2271774A CA 2271774 A1 CA2271774 A1 CA 2271774A1
- Authority
- CA
- Canada
- Prior art keywords
- leu
- lys
- ghpo
- seq
- ser
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 108090000765 processed proteins & peptides Proteins 0.000 title claims abstract description 203
- 102000004196 processed proteins & peptides Human genes 0.000 title claims abstract description 191
- 229920001184 polypeptide Polymers 0.000 title claims abstract description 189
- 102000040430 polynucleotide Human genes 0.000 title claims abstract description 90
- 108091033319 polynucleotide Proteins 0.000 title claims abstract description 90
- 239000002157 polynucleotide Substances 0.000 title claims abstract description 90
- 241000589989 Helicobacter Species 0.000 title claims abstract description 57
- 238000000034 method Methods 0.000 claims abstract description 89
- 206010019375 Helicobacter infections Diseases 0.000 claims abstract description 17
- 108020004414 DNA Proteins 0.000 claims description 70
- 239000013598 vector Substances 0.000 claims description 54
- 239000000203 mixture Substances 0.000 claims description 50
- 210000004027 cell Anatomy 0.000 claims description 36
- 239000013612 plasmid Substances 0.000 claims description 36
- 239000002671 adjuvant Substances 0.000 claims description 30
- 241000124008 Mammalia Species 0.000 claims description 25
- 108010046334 Urease Proteins 0.000 claims description 20
- 230000001580 bacterial effect Effects 0.000 claims description 16
- 150000001875 compounds Chemical class 0.000 claims description 16
- 102000053602 DNA Human genes 0.000 claims description 13
- 239000003085 diluting agent Substances 0.000 claims description 12
- -1 metronidizole Chemical compound 0.000 claims description 10
- 210000004962 mammalian cell Anatomy 0.000 claims description 9
- ULGZDMOVFRHVEP-RWJQBGPGSA-N Erythromycin Chemical compound O([C@@H]1[C@@H](C)C(=O)O[C@@H]([C@@]([C@H](O)[C@@H](C)C(=O)[C@H](C)C[C@@](C)(O)[C@H](O[C@H]2[C@@H]([C@H](C[C@@H](C)O2)N(C)C)O)[C@H]1C)(C)O)CC)[C@H]1C[C@@](C)(OC)[C@@H](O)[C@H](C)O1 ULGZDMOVFRHVEP-RWJQBGPGSA-N 0.000 claims description 8
- 230000008569 process Effects 0.000 claims description 8
- 239000013603 viral vector Substances 0.000 claims description 7
- 239000004098 Tetracycline Substances 0.000 claims description 6
- 235000019364 tetracycline Nutrition 0.000 claims description 6
- 150000003522 tetracyclines Chemical class 0.000 claims description 6
- 229960002180 tetracycline Drugs 0.000 claims description 5
- 229930101283 tetracycline Natural products 0.000 claims description 5
- 241000607626 Vibrio cholerae Species 0.000 claims description 4
- 239000003242 anti bacterial agent Substances 0.000 claims description 4
- 150000001621 bismuth Chemical class 0.000 claims description 4
- ZQUAVILLCXTKTF-UHFFFAOYSA-H bismuth;tripotassium;2-hydroxypropane-1,2,3-tricarboxylate Chemical compound [K+].[K+].[K+].[Bi+3].[O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O.[O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O ZQUAVILLCXTKTF-UHFFFAOYSA-H 0.000 claims description 4
- 229960003276 erythromycin Drugs 0.000 claims description 4
- 239000002731 stomach secretion inhibitor Substances 0.000 claims description 4
- BVPWJMCABCPUQY-UHFFFAOYSA-N 4-amino-5-chloro-2-methoxy-N-[1-(phenylmethyl)-4-piperidinyl]benzamide Chemical compound COC1=CC(N)=C(Cl)C=C1C(=O)NC1CCN(CC=2C=CC=CC=2)CC1 BVPWJMCABCPUQY-UHFFFAOYSA-N 0.000 claims description 3
- 241000607142 Salmonella Species 0.000 claims description 3
- 230000003115 biocidal effect Effects 0.000 claims description 3
- 238000004113 cell culture Methods 0.000 claims description 3
- 150000003180 prostaglandins Chemical class 0.000 claims description 3
- 229940126409 proton pump inhibitor Drugs 0.000 claims description 3
- 239000000612 proton pump inhibitor Substances 0.000 claims description 3
- 229960000620 ranitidine Drugs 0.000 claims description 3
- VMXUWOKSQNHOCA-LCYFTJDESA-N ranitidine Chemical compound [O-][N+](=O)/C=C(/NC)NCCSCC1=CC=C(CN(C)C)O1 VMXUWOKSQNHOCA-LCYFTJDESA-N 0.000 claims description 3
- 229940118696 vibrio cholerae Drugs 0.000 claims description 3
- SUBDBMMJDZJVOS-UHFFFAOYSA-N 5-methoxy-2-{[(4-methoxy-3,5-dimethylpyridin-2-yl)methyl]sulfinyl}-1H-benzimidazole Chemical compound N=1C2=CC(OC)=CC=C2NC=1S(=O)CC1=NC=C(C)C(OC)=C1C SUBDBMMJDZJVOS-UHFFFAOYSA-N 0.000 claims description 2
- 241000186660 Lactobacillus Species 0.000 claims description 2
- IQPSEEYGBUAQFF-UHFFFAOYSA-N Pantoprazole Chemical compound COC1=CC=NC(CS(=O)C=2NC3=CC=C(OC(F)F)C=C3N=2)=C1OC IQPSEEYGBUAQFF-UHFFFAOYSA-N 0.000 claims description 2
- SMTZFNFIKUPEJC-UHFFFAOYSA-N Roxane Chemical compound CC(=O)OCC(=O)NCCCOC1=CC=CC(CN2CCCCC2)=C1 SMTZFNFIKUPEJC-UHFFFAOYSA-N 0.000 claims description 2
- 241000607768 Shigella Species 0.000 claims description 2
- 241000194017 Streptococcus Species 0.000 claims description 2
- 229960003022 amoxicillin Drugs 0.000 claims description 2
- LSQZJLSUYDQPKJ-NJBDSQKTSA-N amoxicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=C(O)C=C1 LSQZJLSUYDQPKJ-NJBDSQKTSA-N 0.000 claims description 2
- 229960004645 bismuth subcitrate Drugs 0.000 claims description 2
- ZREIPSZUJIFJNP-UHFFFAOYSA-K bismuth subsalicylate Chemical compound C1=CC=C2O[Bi](O)OC(=O)C2=C1 ZREIPSZUJIFJNP-UHFFFAOYSA-K 0.000 claims description 2
- 229960000782 bismuth subsalicylate Drugs 0.000 claims description 2
- 229960001380 cimetidine Drugs 0.000 claims description 2
- CCGSUNCLSOWKJO-UHFFFAOYSA-N cimetidine Chemical compound N#CNC(=N/C)\NCCSCC1=NC=N[C]1C CCGSUNCLSOWKJO-UHFFFAOYSA-N 0.000 claims description 2
- 229960002626 clarithromycin Drugs 0.000 claims description 2
- AGOYDEPGAOXOCK-KCBOHYOISA-N clarithromycin Chemical compound O([C@@H]1[C@@H](C)C(=O)O[C@@H]([C@@]([C@H](O)[C@@H](C)C(=O)[C@H](C)C[C@](C)([C@H](O[C@H]2[C@@H]([C@H](C[C@@H](C)O2)N(C)C)O)[C@H]1C)OC)(C)O)CC)[C@H]1C[C@@](C)(OC)[C@@H](O)[C@H](C)O1 AGOYDEPGAOXOCK-KCBOHYOISA-N 0.000 claims description 2
- 238000012258 culturing Methods 0.000 claims description 2
- 229960003559 enprostil Drugs 0.000 claims description 2
- 229960001596 famotidine Drugs 0.000 claims description 2
- XUFQPHANEAPEMJ-UHFFFAOYSA-N famotidine Chemical compound NC(N)=NC1=NC(CSCCC(N)=NS(N)(=O)=O)=CS1 XUFQPHANEAPEMJ-UHFFFAOYSA-N 0.000 claims description 2
- 229940039696 lactobacillus Drugs 0.000 claims description 2
- 229960003174 lansoprazole Drugs 0.000 claims description 2
- MJIHNNLFOKEZEW-UHFFFAOYSA-N lansoprazole Chemical compound CC1=C(OCC(F)(F)F)C=CN=C1CS(=O)C1=NC2=CC=CC=C2N1 MJIHNNLFOKEZEW-UHFFFAOYSA-N 0.000 claims description 2
- OJLOPKGSLYJEMD-URPKTTJQSA-N methyl 7-[(1r,2r,3r)-3-hydroxy-2-[(1e)-4-hydroxy-4-methyloct-1-en-1-yl]-5-oxocyclopentyl]heptanoate Chemical compound CCCCC(C)(O)C\C=C\[C@H]1[C@H](O)CC(=O)[C@@H]1CCCCCCC(=O)OC OJLOPKGSLYJEMD-URPKTTJQSA-N 0.000 claims description 2
- PTOJVMZPWPAXER-VFJVYMGBSA-N methyl 7-[(1r,2r,3r)-3-hydroxy-2-[(e,3r)-3-hydroxy-4-phenoxybut-1-enyl]-5-oxocyclopentyl]hepta-4,5-dienoate Chemical compound O[C@@H]1CC(=O)[C@H](CC=C=CCCC(=O)OC)[C@H]1\C=C\[C@@H](O)COC1=CC=CC=C1 PTOJVMZPWPAXER-VFJVYMGBSA-N 0.000 claims description 2
- 229960005249 misoprostol Drugs 0.000 claims description 2
- 229960004872 nizatidine Drugs 0.000 claims description 2
- SGXXNSQHWDMGGP-IZZDOVSWSA-N nizatidine Chemical compound [O-][N+](=O)\C=C(/NC)NCCSCC1=CSC(CN(C)C)=N1 SGXXNSQHWDMGGP-IZZDOVSWSA-N 0.000 claims description 2
- 229960000381 omeprazole Drugs 0.000 claims description 2
- LSQZJLSUYDQPKJ-UHFFFAOYSA-N p-Hydroxyampicillin Natural products O=C1N2C(C(O)=O)C(C)(C)SC2C1NC(=O)C(N)C1=CC=C(O)C=C1 LSQZJLSUYDQPKJ-UHFFFAOYSA-N 0.000 claims description 2
- 229960005019 pantoprazole Drugs 0.000 claims description 2
- 229960003320 roxatidine Drugs 0.000 claims description 2
- 125000003275 alpha amino acid group Chemical group 0.000 claims 7
- 239000003485 histamine H2 receptor antagonist Substances 0.000 claims 2
- 238000002255 vaccination Methods 0.000 abstract description 4
- BYXHQQCXAJARLQ-ZLUOBGJFSA-N Ala-Ala-Ala Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O BYXHQQCXAJARLQ-ZLUOBGJFSA-N 0.000 description 137
- 108090000623 proteins and genes Proteins 0.000 description 137
- 150000001413 amino acids Chemical group 0.000 description 86
- COYHRQWNJDJCNA-NUJDXYNKSA-N Thr-Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O COYHRQWNJDJCNA-NUJDXYNKSA-N 0.000 description 79
- XSQUKJJJFZCRTK-UHFFFAOYSA-N Urea Chemical compound NC(N)=O XSQUKJJJFZCRTK-UHFFFAOYSA-N 0.000 description 77
- 102000004169 proteins and genes Human genes 0.000 description 72
- 235000018102 proteins Nutrition 0.000 description 67
- 235000001014 amino acid Nutrition 0.000 description 62
- 239000013615 primer Substances 0.000 description 61
- 229940024606 amino acid Drugs 0.000 description 59
- 229960005486 vaccine Drugs 0.000 description 48
- 108091007433 antigens Proteins 0.000 description 47
- 102000036639 antigens Human genes 0.000 description 47
- 239000000427 antigen Substances 0.000 description 46
- 239000012634 fragment Substances 0.000 description 42
- 239000004202 carbamide Substances 0.000 description 37
- 108010050848 glycylleucine Proteins 0.000 description 36
- 239000000523 sample Substances 0.000 description 36
- 239000000872 buffer Substances 0.000 description 30
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 30
- 241000880493 Leptailurus serval Species 0.000 description 29
- 238000003752 polymerase chain reaction Methods 0.000 description 28
- 108010076504 Protein Sorting Signals Proteins 0.000 description 27
- RLMISHABBKUNFO-WHFBIAKZSA-N Ala-Ala-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 25
- 108091026890 Coding region Proteins 0.000 description 25
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 24
- 230000004927 fusion Effects 0.000 description 22
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 22
- 102000039446 nucleic acids Human genes 0.000 description 21
- 108020004707 nucleic acids Proteins 0.000 description 21
- 150000007523 nucleic acids Chemical class 0.000 description 21
- 239000002773 nucleotide Substances 0.000 description 21
- 125000003729 nucleotide group Chemical group 0.000 description 21
- 239000008188 pellet Substances 0.000 description 21
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 20
- 239000000047 product Substances 0.000 description 20
- 238000010367 cloning Methods 0.000 description 19
- 108010009298 lysylglutamic acid Proteins 0.000 description 19
- 241000588724 Escherichia coli Species 0.000 description 18
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 17
- XKUKSGPZAADMRA-UHFFFAOYSA-N glycyl-glycyl-glycine Natural products NCC(=O)NCC(=O)NCC(O)=O XKUKSGPZAADMRA-UHFFFAOYSA-N 0.000 description 17
- 208000015181 infectious disease Diseases 0.000 description 17
- 108010073969 valyllysine Proteins 0.000 description 17
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 16
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 16
- 108010047495 alanylglycine Proteins 0.000 description 16
- 230000003321 amplification Effects 0.000 description 16
- 210000004899 c-terminal region Anatomy 0.000 description 16
- 108020001507 fusion proteins Proteins 0.000 description 16
- 102000037865 fusion proteins Human genes 0.000 description 16
- 108010034529 leucyl-lysine Proteins 0.000 description 16
- 238000003199 nucleic acid amplification method Methods 0.000 description 16
- 230000001225 therapeutic effect Effects 0.000 description 16
- CKSXSQUVEYCDIW-AVGNSLFASA-N Lys-Arg-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCCN)N CKSXSQUVEYCDIW-AVGNSLFASA-N 0.000 description 15
- 108010077245 asparaginyl-proline Proteins 0.000 description 15
- 108010038633 aspartylglutamate Proteins 0.000 description 15
- 230000028993 immune response Effects 0.000 description 15
- 238000000746 purification Methods 0.000 description 15
- 108010005233 alanylglutamic acid Proteins 0.000 description 14
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 14
- 108010054155 lysyllysine Proteins 0.000 description 14
- 230000009466 transformation Effects 0.000 description 14
- 108010051110 tyrosyl-lysine Proteins 0.000 description 14
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 13
- 238000005119 centrifugation Methods 0.000 description 13
- 239000003153 chemical reaction reagent Substances 0.000 description 13
- 239000000499 gel Substances 0.000 description 13
- 108010078144 glutaminyl-glycine Proteins 0.000 description 13
- 108010044374 isoleucyl-tyrosine Proteins 0.000 description 13
- 108010064235 lysylglycine Proteins 0.000 description 13
- 239000012528 membrane Substances 0.000 description 13
- 238000002360 preparation method Methods 0.000 description 13
- 239000000243 solution Substances 0.000 description 13
- 239000004475 Arginine Substances 0.000 description 12
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 12
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 12
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 12
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 12
- 229960003121 arginine Drugs 0.000 description 12
- 238000006243 chemical reaction Methods 0.000 description 12
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 12
- 108010003700 lysyl aspartic acid Proteins 0.000 description 12
- 108010051242 phenylalanylserine Proteins 0.000 description 12
- 241000699670 Mus sp. Species 0.000 description 11
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 11
- 238000012217 deletion Methods 0.000 description 11
- 230000037430 deletion Effects 0.000 description 11
- 230000000694 effects Effects 0.000 description 11
- 238000009396 hybridization Methods 0.000 description 11
- 108010017391 lysylvaline Proteins 0.000 description 11
- 210000004379 membrane Anatomy 0.000 description 11
- HEGSGKPQLMEBJL-RKQHYHRCSA-N octyl beta-D-glucopyranoside Chemical compound CCCCCCCCO[C@@H]1O[C@H](CO)[C@@H](O)[C@H](O)[C@H]1O HEGSGKPQLMEBJL-RKQHYHRCSA-N 0.000 description 11
- 230000000069 prophylactic effect Effects 0.000 description 11
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 10
- QTBSBXVTEAMEQO-UHFFFAOYSA-N Acetic acid Chemical compound CC(O)=O QTBSBXVTEAMEQO-UHFFFAOYSA-N 0.000 description 10
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 10
- 108010079364 N-glycylalanine Proteins 0.000 description 10
- 108700026244 Open Reading Frames Proteins 0.000 description 10
- 239000012472 biological sample Substances 0.000 description 10
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 9
- 102000009016 Cholera Toxin Human genes 0.000 description 9
- 108010049048 Cholera Toxin Proteins 0.000 description 9
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 9
- MJWVXZABPOKJJF-ACRUOGEOSA-N Leu-Phe-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O MJWVXZABPOKJJF-ACRUOGEOSA-N 0.000 description 9
- XOWMDXHFSBCAKQ-SRVKXCTJSA-N Leu-Ser-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(C)C XOWMDXHFSBCAKQ-SRVKXCTJSA-N 0.000 description 9
- 108010044940 alanylglutamine Proteins 0.000 description 9
- 239000003795 chemical substances by application Substances 0.000 description 9
- 238000003776 cleavage reaction Methods 0.000 description 9
- 238000002405 diagnostic procedure Methods 0.000 description 9
- 108010081551 glycylphenylalanine Proteins 0.000 description 9
- 238000003780 insertion Methods 0.000 description 9
- 230000037431 insertion Effects 0.000 description 9
- 239000002502 liposome Substances 0.000 description 9
- 239000008194 pharmaceutical composition Substances 0.000 description 9
- 108010071207 serylmethionine Proteins 0.000 description 9
- 238000010561 standard procedure Methods 0.000 description 9
- UVTGNSWSRSCPLP-UHFFFAOYSA-N Arg-Tyr Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccc(O)cc1)C(=O)O UVTGNSWSRSCPLP-UHFFFAOYSA-N 0.000 description 8
- BCYGDJXHAGZNPQ-DCAQKATOSA-N Glu-Lys-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O BCYGDJXHAGZNPQ-DCAQKATOSA-N 0.000 description 8
- QJXHMYMRGDOHRU-NHCYSSNCSA-N Leu-Ile-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O QJXHMYMRGDOHRU-NHCYSSNCSA-N 0.000 description 8
- LUTDBHBIHHREDC-IHRRRGAJSA-N Lys-Pro-Lys Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O LUTDBHBIHHREDC-IHRRRGAJSA-N 0.000 description 8
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 8
- 108010011559 alanylphenylalanine Proteins 0.000 description 8
- 108010092854 aspartyllysine Proteins 0.000 description 8
- 108010051673 leucyl-glycyl-phenylalanine Proteins 0.000 description 8
- 239000002243 precursor Substances 0.000 description 8
- 108010090894 prolylleucine Proteins 0.000 description 8
- UPALZCBCKAMGIY-PEFMBERDSA-N Asn-Gln-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UPALZCBCKAMGIY-PEFMBERDSA-N 0.000 description 7
- OQXDUSZKISQQSS-GUBZILKMSA-N Glu-Lys-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OQXDUSZKISQQSS-GUBZILKMSA-N 0.000 description 7
- CCBIBMKQNXHNIN-ZETCQYMHSA-N Gly-Leu-Gly Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O CCBIBMKQNXHNIN-ZETCQYMHSA-N 0.000 description 7
- 241000590017 Helicobacter felis Species 0.000 description 7
- 241000590006 Helicobacter mustelae Species 0.000 description 7
- 108010087924 alanylproline Proteins 0.000 description 7
- 125000002091 cationic group Chemical group 0.000 description 7
- 210000000349 chromosome Anatomy 0.000 description 7
- 238000010790 dilution Methods 0.000 description 7
- 239000012895 dilution Substances 0.000 description 7
- 230000002496 gastric effect Effects 0.000 description 7
- 108010049041 glutamylalanine Proteins 0.000 description 7
- 108010079547 glutamylmethionine Proteins 0.000 description 7
- 108010067216 glycyl-glycyl-glycine Proteins 0.000 description 7
- 108010026364 glycyl-glycyl-leucine Proteins 0.000 description 7
- 230000003053 immunization Effects 0.000 description 7
- 238000002649 immunization Methods 0.000 description 7
- 230000001939 inductive effect Effects 0.000 description 7
- 108010038320 lysylphenylalanine Proteins 0.000 description 7
- 239000002609 medium Substances 0.000 description 7
- 230000035772 mutation Effects 0.000 description 7
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 7
- 230000001681 protective effect Effects 0.000 description 7
- 230000007017 scission Effects 0.000 description 7
- 239000011780 sodium chloride Substances 0.000 description 7
- 239000006228 supernatant Substances 0.000 description 7
- 238000011282 treatment Methods 0.000 description 7
- MVBWLRJESQOQTM-ACZMJKKPSA-N Ala-Gln-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O MVBWLRJESQOQTM-ACZMJKKPSA-N 0.000 description 6
- 102100021935 C-C motif chemokine 26 Human genes 0.000 description 6
- 108020004705 Codon Proteins 0.000 description 6
- LURQDGKYBFWWJA-MNXVOIDGSA-N Gln-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)N)N LURQDGKYBFWWJA-MNXVOIDGSA-N 0.000 description 6
- CELXWPDNIGWCJN-WDCWCFNPSA-N Gln-Lys-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CELXWPDNIGWCJN-WDCWCFNPSA-N 0.000 description 6
- ZLCLYFGMKFCDCN-XPUUQOCRSA-N Gly-Ser-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CO)NC(=O)CN)C(O)=O ZLCLYFGMKFCDCN-XPUUQOCRSA-N 0.000 description 6
- 101000897493 Homo sapiens C-C motif chemokine 26 Proteins 0.000 description 6
- 108010065920 Insulin Lispro Proteins 0.000 description 6
- WNGVUZWBXZKQES-YUMQZZPRSA-N Leu-Ala-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O WNGVUZWBXZKQES-YUMQZZPRSA-N 0.000 description 6
- BQSLGJHIAGOZCD-CIUDSAMLSA-N Leu-Ala-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O BQSLGJHIAGOZCD-CIUDSAMLSA-N 0.000 description 6
- APFJUBGRZGMQFF-QWRGUYRKSA-N Leu-Gly-Lys Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN APFJUBGRZGMQFF-QWRGUYRKSA-N 0.000 description 6
- HVHRPWQEQHIQJF-AVGNSLFASA-N Leu-Lys-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HVHRPWQEQHIQJF-AVGNSLFASA-N 0.000 description 6
- HKCCVDWHHTVVPN-CIUDSAMLSA-N Lys-Asp-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O HKCCVDWHHTVVPN-CIUDSAMLSA-N 0.000 description 6
- KPJJOZUXFOLGMQ-CIUDSAMLSA-N Lys-Asp-Asn Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N KPJJOZUXFOLGMQ-CIUDSAMLSA-N 0.000 description 6
- HAQLBBVZAGMESV-IHRRRGAJSA-N Met-Lys-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O HAQLBBVZAGMESV-IHRRRGAJSA-N 0.000 description 6
- VMLONWHIORGALA-SRVKXCTJSA-N Ser-Leu-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CO VMLONWHIORGALA-SRVKXCTJSA-N 0.000 description 6
- SRSPTFBENMJHMR-WHFBIAKZSA-N Ser-Ser-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SRSPTFBENMJHMR-WHFBIAKZSA-N 0.000 description 6
- BMKNXTJLHFIAAH-CIUDSAMLSA-N Ser-Ser-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O BMKNXTJLHFIAAH-CIUDSAMLSA-N 0.000 description 6
- IZFVRRYRMQFVGX-NRPADANISA-N Val-Ala-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N IZFVRRYRMQFVGX-NRPADANISA-N 0.000 description 6
- COYSIHFOCOMGCF-UHFFFAOYSA-N Val-Arg-Gly Natural products CC(C)C(N)C(=O)NC(C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-UHFFFAOYSA-N 0.000 description 6
- AEMPCGRFEZTWIF-IHRRRGAJSA-N Val-Leu-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O AEMPCGRFEZTWIF-IHRRRGAJSA-N 0.000 description 6
- 230000000890 antigenic effect Effects 0.000 description 6
- 230000008827 biological function Effects 0.000 description 6
- 239000012620 biological material Substances 0.000 description 6
- 229960005091 chloramphenicol Drugs 0.000 description 6
- WIIZWVCIJKGZOK-RKDXNWHRSA-N chloramphenicol Chemical compound ClC(Cl)C(=O)N[C@H](CO)[C@H](O)C1=CC=C([N+]([O-])=O)C=C1 WIIZWVCIJKGZOK-RKDXNWHRSA-N 0.000 description 6
- 238000010276 construction Methods 0.000 description 6
- 238000013461 design Methods 0.000 description 6
- 230000002068 genetic effect Effects 0.000 description 6
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 6
- 108010020688 glycylhistidine Proteins 0.000 description 6
- 108010015792 glycyllysine Proteins 0.000 description 6
- 108010092114 histidylphenylalanine Proteins 0.000 description 6
- 238000007918 intramuscular administration Methods 0.000 description 6
- 108010076756 leucyl-alanyl-phenylalanine Proteins 0.000 description 6
- 108010030617 leucyl-phenylalanyl-valine Proteins 0.000 description 6
- 238000004519 manufacturing process Methods 0.000 description 6
- 239000000463 material Substances 0.000 description 6
- 108010005942 methionylglycine Proteins 0.000 description 6
- 108010015796 prolylisoleucine Proteins 0.000 description 6
- 108010026333 seryl-proline Proteins 0.000 description 6
- 241000894007 species Species 0.000 description 6
- UCSJYZPVAKXKNQ-HZYVHMACSA-N streptomycin Chemical compound CN[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O[C@H]1O[C@@H]1[C@](C=O)(O)[C@H](C)O[C@H]1O[C@@H]1[C@@H](NC(N)=N)[C@H](O)[C@@H](NC(N)=N)[C@H](O)[C@H]1O UCSJYZPVAKXKNQ-HZYVHMACSA-N 0.000 description 6
- 238000007920 subcutaneous administration Methods 0.000 description 6
- 238000006467 substitution reaction Methods 0.000 description 6
- TZDNWXDLYFIFPT-BJDJZHNGSA-N Ala-Ile-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O TZDNWXDLYFIFPT-BJDJZHNGSA-N 0.000 description 5
- MNZHHDPWDWQJCQ-YUMQZZPRSA-N Ala-Leu-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O MNZHHDPWDWQJCQ-YUMQZZPRSA-N 0.000 description 5
- TTXYKSADPSNOIF-IHRRRGAJSA-N Arg-Asp-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O TTXYKSADPSNOIF-IHRRRGAJSA-N 0.000 description 5
- QISZHYWZHJRDAO-CIUDSAMLSA-N Asn-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N QISZHYWZHJRDAO-CIUDSAMLSA-N 0.000 description 5
- HPASIOLTWSNMFB-OLHMAJIHSA-N Asn-Thr-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O HPASIOLTWSNMFB-OLHMAJIHSA-N 0.000 description 5
- VPPXTHJNTYDNFJ-CIUDSAMLSA-N Asp-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N VPPXTHJNTYDNFJ-CIUDSAMLSA-N 0.000 description 5
- 241000894006 Bacteria Species 0.000 description 5
- 208000007882 Gastritis Diseases 0.000 description 5
- IRXNJYPKBVERCW-DCAQKATOSA-N Glu-Leu-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IRXNJYPKBVERCW-DCAQKATOSA-N 0.000 description 5
- FXLVSYVJDPCIHH-STQMWFEESA-N Gly-Phe-Arg Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FXLVSYVJDPCIHH-STQMWFEESA-N 0.000 description 5
- 239000004471 Glycine Substances 0.000 description 5
- QICVAHODWHIWIS-HTFCKZLJSA-N Ile-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N QICVAHODWHIWIS-HTFCKZLJSA-N 0.000 description 5
- MTFVYKQRLXYAQN-LAEOZQHASA-N Ile-Glu-Gly Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O MTFVYKQRLXYAQN-LAEOZQHASA-N 0.000 description 5
- MQFGXJNSUJTXDT-QSFUFRPTSA-N Ile-Gly-Ile Chemical compound N[C@@H]([C@@H](C)CC)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)O MQFGXJNSUJTXDT-QSFUFRPTSA-N 0.000 description 5
- WIZPFZKOFZXDQG-HTFCKZLJSA-N Ile-Ile-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O WIZPFZKOFZXDQG-HTFCKZLJSA-N 0.000 description 5
- 102000000588 Interleukin-2 Human genes 0.000 description 5
- 108010002350 Interleukin-2 Proteins 0.000 description 5
- HGCNKOLVKRAVHD-UHFFFAOYSA-N L-Met-L-Phe Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 HGCNKOLVKRAVHD-UHFFFAOYSA-N 0.000 description 5
- HBJZFCIVFIBNSV-DCAQKATOSA-N Leu-Arg-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(N)=O)C(O)=O HBJZFCIVFIBNSV-DCAQKATOSA-N 0.000 description 5
- WGNOPSQMIQERPK-UHFFFAOYSA-N Leu-Asn-Pro Natural products CC(C)CC(N)C(=O)NC(CC(=O)N)C(=O)N1CCCC1C(=O)O WGNOPSQMIQERPK-UHFFFAOYSA-N 0.000 description 5
- USLNHQZCDQJBOV-ZPFDUUQYSA-N Leu-Ile-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O USLNHQZCDQJBOV-ZPFDUUQYSA-N 0.000 description 5
- YOKVEHGYYQEQOP-QWRGUYRKSA-N Leu-Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YOKVEHGYYQEQOP-QWRGUYRKSA-N 0.000 description 5
- XVZCXCTYGHPNEM-UHFFFAOYSA-N Leu-Leu-Pro Natural products CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(O)=O XVZCXCTYGHPNEM-UHFFFAOYSA-N 0.000 description 5
- RXGLHDWAZQECBI-SRVKXCTJSA-N Leu-Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O RXGLHDWAZQECBI-SRVKXCTJSA-N 0.000 description 5
- ZRHDPZAAWLXXIR-SRVKXCTJSA-N Leu-Lys-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O ZRHDPZAAWLXXIR-SRVKXCTJSA-N 0.000 description 5
- WXUOJXIGOPMDJM-SRVKXCTJSA-N Leu-Lys-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O WXUOJXIGOPMDJM-SRVKXCTJSA-N 0.000 description 5
- FKQPWMZLIIATBA-AJNGGQMLSA-N Leu-Lys-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FKQPWMZLIIATBA-AJNGGQMLSA-N 0.000 description 5
- KPYAOIVPJKPIOU-KKUMJFAQSA-N Leu-Lys-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O KPYAOIVPJKPIOU-KKUMJFAQSA-N 0.000 description 5
- LZHJZLHSRGWBBE-IHRRRGAJSA-N Leu-Lys-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O LZHJZLHSRGWBBE-IHRRRGAJSA-N 0.000 description 5
- RGUXWMDNCPMQFB-YUMQZZPRSA-N Leu-Ser-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RGUXWMDNCPMQFB-YUMQZZPRSA-N 0.000 description 5
- KCXUCYYZNZFGLL-SRVKXCTJSA-N Lys-Ala-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O KCXUCYYZNZFGLL-SRVKXCTJSA-N 0.000 description 5
- GQFDWEDHOQRNLC-QWRGUYRKSA-N Lys-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN GQFDWEDHOQRNLC-QWRGUYRKSA-N 0.000 description 5
- YXPJCVNIDDKGOE-MELADBBJSA-N Lys-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N)C(=O)O YXPJCVNIDDKGOE-MELADBBJSA-N 0.000 description 5
- LUAJJLPHUXPQLH-KKUMJFAQSA-N Lys-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCCN)N LUAJJLPHUXPQLH-KKUMJFAQSA-N 0.000 description 5
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 5
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 5
- 108091028043 Nucleic acid sequence Proteins 0.000 description 5
- 241000283973 Oryctolagus cuniculus Species 0.000 description 5
- KLYYKKGCPOGDPE-OEAJRASXSA-N Phe-Thr-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O KLYYKKGCPOGDPE-OEAJRASXSA-N 0.000 description 5
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 5
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 5
- XVAUJOAYHWWNQF-ZLUOBGJFSA-N Ser-Asn-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O XVAUJOAYHWWNQF-ZLUOBGJFSA-N 0.000 description 5
- YUJLIIRMIAGMCQ-CIUDSAMLSA-N Ser-Leu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YUJLIIRMIAGMCQ-CIUDSAMLSA-N 0.000 description 5
- FHXGMDRKJHKLKW-QWRGUYRKSA-N Ser-Tyr-Gly Chemical compound OC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 FHXGMDRKJHKLKW-QWRGUYRKSA-N 0.000 description 5
- 239000007983 Tris buffer Substances 0.000 description 5
- XGEUYEOEZYFHRL-KKXDTOCCSA-N Tyr-Ala-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 XGEUYEOEZYFHRL-KKXDTOCCSA-N 0.000 description 5
- VHIZXDZMTDVFGX-DCAQKATOSA-N Val-Ser-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N VHIZXDZMTDVFGX-DCAQKATOSA-N 0.000 description 5
- QPJSIBAOZBVELU-BPNCWPANSA-N Val-Tyr-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](C(C)C)N QPJSIBAOZBVELU-BPNCWPANSA-N 0.000 description 5
- LLJLBRRXKZTTRD-GUBZILKMSA-N Val-Val-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(=O)O)N LLJLBRRXKZTTRD-GUBZILKMSA-N 0.000 description 5
- 238000007792 addition Methods 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 5
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 5
- 238000004587 chromatography analysis Methods 0.000 description 5
- 108010089804 glycyl-threonine Proteins 0.000 description 5
- 108010010147 glycylglutamine Proteins 0.000 description 5
- 108010087823 glycyltyrosine Proteins 0.000 description 5
- 108010037850 glycylvaline Proteins 0.000 description 5
- 108010040030 histidinoalanine Proteins 0.000 description 5
- 108010025306 histidylleucine Proteins 0.000 description 5
- 108010085325 histidylproline Proteins 0.000 description 5
- 230000002209 hydrophobic effect Effects 0.000 description 5
- 238000002347 injection Methods 0.000 description 5
- 239000007924 injection Substances 0.000 description 5
- 229930027917 kanamycin Natural products 0.000 description 5
- 229960000318 kanamycin Drugs 0.000 description 5
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 5
- 229930182823 kanamycin A Natural products 0.000 description 5
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 5
- 108010057821 leucylproline Proteins 0.000 description 5
- 108010068488 methionylphenylalanine Proteins 0.000 description 5
- 239000011859 microparticle Substances 0.000 description 5
- 230000004899 motility Effects 0.000 description 5
- 108010018625 phenylalanylarginine Proteins 0.000 description 5
- 108010012581 phenylalanylglutamate Proteins 0.000 description 5
- 108010073101 phenylalanylleucine Proteins 0.000 description 5
- 239000008363 phosphate buffer Substances 0.000 description 5
- 108091008146 restriction endonucleases Proteins 0.000 description 5
- 238000001179 sorption measurement Methods 0.000 description 5
- 108010031491 threonyl-lysyl-glutamic acid Proteins 0.000 description 5
- 239000003053 toxin Substances 0.000 description 5
- 231100000765 toxin Toxicity 0.000 description 5
- 108700012359 toxins Proteins 0.000 description 5
- 238000013518 transcription Methods 0.000 description 5
- 230000035897 transcription Effects 0.000 description 5
- 230000002103 transcriptional effect Effects 0.000 description 5
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 5
- 108010077037 tyrosyl-tyrosyl-phenylalanine Proteins 0.000 description 5
- 108010003137 tyrosyltyrosine Proteins 0.000 description 5
- 108010009962 valyltyrosine Proteins 0.000 description 5
- LSLIRHLIUDVNBN-CIUDSAMLSA-N Ala-Asp-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LSLIRHLIUDVNBN-CIUDSAMLSA-N 0.000 description 4
- RZZMZYZXNJRPOJ-BJDJZHNGSA-N Ala-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](C)N RZZMZYZXNJRPOJ-BJDJZHNGSA-N 0.000 description 4
- AWZKCUCQJNTBAD-SRVKXCTJSA-N Ala-Leu-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN AWZKCUCQJNTBAD-SRVKXCTJSA-N 0.000 description 4
- AJBVYEYZVYPFCF-CIUDSAMLSA-N Ala-Lys-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O AJBVYEYZVYPFCF-CIUDSAMLSA-N 0.000 description 4
- 108010011667 Ala-Phe-Ala Proteins 0.000 description 4
- ZBLQIYPCUWZSRZ-QEJZJMRPSA-N Ala-Phe-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CC=CC=C1 ZBLQIYPCUWZSRZ-QEJZJMRPSA-N 0.000 description 4
- PEEYDECOOVQKRZ-DLOVCJGASA-N Ala-Ser-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PEEYDECOOVQKRZ-DLOVCJGASA-N 0.000 description 4
- WNHNMKOFKCHKKD-BFHQHQDPSA-N Ala-Thr-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O WNHNMKOFKCHKKD-BFHQHQDPSA-N 0.000 description 4
- 108010006591 Apoenzymes Proteins 0.000 description 4
- XYOVHPDDWCEUDY-CIUDSAMLSA-N Asn-Ala-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O XYOVHPDDWCEUDY-CIUDSAMLSA-N 0.000 description 4
- SLKLLQWZQHXYSV-CIUDSAMLSA-N Asn-Ala-Lys Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O SLKLLQWZQHXYSV-CIUDSAMLSA-N 0.000 description 4
- SEKBHZJLARBNPB-GHCJXIJMSA-N Asn-Ile-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O SEKBHZJLARBNPB-GHCJXIJMSA-N 0.000 description 4
- JEEFEQCRXKPQHC-KKUMJFAQSA-N Asn-Leu-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JEEFEQCRXKPQHC-KKUMJFAQSA-N 0.000 description 4
- BEHQTVDBCLSCBY-CFMVVWHZSA-N Asn-Tyr-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BEHQTVDBCLSCBY-CFMVVWHZSA-N 0.000 description 4
- ZAESWDKAMDVHLL-RCOVLWMOSA-N Asn-Val-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O ZAESWDKAMDVHLL-RCOVLWMOSA-N 0.000 description 4
- HXVILZUZXFLVEN-DCAQKATOSA-N Asp-Met-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O HXVILZUZXFLVEN-DCAQKATOSA-N 0.000 description 4
- RAGIABZNLPZBGS-FXQIFTODSA-N Cys-Pro-Cys Chemical compound N[C@@H](CS)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CS)C(O)=O RAGIABZNLPZBGS-FXQIFTODSA-N 0.000 description 4
- 108090000695 Cytokines Proteins 0.000 description 4
- 102000004127 Cytokines Human genes 0.000 description 4
- 101710112752 Cytotoxin Proteins 0.000 description 4
- 108010090461 DFG peptide Proteins 0.000 description 4
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 4
- 241000282326 Felis catus Species 0.000 description 4
- KZKBJEUWNMQTLV-XDTLVQLUSA-N Gln-Ala-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KZKBJEUWNMQTLV-XDTLVQLUSA-N 0.000 description 4
- YXQCLIVLWCKCRS-RYUDHWBXSA-N Gln-Gly-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)N)N)O YXQCLIVLWCKCRS-RYUDHWBXSA-N 0.000 description 4
- RJONUNZIMUXUOI-GUBZILKMSA-N Glu-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N RJONUNZIMUXUOI-GUBZILKMSA-N 0.000 description 4
- XHUCVVHRLNPZSZ-CIUDSAMLSA-N Glu-Gln-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XHUCVVHRLNPZSZ-CIUDSAMLSA-N 0.000 description 4
- ZWABFSSWTSAMQN-KBIXCLLPSA-N Glu-Ile-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O ZWABFSSWTSAMQN-KBIXCLLPSA-N 0.000 description 4
- XTZDZAXYPDISRR-MNXVOIDGSA-N Glu-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N XTZDZAXYPDISRR-MNXVOIDGSA-N 0.000 description 4
- MWMJCGBSIORNCD-AVGNSLFASA-N Glu-Leu-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O MWMJCGBSIORNCD-AVGNSLFASA-N 0.000 description 4
- MXXXVOYFNVJHMA-IUCAKERBSA-N Gly-Arg-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)CN MXXXVOYFNVJHMA-IUCAKERBSA-N 0.000 description 4
- GGEJHJIXRBTJPD-BYPYZUCNSA-N Gly-Asn-Gly Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O GGEJHJIXRBTJPD-BYPYZUCNSA-N 0.000 description 4
- XBWMTPAIUQIWKA-BYULHYEWSA-N Gly-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)CN XBWMTPAIUQIWKA-BYULHYEWSA-N 0.000 description 4
- YWAQATDNEKZFFK-BYPYZUCNSA-N Gly-Gly-Ser Chemical compound NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O YWAQATDNEKZFFK-BYPYZUCNSA-N 0.000 description 4
- LRQXRHGQEVWGPV-NHCYSSNCSA-N Gly-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)CN LRQXRHGQEVWGPV-NHCYSSNCSA-N 0.000 description 4
- TVUWMSBGMVAHSJ-KBPBESRZSA-N Gly-Leu-Phe Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 TVUWMSBGMVAHSJ-KBPBESRZSA-N 0.000 description 4
- AFWYPMDMDYCKMD-KBPBESRZSA-N Gly-Leu-Tyr Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AFWYPMDMDYCKMD-KBPBESRZSA-N 0.000 description 4
- JQFILXICXLDTRR-FBCQKBJTSA-N Gly-Thr-Gly Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)NCC(O)=O JQFILXICXLDTRR-FBCQKBJTSA-N 0.000 description 4
- ZZWUYQXMIFTIIY-WEDXCCLWSA-N Gly-Thr-Leu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O ZZWUYQXMIFTIIY-WEDXCCLWSA-N 0.000 description 4
- IZVICCORZOSGPT-JSGCOSHPSA-N Gly-Val-Tyr Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O IZVICCORZOSGPT-JSGCOSHPSA-N 0.000 description 4
- ZRALSGWEFCBTJO-UHFFFAOYSA-N Guanidine Chemical compound NC(N)=N ZRALSGWEFCBTJO-UHFFFAOYSA-N 0.000 description 4
- WSWAUVHXQREQQG-JYJNAYRXSA-N His-Tyr-Gln Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O WSWAUVHXQREQQG-JYJNAYRXSA-N 0.000 description 4
- 241000282412 Homo Species 0.000 description 4
- RWIKBYVJQAJYDP-BJDJZHNGSA-N Ile-Ala-Lys Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN RWIKBYVJQAJYDP-BJDJZHNGSA-N 0.000 description 4
- OVPYIUNCVSOVNF-ZPFDUUQYSA-N Ile-Gln-Pro Natural products CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(O)=O OVPYIUNCVSOVNF-ZPFDUUQYSA-N 0.000 description 4
- PHIXPNQDGGILMP-YVNDNENWSA-N Ile-Glu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N PHIXPNQDGGILMP-YVNDNENWSA-N 0.000 description 4
- GVNNAHIRSDRIII-AJNGGQMLSA-N Ile-Lys-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N GVNNAHIRSDRIII-AJNGGQMLSA-N 0.000 description 4
- XLXPYSDGMXTTNQ-UHFFFAOYSA-N Ile-Phe-Leu Natural products CCC(C)C(N)C(=O)NC(C(=O)NC(CC(C)C)C(O)=O)CC1=CC=CC=C1 XLXPYSDGMXTTNQ-UHFFFAOYSA-N 0.000 description 4
- RCFDOSNHHZGBOY-UHFFFAOYSA-N L-isoleucyl-L-alanine Natural products CCC(C)C(N)C(=O)NC(C)C(O)=O RCFDOSNHHZGBOY-UHFFFAOYSA-N 0.000 description 4
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 4
- MMEDVBWCMGRKKC-GARJFASQSA-N Leu-Asp-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N MMEDVBWCMGRKKC-GARJFASQSA-N 0.000 description 4
- LLBQJYDYOLIQAI-JYJNAYRXSA-N Leu-Glu-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LLBQJYDYOLIQAI-JYJNAYRXSA-N 0.000 description 4
- UCDHVOALNXENLC-KBPBESRZSA-N Leu-Gly-Tyr Chemical compound CC(C)C[C@H]([NH3+])C(=O)NCC(=O)N[C@H](C([O-])=O)CC1=CC=C(O)C=C1 UCDHVOALNXENLC-KBPBESRZSA-N 0.000 description 4
- JKSIBWITFMQTOA-XUXIUFHCSA-N Leu-Ile-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O JKSIBWITFMQTOA-XUXIUFHCSA-N 0.000 description 4
- UBZGNBKMIJHOHL-BZSNNMDCSA-N Leu-Leu-Phe Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 UBZGNBKMIJHOHL-BZSNNMDCSA-N 0.000 description 4
- BIZNDKMFQHDOIE-KKUMJFAQSA-N Leu-Phe-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 BIZNDKMFQHDOIE-KKUMJFAQSA-N 0.000 description 4
- SBANPBVRHYIMRR-UHFFFAOYSA-N Leu-Ser-Pro Natural products CC(C)CC(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O SBANPBVRHYIMRR-UHFFFAOYSA-N 0.000 description 4
- BRTVHXHCUSXYRI-CIUDSAMLSA-N Leu-Ser-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O BRTVHXHCUSXYRI-CIUDSAMLSA-N 0.000 description 4
- GJJQCBVRWDGLMQ-GUBZILKMSA-N Lys-Glu-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O GJJQCBVRWDGLMQ-GUBZILKMSA-N 0.000 description 4
- ULUQBUKAPDUKOC-GVXVVHGQSA-N Lys-Glu-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O ULUQBUKAPDUKOC-GVXVVHGQSA-N 0.000 description 4
- XOQMURBBIXRRCR-SRVKXCTJSA-N Lys-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN XOQMURBBIXRRCR-SRVKXCTJSA-N 0.000 description 4
- ZJWIXBZTAAJERF-IHRRRGAJSA-N Lys-Lys-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CCCN=C(N)N ZJWIXBZTAAJERF-IHRRRGAJSA-N 0.000 description 4
- UQRZFMQQXXJTTF-AVGNSLFASA-N Lys-Lys-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O UQRZFMQQXXJTTF-AVGNSLFASA-N 0.000 description 4
- YDDDRTIPNTWGIG-SRVKXCTJSA-N Lys-Lys-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O YDDDRTIPNTWGIG-SRVKXCTJSA-N 0.000 description 4
- HKXSZKJMDBHOTG-CIUDSAMLSA-N Lys-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CCCCN HKXSZKJMDBHOTG-CIUDSAMLSA-N 0.000 description 4
- DNWBUCHHMRQWCZ-GUBZILKMSA-N Lys-Ser-Gln Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(N)=O DNWBUCHHMRQWCZ-GUBZILKMSA-N 0.000 description 4
- PLOUVAYOMTYJRG-JXUBOQSCSA-N Lys-Thr-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O PLOUVAYOMTYJRG-JXUBOQSCSA-N 0.000 description 4
- BDFHWFUAQLIMJO-KXNHARMFSA-N Lys-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N)O BDFHWFUAQLIMJO-KXNHARMFSA-N 0.000 description 4
- VVURYEVJJTXWNE-ULQDDVLXSA-N Lys-Tyr-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O VVURYEVJJTXWNE-ULQDDVLXSA-N 0.000 description 4
- BQVUABVGYYSDCJ-UHFFFAOYSA-N Nalpha-L-Leucyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-UHFFFAOYSA-N 0.000 description 4
- BIYWZVCPZIFGPY-QWRGUYRKSA-N Phe-Gly-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](CO)C(O)=O BIYWZVCPZIFGPY-QWRGUYRKSA-N 0.000 description 4
- KZRQONDKKJCAOL-DKIMLUQUSA-N Phe-Leu-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KZRQONDKKJCAOL-DKIMLUQUSA-N 0.000 description 4
- LRBSWBVUCLLRLU-BZSNNMDCSA-N Phe-Leu-Lys Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)Cc1ccccc1)C(=O)N[C@@H](CCCCN)C(O)=O LRBSWBVUCLLRLU-BZSNNMDCSA-N 0.000 description 4
- RUDOLGWDSKQQFF-DCAQKATOSA-N Pro-Leu-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O RUDOLGWDSKQQFF-DCAQKATOSA-N 0.000 description 4
- 241000293869 Salmonella enterica subsp. enterica serovar Typhimurium Species 0.000 description 4
- RIAKPZVSNBBNRE-BJDJZHNGSA-N Ser-Ile-Leu Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O RIAKPZVSNBBNRE-BJDJZHNGSA-N 0.000 description 4
- UBRMZSHOOIVJPW-SRVKXCTJSA-N Ser-Leu-Lys Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O UBRMZSHOOIVJPW-SRVKXCTJSA-N 0.000 description 4
- VZQRNAYURWAEFE-KKUMJFAQSA-N Ser-Leu-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 VZQRNAYURWAEFE-KKUMJFAQSA-N 0.000 description 4
- LRWBCWGEUCKDTN-BJDJZHNGSA-N Ser-Lys-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LRWBCWGEUCKDTN-BJDJZHNGSA-N 0.000 description 4
- ZKBKUWQVDWWSRI-BZSNNMDCSA-N Ser-Phe-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZKBKUWQVDWWSRI-BZSNNMDCSA-N 0.000 description 4
- HAUVENOGHPECML-BPUTZDHNSA-N Ser-Trp-Val Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](C(C)C)C(O)=O)NC(=O)[C@@H](N)CO)=CNC2=C1 HAUVENOGHPECML-BPUTZDHNSA-N 0.000 description 4
- 238000002105 Southern blotting Methods 0.000 description 4
- DWYAUVCQDTZIJI-VZFHVOOUSA-N Thr-Ala-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O DWYAUVCQDTZIJI-VZFHVOOUSA-N 0.000 description 4
- JBHMLZSKIXMVFS-XVSYOHENSA-N Thr-Asn-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JBHMLZSKIXMVFS-XVSYOHENSA-N 0.000 description 4
- UHBPFYOQQPFKQR-JHEQGTHGSA-N Thr-Gln-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O UHBPFYOQQPFKQR-JHEQGTHGSA-N 0.000 description 4
- LHEZGZQRLDBSRR-WDCWCFNPSA-N Thr-Glu-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LHEZGZQRLDBSRR-WDCWCFNPSA-N 0.000 description 4
- DWAMXBFJNZIHMC-KBPBESRZSA-N Tyr-Leu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O DWAMXBFJNZIHMC-KBPBESRZSA-N 0.000 description 4
- DBOXBUDEAJVKRE-LSJOCFKGSA-N Val-Asn-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C(C)C)C(=O)O)N DBOXBUDEAJVKRE-LSJOCFKGSA-N 0.000 description 4
- OTJMMKPMLUNTQT-AVGNSLFASA-N Val-Leu-Arg Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](C(C)C)N OTJMMKPMLUNTQT-AVGNSLFASA-N 0.000 description 4
- SYSWVVCYSXBVJG-RHYQMDGZSA-N Val-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)N)O SYSWVVCYSXBVJG-RHYQMDGZSA-N 0.000 description 4
- XXWBHOWRARMUOC-NHCYSSNCSA-N Val-Lys-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)N)C(=O)O)N XXWBHOWRARMUOC-NHCYSSNCSA-N 0.000 description 4
- GBIUHAYJGWVNLN-UHFFFAOYSA-N Val-Ser-Pro Natural products CC(C)C(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O GBIUHAYJGWVNLN-UHFFFAOYSA-N 0.000 description 4
- JAIZPWVHPQRYOU-ZJDVBMNYSA-N Val-Thr-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O JAIZPWVHPQRYOU-ZJDVBMNYSA-N 0.000 description 4
- GUIYPEKUEMQBIK-JSGCOSHPSA-N Val-Tyr-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)NCC(O)=O GUIYPEKUEMQBIK-JSGCOSHPSA-N 0.000 description 4
- NLNCNKIVJPEFBC-DLOVCJGASA-N Val-Val-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O NLNCNKIVJPEFBC-DLOVCJGASA-N 0.000 description 4
- 238000002835 absorbance Methods 0.000 description 4
- 125000000129 anionic group Chemical group 0.000 description 4
- 238000013459 approach Methods 0.000 description 4
- 108010013835 arginine glutamate Proteins 0.000 description 4
- 108010009111 arginyl-glycyl-glutamic acid Proteins 0.000 description 4
- 108010069926 arginyl-glycyl-serine Proteins 0.000 description 4
- 108010043240 arginyl-leucyl-glycine Proteins 0.000 description 4
- 108010062796 arginyllysine Proteins 0.000 description 4
- 108010093581 aspartyl-proline Proteins 0.000 description 4
- 108010047857 aspartylglycine Proteins 0.000 description 4
- 231100000599 cytotoxic agent Toxicity 0.000 description 4
- 239000002619 cytotoxin Substances 0.000 description 4
- 238000001514 detection method Methods 0.000 description 4
- 108010054812 diprotin A Proteins 0.000 description 4
- 108010054813 diprotin B Proteins 0.000 description 4
- 239000003814 drug Substances 0.000 description 4
- 239000013604 expression vector Substances 0.000 description 4
- 238000009472 formulation Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000001476 gene delivery Methods 0.000 description 4
- 108010073628 glutamyl-valyl-phenylalanine Proteins 0.000 description 4
- 108010090037 glycyl-alanyl-isoleucine Proteins 0.000 description 4
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 4
- 108010019832 glycyl-asparaginyl-glycine Proteins 0.000 description 4
- 108010050475 glycyl-leucyl-tyrosine Proteins 0.000 description 4
- 108010059898 glycyl-tyrosyl-lysine Proteins 0.000 description 4
- 229960000789 guanidine hydrochloride Drugs 0.000 description 4
- PJJJBBJSCAKJQF-UHFFFAOYSA-N guanidinium chloride Chemical compound [Cl-].NC(N)=[NH2+] PJJJBBJSCAKJQF-UHFFFAOYSA-N 0.000 description 4
- 230000000521 hyperimmunizing effect Effects 0.000 description 4
- 238000011534 incubation Methods 0.000 description 4
- 238000001990 intravenous administration Methods 0.000 description 4
- 108010078274 isoleucylvaline Proteins 0.000 description 4
- 108010090333 leucyl-lysyl-proline Proteins 0.000 description 4
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 4
- 150000002632 lipids Chemical class 0.000 description 4
- 108700023046 methionyl-leucyl-phenylalanine Proteins 0.000 description 4
- 108010056582 methionylglutamic acid Proteins 0.000 description 4
- 230000007935 neutral effect Effects 0.000 description 4
- PXHVJJICTQNCMI-UHFFFAOYSA-N nickel Substances [Ni] PXHVJJICTQNCMI-UHFFFAOYSA-N 0.000 description 4
- 244000052769 pathogen Species 0.000 description 4
- 108010024607 phenylalanylalanine Proteins 0.000 description 4
- 230000037452 priming Effects 0.000 description 4
- 230000002829 reductive effect Effects 0.000 description 4
- 230000002441 reversible effect Effects 0.000 description 4
- 108010015840 seryl-prolyl-lysyl-lysine Proteins 0.000 description 4
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 4
- 239000012064 sodium phosphate buffer Substances 0.000 description 4
- 238000011144 upstream manufacturing Methods 0.000 description 4
- 239000003981 vehicle Substances 0.000 description 4
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 3
- BUANFPRKJKJSRR-ACZMJKKPSA-N Ala-Ala-Gln Chemical compound C[C@H]([NH3+])C(=O)N[C@@H](C)C(=O)N[C@H](C([O-])=O)CCC(N)=O BUANFPRKJKJSRR-ACZMJKKPSA-N 0.000 description 3
- PXKLCFFSVLKOJM-ACZMJKKPSA-N Ala-Asn-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O PXKLCFFSVLKOJM-ACZMJKKPSA-N 0.000 description 3
- GORKKVHIBWAQHM-GCJQMDKQSA-N Ala-Asn-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GORKKVHIBWAQHM-GCJQMDKQSA-N 0.000 description 3
- ZIBWKCRKNFYTPT-ZKWXMUAHSA-N Ala-Asn-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O ZIBWKCRKNFYTPT-ZKWXMUAHSA-N 0.000 description 3
- LGFCAXJBAZESCF-ACZMJKKPSA-N Ala-Gln-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O LGFCAXJBAZESCF-ACZMJKKPSA-N 0.000 description 3
- FVSOUJZKYWEFOB-KBIXCLLPSA-N Ala-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](C)N FVSOUJZKYWEFOB-KBIXCLLPSA-N 0.000 description 3
- VGPWRRFOPXVGOH-BYPYZUCNSA-N Ala-Gly-Gly Chemical compound C[C@H](N)C(=O)NCC(=O)NCC(O)=O VGPWRRFOPXVGOH-BYPYZUCNSA-N 0.000 description 3
- LXAARTARZJJCMB-CIQUZCHMSA-N Ala-Ile-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LXAARTARZJJCMB-CIQUZCHMSA-N 0.000 description 3
- PIXQDIGKDNNOOV-GUBZILKMSA-N Ala-Lys-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O PIXQDIGKDNNOOV-GUBZILKMSA-N 0.000 description 3
- MFMDKJIPHSWSBM-GUBZILKMSA-N Ala-Lys-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O MFMDKJIPHSWSBM-GUBZILKMSA-N 0.000 description 3
- PMQXMXAASGFUDX-SRVKXCTJSA-N Ala-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CCCCN PMQXMXAASGFUDX-SRVKXCTJSA-N 0.000 description 3
- BTRULDJUUVGRNE-DCAQKATOSA-N Ala-Pro-Lys Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O BTRULDJUUVGRNE-DCAQKATOSA-N 0.000 description 3
- VJVQKGYHIZPSNS-FXQIFTODSA-N Ala-Ser-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N VJVQKGYHIZPSNS-FXQIFTODSA-N 0.000 description 3
- KLALXKYLOMZDQT-ZLUOBGJFSA-N Ala-Ser-Asn Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(N)=O KLALXKYLOMZDQT-ZLUOBGJFSA-N 0.000 description 3
- WVNFNPGXYADPPO-BQBZGAKWSA-N Arg-Gly-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O WVNFNPGXYADPPO-BQBZGAKWSA-N 0.000 description 3
- AGVNTAUPLWIQEN-ZPFDUUQYSA-N Arg-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N AGVNTAUPLWIQEN-ZPFDUUQYSA-N 0.000 description 3
- NPAVRDPEFVKELR-DCAQKATOSA-N Arg-Lys-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O NPAVRDPEFVKELR-DCAQKATOSA-N 0.000 description 3
- VRTWYUYCJGNFES-CIUDSAMLSA-N Arg-Ser-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O VRTWYUYCJGNFES-CIUDSAMLSA-N 0.000 description 3
- NUHQMYUWLUSRJX-BIIVOSGPSA-N Asn-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N NUHQMYUWLUSRJX-BIIVOSGPSA-N 0.000 description 3
- IARGXWMWRFOQPG-GCJQMDKQSA-N Asn-Ala-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IARGXWMWRFOQPG-GCJQMDKQSA-N 0.000 description 3
- ZZXMOQIUIJJOKZ-ZLUOBGJFSA-N Asn-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC(N)=O ZZXMOQIUIJJOKZ-ZLUOBGJFSA-N 0.000 description 3
- RCENDENBBJFJHZ-ACZMJKKPSA-N Asn-Asn-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O RCENDENBBJFJHZ-ACZMJKKPSA-N 0.000 description 3
- XVAPVJNJGLWGCS-ACZMJKKPSA-N Asn-Glu-Asn Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N XVAPVJNJGLWGCS-ACZMJKKPSA-N 0.000 description 3
- GQRDIVQPSMPQME-ZPFDUUQYSA-N Asn-Ile-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O GQRDIVQPSMPQME-ZPFDUUQYSA-N 0.000 description 3
- DJIMLSXHXKWADV-CIUDSAMLSA-N Asn-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(N)=O DJIMLSXHXKWADV-CIUDSAMLSA-N 0.000 description 3
- JTXVXGXTRXMOFJ-FXQIFTODSA-N Asn-Pro-Asn Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O JTXVXGXTRXMOFJ-FXQIFTODSA-N 0.000 description 3
- BYLSYQASFJJBCL-DCAQKATOSA-N Asn-Pro-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O BYLSYQASFJJBCL-DCAQKATOSA-N 0.000 description 3
- HNXWVVHIGTZTBO-LKXGYXEUSA-N Asn-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O HNXWVVHIGTZTBO-LKXGYXEUSA-N 0.000 description 3
- HCZQKHSRYHCPSD-IUKAMOBKSA-N Asn-Thr-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HCZQKHSRYHCPSD-IUKAMOBKSA-N 0.000 description 3
- PUUPMDXIHCOPJU-HJGDQZAQSA-N Asn-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O PUUPMDXIHCOPJU-HJGDQZAQSA-N 0.000 description 3
- KSZHWTRZPOTIGY-AVGNSLFASA-N Asn-Tyr-Gln Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O KSZHWTRZPOTIGY-AVGNSLFASA-N 0.000 description 3
- XEDQMTWEYFBOIK-ACZMJKKPSA-N Asp-Ala-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O XEDQMTWEYFBOIK-ACZMJKKPSA-N 0.000 description 3
- CXBOKJPLEYUPGB-FXQIFTODSA-N Asp-Ala-Met Chemical compound C[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC(=O)O)N CXBOKJPLEYUPGB-FXQIFTODSA-N 0.000 description 3
- VILLWIDTHYPSLC-PEFMBERDSA-N Asp-Glu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VILLWIDTHYPSLC-PEFMBERDSA-N 0.000 description 3
- OVPHVTCDVYYTHN-AVGNSLFASA-N Asp-Glu-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OVPHVTCDVYYTHN-AVGNSLFASA-N 0.000 description 3
- PZXPWHFYZXTFBI-YUMQZZPRSA-N Asp-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O PZXPWHFYZXTFBI-YUMQZZPRSA-N 0.000 description 3
- PCJOFZYFFMBZKC-PCBIJLKTSA-N Asp-Phe-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PCJOFZYFFMBZKC-PCBIJLKTSA-N 0.000 description 3
- ZQFRDAZBTSFGGW-SRVKXCTJSA-N Asp-Ser-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZQFRDAZBTSFGGW-SRVKXCTJSA-N 0.000 description 3
- QOJJMJKTMKNFEF-ZKWXMUAHSA-N Asp-Val-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(O)=O QOJJMJKTMKNFEF-ZKWXMUAHSA-N 0.000 description 3
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 3
- CIVXDCMSSFGWAL-YUMQZZPRSA-N Cys-Lys-Gly Chemical compound C(CCN)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CS)N CIVXDCMSSFGWAL-YUMQZZPRSA-N 0.000 description 3
- 238000002965 ELISA Methods 0.000 description 3
- RZSLYUUFFVHFRQ-FXQIFTODSA-N Gln-Ala-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O RZSLYUUFFVHFRQ-FXQIFTODSA-N 0.000 description 3
- MAGNEQBFSBREJL-DCAQKATOSA-N Gln-Glu-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N MAGNEQBFSBREJL-DCAQKATOSA-N 0.000 description 3
- HXOLDXKNWKLDMM-YVNDNENWSA-N Gln-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N HXOLDXKNWKLDMM-YVNDNENWSA-N 0.000 description 3
- HPCOBEHVEHWREJ-DCAQKATOSA-N Gln-Lys-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HPCOBEHVEHWREJ-DCAQKATOSA-N 0.000 description 3
- FKXCBKCOSVIGCT-AVGNSLFASA-N Gln-Lys-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O FKXCBKCOSVIGCT-AVGNSLFASA-N 0.000 description 3
- QBEWLBKBGXVVPD-RYUDHWBXSA-N Gln-Phe-Gly Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N QBEWLBKBGXVVPD-RYUDHWBXSA-N 0.000 description 3
- MKRDNSWGJWTBKZ-GVXVVHGQSA-N Gln-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N MKRDNSWGJWTBKZ-GVXVVHGQSA-N 0.000 description 3
- GJBUAAAIZSRCDC-GVXVVHGQSA-N Glu-Leu-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O GJBUAAAIZSRCDC-GVXVVHGQSA-N 0.000 description 3
- SWRVAQHFBRZVNX-GUBZILKMSA-N Glu-Lys-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O SWRVAQHFBRZVNX-GUBZILKMSA-N 0.000 description 3
- YKBUCXNNBYZYAY-MNXVOIDGSA-N Glu-Lys-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YKBUCXNNBYZYAY-MNXVOIDGSA-N 0.000 description 3
- YTRBQAQSUDSIQE-FHWLQOOXSA-N Glu-Phe-Phe Chemical compound C([C@H](NC(=O)[C@H](CCC(O)=O)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 YTRBQAQSUDSIQE-FHWLQOOXSA-N 0.000 description 3
- JVYNYWXHZWVJEF-NUMRIWBASA-N Glu-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O JVYNYWXHZWVJEF-NUMRIWBASA-N 0.000 description 3
- VIPDPMHGICREIS-GVXVVHGQSA-N Glu-Val-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O VIPDPMHGICREIS-GVXVVHGQSA-N 0.000 description 3
- SOYWRINXUSUWEQ-DLOVCJGASA-N Glu-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O SOYWRINXUSUWEQ-DLOVCJGASA-N 0.000 description 3
- MHHUEAIBJZWDBH-YUMQZZPRSA-N Gly-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN MHHUEAIBJZWDBH-YUMQZZPRSA-N 0.000 description 3
- ZQIMMEYPEXIYBB-IUCAKERBSA-N Gly-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN ZQIMMEYPEXIYBB-IUCAKERBSA-N 0.000 description 3
- UFPXDFOYHVEIPI-BYPYZUCNSA-N Gly-Gly-Asp Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O UFPXDFOYHVEIPI-BYPYZUCNSA-N 0.000 description 3
- XMPXVJIDADUOQB-RCOVLWMOSA-N Gly-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C([O-])=O)NC(=O)CNC(=O)C[NH3+] XMPXVJIDADUOQB-RCOVLWMOSA-N 0.000 description 3
- OLPPXYMMIARYAL-QMMMGPOBSA-N Gly-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)CN OLPPXYMMIARYAL-QMMMGPOBSA-N 0.000 description 3
- SWQALSGKVLYKDT-UHFFFAOYSA-N Gly-Ile-Ala Natural products NCC(=O)NC(C(C)CC)C(=O)NC(C)C(O)=O SWQALSGKVLYKDT-UHFFFAOYSA-N 0.000 description 3
- FCKPEGOCSVZPNC-WHOFXGATSA-N Gly-Ile-Phe Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 FCKPEGOCSVZPNC-WHOFXGATSA-N 0.000 description 3
- IUZGUFAJDBHQQV-YUMQZZPRSA-N Gly-Leu-Asn Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IUZGUFAJDBHQQV-YUMQZZPRSA-N 0.000 description 3
- NNCSJUBVFBDDLC-YUMQZZPRSA-N Gly-Leu-Ser Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O NNCSJUBVFBDDLC-YUMQZZPRSA-N 0.000 description 3
- MIIVFRCYJABHTQ-ONGXEEELSA-N Gly-Leu-Val Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O MIIVFRCYJABHTQ-ONGXEEELSA-N 0.000 description 3
- FXGRXIATVXUAHO-WEDXCCLWSA-N Gly-Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN FXGRXIATVXUAHO-WEDXCCLWSA-N 0.000 description 3
- GGAPHLIUUTVYMX-QWRGUYRKSA-N Gly-Phe-Ser Chemical compound OC[C@@H](C([O-])=O)NC(=O)[C@@H](NC(=O)C[NH3+])CC1=CC=CC=C1 GGAPHLIUUTVYMX-QWRGUYRKSA-N 0.000 description 3
- VDCRBJACQKOSMS-JSGCOSHPSA-N Gly-Phe-Val Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O VDCRBJACQKOSMS-JSGCOSHPSA-N 0.000 description 3
- SOEGEPHNZOISMT-BYPYZUCNSA-N Gly-Ser-Gly Chemical compound NCC(=O)N[C@@H](CO)C(=O)NCC(O)=O SOEGEPHNZOISMT-BYPYZUCNSA-N 0.000 description 3
- LLWQVJNHMYBLLK-CDMKHQONSA-N Gly-Thr-Phe Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LLWQVJNHMYBLLK-CDMKHQONSA-N 0.000 description 3
- HQSKKSLNLSTONK-JTQLQIEISA-N Gly-Tyr-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 HQSKKSLNLSTONK-JTQLQIEISA-N 0.000 description 3
- ZVXMEWXHFBYJPI-LSJOCFKGSA-N Gly-Val-Ile Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZVXMEWXHFBYJPI-LSJOCFKGSA-N 0.000 description 3
- SBVMXEZQJVUARN-XPUUQOCRSA-N Gly-Val-Ser Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O SBVMXEZQJVUARN-XPUUQOCRSA-N 0.000 description 3
- AFMOTCMSEBITOE-YEPSODPASA-N Gly-Val-Thr Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O AFMOTCMSEBITOE-YEPSODPASA-N 0.000 description 3
- 241000590002 Helicobacter pylori Species 0.000 description 3
- GGXUJBKENKVYNV-ULQDDVLXSA-N His-Val-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N GGXUJBKENKVYNV-ULQDDVLXSA-N 0.000 description 3
- VSZALHITQINTGC-GHCJXIJMSA-N Ile-Ala-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(=O)O)C(=O)O)N VSZALHITQINTGC-GHCJXIJMSA-N 0.000 description 3
- HDOYNXLPTRQLAD-JBDRJPRFSA-N Ile-Ala-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(=O)O)N HDOYNXLPTRQLAD-JBDRJPRFSA-N 0.000 description 3
- DFJJAVZIHDFOGQ-MNXVOIDGSA-N Ile-Glu-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N DFJJAVZIHDFOGQ-MNXVOIDGSA-N 0.000 description 3
- FUOYNOXRWPJPAN-QEWYBTABSA-N Ile-Glu-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N FUOYNOXRWPJPAN-QEWYBTABSA-N 0.000 description 3
- CDGLBYSAZFIIJO-RCOVLWMOSA-N Ile-Gly-Gly Chemical compound CC[C@H](C)[C@H]([NH3+])C(=O)NCC(=O)NCC([O-])=O CDGLBYSAZFIIJO-RCOVLWMOSA-N 0.000 description 3
- SVBAHOMTJRFSIC-SXTJYALSSA-N Ile-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(=O)N)C(=O)O)N SVBAHOMTJRFSIC-SXTJYALSSA-N 0.000 description 3
- SJLVSMMIFYTSGY-GRLWGSQLSA-N Ile-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N SJLVSMMIFYTSGY-GRLWGSQLSA-N 0.000 description 3
- YNMQUIVKEFRCPH-QSFUFRPTSA-N Ile-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(=O)O)N YNMQUIVKEFRCPH-QSFUFRPTSA-N 0.000 description 3
- DMSVBUWGDLYNLC-IAVJCBSLSA-N Ile-Ile-Phe Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 DMSVBUWGDLYNLC-IAVJCBSLSA-N 0.000 description 3
- FZWVCYCYWCLQDH-NHCYSSNCSA-N Ile-Leu-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)O)N FZWVCYCYWCLQDH-NHCYSSNCSA-N 0.000 description 3
- IOVUXUSIGXCREV-DKIMLUQUSA-N Ile-Leu-Phe Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IOVUXUSIGXCREV-DKIMLUQUSA-N 0.000 description 3
- UIEZQYNXCYHMQS-BJDJZHNGSA-N Ile-Lys-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)O)N UIEZQYNXCYHMQS-BJDJZHNGSA-N 0.000 description 3
- JODPUDMBQBIWCK-GHCJXIJMSA-N Ile-Ser-Asn Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O JODPUDMBQBIWCK-GHCJXIJMSA-N 0.000 description 3
- KBDIBHQICWDGDL-PPCPHDFISA-N Ile-Thr-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N KBDIBHQICWDGDL-PPCPHDFISA-N 0.000 description 3
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 3
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 3
- LZDNBBYBDGBADK-UHFFFAOYSA-N L-valyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-UHFFFAOYSA-N 0.000 description 3
- HXWALXSAVBLTPK-NUTKFTJISA-N Leu-Ala-Trp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC(C)C)N HXWALXSAVBLTPK-NUTKFTJISA-N 0.000 description 3
- YOZCKMXHBYKOMQ-IHRRRGAJSA-N Leu-Arg-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N YOZCKMXHBYKOMQ-IHRRRGAJSA-N 0.000 description 3
- WUFYAPWIHCUMLL-CIUDSAMLSA-N Leu-Asn-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O WUFYAPWIHCUMLL-CIUDSAMLSA-N 0.000 description 3
- RFUBXQQFJFGJFV-GUBZILKMSA-N Leu-Asn-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O RFUBXQQFJFGJFV-GUBZILKMSA-N 0.000 description 3
- KKXDHFKZWKLYGB-GUBZILKMSA-N Leu-Asn-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N KKXDHFKZWKLYGB-GUBZILKMSA-N 0.000 description 3
- MYGQXVYRZMKRDB-SRVKXCTJSA-N Leu-Asp-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN MYGQXVYRZMKRDB-SRVKXCTJSA-N 0.000 description 3
- PVMPDMIKUVNOBD-CIUDSAMLSA-N Leu-Asp-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O PVMPDMIKUVNOBD-CIUDSAMLSA-N 0.000 description 3
- HYIFFZAQXPUEAU-QWRGUYRKSA-N Leu-Gly-Leu Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(C)C HYIFFZAQXPUEAU-QWRGUYRKSA-N 0.000 description 3
- POZULHZYLPGXMR-ONGXEEELSA-N Leu-Gly-Val Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O POZULHZYLPGXMR-ONGXEEELSA-N 0.000 description 3
- PBGDOSARRIJMEV-DLOVCJGASA-N Leu-His-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(O)=O PBGDOSARRIJMEV-DLOVCJGASA-N 0.000 description 3
- DBSLVQBXKVKDKJ-BJDJZHNGSA-N Leu-Ile-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O DBSLVQBXKVKDKJ-BJDJZHNGSA-N 0.000 description 3
- DSFYPIUSAMSERP-IHRRRGAJSA-N Leu-Leu-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DSFYPIUSAMSERP-IHRRRGAJSA-N 0.000 description 3
- IAJFFZORSWOZPQ-SRVKXCTJSA-N Leu-Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IAJFFZORSWOZPQ-SRVKXCTJSA-N 0.000 description 3
- BGZCJDGBBUUBHA-KKUMJFAQSA-N Leu-Lys-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O BGZCJDGBBUUBHA-KKUMJFAQSA-N 0.000 description 3
- OVZLLFONXILPDZ-VOAKCMCISA-N Leu-Lys-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OVZLLFONXILPDZ-VOAKCMCISA-N 0.000 description 3
- POMXSEDNUXYPGK-IHRRRGAJSA-N Leu-Met-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N POMXSEDNUXYPGK-IHRRRGAJSA-N 0.000 description 3
- IBSGMIPRBMPMHE-IHRRRGAJSA-N Leu-Met-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(O)=O IBSGMIPRBMPMHE-IHRRRGAJSA-N 0.000 description 3
- NJMXCOOEFLMZSR-AVGNSLFASA-N Leu-Met-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(O)=O NJMXCOOEFLMZSR-AVGNSLFASA-N 0.000 description 3
- SBANPBVRHYIMRR-GARJFASQSA-N Leu-Ser-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N SBANPBVRHYIMRR-GARJFASQSA-N 0.000 description 3
- KLSUAWUZBMAZCL-RHYQMDGZSA-N Leu-Thr-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(O)=O KLSUAWUZBMAZCL-RHYQMDGZSA-N 0.000 description 3
- ARNIBBOXIAWUOP-MGHWNKPDSA-N Leu-Tyr-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ARNIBBOXIAWUOP-MGHWNKPDSA-N 0.000 description 3
- MVJRBCJCRYGCKV-GVXVVHGQSA-N Leu-Val-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O MVJRBCJCRYGCKV-GVXVVHGQSA-N 0.000 description 3
- 239000006142 Luria-Bertani Agar Substances 0.000 description 3
- NFLFJGGKOHYZJF-BJDJZHNGSA-N Lys-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN NFLFJGGKOHYZJF-BJDJZHNGSA-N 0.000 description 3
- ALSRJRIWBNENFY-DCAQKATOSA-N Lys-Arg-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O ALSRJRIWBNENFY-DCAQKATOSA-N 0.000 description 3
- YNNPKXBBRZVIRX-IHRRRGAJSA-N Lys-Arg-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O YNNPKXBBRZVIRX-IHRRRGAJSA-N 0.000 description 3
- WALVCOOOKULCQM-ULQDDVLXSA-N Lys-Arg-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WALVCOOOKULCQM-ULQDDVLXSA-N 0.000 description 3
- ABHIXYDMILIUKV-CIUDSAMLSA-N Lys-Asn-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ABHIXYDMILIUKV-CIUDSAMLSA-N 0.000 description 3
- ZQCVMVCVPFYXHZ-SRVKXCTJSA-N Lys-Asn-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCCN ZQCVMVCVPFYXHZ-SRVKXCTJSA-N 0.000 description 3
- DGWXCIORNLWGGG-CIUDSAMLSA-N Lys-Asn-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O DGWXCIORNLWGGG-CIUDSAMLSA-N 0.000 description 3
- LMVOVCYVZBBWQB-SRVKXCTJSA-N Lys-Asp-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LMVOVCYVZBBWQB-SRVKXCTJSA-N 0.000 description 3
- NTBFKPBULZGXQL-KKUMJFAQSA-N Lys-Asp-Tyr Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NTBFKPBULZGXQL-KKUMJFAQSA-N 0.000 description 3
- DUTMKEAPLLUGNO-JYJNAYRXSA-N Lys-Glu-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O DUTMKEAPLLUGNO-JYJNAYRXSA-N 0.000 description 3
- PGLGNCVOWIORQE-SRVKXCTJSA-N Lys-His-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O PGLGNCVOWIORQE-SRVKXCTJSA-N 0.000 description 3
- KYNNSEJZFVCDIV-ZPFDUUQYSA-N Lys-Ile-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O KYNNSEJZFVCDIV-ZPFDUUQYSA-N 0.000 description 3
- GAHJXEMYXKLZRQ-AJNGGQMLSA-N Lys-Lys-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O GAHJXEMYXKLZRQ-AJNGGQMLSA-N 0.000 description 3
- WKUXWMWQTOYTFI-SRVKXCTJSA-N Lys-Met-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N WKUXWMWQTOYTFI-SRVKXCTJSA-N 0.000 description 3
- IOQWIOPSKJOEKI-SRVKXCTJSA-N Lys-Ser-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IOQWIOPSKJOEKI-SRVKXCTJSA-N 0.000 description 3
- QVTDVTONTRSQMF-WDCWCFNPSA-N Lys-Thr-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CCCCN QVTDVTONTRSQMF-WDCWCFNPSA-N 0.000 description 3
- UGCIQUYEJIEHKX-GVXVVHGQSA-N Lys-Val-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O UGCIQUYEJIEHKX-GVXVVHGQSA-N 0.000 description 3
- TXTZMVNJIRZABH-ULQDDVLXSA-N Lys-Val-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 TXTZMVNJIRZABH-ULQDDVLXSA-N 0.000 description 3
- AETNZPKUUYYYEK-CIUDSAMLSA-N Met-Glu-Asn Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O AETNZPKUUYYYEK-CIUDSAMLSA-N 0.000 description 3
- IUYCGMNKIZDRQI-BQBZGAKWSA-N Met-Gly-Ala Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O IUYCGMNKIZDRQI-BQBZGAKWSA-N 0.000 description 3
- QGRJTULYDZUBAY-ZPFDUUQYSA-N Met-Ile-Glu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O QGRJTULYDZUBAY-ZPFDUUQYSA-N 0.000 description 3
- LXCSZPUQKMTXNW-BQBZGAKWSA-N Met-Ser-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O LXCSZPUQKMTXNW-BQBZGAKWSA-N 0.000 description 3
- SOAYQFDWEIWPPR-IHRRRGAJSA-N Met-Ser-Tyr Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O SOAYQFDWEIWPPR-IHRRRGAJSA-N 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 3
- 241001529936 Murinae Species 0.000 description 3
- AUEJLPRZGVVDNU-UHFFFAOYSA-N N-L-tyrosyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 AUEJLPRZGVVDNU-UHFFFAOYSA-N 0.000 description 3
- 108010066427 N-valyltryptophan Proteins 0.000 description 3
- 239000000020 Nitrocellulose Substances 0.000 description 3
- 208000008469 Peptic Ulcer Diseases 0.000 description 3
- 102000003992 Peroxidases Human genes 0.000 description 3
- UHRNIXJAGGLKHP-DLOVCJGASA-N Phe-Ala-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O UHRNIXJAGGLKHP-DLOVCJGASA-N 0.000 description 3
- WYPVCIACUMJRIB-JYJNAYRXSA-N Phe-Gln-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N WYPVCIACUMJRIB-JYJNAYRXSA-N 0.000 description 3
- WKTSCAXSYITIJJ-PCBIJLKTSA-N Phe-Ile-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O WKTSCAXSYITIJJ-PCBIJLKTSA-N 0.000 description 3
- GXDPQJUBLBZKDY-IAVJCBSLSA-N Phe-Ile-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O GXDPQJUBLBZKDY-IAVJCBSLSA-N 0.000 description 3
- SMFGCTXUBWEPKM-KBPBESRZSA-N Phe-Leu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 SMFGCTXUBWEPKM-KBPBESRZSA-N 0.000 description 3
- RMKGXGPQIPLTFC-KKUMJFAQSA-N Phe-Lys-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O RMKGXGPQIPLTFC-KKUMJFAQSA-N 0.000 description 3
- OQTDZEJJWWAGJT-KKUMJFAQSA-N Phe-Lys-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O OQTDZEJJWWAGJT-KKUMJFAQSA-N 0.000 description 3
- DSXPMZMSJHOKKK-HJOGWXRNSA-N Phe-Phe-Tyr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O DSXPMZMSJHOKKK-HJOGWXRNSA-N 0.000 description 3
- IAOZOFPONWDXNT-IXOXFDKPSA-N Phe-Ser-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IAOZOFPONWDXNT-IXOXFDKPSA-N 0.000 description 3
- KIQUCMUULDXTAZ-HJOGWXRNSA-N Phe-Tyr-Tyr Chemical compound N[C@@H](Cc1ccccc1)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](Cc1ccc(O)cc1)C(O)=O KIQUCMUULDXTAZ-HJOGWXRNSA-N 0.000 description 3
- RGMLUHANLDVMPB-ULQDDVLXSA-N Phe-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N RGMLUHANLDVMPB-ULQDDVLXSA-N 0.000 description 3
- DXTOOBDIIAJZBJ-BQBZGAKWSA-N Pro-Gly-Ser Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CO)C(O)=O DXTOOBDIIAJZBJ-BQBZGAKWSA-N 0.000 description 3
- 229920005654 Sephadex Polymers 0.000 description 3
- 239000012507 Sephadex™ Substances 0.000 description 3
- 229920002684 Sepharose Polymers 0.000 description 3
- PZZJMBYSYAKYPK-UWJYBYFXSA-N Ser-Ala-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O PZZJMBYSYAKYPK-UWJYBYFXSA-N 0.000 description 3
- UBRXAVQWXOWRSJ-ZLUOBGJFSA-N Ser-Asn-Asp Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CO)N)C(=O)N UBRXAVQWXOWRSJ-ZLUOBGJFSA-N 0.000 description 3
- YMEXHZTVKDAKIY-GHCJXIJMSA-N Ser-Asn-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO)C(O)=O YMEXHZTVKDAKIY-GHCJXIJMSA-N 0.000 description 3
- VMVNCJDKFOQOHM-GUBZILKMSA-N Ser-Gln-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CO)N VMVNCJDKFOQOHM-GUBZILKMSA-N 0.000 description 3
- UQFYNFTYDHUIMI-WHFBIAKZSA-N Ser-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CO UQFYNFTYDHUIMI-WHFBIAKZSA-N 0.000 description 3
- JFWDJFULOLKQFY-QWRGUYRKSA-N Ser-Gly-Phe Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JFWDJFULOLKQFY-QWRGUYRKSA-N 0.000 description 3
- FYUIFUJFNCLUIX-XVYDVKMFSA-N Ser-His-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(O)=O FYUIFUJFNCLUIX-XVYDVKMFSA-N 0.000 description 3
- ZIFYDQAFEMIZII-GUBZILKMSA-N Ser-Leu-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZIFYDQAFEMIZII-GUBZILKMSA-N 0.000 description 3
- XNCUYZKGQOCOQH-YUMQZZPRSA-N Ser-Leu-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O XNCUYZKGQOCOQH-YUMQZZPRSA-N 0.000 description 3
- HEUVHBXOVZONPU-BJDJZHNGSA-N Ser-Leu-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HEUVHBXOVZONPU-BJDJZHNGSA-N 0.000 description 3
- KCGIREHVWRXNDH-GARJFASQSA-N Ser-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N KCGIREHVWRXNDH-GARJFASQSA-N 0.000 description 3
- JWOBLHJRDADHLN-KKUMJFAQSA-N Ser-Leu-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JWOBLHJRDADHLN-KKUMJFAQSA-N 0.000 description 3
- QMCDMHWAKMUGJE-IHRRRGAJSA-N Ser-Phe-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O QMCDMHWAKMUGJE-IHRRRGAJSA-N 0.000 description 3
- NMZXJDSKEGFDLJ-DCAQKATOSA-N Ser-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CO)N)C(=O)N[C@@H](CCCCN)C(=O)O NMZXJDSKEGFDLJ-DCAQKATOSA-N 0.000 description 3
- JCLAFVNDBJMLBC-JBDRJPRFSA-N Ser-Ser-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JCLAFVNDBJMLBC-JBDRJPRFSA-N 0.000 description 3
- CUXJENOFJXOSOZ-BIIVOSGPSA-N Ser-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CO)N)C(=O)O CUXJENOFJXOSOZ-BIIVOSGPSA-N 0.000 description 3
- PIQRHJQWEPWFJG-UWJYBYFXSA-N Ser-Tyr-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O PIQRHJQWEPWFJG-UWJYBYFXSA-N 0.000 description 3
- HAYADTTXNZFUDM-IHRRRGAJSA-N Ser-Tyr-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O HAYADTTXNZFUDM-IHRRRGAJSA-N 0.000 description 3
- PCMZJFMUYWIERL-ZKWXMUAHSA-N Ser-Val-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O PCMZJFMUYWIERL-ZKWXMUAHSA-N 0.000 description 3
- JZRYFUGREMECBH-XPUUQOCRSA-N Ser-Val-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O JZRYFUGREMECBH-XPUUQOCRSA-N 0.000 description 3
- MFQMZDPAZRZAPV-NAKRPEOUSA-N Ser-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CO)N MFQMZDPAZRZAPV-NAKRPEOUSA-N 0.000 description 3
- HSWXBJCBYSWBPT-GUBZILKMSA-N Ser-Val-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CO)C(C)C)C(O)=O HSWXBJCBYSWBPT-GUBZILKMSA-N 0.000 description 3
- SKHPKKYKDYULDH-HJGDQZAQSA-N Thr-Asn-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O SKHPKKYKDYULDH-HJGDQZAQSA-N 0.000 description 3
- HJOSVGCWOTYJFG-WDCWCFNPSA-N Thr-Glu-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N)O HJOSVGCWOTYJFG-WDCWCFNPSA-N 0.000 description 3
- XPNSAQMEAVSQRD-FBCQKBJTSA-N Thr-Gly-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)NCC(O)=O XPNSAQMEAVSQRD-FBCQKBJTSA-N 0.000 description 3
- JKGGPMOUIAAJAA-YEPSODPASA-N Thr-Gly-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O JKGGPMOUIAAJAA-YEPSODPASA-N 0.000 description 3
- AHOLTQCAVBSUDP-PPCPHDFISA-N Thr-Ile-Lys Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)[C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O AHOLTQCAVBSUDP-PPCPHDFISA-N 0.000 description 3
- PRNGXSILMXSWQQ-OEAJRASXSA-N Thr-Leu-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PRNGXSILMXSWQQ-OEAJRASXSA-N 0.000 description 3
- UUSQVWOVUYMLJA-PPCPHDFISA-N Thr-Lys-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UUSQVWOVUYMLJA-PPCPHDFISA-N 0.000 description 3
- SPVHQURZJCUDQC-VOAKCMCISA-N Thr-Lys-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O SPVHQURZJCUDQC-VOAKCMCISA-N 0.000 description 3
- MGJLBZFUXUGMML-VOAKCMCISA-N Thr-Lys-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MGJLBZFUXUGMML-VOAKCMCISA-N 0.000 description 3
- GFRIEEKFXOVPIR-RHYQMDGZSA-N Thr-Pro-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O GFRIEEKFXOVPIR-RHYQMDGZSA-N 0.000 description 3
- CRWOSTCODDFEKZ-HRCADAONSA-N Tyr-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O CRWOSTCODDFEKZ-HRCADAONSA-N 0.000 description 3
- HVHJYXDXRIWELT-RYUDHWBXSA-N Tyr-Glu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O HVHJYXDXRIWELT-RYUDHWBXSA-N 0.000 description 3
- WSFXJLFSJSXGMQ-MGHWNKPDSA-N Tyr-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N WSFXJLFSJSXGMQ-MGHWNKPDSA-N 0.000 description 3
- NVZVJIUDICCMHZ-BZSNNMDCSA-N Tyr-Phe-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O NVZVJIUDICCMHZ-BZSNNMDCSA-N 0.000 description 3
- COYSIHFOCOMGCF-WPRPVWTQSA-N Val-Arg-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-WPRPVWTQSA-N 0.000 description 3
- YODDULVCGFQRFZ-ZKWXMUAHSA-N Val-Asp-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O YODDULVCGFQRFZ-ZKWXMUAHSA-N 0.000 description 3
- PMXBARDFIAPBGK-DZKIICNBSA-N Val-Glu-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 PMXBARDFIAPBGK-DZKIICNBSA-N 0.000 description 3
- URIRWLJVWHYLET-ONGXEEELSA-N Val-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)C(C)C URIRWLJVWHYLET-ONGXEEELSA-N 0.000 description 3
- OVBMCNDKCWAXMZ-NAKRPEOUSA-N Val-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](C(C)C)N OVBMCNDKCWAXMZ-NAKRPEOUSA-N 0.000 description 3
- IJGPOONOTBNTFS-GVXVVHGQSA-N Val-Lys-Glu Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O IJGPOONOTBNTFS-GVXVVHGQSA-N 0.000 description 3
- DIOSYUIWOQCXNR-ONGXEEELSA-N Val-Lys-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O DIOSYUIWOQCXNR-ONGXEEELSA-N 0.000 description 3
- XPKCFQZDQGVJCX-RHYQMDGZSA-N Val-Lys-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C(C)C)N)O XPKCFQZDQGVJCX-RHYQMDGZSA-N 0.000 description 3
- JVGHIFMSFBZDHH-WPRPVWTQSA-N Val-Met-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)NCC(=O)O)N JVGHIFMSFBZDHH-WPRPVWTQSA-N 0.000 description 3
- MJOUSKQHAIARKI-JYJNAYRXSA-N Val-Phe-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=CC=C1 MJOUSKQHAIARKI-JYJNAYRXSA-N 0.000 description 3
- IECQJCJNPJVUSB-IHRRRGAJSA-N Val-Tyr-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CO)C(O)=O IECQJCJNPJVUSB-IHRRRGAJSA-N 0.000 description 3
- 241000700605 Viruses Species 0.000 description 3
- 238000001042 affinity chromatography Methods 0.000 description 3
- 108010070944 alanylhistidine Proteins 0.000 description 3
- 108010070783 alanyltyrosine Proteins 0.000 description 3
- 101150073130 ampR gene Proteins 0.000 description 3
- 108010008355 arginyl-glutamine Proteins 0.000 description 3
- 108010068265 aspartyltyrosine Proteins 0.000 description 3
- 210000004369 blood Anatomy 0.000 description 3
- 239000008280 blood Substances 0.000 description 3
- 229940098773 bovine serum albumin Drugs 0.000 description 3
- 239000007853 buffer solution Substances 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 3
- 239000013611 chromosomal DNA Substances 0.000 description 3
- 230000000295 complement effect Effects 0.000 description 3
- 108091036078 conserved sequence Proteins 0.000 description 3
- 101150028842 ctxA gene Proteins 0.000 description 3
- 238000000502 dialysis Methods 0.000 description 3
- 230000029087 digestion Effects 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 210000003495 flagella Anatomy 0.000 description 3
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 3
- 108010078326 glycyl-glycyl-valine Proteins 0.000 description 3
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 3
- 239000010931 gold Substances 0.000 description 3
- 229910052737 gold Inorganic materials 0.000 description 3
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 3
- 108010028295 histidylhistidine Proteins 0.000 description 3
- 230000002163 immunogen Effects 0.000 description 3
- 239000012535 impurity Substances 0.000 description 3
- 239000004615 ingredient Substances 0.000 description 3
- 230000000968 intestinal effect Effects 0.000 description 3
- 238000007912 intraperitoneal administration Methods 0.000 description 3
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 3
- 108010091871 leucylmethionine Proteins 0.000 description 3
- 230000029226 lipidation Effects 0.000 description 3
- 108010034507 methionyltryptophan Proteins 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 238000002703 mutagenesis Methods 0.000 description 3
- 231100000350 mutagenesis Toxicity 0.000 description 3
- 229920001220 nitrocellulos Polymers 0.000 description 3
- 238000007911 parenteral administration Methods 0.000 description 3
- 108040007629 peroxidase activity proteins Proteins 0.000 description 3
- 108010070409 phenylalanyl-glycyl-glycine Proteins 0.000 description 3
- 108010025488 pinealon Proteins 0.000 description 3
- 229920000136 polysorbate Polymers 0.000 description 3
- 108010031719 prolyl-serine Proteins 0.000 description 3
- 230000002685 pulmonary effect Effects 0.000 description 3
- 210000002966 serum Anatomy 0.000 description 3
- 239000007787 solid Substances 0.000 description 3
- PFNFFQXMRSDOHW-UHFFFAOYSA-N spermine Chemical compound NCCCNCCCCNCCCN PFNFFQXMRSDOHW-UHFFFAOYSA-N 0.000 description 3
- 210000002784 stomach Anatomy 0.000 description 3
- 229960005322 streptomycin Drugs 0.000 description 3
- 239000000758 substrate Substances 0.000 description 3
- 239000000725 suspension Substances 0.000 description 3
- 108010061238 threonyl-glycine Proteins 0.000 description 3
- 210000001519 tissue Anatomy 0.000 description 3
- 238000013519 translation Methods 0.000 description 3
- 108010029384 tryptophyl-histidine Proteins 0.000 description 3
- 108010038745 tryptophylglycine Proteins 0.000 description 3
- 108010078580 tyrosylleucine Proteins 0.000 description 3
- 241000701161 unidentified adenovirus Species 0.000 description 3
- 210000001635 urinary tract Anatomy 0.000 description 3
- 230000003612 virological effect Effects 0.000 description 3
- 238000001262 western blot Methods 0.000 description 3
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 2
- XVZCXCTYGHPNEM-IHRRRGAJSA-N (2s)-1-[(2s)-2-[[(2s)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(O)=O XVZCXCTYGHPNEM-IHRRRGAJSA-N 0.000 description 2
- PAHHYDSPOXDASW-VGWMRTNUSA-N (2s)-6-amino-2-[[(2s)-6-amino-2-[[(2s)-1-[(2s)-2-amino-3-hydroxypropanoyl]pyrrolidine-2-carbonyl]amino]hexanoyl]amino]hexanoic acid Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO PAHHYDSPOXDASW-VGWMRTNUSA-N 0.000 description 2
- LEBVLXFERQHONN-UHFFFAOYSA-N 1-butyl-N-(2,6-dimethylphenyl)piperidine-2-carboxamide Chemical compound CCCCN1CCCCC1C(=O)NC1=C(C)C=CC=C1C LEBVLXFERQHONN-UHFFFAOYSA-N 0.000 description 2
- LDGWQMRUWMSZIU-LQDDAWAPSA-M 2,3-bis[(z)-octadec-9-enoxy]propyl-trimethylazanium;chloride Chemical compound [Cl-].CCCCCCCC\C=C/CCCCCCCCOCC(C[N+](C)(C)C)OCCCCCCCC\C=C/CCCCCCCC LDGWQMRUWMSZIU-LQDDAWAPSA-M 0.000 description 2
- HZAXFHJVJLSVMW-UHFFFAOYSA-N 2-Aminoethan-1-ol Chemical compound NCCO HZAXFHJVJLSVMW-UHFFFAOYSA-N 0.000 description 2
- HXUVTXPOZRFMOY-NSHDSACASA-N 2-[[(2s)-2-[[2-[(2-aminoacetyl)amino]acetyl]amino]-3-phenylpropanoyl]amino]acetic acid Chemical compound NCC(=O)NCC(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=CC=C1 HXUVTXPOZRFMOY-NSHDSACASA-N 0.000 description 2
- OTEWWRBKGONZBW-UHFFFAOYSA-N 2-[[2-[[2-[(2-azaniumylacetyl)amino]-4-methylpentanoyl]amino]acetyl]amino]acetate Chemical compound NCC(=O)NC(CC(C)C)C(=O)NCC(=O)NCC(O)=O OTEWWRBKGONZBW-UHFFFAOYSA-N 0.000 description 2
- DQPMXYDFWRYWQV-UHFFFAOYSA-N 2-[[6-amino-2-[[2-[(2-amino-3-methylbutanoyl)amino]-3-hydroxybutanoyl]amino]hexanoyl]amino]acetic acid Chemical compound CC(C)C(N)C(=O)NC(C(C)O)C(=O)NC(CCCCN)C(=O)NCC(O)=O DQPMXYDFWRYWQV-UHFFFAOYSA-N 0.000 description 2
- 229920001817 Agar Polymers 0.000 description 2
- UWQJHXKARZWDIJ-ZLUOBGJFSA-N Ala-Ala-Cys Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CS)C(O)=O UWQJHXKARZWDIJ-ZLUOBGJFSA-N 0.000 description 2
- HGRBNYQIMKTUNT-XVYDVKMFSA-N Ala-Asn-His Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N HGRBNYQIMKTUNT-XVYDVKMFSA-N 0.000 description 2
- STACJSVFHSEZJV-GHCJXIJMSA-N Ala-Asn-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O STACJSVFHSEZJV-GHCJXIJMSA-N 0.000 description 2
- XQGIRPGAVLFKBJ-CIUDSAMLSA-N Ala-Asn-Lys Chemical compound N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)O XQGIRPGAVLFKBJ-CIUDSAMLSA-N 0.000 description 2
- SHYYAQLDNVHPFT-DLOVCJGASA-N Ala-Asn-Phe Chemical compound C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 SHYYAQLDNVHPFT-DLOVCJGASA-N 0.000 description 2
- FUSPCLTUKXQREV-ACZMJKKPSA-N Ala-Glu-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O FUSPCLTUKXQREV-ACZMJKKPSA-N 0.000 description 2
- NWVVKQZOVSTDBQ-CIUDSAMLSA-N Ala-Glu-Arg Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NWVVKQZOVSTDBQ-CIUDSAMLSA-N 0.000 description 2
- HXNNRBHASOSVPG-GUBZILKMSA-N Ala-Glu-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O HXNNRBHASOSVPG-GUBZILKMSA-N 0.000 description 2
- KMGOBAQSCKTBGD-DLOVCJGASA-N Ala-His-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CN=CN1 KMGOBAQSCKTBGD-DLOVCJGASA-N 0.000 description 2
- PNALXAODQKTNLV-JBDRJPRFSA-N Ala-Ile-Ala Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O PNALXAODQKTNLV-JBDRJPRFSA-N 0.000 description 2
- IFKQPMZRDQZSHI-GHCJXIJMSA-N Ala-Ile-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O IFKQPMZRDQZSHI-GHCJXIJMSA-N 0.000 description 2
- HQJKCXHQNUCKMY-GHCJXIJMSA-N Ala-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](C)N HQJKCXHQNUCKMY-GHCJXIJMSA-N 0.000 description 2
- DVJSJDDYCYSMFR-ZKWXMUAHSA-N Ala-Ile-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O DVJSJDDYCYSMFR-ZKWXMUAHSA-N 0.000 description 2
- LNNSWWRRYJLGNI-NAKRPEOUSA-N Ala-Ile-Val Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O LNNSWWRRYJLGNI-NAKRPEOUSA-N 0.000 description 2
- WUHJHHGYVVJMQE-BJDJZHNGSA-N Ala-Leu-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WUHJHHGYVVJMQE-BJDJZHNGSA-N 0.000 description 2
- MEFILNJXAVSUTO-JXUBOQSCSA-N Ala-Leu-Thr Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MEFILNJXAVSUTO-JXUBOQSCSA-N 0.000 description 2
- UWIQWPWWZUHBAO-ZLIFDBKOSA-N Ala-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@H](C)N)CC(C)C)C(O)=O)=CNC2=C1 UWIQWPWWZUHBAO-ZLIFDBKOSA-N 0.000 description 2
- RGQCNKIDEQJEBT-CQDKDKBSSA-N Ala-Leu-Tyr Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 RGQCNKIDEQJEBT-CQDKDKBSSA-N 0.000 description 2
- OQWQTGBOFPJOIF-DLOVCJGASA-N Ala-Lys-His Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N OQWQTGBOFPJOIF-DLOVCJGASA-N 0.000 description 2
- MDNAVFBZPROEHO-UHFFFAOYSA-N Ala-Lys-Val Natural products CC(C)C(C(O)=O)NC(=O)C(NC(=O)C(C)N)CCCCN MDNAVFBZPROEHO-UHFFFAOYSA-N 0.000 description 2
- GFEDXKNBZMPEDM-KZVJFYERSA-N Ala-Met-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GFEDXKNBZMPEDM-KZVJFYERSA-N 0.000 description 2
- CJQAEJMHBAOQHA-DLOVCJGASA-N Ala-Phe-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N CJQAEJMHBAOQHA-DLOVCJGASA-N 0.000 description 2
- RMAWDDRDTRSZIR-ZLUOBGJFSA-N Ala-Ser-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O RMAWDDRDTRSZIR-ZLUOBGJFSA-N 0.000 description 2
- YYAVDNKUWLAFCV-ACZMJKKPSA-N Ala-Ser-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O YYAVDNKUWLAFCV-ACZMJKKPSA-N 0.000 description 2
- DYXOFPBJBAHWFY-JBDRJPRFSA-N Ala-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N DYXOFPBJBAHWFY-JBDRJPRFSA-N 0.000 description 2
- ARHJJAAWNWOACN-FXQIFTODSA-N Ala-Ser-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O ARHJJAAWNWOACN-FXQIFTODSA-N 0.000 description 2
- HCBKAOZYACJUEF-XQXXSGGOSA-N Ala-Thr-Gln Chemical compound N[C@@H](C)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCC(N)=O)C(=O)O HCBKAOZYACJUEF-XQXXSGGOSA-N 0.000 description 2
- LTTLSZVJTDSACD-OWLDWWDNSA-N Ala-Thr-Trp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O LTTLSZVJTDSACD-OWLDWWDNSA-N 0.000 description 2
- MTDDMSUUXNQMKK-BPNCWPANSA-N Ala-Tyr-Arg Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N MTDDMSUUXNQMKK-BPNCWPANSA-N 0.000 description 2
- BHFOJPDOQPWJRN-XDTLVQLUSA-N Ala-Tyr-Gln Chemical compound C[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CCC(N)=O)C(O)=O BHFOJPDOQPWJRN-XDTLVQLUSA-N 0.000 description 2
- VYMJAWXRWHJIMS-LKTVYLICSA-N Ala-Tyr-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N VYMJAWXRWHJIMS-LKTVYLICSA-N 0.000 description 2
- ZJLORAAXDAJLDC-CQDKDKBSSA-N Ala-Tyr-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O ZJLORAAXDAJLDC-CQDKDKBSSA-N 0.000 description 2
- MUGAESARFRGOTQ-IGNZVWTISA-N Ala-Tyr-Tyr Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N MUGAESARFRGOTQ-IGNZVWTISA-N 0.000 description 2
- LYILPUNCKACNGF-NAKRPEOUSA-N Ala-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C)N LYILPUNCKACNGF-NAKRPEOUSA-N 0.000 description 2
- GIVATXIGCXFQQA-FXQIFTODSA-N Arg-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N GIVATXIGCXFQQA-FXQIFTODSA-N 0.000 description 2
- UXJCMQFPDWCHKX-DCAQKATOSA-N Arg-Arg-Glu Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(O)=O)C(O)=O UXJCMQFPDWCHKX-DCAQKATOSA-N 0.000 description 2
- IASNWHAGGYTEKX-IUCAKERBSA-N Arg-Arg-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)NCC(O)=O IASNWHAGGYTEKX-IUCAKERBSA-N 0.000 description 2
- HJVGMOYJDDXLMI-AVGNSLFASA-N Arg-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CCCNC(N)=N HJVGMOYJDDXLMI-AVGNSLFASA-N 0.000 description 2
- BVBKBQRPOJFCQM-DCAQKATOSA-N Arg-Asn-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BVBKBQRPOJFCQM-DCAQKATOSA-N 0.000 description 2
- RWCLSUOSKWTXLA-FXQIFTODSA-N Arg-Asp-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O RWCLSUOSKWTXLA-FXQIFTODSA-N 0.000 description 2
- KMSHNDWHPWXPEC-BQBZGAKWSA-N Arg-Asp-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KMSHNDWHPWXPEC-BQBZGAKWSA-N 0.000 description 2
- FEZJJKXNPSEYEV-CIUDSAMLSA-N Arg-Gln-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O FEZJJKXNPSEYEV-CIUDSAMLSA-N 0.000 description 2
- AUFHLLPVPSMEOG-YUMQZZPRSA-N Arg-Gly-Glu Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AUFHLLPVPSMEOG-YUMQZZPRSA-N 0.000 description 2
- RFXXUWGNVRJTNQ-QXEWZRGKSA-N Arg-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCCN=C(N)N)N RFXXUWGNVRJTNQ-QXEWZRGKSA-N 0.000 description 2
- ZJEDSBGPBXVBMP-PYJNHQTQSA-N Arg-His-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZJEDSBGPBXVBMP-PYJNHQTQSA-N 0.000 description 2
- UBCPNBUIQNMDNH-NAKRPEOUSA-N Arg-Ile-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O UBCPNBUIQNMDNH-NAKRPEOUSA-N 0.000 description 2
- FRMQITGHXMUNDF-GMOBBJLQSA-N Arg-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N FRMQITGHXMUNDF-GMOBBJLQSA-N 0.000 description 2
- YBZMTKUDWXZLIX-UWVGGRQHSA-N Arg-Leu-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YBZMTKUDWXZLIX-UWVGGRQHSA-N 0.000 description 2
- XUGATJVGQUGQKY-ULQDDVLXSA-N Arg-Lys-Phe Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XUGATJVGQUGQKY-ULQDDVLXSA-N 0.000 description 2
- QBQVKUNBCAFXSV-ULQDDVLXSA-N Arg-Lys-Tyr Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 QBQVKUNBCAFXSV-ULQDDVLXSA-N 0.000 description 2
- CZUHPNLXLWMYMG-UBHSHLNASA-N Arg-Phe-Ala Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=CC=C1 CZUHPNLXLWMYMG-UBHSHLNASA-N 0.000 description 2
- KZXPVYVSHUJCEO-ULQDDVLXSA-N Arg-Phe-Lys Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=CC=C1 KZXPVYVSHUJCEO-ULQDDVLXSA-N 0.000 description 2
- XFXZKCRBBOVJKS-BVSLBCMMSA-N Arg-Phe-Trp Chemical compound C([C@H](NC(=O)[C@H](CCCN=C(N)N)N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CC=CC=C1 XFXZKCRBBOVJKS-BVSLBCMMSA-N 0.000 description 2
- KSHJMDSNSKDJPU-QTKMDUPCSA-N Arg-Thr-His Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 KSHJMDSNSKDJPU-QTKMDUPCSA-N 0.000 description 2
- XOZYYXMHMIEJET-XIRDDKMYSA-N Arg-Trp-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(O)=O)C(O)=O XOZYYXMHMIEJET-XIRDDKMYSA-N 0.000 description 2
- UTSMXMABBPFVJP-SZMVWBNQSA-N Arg-Val-Trp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UTSMXMABBPFVJP-SZMVWBNQSA-N 0.000 description 2
- 206010003445 Ascites Diseases 0.000 description 2
- SWLOHUMCUDRTCL-ZLUOBGJFSA-N Asn-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N SWLOHUMCUDRTCL-ZLUOBGJFSA-N 0.000 description 2
- PDQBXRSOSCTGKY-ACZMJKKPSA-N Asn-Ala-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N PDQBXRSOSCTGKY-ACZMJKKPSA-N 0.000 description 2
- QQEWINYJRFBLNN-DLOVCJGASA-N Asn-Ala-Phe Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QQEWINYJRFBLNN-DLOVCJGASA-N 0.000 description 2
- XHFXZQHTLJVZBN-FXQIFTODSA-N Asn-Arg-Asn Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)CN=C(N)N XHFXZQHTLJVZBN-FXQIFTODSA-N 0.000 description 2
- IOTKDTZEEBZNCM-UGYAYLCHSA-N Asn-Asn-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O IOTKDTZEEBZNCM-UGYAYLCHSA-N 0.000 description 2
- APHUDFFMXFYRKP-CIUDSAMLSA-N Asn-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N APHUDFFMXFYRKP-CIUDSAMLSA-N 0.000 description 2
- NLCDVZJDEXIDDL-BIIVOSGPSA-N Asn-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N)C(=O)O NLCDVZJDEXIDDL-BIIVOSGPSA-N 0.000 description 2
- VKCOHFFSTKCXEQ-OLHMAJIHSA-N Asn-Asn-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VKCOHFFSTKCXEQ-OLHMAJIHSA-N 0.000 description 2
- ZDOQDYFZNGASEY-BIIVOSGPSA-N Asn-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N)C(=O)O ZDOQDYFZNGASEY-BIIVOSGPSA-N 0.000 description 2
- VWJFQGXPYOPXJH-ZLUOBGJFSA-N Asn-Cys-Asp Chemical compound C([C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)C(=O)N VWJFQGXPYOPXJH-ZLUOBGJFSA-N 0.000 description 2
- PQAIOUVVZCOLJK-FXQIFTODSA-N Asn-Gln-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N PQAIOUVVZCOLJK-FXQIFTODSA-N 0.000 description 2
- ULRPXVNMIIYDDJ-ACZMJKKPSA-N Asn-Glu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)N)N ULRPXVNMIIYDDJ-ACZMJKKPSA-N 0.000 description 2
- OGMDXNFGPOPZTK-GUBZILKMSA-N Asn-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)N)N OGMDXNFGPOPZTK-GUBZILKMSA-N 0.000 description 2
- JREOBWLIZLXRIS-GUBZILKMSA-N Asn-Glu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JREOBWLIZLXRIS-GUBZILKMSA-N 0.000 description 2
- ODBSSLHUFPJRED-CIUDSAMLSA-N Asn-His-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N ODBSSLHUFPJRED-CIUDSAMLSA-N 0.000 description 2
- YGHCVNQOZZMHRZ-DJFWLOJKSA-N Asn-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC(=O)N)N YGHCVNQOZZMHRZ-DJFWLOJKSA-N 0.000 description 2
- WQLJRNRLHWJIRW-KKUMJFAQSA-N Asn-His-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC(=O)N)N)O WQLJRNRLHWJIRW-KKUMJFAQSA-N 0.000 description 2
- XVBDDUPJVQXDSI-PEFMBERDSA-N Asn-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N XVBDDUPJVQXDSI-PEFMBERDSA-N 0.000 description 2
- XLZCLJRGGMBKLR-PCBIJLKTSA-N Asn-Ile-Phe Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XLZCLJRGGMBKLR-PCBIJLKTSA-N 0.000 description 2
- LTZIRYMWOJHRCH-GUDRVLHUSA-N Asn-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N LTZIRYMWOJHRCH-GUDRVLHUSA-N 0.000 description 2
- BXUHCIXDSWRSBS-CIUDSAMLSA-N Asn-Leu-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BXUHCIXDSWRSBS-CIUDSAMLSA-N 0.000 description 2
- YVXRYLVELQYAEQ-SRVKXCTJSA-N Asn-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N YVXRYLVELQYAEQ-SRVKXCTJSA-N 0.000 description 2
- FTSAJSADJCMDHH-CIUDSAMLSA-N Asn-Lys-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N FTSAJSADJCMDHH-CIUDSAMLSA-N 0.000 description 2
- FBODFHMLALOPHP-GUBZILKMSA-N Asn-Lys-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O FBODFHMLALOPHP-GUBZILKMSA-N 0.000 description 2
- GIQCDTKOIPUDSG-GARJFASQSA-N Asn-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)N)N)C(=O)O GIQCDTKOIPUDSG-GARJFASQSA-N 0.000 description 2
- NLDNNZKUSLAYFW-NHCYSSNCSA-N Asn-Lys-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O NLDNNZKUSLAYFW-NHCYSSNCSA-N 0.000 description 2
- FTNRWCPWDWRPAV-BZSNNMDCSA-N Asn-Phe-Phe Chemical compound C([C@H](NC(=O)[C@H](CC(N)=O)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 FTNRWCPWDWRPAV-BZSNNMDCSA-N 0.000 description 2
- AWXDRZJQCVHCIT-DCAQKATOSA-N Asn-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC(N)=O AWXDRZJQCVHCIT-DCAQKATOSA-N 0.000 description 2
- XTMZYFMTYJNABC-ZLUOBGJFSA-N Asn-Ser-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N XTMZYFMTYJNABC-ZLUOBGJFSA-N 0.000 description 2
- DOURAOODTFJRIC-CIUDSAMLSA-N Asn-Ser-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N DOURAOODTFJRIC-CIUDSAMLSA-N 0.000 description 2
- QUMKPKWYDVMGNT-NUMRIWBASA-N Asn-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O QUMKPKWYDVMGNT-NUMRIWBASA-N 0.000 description 2
- PIABYSIYPGLLDQ-XVSYOHENSA-N Asn-Thr-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PIABYSIYPGLLDQ-XVSYOHENSA-N 0.000 description 2
- KZYSHAMXEBPJBD-JRQIVUDYSA-N Asn-Thr-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KZYSHAMXEBPJBD-JRQIVUDYSA-N 0.000 description 2
- IPAQILGYEQFCFO-NYVOZVTQSA-N Asn-Trp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CNC4=CC=CC=C43)C(=O)O)NC(=O)[C@H](CC(=O)N)N IPAQILGYEQFCFO-NYVOZVTQSA-N 0.000 description 2
- XEGZSHSPQNDNRH-JRQIVUDYSA-N Asn-Tyr-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XEGZSHSPQNDNRH-JRQIVUDYSA-N 0.000 description 2
- XLDMSQYOYXINSZ-QXEWZRGKSA-N Asn-Val-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N XLDMSQYOYXINSZ-QXEWZRGKSA-N 0.000 description 2
- CBHVAFXKOYAHOY-NHCYSSNCSA-N Asn-Val-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O CBHVAFXKOYAHOY-NHCYSSNCSA-N 0.000 description 2
- PQKSVQSMTHPRIB-ZKWXMUAHSA-N Asn-Val-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O PQKSVQSMTHPRIB-ZKWXMUAHSA-N 0.000 description 2
- XOQYDFCQPWAMSA-KKHAAJSZSA-N Asn-Val-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOQYDFCQPWAMSA-KKHAAJSZSA-N 0.000 description 2
- VTYQAQFKMQTKQD-ACZMJKKPSA-N Asp-Ala-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O VTYQAQFKMQTKQD-ACZMJKKPSA-N 0.000 description 2
- HMQDRBKQMLRCCG-GMOBBJLQSA-N Asp-Arg-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HMQDRBKQMLRCCG-GMOBBJLQSA-N 0.000 description 2
- DBWYWXNMZZYIRY-LPEHRKFASA-N Asp-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)O)N)C(=O)O DBWYWXNMZZYIRY-LPEHRKFASA-N 0.000 description 2
- FAEIQWHBRBWUBN-FXQIFTODSA-N Asp-Arg-Ser Chemical compound C(C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC(=O)O)N)CN=C(N)N FAEIQWHBRBWUBN-FXQIFTODSA-N 0.000 description 2
- YNQIDCRRTWGHJD-ZLUOBGJFSA-N Asp-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC(O)=O YNQIDCRRTWGHJD-ZLUOBGJFSA-N 0.000 description 2
- VBVKSAFJPVXMFJ-CIUDSAMLSA-N Asp-Asn-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N VBVKSAFJPVXMFJ-CIUDSAMLSA-N 0.000 description 2
- WCFCYFDBMNFSPA-ACZMJKKPSA-N Asp-Asp-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(O)=O WCFCYFDBMNFSPA-ACZMJKKPSA-N 0.000 description 2
- SVFOIXMRMLROHO-SRVKXCTJSA-N Asp-Asp-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 SVFOIXMRMLROHO-SRVKXCTJSA-N 0.000 description 2
- ZEDBMCPXPIYJLW-XHNCKOQMSA-N Asp-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N)C(=O)O ZEDBMCPXPIYJLW-XHNCKOQMSA-N 0.000 description 2
- JUWZKMBALYLZCK-WHFBIAKZSA-N Asp-Gly-Asn Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O JUWZKMBALYLZCK-WHFBIAKZSA-N 0.000 description 2
- UJGRZQYSNYTCAX-SRVKXCTJSA-N Asp-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UJGRZQYSNYTCAX-SRVKXCTJSA-N 0.000 description 2
- UZFHNLYQWMGUHU-DCAQKATOSA-N Asp-Lys-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O UZFHNLYQWMGUHU-DCAQKATOSA-N 0.000 description 2
- CTWCFPWFIGRAEP-CIUDSAMLSA-N Asp-Lys-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O CTWCFPWFIGRAEP-CIUDSAMLSA-N 0.000 description 2
- NVFSJIXJZCDICF-SRVKXCTJSA-N Asp-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N NVFSJIXJZCDICF-SRVKXCTJSA-N 0.000 description 2
- DPNWSMBUYCLEDG-CIUDSAMLSA-N Asp-Lys-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O DPNWSMBUYCLEDG-CIUDSAMLSA-N 0.000 description 2
- VMVUDJUXJKDGNR-FXQIFTODSA-N Asp-Met-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N VMVUDJUXJKDGNR-FXQIFTODSA-N 0.000 description 2
- QJHOOKBAHRJPPX-QWRGUYRKSA-N Asp-Phe-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=CC=C1 QJHOOKBAHRJPPX-QWRGUYRKSA-N 0.000 description 2
- NONWUQAWAANERO-BZSNNMDCSA-N Asp-Phe-Tyr Chemical compound C([C@H](NC(=O)[C@H](CC(O)=O)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 NONWUQAWAANERO-BZSNNMDCSA-N 0.000 description 2
- ZVGRHIRJLWBWGJ-ACZMJKKPSA-N Asp-Ser-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZVGRHIRJLWBWGJ-ACZMJKKPSA-N 0.000 description 2
- DRCOAZZDQRCGGP-GHCJXIJMSA-N Asp-Ser-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DRCOAZZDQRCGGP-GHCJXIJMSA-N 0.000 description 2
- OFYVKOXTTDCUIL-FXQIFTODSA-N Asp-Ser-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)O)N OFYVKOXTTDCUIL-FXQIFTODSA-N 0.000 description 2
- YIDFBWRHIYOYAA-LKXGYXEUSA-N Asp-Ser-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O YIDFBWRHIYOYAA-LKXGYXEUSA-N 0.000 description 2
- NAAAPCLFJPURAM-HJGDQZAQSA-N Asp-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O NAAAPCLFJPURAM-HJGDQZAQSA-N 0.000 description 2
- NWAHPBGBDIFUFD-KKUMJFAQSA-N Asp-Tyr-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O NWAHPBGBDIFUFD-KKUMJFAQSA-N 0.000 description 2
- 101000583086 Bunodosoma granuliferum Delta-actitoxin-Bgr2b Proteins 0.000 description 2
- 241000178270 Canarypox virus Species 0.000 description 2
- 241000283707 Capra Species 0.000 description 2
- 108091033380 Coding strand Proteins 0.000 description 2
- QDFBJJABJKOLTD-FXQIFTODSA-N Cys-Asn-Arg Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QDFBJJABJKOLTD-FXQIFTODSA-N 0.000 description 2
- UPJGYXRAPJWIHD-CIUDSAMLSA-N Cys-Asn-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UPJGYXRAPJWIHD-CIUDSAMLSA-N 0.000 description 2
- MBILEVLLOHJZMG-FXQIFTODSA-N Cys-Gln-Glu Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CS)N MBILEVLLOHJZMG-FXQIFTODSA-N 0.000 description 2
- HHABWQIFXZPZCK-ACZMJKKPSA-N Cys-Gln-Ser Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CS)N HHABWQIFXZPZCK-ACZMJKKPSA-N 0.000 description 2
- LYSHSHHDBVKJRN-JBDRJPRFSA-N Cys-Ile-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CS)N LYSHSHHDBVKJRN-JBDRJPRFSA-N 0.000 description 2
- KGIHMGPYGXBYJJ-SRVKXCTJSA-N Cys-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CS KGIHMGPYGXBYJJ-SRVKXCTJSA-N 0.000 description 2
- NLDWTJBJFVWBDQ-KKUMJFAQSA-N Cys-Lys-Phe Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)CS)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 NLDWTJBJFVWBDQ-KKUMJFAQSA-N 0.000 description 2
- HJXSYJVCMUOUNY-SRVKXCTJSA-N Cys-Ser-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CS)N HJXSYJVCMUOUNY-SRVKXCTJSA-N 0.000 description 2
- 241000701022 Cytomegalovirus Species 0.000 description 2
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 2
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 2
- 102000004190 Enzymes Human genes 0.000 description 2
- 108090000790 Enzymes Proteins 0.000 description 2
- 108060002716 Exonuclease Proteins 0.000 description 2
- ZHNUHDYFZUAESO-UHFFFAOYSA-N Formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 2
- DTCCMDYODDPHBG-ACZMJKKPSA-N Gln-Ala-Cys Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CS)C(O)=O DTCCMDYODDPHBG-ACZMJKKPSA-N 0.000 description 2
- HHWQMFIGMMOVFK-WDSKDSINSA-N Gln-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O HHWQMFIGMMOVFK-WDSKDSINSA-N 0.000 description 2
- KVYVOGYEMPEXBT-GUBZILKMSA-N Gln-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O KVYVOGYEMPEXBT-GUBZILKMSA-N 0.000 description 2
- MLZRSFQRBDNJON-GUBZILKMSA-N Gln-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N MLZRSFQRBDNJON-GUBZILKMSA-N 0.000 description 2
- SHERTACNJPYHAR-ACZMJKKPSA-N Gln-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O SHERTACNJPYHAR-ACZMJKKPSA-N 0.000 description 2
- DLOHWQXXGMEZDW-CIUDSAMLSA-N Gln-Arg-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O DLOHWQXXGMEZDW-CIUDSAMLSA-N 0.000 description 2
- INFBPLSHYFALDE-ACZMJKKPSA-N Gln-Asn-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O INFBPLSHYFALDE-ACZMJKKPSA-N 0.000 description 2
- OETQLUYCMBARHJ-CIUDSAMLSA-N Gln-Asn-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O OETQLUYCMBARHJ-CIUDSAMLSA-N 0.000 description 2
- TWHDOEYLXXQYOZ-FXQIFTODSA-N Gln-Asn-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N TWHDOEYLXXQYOZ-FXQIFTODSA-N 0.000 description 2
- WMOMPXKOKASNBK-PEFMBERDSA-N Gln-Asn-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WMOMPXKOKASNBK-PEFMBERDSA-N 0.000 description 2
- CYTSBCIIEHUPDU-ACZMJKKPSA-N Gln-Asp-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O CYTSBCIIEHUPDU-ACZMJKKPSA-N 0.000 description 2
- BLOXULLYFRGYKZ-GUBZILKMSA-N Gln-Glu-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O BLOXULLYFRGYKZ-GUBZILKMSA-N 0.000 description 2
- ZQPOVSJFBBETHQ-CIUDSAMLSA-N Gln-Glu-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZQPOVSJFBBETHQ-CIUDSAMLSA-N 0.000 description 2
- PNENQZWRFMUZOM-DCAQKATOSA-N Gln-Glu-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O PNENQZWRFMUZOM-DCAQKATOSA-N 0.000 description 2
- SMLDOQHTOAAFJQ-WDSKDSINSA-N Gln-Gly-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CO)C(O)=O SMLDOQHTOAAFJQ-WDSKDSINSA-N 0.000 description 2
- GXMBDEGTXHQBAO-NKIYYHGXSA-N Gln-His-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCC(=O)N)N)O GXMBDEGTXHQBAO-NKIYYHGXSA-N 0.000 description 2
- TWTWUBHEWQPMQW-ZPFDUUQYSA-N Gln-Ile-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TWTWUBHEWQPMQW-ZPFDUUQYSA-N 0.000 description 2
- YRWWJCDWLVXTHN-LAEOZQHASA-N Gln-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N YRWWJCDWLVXTHN-LAEOZQHASA-N 0.000 description 2
- RGAOLBZBLOJUTP-GRLWGSQLSA-N Gln-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)NC(=O)[C@H](CCC(=O)N)N RGAOLBZBLOJUTP-GRLWGSQLSA-N 0.000 description 2
- FTIJVMLAGRAYMJ-MNXVOIDGSA-N Gln-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(N)=O FTIJVMLAGRAYMJ-MNXVOIDGSA-N 0.000 description 2
- HYPVLWGNBIYTNA-GUBZILKMSA-N Gln-Leu-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O HYPVLWGNBIYTNA-GUBZILKMSA-N 0.000 description 2
- LGIKBBLQVSWUGK-DCAQKATOSA-N Gln-Leu-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O LGIKBBLQVSWUGK-DCAQKATOSA-N 0.000 description 2
- HSHCEAUPUPJPTE-JYJNAYRXSA-N Gln-Leu-Tyr Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N HSHCEAUPUPJPTE-JYJNAYRXSA-N 0.000 description 2
- JRHPEMVLTRADLJ-AVGNSLFASA-N Gln-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N JRHPEMVLTRADLJ-AVGNSLFASA-N 0.000 description 2
- KLKYKPXITJBSNI-CIUDSAMLSA-N Gln-Met-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O KLKYKPXITJBSNI-CIUDSAMLSA-N 0.000 description 2
- FTTHLXOMDMLKKW-FHWLQOOXSA-N Gln-Phe-Phe Chemical compound C([C@H](NC(=O)[C@H](CCC(N)=O)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 FTTHLXOMDMLKKW-FHWLQOOXSA-N 0.000 description 2
- DCWNCMRZIZSZBL-KKUMJFAQSA-N Gln-Pro-Tyr Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CCC(=O)N)N)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O DCWNCMRZIZSZBL-KKUMJFAQSA-N 0.000 description 2
- LGWNISYVKDNJRP-FXQIFTODSA-N Gln-Ser-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O LGWNISYVKDNJRP-FXQIFTODSA-N 0.000 description 2
- LPIKVBWNNVFHCQ-GUBZILKMSA-N Gln-Ser-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O LPIKVBWNNVFHCQ-GUBZILKMSA-N 0.000 description 2
- OKQLXOYFUPVEHI-CIUDSAMLSA-N Gln-Ser-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)N)N OKQLXOYFUPVEHI-CIUDSAMLSA-N 0.000 description 2
- JILRMFFFCHUUTJ-ACZMJKKPSA-N Gln-Ser-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O JILRMFFFCHUUTJ-ACZMJKKPSA-N 0.000 description 2
- DUGYCMAIAKAQPB-GLLZPBPUSA-N Gln-Thr-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O DUGYCMAIAKAQPB-GLLZPBPUSA-N 0.000 description 2
- UEILCTONAMOGBR-RWRJDSDZSA-N Gln-Thr-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UEILCTONAMOGBR-RWRJDSDZSA-N 0.000 description 2
- RGNMNWULPAYDAH-JSGCOSHPSA-N Gln-Trp-Gly Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N RGNMNWULPAYDAH-JSGCOSHPSA-N 0.000 description 2
- ITYRYNUZHPNCIK-GUBZILKMSA-N Glu-Ala-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O ITYRYNUZHPNCIK-GUBZILKMSA-N 0.000 description 2
- JJKKWYQVHRUSDG-GUBZILKMSA-N Glu-Ala-Lys Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O JJKKWYQVHRUSDG-GUBZILKMSA-N 0.000 description 2
- CVPXINNKRTZBMO-CIUDSAMLSA-N Glu-Arg-Asn Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)CN=C(N)N CVPXINNKRTZBMO-CIUDSAMLSA-N 0.000 description 2
- OJGLIOXAKGFFDW-SRVKXCTJSA-N Glu-Arg-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCC(=O)O)N OJGLIOXAKGFFDW-SRVKXCTJSA-N 0.000 description 2
- ALCAUWPAMLVUDB-FXQIFTODSA-N Glu-Gln-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ALCAUWPAMLVUDB-FXQIFTODSA-N 0.000 description 2
- UMIRPYLZFKOEOH-YVNDNENWSA-N Glu-Gln-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UMIRPYLZFKOEOH-YVNDNENWSA-N 0.000 description 2
- QQLBPVKLJBAXBS-FXQIFTODSA-N Glu-Glu-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O QQLBPVKLJBAXBS-FXQIFTODSA-N 0.000 description 2
- BUZMZDDKFCSKOT-CIUDSAMLSA-N Glu-Glu-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BUZMZDDKFCSKOT-CIUDSAMLSA-N 0.000 description 2
- IQACOVZVOMVILH-FXQIFTODSA-N Glu-Glu-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O IQACOVZVOMVILH-FXQIFTODSA-N 0.000 description 2
- QYPKJXSMLMREKF-BPUTZDHNSA-N Glu-Glu-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)O)N QYPKJXSMLMREKF-BPUTZDHNSA-N 0.000 description 2
- OAGVHWYIBZMWLA-YFKPBYRVSA-N Glu-Gly-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)NCC(O)=O OAGVHWYIBZMWLA-YFKPBYRVSA-N 0.000 description 2
- RAUDKMVXNOWDLS-WDSKDSINSA-N Glu-Gly-Ser Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O RAUDKMVXNOWDLS-WDSKDSINSA-N 0.000 description 2
- QXDXIXFSFHUYAX-MNXVOIDGSA-N Glu-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(O)=O QXDXIXFSFHUYAX-MNXVOIDGSA-N 0.000 description 2
- ZHNHJYYFCGUZNQ-KBIXCLLPSA-N Glu-Ile-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(O)=O ZHNHJYYFCGUZNQ-KBIXCLLPSA-N 0.000 description 2
- ZSWGJYOZWBHROQ-RWRJDSDZSA-N Glu-Ile-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZSWGJYOZWBHROQ-RWRJDSDZSA-N 0.000 description 2
- VSRCAOIHMGCIJK-SRVKXCTJSA-N Glu-Leu-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O VSRCAOIHMGCIJK-SRVKXCTJSA-N 0.000 description 2
- PJBVXVBTTFZPHJ-GUBZILKMSA-N Glu-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)O)N PJBVXVBTTFZPHJ-GUBZILKMSA-N 0.000 description 2
- DNPCBMNFQVTHMA-DCAQKATOSA-N Glu-Leu-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O DNPCBMNFQVTHMA-DCAQKATOSA-N 0.000 description 2
- ILWHFUZZCFYSKT-AVGNSLFASA-N Glu-Lys-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O ILWHFUZZCFYSKT-AVGNSLFASA-N 0.000 description 2
- HRBYTAIBKPNZKQ-AVGNSLFASA-N Glu-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(O)=O HRBYTAIBKPNZKQ-AVGNSLFASA-N 0.000 description 2
- QDMVXRNLOPTPIE-WDCWCFNPSA-N Glu-Lys-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QDMVXRNLOPTPIE-WDCWCFNPSA-N 0.000 description 2
- SUIAHERNFYRBDZ-GVXVVHGQSA-N Glu-Lys-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O SUIAHERNFYRBDZ-GVXVVHGQSA-N 0.000 description 2
- ZQYZDDXTNQXUJH-CIUDSAMLSA-N Glu-Met-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CCC(=O)O)N ZQYZDDXTNQXUJH-CIUDSAMLSA-N 0.000 description 2
- AOCARQDSFTWWFT-DCAQKATOSA-N Glu-Met-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O AOCARQDSFTWWFT-DCAQKATOSA-N 0.000 description 2
- LGWUJBCIFGVBSJ-CIUDSAMLSA-N Glu-Met-Cys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)O)N LGWUJBCIFGVBSJ-CIUDSAMLSA-N 0.000 description 2
- LHIPZASLKPYDPI-AVGNSLFASA-N Glu-Phe-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O LHIPZASLKPYDPI-AVGNSLFASA-N 0.000 description 2
- MIIGESVJEBDJMP-FHWLQOOXSA-N Glu-Phe-Tyr Chemical compound C([C@H](NC(=O)[C@H](CCC(O)=O)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 MIIGESVJEBDJMP-FHWLQOOXSA-N 0.000 description 2
- MRWYPDWDZSLWJM-ACZMJKKPSA-N Glu-Ser-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O MRWYPDWDZSLWJM-ACZMJKKPSA-N 0.000 description 2
- SYAYROHMAIHWFB-KBIXCLLPSA-N Glu-Ser-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SYAYROHMAIHWFB-KBIXCLLPSA-N 0.000 description 2
- GUOWMVFLAJNPDY-CIUDSAMLSA-N Glu-Ser-Met Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O GUOWMVFLAJNPDY-CIUDSAMLSA-N 0.000 description 2
- YQAQQKPWFOBSMU-WDCWCFNPSA-N Glu-Thr-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O YQAQQKPWFOBSMU-WDCWCFNPSA-N 0.000 description 2
- UMZHHILWZBFPGL-LOKLDPHHSA-N Glu-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O UMZHHILWZBFPGL-LOKLDPHHSA-N 0.000 description 2
- XOEKMEAOMXMURD-JYJNAYRXSA-N Glu-Tyr-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O XOEKMEAOMXMURD-JYJNAYRXSA-N 0.000 description 2
- KXRORHJIRAOQPG-SOUVJXGZSA-N Glu-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CCC(=O)O)N)C(=O)O KXRORHJIRAOQPG-SOUVJXGZSA-N 0.000 description 2
- KIEICAOUSNYOLM-NRPADANISA-N Glu-Val-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O KIEICAOUSNYOLM-NRPADANISA-N 0.000 description 2
- LZEUDRYSAZAJIO-AUTRQRHGSA-N Glu-Val-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LZEUDRYSAZAJIO-AUTRQRHGSA-N 0.000 description 2
- ZYRXTRTUCAVNBQ-GVXVVHGQSA-N Glu-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZYRXTRTUCAVNBQ-GVXVVHGQSA-N 0.000 description 2
- BRFJMRSRMOMIMU-WHFBIAKZSA-N Gly-Ala-Asn Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O BRFJMRSRMOMIMU-WHFBIAKZSA-N 0.000 description 2
- UGVQELHRNUDMAA-BYPYZUCNSA-N Gly-Ala-Gly Chemical compound [NH3+]CC(=O)N[C@@H](C)C(=O)NCC([O-])=O UGVQELHRNUDMAA-BYPYZUCNSA-N 0.000 description 2
- PHONXOACARQMPM-BQBZGAKWSA-N Gly-Ala-Met Chemical compound [H]NCC(=O)N[C@@H](C)C(=O)N[C@@H](CCSC)C(O)=O PHONXOACARQMPM-BQBZGAKWSA-N 0.000 description 2
- QIZJOTQTCAGKPU-KWQFWETISA-N Gly-Ala-Tyr Chemical compound [NH3+]CC(=O)N[C@@H](C)C(=O)N[C@H](C([O-])=O)CC1=CC=C(O)C=C1 QIZJOTQTCAGKPU-KWQFWETISA-N 0.000 description 2
- RJIVPOXLQFJRTG-LURJTMIESA-N Gly-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N RJIVPOXLQFJRTG-LURJTMIESA-N 0.000 description 2
- FMNHBTKMRFVGRO-FOHZUACHSA-N Gly-Asn-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)CN FMNHBTKMRFVGRO-FOHZUACHSA-N 0.000 description 2
- KQDMENMTYNBWMR-WHFBIAKZSA-N Gly-Asp-Ala Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O KQDMENMTYNBWMR-WHFBIAKZSA-N 0.000 description 2
- RPLLQZBOVIVGMX-QWRGUYRKSA-N Gly-Asp-Phe Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O RPLLQZBOVIVGMX-QWRGUYRKSA-N 0.000 description 2
- PMNHJLASAAWELO-FOHZUACHSA-N Gly-Asp-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PMNHJLASAAWELO-FOHZUACHSA-N 0.000 description 2
- CCQOOWAONKGYKQ-BYPYZUCNSA-N Gly-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)CN CCQOOWAONKGYKQ-BYPYZUCNSA-N 0.000 description 2
- HQRHFUYMGCHHJS-LURJTMIESA-N Gly-Gly-Arg Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N HQRHFUYMGCHHJS-LURJTMIESA-N 0.000 description 2
- KMSGYZQRXPUKGI-BYPYZUCNSA-N Gly-Gly-Asn Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(N)=O KMSGYZQRXPUKGI-BYPYZUCNSA-N 0.000 description 2
- XPJBQTCXPJNIFE-ZETCQYMHSA-N Gly-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)CN XPJBQTCXPJNIFE-ZETCQYMHSA-N 0.000 description 2
- QITBQGJOXQYMOA-ZETCQYMHSA-N Gly-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)CN QITBQGJOXQYMOA-ZETCQYMHSA-N 0.000 description 2
- KAJAOGBVWCYGHZ-JTQLQIEISA-N Gly-Gly-Phe Chemical compound [NH3+]CC(=O)NCC(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 KAJAOGBVWCYGHZ-JTQLQIEISA-N 0.000 description 2
- INLIXXRWNUKVCF-JTQLQIEISA-N Gly-Gly-Tyr Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 INLIXXRWNUKVCF-JTQLQIEISA-N 0.000 description 2
- FSPVILZGHUJOHS-QWRGUYRKSA-N Gly-His-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CNC=N1 FSPVILZGHUJOHS-QWRGUYRKSA-N 0.000 description 2
- YNIMVVJTPWCUJH-KBPBESRZSA-N Gly-His-Tyr Chemical compound [H]NCC(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YNIMVVJTPWCUJH-KBPBESRZSA-N 0.000 description 2
- SWQALSGKVLYKDT-ZKWXMUAHSA-N Gly-Ile-Ala Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O SWQALSGKVLYKDT-ZKWXMUAHSA-N 0.000 description 2
- ITZOBNKQDZEOCE-NHCYSSNCSA-N Gly-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)CN ITZOBNKQDZEOCE-NHCYSSNCSA-N 0.000 description 2
- LLZXNUUIBOALNY-QWRGUYRKSA-N Gly-Leu-Lys Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN LLZXNUUIBOALNY-QWRGUYRKSA-N 0.000 description 2
- UUYBFNKHOCJCHT-VHSXEESVSA-N Gly-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN UUYBFNKHOCJCHT-VHSXEESVSA-N 0.000 description 2
- LHYJCVCQPWRMKZ-WEDXCCLWSA-N Gly-Leu-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LHYJCVCQPWRMKZ-WEDXCCLWSA-N 0.000 description 2
- BXICSAQLIHFDDL-YUMQZZPRSA-N Gly-Lys-Asn Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O BXICSAQLIHFDDL-YUMQZZPRSA-N 0.000 description 2
- LOEANKRDMMVOGZ-YUMQZZPRSA-N Gly-Lys-Asp Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)N[C@@H](CC(O)=O)C(O)=O LOEANKRDMMVOGZ-YUMQZZPRSA-N 0.000 description 2
- PTIIBFKSLCYQBO-NHCYSSNCSA-N Gly-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)CN PTIIBFKSLCYQBO-NHCYSSNCSA-N 0.000 description 2
- VEPBEGNDJYANCF-QWRGUYRKSA-N Gly-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN VEPBEGNDJYANCF-QWRGUYRKSA-N 0.000 description 2
- YHYDTTUSJXGTQK-UWVGGRQHSA-N Gly-Met-Leu Chemical compound CSCC[C@H](NC(=O)CN)C(=O)N[C@@H](CC(C)C)C(O)=O YHYDTTUSJXGTQK-UWVGGRQHSA-N 0.000 description 2
- LXTRSHQLGYINON-DTWKUNHWSA-N Gly-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN LXTRSHQLGYINON-DTWKUNHWSA-N 0.000 description 2
- WMGHDYWNHNLGBV-ONGXEEELSA-N Gly-Phe-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 WMGHDYWNHNLGBV-ONGXEEELSA-N 0.000 description 2
- WZSHYFGOLPXPLL-RYUDHWBXSA-N Gly-Phe-Glu Chemical compound NCC(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CCC(O)=O)C(O)=O WZSHYFGOLPXPLL-RYUDHWBXSA-N 0.000 description 2
- IBYOLNARKHMLBG-WHOFXGATSA-N Gly-Phe-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 IBYOLNARKHMLBG-WHOFXGATSA-N 0.000 description 2
- JPVGHHQGKPQYIL-KBPBESRZSA-N Gly-Phe-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 JPVGHHQGKPQYIL-KBPBESRZSA-N 0.000 description 2
- YLEIWGJJBFBFHC-KBPBESRZSA-N Gly-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 YLEIWGJJBFBFHC-KBPBESRZSA-N 0.000 description 2
- FEUPVVCGQLNXNP-IRXDYDNUSA-N Gly-Phe-Phe Chemical compound C([C@H](NC(=O)CN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 FEUPVVCGQLNXNP-IRXDYDNUSA-N 0.000 description 2
- QAMMIGULQSIRCD-IRXDYDNUSA-N Gly-Phe-Tyr Chemical compound C([C@H](NC(=O)C[NH3+])C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C([O-])=O)C1=CC=CC=C1 QAMMIGULQSIRCD-IRXDYDNUSA-N 0.000 description 2
- HFPVRZWORNJRRC-UWVGGRQHSA-N Gly-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN HFPVRZWORNJRRC-UWVGGRQHSA-N 0.000 description 2
- WCORRBXVISTKQL-WHFBIAKZSA-N Gly-Ser-Ser Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WCORRBXVISTKQL-WHFBIAKZSA-N 0.000 description 2
- YXTFLTJYLIAZQG-FJXKBIBVSA-N Gly-Thr-Arg Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YXTFLTJYLIAZQG-FJXKBIBVSA-N 0.000 description 2
- HUFUVTYGPOUCBN-MBLNEYKQSA-N Gly-Thr-Ile Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HUFUVTYGPOUCBN-MBLNEYKQSA-N 0.000 description 2
- NIOPEYHPOBWLQO-KBPBESRZSA-N Gly-Trp-Glu Chemical compound NCC(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](CCC(O)=O)C(O)=O NIOPEYHPOBWLQO-KBPBESRZSA-N 0.000 description 2
- YJDALMUYJIENAG-QWRGUYRKSA-N Gly-Tyr-Asn Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)CN)O YJDALMUYJIENAG-QWRGUYRKSA-N 0.000 description 2
- KOYUSMBPJOVSOO-XEGUGMAKSA-N Gly-Tyr-Ile Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KOYUSMBPJOVSOO-XEGUGMAKSA-N 0.000 description 2
- PNUFMLXHOLFRLD-KBPBESRZSA-N Gly-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 PNUFMLXHOLFRLD-KBPBESRZSA-N 0.000 description 2
- GJHWILMUOANXTG-WPRPVWTQSA-N Gly-Val-Arg Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GJHWILMUOANXTG-WPRPVWTQSA-N 0.000 description 2
- RYAOJUMWLWUGNW-QMMMGPOBSA-N Gly-Val-Gly Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O RYAOJUMWLWUGNW-QMMMGPOBSA-N 0.000 description 2
- BAYQNCWLXIDLHX-ONGXEEELSA-N Gly-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)CN BAYQNCWLXIDLHX-ONGXEEELSA-N 0.000 description 2
- FNXSYBOHALPRHV-ONGXEEELSA-N Gly-Val-Lys Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN FNXSYBOHALPRHV-ONGXEEELSA-N 0.000 description 2
- XINDHUAGVGCNSF-QSFUFRPTSA-N His-Ala-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XINDHUAGVGCNSF-QSFUFRPTSA-N 0.000 description 2
- MBSSHYPAEHPSGY-LSJOCFKGSA-N His-Ala-Met Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CCSC)C(O)=O MBSSHYPAEHPSGY-LSJOCFKGSA-N 0.000 description 2
- HDXNWVLQSQFJOX-SRVKXCTJSA-N His-Arg-Gln Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N HDXNWVLQSQFJOX-SRVKXCTJSA-N 0.000 description 2
- ZIMTWPHIKZEHSE-UWVGGRQHSA-N His-Arg-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O ZIMTWPHIKZEHSE-UWVGGRQHSA-N 0.000 description 2
- OMNVOTCFQQLEQU-CIUDSAMLSA-N His-Asn-Asp Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N OMNVOTCFQQLEQU-CIUDSAMLSA-N 0.000 description 2
- WMKXFMUJRCEGRP-SRVKXCTJSA-N His-Asn-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N WMKXFMUJRCEGRP-SRVKXCTJSA-N 0.000 description 2
- LSQHWKPPOFDHHZ-YUMQZZPRSA-N His-Asp-Gly Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)NCC(=O)O)N LSQHWKPPOFDHHZ-YUMQZZPRSA-N 0.000 description 2
- HVCRQRQPIIRNLY-IUCAKERBSA-N His-Gln-Gly Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)NCC(=O)O)N HVCRQRQPIIRNLY-IUCAKERBSA-N 0.000 description 2
- JCOSMKPAOYDKRO-AVGNSLFASA-N His-Glu-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N JCOSMKPAOYDKRO-AVGNSLFASA-N 0.000 description 2
- PGTISAJTWZPFGN-PEXQALLHSA-N His-Gly-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O PGTISAJTWZPFGN-PEXQALLHSA-N 0.000 description 2
- FSOXZQBMPBQKGJ-QSFUFRPTSA-N His-Ile-Ala Chemical compound [O-]C(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]([NH3+])CC1=CN=CN1 FSOXZQBMPBQKGJ-QSFUFRPTSA-N 0.000 description 2
- VJJSDSNFXCWCEJ-DJFWLOJKSA-N His-Ile-Asn Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O VJJSDSNFXCWCEJ-DJFWLOJKSA-N 0.000 description 2
- UROVZOUMHNXPLZ-AVGNSLFASA-N His-Leu-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CN=CN1 UROVZOUMHNXPLZ-AVGNSLFASA-N 0.000 description 2
- BXOLYFJYQQRQDJ-MXAVVETBSA-N His-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC1=CN=CN1)N BXOLYFJYQQRQDJ-MXAVVETBSA-N 0.000 description 2
- KHUFDBQXGLEIHC-BZSNNMDCSA-N His-Leu-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 KHUFDBQXGLEIHC-BZSNNMDCSA-N 0.000 description 2
- RLAOTFTXBFQJDV-KKUMJFAQSA-N His-Phe-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CN=CN1 RLAOTFTXBFQJDV-KKUMJFAQSA-N 0.000 description 2
- YAEKRYQASVCDLK-JYJNAYRXSA-N His-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N YAEKRYQASVCDLK-JYJNAYRXSA-N 0.000 description 2
- ZFDKSLBEWYCOCS-BZSNNMDCSA-N His-Phe-Lys Chemical compound C([C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@@H](N)CC=1NC=NC=1)C1=CC=CC=C1 ZFDKSLBEWYCOCS-BZSNNMDCSA-N 0.000 description 2
- IAYPZSHNZQHQNO-KKUMJFAQSA-N His-Ser-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC2=CN=CN2)N IAYPZSHNZQHQNO-KKUMJFAQSA-N 0.000 description 2
- FFKJUTZARGRVTH-KKUMJFAQSA-N His-Ser-Tyr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O FFKJUTZARGRVTH-KKUMJFAQSA-N 0.000 description 2
- UWSMZKRTOZEGDD-CUJWVEQBSA-N His-Thr-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O UWSMZKRTOZEGDD-CUJWVEQBSA-N 0.000 description 2
- VEXZGXHMUGYJMC-UHFFFAOYSA-N Hydrochloric acid Chemical compound Cl VEXZGXHMUGYJMC-UHFFFAOYSA-N 0.000 description 2
- YKRYHWJRQUSTKG-KBIXCLLPSA-N Ile-Ala-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N YKRYHWJRQUSTKG-KBIXCLLPSA-N 0.000 description 2
- CYHYBSGMHMHKOA-CIQUZCHMSA-N Ile-Ala-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N CYHYBSGMHMHKOA-CIQUZCHMSA-N 0.000 description 2
- TZCGZYWNIDZZMR-UHFFFAOYSA-N Ile-Arg-Ala Natural products CCC(C)C(N)C(=O)NC(C(=O)NC(C)C(O)=O)CCCN=C(N)N TZCGZYWNIDZZMR-UHFFFAOYSA-N 0.000 description 2
- WECYRWOMWSCWNX-XUXIUFHCSA-N Ile-Arg-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(C)C)C(O)=O WECYRWOMWSCWNX-XUXIUFHCSA-N 0.000 description 2
- QYZYJFXHXYUZMZ-UGYAYLCHSA-N Ile-Asn-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N QYZYJFXHXYUZMZ-UGYAYLCHSA-N 0.000 description 2
- LEDRIAHEWDJRMF-CFMVVWHZSA-N Ile-Asn-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 LEDRIAHEWDJRMF-CFMVVWHZSA-N 0.000 description 2
- RPZFUIQVAPZLRH-GHCJXIJMSA-N Ile-Asp-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](C)C(=O)O)N RPZFUIQVAPZLRH-GHCJXIJMSA-N 0.000 description 2
- NBJAAWYRLGCJOF-UGYAYLCHSA-N Ile-Asp-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N NBJAAWYRLGCJOF-UGYAYLCHSA-N 0.000 description 2
- GYAFMRQGWHXMII-IUKAMOBKSA-N Ile-Asp-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N GYAFMRQGWHXMII-IUKAMOBKSA-N 0.000 description 2
- CYHJCEKUMCNDFG-LAEOZQHASA-N Ile-Gln-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)NCC(=O)O)N CYHJCEKUMCNDFG-LAEOZQHASA-N 0.000 description 2
- OVPYIUNCVSOVNF-KQXIARHKSA-N Ile-Gln-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N OVPYIUNCVSOVNF-KQXIARHKSA-N 0.000 description 2
- KIMHKBDJQQYLHU-PEFMBERDSA-N Ile-Glu-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N KIMHKBDJQQYLHU-PEFMBERDSA-N 0.000 description 2
- SPQWWEZBHXHUJN-KBIXCLLPSA-N Ile-Glu-Ser Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O SPQWWEZBHXHUJN-KBIXCLLPSA-N 0.000 description 2
- ODPKZZLRDNXTJZ-WHOFXGATSA-N Ile-Gly-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N ODPKZZLRDNXTJZ-WHOFXGATSA-N 0.000 description 2
- LBRCLQMZAHRTLV-ZKWXMUAHSA-N Ile-Gly-Ser Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O LBRCLQMZAHRTLV-ZKWXMUAHSA-N 0.000 description 2
- VOBYAKCXGQQFLR-LSJOCFKGSA-N Ile-Gly-Val Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O VOBYAKCXGQQFLR-LSJOCFKGSA-N 0.000 description 2
- GAZGFPOZOLEYAJ-YTFOTSKYSA-N Ile-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N GAZGFPOZOLEYAJ-YTFOTSKYSA-N 0.000 description 2
- HPCFRQWLTRDGHT-AJNGGQMLSA-N Ile-Leu-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O HPCFRQWLTRDGHT-AJNGGQMLSA-N 0.000 description 2
- NZGTYCMLUGYMCV-XUXIUFHCSA-N Ile-Lys-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N NZGTYCMLUGYMCV-XUXIUFHCSA-N 0.000 description 2
- RMNMUUCYTMLWNA-ZPFDUUQYSA-N Ile-Lys-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N RMNMUUCYTMLWNA-ZPFDUUQYSA-N 0.000 description 2
- ADDYYRVQQZFIMW-MNXVOIDGSA-N Ile-Lys-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ADDYYRVQQZFIMW-MNXVOIDGSA-N 0.000 description 2
- PARSHQDZROHERM-NHCYSSNCSA-N Ile-Lys-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)NCC(=O)O)N PARSHQDZROHERM-NHCYSSNCSA-N 0.000 description 2
- WSSGUVAKYCQSCT-XUXIUFHCSA-N Ile-Met-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(=O)O)N WSSGUVAKYCQSCT-XUXIUFHCSA-N 0.000 description 2
- HQEPKOFULQTSFV-JURCDPSOSA-N Ile-Phe-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)O)N HQEPKOFULQTSFV-JURCDPSOSA-N 0.000 description 2
- IIWQTXMUALXGOV-PCBIJLKTSA-N Ile-Phe-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)O)C(=O)O)N IIWQTXMUALXGOV-PCBIJLKTSA-N 0.000 description 2
- UAELWXJFLZBKQS-WHOFXGATSA-N Ile-Phe-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](Cc1ccccc1)C(=O)NCC(O)=O UAELWXJFLZBKQS-WHOFXGATSA-N 0.000 description 2
- RENBRDSDKPSRIH-HJWJTTGWSA-N Ile-Phe-Met Chemical compound N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(=O)O RENBRDSDKPSRIH-HJWJTTGWSA-N 0.000 description 2
- XQLGNKLSPYCRMZ-HJWJTTGWSA-N Ile-Phe-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(=O)O)N XQLGNKLSPYCRMZ-HJWJTTGWSA-N 0.000 description 2
- NLZVTPYXYXMCIP-XUXIUFHCSA-N Ile-Pro-Lys Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O NLZVTPYXYXMCIP-XUXIUFHCSA-N 0.000 description 2
- CAHCWMVNBZJVAW-NAKRPEOUSA-N Ile-Pro-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)O)N CAHCWMVNBZJVAW-NAKRPEOUSA-N 0.000 description 2
- AGGIYSLVUKVOPT-HTFCKZLJSA-N Ile-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N AGGIYSLVUKVOPT-HTFCKZLJSA-N 0.000 description 2
- PXKACEXYLPBMAD-JBDRJPRFSA-N Ile-Ser-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)O)N PXKACEXYLPBMAD-JBDRJPRFSA-N 0.000 description 2
- SAEWJTCJQVZQNZ-IUKAMOBKSA-N Ile-Thr-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N SAEWJTCJQVZQNZ-IUKAMOBKSA-N 0.000 description 2
- NURNJECQNNCRBK-FLBSBUHZSA-N Ile-Thr-Thr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NURNJECQNNCRBK-FLBSBUHZSA-N 0.000 description 2
- RTSQPLLOYSGMKM-DSYPUSFNSA-N Ile-Trp-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC(C)C)C(=O)O)N RTSQPLLOYSGMKM-DSYPUSFNSA-N 0.000 description 2
- KXUKTDGKLAOCQK-LSJOCFKGSA-N Ile-Val-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O KXUKTDGKLAOCQK-LSJOCFKGSA-N 0.000 description 2
- NJGXXYLPDMMFJB-XUXIUFHCSA-N Ile-Val-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N NJGXXYLPDMMFJB-XUXIUFHCSA-N 0.000 description 2
- RQZFWBLDTBDEOF-RNJOBUHISA-N Ile-Val-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N RQZFWBLDTBDEOF-RNJOBUHISA-N 0.000 description 2
- JZBVBOKASHNXAD-NAKRPEOUSA-N Ile-Val-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(=O)O)N JZBVBOKASHNXAD-NAKRPEOUSA-N 0.000 description 2
- YHFPHRUWZMEOIX-CYDGBPFRSA-N Ile-Val-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(=O)O)N YHFPHRUWZMEOIX-CYDGBPFRSA-N 0.000 description 2
- XUJNEKJLAYXESH-REOHCLBHSA-N L-Cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 2
- IBMVEYRWAWIOTN-UHFFFAOYSA-N L-Leucyl-L-Arginyl-L-Proline Natural products CC(C)CC(N)C(=O)NC(CCCN=C(N)N)C(=O)N1CCCC1C(O)=O IBMVEYRWAWIOTN-UHFFFAOYSA-N 0.000 description 2
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 2
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 2
- LHSGPCFBGJHPCY-UHFFFAOYSA-N L-leucine-L-tyrosine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-UHFFFAOYSA-N 0.000 description 2
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 2
- ZRLUISBDKUWAIZ-CIUDSAMLSA-N Leu-Ala-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O ZRLUISBDKUWAIZ-CIUDSAMLSA-N 0.000 description 2
- QPRQGENIBFLVEB-BJDJZHNGSA-N Leu-Ala-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O QPRQGENIBFLVEB-BJDJZHNGSA-N 0.000 description 2
- XIRYQRLFHWWWTC-QEJZJMRPSA-N Leu-Ala-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XIRYQRLFHWWWTC-QEJZJMRPSA-N 0.000 description 2
- SUPVSFFZWVOEOI-UHFFFAOYSA-N Leu-Ala-Tyr Natural products CC(C)CC(N)C(=O)NC(C)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 SUPVSFFZWVOEOI-UHFFFAOYSA-N 0.000 description 2
- VKOAHIRLIUESLU-ULQDDVLXSA-N Leu-Arg-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O VKOAHIRLIUESLU-ULQDDVLXSA-N 0.000 description 2
- UCOCBWDBHCUPQP-DCAQKATOSA-N Leu-Arg-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O UCOCBWDBHCUPQP-DCAQKATOSA-N 0.000 description 2
- OIARJGNVARWKFP-YUMQZZPRSA-N Leu-Asn-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O OIARJGNVARWKFP-YUMQZZPRSA-N 0.000 description 2
- OXKYZSRZKBTVEY-ZPFDUUQYSA-N Leu-Asn-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O OXKYZSRZKBTVEY-ZPFDUUQYSA-N 0.000 description 2
- POJPZSMTTMLSTG-SRVKXCTJSA-N Leu-Asn-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N POJPZSMTTMLSTG-SRVKXCTJSA-N 0.000 description 2
- RIMMMMYKGIBOSN-DCAQKATOSA-N Leu-Asn-Met Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O RIMMMMYKGIBOSN-DCAQKATOSA-N 0.000 description 2
- MDVZJYGNAGLPGJ-KKUMJFAQSA-N Leu-Asn-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MDVZJYGNAGLPGJ-KKUMJFAQSA-N 0.000 description 2
- WGNOPSQMIQERPK-GARJFASQSA-N Leu-Asn-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N WGNOPSQMIQERPK-GARJFASQSA-N 0.000 description 2
- FIJMQLGQLBLBOL-HJGDQZAQSA-N Leu-Asn-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FIJMQLGQLBLBOL-HJGDQZAQSA-N 0.000 description 2
- TWQIYNGNYNJUFM-NHCYSSNCSA-N Leu-Asn-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O TWQIYNGNYNJUFM-NHCYSSNCSA-N 0.000 description 2
- DLFAACQHIRSQGG-CIUDSAMLSA-N Leu-Asp-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O DLFAACQHIRSQGG-CIUDSAMLSA-N 0.000 description 2
- PJYSOYLLTJKZHC-GUBZILKMSA-N Leu-Asp-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(N)=O PJYSOYLLTJKZHC-GUBZILKMSA-N 0.000 description 2
- YORLGJINWYYIMX-KKUMJFAQSA-N Leu-Cys-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O YORLGJINWYYIMX-KKUMJFAQSA-N 0.000 description 2
- WCTCIIAGNMFYAO-DCAQKATOSA-N Leu-Cys-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(O)=O WCTCIIAGNMFYAO-DCAQKATOSA-N 0.000 description 2
- DPWGZWUMUUJQDT-IUCAKERBSA-N Leu-Gln-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O DPWGZWUMUUJQDT-IUCAKERBSA-N 0.000 description 2
- FQZPTCNSNPWHLJ-AVGNSLFASA-N Leu-Gln-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(O)=O FQZPTCNSNPWHLJ-AVGNSLFASA-N 0.000 description 2
- GLBNEGIOFRVRHO-JYJNAYRXSA-N Leu-Gln-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GLBNEGIOFRVRHO-JYJNAYRXSA-N 0.000 description 2
- CQGSYZCULZMEDE-UHFFFAOYSA-N Leu-Gln-Pro Natural products CC(C)CC(N)C(=O)NC(CCC(N)=O)C(=O)N1CCCC1C(O)=O CQGSYZCULZMEDE-UHFFFAOYSA-N 0.000 description 2
- DZQMXBALGUHGJT-GUBZILKMSA-N Leu-Glu-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O DZQMXBALGUHGJT-GUBZILKMSA-N 0.000 description 2
- RVVBWTWPNFDYBE-SRVKXCTJSA-N Leu-Glu-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O RVVBWTWPNFDYBE-SRVKXCTJSA-N 0.000 description 2
- NEEOBPIXKWSBRF-IUCAKERBSA-N Leu-Glu-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O NEEOBPIXKWSBRF-IUCAKERBSA-N 0.000 description 2
- HQUXQAMSWFIRET-AVGNSLFASA-N Leu-Glu-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN HQUXQAMSWFIRET-AVGNSLFASA-N 0.000 description 2
- LAGPXKYZCCTSGQ-JYJNAYRXSA-N Leu-Glu-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LAGPXKYZCCTSGQ-JYJNAYRXSA-N 0.000 description 2
- WQWSMEOYXJTFRU-GUBZILKMSA-N Leu-Glu-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O WQWSMEOYXJTFRU-GUBZILKMSA-N 0.000 description 2
- FMEICTQWUKNAGC-YUMQZZPRSA-N Leu-Gly-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O FMEICTQWUKNAGC-YUMQZZPRSA-N 0.000 description 2
- CSFVADKICPDRRF-KKUMJFAQSA-N Leu-His-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CN=CN1 CSFVADKICPDRRF-KKUMJFAQSA-N 0.000 description 2
- KOSWSHVQIVTVQF-ZPFDUUQYSA-N Leu-Ile-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O KOSWSHVQIVTVQF-ZPFDUUQYSA-N 0.000 description 2
- HGFGEMSVBMCFKK-MNXVOIDGSA-N Leu-Ile-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O HGFGEMSVBMCFKK-MNXVOIDGSA-N 0.000 description 2
- SEMUSFOBZGKBGW-YTFOTSKYSA-N Leu-Ile-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SEMUSFOBZGKBGW-YTFOTSKYSA-N 0.000 description 2
- QLDHBYRUNQZIJQ-DKIMLUQUSA-N Leu-Ile-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QLDHBYRUNQZIJQ-DKIMLUQUSA-N 0.000 description 2
- HRTRLSRYZZKPCO-BJDJZHNGSA-N Leu-Ile-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O HRTRLSRYZZKPCO-BJDJZHNGSA-N 0.000 description 2
- FAELBUXXFQLUAX-AJNGGQMLSA-N Leu-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(C)C FAELBUXXFQLUAX-AJNGGQMLSA-N 0.000 description 2
- LXKNSJLSGPNHSK-KKUMJFAQSA-N Leu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N LXKNSJLSGPNHSK-KKUMJFAQSA-N 0.000 description 2
- IEWBEPKLKUXQBU-VOAKCMCISA-N Leu-Leu-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IEWBEPKLKUXQBU-VOAKCMCISA-N 0.000 description 2
- UCNNZELZXFXXJQ-BZSNNMDCSA-N Leu-Leu-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 UCNNZELZXFXXJQ-BZSNNMDCSA-N 0.000 description 2
- RZXLZBIUTDQHJQ-SRVKXCTJSA-N Leu-Lys-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O RZXLZBIUTDQHJQ-SRVKXCTJSA-N 0.000 description 2
- ZGUMORRUBUCXEH-AVGNSLFASA-N Leu-Lys-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZGUMORRUBUCXEH-AVGNSLFASA-N 0.000 description 2
- LVTJJOJKDCVZGP-QWRGUYRKSA-N Leu-Lys-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LVTJJOJKDCVZGP-QWRGUYRKSA-N 0.000 description 2
- VCHVSKNMTXWIIP-SRVKXCTJSA-N Leu-Lys-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O VCHVSKNMTXWIIP-SRVKXCTJSA-N 0.000 description 2
- BJWKOATWNQJPSK-SRVKXCTJSA-N Leu-Met-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N BJWKOATWNQJPSK-SRVKXCTJSA-N 0.000 description 2
- FLNPJLDPGMLWAU-UWVGGRQHSA-N Leu-Met-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CC(C)C FLNPJLDPGMLWAU-UWVGGRQHSA-N 0.000 description 2
- GCXGCIYIHXSKAY-ULQDDVLXSA-N Leu-Phe-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GCXGCIYIHXSKAY-ULQDDVLXSA-N 0.000 description 2
- FYPWFNKQVVEELI-ULQDDVLXSA-N Leu-Phe-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=CC=C1 FYPWFNKQVVEELI-ULQDDVLXSA-N 0.000 description 2
- MUCIDQMDOYQYBR-IHRRRGAJSA-N Leu-Pro-His Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N MUCIDQMDOYQYBR-IHRRRGAJSA-N 0.000 description 2
- JDBQSGMJBMPNFT-AVGNSLFASA-N Leu-Pro-Val Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O JDBQSGMJBMPNFT-AVGNSLFASA-N 0.000 description 2
- KIZIOFNVSOSKJI-CIUDSAMLSA-N Leu-Ser-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O)N KIZIOFNVSOSKJI-CIUDSAMLSA-N 0.000 description 2
- AKVBOOKXVAMKSS-GUBZILKMSA-N Leu-Ser-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O AKVBOOKXVAMKSS-GUBZILKMSA-N 0.000 description 2
- AMSSKPUHBUQBOQ-SRVKXCTJSA-N Leu-Ser-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N AMSSKPUHBUQBOQ-SRVKXCTJSA-N 0.000 description 2
- GOFJOGXGMPHOGL-DCAQKATOSA-N Leu-Ser-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(C)C GOFJOGXGMPHOGL-DCAQKATOSA-N 0.000 description 2
- AEDWWMMHUGYIFD-HJGDQZAQSA-N Leu-Thr-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O AEDWWMMHUGYIFD-HJGDQZAQSA-N 0.000 description 2
- QWWPYKKLXWOITQ-VOAKCMCISA-N Leu-Thr-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(C)C QWWPYKKLXWOITQ-VOAKCMCISA-N 0.000 description 2
- DAYQSYGBCUKVKT-VOAKCMCISA-N Leu-Thr-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O DAYQSYGBCUKVKT-VOAKCMCISA-N 0.000 description 2
- HOMFINRJHIIZNJ-HOCLYGCPSA-N Leu-Trp-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)NCC(O)=O HOMFINRJHIIZNJ-HOCLYGCPSA-N 0.000 description 2
- FPFOYSCDUWTZBF-IHPCNDPISA-N Leu-Trp-Leu Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H]([NH3+])CC(C)C)C(=O)N[C@@H](CC(C)C)C([O-])=O)=CNC2=C1 FPFOYSCDUWTZBF-IHPCNDPISA-N 0.000 description 2
- WUHBLPVELFTPQK-KKUMJFAQSA-N Leu-Tyr-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O WUHBLPVELFTPQK-KKUMJFAQSA-N 0.000 description 2
- WFCKERTZVCQXKH-KBPBESRZSA-N Leu-Tyr-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O WFCKERTZVCQXKH-KBPBESRZSA-N 0.000 description 2
- BTEMNFBEAAOGBR-BZSNNMDCSA-N Leu-Tyr-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BTEMNFBEAAOGBR-BZSNNMDCSA-N 0.000 description 2
- YIRIDPUGZKHMHT-ACRUOGEOSA-N Leu-Tyr-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YIRIDPUGZKHMHT-ACRUOGEOSA-N 0.000 description 2
- CGHXMODRYJISSK-NHCYSSNCSA-N Leu-Val-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O CGHXMODRYJISSK-NHCYSSNCSA-N 0.000 description 2
- AIMGJYMCTAABEN-GVXVVHGQSA-N Leu-Val-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIMGJYMCTAABEN-GVXVVHGQSA-N 0.000 description 2
- AAKRWBIIGKPOKQ-ONGXEEELSA-N Leu-Val-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O AAKRWBIIGKPOKQ-ONGXEEELSA-N 0.000 description 2
- FMFNIDICDKEMOE-XUXIUFHCSA-N Leu-Val-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FMFNIDICDKEMOE-XUXIUFHCSA-N 0.000 description 2
- YQFZRHYZLARWDY-IHRRRGAJSA-N Leu-Val-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN YQFZRHYZLARWDY-IHRRRGAJSA-N 0.000 description 2
- XOEDPXDZJHBQIX-ULQDDVLXSA-N Leu-Val-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XOEDPXDZJHBQIX-ULQDDVLXSA-N 0.000 description 2
- VKVDRTGWLVZJOM-DCAQKATOSA-N Leu-Val-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O VKVDRTGWLVZJOM-DCAQKATOSA-N 0.000 description 2
- NTXYXFDMIHXTHE-WDSOQIARSA-N Leu-Val-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(C)C)C(O)=O)=CNC2=C1 NTXYXFDMIHXTHE-WDSOQIARSA-N 0.000 description 2
- 108090001030 Lipoproteins Proteins 0.000 description 2
- 102000004895 Lipoproteins Human genes 0.000 description 2
- 239000006137 Luria-Bertani broth Substances 0.000 description 2
- MPOHDJKRBLVGCT-CIUDSAMLSA-N Lys-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N MPOHDJKRBLVGCT-CIUDSAMLSA-N 0.000 description 2
- JCFYLFOCALSNLQ-GUBZILKMSA-N Lys-Ala-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O JCFYLFOCALSNLQ-GUBZILKMSA-N 0.000 description 2
- XFIHDSBIPWEYJJ-YUMQZZPRSA-N Lys-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN XFIHDSBIPWEYJJ-YUMQZZPRSA-N 0.000 description 2
- WSXTWLJHTLRFLW-SRVKXCTJSA-N Lys-Ala-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O WSXTWLJHTLRFLW-SRVKXCTJSA-N 0.000 description 2
- CLBGMWIYPYAZPR-AVGNSLFASA-N Lys-Arg-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O CLBGMWIYPYAZPR-AVGNSLFASA-N 0.000 description 2
- GAOJCVKPIGHTGO-UWVGGRQHSA-N Lys-Arg-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O GAOJCVKPIGHTGO-UWVGGRQHSA-N 0.000 description 2
- VHNOAIFVYUQOOY-XUXIUFHCSA-N Lys-Arg-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VHNOAIFVYUQOOY-XUXIUFHCSA-N 0.000 description 2
- DGAAQRAUOFHBFJ-CIUDSAMLSA-N Lys-Asn-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O DGAAQRAUOFHBFJ-CIUDSAMLSA-N 0.000 description 2
- WLCYCADOWRMSAJ-CIUDSAMLSA-N Lys-Asn-Cys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CS)C(O)=O WLCYCADOWRMSAJ-CIUDSAMLSA-N 0.000 description 2
- DEFGUIIUYAUEDU-ZPFDUUQYSA-N Lys-Asn-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DEFGUIIUYAUEDU-ZPFDUUQYSA-N 0.000 description 2
- FACUGMGEFUEBTI-SRVKXCTJSA-N Lys-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCCCN FACUGMGEFUEBTI-SRVKXCTJSA-N 0.000 description 2
- KWUKZRFFKPLUPE-HJGDQZAQSA-N Lys-Asp-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KWUKZRFFKPLUPE-HJGDQZAQSA-N 0.000 description 2
- DFXQCCBKGUNYGG-GUBZILKMSA-N Lys-Gln-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCCCN DFXQCCBKGUNYGG-GUBZILKMSA-N 0.000 description 2
- HWMZUBUEOYAQSC-DCAQKATOSA-N Lys-Gln-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O HWMZUBUEOYAQSC-DCAQKATOSA-N 0.000 description 2
- QQUJSUFWEDZQQY-AVGNSLFASA-N Lys-Gln-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCCN QQUJSUFWEDZQQY-AVGNSLFASA-N 0.000 description 2
- ZXEUFAVXODIPHC-GUBZILKMSA-N Lys-Glu-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZXEUFAVXODIPHC-GUBZILKMSA-N 0.000 description 2
- IMAKMJCBYCSMHM-AVGNSLFASA-N Lys-Glu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN IMAKMJCBYCSMHM-AVGNSLFASA-N 0.000 description 2
- WGLAORUKDGRINI-WDCWCFNPSA-N Lys-Glu-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WGLAORUKDGRINI-WDCWCFNPSA-N 0.000 description 2
- ITWQLSZTLBKWJM-YUMQZZPRSA-N Lys-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CCCCN ITWQLSZTLBKWJM-YUMQZZPRSA-N 0.000 description 2
- FHIAJWBDZVHLAH-YUMQZZPRSA-N Lys-Gly-Ser Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O FHIAJWBDZVHLAH-YUMQZZPRSA-N 0.000 description 2
- NNKLKUUGESXCBS-KBPBESRZSA-N Lys-Gly-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O NNKLKUUGESXCBS-KBPBESRZSA-N 0.000 description 2
- SLQJJFAVWSZLBL-BJDJZHNGSA-N Lys-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN SLQJJFAVWSZLBL-BJDJZHNGSA-N 0.000 description 2
- MXMDJEJWERYPMO-XUXIUFHCSA-N Lys-Ile-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O MXMDJEJWERYPMO-XUXIUFHCSA-N 0.000 description 2
- QBEPTBMRQALPEV-MNXVOIDGSA-N Lys-Ile-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN QBEPTBMRQALPEV-MNXVOIDGSA-N 0.000 description 2
- ZXFRGTAIIZHNHG-AJNGGQMLSA-N Lys-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CCCCN)N ZXFRGTAIIZHNHG-AJNGGQMLSA-N 0.000 description 2
- KEPWSUPUFAPBRF-DKIMLUQUSA-N Lys-Ile-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O KEPWSUPUFAPBRF-DKIMLUQUSA-N 0.000 description 2
- IZJGPPIGYTVXLB-FQUUOJAGSA-N Lys-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N IZJGPPIGYTVXLB-FQUUOJAGSA-N 0.000 description 2
- PRSBSVAVOQOAMI-BJDJZHNGSA-N Lys-Ile-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN PRSBSVAVOQOAMI-BJDJZHNGSA-N 0.000 description 2
- XREQQOATSMMAJP-MGHWNKPDSA-N Lys-Ile-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XREQQOATSMMAJP-MGHWNKPDSA-N 0.000 description 2
- MYZMQWHPDAYKIE-SRVKXCTJSA-N Lys-Leu-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O MYZMQWHPDAYKIE-SRVKXCTJSA-N 0.000 description 2
- NJNRBRKHOWSGMN-SRVKXCTJSA-N Lys-Leu-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O NJNRBRKHOWSGMN-SRVKXCTJSA-N 0.000 description 2
- VMTYLUGCXIEDMV-QWRGUYRKSA-N Lys-Leu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCCCN VMTYLUGCXIEDMV-QWRGUYRKSA-N 0.000 description 2
- RBEATVHTWHTHTJ-KKUMJFAQSA-N Lys-Leu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O RBEATVHTWHTHTJ-KKUMJFAQSA-N 0.000 description 2
- LJADEBULDNKJNK-IHRRRGAJSA-N Lys-Leu-Val Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O LJADEBULDNKJNK-IHRRRGAJSA-N 0.000 description 2
- AHFOKDZWPPGJAZ-SRVKXCTJSA-N Lys-Lys-Cys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CS)C(=O)O)N AHFOKDZWPPGJAZ-SRVKXCTJSA-N 0.000 description 2
- PYFNONMJYNJENN-AVGNSLFASA-N Lys-Lys-Gln Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N PYFNONMJYNJENN-AVGNSLFASA-N 0.000 description 2
- QQPSCXKFDSORFT-IHRRRGAJSA-N Lys-Lys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN QQPSCXKFDSORFT-IHRRRGAJSA-N 0.000 description 2
- SPNKGZFASINBMR-IHRRRGAJSA-N Lys-Met-His Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCCCN)N SPNKGZFASINBMR-IHRRRGAJSA-N 0.000 description 2
- ALEVUGKHINJNIF-QEJZJMRPSA-N Lys-Phe-Ala Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=CC=C1 ALEVUGKHINJNIF-QEJZJMRPSA-N 0.000 description 2
- LNMKRJJLEFASGA-BZSNNMDCSA-N Lys-Phe-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O LNMKRJJLEFASGA-BZSNNMDCSA-N 0.000 description 2
- WLXGMVVHTIUPHE-ULQDDVLXSA-N Lys-Phe-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O WLXGMVVHTIUPHE-ULQDDVLXSA-N 0.000 description 2
- JOSAKOKSPXROGQ-BJDJZHNGSA-N Lys-Ser-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JOSAKOKSPXROGQ-BJDJZHNGSA-N 0.000 description 2
- MEQLGHAMAUPOSJ-DCAQKATOSA-N Lys-Ser-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O MEQLGHAMAUPOSJ-DCAQKATOSA-N 0.000 description 2
- RPWTZTBIFGENIA-VOAKCMCISA-N Lys-Thr-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O RPWTZTBIFGENIA-VOAKCMCISA-N 0.000 description 2
- YKBSXQFZWFXFIB-VOAKCMCISA-N Lys-Thr-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCCCN)C(O)=O YKBSXQFZWFXFIB-VOAKCMCISA-N 0.000 description 2
- WAAZECNCPVGPIV-RHYQMDGZSA-N Lys-Thr-Met Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(O)=O WAAZECNCPVGPIV-RHYQMDGZSA-N 0.000 description 2
- ZJSXCIMWLPSTMG-HSCHXYMDSA-N Lys-Trp-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZJSXCIMWLPSTMG-HSCHXYMDSA-N 0.000 description 2
- PELXPRPDQRFBGQ-KKUMJFAQSA-N Lys-Tyr-Asn Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N)O PELXPRPDQRFBGQ-KKUMJFAQSA-N 0.000 description 2
- MIMXMVDLMDMOJD-BZSNNMDCSA-N Lys-Tyr-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O MIMXMVDLMDMOJD-BZSNNMDCSA-N 0.000 description 2
- SQRLLZAQNOQCEG-KKUMJFAQSA-N Lys-Tyr-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=C(O)C=C1 SQRLLZAQNOQCEG-KKUMJFAQSA-N 0.000 description 2
- FPQMQEOVSKMVMA-ACRUOGEOSA-N Lys-Tyr-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)NC(=O)[C@H](CCCCN)N)O FPQMQEOVSKMVMA-ACRUOGEOSA-N 0.000 description 2
- XABXVVSWUVCZST-GVXVVHGQSA-N Lys-Val-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN XABXVVSWUVCZST-GVXVVHGQSA-N 0.000 description 2
- VKCPHIOZDWUFSW-ONGXEEELSA-N Lys-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN VKCPHIOZDWUFSW-ONGXEEELSA-N 0.000 description 2
- NYTDJEZBAAFLLG-IHRRRGAJSA-N Lys-Val-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(O)=O NYTDJEZBAAFLLG-IHRRRGAJSA-N 0.000 description 2
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L Magnesium chloride Chemical compound [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 2
- FRWZTWWOORIIBA-FXQIFTODSA-N Met-Asn-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N FRWZTWWOORIIBA-FXQIFTODSA-N 0.000 description 2
- IVCPHARVJUYDPA-FXQIFTODSA-N Met-Asn-Asp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N IVCPHARVJUYDPA-FXQIFTODSA-N 0.000 description 2
- TUSOIZOVPJCMFC-FXQIFTODSA-N Met-Asp-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O TUSOIZOVPJCMFC-FXQIFTODSA-N 0.000 description 2
- XMMWDTUFTZMQFD-GMOBBJLQSA-N Met-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCSC XMMWDTUFTZMQFD-GMOBBJLQSA-N 0.000 description 2
- MYKLINMAGAIRPJ-CIUDSAMLSA-N Met-Gln-Asn Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O MYKLINMAGAIRPJ-CIUDSAMLSA-N 0.000 description 2
- FWTBMGAKKPSTBT-GUBZILKMSA-N Met-Gln-Glu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O FWTBMGAKKPSTBT-GUBZILKMSA-N 0.000 description 2
- SJDQOYTYNGZZJX-SRVKXCTJSA-N Met-Glu-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O SJDQOYTYNGZZJX-SRVKXCTJSA-N 0.000 description 2
- HLQWFLJOJRFXHO-CIUDSAMLSA-N Met-Glu-Ser Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O HLQWFLJOJRFXHO-CIUDSAMLSA-N 0.000 description 2
- JPCHYAUKOUGOIB-HJGDQZAQSA-N Met-Glu-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JPCHYAUKOUGOIB-HJGDQZAQSA-N 0.000 description 2
- FYRUJIJAUPHUNB-IUCAKERBSA-N Met-Gly-Arg Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCNC(N)=N FYRUJIJAUPHUNB-IUCAKERBSA-N 0.000 description 2
- MYAPQOBHGWJZOM-UWVGGRQHSA-N Met-Gly-Leu Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(C)C MYAPQOBHGWJZOM-UWVGGRQHSA-N 0.000 description 2
- JACAKCWAOHKQBV-UWVGGRQHSA-N Met-Gly-Lys Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN JACAKCWAOHKQBV-UWVGGRQHSA-N 0.000 description 2
- RVYDCISQIGHAFC-ZPFDUUQYSA-N Met-Ile-Gln Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(O)=O RVYDCISQIGHAFC-ZPFDUUQYSA-N 0.000 description 2
- GETCJHFFECHWHI-QXEWZRGKSA-N Met-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCSC)N GETCJHFFECHWHI-QXEWZRGKSA-N 0.000 description 2
- AFVOKRHYSSFPHC-STECZYCISA-N Met-Ile-Tyr Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AFVOKRHYSSFPHC-STECZYCISA-N 0.000 description 2
- RBGLBUDVQVPTEG-DCAQKATOSA-N Met-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCSC)N RBGLBUDVQVPTEG-DCAQKATOSA-N 0.000 description 2
- CHDYFPCQVUOJEB-ULQDDVLXSA-N Met-Leu-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 CHDYFPCQVUOJEB-ULQDDVLXSA-N 0.000 description 2
- DBXMFHGGHMXYHY-DCAQKATOSA-N Met-Leu-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O DBXMFHGGHMXYHY-DCAQKATOSA-N 0.000 description 2
- WPTHAGXMYDRPFD-SRVKXCTJSA-N Met-Lys-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O WPTHAGXMYDRPFD-SRVKXCTJSA-N 0.000 description 2
- ZRACLHJYVRBJFC-ULQDDVLXSA-N Met-Lys-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ZRACLHJYVRBJFC-ULQDDVLXSA-N 0.000 description 2
- WTHGNAAQXISJHP-AVGNSLFASA-N Met-Lys-Val Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O WTHGNAAQXISJHP-AVGNSLFASA-N 0.000 description 2
- ILKCLLLOGPDNIP-RCWTZXSCSA-N Met-Met-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ILKCLLLOGPDNIP-RCWTZXSCSA-N 0.000 description 2
- WYDFQSJOARJAMM-GUBZILKMSA-N Met-Pro-Asp Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O WYDFQSJOARJAMM-GUBZILKMSA-N 0.000 description 2
- WSPQHZOMTFFWGH-XGEHTFHBSA-N Met-Thr-Cys Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(O)=O WSPQHZOMTFFWGH-XGEHTFHBSA-N 0.000 description 2
- WXJLBSXNUHIGSS-OSUNSFLBSA-N Met-Thr-Ile Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WXJLBSXNUHIGSS-OSUNSFLBSA-N 0.000 description 2
- RKRFGIBULDYDPF-XIRDDKMYSA-N Met-Trp-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N RKRFGIBULDYDPF-XIRDDKMYSA-N 0.000 description 2
- 241000699666 Mus <mouse, genus> Species 0.000 description 2
- CHJJGSNFBQVOTG-UHFFFAOYSA-N N-methyl-guanidine Natural products CNC(N)=N CHJJGSNFBQVOTG-UHFFFAOYSA-N 0.000 description 2
- 108010034522 NNQQ peptide Proteins 0.000 description 2
- 108010065395 Neuropep-1 Proteins 0.000 description 2
- 102000007079 Peptide Fragments Human genes 0.000 description 2
- 108010033276 Peptide Fragments Proteins 0.000 description 2
- LSXGADJXBDFXQU-DLOVCJGASA-N Phe-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 LSXGADJXBDFXQU-DLOVCJGASA-N 0.000 description 2
- FPTXMUIBLMGTQH-ONGXEEELSA-N Phe-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 FPTXMUIBLMGTQH-ONGXEEELSA-N 0.000 description 2
- BKWJQWJPZMUWEG-LFSVMHDDSA-N Phe-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 BKWJQWJPZMUWEG-LFSVMHDDSA-N 0.000 description 2
- AGYXCMYVTBYGCT-ULQDDVLXSA-N Phe-Arg-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O AGYXCMYVTBYGCT-ULQDDVLXSA-N 0.000 description 2
- IWRZUGHCHFZYQZ-UFYCRDLUSA-N Phe-Arg-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 IWRZUGHCHFZYQZ-UFYCRDLUSA-N 0.000 description 2
- HTTYNOXBBOWZTB-SRVKXCTJSA-N Phe-Asn-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N HTTYNOXBBOWZTB-SRVKXCTJSA-N 0.000 description 2
- DDYIRGBOZVKRFR-AVGNSLFASA-N Phe-Asp-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N DDYIRGBOZVKRFR-AVGNSLFASA-N 0.000 description 2
- CSYVXYQDIVCQNU-QWRGUYRKSA-N Phe-Asp-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O CSYVXYQDIVCQNU-QWRGUYRKSA-N 0.000 description 2
- OJUMUUXGSXUZJZ-SRVKXCTJSA-N Phe-Asp-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O OJUMUUXGSXUZJZ-SRVKXCTJSA-N 0.000 description 2
- WFDAEEUZPZSMOG-SRVKXCTJSA-N Phe-Cys-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(O)=O WFDAEEUZPZSMOG-SRVKXCTJSA-N 0.000 description 2
- HOYQLNNGMHXZDW-KKUMJFAQSA-N Phe-Glu-Arg Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O HOYQLNNGMHXZDW-KKUMJFAQSA-N 0.000 description 2
- APJPXSFJBMMOLW-KBPBESRZSA-N Phe-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 APJPXSFJBMMOLW-KBPBESRZSA-N 0.000 description 2
- GYEPCBNTTRORKW-PCBIJLKTSA-N Phe-Ile-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O GYEPCBNTTRORKW-PCBIJLKTSA-N 0.000 description 2
- YTILBRIUASDGBL-BZSNNMDCSA-N Phe-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 YTILBRIUASDGBL-BZSNNMDCSA-N 0.000 description 2
- OSBADCBXAMSPQD-YESZJQIVSA-N Phe-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N OSBADCBXAMSPQD-YESZJQIVSA-N 0.000 description 2
- JKJSIYKSGIDHPM-WBAXXEDZSA-N Phe-Phe-Ala Chemical compound C[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H](N)Cc1ccccc1)C(O)=O JKJSIYKSGIDHPM-WBAXXEDZSA-N 0.000 description 2
- OWSLLRKCHLTUND-BZSNNMDCSA-N Phe-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OWSLLRKCHLTUND-BZSNNMDCSA-N 0.000 description 2
- KAJLHCWRWDSROH-BZSNNMDCSA-N Phe-Phe-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=CC=C1 KAJLHCWRWDSROH-BZSNNMDCSA-N 0.000 description 2
- FENSZYFJQOFSQR-FIRPJDEBSA-N Phe-Phe-Ile Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FENSZYFJQOFSQR-FIRPJDEBSA-N 0.000 description 2
- JDMKQHSHKJHAHR-UHFFFAOYSA-N Phe-Phe-Leu-Tyr Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(CC(C)C)NC(=O)C(NC(=O)C(N)CC=1C=CC=CC=1)CC1=CC=CC=C1 JDMKQHSHKJHAHR-UHFFFAOYSA-N 0.000 description 2
- DEZCWWXTRAKZKJ-UFYCRDLUSA-N Phe-Phe-Met Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(O)=O DEZCWWXTRAKZKJ-UFYCRDLUSA-N 0.000 description 2
- WWPAHTZOWURIMR-ULQDDVLXSA-N Phe-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=CC=C1 WWPAHTZOWURIMR-ULQDDVLXSA-N 0.000 description 2
- AFNJAQVMTIQTCB-DLOVCJGASA-N Phe-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=CC=C1 AFNJAQVMTIQTCB-DLOVCJGASA-N 0.000 description 2
- XDMMOISUAHXXFD-SRVKXCTJSA-N Phe-Ser-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O XDMMOISUAHXXFD-SRVKXCTJSA-N 0.000 description 2
- UNBFGVQVQGXXCK-KKUMJFAQSA-N Phe-Ser-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O UNBFGVQVQGXXCK-KKUMJFAQSA-N 0.000 description 2
- GKRCCTYAGQPMMP-IHRRRGAJSA-N Phe-Ser-Met Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O GKRCCTYAGQPMMP-IHRRRGAJSA-N 0.000 description 2
- PTDAGKJHZBGDKD-OEAJRASXSA-N Phe-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N)O PTDAGKJHZBGDKD-OEAJRASXSA-N 0.000 description 2
- MJOJSHOTYWABPR-WIRXVTQYSA-N Phe-Trp-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CC=CC=C1 MJOJSHOTYWABPR-WIRXVTQYSA-N 0.000 description 2
- QTDBZORPVYTRJU-KKXDTOCCSA-N Phe-Tyr-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O QTDBZORPVYTRJU-KKXDTOCCSA-N 0.000 description 2
- BAONJAHBAUDJKA-BZSNNMDCSA-N Phe-Tyr-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=CC=C1 BAONJAHBAUDJKA-BZSNNMDCSA-N 0.000 description 2
- MHNBYYFXWDUGBW-RPTUDFQQSA-N Phe-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC2=CC=CC=C2)N)O MHNBYYFXWDUGBW-RPTUDFQQSA-N 0.000 description 2
- GOUWCZRDTWTODO-YDHLFZDLSA-N Phe-Val-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O GOUWCZRDTWTODO-YDHLFZDLSA-N 0.000 description 2
- XALFIVXGQUEGKV-JSGCOSHPSA-N Phe-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 XALFIVXGQUEGKV-JSGCOSHPSA-N 0.000 description 2
- JTKGCYOOJLUETJ-ULQDDVLXSA-N Phe-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 JTKGCYOOJLUETJ-ULQDDVLXSA-N 0.000 description 2
- IEIFEYBAYFSRBQ-IHRRRGAJSA-N Phe-Val-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N IEIFEYBAYFSRBQ-IHRRRGAJSA-N 0.000 description 2
- 102000017033 Porins Human genes 0.000 description 2
- 108010013381 Porins Proteins 0.000 description 2
- FYQSMXKJYTZYRP-DCAQKATOSA-N Pro-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 FYQSMXKJYTZYRP-DCAQKATOSA-N 0.000 description 2
- WWAQEUOYCYMGHB-FXQIFTODSA-N Pro-Asn-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H]1CCCN1 WWAQEUOYCYMGHB-FXQIFTODSA-N 0.000 description 2
- XWYXZPHPYKRYPA-GMOBBJLQSA-N Pro-Asn-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XWYXZPHPYKRYPA-GMOBBJLQSA-N 0.000 description 2
- FUVBEZJCRMHWEM-FXQIFTODSA-N Pro-Asn-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O FUVBEZJCRMHWEM-FXQIFTODSA-N 0.000 description 2
- SWXSLPHTJVAWDF-VEVYYDQMSA-N Pro-Asn-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWXSLPHTJVAWDF-VEVYYDQMSA-N 0.000 description 2
- SGCZFWSQERRKBD-BQBZGAKWSA-N Pro-Asp-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)[C@@H]1CCCN1 SGCZFWSQERRKBD-BQBZGAKWSA-N 0.000 description 2
- KPDRZQUWJKTMBP-DCAQKATOSA-N Pro-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1 KPDRZQUWJKTMBP-DCAQKATOSA-N 0.000 description 2
- LXVLKXPFIDDHJG-CIUDSAMLSA-N Pro-Glu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O LXVLKXPFIDDHJG-CIUDSAMLSA-N 0.000 description 2
- UEHYFUCOGHWASA-HJGDQZAQSA-N Pro-Glu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1 UEHYFUCOGHWASA-HJGDQZAQSA-N 0.000 description 2
- FEPSEIDIPBMIOS-QXEWZRGKSA-N Pro-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 FEPSEIDIPBMIOS-QXEWZRGKSA-N 0.000 description 2
- XQSREVQDGCPFRJ-STQMWFEESA-N Pro-Gly-Phe Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O XQSREVQDGCPFRJ-STQMWFEESA-N 0.000 description 2
- AQSMZTIEJMZQEC-DCAQKATOSA-N Pro-His-Ser Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CO)C(=O)O AQSMZTIEJMZQEC-DCAQKATOSA-N 0.000 description 2
- IBGCFJDLCYTKPW-NAKRPEOUSA-N Pro-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]1CCCN1 IBGCFJDLCYTKPW-NAKRPEOUSA-N 0.000 description 2
- BBFRBZYKHIKFBX-GMOBBJLQSA-N Pro-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@@H]1CCCN1 BBFRBZYKHIKFBX-GMOBBJLQSA-N 0.000 description 2
- KWMUAKQOVYCQJQ-ZPFDUUQYSA-N Pro-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@@H]1CCCN1 KWMUAKQOVYCQJQ-ZPFDUUQYSA-N 0.000 description 2
- FKVNLUZHSFCNGY-RVMXOQNASA-N Pro-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 FKVNLUZHSFCNGY-RVMXOQNASA-N 0.000 description 2
- MCWHYUWXVNRXFV-RWMBFGLXSA-N Pro-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 MCWHYUWXVNRXFV-RWMBFGLXSA-N 0.000 description 2
- VTFXTWDFPTWNJY-RHYQMDGZSA-N Pro-Leu-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VTFXTWDFPTWNJY-RHYQMDGZSA-N 0.000 description 2
- SUENWIFTSTWUKD-AVGNSLFASA-N Pro-Leu-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O SUENWIFTSTWUKD-AVGNSLFASA-N 0.000 description 2
- OFGUOWQVEGTVNU-DCAQKATOSA-N Pro-Lys-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OFGUOWQVEGTVNU-DCAQKATOSA-N 0.000 description 2
- ZLXKLMHAMDENIO-DCAQKATOSA-N Pro-Lys-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O ZLXKLMHAMDENIO-DCAQKATOSA-N 0.000 description 2
- RMODQFBNDDENCP-IHRRRGAJSA-N Pro-Lys-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O RMODQFBNDDENCP-IHRRRGAJSA-N 0.000 description 2
- DWGFLKQSGRUQTI-IHRRRGAJSA-N Pro-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1 DWGFLKQSGRUQTI-IHRRRGAJSA-N 0.000 description 2
- KLOQCCRTPHPIFN-DCAQKATOSA-N Pro-Met-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@@H]1CCCN1 KLOQCCRTPHPIFN-DCAQKATOSA-N 0.000 description 2
- WIPAMEKBSHNFQE-IUCAKERBSA-N Pro-Met-Gly Chemical compound CSCC[C@@H](C(=O)NCC(=O)O)NC(=O)[C@@H]1CCCN1 WIPAMEKBSHNFQE-IUCAKERBSA-N 0.000 description 2
- BUEIYHBJHCDAMI-UFYCRDLUSA-N Pro-Phe-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O BUEIYHBJHCDAMI-UFYCRDLUSA-N 0.000 description 2
- HOTVCUAVDQHUDB-UFYCRDLUSA-N Pro-Phe-Tyr Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 HOTVCUAVDQHUDB-UFYCRDLUSA-N 0.000 description 2
- GFHOSBYCLACKEK-GUBZILKMSA-N Pro-Pro-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O GFHOSBYCLACKEK-GUBZILKMSA-N 0.000 description 2
- POQFNPILEQEODH-FXQIFTODSA-N Pro-Ser-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O POQFNPILEQEODH-FXQIFTODSA-N 0.000 description 2
- GOMUXSCOIWIJFP-GUBZILKMSA-N Pro-Ser-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GOMUXSCOIWIJFP-GUBZILKMSA-N 0.000 description 2
- ITUDDXVFGFEKPD-NAKRPEOUSA-N Pro-Ser-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ITUDDXVFGFEKPD-NAKRPEOUSA-N 0.000 description 2
- PRKWBYCXBBSLSK-GUBZILKMSA-N Pro-Ser-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O PRKWBYCXBBSLSK-GUBZILKMSA-N 0.000 description 2
- MDAWMJUZHBQTBO-XGEHTFHBSA-N Pro-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@@H]1CCCN1)O MDAWMJUZHBQTBO-XGEHTFHBSA-N 0.000 description 2
- QDDJNKWPTJHROJ-UFYCRDLUSA-N Pro-Tyr-Tyr Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 QDDJNKWPTJHROJ-UFYCRDLUSA-N 0.000 description 2
- PGSWNLRYYONGPE-JYJNAYRXSA-N Pro-Val-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O PGSWNLRYYONGPE-JYJNAYRXSA-N 0.000 description 2
- 108010019653 Pwo polymerase Proteins 0.000 description 2
- 108010025216 RVF peptide Proteins 0.000 description 2
- 108091028664 Ribonucleotide Proteins 0.000 description 2
- BKOKTRCZXRIQPX-ZLUOBGJFSA-N Ser-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N BKOKTRCZXRIQPX-ZLUOBGJFSA-N 0.000 description 2
- SRTCFKGBYBZRHA-ACZMJKKPSA-N Ser-Ala-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SRTCFKGBYBZRHA-ACZMJKKPSA-N 0.000 description 2
- IDQFQFVEWMWRQQ-DLOVCJGASA-N Ser-Ala-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O IDQFQFVEWMWRQQ-DLOVCJGASA-N 0.000 description 2
- YQHZVYJAGWMHES-ZLUOBGJFSA-N Ser-Ala-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YQHZVYJAGWMHES-ZLUOBGJFSA-N 0.000 description 2
- QVOGDCQNGLBNCR-FXQIFTODSA-N Ser-Arg-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O QVOGDCQNGLBNCR-FXQIFTODSA-N 0.000 description 2
- BCKYYTVFBXHPOG-ACZMJKKPSA-N Ser-Asn-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CO)N BCKYYTVFBXHPOG-ACZMJKKPSA-N 0.000 description 2
- VGNYHOBZJKWRGI-CIUDSAMLSA-N Ser-Asn-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO VGNYHOBZJKWRGI-CIUDSAMLSA-N 0.000 description 2
- KAAPNMOKUUPKOE-SRVKXCTJSA-N Ser-Asn-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KAAPNMOKUUPKOE-SRVKXCTJSA-N 0.000 description 2
- TYYBJUYSTWJHGO-ZKWXMUAHSA-N Ser-Asn-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O TYYBJUYSTWJHGO-ZKWXMUAHSA-N 0.000 description 2
- DBIDZNUXSLXVRG-FXQIFTODSA-N Ser-Asp-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)N DBIDZNUXSLXVRG-FXQIFTODSA-N 0.000 description 2
- OLIJLNWFEQEFDM-SRVKXCTJSA-N Ser-Asp-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OLIJLNWFEQEFDM-SRVKXCTJSA-N 0.000 description 2
- MMAPOBOTRUVNKJ-ZLUOBGJFSA-N Ser-Asp-Ser Chemical compound C([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CO)N)C(=O)O MMAPOBOTRUVNKJ-ZLUOBGJFSA-N 0.000 description 2
- KCFKKAQKRZBWJB-ZLUOBGJFSA-N Ser-Cys-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)N[C@@H](C)C(O)=O KCFKKAQKRZBWJB-ZLUOBGJFSA-N 0.000 description 2
- FMDHKPRACUXATF-ACZMJKKPSA-N Ser-Gln-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O FMDHKPRACUXATF-ACZMJKKPSA-N 0.000 description 2
- KJMOINFQVCCSDX-XKBZYTNZSA-N Ser-Gln-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KJMOINFQVCCSDX-XKBZYTNZSA-N 0.000 description 2
- SQBLRDDJTUJDMV-ACZMJKKPSA-N Ser-Glu-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O SQBLRDDJTUJDMV-ACZMJKKPSA-N 0.000 description 2
- YRBGKVIWMNEVCZ-WDSKDSINSA-N Ser-Glu-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O YRBGKVIWMNEVCZ-WDSKDSINSA-N 0.000 description 2
- QKQDTEYDEIJPNK-GUBZILKMSA-N Ser-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CO QKQDTEYDEIJPNK-GUBZILKMSA-N 0.000 description 2
- GZFAWAQTEYDKII-YUMQZZPRSA-N Ser-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO GZFAWAQTEYDKII-YUMQZZPRSA-N 0.000 description 2
- IFPBAGJBHSNYPR-ZKWXMUAHSA-N Ser-Ile-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O IFPBAGJBHSNYPR-ZKWXMUAHSA-N 0.000 description 2
- LWMQRHDTXHQQOV-MXAVVETBSA-N Ser-Ile-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LWMQRHDTXHQQOV-MXAVVETBSA-N 0.000 description 2
- QYSFWUIXDFJUDW-DCAQKATOSA-N Ser-Leu-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QYSFWUIXDFJUDW-DCAQKATOSA-N 0.000 description 2
- KCNSGAMPBPYUAI-CIUDSAMLSA-N Ser-Leu-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O KCNSGAMPBPYUAI-CIUDSAMLSA-N 0.000 description 2
- GVIGVIOEYBOTCB-XIRDDKMYSA-N Ser-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CC(C)C)C(O)=O)=CNC2=C1 GVIGVIOEYBOTCB-XIRDDKMYSA-N 0.000 description 2
- IXZHZUGGKLRHJD-DCAQKATOSA-N Ser-Leu-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O IXZHZUGGKLRHJD-DCAQKATOSA-N 0.000 description 2
- GZSZPKSBVAOGIE-CIUDSAMLSA-N Ser-Lys-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O GZSZPKSBVAOGIE-CIUDSAMLSA-N 0.000 description 2
- HDBOEVPDIDDEPC-CIUDSAMLSA-N Ser-Lys-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O HDBOEVPDIDDEPC-CIUDSAMLSA-N 0.000 description 2
- GVMUJUPXFQFBBZ-GUBZILKMSA-N Ser-Lys-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O GVMUJUPXFQFBBZ-GUBZILKMSA-N 0.000 description 2
- WGDYNRCOQRERLZ-KKUMJFAQSA-N Ser-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CO)N WGDYNRCOQRERLZ-KKUMJFAQSA-N 0.000 description 2
- ZGFRMNZZTOVBOU-CIUDSAMLSA-N Ser-Met-Gln Chemical compound N[C@@H](CO)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)O ZGFRMNZZTOVBOU-CIUDSAMLSA-N 0.000 description 2
- ASGYVPAVFNDZMA-GUBZILKMSA-N Ser-Met-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CO)N ASGYVPAVFNDZMA-GUBZILKMSA-N 0.000 description 2
- XVWDJUROVRQKAE-KKUMJFAQSA-N Ser-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CC1=CC=CC=C1 XVWDJUROVRQKAE-KKUMJFAQSA-N 0.000 description 2
- MHVXPTAMDHLTHB-IHPCNDPISA-N Ser-Phe-Trp Chemical compound C([C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CC=CC=C1 MHVXPTAMDHLTHB-IHPCNDPISA-N 0.000 description 2
- RHAPJNVNWDBFQI-BQBZGAKWSA-N Ser-Pro-Gly Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O RHAPJNVNWDBFQI-BQBZGAKWSA-N 0.000 description 2
- FKYWFUYPVKLJLP-DCAQKATOSA-N Ser-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO FKYWFUYPVKLJLP-DCAQKATOSA-N 0.000 description 2
- QUGRFWPMPVIAPW-IHRRRGAJSA-N Ser-Pro-Phe Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QUGRFWPMPVIAPW-IHRRRGAJSA-N 0.000 description 2
- HHJFMHQYEAAOBM-ZLUOBGJFSA-N Ser-Ser-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O HHJFMHQYEAAOBM-ZLUOBGJFSA-N 0.000 description 2
- KQNDIKOYWZTZIX-FXQIFTODSA-N Ser-Ser-Arg Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCNC(N)=N KQNDIKOYWZTZIX-FXQIFTODSA-N 0.000 description 2
- ILZAUMFXKSIUEF-SRVKXCTJSA-N Ser-Ser-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ILZAUMFXKSIUEF-SRVKXCTJSA-N 0.000 description 2
- OLKICIBQRVSQMA-SRVKXCTJSA-N Ser-Ser-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O OLKICIBQRVSQMA-SRVKXCTJSA-N 0.000 description 2
- PURRNJBBXDDWLX-ZDLURKLDSA-N Ser-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CO)N)O PURRNJBBXDDWLX-ZDLURKLDSA-N 0.000 description 2
- DYEGLQRVMBWQLD-IXOXFDKPSA-N Ser-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CO)N)O DYEGLQRVMBWQLD-IXOXFDKPSA-N 0.000 description 2
- SNXUIBACCONSOH-BWBBJGPYSA-N Ser-Thr-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CO)C(O)=O SNXUIBACCONSOH-BWBBJGPYSA-N 0.000 description 2
- AXKJPUBALUNJEO-UBHSHLNASA-N Ser-Trp-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(O)=O AXKJPUBALUNJEO-UBHSHLNASA-N 0.000 description 2
- ATEQEHCGZKBEMU-GQGQLFGLSA-N Ser-Trp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](CO)N ATEQEHCGZKBEMU-GQGQLFGLSA-N 0.000 description 2
- UBTNVMGPMYDYIU-HJPIBITLSA-N Ser-Tyr-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UBTNVMGPMYDYIU-HJPIBITLSA-N 0.000 description 2
- KIEIJCFVGZCUAS-MELADBBJSA-N Ser-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CO)N)C(=O)O KIEIJCFVGZCUAS-MELADBBJSA-N 0.000 description 2
- SGZVZUCRAVSPKQ-FXQIFTODSA-N Ser-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N SGZVZUCRAVSPKQ-FXQIFTODSA-N 0.000 description 2
- BEBVVQPDSHHWQL-NRPADANISA-N Ser-Val-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O BEBVVQPDSHHWQL-NRPADANISA-N 0.000 description 2
- RCOUFINCYASMDN-GUBZILKMSA-N Ser-Val-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCSC)C(O)=O RCOUFINCYASMDN-GUBZILKMSA-N 0.000 description 2
- JGUWRQWULDWNCM-FXQIFTODSA-N Ser-Val-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O JGUWRQWULDWNCM-FXQIFTODSA-N 0.000 description 2
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 2
- 241000607762 Shigella flexneri Species 0.000 description 2
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 2
- 241000256251 Spodoptera frugiperda Species 0.000 description 2
- 108091081024 Start codon Proteins 0.000 description 2
- 229930006000 Sucrose Natural products 0.000 description 2
- CZMRCDWAGMRECN-UGDNZRGBSA-N Sucrose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 description 2
- YRNBANYVJJBGDI-VZFHVOOUSA-N Thr-Ala-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CS)C(=O)O)N)O YRNBANYVJJBGDI-VZFHVOOUSA-N 0.000 description 2
- DDPVJPIGACCMEH-XQXXSGGOSA-N Thr-Ala-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O DDPVJPIGACCMEH-XQXXSGGOSA-N 0.000 description 2
- BSNZTJXVDOINSR-JXUBOQSCSA-N Thr-Ala-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O BSNZTJXVDOINSR-JXUBOQSCSA-N 0.000 description 2
- CAGTXGDOIFXLPC-KZVJFYERSA-N Thr-Arg-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CCCN=C(N)N CAGTXGDOIFXLPC-KZVJFYERSA-N 0.000 description 2
- DCCGCVLVVSAJFK-NUMRIWBASA-N Thr-Asp-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O DCCGCVLVVSAJFK-NUMRIWBASA-N 0.000 description 2
- GNHRVXYZKWSJTF-HJGDQZAQSA-N Thr-Asp-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N)O GNHRVXYZKWSJTF-HJGDQZAQSA-N 0.000 description 2
- KRPKYGOFYUNIGM-XVSYOHENSA-N Thr-Asp-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N)O KRPKYGOFYUNIGM-XVSYOHENSA-N 0.000 description 2
- QILPDQCTQZDHFM-HJGDQZAQSA-N Thr-Gln-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QILPDQCTQZDHFM-HJGDQZAQSA-N 0.000 description 2
- LAFLAXHTDVNVEL-WDCWCFNPSA-N Thr-Gln-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O LAFLAXHTDVNVEL-WDCWCFNPSA-N 0.000 description 2
- FHDLKMFZKRUQCE-HJGDQZAQSA-N Thr-Glu-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FHDLKMFZKRUQCE-HJGDQZAQSA-N 0.000 description 2
- GKWNLDNXMMLRMC-GLLZPBPUSA-N Thr-Glu-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O GKWNLDNXMMLRMC-GLLZPBPUSA-N 0.000 description 2
- JMGJDTNUMAZNLX-RWRJDSDZSA-N Thr-Glu-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JMGJDTNUMAZNLX-RWRJDSDZSA-N 0.000 description 2
- BIENEHRYNODTLP-HJGDQZAQSA-N Thr-Glu-Met Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCSC)C(=O)O)N)O BIENEHRYNODTLP-HJGDQZAQSA-N 0.000 description 2
- ONNSECRQFSTMCC-XKBZYTNZSA-N Thr-Glu-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O ONNSECRQFSTMCC-XKBZYTNZSA-N 0.000 description 2
- IMULJHHGAUZZFE-MBLNEYKQSA-N Thr-Gly-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O IMULJHHGAUZZFE-MBLNEYKQSA-N 0.000 description 2
- QQWNRERCGGZOKG-WEDXCCLWSA-N Thr-Gly-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O QQWNRERCGGZOKG-WEDXCCLWSA-N 0.000 description 2
- ZBKDBZUTTXINIX-RWRJDSDZSA-N Thr-Ile-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZBKDBZUTTXINIX-RWRJDSDZSA-N 0.000 description 2
- IMDMLDSVUSMAEJ-HJGDQZAQSA-N Thr-Leu-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IMDMLDSVUSMAEJ-HJGDQZAQSA-N 0.000 description 2
- RRRRCRYTLZVCEN-HJGDQZAQSA-N Thr-Leu-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O RRRRCRYTLZVCEN-HJGDQZAQSA-N 0.000 description 2
- FLPZMPOZGYPBEN-PPCPHDFISA-N Thr-Leu-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FLPZMPOZGYPBEN-PPCPHDFISA-N 0.000 description 2
- MECLEFZMPPOEAC-VOAKCMCISA-N Thr-Leu-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MECLEFZMPPOEAC-VOAKCMCISA-N 0.000 description 2
- ZSPQUTWLWGWTPS-HJGDQZAQSA-N Thr-Lys-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O ZSPQUTWLWGWTPS-HJGDQZAQSA-N 0.000 description 2
- WRUWXBBEFUTJOU-XGEHTFHBSA-N Thr-Met-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)O)N)O WRUWXBBEFUTJOU-XGEHTFHBSA-N 0.000 description 2
- NZRUWPIYECBYRK-HTUGSXCWSA-N Thr-Phe-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O NZRUWPIYECBYRK-HTUGSXCWSA-N 0.000 description 2
- MXNAOGFNFNKUPD-JHYOHUSXSA-N Thr-Phe-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MXNAOGFNFNKUPD-JHYOHUSXSA-N 0.000 description 2
- MEBDIIKMUUNBSB-RPTUDFQQSA-N Thr-Phe-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MEBDIIKMUUNBSB-RPTUDFQQSA-N 0.000 description 2
- DEGCBBCMYWNJNA-RHYQMDGZSA-N Thr-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)[C@@H](C)O DEGCBBCMYWNJNA-RHYQMDGZSA-N 0.000 description 2
- YGCDFAJJCRVQKU-RCWTZXSCSA-N Thr-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)[C@@H](C)O YGCDFAJJCRVQKU-RCWTZXSCSA-N 0.000 description 2
- WKGAAMOJPMBBMC-IXOXFDKPSA-N Thr-Ser-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WKGAAMOJPMBBMC-IXOXFDKPSA-N 0.000 description 2
- UMFLBPIPAJMNIM-LYARXQMPSA-N Thr-Trp-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC3=CC=CC=C3)C(=O)O)N)O UMFLBPIPAJMNIM-LYARXQMPSA-N 0.000 description 2
- XGFYGMKZKFRGAI-RCWTZXSCSA-N Thr-Val-Arg Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N XGFYGMKZKFRGAI-RCWTZXSCSA-N 0.000 description 2
- DXDMNBJJEXYMLA-UBHSHLNASA-N Trp-Asn-Asp Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O)=CNC2=C1 DXDMNBJJEXYMLA-UBHSHLNASA-N 0.000 description 2
- IQGJAHMZWBTRIF-UBHSHLNASA-N Trp-Asp-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N IQGJAHMZWBTRIF-UBHSHLNASA-N 0.000 description 2
- DQDXHYIEITXNJY-BPUTZDHNSA-N Trp-Gln-Gln Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N DQDXHYIEITXNJY-BPUTZDHNSA-N 0.000 description 2
- YTCNLMSUXPCFBW-SXNHZJKMSA-N Trp-Ile-Glu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O YTCNLMSUXPCFBW-SXNHZJKMSA-N 0.000 description 2
- SAKLWFSRZTZQAJ-GQGQLFGLSA-N Trp-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N SAKLWFSRZTZQAJ-GQGQLFGLSA-N 0.000 description 2
- YPBYQWFZAAQMGW-XIRDDKMYSA-N Trp-Lys-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)N)C(=O)O)N YPBYQWFZAAQMGW-XIRDDKMYSA-N 0.000 description 2
- HJXOFWKCWLHYIJ-SZMVWBNQSA-N Trp-Lys-Glu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HJXOFWKCWLHYIJ-SZMVWBNQSA-N 0.000 description 2
- ABRICLFKFRFDKS-IHPCNDPISA-N Trp-Ser-Tyr Chemical compound C([C@H](NC(=O)[C@H](CO)NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)N)C(O)=O)C1=CC=C(O)C=C1 ABRICLFKFRFDKS-IHPCNDPISA-N 0.000 description 2
- HHPSUFUXXBOFQY-AQZXSJQPSA-N Trp-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O HHPSUFUXXBOFQY-AQZXSJQPSA-N 0.000 description 2
- KRCAKIVDAFTTGJ-ARVREXMNSA-N Trp-Trp-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CC=3C4=CC=CC=C4NC=3)NC(=O)[C@H](CC=3C4=CC=CC=C4NC=3)N)C(O)=O)=CNC2=C1 KRCAKIVDAFTTGJ-ARVREXMNSA-N 0.000 description 2
- SGQSAIFDESQBRA-IHPCNDPISA-N Trp-Tyr-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CC=C(C=C3)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N SGQSAIFDESQBRA-IHPCNDPISA-N 0.000 description 2
- NSOMQRHZMJMZIE-GVARAGBVSA-N Tyr-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NSOMQRHZMJMZIE-GVARAGBVSA-N 0.000 description 2
- KDGFPPHLXCEQRN-STECZYCISA-N Tyr-Arg-Ile Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KDGFPPHLXCEQRN-STECZYCISA-N 0.000 description 2
- DKKHULUSOSWGHS-UWJYBYFXSA-N Tyr-Asn-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N DKKHULUSOSWGHS-UWJYBYFXSA-N 0.000 description 2
- CKKFTIQYURNSEI-IHRRRGAJSA-N Tyr-Asn-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 CKKFTIQYURNSEI-IHRRRGAJSA-N 0.000 description 2
- AYPAIRCDLARHLM-KKUMJFAQSA-N Tyr-Asn-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O AYPAIRCDLARHLM-KKUMJFAQSA-N 0.000 description 2
- XMNDQSYABVWZRK-BZSNNMDCSA-N Tyr-Asn-Phe Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O XMNDQSYABVWZRK-BZSNNMDCSA-N 0.000 description 2
- VFJIWSJKZJTQII-SRVKXCTJSA-N Tyr-Asp-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O VFJIWSJKZJTQII-SRVKXCTJSA-N 0.000 description 2
- MNMYOSZWCKYEDI-JRQIVUDYSA-N Tyr-Asp-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MNMYOSZWCKYEDI-JRQIVUDYSA-N 0.000 description 2
- BVDHHLMIZFCAAU-BZSNNMDCSA-N Tyr-Cys-Phe Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O BVDHHLMIZFCAAU-BZSNNMDCSA-N 0.000 description 2
- AKLNEFNQWLHIGY-QWRGUYRKSA-N Tyr-Gly-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)O)N)O AKLNEFNQWLHIGY-QWRGUYRKSA-N 0.000 description 2
- FNWGDMZVYBVAGJ-XEGUGMAKSA-N Tyr-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC1=CC=C(C=C1)O)N FNWGDMZVYBVAGJ-XEGUGMAKSA-N 0.000 description 2
- NOOMDULIORCDNF-IRXDYDNUSA-N Tyr-Gly-Phe Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O NOOMDULIORCDNF-IRXDYDNUSA-N 0.000 description 2
- AZGZDDNKFFUDEH-QWRGUYRKSA-N Tyr-Gly-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AZGZDDNKFFUDEH-QWRGUYRKSA-N 0.000 description 2
- ULHJJQYGMWONTD-HKUYNNGSSA-N Tyr-Gly-Trp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O ULHJJQYGMWONTD-HKUYNNGSSA-N 0.000 description 2
- NMKJPMCEKQHRPD-IRXDYDNUSA-N Tyr-Gly-Tyr Chemical compound C([C@H](N)C(=O)NCC(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 NMKJPMCEKQHRPD-IRXDYDNUSA-N 0.000 description 2
- FBHBVXUBTYVCRU-BZSNNMDCSA-N Tyr-His-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CN=CN1 FBHBVXUBTYVCRU-BZSNNMDCSA-N 0.000 description 2
- USYGMBIIUDLYHJ-GVARAGBVSA-N Tyr-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 USYGMBIIUDLYHJ-GVARAGBVSA-N 0.000 description 2
- NXRGXTBPMOGFID-CFMVVWHZSA-N Tyr-Ile-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O NXRGXTBPMOGFID-CFMVVWHZSA-N 0.000 description 2
- ILTXFANLDMJWPR-SIUGBPQLSA-N Tyr-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N ILTXFANLDMJWPR-SIUGBPQLSA-N 0.000 description 2
- BSCBBPKDVOZICB-KKUMJFAQSA-N Tyr-Leu-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BSCBBPKDVOZICB-KKUMJFAQSA-N 0.000 description 2
- CDKZJGMPZHPAJC-ULQDDVLXSA-N Tyr-Leu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 CDKZJGMPZHPAJC-ULQDDVLXSA-N 0.000 description 2
- GITNQBVCEQBDQC-KKUMJFAQSA-N Tyr-Lys-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O GITNQBVCEQBDQC-KKUMJFAQSA-N 0.000 description 2
- UBKKNELWDCBNCF-STQMWFEESA-N Tyr-Met-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 UBKKNELWDCBNCF-STQMWFEESA-N 0.000 description 2
- HNERGSKJJZQGEA-JYJNAYRXSA-N Tyr-Met-Met Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N HNERGSKJJZQGEA-JYJNAYRXSA-N 0.000 description 2
- OGPKMBOPMDTEDM-IHRRRGAJSA-N Tyr-Met-Ser Chemical compound CSCC[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N OGPKMBOPMDTEDM-IHRRRGAJSA-N 0.000 description 2
- WTTRJMAZPDHPGS-KKXDTOCCSA-N Tyr-Phe-Ala Chemical compound C[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H](N)Cc1ccc(O)cc1)C(O)=O WTTRJMAZPDHPGS-KKXDTOCCSA-N 0.000 description 2
- LRHBBGDMBLFYGL-FHWLQOOXSA-N Tyr-Phe-Glu Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CCC(O)=O)C(O)=O)C1=CC=C(O)C=C1 LRHBBGDMBLFYGL-FHWLQOOXSA-N 0.000 description 2
- OKDNSNWJEXAMSU-IRXDYDNUSA-N Tyr-Phe-Gly Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)NCC(O)=O)C1=CC=C(O)C=C1 OKDNSNWJEXAMSU-IRXDYDNUSA-N 0.000 description 2
- PSALWJCUIAQKFW-ACRUOGEOSA-N Tyr-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N PSALWJCUIAQKFW-ACRUOGEOSA-N 0.000 description 2
- PYJKETPLFITNKS-IHRRRGAJSA-N Tyr-Pro-Asn Chemical compound N[C@@H](Cc1ccc(O)cc1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O PYJKETPLFITNKS-IHRRRGAJSA-N 0.000 description 2
- YYLHVUCSTXXKBS-IHRRRGAJSA-N Tyr-Pro-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O YYLHVUCSTXXKBS-IHRRRGAJSA-N 0.000 description 2
- XGZBEGGGAUQBMB-KJEVXHAQSA-N Tyr-Pro-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC2=CC=C(C=C2)O)N)O XGZBEGGGAUQBMB-KJEVXHAQSA-N 0.000 description 2
- IEWKKXZRJLTIOV-AVGNSLFASA-N Tyr-Ser-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O IEWKKXZRJLTIOV-AVGNSLFASA-N 0.000 description 2
- QPOUERMDWKKZEG-HJPIBITLSA-N Tyr-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 QPOUERMDWKKZEG-HJPIBITLSA-N 0.000 description 2
- LUMQYLVYUIRHHU-YJRXYDGGSA-N Tyr-Ser-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LUMQYLVYUIRHHU-YJRXYDGGSA-N 0.000 description 2
- MDXLPNRXCFOBTL-BZSNNMDCSA-N Tyr-Ser-Tyr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MDXLPNRXCFOBTL-BZSNNMDCSA-N 0.000 description 2
- TYFLVOUZHQUBGM-IHRRRGAJSA-N Tyr-Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 TYFLVOUZHQUBGM-IHRRRGAJSA-N 0.000 description 2
- RIVVDNTUSRVTQT-IRIUXVKKSA-N Tyr-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O RIVVDNTUSRVTQT-IRIUXVKKSA-N 0.000 description 2
- ZZDYJFVIKVSUFA-WLTAIBSBSA-N Tyr-Thr-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O ZZDYJFVIKVSUFA-WLTAIBSBSA-N 0.000 description 2
- VSYROIRKNBCULO-BWAGICSOSA-N Tyr-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)O VSYROIRKNBCULO-BWAGICSOSA-N 0.000 description 2
- HZDQUVQEVVYDDA-ACRUOGEOSA-N Tyr-Tyr-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 HZDQUVQEVVYDDA-ACRUOGEOSA-N 0.000 description 2
- AGDDLOQMXUQPDY-BZSNNMDCSA-N Tyr-Tyr-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O AGDDLOQMXUQPDY-BZSNNMDCSA-N 0.000 description 2
- 108010064997 VPY tripeptide Proteins 0.000 description 2
- 241000700618 Vaccinia virus Species 0.000 description 2
- YFOCMOVJBQDBCE-NRPADANISA-N Val-Ala-Glu Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N YFOCMOVJBQDBCE-NRPADANISA-N 0.000 description 2
- LABUITCFCAABSV-BPNCWPANSA-N Val-Ala-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 LABUITCFCAABSV-BPNCWPANSA-N 0.000 description 2
- LABUITCFCAABSV-UHFFFAOYSA-N Val-Ala-Tyr Natural products CC(C)C(N)C(=O)NC(C)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LABUITCFCAABSV-UHFFFAOYSA-N 0.000 description 2
- DNOOLPROHJWCSQ-RCWTZXSCSA-N Val-Arg-Thr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DNOOLPROHJWCSQ-RCWTZXSCSA-N 0.000 description 2
- UDLYXGYWTVOIKU-QXEWZRGKSA-N Val-Asn-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N UDLYXGYWTVOIKU-QXEWZRGKSA-N 0.000 description 2
- OGNMURQZFMHFFD-NHCYSSNCSA-N Val-Asn-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N OGNMURQZFMHFFD-NHCYSSNCSA-N 0.000 description 2
- JLFKWDAZBRYCGX-ZKWXMUAHSA-N Val-Asn-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N JLFKWDAZBRYCGX-ZKWXMUAHSA-N 0.000 description 2
- VLOYGOZDPGYWFO-LAEOZQHASA-N Val-Asp-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VLOYGOZDPGYWFO-LAEOZQHASA-N 0.000 description 2
- QHDXUYOYTPWCSK-RCOVLWMOSA-N Val-Asp-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)NCC(=O)O)N QHDXUYOYTPWCSK-RCOVLWMOSA-N 0.000 description 2
- TZVUSFMQWPWHON-NHCYSSNCSA-N Val-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N TZVUSFMQWPWHON-NHCYSSNCSA-N 0.000 description 2
- CJDZKZFMAXGUOJ-IHRRRGAJSA-N Val-Cys-Tyr Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N CJDZKZFMAXGUOJ-IHRRRGAJSA-N 0.000 description 2
- YDPFWRVQHFWBKI-GVXVVHGQSA-N Val-Glu-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N YDPFWRVQHFWBKI-GVXVVHGQSA-N 0.000 description 2
- ROLGIBMFNMZANA-GVXVVHGQSA-N Val-Glu-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N ROLGIBMFNMZANA-GVXVVHGQSA-N 0.000 description 2
- PIFJAFRUVWZRKR-QMMMGPOBSA-N Val-Gly-Gly Chemical compound CC(C)[C@H]([NH3+])C(=O)NCC(=O)NCC([O-])=O PIFJAFRUVWZRKR-QMMMGPOBSA-N 0.000 description 2
- PMDOQZFYGWZSTK-LSJOCFKGSA-N Val-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)C(C)C PMDOQZFYGWZSTK-LSJOCFKGSA-N 0.000 description 2
- SYOMXKPPFZRELL-ONGXEEELSA-N Val-Gly-Lys Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N SYOMXKPPFZRELL-ONGXEEELSA-N 0.000 description 2
- CPGJELLYDQEDRK-NAKRPEOUSA-N Val-Ile-Ala Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@@H](C)C(O)=O CPGJELLYDQEDRK-NAKRPEOUSA-N 0.000 description 2
- UKEVLVBHRKWECS-LSJOCFKGSA-N Val-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](C(C)C)N UKEVLVBHRKWECS-LSJOCFKGSA-N 0.000 description 2
- FTKXYXACXYOHND-XUXIUFHCSA-N Val-Ile-Leu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O FTKXYXACXYOHND-XUXIUFHCSA-N 0.000 description 2
- SDUBQHUJJWQTEU-XUXIUFHCSA-N Val-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](C(C)C)N SDUBQHUJJWQTEU-XUXIUFHCSA-N 0.000 description 2
- DJQIUOKSNRBTSV-CYDGBPFRSA-N Val-Ile-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](C(C)C)N DJQIUOKSNRBTSV-CYDGBPFRSA-N 0.000 description 2
- FEXILLGKGGTLRI-NHCYSSNCSA-N Val-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N FEXILLGKGGTLRI-NHCYSSNCSA-N 0.000 description 2
- DAVNYIUELQBTAP-XUXIUFHCSA-N Val-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)N DAVNYIUELQBTAP-XUXIUFHCSA-N 0.000 description 2
- ZZGPVSZDZQRJQY-ULQDDVLXSA-N Val-Leu-Phe Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@@H](Cc1ccccc1)C(O)=O ZZGPVSZDZQRJQY-ULQDDVLXSA-N 0.000 description 2
- KTEZUXISLQTDDQ-NHCYSSNCSA-N Val-Lys-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N KTEZUXISLQTDDQ-NHCYSSNCSA-N 0.000 description 2
- HJSLDXZAZGFPDK-ULQDDVLXSA-N Val-Phe-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](C(C)C)N HJSLDXZAZGFPDK-ULQDDVLXSA-N 0.000 description 2
- KISFXYYRKKNLOP-IHRRRGAJSA-N Val-Phe-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)O)N KISFXYYRKKNLOP-IHRRRGAJSA-N 0.000 description 2
- AIWLHFZYOUUJGB-UFYCRDLUSA-N Val-Phe-Tyr Chemical compound C([C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 AIWLHFZYOUUJGB-UFYCRDLUSA-N 0.000 description 2
- NHXZRXLFOBFMDM-AVGNSLFASA-N Val-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)C(C)C NHXZRXLFOBFMDM-AVGNSLFASA-N 0.000 description 2
- QWCZXKIFPWPQHR-JYJNAYRXSA-N Val-Pro-Tyr Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 QWCZXKIFPWPQHR-JYJNAYRXSA-N 0.000 description 2
- DEGUERSKQBRZMZ-FXQIFTODSA-N Val-Ser-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DEGUERSKQBRZMZ-FXQIFTODSA-N 0.000 description 2
- GBIUHAYJGWVNLN-AEJSXWLSSA-N Val-Ser-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N GBIUHAYJGWVNLN-AEJSXWLSSA-N 0.000 description 2
- PZTZYZUTCPZWJH-FXQIFTODSA-N Val-Ser-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)O)N PZTZYZUTCPZWJH-FXQIFTODSA-N 0.000 description 2
- HWNYVQMOLCYHEA-IHRRRGAJSA-N Val-Ser-Tyr Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N HWNYVQMOLCYHEA-IHRRRGAJSA-N 0.000 description 2
- NZYNRRGJJVSSTJ-GUBZILKMSA-N Val-Ser-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O NZYNRRGJJVSSTJ-GUBZILKMSA-N 0.000 description 2
- UVHFONIHVHLDDQ-IFFSRLJSSA-N Val-Thr-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O UVHFONIHVHLDDQ-IFFSRLJSSA-N 0.000 description 2
- LCHZBEUVGAVMKS-RHYQMDGZSA-N Val-Thr-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)[C@@H](C)O)C(O)=O LCHZBEUVGAVMKS-RHYQMDGZSA-N 0.000 description 2
- JXWGBRRVTRAZQA-ULQDDVLXSA-N Val-Tyr-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](C(C)C)N JXWGBRRVTRAZQA-ULQDDVLXSA-N 0.000 description 2
- BGTDGENDNWGMDQ-KJEVXHAQSA-N Val-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](C(C)C)N)O BGTDGENDNWGMDQ-KJEVXHAQSA-N 0.000 description 2
- OWFGFHQMSBTKLX-UFYCRDLUSA-N Val-Tyr-Tyr Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N OWFGFHQMSBTKLX-UFYCRDLUSA-N 0.000 description 2
- ZNGPROMGGGFOAA-JYJNAYRXSA-N Val-Tyr-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=C(O)C=C1 ZNGPROMGGGFOAA-JYJNAYRXSA-N 0.000 description 2
- AOILQMZPNLUXCM-AVGNSLFASA-N Val-Val-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN AOILQMZPNLUXCM-AVGNSLFASA-N 0.000 description 2
- JSOXWWFKRJKTMT-WOPDTQHZSA-N Val-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N JSOXWWFKRJKTMT-WOPDTQHZSA-N 0.000 description 2
- YKZVPMUGEJXEOR-JYJNAYRXSA-N Val-Val-Tyr Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N YKZVPMUGEJXEOR-JYJNAYRXSA-N 0.000 description 2
- 238000001261 affinity purification Methods 0.000 description 2
- 239000008272 agar Substances 0.000 description 2
- 238000000246 agarose gel electrophoresis Methods 0.000 description 2
- 235000004279 alanine Nutrition 0.000 description 2
- 108010076324 alanyl-glycyl-glycine Proteins 0.000 description 2
- 108010031014 alanyl-histidyl-leucyl-leucine Proteins 0.000 description 2
- 108010069020 alanyl-prolyl-glycine Proteins 0.000 description 2
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 2
- 108010045350 alanyl-tyrosyl-alanine Proteins 0.000 description 2
- 108010041407 alanylaspartic acid Proteins 0.000 description 2
- 125000000217 alkyl group Chemical group 0.000 description 2
- 108010050025 alpha-glutamyltryptophan Proteins 0.000 description 2
- 229910052782 aluminium Inorganic materials 0.000 description 2
- 229960000723 ampicillin Drugs 0.000 description 2
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 2
- 238000000137 annealing Methods 0.000 description 2
- 230000000692 anti-sense effect Effects 0.000 description 2
- 229940088710 antibiotic agent Drugs 0.000 description 2
- 108010001271 arginyl-glutamyl-arginine Proteins 0.000 description 2
- 108010089975 arginyl-glycyl-aspartyl-serine Proteins 0.000 description 2
- 108010038850 arginyl-isoleucyl-tyrosine Proteins 0.000 description 2
- 108010068380 arginylarginine Proteins 0.000 description 2
- 108010036533 arginylvaline Proteins 0.000 description 2
- 238000003556 assay Methods 0.000 description 2
- 230000002238 attenuated effect Effects 0.000 description 2
- 238000001574 biopsy Methods 0.000 description 2
- 229960003150 bupivacaine Drugs 0.000 description 2
- 208000023652 chronic gastritis Diseases 0.000 description 2
- 238000004737 colorimetric analysis Methods 0.000 description 2
- 239000002299 complementary DNA Substances 0.000 description 2
- 238000012790 confirmation Methods 0.000 description 2
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 2
- 235000018417 cysteine Nutrition 0.000 description 2
- 108010004073 cysteinylcysteine Proteins 0.000 description 2
- 108010060199 cysteinylproline Proteins 0.000 description 2
- 239000003398 denaturant Substances 0.000 description 2
- PSLWZOIUBRXAQW-UHFFFAOYSA-M dimethyl(dioctadecyl)azanium;bromide Chemical compound [Br-].CCCCCCCCCCCCCCCCCC[N+](C)(C)CCCCCCCCCCCCCCCCCC PSLWZOIUBRXAQW-UHFFFAOYSA-M 0.000 description 2
- SWSQBOPZIKWTGO-UHFFFAOYSA-N dimethylaminoamidine Natural products CN(C)C(N)=N SWSQBOPZIKWTGO-UHFFFAOYSA-N 0.000 description 2
- 238000010828 elution Methods 0.000 description 2
- 229940088598 enzyme Drugs 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 102000013165 exonuclease Human genes 0.000 description 2
- 239000012530 fluid Substances 0.000 description 2
- 206010017758 gastric cancer Diseases 0.000 description 2
- 208000010749 gastric carcinoma Diseases 0.000 description 2
- 210000001156 gastric mucosa Anatomy 0.000 description 2
- 108010037389 glutamyl-cysteinyl-lysine Proteins 0.000 description 2
- 108010008237 glutamyl-valyl-glycine Proteins 0.000 description 2
- 108010075431 glycyl-alanyl-phenylalanine Proteins 0.000 description 2
- 108010072405 glycyl-aspartyl-glycine Proteins 0.000 description 2
- 108010062266 glycyl-glycyl-argininal Proteins 0.000 description 2
- 108010010096 glycyl-glycyl-tyrosine Proteins 0.000 description 2
- 108010054666 glycyl-leucyl-glycyl-glycine Proteins 0.000 description 2
- 108010066198 glycyl-leucyl-phenylalanine Proteins 0.000 description 2
- 108010082286 glycyl-seryl-alanine Proteins 0.000 description 2
- 108010045126 glycyl-tyrosyl-glycine Proteins 0.000 description 2
- 229960004198 guanidine Drugs 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 229940037467 helicobacter pylori Drugs 0.000 description 2
- 125000000487 histidyl group Chemical group [H]N([H])C(C(=O)O*)C([H])([H])C1=C([H])N([H])C([H])=N1 0.000 description 2
- 108010036413 histidylglycine Proteins 0.000 description 2
- 108010018006 histidylserine Proteins 0.000 description 2
- 230000036039 immunity Effects 0.000 description 2
- 230000006698 induction Effects 0.000 description 2
- 108010031424 isoleucyl-prolyl-proline Proteins 0.000 description 2
- 108010027338 isoleucylcysteine Proteins 0.000 description 2
- 108010053037 kyotorphin Proteins 0.000 description 2
- 108010083708 leucyl-aspartyl-valine Proteins 0.000 description 2
- 108010047926 leucyl-lysyl-tyrosine Proteins 0.000 description 2
- 238000011068 loading method Methods 0.000 description 2
- 108010057952 lysyl-phenylalanyl-lysine Proteins 0.000 description 2
- 230000013011 mating Effects 0.000 description 2
- 239000004005 microsphere Substances 0.000 description 2
- 229940035032 monophosphoryl lipid a Drugs 0.000 description 2
- 238000010172 mouse model Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 229940126578 oral vaccine Drugs 0.000 description 2
- 230000036961 partial effect Effects 0.000 description 2
- 208000011906 peptic ulcer disease Diseases 0.000 description 2
- 108010083476 phenylalanyltryptophan Proteins 0.000 description 2
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 2
- 239000002987 primer (paints) Substances 0.000 description 2
- 108010020755 prolyl-glycyl-glycine Proteins 0.000 description 2
- 108010079317 prolyl-tyrosine Proteins 0.000 description 2
- 108010004914 prolylarginine Proteins 0.000 description 2
- 239000011541 reaction mixture Substances 0.000 description 2
- 238000003259 recombinant expression Methods 0.000 description 2
- 238000004153 renaturation Methods 0.000 description 2
- 230000010076 replication Effects 0.000 description 2
- 239000011347 resin Substances 0.000 description 2
- 229920005989 resin Polymers 0.000 description 2
- 239000002336 ribonucleotide Substances 0.000 description 2
- 125000002652 ribonucleotide group Chemical group 0.000 description 2
- 150000003839 salts Chemical class 0.000 description 2
- 229920006395 saturated elastomer Polymers 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 230000028327 secretion Effects 0.000 description 2
- 108010069117 seryl-lysyl-aspartic acid Proteins 0.000 description 2
- 238000010186 staining Methods 0.000 description 2
- 201000000498 stomach carcinoma Diseases 0.000 description 2
- 239000005720 sucrose Substances 0.000 description 2
- 239000001648 tannin Substances 0.000 description 2
- 102000055501 telomere Human genes 0.000 description 2
- 108091035539 telomere Proteins 0.000 description 2
- 108010001055 thymocartin Proteins 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 108700004896 tripeptide FEG Proteins 0.000 description 2
- WFKWXMTUELFFGS-UHFFFAOYSA-N tungsten Chemical compound [W] WFKWXMTUELFFGS-UHFFFAOYSA-N 0.000 description 2
- 229910052721 tungsten Inorganic materials 0.000 description 2
- 239000010937 tungsten Substances 0.000 description 2
- 108010005834 tyrosyl-alanyl-glycine Proteins 0.000 description 2
- 108010017949 tyrosyl-glycyl-glycine Proteins 0.000 description 2
- 108010035534 tyrosyl-leucyl-alanine Proteins 0.000 description 2
- 101150080234 vacA gene Proteins 0.000 description 2
- JNTMAZFVYNDPLB-PEDHHIEDSA-N (2S,3S)-2-[[[(2S)-1-[(2S,3S)-2-amino-3-methyl-1-oxopentyl]-2-pyrrolidinyl]-oxomethyl]amino]-3-methylpentanoic acid Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JNTMAZFVYNDPLB-PEDHHIEDSA-N 0.000 description 1
- CWFMWBHMIMNZLN-NAKRPEOUSA-N (2s)-1-[(2s)-2-[[(2s,3s)-2-amino-3-methylpentanoyl]amino]propanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O CWFMWBHMIMNZLN-NAKRPEOUSA-N 0.000 description 1
- NTUPOKHATNSWCY-PMPSAXMXSA-N (2s)-2-[[(2s)-1-[(2r)-2-amino-3-phenylpropanoyl]pyrrolidine-2-carbonyl]amino]-5-(diaminomethylideneamino)pentanoic acid Chemical compound C([C@@H](N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)C1=CC=CC=C1 NTUPOKHATNSWCY-PMPSAXMXSA-N 0.000 description 1
- UUDAMDVQRQNNHZ-UHFFFAOYSA-N (S)-AMPA Chemical compound CC=1ONC(=O)C=1CC(N)C(O)=O UUDAMDVQRQNNHZ-UHFFFAOYSA-N 0.000 description 1
- RKDVKSZUMVYZHH-UHFFFAOYSA-N 1,4-dioxane-2,5-dione Chemical compound O=C1COC(=O)CO1 RKDVKSZUMVYZHH-UHFFFAOYSA-N 0.000 description 1
- VSWPGAIWKHPTKX-UHFFFAOYSA-N 1-methyl-10-[2-(4-methyl-1-piperazinyl)-1-oxoethyl]-5H-thieno[3,4-b][1,5]benzodiazepin-4-one Chemical compound C1CN(C)CCN1CC(=O)N1C2=CC=CC=C2NC(=O)C2=CSC(C)=C21 VSWPGAIWKHPTKX-UHFFFAOYSA-N 0.000 description 1
- MXHRCPNRJAMMIM-SHYZEUOFSA-N 2'-deoxyuridine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-SHYZEUOFSA-N 0.000 description 1
- RYOFERRMXDATKG-YEUCEMRASA-N 2,3-bis[(z)-octadec-9-enoxy]propyl-trimethylazanium Chemical compound CCCCCCCC\C=C/CCCCCCCCOCC(C[N+](C)(C)C)OCCCCCCCC\C=C/CCCCCCCC RYOFERRMXDATKG-YEUCEMRASA-N 0.000 description 1
- KSXTUUUQYQYKCR-LQDDAWAPSA-M 2,3-bis[[(z)-octadec-9-enoyl]oxy]propyl-trimethylazanium;chloride Chemical compound [Cl-].CCCCCCCC\C=C/CCCCCCCC(=O)OCC(C[N+](C)(C)C)OC(=O)CCCCCCC\C=C/CCCCCCCC KSXTUUUQYQYKCR-LQDDAWAPSA-M 0.000 description 1
- DQVAZKGVGKHQDS-UHFFFAOYSA-N 2-[[1-[2-[(2-amino-4-methylpentanoyl)amino]-4-methylpentanoyl]pyrrolidine-2-carbonyl]amino]-4-methylpentanoic acid Chemical compound CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(=O)NC(CC(C)C)C(O)=O DQVAZKGVGKHQDS-UHFFFAOYSA-N 0.000 description 1
- UQYCFWDXGAGNGW-UHFFFAOYSA-N 2-[[2-[[2-[(2-amino-3-methylpentanoyl)amino]-3-methylpentanoyl]amino]acetyl]amino]-3-phenylpropanoic acid Chemical compound CCC(C)C(N)C(=O)NC(C(C)CC)C(=O)NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 UQYCFWDXGAGNGW-UHFFFAOYSA-N 0.000 description 1
- QMOQBVOBWVNSNO-UHFFFAOYSA-N 2-[[2-[[2-[(2-azaniumylacetyl)amino]acetyl]amino]acetyl]amino]acetate Chemical compound NCC(=O)NCC(=O)NCC(=O)NCC(O)=O QMOQBVOBWVNSNO-UHFFFAOYSA-N 0.000 description 1
- ASJSAQIRZKANQN-CRCLSJGQSA-N 2-deoxy-D-ribose Chemical group OC[C@@H](O)[C@@H](O)CC=O ASJSAQIRZKANQN-CRCLSJGQSA-N 0.000 description 1
- 108010036211 5-HT-moduline Proteins 0.000 description 1
- WRDABNWSWOHGMS-UHFFFAOYSA-N AEBSF hydrochloride Chemical compound Cl.NCCC1=CC=C(S(F)(=O)=O)C=C1 WRDABNWSWOHGMS-UHFFFAOYSA-N 0.000 description 1
- 102100039819 Actin, alpha cardiac muscle 1 Human genes 0.000 description 1
- FJVAQLJNTSUQPY-CIUDSAMLSA-N Ala-Ala-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN FJVAQLJNTSUQPY-CIUDSAMLSA-N 0.000 description 1
- JBVSSSZFNTXJDX-YTLHQDLWSA-N Ala-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](C)N JBVSSSZFNTXJDX-YTLHQDLWSA-N 0.000 description 1
- PJNSIUPOXFBHDM-GUBZILKMSA-N Ala-Arg-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O PJNSIUPOXFBHDM-GUBZILKMSA-N 0.000 description 1
- FXKNPWNXPQZLES-ZLUOBGJFSA-N Ala-Asn-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O FXKNPWNXPQZLES-ZLUOBGJFSA-N 0.000 description 1
- XQJAFSDFQZPYCU-UWJYBYFXSA-N Ala-Asn-Tyr Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N XQJAFSDFQZPYCU-UWJYBYFXSA-N 0.000 description 1
- ZIWWTZWAKYBUOB-CIUDSAMLSA-N Ala-Asp-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O ZIWWTZWAKYBUOB-CIUDSAMLSA-N 0.000 description 1
- KUDREHRZRIVKHS-UWJYBYFXSA-N Ala-Asp-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KUDREHRZRIVKHS-UWJYBYFXSA-N 0.000 description 1
- CSAHOYQKNHGDHX-ACZMJKKPSA-N Ala-Gln-Asn Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CSAHOYQKNHGDHX-ACZMJKKPSA-N 0.000 description 1
- IFTVANMRTIHKML-WDSKDSINSA-N Ala-Gln-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O IFTVANMRTIHKML-WDSKDSINSA-N 0.000 description 1
- CRWFEKLFPVRPBV-CIUDSAMLSA-N Ala-Gln-Met Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O CRWFEKLFPVRPBV-CIUDSAMLSA-N 0.000 description 1
- YIGLXQRFQVWFEY-NRPADANISA-N Ala-Gln-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O YIGLXQRFQVWFEY-NRPADANISA-N 0.000 description 1
- IXTPACPAXIOCRG-ACZMJKKPSA-N Ala-Glu-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CS)C(=O)O)N IXTPACPAXIOCRG-ACZMJKKPSA-N 0.000 description 1
- PCIFXPRIFWKWLK-YUMQZZPRSA-N Ala-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N PCIFXPRIFWKWLK-YUMQZZPRSA-N 0.000 description 1
- NIZKGBJVCMRDKO-KWQFWETISA-N Ala-Gly-Tyr Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NIZKGBJVCMRDKO-KWQFWETISA-N 0.000 description 1
- SMCGQGDVTPFXKB-XPUUQOCRSA-N Ala-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N SMCGQGDVTPFXKB-XPUUQOCRSA-N 0.000 description 1
- JEPNLGMEZMCFEX-QSFUFRPTSA-N Ala-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](C)N JEPNLGMEZMCFEX-QSFUFRPTSA-N 0.000 description 1
- SHKGHIFSEAGTNL-DLOVCJGASA-N Ala-His-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CN=CN1 SHKGHIFSEAGTNL-DLOVCJGASA-N 0.000 description 1
- CKLDHDOIYBVUNP-KBIXCLLPSA-N Ala-Ile-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O CKLDHDOIYBVUNP-KBIXCLLPSA-N 0.000 description 1
- LBYMZCVBOKYZNS-CIUDSAMLSA-N Ala-Leu-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O LBYMZCVBOKYZNS-CIUDSAMLSA-N 0.000 description 1
- SUMYEVXWCAYLLJ-GUBZILKMSA-N Ala-Leu-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O SUMYEVXWCAYLLJ-GUBZILKMSA-N 0.000 description 1
- CCDFBRZVTDDJNM-GUBZILKMSA-N Ala-Leu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CCDFBRZVTDDJNM-GUBZILKMSA-N 0.000 description 1
- OPZJWMJPCNNZNT-DCAQKATOSA-N Ala-Leu-Met Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)O)N OPZJWMJPCNNZNT-DCAQKATOSA-N 0.000 description 1
- SOBIAADAMRHGKH-CIUDSAMLSA-N Ala-Leu-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O SOBIAADAMRHGKH-CIUDSAMLSA-N 0.000 description 1
- QUIGLPSHIFPEOV-CIUDSAMLSA-N Ala-Lys-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O QUIGLPSHIFPEOV-CIUDSAMLSA-N 0.000 description 1
- SDZRIBWEVVRDQI-CIUDSAMLSA-N Ala-Lys-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O SDZRIBWEVVRDQI-CIUDSAMLSA-N 0.000 description 1
- XHNLCGXYBXNRIS-BJDJZHNGSA-N Ala-Lys-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XHNLCGXYBXNRIS-BJDJZHNGSA-N 0.000 description 1
- VCSABYLVNWQYQE-SRVKXCTJSA-N Ala-Lys-Lys Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O VCSABYLVNWQYQE-SRVKXCTJSA-N 0.000 description 1
- FUKFQILQFQKHLE-DCAQKATOSA-N Ala-Lys-Met Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(O)=O FUKFQILQFQKHLE-DCAQKATOSA-N 0.000 description 1
- BLTRAARCJYVJKV-QEJZJMRPSA-N Ala-Lys-Phe Chemical compound C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](Cc1ccccc1)C(O)=O BLTRAARCJYVJKV-QEJZJMRPSA-N 0.000 description 1
- CHFFHQUVXHEGBY-GARJFASQSA-N Ala-Lys-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N CHFFHQUVXHEGBY-GARJFASQSA-N 0.000 description 1
- MDNAVFBZPROEHO-DCAQKATOSA-N Ala-Lys-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O MDNAVFBZPROEHO-DCAQKATOSA-N 0.000 description 1
- VHEVVUZDDUCAKU-FXQIFTODSA-N Ala-Met-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O VHEVVUZDDUCAKU-FXQIFTODSA-N 0.000 description 1
- PVQLRJRPUTXFFX-CIUDSAMLSA-N Ala-Met-Gln Chemical compound CSCC[C@H](NC(=O)[C@H](C)N)C(=O)N[C@@H](CCC(N)=O)C(O)=O PVQLRJRPUTXFFX-CIUDSAMLSA-N 0.000 description 1
- DEWWPUNXRNGMQN-LPEHRKFASA-N Ala-Met-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N1CCC[C@@H]1C(=O)O)N DEWWPUNXRNGMQN-LPEHRKFASA-N 0.000 description 1
- BFMIRJBURUXDRG-DLOVCJGASA-N Ala-Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 BFMIRJBURUXDRG-DLOVCJGASA-N 0.000 description 1
- WEZNQZHACPSMEF-QEJZJMRPSA-N Ala-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 WEZNQZHACPSMEF-QEJZJMRPSA-N 0.000 description 1
- JAQNUEWEJWBVAY-WBAXXEDZSA-N Ala-Phe-Phe Chemical compound C([C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 JAQNUEWEJWBVAY-WBAXXEDZSA-N 0.000 description 1
- IHMCQESUJVZTKW-UBHSHLNASA-N Ala-Phe-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CC=CC=C1 IHMCQESUJVZTKW-UBHSHLNASA-N 0.000 description 1
- XAXHGSOBFPIRFG-LSJOCFKGSA-N Ala-Pro-His Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O XAXHGSOBFPIRFG-LSJOCFKGSA-N 0.000 description 1
- DCVYRWFAMZFSDA-ZLUOBGJFSA-N Ala-Ser-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DCVYRWFAMZFSDA-ZLUOBGJFSA-N 0.000 description 1
- MSWSRLGNLKHDEI-ACZMJKKPSA-N Ala-Ser-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O MSWSRLGNLKHDEI-ACZMJKKPSA-N 0.000 description 1
- RTZCUEHYUQZIDE-WHFBIAKZSA-N Ala-Ser-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RTZCUEHYUQZIDE-WHFBIAKZSA-N 0.000 description 1
- HOVPGJUNRLMIOZ-CIUDSAMLSA-N Ala-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N HOVPGJUNRLMIOZ-CIUDSAMLSA-N 0.000 description 1
- MMLHRUJLOUSRJX-CIUDSAMLSA-N Ala-Ser-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN MMLHRUJLOUSRJX-CIUDSAMLSA-N 0.000 description 1
- NCQMBSJGJMYKCK-ZLUOBGJFSA-N Ala-Ser-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O NCQMBSJGJMYKCK-ZLUOBGJFSA-N 0.000 description 1
- JJHBEVZAZXZREW-LFSVMHDDSA-N Ala-Thr-Phe Chemical compound C[C@@H](O)[C@H](NC(=O)[C@H](C)N)C(=O)N[C@@H](Cc1ccccc1)C(O)=O JJHBEVZAZXZREW-LFSVMHDDSA-N 0.000 description 1
- VQBULXOHAZSTQY-GKCIPKSASA-N Ala-Trp-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O VQBULXOHAZSTQY-GKCIPKSASA-N 0.000 description 1
- GCTANJIJJROSLH-GVARAGBVSA-N Ala-Tyr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](C)N GCTANJIJJROSLH-GVARAGBVSA-N 0.000 description 1
- XAXMJQUMRJAFCH-CQDKDKBSSA-N Ala-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=C(O)C=C1 XAXMJQUMRJAFCH-CQDKDKBSSA-N 0.000 description 1
- XSLGWYYNOSUMRM-ZKWXMUAHSA-N Ala-Val-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O XSLGWYYNOSUMRM-ZKWXMUAHSA-N 0.000 description 1
- DHONNEYAZPNGSG-UBHSHLNASA-N Ala-Val-Phe Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 DHONNEYAZPNGSG-UBHSHLNASA-N 0.000 description 1
- REWSWYIDQIELBE-FXQIFTODSA-N Ala-Val-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O REWSWYIDQIELBE-FXQIFTODSA-N 0.000 description 1
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 1
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 1
- 108700028369 Alleles Proteins 0.000 description 1
- 101710145634 Antigen 1 Proteins 0.000 description 1
- DFCIPNHFKOQAME-FXQIFTODSA-N Arg-Ala-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O DFCIPNHFKOQAME-FXQIFTODSA-N 0.000 description 1
- MCYJBCKCAPERSE-FXQIFTODSA-N Arg-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N MCYJBCKCAPERSE-FXQIFTODSA-N 0.000 description 1
- SBVJJNJLFWSJOV-UBHSHLNASA-N Arg-Ala-Phe Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 SBVJJNJLFWSJOV-UBHSHLNASA-N 0.000 description 1
- VWVPYNGMOCSSGK-GUBZILKMSA-N Arg-Arg-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O VWVPYNGMOCSSGK-GUBZILKMSA-N 0.000 description 1
- NONSEUUPKITYQT-BQBZGAKWSA-N Arg-Asn-Gly Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)NCC(=O)O)N)CN=C(N)N NONSEUUPKITYQT-BQBZGAKWSA-N 0.000 description 1
- SQKPKIJVWHAWNF-DCAQKATOSA-N Arg-Asp-Lys Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(O)=O SQKPKIJVWHAWNF-DCAQKATOSA-N 0.000 description 1
- FBLMOFHNVQBKRR-IHRRRGAJSA-N Arg-Asp-Tyr Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FBLMOFHNVQBKRR-IHRRRGAJSA-N 0.000 description 1
- PMGDADKJMCOXHX-BQBZGAKWSA-N Arg-Gln Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(O)=O PMGDADKJMCOXHX-BQBZGAKWSA-N 0.000 description 1
- SNBHMYQRNCJSOJ-CIUDSAMLSA-N Arg-Gln-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O SNBHMYQRNCJSOJ-CIUDSAMLSA-N 0.000 description 1
- OBFTYSPXDRROQO-SRVKXCTJSA-N Arg-Gln-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCCN=C(N)N OBFTYSPXDRROQO-SRVKXCTJSA-N 0.000 description 1
- LMPKCSXZJSXBBL-NHCYSSNCSA-N Arg-Gln-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O LMPKCSXZJSXBBL-NHCYSSNCSA-N 0.000 description 1
- HPKSHFSEXICTLI-CIUDSAMLSA-N Arg-Glu-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O HPKSHFSEXICTLI-CIUDSAMLSA-N 0.000 description 1
- QAODJPUKWNNNRP-DCAQKATOSA-N Arg-Glu-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O QAODJPUKWNNNRP-DCAQKATOSA-N 0.000 description 1
- QAXCZGMLVICQKS-SRVKXCTJSA-N Arg-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N QAXCZGMLVICQKS-SRVKXCTJSA-N 0.000 description 1
- OHYQKYUTLIPFOX-ZPFDUUQYSA-N Arg-Glu-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O OHYQKYUTLIPFOX-ZPFDUUQYSA-N 0.000 description 1
- UFBURHXMKFQVLM-CIUDSAMLSA-N Arg-Glu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O UFBURHXMKFQVLM-CIUDSAMLSA-N 0.000 description 1
- NXDXECQFKHXHAM-HJGDQZAQSA-N Arg-Glu-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NXDXECQFKHXHAM-HJGDQZAQSA-N 0.000 description 1
- SLNCSSWAIDUUGF-LSJOCFKGSA-N Arg-His-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(O)=O SLNCSSWAIDUUGF-LSJOCFKGSA-N 0.000 description 1
- CRCCTGPNZUCAHE-DCAQKATOSA-N Arg-His-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CN=CN1 CRCCTGPNZUCAHE-DCAQKATOSA-N 0.000 description 1
- YQGZIRIYGHNSQO-ZPFDUUQYSA-N Arg-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N YQGZIRIYGHNSQO-ZPFDUUQYSA-N 0.000 description 1
- FFEUXEAKYRCACT-PEDHHIEDSA-N Arg-Ile-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCCNC(N)=N)[C@@H](C)CC)C(O)=O FFEUXEAKYRCACT-PEDHHIEDSA-N 0.000 description 1
- GXXWTNKNFFKTJB-NAKRPEOUSA-N Arg-Ile-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O GXXWTNKNFFKTJB-NAKRPEOUSA-N 0.000 description 1
- HJDNZFIYILEIKR-OSUNSFLBSA-N Arg-Ile-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O HJDNZFIYILEIKR-OSUNSFLBSA-N 0.000 description 1
- FNXCAFKDGBROCU-STECZYCISA-N Arg-Ile-Tyr Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FNXCAFKDGBROCU-STECZYCISA-N 0.000 description 1
- YKZJPIPFKGYHKY-DCAQKATOSA-N Arg-Leu-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O YKZJPIPFKGYHKY-DCAQKATOSA-N 0.000 description 1
- UZGFHWIJWPUPOH-IHRRRGAJSA-N Arg-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UZGFHWIJWPUPOH-IHRRRGAJSA-N 0.000 description 1
- COXMUHNBYCVVRG-DCAQKATOSA-N Arg-Leu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O COXMUHNBYCVVRG-DCAQKATOSA-N 0.000 description 1
- DTBPLQNKYCYUOM-JYJNAYRXSA-N Arg-Met-Phe Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 DTBPLQNKYCYUOM-JYJNAYRXSA-N 0.000 description 1
- AOHKLEBWKMKITA-IHRRRGAJSA-N Arg-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N AOHKLEBWKMKITA-IHRRRGAJSA-N 0.000 description 1
- FVBZXNSRIDVYJS-AVGNSLFASA-N Arg-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCN=C(N)N FVBZXNSRIDVYJS-AVGNSLFASA-N 0.000 description 1
- ATABBWFGOHKROJ-GUBZILKMSA-N Arg-Pro-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O ATABBWFGOHKROJ-GUBZILKMSA-N 0.000 description 1
- URAUIUGLHBRPMF-NAKRPEOUSA-N Arg-Ser-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O URAUIUGLHBRPMF-NAKRPEOUSA-N 0.000 description 1
- JOTRDIXZHNQYGP-DCAQKATOSA-N Arg-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCCN=C(N)N)N JOTRDIXZHNQYGP-DCAQKATOSA-N 0.000 description 1
- UZSQXCMNUPKLCC-FJXKBIBVSA-N Arg-Thr-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O UZSQXCMNUPKLCC-FJXKBIBVSA-N 0.000 description 1
- RYQSYXFGFOTJDJ-RHYQMDGZSA-N Arg-Thr-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O RYQSYXFGFOTJDJ-RHYQMDGZSA-N 0.000 description 1
- INOIAEUXVVNJKA-XGEHTFHBSA-N Arg-Thr-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O INOIAEUXVVNJKA-XGEHTFHBSA-N 0.000 description 1
- XRLOBFSLPCHYLQ-ULQDDVLXSA-N Arg-Tyr-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O XRLOBFSLPCHYLQ-ULQDDVLXSA-N 0.000 description 1
- PSUXEQYPYZLNER-QXEWZRGKSA-N Arg-Val-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O PSUXEQYPYZLNER-QXEWZRGKSA-N 0.000 description 1
- LLQIAIUAKGNOSE-NHCYSSNCSA-N Arg-Val-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCN=C(N)N LLQIAIUAKGNOSE-NHCYSSNCSA-N 0.000 description 1
- WOZDCBHUGJVJPL-AVGNSLFASA-N Arg-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N WOZDCBHUGJVJPL-AVGNSLFASA-N 0.000 description 1
- FXGMURPOWCKNAZ-JYJNAYRXSA-N Arg-Val-Phe Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 FXGMURPOWCKNAZ-JYJNAYRXSA-N 0.000 description 1
- QLSRIZIDQXDQHK-RCWTZXSCSA-N Arg-Val-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QLSRIZIDQXDQHK-RCWTZXSCSA-N 0.000 description 1
- 241000238421 Arthropoda Species 0.000 description 1
- BRCVLJZIIFBSPF-ZLUOBGJFSA-N Asn-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)N)N BRCVLJZIIFBSPF-ZLUOBGJFSA-N 0.000 description 1
- NXVGBGZQQFDUTM-XVYDVKMFSA-N Asn-Ala-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC(=O)N)N NXVGBGZQQFDUTM-XVYDVKMFSA-N 0.000 description 1
- CMLGVVWQQHUXOZ-GHCJXIJMSA-N Asn-Ala-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O CMLGVVWQQHUXOZ-GHCJXIJMSA-N 0.000 description 1
- XWGJDUSDTRPQRK-ZLUOBGJFSA-N Asn-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(N)=O XWGJDUSDTRPQRK-ZLUOBGJFSA-N 0.000 description 1
- HOIFSHOLNKQCSA-FXQIFTODSA-N Asn-Arg-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O HOIFSHOLNKQCSA-FXQIFTODSA-N 0.000 description 1
- GXMSVVBIAMWMKO-BQBZGAKWSA-N Asn-Arg-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CCCN=C(N)N GXMSVVBIAMWMKO-BQBZGAKWSA-N 0.000 description 1
- LXTGAOAXPSJWOU-DCAQKATOSA-N Asn-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)N)N LXTGAOAXPSJWOU-DCAQKATOSA-N 0.000 description 1
- MFFOYNGMOYFPBD-DCAQKATOSA-N Asn-Arg-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O MFFOYNGMOYFPBD-DCAQKATOSA-N 0.000 description 1
- HUZGPXBILPMCHM-IHRRRGAJSA-N Asn-Arg-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HUZGPXBILPMCHM-IHRRRGAJSA-N 0.000 description 1
- KSBHCUSPLWRVEK-ZLUOBGJFSA-N Asn-Asn-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O KSBHCUSPLWRVEK-ZLUOBGJFSA-N 0.000 description 1
- LDSFSKFATNBTBV-UHFFFAOYSA-N Asn-Asn-Gly-His Chemical compound NC(=O)CC(N)C(=O)NC(CC(N)=O)C(=O)NCC(=O)NC(C(O)=O)CC1=CN=CN1 LDSFSKFATNBTBV-UHFFFAOYSA-N 0.000 description 1
- BVLIJXXSXBUGEC-SRVKXCTJSA-N Asn-Asn-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O BVLIJXXSXBUGEC-SRVKXCTJSA-N 0.000 description 1
- XVVOVPFMILMHPX-ZLUOBGJFSA-N Asn-Asp-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O XVVOVPFMILMHPX-ZLUOBGJFSA-N 0.000 description 1
- CUQUEHYSSFETRD-ACZMJKKPSA-N Asn-Asp-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N CUQUEHYSSFETRD-ACZMJKKPSA-N 0.000 description 1
- XSGBIBGAMKTHMY-WHFBIAKZSA-N Asn-Asp-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O XSGBIBGAMKTHMY-WHFBIAKZSA-N 0.000 description 1
- XWFPGQVLOVGSLU-CIUDSAMLSA-N Asn-Gln-Arg Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N XWFPGQVLOVGSLU-CIUDSAMLSA-N 0.000 description 1
- AYKKKGFJXIDYLX-ACZMJKKPSA-N Asn-Gln-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O AYKKKGFJXIDYLX-ACZMJKKPSA-N 0.000 description 1
- OKZOABJQOMAYEC-NUMRIWBASA-N Asn-Gln-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OKZOABJQOMAYEC-NUMRIWBASA-N 0.000 description 1
- SRUUBQBAVNQZGJ-LAEOZQHASA-N Asn-Gln-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)N)N SRUUBQBAVNQZGJ-LAEOZQHASA-N 0.000 description 1
- BZMWJLLUAKSIMH-FXQIFTODSA-N Asn-Glu-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BZMWJLLUAKSIMH-FXQIFTODSA-N 0.000 description 1
- GNKVBRYFXYWXAB-WDSKDSINSA-N Asn-Glu-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O GNKVBRYFXYWXAB-WDSKDSINSA-N 0.000 description 1
- JZDZLBJVYWIIQU-AVGNSLFASA-N Asn-Glu-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JZDZLBJVYWIIQU-AVGNSLFASA-N 0.000 description 1
- JQSWHKKUZMTOIH-QWRGUYRKSA-N Asn-Gly-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N JQSWHKKUZMTOIH-QWRGUYRKSA-N 0.000 description 1
- FTCGGKNCJZOPNB-WHFBIAKZSA-N Asn-Gly-Ser Chemical compound NC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O FTCGGKNCJZOPNB-WHFBIAKZSA-N 0.000 description 1
- UDSVWSUXKYXSTR-QWRGUYRKSA-N Asn-Gly-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O UDSVWSUXKYXSTR-QWRGUYRKSA-N 0.000 description 1
- OOWSBIOUKIUWLO-RCOVLWMOSA-N Asn-Gly-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O OOWSBIOUKIUWLO-RCOVLWMOSA-N 0.000 description 1
- OLISTMZJGQUOGS-GMOBBJLQSA-N Asn-Ile-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N OLISTMZJGQUOGS-GMOBBJLQSA-N 0.000 description 1
- YYSYDIYQTUPNQQ-SXTJYALSSA-N Asn-Ile-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YYSYDIYQTUPNQQ-SXTJYALSSA-N 0.000 description 1
- ACKNRKFVYUVWAC-ZPFDUUQYSA-N Asn-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N ACKNRKFVYUVWAC-ZPFDUUQYSA-N 0.000 description 1
- SPCONPVIDFMDJI-QSFUFRPTSA-N Asn-Ile-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O SPCONPVIDFMDJI-QSFUFRPTSA-N 0.000 description 1
- NLRJGXZWTKXRHP-DCAQKATOSA-N Asn-Leu-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NLRJGXZWTKXRHP-DCAQKATOSA-N 0.000 description 1
- HDHZCEDPLTVHFZ-GUBZILKMSA-N Asn-Leu-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O HDHZCEDPLTVHFZ-GUBZILKMSA-N 0.000 description 1
- JLNFZLNDHONLND-GARJFASQSA-N Asn-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N JLNFZLNDHONLND-GARJFASQSA-N 0.000 description 1
- FHETWELNCBMRMG-HJGDQZAQSA-N Asn-Leu-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FHETWELNCBMRMG-HJGDQZAQSA-N 0.000 description 1
- TZFQICWZWFNIKU-KKUMJFAQSA-N Asn-Leu-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 TZFQICWZWFNIKU-KKUMJFAQSA-N 0.000 description 1
- NCFJQJRLQJEECD-NHCYSSNCSA-N Asn-Leu-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O NCFJQJRLQJEECD-NHCYSSNCSA-N 0.000 description 1
- KHCNTVRVAYCPQE-CIUDSAMLSA-N Asn-Lys-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O KHCNTVRVAYCPQE-CIUDSAMLSA-N 0.000 description 1
- LZLCLRQMUQWUHJ-GUBZILKMSA-N Asn-Lys-Gln Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N LZLCLRQMUQWUHJ-GUBZILKMSA-N 0.000 description 1
- ORJQQZIXTOYGGH-SRVKXCTJSA-N Asn-Lys-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O ORJQQZIXTOYGGH-SRVKXCTJSA-N 0.000 description 1
- ZYPWIUFLYMQZBS-SRVKXCTJSA-N Asn-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N ZYPWIUFLYMQZBS-SRVKXCTJSA-N 0.000 description 1
- KEUNWIXNKVWCFL-FXQIFTODSA-N Asn-Met-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O KEUNWIXNKVWCFL-FXQIFTODSA-N 0.000 description 1
- OROMFUQQTSWUTI-IHRRRGAJSA-N Asn-Phe-Arg Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N OROMFUQQTSWUTI-IHRRRGAJSA-N 0.000 description 1
- PPCORQFLAZWUNO-QWRGUYRKSA-N Asn-Phe-Gly Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC(=O)N)N PPCORQFLAZWUNO-QWRGUYRKSA-N 0.000 description 1
- HZZIFFOVHLWGCS-KKUMJFAQSA-N Asn-Phe-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O HZZIFFOVHLWGCS-KKUMJFAQSA-N 0.000 description 1
- RVHGJNGNKGDCPX-KKUMJFAQSA-N Asn-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N RVHGJNGNKGDCPX-KKUMJFAQSA-N 0.000 description 1
- YXVAESUIQFDBHN-SRVKXCTJSA-N Asn-Phe-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O YXVAESUIQFDBHN-SRVKXCTJSA-N 0.000 description 1
- QXOPPIDJKPEKCW-GUBZILKMSA-N Asn-Pro-Arg Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC(=O)N)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O QXOPPIDJKPEKCW-GUBZILKMSA-N 0.000 description 1
- SZNGQSBRHFMZLT-IHRRRGAJSA-N Asn-Pro-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SZNGQSBRHFMZLT-IHRRRGAJSA-N 0.000 description 1
- VHQSGALUSWIYOD-QXEWZRGKSA-N Asn-Pro-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O VHQSGALUSWIYOD-QXEWZRGKSA-N 0.000 description 1
- VWADICJNCPFKJS-ZLUOBGJFSA-N Asn-Ser-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O VWADICJNCPFKJS-ZLUOBGJFSA-N 0.000 description 1
- SNYCNNPOFYBCEK-ZLUOBGJFSA-N Asn-Ser-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O SNYCNNPOFYBCEK-ZLUOBGJFSA-N 0.000 description 1
- XHTUGJCAEYOZOR-UBHSHLNASA-N Asn-Ser-Trp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O XHTUGJCAEYOZOR-UBHSHLNASA-N 0.000 description 1
- NCXTYSVDWLAQGZ-ZKWXMUAHSA-N Asn-Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O NCXTYSVDWLAQGZ-ZKWXMUAHSA-N 0.000 description 1
- WLVLIYYBPPONRJ-GCJQMDKQSA-N Asn-Thr-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O WLVLIYYBPPONRJ-GCJQMDKQSA-N 0.000 description 1
- JXMREEPBRANWBY-VEVYYDQMSA-N Asn-Thr-Arg Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JXMREEPBRANWBY-VEVYYDQMSA-N 0.000 description 1
- ZUFPUBYQYWCMDB-NUMRIWBASA-N Asn-Thr-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZUFPUBYQYWCMDB-NUMRIWBASA-N 0.000 description 1
- YHXNKGKUDJCAHB-PBCZWWQYSA-N Asn-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O YHXNKGKUDJCAHB-PBCZWWQYSA-N 0.000 description 1
- JBDLMLZNDRLDIX-HJGDQZAQSA-N Asn-Thr-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O JBDLMLZNDRLDIX-HJGDQZAQSA-N 0.000 description 1
- AMGQTNHANMRPOE-LKXGYXEUSA-N Asn-Thr-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O AMGQTNHANMRPOE-LKXGYXEUSA-N 0.000 description 1
- BCADFFUQHIMQAA-KKHAAJSZSA-N Asn-Thr-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O BCADFFUQHIMQAA-KKHAAJSZSA-N 0.000 description 1
- MLJZMGIXXMTEPO-UBHSHLNASA-N Asn-Trp-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(O)=O MLJZMGIXXMTEPO-UBHSHLNASA-N 0.000 description 1
- RTFXPCYMDYBZNQ-SRVKXCTJSA-N Asn-Tyr-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O RTFXPCYMDYBZNQ-SRVKXCTJSA-N 0.000 description 1
- DATSKXOXPUAOLK-KKUMJFAQSA-N Asn-Tyr-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O DATSKXOXPUAOLK-KKUMJFAQSA-N 0.000 description 1
- LRCIOEVFVGXZKB-BZSNNMDCSA-N Asn-Tyr-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LRCIOEVFVGXZKB-BZSNNMDCSA-N 0.000 description 1
- MJIJBEYEHBKTIM-BYULHYEWSA-N Asn-Val-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N MJIJBEYEHBKTIM-BYULHYEWSA-N 0.000 description 1
- JNCRAQVYJZGIOW-QSFUFRPTSA-N Asn-Val-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JNCRAQVYJZGIOW-QSFUFRPTSA-N 0.000 description 1
- LMIWYCWRJVMAIQ-NHCYSSNCSA-N Asn-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N LMIWYCWRJVMAIQ-NHCYSSNCSA-N 0.000 description 1
- KDFQZBWWPYQBEN-ZLUOBGJFSA-N Asp-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N KDFQZBWWPYQBEN-ZLUOBGJFSA-N 0.000 description 1
- PBVLJOIPOGUQQP-CIUDSAMLSA-N Asp-Ala-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O PBVLJOIPOGUQQP-CIUDSAMLSA-N 0.000 description 1
- NECWUSYTYSIFNC-DLOVCJGASA-N Asp-Ala-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 NECWUSYTYSIFNC-DLOVCJGASA-N 0.000 description 1
- RGKKALNPOYURGE-ZKWXMUAHSA-N Asp-Ala-Val Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)O RGKKALNPOYURGE-ZKWXMUAHSA-N 0.000 description 1
- IXIWEFWRKIUMQX-DCAQKATOSA-N Asp-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CC(O)=O IXIWEFWRKIUMQX-DCAQKATOSA-N 0.000 description 1
- XYBJLTKSGFBLCS-QXEWZRGKSA-N Asp-Arg-Val Chemical compound NC(N)=NCCC[C@@H](C(=O)N[C@@H](C(C)C)C(O)=O)NC(=O)[C@@H](N)CC(O)=O XYBJLTKSGFBLCS-QXEWZRGKSA-N 0.000 description 1
- CASGONAXMZPHCK-FXQIFTODSA-N Asp-Asn-Arg Chemical compound C(C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N)CN=C(N)N CASGONAXMZPHCK-FXQIFTODSA-N 0.000 description 1
- ZELQAFZSJOBEQS-ACZMJKKPSA-N Asp-Asn-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZELQAFZSJOBEQS-ACZMJKKPSA-N 0.000 description 1
- HOQGTAIGQSDCHR-SRVKXCTJSA-N Asp-Asn-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HOQGTAIGQSDCHR-SRVKXCTJSA-N 0.000 description 1
- KNMRXHIAVXHCLW-ZLUOBGJFSA-N Asp-Asn-Ser Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N)C(=O)O KNMRXHIAVXHCLW-ZLUOBGJFSA-N 0.000 description 1
- BKXPJCBEHWFSTF-ACZMJKKPSA-N Asp-Gln-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O BKXPJCBEHWFSTF-ACZMJKKPSA-N 0.000 description 1
- XJQRWGXKUSDEFI-ACZMJKKPSA-N Asp-Glu-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O XJQRWGXKUSDEFI-ACZMJKKPSA-N 0.000 description 1
- KHBLRHKVXICFMY-GUBZILKMSA-N Asp-Glu-Lys Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O KHBLRHKVXICFMY-GUBZILKMSA-N 0.000 description 1
- RRKCPMGSRIDLNC-AVGNSLFASA-N Asp-Glu-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O RRKCPMGSRIDLNC-AVGNSLFASA-N 0.000 description 1
- VIRHEUMYXXLCBF-WDSKDSINSA-N Asp-Gly-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O VIRHEUMYXXLCBF-WDSKDSINSA-N 0.000 description 1
- TZOZNVLBTAFJRW-UGYAYLCHSA-N Asp-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N TZOZNVLBTAFJRW-UGYAYLCHSA-N 0.000 description 1
- YFSLJHLQOALGSY-ZPFDUUQYSA-N Asp-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N YFSLJHLQOALGSY-ZPFDUUQYSA-N 0.000 description 1
- CLUMZOKVGUWUFD-CIUDSAMLSA-N Asp-Leu-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O CLUMZOKVGUWUFD-CIUDSAMLSA-N 0.000 description 1
- RQHLMGCXCZUOGT-ZPFDUUQYSA-N Asp-Leu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O RQHLMGCXCZUOGT-ZPFDUUQYSA-N 0.000 description 1
- KFAFUJMGHVVYRC-DCAQKATOSA-N Asp-Leu-Met Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(O)=O KFAFUJMGHVVYRC-DCAQKATOSA-N 0.000 description 1
- LIVXPXUVXFRWNY-CIUDSAMLSA-N Asp-Lys-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O LIVXPXUVXFRWNY-CIUDSAMLSA-N 0.000 description 1
- XWSIYTYNLKCLJB-CIUDSAMLSA-N Asp-Lys-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O XWSIYTYNLKCLJB-CIUDSAMLSA-N 0.000 description 1
- LBOVBQONZJRWPV-YUMQZZPRSA-N Asp-Lys-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LBOVBQONZJRWPV-YUMQZZPRSA-N 0.000 description 1
- AKKUDRZKFZWPBH-SRVKXCTJSA-N Asp-Lys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)O)N AKKUDRZKFZWPBH-SRVKXCTJSA-N 0.000 description 1
- GKWFMNNNYZHJHV-SRVKXCTJSA-N Asp-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC(O)=O GKWFMNNNYZHJHV-SRVKXCTJSA-N 0.000 description 1
- YWLDTBBUHZJQHW-KKUMJFAQSA-N Asp-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)O)N YWLDTBBUHZJQHW-KKUMJFAQSA-N 0.000 description 1
- NZWDWXSWUQCNMG-GARJFASQSA-N Asp-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)O)N)C(=O)O NZWDWXSWUQCNMG-GARJFASQSA-N 0.000 description 1
- VWWAFGHMPWBKEP-GMOBBJLQSA-N Asp-Met-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC(=O)O)N VWWAFGHMPWBKEP-GMOBBJLQSA-N 0.000 description 1
- DJCAHYVLMSRBFR-QXEWZRGKSA-N Asp-Met-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CC(O)=O DJCAHYVLMSRBFR-QXEWZRGKSA-N 0.000 description 1
- YZQCXOFQZKCETR-UWVGGRQHSA-N Asp-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 YZQCXOFQZKCETR-UWVGGRQHSA-N 0.000 description 1
- IDDMGSKZQDEDGA-SRVKXCTJSA-N Asp-Phe-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 IDDMGSKZQDEDGA-SRVKXCTJSA-N 0.000 description 1
- YRZIYQGXTSBRLT-AVGNSLFASA-N Asp-Phe-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O YRZIYQGXTSBRLT-AVGNSLFASA-N 0.000 description 1
- RPUYTJJZXQBWDT-SRVKXCTJSA-N Asp-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC(=O)O)N RPUYTJJZXQBWDT-SRVKXCTJSA-N 0.000 description 1
- USNJAPJZSGTTPX-XVSYOHENSA-N Asp-Phe-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O USNJAPJZSGTTPX-XVSYOHENSA-N 0.000 description 1
- KOWYNSKRPUWSFG-IHPCNDPISA-N Asp-Phe-Trp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O)NC(=O)[C@H](CC(=O)O)N KOWYNSKRPUWSFG-IHPCNDPISA-N 0.000 description 1
- BWJZSLQJNBSUPM-FXQIFTODSA-N Asp-Pro-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O BWJZSLQJNBSUPM-FXQIFTODSA-N 0.000 description 1
- MVRGBQGZSDJBSM-GMOBBJLQSA-N Asp-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC(=O)O)N MVRGBQGZSDJBSM-GMOBBJLQSA-N 0.000 description 1
- YFGUZQQCSDZRBN-DCAQKATOSA-N Asp-Pro-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O YFGUZQQCSDZRBN-DCAQKATOSA-N 0.000 description 1
- FAUPLTGRUBTXNU-FXQIFTODSA-N Asp-Pro-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O FAUPLTGRUBTXNU-FXQIFTODSA-N 0.000 description 1
- QSFHZPQUAAQHAQ-CIUDSAMLSA-N Asp-Ser-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O QSFHZPQUAAQHAQ-CIUDSAMLSA-N 0.000 description 1
- MNQMTYSEKZHIDF-GCJQMDKQSA-N Asp-Thr-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O MNQMTYSEKZHIDF-GCJQMDKQSA-N 0.000 description 1
- GWWSUMLEWKQHLR-NUMRIWBASA-N Asp-Thr-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O GWWSUMLEWKQHLR-NUMRIWBASA-N 0.000 description 1
- JSNWZMFSLIWAHS-HJGDQZAQSA-N Asp-Thr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O JSNWZMFSLIWAHS-HJGDQZAQSA-N 0.000 description 1
- KNDCWFXCFKSEBM-AVGNSLFASA-N Asp-Tyr-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O KNDCWFXCFKSEBM-AVGNSLFASA-N 0.000 description 1
- AWPWHMVCSISSQK-QWRGUYRKSA-N Asp-Tyr-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O AWPWHMVCSISSQK-QWRGUYRKSA-N 0.000 description 1
- ZQFZEBRNAMXXJV-KKUMJFAQSA-N Asp-Tyr-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O ZQFZEBRNAMXXJV-KKUMJFAQSA-N 0.000 description 1
- BJDHEININLSZOT-KKUMJFAQSA-N Asp-Tyr-Lys Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(O)=O BJDHEININLSZOT-KKUMJFAQSA-N 0.000 description 1
- ALMIMUZAWTUNIO-BZSNNMDCSA-N Asp-Tyr-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ALMIMUZAWTUNIO-BZSNNMDCSA-N 0.000 description 1
- BYLPQJAWXJWUCJ-YDHLFZDLSA-N Asp-Tyr-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O BYLPQJAWXJWUCJ-YDHLFZDLSA-N 0.000 description 1
- UXRVDHVARNBOIO-QSFUFRPTSA-N Asp-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CC(=O)O)N UXRVDHVARNBOIO-QSFUFRPTSA-N 0.000 description 1
- SFJUYBCDQBAYAJ-YDHLFZDLSA-N Asp-Val-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 SFJUYBCDQBAYAJ-YDHLFZDLSA-N 0.000 description 1
- JGLWFWXGOINXEA-YDHLFZDLSA-N Asp-Val-Tyr Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 JGLWFWXGOINXEA-YDHLFZDLSA-N 0.000 description 1
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 1
- 102100037293 Atrial natriuretic peptide-converting enzyme Human genes 0.000 description 1
- 208000004300 Atrophic Gastritis Diseases 0.000 description 1
- 241000271566 Aves Species 0.000 description 1
- 108020000946 Bacterial DNA Proteins 0.000 description 1
- 231100000699 Bacterial toxin Toxicity 0.000 description 1
- 241001288393 Belgica Species 0.000 description 1
- 241000537222 Betabaculovirus Species 0.000 description 1
- BVKZGUZCCUSVTD-UHFFFAOYSA-M Bicarbonate Chemical compound OC([O-])=O BVKZGUZCCUSVTD-UHFFFAOYSA-M 0.000 description 1
- 206010006187 Breast cancer Diseases 0.000 description 1
- 241000589562 Brucella Species 0.000 description 1
- VTYYLEPIZMXCLO-UHFFFAOYSA-L Calcium carbonate Chemical class [Ca+2].[O-]C([O-])=O VTYYLEPIZMXCLO-UHFFFAOYSA-L 0.000 description 1
- 241000701931 Canine parvovirus Species 0.000 description 1
- 241000282472 Canis lupus familiaris Species 0.000 description 1
- BQENDLAVTKRQMS-SBBGFIFASA-L Carbenoxolone sodium Chemical compound [Na+].[Na+].C([C@H]1C2=CC(=O)[C@H]34)[C@@](C)(C([O-])=O)CC[C@]1(C)CC[C@@]2(C)[C@]4(C)CC[C@@H]1[C@]3(C)CC[C@H](OC(=O)CCC([O-])=O)C1(C)C BQENDLAVTKRQMS-SBBGFIFASA-L 0.000 description 1
- 102100035882 Catalase Human genes 0.000 description 1
- 108010053835 Catalase Proteins 0.000 description 1
- 206010061041 Chlamydial infection Diseases 0.000 description 1
- VEXZGXHMUGYJMC-UHFFFAOYSA-M Chloride anion Chemical compound [Cl-] VEXZGXHMUGYJMC-UHFFFAOYSA-M 0.000 description 1
- 241000193163 Clostridioides difficile Species 0.000 description 1
- 108020004635 Complementary DNA Proteins 0.000 description 1
- 102100032202 Cornulin Human genes 0.000 description 1
- 229920000742 Cotton Polymers 0.000 description 1
- UCMIKRLLIOVDRJ-XKBZYTNZSA-N Cys-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CS)N)O UCMIKRLLIOVDRJ-XKBZYTNZSA-N 0.000 description 1
- SSNJZBGOMNLSLA-CIUDSAMLSA-N Cys-Leu-Asn Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O SSNJZBGOMNLSLA-CIUDSAMLSA-N 0.000 description 1
- HKALUUKHYNEDRS-GUBZILKMSA-N Cys-Leu-Gln Chemical compound SC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O HKALUUKHYNEDRS-GUBZILKMSA-N 0.000 description 1
- MFMDKTLJCUBQIC-MXAVVETBSA-N Cys-Phe-Ile Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MFMDKTLJCUBQIC-MXAVVETBSA-N 0.000 description 1
- YNJBLTDKTMKEET-ZLUOBGJFSA-N Cys-Ser-Ser Chemical compound SC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O YNJBLTDKTMKEET-ZLUOBGJFSA-N 0.000 description 1
- DQGIAOGALAQBGK-BWBBJGPYSA-N Cys-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CS)N)O DQGIAOGALAQBGK-BWBBJGPYSA-N 0.000 description 1
- XWTGTTNUCCEFJI-UBHSHLNASA-N Cys-Ser-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CS)N XWTGTTNUCCEFJI-UBHSHLNASA-N 0.000 description 1
- MHYHLWUGWUBUHF-GUBZILKMSA-N Cys-Val-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CS)N MHYHLWUGWUBUHF-GUBZILKMSA-N 0.000 description 1
- KZZYVYWSXMFYEC-DCAQKATOSA-N Cys-Val-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O KZZYVYWSXMFYEC-DCAQKATOSA-N 0.000 description 1
- ALTQTAKGRFLRLR-GUBZILKMSA-N Cys-Val-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CS)N ALTQTAKGRFLRLR-GUBZILKMSA-N 0.000 description 1
- 238000007399 DNA isolation Methods 0.000 description 1
- 239000003155 DNA primer Substances 0.000 description 1
- 238000001712 DNA sequencing Methods 0.000 description 1
- 238000011238 DNA vaccination Methods 0.000 description 1
- 102100036912 Desmin Human genes 0.000 description 1
- 108010044052 Desmin Proteins 0.000 description 1
- 238000012286 ELISA Assay Methods 0.000 description 1
- 241000196324 Embryophyta Species 0.000 description 1
- 101000686777 Escherichia phage T7 T7 RNA polymerase Proteins 0.000 description 1
- 108010040721 Flagellin Proteins 0.000 description 1
- 208000036495 Gastritis atrophic Diseases 0.000 description 1
- YJIUYQKQBBQYHZ-ACZMJKKPSA-N Gln-Ala-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O YJIUYQKQBBQYHZ-ACZMJKKPSA-N 0.000 description 1
- WUAYFMZULZDSLB-ACZMJKKPSA-N Gln-Ala-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O WUAYFMZULZDSLB-ACZMJKKPSA-N 0.000 description 1
- LKUWAWGNJYJODH-KBIXCLLPSA-N Gln-Ala-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LKUWAWGNJYJODH-KBIXCLLPSA-N 0.000 description 1
- YNNXQZDEOCYJJL-CIUDSAMLSA-N Gln-Arg-Asp Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)CN=C(N)N YNNXQZDEOCYJJL-CIUDSAMLSA-N 0.000 description 1
- PRBLYKYHAJEABA-SRVKXCTJSA-N Gln-Arg-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O PRBLYKYHAJEABA-SRVKXCTJSA-N 0.000 description 1
- SOBBAYVQSNXYPQ-ACZMJKKPSA-N Gln-Asn-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O SOBBAYVQSNXYPQ-ACZMJKKPSA-N 0.000 description 1
- LLVXTGUTDYMJLY-GUBZILKMSA-N Gln-Asn-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)N)N LLVXTGUTDYMJLY-GUBZILKMSA-N 0.000 description 1
- ODBLJLZVLAWVMS-GUBZILKMSA-N Gln-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)N)N ODBLJLZVLAWVMS-GUBZILKMSA-N 0.000 description 1
- PONUFVLSGMQFAI-AVGNSLFASA-N Gln-Asn-Phe Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PONUFVLSGMQFAI-AVGNSLFASA-N 0.000 description 1
- GMGKDVVBSVVKCT-NUMRIWBASA-N Gln-Asn-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GMGKDVVBSVVKCT-NUMRIWBASA-N 0.000 description 1
- LMPBBFWHCRURJD-LAEOZQHASA-N Gln-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)N)N LMPBBFWHCRURJD-LAEOZQHASA-N 0.000 description 1
- BTSPOOHJBYJRKO-CIUDSAMLSA-N Gln-Asp-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O BTSPOOHJBYJRKO-CIUDSAMLSA-N 0.000 description 1
- IKDOHQHEFPPGJG-FXQIFTODSA-N Gln-Asp-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O IKDOHQHEFPPGJG-FXQIFTODSA-N 0.000 description 1
- MFLMFRZBAJSGHK-ACZMJKKPSA-N Gln-Cys-Ser Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)O)N MFLMFRZBAJSGHK-ACZMJKKPSA-N 0.000 description 1
- LOJYQMFIIJVETK-WDSKDSINSA-N Gln-Gln Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(O)=O LOJYQMFIIJVETK-WDSKDSINSA-N 0.000 description 1
- NVEASDQHBRZPSU-BQBZGAKWSA-N Gln-Gln-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O NVEASDQHBRZPSU-BQBZGAKWSA-N 0.000 description 1
- AJDMYLOISOCHHC-YVNDNENWSA-N Gln-Gln-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O AJDMYLOISOCHHC-YVNDNENWSA-N 0.000 description 1
- KDXKFBSNIJYNNR-YVNDNENWSA-N Gln-Glu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KDXKFBSNIJYNNR-YVNDNENWSA-N 0.000 description 1
- PXAFHUATEHLECW-GUBZILKMSA-N Gln-Glu-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N PXAFHUATEHLECW-GUBZILKMSA-N 0.000 description 1
- VOLVNCMGXWDDQY-LPEHRKFASA-N Gln-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N)C(=O)O VOLVNCMGXWDDQY-LPEHRKFASA-N 0.000 description 1
- XJKAKYXMFHUIHT-AUTRQRHGSA-N Gln-Glu-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N XJKAKYXMFHUIHT-AUTRQRHGSA-N 0.000 description 1
- FGYPOQPQTUNESW-IUCAKERBSA-N Gln-Gly-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)N)N FGYPOQPQTUNESW-IUCAKERBSA-N 0.000 description 1
- VGTDBGYFVWOQTI-RYUDHWBXSA-N Gln-Gly-Phe Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 VGTDBGYFVWOQTI-RYUDHWBXSA-N 0.000 description 1
- ORYMMTRPKVTGSJ-XVKPBYJWSA-N Gln-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCC(N)=O ORYMMTRPKVTGSJ-XVKPBYJWSA-N 0.000 description 1
- IWUFOVSLWADEJC-AVGNSLFASA-N Gln-His-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O IWUFOVSLWADEJC-AVGNSLFASA-N 0.000 description 1
- JXBZEDIQFFCHPZ-PEFMBERDSA-N Gln-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N JXBZEDIQFFCHPZ-PEFMBERDSA-N 0.000 description 1
- KKCJHBXMYYVWMX-KQXIARHKSA-N Gln-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N KKCJHBXMYYVWMX-KQXIARHKSA-N 0.000 description 1
- VZRAXPGTUNDIDK-GUBZILKMSA-N Gln-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N VZRAXPGTUNDIDK-GUBZILKMSA-N 0.000 description 1
- CAXXTYYGFYTBPV-IUCAKERBSA-N Gln-Leu-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O CAXXTYYGFYTBPV-IUCAKERBSA-N 0.000 description 1
- PSERKXGRRADTKA-MNXVOIDGSA-N Gln-Leu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PSERKXGRRADTKA-MNXVOIDGSA-N 0.000 description 1
- CLSDNFWKGFJIBZ-YUMQZZPRSA-N Gln-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(N)=O CLSDNFWKGFJIBZ-YUMQZZPRSA-N 0.000 description 1
- TWIAMTNJOMRDAK-GUBZILKMSA-N Gln-Lys-Asp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O TWIAMTNJOMRDAK-GUBZILKMSA-N 0.000 description 1
- ZEEPYMXTJWIMSN-GUBZILKMSA-N Gln-Lys-Ser Chemical compound NCCCC[C@@H](C(=O)N[C@@H](CO)C(O)=O)NC(=O)[C@@H](N)CCC(N)=O ZEEPYMXTJWIMSN-GUBZILKMSA-N 0.000 description 1
- DQLVHRFFBQOWFL-JYJNAYRXSA-N Gln-Lys-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)N)N)O DQLVHRFFBQOWFL-JYJNAYRXSA-N 0.000 description 1
- QMVCEWKHIUHTSD-GUBZILKMSA-N Gln-Met-Glu Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N QMVCEWKHIUHTSD-GUBZILKMSA-N 0.000 description 1
- BJPPYOMRAVLXBY-YUMQZZPRSA-N Gln-Met-Gly Chemical compound CSCC[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N BJPPYOMRAVLXBY-YUMQZZPRSA-N 0.000 description 1
- CULXMOZETKLBDI-XIRDDKMYSA-N Gln-Met-Trp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CCC(=O)N)N CULXMOZETKLBDI-XIRDDKMYSA-N 0.000 description 1
- RWCBJYUPAUTWJD-NHCYSSNCSA-N Gln-Met-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(O)=O RWCBJYUPAUTWJD-NHCYSSNCSA-N 0.000 description 1
- UESYBOXFJWJVSB-AVGNSLFASA-N Gln-Phe-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O UESYBOXFJWJVSB-AVGNSLFASA-N 0.000 description 1
- SXFPZRRVWSUYII-KBIXCLLPSA-N Gln-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)N)N SXFPZRRVWSUYII-KBIXCLLPSA-N 0.000 description 1
- UXXIVIQGOODKQC-NUMRIWBASA-N Gln-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O UXXIVIQGOODKQC-NUMRIWBASA-N 0.000 description 1
- VOUSELYGTNGEPB-NUMRIWBASA-N Gln-Thr-Asp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O VOUSELYGTNGEPB-NUMRIWBASA-N 0.000 description 1
- NHMRJKKAVMENKJ-WDCWCFNPSA-N Gln-Thr-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O NHMRJKKAVMENKJ-WDCWCFNPSA-N 0.000 description 1
- OUBUHIODTNUUTC-WDCWCFNPSA-N Gln-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O OUBUHIODTNUUTC-WDCWCFNPSA-N 0.000 description 1
- RONJIBWTGKVKFY-HTUGSXCWSA-N Gln-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O RONJIBWTGKVKFY-HTUGSXCWSA-N 0.000 description 1
- VLOLPWWCNKWRNB-LOKLDPHHSA-N Gln-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O VLOLPWWCNKWRNB-LOKLDPHHSA-N 0.000 description 1
- GTBXHETZPUURJE-KKUMJFAQSA-N Gln-Tyr-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GTBXHETZPUURJE-KKUMJFAQSA-N 0.000 description 1
- JKDBRTNMYXYLHO-JYJNAYRXSA-N Gln-Tyr-Leu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)CC1=CC=C(O)C=C1 JKDBRTNMYXYLHO-JYJNAYRXSA-N 0.000 description 1
- ICRKQMRFXYDYMK-LAEOZQHASA-N Gln-Val-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O ICRKQMRFXYDYMK-LAEOZQHASA-N 0.000 description 1
- QGWXAMDECCKGRU-XVKPBYJWSA-N Gln-Val-Gly Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCC(N)=O)C(=O)NCC(O)=O QGWXAMDECCKGRU-XVKPBYJWSA-N 0.000 description 1
- LKDIBBOKUAASNP-FXQIFTODSA-N Glu-Ala-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LKDIBBOKUAASNP-FXQIFTODSA-N 0.000 description 1
- UTKUTMJSWKKHEM-WDSKDSINSA-N Glu-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O UTKUTMJSWKKHEM-WDSKDSINSA-N 0.000 description 1
- RLZBLVSJDFHDBL-KBIXCLLPSA-N Glu-Ala-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O RLZBLVSJDFHDBL-KBIXCLLPSA-N 0.000 description 1
- MXOODARRORARSU-ACZMJKKPSA-N Glu-Ala-Ser Chemical compound C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCC(=O)O)N MXOODARRORARSU-ACZMJKKPSA-N 0.000 description 1
- WOMUDRVDJMHTCV-DCAQKATOSA-N Glu-Arg-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O WOMUDRVDJMHTCV-DCAQKATOSA-N 0.000 description 1
- RCCDHXSRMWCOOY-GUBZILKMSA-N Glu-Arg-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O RCCDHXSRMWCOOY-GUBZILKMSA-N 0.000 description 1
- CGYDXNKRIMJMLV-GUBZILKMSA-N Glu-Arg-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O CGYDXNKRIMJMLV-GUBZILKMSA-N 0.000 description 1
- VTTSANCGJWLPNC-ZPFDUUQYSA-N Glu-Arg-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VTTSANCGJWLPNC-ZPFDUUQYSA-N 0.000 description 1
- KEBACWCLVOXFNC-DCAQKATOSA-N Glu-Arg-Met Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(O)=O KEBACWCLVOXFNC-DCAQKATOSA-N 0.000 description 1
- AKJRHDMTEJXTPV-ACZMJKKPSA-N Glu-Asn-Ala Chemical compound C[C@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCC(O)=O)C(O)=O AKJRHDMTEJXTPV-ACZMJKKPSA-N 0.000 description 1
- GLWXKFRTOHKGIT-ACZMJKKPSA-N Glu-Asn-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O GLWXKFRTOHKGIT-ACZMJKKPSA-N 0.000 description 1
- MLCPTRRNICEKIS-FXQIFTODSA-N Glu-Asn-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MLCPTRRNICEKIS-FXQIFTODSA-N 0.000 description 1
- CKRUHITYRFNUKW-WDSKDSINSA-N Glu-Asn-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CKRUHITYRFNUKW-WDSKDSINSA-N 0.000 description 1
- ZOXBSICWUDAOHX-GUBZILKMSA-N Glu-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCC(O)=O ZOXBSICWUDAOHX-GUBZILKMSA-N 0.000 description 1
- VAZZOGXDUQSVQF-NUMRIWBASA-N Glu-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N)O VAZZOGXDUQSVQF-NUMRIWBASA-N 0.000 description 1
- PCBBLFVHTYNQGG-LAEOZQHASA-N Glu-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N PCBBLFVHTYNQGG-LAEOZQHASA-N 0.000 description 1
- QPRZKNOOOBWXSU-CIUDSAMLSA-N Glu-Asp-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N QPRZKNOOOBWXSU-CIUDSAMLSA-N 0.000 description 1
- NADWTMLCUDMDQI-ACZMJKKPSA-N Glu-Asp-Cys Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CS)C(=O)O)N NADWTMLCUDMDQI-ACZMJKKPSA-N 0.000 description 1
- HJIFPJUEOGZWRI-GUBZILKMSA-N Glu-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)N HJIFPJUEOGZWRI-GUBZILKMSA-N 0.000 description 1
- ISXJHXGYMJKXOI-GUBZILKMSA-N Glu-Cys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CS)NC(=O)[C@@H](N)CCC(O)=O ISXJHXGYMJKXOI-GUBZILKMSA-N 0.000 description 1
- LVCHEMOPBORRLB-DCAQKATOSA-N Glu-Gln-Lys Chemical compound NCCCC[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCC(O)=O)C(O)=O LVCHEMOPBORRLB-DCAQKATOSA-N 0.000 description 1
- HTTSBEBKVNEDFE-AUTRQRHGSA-N Glu-Gln-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)O)N HTTSBEBKVNEDFE-AUTRQRHGSA-N 0.000 description 1
- NKLRYVLERDYDBI-FXQIFTODSA-N Glu-Glu-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O NKLRYVLERDYDBI-FXQIFTODSA-N 0.000 description 1
- SJPMNHCEWPTRBR-BQBZGAKWSA-N Glu-Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SJPMNHCEWPTRBR-BQBZGAKWSA-N 0.000 description 1
- AUTNXSQEVVHSJK-YVNDNENWSA-N Glu-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O AUTNXSQEVVHSJK-YVNDNENWSA-N 0.000 description 1
- PHONAZGUEGIOEM-GLLZPBPUSA-N Glu-Glu-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PHONAZGUEGIOEM-GLLZPBPUSA-N 0.000 description 1
- QJCKNLPMTPXXEM-AUTRQRHGSA-N Glu-Glu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O QJCKNLPMTPXXEM-AUTRQRHGSA-N 0.000 description 1
- UHVIQGKBMXEVGN-WDSKDSINSA-N Glu-Gly-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O UHVIQGKBMXEVGN-WDSKDSINSA-N 0.000 description 1
- OGNJZUXUTPQVBR-BQBZGAKWSA-N Glu-Gly-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OGNJZUXUTPQVBR-BQBZGAKWSA-N 0.000 description 1
- CUXJIASLBRJOFV-LAEOZQHASA-N Glu-Gly-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O CUXJIASLBRJOFV-LAEOZQHASA-N 0.000 description 1
- LRPXYSGPOBVBEH-IUCAKERBSA-N Glu-Gly-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O LRPXYSGPOBVBEH-IUCAKERBSA-N 0.000 description 1
- ZWQVYZXPYSYPJD-RYUDHWBXSA-N Glu-Gly-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ZWQVYZXPYSYPJD-RYUDHWBXSA-N 0.000 description 1
- HILMIYALTUQTRC-XVKPBYJWSA-N Glu-Gly-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O HILMIYALTUQTRC-XVKPBYJWSA-N 0.000 description 1
- VXQOONWNIWFOCS-HGNGGELXSA-N Glu-His-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCC(=O)O)N VXQOONWNIWFOCS-HGNGGELXSA-N 0.000 description 1
- DVLZZEPUNFEUBW-AVGNSLFASA-N Glu-His-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCC(=O)O)N DVLZZEPUNFEUBW-AVGNSLFASA-N 0.000 description 1
- LGYCLOCORAEQSZ-PEFMBERDSA-N Glu-Ile-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O LGYCLOCORAEQSZ-PEFMBERDSA-N 0.000 description 1
- ZCOJVESMNGBGLF-GRLWGSQLSA-N Glu-Ile-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZCOJVESMNGBGLF-GRLWGSQLSA-N 0.000 description 1
- HVYWQYLBVXMXSV-GUBZILKMSA-N Glu-Leu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O HVYWQYLBVXMXSV-GUBZILKMSA-N 0.000 description 1
- VGBSZQSKQRMLHD-MNXVOIDGSA-N Glu-Leu-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VGBSZQSKQRMLHD-MNXVOIDGSA-N 0.000 description 1
- UGSVSNXPJJDJKL-SDDRHHMPSA-N Glu-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N UGSVSNXPJJDJKL-SDDRHHMPSA-N 0.000 description 1
- FBEJIDRSQCGFJI-GUBZILKMSA-N Glu-Leu-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FBEJIDRSQCGFJI-GUBZILKMSA-N 0.000 description 1
- SJJHXJDSNQJMMW-SRVKXCTJSA-N Glu-Lys-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O SJJHXJDSNQJMMW-SRVKXCTJSA-N 0.000 description 1
- CUPSDFQZTVVTSK-GUBZILKMSA-N Glu-Lys-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(O)=O CUPSDFQZTVVTSK-GUBZILKMSA-N 0.000 description 1
- UJMNFCAHLYKWOZ-DCAQKATOSA-N Glu-Lys-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O UJMNFCAHLYKWOZ-DCAQKATOSA-N 0.000 description 1
- OCJRHJZKGGSPRW-IUCAKERBSA-N Glu-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O OCJRHJZKGGSPRW-IUCAKERBSA-N 0.000 description 1
- FMBWLLMUPXTXFC-SDDRHHMPSA-N Glu-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)O)N)C(=O)O FMBWLLMUPXTXFC-SDDRHHMPSA-N 0.000 description 1
- RBXSZQRSEGYDFG-GUBZILKMSA-N Glu-Lys-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O RBXSZQRSEGYDFG-GUBZILKMSA-N 0.000 description 1
- AQNYKMCFCCZEEL-JYJNAYRXSA-N Glu-Lys-Tyr Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AQNYKMCFCCZEEL-JYJNAYRXSA-N 0.000 description 1
- XNOWYPDMSLSRKP-GUBZILKMSA-N Glu-Met-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(O)=O XNOWYPDMSLSRKP-GUBZILKMSA-N 0.000 description 1
- JHSRJMUJOGLIHK-GUBZILKMSA-N Glu-Met-Glu Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)O)N JHSRJMUJOGLIHK-GUBZILKMSA-N 0.000 description 1
- QMOSCLNJVKSHHU-YUMQZZPRSA-N Glu-Met-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O QMOSCLNJVKSHHU-YUMQZZPRSA-N 0.000 description 1
- LKOAAMXDJGEYMS-ZPFDUUQYSA-N Glu-Met-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LKOAAMXDJGEYMS-ZPFDUUQYSA-N 0.000 description 1
- SOEPMWQCTJITPZ-SRVKXCTJSA-N Glu-Met-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N SOEPMWQCTJITPZ-SRVKXCTJSA-N 0.000 description 1
- QNJNPKSWAHPYGI-JYJNAYRXSA-N Glu-Phe-Leu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)CC1=CC=CC=C1 QNJNPKSWAHPYGI-JYJNAYRXSA-N 0.000 description 1
- JYXKPJVDCAWMDG-ZPFDUUQYSA-N Glu-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CCC(=O)O)N JYXKPJVDCAWMDG-ZPFDUUQYSA-N 0.000 description 1
- SYWCGQOIIARSIX-SRVKXCTJSA-N Glu-Pro-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O SYWCGQOIIARSIX-SRVKXCTJSA-N 0.000 description 1
- ALMBZBOCGSVSAI-ACZMJKKPSA-N Glu-Ser-Asn Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)N)C(=O)O)N ALMBZBOCGSVSAI-ACZMJKKPSA-N 0.000 description 1
- GTFYQOVVVJASOA-ACZMJKKPSA-N Glu-Ser-Cys Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O)N GTFYQOVVVJASOA-ACZMJKKPSA-N 0.000 description 1
- IDEODOAVGCMUQV-GUBZILKMSA-N Glu-Ser-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IDEODOAVGCMUQV-GUBZILKMSA-N 0.000 description 1
- QOXDAWODGSIDDI-GUBZILKMSA-N Glu-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)O)N QOXDAWODGSIDDI-GUBZILKMSA-N 0.000 description 1
- BXSZPACYCMNKLS-AVGNSLFASA-N Glu-Ser-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O BXSZPACYCMNKLS-AVGNSLFASA-N 0.000 description 1
- WXONSNSSBYQGNN-AVGNSLFASA-N Glu-Ser-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O WXONSNSSBYQGNN-AVGNSLFASA-N 0.000 description 1
- DDXZHOHEABQXSE-NKIYYHGXSA-N Glu-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O DDXZHOHEABQXSE-NKIYYHGXSA-N 0.000 description 1
- DTLLNDVORUEOTM-WDCWCFNPSA-N Glu-Thr-Lys Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O DTLLNDVORUEOTM-WDCWCFNPSA-N 0.000 description 1
- CQGBSALYGOXQPE-HTUGSXCWSA-N Glu-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O CQGBSALYGOXQPE-HTUGSXCWSA-N 0.000 description 1
- CAQXJMUDOLSBPF-SUSMZKCASA-N Glu-Thr-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAQXJMUDOLSBPF-SUSMZKCASA-N 0.000 description 1
- HVKAAUOFFTUSAA-XDTLVQLUSA-N Glu-Tyr-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O HVKAAUOFFTUSAA-XDTLVQLUSA-N 0.000 description 1
- HHSKZJZWQFPSKN-AVGNSLFASA-N Glu-Tyr-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O HHSKZJZWQFPSKN-AVGNSLFASA-N 0.000 description 1
- VXEFAWJTFAUDJK-AVGNSLFASA-N Glu-Tyr-Ser Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O VXEFAWJTFAUDJK-AVGNSLFASA-N 0.000 description 1
- HQTDNEZTGZUWSY-XVKPBYJWSA-N Glu-Val-Gly Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCC(O)=O)C(=O)NCC(O)=O HQTDNEZTGZUWSY-XVKPBYJWSA-N 0.000 description 1
- FVGOGEGGQLNZGH-DZKIICNBSA-N Glu-Val-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 FVGOGEGGQLNZGH-DZKIICNBSA-N 0.000 description 1
- QXUPRMQJDWJDFR-NRPADANISA-N Glu-Val-Ser Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O QXUPRMQJDWJDFR-NRPADANISA-N 0.000 description 1
- 108010070675 Glutathione transferase Proteins 0.000 description 1
- YMUFWNJHVPQNQD-ZKWXMUAHSA-N Gly-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN YMUFWNJHVPQNQD-ZKWXMUAHSA-N 0.000 description 1
- JBRBACJPBZNFMF-YUMQZZPRSA-N Gly-Ala-Lys Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN JBRBACJPBZNFMF-YUMQZZPRSA-N 0.000 description 1
- JXYMPBCYRKWJEE-BQBZGAKWSA-N Gly-Arg-Ala Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O JXYMPBCYRKWJEE-BQBZGAKWSA-N 0.000 description 1
- XUDLUKYPXQDCRX-BQBZGAKWSA-N Gly-Arg-Asn Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O XUDLUKYPXQDCRX-BQBZGAKWSA-N 0.000 description 1
- DTPOVRRYXPJJAZ-FJXKBIBVSA-N Gly-Arg-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N DTPOVRRYXPJJAZ-FJXKBIBVSA-N 0.000 description 1
- WKJKBELXHCTHIJ-WPRPVWTQSA-N Gly-Arg-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N WKJKBELXHCTHIJ-WPRPVWTQSA-N 0.000 description 1
- UXJHNZODTMHWRD-WHFBIAKZSA-N Gly-Asn-Ala Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O UXJHNZODTMHWRD-WHFBIAKZSA-N 0.000 description 1
- CIMULJZTTOBOPN-WHFBIAKZSA-N Gly-Asn-Asn Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CIMULJZTTOBOPN-WHFBIAKZSA-N 0.000 description 1
- DJTXYXZNNDDEOU-WHFBIAKZSA-N Gly-Asn-Cys Chemical compound C([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)CN)C(=O)N DJTXYXZNNDDEOU-WHFBIAKZSA-N 0.000 description 1
- WJZLEENECIOOSA-WDSKDSINSA-N Gly-Asn-Gln Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)O WJZLEENECIOOSA-WDSKDSINSA-N 0.000 description 1
- BGVYNAQWHSTTSP-BYULHYEWSA-N Gly-Asn-Ile Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BGVYNAQWHSTTSP-BYULHYEWSA-N 0.000 description 1
- JVWPPCWUDRJGAE-YUMQZZPRSA-N Gly-Asn-Leu Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JVWPPCWUDRJGAE-YUMQZZPRSA-N 0.000 description 1
- OCDLPQDYTJPWNG-YUMQZZPRSA-N Gly-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)CN OCDLPQDYTJPWNG-YUMQZZPRSA-N 0.000 description 1
- IWAXHBCACVWNHT-BQBZGAKWSA-N Gly-Asp-Arg Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IWAXHBCACVWNHT-BQBZGAKWSA-N 0.000 description 1
- QSTLUOIOYLYLLF-WDSKDSINSA-N Gly-Asp-Glu Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O QSTLUOIOYLYLLF-WDSKDSINSA-N 0.000 description 1
- TZOVVRJYUDETQG-RCOVLWMOSA-N Gly-Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)CN TZOVVRJYUDETQG-RCOVLWMOSA-N 0.000 description 1
- BYYNJRSNDARRBX-YFKPBYRVSA-N Gly-Gln-Gly Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O BYYNJRSNDARRBX-YFKPBYRVSA-N 0.000 description 1
- LXXANCRPFBSSKS-IUCAKERBSA-N Gly-Gln-Leu Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LXXANCRPFBSSKS-IUCAKERBSA-N 0.000 description 1
- QPDUVFSVVAOUHE-XVKPBYJWSA-N Gly-Gln-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)CN)C(O)=O QPDUVFSVVAOUHE-XVKPBYJWSA-N 0.000 description 1
- SOEATRRYCIPEHA-BQBZGAKWSA-N Gly-Glu-Glu Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SOEATRRYCIPEHA-BQBZGAKWSA-N 0.000 description 1
- XTQFHTHIAKKCTM-YFKPBYRVSA-N Gly-Glu-Gly Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O XTQFHTHIAKKCTM-YFKPBYRVSA-N 0.000 description 1
- YYPFZVIXAVDHIK-IUCAKERBSA-N Gly-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN YYPFZVIXAVDHIK-IUCAKERBSA-N 0.000 description 1
- BEQGFMIBZFNROK-JGVFFNPUSA-N Gly-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)CN)C(=O)O BEQGFMIBZFNROK-JGVFFNPUSA-N 0.000 description 1
- GDOZQTNZPCUARW-YFKPBYRVSA-N Gly-Gly-Glu Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O GDOZQTNZPCUARW-YFKPBYRVSA-N 0.000 description 1
- HPAIKDPJURGQLN-KBPBESRZSA-N Gly-His-Phe Chemical compound C([C@H](NC(=O)CN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CNC=N1 HPAIKDPJURGQLN-KBPBESRZSA-N 0.000 description 1
- LUJVWKKYHSLULQ-ZKWXMUAHSA-N Gly-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)CN LUJVWKKYHSLULQ-ZKWXMUAHSA-N 0.000 description 1
- VIIBEIQMLJEUJG-LAEOZQHASA-N Gly-Ile-Gln Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(O)=O VIIBEIQMLJEUJG-LAEOZQHASA-N 0.000 description 1
- HKSNHPVETYYJBK-LAEOZQHASA-N Gly-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)CN HKSNHPVETYYJBK-LAEOZQHASA-N 0.000 description 1
- UYPPAMNTTMJHJW-KCTSRDHCSA-N Gly-Ile-Trp Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O UYPPAMNTTMJHJW-KCTSRDHCSA-N 0.000 description 1
- PAWIVEIWWYGBAM-YUMQZZPRSA-N Gly-Leu-Ala Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O PAWIVEIWWYGBAM-YUMQZZPRSA-N 0.000 description 1
- YTSVAIMKVLZUDU-YUMQZZPRSA-N Gly-Leu-Asp Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O YTSVAIMKVLZUDU-YUMQZZPRSA-N 0.000 description 1
- YIFUFYZELCMPJP-YUMQZZPRSA-N Gly-Leu-Cys Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(O)=O YIFUFYZELCMPJP-YUMQZZPRSA-N 0.000 description 1
- UHPAZODVFFYEEL-QWRGUYRKSA-N Gly-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)CN UHPAZODVFFYEEL-QWRGUYRKSA-N 0.000 description 1
- CLNSYANKYVMZNM-UWVGGRQHSA-N Gly-Lys-Arg Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)N[C@H](C(O)=O)CCCN=C(N)N CLNSYANKYVMZNM-UWVGGRQHSA-N 0.000 description 1
- PDUHNKAFQXQNLH-ZETCQYMHSA-N Gly-Lys-Gly Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)NCC(O)=O PDUHNKAFQXQNLH-ZETCQYMHSA-N 0.000 description 1
- MHZXESQPPXOING-KBPBESRZSA-N Gly-Lys-Phe Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O MHZXESQPPXOING-KBPBESRZSA-N 0.000 description 1
- WDEHMRNSGHVNOH-VHSXEESVSA-N Gly-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)CN)C(=O)O WDEHMRNSGHVNOH-VHSXEESVSA-N 0.000 description 1
- NTBOEZICHOSJEE-YUMQZZPRSA-N Gly-Lys-Ser Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O NTBOEZICHOSJEE-YUMQZZPRSA-N 0.000 description 1
- LPHQAFLNEHWKFF-QXEWZRGKSA-N Gly-Met-Ile Chemical compound [H]NCC(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LPHQAFLNEHWKFF-QXEWZRGKSA-N 0.000 description 1
- OCPPBNKYGYSLOE-IUCAKERBSA-N Gly-Pro-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN OCPPBNKYGYSLOE-IUCAKERBSA-N 0.000 description 1
- IRJWAYCXIYUHQE-WHFBIAKZSA-N Gly-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)CN IRJWAYCXIYUHQE-WHFBIAKZSA-N 0.000 description 1
- YOBGUCWZPXJHTN-BQBZGAKWSA-N Gly-Ser-Arg Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YOBGUCWZPXJHTN-BQBZGAKWSA-N 0.000 description 1
- FGPLUIQCSKGLTI-WDSKDSINSA-N Gly-Ser-Glu Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O FGPLUIQCSKGLTI-WDSKDSINSA-N 0.000 description 1
- MKIAPEZXQDILRR-YUMQZZPRSA-N Gly-Ser-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)CN MKIAPEZXQDILRR-YUMQZZPRSA-N 0.000 description 1
- JSLVAHYTAJJEQH-QWRGUYRKSA-N Gly-Ser-Phe Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 JSLVAHYTAJJEQH-QWRGUYRKSA-N 0.000 description 1
- ABPRMMYHROQBLY-NKWVEPMBSA-N Gly-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)CN)C(=O)O ABPRMMYHROQBLY-NKWVEPMBSA-N 0.000 description 1
- FKYQEVBRZSFAMJ-QWRGUYRKSA-N Gly-Ser-Tyr Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FKYQEVBRZSFAMJ-QWRGUYRKSA-N 0.000 description 1
- FFALDIDGPLUDKV-ZDLURKLDSA-N Gly-Thr-Ser Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O FFALDIDGPLUDKV-ZDLURKLDSA-N 0.000 description 1
- GNNJKUYDWFIBTK-QWRGUYRKSA-N Gly-Tyr-Asp Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O GNNJKUYDWFIBTK-QWRGUYRKSA-N 0.000 description 1
- NWOSHVVPKDQKKT-RYUDHWBXSA-N Gly-Tyr-Gln Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O NWOSHVVPKDQKKT-RYUDHWBXSA-N 0.000 description 1
- DNAZKGFYFRGZIH-QWRGUYRKSA-N Gly-Tyr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 DNAZKGFYFRGZIH-QWRGUYRKSA-N 0.000 description 1
- GBYYQVBXFVDJPJ-WLTAIBSBSA-N Gly-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)CN)O GBYYQVBXFVDJPJ-WLTAIBSBSA-N 0.000 description 1
- SYOJVRNQCXYEOV-XVKPBYJWSA-N Gly-Val-Glu Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SYOJVRNQCXYEOV-XVKPBYJWSA-N 0.000 description 1
- BNMRSWQOHIQTFL-JSGCOSHPSA-N Gly-Val-Phe Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 BNMRSWQOHIQTFL-JSGCOSHPSA-N 0.000 description 1
- JZNWSCPGTDBMEW-UHFFFAOYSA-N Glycerophosphorylethanolamin Natural products NCCOP(O)(=O)OCC(O)CO JZNWSCPGTDBMEW-UHFFFAOYSA-N 0.000 description 1
- 241001175058 Helicobacter pylori P12 Species 0.000 description 1
- 208000028861 Helicobacter pylori infectious disease Diseases 0.000 description 1
- 102100029100 Hematopoietic prostaglandin D synthase Human genes 0.000 description 1
- DCRODRAURLJOFY-XPUUQOCRSA-N His-Ala-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)NCC(O)=O DCRODRAURLJOFY-XPUUQOCRSA-N 0.000 description 1
- VCDNHBNNPCDBKV-DLOVCJGASA-N His-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N VCDNHBNNPCDBKV-DLOVCJGASA-N 0.000 description 1
- ZNPRMNDAFQKATM-LKTVYLICSA-N His-Ala-Tyr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZNPRMNDAFQKATM-LKTVYLICSA-N 0.000 description 1
- NOQPTNXSGNPJNS-YUMQZZPRSA-N His-Asn-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O NOQPTNXSGNPJNS-YUMQZZPRSA-N 0.000 description 1
- WZOGEMJIZBNFBK-CIUDSAMLSA-N His-Asp-Asn Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O WZOGEMJIZBNFBK-CIUDSAMLSA-N 0.000 description 1
- FDQYIRHBVVUTJF-ZETCQYMHSA-N His-Gly-Gly Chemical compound [O-]C(=O)CNC(=O)CNC(=O)[C@@H]([NH3+])CC1=CN=CN1 FDQYIRHBVVUTJF-ZETCQYMHSA-N 0.000 description 1
- FYTCLUIYTYFGPT-YUMQZZPRSA-N His-Gly-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)NCC(=O)N[C@@H](CO)C(O)=O FYTCLUIYTYFGPT-YUMQZZPRSA-N 0.000 description 1
- SYIPVNMWBZXKMU-HJPIBITLSA-N His-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CN=CN2)N SYIPVNMWBZXKMU-HJPIBITLSA-N 0.000 description 1
- MPXGJGBXCRQQJE-MXAVVETBSA-N His-Ile-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O MPXGJGBXCRQQJE-MXAVVETBSA-N 0.000 description 1
- QMUHTRISZMFKAY-MXAVVETBSA-N His-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N QMUHTRISZMFKAY-MXAVVETBSA-N 0.000 description 1
- SKYULSWNBYAQMG-IHRRRGAJSA-N His-Leu-Arg Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SKYULSWNBYAQMG-IHRRRGAJSA-N 0.000 description 1
- OQDLKDUVMTUPPG-AVGNSLFASA-N His-Leu-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O OQDLKDUVMTUPPG-AVGNSLFASA-N 0.000 description 1
- GJMHMDKCJPQJOI-IHRRRGAJSA-N His-Lys-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CN=CN1 GJMHMDKCJPQJOI-IHRRRGAJSA-N 0.000 description 1
- FBCURAVMSXNOLP-JYJNAYRXSA-N His-Phe-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N FBCURAVMSXNOLP-JYJNAYRXSA-N 0.000 description 1
- SVVULKPWDBIPCO-BZSNNMDCSA-N His-Phe-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O SVVULKPWDBIPCO-BZSNNMDCSA-N 0.000 description 1
- SGLXGEDPYJPGIQ-ACRUOGEOSA-N His-Phe-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)O)NC(=O)[C@H](CC3=CN=CN3)N SGLXGEDPYJPGIQ-ACRUOGEOSA-N 0.000 description 1
- ULRFSEJGSHYLQI-YESZJQIVSA-N His-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CC3=CN=CN3)N)C(=O)O ULRFSEJGSHYLQI-YESZJQIVSA-N 0.000 description 1
- PBVQWNDMFFCPIZ-ULQDDVLXSA-N His-Pro-Phe Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CN=CN1 PBVQWNDMFFCPIZ-ULQDDVLXSA-N 0.000 description 1
- UPJODPVSKKWGDQ-KLHWPWHYSA-N His-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N)O UPJODPVSKKWGDQ-KLHWPWHYSA-N 0.000 description 1
- RNVUQLOKVIPNEM-BZSNNMDCSA-N His-Tyr-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N)O RNVUQLOKVIPNEM-BZSNNMDCSA-N 0.000 description 1
- XGBVLRJLHUVCNK-DCAQKATOSA-N His-Val-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O XGBVLRJLHUVCNK-DCAQKATOSA-N 0.000 description 1
- 101710169678 Histidine-rich protein Proteins 0.000 description 1
- 108010025076 Holoenzymes Proteins 0.000 description 1
- 101000959247 Homo sapiens Actin, alpha cardiac muscle 1 Proteins 0.000 description 1
- 101000952934 Homo sapiens Atrial natriuretic peptide-converting enzyme Proteins 0.000 description 1
- 101000920981 Homo sapiens Cornulin Proteins 0.000 description 1
- 241000257303 Hymenoptera Species 0.000 description 1
- VAXBXNPRXPHGHG-BJDJZHNGSA-N Ile-Ala-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)O)N VAXBXNPRXPHGHG-BJDJZHNGSA-N 0.000 description 1
- DPTBVFUDCPINIP-JURCDPSOSA-N Ile-Ala-Phe Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 DPTBVFUDCPINIP-JURCDPSOSA-N 0.000 description 1
- HERITAGIPLEJMT-GVARAGBVSA-N Ile-Ala-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HERITAGIPLEJMT-GVARAGBVSA-N 0.000 description 1
- MKWSZEHGHSLNPF-NAKRPEOUSA-N Ile-Ala-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)O)N MKWSZEHGHSLNPF-NAKRPEOUSA-N 0.000 description 1
- TZCGZYWNIDZZMR-NAKRPEOUSA-N Ile-Arg-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](C)C(=O)O)N TZCGZYWNIDZZMR-NAKRPEOUSA-N 0.000 description 1
- HLYBGMZJVDHJEO-CYDGBPFRSA-N Ile-Arg-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N HLYBGMZJVDHJEO-CYDGBPFRSA-N 0.000 description 1
- QLRMMMQNCWBNPQ-QXEWZRGKSA-N Ile-Arg-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)NCC(=O)O)N QLRMMMQNCWBNPQ-QXEWZRGKSA-N 0.000 description 1
- ATXGFMOBVKSOMK-PEDHHIEDSA-N Ile-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N ATXGFMOBVKSOMK-PEDHHIEDSA-N 0.000 description 1
- AZEYWPUCOYXFOE-CYDGBPFRSA-N Ile-Arg-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](C(C)C)C(=O)O)N AZEYWPUCOYXFOE-CYDGBPFRSA-N 0.000 description 1
- YKRIXHPEIZUDDY-GMOBBJLQSA-N Ile-Asn-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YKRIXHPEIZUDDY-GMOBBJLQSA-N 0.000 description 1
- XENGULNPUDGALZ-ZPFDUUQYSA-N Ile-Asn-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(C)C)C(=O)O)N XENGULNPUDGALZ-ZPFDUUQYSA-N 0.000 description 1
- FJWYJQRCVNGEAQ-ZPFDUUQYSA-N Ile-Asn-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N FJWYJQRCVNGEAQ-ZPFDUUQYSA-N 0.000 description 1
- HDODQNPMSHDXJT-GHCJXIJMSA-N Ile-Asn-Ser Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O HDODQNPMSHDXJT-GHCJXIJMSA-N 0.000 description 1
- IDAHFEPYTJJZFD-PEFMBERDSA-N Ile-Asp-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N IDAHFEPYTJJZFD-PEFMBERDSA-N 0.000 description 1
- NPROWIBAWYMPAZ-GUDRVLHUSA-N Ile-Asp-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N NPROWIBAWYMPAZ-GUDRVLHUSA-N 0.000 description 1
- PFTFEWHJSAXGED-ZKWXMUAHSA-N Ile-Cys-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)NCC(=O)O)N PFTFEWHJSAXGED-ZKWXMUAHSA-N 0.000 description 1
- GECLQMBTZCPAFY-PEFMBERDSA-N Ile-Gln-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N GECLQMBTZCPAFY-PEFMBERDSA-N 0.000 description 1
- DMZOUKXXHJQPTL-GRLWGSQLSA-N Ile-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N DMZOUKXXHJQPTL-GRLWGSQLSA-N 0.000 description 1
- WNQKUUQIVDDAFA-ZPFDUUQYSA-N Ile-Gln-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCSC)C(=O)O)N WNQKUUQIVDDAFA-ZPFDUUQYSA-N 0.000 description 1
- JDAWAWXGAUZPNJ-ZPFDUUQYSA-N Ile-Glu-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N JDAWAWXGAUZPNJ-ZPFDUUQYSA-N 0.000 description 1
- TVSPLSZTKTUYLV-ZPFDUUQYSA-N Ile-Glu-Met Chemical compound N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCSC)C(=O)O TVSPLSZTKTUYLV-ZPFDUUQYSA-N 0.000 description 1
- PNDMHTTXXPUQJH-RWRJDSDZSA-N Ile-Glu-Thr Chemical compound N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H]([C@H](O)C)C(=O)O PNDMHTTXXPUQJH-RWRJDSDZSA-N 0.000 description 1
- NZOCIWKZUVUNDW-ZKWXMUAHSA-N Ile-Gly-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O NZOCIWKZUVUNDW-ZKWXMUAHSA-N 0.000 description 1
- SLQVFYWBGNNOTK-BYULHYEWSA-N Ile-Gly-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)N)C(=O)O)N SLQVFYWBGNNOTK-BYULHYEWSA-N 0.000 description 1
- KFVUBLZRFSVDGO-BYULHYEWSA-N Ile-Gly-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O KFVUBLZRFSVDGO-BYULHYEWSA-N 0.000 description 1
- PDTMWFVVNZYWTR-NHCYSSNCSA-N Ile-Gly-Lys Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](CCCCN)C(O)=O PDTMWFVVNZYWTR-NHCYSSNCSA-N 0.000 description 1
- DFFTXLCCDFYRKD-MBLNEYKQSA-N Ile-Gly-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(=O)O)N DFFTXLCCDFYRKD-MBLNEYKQSA-N 0.000 description 1
- JLWLMGADIQFKRD-QSFUFRPTSA-N Ile-His-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CN=CN1 JLWLMGADIQFKRD-QSFUFRPTSA-N 0.000 description 1
- YBGTWSFIGHUWQE-MXAVVETBSA-N Ile-His-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)CC)CC1=CN=CN1 YBGTWSFIGHUWQE-MXAVVETBSA-N 0.000 description 1
- URWXDJAEEGBADB-TUBUOCAGSA-N Ile-His-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N URWXDJAEEGBADB-TUBUOCAGSA-N 0.000 description 1
- KYLIZSDYWQQTFM-PEDHHIEDSA-N Ile-Ile-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CCCN=C(N)N KYLIZSDYWQQTFM-PEDHHIEDSA-N 0.000 description 1
- PFPUFNLHBXKPHY-HTFCKZLJSA-N Ile-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(=O)O)N PFPUFNLHBXKPHY-HTFCKZLJSA-N 0.000 description 1
- AXNGDPAKKCEKGY-QPHKQPEJSA-N Ile-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N AXNGDPAKKCEKGY-QPHKQPEJSA-N 0.000 description 1
- HUORUFRRJHELPD-MNXVOIDGSA-N Ile-Leu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N HUORUFRRJHELPD-MNXVOIDGSA-N 0.000 description 1
- DSDPLOODKXISDT-XUXIUFHCSA-N Ile-Leu-Val Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O DSDPLOODKXISDT-XUXIUFHCSA-N 0.000 description 1
- CKRFDMPBSWYOBT-PPCPHDFISA-N Ile-Lys-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N CKRFDMPBSWYOBT-PPCPHDFISA-N 0.000 description 1
- UDBPXJNOEWDBDF-XUXIUFHCSA-N Ile-Lys-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)O)N UDBPXJNOEWDBDF-XUXIUFHCSA-N 0.000 description 1
- RVNOXPZHMUWCLW-GMOBBJLQSA-N Ile-Met-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)N)C(=O)O)N RVNOXPZHMUWCLW-GMOBBJLQSA-N 0.000 description 1
- VOCZPDONPURUHV-QEWYBTABSA-N Ile-Phe-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N VOCZPDONPURUHV-QEWYBTABSA-N 0.000 description 1
- XLXPYSDGMXTTNQ-DKIMLUQUSA-N Ile-Phe-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CC(C)C)C(O)=O XLXPYSDGMXTTNQ-DKIMLUQUSA-N 0.000 description 1
- LRAUKBMYHHNADU-DKIMLUQUSA-N Ile-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)CC)CC1=CC=CC=C1 LRAUKBMYHHNADU-DKIMLUQUSA-N 0.000 description 1
- FGBRXCZYVRFNKQ-MXAVVETBSA-N Ile-Phe-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)O)N FGBRXCZYVRFNKQ-MXAVVETBSA-N 0.000 description 1
- BATWGBRIZANGPN-ZPFDUUQYSA-N Ile-Pro-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(=O)N)C(=O)O)N BATWGBRIZANGPN-ZPFDUUQYSA-N 0.000 description 1
- IVXJIMGDOYRLQU-XUXIUFHCSA-N Ile-Pro-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O IVXJIMGDOYRLQU-XUXIUFHCSA-N 0.000 description 1
- CIJLNXXMDUOFPH-HJWJTTGWSA-N Ile-Pro-Phe Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 CIJLNXXMDUOFPH-HJWJTTGWSA-N 0.000 description 1
- JHNJNTMTZHEDLJ-NAKRPEOUSA-N Ile-Ser-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O JHNJNTMTZHEDLJ-NAKRPEOUSA-N 0.000 description 1
- JZNVOBUNTWNZPW-GHCJXIJMSA-N Ile-Ser-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)O)N JZNVOBUNTWNZPW-GHCJXIJMSA-N 0.000 description 1
- XMYURPUVJSKTMC-KBIXCLLPSA-N Ile-Ser-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N XMYURPUVJSKTMC-KBIXCLLPSA-N 0.000 description 1
- ZLFNNVATRMCAKN-ZKWXMUAHSA-N Ile-Ser-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)NCC(=O)O)N ZLFNNVATRMCAKN-ZKWXMUAHSA-N 0.000 description 1
- PELCGFMHLZXWBQ-BJDJZHNGSA-N Ile-Ser-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)O)N PELCGFMHLZXWBQ-BJDJZHNGSA-N 0.000 description 1
- YCKPUHHMCFSUMD-IUKAMOBKSA-N Ile-Thr-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YCKPUHHMCFSUMD-IUKAMOBKSA-N 0.000 description 1
- NAFIFZNBSPWYOO-RWRJDSDZSA-N Ile-Thr-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N NAFIFZNBSPWYOO-RWRJDSDZSA-N 0.000 description 1
- YBKKLDBBPFIXBQ-MBLNEYKQSA-N Ile-Thr-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(=O)O)N YBKKLDBBPFIXBQ-MBLNEYKQSA-N 0.000 description 1
- WCNWGAUZWWSYDG-SVSWQMSJSA-N Ile-Thr-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)O)N WCNWGAUZWWSYDG-SVSWQMSJSA-N 0.000 description 1
- DGTOKVBDZXJHNZ-WZLNRYEVSA-N Ile-Thr-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N DGTOKVBDZXJHNZ-WZLNRYEVSA-N 0.000 description 1
- HZVRQFKRALAMQS-SLBDDTMCSA-N Ile-Trp-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HZVRQFKRALAMQS-SLBDDTMCSA-N 0.000 description 1
- HQLSBZFLOUHQJK-STECZYCISA-N Ile-Tyr-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N HQLSBZFLOUHQJK-STECZYCISA-N 0.000 description 1
- FXJLRZFMKGHYJP-CFMVVWHZSA-N Ile-Tyr-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N FXJLRZFMKGHYJP-CFMVVWHZSA-N 0.000 description 1
- GVEODXUBBFDBPW-MGHWNKPDSA-N Ile-Tyr-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)CC1=CC=C(O)C=C1 GVEODXUBBFDBPW-MGHWNKPDSA-N 0.000 description 1
- ZUWSVOYKBCHLRR-MGHWNKPDSA-N Ile-Tyr-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZUWSVOYKBCHLRR-MGHWNKPDSA-N 0.000 description 1
- 102000009786 Immunoglobulin Constant Regions Human genes 0.000 description 1
- 108010009817 Immunoglobulin Constant Regions Proteins 0.000 description 1
- 102000001706 Immunoglobulin Fab Fragments Human genes 0.000 description 1
- 108010054477 Immunoglobulin Fab Fragments Proteins 0.000 description 1
- 229930010555 Inosine Natural products 0.000 description 1
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 description 1
- 102000013462 Interleukin-12 Human genes 0.000 description 1
- 108010065805 Interleukin-12 Proteins 0.000 description 1
- 102100024319 Intestinal-type alkaline phosphatase Human genes 0.000 description 1
- 101710184243 Intestinal-type alkaline phosphatase Proteins 0.000 description 1
- 241000235058 Komagataella pastoris Species 0.000 description 1
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 1
- ODKSFYDXXFIFQN-BYPYZUCNSA-N L-arginine Chemical compound OC(=O)[C@@H](N)CCCN=C(N)N ODKSFYDXXFIFQN-BYPYZUCNSA-N 0.000 description 1
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 1
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 1
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 1
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 1
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 1
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 1
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 1
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 1
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 1
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 1
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 1
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 1
- 108010054278 Lac Repressors Proteins 0.000 description 1
- MJOZZTKJZQFKDK-GUBZILKMSA-N Leu-Ala-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(N)=O MJOZZTKJZQFKDK-GUBZILKMSA-N 0.000 description 1
- CQQGCWPXDHTTNF-GUBZILKMSA-N Leu-Ala-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O CQQGCWPXDHTTNF-GUBZILKMSA-N 0.000 description 1
- KWTVLKBOQATPHJ-SRVKXCTJSA-N Leu-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N KWTVLKBOQATPHJ-SRVKXCTJSA-N 0.000 description 1
- HASRFYOMVPJRPU-SRVKXCTJSA-N Leu-Arg-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(O)=O)C(O)=O HASRFYOMVPJRPU-SRVKXCTJSA-N 0.000 description 1
- KSZCCRIGNVSHFH-UWVGGRQHSA-N Leu-Arg-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O KSZCCRIGNVSHFH-UWVGGRQHSA-N 0.000 description 1
- IBMVEYRWAWIOTN-RWMBFGLXSA-N Leu-Arg-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@@H]1C(O)=O IBMVEYRWAWIOTN-RWMBFGLXSA-N 0.000 description 1
- VIWUBXKCYJGNCL-SRVKXCTJSA-N Leu-Asn-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 VIWUBXKCYJGNCL-SRVKXCTJSA-N 0.000 description 1
- WXHFZJFZWNCDNB-KKUMJFAQSA-N Leu-Asn-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 WXHFZJFZWNCDNB-KKUMJFAQSA-N 0.000 description 1
- ILJREDZFPHTUIE-GUBZILKMSA-N Leu-Asp-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ILJREDZFPHTUIE-GUBZILKMSA-N 0.000 description 1
- DLCOFDAHNMMQPP-SRVKXCTJSA-N Leu-Asp-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DLCOFDAHNMMQPP-SRVKXCTJSA-N 0.000 description 1
- JQSXWJXBASFONF-KKUMJFAQSA-N Leu-Asp-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JQSXWJXBASFONF-KKUMJFAQSA-N 0.000 description 1
- QCSFMCFHVGTLFF-NHCYSSNCSA-N Leu-Asp-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O QCSFMCFHVGTLFF-NHCYSSNCSA-N 0.000 description 1
- NHHKSOGJYNQENP-SRVKXCTJSA-N Leu-Cys-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)O)N NHHKSOGJYNQENP-SRVKXCTJSA-N 0.000 description 1
- VQPPIMUZCZCOIL-GUBZILKMSA-N Leu-Gln-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O VQPPIMUZCZCOIL-GUBZILKMSA-N 0.000 description 1
- VPKIQULSKFVCSM-SRVKXCTJSA-N Leu-Gln-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VPKIQULSKFVCSM-SRVKXCTJSA-N 0.000 description 1
- LOLUPZNNADDTAA-AVGNSLFASA-N Leu-Gln-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LOLUPZNNADDTAA-AVGNSLFASA-N 0.000 description 1
- CQGSYZCULZMEDE-SRVKXCTJSA-N Leu-Gln-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(O)=O CQGSYZCULZMEDE-SRVKXCTJSA-N 0.000 description 1
- GPICTNQYKHHHTH-GUBZILKMSA-N Leu-Gln-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O GPICTNQYKHHHTH-GUBZILKMSA-N 0.000 description 1
- WIDZHJTYKYBLSR-DCAQKATOSA-N Leu-Glu-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WIDZHJTYKYBLSR-DCAQKATOSA-N 0.000 description 1
- QVFGXCVIXXBFHO-AVGNSLFASA-N Leu-Glu-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O QVFGXCVIXXBFHO-AVGNSLFASA-N 0.000 description 1
- PRZVBIAOPFGAQF-SRVKXCTJSA-N Leu-Glu-Met Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O PRZVBIAOPFGAQF-SRVKXCTJSA-N 0.000 description 1
- HVJVUYQWFYMGJS-GVXVVHGQSA-N Leu-Glu-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O HVJVUYQWFYMGJS-GVXVVHGQSA-N 0.000 description 1
- BABSVXFGKFLIGW-UWVGGRQHSA-N Leu-Gly-Arg Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCNC(N)=N BABSVXFGKFLIGW-UWVGGRQHSA-N 0.000 description 1
- FIYMBBHGYNQFOP-IUCAKERBSA-N Leu-Gly-Gln Chemical compound CC(C)C[C@@H](C(=O)NCC(=O)N[C@@H](CCC(=O)N)C(=O)O)N FIYMBBHGYNQFOP-IUCAKERBSA-N 0.000 description 1
- KGCLIYGPQXUNLO-IUCAKERBSA-N Leu-Gly-Glu Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O KGCLIYGPQXUNLO-IUCAKERBSA-N 0.000 description 1
- CCQLQKZTXZBXTN-NHCYSSNCSA-N Leu-Gly-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O CCQLQKZTXZBXTN-NHCYSSNCSA-N 0.000 description 1
- QPXBPQUGXHURGP-UWVGGRQHSA-N Leu-Gly-Met Chemical compound CC(C)C[C@@H](C(=O)NCC(=O)N[C@@H](CCSC)C(=O)O)N QPXBPQUGXHURGP-UWVGGRQHSA-N 0.000 description 1
- KEVYYIMVELOXCT-KBPBESRZSA-N Leu-Gly-Phe Chemical compound CC(C)C[C@H]([NH3+])C(=O)NCC(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 KEVYYIMVELOXCT-KBPBESRZSA-N 0.000 description 1
- VGPCJSXPPOQPBK-YUMQZZPRSA-N Leu-Gly-Ser Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O VGPCJSXPPOQPBK-YUMQZZPRSA-N 0.000 description 1
- HYMLKESRWLZDBR-WEDXCCLWSA-N Leu-Gly-Thr Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O HYMLKESRWLZDBR-WEDXCCLWSA-N 0.000 description 1
- WRLPVDVHNWSSCL-MELADBBJSA-N Leu-His-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N2CCC[C@@H]2C(=O)O)N WRLPVDVHNWSSCL-MELADBBJSA-N 0.000 description 1
- HMDDEJADNKQTBR-BZSNNMDCSA-N Leu-His-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O HMDDEJADNKQTBR-BZSNNMDCSA-N 0.000 description 1
- SGIIOQQGLUUMDQ-IHRRRGAJSA-N Leu-His-Val Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](C(C)C)C(=O)O)N SGIIOQQGLUUMDQ-IHRRRGAJSA-N 0.000 description 1
- JFSGIJSCJFQGSZ-MXAVVETBSA-N Leu-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC(C)C)N JFSGIJSCJFQGSZ-MXAVVETBSA-N 0.000 description 1
- KUIDCYNIEJBZBU-AJNGGQMLSA-N Leu-Ile-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O KUIDCYNIEJBZBU-AJNGGQMLSA-N 0.000 description 1
- LIINDKYIGYTDLG-PPCPHDFISA-N Leu-Ile-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LIINDKYIGYTDLG-PPCPHDFISA-N 0.000 description 1
- QNBVTHNJGCOVFA-AVGNSLFASA-N Leu-Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O QNBVTHNJGCOVFA-AVGNSLFASA-N 0.000 description 1
- KYIIALJHAOIAHF-KKUMJFAQSA-N Leu-Leu-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 KYIIALJHAOIAHF-KKUMJFAQSA-N 0.000 description 1
- VVQJGYPTIYOFBR-IHRRRGAJSA-N Leu-Lys-Met Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)O)N VVQJGYPTIYOFBR-IHRRRGAJSA-N 0.000 description 1
- RTIRBWJPYJYTLO-MELADBBJSA-N Leu-Lys-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N RTIRBWJPYJYTLO-MELADBBJSA-N 0.000 description 1
- WXZOHBVPVKABQN-DCAQKATOSA-N Leu-Met-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)O)C(=O)O)N WXZOHBVPVKABQN-DCAQKATOSA-N 0.000 description 1
- DDVHDMSBLRAKNV-IHRRRGAJSA-N Leu-Met-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O DDVHDMSBLRAKNV-IHRRRGAJSA-N 0.000 description 1
- LQUIENKUVKPNIC-ULQDDVLXSA-N Leu-Met-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LQUIENKUVKPNIC-ULQDDVLXSA-N 0.000 description 1
- INCJJHQRZGQLFC-KBPBESRZSA-N Leu-Phe-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O INCJJHQRZGQLFC-KBPBESRZSA-N 0.000 description 1
- DRWMRVFCKKXHCH-BZSNNMDCSA-N Leu-Phe-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=CC=C1 DRWMRVFCKKXHCH-BZSNNMDCSA-N 0.000 description 1
- WXDRGWBQZIMJDE-ULQDDVLXSA-N Leu-Phe-Met Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(O)=O WXDRGWBQZIMJDE-ULQDDVLXSA-N 0.000 description 1
- YWKNKRAKOCLOLH-OEAJRASXSA-N Leu-Phe-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=CC=C1 YWKNKRAKOCLOLH-OEAJRASXSA-N 0.000 description 1
- RRVCZCNFXIFGRA-DCAQKATOSA-N Leu-Pro-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O RRVCZCNFXIFGRA-DCAQKATOSA-N 0.000 description 1
- UCBPDSYUVAAHCD-UWVGGRQHSA-N Leu-Pro-Gly Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UCBPDSYUVAAHCD-UWVGGRQHSA-N 0.000 description 1
- YUTNOGOMBNYPFH-XUXIUFHCSA-N Leu-Pro-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YUTNOGOMBNYPFH-XUXIUFHCSA-N 0.000 description 1
- KWLWZYMNUZJKMZ-IHRRRGAJSA-N Leu-Pro-Leu Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O KWLWZYMNUZJKMZ-IHRRRGAJSA-N 0.000 description 1
- YRRCOJOXAJNSAX-IHRRRGAJSA-N Leu-Pro-Lys Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)O)N YRRCOJOXAJNSAX-IHRRRGAJSA-N 0.000 description 1
- UCXQIIIFOOGYEM-ULQDDVLXSA-N Leu-Pro-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 UCXQIIIFOOGYEM-ULQDDVLXSA-N 0.000 description 1
- XGDCYUQSFDQISZ-BQBZGAKWSA-N Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O XGDCYUQSFDQISZ-BQBZGAKWSA-N 0.000 description 1
- IRMLZWSRWSGTOP-CIUDSAMLSA-N Leu-Ser-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O IRMLZWSRWSGTOP-CIUDSAMLSA-N 0.000 description 1
- JIHDFWWRYHSAQB-GUBZILKMSA-N Leu-Ser-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O JIHDFWWRYHSAQB-GUBZILKMSA-N 0.000 description 1
- MVHXGBZUJLWZOH-BJDJZHNGSA-N Leu-Ser-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MVHXGBZUJLWZOH-BJDJZHNGSA-N 0.000 description 1
- IWMJFLJQHIDZQW-KKUMJFAQSA-N Leu-Ser-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IWMJFLJQHIDZQW-KKUMJFAQSA-N 0.000 description 1
- SVBJIZVVYJYGLA-DCAQKATOSA-N Leu-Ser-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O SVBJIZVVYJYGLA-DCAQKATOSA-N 0.000 description 1
- VDIARPPNADFEAV-WEDXCCLWSA-N Leu-Thr-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O VDIARPPNADFEAV-WEDXCCLWSA-N 0.000 description 1
- ODRREERHVHMIPT-OEAJRASXSA-N Leu-Thr-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ODRREERHVHMIPT-OEAJRASXSA-N 0.000 description 1
- CNWDWAMPKVYJJB-NUTKFTJISA-N Leu-Trp-Ala Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CC(C)C)C(=O)N[C@@H](C)C(O)=O)=CNC2=C1 CNWDWAMPKVYJJB-NUTKFTJISA-N 0.000 description 1
- LSLUTXRANSUGFY-XIRDDKMYSA-N Leu-Trp-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(O)=O)C(O)=O LSLUTXRANSUGFY-XIRDDKMYSA-N 0.000 description 1
- VJGQRELPQWNURN-JYJNAYRXSA-N Leu-Tyr-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O VJGQRELPQWNURN-JYJNAYRXSA-N 0.000 description 1
- SEOXPEFQEOYURL-PMVMPFDFSA-N Leu-Tyr-Trp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O SEOXPEFQEOYURL-PMVMPFDFSA-N 0.000 description 1
- FBNPMTNBFFAMMH-UHFFFAOYSA-N Leu-Val-Arg Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(=O)NC(C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-UHFFFAOYSA-N 0.000 description 1
- FPPCCQGECVKLDY-IHRRRGAJSA-N Leu-Val-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(C)C FPPCCQGECVKLDY-IHRRRGAJSA-N 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- PNPYKQFJGRFYJE-GUBZILKMSA-N Lys-Ala-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O PNPYKQFJGRFYJE-GUBZILKMSA-N 0.000 description 1
- IXHKPDJKKCUKHS-GARJFASQSA-N Lys-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N IXHKPDJKKCUKHS-GARJFASQSA-N 0.000 description 1
- UWKNTTJNVSYXPC-CIUDSAMLSA-N Lys-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN UWKNTTJNVSYXPC-CIUDSAMLSA-N 0.000 description 1
- GQUDMNDPQTXZRV-DCAQKATOSA-N Lys-Arg-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O GQUDMNDPQTXZRV-DCAQKATOSA-N 0.000 description 1
- SJNZALDHDUYDBU-IHRRRGAJSA-N Lys-Arg-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(O)=O SJNZALDHDUYDBU-IHRRRGAJSA-N 0.000 description 1
- NLOZZWJNIKKYSC-WDSOQIARSA-N Lys-Arg-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CCCCN)C(O)=O)=CNC2=C1 NLOZZWJNIKKYSC-WDSOQIARSA-N 0.000 description 1
- NTSPQIONFJUMJV-AVGNSLFASA-N Lys-Arg-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O NTSPQIONFJUMJV-AVGNSLFASA-N 0.000 description 1
- YKIRNDPUWONXQN-GUBZILKMSA-N Lys-Asn-Gln Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N YKIRNDPUWONXQN-GUBZILKMSA-N 0.000 description 1
- HQVDJTYKCMIWJP-YUMQZZPRSA-N Lys-Asn-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O HQVDJTYKCMIWJP-YUMQZZPRSA-N 0.000 description 1
- QUCDKEKDPYISNX-HJGDQZAQSA-N Lys-Asn-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QUCDKEKDPYISNX-HJGDQZAQSA-N 0.000 description 1
- HIIZIQUUHIXUJY-GUBZILKMSA-N Lys-Asp-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O HIIZIQUUHIXUJY-GUBZILKMSA-N 0.000 description 1
- AAORVPFVUIHEAB-YUMQZZPRSA-N Lys-Asp-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O AAORVPFVUIHEAB-YUMQZZPRSA-N 0.000 description 1
- IWWMPCPLFXFBAF-SRVKXCTJSA-N Lys-Asp-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O IWWMPCPLFXFBAF-SRVKXCTJSA-N 0.000 description 1
- SSJBMGCZZXCGJJ-DCAQKATOSA-N Lys-Asp-Met Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O SSJBMGCZZXCGJJ-DCAQKATOSA-N 0.000 description 1
- VSRXPEHZMHSFKU-IUCAKERBSA-N Lys-Gln-Gly Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O VSRXPEHZMHSFKU-IUCAKERBSA-N 0.000 description 1
- MQMIRLVJXQNTRJ-SDDRHHMPSA-N Lys-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N)C(=O)O MQMIRLVJXQNTRJ-SDDRHHMPSA-N 0.000 description 1
- HEWWNLVEWBJBKA-WDCWCFNPSA-N Lys-Gln-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCCCN HEWWNLVEWBJBKA-WDCWCFNPSA-N 0.000 description 1
- NDORZBUHCOJQDO-GVXVVHGQSA-N Lys-Gln-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O NDORZBUHCOJQDO-GVXVVHGQSA-N 0.000 description 1
- DRCILAJNUJKAHC-SRVKXCTJSA-N Lys-Glu-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O DRCILAJNUJKAHC-SRVKXCTJSA-N 0.000 description 1
- GRADYHMSAUIKPS-DCAQKATOSA-N Lys-Glu-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O GRADYHMSAUIKPS-DCAQKATOSA-N 0.000 description 1
- GPJGFSFYBJGYRX-YUMQZZPRSA-N Lys-Gly-Asp Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O GPJGFSFYBJGYRX-YUMQZZPRSA-N 0.000 description 1
- XNKDCYABMBBEKN-IUCAKERBSA-N Lys-Gly-Gln Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O XNKDCYABMBBEKN-IUCAKERBSA-N 0.000 description 1
- DTUZCYRNEJDKSR-NHCYSSNCSA-N Lys-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN DTUZCYRNEJDKSR-NHCYSSNCSA-N 0.000 description 1
- PRCHKVGXZVTALR-KKUMJFAQSA-N Lys-His-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CCCCN)N PRCHKVGXZVTALR-KKUMJFAQSA-N 0.000 description 1
- OWRUUFUVXFREBD-KKUMJFAQSA-N Lys-His-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O OWRUUFUVXFREBD-KKUMJFAQSA-N 0.000 description 1
- ZMMDPRTXLAEMOD-BZSNNMDCSA-N Lys-His-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZMMDPRTXLAEMOD-BZSNNMDCSA-N 0.000 description 1
- GNLJXWBNLAIPEP-MELADBBJSA-N Lys-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CCCCN)N)C(=O)O GNLJXWBNLAIPEP-MELADBBJSA-N 0.000 description 1
- IVFUVMSKSFSFBT-NHCYSSNCSA-N Lys-Ile-Gly Chemical compound OC(=O)CNC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN IVFUVMSKSFSFBT-NHCYSSNCSA-N 0.000 description 1
- YWJQHDDBFAXNIR-MXAVVETBSA-N Lys-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCCCN)N YWJQHDDBFAXNIR-MXAVVETBSA-N 0.000 description 1
- JYXBNQOKPRQNQS-YTFOTSKYSA-N Lys-Ile-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JYXBNQOKPRQNQS-YTFOTSKYSA-N 0.000 description 1
- WAIHHELKYSFIQN-XUXIUFHCSA-N Lys-Ile-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O WAIHHELKYSFIQN-XUXIUFHCSA-N 0.000 description 1
- OVAOHZIOUBEQCJ-IHRRRGAJSA-N Lys-Leu-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O OVAOHZIOUBEQCJ-IHRRRGAJSA-N 0.000 description 1
- MUXNCRWTWBMNHX-SRVKXCTJSA-N Lys-Leu-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O MUXNCRWTWBMNHX-SRVKXCTJSA-N 0.000 description 1
- WVJNGSFKBKOKRV-AJNGGQMLSA-N Lys-Leu-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WVJNGSFKBKOKRV-AJNGGQMLSA-N 0.000 description 1
- ORVFEGYUJITPGI-IHRRRGAJSA-N Lys-Leu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCCCN ORVFEGYUJITPGI-IHRRRGAJSA-N 0.000 description 1
- WRODMZBHNNPRLN-SRVKXCTJSA-N Lys-Leu-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O WRODMZBHNNPRLN-SRVKXCTJSA-N 0.000 description 1
- RIJCHEVHFWMDKD-SRVKXCTJSA-N Lys-Lys-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O RIJCHEVHFWMDKD-SRVKXCTJSA-N 0.000 description 1
- ALGGDNMLQNFVIZ-SRVKXCTJSA-N Lys-Lys-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N ALGGDNMLQNFVIZ-SRVKXCTJSA-N 0.000 description 1
- JQSIGLHQNSZZRL-KKUMJFAQSA-N Lys-Lys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N JQSIGLHQNSZZRL-KKUMJFAQSA-N 0.000 description 1
- KJIXWRWPOCKYLD-IHRRRGAJSA-N Lys-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N KJIXWRWPOCKYLD-IHRRRGAJSA-N 0.000 description 1
- VSTNAUBHKQPVJX-IHRRRGAJSA-N Lys-Met-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O VSTNAUBHKQPVJX-IHRRRGAJSA-N 0.000 description 1
- KVNLHIXLLZBAFQ-RWMBFGLXSA-N Lys-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N KVNLHIXLLZBAFQ-RWMBFGLXSA-N 0.000 description 1
- MSSJJDVQTFTLIF-KBPBESRZSA-N Lys-Phe-Gly Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](Cc1ccccc1)C(=O)NCC(O)=O MSSJJDVQTFTLIF-KBPBESRZSA-N 0.000 description 1
- AZOFEHCPMBRNFD-BZSNNMDCSA-N Lys-Phe-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=CC=C1 AZOFEHCPMBRNFD-BZSNNMDCSA-N 0.000 description 1
- BOJYMMBYBNOOGG-DCAQKATOSA-N Lys-Pro-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O BOJYMMBYBNOOGG-DCAQKATOSA-N 0.000 description 1
- OBZHNHBAAVEWKI-DCAQKATOSA-N Lys-Pro-Asn Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O OBZHNHBAAVEWKI-DCAQKATOSA-N 0.000 description 1
- WGILOYIKJVQUPT-DCAQKATOSA-N Lys-Pro-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O WGILOYIKJVQUPT-DCAQKATOSA-N 0.000 description 1
- CNGOEHJCLVCJHN-SRVKXCTJSA-N Lys-Pro-Glu Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O CNGOEHJCLVCJHN-SRVKXCTJSA-N 0.000 description 1
- LECIJRIRMVOFMH-ULQDDVLXSA-N Lys-Pro-Phe Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 LECIJRIRMVOFMH-ULQDDVLXSA-N 0.000 description 1
- YTJFXEDRUOQGSP-DCAQKATOSA-N Lys-Pro-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O YTJFXEDRUOQGSP-DCAQKATOSA-N 0.000 description 1
- WQDKIVRHTQYJSN-DCAQKATOSA-N Lys-Ser-Arg Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N WQDKIVRHTQYJSN-DCAQKATOSA-N 0.000 description 1
- MGKFCQFVPKOWOL-CIUDSAMLSA-N Lys-Ser-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)O)N MGKFCQFVPKOWOL-CIUDSAMLSA-N 0.000 description 1
- DIBZLYZXTSVGLN-CIUDSAMLSA-N Lys-Ser-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O DIBZLYZXTSVGLN-CIUDSAMLSA-N 0.000 description 1
- TVHCDSBMFQYPNA-RHYQMDGZSA-N Lys-Thr-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TVHCDSBMFQYPNA-RHYQMDGZSA-N 0.000 description 1
- CUHGAUZONORRIC-HJGDQZAQSA-N Lys-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N)O CUHGAUZONORRIC-HJGDQZAQSA-N 0.000 description 1
- SUZVLFWOCKHWET-CQDKDKBSSA-N Lys-Tyr-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O SUZVLFWOCKHWET-CQDKDKBSSA-N 0.000 description 1
- XATKLFSXFINPSB-JYJNAYRXSA-N Lys-Tyr-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O XATKLFSXFINPSB-JYJNAYRXSA-N 0.000 description 1
- XYLSGAWRCZECIQ-JYJNAYRXSA-N Lys-Tyr-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCC(O)=O)C(O)=O)CC1=CC=C(O)C=C1 XYLSGAWRCZECIQ-JYJNAYRXSA-N 0.000 description 1
- RQILLQOQXLZTCK-KBPBESRZSA-N Lys-Tyr-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O RQILLQOQXLZTCK-KBPBESRZSA-N 0.000 description 1
- OHXUUQDOBQKSNB-AVGNSLFASA-N Lys-Val-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O OHXUUQDOBQKSNB-AVGNSLFASA-N 0.000 description 1
- RPWQJSBMXJSCPD-XUXIUFHCSA-N Lys-Val-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCCCN)C(C)C)C(O)=O RPWQJSBMXJSCPD-XUXIUFHCSA-N 0.000 description 1
- OZVXDDFYCQOPFD-XQQFMLRXSA-N Lys-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N OZVXDDFYCQOPFD-XQQFMLRXSA-N 0.000 description 1
- HMZPYMSEAALNAE-ULQDDVLXSA-N Lys-Val-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O HMZPYMSEAALNAE-ULQDDVLXSA-N 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- 101710175625 Maltose/maltodextrin-binding periplasmic protein Proteins 0.000 description 1
- 101710117393 Membrane-associated lipoprotein Proteins 0.000 description 1
- QAHFGYLFLVGBNW-DCAQKATOSA-N Met-Ala-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN QAHFGYLFLVGBNW-DCAQKATOSA-N 0.000 description 1
- WYEXWKAWMNJKPN-UBHSHLNASA-N Met-Ala-Phe Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCSC)N WYEXWKAWMNJKPN-UBHSHLNASA-N 0.000 description 1
- CTVJSFRHUOSCQQ-DCAQKATOSA-N Met-Arg-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O CTVJSFRHUOSCQQ-DCAQKATOSA-N 0.000 description 1
- WDTLNWHPIPCMMP-AVGNSLFASA-N Met-Arg-Leu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O WDTLNWHPIPCMMP-AVGNSLFASA-N 0.000 description 1
- AHZNUGRZHMZGFL-GUBZILKMSA-N Met-Arg-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CCCNC(N)=N AHZNUGRZHMZGFL-GUBZILKMSA-N 0.000 description 1
- UZVWDRPUTHXQAM-FXQIFTODSA-N Met-Asp-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O UZVWDRPUTHXQAM-FXQIFTODSA-N 0.000 description 1
- ZMYHJISLFYTQGK-FXQIFTODSA-N Met-Asp-Asn Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZMYHJISLFYTQGK-FXQIFTODSA-N 0.000 description 1
- OXHSZBRPUGNMKW-DCAQKATOSA-N Met-Gln-Arg Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O OXHSZBRPUGNMKW-DCAQKATOSA-N 0.000 description 1
- VOOINLQYUZOREH-SRVKXCTJSA-N Met-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCSC)N VOOINLQYUZOREH-SRVKXCTJSA-N 0.000 description 1
- MTBVQFFQMXHCPC-CIUDSAMLSA-N Met-Glu-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MTBVQFFQMXHCPC-CIUDSAMLSA-N 0.000 description 1
- VZBXCMCHIHEPBL-SRVKXCTJSA-N Met-Glu-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN VZBXCMCHIHEPBL-SRVKXCTJSA-N 0.000 description 1
- QXOHLNCNYLGICT-YFKPBYRVSA-N Met-Gly Chemical compound CSCC[C@H](N)C(=O)NCC(O)=O QXOHLNCNYLGICT-YFKPBYRVSA-N 0.000 description 1
- MHQXIBRPDKXDGZ-ZFWWWQNUSA-N Met-Gly-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)CNC(=O)[C@@H](N)CCSC)C(O)=O)=CNC2=C1 MHQXIBRPDKXDGZ-ZFWWWQNUSA-N 0.000 description 1
- BCRQJDMZQUHQSV-STQMWFEESA-N Met-Gly-Tyr Chemical compound [H]N[C@@H](CCSC)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O BCRQJDMZQUHQSV-STQMWFEESA-N 0.000 description 1
- TZHFJXDKXGZHEN-IHRRRGAJSA-N Met-His-Leu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O TZHFJXDKXGZHEN-IHRRRGAJSA-N 0.000 description 1
- XPCLRYNQMZOOFB-ULQDDVLXSA-N Met-His-Phe Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)O)N XPCLRYNQMZOOFB-ULQDDVLXSA-N 0.000 description 1
- MXEASDMFHUKOGE-ULQDDVLXSA-N Met-His-Tyr Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N MXEASDMFHUKOGE-ULQDDVLXSA-N 0.000 description 1
- MVMNUCOHQGYYKB-PEDHHIEDSA-N Met-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)NC(=O)[C@H](CCSC)N MVMNUCOHQGYYKB-PEDHHIEDSA-N 0.000 description 1
- WPTDJKDGICUFCP-XUXIUFHCSA-N Met-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CCSC)N WPTDJKDGICUFCP-XUXIUFHCSA-N 0.000 description 1
- RRIHXWPHQSXHAQ-XUXIUFHCSA-N Met-Ile-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(O)=O RRIHXWPHQSXHAQ-XUXIUFHCSA-N 0.000 description 1
- FTQOFRPGLYXRFM-CYDGBPFRSA-N Met-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CCSC)N FTQOFRPGLYXRFM-CYDGBPFRSA-N 0.000 description 1
- QZPXMHVKPHJNTR-DCAQKATOSA-N Met-Leu-Asn Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O QZPXMHVKPHJNTR-DCAQKATOSA-N 0.000 description 1
- HGAJNEWOUHDUMZ-SRVKXCTJSA-N Met-Leu-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O HGAJNEWOUHDUMZ-SRVKXCTJSA-N 0.000 description 1
- LBNFTWKGISQVEE-AVGNSLFASA-N Met-Leu-Met Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCSC LBNFTWKGISQVEE-AVGNSLFASA-N 0.000 description 1
- JCMMNFZUKMMECJ-DCAQKATOSA-N Met-Lys-Asn Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O JCMMNFZUKMMECJ-DCAQKATOSA-N 0.000 description 1
- YYEIFXZOBZVDPH-DCAQKATOSA-N Met-Lys-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O YYEIFXZOBZVDPH-DCAQKATOSA-N 0.000 description 1
- HUURTRNKPBHHKZ-JYJNAYRXSA-N Met-Phe-Val Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=CC=C1 HUURTRNKPBHHKZ-JYJNAYRXSA-N 0.000 description 1
- VQILILSLEFDECU-GUBZILKMSA-N Met-Pro-Ala Chemical compound [H]N[C@@H](CCSC)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O VQILILSLEFDECU-GUBZILKMSA-N 0.000 description 1
- PHKBGZKVOJCIMZ-SRVKXCTJSA-N Met-Pro-Arg Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PHKBGZKVOJCIMZ-SRVKXCTJSA-N 0.000 description 1
- QEDGNYFHLXXIDC-DCAQKATOSA-N Met-Pro-Gln Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O QEDGNYFHLXXIDC-DCAQKATOSA-N 0.000 description 1
- YLDSJJOGQNEQJK-AVGNSLFASA-N Met-Pro-Leu Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O YLDSJJOGQNEQJK-AVGNSLFASA-N 0.000 description 1
- XPVCDCMPKCERFT-GUBZILKMSA-N Met-Ser-Arg Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O XPVCDCMPKCERFT-GUBZILKMSA-N 0.000 description 1
- KSIPKXNIQOWMIC-RCWTZXSCSA-N Met-Thr-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCNC(N)=N KSIPKXNIQOWMIC-RCWTZXSCSA-N 0.000 description 1
- CIIJWIAORKTXAH-FJXKBIBVSA-N Met-Thr-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O CIIJWIAORKTXAH-FJXKBIBVSA-N 0.000 description 1
- QYIGOFGUOVTAHK-ZJDVBMNYSA-N Met-Thr-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QYIGOFGUOVTAHK-ZJDVBMNYSA-N 0.000 description 1
- KLGIQJRMFHIGCQ-ZFWWWQNUSA-N Met-Trp-Gly Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CCSC)C(=O)NCC(O)=O)=CNC2=C1 KLGIQJRMFHIGCQ-ZFWWWQNUSA-N 0.000 description 1
- UXJHNUBJSQQIOC-SZMVWBNQSA-N Met-Trp-Val Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C(C)C)C(O)=O UXJHNUBJSQQIOC-SZMVWBNQSA-N 0.000 description 1
- FZDOBWIKRQORAC-ULQDDVLXSA-N Met-Tyr-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CCSC)N FZDOBWIKRQORAC-ULQDDVLXSA-N 0.000 description 1
- CQRGINSEMFBACV-WPRPVWTQSA-N Met-Val-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O CQRGINSEMFBACV-WPRPVWTQSA-N 0.000 description 1
- LPNWWHBFXPNHJG-AVGNSLFASA-N Met-Val-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN LPNWWHBFXPNHJG-AVGNSLFASA-N 0.000 description 1
- VYDLZDRMOFYOGV-TUAOUCFPSA-N Met-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCSC)N VYDLZDRMOFYOGV-TUAOUCFPSA-N 0.000 description 1
- 102000016943 Muramidase Human genes 0.000 description 1
- 108010014251 Muramidase Proteins 0.000 description 1
- 101100476480 Mus musculus S100a8 gene Proteins 0.000 description 1
- 229940121948 Muscarinic receptor antagonist Drugs 0.000 description 1
- 241000282339 Mustela Species 0.000 description 1
- 235000009421 Myristica fragrans Nutrition 0.000 description 1
- 108010062010 N-Acetylmuramoyl-L-alanine Amidase Proteins 0.000 description 1
- WUGMRIBZSVSJNP-UHFFFAOYSA-N N-L-alanyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UHFFFAOYSA-N 0.000 description 1
- XZFYRXDAULDNFX-UHFFFAOYSA-N N-L-cysteinyl-L-phenylalanine Natural products SCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XZFYRXDAULDNFX-UHFFFAOYSA-N 0.000 description 1
- AJHCSUXXECOXOY-UHFFFAOYSA-N N-glycyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-UHFFFAOYSA-N 0.000 description 1
- 101100401106 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) met-7 gene Proteins 0.000 description 1
- VEQPNABPJHWNSG-UHFFFAOYSA-N Nickel(2+) Chemical compound [Ni+2] VEQPNABPJHWNSG-UHFFFAOYSA-N 0.000 description 1
- 238000000636 Northern blotting Methods 0.000 description 1
- 108091034117 Oligonucleotide Proteins 0.000 description 1
- 241000283977 Oryctolagus Species 0.000 description 1
- 108700006640 OspA Proteins 0.000 description 1
- 101710116435 Outer membrane protein Proteins 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 241001494479 Pecora Species 0.000 description 1
- 229930182555 Penicillin Natural products 0.000 description 1
- BJEYSVHMGIJORT-NHCYSSNCSA-N Phe-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 BJEYSVHMGIJORT-NHCYSSNCSA-N 0.000 description 1
- DFEVBOYEUQJGER-JURCDPSOSA-N Phe-Ala-Ile Chemical compound N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O DFEVBOYEUQJGER-JURCDPSOSA-N 0.000 description 1
- BBDSZDHUCPSYAC-QEJZJMRPSA-N Phe-Ala-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O BBDSZDHUCPSYAC-QEJZJMRPSA-N 0.000 description 1
- NEHSHYOUIWBYSA-DCPHZVHLSA-N Phe-Ala-Trp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC3=CC=CC=C3)N NEHSHYOUIWBYSA-DCPHZVHLSA-N 0.000 description 1
- AYPMIIKUMNADSU-IHRRRGAJSA-N Phe-Arg-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O AYPMIIKUMNADSU-IHRRRGAJSA-N 0.000 description 1
- QCHNRQQVLJYDSI-DLOVCJGASA-N Phe-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 QCHNRQQVLJYDSI-DLOVCJGASA-N 0.000 description 1
- KIEPQOIQHFKQLK-PCBIJLKTSA-N Phe-Asn-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KIEPQOIQHFKQLK-PCBIJLKTSA-N 0.000 description 1
- MECSIDWUTYRHRJ-KKUMJFAQSA-N Phe-Asn-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O MECSIDWUTYRHRJ-KKUMJFAQSA-N 0.000 description 1
- JOXIIFVCSATTDH-IHPCNDPISA-N Phe-Asn-Trp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O)N JOXIIFVCSATTDH-IHPCNDPISA-N 0.000 description 1
- LXVFHIBXOWJTKZ-BZSNNMDCSA-N Phe-Asn-Tyr Chemical compound N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O LXVFHIBXOWJTKZ-BZSNNMDCSA-N 0.000 description 1
- JIYJYFIXQTYDNF-YDHLFZDLSA-N Phe-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N JIYJYFIXQTYDNF-YDHLFZDLSA-N 0.000 description 1
- LDSOBEJVGGVWGD-DLOVCJGASA-N Phe-Asp-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 LDSOBEJVGGVWGD-DLOVCJGASA-N 0.000 description 1
- RIYZXJVARWJLKS-KKUMJFAQSA-N Phe-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 RIYZXJVARWJLKS-KKUMJFAQSA-N 0.000 description 1
- WIVCOAKLPICYGY-KKUMJFAQSA-N Phe-Asp-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N WIVCOAKLPICYGY-KKUMJFAQSA-N 0.000 description 1
- DJPXNKUDJKGQEE-BZSNNMDCSA-N Phe-Asp-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O DJPXNKUDJKGQEE-BZSNNMDCSA-N 0.000 description 1
- FRPVPGRXUKFEQE-YDHLFZDLSA-N Phe-Asp-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O FRPVPGRXUKFEQE-YDHLFZDLSA-N 0.000 description 1
- UMKYAYXCMYYNHI-AVGNSLFASA-N Phe-Gln-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N UMKYAYXCMYYNHI-AVGNSLFASA-N 0.000 description 1
- MFQXSDWKUXTOPZ-DZKIICNBSA-N Phe-Gln-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N MFQXSDWKUXTOPZ-DZKIICNBSA-N 0.000 description 1
- KJJROSNFBRWPHS-JYJNAYRXSA-N Phe-Glu-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O KJJROSNFBRWPHS-JYJNAYRXSA-N 0.000 description 1
- XXAOSEUPEMQJOF-KKUMJFAQSA-N Phe-Glu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 XXAOSEUPEMQJOF-KKUMJFAQSA-N 0.000 description 1
- RFEXGCASCQGGHZ-STQMWFEESA-N Phe-Gly-Arg Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O RFEXGCASCQGGHZ-STQMWFEESA-N 0.000 description 1
- WPTYDQPGBMDUBI-QWRGUYRKSA-N Phe-Gly-Asn Chemical compound N[C@@H](Cc1ccccc1)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O WPTYDQPGBMDUBI-QWRGUYRKSA-N 0.000 description 1
- JEBWZLWTRPZQRX-QWRGUYRKSA-N Phe-Gly-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O JEBWZLWTRPZQRX-QWRGUYRKSA-N 0.000 description 1
- NAXPHWZXEXNDIW-JTQLQIEISA-N Phe-Gly-Gly Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 NAXPHWZXEXNDIW-JTQLQIEISA-N 0.000 description 1
- HBGFEEQFVBWYJQ-KBPBESRZSA-N Phe-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 HBGFEEQFVBWYJQ-KBPBESRZSA-N 0.000 description 1
- NHCKESBLOMHIIE-IRXDYDNUSA-N Phe-Gly-Phe Chemical compound C([C@H](N)C(=O)NCC(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 NHCKESBLOMHIIE-IRXDYDNUSA-N 0.000 description 1
- VZFPYFRVHMSSNA-JURCDPSOSA-N Phe-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC1=CC=CC=C1 VZFPYFRVHMSSNA-JURCDPSOSA-N 0.000 description 1
- WEMYTDDMDBLPMI-DKIMLUQUSA-N Phe-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N WEMYTDDMDBLPMI-DKIMLUQUSA-N 0.000 description 1
- ONORAGIFHNAADN-LLLHUVSDSA-N Phe-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N ONORAGIFHNAADN-LLLHUVSDSA-N 0.000 description 1
- KXUZHWXENMYOHC-QEJZJMRPSA-N Phe-Leu-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O KXUZHWXENMYOHC-QEJZJMRPSA-N 0.000 description 1
- KBVJZCVLQWCJQN-KKUMJFAQSA-N Phe-Leu-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O KBVJZCVLQWCJQN-KKUMJFAQSA-N 0.000 description 1
- METZZBCMDXHFMK-BZSNNMDCSA-N Phe-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N METZZBCMDXHFMK-BZSNNMDCSA-N 0.000 description 1
- MSHZERMPZKCODG-ACRUOGEOSA-N Phe-Leu-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 MSHZERMPZKCODG-ACRUOGEOSA-N 0.000 description 1
- KNYPNEYICHHLQL-ACRUOGEOSA-N Phe-Leu-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 KNYPNEYICHHLQL-ACRUOGEOSA-N 0.000 description 1
- INHMISZWLJZQGH-ULQDDVLXSA-N Phe-Leu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 INHMISZWLJZQGH-ULQDDVLXSA-N 0.000 description 1
- DMEYUTSDVRCWRS-ULQDDVLXSA-N Phe-Lys-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 DMEYUTSDVRCWRS-ULQDDVLXSA-N 0.000 description 1
- ZUQACJLOHYRVPJ-DKIMLUQUSA-N Phe-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 ZUQACJLOHYRVPJ-DKIMLUQUSA-N 0.000 description 1
- DOXQMJCSSYZSNM-BZSNNMDCSA-N Phe-Lys-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O DOXQMJCSSYZSNM-BZSNNMDCSA-N 0.000 description 1
- PEFJUUYFEGBXFA-BZSNNMDCSA-N Phe-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 PEFJUUYFEGBXFA-BZSNNMDCSA-N 0.000 description 1
- SCKXGHWQPPURGT-KKUMJFAQSA-N Phe-Lys-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O SCKXGHWQPPURGT-KKUMJFAQSA-N 0.000 description 1
- GPSMLZQVIIYLDK-ULQDDVLXSA-N Phe-Lys-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O GPSMLZQVIIYLDK-ULQDDVLXSA-N 0.000 description 1
- SZYBZVANEAOIPE-UBHSHLNASA-N Phe-Met-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O SZYBZVANEAOIPE-UBHSHLNASA-N 0.000 description 1
- RTUWVJVJSMOGPL-KKUMJFAQSA-N Phe-Met-Glu Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N RTUWVJVJSMOGPL-KKUMJFAQSA-N 0.000 description 1
- UXQFHEKRGHYJRA-STQMWFEESA-N Phe-Met-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O UXQFHEKRGHYJRA-STQMWFEESA-N 0.000 description 1
- FQUUYTNBMIBOHS-IHRRRGAJSA-N Phe-Met-Ser Chemical compound CSCC[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N FQUUYTNBMIBOHS-IHRRRGAJSA-N 0.000 description 1
- RYQWALWYQWBUKN-FHWLQOOXSA-N Phe-Phe-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O RYQWALWYQWBUKN-FHWLQOOXSA-N 0.000 description 1
- GPLWGAYGROGDEN-BZSNNMDCSA-N Phe-Phe-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O GPLWGAYGROGDEN-BZSNNMDCSA-N 0.000 description 1
- RVEVENLSADZUMS-IHRRRGAJSA-N Phe-Pro-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O RVEVENLSADZUMS-IHRRRGAJSA-N 0.000 description 1
- XOHJOMKCRLHGCY-UNQGMJICSA-N Phe-Pro-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOHJOMKCRLHGCY-UNQGMJICSA-N 0.000 description 1
- BPCLGWHVPVTTFM-QWRGUYRKSA-N Phe-Ser-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)NCC(O)=O BPCLGWHVPVTTFM-QWRGUYRKSA-N 0.000 description 1
- GLJZDMZJHFXJQG-BZSNNMDCSA-N Phe-Ser-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GLJZDMZJHFXJQG-BZSNNMDCSA-N 0.000 description 1
- QSWKNJAPHQDAAS-MELADBBJSA-N Phe-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O QSWKNJAPHQDAAS-MELADBBJSA-N 0.000 description 1
- JHSRGEODDALISP-XVSYOHENSA-N Phe-Thr-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O JHSRGEODDALISP-XVSYOHENSA-N 0.000 description 1
- VGTJSEYTVMAASM-RPTUDFQQSA-N Phe-Thr-Tyr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VGTJSEYTVMAASM-RPTUDFQQSA-N 0.000 description 1
- JLDZQPPLTJTJLE-IHPCNDPISA-N Phe-Trp-Asp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)N[C@@H](CC(=O)O)C(=O)O)N JLDZQPPLTJTJLE-IHPCNDPISA-N 0.000 description 1
- GTMSCDVFQLNEOY-BZSNNMDCSA-N Phe-Tyr-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N GTMSCDVFQLNEOY-BZSNNMDCSA-N 0.000 description 1
- AGTHXWTYCLLYMC-FHWLQOOXSA-N Phe-Tyr-Glu Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CCC(O)=O)C(O)=O)C1=CC=CC=C1 AGTHXWTYCLLYMC-FHWLQOOXSA-N 0.000 description 1
- MMPBPRXOFJNCCN-ZEWNOJEFSA-N Phe-Tyr-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MMPBPRXOFJNCCN-ZEWNOJEFSA-N 0.000 description 1
- ZYNBEWGJFXTBDU-ACRUOGEOSA-N Phe-Tyr-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC2=CC=CC=C2)N ZYNBEWGJFXTBDU-ACRUOGEOSA-N 0.000 description 1
- DBNGDEAQXGFGRA-ACRUOGEOSA-N Phe-Tyr-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCCCN)C(=O)O)N DBNGDEAQXGFGRA-ACRUOGEOSA-N 0.000 description 1
- FXEKNHAJIMHRFJ-ULQDDVLXSA-N Phe-Val-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N FXEKNHAJIMHRFJ-ULQDDVLXSA-N 0.000 description 1
- GNZCMRRSXOBHLC-JYJNAYRXSA-N Phe-Val-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N GNZCMRRSXOBHLC-JYJNAYRXSA-N 0.000 description 1
- VDTYRPWRWRCROL-UFYCRDLUSA-N Phe-Val-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 VDTYRPWRWRCROL-UFYCRDLUSA-N 0.000 description 1
- GAMLAXHLYGLQBJ-UFYCRDLUSA-N Phe-Val-Tyr Chemical compound N[C@H](C(=O)N[C@H](C(=O)N[C@H](C(=O)O)CC1=CC=C(C=C1)O)C(C)C)CC1=CC=CC=C1 GAMLAXHLYGLQBJ-UFYCRDLUSA-N 0.000 description 1
- APZNYJFGVAGFCF-JYJNAYRXSA-N Phe-Val-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)Cc1ccccc1)C(C)C)C(O)=O APZNYJFGVAGFCF-JYJNAYRXSA-N 0.000 description 1
- 102100038124 Plasminogen Human genes 0.000 description 1
- 108010051456 Plasminogen Proteins 0.000 description 1
- 239000004952 Polyamide Substances 0.000 description 1
- 229920001213 Polysorbate 20 Polymers 0.000 description 1
- XQLBWXHVZVBNJM-FXQIFTODSA-N Pro-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 XQLBWXHVZVBNJM-FXQIFTODSA-N 0.000 description 1
- HFZNNDWPHBRNPV-KZVJFYERSA-N Pro-Ala-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O HFZNNDWPHBRNPV-KZVJFYERSA-N 0.000 description 1
- OCSACVPBMIYNJE-GUBZILKMSA-N Pro-Arg-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O OCSACVPBMIYNJE-GUBZILKMSA-N 0.000 description 1
- HPXVFFIIGOAQRV-DCAQKATOSA-N Pro-Arg-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O HPXVFFIIGOAQRV-DCAQKATOSA-N 0.000 description 1
- GRIRJQGZZJVANI-CYDGBPFRSA-N Pro-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H]1CCCN1 GRIRJQGZZJVANI-CYDGBPFRSA-N 0.000 description 1
- OBVCYFIHIIYIQF-CIUDSAMLSA-N Pro-Asn-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O OBVCYFIHIIYIQF-CIUDSAMLSA-N 0.000 description 1
- AMBLXEMWFARNNQ-DCAQKATOSA-N Pro-Asn-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@@H]1CCCN1 AMBLXEMWFARNNQ-DCAQKATOSA-N 0.000 description 1
- KQCCDMFIALWGTL-GUBZILKMSA-N Pro-Asn-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H]1CCCN1 KQCCDMFIALWGTL-GUBZILKMSA-N 0.000 description 1
- MLQVJYMFASXBGZ-IHRRRGAJSA-N Pro-Asn-Tyr Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O MLQVJYMFASXBGZ-IHRRRGAJSA-N 0.000 description 1
- ODPIUQVTULPQEP-CIUDSAMLSA-N Pro-Gln-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@@H]1CCCN1 ODPIUQVTULPQEP-CIUDSAMLSA-N 0.000 description 1
- WGAQWMRJUFQXMF-ZPFDUUQYSA-N Pro-Gln-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WGAQWMRJUFQXMF-ZPFDUUQYSA-N 0.000 description 1
- WVOXLKUUVCCCSU-ZPFDUUQYSA-N Pro-Glu-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WVOXLKUUVCCCSU-ZPFDUUQYSA-N 0.000 description 1
- CLNJSLSHKJECME-BQBZGAKWSA-N Pro-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H]1CCCN1 CLNJSLSHKJECME-BQBZGAKWSA-N 0.000 description 1
- FEVDNIBDCRKMER-IUCAKERBSA-N Pro-Gly-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)CNC(=O)[C@@H]1CCCN1 FEVDNIBDCRKMER-IUCAKERBSA-N 0.000 description 1
- FDINZVJXLPILKV-DCAQKATOSA-N Pro-His-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(O)=O FDINZVJXLPILKV-DCAQKATOSA-N 0.000 description 1
- AJCRQOHDLCBHFA-SRVKXCTJSA-N Pro-His-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(O)=O AJCRQOHDLCBHFA-SRVKXCTJSA-N 0.000 description 1
- VWXGFAIZUQBBBG-UWVGGRQHSA-N Pro-His-Gly Chemical compound C([C@@H](C(=O)NCC(=O)[O-])NC(=O)[C@H]1[NH2+]CCC1)C1=CN=CN1 VWXGFAIZUQBBBG-UWVGGRQHSA-N 0.000 description 1
- STASJMBVVHNWCG-IHRRRGAJSA-N Pro-His-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)NC(=O)[C@H]1[NH2+]CCC1)C1=CN=CN1 STASJMBVVHNWCG-IHRRRGAJSA-N 0.000 description 1
- BWCZJGJKOFUUCN-ZPFDUUQYSA-N Pro-Ile-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(O)=O BWCZJGJKOFUUCN-ZPFDUUQYSA-N 0.000 description 1
- CFVRJNZJQHDQPP-CYDGBPFRSA-N Pro-Ile-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]1CCCN1 CFVRJNZJQHDQPP-CYDGBPFRSA-N 0.000 description 1
- KLSOMAFWRISSNI-OSUNSFLBSA-N Pro-Ile-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]1CCCN1 KLSOMAFWRISSNI-OSUNSFLBSA-N 0.000 description 1
- FMLRRBDLBJLJIK-DCAQKATOSA-N Pro-Leu-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]1CCCN1 FMLRRBDLBJLJIK-DCAQKATOSA-N 0.000 description 1
- CLJLVCYFABNTHP-DCAQKATOSA-N Pro-Leu-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O CLJLVCYFABNTHP-DCAQKATOSA-N 0.000 description 1
- GURGCNUWVSDYTP-SRVKXCTJSA-N Pro-Leu-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O GURGCNUWVSDYTP-SRVKXCTJSA-N 0.000 description 1
- HFNPOYOKIPGAEI-SRVKXCTJSA-N Pro-Leu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]1CCCN1 HFNPOYOKIPGAEI-SRVKXCTJSA-N 0.000 description 1
- BRJGUPWVFXKBQI-XUXIUFHCSA-N Pro-Leu-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BRJGUPWVFXKBQI-XUXIUFHCSA-N 0.000 description 1
- CPRLKHJUFAXVTD-ULQDDVLXSA-N Pro-Leu-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CPRLKHJUFAXVTD-ULQDDVLXSA-N 0.000 description 1
- JUJCUYWRJMFJJF-AVGNSLFASA-N Pro-Lys-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1 JUJCUYWRJMFJJF-AVGNSLFASA-N 0.000 description 1
- INDVYIOKMXFQFM-SRVKXCTJSA-N Pro-Lys-Gln Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O INDVYIOKMXFQFM-SRVKXCTJSA-N 0.000 description 1
- CDGABSWLRMECHC-IHRRRGAJSA-N Pro-Lys-His Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O CDGABSWLRMECHC-IHRRRGAJSA-N 0.000 description 1
- ULWBBFKQBDNGOY-RWMBFGLXSA-N Pro-Lys-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N2CCC[C@@H]2C(=O)O ULWBBFKQBDNGOY-RWMBFGLXSA-N 0.000 description 1
- WFIVLLFYUZZWOD-RHYQMDGZSA-N Pro-Lys-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WFIVLLFYUZZWOD-RHYQMDGZSA-N 0.000 description 1
- HBBBLSVBQGZKOZ-GUBZILKMSA-N Pro-Met-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O HBBBLSVBQGZKOZ-GUBZILKMSA-N 0.000 description 1
- QCMYJBKTMIWZAP-AVGNSLFASA-N Pro-Met-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@@H]1CCCN1 QCMYJBKTMIWZAP-AVGNSLFASA-N 0.000 description 1
- JFBJPBZSTMXGKL-JYJNAYRXSA-N Pro-Met-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JFBJPBZSTMXGKL-JYJNAYRXSA-N 0.000 description 1
- VGVCNKSUVSZEIE-IHRRRGAJSA-N Pro-Phe-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O VGVCNKSUVSZEIE-IHRRRGAJSA-N 0.000 description 1
- SWRNSCMUXRLHCR-ULQDDVLXSA-N Pro-Phe-Lys Chemical compound C([C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@H]1NCCC1)C1=CC=CC=C1 SWRNSCMUXRLHCR-ULQDDVLXSA-N 0.000 description 1
- SVXXJYJCRNKDDE-AVGNSLFASA-N Pro-Pro-His Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H]1N(CCC1)C(=O)[C@H]1NCCC1)C1=CN=CN1 SVXXJYJCRNKDDE-AVGNSLFASA-N 0.000 description 1
- LNICFEXCAHIJOR-DCAQKATOSA-N Pro-Ser-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O LNICFEXCAHIJOR-DCAQKATOSA-N 0.000 description 1
- BJCXXMGGPHRSHV-GUBZILKMSA-N Pro-Ser-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@@H]1CCCN1 BJCXXMGGPHRSHV-GUBZILKMSA-N 0.000 description 1
- WVXQQUWOKUZIEG-VEVYYDQMSA-N Pro-Thr-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O WVXQQUWOKUZIEG-VEVYYDQMSA-N 0.000 description 1
- UIUWGMRJTWHIJZ-ULQDDVLXSA-N Pro-Tyr-Lys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCCCN)C(=O)O UIUWGMRJTWHIJZ-ULQDDVLXSA-N 0.000 description 1
- IIRBTQHFVNGPMQ-AVGNSLFASA-N Pro-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@@H]1CCCN1 IIRBTQHFVNGPMQ-AVGNSLFASA-N 0.000 description 1
- VDHGTOHMHHQSKG-JYJNAYRXSA-N Pro-Val-Phe Chemical compound CC(C)[C@H](NC(=O)[C@@H]1CCCN1)C(=O)N[C@@H](Cc1ccccc1)C(O)=O VDHGTOHMHHQSKG-JYJNAYRXSA-N 0.000 description 1
- WQUURFHRUAZQHU-VGWMRTNUSA-N Pro-Val-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 WQUURFHRUAZQHU-VGWMRTNUSA-N 0.000 description 1
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 1
- 101100321932 Rattus norvegicus Prkaa2 gene Proteins 0.000 description 1
- 108010091086 Recombinases Proteins 0.000 description 1
- 241000606651 Rickettsiales Species 0.000 description 1
- 241000714474 Rous sarcoma virus Species 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 1
- 208000006268 Sarcoma 180 Diseases 0.000 description 1
- 241000710961 Semliki Forest virus Species 0.000 description 1
- ZUGXSSFMTXKHJS-ZLUOBGJFSA-N Ser-Ala-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O ZUGXSSFMTXKHJS-ZLUOBGJFSA-N 0.000 description 1
- MWMKFWJYRRGXOR-ZLUOBGJFSA-N Ser-Ala-Asn Chemical compound N[C@H](C(=O)N[C@H](C(=O)N[C@H](C(=O)O)CC(N)=O)C)CO MWMKFWJYRRGXOR-ZLUOBGJFSA-N 0.000 description 1
- BTKUIVBNGBFTTP-WHFBIAKZSA-N Ser-Ala-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)NCC(O)=O BTKUIVBNGBFTTP-WHFBIAKZSA-N 0.000 description 1
- QFBNNYNWKYKVJO-DCAQKATOSA-N Ser-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CCCN=C(N)N QFBNNYNWKYKVJO-DCAQKATOSA-N 0.000 description 1
- OYEDZGNMSBZCIM-XGEHTFHBSA-N Ser-Arg-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OYEDZGNMSBZCIM-XGEHTFHBSA-N 0.000 description 1
- KNZQGAUEYZJUSQ-ZLUOBGJFSA-N Ser-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)N KNZQGAUEYZJUSQ-ZLUOBGJFSA-N 0.000 description 1
- MESDJCNHLZBMEP-ZLUOBGJFSA-N Ser-Asp-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MESDJCNHLZBMEP-ZLUOBGJFSA-N 0.000 description 1
- QPFJSHSJFIYDJZ-GHCJXIJMSA-N Ser-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CO QPFJSHSJFIYDJZ-GHCJXIJMSA-N 0.000 description 1
- YPUSXTWURJANKF-KBIXCLLPSA-N Ser-Gln-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YPUSXTWURJANKF-KBIXCLLPSA-N 0.000 description 1
- YMAWDPHQVABADW-CIUDSAMLSA-N Ser-Gln-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O YMAWDPHQVABADW-CIUDSAMLSA-N 0.000 description 1
- PVDTYLHUWAEYGY-CIUDSAMLSA-N Ser-Glu-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PVDTYLHUWAEYGY-CIUDSAMLSA-N 0.000 description 1
- YQQKYAZABFEYAF-FXQIFTODSA-N Ser-Glu-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O YQQKYAZABFEYAF-FXQIFTODSA-N 0.000 description 1
- BRGQQXQKPUCUJQ-KBIXCLLPSA-N Ser-Glu-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BRGQQXQKPUCUJQ-KBIXCLLPSA-N 0.000 description 1
- AEGUWTFAQQWVLC-BQBZGAKWSA-N Ser-Gly-Arg Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O AEGUWTFAQQWVLC-BQBZGAKWSA-N 0.000 description 1
- YMTLKLXDFCSCNX-BYPYZUCNSA-N Ser-Gly-Gly Chemical compound OC[C@H](N)C(=O)NCC(=O)NCC(O)=O YMTLKLXDFCSCNX-BYPYZUCNSA-N 0.000 description 1
- UAJAYRMZGNQILN-BQBZGAKWSA-N Ser-Gly-Met Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCSC)C(O)=O UAJAYRMZGNQILN-BQBZGAKWSA-N 0.000 description 1
- SFTZWNJFZYOLBD-ZDLURKLDSA-N Ser-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO SFTZWNJFZYOLBD-ZDLURKLDSA-N 0.000 description 1
- MOQDPPUMFSMYOM-KKUMJFAQSA-N Ser-His-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CO)N MOQDPPUMFSMYOM-KKUMJFAQSA-N 0.000 description 1
- JEHPKECJCALLRW-CUJWVEQBSA-N Ser-His-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JEHPKECJCALLRW-CUJWVEQBSA-N 0.000 description 1
- SFTZTYBXIXLRGQ-JBDRJPRFSA-N Ser-Ile-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O SFTZTYBXIXLRGQ-JBDRJPRFSA-N 0.000 description 1
- YIUWWXVTYLANCJ-NAKRPEOUSA-N Ser-Ile-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O YIUWWXVTYLANCJ-NAKRPEOUSA-N 0.000 description 1
- LQESNKGTTNHZPZ-GHCJXIJMSA-N Ser-Ile-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O LQESNKGTTNHZPZ-GHCJXIJMSA-N 0.000 description 1
- BKZYBLLIBOBOOW-GHCJXIJMSA-N Ser-Ile-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O BKZYBLLIBOBOOW-GHCJXIJMSA-N 0.000 description 1
- UIPXCLNLUUAMJU-JBDRJPRFSA-N Ser-Ile-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O UIPXCLNLUUAMJU-JBDRJPRFSA-N 0.000 description 1
- MQQBBLVOUUJKLH-HJPIBITLSA-N Ser-Ile-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MQQBBLVOUUJKLH-HJPIBITLSA-N 0.000 description 1
- NLOAIFSWUUFQFR-CIUDSAMLSA-N Ser-Leu-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O NLOAIFSWUUFQFR-CIUDSAMLSA-N 0.000 description 1
- GJFYFGOEWLDQGW-GUBZILKMSA-N Ser-Leu-Gln Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CO)N GJFYFGOEWLDQGW-GUBZILKMSA-N 0.000 description 1
- XXNYYSXNXCJYKX-DCAQKATOSA-N Ser-Leu-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(O)=O XXNYYSXNXCJYKX-DCAQKATOSA-N 0.000 description 1
- MUJQWSAWLLRJCE-KATARQTJSA-N Ser-Leu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MUJQWSAWLLRJCE-KATARQTJSA-N 0.000 description 1
- NNFMANHDYSVNIO-DCAQKATOSA-N Ser-Lys-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NNFMANHDYSVNIO-DCAQKATOSA-N 0.000 description 1
- PPNPDKGQRFSCAC-CIUDSAMLSA-N Ser-Lys-Asp Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)CO)C(=O)N[C@@H](CC(O)=O)C(O)=O PPNPDKGQRFSCAC-CIUDSAMLSA-N 0.000 description 1
- BYCVMHKULKRVPV-GUBZILKMSA-N Ser-Lys-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O BYCVMHKULKRVPV-GUBZILKMSA-N 0.000 description 1
- OWCVUSJMEBGMOK-YUMQZZPRSA-N Ser-Lys-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O OWCVUSJMEBGMOK-YUMQZZPRSA-N 0.000 description 1
- CRJZZXMAADSBBQ-SRVKXCTJSA-N Ser-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO CRJZZXMAADSBBQ-SRVKXCTJSA-N 0.000 description 1
- PTWIYDNFWPXQSD-GARJFASQSA-N Ser-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CO)N)C(=O)O PTWIYDNFWPXQSD-GARJFASQSA-N 0.000 description 1
- PMCMLDNPAZUYGI-DCAQKATOSA-N Ser-Lys-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PMCMLDNPAZUYGI-DCAQKATOSA-N 0.000 description 1
- JUTGONBTALQWMK-NAKRPEOUSA-N Ser-Met-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CO)N JUTGONBTALQWMK-NAKRPEOUSA-N 0.000 description 1
- VIIJCAQMJBHSJH-FXQIFTODSA-N Ser-Met-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O VIIJCAQMJBHSJH-FXQIFTODSA-N 0.000 description 1
- HJAXVYLCKDPPDF-SRVKXCTJSA-N Ser-Phe-Cys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N HJAXVYLCKDPPDF-SRVKXCTJSA-N 0.000 description 1
- XKFJENWJGHMDLI-QWRGUYRKSA-N Ser-Phe-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O XKFJENWJGHMDLI-QWRGUYRKSA-N 0.000 description 1
- UPLYXVPQLJVWMM-KKUMJFAQSA-N Ser-Phe-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O UPLYXVPQLJVWMM-KKUMJFAQSA-N 0.000 description 1
- RRVFEDGUXSYWOW-BZSNNMDCSA-N Ser-Phe-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O RRVFEDGUXSYWOW-BZSNNMDCSA-N 0.000 description 1
- FBLNYDYPCLFTSP-IXOXFDKPSA-N Ser-Phe-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FBLNYDYPCLFTSP-IXOXFDKPSA-N 0.000 description 1
- NUEHQDHDLDXCRU-GUBZILKMSA-N Ser-Pro-Arg Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NUEHQDHDLDXCRU-GUBZILKMSA-N 0.000 description 1
- JLKWJWPDXPKKHI-FXQIFTODSA-N Ser-Pro-Asn Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CO)N)C(=O)N[C@@H](CC(=O)N)C(=O)O JLKWJWPDXPKKHI-FXQIFTODSA-N 0.000 description 1
- BSXKBOUZDAZXHE-CIUDSAMLSA-N Ser-Pro-Glu Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O BSXKBOUZDAZXHE-CIUDSAMLSA-N 0.000 description 1
- BVLGVLWFIZFEAH-BPUTZDHNSA-N Ser-Pro-Trp Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O BVLGVLWFIZFEAH-BPUTZDHNSA-N 0.000 description 1
- VFWQQZMRKFOGLE-ZLUOBGJFSA-N Ser-Ser-Cys Chemical compound C([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O)N)O VFWQQZMRKFOGLE-ZLUOBGJFSA-N 0.000 description 1
- FZXOPYUEQGDGMS-ACZMJKKPSA-N Ser-Ser-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O FZXOPYUEQGDGMS-ACZMJKKPSA-N 0.000 description 1
- OZPDGESCTGGNAD-CIUDSAMLSA-N Ser-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CO OZPDGESCTGGNAD-CIUDSAMLSA-N 0.000 description 1
- PYTKULIABVRXSC-BWBBJGPYSA-N Ser-Ser-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PYTKULIABVRXSC-BWBBJGPYSA-N 0.000 description 1
- SQHKXWODKJDZRC-LKXGYXEUSA-N Ser-Thr-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O SQHKXWODKJDZRC-LKXGYXEUSA-N 0.000 description 1
- QNBVFKZSSRYNFX-CUJWVEQBSA-N Ser-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CO)N)O QNBVFKZSSRYNFX-CUJWVEQBSA-N 0.000 description 1
- FLMYSKVSDVHLEW-SVSWQMSJSA-N Ser-Thr-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FLMYSKVSDVHLEW-SVSWQMSJSA-N 0.000 description 1
- NADLKBTYNKUJEP-KATARQTJSA-N Ser-Thr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O NADLKBTYNKUJEP-KATARQTJSA-N 0.000 description 1
- RTXKJFWHEBTABY-IHPCNDPISA-N Ser-Trp-Tyr Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CC=C(C=C3)O)C(=O)O)NC(=O)[C@H](CO)N RTXKJFWHEBTABY-IHPCNDPISA-N 0.000 description 1
- PQEQXWRVHQAAKS-SRVKXCTJSA-N Ser-Tyr-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](CO)N)CC1=CC=C(O)C=C1 PQEQXWRVHQAAKS-SRVKXCTJSA-N 0.000 description 1
- GSCVDSBEYVGMJQ-SRVKXCTJSA-N Ser-Tyr-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CO)N)O GSCVDSBEYVGMJQ-SRVKXCTJSA-N 0.000 description 1
- PZHJLTWGMYERRJ-SRVKXCTJSA-N Ser-Tyr-Cys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N)O PZHJLTWGMYERRJ-SRVKXCTJSA-N 0.000 description 1
- VEVYMLNYMULSMS-AVGNSLFASA-N Ser-Tyr-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O VEVYMLNYMULSMS-AVGNSLFASA-N 0.000 description 1
- VVKVHAOOUGNDPJ-SRVKXCTJSA-N Ser-Tyr-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O VVKVHAOOUGNDPJ-SRVKXCTJSA-N 0.000 description 1
- OSFZCEQJLWCIBG-BZSNNMDCSA-N Ser-Tyr-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O OSFZCEQJLWCIBG-BZSNNMDCSA-N 0.000 description 1
- SYCFMSYTIFXWAJ-DCAQKATOSA-N Ser-Val-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CO)N SYCFMSYTIFXWAJ-DCAQKATOSA-N 0.000 description 1
- YEDSOSIKVUMIJE-DCAQKATOSA-N Ser-Val-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O YEDSOSIKVUMIJE-DCAQKATOSA-N 0.000 description 1
- ANOQEBQWIAYIMV-AEJSXWLSSA-N Ser-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N ANOQEBQWIAYIMV-AEJSXWLSSA-N 0.000 description 1
- ODRUTDLAONAVDV-IHRRRGAJSA-N Ser-Val-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ODRUTDLAONAVDV-IHRRRGAJSA-N 0.000 description 1
- 108010034546 Serratia marcescens nuclease Proteins 0.000 description 1
- 101710084578 Short neurotoxin 1 Proteins 0.000 description 1
- UIIMBOGNXHQVGW-DEQYMQKBSA-M Sodium bicarbonate-14C Chemical compound [Na+].O[14C]([O-])=O UIIMBOGNXHQVGW-DEQYMQKBSA-M 0.000 description 1
- 208000007107 Stomach Ulcer Diseases 0.000 description 1
- 241000194026 Streptococcus gordonii Species 0.000 description 1
- 210000001744 T-lymphocyte Anatomy 0.000 description 1
- SSDZRWBPFCFZGB-UHFFFAOYSA-N TCA-ethadyl Chemical compound ClC(Cl)(Cl)C(=O)OCCOC(=O)C(Cl)(Cl)Cl SSDZRWBPFCFZGB-UHFFFAOYSA-N 0.000 description 1
- 108010006785 Taq Polymerase Proteins 0.000 description 1
- 101000865057 Thermococcus litoralis DNA polymerase Proteins 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-N Thiophosphoric acid Chemical class OP(O)(S)=O RYYWUUFWQRZTIU-UHFFFAOYSA-N 0.000 description 1
- DGDCHPCRMWEOJR-FQPOAREZSA-N Thr-Ala-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 DGDCHPCRMWEOJR-FQPOAREZSA-N 0.000 description 1
- TWLMXDWFVNEFFK-FJXKBIBVSA-N Thr-Arg-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O TWLMXDWFVNEFFK-FJXKBIBVSA-N 0.000 description 1
- WFUAUEQXPVNAEF-ZJDVBMNYSA-N Thr-Arg-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CCCN=C(N)N WFUAUEQXPVNAEF-ZJDVBMNYSA-N 0.000 description 1
- UNURFMVMXLENAZ-KJEVXHAQSA-N Thr-Arg-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O UNURFMVMXLENAZ-KJEVXHAQSA-N 0.000 description 1
- TZKPNGDGUVREEB-FOHZUACHSA-N Thr-Asn-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O TZKPNGDGUVREEB-FOHZUACHSA-N 0.000 description 1
- PZVGOVRNGKEFCB-KKHAAJSZSA-N Thr-Asn-Val Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C(C)C)C(=O)O)N)O PZVGOVRNGKEFCB-KKHAAJSZSA-N 0.000 description 1
- YOSLMIPKOUAHKI-OLHMAJIHSA-N Thr-Asp-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O YOSLMIPKOUAHKI-OLHMAJIHSA-N 0.000 description 1
- OYTNZCBFDXGQGE-XQXXSGGOSA-N Thr-Gln-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](C)C(=O)O)N)O OYTNZCBFDXGQGE-XQXXSGGOSA-N 0.000 description 1
- KCRQEJSKXAIULJ-FJXKBIBVSA-N Thr-Gly-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O KCRQEJSKXAIULJ-FJXKBIBVSA-N 0.000 description 1
- SIMKLINEDYOTKL-MBLNEYKQSA-N Thr-His-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](C)C(=O)O)N)O SIMKLINEDYOTKL-MBLNEYKQSA-N 0.000 description 1
- IGGFFPOIFHZYKC-PBCZWWQYSA-N Thr-His-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O IGGFFPOIFHZYKC-PBCZWWQYSA-N 0.000 description 1
- FDALPRWYVKJCLL-PMVVWTBXSA-N Thr-His-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)NCC(O)=O FDALPRWYVKJCLL-PMVVWTBXSA-N 0.000 description 1
- XTCNBOBTROGWMW-RWRJDSDZSA-N Thr-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N XTCNBOBTROGWMW-RWRJDSDZSA-N 0.000 description 1
- UYTYTDMCDBPDSC-URLPEUOOSA-N Thr-Ile-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N UYTYTDMCDBPDSC-URLPEUOOSA-N 0.000 description 1
- FQPDRTDDEZXCEC-SVSWQMSJSA-N Thr-Ile-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O FQPDRTDDEZXCEC-SVSWQMSJSA-N 0.000 description 1
- GXUWHVZYDAHFSV-FLBSBUHZSA-N Thr-Ile-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GXUWHVZYDAHFSV-FLBSBUHZSA-N 0.000 description 1
- HOVLHEKTGVIKAP-WDCWCFNPSA-N Thr-Leu-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O HOVLHEKTGVIKAP-WDCWCFNPSA-N 0.000 description 1
- VTVVYQOXJCZVEB-WDCWCFNPSA-N Thr-Leu-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O VTVVYQOXJCZVEB-WDCWCFNPSA-N 0.000 description 1
- MEJHFIOYJHTWMK-VOAKCMCISA-N Thr-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)[C@@H](C)O MEJHFIOYJHTWMK-VOAKCMCISA-N 0.000 description 1
- YOOAQCZYZHGUAZ-KATARQTJSA-N Thr-Leu-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YOOAQCZYZHGUAZ-KATARQTJSA-N 0.000 description 1
- TZJSEJOXAIWOST-RHYQMDGZSA-N Thr-Lys-Arg Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CCCN=C(N)N TZJSEJOXAIWOST-RHYQMDGZSA-N 0.000 description 1
- CJXURNZYNHCYFD-WDCWCFNPSA-N Thr-Lys-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O CJXURNZYNHCYFD-WDCWCFNPSA-N 0.000 description 1
- SCSVNSNWUTYSFO-WDCWCFNPSA-N Thr-Lys-Glu Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O SCSVNSNWUTYSFO-WDCWCFNPSA-N 0.000 description 1
- JLNMFGCJODTXDH-WEDXCCLWSA-N Thr-Lys-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O JLNMFGCJODTXDH-WEDXCCLWSA-N 0.000 description 1
- XSEPSRUDSPHMPX-KATARQTJSA-N Thr-Lys-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O XSEPSRUDSPHMPX-KATARQTJSA-N 0.000 description 1
- LHNNQVXITHUCAB-QTKMDUPCSA-N Thr-Met-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N)O LHNNQVXITHUCAB-QTKMDUPCSA-N 0.000 description 1
- SIEZEMFJLYRUMK-YTWAJWBKSA-N Thr-Met-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)N1CCC[C@@H]1C(=O)O)N)O SIEZEMFJLYRUMK-YTWAJWBKSA-N 0.000 description 1
- KZURUCDWKDEAFZ-XVSYOHENSA-N Thr-Phe-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O KZURUCDWKDEAFZ-XVSYOHENSA-N 0.000 description 1
- WYLAVUAWOUVUCA-XVSYOHENSA-N Thr-Phe-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O WYLAVUAWOUVUCA-XVSYOHENSA-N 0.000 description 1
- BIBYEFRASCNLAA-CDMKHQONSA-N Thr-Phe-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=CC=C1 BIBYEFRASCNLAA-CDMKHQONSA-N 0.000 description 1
- GYUUYCIXELGTJS-MEYUZBJRSA-N Thr-Phe-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O GYUUYCIXELGTJS-MEYUZBJRSA-N 0.000 description 1
- VGYVVSQFSSKZRJ-OEAJRASXSA-N Thr-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@H](O)C)CC1=CC=CC=C1 VGYVVSQFSSKZRJ-OEAJRASXSA-N 0.000 description 1
- VEIKMWOMUYMMMK-FCLVOEFKSA-N Thr-Phe-Phe Chemical compound C([C@H](NC(=O)[C@@H](N)[C@H](O)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 VEIKMWOMUYMMMK-FCLVOEFKSA-N 0.000 description 1
- WTMPKZWHRCMMMT-KZVJFYERSA-N Thr-Pro-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O WTMPKZWHRCMMMT-KZVJFYERSA-N 0.000 description 1
- VTMGKRABARCZAX-OSUNSFLBSA-N Thr-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)[C@@H](C)O VTMGKRABARCZAX-OSUNSFLBSA-N 0.000 description 1
- OLFOOYQTTQSSRK-UNQGMJICSA-N Thr-Pro-Phe Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OLFOOYQTTQSSRK-UNQGMJICSA-N 0.000 description 1
- KERCOYANYUPLHJ-XGEHTFHBSA-N Thr-Pro-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O KERCOYANYUPLHJ-XGEHTFHBSA-N 0.000 description 1
- GVMXJJAJLIEASL-ZJDVBMNYSA-N Thr-Pro-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O GVMXJJAJLIEASL-ZJDVBMNYSA-N 0.000 description 1
- FWTFAZKJORVTIR-VZFHVOOUSA-N Thr-Ser-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O FWTFAZKJORVTIR-VZFHVOOUSA-N 0.000 description 1
- STUAPCLEDMKXKL-LKXGYXEUSA-N Thr-Ser-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O STUAPCLEDMKXKL-LKXGYXEUSA-N 0.000 description 1
- IVDFVBVIVLJJHR-LKXGYXEUSA-N Thr-Ser-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O IVDFVBVIVLJJHR-LKXGYXEUSA-N 0.000 description 1
- XHWCDRUPDNSDAZ-XKBZYTNZSA-N Thr-Ser-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N)O XHWCDRUPDNSDAZ-XKBZYTNZSA-N 0.000 description 1
- SGAOHNPSEPVAFP-ZDLURKLDSA-N Thr-Ser-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SGAOHNPSEPVAFP-ZDLURKLDSA-N 0.000 description 1
- WPSKTVVMQCXPRO-BWBBJGPYSA-N Thr-Ser-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WPSKTVVMQCXPRO-BWBBJGPYSA-N 0.000 description 1
- HUPLKEHTTQBXSC-YJRXYDGGSA-N Thr-Ser-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HUPLKEHTTQBXSC-YJRXYDGGSA-N 0.000 description 1
- QYDKSNXSBXZPFK-ZJDVBMNYSA-N Thr-Thr-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QYDKSNXSBXZPFK-ZJDVBMNYSA-N 0.000 description 1
- NHQVWACSJZJCGJ-FLBSBUHZSA-N Thr-Thr-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NHQVWACSJZJCGJ-FLBSBUHZSA-N 0.000 description 1
- QJIODPFLAASXJC-JHYOHUSXSA-N Thr-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N)O QJIODPFLAASXJC-JHYOHUSXSA-N 0.000 description 1
- BZTSQFWJNJYZSX-JRQIVUDYSA-N Thr-Tyr-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O BZTSQFWJNJYZSX-JRQIVUDYSA-N 0.000 description 1
- JAWUQFCGNVEDRN-MEYUZBJRSA-N Thr-Tyr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N)O JAWUQFCGNVEDRN-MEYUZBJRSA-N 0.000 description 1
- CJEHCEOXPLASCK-MEYUZBJRSA-N Thr-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@H](O)C)CC1=CC=C(O)C=C1 CJEHCEOXPLASCK-MEYUZBJRSA-N 0.000 description 1
- VMSSYINFMOFLJM-KJEVXHAQSA-N Thr-Tyr-Met Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCSC)C(=O)O)N)O VMSSYINFMOFLJM-KJEVXHAQSA-N 0.000 description 1
- FYBFTPLPAXZBOY-KKHAAJSZSA-N Thr-Val-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O FYBFTPLPAXZBOY-KKHAAJSZSA-N 0.000 description 1
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 1
- 239000004473 Threonine Substances 0.000 description 1
- 101710182532 Toxin a Proteins 0.000 description 1
- 102000008579 Transposases Human genes 0.000 description 1
- 108010020764 Transposases Proteins 0.000 description 1
- HYVLNORXQGKONN-NUTKFTJISA-N Trp-Ala-Lys Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O)=CNC2=C1 HYVLNORXQGKONN-NUTKFTJISA-N 0.000 description 1
- YEGMNOHLZNGOCG-UBHSHLNASA-N Trp-Asn-Asn Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O YEGMNOHLZNGOCG-UBHSHLNASA-N 0.000 description 1
- UTQBQJNSNXJNIH-IHPCNDPISA-N Trp-Asn-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N UTQBQJNSNXJNIH-IHPCNDPISA-N 0.000 description 1
- GKUROEIXVURAAO-BPUTZDHNSA-N Trp-Asp-Arg Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GKUROEIXVURAAO-BPUTZDHNSA-N 0.000 description 1
- NKUIXQOJUAEIET-AQZXSJQPSA-N Trp-Asp-Thr Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@H](O)C)C(O)=O)=CNC2=C1 NKUIXQOJUAEIET-AQZXSJQPSA-N 0.000 description 1
- PKUJMYZNJMRHEZ-XIRDDKMYSA-N Trp-Glu-Arg Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PKUJMYZNJMRHEZ-XIRDDKMYSA-N 0.000 description 1
- FNOQJVHFVLVMOS-AAEUAGOBSA-N Trp-Gly-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)N)C(=O)O)N FNOQJVHFVLVMOS-AAEUAGOBSA-N 0.000 description 1
- QEJHHFFFCUDPDV-WDSOQIARSA-N Trp-His-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N QEJHHFFFCUDPDV-WDSOQIARSA-N 0.000 description 1
- WKCFCVBOFKEVKY-HSCHXYMDSA-N Trp-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N WKCFCVBOFKEVKY-HSCHXYMDSA-N 0.000 description 1
- MEZCXKYMMQJRDE-PMVMPFDFSA-N Trp-Leu-Tyr Chemical compound C([C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CC=1C2=CC=CC=C2NC=1)CC(C)C)C(O)=O)C1=CC=C(O)C=C1 MEZCXKYMMQJRDE-PMVMPFDFSA-N 0.000 description 1
- FJHXNRKNOXEIIO-OYDLWJJNSA-N Trp-Met-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CC=3C4=CC=CC=C4NC=3)CCSC)C(O)=O)=CNC2=C1 FJHXNRKNOXEIIO-OYDLWJJNSA-N 0.000 description 1
- UQHPXCFAHVTWFU-BVSLBCMMSA-N Trp-Phe-Val Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O UQHPXCFAHVTWFU-BVSLBCMMSA-N 0.000 description 1
- GEGYPBOPIGNZIF-CWRNSKLLSA-N Trp-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N)C(=O)O GEGYPBOPIGNZIF-CWRNSKLLSA-N 0.000 description 1
- ZJPSMXCFEKMZFE-IHPCNDPISA-N Trp-Tyr-Ser Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O ZJPSMXCFEKMZFE-IHPCNDPISA-N 0.000 description 1
- BABINGWMZBWXIX-BPUTZDHNSA-N Trp-Val-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N BABINGWMZBWXIX-BPUTZDHNSA-N 0.000 description 1
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 1
- BURPTJBFWIOHEY-UWJYBYFXSA-N Tyr-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 BURPTJBFWIOHEY-UWJYBYFXSA-N 0.000 description 1
- SDNVRAKIJVKAGS-LKTVYLICSA-N Tyr-Ala-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N SDNVRAKIJVKAGS-LKTVYLICSA-N 0.000 description 1
- AKXBNSZMYAOGLS-STQMWFEESA-N Tyr-Arg-Gly Chemical compound NC(N)=NCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AKXBNSZMYAOGLS-STQMWFEESA-N 0.000 description 1
- ZNFPUOSTMUMUDR-JRQIVUDYSA-N Tyr-Asn-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZNFPUOSTMUMUDR-JRQIVUDYSA-N 0.000 description 1
- DANHCMVVXDXOHN-SRVKXCTJSA-N Tyr-Asp-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 DANHCMVVXDXOHN-SRVKXCTJSA-N 0.000 description 1
- JWHOIHCOHMZSAR-QWRGUYRKSA-N Tyr-Asp-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 JWHOIHCOHMZSAR-QWRGUYRKSA-N 0.000 description 1
- WPVGRKLNHJJCEN-BZSNNMDCSA-N Tyr-Asp-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 WPVGRKLNHJJCEN-BZSNNMDCSA-N 0.000 description 1
- NRFTYDWKWGJLAR-MELADBBJSA-N Tyr-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O NRFTYDWKWGJLAR-MELADBBJSA-N 0.000 description 1
- UABYBEBXFFNCIR-YDHLFZDLSA-N Tyr-Asp-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O UABYBEBXFFNCIR-YDHLFZDLSA-N 0.000 description 1
- QHEGAOPHISYNDF-XDTLVQLUSA-N Tyr-Gln-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N QHEGAOPHISYNDF-XDTLVQLUSA-N 0.000 description 1
- TWAVEIJGFCBWCG-JYJNAYRXSA-N Tyr-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N TWAVEIJGFCBWCG-JYJNAYRXSA-N 0.000 description 1
- WZQZUVWEPMGIMM-JYJNAYRXSA-N Tyr-Gln-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O WZQZUVWEPMGIMM-JYJNAYRXSA-N 0.000 description 1
- FXYOYUMPUJONGW-FHWLQOOXSA-N Tyr-Gln-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 FXYOYUMPUJONGW-FHWLQOOXSA-N 0.000 description 1
- FJKXUIJOMUWCDD-FHWLQOOXSA-N Tyr-Gln-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N)O FJKXUIJOMUWCDD-FHWLQOOXSA-N 0.000 description 1
- WVRUKYLYMFGKAN-IHRRRGAJSA-N Tyr-Glu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 WVRUKYLYMFGKAN-IHRRRGAJSA-N 0.000 description 1
- HDSKHCBAVVWPCQ-FHWLQOOXSA-N Tyr-Glu-Phe Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HDSKHCBAVVWPCQ-FHWLQOOXSA-N 0.000 description 1
- ZRPLVTZTKPPSBT-AVGNSLFASA-N Tyr-Glu-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O ZRPLVTZTKPPSBT-AVGNSLFASA-N 0.000 description 1
- CNLKDWSAORJEMW-KWQFWETISA-N Tyr-Gly-Ala Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](C)C(O)=O CNLKDWSAORJEMW-KWQFWETISA-N 0.000 description 1
- CDHQEOXPWBDFPL-QWRGUYRKSA-N Tyr-Gly-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 CDHQEOXPWBDFPL-QWRGUYRKSA-N 0.000 description 1
- HIINQLBHPIQYHN-JTQLQIEISA-N Tyr-Gly-Gly Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 HIINQLBHPIQYHN-JTQLQIEISA-N 0.000 description 1
- QAYSODICXVZUIA-WLTAIBSBSA-N Tyr-Gly-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O QAYSODICXVZUIA-WLTAIBSBSA-N 0.000 description 1
- CTDPLKMBVALCGN-JSGCOSHPSA-N Tyr-Gly-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O CTDPLKMBVALCGN-JSGCOSHPSA-N 0.000 description 1
- FIRUOPRJKCBLST-KKUMJFAQSA-N Tyr-His-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O FIRUOPRJKCBLST-KKUMJFAQSA-N 0.000 description 1
- LFCQXIXJQXWZJI-BZSNNMDCSA-N Tyr-His-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N)O LFCQXIXJQXWZJI-BZSNNMDCSA-N 0.000 description 1
- AXWBYOVVDRBOGU-SIUGBPQLSA-N Tyr-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N AXWBYOVVDRBOGU-SIUGBPQLSA-N 0.000 description 1
- DZKFGCNKEVMXFA-JUKXBJQTSA-N Tyr-Ile-His Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)Cc1ccc(O)cc1)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O DZKFGCNKEVMXFA-JUKXBJQTSA-N 0.000 description 1
- HVPPEXXUDXAPOM-MGHWNKPDSA-N Tyr-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 HVPPEXXUDXAPOM-MGHWNKPDSA-N 0.000 description 1
- FJBCEFPCVPHPPM-STECZYCISA-N Tyr-Ile-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O FJBCEFPCVPHPPM-STECZYCISA-N 0.000 description 1
- KSCVLGXNQXKUAR-JYJNAYRXSA-N Tyr-Leu-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O KSCVLGXNQXKUAR-JYJNAYRXSA-N 0.000 description 1
- YKCXQOBTISTQJD-BZSNNMDCSA-N Tyr-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N YKCXQOBTISTQJD-BZSNNMDCSA-N 0.000 description 1
- WDGDKHLSDIOXQC-ACRUOGEOSA-N Tyr-Leu-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 WDGDKHLSDIOXQC-ACRUOGEOSA-N 0.000 description 1
- JAGGEZACYAAMIL-CQDKDKBSSA-N Tyr-Lys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CC=C(C=C1)O)N JAGGEZACYAAMIL-CQDKDKBSSA-N 0.000 description 1
- WOAQYWUEUYMVGK-ULQDDVLXSA-N Tyr-Lys-Arg Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O WOAQYWUEUYMVGK-ULQDDVLXSA-N 0.000 description 1
- HSBZWINKRYZCSQ-KKUMJFAQSA-N Tyr-Lys-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O HSBZWINKRYZCSQ-KKUMJFAQSA-N 0.000 description 1
- CNNVVEPJTFOGHI-ACRUOGEOSA-N Tyr-Lys-Tyr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CNNVVEPJTFOGHI-ACRUOGEOSA-N 0.000 description 1
- SCZJKZLFSSPJDP-ACRUOGEOSA-N Tyr-Phe-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O SCZJKZLFSSPJDP-ACRUOGEOSA-N 0.000 description 1
- GQVZBMROTPEPIF-SRVKXCTJSA-N Tyr-Ser-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O GQVZBMROTPEPIF-SRVKXCTJSA-N 0.000 description 1
- HRHYJNLMIJWGLF-BZSNNMDCSA-N Tyr-Ser-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 HRHYJNLMIJWGLF-BZSNNMDCSA-N 0.000 description 1
- SYFHQHYTNCQCCN-MELADBBJSA-N Tyr-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O SYFHQHYTNCQCCN-MELADBBJSA-N 0.000 description 1
- NZBSVMQZQMEUHI-WZLNRYEVSA-N Tyr-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N NZBSVMQZQMEUHI-WZLNRYEVSA-N 0.000 description 1
- PWKMJDQXKCENMF-MEYUZBJRSA-N Tyr-Thr-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O PWKMJDQXKCENMF-MEYUZBJRSA-N 0.000 description 1
- CLEGSEJVGBYZBJ-MEYUZBJRSA-N Tyr-Thr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 CLEGSEJVGBYZBJ-MEYUZBJRSA-N 0.000 description 1
- GAKBTSMAPGLQFA-JNPHEJMOSA-N Tyr-Thr-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 GAKBTSMAPGLQFA-JNPHEJMOSA-N 0.000 description 1
- LVILBTSHPTWDGE-PMVMPFDFSA-N Tyr-Trp-Lys Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CCCCN)C(O)=O)C1=CC=C(O)C=C1 LVILBTSHPTWDGE-PMVMPFDFSA-N 0.000 description 1
- GPLTZEMVOCZVAV-UFYCRDLUSA-N Tyr-Tyr-Arg Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)C1=CC=C(O)C=C1 GPLTZEMVOCZVAV-UFYCRDLUSA-N 0.000 description 1
- WYOBRXPIZVKNMF-IRXDYDNUSA-N Tyr-Tyr-Gly Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)NCC(O)=O)C1=CC=C(O)C=C1 WYOBRXPIZVKNMF-IRXDYDNUSA-N 0.000 description 1
- TYGHOWWWMTWVKM-HJOGWXRNSA-N Tyr-Tyr-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 TYGHOWWWMTWVKM-HJOGWXRNSA-N 0.000 description 1
- UUJHRSTVQCFDPA-UFYCRDLUSA-N Tyr-Tyr-Val Chemical compound C([C@@H](C(=O)N[C@@H](C(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 UUJHRSTVQCFDPA-UFYCRDLUSA-N 0.000 description 1
- 206010046865 Vaccinia virus infection Diseases 0.000 description 1
- UEOOXDLMQZBPFR-ZKWXMUAHSA-N Val-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N UEOOXDLMQZBPFR-ZKWXMUAHSA-N 0.000 description 1
- RUCNAYOMFXRIKJ-DCAQKATOSA-N Val-Ala-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN RUCNAYOMFXRIKJ-DCAQKATOSA-N 0.000 description 1
- JFAWZADYPRMRCO-UBHSHLNASA-N Val-Ala-Phe Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 JFAWZADYPRMRCO-UBHSHLNASA-N 0.000 description 1
- CWOSXNKDOACNJN-BZSNNMDCSA-N Val-Arg-Trp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N CWOSXNKDOACNJN-BZSNNMDCSA-N 0.000 description 1
- WKWJJQZZZBBWKV-JYJNAYRXSA-N Val-Arg-Tyr Chemical compound NC(N)=NCCC[C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 WKWJJQZZZBBWKV-JYJNAYRXSA-N 0.000 description 1
- BYOHPUZJVXWHAE-BYULHYEWSA-N Val-Asn-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N BYOHPUZJVXWHAE-BYULHYEWSA-N 0.000 description 1
- AUMNPAUHKUNHHN-BYULHYEWSA-N Val-Asn-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N AUMNPAUHKUNHHN-BYULHYEWSA-N 0.000 description 1
- QGFPYRPIUXBYGR-YDHLFZDLSA-N Val-Asn-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N QGFPYRPIUXBYGR-YDHLFZDLSA-N 0.000 description 1
- IQQYYFPCWKWUHW-YDHLFZDLSA-N Val-Asn-Tyr Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N IQQYYFPCWKWUHW-YDHLFZDLSA-N 0.000 description 1
- HHSILIQTHXABKM-YDHLFZDLSA-N Val-Asp-Phe Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](Cc1ccccc1)C(O)=O HHSILIQTHXABKM-YDHLFZDLSA-N 0.000 description 1
- CFSSLXZJEMERJY-NRPADANISA-N Val-Gln-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O CFSSLXZJEMERJY-NRPADANISA-N 0.000 description 1
- YCMXFKWYJFZFKS-LAEOZQHASA-N Val-Gln-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YCMXFKWYJFZFKS-LAEOZQHASA-N 0.000 description 1
- QHFQQRKNGCXTHL-AUTRQRHGSA-N Val-Gln-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O QHFQQRKNGCXTHL-AUTRQRHGSA-N 0.000 description 1
- ZEVNVXYRZRIRCH-GVXVVHGQSA-N Val-Gln-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N ZEVNVXYRZRIRCH-GVXVVHGQSA-N 0.000 description 1
- XGJLNBNZNMVJRS-NRPADANISA-N Val-Glu-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O XGJLNBNZNMVJRS-NRPADANISA-N 0.000 description 1
- GBESYURLQOYWLU-LAEOZQHASA-N Val-Glu-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N GBESYURLQOYWLU-LAEOZQHASA-N 0.000 description 1
- SZTTYWIUCGSURQ-AUTRQRHGSA-N Val-Glu-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SZTTYWIUCGSURQ-AUTRQRHGSA-N 0.000 description 1
- ZXAGTABZUOMUDO-GVXVVHGQSA-N Val-Glu-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZXAGTABZUOMUDO-GVXVVHGQSA-N 0.000 description 1
- NXRAUQGGHPCJIB-RCOVLWMOSA-N Val-Gly-Asn Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O NXRAUQGGHPCJIB-RCOVLWMOSA-N 0.000 description 1
- DJEVQCWNMQOABE-RCOVLWMOSA-N Val-Gly-Asp Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)O)N DJEVQCWNMQOABE-RCOVLWMOSA-N 0.000 description 1
- MDYSKHBSPXUOPV-JSGCOSHPSA-N Val-Gly-Phe Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N MDYSKHBSPXUOPV-JSGCOSHPSA-N 0.000 description 1
- BVWPHWLFGRCECJ-JSGCOSHPSA-N Val-Gly-Tyr Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N BVWPHWLFGRCECJ-JSGCOSHPSA-N 0.000 description 1
- FEFZWCSXEMVSPO-LSJOCFKGSA-N Val-His-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1cnc[nH]1)C(=O)N[C@@H](C)C(O)=O FEFZWCSXEMVSPO-LSJOCFKGSA-N 0.000 description 1
- VHRLUTIMTDOVCG-PEDHHIEDSA-N Val-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)NC(=O)[C@H](C(C)C)N VHRLUTIMTDOVCG-PEDHHIEDSA-N 0.000 description 1
- APQIVBCUIUDSMB-OSUNSFLBSA-N Val-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N APQIVBCUIUDSMB-OSUNSFLBSA-N 0.000 description 1
- BZWUSZGQOILYEU-STECZYCISA-N Val-Ile-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 BZWUSZGQOILYEU-STECZYCISA-N 0.000 description 1
- HGJRMXOWUWVUOA-GVXVVHGQSA-N Val-Leu-Gln Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N HGJRMXOWUWVUOA-GVXVVHGQSA-N 0.000 description 1
- BTWMICVCQLKKNR-DCAQKATOSA-N Val-Leu-Ser Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C([O-])=O BTWMICVCQLKKNR-DCAQKATOSA-N 0.000 description 1
- GVJUTBOZZBTBIG-AVGNSLFASA-N Val-Lys-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N GVJUTBOZZBTBIG-AVGNSLFASA-N 0.000 description 1
- ZRSZTKTVPNSUNA-IHRRRGAJSA-N Val-Lys-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)C(C)C)C(O)=O ZRSZTKTVPNSUNA-IHRRRGAJSA-N 0.000 description 1
- IEBGHUMBJXIXHM-AVGNSLFASA-N Val-Lys-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)O)N IEBGHUMBJXIXHM-AVGNSLFASA-N 0.000 description 1
- HPANGHISDXDUQY-ULQDDVLXSA-N Val-Lys-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N HPANGHISDXDUQY-ULQDDVLXSA-N 0.000 description 1
- SBJCTAZFSZXWSR-AVGNSLFASA-N Val-Met-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N SBJCTAZFSZXWSR-AVGNSLFASA-N 0.000 description 1
- VNGKMNPAENRGDC-JYJNAYRXSA-N Val-Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=CC=C1 VNGKMNPAENRGDC-JYJNAYRXSA-N 0.000 description 1
- NZGOVKLVQNOEKP-YDHLFZDLSA-N Val-Phe-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N NZGOVKLVQNOEKP-YDHLFZDLSA-N 0.000 description 1
- WMRWZYSRQUORHJ-YDHLFZDLSA-N Val-Phe-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)O)C(=O)O)N WMRWZYSRQUORHJ-YDHLFZDLSA-N 0.000 description 1
- FMQGYTMERWBMSI-HJWJTTGWSA-N Val-Phe-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](C(C)C)N FMQGYTMERWBMSI-HJWJTTGWSA-N 0.000 description 1
- JMCOXFSCTGKLLB-FKBYEOEOSA-N Val-Phe-Trp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O)N JMCOXFSCTGKLLB-FKBYEOEOSA-N 0.000 description 1
- XBJKAZATRJBDCU-GUBZILKMSA-N Val-Pro-Ala Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O XBJKAZATRJBDCU-GUBZILKMSA-N 0.000 description 1
- KSFXWENSJABBFI-ZKWXMUAHSA-N Val-Ser-Asn Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O KSFXWENSJABBFI-ZKWXMUAHSA-N 0.000 description 1
- RYHUIHUOYRNNIE-NRPADANISA-N Val-Ser-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N RYHUIHUOYRNNIE-NRPADANISA-N 0.000 description 1
- VIKZGAUAKQZDOF-NRPADANISA-N Val-Ser-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O VIKZGAUAKQZDOF-NRPADANISA-N 0.000 description 1
- KRAHMIJVUPUOTQ-DCAQKATOSA-N Val-Ser-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N KRAHMIJVUPUOTQ-DCAQKATOSA-N 0.000 description 1
- QTPQHINADBYBNA-DCAQKATOSA-N Val-Ser-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN QTPQHINADBYBNA-DCAQKATOSA-N 0.000 description 1
- ZLMFVXMJFIWIRE-FHWLQOOXSA-N Val-Trp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](C(C)C)N ZLMFVXMJFIWIRE-FHWLQOOXSA-N 0.000 description 1
- PDASTHRLDFOZMG-JYJNAYRXSA-N Val-Tyr-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=C(O)C=C1 PDASTHRLDFOZMG-JYJNAYRXSA-N 0.000 description 1
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 1
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 1
- PTTGRYBBCYZPSL-UHFFFAOYSA-H [Al+3].[Al+3].OOP([O-])([O-])=O.OOP([O-])([O-])=O.OOP([O-])([O-])=O Chemical compound [Al+3].[Al+3].OOP([O-])([O-])=O.OOP([O-])([O-])=O.OOP([O-])([O-])=O PTTGRYBBCYZPSL-UHFFFAOYSA-H 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 230000002378 acidificating effect Effects 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 239000013543 active substance Substances 0.000 description 1
- 230000001154 acute effect Effects 0.000 description 1
- 230000000240 adjuvant effect Effects 0.000 description 1
- 238000012387 aerosolization Methods 0.000 description 1
- 238000003450 affinity purification method Methods 0.000 description 1
- AZDRQVAHHNSJOQ-UHFFFAOYSA-N alumane Chemical class [AlH3] AZDRQVAHHNSJOQ-UHFFFAOYSA-N 0.000 description 1
- WNROFYMDJYEPJX-UHFFFAOYSA-K aluminium hydroxide Chemical compound [OH-].[OH-].[OH-].[Al+3] WNROFYMDJYEPJX-UHFFFAOYSA-K 0.000 description 1
- ILRRQNADMUWWFW-UHFFFAOYSA-K aluminium phosphate Chemical compound O1[Al]2OP1(=O)O2 ILRRQNADMUWWFW-UHFFFAOYSA-K 0.000 description 1
- 229940047712 aluminum hydroxyphosphate Drugs 0.000 description 1
- 229940126575 aminoglycoside Drugs 0.000 description 1
- 229940069428 antacid Drugs 0.000 description 1
- 239000003159 antacid agent Substances 0.000 description 1
- 230000001458 anti-acid effect Effects 0.000 description 1
- PYMYPHUHKUWMLA-WDCZJNDASA-N arabinose Chemical compound OC[C@@H](O)[C@@H](O)[C@H](O)C=O PYMYPHUHKUWMLA-WDCZJNDASA-N 0.000 description 1
- PYMYPHUHKUWMLA-UHFFFAOYSA-N arabinose Natural products OCC(O)C(O)C(O)C=O PYMYPHUHKUWMLA-UHFFFAOYSA-N 0.000 description 1
- 229960003589 arginine hydrochloride Drugs 0.000 description 1
- 108010089442 arginyl-leucyl-alanyl-arginine Proteins 0.000 description 1
- 108010084758 arginyl-tyrosyl-aspartic acid Proteins 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 229960001230 asparagine Drugs 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- 230000000621 autoagglutination Effects 0.000 description 1
- 239000000688 bacterial toxin Substances 0.000 description 1
- 239000011324 bead Substances 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- SRBFZHDQGSBBOR-UHFFFAOYSA-N beta-D-Pyranose-Lyxose Natural products OC1COC(O)C(O)C1O SRBFZHDQGSBBOR-UHFFFAOYSA-N 0.000 description 1
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 1
- 108010036170 bicitropeptide Proteins 0.000 description 1
- 239000003833 bile salt Substances 0.000 description 1
- 229940093761 bile salts Drugs 0.000 description 1
- 229960002685 biotin Drugs 0.000 description 1
- 235000020958 biotin Nutrition 0.000 description 1
- 239000011616 biotin Substances 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 239000007975 buffered saline Substances 0.000 description 1
- 238000010804 cDNA synthesis Methods 0.000 description 1
- 239000001506 calcium phosphate Substances 0.000 description 1
- 229910000389 calcium phosphate Inorganic materials 0.000 description 1
- 235000011010 calcium phosphates Nutrition 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 244000309466 calf Species 0.000 description 1
- 229960000530 carbenoxolone Drugs 0.000 description 1
- BVKZGUZCCUSVTD-UHFFFAOYSA-N carbonic acid Chemical class OC(O)=O BVKZGUZCCUSVTD-UHFFFAOYSA-N 0.000 description 1
- 101150055766 cat gene Proteins 0.000 description 1
- 229960001668 cefuroxime Drugs 0.000 description 1
- JFPVXVDWJQMJEE-IZRZKJBUSA-N cefuroxime Chemical compound N([C@@H]1C(N2C(=C(COC(N)=O)CS[C@@H]21)C(O)=O)=O)C(=O)\C(=N/OC)C1=CC=CO1 JFPVXVDWJQMJEE-IZRZKJBUSA-N 0.000 description 1
- 239000013592 cell lysate Substances 0.000 description 1
- 230000004700 cellular uptake Effects 0.000 description 1
- 230000003196 chaotropic effect Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 239000013043 chemical agent Substances 0.000 description 1
- 150000001841 cholesterols Chemical class 0.000 description 1
- 239000000812 cholinergic antagonist Substances 0.000 description 1
- 208000016644 chronic atrophic gastritis Diseases 0.000 description 1
- 230000001684 chronic effect Effects 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 230000021615 conjugation Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000009260 cross reactivity Effects 0.000 description 1
- ATDGTVJJHBUTRL-UHFFFAOYSA-N cyanogen bromide Chemical compound BrC#N ATDGTVJJHBUTRL-UHFFFAOYSA-N 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 210000005045 desmin Anatomy 0.000 description 1
- MXHRCPNRJAMMIM-UHFFFAOYSA-N desoxyuridine Natural products C1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-UHFFFAOYSA-N 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000001177 diphosphate Substances 0.000 description 1
- XPPKVPWEQAFLFU-UHFFFAOYSA-J diphosphate(4-) Chemical compound [O-]P([O-])(=O)OP([O-])([O-])=O XPPKVPWEQAFLFU-UHFFFAOYSA-J 0.000 description 1
- 235000011180 diphosphates Nutrition 0.000 description 1
- LOKCTEFSRHRXRJ-UHFFFAOYSA-I dipotassium trisodium dihydrogen phosphate hydrogen phosphate dichloride Chemical compound P(=O)(O)(O)[O-].[K+].P(=O)(O)([O-])[O-].[Na+].[Na+].[Cl-].[K+].[Cl-].[Na+] LOKCTEFSRHRXRJ-UHFFFAOYSA-I 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 239000003937 drug carrier Substances 0.000 description 1
- 238000009510 drug design Methods 0.000 description 1
- 208000000718 duodenal ulcer Diseases 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 239000012149 elution buffer Substances 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000006872 enzymatic polymerization reaction Methods 0.000 description 1
- 210000002919 epithelial cell Anatomy 0.000 description 1
- 125000004185 ester group Chemical group 0.000 description 1
- 230000029142 excretion Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000037433 frameshift Effects 0.000 description 1
- 210000001035 gastrointestinal tract Anatomy 0.000 description 1
- 238000001415 gene therapy Methods 0.000 description 1
- 238000010353 genetic engineering Methods 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 1
- 230000013595 glycosylation Effects 0.000 description 1
- 238000006206 glycosylation reaction Methods 0.000 description 1
- HPAIKDPJURGQLN-UHFFFAOYSA-N glycyl-L-histidyl-L-phenylalanine Natural products C=1C=CC=CC=1CC(C(O)=O)NC(=O)C(NC(=O)CN)CC1=CN=CN1 HPAIKDPJURGQLN-UHFFFAOYSA-N 0.000 description 1
- 108010048994 glycyl-tyrosyl-alanine Proteins 0.000 description 1
- ZWCXYZRRTRDGQE-SORVKSEFSA-N gramicidina Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](CC(C)C)NC(=O)[C@H](CC=3C4=CC=CC=C4NC=3)NC(=O)[C@@H](CC(C)C)NC(=O)[C@H](CC=3C4=CC=CC=C4NC=3)NC(=O)[C@@H](CC(C)C)NC(=O)[C@H](CC=3C4=CC=CC=C4NC=3)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](C(C)C)NC(=O)[C@H](C)NC(=O)[C@H](NC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](NC=O)C(C)C)CC(C)C)C(=O)NCCO)=CNC2=C1 ZWCXYZRRTRDGQE-SORVKSEFSA-N 0.000 description 1
- 238000003306 harvesting Methods 0.000 description 1
- 229940077716 histamine h2 receptor antagonists for peptic ulcer and gord Drugs 0.000 description 1
- 238000002744 homologous recombination Methods 0.000 description 1
- 230000006801 homologous recombination Effects 0.000 description 1
- 230000003100 immobilizing effect Effects 0.000 description 1
- 238000003119 immunoblot Methods 0.000 description 1
- 238000010324 immunological assay Methods 0.000 description 1
- 230000001771 impaired effect Effects 0.000 description 1
- 210000003000 inclusion body Anatomy 0.000 description 1
- 229960003786 inosine Drugs 0.000 description 1
- 238000002743 insertional mutagenesis Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 229940117681 interleukin-12 Drugs 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- 238000007852 inverse PCR Methods 0.000 description 1
- 238000005342 ion exchange Methods 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 101150027374 irgA gene Proteins 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 229960000310 isoleucine Drugs 0.000 description 1
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 1
- 108010071397 lactoferrin receptors Proteins 0.000 description 1
- 108010012058 leucyltyrosine Proteins 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 229960000274 lysozyme Drugs 0.000 description 1
- 235000010335 lysozyme Nutrition 0.000 description 1
- 239000004325 lysozyme Substances 0.000 description 1
- 239000001115 mace Substances 0.000 description 1
- 239000003120 macrolide antibiotic agent Substances 0.000 description 1
- 229940041033 macrolides Drugs 0.000 description 1
- 229910001629 magnesium chloride Inorganic materials 0.000 description 1
- 238000003760 magnetic stirring Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000002844 melting Methods 0.000 description 1
- 230000008018 melting Effects 0.000 description 1
- 229910052751 metal Inorganic materials 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 229930182817 methionine Natural products 0.000 description 1
- 230000000813 microbial effect Effects 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 210000004877 mucosa Anatomy 0.000 description 1
- 230000016379 mucosal immune response Effects 0.000 description 1
- 210000003205 muscle Anatomy 0.000 description 1
- 210000000663 muscle cell Anatomy 0.000 description 1
- 229910052759 nickel Inorganic materials 0.000 description 1
- 229910001453 nickel ion Inorganic materials 0.000 description 1
- 231100000252 nontoxic Toxicity 0.000 description 1
- 230000003000 nontoxic effect Effects 0.000 description 1
- 210000000633 nuclear envelope Anatomy 0.000 description 1
- 239000006174 pH buffer Substances 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 230000008506 pathogenesis Effects 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 150000002960 penicillins Chemical class 0.000 description 1
- 229940101070 pepto-bismol Drugs 0.000 description 1
- 210000001322 periplasm Anatomy 0.000 description 1
- 230000035699 permeability Effects 0.000 description 1
- 239000008024 pharmaceutical diluent Substances 0.000 description 1
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- 239000002953 phosphate buffered saline Substances 0.000 description 1
- 150000008104 phosphatidylethanolamines Chemical class 0.000 description 1
- 230000026731 phosphorylation Effects 0.000 description 1
- 238000006366 phosphorylation reaction Methods 0.000 description 1
- RMHMFHUVIITRHF-UHFFFAOYSA-N pirenzepine Chemical compound C1CN(C)CCN1CC(=O)N1C2=NC=CC=C2NC(=O)C2=CC=CC=C21 RMHMFHUVIITRHF-UHFFFAOYSA-N 0.000 description 1
- 229960004633 pirenzepine Drugs 0.000 description 1
- 229920000747 poly(lactic acid) Polymers 0.000 description 1
- 229920001606 poly(lactic acid-co-glycolic acid) Polymers 0.000 description 1
- 229920002627 poly(phosphazenes) Polymers 0.000 description 1
- 238000002264 polyacrylamide gel electrophoresis Methods 0.000 description 1
- 229920002647 polyamide Polymers 0.000 description 1
- 239000000256 polyoxyethylene sorbitan monolaurate Substances 0.000 description 1
- 235000010486 polyoxyethylene sorbitan monolaurate Nutrition 0.000 description 1
- 239000011148 porous material Substances 0.000 description 1
- 230000004481 post-translational protein modification Effects 0.000 description 1
- 230000001376 precipitating effect Effects 0.000 description 1
- 229960003857 proglumide Drugs 0.000 description 1
- 108010070643 prolylglutamic acid Proteins 0.000 description 1
- 108010029020 prolylglycine Proteins 0.000 description 1
- 108010053725 prolylvaline Proteins 0.000 description 1
- 230000001915 proofreading effect Effects 0.000 description 1
- 125000001436 propyl group Chemical group [H]C([*])([H])C([H])([H])C([H])([H])[H] 0.000 description 1
- 238000001742 protein purification Methods 0.000 description 1
- 230000005180 public health Effects 0.000 description 1
- 239000012264 purified product Substances 0.000 description 1
- 150000007660 quinolones Chemical class 0.000 description 1
- 230000002285 radioactive effect Effects 0.000 description 1
- 230000009257 reactivity Effects 0.000 description 1
- 108020003175 receptors Proteins 0.000 description 1
- 238000010188 recombinant method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000010839 reverse transcription Methods 0.000 description 1
- 229960001225 rifampicin Drugs 0.000 description 1
- JQXXHWHPUNPDRT-WLSIYKJHSA-N rifampicin Chemical compound O([C@](C1=O)(C)O/C=C/[C@@H]([C@H]([C@@H](OC(C)=O)[C@H](C)[C@H](O)[C@H](C)[C@@H](O)[C@@H](C)\C=C\C=C(C)/C(=O)NC=2C(O)=C3C([O-])=C4C)C)OC)C4=C1C3=C(O)C=2\C=N\N1CC[NH+](C)CC1 JQXXHWHPUNPDRT-WLSIYKJHSA-N 0.000 description 1
- 229930182490 saponin Natural products 0.000 description 1
- 150000007949 saponins Chemical class 0.000 description 1
- 235000017709 saponins Nutrition 0.000 description 1
- 230000009962 secretion pathway Effects 0.000 description 1
- 108010048818 seryl-histidine Proteins 0.000 description 1
- 239000000377 silicon dioxide Substances 0.000 description 1
- 238000004513 sizing Methods 0.000 description 1
- 235000020183 skimmed milk Nutrition 0.000 description 1
- 239000001509 sodium citrate Substances 0.000 description 1
- NLJMYIDDQXHKNR-UHFFFAOYSA-K sodium citrate Chemical compound O.O.[Na+].[Na+].[Na+].[O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O NLJMYIDDQXHKNR-UHFFFAOYSA-K 0.000 description 1
- 239000007790 solid phase Substances 0.000 description 1
- 238000005063 solubilization Methods 0.000 description 1
- 230000007928 solubilization Effects 0.000 description 1
- 125000006850 spacer group Chemical group 0.000 description 1
- 238000011895 specific detection Methods 0.000 description 1
- 229940063675 spermine Drugs 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 229960004291 sucralfate Drugs 0.000 description 1
- MNQYNQBOVCBZIQ-JQOFMKNESA-A sucralfate Chemical compound O[Al](O)OS(=O)(=O)O[C@@H]1[C@@H](OS(=O)(=O)O[Al](O)O)[C@H](OS(=O)(=O)O[Al](O)O)[C@@H](COS(=O)(=O)O[Al](O)O)O[C@H]1O[C@@]1(COS(=O)(=O)O[Al](O)O)[C@@H](OS(=O)(=O)O[Al](O)O)[C@H](OS(=O)(=O)O[Al](O)O)[C@@H](OS(=O)(=O)O[Al](O)O)O1 MNQYNQBOVCBZIQ-JQOFMKNESA-A 0.000 description 1
- FRGKKTITADJNOE-UHFFFAOYSA-N sulfanyloxyethane Chemical compound CCOS FRGKKTITADJNOE-UHFFFAOYSA-N 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 238000007910 systemic administration Methods 0.000 description 1
- 230000009885 systemic effect Effects 0.000 description 1
- 229950004351 telenzepine Drugs 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 229940040944 tetracyclines Drugs 0.000 description 1
- RTKIYNMVFMVABJ-UHFFFAOYSA-L thimerosal Chemical compound [Na+].CC[Hg]SC1=CC=CC=C1C([O-])=O RTKIYNMVFMVABJ-UHFFFAOYSA-L 0.000 description 1
- 108010033670 threonyl-aspartyl-tyrosine Proteins 0.000 description 1
- 101150097091 tnpA gene Proteins 0.000 description 1
- 231100000419 toxicity Toxicity 0.000 description 1
- 230000001988 toxicity Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000032258 transport Effects 0.000 description 1
- 230000017105 transposition Effects 0.000 description 1
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 1
- 108010045269 tryptophyltryptophan Proteins 0.000 description 1
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 1
- 229940125575 vaccine candidate Drugs 0.000 description 1
- 208000007089 vaccinia Diseases 0.000 description 1
- 230000002477 vacuolizing effect Effects 0.000 description 1
- 239000004474 valine Substances 0.000 description 1
- IBIDRSSEHFLGSD-UHFFFAOYSA-N valinyl-arginine Natural products CC(C)C(N)C(=O)NC(C(O)=O)CCCN=C(N)N IBIDRSSEHFLGSD-UHFFFAOYSA-N 0.000 description 1
- 229940126580 vector vaccine Drugs 0.000 description 1
- 229940023147 viral vector vaccine Drugs 0.000 description 1
- 239000011782 vitamin Substances 0.000 description 1
- 235000013343 vitamin Nutrition 0.000 description 1
- 229940088594 vitamin Drugs 0.000 description 1
- 229930003231 vitamin Natural products 0.000 description 1
- 150000003722 vitamin derivatives Chemical class 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 230000003442 weekly effect Effects 0.000 description 1
- 210000005253 yeast cell Anatomy 0.000 description 1
- DGVVWUTYPXICAM-UHFFFAOYSA-N β‐Mercaptoethanol Chemical compound OCCS DGVVWUTYPXICAM-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K16/00—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
- C07K16/12—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from bacteria
- C07K16/1203—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from bacteria from Gram-negative bacteria
- C07K16/121—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from bacteria from Gram-negative bacteria from Helicobacter (Campylobacter) (G)
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/195—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
- C07K14/205—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Campylobacter (G)
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K38/00—Medicinal preparations containing peptides
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K39/00—Medicinal preparations containing antigens or antibodies
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Medicinal Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- Immunology (AREA)
- Gastroenterology & Hepatology (AREA)
- Medicines Containing Antibodies Or Antigens For Use As Internal Diagnostic Agents (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Saccharide Compounds (AREA)
- Peptides Or Proteins (AREA)
Abstract
The invention provides Helicobacter polypeptides that can be used in vaccination methods for preventing or treating Helicobacter infection, and polynucleotides that encode these polypeptides.
Description
HELICOBACTER POLYPEPTIDES AND CORRESPONDING
POLYNUCLEOTIDE lyIOLECULES
The invention relates to Helicobacter antigens and corresponding polynucleotide molecules that can be used in methods to prevent or treat Helicobacter infection in mammals, such as humans.
Back round of the Invention Helicobacter is a genus of spiral, gram-negative bacteria that colonize the gastrointestinal tracts of mammals. Several species colonize the stomach, most notably H. pylori, H. heilmanii, H. felis, and H. mustelae.
Although H. pylori is the species most commonly associated with human infection, H. heilmanii and H. felis have also been isolated from humans, but at lower frequencies than H. pylori. Helicobacter infects over 50% of adult populations in developed countries and nearly l00% in developing countries and some Pacific rim countries, making it one of the most prevalent infections worldwide.
Helicobacter is routinely recovered from gastric biopsies of humans with histological evidence of gastritis and peptic ulceration. Indeed, H.
pylori is now recognized as an important pathogen of humans, in that the chronic gastritis it causes is a risk factor for the development of peptic ulcer diseases and gastric carcinoma. It is thus highly desirable to develop safe and effective vaccines for preventing and treating Helicobacter infection.
A number of Helicobacter antigens have been characterized or ~ 25 isolated. These include urease, which is composed of two structural subunits of approximately 30 and 67 kDa (Hu et al., Infect. Immun. 58:992, 1990; Dunn et al., J. Biol. Chem. 265:94G4, 1990; Evans et al., Microbial Pathogenesis 10:15, 1991; Labigne et al., J. Bact., 173:1920, 1991); the 87 kDa vacuolar cytotoxin (VacA) (Cover et al., J. Biol. Chem. 2G7:10570, 1992; Phadnis et al., Infect.
Immun. 62:1 S57, 1994; WO 93/l8 i 50); a I28 kDa immunodominant antigen associated with the cytotoxin (CagA, also called TagA; WO 93/18150; U.S.
Patent No. 5,403,924); 13 and 58 kDa heat shock proteins HspA and HspB
(Suerbaum et al., Mol. Microbiol. 14:959, I994; WO 93/18150); a 54 kDa catalase (Hazell et al., J. Gen. Microbiol.137:57, 1991 ); a 15 kDa histidine-rich protein (Hpn) (Gilbert et al., Infect. Immun. G3:2682, 1995); a 20 kDa membrane-associated lipoprotein (Kostrcynska et al., J. Bact. l76:5938, l994);
a 30 kDa outer membrane protein (Bolin et al., J. Clin. MicrobioI. 33:38l, l995); a lactoferrin receptor (FR 2,724,93G); and several porins, designated HopA, HopB, HopC, HopD, and HopE, which have molecular weights of 48-67 kDa (Exner et al., Infect. Immun. 63:1567, l995; Doig et al., J. Bact.
l77:5447, 1995). Some of these proteins have been proposed as potential vaccine antigens. In particular, urease is believed to be a vaccine candidate (WO 94/9823; WO 95/22987; WO 9513824; Michetti et al., Gastroenterology 107:1002, 1994). Nevertheless, it is thought that several antigens may ultimately be necessary in a vaccine.
Summarv of the Invention The invention provides polynucleotide molecules that encode Helicobacter polypeptides, designated GHPO 13, GHPO 73, GHPO 90, GHPO
l07, GHPO 13G, GHPO 19I, GHPO 213, GHPO 240, GHPO 408, GHPO 411, GHPO 419, GHPO 431, GHPO 474, GHPO 591, GHPO 59G, GHPO 699, GHPO 724. GHPO 730, GHPO 76I, GHPO 804, GHPO 805, GHPO 812, GHPO 879, GHPO 888, GHPO 986, GHPO 1056, GHPO 1081, GHPO 1100, GHPO 1140, GHPO 1148, GHPO 1200, GHfO l212, GHPO 1258, GHPO
1263, GHPO l273, GHPO 1284, GHPO 1299, GHPO 1327, GHPO 1346, GHPO 1378, GHPO 14l2, GHPO 1443, GHfO 1466, GHPO l476, GHPO
1536, GHPO 15S9, GHPO 427, GHPO 1045, GHPO 12G2, GHPO 1688, GHPO 1538, GHPO 346, GHPO l012, GHPO 470, GHPO 1398, GHPO l550, GHPO 27G, GHPO l501, GHPO 70G, GHPO 1001, GHPO 732, GHPO 329, GHPO 574, GHPO 1190, GHPO 1374, GHPO 1 G20, GHPO 956, HPO 98, GHPO 689, GHPO 208, GHPO 29G, GHPO 726, GHPO 1026, GHPO 1301, GHPO 1536, GHPO 166, GHPO 253, GHPO 297, GHPO G 15, GHPO 1278, GHPO 1282, GHPO 1420, GHPO 1484, GHPO 17l9, and GHPO 1252, which _ - can be used, e.g., in methods to prevent, treat, or diagnose Helicobacter infection. The polypeptides of the invention include those having the amino acid sequences shown in SEQ ID NOs:2-170 (even numbers), as well as mature forms of proteins having sequences shown in SEQ ID NOs:2-170 in their unprocessed forms, and fragments thereof. Those skilled in the art will understand that the invention also includes polynucleotide molecules that encode mutants and derivatives of these polypeptides, which can result from the addition, deletion, or substitution of non-casential amino acids, as is described further below.
In addition to the polynucleotide molecules described above, the invention includes the corresponding polypeptides (i.e., polypeptides encoded by the polynucleotide molecules of the invention, or fragments thereof), and monospecific antibodies that specifically bind to these polypeptides.
The present invention has many applications and includes expression cassettes, vectors, and cells transformed or transfected with the polynucleotides of the invention. Accordingly, the present invention provides (i) methods for producing polypeptides of the invention in recombinant host systems and related expression cassettes, vectors, and transformed or transfected cells;
(ii) live vaccine vectors, such as pox virus, Salmonella typhimurium, and Vibrio cholerae vectors, that contain polynucleotides of the invention (such vaccine vectors being useful in, e.g., methods for preventing or treating Helicobacter infection) in combination with a diluent or carrier, and related pharmaceutical compositions and associated therapeutic and/or prophylactic methods; (iii) therapeutic and/or prophylactic methods involving administration of polynucleotide molecules, either in a naked form or formulated with a delivery vehicle, polypeptides or mixtures of polypeptides, or monospecific antibodies of the invention, and related pharmaceutical compositions; (iv) methods for detecting the presence of Helicobacter in biological samples, which can involve the use of polynucleotide molecules, monospecific antibodies, or polypeptides of the invention; and (v) methods for purifying polypeptides of the invention by antibody-based affinity chromatography.
Brief Description of the Drawings Fig. lA is a diagrammatic representation of transposon TnMax9, which is a derivative of the TnMax transposon system (Haas et al., Gene l30:23-21, 1993). The mini-transposon carries the blaM gene, which is the ~i-lactamase gene lacking a promoter and a signal sequence, next to the inverted repeats (IR) and the M 13 forward (M 13-FP) and reverse (M 13-RP 1 ) primer binding sites. The resolution site (res) and an origin of replication (ori~d) are located between the BIaM gene and the constitutive cat~~ resistance gene. The transposase tnpA and resolvase tnpR genes are located outside of the mini-transposon and are under the control of the inducible P,~ promoter. The laclg gene encodes the Lac repressor.
WO 98l21225 PCTlUS9?/2I353 Fig. 1B is a diagrammatic representation of plasmid pMin2. pMin2 contains a multiple cloning site, the tetracycline resistance gene (tet), an origin of transfer (oriT~, an origin of replication (ori~o~E~), a transcriptional terminator (tad), and a weak, constitutive promoter (P;6~). H. pylori chromosome fragments were introduced into the BgIII and CIaI sites of pMin2.
Detailed Description Open reading frames (ORFs) encoding new, full length polypeptides, designated GHPO 13, GHPO 73, GHPO 90, (iHPO 107, GHPO 136, GHPO
l91, GHPO 213, GHPO 240, GHPO 408, GHPO 411, GHPO 419, GHPO 431, GHPO 474, GHPO 591, GHPO 59G, GHPO 699, GHPO 724, GHPO 730, GHPO 7G1, GHPO 804, GHPO 805, GHPO 8l2, GHPO 879, GHPO 888, GHPO 986, GHPO 105G, GHPO 108l, GHPO 1 l00, GHPO 1140, GHPO
1 l48, GHPO 1200, GHPO I212, GHPO l258, GHPO 12G3, GHPO 1273, GHPO 1284, GHPO 1299, GHPO 1327, GHPO 1346, GHPO l378, GHPO
1412, GHPO 1443, GHPO 1466, GHPO 1476, GHPO 1 S36, GHPO 15S9, GHPO 427, GHPO 1045, GHPO 12G2, GHPC> l688, GHPO 1538, GHPO 346, GHPO 1012, GHPO 470, GHPO 1398, GHPC> 1550, GHPO 276, GHPO 1501, GHPO 70G, GHPO l001, GHPO 732, GHPO 329, GHPO S74, GHPO 1 l90, GHPO 1374, GHPO 1G20, GHPO 95G, HPO ~>8, GHPO 689, GHPO 208, GHPO 29G, GHPO 726, GHPO 1026, GHPO 1301, GHPO l536, GHPO 166, GHPO 253, GHPO 297, GHPO 615, GHPO 1:278, GHPO 1282, GHPO 1420, GHPO 1484, GHPO 1719, and GHPO I252, have been identified in the H.
pylori genome. These polypeptides can be usc;d, for example, in vaccination methods for preventing or treating Helicobacter infection. Some of the new polypeptides are secreted polypeptides that cam be produced in their mature forms (i.e., as polypeptides that have been exported through class II or class III
POLYNUCLEOTIDE lyIOLECULES
The invention relates to Helicobacter antigens and corresponding polynucleotide molecules that can be used in methods to prevent or treat Helicobacter infection in mammals, such as humans.
Back round of the Invention Helicobacter is a genus of spiral, gram-negative bacteria that colonize the gastrointestinal tracts of mammals. Several species colonize the stomach, most notably H. pylori, H. heilmanii, H. felis, and H. mustelae.
Although H. pylori is the species most commonly associated with human infection, H. heilmanii and H. felis have also been isolated from humans, but at lower frequencies than H. pylori. Helicobacter infects over 50% of adult populations in developed countries and nearly l00% in developing countries and some Pacific rim countries, making it one of the most prevalent infections worldwide.
Helicobacter is routinely recovered from gastric biopsies of humans with histological evidence of gastritis and peptic ulceration. Indeed, H.
pylori is now recognized as an important pathogen of humans, in that the chronic gastritis it causes is a risk factor for the development of peptic ulcer diseases and gastric carcinoma. It is thus highly desirable to develop safe and effective vaccines for preventing and treating Helicobacter infection.
A number of Helicobacter antigens have been characterized or ~ 25 isolated. These include urease, which is composed of two structural subunits of approximately 30 and 67 kDa (Hu et al., Infect. Immun. 58:992, 1990; Dunn et al., J. Biol. Chem. 265:94G4, 1990; Evans et al., Microbial Pathogenesis 10:15, 1991; Labigne et al., J. Bact., 173:1920, 1991); the 87 kDa vacuolar cytotoxin (VacA) (Cover et al., J. Biol. Chem. 2G7:10570, 1992; Phadnis et al., Infect.
Immun. 62:1 S57, 1994; WO 93/l8 i 50); a I28 kDa immunodominant antigen associated with the cytotoxin (CagA, also called TagA; WO 93/18150; U.S.
Patent No. 5,403,924); 13 and 58 kDa heat shock proteins HspA and HspB
(Suerbaum et al., Mol. Microbiol. 14:959, I994; WO 93/18150); a 54 kDa catalase (Hazell et al., J. Gen. Microbiol.137:57, 1991 ); a 15 kDa histidine-rich protein (Hpn) (Gilbert et al., Infect. Immun. G3:2682, 1995); a 20 kDa membrane-associated lipoprotein (Kostrcynska et al., J. Bact. l76:5938, l994);
a 30 kDa outer membrane protein (Bolin et al., J. Clin. MicrobioI. 33:38l, l995); a lactoferrin receptor (FR 2,724,93G); and several porins, designated HopA, HopB, HopC, HopD, and HopE, which have molecular weights of 48-67 kDa (Exner et al., Infect. Immun. 63:1567, l995; Doig et al., J. Bact.
l77:5447, 1995). Some of these proteins have been proposed as potential vaccine antigens. In particular, urease is believed to be a vaccine candidate (WO 94/9823; WO 95/22987; WO 9513824; Michetti et al., Gastroenterology 107:1002, 1994). Nevertheless, it is thought that several antigens may ultimately be necessary in a vaccine.
Summarv of the Invention The invention provides polynucleotide molecules that encode Helicobacter polypeptides, designated GHPO 13, GHPO 73, GHPO 90, GHPO
l07, GHPO 13G, GHPO 19I, GHPO 213, GHPO 240, GHPO 408, GHPO 411, GHPO 419, GHPO 431, GHPO 474, GHPO 591, GHPO 59G, GHPO 699, GHPO 724. GHPO 730, GHPO 76I, GHPO 804, GHPO 805, GHPO 812, GHPO 879, GHPO 888, GHPO 986, GHPO 1056, GHPO 1081, GHPO 1100, GHPO 1140, GHPO 1148, GHPO 1200, GHfO l212, GHPO 1258, GHPO
1263, GHPO l273, GHPO 1284, GHPO 1299, GHPO 1327, GHPO 1346, GHPO 1378, GHPO 14l2, GHPO 1443, GHfO 1466, GHPO l476, GHPO
1536, GHPO 15S9, GHPO 427, GHPO 1045, GHPO 12G2, GHPO 1688, GHPO 1538, GHPO 346, GHPO l012, GHPO 470, GHPO 1398, GHPO l550, GHPO 27G, GHPO l501, GHPO 70G, GHPO 1001, GHPO 732, GHPO 329, GHPO 574, GHPO 1190, GHPO 1374, GHPO 1 G20, GHPO 956, HPO 98, GHPO 689, GHPO 208, GHPO 29G, GHPO 726, GHPO 1026, GHPO 1301, GHPO 1536, GHPO 166, GHPO 253, GHPO 297, GHPO G 15, GHPO 1278, GHPO 1282, GHPO 1420, GHPO 1484, GHPO 17l9, and GHPO 1252, which _ - can be used, e.g., in methods to prevent, treat, or diagnose Helicobacter infection. The polypeptides of the invention include those having the amino acid sequences shown in SEQ ID NOs:2-170 (even numbers), as well as mature forms of proteins having sequences shown in SEQ ID NOs:2-170 in their unprocessed forms, and fragments thereof. Those skilled in the art will understand that the invention also includes polynucleotide molecules that encode mutants and derivatives of these polypeptides, which can result from the addition, deletion, or substitution of non-casential amino acids, as is described further below.
In addition to the polynucleotide molecules described above, the invention includes the corresponding polypeptides (i.e., polypeptides encoded by the polynucleotide molecules of the invention, or fragments thereof), and monospecific antibodies that specifically bind to these polypeptides.
The present invention has many applications and includes expression cassettes, vectors, and cells transformed or transfected with the polynucleotides of the invention. Accordingly, the present invention provides (i) methods for producing polypeptides of the invention in recombinant host systems and related expression cassettes, vectors, and transformed or transfected cells;
(ii) live vaccine vectors, such as pox virus, Salmonella typhimurium, and Vibrio cholerae vectors, that contain polynucleotides of the invention (such vaccine vectors being useful in, e.g., methods for preventing or treating Helicobacter infection) in combination with a diluent or carrier, and related pharmaceutical compositions and associated therapeutic and/or prophylactic methods; (iii) therapeutic and/or prophylactic methods involving administration of polynucleotide molecules, either in a naked form or formulated with a delivery vehicle, polypeptides or mixtures of polypeptides, or monospecific antibodies of the invention, and related pharmaceutical compositions; (iv) methods for detecting the presence of Helicobacter in biological samples, which can involve the use of polynucleotide molecules, monospecific antibodies, or polypeptides of the invention; and (v) methods for purifying polypeptides of the invention by antibody-based affinity chromatography.
Brief Description of the Drawings Fig. lA is a diagrammatic representation of transposon TnMax9, which is a derivative of the TnMax transposon system (Haas et al., Gene l30:23-21, 1993). The mini-transposon carries the blaM gene, which is the ~i-lactamase gene lacking a promoter and a signal sequence, next to the inverted repeats (IR) and the M 13 forward (M 13-FP) and reverse (M 13-RP 1 ) primer binding sites. The resolution site (res) and an origin of replication (ori~d) are located between the BIaM gene and the constitutive cat~~ resistance gene. The transposase tnpA and resolvase tnpR genes are located outside of the mini-transposon and are under the control of the inducible P,~ promoter. The laclg gene encodes the Lac repressor.
WO 98l21225 PCTlUS9?/2I353 Fig. 1B is a diagrammatic representation of plasmid pMin2. pMin2 contains a multiple cloning site, the tetracycline resistance gene (tet), an origin of transfer (oriT~, an origin of replication (ori~o~E~), a transcriptional terminator (tad), and a weak, constitutive promoter (P;6~). H. pylori chromosome fragments were introduced into the BgIII and CIaI sites of pMin2.
Detailed Description Open reading frames (ORFs) encoding new, full length polypeptides, designated GHPO 13, GHPO 73, GHPO 90, (iHPO 107, GHPO 136, GHPO
l91, GHPO 213, GHPO 240, GHPO 408, GHPO 411, GHPO 419, GHPO 431, GHPO 474, GHPO 591, GHPO 59G, GHPO 699, GHPO 724, GHPO 730, GHPO 7G1, GHPO 804, GHPO 805, GHPO 8l2, GHPO 879, GHPO 888, GHPO 986, GHPO 105G, GHPO 108l, GHPO 1 l00, GHPO 1140, GHPO
1 l48, GHPO 1200, GHPO I212, GHPO l258, GHPO 12G3, GHPO 1273, GHPO 1284, GHPO 1299, GHPO 1327, GHPO 1346, GHPO l378, GHPO
1412, GHPO 1443, GHPO 1466, GHPO 1476, GHPO 1 S36, GHPO 15S9, GHPO 427, GHPO 1045, GHPO 12G2, GHPC> l688, GHPO 1538, GHPO 346, GHPO 1012, GHPO 470, GHPO 1398, GHPC> 1550, GHPO 276, GHPO 1501, GHPO 70G, GHPO l001, GHPO 732, GHPO 329, GHPO S74, GHPO 1 l90, GHPO 1374, GHPO 1G20, GHPO 95G, HPO ~>8, GHPO 689, GHPO 208, GHPO 29G, GHPO 726, GHPO 1026, GHPO 1301, GHPO l536, GHPO 166, GHPO 253, GHPO 297, GHPO 615, GHPO 1:278, GHPO 1282, GHPO 1420, GHPO 1484, GHPO 1719, and GHPO I252, have been identified in the H.
pylori genome. These polypeptides can be usc;d, for example, in vaccination methods for preventing or treating Helicobacter infection. Some of the new polypeptides are secreted polypeptides that cam be produced in their mature forms (i.e., as polypeptides that have been exported through class II or class III
secretion pathways) or as precursors that include signal peptides, which can be removed in the course of excretion/secretion by cleavage at the N-terminal end of the mature form. (The cleavage site is located at the C-terminal end of the signal peptide, adjacent to the mature form.) According to a first aspect of the invention, there are provided isolated polynucleotides that encode the precursor and mature forms of Helicobacter GHPO 13, GHPO 73, GHPO 90, GHPO 107, GHPO 136, GHPO
19I, GHPO 213, GHPO 240, GHPO 408, GHPO 411, GHPO 419, GHPO 431, GHPO 474, GHPO 591, GHPO 59G, GHPO G99, GHPO 724, GHPO 730, GHPO 7G1, GHPO 804, GHPO 805, GHPO 812, GHPO 879, GHPO 888, - GHPO 98G, GHPO 105G, GHPO 1081, GHPO 1 l00, GHPO 1140, GHPO
1 l48, GHPO I200, GHPO 1212, GHPO 12S8, GHPO 1263, GHPO 1273, GHPO 1284, GHPO 1299, GHPO 1327, GHPO 134G, GHPO 1378, GHPO
1412, GHPO 1443, GHPO 1466, GHPO 147G, GHPO 153G, GHPO 1559, GHPO 427, GHPO l045, GHPO l262, GHPO 1688, GHPO l538, GHPO 346, GHPO 1012, GHPO 470, GHPO l398, GHPO 1 S50, GHPO 27G, GHPO 1501, GHPO 70G, GHPO 1001, GHPO 732, GHPO 329, GHPO 574, GHPO 1190, GHPO 1374, GHPO 1G20, GHPO 956, HPO 98, GHPO 689, GHPO 208, GHPO 296, GHPO 726, GHPO 1026, GHPO 130I, GHPO 153G, GHPO 166, GHPO 253, GHPO 297, GHPO G15, GHPO l278, GHPO 1282, GHPO 1420, GHPO l484, GHPO 17l9, and GHPO l252.
An isolated polynucleotide of the invention encodes:
(i) a polypeptide having an amino acid sequence that is homologous to a Helicobacter amino acid sequence of a polypeptide, the Helicobacter amino acid sequence being selected from the group consisting of the amino acid sequences shown in SEQ ID N0:2 {GHPO 13), SEQ ID N0:4 (GHPO 73}, SEQ ID NO:G (GHPO 90), SEQ ID N0:8 (GHPO 107}, SEQ ID NO:10 CfI ~~S '96Z OdHJ) Z~I~O1~I CTI ~dS '84Z OdHJ) Ot~i~OIvI CII ~~S g689 OdHJ) 8~I~OI~I CII ~~S '~86 OdH) 9~I~OI~I CfI ~~S '9S6 OdHJ) ~~I~OI~I
CTI ~~S '~OZ9t OdHJ) Z~I~~OI~ CfI ~~S 'l'bL~i OdHJ) 0~I~OI~I CII ~~S SZ
'06i I OdHJ) 8Z1 ~O1~I CfI a~S '~fiLS OdHJ) 9ZI ~OI~I CfI ~~S '~6Z~ OdHJ) fiZi~OI~I CII ~~S '~Z~L OdHJ) ZZI~OI~I CfI ~~S '~IOOi OdHJ) OZi~OI~I
CII ~~S '90L OdHJ) 8i I ~OI~I CfI ~~S '~IOSI OdHJ) 9I I~OI~I CfI ~~S
'~9LZ OdHJ) bI i ~OIvI QI ~~S 'BOSS I OdHrJ) Zi I ~OI~I CfI ~~S '~86~ I OdHJ) OI I ~OI~I CII ~~S '~OLb OdHJ) 80 i ~OI~I CfI ~~S '~Z I O i OdHtJ) 90 i ~OI~I
OZ
CII ~~S '~9~b~ OdHJ) ~bO I ~OIvI CfI ~~S '~8 ~S I OdHJ) ZO I ~OI~I CII ~~S
'889I
OdHJ) 00 i ~OI~I CII a~S '~Z9Z I C)dHJ) 86~OI~I CfI ~dS '{S~0 i OdHJ) 96~OIvI
CLI a~S '~LZt~ OdHJ) t6~0IvI CfI ~~S '~6SSI OdHJ) Z6~Ol~I CTI ~~S '~9~SI
OdHJ) 06~OI~I CTI ~~S '~9Lt~I C>dHrJ) 88~OIvI CfI ~~S '~99tiI OdHJ) 98~OI~I
CII ~~S '~~~bt~ i OdHJ) ~b8 ~OI~I CfI ~~S '~Z i t~ I OdHJ) Z8 ~OI~I CII ~~S
'~8L ~ I S i OdHJ) 08~OIvI CII ~~S '~9~b~I C)dHJ) 8L~OI~I CII a~S '~LZ~I OdHJ) 9L~O1~I
CfI ~J~S '~66ZI OdHJ) tL~OI~I CfI ~~S '{t~8ZI OdHJ) ZL~OI~I CII ~~S '~~LZI
OdHJ) OL~OI~I CfI a~S '~~9ZI C>dHJ) 89~OI~I CfI ~~S '~BSZt OdHJ) 99~O1~I
CII ~~S '~ZIZi OdHJ) t~9~0IvI CfI ~~S '~OOZI OdHJ) Z9~OI~I CII ~~S '~8bi I
OdHJ) 09~OI~I CII a~S '~Ot~t i C>dHJ) BS~OIvL C(I ~~S '~OOI i OdHJ) 9S~OI~I OI
CII ~~S '~i80i OdHJ) ~bS~OI~I CfI b~S '~9SOI OdHJ) ZS~OI~I CII a~S g986 OdHJ) OS~OI~I CfI a~S g888 OdHJ) 8t~~Ol~I CfI a~S '6L8 OdHJ) 9t~~Olvl CfI ~~S '~ZIB OdHJ) t~fi~01'T CtI ~~S g508 OdHJ) Zb~OI~I C(I a~S 'O08 OdHJ) O~~OIvI CfI ~~S '~I9L OdHJ) 8~~OIvI CII ~~S 'lO~L OdHJ) 9~~OI~( CfI ~~S 'OZL OdHJ) ~b~~OI'T CTI a~S g669 OdHJ) Z~~O1~I CII a~S 't96S S
OdHJ) 0~~OIvI CII a~S '~I6S OdHJ) 8Z~OlvI C(I ~~S '~t~Lt~ OdHJ) 9Z~Ol~I
CfI ~~S '~ i ~t~ OdHJ) t~Z~01'T CfI ~~S '~6 i ~ OdHJ) ZZ~Ot~I CfI UPS '~ i i ~
OdHJ) OZ~OIvI CII a~S '~80t~ OdHJ) 8 I ~O1~I CII ~~S '~O~Z OdHJ) 9I ~OI~I CfI
~~S '{~IZ OdHJ) ~i ~OI~I CII ~~S '~I6I OdHJ) ZI~OI~I CII ~~S '~9~I OdHJ) -L-- ~S~iZ/G6Sfl/Z~d SZZTZ/86 OM
bi-SO-666l bGGiGZZO ~a _g_ N0:144 (GHPO 726), SEQ ID N0:146 (GHPO 1026), SEQ ID N0:148 (GHPO 1301 ), SEQ ID N0:150 (GHPO 1536), SEQ ID N0:152 (GHPO 166), SEQ ID N0:154 (GHPO 253), SEQ ID N0:156 (GHPO 297), SEQ ID N0:158 (GHPO 615), SEQ ID N0:160 (GHPO 1278), SEQ ID N0:162 (GHPO 1282), SEQ ID N0:164 (GHPO l420), SEQ ID N0:166 (GHPO l484), SEQ ID
N0:168 (GHPO 1719), and SEQ ID N0:170 (GHPO 1252); or (ii) a derivative of the polypeptide.
In addition to the full-length polypeptides encoded by the polynucleotides of the invention, as set forth above, polynucleotides included in l 0 the invention can also encode polypeptides that lack signal sequences, as well as other polypeptide or peptide fragments of the full-length polypeptides.
The term "isolated polynucleotide" is defined as a polynucleotide that is removed from the environment in which it naturally occurs. For example, a naturally-occurring DNA molecule present in the genome of a living bacteria or as part of a gene bank is not isolated, but the same molecule, separated from the remaining part of the bacterial genome, as a result of, e.g., a cloning event (amplification), is "isolated." Typically, an isolated DNA
molecule is free from DNA regions (e.g., coding regions) with which it is immediately contiguous, at the 5' or 3' ends, in the naturally occurring genome.
Such isolated polynucleotides can be part of a vector or a composition and still be isolated, as such a vector or composition is not part of its natural environment.
A polynucleotide of the invention can consist of RNA or DNA (e.g., cDNA, genomic DNA; or synthetic DNA), or modifications or combinations of RNA or DNA. The polynucleotide can be double-stranded or single-stranded and, if single-stranded, can be the coding (sense) strand or the non-coding (anti-sense) strand. The sequences that encode polypeptides of the invention, as shown in any of SEQ ID NOs:2-170 (even numbers), can be (a) the coding sequence as shown in any of SEQ ID NOs:I-l69 (odd numbers); (b) a ribonucleotide sequence derived by transcription of (a); or (c) a different coding sequence that, as a result of the redundancy or degeneracy of the genetic code, encodes the same polypeptides as the polynucleotide molecules having the sequences illustrated in any of SEQ ID NOs:I-169 (odd numbers). The polypeptide can be one that is naturally secrel:ed or excreted by, e.g., H.
felis, H. mustelae, H. heilmanii, or H. pylori.
By "polypeptide" or "protein" is meant any chain of amino acids, regardless of length or post-translational modification (e.g., glycosylation or phosphorylation). Both terms are used interchangeably in the present application.
By "homologous amino acid sequence" is meant an amino acid sequence that differs from an amino acid sequence shown in any of SEQ ID
NOs:2-170 (even numbers), or an amino acid sequence encoded by the nucleotide sequence of any of SEQ ID NOs: I-l69 (odd numbers), by one or more non-conservative amino acid substitutions, deletions, or additions located at positions at which they do not destroy the specific antigenicity of the polypeptide. Preferably, such a sequence is at least 75%, more preferably at least 80%, and most preferably at least 90% identical to an amino acid sequence shown in any of SEQ ID NOs:2-170 (even-numbers).
Homologous amino acid sequences include sequences that are identical or substantially identical to an amino acid sequence as shown in any of SEQ ID NOs:2-170 (even numbers). By "amino acid sequence that is substantially identical" is meant a sequence that is at least 90%, preferably at least 95%, more preferably at least 97%, and most preferably at least 99%
identical to an amino acid sequence of reference and that differs from the sequence of reference, if at all, by a majority of conservative amino acid substitutions.
Conservative amino acid substitutions typically include substitutions among amino acids of the same class. These classes include, for example, amino acids having uncharged polar side chains, such as asparagine, glutamine, serine, threonine, and tyrosine; amino acids having basic side chains, such as lysine, arginine, and histidine; amino acids having acidic side chains, such as aspartic acid and glutamic acid; and amino acids having nonpolar side chains, such as glycine, alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan, and cysteine.
Homology can be measured using sequence analysis software (e.g., Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 University Avenue, Madison, WI 53705). Similar amino acid sequences are aligned to obtain the maximum degree of homology (i.e., identity). To this end, it may be necessary to artificially introduce gaps into the sequence. Once the optimal alignment has been set up, the degree of homology (i.e., identity) is established by recording a11 of the positions in which the amino acids of both sequences are identical, relative to the total number of positions.
Homologous polynucleotide sequences are defined in a similar way.
Preferably, a homologous sequence is one that is at least 45%, more preferably at least 60%, and most preferably at least 85% identical to a coding sequence of any of SEQ ID NOs:I-169 (odd numbers).
Polypeptides having a sequence homologous to any one of the sequences shown in SEQ ID NOs:2-l70 (even numbers), include naturally occurring allelic variants, as well as mutants or any other non-naturally occurring variants that are analogous in terms of antigenicity, to a polypeptide having a sequence as shown in any one of SEQ ID NOs:2-170 (even numbers).
As is known in the art, an allelic variant is an alternate form of a polypeptide that is characterized as having a substitution, deletion, or addition of one or more amino acids that does not alter the biological function of the polypeptide. By "biological function" is meant a function of the polypeptide in the cells in which it naturally occurs, even if the function is not necessary for the growth or survival of the cells. For example, the biological function of a porin is to allow the entry into cells of compounds present in the extracellular medium. The biological function is distinct fiom the antigenic function. A
polypeptide can have more than one biological function.
Allelic variants are very common in nature. For example, a bacterial species, e.g., H. pylori, is usually represented by a variety of strains that differ from each other by minor allelic variations. Indeed, a polypeptide that fulfills the same biological function in different strains can have an amino acid sequence that is not identical in each of the stn~ains. Such an allelic variation can be equally reflected at the polynucleotide level.
Support for the use of allelic variants of polypeptide antigens comes from, e.g., studies of the Helicobacter urease antigen. The amino acid sequence of Helicobacter urease varies widely from species to species, yet cross-species protection occurs, indicating that the urease molecule, when used as an immunogen, is highly tolerant of amino acid variations. Even among different strains of the single species H. pylori, there are amino acid sequence variations.
For example, although the amino acid sequences of the UreA and Urea subunits of H. pylori and H. felis ureases differ from one another by 26.5% and 11.8%, respectively (Ferrero et al., Molecular Microbiology 9{2):323-333, 1993), it has been shown that H. pylori urease protects mice from H. fells infection {Michetti et al., Gastroenterology 107:1002, 1994). In addition, it has been shown that the individual structural subunits of urease, UreA and Urea, which contain distinct amino acid sequences, are both protective antigens against Helicobacter infection {Michetti et al., supra).
Similarly, Cuenca et al. (Gastroenterology 110: I770, 1996) showed that therapeutic immunization of H. mustelae-infected ferrets with H. pylori urease was effective at eradicating H. mustelae infection. Further, several urease variants have been reported to be effective vaccine antigens, including, e.g., recombinant UreA + Urea apoenzyme expressed from pORV 142 (UreA and Urea sequences derived from H. pylori strain CPM630; Lee et al., J. Infect.
Dis.l72:161, 1995); recombinant UreA + Urea apoenzyme expressed from pORV214 (UreA and Urea sequences differ from H. pylori strain CPM630 by one and two amino acid changes, respectively; Lee et al., supra, 1995); a UreA-glutathione-S-transferase fusion protein (UreA sequence from H. pylori strain ATCC 43 504; Thomas et al., Acta Gastro-Enterologica Belgica S6:54, 1993); UreA + Urea holoenzyme purified from H. pylori strain NCTC11637 (Marchetti et al., Science 267:1655, l995); a UreA-MBP fusion protein (UreA
from H. pylori strain 85P; Ferrero et al., Infection and Immunity 62:4981, 1994); a Urea-MBP fusion protein (Urea from H. pylori strain 85P; Ferrero et al., supra); a UreA-MBP fusion protein (UreA from H. fells strain ATCC
49179; Ferrero et al., supra); a Urea-MBP fusion protein (Urea from H. fells strain ATCC 49l79; Ferrero et al., supra); and a 37 kDa fragment of Urea containing amino acids 220-569 {Dore-Davin et al., "A 37 kD fragment of Urea is sufficient to confer protection against Helicobacter fells infection in mice"). Finally, Thomas et al. (supra) showed that oral immunization of mice with crude sonicates of H. pylori protected mace from subsequent challenge with H. felis.
Polynucleotides, e.g., DNA molecules, encoding allelic variants can easily be obtained by polymerase chain reaction (PCR) amplification of genomic bacterial DNA extracted by conventional methods. This involves the use of synthetic oligonucleotide primers matching sequences that are upstream and downstream of the 5' and 3' ends of the coding region. Suitable primers can be designed based on the nucleotide sequence information provided in any of SEQ ID NOs:I-1G9 (odd numbers). Typically, a primer consists of 10 to 40, I O preferably 15 to 25 nucleotides. It can also be advantageous to select primers containing C and G nucleotides in proportions sufficient to ensure efficient hybridization, e.g., an amount of C and G nucleotides of at least 40%, preferably 50%, of the total nucleotide amount. Those skilled in the art can readily design primers that can be used to isolate the polynucleotides of the invention from different Helicobacter strains. Experimental conditions for carrying out PCR can readily be determined by one skilled in the art and an illustration of carrying out PCR is provided in the Examples below. As is well known in the art, restriction endonuclease recognition sites that contain, typically, 4 to G nucleotides (for example, the sequences 5'-GGATCC-3' (BamHI) or 5'-CTCGAG-3' (XhoI)), can be included on the S' ends of the primers. Restriction sites can be selected by those skilled in the art so that the amplified DNA can be conveniently cloned into an appropriately digested vector, such as a plasmid.
Useful homologs that do not occur naturally can be designed using known methods for identifying regions of an .antigen that are likely to be tolerant of amino acid sequence changes and/or deletions. For example, WO 98l21225 PCTlUS97/21353 -sequences of the antigen from different species can be compared to identify conserved sequences.
Polypeptide derivatives that are encoded by polynucleotides of the invention include, e.g., fragments, polypeptides having large internal deletions derived from full-length polypeptides, and fusion proteins. Polypeptide fragments of the invention can be derived from a polypeptide having a sequence homologous to any of the sequences of SEQ ID NOs:2-170 (even numbers), to the extent that the fragments retain the substantial antigenicity of the parent polypeptide (specific antigenicity). Polypeptide derivatives can also be constructed by large internal deletions that remove a substantial part of the parent polypeptide, while retaining specific antigenicity. Generally, polypeptide derivatives should be about at least 12 amino acids in length to maintain antigenicity. Advantageously, they can be at least 20 amino acids, preferably at least 50 amino acids, more preferably at least 75 amino acids, and most preferably at least 100 amino acids in length.
Useful polypeptide derivatives, e.g., polypeptide fragments, can be designed using computer-assisted analysis of amino acid sequences in order to i dentify sites in protein antigens having potential as surface-exposed, antigenic regions (Hughes et al., Infect. Immun. G0(9):3497, 1992). For example, the Laser Gene Program from DNA Star can be used to obtain hydrophilicity, antigenic index, and intensity index plots for the polypeptides of the invention.
This program can also be used to obtain information about homologies of the polypeptides with known protein motifs. One skilled in the art can readily use the information provided in such plots to select peptide fragments for use as vaccine antigens. For example, fragments spanning regions of the plots in which the antigenic index is relatively high can be selected. One can also select ftagments spanning regions in which both the antigenic index and the intensity plots are relatively high. Fragments. containing conserved sequences, particularly hydrophilic conserved sequences, can also be selected.
Polypeptide fragments and polypeptides having large internal deletions can be used for revealing epitopes that are otherwise masked in the parent polypeptide and that may be of importance for inducing a protective T
cell-dependent immune response. Deletions can also remove immunodominant regions of high variability among strains.
It is an accepted practice in the field of immunology to use fragments and variants of protein immunogens as vaccines, as all that is required to induce an immune response to a protein is a small (e.g., 8 to 10 amino acids) .._ - immunogenic region of the protein. This has been done for a number of vaccines against pathogens other than Helicobacter. For example, short synthetic peptides corresponding to surface-exposed antigens of pathogens such as murine mammary tumor virus (peptide containing 11 amino acids; Dion et al., Virology 179:474-477, 1990), Semliki Forest virus (peptide containing 16 amino acids; Snijders et al., J. Gen. Virol. 72;S57-5G5, 199l), and canine parvovirus (2 overlapping peptides, each containing 15 amino acids; Langeveld et al., Vaccine 12(15):1473-l480, 1994) have; been shown to be effective vaccine antigens against their respective pathogens.
Polynucleotides encoding polypeptide fragments and polypeptides having large internal deletions can be constructed using standard methods (see, e.g., Ausubel et al., Current Protocols in Molecular Biology, John Wiley &
Sons Inc., l994), for example, by PCR, including inverse PCR, by restriction enzyme treatment of the cloned DNA molecules, or by the method of Kunkel et al. (Proc. Natl. Acad. Sci. USA 82:448, l985; biological material available at Stratagene).
A polypeptide derivative can also be produced as a fusion polypeptide that contains a polypeptide or a polypeptide derivative of the invention fused, e.g., at the N- or C-terminal end, to any other polypeptide (hereinafter referred to as a peptide tail). Such a product can be easily obtained by translation of a genetic fusion, i.e., a hybrid gene. Vectors for expressing fusion polypeptides are commercially available, and include the pMal-c2 or pMal-p2 systems of New England Biolabs, in which the peptide tail is a maltose binding protein, the glutathione-S-transferase system of Pharmacia, or the His-Tag system available from Novagen. These and other expression systems provide convenient means for further purification of polypeptides and derivatives of the invention.
Another particular example of fusion polypeptides included in invention includes a polypeptide or polypeptide derivative of the invention fused to a polypeptide having adjuvant activity, such as, e.g., subunit B of either cholera toxin or E. toll heat-labile toxin. Several possibilities can be used for producing such fusion proteins. First, the polypeptide of the invention can be fused to the N-terminal end or, preferably, to the C-terminal end of the polypeptide having adjuvant activity. Second, a polypeptide fragment of the invention can be fused within the amino acid sequence of the polypeptide having adjuvant activity.
Spacer sequences can also be included, if desired.
As stated above, the polynucleotides of the invention encode Helicobacte~ polypeptides in precursor or mature form. They can also encode hybrid precursors containing heterologous signal peptides, which can mature into polypeptides of the invention. By "heterologous signal peptide" is meant a signal peptide that is not found in the naturally-occurring precursor of a polypeptide of the invention.
_17_ A polynucleotide of the invention hybridizes, preferably under stringent conditions, to a polynucleotide having a sequence as shown in any of SEQ ID NOs:I-1G9 (odd numbers}. Hybridization procedures are, e.g., described by Ausubel et al. (supra); Silhavy e~t al. (Experiments with Gene Fusions, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1984); and Davis et al. (A Manual for G%enetic Engineering. Advanced Bacterial Genetics, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1980). Important parameters that can be considered for optimizing hybridization conditions are reflected in the following formula, which facilitates calculation of the melting temperah~re (Tm), which is the temperature above which two complementary DNA strands separate from one another (Casey et al., Nucl. Acid Res. 4:1539, l977): Tm = 81.5 + 0.5 x (%
G+C) + 1.G log (positive ion concentration) - 0.G x (% folnamide). Under appropriate stringency conditions, hybridization temperature (Th) is approximately 20 to 40 ~ C, 20 to 25 ~ C, or, preferably, 3 0 to 40 ~ C below the calculated Tm. Those skilled in the art will understand that optimal temperature and salt conditions can be readily determined empirically in preliminary experiments using conventional procedures. For example, stringent conditions can be achieved, both for pre-hybridizing and hybridizing incubations, (i) within 4-1G hours at 42~C, in G x SSC containing 50% formamide or (ii) within 4-16 hours at 6~~ ~C in an aqueous 6 x SSC
solution ( 1 M NaCI, 0.1 M sodium citrate (pH 7.0)). For polynucleotides containing 30 to 600 nucleotides, the above formula is used and then is corrected by subtracting (G00/polynucleotide size in base pairs). Stringency conditions are defined by a Th that is 5 to 10~C below Tm.
Hybridization conditions with oligonucleotides shorter than 20-30 bases do not precisely follow the rules set forth above. In such cases, the formula for calculating the Tm is as follows: Tm = 4 x (G+C) + 2 (A+T). For example, an 18 nucleotide fragment of 50% G+C would have an approximate Tm of 54~C.
A polynucleotide molecule of the invention, containing RNA, DNA, or modifications or combinations thereof, can have various applications. For example, a polynucleotide molecule can be used (i) in a process for producing the encoded polypeptide in a recombinant host system, (ii) in the construction of vaccine vectors, such as poxviruses, which are further used in methods and compositions for preventing and/or treating Helicobacter infection, (iii) as a vaccine agent, in a naked form or formulated with a delivery vehicle, and (iv) in the construction of attenuated Helicobacter strains that can over-express a polynucleotide of the invention or express it in a non-toxic, mutated form.
According to a second aspect of the invention, there is therefore provided (i) an expression cassette containing a poiynucleotide molecule of the invention placed under the control of elements (e.g., a promoter) required for expression; (ii) an expression vector containing an expression cassette of the invention; (iii) a procaryotic or eucaryotic cell transformed or transfected with an expression cassette and/or vector of the invention, as well as (iv) a process for producing a polypeptide or polypeptide derivative encoded by a polynucleotide of the invention, which involves culturing a procaryotic or eucaryotic cell transformed or transfected with an expression cassette and/or vector of the invention, under conditions that allow expression of the polynucleotide molecule of the invention, and recovering the encoded polypeptide or polypeptide derivative from the cell culture.
A recombinant expression system can be selected from procaryotic and eucaryotic hosts. Eucaryotic hosts include, for example, yeast cells (e.g., Saccharomyces cerevisiae or Pichia Pastoris), mammalian cells (e.g., COS l, NIH3T3, or JEG3 cells), arthropods cells (e.g., Spodoptera frugiperda (SF9) cells), and plant cells. Preferably, a procaryotic host such as E. coli is used.
Bacterial and eucaryotic cells are available from a number of different sources that are known to those skilled in the art, e.g., the American Type Culture Collection (ATCC; Rockville, Maryland).
The choice of the expression cassette will depend on the host system selected, as well as the features desired for the expressed polypeptide. For example, it may be useful to produce a polype;ptide of the invention in a particular Iipidated form or any other form. Typically, an expression cassette includes a constitutive or inducible promoter that is functional in the selected host system; a ribosome binding site; a start c~odon (ATG); if necessary, a region encoding a signal peptide, e.g., a lipidation signal peptide; a polynucleotide molecule of the invention; a stop codon; and, optionally, a 3' terminal region (translation and/or transcription terminator). The signal peptide-encoding region is adjacent to the polynucleotide of the invention and is placed in the proper reading frame. The signal peptide-encoding region can be homologous or heterologous to the polynucleotide molecule encoding the mature polypeptide and it can be specific to the secretion apparatus of the host used for expression. The open reading frame constituted by the polynucleotide molecule of the invention, alone or together with the signal peptide, is placed under the control of the promoter so that transcription and translation occur in the host system. Promoters and signal peptidf;-encoding regions are widely known and available to those skilled in the art. and include, for example, the promoter of Salmonella typhimurium (and derivatives) that is inducible by arabinose (promoter araB) and is functional in Gram-negative bacteria such as E. coli (U.S. Patent No. 5,028,S30; Cagnon et al., Protein Engineering 4(7): 843, I 991 ); the promoter of the bacteriophage T7 RNA polymerase gene, which is functional in a number of E. coli strains expressing T7 polymerase (U.S. Patent No. 4,952,496); the OspA lipidation signal peptide; and RlpB
lipidation signal peptide (Takase et al., J. Bact 169:4692, 1987).
The expression cassette is typically part of an expression vector, which is selected for its ability to replicate in the chosen expression system.
Expression vectors (e.g. plasmids or viral vectors) can be chosen from, for example, those described in Pouwels et al. (Cloning Vectors: A Laboratory Manual, 1985, Supp. 1987), and can be purchased from various commercial sources. Methods for transforming or transfecting host cells with expression vectors are well known in the art and will depend on the host system selected, as described in Ausubel et al. (supra).
Upon expression, a recombinant polypeptide of the invention (or a polypeptide derivative) is produced and remains in the intracellular compartment, is secreted/excreted in the extracellular medium or in the periplasmic space, or is embedded in the cellular membrane. The polypeptide can then be recovered in a substantially purified form from the cell extract or from the supernatant after centrifugation of the cell culture. Typically, the recombinant polypeptide can be purified by antibody-based affinity purification or by any other method known in the art, such as by genetic fusion to a small affinity-binding domain. Antibody-based affinity purification methods are also available for purifying a polypeptide of the invention extracted from a Helicobacter strain. Antidodies useful for immunoaffinity purification of the polypeptides of the invention can be obtained using methods described below.
Polynucleotides of the invention can also be used in DNA
vaccination methods, using either a viral or bacterial host as gene delivery vehicle (live vaccine vector) or administering the gene in a free form, e.g., inserted into a plasmid. Therapeutic or prophylactic efficacy of a polynucleotide of the invention can be evaluated as is described below.
Accordingly, in a third aspect of the invention, there is provided (i) a vaccine vector such as a poxvirus, containing a polynucleotide molecule of the invention placed under the control of elements required for expression; (ii) a composition of matter containing a vaccine vector of the invention, together with a diluent or carrier; (iii) a pharmaceutical composition containing a therapeutically or prophylactically effective amount of a vaccine vector of the invention; (iv) a method for inducing an immune response against Helicobacter I 0 in a mammal (e.g., a human; alternatively, the; method can be used in veterinary applications for treating or preventing Helico~iacter infection of animals, e.g., cats or birds), which involves administering to the mammal an immunogenically effective amount of a vaccine vector of the invention to elicit an immune response, e.g., a protective or therapeutic immune response to 1 S Helicobacter; and (v) a method for preventing and/or treating a Helicobacter (e.g., H. pylori, H. felis, H. mustelae, or H. heilmanii) infection, which involves administering a prophylactic or therapeutic amount of a vaccine vector of the invention to an individual in need. Additionally, the third aspect of the invention encompasses the use of a vaccine vector of the invention in the 20 preparation of a medicament for preventing and/or treating Helicobacter infection.
A vaccine vector of the invention can express one or several polypeptides or derivatives of the invention, a.s well as at least one additional Helicobacter antigen such as a urease apoenz;yme or a subunit, fragment, 2S homolog, mutant, or derivative thereof. In addition, it can express a cytokine, such as interleukin-2 (IL-2) or interleukin-i2 (IL-12), that enhances the immune response. Thus, a vaccine vector can include an additional WO 98I21225 PCT/US9'7/21353 polynucleotide molecule encoding, e.g., urease subunit A, B, or both, or a cytokine, placed under the control of elements required for expression in a mammalian cell.
Alternatively, a composition of the invention can include several vaccine vectors, each of which are capable of expressing a polypeptide or derivative of the invention. A composition can also contain a vaccine vector capable of expressing an additional Helicobacter antigen, such as urease apoenzyme, a subunit, fragment, homolog, mutant, or derivative thereof, or a cytokine such as IL-2 or IL-12.
In vaccination methods for treating or preventing infection in a mammal, a vaccine vector of the invention can be administered by any conventional route in use in the vaccine field, for example, to a mucosal (e.g., ocular, intranasal, oral, gastric, pulmonary, intestinal, rectal, vaginal, or urinary tract) surface or via a parenteral (e.g., subcutaneous, intradermal, intramuscular, intravenous, or intraperitoneal) route, or a combination thereof.
Preferred routes depend upon the choice of the vaccine vector. The administration can be achieved in a single dose or repeated at intervals. The appropriate dosage depends on various parameters that are understood by those skilled in the art, such as the nature of the vaccine vector itself, the route of administration, and the condition of the mammal to be vaccinated (e.g., the weight, age, and general health of the mammal).
Live vaccine vectors that can be used in the invention include viral vectors, such as adenoviruses and poxviruses, as well as bacterial vectors, e.g., Shigella, Salmonella, Vibrio cholerae, Lactobacillus, Bacille bilie de Calmette-Guerin {BCG), and Streptococcus. An example of an adenovirus vector, as well as a method for constructing an adenovirus vector capable of expressing a polynucleotide molecule of the invention, is described in U.S. Patent No.
4,920,209. Poxvirus vectors that can be used in the invention include, e.g., vaccinia and canary pox viruses, which are described in U.S. Patent No.
4,722,848 and U.S. Patent No. S,364,773, respectively (also see, e.g., Tartaglia et al., Virology 188:217, 1992, for a description of a vaccinia virus vector, and Taylor et al, Vaccine 13:S39, l995, for a description of a canary poxvirus vector). Poxvirus vectors capable of expressing a polynucleotide of the invention can be obtained by homologous recombination, as described in Kieny et al. (Nature 312:163, 1984} so that the polynucleotide of the invention is inserted into the viral genome under appropriate conditions for expression in mammalian cells. Generally, the dose of viral vector vaccine, for therapeutic or prophylactic use, can be from about 1 x 104 to about 1 x 10", advantageously from about 1 x 10' to about 1 x 10' ~, or, preferably, from about 1 x 10' to about I x 10'' plaque-forming units per kilogram. Preferably, viral vectors are administered parenterally, for example, in 3 doses that are 4 weeks apart.
1 S Those skilled in the art will recognize that it is, preferable to avoid adding a chemical adjuvant to a composition containin~; a viral vector of the invention and thereby minimizing the immune response to the viral vector itself.
Non-toxicogenic Yibrio cholerae mutant strains that can be used in live oral vaccines are described by Mekalanos et al. (Nature 306:551, 1983) and in U.S. Patent No. 4,882,278 (strain in which a substantial amount of the coding sequence of each of the two ctxA alleles has been deleted so that no functional cholerae toxin is produced); WO 9:'/113S4 (strain in which the irgA
locus is inactivated by mutation; this mutation can be combined in a single strain with ctxA mutations); and WO 94/l S33 (deletion mutant lacking functional ctxA and attRSl DNA sequences). These strains can be genetically engineered to express heterologous antigens, as described in WO 94/19482.
An effective vaccine dose of a V. cholerae strain capable of expressing a polypeptide or polypeptide derivative encoded by a polynucleotide molecule of the invention can contain, e.g., about 1 x 1 OS to about 1 x 10~, preferably about 1 x 106 to about 1 x 1 O8, viable bacteria in an appropriate volume for the selected route of administration. Preferred routes of administration include all mucosal routes, but, most preferably, these vectors are administered intranasally or orally.
Attenuated Salmonella typhimurium strains, genetically engineered for recombinant expression of heterologous antigens, and their use as oral vaccines, are described by Nakayama et al. (Bio/Technology G:693, 1988) and in W O 92/ 113 G 1. Preferred routes of administration for these vectors include all mucosal routes. Most preferably, the vectors are administered intranasally or orally.
Others bacterial strains useful as vaccine vectors are described by High et al. (EMBO 1 l :1991, 1992) and Sizemore et al. (Science 270:299, l995; Shigella flexneri); Medaglini et al. (Proc. Natl. Acad. Sci. USA
92:G868, l995; (Streptococcus gordonii); Flynn (Cell. Mol. Biol. 40 (suppl. I):31, 1194), and in WO 88/GG26, WO 90/0594, WO 91/I3157, WO 92/179G, and WO
92/213?G (Bacille Calmette Guerin). In bacterial vectors, a polynucleotide of the invention can be inserted into the bacterial genome or it can remain in a free state, for example, carned on a plasmid.
An adjuvant can also be added to a composition containing a bacterial vector vaccine. A number of adjuvants that can be used are known to those skilled in the art. For example, preferred adjuvants can be selected from the list provided below.
According to a fourth aspect of the invention, there is also provided (i) a composition of matter containing a polynucleotide of the invention, together with a diluent or carrier; (ii) a pharmaceutical composition containing a therapeutically or prophylactically effective. amount of a polynucleotide of the invention; (iii) a method for inducing an immune response against Helicobacter in a mammal by administering to the mammal an immunogenically effective amount of a polynucleotide of the invention to elicit an immune response, e.g., a protective immune response to Helicobacte;~ ; and (iv) a method for preventing and/or treating a Helicobacter (e.g~., H. pylori, H. felis, H.
mustelae, or H. heilmanii) infection, by administering a prophylactic or therapeutic amount of a polynucleotide of the invention to an individual in need of such -treatment. Additionally, the fourth aspect of the invention encompasses the use of a polynucleotide of the invention in the preparation of a medicament for preventing and/or treating Helicobacter infection. The fourth aspect of the invention preferably includes the use of a pol~ynucleotide molecule placed under conditions for expression in a mammalian cell, e.g., in a plasmid that is unable to replicate in mammalian cells and to substantially integrate into a mammalian genome.
Polynucleotides (for example, DNA or RNA molecules) of the invention can also be administered as such to a mammal as a vaccine. When a DNA molecule of the invention is used, it can be in the form of a plasmid that is unable to replicate in a mammalian cell and unable to integrate into the mammalian genome. Typically, a DNA molecule is placed under the control of a promoter suitable for expression in a mammalian cell. The promoter can function ubiquitously or tissue-specifically. F?xamples of non-tissue specific promoters include the early Cytomegalovirus (CMV) promoter (U.S. Patent No. 4,1 G8,062) and the Rous Sarcoma Virus promoter (Norton et al., Molec.
Cell Bioi. 5:281, 1985). The desmin promoter (Li et al., Gene 78:243, 1989; Li et al., J. Biol. Chem. 266:6562, 1991; Li et al., J. Biol. Chem. 268:10403, 1993) is tissue-specific and drives expression in muscle cells. More generally, useful promoters and vectors are described, e.g., in WO 94/2l797 and by Hartikka et al. (Human Gene Therapy 7:1205, 1996).
For DNA/RNA vaccination, the polynucleotide of the invention can encode a precursor or a mature form of a polypeptide of the invention. When it encodes a precursor form, the precursor sequence can be homologous or heterologous. In the latter case, a eucaryotic leader sequence can be used, such as the leader sequence of the tissue-type plasminogen factor (tPA).
A composition of the invention can contain one or several polynucleotides of the invention. It can also contain at least one additional polynucleotide encoding another Helicobacte~~ antigen, such as urease subunit A, B, or both, or a fragment, derivative, mutant, or analog thereof. A
polynucleotide encoding a cytokine, such as interleukin-2 (IL-2) or interleukin-12 (IL-12), can also be added to the composition so that the immune response is enhanced. These additional polynucleotides are placed under appropriate control for expression. Advantageously, DNA molecules of the invention and/or additional DNA molecules to be included in the same composition are carried in the same plasmid.
Standard methods can be used in the preparation of therapeutic polynucleotides of the invention. For example, a polynucleotide can be used in a naked form, free of any delivery vehicles, such as anionic liposomes, cationic lipids, microparticles, e.g., gold microparticles, precipitating agents, e.g., calcium phosphate, or any other transfection-facilitating agent. In this case, the polynucleotide can be simply diluted in a physiologically acceptable solution, such as sterile saline or sterile buffered saline, with or without a Garner.
When present, the carrier preferably is isotonic, hypotonic, or weakly hypertonic, and has a relatively low ionic strength, such as provided by a sucrose solution, e.g., a solution containing 20% sucrose.
Alternatively, a polynucleotide can be associated with agents that assist in cellular uptake. It can be, e.g., (i) complemented with a chemical agent that modifies cellular permeability, such as bupivacaine (see, e.g., 7), (ii) encapsulated into liposorrles, or (iii) associated with cationic lipids or silica, gold, or tungsten microparticles.
Anionic and neutral liposornes are well-known in the art (see, e.g., Liposomes: A Practical Approach, RPC New Ed, IRL Press, 1990, for a detailed description of methods for making liposomes) and are useful for delivering a large range of products, including; polynucleotides.
Cationic lipids can also be used for gene delivery. Such lipids ._ - include, for example, LipofectinTM, which is also known as DOTMA (N-[ 1-(2,3-dioleyloxy)propyl]-N,N,N-trimethylamrrlonium chloride), DOTAP ( 1,2-bis(oleyloxy)-3-(trimethylammonio)propane), DDAB (dimethyldioctadecyl-ammonium bromide), DOGS (dioctadecylamiidologlycyl spermine), and cholesterol derivatives. A description of these cationic lipids can be found in EP 187,702, WO 90/11092, U.S. Patent No. 5,283,185, WO 91/15501, WO 9S/26356, and U.S. Patent No. 5,527,928. Cationic lipids for gene delivery are preferably used in association with a neutral lipid, such as DOPE
(dioleyl phosphatidylethanolamine; WO 90/l 1092). Other transfection-facilitating compounds can be added to a fornlulation containing cationic Iiposomes. A number of them are described in, e.g., WO 93/18759, WO 93/l9768, WO 94/2S608, and WO 95/2397. They include, e.g., spermine derivatives useful for facilitating the transport; of DNA through the nuclear membrane (see, for example, WO 93/l8759) .and membrane-permeabilizing compounds such as GALA, Gramicidine S, and cationic bile salts (see, for example, WO 93/19768).
Gold or tungsten microparticles can also be used for gene delivery, as described in WO 91/359, WO 93/17706, and by Tang et al. (Nature 356:152, 1992). In this case, the microparticle-coated polynucleotides can be injected via intradermal or intraepidermal routes using a needleless injection device ("gene gun"), such as those described in U.S. Patent No. 4,94S,050, U.S.
Patent No. 5,015,580, and WO 94l24263.
The amount of DNA to be used in a vaccine depends, e.g., on the strength of the promoter used in the DNA construct, the imrnunogenicity of the expressed gene product, the condition of the mammal intended for administration (e.g., the weight, age, and general health of the mammal), the mode of administration, and the type of formulation. In general, a therapeutically or prophylactically effective dose from about 1 ~g to about 1 mg, preferably, from about 10 ~cg to about 800 fig, and, more preferably, from about 25 ~g to about 250 ,ug, can be administered to a human adult. The 1 S administration can be achieved in a single dose or repeated at intervals.
The route of administration can be any conventional route used in the vaccine field. As general guidance, a polynucleotide of the invention can be administered via a mucosal surface, e.g., an ocular, intranasal, pulmonary, oral, intestinal, rectal, vaginal, or urinary tract surface, or via a parenteral route, e.g., by an intravenous, subcutaneous, intraperitoneal, intradermal, intraepidermal, or intramuscular route. The choice of administration route will depend on, e.g., the formulation that is selected. A polynucleotide formulated in association with bupivacaine is advantageously administered into muscle. When a neutral or anionic liposome or a cationic lipid, such as DOTMA, is used, the formulation can be advantageously administered via intravenous, intranasal (for example, by aerosolization), intramuscular, intradermal, and subcutaneous routes. A polynucleotide in a naked form can advantageously be administered via the intramuscular, intradermal, or subcutaneous routes. Although not absolutely required, such a composition can also contain an adjuvant. A
systemic adjuvant that does not require concomitant administration in order to exhibit an adjuvant effect is preferable.
The sequence information provided in the present application enables the design of specific nucleotide probes and primers that can be used in diagnostic methods. Accordingly, in a fifth aspect of the invention, there is provided a nucleotide probe or primer having a sequence found in, or derived by degeneracy of the genetic code from, a sequence shown in any of SEQ ID
NOs: l-169 (odd numbers).
The term "probe" as used in the preaent application refers to a DNA
(preferably single stranded) or RNA molecule; (or modifications or combinations thereof) that hybridizes under the stringent conditions, as defined above, to a polynucleotide molecule having a sequence homologous to any of those shown in SEQ ID NOs:l-169 (odd numbers}, or to a complementary or anti-sense sequence of any of those shown in SEQ ID NOs:I-169 (odd numbers). Generally, probes are significantly shorter than the full-length sequences shown in SEQ ID NOs: l-l69 (odd numbers). For example, they can contain from about 5 to about 100, preferably from about 10 to about 80, nucleotides. In particular, probes have sequences that are at least 75%, preferably at least 85%, more preferably 95% homologous to a portion of a sequence as shown in any of SEQ ID NOs: l-169 (odd numbers) or a sequence complementary to any of such sequences.
Probes can contain modified bases, such as inosine, methyl-5-deoxycytidine, deoxyuridine, dimethylamino-S-deoxyuridine, or diamino-2, G-purine. Sugar or phosphate residues can also be modified or substituted. For example, a deoxyribose residue can be replaced by a polyamide (Nielsen et al., WO 98I21225 PCT/LIS9'I/21353 Science 254: l497, 1991 ) and phosphate residues can be replaced by ester groups, such as diphosphate, alkyl, aryiphosphonate, and phosphorothioate esters. In addition, the 2'-hydroxyl group on ribonucleotides can be modified by addition of, e.g., alkyl groups.
Probes of the invention can be used in diagnostic tests or as capture or detection probes. Such capture probes can be immobilized on solid supports, directly or indirectly, by covalent means or by passive adsorption. A
detection probe can be labeled by a detectable label, for example, a label selected from radioactive isotopes; enzymes, such as peroxidase and alkaline phosphatase;
enzymes that are able to hydrolyze a chromogenic, fluorogenic, or luminescent -- substrate; compounds that are chromogenic, fluorogenic, or luminescent;
nucleotide base analogs; and biotin.
Probes of the invention can be used in any conventional hybridization method, such as in dot blot methods (Maniatis et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1982), Southern blot methods (Southern, J. Mol.
Biol. 98:503, 1975), northern blot methods (identical to Southern blot to the exception that RNA is used as a target), or a sandwich method (Dunn et al., Cell l2:23, 1977). As is known in the art, the latter technique involves the use of a specific capture probe and a specific detection probe that have nucleotide sequences that are at least partially different from each other.
Primers used in the invention usually contain about 10 to 40 nucleotides and are used to initiate enzymatic polymerization of DNA in an amplification process (e.g., PCR), an elongation process, or a reverse transcription method. In a diagnostic method involving PCR, the primers can be labeled.
Thus, the invention also encompasses (i) a reagent containing a probe of the invention for detecting and/or identifying the presence of Helicobacter in a biological material; (ii) a method for detecting and/or identifying the presence of Helicobacter in a biological material, in which {a) a sample is recovered or derived from the biological material, (b) DNA or RNA
is extracted from the material and denatured, and (c) the sample is exposed to a probe of the invention, for example, a capture probe, a detection probe, or both, under stringent hybridization conditions, so tlhat hybridization is detected;
and (iii) a method for detecting and/or identifying; the presence of Helicobacter in a biological material, in which (a) a sample is recovered or derived from the biological material, (b) DNA is extracted therefrom, (c) the extracted DNA is contacted with at least one, or, preferably two, primers of the invention, and amplified by the polymerase chain reaction, and (d) an amplified DNA
molecule is produced.
As mentioned above, polypeptides that can be produced by expression of the polynucleotides of the invention can be used as vaccine antigens. Accordingly, a sixth aspect of the invention features a substantially purified polypeptide or polypeptide derivative having an amino acid sequence encoded by a polynucleotide of the invention.
A "substantially purified polypeptide" is defined as a polypeptide that is separated from the environment in which it naturally occurs and/or a polypeptide that is free of most of the other polypeptides that are present in the environment in which it was synthesized. The polypeptides of the invention can be purified from a natural source, such as a Helicobacter strain, or can be produced using recombinant methods.
Homologous polypeptides or polypeptide derivatives encoded by polynucleotides of the invention can be screened for specific antigenicity by testing cross-reactivity with an antiserum raised against a polypeptide having an amino acid sequence as shown in any of SEQ ID NOs:2-l70 (even numbers). Briefly, a monospecific hyperimmune antiserum can be raised against a purified reference polypeptide as such or as a fusion polypeptide, for example, an expression product of MBP, GST, or His-tag systems, or a synthetic peptide predicted to be antigenic. The homologous polypeptide or derivative that is screened for specific antigenicity can be produced as such or as a fusion polypeptide. In the latter case, and if the antiserum is also raised against a fusion polypeptide, two different fusion systems are employed.
Specific antigenicity can be determined using a number of methods, including Western blot (Towbin et al., Proc. Natl. Acad. Sci. USA 76:4350, l979), dot blot, and ELISA methods, as described below.
In a Western blot assay, the product to be screened, either as a purified preparation or a total E. coli extract, is fractionated by SDS-PAGE, as described, for example, by Laemmli (Nature 227:680, 1970). After being transferred to a filter, such as a nitrocellulose membrane, the material is incubated with the monospecific hyperimmune antiserum, which is diluted in a range of dilutions from about 1:50 to about 1:5,000, preferably from about 1:100 to about 1:500. Specific antigenicity is shown once a band corresponding to the product exhibits reactivity at any of the dilutions in the range.
In an ELISA assay, the product to be screened can be used as the coating antigen. A purified preparation is preferred, but a whole cell extract can also be used. Briefly, about l00 ~,l of a preparation of about 10 ~cg protein/ml is distributed into wells of a 96-well ELISA plate. The plate is incubated for about 2 hours at 37~C, then overnight at 4~C. The plate is washed with phosphate buffer saline (PBS) containing 0.05% Tween 20 (PBS/Tween buffer) and the wells are saturated with 2S0 ~1 PBS containing 1 % bovine serum albumin (BSA), to prevent non-specific antibody binding.
After 1 hour of incubation at 37~C, the plate is washed with PBS/Tween buffer.
The antiserum is serially diluted in PBS/Tween buffer containing O.S% BSA, S and 100 ,ul dilutions are added to each well. 'The plate is incubated for 90 minutes at 37~C, washed, and evaluated using standard methods. For example, a goat anti-rabbit peroxidase conjugate can be added to the wells when the specific antibodies used were raised in rabbits. Incubation is carried out for about 90 minutes at 37~C and the plate is washed. The reaction is developed with the appropriate substrate and the reaction is measured by colorimetry (absorbance measured spectrophotometrically). Under these experimental conditions, a positive reaction i:> shown once an O.D. value of 1.0 is detected with a dilution of at least about 1:50, preferably of at least about 1:S00.
1 S In a dot blot assay, a purified product is preferred, although a whole cell extract can be used. Briefly, a solution o:f the product at a concentration of about 100 ,ug/ml is serially diluted two-fold with SO mM Tris-HCl (pH 7.S).
One hundred ~cl of each dilution is applied to a filter, such as a 0.4S ~m nitrocellulose membrane, set in a 96-well dot blot apparatus (Biorad). The buffer is removed by applying vacuum to the system. Wells are washed by addition of SO mM Tris-HCl (pH 7.S) and the membrane is air-dried. The membrane is saturated in blocking buffer (SO mM Tris-HCl (pH 7.S), 0.1 S M
NaCI, 10 g/1 skim milk) and incubated with an antiserum diluted from about 1:50 to about 1:S000, preferably about 1:S00. The reaction is detected using standard methods. For example, a goat anti-rabbit peroxidase conjugate can be added to the wells when rabbit antibodies are used. Incubation is carned out for about 90 minutes at 37 ~ C and the blot is washed. The reaction is developed with the appropriate substrate and stopped. The reaction is then measured visually by the appearance of a colored spot, e.g., by colorimetry. Under these experimental conditions, a positive reaction is associated with detection of a colored spot for reactions carried out with a dilution of at least about 1:50, preferably, of at least about 1:500. Therapeutic or prophylactic efficacy of a polypeptide or polypeptide derivative of the invention can be evaluated as described below.
According to a seventh aspect of the invention, there is provided (i) a composition of matter containing a polypeptide of the invention, together with a diluent or carrier; (ii) a pharmaceutical composition containing a therapeutically or prophylactically effective amount of a polypeptide of the invention; (iii) a method for inducing an immune response against Helicobacter in a mammal by administering to the mammal an immunogenically effective amount of a polypeptide of the invention to elicit an immune response, e.g., a protective immune response to Helicobacter; and (iv) a method for preventing and/or treating a Helicobacter (e.g., H. pylori, H. fells, H. mustelae, or H.
heilmanii) infection, by administering a prophylactic or therapeutic amount of a polypeptide of the invention to an individual in need of such treatment.
Additionally, this aspect of the invention includes the use of a polypeptide of the invention in the preparation of a medicament for preventing and/or treating Helicobacter infection.
The immunogenic compositions of the invention can be administered by any conventional route in use in the vaccine field, for example, to a mucosal (e.g., ocular, intranasaI, pulmonary, oral, gastric, intestinal, rectal, vaginal, or urinary tract) surface or via a parenteral (e.g., subcutaneous, intradermal, intramuscular, intravenous, or intraperitoneal) route. The choice of the administration route depends upon a number of parameters, such as the adjuvant used. For example, if a mucosal adjuvant is used, the intranasal or oral route will be preferred, and if a lipid formulation or an aluminum compound is used, a parenteral route will be preferred. In the latter case, the subcutaneous or intramuscular route is most x>referred. The choice of administration route can also depend upon thc; nature of the vaccine agent.
For example, a polypeptide of the invention fusedl to CTB or to LTB will be best administered to a mucosal surface.
A composition of the invention can. contain one or several polypeptides or derivatives of the invention. It can also contain at least one additional Helicobacter antigen, such as the urease apoenzyme, or a subunit, fragment, homolog, mutant, or derivative thereof.
For use in a composition of the invention, a polypeptide or polypeptide derivative can be formulated into or with liposomes, such as neutral or anionic liposomes, microspheres, ISCOMS, or virus-like particles (VLPs), to facilitate delivery and/or enhance l;he immune response. These compounds are readily available to those skilled in the art; for example, see Liposomes: A Practical Approach (supra). Adjuvants other than liposomes can also be used in the invention and are well known in the art (see, for example, the list provided below).
Administration can be achieved in .a single dose or repeated as necessary at appropriate intervals that can be determined by those skilled in the art. For example, a priming dose can be followed by three booster doses at weekly or monthly intervals. An appropriate dose depends on various parameters, including the nature of the recipient (e.g., whether the recipient is an adult or an infant), the particular vaccine antigen, the route and frequency of administration, the presence/absence or type of adjuvant, and the desired effect (e.g., protection and/or treatment), and can be readily determined by one skilled in the art. In general, a vaccine antigen of the invention can be administered mucosally in an amount ranging from about I 0 ,ug to about 500 mg, preferably from about 1 mg to about 200 mg. For a parenteral route of administration, the dose usually should not exceed about 1 mg, and is, preferably, about 100 ,ug.
When used as components of a vaccine, the polynucleotides and polypeptides of the invention can be used sequentially as part of a multi-step immunization process. For example, a mammal can be initially primed with a vaccine vector of the invention, such as a pox virus, e.g., via a parenteral route, and then boosted twice with a polypeptide encoded by the vaccine vector, e.g., via the mucosal route. In another example, liposomes associated with a polypeptide or polypeptide derivative of the invention can be used for priming, with boosting being carried out mucosally using a soluble polypeptide or polypeptide derivative of the invention, in combination with a mucosal adjuvant (e.g., LT).
Polypeptides and polypeptide derivatives of the invention can also be used as diagnostic reagents for detecting the presence of anti-Helicobacter antibodies, e.g., in blood samples. Such polypeptides can be about 5 to about 80, preferably, about 10 to about 50, amino acids in length and can be labeled or unlabeled, depending upon the diagnostic method. Diagnostic methods involving such a reagent are described below.
Upon expression of a polynucleotide molecule of the invention, a polypeptide or polypeptide derivative is produced and can be purified using known methods. For example, the polypeptide or polypeptide derivative can be produced as a fusion protein containing a fused tail that facilitates purification.
The fusion product can be used to immunize a small mammal, e.g., a mouse or a rabbit, in order to raise monospecific antibodies against the polypeptide or polypeptide derivative. The eighth aspect of the invention thus provides a monospecific antibody that binds to a polypeptide or polypeptide derivative of the invention.
By "monospecific antibody" is meant an antibody that is capable of reacting with a unique, naturally-occurring Helicobacter polypeptide. Ari antibody of the invention can be polyclonal or monoclonal. Monospecific antibodies can be recombinant, e.g., chimeric (e.g., consisting of a variable _-region of murine origin and a human constant region), humanized (e.g., a human immunoglobulin constant region and a variable region of animal, e.g., murine, origin), and/or single chain. Both polyclonal and monospecific antibodies can also be in the form of immuno~;lobulin fragments, e.g., F(ab)'2 or Fab fragments. The antibodies of the invention can be of any isotype, e.g., IgG or IgA, and polyclonal antibodies can be of a single isotype or can contain a mixture of isotypes.
The antibodies of the invention, which can be raised against a polypeptide or polypeptide derivative of the invention, can be produced and identified using standard immunological assays, e.g., Western blot assays, dot blot assays, or ELISA (see, e.g., Coligan et al., Current Protocols in Immunology, John Wiley & Sons, Inc., New Fork, NY, 1994). The antibodies can be used in diagnostic methods to detect the presence of Helicobacter antigens in a sample, such as a biological sample. The antibodies can also be used in affinity chromatography methods for purifying a polypeptide or polypeptide derivative of the invention. As is discussed further below, the antibodies can also be used in prophylactic and therapeutic passive immunization methods.
Accordingly, a ninth aspect of the invention provides (i) a reagent for detecting the presence of Helicobacter in a biological sample that contains an antibody, polypeptide, or polypeptide derivative of the invention; and (ii) a diagnostic method for detecting the presence of Helicobacter in a biological sample, by contacting the biological sample with an antibody, a polypeptide, or a polypeptide derivative of the invention, so that an immune complex is formed, and detecting the complex as an indication of the presence of Helicobacter in the sample or the organism from which the sample was derived. The immune complex is formed between a component of the sample and the antibody, polypeptide, or polypeptide derivative, and any unbound material can be removed prior to detecting the complex. A polypeptide reagent can be used for detecting the presence of anti-Helicobacter antibodies in a sample, e.g., a blood sample, while an antibody of the invention can be used for screening a sample, such as a gastric extract or biopsy sample, for the presence of Helicobacter polypeptides.
For use in diagnostic methods, the reagent (e.g., the antibody, polypeptide, or polypeptide derivative of the invention) can be in a free state or can be immobilized on a solid support, such as, for example, on the interior surface of a tube or on the surface, or within pores, of a bead.
Immobilization can be achieved using direct or indirect means. Direct means include passive adsorption (i.e., non-covalent binding) or covalent binding between the support and the reagent. By "indirect means" is meant that an anti-reagent compound that interacts with the reagent is first attached to the solid support. For example, if a polypeptide reagent is used, an antibody that binds to it can serve as an anti-reagent, provided that it binds to an epitope that is not involved in recognition of antibodies in biological samples. Indirect means can also employ a ligand-receptor system, for example, a molecule, such as a vitamin, can be grafted onto the polypeptide reagent and the corresponding receptor can be immobilized on the solid phase. This concept is illustrated by the well known biotin-streptavidin system. Alternatively, indirect means can be used, e.g., by adding to the reagent a peptide tail, chemically or by genetic engineering, and immobilizing the grafted or fused product by passive adsorption or covalent linkage of the peptide hail.
According to a tenth aspect of the invention, there is provided a process for purifying from a biological sample a polypeptide or polypeptide derivative of the invention, which involves carrying out antibody-based affinity chromatography with the biological sample, v~rherein the antibody is a monospecific antibody of the invention.
For use in a purification process of the invention, the antibody can be polyclonal or monospecific, and preferably is of the IgG type. Purified IgGs - can be prepared from an antiserum using standard methods (see, e.g., Coligan et al., supra). Conventional chromatography aupports, as well as standard methods for grafting antibodies, are described, for example, by Harlow et al.
(Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1988).
Briefly, a biological sample, such as an H. pylori extract, preferably in a buffer solution, is applied to a chromatography material, which is, preferably, equilibrated with the buffer used to dilute the biological sample, so that the polypeptide or polypeptide derivative of the invention (i.e., the antigen) is allowed to adsorb onto the material. The chromatography material, such as a gel or a resin coupled to an antibody of the invention, can be in batch form or in a column. The unbound components are washed off and the antigen is eluted with an appropriate elution buffer, such as a glycine buffer, a buffer containing a chaotropic agent, e.g., guanidine HCI, or a buffer having high salt concentration (e.g., 3 M MgClz). Eluted fractions are recovered and the presence of the antigen is detected, e.g., by measuring the absorbance at 280 nm.
An antibody of the invention can be screened for therapeutic efficacy as follows. According to an eleventh aspect of the invention, there is provided (i) a composition of matter containing a monospecific antibody of the invention, together with a diluent or carrier; (ii) a pharmaceutical composition containing a therapeutically or prophylactically effective amount of a monospecific antibody of the invention, and (iii) a method for treating or preventing Helicobactef~ (e.g., H. pylori, H. fells, H. mustelae, or H.
heilmanii) infection, by administering a therapeutic or prophylactic amount of a monospecific antibody of the invention to an individual in need of such treatment. In addition, the eleventh aspect of the invention includes the use of a monospecific antibody of the invention in the preparation of a medicament for treating or preventing Helicobacte~~ infection.
The monospecific antibody can be polyclonal or monoclonal, and is, preferably, predominantly of the IgA isotype. In passive immunization methods, the antibody is administered to a mucosal surface of a mammal, e.g., the gastric mucosa, e.g., orally or intragastrically, optionally, in the presence of a bicarbonate buffer. Alternatively, systemic administration, not requiring a bicarbonate buffer, can be carried out. A monospecific antibody of the invention can be administered as a single active agent or as a mixture with at least one additional monospecific antibody specific for a different Helicobacter polypeptide. The amount of antibody and the particular regimen used can be readily determined by one skilled in the art. For example, daily administration of about 100 to 1,000 mg of antibody over one week, or three doses per day of about 100 to l,000 mg of antibody over two or three days, can be effective regimens for most purposes.
Therapeutic or prophylactic efficacy can be evaluated using standard methods in the art, e.g., by measuring induction of a mucosal immune response or induction of protective and/or therapeutic immunity using, e.g., the H.
fells mouse model and the procedures described by Lee et al. (Eur. J.
Gastroenterology & Hepatology 7:303, 1995) or Lee et al. (J. Infect. Dis.
172:1G1, l995). Those skilled in the art will recognize that the H. fells strain of the model can be replaced with another Helicobacter strain. For example, the efficacy of polynucleotide molecules and polypeptides from H. pylori is, preferably, evaluated in a mouse model using an H. pylori strain. Protection can be determined by comparing the degree of Helicobacter infection in the gastric tissue assessed by, for example, urease activity, bacterial counts, or gastritis, to that of a control group. Protection is shown when infection is reduced by comparison to the control group. Such an evaluation can be made for polynucleotides, vaccine vectors, polypeptides, and polypeptide derivatives, as well as for antibodies of the invention.
For example, various doses of an antibody of the invention can be administered to the gastric mucosa of mice previously challenged with an H.
pylori strain as described, e.g., by Lee et al. (supra). Then, after an appropriate period of time, the bacterial load of the mucosa can be estimated by assessing urease activity, as compared to a control. Reduced urease activity indicates that the antibody is therapeutically effective.
Adjuvants that can be used in any of the vaccine compositions described above are described as follows. Adjuvants for parenteral administration include, for example, aluminum compounds, such as aluminum hydroxide, aluminum phosphate, and aluminum hydroxy phosphate. The antigen can be precipitated with, or adsorbed onto, the aluminum compound using standard methods. Other adjuvants, such as RIBI (ImmunoChem, Hamilton, MT), can also be used in parenteral administration.
Adjuvants that can be used for mucosal administration include, for example, bacterial toxins, e.g., the cholera toxin (CT), the E. coli heat-labile toxin (LT), the Clostridium difficile toxin A, the pe~tussis toxin (PT), and combinations, subunits, toxoids, or mutants thereof. For example, a purified preparation of native cholera toxin subunit B (CTB) can be used. Fragments, homologs, derivatives, and fusions to any of these toxins can also be used, provided that they retain adjuvant activity. Preferably, a mutant having reduced toxicity is used. Suitable mutants are described, e.g., in WO 95/17211 (Arg-7-Lys CT mutant), WO 9G/GG27 (Arg-l92-Gly LT mutant), and WO
95/34323 (Arg-9-Lys and Glu-l29-Gly PT mutant). Additional LT mutants - that can be used in the methods and compositions of the invention include, e.g., Ser-G3-Lys, Ala-69-Gly, Glu-110-Asp, and Glu-112-Asp mutants. Other adjuvants, such as the bacterial monophosphoryl lipid A (MPLA) of, e.g., E.
coli, Salmonella mihnesota, Salmonella typhimurium, or Shigella flexneri;
I 5 saponins, and polylactide glycolide (PLGA) microspheres, can also be used in mucosal administration. Adjuvants useful for both mucosal and parenteral administration, such as polyphosphazene (WO 95/24l5), can also be used.
Any pharmaceutical composition of the invention, containing a polynucleotide, polypeptide, polypeptide derivative, or antibody of the invention, can be manufactured using standard methods. It can be formulated with a pharmaceutically acceptable diluent or carrier, e.g., water or a saline solution, such as phosphate buffered saline, optionally, including a bicarbonate salt, such as sodium bicarbonate, e.g., 0.1 to 0.5 M. Bicarbonate can advantageously be added to compositions intended for oral or intragastric administration. In general, a diluent or carrier can be selected on the basis of the mode and route of administration, and standard pharmaceutical practice.
Suitable pharmaceutical carriers and diluents, as well as pharmaceutical necessities for their use in pharmaceutical formulations, are described in Remington's Pharmaceutical Sciences, a standard reference text in this field and in the USP/NF.
The invention also includes methods in which gastroduodenal infections, such as Helicobacter infection, are treated by oral administration of a Helicobacter polypeptide of the invention and a mucosal adjuvant, in combination with an antibiotic, an antisecretory agent, a bismuth salt, an antacid, sucralfate, or a combination thereof. Examples of such compounds that can be administered with the vaccine antigen and an adjuvant are antibiotics, including, e.g., macrolides, tetracyclines, (3-lactams, aminoglycosides, quinolones, penicillins, and derivatives thereof (specific examples of antibiotics that can be used in the; invention include, e.g., amoxicillin, clarithromycin, tetracycline, metronidizole, erythromycin, cefuroxime, and erythromycin); antisecretory agents, including, e.g., H2-receptor antagonists (e.g., cimetidine, ranitidine, famotidine, nizatidine, and roxatidine), proton pump inhibitors (e.g., omeprazole, lansoprazole, and pantoprazole), prostaglandin analogs (e.g., misoprostil and enprostil), and anticholinergic agents (e.g., pirenzepine, telenzepine, carbenoxolone, and proglumide); and bismuth salts, including colloidal bismuth subcitrate, tripotassium dicitrate bismuthate, bismuth subsalicylate, bicitropeptide, and pepto-bismol (see, e.g., Goodwin et al., Helicobacter pylori, Biology and Clinical Practice, CRC Press, Boca Raton, FL,, pp 366-395, 1993; Physicians' Desk Reference, 49''' edn., Medical Economics Data Production Company, Montvale, New Jersey, 199S). In addition, compounds containing more than one of the above-listed components coupled together, e.g., ranitidine coupled to bismuth subcitrate, can be used. The invention also includes compositions for carrying out these methods, i. e., compositions containing a Helicobacter antigen (or antigens) of the invention, an adjuvant, and one or more of the above-listed compounds, in a pharmaceutically acceptable earner or diluent.
Amounts of the above-listed compounds used in the methods and compositions of the invention can readily be determined by one skilled in the art. In addition, one skilled in the art can readily design treatment/immunization schedules. For example, the non-vaccine components can be administered on days 1-14, and the vaccine antigen + adjuvant can be administered on days 7, 14, 21, and 28.
Methods and pharmaceutical compositions of the invention can be used to treat or to prevent Helicobacter infections and, accordingly, gastroduodenal diseases associated with these infections, including acute, chronic, and atrophic gastritis, and peptic ulcer diseases, e.g., gastric and duodenal ulcers.
The clones of the invention were originally isolated by a transposon shuttle mutagenesis method. Briefly, in this method, a TnMax9 mini-blaM
transposon was used for insertional mutagenesis of an H. pylori gene library established in E. coli. 192 E. coli clones expressing active ~i-lactamase fusion proteins were obtained, indicating that the corresponding target plasmids carry H. pylori genes encoding extracytoplasmic proteins. Individual mutants were transferred onto the chromosome of H. pylori P 1 or P 12 by natural transformation, resulting in l35 distinct H. pylori mutants. This method is described in further detail, as follows.
The transposon TnMax9 (Kahrs et al., Gene 1 G7:53, 1995) was used to generate mutations in an H. pylori library in E. coli. As illustrated in Fig.
1 A, TnMax9 contains, in addition to a cat~C-resistance gene close to the inverted repeat (IR), an unexpressed open reading frame encoding ~3-lactamase without a promoter or signal sequence (mature ~3-lactamase, blaM; Kahrs et al., WO 98/21225 PCT/US97/21353 w supra). For production of extracytoplasmic BIaM fusion proteins resulting in ampicillin-resistant (amps) clones, expression of the cloned H. pylori genes in E. coli is obligatory. The minimal vector pMin2 (Kahrs et al., supra; see Fig.
1 B), containing a weak constitutive promoter (P;s~) upstream of the multiple cloning site, was used for construction of the H. pylori library to ensure expression of H. pylori genes in E. coli.
In construction of the library, H. pylori DNA was partially digested with Sau3A and HpaII, size fractionated by preparative agarose gel electrophoresis, and 3-6 kilobase fragments were ligated into the BgIII and CIaI
sites of pMin2. The library was introduced into E. coli strain E 181 (pTnMax9), which is a derivative of HB 101 containing the TnMax9 transposon, by electroporation. This generated approximately 2,400 independent transformants. More than 95% of the plasmids contained an insert of between 3 and 6 kilobases, showing that the 1.7 megabase H. pylori chromosome was statistically covered. Since not every plasmid could be expected to contain a target gene carrying an export signal, the library was partitioned into a total of 198 pools (24 pools of 20 clones and 174 pools of I 1 clones). Using a cotton swab, either eleven or twenty individual colonies were inoculated in 0. S ml LB
medium in eppendorf tubes, vortexed, and 100 ml of the suspension was spread on LB agar plates supplemented with tetracycline and chloramphenicol to select for maintenance of both plasmids. Insertion of TnMax9 into the target plasmids was induced with l00 mM isopropyl-b-D-thiogalactoside (IPTG) separately for each pool (Haas et al., Gene 130:23-2 I , 1993 ). Plasmids were transferred into E145 by triparental mating, in which 25 ml of the donor strain (E 181 ), 25 ml of the mobilisator (HB 1 O 1 (pRI'.2013)), and 50 ml of the recipient strain (E145) were mixed from corrf;sponding bacterial suspensions (O.D.55~= 10). The matings were performed for 2-3 hours at 37~C on nitrocellulose filters, which were placed on LB plates. Bacteria were suspended in 1 ml LB and aliquots were spread on LB plates containing chloramphenicol, tetracycline, and rifampicin. Each pool gave rise to chloramphenicol-resistant transconjugates in E 145, demonstrating that both transposition and conjugation were successful. Generally, several thousand chloramphenicol-resistant transconjugates were obtained, but the number of ampR colonies varied in different pools, ranging from one to several hundred colonies. Two amp's colonies from each positive pool were isolated, plasmid DNA was extracted, and the DNA was characterized by further restriction analysis. Only those TnMax9 insertions of a single pool that mapped in obviously different plasmid clones, or in markedly different regions of the same clone, were used further.
From l58 of the 198 pools, ampicillin-resistant E145 transconjugates were obtained (80%), showing that in several pools, TnMax9 inserted into expressed genes, resulting in production of extracytoplasmic BIaM fusion proteins. Thus, a total of 192 ampa E 145 clones could be isolated by conjugal transfer of plasmids from 198 pools.
To analyze the mutant library, it was determined whether defined gene sequences inactivated by TnMax9 were represented once or several times in the whole library. Five transposon-containing plasmids confernng an ampK
phenotype to E 145 (pMu7, pMu 13, pMu75, pMu94, and pMu 110) were randomly selected and DNA fragments flanking the TnMax9 insert were isolated and used as probes in Southern hybridization of 120 ampR clones. The hybridization probes isolated from clones pMu7, pMu75, and pMu94 were between 0.9 and 1.1 kilobases in size, and hybridized exclusively with the inserts of the homologous plasmids. In contrast, the TnMax9 flanking regions of clones pMu 13 and pMu 110 were 4.0 and 5.5 kilobases, respectively. They WO 98I21225 PCT/US97/21353 w each hybridized with the homologous plasmids, and with one additional clone of the library. Such a result was expected, since the chance of a probe to find a homologous sequence in the library should be: higher, the longer the hybridization probes.
In order to verify the insertion of the transposon into distinct ORFs encoding putative exported proteins, the TnMax9-flanking DNA of five representative ampR mutant clones (pMu7, pMu 12, pMu 18, pMu20, and pMu26) was sequenced, taking advantage of the M I 3 forward and reverse primers on TnMax9 (Fig. 1 A). This analysis revealed that the mini-transposon was inserted into different sequences in each plasmid, thereby interrupting ORFs encoding putative proteins. For two clones, the sequences located upstream of the blaM gene revealed a putative; ribosome-binding site and a potential translational start codon (ATG). Other clones either revealed an ORF
spanning the complete sequence (approximate:ly 400 base pairs upstream and downstream of the TnMax9 insertion) or terminating shortly after the site of TnMax9 insertion. The partial protein sequences from different ORFs were used for database searches, but no significant homologies with known proteins were found.
In a further approach, it was determined whether a known gene, like vacA, encoding the extracellular vacuolating c:ytotoxin of H. pylori, could be identified using this method and how often such a mutation would be represented in the mutant library. Total cell lysates of the 13 5 mutants were tested in an immunoblot using the H. pylori cytotoxin-specific rabbit antiserum AK197 (Schmitt et al., Mol. Microbiol. 12:30'7-3l9, l994). Two mutants were identified that no longer produced the cytotoxin antigen (mutants P1-26 and P1-47) and partial DNA sequencing of the insertion sites revealed that TnMax9 was inserted at distinct positions in the vacA gene, S6 and 53 codons downstream of the ATG start codon.
Thus, the characterization of the mutant collection confirmed that a representative gene library was constructed in E. coli, in which target genes encoding exported H. pylori proteins were efficiently tagged by TnMax9.
In order to establish a collection of mutants lacking distinct exported proteins, the mutations had to be transferred back into the H. pylori chromosome. By means of natural transformation, 86 plasmids could be transformed into the original strain P 1. H. pylori strains P 1 or P 12, which were naturally competent for DNA transformation, were transformed with circular plasmid DNA (0.2-O.S mg/transformation). Transformations to streptomycin resistance were performed with chromosomal DNA ( 1 mg/transformation), isolated from a streptomycin-resistant NCTC 11637 H. pylori mutant according to the procedure described in Haas et al. (Mol. Microbiol. 8:753-760).
i 5 Selection was performed on serum plates containing 4 mg/ml chloramphenicol or 500 mg/ml streptomycin. The transformation frequency for a given mutant was calculated as the number of chloramphenicol-, streptomycin-, or erythromycin-resistant colonies per cfu (average of three experiments). The blaM gene was deleted by NotI digestion, and the plasmid relegated, in those plasmids that did not transform strain P 1 directly. This procedure, which resulted in a twenty- to thirty-fold higher frequency of transformation, as compared to the same plasmid containing blaM, resulted in 36 additional mutant P 1 strains. The blaM deletion plasmids that still did not transform strain P 1 were used to transform the heterologous H. pylori strain P 12, 2S possessing an approximately 10-fold higher transformation frequency compared to Pl. This resulted in thirteen further mutants.
Thus, from the 192 amp's plasmids, a total of 135 H. pylori mutants ( 122 mutants in P 1 and 13 mutants in P 12} were finally obtained by selection for chloramphenicol resistance (70%). The transformation frequency varied between different plasmids in the range of 1 x 10-5 - i x 10-'. The remaining plasmids did not result in any transformants. 'The collection was frozen as individual mutants in stock cultures at -70~C. To verify the correct insertion of the mini-transposon into the H. pylori chromosome, ten representative mutants were tested by Southern hybridization of chromosomal DNA using cat~~ DNA
and the vector pMin2 as probes. Consistent with our previous experience concerning TnMax9-based shuttle mutagenesis of H. pylori, the mini-transposon was, in all cases, inserted into the chromosome without integration of the vector DNA, which probably means by a double cross-over, rather than by a single cross-over event. As judged from the hybridization pattern obtained with the cat gene as a probe, it appears that Tr~Max9 is located in different regions of the chromosome, showing that distinct target genes have been interrupted in individual mutants.
The mutants were analyzed for motility, transformation competence, and adherence to KatoIII cells. Screening of the H. pylori mutant collection allowed identification of mutants impaired in motility, natural transformation competence, and adherence to gastric epithelial cell Iines. Motility mutants could be grouped into distinct classes: (i) mutants lacking the major flagellin subunit FIaA and intact flagella; (ii) mutants with apparently normal flagella, but reduced motility; and (iii) mutants with obviously normal flagella, but completely abolished motility. Two independent mutations, which exhibited defects in natural competence for genetic transformation, mapped to different genetic loci. In addition, two independent mul:ants were isolated by their failure to bind to the human gastric carcinoma cell line KatoIII. Both mutants carried a transposon in the same gene, approximately 0.8 kilobases apart, and showed decrease autoagglutination, when compared to the wild type strain.
Sequences of clones obtained using the above-described transposon shuttle mutagenesis method were used to identify intact genes, lacking inserted transposons, in the H. pylori genome, as is described below in Example 5.
The invention is further illustrated by the following examples.
Example 1 describes identification of genes, such as genes that encode the polypeptides of the invention, in the Helicobacter genome, as well as identification of signal sequences and primer design for amplification of genes lacking signal sequences. Example 2 describes cloning of DNA encoding GHPO 732, GHPO 419, GHPO 1398, GHPO 70G, GHPO 1190, GHPO 98G, GHPO 1420, GHPO 1299, and GHPO 13 into a vector that provides a histidine tag, and production and purification of the resulting his-tagged fusion proteins.
Example 3 describes methods for cloning DNA encoding the polypeptides of 1 S the invention so that they can be produced without his-tags, and Example 4 describes methods for purifying recombinantly produced polypeptides of the invention. Example 5 describes methods for obtaining the nucleic acids of the invention from the deposited clones. Example 6 describes purification of recombinant H. pylori antigen GHPO 1190.
EXAMPLE 1: Identification of genes in the H. pylori genome, identification of signal sequences, and primer design for amplification of genes lacking signal sequences 1.A. Creating H. pylori genomic databases The H. pylori genome was provided as a text file containing a single contiguous string of nucleotides that had been determined to be 1.7G
Megabases in length. The complete genome was split into 17 separate files using the program SPLIT (Creativity in Action), giving rise to 16 contigs, each containing l00,000 nucleotides, and a 17'" contig containing the remaining 76,000 nucleotides. A header was added to each of the 17 files using the format: >hpg0.txt (representing contig 1 ), .hpg l .txt (representing contig 2), etc.
The resulting 17 files, named hpg0 through hpg 16, were then copied together to form one file that represented the plus strand of the complete H. pylori genome.
The constructed database was given the designation "H." A negative strand database of the H. pylori genome was created similarly by first creating a reverse complement of the positive strand using the program SeqPup (D.G.
Gilbert, Indiana University Biology Department) and then performing the same procedure as described above for the plus strand. This database was given the designation "N."
The regions predicted to encode open reading frames (ORFs) were defined for the complete H. pylori genome using the program GENEMARKTM
(Borodovsky et al., Comp. Chem. 17:123, 19!a3). A database was created from a text file containing an annotated version of all ORFs predicted to be encoded by the H. pylori genome for both the plus and minus strands, and was given the designation "O." Each ORF was assigned a number indicating its location on the genome and its position relative to other genes. No manipulation of the text file was required.
1.B. Searching the H. pylori databases The databases constructed as is described above were searched using the program FASTA (Pearson et al., Proc. Natl. Acad. Sci. USA 85:2444-2448, l988). FASTA was used for searching either a DNA sequence against either of the gene databases ("H" and/or "N"), or a peptide sequence against the ORF
library ("O"). TFASTX was used to search a peptide sequence against all possible reading frames of a DNA database ("H" and/or "N" libraries).
Potential frameshifts also being resolved, FASTX was used for searching the translated reading frames of a DNA sequence against either a DNA database, or a peptide sequence against the protein database.
1.C. Isolation of DNA sequences from the H. pylori genome The FASTA searches against the constructed DNA databases identified exact nucleotide coordinates on one or more of the isolated contigs, and therefore the location of the target DNA. Once the exact location of the target sequence was known, the contig identified to carry the gene was exported into the software package MapDraw (DNAStar, Inc.) and the gene was isolated. Gene sequences with flanking DNA were then excised and copied into the EditSeq. Software package (DNAStar, Inc.) for further analysis.
1.D. Identification of signal sequences The deduced protein encoded by a target gene sequence was analyzed using the PROTEAN software package (DNAStar, Inc.). This analysis predicts those areas of the protein that are hydrophobic by using the Kyte-Doolittle algorithm, and identifies any potential polar residues preceding the hydrophobic core region, which is typical for many signal sequences. For confirmation, the target protein was then searched against a PROSITE database (DNAStar, Inc.) consisting of motifs and signatures. Characteristic of many signal sequences and hydrophobic regions in general, is the identification of predicted prokaryotic lipid attachment sites. Where confirmation between the two approaches is apparent at the N-terminus of any protein, putative cleavage sites were sought. Specifically, this includes the presence of either an Alanine (A), Serine (S), or Glycine (G) residue immediately after the core hydrophobic region. In the case of lipoproteins, a Cysteine (C) residue would be identified as the +1 residue, post-cleavage.
1.E. Rational design of PCR primers based on the identification of signal sequences To clone gene sequences as N-terminal translational fusions for the generation of recombinant proteins with N-terminal Histidine tags, the gene sequence that specifies the signal sequence is omitted. The 5'-end of the gene-specific portion of the N-terminal primer is designed to start at the first codon beyond the cleavage site. In the case of lipoproteins, the S'-end of the N-terminal primer begins at the second codon, immediately after the modifiable residue at position +1 post-cleavage. The omission of the signal sequence from the recombinant allows for one-step purification, and potential problems associated with insertion of signal sequences. in the membrane of the host strain carrying the hybrid construct are avoided.
1 S EXAMPLE 2: Preparation of isolated DNA encoding GHPO 732, GHPO
419, GHPO I398, GHPO 706, GHPO 1190, GHPO 986, GHPO 1420, GHPO 1299, and GHPO 13, and production of these polypeptides as histidine-tagged fusion proteins 2.A. Preparation of genomic DNA from Helicobacter pylori Helicobacter pylori strain ORV2001, stored in LB medium containing 50% glycerol at -70~C, is grown on Colombia agar containing 7%
sheep blood for 48 hours under microaerophilic conditions (8-10% CO2, 5-7%
O2, 85-87% NZ). Cells are harvested, washed with phosphate buffer saline (PBS) (pH 7.2), and DNA is then extracted from the cells using the Rapid Prep Genomic DNA Isolation kit (Pharmacia Biol:ech).
2.B. PCR amplification DNA molecules encoding GHPO 732, GHPO 419, GHPO l398, GHPO 70G, GHPO 1 l90, GHPO 986, GHPO 1420, GHPO 1299, and GHPO
13 are amplified from genomic DNA, as can be prepared as is described above, by the Polymerase Chain Reaction (PCR) using the following primers:
GHPO 732 (HPO 64l:
N-terminal primer:
5'-GCCGGATCCATGACTTATGGGTATGGGGAA-3' (SEQ ID N0:171 );
and C-terminal primer:
5'-GCCCTCGAGACTTTTATTGATTCACCATTTCATT-3' (SEQ ID
N0:172).
GHPO 4l9 lHPO 547:
N-terminal primer:
5'-GCCGGATCCATCGCTGAAGAAAATGGGGCG-3' (SEQ ID N0:173);
and C-terminal primer:
5'-GCCCGGCCGCCCTAAAAACTATAAACATAACTC-3' (SEQ ID
NO: l 74).
GHPO l398 (HPO 15~
N-terminal primer:
5'-GCCGGATCCGGTATTAGGAAGCTTATACCATC-3' (SEQ ID N0:175);
and C-terminal primer:
5'-GCCCTCGAGAAGTTCTATTTTTAATTCCTTGAGAG-3' (SEQ ID
N0:176).
GHPO 706 lHPO 507:
WO 98/21225 PCTlUS97/21353 -N-terminal primer:
5'-GCCGGATCCTCTGATAGCCATAAAGAAAAAAAGGAC-3' (SEQ ID
NO:I77); and C-terminal primer:
5'-GCCCTCGAGATCTTTAGAAATCAACCCCCAAAGC-3' (SEQ ID
NO: I 78).
GHPO I 190 fHPO 7G):
N-terminal primer:
5'-GCCGGATCCGACTTAGAACATTTTAACACGCTC-3' (SEQ ID
N0:179);
and C-terminal primer:
5'-GCCCTCGAGTCATTTTAAACGACTCAAAACAAA-3' (SEQ ID
NO:180).
GHPO 986:
N-terminal primer:
S'-GCCGGATCCGGCCAAAGCGTGCGCACTTATTGG-3' (SEQ ID
N0:181 ); and C-terminal primer:
S'-GCCCTCGAGTTATTGTTCCAACCCCC'ACGCATC-3' (SEQ ID N0:182).
C~HPO l420:
N-terminal primer:
5'-GCCGGATCCAAGAGCAATGCTGATGACAAACC-3' (SEQ ID
N0:183); and C-terminal primer:
5'-GCCCTCGAGTTATGAGTTAAAGCCC(~TTGTCC-3' (SEQ ID N0:184).
WO 98/21225 PCTlUS97/21353 N-terminal primer:
5'-GCCGGATCCGAATCAGTAAAAACAGGAAAAAC-3' (SEQ ID
N0:185); and C-terminal primer:
5'-GCCCTCGAGCGGCTCTTTGGAGTTTTATTG-3' (SEQ ID N0:186).
GHPO 13:
N-terminal primer:
5'-GCCGGATCCATCATTCCCTCTCGCTCTATGG-3' (SEQ ID N0:187);
and C-terminal primer:
5'-GCCCTCGAGACCTTAATGCGTTGCGTTTTCTTT-3' (SEQ ID N0:188).
The N-terminal and C-terminal primers for each clone both include a 5' clamp and a restriction enzyme recognition sequence for cloning purposes (BamHI (GGATCC) and Xhol (CTCGAG) or NotI (CGGCCG) recognition sequences). The N-terminal primer is designed so that the amplified product does not encode the signal sequence and the potential cleavage site.
Amplification of gene-specific DNA is carried out using Pwo DNA
Polymerise (Boehringer Mannheim), which is a proof reading polymerise, according to general guidance provided by the manufacturer. Because of the exonuclease activity of the polymerise, two reaction mixtures (mixtures 1 and 2) are first prepared separately and combined just prior to amplification.
These mixtures are as follows:
Ingredient (final conc.l Mixture i full Mixture 2 (ull distilled H,O 160 79 dNTPs (200 ~cM each) 40 ---l Ox PCR buffer --- 20 primers ( 100 nM each) 1 ---DNA template (200 ng) 2 ---as obtained in S.A.
( 10x PCR buffer contains 100 mM Tris-HCl (pH 8.85), 2_'i0 mM KCI, 50 mM
(NH4)~ SO4, 20 mM
MgSO,) Amplification is carried out as follows:
Cvclin,~ conditions Tem ~C Ti a iin. Number of cycles Initial denaturing step 96 4 1 Denaturing step 94 0.5 20 Annealing step 50 1 20 Extension step 72 1 20 Final extension step 72 5 1 2.C. Transformation and selection of transformants A single PCR product is thus amplified and is then digested at 37~C
for 2 hours with BamHI and XhoI or NotI concurrently in a 20 ,ul reaction volume. The digested product is ligated to similarly cleaved pET28a (Novagen) that is dephosphorylated prior to the ligation by treatment with Calf Intestinal Alkaline Phosphatase (CIP). The gene fusion constructed in this manner allows one-step affinity purification of the resulting fusion protein because of the presence of histidine residues at the N-terminus of the fusion protein, which are encoded by the vector.
The ligation reaction (20 ,ul) is can-ied out at 14 ~C overnight and then is used to transform 100 ,ul fresh E, coli XL1-blue competent cells (Novagen). The cells are incubated on ice for 2 hours, heat-shocked at 42 ~C
for 30 seconds, and returned to ice for 90 seconds. The samples are then added to 1 ml LB broth in the absence of selection and grown at 37 ~C for 2 hours.
The cells are plated out on LB agar containing kanamycin (50 ~g/ml) at a 10x and neat dilution and incubated overnight at 37~C. The following day, 50 colonies are picked onto secondary plates and incubated at 37~C overnight.
Five colonies are picked into 3 ml LB broth supplemented with kanamycin (100 ~g/ml) and are grown overnight at 37~C. Plasmid DNA is extracted using the Quiagen mini-prep. method and is quantitated by agarose gel electrophoresis.
PCR is performed with the gene-specific primers under the conditions set forth above and transformant DNA is confirmed to contain the desired insert. If PCR-positive, one of the five plasmid DNA samples (500 ng) extracted from the E. coli XL 1-blue cells is used to transform competent BL21 (~,DE3) E. coli competent cells (Novagen; as described previously).
Transformants ( 10) are picked onto selective kanamycin (50 ~g/mL) containing LB agar plates and stored as a research stock in LB containing 50% glycerol.
_, 2.D. Purification of recombinant proteins One ml of frozen glycerol stock prepared as described in 2.C. is used to inoculate 50 ml of LB medium containing 25 ~g/ml of kanamycin in a 250 ml Erlenmeyer flask. The flask is incubated at 37~C for 2 hours or until the absorbance at 600 nm (0D600) reaches 0.4-1Ø The culture is stopped from growing by placing the flask at 4~C overnight. The following day, 10 ml of the overnight culture are used to inoculate 240 ml LB medium containing kanamycin (25 p.g/ml), with the initial 0D600 about 0.02-0.04. Four flasks are inoculated for each ORF. The cells are grown to an 0D600 of 1.0 (about 2 hours at 37~C), a 1 ml sample is harvested by centrifugation, and the sample is analyzed by SDS-PAGE to detect any leaky expression. The remaining culture is induced with 1 mM IPTG and the induced cultures are grown for an additional 2 hours at 37~C.
The final OD600 is taken and the cells are harvested by centrifugation at 5,000 x g for 15 minutes at 4~C. The supernatant is discarded and the pellets are resuspended in 50 mM Tris-HCl (pH 8.0), 2 mM EDTA. Two hundred and fifty ml of buffer are used for 1 liter of culture and the cells are recovered by centrifugation at 12,000 x g for 20 minutes. 'The supernatant is discarded and the pellets are stored at -45~C.
2. E. Protein purification Pellets obtained from 2.D. are thawed and resuspended in 95 ml of 50 mM Tris-HCl (pH 8.0). Pefabloc and lysozyme are added to final concentrations of 100 ~,M and 100 ~.g/ml, respectively. The mixture is homogenized with magnetic stirring at 5~C for 30 minutes. Benzonase (Merck) is added at a 1 U/ml final concentration, in th.e presence of 10 mM MgCl2, to 1 S ensure total digestion of the DNA. The suspension is sonicated (Branson Sonifier 4S0) for 3 cycles of 2 minutes each at maximum output. The homogenate is centrifuged at l9,000 x g for l 5 minutes and both the supernatant and the pellet are analyzed by SL)S-PAGE to detect the cellular location of the target protein in the soluble or insoluble fractions, as is described further below.
2.E.1. Soluble fraction If the target protein is produced in a soluble form (i.e., in the supernatant obtained in 2.E.} NaCI and imidazole are added to the supernatant to final concentrations of 50 mM Tris-HCl (pH 8.0), 0.5 M NaCI, and 10 mM
imidazole (buffer A). The mixture is filtered through a 0.45 ~.m membrane and WO 98l21225 PCT/US97/21353 -loaded onto an IMAC column (Pharmacia HiTrap chelating Sepharose; 1 ml), which has been charged with nickel ions according to the manufacturer's recommendations. After loading, the column is washed with 50 column volumes of buffer A and the recombinant target protein is eluted with 5 ml of buffer B (50 mM Tris-HCl (pH 8.0), 0.5 M NaCI, 500 mM imidazole).
The elution profile is monitored by measuring the absorbance of the fractions at 280 nm. Fractions corresponding to the protein peak are pooled, dialyzed against PBS containing 0.5 M arginine, filtered through a 0.22 gm membrane, and stored at -45~C.
2.E.2. Insoluble fraction If the target protein is expressed in the insoluble fraction (pellets obtained from 2.E.), purification is conducted under denaturing conditions.
NaCI, imidazole, and urea are added to the resuspended pellet to final concentrations of 50 mM Tris-HCl (pH 8.0), 0.5 M NaCl, 10 mM imidazole, and 6 M urea (buffer C). After complete solubilization, the mixture is filtered through a 0.45 ~,m membrane and loaded onto an IMAC column.
The purification procedures on the IMAC column are the same as described in 2.E.1., except that b M urea is included in a11 buffers used and column volumes of buffer C are used to wash the column after protein loading, instead of 50 column volumes.
The protein fractions eluted from the IMAC column with buffer D
(buffer C containing 500 mM imidazole) are pooled. Arginine is added to the solution to final concentration of 0.5 M and the mixture is dialyzed against PBS
containing 0.5 M arginine and various concentrations of urea (4 M, 3 M, 2 M, 1 M, and 0.5 M) to progressively decrease the concentration of urea. The final dialysate is filtered through a 0.22 ~,m membrane and stored at -45~C.
Alternatively, when the above purification process is not as efficient as it should be, two other processes may be used as follows. A first alternative involves the use of a mild denaturant, N-octyl glucoside (NOG). Briefly, a pellet obtained in 2.E. is homogenized in 5 rr~M imidazole, 500 mM sodium chloride, 20 mM Tris-HCl (pH 7.9) by microfluidization at a pressure of 15,000 psi and is clarified by centrifugation at 4,000-5,000 x g. The pellet is recovered, resuspended in 50 mM NaP04 (pH 7.5) containing 1-2 % weight /volume NOG, and homogenized. The NOG-soluble impurities are removed by centrifugation. The pellet is extracted once more by repeating the preceding extraction step. The pellet is dissolved in 8 M urea, 50 mM Tris (pH 8.0). The urea-solubilized protein is diluted with an equal volume of 2 M arginine, 50 mM Tris (pH 8.0), and is dialyzed against 1 PvI arginine for 24-48 hours to remove the urea. The final dialysate is filtered through a 0.22 ~,m membrane and stored at -45~C.
A second alternative involves the use of a strong denaturant, such as guanidine hydrochloride. Briefly, a pellet obtained in 2.E. is homogenized in mM imidazole, 500 mM sodium chloride, 20 mM Tris-HCl (pH 7.9) by microfluidization at a pressure of 1 S,000 psi .and clarified by centrifugation at 4,000-5,000 x g. The pellet is recovered, resuspended in 6 M guanidine hydrochloride, and passed through an IMAC column charged with Ni~. The bound antigen is eluted with 8 M urea (pH 8.5). Beta-mercaptoethanol is added to the eluted protein to a final concentration of 1 mM, then the eluted protein is passed through a Sephadex G-25 column equilibrated in 0.1 M acetic acid.
Protein eluted from the column is slowly added to 4 volumes of 50 mM
phosphate buffer (pH 7.0). The protein remains in solution.
2.F. Evaluation of the protective activity of the purified protein Groups of 10 Swiss Webster mice (Taconic Labs) are immunized rectally with 25 ~cg of the purified recombinant protein, admixed with 1 ,ug of cholera toxin (Berna) in physiological buffer. Mice are immunized on days 0, 7, 14, and 21. Fourteen days after the last immunization, the mice are challenged with H. pylori strain ORV2001 grown in liquid media (the cells are grown on agar plates, as described in 2.A., and, after harvest, the cells are resuspended in Brucella broth; the flasks are then incubated overnight at 37 ~C).
Fourteen days after challenge, the mice are sacrificed and their stomachs are removed. The amount of H. pylori is determined by measuring the crease activity in the stomach and by culture.
2.G. Production of monospecific polyclonal antibodies 2.G.1. Hyperimmune rabbit antiserum New Zealand rabbits are injected both subcutaneously and Z 5 intramuscularly with 100 ,ug of a purified fusion polypeptide, as obtained in 2.E.1. or 2.E.2., in the presence of Freund's complete adjuvant and in a total volume of approximately 2 ml. Twenty one and 42 days after the initial injection, booster doses, which are identical to priming doses, except that Freund's incomplete adjuvant is used, are administered in the same way.
Fifteen days after the last injection, animal serum is recovered, decomplemented, and filtered through a 0.45 ~cm membrane.
2.G.2. Mouse hyperimmune ascites fluid Ten mice are injected subcutaneously with 10-50 ,ug of a purified fusion polypeptide as obtained in 2.E.1. or 2.E.2., in the presence of Freund's complete adjuvant and in a volume of approximately 200 ,ul. Seven and 14 days after the initial injection, booster doses, which are identical to the priming doses, except that Freund's incomplete adjuvant is used, are administered in the same way.
Twenty one and 28 days after the initial infection, mice receive 50 ,ug of the antigen alone intraperitoneally. On day 21, mice are also injected intraperitoneally with sarcoma 180/TG cells CM26684 (Lennette et al., Diagnostic Procedures for Viral) Rickettsial, ~znd Chlamydial Infections, 5th Ed. Washington DC, American Public Health Association, l979). Ascites fluid is collected 10-13 days after the last injection.
EXAMPLE 3: Methods for producing transcriptional fusions lacking His-tags Methods for amplification and cloning of DNA encoding the polypeptides of the invention as transcriptional fusions lacking His-tags are described as follows. Two PCR primers for each clone are designed based upon the sequences of the polynucleotides that encode them (SEQ ID NOs: l-169 (odd numbers)). These primers can be used to amplify DNA encoding the polypeptides of the invention from any Helicobacter pylori strain, including, for example, ORV2001 and the strain deposited as ATCC deposit number 43579, as well as from other Helicobacter species.
The N-terminal primers are designed to include the ribosome binding site of the target gene, the ATG start site, and any signal sequence and cleavage site. The N-terminal primers can include a 5' clamp and a restriction endonuclease recognition site, such as that for BamHI (GGATCC), which facilitates subsequent cloning. Similarly, the C-terminal primers can include a restriction endonuclease recognition site, such as that for XhoI (CTCGAG), which can be used in subsequent cloning, and a TAA stop codon.
Amplification of genes encoding the polypeptides of the invention is carried out using Thermalase DNA Polymerase under the conditions described above in Example 2. Alternatively, Vent DNA polymerase (New England Biolabs), Pwo DNA polymerase (Boehringer Mannheim), or Taq DNA
polymerase {Appligene) can be used, according to instructions provided by the manufacturers.
A single PCR product for each clone is amplified and cloned into appropriately cleaved pET 24 (e.g., BamHI XhoI cleaved pET 24), resulting in construction of a transcriptional fusion that permits expression of the proteins without His-tags. The expressed products can be purified as denatured proteins that are refolded by dialysis into 1 M arginine.
Cloning into pET 24 allows transcription of the genes from the T7 promoter, which is supplied by the vector, but relies upon binding of the RNA-specific DNA polymerase to the intrinsic ribosome binding sites of the genes, and thereby expression of the complete ORF. The amplification, digestion, and cloning protocols are as described above for constructing transiational fusions.
Amplification of clone GHPO 1190 DNA
Design of PCR primers for cloning Two PCR primers are designed based on the complete gene sequence (see table 1 ). The N-terminal primer (FC 1 ) is designed to include the ribosome binding site of the target gene, the ATG start site, and the signal sequence (with cleavage site). It includes a clamp (GCC) at the 5' most end, and a SacI
recognition sequence (GAGCTC) for cloning purposes.
The C-terminal primer (RN2) includes an XhoI recognition sequence for cloning purposes, and the natural TAA stop codon.
N-terminal primer (FC 1 ):
5'-GCCGAGCTCCAAGCAAAAAAATGTC'.AATTAAAAGGG-3' (SEQ ID
NO:189) C-terminal primer (RN2):
5'-GCCCTCGAGGTCTAAATTAGAATAAGTGTTGTT-3' (SEQ ID N0:190) Amplification of each specified gene can be achieved by employing FC 1 /RN2 primers for any of the genes described (see Table 1 ).
PCR conditions Amplification of gene-specific DNA is earned out using Pwo DNA
Polymerase (Boehringer Mannheim) under the following conditions. Due to _ - 10 the exonuclease activity of the polymerase, two reaction mixtures are prepared separately and combined just prior to amplification.
Reaction ingredients: Ingredient (final conc.~ Mixture 1 lull Mixture 2 lull distilled H,O 160 79 dNTPs (2U0 ~M each) 40 -10X buffer - 20 primer I ( 100 nM) 1 -primer 2 ( 100 nM) 1 -Template (200 ng) 2 0 Cvclin~ condition Tem ~C Time(minl Number of cycles Initial denaturing step 9G 4 1 Denaturing step 94 0.5 20 Annealing step 50 1 20 Extension step 72 1 20 Final extension step 72 1 1 A single PCR product of 624 basepairs is amplified and cloned into SacI-XhoI cleaved pET 24, allowing construction of a transcriptional fusion and expression of GHPO 1190 antigen in the absence of a His-tag. In this instance, expressed product can be purified a;> a denatured protein that is re-folded by dialysis into 1 M arginine.
WO 98l21225 PCTlUS9?/21353 -_(~~_ Cloning into pET 24 allows transcription from the T7 promoter, supplied by the vector, but relies upon binding of the RNA-specific DNA
polymerise to the intrinsic ribosome binding site for GHPO 1190, and thereby expression of the complete ORF. The amplification, restriction, and cloning protocols are as previously described for constructing translational fusions.
EXAMPLE 4: Purification of the polypeptides of the invention by immunoaffinity 4.A. Purification of specific IgGs An immune serum, as prepared in section 2.G., is applied to a protein A
Sepharose Fast Flow column (Pharmacia) equilibrated in 100 mM Tris-HCl (pH 8.0). The resin is washed by applying 10 column volumes of 100 mM
Tris-HCI and 10 volumes of 10 mM Tris-HCI (pH 8.0) to the column. IgG
antibodies are eluted with 0.1 M glycine buffer (pH 3.0) and are collected as ml fractions to which is added 0.25 ml 1 M Tris-HCl {pH 8.0). The optical density of the eluate is measured at 280 nm and the fractions containing the IgG
antibodies are pooled, dialyzed against 50 mM Tris-HCI (pH 8.0), and, if necessary, stored frozen at -70~C.
4.B. Preparation of the column An appropriate amount of CNBr-activated Sepharose 4B gel ( 1 g of dried gel provides for approximately 3.5 ml of hydrated gel; gel capacity is from 5 to 10 mg coupled IgG/ml of gel) manufactured by Pharmacia ( 17-0430-O 1 ) is suspended in 1 mM HCl buffer and washed with a buchner by adding small quantities of 1 mM HCl buffer. The total volume of buffer is 200 ml per gram of gel.
Purified IgG antibodies are dialyzed for 4 hours at 205 ~ C against SO volumes of S00 mM sodium phosphate buffer (pH 7.S). The antibodies are then diluted in S00 mM phosphate buffer (pH 7.S) to a final concentration of 3 mg/ml.
S IgG antibodies are mixed with the gel overnight at 5~3 ~C. The gel is packed into a chromatography column and is washed with 2 column volumes of S00 mM phosphate buffer (pH 7.S), and 1 column volume of SO mM sodium phosphate buffer, containing S00 mM NaCI (pH 7.S). The gel is then transferred to a tube, mixed with 100 mM ethanolamine (pH 7.S) for 4 hours at room temperature, and washed twice with 2 column volumes of PBS. The gel is then stored in 1 / 10,000 PBS/merthiolate. T'he amount of IgG antibodies coupled to the gel is determined by measuring the optical density (OD) at 280 nm of the IgG solution and the direct eluate, plus washings.
4.C. Adsorption and elution of the antigen 1 S An antigen solution in SO mM Tris-HC;1 (pH 8.0), 2 mM EDTA, for example, the supernatant obtained in 3.E. or the solubilized pellet obtained in 3.E., after centrifugation and filtration through a 0.4S ~cm membrane, is applied to a column equilibrated with SO mM Tris-HCl (pH 8.0), 2 mM EDTA, at a flow rate of about 10 ml/hour. The column is then washed with 20 volumes of 50 mM Tris-HCl (pH 8.0), 2 mM EDTA. Alternatively, adsorption can be achieved by mixing overnight at S~3 ~C.
The adsorbed gel is washed with 2 to C~ volumes of 10 mM sodium phosphate buffer (pH 6.8) and the antigen is eluted with 100 mM glycine buffer (pH 2.S). The eluate is recovered in 3 mL fractions, to each of which is added 1 SO ,ul of 1 M sodium phosphate buffer (pH 8.0). Absorption is measured at 280 nm for each fraction; those fractions containing the antigen are pooled and stored at -20 ~C.
EXAMPLE 5: Preparation of isolated DNA encoding the polypeptides of the invention from the deposited clones.
As mentioned above, E. coli strains including plasmids containing nucleic acids encoding GHPO 1190 (formerly HP076, ATCC# 98197), GHPO
1212 (formerly HP018, ATCC# 982l0), GHPO 10l2 (formerly HP0121, ATCC# 98201 ), GHPO I 50l (formerly HP045, ATCC# 98208), GHPO 1 G88 - 10 (formerly HPO101, ATCC# 98l98), GHPO 34G (formerly HPOI 16, ATCC#
98200), GHPO 1200 (formerly HP07, ATCC# 98211), GHPO 1538 (formerly HP0104, ATCC# 98199), GHPO l398 (formerly HPO15, ATCC# 98214), GHPO 100l (formerly HP058, ATCC# 98206), GHPO 470 (formerly HP0132, ATCC# 98202), GHPO 689 (formerly HP09, ATCC# 98203), GHPO 1550 (formerly HP038, ATCC# 98204), GHPO 1G20 (formerly HP087, ATCC# 98205), GHPO 574 (formerly HP071, ATCC# 98217), GHPO 329 (formerly HP070, ATCC# 982l9), GHPO 1374 (formerly HP080, ATCC# 982l5), GHPO 956 (formerly HP095 ATCC# 98216), HPO 98 (ATCC# 98218), GHPO 134G (formerly HP057, ATCC# 98220), GHPO 706 {formerly HPO50, ATCC# 98207), GHPO 732 (formerly HP064, ATCC#
98213), GHPO 419 (formerly HP054, ATCC# 98212), and GHPO 27G
(formerly HP042, ATCC# 98209) were deposited in E. coli strain DHSa under the Budapest Treaty with the American Type Culture Collection (ATCC;
Rockville, Maryland) on October 9, 1996 and were designated with accession numbers indicated in parentheses above. These plasmids each contain a genomic DNA BgIII-CIaI insert from H. pylori strain P 1 or P 12 (referred to as 69-A and 888-0 in Haas et al., Mol. Microbiol. (1993) 8:753). Each of the inserts are disrupted by the presence of transposon TnMax9 (Kahrs et al., Gene ( 1995) 167:53). DNA molecules lacking the transposon can be amplified from the plasmids using standard PCR techniques, such as inverse and recombinant PCR (see, e.g., Innis et al., supra), so that a full-length H. pylori insert is reconstituted. For example, the H. pylori sequences flanking the transposon can each be amplified by PCR, and then ligated together to form the full-length H. pylon~i gene lacking the transposon. Primers that can be used in these methods for each of the twenty-four deposited clones of the invention are shown in Table 1. The locations of insertion of the transposon in each of the deposited clones are between the nucleotides indicated in parentheses after the name of each clone, as follows: HPO101 (4!)7-498), GHPO 1538 (428-429), GHPO 346 (433-444), GHPO l012 (463-4G4), GHPO l32 (408-409), GHPO
1212 (22G-227), GHPO 1550 (347-348), GHPO 27G (372-373), GHPO 150l (299-300), GHPO 70G (29-293), GHPO 4l9' (351-352), GHPO 134G (2GG-2G7), GHPO 1001 (434-435), GHPO 732 (224-22S), GHPO 329 (114-115), GHPO
574 (274-275), GHPO 1l90 (412-413), GHfO 1200 (349-350), GHPO 1374 (105-l06), GHPO 1620 (2G-27), GHPO 956 (64-65), HPO 98 (43-44), and GHPO 689 (346-347).
EXAMPLE 6: Purification of recombinant H. pylori antigen from GHPO
1190.
A pellet of E. coli expressing GHPO 1190 is homogenized in 5 mM
imidazole, S00 mM sodium chloride, 20 mM Tris-HCl (pH 7.9) by microfluidization at a pressure of I5,000 psi, and clarified by centrifugation at - 25 4000-5000g.
Method I
The pellet containing cloned protein is suspended in buffer containing 2% N-octyl glucoside (NOG) and is homogenized. The NOG soluble protein is removed by centrifugation. The pellet is extracted one more time with 2%
NOG. After centrifugation, the pellet is dissolved in 8 M urea. The urea solubilized protein is diluted with an equal volume of 2 M arginine and dialyzed against 1 M arginine for 24-48 hours to remove urea. The cloned protein remains in solution. SDS-PAGE and Coomassie staining, followed by densitometric scanning, shows that the protein is 80-85% pure cloned antigen.
Method 2 The pellet containing cloned protein is solubilized in 6 M guanidine hydrochloride and is passed through an IMAC column charged with Ni~. The bound antigen is eluted with 8 M urea (pH 8.5). (3-Mercaptoethanol is added to eluted protein to a final concentration of 1 mM, then passed through a Sephadex G-25 column equilibrated in 0.1 M acetic acid. Protein eluted from Sephadex G-25 column is slowly added to 4 volumes of 50 mM phosphate (pH
7.0). The protein remains in solution.
Purification of recombinant proteins Recombinant proteins expressed as Histidine-tagged fusion proteins can be solubilized and purified by using a metal affinity column (nickel column).
The bound protein can be eluted with imidazole buffer, with or without urea, or by using low pH buffers, with or without urea. Urea or guanidine hydrochloride-denatured proteins can then be renatured using appropriate renaturing buffers. With a number of recombinant H. pylori antigens (HpaA
and clone GHPO 1 l90), renaturation conditions using arginine hydrochloride (0.25-1 M) have been determined.
WO 98/21225 PCT/US97/21353 w Recombinant proteins without a His-tag can be solubilized and purified using immunoaffinity, ion-exchange, sizing, and/or hydrophobic chromatography. Proteins expressed as insoluble aggregates in inclusion bodies can be solubilized in denaturing agents, such as 8 M urea or 6 M
guanidine hydrochloride. Appropriate folding and renaturation can readily be determined by one skilled in the art.
The above pellet containing cloned protein is suspended in 50 mM
NaP04 (pH 7.5) containing 1 % weight/volume N-octyl glucoside (NOG) and mixed vigorously. The NOG soluble impurities are removed by centrifugation.
The remaining pellet is extracted one more time with the 1 % NOG solution to further remove impurities. After centrifugation, the pellet is solubilized in urea, 50 mM Tris (pH 8.0). The Urea solubi lized protein is diluted with an equal volume of 2 M Arginine, 50 mM Tris (pH 8.0), and is dialyzed against 1 M Arginine, 50 mM Tris, 50 mM NaCI (pH .8.0) for 24-48 hours to remove urea. The cloned protein remains in solution following dialysis. SDS-PAGE
and Coomassie staining followed by densitometric scanning shows that the protein is 80-85% pure cloned antigen.
Other embodiments are within the following claims.
Itl~;-(.'()N5'1'Itll(:'1'1()N (tl~ A (:U>\II'LII'1? Oltl~' 13l' RI?C0119131NAN'I' 1'CK
O
I=' denotes lorward primer R' denotes reverse primer n~
r.
C' denotes coding strand N' denotes non-coding strand AIf 1-C t and RN2 primers have incorporated at their 5' end a clamp and a recognition sequence for cloning purposes GUC clamp present for oning of entire gene sequence amplilication and cl from chromosomal DNA
[Xj denotes any nucleotide sequence not present in the completed gene sequence () hientifies region lap betweenIwo original PCR products, each clone of over the and is consistently 10 nucleotides long for CLONE Primer nt positionsPrimer sequence (5' - 3') Length Tm (oC) Vy tdo. type of gene seq.
C
7 6 FC1 304 - GCC [X] CAAGCAAAAAAATGTCAATTAAAAGGG2 7 7 0 ,,''', (TATGGAACTTA)GAACATTTTAACACGCTCTATTA33 60 RN2 927 - GCC(XJGTCTAAATTAGAATAAGTGTTGTT24 60 w 1 8 FC1 101 - GCC [Xj AATATATGGGAACTTAATGAGAAT2 4 6 0 ' ,.ml FC2 218 - (AAATCTCGCA) GAAATCTTTCACAAGCGAGCAA32 60 ",' ~~ RN2 922 - GCC (XJ ATGTCATGTCAAACTATGAAGC2 2 C
t21 FC1 141 - GCC[XJTCACAATGGATAAAAACAACAACA24 62 N RN1 45t - GCCCTTTTGTTTAGGGGTTAG 21 v FC2 455- 485 (ACAAAAGGGC)TTTTTAGAGCATGTGAGCCATC32 62 RN2 814 - GCC[XjCTGTCCAAATCAGCCACCC 19 G0 4 5 FC1 1 - 26 GCC [XJ ATGAAMGATTTGATTTGTTTf~ATC2 6 6 2 b FC2 290 - (AATACGGCTTTAAAGCTATAGAAAATTTAAACGC)34 60 n RN2 603 - GCC[XJTTAAATATCCCAATCCTGCCAC 22 62 ~o J
N.
W
W
f 101 FC1 3D8 - GCC (X] GMGGATTTATTATGATTAAAAGAA2 5 t~ 0 FC2 488 - (AAATTAGGTT)TTGTAGGCTTTGCCAATAMTG32 60 RN2 893 - GCC [X] AAGGMTAAATTAGAAAGTGAAGM2 5 6 2 N
W
t t s FC1 236 - GCC (X] CGCATTGATTTGATGAATAAACC2 3 6 2 FC2 425 - (GTTATAGGCG) ATAMGGTTTMCGCAGCTAAG3 2 6 0 RN2 812 - GCC (X] CTCACTAAAAAGCMTTTTTGAG'Z 3 6 0 7 FC1 195 - GCC [X] TMGGAATGAAGTTGATAAAATTTGT2 6 6 4 FC2 339 - (ATGMMTGC) ACGCCCAMTAATAAGGAAGTA3 2 6 0 RN2 738 - GCC [X] GGATTTATTGAGCTTTCCCCTT2 2 G 2 1 Oa FC1 25t - GCC (X] AMGGGCGAAMTGAGCMGA ? 1 6 D
RNi 429 - TMMTAACCMCAGAGTGATCA ~' a G 0 FC2 420 - (GGTTATTTTA) GTGGATATTTGGGTTTATAGCGA3 3 6 2 RN2 7S4 - GCC (X] TTTTTTAAGAATCACTTTCTTCGG2 4 f 2 N
m N vo 5 8 Fr1 11, 8 Gl_'C (X1 ATAC_;('_AACAA(TC.ATt~ITTT1-fAAAAC;2 f~ G 6 - 1, d, 3 FC2 425 - (CMGACTTCA) AAAAAGMGGAGCGGTTGCC3 0 6 0 RN2 650 - GCC [X] CTGGCTTATTGCGTATCATC 2 0 E; 0 1 3 2 Fc1 294 - GGC [X] GGMGMTMTGCTCGCTTCC 2 S 6 2 FC2 A00 - (ACACTCCAGT)AGATGCTTTCCCGGATATTrC31 60 RN2 761 - GCC [X] CTATTCTCCAGGGATATGGCC2 1 6 4 9 FC1 211 - GCC [Xj GATGGATTTTTTATGGGGGTGAG2 3 6 4 328 f7 FC2 338 - (CGGCAGTGCC) TTTAGCCTATTATTTAGMGCGA3 3 6 0 RN2 686 - GCC [X] ATGGTATTTGTCTAAGACCCTC2 2 6 2 s N
W
W
W
I
3 8 FC1 220 - GCC [X] AAAAGGGTTTTAAATAATGGCTG2 3 6 0 FC2 239 - (TTATCCTTGT) TGCTGGCTTGGTTTTTTTTAATT3 3 6 0 t~
RN2 597 - GCC [X] AAGATTCTAAAAGGGCTTCAAAT2 3 6 0 7 1 FCi 1 - 25 GCC [X] ATGTTGAAATTTAAATATGGTTTGA2 5 G 0 FC2 265 - (AGTGGGGTTT) TTTTAGGGGGTGGGTATGCT3 0 6 0 RN2 ~ 524 GCC [X] GAGCCTACAGGTTGCTTGC 2 0 G 0 70 FC1 1-23 GCC[X]ATGGTATTTGACAGAACAATCAG23 G2 RN1 115 - GAAAAGCCACCCCGCTTATT 20 f 0 FC2 106 - (GTGGCTTTTC)AAAAAGAGTGGGTGCAACAATT32 60 RN'? 495 - GCC [X) TTAGGAATAGCATAACAAACAAACG2 5 6 6 N
8 0 FC1 1 - 25 GCC [X] ATGTTAGAAAAATTGAITGAAAGAG2 5 6 2 FC2 97 - (TATGTGTTCA) TGAAAGAGTTGTGGCACATGC3 1 6 2 V
RN2 435 - GCC [X] TTATGCGATAGGGGGCGTATC2 1 6 6 m m 95 FC1 1-27 GCC[XjATGAAAAAATTTTITfCTCAATCTTT27 C FC2 55 - (CTACTGGCCA) TGGATGGCAATGGCGTTTnTtAG3 4 6 8 r- RN2 432 - GCC [X] TTATTGATGAACATTAACCATTAAA2 m 98 FC1 1-22 GCC[XjATGAAAACCTTTAAAAACCTGC 22 58 FC2 34 - (CTGATCGCTA) TGAGTTGGCTCCAAGCGGA2 9 6 0 RN2 336 - GCC [X] TTAAAACTCATAGCGTTTTTCAAT2 4 G 0 42 FC1 18-51 GCC[X]GAGAGTAGTGGCAGAGTTTATGCTGATTCC34 98 (,b RN1 380-351 (AACTTTTC)TCTATCCCAATTCGTTACGCTC30 64 FC2 366-396 (GGATAGA)GAAAAGTTTGGCGTCAAAAGTTGG31 f 8 RN2 822-801 GCC [X] GGCTTAAACTGGAACGGATTTC2 2 f 1 N
W
W
50 FC1 140-170 GCC[X]TAAAGTTTGCTAAAAAGATGGTTTTAATT131 76 RNt 297-270 (GACTTCTAAAG)CGTCCTT-TTITfCTTTA28 56 O
FC2 287-31 (CTTTA)GAAGTCATTAAACAAAGAGGGGT2 9 6 4 0~
RN2 607-584 GCC[X]CCCATCTTTAGAAATCAACCCCCA24 70 ' N
N
64 FC1 23-so GCC[X]GAAATCAAGGAGTTTGTATGCAACAGCG28 80 RN1 225-149 (A)AGCTTTTCATTATCTTCCCCATAAGC27 74 FC2 216-244 (fGAAAAGCT)TTTAGCGAAGCGATCAAGCC20 60 RN2 1039-1012GCC[X]CCCAATACTTTTATTGATTCACCATTTC28 74 54 FC1 21-48 GCG[X]CAATAAAACACCAAAATGAATGAGTTAC2(3 68 !' RN1 352-327 (A)GATTTTGTTTTGAGCGTTAGAAATG 26 m FC2 345-376 (CAAAATC)TATAAACTCAATCAAGTCAAAAATG32 62 RN2 1280-1255GCC[X]GCATTTACCCCCTAAAAACTATAAAC26 70 y "i o 15 FC1 14-35 GCC[X]CTGAAGGGTGTATGGTATTAGG 22 64 RN1 157-132 (C)ACCATACATGTATCCTGCATTAATG 26 68 FC2 147-179 (CATGTATGGT)GTAGCAAAGAATTTTAAGGAGGC33 64 RN2 377-349 GCC(X]CGTTAAAACTAAAGTTCTATTTTTAATTC29 70 w N vp 57 FC1 13-39 GCC[X1GTAAGGAATGAGATGATAAAGAGTTGG27 74 C RN1 267-244 (-~GGAATATTCTGATCCACGCCATC 24 FC2 258-294 (GAATATTCC)AAAAGCCGTTTTTTATTACAGAAGAC37 76 N RN2 957-934 GCC[X]CTAAACTCTGGCTTATTGCGTATC2a 6(3 Of ..
B 7 FCt t -22 GCC [XJ ATGCGTTTATTATTGTGGTGGG2 2 6 2 RN1 27-3 (C)AATACCCACCACAATAATAAACGCAT2 5 6 (i FC2 18-50 (GTGGGTATT)GGTATTATCGCTCTTTTTAAATCC33 64 RN2 519-4 GCC [X] TTAAATTTTTAGGGAAAGGGTA2 2 6 2 CONDITIONS FOR RECOMBINANT PCR
Two independent PCR conditions are carried out for FC1/RN1 and FC2/RN2 primers under the same conditions proposed for cloning genes for expression.
N
f.a W
W
W
I
O
~o N
N
N
N
After 20 cycles, the product of each reaction is used as template for a further 20 cycles with FC1/RN2 only The product will encompass the full tenth gene minus the transposon.
The presence of restriction sites at the 5' ends of these primers allows for cloninglexpression studies.
n N
C o N
N
C cVa m w.
N
Z
m m c r m N
..
b n ~o J
N
rr W
W
SEQUENCE LISTING
(1) GENERAL INFORMATION
(i) APPLICANT: ORAVAX, INC.
(ii) TITLE OF THE INVENTION: HELICOBACTER POLYPEPTIDES
AND CORRESPONDING POLYNUCLEOTIDE MOLECULES
(iii) NUMBER OF SEQUENCES: 190 (iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: Clark & Elbing LLF?
(B) STREET: 176 Federal Street (C) CITY: Boston (D) STATE: MA
(E) COUNTRY: USA
iF) ZIP: 02110-22l4 (v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Diskette (B} COMPUTER: IBM Compatible (C) OPERATING SYSTEM: DOS
(D) SOFTWARE: FastSEQ for Window; Version 2.0 (vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER: UNKNOWN
(B) FILING DATE: 14-NOV-1997 (C) CLASSIFICATION:
(vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: 08/749,051 (B} FILING DATE: 14-NOV-l996 (vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: 08/B31,309 (B) FILING DATE: 1-APR-1997 (vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: 08/834,705 (B) FILING DATE: 1-APR-1997 (vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: 08/833,457 (B) FILING DATE: 1-APR-1997 (vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: 08/88l,227 (B) FILING DATE: 24-JL1N-1997 (vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: 08/902,615 (B) FILING DATE: 29-JUL-1997 (viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: Clark, Paul T.
(B) REGISTRATION NUMBER: 30,175 (C) REFERENCE/DOCKET NUMBER: 06132/028W01 (ix} TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: 6l7-428-0200 (B) TELEFAX: 617-428-7045 (C) TELEX:
(2) INFORMATION FOR SEQ ID N0:1:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 989 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear {ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 71...940 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1:
Met Lys Phe Leu Arg Ser Val Tyr Ala Phe Cys Ser Ser Trp Val Gly Thr Ile Val Ile Val Leu Leu Val Ile Phe Phe Ile Ala Gln Ala Phe Ile Ile Pro Ser Arg Ser Met Val Gly Thr Leu Tyr Glu Gly Asp Met Leu Phe Val Lys Lys Phe Ser Tyr Gly Ile Pro Ile Pro Lys Ile Pro Trp Ile Glu Leu Pro Val Met Pro Asp Phe Lys Asn Asn Gly His Leu Ile Glu Gly Asp Arg Pro Lys Arg Gly Glu Val Val Val TTT ATC CCT CCC CAT GAA AAA AAG TCT TAC T.AT GTT AAA AGG AAT TTT 397 Phe Ile Pro Pro His Glu Lys Lys Ser Tyr Tyr Val Lys Arg Asn Phe 95 l00 105 Ala Ile Gly Gly Asp Glu Val Leu Phe Thr Asn Glu Gly Phe Tyr Leu CAC CCT TTT GAG AGC GAC ACG GAC AAA AAT T.AC ATC GCT AAA CAT TAC 493 His Pro Phe Glu Ser Asp Thr Asp Lys Asn Tyr Ile Ala Lys His Tyr Pro Asn Ala Met Thr Lys Glu Phe Met Gly Lys Ile Phe Val Leu Asn CCT TAT AAA AAT GAG CAT CCG GGT ATC CAT T.AC CAA AAA GAC AAT GAA 589 Pro Tyr Lys Asn Glu His Pro Gly Ile His Tyr Gln Lys Asp Asn Glu ACC TTC CAC TTA ATG GAG CAA TTA GCC ACT C:AA GGC GCA GAA GCT AAT 637 Thr Phe His Leu Met Glu Gln Leu Ala Thr Gln Gly Ala Glu Ala Asn Ile Ser Met Gln Leu Ile Gln Met Glu Gly G.lu Lys Val Phe Tyr Lys l90 195 200 205 Lys Ile Asn Asp Asp Glu Phe Phe Met Ile GLy Asp Asn Arg Asp Asn Ser.Ser Asp Ser Arg Phe Trp Gly Ser Val A.la Tyr Lys Asn Ile Val Gly Ser Pro Trp Phe Val Tyr Phe Ser Leu S~'r Leu Lys Asn Ser Leu Glu Met Asp Ala Glu Asn Asn Pro Lys Lys A:rg Tyr Leu Val Arg Trp GAA CGC ATG TTT AAA AGC GTT GGA GGC TTA Gi~A AAA ATC ATT AAA AAA 925 Glu Arg Met Phe Lys Ser Val Gly Gly Leu G:Lu Lys Ile Ile Lys Lys Glu Asn Ala Thr His _77_ (2) INFORMATION FOR SEQ ID N0:2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 290 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:2:
Met Lys Phe Leu Arg Ser Val Tyr Ala Phe Cys Ser Ser Trp Val Gly Thr Ile Val Ile Val Leu Leu Val Ile Phe Phe Ile Ala Gln Ala Phe Ile Ile Pro Ser Arg Ser Met Val Gly Thr Leu Tyr Glu Gly Asp Met Leu Phe Val Lys Lys Phe Ser Tyr Gly Ile Pro Ile Pro Lys Ile Pro _ __ Trp Ile Glu Leu Pro Val Met Pro Asp Phe Lys Asn Asn Gly His Leu Ile Glu Gly Asp Arg Pro Lys Arg Gly Glu Val Val Val Phe Ile Pro Pro His Glu Lys Lys Ser Tyr Tyr VaI Lys Arg Asn Phe Ala Ile Gly Gly Asp Glu Val Leu Phe Thr Asn Glu Gly Phe Tyr Leu His Pro Phe 115 l20 125 Glu Ser Asp Thr Asp Lys Asn Tyr Ile Ala Lys His Tyr Pro Asn Ala Met Thr Lys Glu Phe Met Gly Lys Ile Phe Val Leu Asn Pro Tyr Lys 145 150 l55 160 Asn Glu His Pro Gly Ile His Tyr Gln Lys Asp Asn Glu Thr Phe His l65 l70 175 Leu Met Glu Gln Leu Ala Thr Gln Gly Ala Glu Ala Asn Ile Ser Met l80 185 190 Gln Leu Ile Gln Met Glu Gly Glu Lys Val Phe Tyr Lys Lys Ile Asn Asp Asp Glu Phe Phe Met Ile Gly Asp Asn Arg Asp Asn Ser Ser Asp Ser Arg Phe Trp Gly Ser Val Ala Tyr Lys Asn Ile Val Gly Ser Pro Trp Phe Val Tyr Phe Ser Leu Ser Leu Lys Asn Ser Leu Glu Met Asp Ala Glu Asn Asn Pro Lys Lys Arg Tyr Leu Val Arg Trp Glu Arg Met Phe Lys Ser Val Gly Gly Leu Glu Lys Ile Ile Lys Lys Glu Asn Ala Thr His (2) INFORMATION FOR SEQ ID N0:3:
(i) SEQUENCE CHARACTERISTICS:
WO 98/21225 PCT/US97/21353 w (A) LENGTH: 514 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear ~ (ix) FEATURE:
(A) NAME/KEY: Coding Sequence _ (B) LOCATION: 112...47l (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:3:
GGATTTTTTA GAGCTCTTAG TCAATGATAA TGTGGTAGAi'~ ACGATTGAAA AAGGCTTTGT 60 Met Gly GCA GTG GTT GTT TTA TTT TTA ACG CTG GTT TTi~ TTG TTT TTA GTT TTA 165 Ala Val Val Val Leu Phe Leu Thr Leu Val Leu Leu Phe Leu Val Leu Arg Asp Phe Gly Leu A1a Ser Pro Lys Gln Lys Ile Leu Ala Phe Leu ATC-GTA GGG ATT ATA GGA GCG AGC ATC AGC GT'r TAT ACT TAC AAG CAA 261 -Ile Val Gly Ile Ile Gly Ala Ser Ile Ser Va:L Tyr Thr Tyr Lys Gln AAC CAA CAA AAC CAA CAA GAG ATC GCT TTG CAe~ AGA GCG TTT TTA AGG 309 Asn Gln Gln Asn Gln Gln Glu Ile Ala Leu Gln Arg Ala Phe Leu Arg Gly Glu Thr Leu Leu Cys Lys Gly Ile Lys Va:1 Asn Asn Gln Thr Phe Asn Leu Val Ser Gly Thr Leu Ser Phe Leu G1~,~ Lys Lys Gln Thr Pro ATG AAA GAC GTT CTT GTG GAT TTG GAT TCT TG'P CAG ACG CTC CAA AAA 453 Met Lys Asp Val Leu Val Asp Leu Asp Ser Cy;~ Gln Thr Leu Gln Lys GAT CCC TTA ATC CAA CCC TAATGATGAA TAATAATi'~AT ACCCCACCCA AACCCCTA 509 Asp Pro Leu Ile Gln Pro - (2) INFORMATION FOR SEQ ID N0:4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 120 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single {D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:4:
Met Gly Ala Val Val Val Leu Phe Leu Thr Leu Val Leu Leu Phe Leu Val Leu Arg Asp Phe Gly Leu Ala Ser Pro Lys Gln Lys Ile Leu Ala Phe Leu Ile Val Gly Ile Ile Gly Ala Ser Ile Ser Val Tyr Thr Tyr Lys Gln Asn Gln Gln Asn Gln Gln Glu Ile Ala Leu Gln Arg Ala Phe Leu Arg Gly Glu Thr Leu Leu Cys Lys Gly Ile Lys Val Asn Asn Gln Thr Phe Asn Leu Val Ser Gly Thr Leu Ser Phe Leu Gly Lys Lys Gln Thr Pro Met Lys Asp Val Leu Val Asp Leu Asp Ser Cys Gln Thr Leu 100 105 1l0 Gln Lys Asp Pro Leu Ile Gln Pro 1l5 120 (2) INFORMATION FOR SEQ ID N0:5:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1233 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 135...l049 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:5:
Met Lys Lys Ala Leu Leu Leu Thr Leu Ser Leu Ser Phe Trp Leu His Ala Glu Arg Asn Gly Phe Tyr Leu Gly Leu Asn Phe -so-Leu Glu Gly Ser Tyr Ile Lys Gly Gln Gly Se:r Ile Gly Lys Lys Ala TCA GCA GAA AAC GCC TTA AAT GAA GCG ATC AA'C AAC GCA AAA AAT TCA 314 Ser Ala Glu Asn Ala Leu Asn Glu A1a Ile Asn Asn Ala Lys Asn Ser Leu Phe Pro Asn Thr Lys Ala Ile Arg Asp Ala Gln Asn Ala Leu Asn Ala Val Lys Asp Ser Asn Lys Ile Ala Ser Arc_3 Phe Ala Gly Asn Gly GGA TCG GGC GGT CTT TTT AAT GAG CTC AGC TT7.' GGG TAT AAA TAT TTT 458 Gly Ser Gly Gly Leu Phe Asn Glu Leu Ser Phee Gly Tyr Lys Tyr Phe TTG GGT AAA AAA AGG ATT ATA GGG TTT AGG CAC' TCT CTT TTT TTC GGT 506.
Leu Gly Lys Lys Arg Ile Ile Gly Phe Arg Hia Ser Leu Phe Phe Gly TAC CAA CTT GGT GGC GTT GGT TCT GTT CCT GG7.' AGC GGT TTA ATC GTT 554 Tyr Gln Leu Gly Gly Val Gly Ser Val Pro Gly Ser Gly Leu Ile Val TTT TTA CCC TAT GGT TTC AAT ACG GAT TTG CTC: ATT AAT TGG ACT AAC 602 Phe Leu Pro Tyr Gly Phe Asn Thr Asp Leu Leu Ile Asn Trp Thr Asn GAT AAG CGA GCG TCC CAA AAA TAT GTT GAA CGP, AGG GTA AAA GGG CTC 650 Asp Lys Arg Ala Ser Gln Lys Tyr Val Glu Arc_~ Arg Val Lys Gly Leu l60 165 170 Ser Ile Phe Tyr Lys Asp Met Thr Gly Arg Thr Leu Asp Ala Asn Thr TTA AAA AAA GCA TCA AGG CAT GTA TTT AGA AAA, TCT TCA GGG CTT GTG 746 Leu Lys Lys Ala Ser Arg His Val Phe Arg Lys Ser Ser Gly Leu Val Ile Gly Met Glu Leu Gly Gly Ser Thr Trp Phe Ala Ser Asn Asn Leu 205 2l0 215 220 ACC CCT TTC AAT CAA GTC AAG AGT CGC ACG ATT' TTT CAG TTG CAA GGA 842 Thr Pro Phe Asn Gln Val Lys Ser Arg Thr Ile Phe Gln Leu Gln Gly Lys Phe Gly Val Arg Trp Asn Asn Asp Glu Tyr Asp Ile Asp Arg Tyr -s1-Gly Asp Glu Ile Tyr Leu Gly Gly Ser Ser Val Glu Leu Gly Val Lys Val Pro Ala Phe Lys Val Asn Tyr Tyr Ser Asp Asp Tyr Gly Asp Lys Leu Asp Tyr Lys Arg Val Val Ser Val Tyr Leu Asn Tyr Thr Tyr Asn Phe Lys Asn Lys His AACCTTATTT TTTATTAGCT TGAAACTCTT CAAAGCCTTT TTTTCTCAAT TGGCATGCCG 1l50 (2) INFORMATION FOR SEQ ID N0:6:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 305 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear - (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:6:
Met Lys Lys Ala Leu Leu Leu Thr Leu Ser Leu Ser Phe Trp Leu His Ala Glu Arg Asn Gly Phe Tyr Leu Gly Leu Asn Phe Leu Glu Gly Ser Tyr Ile Lys Gly Gln Gly Ser Ile Gly Lys Lys Ala Ser Ala Glu Asn Ala Leu Asn Glu Ala Ile Asn Asn Ala Lys Asn Ser Leu Phe Pro Asn Thr Lys Ala Ile Arg Asp Ala Gln Asn Ala Leu Asn Ala Val Lys Asp Ser Asn Lys Ile Ala Ser Arg Phe Ala Gly Asn Gly Gly Ser Gly Gly Leu Phe Asn Glu Leu Ser Phe Gly Tyr Lys Tyr Phe Leu Gly Lys Lys Arg Ile Ile Gly Phe Arg His Ser Leu Phe Phe Gly Tyr Gln Leu Gly 115 120 l25 Gly Val Gly Ser Val Pro Gly Ser Gly Leu Ile Val Phe Leu Pro Tyr Gly Phe Asn Thr Asp Leu Leu Ile Asn Trp Thr Asn Asp Lys Arg Ala Ser Gln Lys Tyr Val Glu Arg Arg Val Lys Gly Leu Ser Ile Phe Tyr 165 170 l75 Lys Asp Met Thr Gly Arg Thr Leu Asp Ala Asn Thr Leu Lys Lys Ala Ser Arg His Val Phe Arg Lys Ser Ser Gly Leu Val Ile Gly Met Glu Leu Gly Gly Ser Thr Trp Phe Ala Ser Asn Asn Leu Thr Pro Phe Asn Gln Val Lys Ser Arg Thr Ile Phe Gln Leu Gln Gly Lys Phe Gly Val Arg Trp Asn Asn Asp Glu Tyr Asp Ile Asp Arch Tyr Gly Asp Glu Ile 245 250 ~ 255 Tyr Leu Gly Gly Ser Ser Val Glu Leu Gly Val Lys Val Pro Ala Phe Lys Val Asn Tyr Tyr Ser Asp Asp Tyr Gly Asp Lys Leu Asp Tyr Lys Arg Val Val Ser Val Tyr Leu Asn Tyr Thr Tyr Asn Phe Lys Asn Lys His (2) INFORMATION FOR SEQ ID N0:7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3012 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 142...2682 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7:
Met Lys Val Lys Ser Ile Ser Tyr Val Gly Leu Ser Tyr Met Ser Asp Met Leu Ala Asn Glu Ile Val Lys Ile Arg ~ 15 20 25 Va1 Gly Asp Ile Val Asp Ser Lys Lys Ile Asp Thr Ala Val Leu Ala Leu Phe Asn Gln Gly Tyr Phe Lys Asp Val Tyr Ala Thr Phe Glu Gly Gly Ile Leu Glu Phe His Phe Asp Glu Lys Ala Arg Ile Ala Gly Val GAA ATC AAG GGT TAT GGG ACT GAA AAG GAA AAA GAC GGC TTA AAA TCC 4l1 Glu Ile Lys Gly Tyr Gly Thr Glu Lys Glu Lys Asp Gly Leu Lys Ser Gln Met Gly Ile Lys Lys Gly Asp Thr Phe Asp Glu Gln Lys Leu Glu 95 100 l05 His Ala Lys Thr Ala Leu Lys Thr Ala Leu Glu G1y Gln Gly Tyr Tyr Gly Ser Val Val Glu Val Arg Thr Glu Lys Val Ser Glu Gly Ala Leu 125 l30 l35 Leu Ile Val Phe Asp val Asn Arg Gly Asp Ser Ile Tyr Ile Lys Gln l40 145 l50 Ser Ile Tyr Glu Gly Ser Ala Lys Leu Lys Arg Arg Met Ile Glu Ser 155 160 165 l70 Leu Ser Ala Asn Lys Gln Arg Asp Phe Met Gly Trp Met Trp Gly Leu l75 l80 185 Asn Asp Gly Lys Leu Arg Leu Asp Gln Leu Glu Tyr Asp Ser Met Arg 190 l95 200 ATC CAA GAT GTG TAT ATG CGT AGG GGT TAC TTA GAC GCT CAT ATT'TCT 795 Ile Gln Asp Val Tyr Met Arg Arg Gly Tyr Leu Asp Ala His Ile Ser Ser Pro Phe Leu Lys Thr Asp Phe Ser Thr His Asp Ala Lys Leu His Tyr Lys Val Lys Glu Gly Ile Gln Tyr Arg Ile Ser Asp Ile Leu Ile Glu Ile Asp Asn Pro Val Val Pro Leu Lys Thr Leu Glu Lys Ala Leu Lys Val Lys Arg Lys Asp Val Phe Asn Ile Glu His Leu Arg Ala Asp GCG CAA ATT TTA AAA ACC GAA ATC GCC GAT .AAG GGT TAT GCG TTT GCG 1035 Ala Gln Ile Leu Lys Thr Glu Ile Ala Asp Lys C3ly Tyr Ala Phe Ala GTG GTG AAG CCA GAC TTG GAT AAA GAT GAA .AAA AAC GGG CTT GTG AAA 1b83 ~ Val Val Lys Pro Asp Leu Asp Lys Asp Glu :Lys Asn Gly Leu Val Lys 300 305 3l0 Val Ile Tyr Arg Ile Glu Val Gly Asp Met 'Jal Tyr Ile Asn Asp Val ATC ATT TCA GGG AAC CAG CGC ACG AGC GAT AGG ATC ATT AGA AGG GAG 1l79 Ile Ile Ser Gly Asn Gln Arg Thr Ser Asp Arg Ile Ile Arg Arg Glu TTA TTG TTA GGG CCT AAG GAT AAA TAC AAC 'rTG ACC AAA CTG AGA AAT 1227 Leu Leu Leu Gly Pro Lys Asp Lys Tyr Asn :~eu Thr Lys Leu Arg Asn TCC GAA AAT TCT TTA AGG CGT TTA GGA TTC 'CTC TCT AAA GTC AAA ATT 1275 Ser Glu Asn Ser Leu Arg Arg Leu Gly Phe 7?he Ser Lys Val Lys Ile Glu Glu Lys Arg Val Asn Ser Ser Leu Met Asp Leu Leu Val Ser Val GAA GAG GGG CGT ACT GGG CAG TTG CAA TTT (3GG TTA GGC TAT GGC TCT 1371 Glu Glu Gly Arg Thr Gly Gln Leu Gln Phe Gly Leu Gly Tyr Gly Ser 395 400 405 4l0 Tyr Gly Gly Leu Met Leu Asn Gly Ser Val Ser Glu Arg Asn Leu Phe GGC ACA GGG CAA AGC ATG AGC TTG TAT GCT AAC ATC GCT ACA GGG GGG l467 Gly Thr Gly Gln Ser Met Ser Leu Tyr Ala Asn Ile Ala Thr Gly Gly Gly Arg Ser Tyr Pro Gly Met Pro Lys Gly Ala Gly Arg Met Phe Ala GGG AAT TTG AGC TTG ACT AAT CCA AGG ATT 7.'TT GAC AGC TGG TAT AGC 1563 Gly Asn Leu Ser Leu Thr Asn Pro Arg Ile Phe Asp Ser Trp Tyr Ser Ser Thr Ile Asn Leu Tyr Ala Asp Tyr Arg 7.1e Ser Tyr Gln Tyr Ile 475 480 9a85 490 Gln Gln Gly Gly Gly Phe Gly Val Asn Val Gly Arg Met Leu Gly Asn WO 98/21225 PCTlUS97121353 -AGA ACC CAT GTG AGC TTA GGG TAT AAC TTG AAT GTT ACC AAA CTC CTT l707 Arg Thr His Val Ser Leu Gly Tyr Asn Leu Asn Val Thr Lys Leu Leu Gly Phe Ser Ser Pro Leu Tyr Asn Arg Tyr Tyr Ser Ser Val Asn Glu GTG GTT TCT CCA AGG CAA TGT TCT ACC CCC GCA TCG GTG ATT ATC AAT l803 Val Val Ser Pro Arg Gln Cys Ser Thr Pro Ala Ser Val Ile Ile Asn Arg Leu Ser Gly Gly Lys Thr Pro Leu Gln Pro Glu Ser Cys Ser Ser Pro Gly Ala Ile Thr Thr Sex Pro Glu Ile Arg Gly Ile Trp Asp Arg Asp Tyr His Thr Pro Ile Thr Ser Ser Phe Thr Leu Asp Val Ser Tyr Asp Asn Thr Asp Asp Tyr Tyr Phe Pro Arg Asn Gly Val Ile Phe Ser 605 6l0 615 Ser Tyr Ala Thr Met Ser Gly Leu Pro Ser Ser Gly Thr Leu Asn Ser Trp Asn Gly Leu Gly Gly Asn Val Arg Asn Thr Lys Val Tyr Gly Lys Phe Ala Ala Tyr His His Leu Gln Lys Tyr Leu Leu Ile Asp Leu Ile Ala Arg Phe Lys Thr Gln Gly Gly Tyr Ile Phe Arg Tyr Asn Thr Asp Asp Tyr Leu Pro Leu Asn Ser Thr Phe Tyr Met Gly Gly Val Thr Thr Val Arg Gly Phe Arg Asn Gly Ser Val Thr Pro Lys Asp Glu Phe Gly TTG TGG CTT GGA GGC GAT GGG ATT TTT ACC GCT TCT ACT GAA TTG AGC 233l Leu Trp Leu Gly Gly Asp Gly Ile Phe Thr Ala Ser Thr.Glu Leu Ser WO 98I21225 PCT/US97/21353 w _g6._ Tyr Gly Val Leu Lys Ala Ala Lys Met Arg Leu Ala Trp Phe Phe Asp Phe Gly Phe Leu Thr Phe Lys Thr Pro Thr Arg Gly Ser Phe Phe Tyr _ AAC GCT CCT GTT ACG ACA GCG AAT TTT AAA GAT TAT GGC GTT ATA GGG 247S
Asn Ala Pro Val Thr Thr Ala Asn Phe Lys Asp Tyr Gly Val Ile Gly Ala Gly Phe Glu Arg Ala Thr Trp Arg Ala Ser Thr Gly Leu Gln Ile Glu Trp Ile Ser Pro Met Gly Pro Leu Val Leu Ile Phe Pro Ile Ala TTT TTC AAC CAA TGG GGC GAT GGC AAT GGC AAG AAA TGT AAA GGG CTA 2.619 Phe Phe Asn Gln Trp Gly Asp Gly Asn Gly Lys Lys Cys Lys Gly Leu Cys Phe Asn Pro Asn Met Asp Asp Tyr Thr Gln His Phe Glu Phe Ser Met Gly Thr Arg Phe CTGAAAACTTGACGAC'TTTT ATTGTGGATAGGAATA.TCAATTACACCAATATTTGTTTTG2843 (2) INFORMATION FOR SEQ ID N0:8:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 847 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: B:
Met Lys Val Lys Ser Ile Ser Tyr Val Gly Leu Ser Tyr Met Ser Asp Met Leu Ala Asn Glu Ile Val Lys Ile Arg Val Gly Asp Ile Val Asp _s7_ Ser Lys Lys Ile Asp Thr Ala Val Leu Ala Leu Phe Asn Gln Gly Tyr Phe Lys Asp Val Tyr Ala Thr Phe Glu Gly Gly Ile Leu Glu Phe His Phe Asp Glu Lys Ala Arg Ile Ala Gly Val Glu Ile Lys Gly Tyr Gly Thr Glu Lys Glu Lys Asp Gly Leu Lys Ser Gln Met Gly Ile Lys Lys Gly Asp Thr Phe Asp Glu Gln Lys Leu Glu His Ala Lys Thr Ala Leu l00 105 l10 Lys Thr Ala Leu Glu Gly Gln Gly Tyr Tyr Gly Ser Val Val Glu Val Arg Thr Glu Lys Val Ser Glu Gly Ala Leu Leu Ile Val Phe Asp Val Asn Arg Gly Asp Ser Ile Tyr Ile Lys Gln Ser Ile Tyr Glu Gly Ser l45 l50 l55 160 Ala Lys Leu Lys Arg Arg Met Ile Glu Ser Leu Ser Ala Asn Lys Gln Arg Asp Phe Met Gly Trp Met Trp Gly Leu Asn Asp Gly Lys Leu Arg l80 185 190 Leu Asp Gln Leu Glu Tyr Asp Ser Met Arg Ile Gln Asp Val Tyr Met Arg Arg Gly Tyr Leu Asp Ala His Tle Ser Ser Pro Phe Leu Lfys Thr 2l0 215 220 Asp Phe 5er Thr His Asp Ala Lys Leu His Tyr Lys Val Lys Glu Gly Ile Gln Tyr Arg Ile Ser Asp Ile Leu Ile Glu Ile Asp Asn Pro Val Val Pro Leu Lys Thr Leu Glu Lys Ala Leu Lys Val Lys Arg Lys Asp Val Phe Asn Ile Glu His Leu Arg Ala Asp Ala Gln Ile Leu Lys Thr Glu Ile Ala Asp Lys Gly Tyr Ala Phe Ala Val Val Lys Pro Asp Leu Asp Lys Asp Glu Lys Asn Gly Leu Val Lys Val Ile Tyr Arg Ile Glu Val Gly Asp Met Val Tyr Ile Asn Asp Val Ile Ile Ser Gly Asn Gln Arg Thr Ser Asp Arg Ile Ile Arg Arg Glu Leu Leu Leu Gly Pro Lys Asp Lys Tyr Asn Leu Thr Lys Leu Arg Asn Ser Glu Asn Ser Leu Arg Arg Leu Gly Phe Phe Ser Lys Val Lys Ile Glu Glu Lys Arg Val Asn Ser Ser Leu Met Asp Leu Leu Val Ser Val Glu Glu Gly Arg Thr Gly Gln Leu Gln Phe Gly Leu Gly Tyr Gly Ser Tyr Gly Gly Leu Met Leu Asn Gly Ser Val Ser Glu Arg Asn Leu Phe Gly Thr Gly Gln Ser Met Ser Leu Tyr Ala Asn Ile Ala Thr Gly Gly Gly Arg Ser Tyr Pro Gly Met Pro Lys Gly Ala Gly Arg Met Phe Ala Gly Asn Leu Ser Leu Thr Asn Pro Arg Ile Phe Asp Ser Trp Tyr Ser Ser Thr Ile Asn Leu Tyr _88_ Ala Asp Tyr Arg Ile Ser Tyr Gln Tyr Ile Gln Gln G1y Gly Gly Phe Gly Val Asn Val Gly Arg Met Leu Gly Asn Arg Thr His Val Ser Leu Gly Tyr Asn Leu Asn Val Thr Lys Leu Leu Gly Phe Ser Ser Pro Leu Tyr Asn Arg Tyr Tyr Ser Ser Val Asn Glu Val Val Ser Pro Arg Gln Cys Ser Thr Pro Ala Ser Val Ile Ile Asn Arg Leu Ser Gly Gly Lys Thr Pro Leu Gln Pro Glu Ser Cys Ser Ser Pro Gly Ala Ile Thr Thr Ser Pro Glu Ile Arg Gly Ile Trp Asp Arg Asp Tyr His Thr Pro Ile Thr Ser Ser Phe Thr Leu Asp Val Ser Tyr Asp Asn Thr Asp Asp Tyr Tyr Phe Pro Arg Asn Gly Val Ile Phe Ser Ser Tyr Ala Thr Met Ser Gly Leu Pro Ser Ser Gly Thr Leu Asn Ser Trp Asn Gly Leu Gly Gly Asn Val Arg Asn Thr Lys Val Tyr Gly Lys Phe Ala Ala Tyr His His Leu Gln Lys Tyr Leu Leu Ile Asp Leu Ile Ala Arg Phe Lys Thr Gln Gly Gly Tyr Ile Phe Arg Tyr Asn Thr Asp Asp Tyr Leu Pro Leu Asn Ser Thr Phe Tyr Met Gly Gly Val Thr Thr Val Arg Gly Phe Arg Asn Gly Ser Val Thr Pro Lys Asp Glu Phe G1y Leu Trp Leu Gly Gly Asp Gly Ile Phe Thr Ala Ser Thr Glu Leu Ser Tyr Gly Val Leu Lys Ala Ala Lys Met Arg Leu Ala Trp Phe Phe Asp Phe Gly Phe Leu Thr Phe Lys Thr Pro Thr Arg Gly Ser Phe Phe Tyr Asn Ala Pro Val Thr Thr Ala Asn Phe Lys Asp Tyr Gly Val Ile Gly Ala Gly Phe Glu Arg Ala Thr Trp Arg Ala Ser Thr Gly Leu Gln Ile Glu Trp Ile Ser Pro Met Gly Pro Leu Val Leu Ile Phe Pro Ile Ala Phe Phe Asn Gln Trp Gly Asp Gly Asn Gly Lys Lys Cys Lys Gly Leu Cys Phe Asn Pro Asn Met Asp Asp Tyr Thr Gln His Phe Glu Phe Ser Met Gly Thr Arg Phe (2) INFORMATION FOR SEQ ID N0:9:
(i) SEQUENCE CHARACTERISTICS:
' (A) LENGTH: 1032 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single - (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 149...9l3 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:9:
Met Asp Ile Tyr Ala Leu Tyr Ile Ala Ile Gly Leu Phe Thr Gly Ile Leu Ser Gly Ile Phe Gly Ile Gly Gly Gly Leu Ile Ile Val Pro Ile Met Leu Ala Thr Gly His Ser Phe Glu Glu Ser Ile Gly Ile Ser Ile Leu Gln Met Ala Leu Ser Ser Phe Val Gly Ser Val Leu Asn Phe Lys Lys Lys Ser Leu Asp Phe Ser Leu Gly Leu Leu Ile Gly Ala Gly Gly Leu Ile Gly Ala Ser Phe Ser Gly Phe Val Leu Lys Ile Val Ser Ser Lys Ile Leu Met Val Ile Phe Ala Leu Leu Val Val Tyr Ser Met Ile Gln Phe Val Leu Lys Pro Lys Lys l05 110 115 120 Lys Asp Leu Ile Ala Asp Thr Lys Arg Tyr His Leu Gln Gly Leu Lys l25 l30 135 Leu Phe Leu Ile Gly Thr Leu Thr Gly Phe Phe Ala Ile Thr Leu Gly l40 145 150 Ile Gly Gly Gly Met Leu Met Val Pro Leu Met His Tyr Phe Leu Gly TAT GAT TCT AAA AAA TGC GTG GCT CTA GGG 'TTA TTT TTC ATC TTG TTT 700 Tyr Asp Ser Lys Lys Cys Val Ala Leu Gly :Leu Phe Phe Ile Leu Phe l70 175 1B0 TCT TCT ATT TCA GGA GCT TTT TCT TTA ATG 'TAT CAC CAC ATC ATC AAT 748 Ser Ser Ile Ser Gly Ala Phe Ser Leu Met 'Tyr His His Ile Ile Asn Lys Glu Val Leu Leu Ala Gly Ala Ile Val Gly Leu Gly Ser Val Met 205 2l0 215 Gly Val Ser Ile Gly Ile Lys Trp Ile Met Gly Leu Leu Asn Glu Lys Met His Lys Ala Leu Ile Leu Gly Val Tyr Gly Leu Ser Leu Leu Ile GTT TTA TAC AAA CTC TTT TTT TAATTGATGG T'TTTATACCA CTACTATTTT RAGA 9.47 Val Leu Tyr Lys Leu Phe Phe (2) INFORMATION FOR SEQ ID NO:10:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 255 amino acids (B) TYPE: amino acid !C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID 1V0:10:
Met Asp Ile Tyr Ala Leu Tyr Ile Ala Ile Gly Leu Phe Thr Gly Ile Leu Ser Gly Ile Phe Gly Ile Gly Gly Gly Leu Ile Ile Val Pro Ile Met Leu Ala Thr Gly His Ser Phe Glu Glu Ser Ile Gly Ile Ser Ile Leu Gln Met Ala Leu Ser Ser Phe Val Gly :3er Val Leu Asn Phe Lys Lys Lys Ser Leu Asp Phe Ser Leu Gly Leu Leu Ile Gly Ala Gly Gly 65 70 '75 80 Leu Ile Gly Ala Ser Phe Ser Gly Phe Val Leu Lys Ile Val Ser Ser Lys Ile Leu Met Val Ile Phe Ala Leu Leu Val Val Tyr Ser Met Ile - Gln Phe Val Leu Lys Pro Lys Lys Lys Asp Leu Ile Ala Asp Thr Lys WO 98I21225 PCT/US97121353 w 1l5 120 125 Arg Tyr His Leu Gln Gly Leu Lys Leu Phe Leu Ile Gly Thr Leu Thr 130 135 l40 Gly Phe Phe Ala Ile Thr Leu Gly Ile Gly Gly Gly Met Leu Met Val Pro Leu Met His Tyr Phe Leu Gly Tyr Asp Ser Lys Lys Cys Val Ala Leu Gly Leu Phe Phe Ile Leu Phe Ser Ser Ile Ser Gly Ala Phe Ser l80 185 190 Leu Met Tyr His His Ile Ile Asn Lys Glu Val Leu Leu Ala Gly Ala Ile Val Gly Leu Gly Ser Val Met Gly Val Ser Ile Gly Ile Lys Trp Ile Met Gly Leu Leu Asn Glu Lys Met His Lys Ala Leu Ile Leu Gly Val Tyr Gly Leu Ser Leu Leu Ile Val Leu Tyr Lys Leu Phe Phe (2) INFORMATION FOR SEQ ID NO:11:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 10S7 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 66...980 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:
Met Gly Arg Ile Glu Ser Lys Lys Arg Leu Lys Ala Leu Val Phe Leu Ala Ser Leu Gly Val Leu Trp Gly Asn Ser Ala Glu Lys Thr Pro Phe Phe Lys Thr Lys Asn His Ile Tyr Leu Gly Phe Arg Leu Gly Thr Gly Ala Asn Val His Thr Ser Met Trp Gln Gln Ala Tyr Lys Asp Asn Pro Thr Cys Pro Gly Ser Val Cys Tyr Gly Glu Lys Leu Glu Ala His Tyr Gln Gly Gly Lys Asn Leu Ser Tyr Thr Gly Gln Ile Gly Asp Glu ATA GCT TTT GAT AAA CAC CAT ATT TTA GGC 'TTA AGG GTG TGG GGG GAT 398 Ile Ala Phe Asp Lys His His Ile Leu Gly :Leu Arg Val Trp Gly Asp 100 105 1l0 GTA GAA TAC GCT AAA GCG CAA TTA GGT CAA ,AAA GTG GGG GGT AAT ACC 446 Val Glu Tyr Ala Lys Ala Gln Leu Gly Gln :Lys Val Gly Gly Asn Thr 115 l20 125 Leu Leu Ser Gln Ala Asn Tyr Asp Pro Asn ;11a Ile Lys Thr Tyr Asp TCT GCT TCA AAC ACT CAA GGC CCT TTA GTT 'rTG CAA AAA ACC CCA AGC 542 Ser Ala Ser Asn Thr Gln Gly Pro Leu Val :Leu Gln Lys Thr Pro Ser 145 l50 155 CCT CAA AAC TTC CTT TTC AAT AAC GGG CAT 'rTC ATG GCG TTT GGT TTG 590 Pro Gln Asn Phe Leu Phe Asn Asn Gly His :Phe Met Ala Phe Gly Leu Asn Val Asn Val Phe Val Asn Leu Pro Ile Asp Thr Leu Leu Lys Leu GCT TTA AAA ACA GAA AAA ATG CTG TTT TTT i~AA ATA GGC GTG TTT GGT 686 Ala Leu Lys Thr Glu Lys Met Leu Phe Phe l~ys Ile Gly Val Phe Gly GGG GGC GGG GTG GAA TAC GCA ATA TTA TGG i~GT CCT AAC TAT CAA AAT 734 Gly Gly Gly Val Glu Tyr Ala Ile Leu Trp tier Pro Asn Tyr Gln Asn 210 2l5 220 CAA AAC ACG AAA CAA GGC GAT AAA TTT TTT (3CA GCG GGT GGG GGG TTT 782 Gln Asn Thr Lys Gln Gly Asp Lys Phe Phe Ala Ala Gly Gly Gly Phe TTT GTG AAT TTT GGG GGT TCT TTG TAT ATA (3GC AAA CGC AAC CGC TTC 830 Phe Val Asn Phe Gly Gly Ser Leu Tyr Ile (31y Lys Arg Asn Arg Phe AAT GTG GGG TTA AAA ATC CCT TAC TAT AGC 'CTG AGC GCG CAA AGT TGG 87B
Asn Val Gly Leu Lys Ile Pro Tyr Tyr Ser Leu Ser Ala Gln Ser Trp Lys Asn Phe Gly Ser Ser Asn Val Trp Gln Gln Gln Thr Ile Arg Gln AAC TTC AGC GTT TTT AGG AAT AAA GAA GTT 'CTT GTC AGC TAC GCG TTC 974 Asn Phe Ser Val Phe Arg Asn Lys Glu Val I?he Val Ser Tyr Ala Phe Leu Phe (2) INFORMATION FOR SEQ ID N0:12:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 305 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:12:
Met Gly Arg Ile Glu Ser Lys Lys Arg Leu Lys Ala Leu Val Phe Leu Ala Ser Leu Gly Val Leu Trp Gly Asn Ser Ala Glu Lys Thr Pro Phe Phe Lys Thr Lys Asn His Ile Tyr Leu Gly Phe Arg Leu Gly Thr Gly Ala Asn Val His Thr Ser Met Trp Gln Gln Ala Tyr Lys Asp Asn Pro Thr Cys Pro Gly Ser Val Cys Tyr Gly Glu Lys Leu Glu Ala His Tyr Gln Gly Gly Lys Asn Leu Ser Tyr Thr Gly Gln Ile Gly Asp Glu Ile Ala Phe Asp Lys His His Ile Leu Gly Leu Arg Val Trp Gly Asp Val l00 105 1l0 Glu Tyr Ala Lys Ala Gln Leu Gly Gln Lys Val Gly Gly Asn Thr Leu l15 120 125 Leu Ser Gln Ala Asn Tyr Asp Pro Asn Ala Ile Lys Thr Tyr Asp Ser Ala Ser Asn Thr Gln Gly Pro Leu Val Leu Gln Lys Thr Pro Ser Pro 145 150 155 l60 Gln Asn Phe Leu Phe Asn Asn Gly His Phe Met Ala Phe Gly Leu Asn Val Asn Val Phe Val Asn Leu Pro Ile Asp Thr Leu Leu Lys Leu Ala Leu Lys Thr Glu Lys Met Leu Phe Phe Lys Ile Gly Val Phe Gly Gly Gly Gly Val Glu Tyr Ala Ile Leu Trp Ser Pro Asn Tyr Gln Asn Gln Asn Thr Lys Gln Gly Asp Lys Phe Phe Ala Ala Gly Gly Gly Phe Phe Val Asn Phe Gly Gly Ser Leu Tyr Ile Gly Lys Arg Asn Arg Phe Asn Val Gly Leu Lys Ile Pro Tyr Tyr Ser Leu Ser Ala Gln Ser Trp Lys Asn Phe Gly Ser Ser Asn Val Trp Gln Gln Gln Thr Ile Arg Gln Asn WO 98l21225 PCTlUS97121353 -Phe Ser Val Phe Arg Asn Lys Glu Val Phe Val Ser Tyr Ala Phe Leu Phe (2) INFORMATION FOR SEQ ID N0:1.3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 624 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 77...535 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:13:
Met Glu Asn Asn Glu Asn His Glu Lys Leu Asn Gly GTT TTG CGC AAG TTT TTA GGC GAT GCG TTC .ACG CTT GAT GGG AAA GAA 160 Val Leu Arg Lys Phe Leu Gly Asp Ala Phe Thr Leu Asp Gly Lys Glu Gly Gly Leu Asn Met Glu Lys Leu Arg Glu .Ala Ile Lys Lys Glu Lys CCA ATC ATG AAT ATT TTG CTC ATG GGA GCT .?ACT GGG GTG GGT AAA AGC 256 Pro Ile Met Asn Ile Leu Leu Met Gly Ala 'rhr Gly Val Gly Lys Ser Ser Leu Ile Asn Ala Leu Phe Gly Lys Glu 'Val Ala Lys Ala Gly Val GGA AAA CCC ATC ACT CAG CAT CTT GAA AAA 'rAT GTT GAT GAA GAA AAA 352 Gly Lys Pro Ile Thr Gln His Leu Glu Lys 'Pyr Val Asp Glu G1u Lys GGC TTG ATT TTA TGG GAC ACT AAA GGC ATT (3AA GAT AAA GAT TAT GAA 400 Gly Leu Ile Leu Trp Asp Thr Lys Gly Ile Glu Asp Lys Asp Tyr Glu Asn Thr Leu Glu Ser Ile Lys Lys Glu Met Glu Asp Ser Phe Lys Thr Leu Asp Glu Lys Glu Ala Ile Asp Val Ala Tyr Leu Cys Val Lys Glu l25 l30 135 140 Thr Ser Gly Arg Val Gln Glu Arg Glu Arg Glu Ser Tyr (2) INFORMATION FOR SEQ ID N0:14:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 153 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:14:
Met Glu Asn Asn Glu Asn His Glu Lys Leu Asn Gly Val Leu Arg Lys Phe Leu Gly Asp Ala Phe Thr Leu Asp Gly Lys Glu Gly Gly Leu Asn Met Glu Lys Leu Arg Glu Ala Ile Lys Lys Glu Lys Pro Ile Met Asn Ile Leu Leu Met Gly Ala Thr Gly Val Gly Lys Ser Ser Leu Ile Asn Ala Leu Phe Gly Lys Glu Val Ala Lys Ala Gly Val Gly Lys Pro Ile Thr Gln His Leu Glu Lys Tyr Val Asp Glu Glu Lys Gly Leu Ile Leu Trp Asp Thr Lys Gly Ile Glu Asp Lys Asp Tyr Glu Asn Thr Leu Glu Ser Ile Lys Lys Glu Met Glu Asp Ser Phe Lys Thr Leu Asp Glu Lys Glu Ala Ile Asp Val Ala Tyr Leu Cys Val Lys Glu Thr Ser Gly Arg l30 l35 140 Val Gln Glu Arg Glu Arg Glu Ser Tyr (2) INFORMATION FOR SEQ ID N0:15:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1083 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 155...1033 (D) OTHER INFORMATION:
{xi) SEQUENCE DESCRIPTION: SEQ ID rd0:15:
AACCAAACAG TGCAATTTCA GGTGTCAGTA TTGC ATCi CCT GCT ACG CCA TTA AAT 175 Met: Pro Ala Thr Pro Leu Asn Phe Phe Asp Asn Glu Glu Leu Leu Pro Leu F~sp Asn Val Leu Glu Phe CTC AAA ATC GCC ATT GAT GAG GGC GTT AAA F,AA ATT AGA ATC ACG GGT 271 Leu Lys Ile Ala Ile Asp Glu Gly Val Lys L~ys Ile Arg Ile Thr Gly Gly Glu Pro Leu Leu Arg Lys Gly Leu Asp C~lu Phe Ile Ala Lys Leu CAC GCT TAC AAT AAA GAA GTG GAG TTA GTT T'TA AGC ACT AAT GGT TTT 367 His Ala Tyr Asn Lys Glu Val Glu Leu Val Leu Ser Thr Asn Gly Phe Leu Leu Lys Lys Met Ala Lys Asp Leu Lys A.sn Ala Gly Leu Ala Gln Val Asn Val Ser Leu Asp Ser Leu Lys Ser Asp Arg Val Leu Lys Tle Ser Gln Lys Asp Ala Leu Lys Asn Thr Leu Glu Gly Ile Glu Glu Ser Leu Lys Val Gly Leu Lys Leu Lys Leu Asn Thr Val Val Ile Lys Ser 120 l25 l30 135 Val Asn Asp Asp Glu Ile Leu Glu Leu Leu Glu Tyr Ala Lys Asn Arg His Ile Gln Ile Arg Tyr Ile Glu Phe Met Glu Asn Thr His Ala Lys _97_ Ser Leu Val Lys Gly Leu Lys Glu Arg Glu Ile Leu Asp Leu Ile Ala CAA AAA TAT CAA ATC ATT GAG GCA GAA AAA CCC AAA CAA GGG TCT TCT 75l Gln Lys Tyr Gln Ile Ile Glu Ala Glu Lys Pro Lys Gln Gly Ser Ser l85 190 195 Lys Ile Tyr Thr Leu Glu Asn Gly Tyr Gln Phe Gly Ile Ile Ala Pro His Ser Asp Asp Phe Cys Gln Ser Cys Asn Arg Ile Arg Leu Ala Ser Asp Gly Lys Ile Cys Pro Cys Leu Tyr Tyr Gln Asp Ala Ile Asp Ala Lys Glu Ala Ile Ile Asn Lys Asp Thr Lys Asn Ile Lys Arg Leu Leu Lys Gln Ser Val Ile Asn Lys Pro Glu Lys Asn Met Trp Asn Asp Lys Asn Ser Glu Thr Pro Thr Arg Ala Phe Tyr Tyr Thr Gly Gly (2) INFORMATION FOR SEQ ID N0:16:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 293 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:16:
Met Pro Ala Thr Pro Leu Asn Phe Phe Asp Asn Glu Glu Leu Leu Pro Leu Asp Asn Val Leu Glu Phe Leu Lys Ile Ala Ile Asp Glu Gly Val Lys Lys Ile Arg Ile Thr Gly Gly Glu Pro Leu Leu Arg Lys Gly Leu Asp Glu Phe Ile Ala Lys Leu His Ala Tyr Asn Lys Glu Val Glu Leu val Leu Ser Thr Asn Gly Phe Leu Leu Lys Lys Met Ala Lys Asp Leu _98-Lys Asn Ala Gly Leu Ala Gln Val Asn Val Ser Leu Asp Ser Leu Lys Ser Asp Arg Val Leu Lys Ile Ser Gln Lys Asp Ala Leu Lys Asn Thr 100 l05 110 - Leu Glu Gly Ile Glu Glu Ser Leu Lys Val Gly Leu Lys Leu Lys Leu l15 120 125 Asn Thr Val Val Ile Lys Ser Val Asn Asp Asp Glu Ile Leu Glu Leu _ 130 135 140 Leu Glu Tyr Ala Lys Asn Arg His Ile Gln Ile Arg Tyr Ile Glu Phe 145 l50 155 l60 Met Glu Asn Thr His Ala Lys Ser Leu Val Lys Gly Leu Lys Glu Arg Glu Ile Leu Asp Leu Ile Ala Gln Lys Tyr Gln Ile Ile Glu Ala Glu 180 185 l90 Lys Pro Lys Gln Gly Ser Ser Lys Ile Tyr Thr Leu Glu Asn Gly Tyr l95 200 205 Gln Phe Gly Ile Ile Ala Pro His Ser Asp Asp Phe Cys Gln Ser Cys Asn Arg Ile Arg Leu Ala Ser Asp Gly Lys Ile Cys Pro Cys Leu Tyr Tyr Gln Asp Ala Ile Asp Ala Lys Glu Ala Ile Ile Asn Lys Asp Thr Lys Asn Ile Lys Arg Leu Leu Lys Gln Ser Val Ile Asn Lys Pro Glu Lys Asn Met Trp Asn Asp Lys Asn Ser Glu Thr Pro Thr Arg Ala Phe Tyr Tyr Thr Gly Gly (2) INFORMATION FOR SEQ ID N0:1.7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1181 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 121...l137 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:17:
TACGATTACA AAGATGTTTT TGGGTTTAAG GCGGGGC'GCT ATGAAGCGAA TATTGATTTC 120 " Met Ser Gly Ser Asn Gln Gly Trp Glu Val Tyr Tyr Gln Pro Tyr Lys Thr Glu Thr Gln Arg Leu Arg Phe Trp Trp Trp Ser Ser Phe Gly Arg Gly Leu Ala Phe Asn Ser Trp Ile Tyr Glu Phe Phe Ala Thr Val Pro Tyr Leu Lys Lys Gly Gly Asn Pro Asn Asn Ser Asn Asp Phe Ile Asn Tyr Gly Trp His Gly Ile Thr Thr Thr Tyr Ser Tyr Lys Gly Leu Asp Ala Gln Phe Phe Tyr Tyr Phe Ala Pro Lys Thr Tyr Asn Ala Pro Gly Phe Lys Leu Val Tyr Asp Thr Asn Arg Asn Phe Gln Asn Val Gly Phe l00 l05 110 Arg Ser Gln Ser Met Ile Met Thr Thr Phe Pro Leu Tyr Tyr Arg Gly 115 l20 l25 Trp Tyr Asn Pro Glu Thr Asn Thr Tyr Ser Leu Glu Asp Ser Thr Pro His Gly Ser Leu Leu Gly Arg Asn Gly Val Thr Leu Asn Ile Arg Gln Val Phe Trp Trp Asp Asn Phe Asn Trp Ser Ile Gly Phe Tyr Asn Thr l65 170 175 Phe Gly Asn Ser Asp Ala Phe Leu Gly Ser His Thr Met Pro Arg Gly Asn Asn Thr Ser Tyr Ile Gly Ser Glu Ile Ser Ile Thr Thr Arg His Ala Gly Met Ile Gly Tyr Asp Phe Trp Asp Asn Thr Ala Tyr Asp Gly 210 2l5 220 Leu Ala Asp Ala Ile Thr Asn Ala Asn Thr Phe Thr Phe Tyr Thr Ser -l00-GTT GGA GGG ATC CAT AAG CGT TTT GCA TGG (.AT GTT TTT GGG CGC GTC 888 Val Gly Gly Ile His Lys Arg Phe Ala Trp His Val Phe Gly Arg Val Ser His Ala Asn Lys Asn Ala Leu Gly Gln Val Gly Arg Ala Asn Glu Tyr Ser Leu Gln Phe Asn Ala Ser Tyr Ala F?he Thr Glu Ser Ile Leu Leu Asn Phe Arg Ile Thr Tyr Tyr Gly Ala Arg Ile Asn Lys Gly Tyr Gln Ala Gly Tyr Phe Gly Ala Pro Lys Phe Asn Asn Pro Asp Gly Asp 305 310 3l5 320 Phe Ser Ala Asn Tyr Gln Asp Arg Ser Tyr Met Met Thr Asn Leu Thr Leu Lys Phe (2) INFORMATION FOR SEQ ID NO:1F3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 339 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID Pd0:18:
Met Ser Gly Ser Asn Gln Gly Trp Glu Val Tyr Tyr Gln Pro Tyr Lys Thr Glu Thr Gln Arg Leu Arg Phe Trp Trp Trp Ser Ser Phe Gly Arg Gly Leu Ala Phe Asn Ser Trp Ile Tyr Glu F~he Phe Ala Thr Val Pro Tyr Leu Lys Lys Gly Gly Asn Pro Asn Asn Ser Asn Asp Phe Ile Asn Tyr Gly Trp His Gly Ile Thr Thr Thr Tyr Ser Tyr Lys Gly Leu Asp Ala Gln Phe Phe Tyr Tyr Phe Ala Pro Lys ~~hr Tyr Asn Ala Pro Gly Phe Lys Leu Val Tyr Asp Thr Asn Arg Asn I?he Gln Asn Val Gly Phe Arg Ser Gln Ser Met Ile Met Thr Thr Phe Pro Leu Tyr Tyr Arg Gly l15 120 l25 Trp Tyr Asn Pro Glu Thr Asn Thr Tyr Ser Leu Glu Asp Ser Thr Pro l30 135 l40 His Gly Ser Leu Leu Gly Arg Asn Gly Val Thr Leu Asn lle Arg Gln 145 l50 155 160 Val Phe Trp Trp Asp Asn Phe Asn Trp Ser Ile Gly Phe Tyr Asn Thr Phe Gly Asn Ser Asp Ala Phe Leu Gly Ser His Thr Met Pro Arg Gly 180 l85 190 Asn Asn Thr Ser Tyr Ile Gly Ser Glu Ile Ser Ile Thr Thr Arg His Ala Gly Met Ile Gly Tyr Asp Phe Trp Asp Asn Thr Ala Tyr Asp Gly 210 2l5 220 Leu Ala Asp Ala Ile Thr Asn Ala Asn Thr Phe Thr Phe Tyr Thr Ser Val Gly Gly Tle His Lys Arg Phe Ala Trp His Val Phe Gly Arg Val Ser His Ala Asn Lys Asn Ala Leu Gly Gln Val Gly Arg Ala Asn Glu Tyr Ser Leu Gln Phe Asn Ala Ser Tyr Ala Phe Thr Glu Ser Ile Leu _ 275 280 285 Leu Asn Phe Arg Ile Thr Tyr Tyr Gly Ala Arg Ile Asn Lys Gly Tyr Gln Ala Gly Tyr Phe Gly Ala Pro Lys Phe Asn Asn Pro Asp Gly Asp 305 3l0 315 320 Phe Ser Ala Asn Tyr Gln Asp Arg Ser Tyr Met Met Thr Asn Leu Thr Leu Lys Phe (2) INFORMATION FOR SEQ ID N0:19:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 959 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 133...879 (D) OTHER INFORMATION:
(xi} SEQUENCE DESCRIPTION: SEQ ID N0:19:
AGGATTTTAA GA ATG AAT GAC AAG CGT TTT AGA AAA TAT TGT AGT TTT TCT 17l Met Asn Asp Lys Arg Phe Arg Lys Tyr Cys Ser Phe Ser -loz-ATT TTT TTG TCC TTA TTA GGA ACG TTT GAA T7.'A GAG GCT AAA GAA GAA 219 Ile Phe Leu Ser Leu Leu Gly Thr Phe Glu Le:u Glu Ala Lys Glu Glu Glu Glu Lys Glu Glu Arg Lys Thr Glu Arg Lys Lys Glu Lys Asn Ala CAA CAC ACT CTA GGC AAG GTT ACC ACT CAA GC'.G GCT AAA ATC TTT AAC 3l5 Gln His Thr Leu Gly Lys Val Thr Thr Gln A7.a Ala Lys Ile Phe Asn Tyr Asn Asn Gln Thr Thr Ile Ser Ser Lys Gl.u Leu Glu Arg Arg Gln Ala Asn Gln Ile Ser Asp Met Phe Arg Arg A~3n Pro Asn Ile Asn Val Gly Gly Gly Ala Val Ile Ala Gln Lys Ile Tyr Val Arg Gly Ile Glu GAC AGA TTG GCT CGG GTT ACG GTG GAT GGG GC'.G GCG CAA ATG GGT GCA 507 Asp Arg Leu Ala Arg Val Thr Val Asp Gly Al.a Ala Gln Met Gly Ala 110 115 1a;0 125 AGC' TAT GGG CAT CAA GGC AAT ACG ATC ATT GAC CCT GGA ATG CTT AAA 555 Ser Tyr Gly His Gln Gly Asn Thr Ile Ile A~~p Pro Gly Met Leu Lys AGC GTG GTG GTT ACT AAA GGG GCG GCT CAA GC'.G AGC GCG GGG CCT ATG 603 Ser Val Val Val Thr Lys Gly Ala Ala Gln A7.a Ser Ala Gly Pro Met Ala Leu Ile Gly Ala Ile Lys Met Glu Thr Lys Ser Ala Ser Asp Phe Ile Pro Lys Gly Lys Asp Tyr Ala Ile Ser G7.y Ala Ala Thr Phe Leu l75 180 185 Thr Asn Phe Gly Asp Arg Glu Thr Va1 Met G7.y Ala Tyr Arg His Asn His Phe Asp Ala Leu Leu Tyr Tyr Thr His G7.n Asn Ile Phe Tyr Tyr CGT GAT GGG GAT AAT GCT ACA AAA GAT CTC T7.'T AGA CCT AAA GCG GAG 843 Arg Asp Gly Asp Asn Ala Thr Lys Asp Leu Phe Arg Pro Lys Ala Glu Asn Lys Val Thr Glu Val Leu Ala Ser Lys Thr Met (2) INFORMATION FOR SEQ ID N0:20:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 249 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:20:
Met Asn Asp Lys Arg Phe Arg Lys Tyr Cys Ser Phe Ser Ile Phe Leu Ser Leu Leu Gly Thr Phe Glu Leu Glu Ala Lys Glu Glu Glu Glu Lys Glu Glu Arg Lys Thr Glu Arg Lys Lys Glu Lys Asn Ala Gln His Thr Leu Gly Lys Val Thr Thr Gln Ala Ala Lys Ile Phe Asn Tyr Asn Asn Gln Thr Thr Ile Ser Ser Lys Glu Leu Glu Arg Arg Gln Ala Asn Gln Ile Ser Asp Met Phe Arg Arg Asn Pro Asn Ile Asn Val Gly Gly Gly Ala Val Ile Ala Gln Lys Ile Tyr Val Arg Gly Ile Glu Asp Arg Leu l00 l05 110 Ala Arg Val Thr Val Asp Gly Ala Ala Gln Met Gly Ala Ser Tyr Gly His Gln Gly Asn Thr Ile Ile Asp Pro Gly Met Leu Lys Ser Val Val l30 l35 140 Val Thr Lys Gly Ala Ala Gln Ala Ser Ala Gly Pro Met Ala Leu Ile Gly Ala Ile Lys Met Glu Thr Lys Ser Ala Ser Asp Phe Ile Pro Lys Gly Lys Asp Tyr Ala Ile Ser Gly Ala Ala Thr Phe Leu Thr Asn Phe Gly Asp Arg Glu Thr Val Met Gly Ala Tyr Arg His Asn His Phe Asp l95 200 205 Ala Leu Leu Tyr Tyr Thr His Gln Asn Ile Phe Tyr Tyr Arg Asp Gly 2l0 2l5 220 Asp Asn Ala Thr Lys Asp Leu Phe Arg Pro Lys Ala Glu Asn Lys Val Thr Glu Val Leu Ala Ser Lys Thr Met (2) INFORMATION FOR SEQ ID N0:21:
WO 98/21225 PCT/US97/21353 w (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1306 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 40...1266 (D) OTHER INFORMATION:
(A) NAME/KEY: sig peptide (B) LOCATION: 40...219 (D) OTHER INFORMATION:
(A) NAME/KEY: mat peptide (B) LOCATION: 220...1266 (D} OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:21:
Met Ser Tyr Thr Lys Lys Tyr Ser Thr Pro Pro Asn Arg Arg Lys Met Gln Asn Ile Ile Ala Ile Lys Arg Ser Ser Arg Val Asp Leu Gln Ala Cys Lys Leu Ala Phe Ala Ser Ser Arg Ser Pro Met Gln Phe Gln Lys Thr Leu Phe Pro Leu Pro Leu Leu Phe Leu Ser Cys Cys Ile Ala Glu Glu Asn Gly Ala Tyr Ala Ser Val Gly Phe Glu Tyr Ser Ile Ser His Ala Val Glu His Asn ' Asn Pro Phe Leu Asn Gln Glu Arg Ile Gln Ile Ile Ser Asn Ala Gln Asn Lys Ile Tyr Lys Leu Asn Gln Val Lys Asn Glu Ile Thr Ser Met Gln Asn Thr Phe Asn Tyr Ile Asn Asn Ala Leu Lys Asn Asn Ala Lys Leu Thr Pro Thr Glu Ile Gln Ala Glu Lys Tyr Tyr Leu Gln Ser Thr Leu Gln Asn Ile Glu Lys Ile Val Thr Leu Ser Gly Gly Val Ala Ser 90 95 l00 l05 Asn Pro Lys Leu Val Gln Ala Leu Glu Lys Met Gln Glu Pro Ile Thr 110 l15 120 Asn Pro Leu Glu Leu Ala Glu Asn Leu Arg Asn Leu Glu Leu Gln Phe l25 130 135 Ala Gln Ser Gln Asn Arg Met Leu Ser Ser Leu Ser Ser Gln Thr Ala Gl.n Ile Ser Asn Ser Leu Asn Ala Leu Asp Pro Ser Ser Tyr Ser Lys l55 160 165 Asn Ile Ser Ser Met Ser Gly Val Ser Leu Ser Val Gly Tyr Lys His Phe Phe Thr Lys Lys Lys Asn Gln Gly Phe Arg Tyr Tyr Leu Phe Tyr Asp Tyr Gly Tyr Thr Asn Phe Gly Phe Val Gly Asn Gly Phe Asp Gly Leu Gly Lys Met Asn Asn His Leu Tyr Gly Leu Gly Ile Asn Tyr Leu Tyr Asn Phe Ile Asp Asn Ala Gln Lys His Ser Ser Val Gly Phe Tyr Ala Gly Phe Ala Leu Ala Gly Asn Ser Trp Val Gly Asn Gly Leu Gly ATG TGG GTG AGC CAA ACG GAT TTT ATC AAC AAT TAC TTG ATG GGC TAT l062 Met Trp Val Ser Gln Thr Asp Phe Ile Asn Asn Tyr Leu Met Gly Tyr Gln Ala Lys Ile His Thr Asn Phe Phe Gln 7:1e Pro Leu Asn Phe Gly GTT CGT GTG AAT GTC AAT AGG CAT AAC GGA 7.'TT GAA ATG GGC CTA AAA 1158 Val Arg Val Asn Val Asn Arg His Asn Gly Phe Glu Met Gly Leu Lys Ile Pro Leu Ala Val Asn Ser Phe Tyr Glu Thr His Gly Lys Gly Leu Asn Thr Ser Leu Phe Phe Lys Arg Leu Val L'al Phe Asn Val Ser Tyr Val Tyr Ser Phe (2) INFORMATION FOR SEQ ID N0:22:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 409 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:22:
Met Ser Tyr Thr Lys Lys Tyr Ser Thr Pro Pro Asn Arg Arg Lys Met Gln Asn Ile Ile Ala Ile Lys Arg Ser Ser Arg Val Asp Leu Gln Ala Cys Lys Leu Ala Phe Ala Ser Ser Arg Ser Pro Met Gln Phe Gln Lys Thr Leu Phe Pro Leu Pro Leu Leu Phe Leu Ser Cys Cys Ile Ala Glu Glu Asn Gly Ala Tyr Ala Ser Val Gly Phe Glu Tyr Ser Ile Ser His Ala Val Glu His Asn Asn Pro Phe Leu Asn Gln Glu Arg Ile Gln Ile Ile Ser Asn Ala Gln Asn Lys Ile Tyr Lys Leu Asn Gln Val Lys Asn ' 40 45 50 Glu Ile Thr Ser Met Gln Asn Thr Phe Asn Tyr Ile Asn Asn Ala Leu Lys Asn Asn Ala Lys Leu Thr Pro Thr Glu Ile Gln Ala Glu Lys Tyr -l07-Tyr Leu Gln Ser Thr Leu Gln Asn Ile Glu Lys Ile Val Thr Leu Ser 85 90 95 l00 Gly Gly Val Ala Ser Asn Pro Lys Leu Val Gln Ala Leu Glu Lys Met 105 1l0 115 Gln Glu Pro Ile Thr Asn Pro Leu Glu Leu Ala Glu Asn Leu Arg Asn Leu Glu Leu Gln Phe Ala Gln Ser Gln Asn Arg Met Leu Ser Ser Leu l35 140 145 Ser Ser Gln Thr Ala Gln Ile Ser Asn Ser Leu Asn Ala Leu Asp Pro Ser Ser Tyr Ser Lys Asn Ile Ser Ser Met Ser Gly Val Ser Leu Ser l65 170 175 l80 Val Gly Tyr Lys His Phe Phe Thr Lys Lys Lys Asn Gln Gly Phe Arg 185 l90 195 Tyr Tyr Leu Phe Tyr Asp Tyr Gly Tyr Thr Asn Phe Gly Phe Val Gly Asn Gly Phe Asp Gly Leu Gly Lys Met Asn Asn His Leu Tyr Gly Leu Gly Ile Asn Tyr Leu Tyr Asn Phe Ile Asp Asn Ala Gln Lys His Ser Ser Val Gly Phe Tyr Ala Gly Phe Ala Leu Ala Gly Asn Ser Trp Val Gly Asn Gly Leu Gly Met Trp Val Ser Gln Thr Asp Phe Ile Asn Asn Tyr Leu Met Gly Tyr Gln Ala Lys Ile His Thr Asn Phe Phe Gln Ile Pro Leu Asn Phe Gly Val Arg Val Asn Val Asn Arg His Asn Gly Phe Glu Met Gly Leu Lys Ile Pro Leu Ala Val Asn Ser Phe Tyr Glu Thr His Gly Lys Gly Leu Asn Thr Ser Leu Phe Phe Lys Arg Leu Val Val Phe Asn Val Ser Tyr Val Tyr Ser Phe (2) INFORMATION FOR SEQ ID N0:23:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1030 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 342...824 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:23:
ACC ATC
AAA GTT
Met Thr Ile Lys Val Phe Ser Pro Lys Tyr Pro Thr Glu Leu Glu Glu Phe Tyr Ala Glu Arg Ile Ala Asp Asn Pro Leu Gly Phe Ile Gln ,Arg Leu Asp Leu Leu Pro Ser Ile Ser Gly Phe Val Gln Lys Leu Arg Glu His Gly Gly Glu Phe Phe Glu Met Arg Glu Gly Asn Lys Leu Ile Gly Ile Cys Gly Leu Asn CCT ATC AAT CAA ACA GAA GCC GAG CTG TGC .AAA TTC CAC ATA AAT AGT 596 Pro Ile Asn Gln Thr Glu Ala Glu Leu Cys Lys Phe His Ile Asn Ser Ala Tyr Gln Ser Gln Gly Leu Gly Gln Lys :Leu Tyr Glu Ser Val Glu AAA TAC GCT TTC ATT AAA GGC TAT ACT AAA .ATC TCT CTG CAT GTG AGC 692 Lys Tyr Ala Phe Ile Lys Gly Tyr Thr Lys Ile Ser Leu His Val Ser 105 l10 115 Lys Ser Gln Ile Lys Ala Cys Asn Leu Tyr Gln Lys Leu Gly Phe Val 120 l25 130 CAC ATC AAA GAA GAG GAT TGC GTG GTG GAG 'TTG GGC GAA GAG ACT TTG 788 His Ile Lys Glu Glu Asp Cys Val Val Glu :Leu Gly Glu Glu Thr Leu ATT TTC CCC ACT CTT TTT ATG GAA AAG ATT 'TTG TCT TGATTGGTGC ATCCAT 840 - Ile Phe Pro Thr Leu Phe Met Glu Lys Ile :Leu Ser l50 155 l60 TTGACACACG CCCAAGCGAC ATTCAAACTA TCAAACT'TTC ATTAACACAA CCCAATTAAC 900 GCTAAATAAA CCCTAAAACA AACACTCGTT GTTAAAA'TTT TGTTTTTCAA GCGCTTCGCA 960 AAGTTTTAGA AGCCCTATTT AGGGGTTAAC GCTAAAA'TAG GCTATCAAAA CTACTTTAAT 1020 ' GATTTTATAG 1030 (2) INFORMATION FOR SEQ ID N0:24:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 161 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:24:
Met Thr Ile Lys Val Phe Ser Pro Lys Tyr Pro Thr Glu Leu Glu Glu Phe Tyr Ala Glu Arg Ile Ala Asp Asn Pro Leu Gly Phe Ile Gln Arg Leu Asp Leu Leu Pro Ser Ile Ser Gly Phe Val Gln Lys Leu Arg Glu His Gly Gly Glu Phe Phe Glu Met Arg Glu Gly Asn Lys Leu Ile Gly Ile Cys Gly Leu Asn Pro Ile Asn Gln Thr Glu Ala Glu Leu Cys Lys Phe His Ile Asn Ser Ala Tyr Gln Ser Gln Gly Leu Gly Gln Lys Leu Tyr Glu Ser Val Glu Lys Tyr Ala Phe Ile Lys Gly Tyr Thr Lys Ile Ser Leu His Val Ser Lys Ser Gln Ile Lys Ala Cys Asn Leu Tyr Gln 115 l20 125 Lys Leu Gly Phe Val His Ile Lys Glu Glu Asp Cys Val Val Glu Leu 130 135 l40 Gly Glu Glu Thr Leu Ile Phe Pro Thr Leu Phe Met Glu Lys Ile Leu 145 150 l55 l60 Ser (2) INFORMATION FOR SEQ ID N0:25:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1477 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence {B) LOCATION: 374...1267 (D) OTHER INFORMATION:
(xi} SEQUENCE DESCRIPTION: SEQ ID N0:25:
TTATTTCTTA ATACAAAAGGTAGGCGTTTT GAAACAT7.'TA CTCACACCAT240 ACCCCACTCA
CGTTACTAAA GTC ATG ACC AAT TGC GAC AF,T ATT TTT AAC 409 ATG AAA GAT
Met Met Thr Asn Cys Asp Asn Ile Phe Asn Lys Asp Ala Lys Gln Lys Glu Val Leu Lys Ala Ala Tyr Gln Phe Gly Ser Lys Glu Asn Leu Gly Tyr Glu Met Ala Gly Ile A.la Trp Lys Glu Ser Cys Ala Gly Val Tyr Lys Ile Asn Phe Ser Asp Pro Ser Ala Gly Val Tyr CAT TCT TAT ATC CCA AGC GTT CTA AAA AGC TAT GGG CAT AAT GAT AGC 60l His Ser Tyr Ile Pro Ser Val Leu Lys Ser Tyr Gly His Asn Asp Ser Pro Phe Leu Arg Asn Val Met Gly Glu Leu Leu Ile Lys Asp Asp Ala TTT GCT TCT GAA GTG GCT TTA AAA GAG TTG C'TC TAT TGG AAA ACA CGC 697 Phe Ala Ser Glu Val Ala Leu Lys Glu Leu Leu Tyr Trp Lys Thr Arg Tyr His Asp Asn Leu Lys Asp Met Ile Lys S~~_r Tyr Asn Lys Gly Ser Arg Trp Glu Arg Ser Glu Lys Ser Asn Ala Asp Ala Glu Lys Tyr Tyr l25 130 l35 140 Glu Glu Ile Gln Asp Arg Ile Arg Arg Leu Lys Glu Ser Lys Ile Phe Asp Ser Gln Ser Ser Asn Asp Gln Glu Leu Gln Lys Ser Ala Asn Ser 160 l65 170 AAC CTG GAT TTA GAC CCT ATC GGC AAC GCC A'CG CCC CAA GCC TTA ATT 937 Asn Leu Asp Leu Asp Pro Ile Gly Asn Ala Met Pro Gln Ala Leu Ile Ala Lys Glu Thr Lys Ile Glu Glu Thr Gln Ala Glu Lys Ser Gln Glu Met Lys Glu Thr Thr Ser Glu Gln Thr Lys Ser Lys Pro Glu Lys Ala 205 2l0 2l5 220 AAA GAT AAA CCC ATG TAT TTG GCG CAA ATC AAC AGC ACT GAT TTC ACA 108l Lys Asp Lys Pro Met Tyr Leu Ala Gln Ile Asn Ser Thr Asp Phe Thr Pro Val Lys Lys Ser Pro Lys Lys Pro Ala Lys Val Ser Gln Lys His TCC TTT AAG AAT AAC ATT AAA AAT AAT GTA AAA AAC AAC GCC AAA ACC 1177 _ Ser Phe Lys Asn Asn Ile Lys Asn Asn Val Lys Asn Asn Ala Lys Thr Ala Ser Lys Lys Gln Glu Met Cys Lys Asn Cys Ser Pro Gly Gln Arg Asn Ala Ile Leu Ala Asn His Ile Thr Leu Met Gln Glu Leu (2) INFORMATION FOR SEQ ID N0:26:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 298 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:26:
Met Met Thr Asn Cys Asp Asn Ile Lys Asp Phe Asn Ala Lys Gln Lys Glu Val Leu Lys Ala Ala Tyr Gln Phe Gly Ser Lys Glu Asn Leu Gly Tyr Glu Met Ala Gly Ile Ala Trp Lys Glu Ser Cys Ala Gly Val Tyr Lys Ile Asn Phe Ser Asp Pro Ser Ala Gly Val Tyr His Ser Tyr Ile Pro Ser Val Leu Lys Ser Tyr Gly His Asn Asp Ser Pro Phe Leu Arg Asn Val Met Gly Glu Leu Leu Ile Lys Asp Asp Ala Phe Ala Ser Glu Val Ala Leu Lys Glu Leu Leu Tyr Trp Lys Thr Arg Tyr His Asp Asn WO 98/21225 PCTlUS97/21353 100 105 1l0 Leu Lys Asp Met Ile Lys Ser Tyr Asn Lys Gly Ser Arg Trp Glu Arg ' Ser Glu Lys Ser Asn Ala Asp Ala Glu Lys '.Cyr Tyr Glu Glu Ile Gln 130 135 l40 Asp Arg Ile Arg Arg Leu Lys Glu Ser Lys Ile Phe Asp Ser Gln Ser 14S 150 .L55 l60 Ser Asn Asp Gln Glu Leu Gln Lys Ser Ala Asn Ser Asn Leu Asp Leu ( 165 170 l75 Asp Pro Ile Gly Asn Ala Met Pro Gln Ala Leu Ile Ala Lys Glu Thr Lys Ile Glu Glu Thr Gln Ala Glu Lys Ser CTln Glu Met Lys Glu Thr Thr Ser Glu Gln Thr Lys Ser Lys Pro Glu Lys Ala Lys Asp Lys Pro Met Tyr Leu Ala Gln Ile Asn Ser Thr Asp Phe Thr Pro Val Lys Lys 225 230 <;35 240 Ser Pro Lys Lys Pro Ala Lys Val Ser Gln Lys His Ser Phe Lys Asn Asn 11e Lys Asn Asn Val Lys Asn Asn Ala Lys Thr Ala Ser Lys Lys G1n Glu Met Cys Lys Asn Cys Ser Pro Gly Gln Arg Asn Ala Ile Leu Ala Asn His Ile Thr Leu Met Gln Glu Leu (2) INFORMATION FOR SEQ ID N0:27:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1515 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 141...1340 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N'0:27:
TTAGTGTTGA TTTTTTTATC GTTAGTGTTT GTGCGTCC'TT TAGAGGCTTT GAGCGTGTTT 60 Met Lys Glu Ser Phe Tyr Ile Glu Gly Met Thr Cys Thr Ala Cys Ser Ser Gly Ile Glu A.rg Ser Leu Gly Arg Lys AGT TTT GTG AAA AAA ATA GAA GTG AGC CTT T'TA AAT AAG AGC GCT AAC 266 Ser Phe Val Lys Lys Ile Glu Val Ser Leu Leu Asn Lys Ser Ala Asn I1e Glu Phe Asp Glu Asn Gln Thr Asn Leu Asp Glu Ile Phe Lys Leu Ile Glu Lys Leu Gly Tyr Ser Pro Lys Lys Ala Leu Thr Lys Glu Lys Lys Glu Phe Phe Ser Pro Asn Val Lys Leu Ala Leu Ala Val Ile Phe Thr Leu Phe Val Val Tyr Leu Ser Met Gly Ala Met Leu Ser Pro Ser 95 l00 10S
Leu Leu Pro Glu Ser Leu Leu Ala Ile Asp Asn His Ser Asn Phe Leu 110 l15 120 Asn Ala Cys Leu Gln Leu Ile Gly Ala Leu Ile Val Met His Leu Gly l25 l30 135 Arg Asp Phe Tyr Ile Gln Gly Phe Lys Ala Leu Trp His Arg Gln Pro l40 145 150 Asn Met Ser Ser Leu Ile Ala Ile Gly Thr Ser Ala Ala Leu Ile Ser l55 160 165 170 Ser Leu Trp Gln Leu Tyr Leu Val Tyr Thr Asn His Tyr Thr Asp Gln Trp Ser Tyr Gly His Tyr Tyr Phe Glu Ser Val Cys Val Ile Leu Met l90 19S 200 Phe Val Met Val Gly Lys Arg Ile Glu Asn Val Ser Lys Asp Lys Ala Leu Asp Ala Met Gln Ala Leu Met Lys Asn Ala Pro Lys Thr Ala Leu Lys Met Gln Asn Asn Gln Gln Ile Glu Val Leu Val Asp Ser Ile Val Val Gly Asp Ile Leu Lys Val Leu Pro Gly ~~er Ala Ile Ala Val Asp Gly Glu Ile Ile Glu Gly Glu Gly Glu Leu F,sp Glu Ser Met Leu Ser Gly Glu Ala Leu Pro Val Tyr Lys Lys Val Gly Asp Lys Val Phe Ser Gly Thr Phe Asn Ser His Thr Ser Phe Leu Met Lys Ala Thr Gln Asn Asn Lys Asn Ser Thr Leu Ser Gln Ile Ile Glu Met Ile Tyr Asn Ala CAA AGT TCA AAG GCA GAG ATT TCT CGC TTA GCG GAT AAG GTT TCA AGC 1l78 Gln Ser Ser Lys Ala Glu Ile Ser Arg Leu Ala Asp Lys Val Ser Ser Val Phe Val Pro Ser Val Ile Ala Ile Ser Ile Leu A1a Phe Val Val Trp Leu Ile Ile Ala Pro Lys Pro Asp Phe Trp Trp Asn Phe Gly Ile Ala Leu Glu Val Phe Val Ser Val Leu Val Ile Ser Cys Pro Cys Ala Leu Gly Leu Leu Arg Leu GGGTTATTTT TTAAAGACGC TAAAAGTTTA GAAAAAGC.~A GGCTAGTCAA TACGATCGTT 1438 (2) INFORMATION FOR SEQ ID N0:28:
(i) SEQUENCE CHARACTERTSTICS:
_ (A) LENGTH: 400 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein ' (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:28:
Met Lys Glu Ser Phe Tyr Ile Glu Gly Met Thr Cys Thr Ala Cys Ser Ser Gly Ile Glu Arg Ser Leu Gly Arg Lys Ser Phe Val Lys Lys Ile Glu Val Ser Leu Leu Asn Lys Ser Ala Asn Ile Glu Phe Asp Glu Asn Gln Thr Asn Leu Asp Glu Ile Phe Lys Leu Ile Glu Lys Leu Gly Tyr Ser Pro Lys Lys Ala Leu Thr Lys Glu Lys Lys Glu Phe Phe Ser Pro Asn Val Lys Leu Ala Leu Ala Val Ile Phe Thr Leu Phe Val Val Tyr Leu Ser Met Gly Ala Met Leu Ser Pro Ser Leu Leu Pro Glu Ser Leu Leu Ala Ile Asp Asn His Ser Asn Phe Leu Asn Ala Cys Leu Gln Leu Ile Gly Ala Leu Ile Val Met His Leu Gly Arg Asp Phe Tyr Ile Gln 130 135 l40 Gly Phe Lys Ala Leu Trp His Arg Gln Pro Asn Met Ser Ser Leu Ile Ala Ile Gly Thr Ser Ala Ala Leu Ile Ser Ser Leu Trp Gln Leu Tyr l65 l70 175 Leu Val Tyr Thr Asn His Tyr Thr Asp Gln Trp Ser Tyr Gly His Tyr l80 185 190 Tyr Phe Glu Ser Val Cys Val Ile Leu Met Phe Val Met Val Gly Lys Arg Ile Glu Asn Val Ser Lys Asp Lys Ala Leu Asp Ala Met Gln Ala Leu Met Lys Asn Ala Pro Lys Thr Ala Leu Lys Met Gln Asn Asn Gln Gln Ile Glu Val Leu Val Asp Ser Ile Val Val Gly Asp Ile Leu Lys Val Leu Pro Gly Ser Ala Ile Ala Val Asp Gly Glu Ile Ile Glu Gly Glu Gly Glu Leu Asp Glu Ser Met Leu Ser Gly Glu Ala Leu Pro Val Tyr Lys Lys Val Gly Asp Lys Val Phe Ser Gly Thr Phe Asn Ser His Thr Ser Phe Leu Met Lys Ala Thr Gln Asn Asn Lys Asn Ser Thr Leu 305 3l0 315 320 Ser Gln Ile Ile Glu Met Ile Tyr Asn Ala Gln Ser Ser Lys Ala Glu Ile Ser Arg Leu Ala Asp Lys Val Ser Ser Val Phe Val Pro Ser Val Ile Ala Ile Ser Ile Leu Ala Phe Val Val Trp Leu Ile Ile Ala Pro Lys Pro Asp Phe Trp Trp Asn Phe Gly Ile Ala Leu Glu Val Phe Val Ser Val Leu Val Ile Ser Cys Pro Cys Ala Leu Gly Leu Leu Arg Leu (2) INFORMATION FOR SEQ ID N0:29:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1443 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 76...1389 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:29:
Met Lys Lys Ile Trp Leu Leu Val Trp Gly Leu Cys TCT TGG GTG TTT TTG CAT GCG ATA GAG ATG ATA GAA AAA GCC CCT ACA l59 Ser Trp Val Phe Leu His Ala Ile Glu Met Ile Glu Lys Ala Pro Thr Asn Val Glu Asp Arg Asp Lys Ala Pro His Leu Leu Leu Leu Ala Gly Ile Gln Gly Asp Glu Pro Gly Gly Phe Asn Ala Thr Asn Leu Phe Leu Met His Tyr Ser Val Leu Lys Gly Leu Val Glu Val Val Pro Val Leu Asn Lys Pro Ser Met Leu Arg Asn His Arg Gly Leu Tyr Gly Asp Met Asn Arg Lys Phe Ala Ala Leu Asp Lys Asn Asp Pro Glu Tyr Pro Thr Ile Gln Glu Ile Lys Ser Leu Ile Ala Lys Pro Ser Ile Asp Ala Val 1l0 1l5 120 Leu His Leu His Asp Gly Gly Gly Tyr Tyr Arg Pro Val Tyr Val Asp Ala Met Leu Asn Pro Lys Arg Trp Gly Asn Cys Phe Ile Ile Asp Gln Asp Glu Val Lys Gly Ala Lys Phe Pro Asn Leu Leu Ala Phe Ala Asn 160 l65 170 Asn Thr Ile Glu Ser Ile Asn Ala His Leu Leu His Pro Ile Glu Glu Tyr His Leu Lys Asn Thr Arg Thr Ala Gln Gly Asp Thr Glu Met Gln 190 l95 200 Lys Ala Leu Thr Phe Tyr Ala Ile Asn Gln Lys Lys Ser Ala Phe Ala 205 2l0 2l5 220 Asn Glu Ala Ser Lys Glu Leu Pro Leu Ala Ser Arg Val Phe Tyr His CTG CAA GCC ATT GAG GGC TTA CTC AAT CAG CTC AAT ATC CCT TTT AAG 8.31 Leu Gln Ala Ile Glu Gly Leu Leu Asn Gln Leu Asn Ile Pro Phe Lys Arg Asp Phe Asp Leu Asn Pro Asn Ser Val His Ala Leu Ile Asn Asp AAA AAC TTG TGG GCA AAA ATC AGC TCT TTG CCT AAA ATG CCC CTT TTT 927 -.
Lys Asn Leu Trp Ala Lys Ile Ser Ser Leu Pro Lys Met Pro Leu Phe Asn Leu Arg Pro Lys Leu Asn His Phe Pro Leu Pro His Asn Thr Lys Ile Pro Gln Ile Pro Ile Glu Ser Asn Ala Tyr Ile Val Gly Leu Val Lys Asn Lys Gln Glu Val Phe Leu Lys Tyr Gly Asn Lys Leu Met Thr CGA TTA TCG CCT TTT TAC ATA GAG TTT GAT CCT TCT TTA GAA GAA GTG 11l9 Arg Leu Ser Pro Phe Tyr Ile Glu Phe Asp Pro Ser Leu Glu Glu Val Lys Met Gln Ile Asp Asn Lys Asp Gln Met Val Lys Ile Gly Ser Val Val Glu Val Lys Glu Ser Phe Tyr Ile His Ala Met Asp Asn Ile Arg Ala Asn Val Ile Gly Phe Ser Val Ser Asn Glu Asn Lys Pro Asn Glu Ala Gly Tyr Thr Ile Lys Phe Lys Asp Phe Gln Lys Arg Phe Ser Leu Asp Lys Gln Glu Arg Ile Tyr Arg Ile Glu Phe Tyr Lys Asn Asn Ala TTT AGC GGG ATG ATC TTA GTG AAA TTT GTG T.AGGAATGGA TAAATCTCAT TGC 1412 Phe Ser Gly Met Ile Leu Val Lys Phe Val (2} INFORMATION FOR SEQ ID N0:30:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 438 amino acids (B) TYPE: amino acid (C} STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:30:
Met Lys Lys Ile Trp Leu Leu Val Trp Gly Le~u Cys Ser Trp Val Phe Leu His Ala Ile Glu Met Ile Glu Lys Ala Pro Thr Asn Val Glu Asp Arg Asp Lys Ala Pro His Leu Leu Leu Leu Ala Gly Ile Gln Gly Asp Glu Pro Gly Gly Phe Asn Ala Thr Asn Leu Phe Leu Met His Tyr Ser Val Leu Lys Gly Leu Val Glu Val Val Pro V<~1 Leu Asn Lys Pro Ser Met Leu Arg Asn His Arg Gly Leu Tyr Gly Asp Met Asn Arg Lys Phe Ala Ala Leu Asp Lys Asn Asp Pro Glu Tyr Pro Thr Ile Gln Glu Ile 100 l05 110 Lys Ser Leu Ile Ala Lys Pro Ser Ile Asp Ala Val Leu His Leu His Asp Gly Gly Gly Tyr Tyr Arg Pro Val Tyr Val Asp Ala Met Leu Asn 130 135 l40 Pro Lys Arg Trp Gly Asn Cys Phe Ile Ile A:>p Gln Asp Glu Val Lys Gly Ala Lys Phe Pro Asn Leu Leu Ala Phe A7.a Asn Asn Thr Ile Glu 165 l70 175 Ser Ile Asn Ala His Leu Leu His Pro Ile G7.u Glu Tyr His Leu Lys 180 185 l90 Asn Thr Arg Thr Ala Gln Gly Asp Thr Glu Met Gln Lys Ala Leu Thr Phe Tyr Ala Ile Asn Gln Lys Lys Ser Ala Phe Ala Asn Glu Ala Ser Lys Glu Leu Pro Leu Ala Ser Arg Val Phe Tyr His Leu Gln Ala Ile Glu Gly Leu Leu Asn Gln Leu Asn Ile Pro Phe Lys Arg Asp Phe Asp Leu Asn Pro Asn Ser Val His Ala Leu Ile Asn Asp Lys Asn Leu Trp Ala Lys Ile Ser Ser Leu Pro Lys Met Pro Leu Phe Asn Leu Arg Pro Lys Leu Asn His Phe Pro Leu Pro His Asn Thr Lys Ile Pro Gln Ile Pro Ile Glu Ser Asn Ala Tyr Ile Val Gly Leu Val Lys Asn Lys Gln Glu Val Phe Leu Lys Tyr Gly Asn Lys Leu Met Thr Arg Leu Ser Pro Phe Tyr Ile Glu Phe Asp Pro Ser Leu Glu Glu Val Lys Met Gln Ile Asp Asn Lys Asp Gln Met Val Lys Ile Gly Ser Val Val Glu Val Lys Glu Ser Phe Tyr Ile His Ala Met Asp Asn Ile Arg Ala Asn Val Ile Gly Phe Ser Val Ser Asn Glu Asn Lys Pro Asn Glu Ala Gly Tyr Thr Ile Lys Phe Lys Asp Phe Gln Lys Arg Phe Ser Leu Asp Lys Gln Glu Arg Ile Tyr Arg Ile Glu Phe Tyr Lys Asn Asn Ala Phe Ser Gly Met Ile Leu Val Lys Phe Val (2) INFORMATION FOR SEQ ID N0:31:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1280 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 66...I223 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:31:
Met Leu Arg Lys Asn Ile Leu Ala Tyr Tyr Gly Ala Asn Phe Leu -1zo-TTA ATC ATC GCT CAA AGC TTA CCC CAT GCG .ATT TTA ACC CCC TTG TTG 158 Leu Ile Ile Ala Gln Ser Leu Pro His Ala Ile Leu Thr Pro Leu Leu CTT TCT AAA GGG CTT AGT TTG AGT GAA ATC 'rTG CTC GTG CAA ACC TTT 206 Leu Ser Lys Gly Leu Ser Leu Ser Glu Ile Leu Leu Val Gln Thr Phe TTT AGC TTT TGC GTG CTA GTG GCT GAA TAC ~~CA AGC GGC GTT TTA GCG 254 Phe Ser Phe Cys Val Leu Val Ala Glu Tyr Pro Ser Gly Val Leu Ala GAT TTG ATG AGC CGA AAA AAT TTA TTC CTG GTT TCT AAT GCC TTT TTA 302 _ Asp Leu Met Ser Arg Lys Asn Leu Phe Leu 'Jal Ser Asn Ala Phe Leu Ile Ala Ser Phe Ser Phe Val Leu Phe Phe ,Asp Ser Phe Ile Phe Met CTT TTA GCG TGG GGG TTG TAT GGT TTG TAT :4GC GCA TGC TCT AGC GGC 39B
Leu Leu Ala Trp Gly Leu Tyr Gly Leu Tyr Ser Ala Cys Ser Ser Gly ACG ATT GAA GCT TCA CTC ATC ACA GAC ATT i~AG GAA AAC AAA AAA GAT 446 Thr Tle Glu Ala Ser Leu Ile Thr Asp Ile Lys Glu Asn Lys Lys Asp l15 120 l25 Leu Ser Lys Phe Leu Ala Lys Asn Asn Gln :Cle Thr Tyr Leu Gly Met 130 135 l40 ATT ATA GGG AGT TCT TTG GGA TCG TTT TTG 'CAT CTC AAA GTC CAT GCG 542 Ile Ile Gly Ser Ser Leu Gly Ser Phe Leu 'Cyr Leu Lys Val His Ala Met Leu Tyr Ile Val Gly Ile Phe Leu Ile Met Leu Cys Val Leu Thr l60 165 :L70 175 Ile Ile Phe Tyr Phe Lys Glu Lys Glu Gly Asp Phe Lys Ser Gln Lys l80 185 190 AGC CTG AAA CTC CTT AAA GAG CAA GTC AAA CiGC AGT CTT AAA GAG CTT 686 Ser Leu Lys Leu Leu Lys Glu Gln Val Lys Gly Ser Leu Lys Glu Leu ' Lys Asp Asn Pro Lys Leu Lys Ile Leu Leu Val Gly His Leu Ile Thr CCC GTC TTT TTT ATG AGC CAT TTT CAA ATG 7.'GG CAA GCG TAT TTT TTA 782 Pro Val Phe Phe Met Ser His Phe Gln Met 7.'rp Gln Ala Tyr Phe Leu WO 98l21225 PCTlUS97l21353 -Lys Gln Gly Val Lys Glu Gln Tyr Leu Phe Val Phe Tyr Ile Ala Phe G1n Val Ile Ser Ile Leu Ile His Phe Leu Lys Ala Ser Ser Tyr Ser Gln Lys Ile Ala Leu Ser Ser Leu Val Val Leu Leu Gly Val Ser Pro Leu Leu Leu Ser Asn Ile Pro Tyr Cys Phe Ile Gly Val Tyr Ala Leu Met Val Ala Phe Phe Thr Tyr Met Ser Tyr Cys Leu Asn Tyr Gln Phe 305 3l0 315 Ser Lys Phe Val Ser Lys Asn Asn Ile Ser Ser Leu Ser Ser Leu Leu Ser Ser Cys Val Arg Val Val Ser Val Leu Ile Leu Ser Leu Ser Ser Leu Glu Leu Arg Tyr Phe Ser Pro Leu Thr Ile Ile Thr Met His Phe GCC TTG ACG CTT ATC ATC CTC TTT TTC TTT TTG TAT AAG GCT AAG CCG l214 Ala Leu Thr Leu Ile Ile Leu Phe Phe Phe Leu Tyr Lys Ala Lys Pro TTT GAT GAG TGAGCGGCTT TAAGAGTGCA ACCTTTTAGC GATTTCTATA GCAACATCA l272 Phe Asp Glu (2) INFORMATION FOR SEQ ID N0:32:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 386 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:32:
Met Leu Arg Lys Asn Ile Leu Ala Tyr Tyr Gly Ala Asn Phe Leu Leu Ile Ile Ala Gln Ser Leu Pro His Ala Ile Leu Thr Pro Leu Leu Leu ~ Ser Lys Gly Leu Ser Leu Ser Glu Ile Leu Leu Val Gln Thr Phe Phe Ser Phe Cys Val Leu Val Ala Glu Tyr Pro Ser Gly Val Leu Ala Asp Leu Met Ser Arg Lys Asn Leu Phe Leu Val :>er Asn Ala Phe Leu Ile 65 70 '.~5 80 Ala Ser Phe Ser Phe Val Leu Phe Phe Asp Ser Phe Ile Phe Met Leu Leu Ala Trp Gly Leu Tyr Gly Leu Tyr Ser Ala Cys Ser Ser Gly Thr Ile Glu Ala Ser Leu Ile Thr Asp Ile Lys Glu Asn Lys Lys Asp Leu 1l5 120 125 Ser Lys Phe Leu Ala Lys Asn Asn Gln Ile 7.'hr Tyr Leu Gly Met Ile Ile Gly Ser Ser Leu Gly Ser Phe Leu Tyr Leu Lys Val His Ala Met l45 l50 7.55 160 Leu Tyr Ile Val Gly Ile Phe Leu Ile Met Leu Cys Val Leu Thr Ile Ile Phe Tyr Phe Lys Glu Lys Glu Gly Asp Phe Lys Ser Gln Lys Ser l80 185 190 Leu Lys Leu Leu Lys Glu Gln Val Lys Gly :~er Leu Lys Glu Leu Lys Asp Asn Pro Lys Leu Lys Ile Leu Leu Val Gly His Leu Ile Thr Pro Val Phe Phe Met Ser His Phe Gln Met Trp Gln Ala Tyr Phe Leu Lys 225 230 i:35 240 Gln Gly Val Lys Glu Gln Tyr Leu Phe Val f>he Tyr Ile Ala Phe Gln Val Ile Ser Ile Leu Ile His Phe Leu Lys Ala Ser Ser Tyr Ser Gln Lys Ile Ala Leu Ser Ser Leu Val Val Leu Leu Gly Val Ser Pro Leu Leu Leu Ser Asn Ile Pro Tyr Cys Phe Ile C~ly Val Tyr Ala Leu Met Val Ala Phe Phe Thr Tyr Met Ser Tyr Cys Leu Asn Tyr Gln Phe Ser 305 3l0 3.15 320 Lys Phe Val Ser Lys Asn Asn Ile Ser Ser Leu Ser Ser Leu Leu Ser Ser Cys Val Arg Val Val Ser Val Leu Ile Leu Ser Leu Ser Ser Leu Glu Leu Arg Tyr Phe Ser Pro Leu Thr Ile I:le Thr Met His Phe Ala - Leu Thr Leu Ile Ile Leu Phe Phe Phe Leu Tyr Lys Ala Lys Pro Phe Asp Glu (2) INFORMATION FOR SEQ ID N0:33:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1264 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 5l...1205 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTTON: SEQ ID N0:33:
ATTAAATATG ACTATATACA CTACAACAAT AAGATTTTGA AAGGTTGGTA ATG GAA 56 _ Met Glu Ser Val Lys Thr Gly Lys Thr Asn Lys Val Gly Lys Asn Thr Glu Met GCT AAT ACA AAG GCA AAT AAA GAG GCT CAT TTT AAA CAA GCG AGC ACC 1.52 Ala Asn Thr Lys Ala Asn Lys Glu Ala His Phe Lys Gln Ala Ser Thr ATT ACA AAT ATA ATC AGA TCA ATT CGT GGG ATT TTT ACA AAA ATT GCA 2d0 Ile Thr Asn Ile Ile Arg Ser Ile Arg Gly Ile Phe Thr Lys Ile Ala Lys Lys Val Arg Gly Leu Val Lys Lys His Pro Lys Lys Ser Ser Ala Ala Leu Val Val Leu Thr His Ile Ala Cys Lys Lys Ala Lys Glu Leu Asp Asp Lys Val Gln Asp Lys Ser Lys Gln Ala Glu Lys Glu Asn Gln Ile Asn Trp Trp Lys Tyr Ser Gly Leu Thr Ile Ala Thr Ser Leu Leu Leu Ala Ala Cys Ser Thr Gly Asp Val Ser Glu Gln Ile Glu Leu Glu 1l5 120 l25 130 Gln Glu Lys Gln Lys Thr Ser Asn Ile Glu Thr Asn Asn Gln Ile Lys Val Glu Gln Glu Lys Gln Lys Thr Ser Asn Ile Glu Thr Asn Asn Gln Ile Lys Val Glu Gln Glu Gln Gln Lys Thr Glu Gln Glu Xaa Gln Lys l65 170 175 ' Thr Glu Gln Glu Arg Gln Lys Thr Glu Gln C~lu Lys Gln Lys Thr Ile Lys Thr Gln Lys Asp Phe Ile Lys Tyr Val Glu Gln Asn Cys Gln Glu 195 200 a!05 210 AAT CAT AAT CAA TTC TTT ATT GAA AAA GGA CiGA ATT AAG GCT GGT ATT 728 Asn His Asn Gln Phe Phe Ile Glu Lys Gly Cily Ile Lys Ala Gly Ile GGT ATA GAA GTA GAA GCT GAA TGC AAA ACC C'.CT AAA CCT GCA AAA ACC 776 Gly Ile Glu Val Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala Lys Thr AAT CAA ACC CCT ATC CAG CCA AAA CAC CTC C:CA AAC TCT AAA CAA CCC 824 -- Asn Gln Thr Pro Ile Gln Pro Lys His Leu Pro Asn Ser Lys Gln Pro CGC TCT CAA AGA GGA TCA AAA GCG CAA GAG C'.TT ATC GCT TAT TTG CAA 872 Arg Ser Gln Arg Gly Ser Lys A1a Gln Glu Leu Ile Ala Tyr Leu Gln Lys Glu Leu Glu Ser Leu Pro Tyr Ser Gln Lys Ala Ile Ala Lys Gln GTG GAT TTT TAT AGA.CCA AGT TCT ATC GCT TAT TTA GAA CTA GAC CCT 968 Val Asp Phe Tyr Arg Pro Ser Ser Ile Ala Tyr Leu Glu Leu Asp Pro AGA GAT TTT AAT GTT ACA GAA GAA TGG CAA AAA GAA AAT TTA AAA ATA 10l6 Arg Asp Phe Asn Val Thr Glu Glu Trp Gln Lys Glu Asn Leu Lys Ile CGC TCT AAA GCT CAA GCT AAA ATG CTT GAA ATG AGG AGT TTA AAA CCA l064 Arg Ser Lys Ala Gln Ala Lys Met Leu Glu Met Arg Ser Leu Lys Pro GAC TCA CAA GCC CAC CTT TCA ACC TCT CAA AGC CTT TTG TTC GTT CAA 11l2 Asp Ser Gln Ala His Leu Ser Thr Ser Gln Ser Leu Leu Phe Val Gln Lys Ile Phe Ala Asp Val Asn Lys Glu Ile Lys Val Val Ala Asn Thr GAA AAG AAA GCA GAA AAA GCG GGT TAT GGT 'CAT AGT AAA AGG ATG TAGGC 1210 G1u Lys Lys Ala Glu Lys Ala Gly Tyr Gly '.t'yr Ser Lys Arg Met ' 375 3B0 385 WO 98/21225 PCT/US97/21353 ' ATAAGAAAAC ACCATAAAAT CGTTCTTAGC TTATTTATAG TATTTTAAAA ACTC l264 (2) INFORMATION FOR SEQ ID N0:34:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 385 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:34:
Met Glu Ser Val Lys Thr Gly Lys Thr Asn Lys Val Gly Lys Asn Thr Glu Met Ala Asn Thr Lys Ala Asn Lys Glu Ala His Phe Lys Gln Ala Ser Thr Ile Thr Asn Ile Ile Arg Ser Ile Arg Gly Ile Phe Thr Lys Ile Ala Lys Lys Val Arg Gly Leu Val Lys Lys His Pro Lys Lys Ser Ser Ala Ala Leu Val Val Leu Thr His Ile Ala Cys Lys Lys Ala Lys Glu Leu Asp Asp Lys Val Gln Asp Lys Ser Lys Gln Ala Glu Lys Glu Asn Gln Ile Asn Trp Trp Lys Tyr Ser Gly Leu Thr Ile Ala Thr Ser Leu Leu Leu Ala Ala Cys Ser Thr Gly Asp Val Ser Glu Gln Ile Glu Leu Glu Gln Glu Lys Gln Lys Thr Ser Asn Ile Glu Thr Asn Asn Gln 130 l35 140 Ile Lys Val Glu Gln Glu Lys Gln Lys Thr Ser Asn Ile Glu Thr Asn l45 150 155 160 Asn Gln Ile Lys Val Glu Gln Glu Gln Gln Lys Thr Glu Gln Glu Xaa l65 I70 175 Gln Lys Thr Glu Gln Glu Arg Gln Lys Thr Glu Gln Glu Lys Gln Lys 180 185 l90 Thr Ile Lys Thr Gln Lys Asp Phe Ile Lys Tyr Val Glu Gln Asn Cys Gln Glu Asn His Asn Gln Phe Phe Ile Glu Lys Gly Gly Ile Lys Ala Gly Ile Gly Ile Glu Val Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala Lys Thr Asn Gln Thr Pro Ile Gln Pro Lys His Leu Pro Asn Ser Lys Gln Pro Arg Ser Gln Arg Gly Ser Lys Ala Gln Glu Leu Ile Ala Tyr Leu Gln Lys Glu Leu Glu Ser Leu Pro Tyr Ser Gln Lys Ala Ile Ala Lys Gln Val Asp Phe Tyr Arg Pro Ser Ser Ile Ala Tyr Leu Glu Leu Asp Pro Arg Asp Phe Asn Val Thr Glu Glu Trp Gln Lys Glu Asn Leu WO 98/2I225 PCT/US97/21353 ' Lys Ile Arg Ser Lys Ala Gln Ala Lys Met Leu Glu Met Arg Ser Leu Lys Pro Asp Ser Gln Ala His Leu Ser Thr Ser Gln Ser Leu Leu Phe Val Gln Lys Ile Phe Ala Asp Val Asn Lys Glu Ile Lys Val Val Ala ' 355 360 365 Asn Thr Glu Lys Lys Ala Glu Lys Ala Gly '~yr Gly Tyr Ser Lys Arg - Met (2) INFORMATION FOR SEQ ID N0:3!i:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 410 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 62...340 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID Ld0:35:
ATTCATTTAC TTTTGAGAAA TATAATTCTC TCGCTTT'iCAA GATCATCACA AGGAGTTTCG 60 Met Lys Lys Gln Ile Leu Thr Gly Val Leu Leu Ser Val Leu Ala Val AGT TCT GCA TAC GCT CAC AAA GAT AAA AAA GAC GCC AAA AAA CCT AAA l57 Ser Ser Ala Tyr Ala His Lys Asp Lys Lys Asp Ala Lys Lys Pro Lys Phe Ser Thr Glu Leu Val Val Ala Gln Asn Asp Lys Lys Asp Ala Lys AAA CCT AAA TTT AGC ACA GAA TTA GTC GTG CiCT CAA AAC GAC AAA AAA 253 Lys Pro Lys Phe Ser Thr Glu Leu Val Val Ala Gln Asn Asp Lys Lys GAC GCT AAA AAA CCT AAA TTT AGC ACA GAA 7.'TA GTC GTG GCT CAA AAC 301 Asp Ala Lys Lys Pro Lys Phe Ser Thr Glu Leu Val Val Ala Gln Asn 65 70 ',~5 80 _ GAC AAA AAA GAC GCT AAA AAA CCT AAA AAC 7.'CA GTG GTC TAATGGCTTT GA 352 Asp Lys Lys Asp Ala Lys Lys Pro Lys Asn Ser Val Val ' CTCTAAAAAA GCGTTTTTAA AAACGCTTTT TTGGATA7.'TA TCCTATAATT TCCTACCA 410 WO 9$/21225 PCT/US97/21353 (2) INFORMATION FOR SEQ ID N0:36:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 93 amino acids (B) TYPE: amino acid (C} STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:36:
Met Lys Lys Gln Ile Leu Thr Gly Val Leu Leu Ser Val Leu Ala Val Ser Ser Ala Tyr Ala His Lys Asp Lys Lys Asp Ala Lys Lys Pro Lys Phe Ser Thr Glu Leu Val Val Ala Gln Asn Asp Lys Lys Asp Ala Lys Lys Pro Lys Phe Ser Thr Glu Leu Val Val Ala Gln Asn Asp Lys Lys Asp Ala Lys Lys Pro Lys Phe Ser Thr Glu Leu Val Val Ala Gln Asn Asp Lys Lys Asp Ala Lys Lys Pro Lys Asn Ser Val Val (2) INFORMATION FOR SEQ ID N0:37:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2097 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 67...2046 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:37:
Met Ile Tyr Trp Leu Tyr Leu Ala Val Phe Phe Leu Leu Ser Ala Leu Asp Ala Lys Glu Ile Ala Met Gln Arg Phe Asp Lys Gln Asn His Lys Ile Phe Glu Ile Leu Ala Asp Lys Val Ser Ala Lys Asp Asn Val Ile Thr Ala Ser Gly Asn Ala Ile Leu Leu Asn Tyr Asp Val Tyr Ile Leu Ala Asp Lys Val Arg Tyr Asp Thr Lys Thr Lys Glu Ala Leu Leu Glu Gly Asn Ile Lys Val Tyr Arg Gly Glu Gly Leu Leu Val Lys Thr Asp Tyr Val Lys Leu Ser Leu Asn Glu Lys Tyr Glu Ile Ile Phe 95 l00 l05 110 Pro Phe Tyr Val Gln Asp Ser Val Ser Gly Ile Trp Val Ser Ala Asp l15 120 125 Ile Ala Ser Gly Lys Asp Gln Lys Tyr Lys Val Lys Asn Met Ser Thr l30 l35 l40 Ser Gly Cys Ser Ile Asp Asn Pro I1e Trp His Val Asn Ala Thr Ser 145 l50 155 Gly Ser Phe Asn Met Gln Lys Ser His Leu Ser Met Trp Asn Pro Lys l60 l65 170 Ile Tyr Val Gly Asp Ile Pro Val Leu Tyr Leu Pro Tyr Ile Phe Met l75 l80 1B5 190 Ser Thr Ser Asn Lys Arg Thr Thr Gly Phe Leu Tyr Pro Glu Phe Gly l95 200 205 Thr Ser Asn Leu Asp Gly Phe Ile Tyr Leu Gln Pro Phe Tyr Leu Ala Pro Lys Asn Ser Trp Asp Met Thr Phe Thr Pro Gln Ile Arg Tyr Lys Arg Gly Phe Gly Leu Asn Phe Glu Ala Arg Tyr Ile Asn Ser Lys Asn ' Asp Arg Phe Leu Phe Asn Ala Arg Tyr Phe Arg Asn Tyr Thr Gln Tyr 255 2fi0 265 270 Val Lys Arg Tyr Asp Leu Arg Asn Gln Asn Ile Tyr Gly Phe Glu Phe Leu Ser Ser Ser Arg Asp Thr Leu Gln Lys Tyr Phe His Leu Lys Ser AAT ATT GAC AAC GGG CAT TAC ATT GAC TTT TTA TAC ATG AAC GAT TTG l020 Asn Ile Asp Asn Gly His Tyr Ile Asp Phe Leu Tyr Met Asn Asp Leu 305 310 3l5 GAC TAT GTG CGT TTT GAA AAG GTT AAT AAG CGT ATC ACA GAC GCC ACG l068 Asp Tyr Val Arg Phe Glu Lys Val Asn Lys Arg Ile Thr Asp Ala Thr His Met Ser Arg Ala Asn Tyr Tyr Leu Gln Thr Glu Asn Asn Tyr Tyr Gly Leu Asn Ile Lys Tyr Phe Leu Asn Leu Asn Lys Ile Rsn Asn Asn CGC ACT TTC CAA TCT GTC CCT AAT TTG CAA TAC CAT AAA TAT TTA AAT l212 Arg Thr Phe Gln Ser Val Pro Asn Leu Gln Tyr His Lys Tyr Leu Asn TCT TTG TAT TTT AGA AAT TTG TTG TAT TCG GTG GAT TAT CAG TTT AGA l260 Ser Leu Tyr Phe Arg Asn Leu Leu Tyr Ser Val Asp Tyr Gln Phe Arg Asn Thr Ala Arg Glu Ile Gly Tyr Gly Tyr Val Gln Asn Ala Leu Asn Val Pro Val Gly Leu Gln Phe Ser Leu Phe Lys Lys Tyr Leu Ser Leu Gly Leu Trp Asn Asp Leu Gln Leu Ser Asn Val Ala Leu Met Gln Ser Lys Asn Ser Phe Val Pro Thr Ile Pro Asn Glu Ser Arg Glu Phe Gly AAT TTT GTG TCT TCA AAT TTT TCC ATG TAT GTC AAT ACG GAT TTG GCT l500 Asn Phe Val Ser Ser Asri Phe Ser Met Tyr Val Asn Thr Asp Leu Ala Arg Glu Tyr Asn Lys Leu Phe His Thr Ile Gln Leu Glu Ala Ile Phe Asn Ile Pro Tyr Tyr Thr Phe Lys Asn Gly Leu Phe Ser Gln Asn Met ' 495 500 505 510 TAT GCT TTA AGC GCG CAA GCC TTA AAC AGC TAC ACT TCG CCT TTA TTG l644 Tyr Ala Leu Ser Ala Gln Ala Leu Asn Ser Tyr Thr Ser Pro Leu Leu Arg Asp Tyr Asp Tyr Gln Gly Arg Leu Tyr Asp Ser Val Trp Asn Pro Ser Ser Ile Leu Pro Ser Asn Ala Ser Asn Lys Thr Val Asp Leu Thr Leu Thr Gln Tyr Leu Tyr Gly Leu G7y Gly Gln Glu Leu Leu Tyr Phe Lys Ile Ser Gln Leu Ile Asn Leu Asp Asp Lys Val Ser Pro Phe Arg Met Pro Leu Glu Ser Lys Ile Gly Phe Ser Pro Leu Thr Gly Leu Asn Ile Phe Gly Asn Val Phe Tyr Ser Phe Tyr Gln Asn Arg Leu Glu Glu Ile Ser Val Asn Ala Asn Tyr Gln Arg Lys Phe Leu Ser Phe Asn Leu Ser Tyr Phe Leu Lys Asn Asn Phe Ser Ser Gly Ile Asn Ser Ile Val Glu Asn Leu Arg Ile Ile (2) INFORMATION FOR SEQ ID N0:38:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 660 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:38:
Met Ile Tyr Trp Leu Tyr Leu Ala Val Phe Phe Leu Leu Ser Ala Leu Asp Ala Lys Glu Ile Ala Met Gln Arg Phe Asp Lys Gln Asn His Lys Ile Phe Glu Ile Leu Ala Asp Lys Val Ser Ala Lys Asp Asn Val Ile Thr Ala Ser Gly Asn Ala Ile Leu Leu Asn Tyr Asp Val Tyr Ile Leu Ala Asp Lys Val Arg Tyr Asp Thr Lys Thr Lys Glu Ala Leu Leu Glu Gly Asn Ile Lys Val Tyr Arg Gly Glu Gly Leu Leu Val Lys Thr Asp Tyr Val Lys Leu Ser Leu Asn Glu Lys Tyr Glu Ile Ile Phe Pro Phe Tyr Val Gln Asp Ser Val Ser Gly Ile Trp Val Ser Ala Asp Ile Ala Ser Gly Lys Asp Gln Lys Tyr Lys Val Lys Asn Met Ser Thr Ser Gly Cys Ser Ile Asp Asn Pro Ile Trp His Val Asn Ala Thr Ser Gly Ser 14S 150 l55 160 Phe Asn Met Gln Lys Ser His Leu Ser Met Trp Asn Pro Lys Ile Tyr 165 l70 175 Val Gly Asp Ile Pro Val Leu Tyr Leu Pro Tyr Ile Phe Met Ser Thr Ser Asn Lys Arg Thr Thr Gly Phe Leu Tyr Pro G1u Phe Gly Thr Ser l95 200 205 Asn Leu Asp Gly Phe Ile Tyr Leu Gln Pro Phe Tyr Leu Ala Pro Lys Asn Ser Trp Asp Met Thr Phe Thr Pro Gln Ile Arg Tyr Lys Arg Gly Phe Gly Leu Asn Phe Glu Ala Arg Tyr Ile Asn Ser Lys Asn Asp Arg Phe Leu Phe Asn Ala Arg Tyr Phe Arg Asn Tyr Thr Gln Tyr Val Lys Arg Tyr Asp Leu Arg Asn Gln Asn Ile Tyr Gly Phe Glu Phe Leu Ser 275 2B0 2g5 Ser Ser Arg Asp Thr Leu Gln Lys Tyr Phe His Leu Lys Ser Asn Ile Asp Asn Gly His Tyr Ile Asp Phe Leu Tyr Met Asn Asp Leu Asp Tyr Val Arg Phe Glu Lys Val Asn Lys Arg Ile Thr Asp Ala Thr His Met Ser Arg Ala Asn Tyr Tyr Leu Gln Thr Glu Asn Asn Tyr Tyr Gly Leu Asn Ile Lys Tyr Phe Leu Asn Leu Asn Lys Ile Asn Asn Asn Arg Thr Phe Gln Ser Val Pro Asn Leu Gln Tyr His Lys Tyr Leu Asn Ser Leu Tyr Phe Arg Asn Leu Leu Tyr Ser Val Asp Tyr Gln Phe Arg Asn Thr WO 98I21225 PCT/US97/21353 ' Ala Arg Glu Ile Gly 'l~yr Gly Tyr Val Gln F,sn Ala Leu Asn Val Pro Val Gly Leu Gln Phe Ser Leu Phe Lys Lys Tyr Leu Ser Leu Gly Leu Trp Asn Asp Leu Gln Leu Ser Asn Val Ala heu Met Gln Ser Lys Asn ' 435 440 445 Ser Phe Val Pro Thr Ile Pro Asn Glu Ser F.rg Glu Phe Gly Asn Phe Val Ser Ser Asn Phe Ser Met Tyr Val Asn Thr Asp Leu Ala Arg Glu Tyr Asn Lys Leu Phe His Thr Ile Gln Leu Glu Ala Ile Phe Asn Ile Pro Tyr Tyr Thr Phe Lys Asn Gly Leu Phe ~~er Gln Asn Met Tyr Ala -Leu Ser A1a Gln Ala Leu Asn Ser Tyr Thr S;er Pro Leu Leu Arg Asp Tyr Asp Tyr Gln Gly Arg Leu Tyr Asp Ser Val Trp Asn Pro Ser Ser Ile Leu Pro Ser Asn Ala Ser Asn Lys Thr Val Asp Leu Thr Leu Thr Gln Tyr Leu Tyr Gly Leu Gly Gly Gln Glu Leu Leu Tyr Phe Lys Ile Ser Gln Leu Ile Asn Leu Asp Asp Lys Val ~~er Pro Phe Arg Met Pro Leu Glu Ser Lys Ile Gly Phe Ser Pro Leu Thr Gly Leu Asn Ile Phe Gly Asn Val Phe Tyr Ser Phe Tyr Gln Asn F.rg Leu Glu Glu Ile Ser Val Asn Ala Asn Tyr Gln Arg Lys Phe Leu ~~er Phe Asn Leu Ser Tyr 625 630 E.35 640 Phe Leu Lys Asn Asn Phe Ser Ser Gly Ile Asn Ser Ile Val Glu Asn Leu Arg Ile Ile (2) INFORMATION FOR SEQ ID N0:39:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 961 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
~ (A) NAME/KEY: Coding Sequence (B) LOCATION: 168...764 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID Dd0:39:
Met Thr Ser Ala Leu Leu Gly Leu Gln Ile Val Leu Ala Val Leu Ile Val Val Val Val Leu Leu Gln Lys Ser Ser Ser Ile Gly Leu Gly Ala Tyr Ser Gly Ser Asn Glu Ser Leu Phe Gly Ala Lys Gly Pro Ala Ser Phe Met Ala Lys Leu Thr Met Phe Leu Gly Leu Leu Phe Val Ile Asn Thr Ile Ala Leu Gly Tyr Phe Tyr Asn Lys Glu Tyr Gly Lys Ser Val Leu Asp Glu Thr Lys Thr Asn Lys Glu Leu Ser Pro Leu Val Pro Ala Thr Gly Thr Leu Asn Pro Ala Leu Asn Pro Thr Leu Asn Pro Thr Leu Asn Pro Leu 100 l05 l10 l15 Glu Gln Ala Pro Thr Asn Pro Leu Met Pro Gln Gln Thr Pro Asn Glu Leu Pro Lys Glu Pro Ala Lys Thr Pro Ser Val Glu Ser Pro Lys Gln l35 140 145 Asn Glu Lys Asn Glu Lys Asn Asp Ala Lys Glu Asn Gly Ile Lys Gly 150 l55 160 Val Glu Lys Thr Lys Glu Asn Ala Lys Thr Pro Pro Thr Thr His Gln Lys Pro Lys Thr His Ala Thr Gln Thr Asn Ala His Thr Asn Gln Lys l80 185 190 195 Lys Asp Glu Lys WO 98/Z1225 PCT/US97/21353 ' AAAGCATTCA AGCTTTAAAC AGGGATTTTT CCACTCT'AAG GAGCGCGAAA GTTTCAGTCA 869 ATATTTTAGA TCACATCAAA GTGGATTATT ACGGCAC'GCC CACGGCATTA AATCAAGTCG 929 (2) INFORMATION FOR SEQ ID N0:40:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 199 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:40:
Met Thr Ser Ala Leu Leu Gly Leu Gln Ile Val Leu Ala Val Leu Ile Val Val Val Val Leu Leu Gln Lys Ser Ser Ser Ile Gly Leu Gly Ala Tyr Ser Gly Ser Asn Glu Ser Leu Phe Gly Ala Lys Gly Pro Ala Ser Phe Met Ala Lys Leu Thr Met Phe Leu Gly Leu Leu Phe Val Ile Asn Thr Ile Ala Leu Gly Tyr Phe Tyr Asn Lys Glu Tyr Gly Lys Ser Val Leu Asp Glu Thr Lys Thr Asn Lys Glu Leu Ser Pro Leu Val Pro Ala Thr Gly Thr Leu Asn Pro Ala Leu Asn Pro Thr Leu Asn Pro Thr Leu Asn Pro Leu Glu Gln Ala Pro Thr Asn Pro Leu Met Pro Gln Gln Thr l15 120 125 Pro Asn Glu Leu Pro Lys Glu Pro Ala Lys Thr Pro Ser Val Glu Ser 130 l35 140 Pro Lys Gln Asn Glu Lys Asn Glu Lys Asn Asp Ala Lys Glu Asn Gly Ile Lys Gly Val Glu Lys Thr Lys Glu Asn Ala Lys Thr Pro Pro Thr l65 170 175 Thr His Gln Lys Pro Lys Thr His Ala Thr Gln Thr Asn Ala His Thr l80 18S 190 Asn Gln Lys Lys Asp Glu Lys (2) INFORMATION FOR SEQ ID N0:41:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1058' base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence WO 98/21225 PCTlUS97/21353 (B) LOCATION: 325...879 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:41:
ACCCAACGCT
TAT AAC
Met Leu Gln Ala Ile Tyr Glu Thr Asn Lys Asp Leu Met Gln Lys Ser Ile Gln Ala Leu Asn Arg Asp Phe Ser Thr Leu Arg Ser Ala Lys Val Ser Val Asn Ile Leu Asp His Ile Lys Val Asp Tyr Tyr Gly Thr Pro Thr Ala Leu Asn Gln Val Gly Ser Val Met Ser Leu Asp Ala Thr Thr Leu Gln Ile Ser Pro Trp Glu Lys Asn Leu Leu Lys Glu Ile Glu Arg Ser Ile Gln Glu Ala Asn Ile Gly Val Asn Pro Asn Asn Asp Gly Glu Thr Ile Lys Leu Phe Phe Pro Pro Met Thr Ser Glu Gln Arg Lys Leu Ile Ala Lys Asp Ala Lys Ala Met Gly Glu Lys Ala Lys Val Ala Val Arg Asn Ile Arg Gln Asp Ala Asn Asn Gln Val Lys Lys Leu Glu Lys Asp Lys Glu Ile Ser Glu Asp Glu Ser 140 l45 150 Lys Lys Ala Gln Glu Gln Ile Gln Lys Ile Thr Asp Glu Ala Ile Lys 155 160 l65 AAA ATT GAT GAA AGC GTG AAA AAC AAA GAA (3AC GCG ATC TTA AAG GTC T 88U
Lys Ile Asp Glu Ser Val Lys Asn Lys Glu i~sp Ala Ile Leu Lys Val 170 175 :L80 l85 ' TGCTCAGCAG TGGGTTTCAT TCCAATTATT ATTTGCAATC CGCTAAAGTT TTAGAAGATC 1000 CCAAACTAGC CGAACAATTA GCGCTAGAAT TAGCCAAe~CA AATCCAAGAA GCTCATTT 1058 (2) INFORMATION FOR SEQ ID N0:42:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 185 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:42:
Met Leu Gln Ala Ile Tyr Asn Glu Thr Lys Asp Leu Met Gln Lys Ser Ile Gln Ala Leu Asn Arg Asp Phe Ser Thr Leu Arg Ser Ala Lys Val Ser Val Asn Ile Leu Asp His Ile Lys Val i~.sp Tyr Tyr Gly Thr Pro Thr Ala Leu Asn Gln Val Gly Ser Val Met :3er Leu Asp Ala Thr Thr Leu Gln Ile Ser Pro Trp Glu Lys Asn Leu Leu Lys Glu Ile Glu Arg 65 70 '75 80 Ser Ile Gln Glu Ala Asn Ile Gly Val Asn J?ro Asn Asn Asp Gly Glu Thr Ile Lys Leu Phe Phe Pro Pro Met Thr :3er Glu Gln Arg Lys Leu 100 105 l10 Ile Ala Lys Asp Ala Lys Ala Met Gly Glu Lys Ala Lys Val Ala Val 115 120 l25 Arg Asn Ile Arg Gln Asp Ala Asn Asn Gln 'Jal Lys Lys Leu Glu Lys 130 l35 140 Asp Lys Glu Ile Ser Glu Asp Glu Ser Lys Lys Ala Gln Glu Gln Ile Gln Lys Ile Thr Asp Glu Ala Ile Lys Lys :Ile Asp Glu Ser Val Lys Asn Lys Glu Asp Ala Ile Leu Lys Val (2) INFORMATION FOR SEQ ID N0:4:3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1669 base pairs _ (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear ' (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: l63...1389 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:43:
Met Ala Gln Asn Phe Thr Lys Leu Asn Pro Gln Phe Glu Asn Ile Ile Phe Glu His Asp Asp Asn Gln Met Ile Leu Asn Phe Gly Pro Gln His Pro Ser Ser His GGG CAA TTG CGC TTG ATT TTG GAA TTA GAG GGC GAA AAA ATC-ACT AAG 3l8 Cly Gln Leu Arg Leu Ile Leu Glu Leu Glu Gly Glu Lys Ile Ile Lys Ala Thr Pro Glu Ile Gly Tyr Leu His Arg Gly Cys Glu Lys Leu Gly GAA AAC ATG ACC TAT AAC GAA TAC ATG CCC ACT ACT GAT AGA TTG GAT 4l4 Glu Asn Met Thr Tyr Asn Glu Tyr Met Pro Thr Thr Asp Arg Leu Asp Tyr Thr Ser Ser Thr Ser Asn Asn Tyr Ala Tyr Ala Tyr Ala Val Glu 85 90 95 l00 Thr Leu Leu Asn Leu Glu Ile Pro Arg Arg Ala Gln Val Ile Arg Thr l05 110 115 Ile Leu Leu Glu Leu Asn Arg Met Ile Ser His Ile Phe Phe Ile Ser 120 125 l30 Val His Ala Leu Asp Val Gly Ala Met Ser Val Phe Leu Tyr Ala Phe 135 l40 l45 Lys Thr Arg Glu Tyr Gly Leu Asp Leu Met Glu Asp Tyr Cys Gly Ala l50 155 160 Arg Leu Thr His Asn Ala Ile Arg Ile Gly G1y Val Pro Leu Asp Leu 165 170 .L75 180 CCC CCT AAT TGG TTA GAA GGC TTA AAA AAG '.CTT TTA GGC GAA ATG AGG 750 Pro Pro Asn Trp Leu Glu Gly Leu Lys Lys I?he Leu Gly Glu Met Arg GAA TGC AAA AAA CTC ATT CAA GGC TTA TTG (3AT AAG AAT CGC ATT TGG 798 Glu Cys Lys Lys Leu Ile Gln Gly Leu Leu Asp Lys Asn Arg Ile Trp Arg Met Arg Leu Glu Asn Val Gly Val Val 'Chr Gln Lys Met Ala Gln Ser Trp Gly Met Ser Gly Ile Met Leu Arg Gly Thr Gly Ile Ala Tyr GAC ATC AGA AAA GAA GAG CCT TAT GAG CTT 'CAT AAA GAG CTT GAT TTT 942 Asp Ile Arg Lys Glu Glu Pro Tyr Glu Leu 'Cyr Lys Glu Leu Asp Phe GAT GTG CCG GTG GGC AAT TAT GGC GAT AGT 'CAT GAT AGG TAT TGT TTG 990 Asp Val Pro Val Gly Asn Tyr Gly Asp Ser 'Cyr Asp Arg Tyr Cys Leu TAT ATG TTA GAA ATT GAT GAA AGC GTT CGC i~TC ATT GAA CAG CTC ATT 1038 Tyr Met Leu Glu Ile Asp Glu Ser Val Arg :Lle Ile Glu Gln Leu Ile Pro Met Tyr Ala Lys Thr Asp Thr Pro Ile Met Ala Gln Asn Pro His Tyr Ile Ser Ala Pro Lys Glu Asp Ile Met 'rhr Gln Asn Tyr Ala Leu Met Gln His Phe Val Leu Val Ala Gln Gly IKet Arg Pro Pro Val Gly GAA GTG TAT GCC CCC ACA GAA AGC CCT AAA t3GG GAA TTA GGG TTT TTT 1230 Glu Val Tyr Ala Pro Thr Glu Ser Pro Lys t3ly Glu Leu Gly Phe Phe ' 345 350 355 Ile His Ser Glu Gly Glu Pro Tyr Pro His Arg Leu Lys Ile Arg Ala CCT AGC TTT TAT CAC ATT GGG GCT TTG AGC c,;AC ATT TTA GTG GGG CAA 1326 Pro Ser Phe Tyr His Ile Gly Ala Leu Ser ,?asp Ile Leu Val Gly Gln Tyr Leu Ala Asp Ala Val Thr Val Ile Gly Ser Thr Asn Ala Val Phe Gly Glu Val Asp Arg (2) INFORMATION FOR SEQ ID N0:44:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 409 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear _ - (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:44:
Met Ala Gln Asn Phe Thr Lys Leu Asn Pro Gln Phe Glu Asn Ile Ile Phe Glu His Asp Asp Asn Gln Met Ile Leu Asn Phe Gly Pro Gln His Pro Ser Ser His Gly Gln Leu Arg Leu Ile Leu Glu Leu Glu Gly Glu Lys Ile Ile Lys Ala.Thr Pro Glu Ile Gly Tyr Leu His Arg Gly Cys Glu Lys Leu Gly Glu Asn Met Thr Tyr Asn Glu Tyr Met Pro Thr Thr Asp Arg Leu Asp Tyr Thr Ser Ser Thr Ser Asn Asn Tyr Ala Tyr Ala Tyr Ala Val Glu Thr Leu Leu Asn Leu Glu Ile Pro Arg Arg Ala Gln 100 105 1l0 Val Ile Arg Thr Ile Leu Leu Glu Leu Asn Arg Met Ile Ser His Ile Phe Phe Ile Ser Val His Ala Leu Asp Val Gly Ala Met Ser Val Phe Leu Tyr Ala Phe Lys Thr Arg Glu Tyr Gly Leu Asp Leu Met Glu Asp 145 l50 155 l60 Tyr Cys Gly Ala Arg Leu Thr His Asn Ala Ile Arg Ile Gly Gly Val Pro Leu Asp Leu Pro Pro Asn Trp Leu Glu Gly Leu Lys Lys Phe Leu Gly Glu Met Arg Glu Cys Lys Lys Leu Ile Gln Gly Leu Leu Asp Lys Asn Arg Ile Trp Arg Met Arg Leu Glu Asn Val Gly Val Val Thr Gln 2l0 215 220 Lys Met Ala Gln Ser Trp Gly Met Ser G1y Ile Met Leu Arg Gly Thr Gly Ile Ala Tyr Asp Ile Arg Lys Glu Glu Pro Tyr Glu Leu Tyr Lys Glu Leu Asp Phe Asp Val Pro Val Gly Asn Tyr Gly Asp Ser Tyr Asp ~ Arg Tyr Cys Leu Tyr Met Leu Glu Ile Asp Glu Ser Val Arg Ile Ile Glu Gln Leu Ile Pro Met Tyr Ala Lys Thr Asp Thr Pro Ile Met Ala Gln Asn Pro His Tyr Ile Ser Ala Pro Lys G1u Asp Ile Met Thr Gln Asn Tyr Ala Leu Met Gln His Phe Val Leu Val Ala Gln Gly Met Arg Pro Pro Val Gly Glu Val Tyr Ala Pro Thr Glu Ser Pro Lys Gly Glu Leu Gly Phe Phe Ile His Ser Glu Gly Glu Pro Tyr Pro His Arg Leu Lys Ile Arg Ala Pro Ser Phe Tyr His Ile Gly Ala Leu Ser Asp Ile Leu Val Gly Gln Tyr Leu Ala Asp Ala Val Thr Val Ile Gly Ser Thr Asn Ala Val Phe Gly Glu Val Asp Arg (2} INFORMATION FOR SEQ ID N0:45:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 869 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 358...732 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:45:
CCAATATTGG CTATCATTTA CTTTGATTTCGCCCATC,"GTGTCATGTTCAATTCTAAATTG180 GTTATTATCC GTTCGCAACA AGAATTTTCTTGTTATC.'TTAATGTAAAGGTCAAAACG 360 ATG
Met TTT TTA GTG GTT ATG
GGT
Lys Lys Leu Ala Ala Leu Val Ser Leu Gly Met Gly Phe Leu Val Val Leu Asn Ala Trp Glu Gln Thr Leu Lys Ala Asn Asp Leu Glu Val Lys Ile Lys Ser Val Gly Asn Pro Ile Lys Gly Asp Asn Thr Phe Ile Leu Ser Pro Thr Leu Lys Gly Lys Ala Leu Glu Lys Ala Ile Val Arg Val Gln Phe Met Met Pro Glu Met Pro Gly Met Pro Ala Met Lys Glu Met Ala Gln Val Ser Glu Lys Asn Gly Leu Tyr Glu Ala Lys Thr Asn Leu Ser Met Asn Gly Thr Trp Gln Val Arg Val Asp Ile Lys Ser Lys Glu l00 l05 110 Gly Gln Val Tyr Arg Ala Lys Thr Ser Leu Asp Leu (2) INFORMATION FOR SEQ ID N0:46:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 125 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:46:
Met Lys Lys Leu Ala Ala Leu Phe Leu Val Ser Val Leu Gly Val Met Gly Leu Asn Ala Trp Glu Gln Thr Leu Lys Ala Asn Asp Leu Glu Val Lys Ile Lys Ser Val Giy Asn Pro Ile Lys Gly Asp Asn Thr Phe Ile Leu Ser Pro Thr Leu Lys Gly Lys Ala Leu Glu Lys Ala Ile Val Arg Val Gln Phe Met Met Pro Glu Met Pro Gly Met Pro Ala Met Lys Glu Met Ala Gln Val Ser G1u Lys Asn Gly Leu '.Cyr Glu Ala Lys Thr Asn Leu Ser Met Asn Gly Thr Trp Gln Val Arg Val Asp Ile Lys Ser Lys 100 105 1l0 Glu Gly Gln Val Tyr Arg Ala Lys Thr Ser Leu Asp Leu 115 l20 125 (2) INFORMATION FOR SEQ ID N0:4'7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 12l7 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 73...1152 (D1 OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:47:
TCCATGCGTT TTGATGCGAT TTTAAAAAAT CTTTGGG7.'AT TTTAGCATGC CAATGGTTAA 60 Met Asn Gly Phe Cys Ala Arq_ Leu Arg Ala Ile Thr His AAT GAA AGA TTA AAA ATG AAA ATA GCG GTA 7.'TA CTC AGT GGG GGG GTG 159 Asn Glu Arg Leu Lys Met Lys Ile Ala Val Leu Leu Ser Gly Gly Val Asp Ser Ser Tyr Ser Ala Tyr Ser Leu Lys Cilu Gln Gly His Glu Leu Val Gly Ile Tyr Leu Lys Leu His Ala Ser C~lu Lys Lys His Asp Leu TAC ATC AAA AAC GCT CAA AAA GCA TGC GAG 7.'TT TTA GGC ATT CCT TTA 303 Tyr Ile Lys Asn Ala Gln Lys Ala Cys Glu Phe Leu Gly Ile Pro Leu Glu Val Leu Asp Phe Gln Lys Asp Phe Lys Cter Ala Val Tyr Asp Glu TTT ATC AAC GCC TAT GAA GAA GGG CAA ACC C.'CA AAC CCT TGT GCG TTG 399 Phe Ile Asn Ala Tyr Glu Glu Gly Gln Thr Pro Asn Pro Cys Ala Leu ~ TGC AAC CCT TTA ATG AAG TTT GGG CTA GCT TTG GAT CAC GCT TTA AAA 447 Cys Asn Pro Leu Met Lys Phe Gly Leu Ala Leu Asp His Ala Leu Lys 110 1L5 120 l25 Leu Gly Cys Glu Lys Ile Ala Thr Gly His Tyr Ala Arg Val Lys Glu 130 l35 l40 Ile Asp Lys Ile Ser Tyr Ile Gln Glu Ala Leu Asp Lys Thr Lys Asp Gln Ser Tyr Phe Leu Tyr Ala Leu Glu His Glu Val Ile Ala Lys Leu 160 165 l70 Val Phe Pro Leu Gly Asp Leu Leu Lys Lys Asp Ile Lys Pro Leu Ala Leu Asn Ala Met Pro Phe Leu Gly Thr Leu Glu Thr Tyr Lys Glu Ser 190 l95 200 205 Gln Glu Ile Cys Phe Val Glu Lys Ser Tyr Ile Asp Thr Leu Lys Lys Hi's Val Glu Val Glu Lys Glu Gly Val Val Lys Asn Leu Gln Gly Glu Val Ile Gly Thr His Lys Gly Tyr Met Gln Tyr Thr I1e Gly Lys Arg Lys Gly Phe Ser Ile Lys Gly Ala Leu' Glu Pro His Phe Val Val Gly Ile Asp Ala Lys Lys Asn Glu Leu Val Val Gly Lys Lys Glu Asp Leu Ala Thr His Ser Leu Lys Ala Lys Asn Lys Ser Leu Met Lys Asp Phe Lys Asp Gly Glu Tyr Phe Ile Lys Ala Arg Tyr Arg Ser Val Pro Ala AAA GCG CAT GTG AGT TTG AAA GAT GAG GTG ATT GAA GTG GGG TTT AAA l071 Lys Ala His Val Ser Leu Lys Asp Glu Val Ile Glu Val Gly Phe Lys WO 98I21225 PCT/(JS97/21353 -Glu Pro Phe Tyr Gly Val Ala Lys Gly Gln Ala Leu Val Val Tyr Lys Asp Asp Ile Leu Leu Gly Gly Gly Val Ile Val " GATACGCCTT TTGGCAGTCT CTTAATGTTT TATTGAF~TAG GCGTT 1217 (2) INFORMATION FOR SEQ ID N0:48:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 360 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:48:
Met Asn Gly Phe Cys Ala Arg Leu Arg Ala Ile Thr His Asn Glu Arg Leu Lys Met Lys Ile Ala Val Leu Leu Ser Gly Gly Val Asp Ser Ser Tyr Ser Ala Tyr Ser Leu Lys Glu Gln Gly His Glu Leu Val Gly Ile Tyr Leu Lys Leu His Ala Ser Glu Lys Lys His Asp Leu Tyr Ile Lys Asn Ala Gln Lys Ala Cys Glu Phe Leu Gly Ile Pro Leu Glu Val Leu Asp Phe Gln Lys Asp Phe Lys Ser Ala Val Tyr Asp Glu Phe Ile Asn Ala Tyr Glu Glu Gly Gln Thr Pro Asn Pro Cys Ala Leu Cys Asn Pro 100 105 l10 Leu Met Lys Phe Gly Leu Ala Leu Asp His Ala Leu Lys Leu Gly Cys 115 l20 125 Glu Lys lle Ala Thr Gly His Tyr Ala Arg Val Lys Glu Ile Asp Lys l30 135 140 Ile Ser Tyr Ile Gln Glu Ala Leu Asp Lys Thr Lys Asp Gln Ser Tyr Phe Leu Tyr Ala Leu Glu His Glu Val Ile Ala Lys Leu Val Phe Pro l65 170 175 Leu Gly Asp Leu Leu Lys Lys Asp Ile Lys Pro Leu Ala Leu Asn Ala 1B0 l85 190 Met Pro Phe Leu Gly Thr Leu Glu Thr Tyr Lys Glu Ser Gln Glu Ile _ 195 200 205 Cys Phe Val Glu Lys Ser Tyr Ile Asp Thr Leu Lys Lys His Val Glu _ 210 215 220 Val Glu Lys Glu Gly Val Val Lys Asn Leu Gln Gly Glu Val Ile Gly Thr His Lys Gly Tyr Met Gln Tyr Thr Ile Gly Lys Arg Lys Gly Phe Ser Ile Lys Gly Ala Leu Glu Pro His Phe Val Val Gly Ile Asp Ala Lys Lys Asn Glu Leu Val Val Gly Lys Lys Glu Asp Leu Ala Thr His Ser Leu Lys Ala Lys Asn Lys Ser Leu Met Lys Asp Phe Lys Asp Gly Glu Tyr Phe Ile Lys Ala Arg Tyr Arg Ser Val Pro Ala Lys Ala His Val Ser Leu Lys Asp Glu Val Ile Glu Val Gly Phe Lys Glu Pro Phe Tyr Gly Val Ala Lys Gly Gln Ala Leu Val Val Tyr Lys Asp Asp Ile Leu Leu Gly Gly Gly Val Ile Val (2) INFORMATION FOR SEQ ID N0:49:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 975 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix} FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: l91...793 (D) OTHER INFORMATION:
(xi} SEQUENCE DESCRIPTION: SEQ ID N0:49:
AACCCACTGA
GGGTTTTAGTTTTAGCATGTTAGCATTCAG CCACCACTCTTTTTAAGGAATTTGTTTGAAl80 TTG TTA TCT
GCC ACT
CTT TTA
Met Ser Leu Ala Cys Leu Leu Leu Ser Ala Thr Leu Leu Pro Pro Lys Gly His His Ser Gly Leu Val Asn Leu Tyr Ile Ala His Gln Gly Gln Ser Val Arg Thr Tyr Trp Arg Lys Val Asp Arg Gly Val Ile Ala Lys His Asn Glu Ala Leu Lys Lys Asp Pro Lys Ala Lys Leu Lys Asp Pro Arg Gly Pro Leu Phe Met Leu Gly Ser Glu Arg Phe Met WO 98l21225 PCT/US97/21353 Leu Leu Trp Lys Asn Arg Tyr Ala Leu Ala Lys Pro Gln Ser Phe Arg - Leu Glu Pro Gly Phe Tyr Tyr Leu Asp Ser Phe Ser Val Glu Thr Gln 95 l00 105 Lys Gly Val Leu Gln Ser Ala Pro Gly Tyr Ser Tyr Thr Lys Asn Gly 110 l15 120 125 Tyr Asp Phe Lys Asn Asn Arg Pro Phe Phe Leu Ala Phe Glu Val Lys l30 135 140 Pro Asp Gly Lys Thr Ile Leu Pro Ser Val Glu Leu Ser Leu Ile Lys 145 l50 155 Thr Pro Arg Gly Phe Leu Gly Val Phe Leu Phe Asp Asn Asn Glu Lys l60 165 170 Gly Thr Asn Ala Lys Trp Ile Glu Gly Ser Leu Asn Leu Lys Leu Lys Asn Ala Ser Phe Lys Asp Ala Trp Gly Leu Glu Gln AGATTTTATT ACCCCTATTC AATTGGAACA AAGCCA'CTAA ATTTTTAAAA ACTTTTAAAA 929 ACGATAAACA TAATCCGCGC TCCAAGTAAC ATAGCT'CTCA AAAATG 975 (2) INFORMATION FOR SEQ ID NO:!i0:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 201 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi} SEQUENCE DESCRIPTION: SEQ ID N0:50:
Met Ser Leu Leu Ala Thr Leu Leu Leu Ala Ser Cys Leu Pro Pro Lys Gly His His Ser Gly Leu Val Asn Leu Tyr Ile Ala His Gln Gly Gln Ser Val Arg Thr Tyr Trp Arg Lys Val Asp Arg Gly Val Ile Ala Lys His Asn Glu Ala Leu Lys Lys Asp Pro Lys Ala Lys Leu Lys Asp Pro WO 98/21225 PCT/US97/21353 ' Arg Gly Pro Leu Phe Met Leu Gly Ser Glu Arg Phe Met Leu Leu Trp Lys Asn Arg Tyr Ala Leu Ala Lys Pro Gln Ser Phe Arg Leu Glu Pro Gly Phe Tyr Tyr Leu Asp Ser Phe Ser Val Glu Thr Gln Lys Gly Val Leu Gln Ser Ala Pro Gly Tyr Ser Tyr Thr Lys Asn Gly Tyr Asp Phe l35 l20 125 Lys Asn Asn Arg Pro Phe Phe Leu Ala Phe Glu Val Lys Pro Asp Gly 130 135 l40 Lys Thr Ile Leu Pro Ser Val Glu Leu Ser Leu Ile Lys Thr Pro Arg l45 150 l55 160 Gly Phe Leu Gly Val Phe Leu Phe Asp Asn Asn Glu Lys Gly Thr Asn Ala Lys Trp Ile Glu Gly Ser Leu Asn Leu Lys Leu Lys Asn Ala Ser Phe Lys Asp Ala Trp Gly Leu Glu Gln (2) INFORMATION FOR SEQ ID N0:51:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1116 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 90...1076 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:51:
Met Ser Asn Ser Met Leu Asp Lys Asn Lys Ala Ile Leu Thr Gly Gly Gly Ala Leu Leu Leu Gly Leu Ile Val Leu Phe Tyr Leu Ala Tyr Arg Pro Lys Ala Glu Val Leu Gln Gly Phe Leu Glu Ala Arg Glu Tyr Ser Val Ser Ser Lys Val Pro Gly Arg ATT GAA AAG GTG TTT GTL' AAA AAA GGC GAT CAC ATT AAA AAG GGC GAT 305 Ile Glu Lys Val Phe Val Lys Lys Gly Asp His Ile Lys Lys Gly Asp ' Leu Val Phe Ser Ile Ser Ser Pro Glu Leu Glu Ala Lys Leu Ala Gln Ala Glu Ala Gly His Lys Ala Ala Lys Ala Leu Ser Asp Glu Val Lys Arg Gly Sex Arg Asp Glu Thr Ile Asn Ser Ala Arg Asp Val Trp Gln l05 110 115 120 Ala Ala Lys Ser Gln Ala Thr Leu Ala Lys Glu Thr Tyr Lys Arg Val 125 130 l35 -- Gln Asp Leu Tyr Asp Asn Gly Val Ala Ser Leu Gln Lys Arg Asp Glu 140 l45 150 Ala Tyr Ala Ala Tyr Glu Ser Thr Lys Tyr Asn Glu Ser Ala Ala Tyr Gln Lys Tyr Lys Met Ala Leu Gly Gly Ala Ser Ser Glu Ser Lys Ile Ala Ala Lys Ala Lys Glu 5er Ala Ala Leu Gly Gln Val Asn Glu Val Glu Ser Tyr Leu Lys Asp Val Lys Ala Thr Ala Pro Ile Asp Gly Glu Val Ser Asn Val Leu Leu Ser Gly Gly Glu Leu Ser Pro Lys Gly Phe Pro Val Val Leu Met Ile Asp Leu Lys Asp Ser Trp Leu Lys Ile Ser Val Pro Glu Lys Tyr Leu Asn Glu Phe Lys Val Gly Lys Glu Phe Glu Gly Tyr Ile Pro Ala Leu Lys Lys Ser Thr Lys Phe Arg Val Lys Tyr ' 265 270 275 280 WO 98/2I225 PCT/US97i21353 Leu Ser Val Met Gly Asp Phe Ala Thr Trp Lys Ala Thr Asn Asn Ser Asn Thr Tyr Asp Met Lys Ser Tyr Glu Val Glu Ala Ile Pro Leu Glu Glu Leu Glu Asn Phe Arg Val Gly Met Ser Val Leu Val Thr Ile Lys CCT TAAAAAGGAT TGTTTTGTTC AGATTGATAA GCGCATGGGT 1l16 Pro (2) INFORMATION FOR SEQ ID N0:52:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 329 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:52:
Met Ser Asn Ser Met Leu Asp Lys Asn Lys Ala Ile Leu Thr Gly Gly Gly Ala Leu Leu Leu Gly Leu Ile Val Leu Phe Tyr Leu Ala Tyr Arg Pro Lys Ala Glu Val Leu Gln Gly Phe Leu Glu Ala Arg Glu Tyr Ser Val Ser Ser Lys Val Pro Gly Arg Ile Glu Lys Val Phe Val Lys Lys Gly Asp His Ile Lys Lys Gly Asp Leu Val Phe Ser Ile Ser Ser Pro Glu Leu Glu Ala Lys Leu Ala Gln Ala Glu Ala Gly His Lys Ala Ala Lys Ala Leu Ser Asp Glu Val Lys Arg Gly Ser Arg Asp Glu Thr Ile Asn Ser Ala Arg Asp Val Trp Gln Ala Ala Lys Ser Gln Ala Thr Leu 1l5 l20 l25 Ala Lys Glu Thr Tyr Lys Arg Val Gln Asp Leu Tyr Asp Asn Gly Val l30 135 140 Ala Ser Leu Gln Lys Arg Asp Glu Ala Tyr Ala Ala Tyr Glu Ser Thr l45 l50 l55 160 Lys Tyr Asn Glu Ser Ala Ala Tyr Gln Lys Tyr Lys Met Ala Leu Gly Gly Ala Ser Ser Glu Ser Lys Ile Ala Ala Lys Ala Lys Glu Ser Ala l80 185 l90 Ala Leu Gly Gln Val Asn Glu Val Glu Ser Tyr Leu Lys Asp Val Lys l95 200 205 Ala Thr Ala Pro Ile Asp Gly Glu Val Ser Asn Val Leu Leu Ser Gly 210 2l5 220 Gly Glu Leu Ser Pro Lys Gly Phe Pro Val Val Leu Met Ile Asp Leu ~ Lys Asp Ser Trp Leu Lys Ile Ser Val Pro Glu Lys Tyr Leu Asn Glu Phe Lys Val Gly Lys Glu Phe Glu Gly Tyr Ile Pro Ala Leu Lys Lys Ser Thr Lys Phe Arg Val Lys Tyr Leu Ser Val Met Gly Asp Phe Ala Thr Trp Lys Ala Thr Asn Asn Ser Asn Thr Tyr Asp Met Lys Ser Tyr Glu Val Glu Ala Ile Pro Leu Glu Glu Leu Glu Asn Phe Arg Val Gly Met Ser Val Leu Val Thr Ile Lys Pro (2) INFORMATION FOR SEQ ID N0:53:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1514 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 94...1467 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:53:
Met Leu Glu Thr Ser Ser His Phe Leu Lys Ser Phe Arg Leu Lys Arg Tyr Ile Gly Phe Leu Leu Ile TCT TTA GCG TTA TTA ATC ACG CCC TTT GTT CGC ATT GAT GGG GCG CAT 2l0 Ser Leu Ala Leu Leu Ile Thr Pro Phe Val Arg Ile Asp Gly Ala His Leu Phe Leu Ile Ser Phe Glu His Lys Gln Leu His Phe Leu Gly Lys ~ Ile Phe Ser Ala Glu Glu Leu Gln Val Met Pro Phe Met Val Ile Leu Leu Phe Ile Gly Ile Phe Phe Ile Thr Thr Ser Leu Gly Arg Val Trp Cys Gly Trp Ala Cys Pro Gln Thr Phe Leu Arg Val Leu Tyr Arg Asp Val Ile Glu Thr Lys Ile Phe Lys Leu His Lys Lys Ile Ser Asn Lys l05 l10 115 Gln Glu Ser Pro Lys Asn Thr Pro Ser Tyr Lys Ile Arg Lys Val Leu 120 125 l30 13S
Ser Val Leu Leu Phe Ala Pro Val Val Ala Gly Leu Met Met Leu Phe Phe Phe Tyr Phe Ile Ala Pro Glu Asp Phe Phe Met Tyr Leu Lys Asn 155 l60 165 Pro Ser Asp His Pro Ile Ala Met Gly Phe Trp Leu Phe Ser Thr Ala Val Val Leu Phe Asp Ile Val Val Val Ala Glu Arg Phe Cys Ile Tyr 185 l90 195 Leu Cys Pro Tyr Ala Arg Val Gln Ser Val Leu Tyr Asp Asn Asp Thr 200 205 2l0 215 Leu Asn Pro Ile Tyr Asp Glu Lys Arg Gly Gly Ala Leu Tyr Asn Asn Gln Gly His Leu Phe Pro Leu Pro Pro Lys Lys Arg Ser Pro Glu Asn Glu Cys Val Asn Cys Leu His Cys Val Gln Val Cys Pro Thr His Ile Asp Ile Arg Lys Gly Leu Gln Leu Glu Cys Ile Asn Cys Leu Glu Cys Val Asp Ala Cys Thr Ile Thr Met Ala Lys Phe Asn Arg Pro Ser Leu Ile Gln Trp Ser Ser Thr Asn Ala Ile Asn Thr Arg Gln Lys Val His Leu Val Arg Leu Lys Thr Ile Ala Tyr Met C;ly Val Ile Ala Ile Val ATC GCT CTT TTA GCC ATC ACT TCG TTT AAA FAA GAA CGC ATG CTC TTA 1l22 Ile Ala Leu Leu Ala Ile Thr Ser Phe Lys Lys Glu Arg Met Leu Leu GAC ATT AAC CGC AAC AGC GAT CTG TAT GAA TTG CGC TCT AGC GGG TAT. 1170 Asp Ile Asn Arg Asn 5er Asp Leu Tyr Glu Leu Arg Ser Ser Gly Tyr GTG GAT AAC GAT TAC GTG TTT TTA TTC CAC F,AC ACG GAC AAT AAA GAC 1218 Val Asp Asn Asp Tyr Val Phe Leu Phe His Asn Thr Asp Asn Lys Asp His Glu Phe Tyr Phe Lys Val Leu Gly Gln hys Asp Ile Gln Ile Lys AAG CCT TTA AAT CCT ATC GCC ATT AAA GCC C;GG CAA AAG ATT AAA GCG 1314 Lys Pro Leu Asn Pro Ile Ala Ile Lys Ala C;ly Gln Lys Ile Lys Ala GTA GTG ATT TTA AGA AAA CCC CTA AAG AGT F~AC GCC ACA GAA TAC AAG 1362 Val Val Ile Leu Arg Lys Pro Leu Lys Ser F~sn Ala Thr Glu Tyr Lys 4l0 415 420 AAC GCT AAA GAC GCT CTA ATC CCC ATT ACC F~TA CAA GCT TAT AGC GCG 1410 Asn Ala Lys Asp Ala Leu Ile Pro Ile Thr I:le Gln Ala Tyr Ser A1a Asp Asp Lys Asn Ile Thr Ile Glu Arg Glu Ser Val Phe Ile Ala Pro 440 445 9.50 455 AGT GAG GAT TGAAGCCTAA AACTAGCGTT CAATCAC'.TTC ATAAGGCAAG CCTTGTT 1514 Ser Glu Asp (2) INFORMATION FOR SEQ ID N0:59:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 458 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear WO 98/21225 PCTlUS97/21353 (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:54:
Met Leu Glu Thr Ser Ser His Phe Leu Lys Ser Phe Arg Leu Lys Arg Tyr Ile Gly Phe Leu Leu Ile Ser Leu Ala Leu Leu Ile Thr Pro Phe Val Arg Ile Asp Gly Ala His Leu Phe Leu Ile Ser Phe Glu His Lys Gln Leu His Phe Leu Gly Lys Ile Phe Ser Ala Glu Glu Leu Gln Val Met Pro Phe Met Val Ile Leu Leu Phe Ile Gly Ile Phe Phe Ile Thr Thr Ser Leu Gly Arg Val Trp Cys Gly Trp Ala Cys Pro Gln Thr Phe Leu Arg Val Leu Tyr Arg Asp Val Ile Glu Thr Lys Ile Phe Lys Leu His Lys Lys Ile Ser Asn Lys Gln Glu Ser Pro Lys Asn Thr Pro Ser 115 l20 125 Tyr Lys Ile Arg Lys Val Leu Ser Val Leu Leu Phe Ala Pro Val Val l30 135 l40 Ala Gly Leu Met Met Leu Phe Phe Phe Tyr Phe Ile Ala Pro Glu Asp Phe Phe Met Tyr Leu Lys Asn Pro Ser Asp His Pro Ile Ala Met Gly 165 170 l75 Phe Trp Leu Phe Ser Thr Ala Val Val Leu Phe Asp Ile Val Val Val 180 l85 190 Ala Glu Arg Phe Cys Ile Tyr Leu Cys Pro Tyr Ala Arg Val Gln Ser Val Leu Tyr Asp Asn Asp Thr Leu Asn Pro Ile Tyr Asp Glu Lys Arg 210 2l5 220 Gly Gly Ala Leu Tyr Asn Asn Gln Gly His Leu Phe Pro Leu Pro Pro Lys Lys Arg Ser Pro Glu Asn Glu Cys Val Asn Cys Leu His Cys Val Gln Val Cys Pro Thr His Ile Asp Ile Arg Lys Gly Leu Gln Leu Glu Cys Ile Asn Cys Leu Glu Cys Val Asp Ala Cys Thr Ile Thr Met Ala Lys Phe Asn Arg Pro Ser Leu Ile Gln Trp Ser Ser Thr Asn Ala Ile Asn Thr Arg Gln Lys Val His Leu Val Arg Leu Lys Thr Ile Ala Tyr Met Gly Val Ile Ala Ile Val Ile Ala Leu Leu Ala Ile Thr Ser Phe Lys Lys Glu Arg Met Leu Leu Asp Ile Asn Arg Asn Ser Asp Leu Tyr Glu Leu Arg Ser Ser Gly Tyr Val Asp Asn Asp Tyr Val Phe Leu Phe His Asn Thr Asp Asn Lys Asp His Glu Phe Tyr Phe Lys Val Leu Gly Gln Lys Asp Ile Gln Ile Lys Lys Pro Leu Asn Pro Ile Ala Ile Lys Ala Gly Gln Lys Ile Lys Ala Val Val Ile Leu Arg Lys Pro Leu Lys Ser Asn Ala Thr Glu Tyr Lys Asn Ala Lys Asp Ala Leu Ile Pro Ile Thr Ile Gln Ala Tyr Ser Ala Asp Asp Lys Asn Ile Thr Ile Glu Arg Glu Ser Val Phe Ile Ala Pro Ser Glu Asp (2) INFORMATION FOR SEQ ID N0:55:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 990 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 228...782 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:55:
ACGATTTGATCAATAACGAA AATAAAATTGATGAAATC'AATAATGAAGAA AACGCTGATC60 CTTCGCAAAAAAGAACGAAC AACGTTTTGCAACGAGCC'ACTAACCACCAA GACAATCTCA120 ATTCCCCACTCAACAGGAAG TATTAAAGTGTGAAACTT'TTTTCAAAGGAT TTATTTAAAA180 AAGTAACCCCTTTATTTTTA AGCGTTTATTTTTTAAAC'CCCACCATT ATG CAA 236 GCC
Met Gln Ala Lys Ser Arg Phe Tyr Val Ala Ser Gln Tyr Gln Val Gly Lys Met Ile ATG AAA AAA TAC AAC GAT CTC AAA CGC ACG A.TT GAA GGG GCG AGC TTT 332 Met Lys Lys Tyr Asn Asp Leu Lys Arg Thr Ile Glu Gly Ala Ser Phe Ser Leu Gly Trp Glu Ile Asn Pro Thr Asn Tyr Trp Phe Tyr Ser Arg Tyr Tyr Phe Phe Met Asp Tyr Gly Asn Val Ile Leu Asn Lys Arg Thr Gly Ala Gln Ala Asn Met Phe Thr Tyr Gly Phe Gly Gly Asp Leu Ile Val Glu Tyr Asn Lys Asn Pro Leu Tyr Val Phe Ser Leu Phe Tyr Gly Met Gln Val Ala Glu Asn Thr Trp Thr Ile Ser Lys His Ser Ala Asn Phe Ile Ile Asp Asp Trp Arg Ser Ile Gln Gly Phe Ser Leu Lys Thr Ser Asn Phe Arg Met Leu Gly Leu Val Gly Phe Lys Phe Gln Thr Val Leu Phe His His Asp Ala Ser Ile Glu Val Gly Ile Lys Trp Pro Phe Ala Phe Glu Tyr Asp Ser Ala Phe Val Arg Leu Phe Ser Val Phe Ile l65 170 175 Ser His Thr Phe Tyr Leu 180 l85 (2) INFORMATION FOR SEQ ID N0:56:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 185 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:56:
Met Gln Ala Lys Ser Arg Phe Tyr Val Ala Ser Gln Tyr Gln Val Gly Lys Met Ile Met Lys Lys Tyr Asn Asp Leu Lys Arg Thr Ile Glu Gly Ala Ser Phe Ser Leu Gly Trp Glu Ile Asn Pro Thr Asn Tyr Trp Phe Tyr Ser Arg Tyr Tyr Phe Phe Met Asp Tyr Gly Asn Val Ile Leu Asn Lys Arg Thr Gly Ala Gln Ala Asn Met Phe Thr Tyr Gly Phe Gly Gly Asp Leu Ile Val Glu Tyr Asn Lys Asn Pro Leu Tyr Val Phe Ser Leu WO 98/2I225 PCT/US97l21353 Phe Tyr Gly Met Gln Val Ala Glu Asn Thr Trp Thr Ile Ser Lys His Ser Ala Asn Phe Ile Ile Asp Asp Trp Arg 5er Tle Gln Gly Phe Ser Leu Lys Thr Ser Asn Phe Arg Met Leu Gly Leu Val Gly Phe Lys Phe ' 130 135 140 Gln Thr Val Leu Phe His His Asp Ala Ser Ile Glu Val Gly Ile Lys 145 150 l55 160 Trp Pro Phe Ala Phe Glu Tyr Asp Ser Ala Phe Val Arg Leu Phe Ser Val Phe Ile Ser His Thr Phe Tyr Leu 180 l85 (2) INFORMATION FOR SEQ ID N0:57:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1161 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 109...1113 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID P10:57:
Met Leu Val Thr Arg Phe Lys Lys Ala Phe Ile Ser Tyr Ser Leu Gly Val Leu Val Ala Ser Leu Trp Leu Asn Val Cys Asn Ala Ser Ala Gln Glu Val Lys GTC AAG GAT TAT TTC GGG GAG CAA ACC ATC AAG CTT CCT GTT TCT AAA 26l Val Lys Asp Tyr Phe Gly Glu Gln Thr Ile Lys Leu Pro Val Ser Lys Ile Ala Tyr Ile Gly Ser Tyr Val Glu Val Pro Ala Met Leu Asn Val Trp Asn Arg Val Val Gly Val Ser Asp Tyr Ala Phe Lys Asp Asp Ile Val Lys Ala Thr Leu Lys Gly Glu Asp Leu Lys Arg Val Lys His Met Ser Thr Asp His Thr Ala Ala Leu Asn Val Glu Leu Leu Lys Lys Leu Ser Pro Asp Leu Val Val Thr Phe Val Gly Asn Pro Lys Ala Val Glu His Ala Lys Lys Phe Gly Ile Ser Phe Leu Ser Phe Gln Glu Thr Thr Ile Ala Glu Ala Met Gln Ala Met Gln Ala G1n Ala Thr Val Leu Glu l50 155 160 Ile Asp Ala Ser Lys Lys Phe Ala Lys Met Gln Glu Thr Leu Asp Phe l65 170 175 Ile Ala Glu Arg Leu Lys Asn Val Lys Lys Lys Lys Gly Va1 Glu Leu 180 l85 190 l95 Phe His Lys Ala Asn Lys Ile Ser Gly His Gln Ala Ile Ser Ser Asp Ile Leu Glu Lys Gly Gly Ile Asp Asn Phe Gly Leu Lys Tyr Val Lys Phe Gly Arg Ala Asp Ile Ser Val Glu Lys Ile Val Lys Glu Asn Pro Glu Ile Ile Phe Ile Trp Trp Ile Ser Pro Leu Thr Pro Glu Asp Val Leu Asn Asn Pro Lys Phe Ala Thr Ile Lys Ala Ile Lys Asn Lys Gln Val Tyr Lys Leu Pro Thr Met Asp Ile Gly Gly Pro Arg Ala Pro Leu Ile Ser Leu Phe Ile Ala Leu Lys Ala His Pro Glu Ala Phe Lys Gly GTG GAT ATT AAT GCG ATG GTT AAA GAC TAC T'AT AAA GTG GTT TTT_GAT 1077 Val Asp Ile Asn Ala Met Val Lys Asp Tyr Tyr Lys Val Val Phe Asp ~ Leu Asn Asp Ala Glu Val Glu Pro Phe Leu Trp His GTTGATGTTT TTAGCCTTTC GTGTATCGCG CT l161 (2) INFORMATION FOR SEQ ID N0:58:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 335 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:58:
Met Leu Val Thr Arg Phe Lys Lys Ala Phe Ile Ser Tyr Ser Leu Gly Val Leu Val Ala Ser Leu Trp Leu Asn Val C'ys Asn Ala Ser Ala Gln Glu Val Lys Val Lys Asp Tyr Phe Gly Glu Gln Thr Ile Lys Leu Pro Val Ser Lys Ile Ala Tyr Ile Gly Ser Tyr Val Glu Val Pro Ala Met Leu Asn Val Trp Asn Arg Val Val Gly Val Ser Asp Tyr Ala Phe Lys Asp Asp Ile Val Lys Ala Thr Leu Lys Gly Glu Asp Leu Lys Arg Val Lys His Met Ser Thr Asp His Thr Ala Ala Leu Asn Val Glu Leu Leu Lys Lys Leu Ser Pro Asp Leu Val Val Thr Phe Val Gly Asn Pro Lys Ala Val Glu His Ala Lys Lys Phe Gly Ile Ser Phe Leu Ser Phe Gln l30 135 140 Glu Thr Thr Ile Ala Glu Ala Met Gln Ala MLet Gln Ala Gln Ala Thr 145 l50 155 160 Val Leu Glu Ile Asp Ala Ser Lys Lys Phe A.la Lys Met Gln Glu Thr ~ Leu Asp Phe Ile Ala Glu Arg Leu Lys Asn Val Lys Lys Lys Lys Gly Val Glu Leu Phe His Lys Ala Asn Lys Ile Ser Gly His Gln Ala Ile l95 200 205 Ser Ser Asp Ile Leu Glu Lys Gly Gly Ile A.sp Asn Phe Gly Leu Lys 2l0 215 220 Tyr Val Lys Phe Gly Arg Ala Asp Ile Ser Val Glu Lys Ile Val Lys Glu Asn Pro Glu Ile Ile Phe Ile Trp Trp Ile Ser Pro Leu Thr Pro Glu Asp Val Leu Asn Asn Pro Lys Phe Ala Thr Ile Lys Ala Ile Lys Asn Lys Gln Val Tyr Lys Leu Pro Thr Met Asp Ile Gly Gly Pro Arg Ala Pro Leu Ile Ser Leu Phe Ile Ala Leu Lys Ala His Pro Glu Ala Phe Lys Gly Val Asp Ile Asn Ala Met Val Lys Asp Tyr Tyr Lys Val Val Phe Asp Leu Asn Asp Ala Glu Val Glu Pro Phe Leu Trp His (2) INFORMATION FOR SEQ ID N0:59:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 800 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 123...669 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:59:
TGAAATCAAA CAAAGCCAAA AAGAAAAGAA AAAATTCCCC ACTTTCAAAG GAGGTTTTTA l20 Met Arg Trp Trp Cys Phe Leu Val Cys Cys Phe Gly Ile Leu Ser Val Met Asp Ala Lys Lys Leu Glu Asn Lys Asn Leu Lys Lys Glu Arg Glu Leu Leu Glu Ile Thr Gly Asn Gln Phe Val Ala Asn Asp Lys Thr Lys Thr Ala Val Ile Gln Gly Asn Val Gln Ile Lys Lys Gly Lys Asp Arg Leu Phe Ala Asp Lys Val Ser Val Phe Leu Asn Asp Lys Arg Lys Pro Glu Arg Tyr Glu Ala Thr Gly Asn Thr His Phe Asn Ile Phe Thr Glu GAC AAT CGT GAA ATC AGC GGG AGT GCT GAC A.AG CTC ATT TAT AAC GCG 456 Asp Asn Arg Glu Ile Ser Gly Ser Ala Asp L~ys Leu Ile Tyr Asn Ala Leu Asn Gly Glu Tyr Lys Leu Leu Gln Asn A.la Val Val Arg Glu Val GGG AAA TCC AAT GTC ATC ACC GGC GAT GAA A.TC ATT TTA AAC AAA ACT 552 Gly Lys Ser Asn Val Ile Thr Gly Asp Glu Ile Ile Leu Asn Lys Thr 130 135 l40 Lys Gly Tyr Ala Asp Val Leu Gly Ser Ala Lys Arg Pro Ala Lys Phe 145 150 l55 160 Val Phe Asp Met Glu Asp Ile Asn Glu Glu Asn Arg Lys Ala Lys Leu 165 170 l75 Lys Lys Lys Gly Glu Lys Pro (2) INFORMATION FOR SEQ ID N0:60:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 183 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:60:
Met Arg Trp Trp Cys Phe Leu Val Cys Cys Phe Gly Ile Leu Ser Val Met Asp Ala Lys Lys Leu Glu Asn Lys Asn Leu Lys Lys Glu Arg Glu Leu Leu Glu Ile Thr Gly Asn Gln Phe Val Ala Asn Asp Lys Thr Lys Thr Ala Val Ile Gln Gly Asn Val Gln Ile Lys Lys Gly Lys Asp Arg Leu Phe Ala Asp Lys Val Ser Val Phe Leu Asn Asp Lys Arg Lys Pro Glu Arg Tyr Glu Ala Thr Gly Asn Thr His Phe Asn Ile Phe Thr Glu Asp Asn Arg Glu Ile Ser Gly Ser Ala Asp Lys Leu Ile Tyr Asn Ala l00 l05 110 Leu Asn Gly Glu Tyr Lys Leu Leu Gln Asn Ala Val Val Arg Glu Val 115 l20 125 Gly Lys Ser Asn Val Ile Thr Gly Asp Glu Ile Ile Leu Asn Lys Thr 130 l35 140 Lys Gly Tyr Ala Asp val Leu Gly Ser Ala Lys Arg Pro Ala Lys Phe l45 150 155 160 Val Phe Asp Met Glu Asp Ile Asn Glu Glu Asn Arg Lys Ala Lys Leu L65 170 l75 Lys Lys Lys Gly Glu Lys Pro (2) INFORMATION FOR SEQ ID N0:61:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 724 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: S8...618 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:61:
Met Lys Leu Ile Lys Phe Val Arg Asn Val Val Leu Phe Ile Leu Thr Ala Ile Phe Leu Ala Phe Met Leu Leu Val Ser Tyr Cys Met Pro His Tyr Ser Ala Ala Val Ile Ser Gly Val Glu Val Lys Arg Met Asn Glu Asn Glu Asn Thr Pro Asn Asn Lys Glu Val Lys Thr Leu Ala Arg Asp Val Tyr Phe Val Gln Thr Tyr Asp Pro Lys Asp Gln Lys Ser Val Thr Val Tyr Arg Asn Glu Asp Thr Arg Phe Ser Phe Pro Phe Tyr Phe Lys Phe Asn Ser Ala Asp Ile Ser Ala Leu Ala Gln Ser Leu Ile Asn Gln Gln Val Glu Val Lys Tyr Tyr Gly Trp Arg I1e Asn Leu Phe Asn Met Phe Pro Asn Val Ile Phe Leu Lys Pro Leu Lys Glu Ser Thr Asp Ile Ser Lys Pro Ile Phe Ser Trp Ile Leu Tyr Ala Leu Leu Leu Met Gly Phe Phe Ile Ser Ala Arg Ser Val Cys 155 160 l65 ACT TTA TTT AAG AGC AAA GCT CAT TAAAACTTT'T AGGCTTTGTT GGAAAATCAC 648 Thr Leu Phe Lys Ser Lys Ala His AATGGGGTTA TTGGAGCGTG TATTAAAAAG CTCAATAT.?~G GGCAAGCTGA TGCTGTGAAA 708 (2) INFORMATION FOR 5EQ ID N0:62:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 177 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:62:
Met Lys Leu Ile Lys Phe Val Arg Asn Val Val Leu Phe Ile Leu Thr Ala Ile Phe Leu Ala Phe Met Leu Leu Val Ser Tyr Cys Met Pro His Tyr Ser Ala Ala Val Ile Ser Gly Val Glu Val Lys Arg Met Asn Glu Asn Glu Asn Thr Pro Asn Asn Lys Glu Val Lys Thr Leu Ala Arg Asp Val Tyr Phe Val Gln Thr Tyr Asp Pro Lys Aap Gln Lys Ser Val Thr Val Tyr Arg Asn G1u Asp Thr Arg Phe Ser Phe Pro Phe Tyr Phe Lys Phe Asn Ser Ala Asp Ile Ser Ala Leu Ala G7.n Ser Leu Ile Asn Gln ' l00 105 l10 Gln Val Glu Val Lys Tyr Tyr Gly Trp Arg Il.e Asn Leu Phe Asn Met Phe Pro Asn Val Ile Phe Leu Lys Pro Leu Lys Glu Ser Thr Asp Ile Ser Lys Pro Ile Phe Ser Trp Ile Leu Tyr Ala Leu Leu Leu Met Gly l45 1S0 155 l60 Phe Phe Ile Ser Ala Arg Ser Val Cys Thr Leu Phe Lys Ser Lys Ala His (2) INFORMATION FOR SEQ ID N0:63:
{i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 982 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear {ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 117...9l1 (D) OTHER INFORMATION:
(A) NAME/KEY: sig peptide (B) LOCATION: 117...167 (D) OTHER INFORMATION:
(A) NAME/KEY: mat_peptide (B) LOCATION: 168 ..911 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:63:
TGGTTAAAAA GGACACAATA AACCCCAAAA ATGAAATTTA AATATATGGG AACTTA ATG l19 Met Arg Ile Phe Phe Val Ile Met Gly Leu Val Phe Phe Gly Cys Thr Ser Lys Val His Glu Met Lys Lys Ser Pro Cys Thr Leu Tyr Glu Asn Arg Leu Asn Leu Ala Glu Ile Phe His Lys Arg Ala Ile Asp Leu Phe Arg Glu Leu Leu Ser His Gln Glu Lys His Leu Glu Asn Lys Leu Ser Gly Phe SerValSer AspLeuAsp MetGlnSerValPhe ArgLeuGlu Arg Asn ArgLeuLys IleAlaTyr LysLeuLeuGlyLeu MetSerPhe Ile Ala LeuIleLeu AlaIleVal LeuI1eSerLeuLeu ProLeuGln Lys Thr GluHisHis PheValAsp PheLeuAsnGlnAsp LysHisTyr Val 100 105 1l0 Ile IleGlnArg AlaAspLys SerIleSerSerAsn GluAlaLeu Ala Arg SerLeuIle GlyAlaTyr ValLeuAsnArgGlu SerIleAsn Arg 130 135 l40 ATT GACGATAAA TCGCGCTAT GAATTGGTGC~~CTTG CAAAGCAGT TCT 647 Ile AspAspLys SerArgTyr GluLeuValArgLeu GlnSerSer Ser Lys ValTrpGln ArgPheGlu AspLeuIleL-ysThr GlnAsnSer Ile l65 170 175 TAT GTGCAAAGC CATTTGGAA AGAGAAGTCC.ATATC GTCAATATT GCG 743 Tyr ValGlnSer HisLeuGlu ArgGluValHisIle ValAsnIle Ala Ile TyrGlnGln AspAsnAsn ProIleAlaSe Val SerIleAla Ala r AAA CTTTTGAAT GAAAACAAG CTGGTGTATG:~AAAG CGTTATAAA ATC 839 Lys LeuLeuAsn GluAsnLys LeuValTyrG.LuLys ArgTyrLys Ile GTA TTGAGTTAT TTGTTTGAC ACCCCGATGAi'~TTCA AGCTTGCAA GCT 887 Val LeuSerTyr LeuPheAsp ThrProMetA;>nSer SerLeuGln Ala _ TGC AAGCTCTCA GGCTTCATA GTTTGACATGACi~ 941 TATAGATGAG
CTTTATGCGG
Cys LysLeuSer GlyPheIle Val ACAGAATGGC
TAACGCAGCA
GGCACCGA(3T
(2} INFORMATION FOR SEQ ID N0:64:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 265 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:64:
Met Arg Ile Phe Phe Val Ile Met Gly Leu Val Phe Phe Gly Cys Thr Ser Lys Val His Glu Met Lys Lys Ser Pro Cys Thr Leu Tyr Glu Asn Arg Leu Asn Leu Ala Glu Ile Phe His Lys Arg Ala Ile Asp Leu Phe Arg Glu Leu Leu Ser His Gln Glu Lys His Leu Glu Asn Lys Leu Ser Gly Phe Ser Val Ser Asp Leu Asp Met Gln Ser Val Phe Arg Leu Glu Arg Asn Arg Leu Lys Ile Ala Tyr Lys Leu Leu Gly Leu Met Ser Phe Ile Ala Leu Ile Leu Ala Ile Val Leu Ile Ser Leu Leu Pro Leu Gln Lys Thr Glu His His Phe Val Asp Phe Leu Asn Gln Asp Lys His Tyr 100 l05 110 Val Ile Ile Gln Arg Ala Asp Lys Ser Ile Ser Ser Asn Glu Ala Leu l15 120 l25 Ala Arg Ser Leu Ile Gly Ala Tyr Val Leu Asn Arg Glu Ser Ile Asn Arg Ile Asp Asp Lys Ser Arg Tyr Glu Leu Val Arg Leu Gln Ser Ser Ser Lys Val Trp Gln Arg Phe Glu Asp Leu Ile Lys Thr Gln Asn Ser Ile Tyr Val Gln Ser His Leu Glu Arg Glu Val His Ile Val Asn Ile 180 185 l90 Ala Ile Tyr Gln Gln Asp Asn Asn Pro Ile Ala Ser Val Ser Ile Ala Ala Lys Leu Leu Asn Glu Asn Lys Leu Val Tyr Glu Lys Arg Tyr Lys 2l0 215 220 Ile Val Leu Ser Tyr Leu Phe Asp Thr Pro Met Asn Ser Ser Leu Gln Ala Cys Lys Leu Ser Gly Phe Ile Val (2) INFORMATION FOR SEQ ID N0:65:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2059 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 183...1961 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID 2d0:65:
GTTTTTTAAA
TACTTTGGCT
AGTCATTTTGATTTCTAAAA ATAGTCTATA ATGCTCG('_AA TTAAGGTTAT180 GAGATATTTT
CA ATG GCT ATA AAA ATA CTT TTT ATA A'..~G TTA AAC 227 AAA ACA CTC AGT
Met Lys Ala Ile Lys Ile Leu Phe Ile Me't Leu Asn Thr Leu Ser GCT ATC AGC GTG AAT AGG GCG TTG TTT GAT 7:'TA AAA GAT TCG CAA TTA 275 Ala Ile Ser Val Asn Arg Ala Leu Phe Asp Leu Lys Asp Ser Gln Leu AAA GGG GAA TTA ACG CCA AAA ATA GTG AAT 7.'TT GGG GGT TAT AAA AGC 323 Lys Gly Glu Leu Thr Pro Lys Ile Val Asn F>he Gly Gly Tyr Lys Ser AGC ACT GAA GAG TGG GGG GCT ACG GCT TTA dIAC TAT ATC AAT GCG GCT 371 Ser Thr Glu Glu Trp Gly Ala Thr Ala Leu Asn Tyr Ile Asn Ala Ala AAT GGC GAT GCG AAA AAA TTC AGC ACT CTA CyTG GAA AAA ATG CGT TTT 419 Asn Gly Asp Ala Lys Lys Phe Ser Thr Leu Val Glu Lys Met Arg Phe AAC TCC GGT ATA TTG GGG AAT TTA AGA GTG C:AT GCA CGT TTG AGG CAA 467 Asn Ser G1y Ile Leu Gly Asn Leu Arg Val His Ala Arg Leu Arg Gln 80 85 ~)0 95 GCC CTA AAA TTG CAA AAG AAT TTG AAA TAT 7.'GC CTT AAA ATC ATC GCT 515 Ala Leu Lys Leu Gln Lys Asn Leu Lys Tyr C'ys Leu Lys Ile Ile Ala Arg Asp Ser Phe Tyr Ser Tyr Arg Thr Gly l:le Tyr Ile Pro Leu Gly 115 l20 125 Ile Ser Leu Lys Asp Gln Lys Thr Ala Gln Lys Met Leu Ala Asp Leu AGC GTG GTA GGG GCG TAT CTT AAA AAA CAA C:AA GAG AAT GAA AAG GCT 659 Ser Val Val Gly Ala Tyr Leu Lys Lys Gln C:ln Glu Asn Glu Lys Ala Gln Ser Pro Tyr Tyr Arg Asn Asn Asn Tyr Tyr Asn Ser Tyr Tyr Ser 160 165 l70 175 Pro Tyr Tyr Gly Met Tyr Gly Met Tyr Gly Met Gly Met Tyr Gly Met Tyr Gly Met Gly Met Tyr Asp Phe Tyr Asp Phe Tyr Asp Gly Met Tyr GGA TTC TAC CCT AAC ATG TTT TTC ATG ATG CAA GTT CAA GAT TAC TTG 85l Gly Phe Tyr Pro Asn Met Phe Phe Met Met Gln Val Gln Asp Tyr Leu 210 2l5 220 Met Leu Glu Asn Tyr Met Tyr Ala Leu Asp Gln Glu Glu Ile Leu Asp His Asp Ala 5er Thr Asp Gln Leu Asp Thr Pro Thr Asp Asp Asp Lys Asp Asp Lys Asp Asp Lys Ser Leu Gln Gln Ala Asn Leu Met Asn Phe Tyr Arg Asp Pro Lys Phe Ser Lys Gly Ile Gln Thr Asn Arg Leu Asn Ser Ala Leu Val Asn Leu Asp Asn 5er Arg Met Leu Lys Asp Asn Ser Leu Phe His Thr Lys Ala Met Pro Thr Lys Ser Val Asp Ala Ile Thr Ser Gln Ala Lys Glu Leu Asn His Leu Val Gly Gln Ile Lys Glu Met Lys Gln Asp Gly Ala Ser Pro Ser Lys Ile Asp Ser Val Val Asn Lys Ala Met Glu Val Arg Asp Lys Leu Asp Asn Asn Leu Asn Gln Leu Asp AAT GAC TTA AAA GAT CAA AAA GGG CTT TCA AGC GAG CAA CAA GCT CAA 133l Asn Asp Leu Lys Asp Gln Lys Gly Leu Ser Ser Glu Gln Gln Ala Gln GTG GATAAAGCC CTAGACAGCGTG CAACAA'TTAAGC CATAGCAGCGAT 1379 Val AspLysAla LeuAspSerVal GlnGln:LeuSer HisSerSerAsp GTG GTGGGGAAT TATTTAGACGGG AGTTTG.AAAATT GATGGCGATGAT 1427 Val ValGlyAsn TyrLeuAspGly SerLeu:LysI1e AspGlyAspAsp Arg AspAspLeu AsnAspAlaMet AsnAsnProMet GlnGlnProVal CAA CAAACGCCT ACTAGCAACATG GCCGAC.ACCCAT GCAAATGACAGC 1523 Gln GlnThrPro ThrSerAsnMet AlaAsp'rhrHis AlaAsnAspSer AAG GATCAAGGG AGTAACGCGCTC ATAAACCCTAAC AGCGCCACTAAC 157l Lys AspGlnGly SerAsnAlaLeu IleAsnProAsn SerAlaThrAsn GCC GACGACACT CACACTGACGAT ACTCAC.ACTGAC ACTAACACCACA 1619 Ala AspAspThr HisThrAspAsp ThrHis'ThrAsp ThrAsnThrThr Asn AspAlaSer ThrThrAspThr ProThr.AspAsp LysAspAlaSer GGC TTGAACAAT ACCGGCGATA'rGAATAAC.ACGGAT ACCGGCAACACG 1715 Gly LeuAsnAsn ThrGlyAspMet AsnAsn'rhrAsp ThrGlyAsnThr Asp ThrGlyAsn ThrAspThrGly AsnThr,AspAsp MetSerAsnMet AAC AACGGCAAC GATGATACGGGT AACGCT.AATGAC GACATGAGCAAC 1811 Asn AsnGlyAsn AspAspThrGly AsnAla.AsnAsp AspMetSerAsn Gly AsnAspMet GlyAspAspLeu AsnAsn.AlaAsn AspMetAsnAsp Asp MetGlyAsn GlyAsnAspAsp MetGly.AspMet GlyAspMetAsn Asp AspMetGly GlyAspMetGly AspMetGlyAsp MetGlyAspMet ' GGG AATTGAGATTAAC CCCAATATCA TAAGGAATAT
AAGAGTGATA TT
GCCAAAACTT
Gly Asn (2) INFORMATION FOR SEQ ID N0:66:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 593 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:66:
Met Lys Ala Ile Lys Ile Leu Phe Ile Met Thr Leu Ser Leu Asn Ala Ile Ser Val Asn Arg Ala Leu Phe Asp Leu Lys Asp Ser Gln Leu Lys Gly Glu Leu Thr Pro Lys Ile Val Asn Phe Gly Gly Tyr Lys Ser Ser Thr Glu Glu Trp Gly Ala Thr Ala Leu Asn Tyr Ile Asn Ala Ala Asn Gly Asp Ala Lys Lys Phe Ser Thr Leu Val Glu Lys Met Arg Phe Asn Ser Gly Ile Leu Gly Asn Leu Arg Val His Ala Arg Leu Arg Gln Ala Leu Lys Leu Gln Lys Asn Leu Lys Tyr Cys Leu Lys Ile Ile Ala Arg l00 105 110 Asp Ser Phe Tyr Ser Tyr Arg Thr Gly Ile Tyr Ile Pro Leu Gly Ile l15 120 l25 Ser Leu Lys Asp Gln Lys Thr Ala Gln Lys Met Leu Ala Asp Leu Ser Val Val Gly Ala Tyr Leu Lys Lys Gln Gln Glu Asn Glu Lys Ala Gln 145 150 155 l60 Ser Pro Tyr Tyr Arg Asn Asn Asn Tyr Tyr Asn Ser Tyr Tyr Ser Pro Tyr Tyr Gly Met Tyr Gly Met Tyr Gly Met Gly Met Tyr Gly Met Tyr l80 185 190 Gly Met Gly Met Tyr Asp Phe Tyr Asp Phe Tyr Asp Gly Met Tyr Gly Phe Tyr Pro Asn Met Phe Phe Met Met Gln Val Gln Asp Tyr Leu Met 210 2l5 220 Leu Glu Asn Tyr Met Tyr Ala Leu Asp Gln Glu Glu Ile Leu Asp His Asp Ala Ser Thr Asp Gln Leu Asp Thr Pro Thr Asp Asp Asp Lys Asp Asp Lys Asp Asp Lys Ser Leu Gln Gln Ala Asn Leu Met Asn Phe Tyr Arg Asp Pro Lys Phe Ser Lys Gly Ile Gln Thr Asn Arg Leu Asn Ser Ala Leu Val Asn Leu Asp Asn Ser Arg Met Leu Lys Asp Asn Ser Leu Phe His Thr Lys Ala Met Pro Thr Lys Ser Val Asp Ala Ile Thr Ser Gln Ala Lys Glu Leu Asn His Leu Val Gly Gln Ile Lys Glu Met Lys Gln Asp Gly Ala Ser Pro Ser Lys Ile Asp Ser Val Val Asn Lys Ala Met Glu Val Arg Asp Lys Leu Asp Asn Asn Leu Asn Gln Leu Asp Asn Asp Leu Lys Asp Gln Lys Gly Leu Ser Ser Glu Gln Gln Ala Gln Val Asp Lys Ala Leu Asp Ser Val Gln Gln Leu Ser His Ser Ser Asp Val Val Gly Asn Tyr Leu Asp Gly Ser Leu Lys Ile Asp Gly Asp Asp Arg 405 4l0 415 Asp Asp Leu Asn Asp Ala Met Asn Asn Pro Met Gln Gln Pro Val Gln Gln Thr Pro Thr Ser Asn Met Ala Asp Thr His Ala Asn Asp Ser Lys Asp Gln Gly Ser Asn Ala Leu Ile Asn Pro Asn Ser Ala Thr Asn Ala Asp Asp Thr His Thr Asp Asp Thr His Thr Asp Thr Asn Thr Thr Asn Asp Ala Ser Thr Thr Asp Thr Pro Thr Asp Asp Lys Asp Ala Ser Gly Leu Asn Asn Thr Gly Asp Met Asn Asn Thr Asp Thr Gly Asn Thr Asp 500 505 5l0 Thr Gly Asn Thr Asp Thr Gly Asn Thr Asp Asp Met Ser Asn Met Asn Asn Gly Asn Asp Asp Thr Gly Asn Ala Asn Asp Asp Met Ser Asn Gly Asn Asp Met Gly Asp Asp Leu Asn Asn Ala Asn Asp Met Asn Asp Asp Met Gly Asn Gly Asn Asp Asp Met Gly Asp Met Gly Asp Met Asn Asp Asp Met Gly Gly Asp Met Gly Asp Met Gly Asp Met Gly Asp Met Gly Asn (2) INFORMATION FOR SEQ ID N0:67:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1527 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 1l2...1461 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:67:
Met Ser Met Glu Phe Asp Ala Val Ile Ile Gly Gly Gly Val Ser Gly Cys Ala Thr Phe Tyr Thr Leu Ser Glu Tyr Ser Ser Leu Lys Arg Val Ala Ile Val Glu Lys Cys Ser Lys Leu Ala Gln Ile Ser Ser Ser Ala Lys Ala Asn Ser Gln Thr Ile His Asp Gly Ser Ile Glu Thr Asn Tyr Thr Pro Glu Lys Ala Lys Lys Val Arg Leu Ser Ala Tyr Lys Thr Arg Gln Tyr Ala Leu Asn Lys Gly Leu Gln Asn Glu Val Ile Phe Glu Thr Gln Lys Met Ala Ile Gly Val Gly Asp Glu Glu Cys Glu Phe Met Lys Lys Arg l00 10S 110 Tyr Glu Ser Phe Lys Glu Ile Phe Val Gly Leu Glu Glu Phe Asp Lys l15 120 125 130 Gln Lys Ile Lys Glu Leu Glu Pro Asn Val Ile Leu Gly Ala Asn Gly Ile Asp Arg His Glu Asn Ile Ile Gly His Gly Tyr Arg Lys Asp Trp Ser Thr Met Asn Phe Ala Lys Leu Ser Glu Asn Phe Val Glu Glu Ala 165 170 l75 Leu Lys Leu Lys Pro Asn Asn Gln Val Phe Leu Asn Phe Lys Val Lys 180 185 l90 Lys Ile Glu Lys Arg Asn Asp Thr Tyr Ala Val Ile Ser Glu Asp Ala GAA GAA GTG TAT GCT AAA TTC GTG CTG GTC .AAT GCC GGC TCT TAC GCT 7B9 Glu Glu Val Tyr Ala Lys Phe Val Leu Val .Asn Ala Gly Ser Tyr Ala TTG CCT TTG GCT CAG AGC ATG GGC TAT GGC ~~TA GAT TTA GGG TGC TTG 837 Leu Pro Leu Ala Gln Ser Met Gly Tyr Gly :Leu Asp Leu Gly Cys Leu Pro Val Ala Gly Ser Phe Tyr Phe Val Pro ,asp Leu Leu Arg Gly Lys GTT TAT ACC GTT CAA AAC CCC AAA CTC CCT 'TTT GCA GCC GTG CAT GGC 933 Val Tyr Thr Val Gln Asn Pro Lys Leu Pro :Phe Ala Ala Val His Gly Asp Pro Asp Ala Val Ile Lys Gly Lys Thr Arg Ile Gly Pro Thr Ala 275 280 285 ~ 290 TTA ACG ATG CCT AAA TTA GAA CGC AAC AAA 'TGT TGG CTT AAG GGC ATT 1029 Leu Thr Met Pro Lys Leu Glu Arg Asn Lys Cys Trp Leu Lys Gly Ile AGC TTG GAA TTG TTG AAA ATG GAT TTG AAT i~AA GAT GTG TTT AAA ATT 1077 Ser Leu Glu Leu Leu Lys Met Asp Leu Asn Lys Asp Val Phe Lys Ile Ala Phe Asp Leu Met Ser Asp Lys Glu Ile Arg Asn Tyr Val Phe Lys Asn Met Val Phe Glu Leu Pro Ile Ile Gly Lys Arg Lys Phe Leu Lys GAC GCT CAA AAA ATC ATC CCC TCT CTT AGC CTA GAA GAT CTA GAA TAC 122l Asp Ala Gln Lys Ile Ile Pro Ser Leu Ser Leu Glu Asp Leu Glu Tyr Ala His Gly Phe Gly Glu Val Arg Pro Gln Val Leu Asp Arg Thr Lys Arg Lys Leu Glu Leu Gly Glu Lys Lys Ile Cys Thr His Lys Gly Ile Thr Phe Asn Met Thr Pro Ser Pro Gly Ala Thr Ser Cys Leu Gln Asn Ala Leu Val Asp Ser Gln Glu Ile Ala Ala Tyr Leu Gly Glu Ser Phe Glu Leu Glu Arg Phe Tyr Lys Asp Leu Ser Pro Glu Glu Leu Glu Asn (2) INFORMATION FOR SEQ ID N0:68:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 450 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:68:
Met Ser Met Glu Phe Asp Ala Val Ile Ile Gly Gly Gly Val Ser Gly Cys Ala Thr Phe Tyr Thr Leu Ser Glu Tyr Ser Ser Leu Lys Arg Val Ala Ile Val Glu Lys Cys Ser Lys Leu Ala Gln Ile Ser Ser Ser Ala Lys Ala Asn Ser Gln Thr Ile His Asp Gly Ser Ile Glu Thr Asn Tyr Thr Pro Glu Lys Ala Lys Lys Val Arg Leu Ser Ala Tyr Lys Thr Arg Gln Tyr Ala Leu Asn Lys Gly Leu Gln Asn Glu Val Ile Phe Glu Thr Gln Lys Met Ala Ile Gly Val Gly Asp Glu Glu Cys Glu Phe Met Lys Lys Arg Tyr Glu Ser Phe Lys Glu Ile Phe Val Gly Leu Glu Glu Phe l15 I20 125 Asp Lys Gln Lys Ile Lys Glu Leu Glu Pro Asn Val Ile Leu Gly Ala Asn Gly Ile Asp Arg His Glu Asn Ile Ile Gly His Gly Tyr Arg Lys Asp Trp Ser Thr Met Asn Phe Ala Lys Leu Ser Glu Asn Phe Val Glu Glu Ala Leu Lys Leu Lys Pro Asn Asn Gln Val Phe Leu Asn Phe Lys l80 185 l90 Val Lys Lys Ile Glu Lys Arg Asn Asp Thr Tyr Ala Val Ile Ser Glu 195 200 2b5 Asp Ala Glu Glu Val Tyr Ala Lys Phe Val Leu Val Asn Ala Gly Ser 2i0 215 220 Tyr Ala Leu Pro Leu A1a Gln Ser Met Gly Tyr Gly Leu Asp Leu Gly Cys Leu Pro Val Ala Gly Ser Phe Tyr Phe Val Pro Asp Leu Leu Arg Gly Lys Val Tyr Thr Val Gln Asn Pro Lys :Leu Pro Phe Ala Ala Val His Gly Asp Pro Asp Ala Val Ile Lys Gly :Lys Thr Arg Ile Gly Pro Thr Ala Leu Thr Met Pro Lys Leu Glu Arg .Asn Lys Cys Trp Leu Lys Gly Ile Ser Leu Glu Leu Leu Lys Met Asp :Leu Asn Lys Asp Val Phe Lys Ile Ala Phe Asp Leu Met Ser Asp Lys Glu Tle Arg Asn Tyr Val Phe Lys Asn Met Val Phe Glu Leu Pro Ile Ile Gly Lys Arg Lys Phe Leu Lys Asp Ala Gln Lys Ile Ile Pro Ser Leu Ser Leu Glu Asp Leu Glu Tyr Ala His Gly Phe Gly Glu Val Arg Pro Gln Val Leu Asp Arg Thr Lys Arg Lys Leu Glu Leu Gly Glu Lys :Lys Ile Cys Thr His Lys Gly Ile Thr Phe Asn Met Thr Pro Ser Pro Gly Ala Thr Ser Cys Leu 405 410 4l5 Gln Asn Ala Leu Val Asp Ser Gln Glu Ile .Ala Ala Tyr Leu Gly Glu _ -_ 420 425 430 Ser Phe Glu Leu Glu Arg Phe Tyr Lys Asp Leu Ser Pro Glu Glu Leu Glu Asn (2) INFORMATION FOR SEQ ID N0:69:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 653 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 63...590 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:69:
Met Pro Lys Pro Lys Lys Asn Thr Leu Pro Cys Ser Leu Ser Val Lys Met Ser Tyr Phe Met Arg Phe Leu Ile Lys Trp Arg Thr Arg Ser Leu Ser His Lys Met Met Thr Leu Ile Gln Ile Leu Ser Ile Leu Ala TTA GCG AGC AAG GCC AGT GAA GAT TTA GAA GAG CAA CTC AAA AAA ATC 25l Leu Ala Ser Lys Ala Ser Glu Asp Leu Glu Glu Gln Leu Lys Lys Tle Lys Asp Tyr Ile Tyr Arg Thr Leu Asn Ala Lys Ile Ala Ser Asp Val Tyr Asn Arg Val Leu Ile Leu Val Asn Glu Tyr Cys Thr Asn Glu Glu 80 85 90 g5 Leu Phe Asp Lys Glu Ser Val Lys Ile Ser Asp Leu Leu Ile Gln Asp Ile Gln Leu Tyr Ala Leu Val Asp Glu Met Leu Lys Glu Asp Lys Tyr Gln Val Gln His Thr Ile Leu Lys Gly Ile Ile Lys Arg Lys Tyr Asp l30 13S 140 Glu Ala Tyr Ser Leu Asn Ser Glu Asp Arg Ile Leu Leu Glu Tyr Gln l45 150 l55 Glu Arg Leu Leu Glu His Ser His Ala Ser Phe Ser Asn Lys Lys Phe 160 l65 170 175 Lys (2) INFORMATION FOR SEQ ID N0:70:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: l76 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:70:
Met Pro Lys Pro Lys Lys Asn Thr Leu Pro Cys Ser Leu Ser Val Lys Met Ser Tyr Phe Met Arg Phe Leu Ile Lys Trp Arg Thr Arg Ser Leu Ser His Lys Met Met Thr Leu Ile Gln Ile Leu Ser Ile Leu Ala Leu Ala Ser Lys Ala Ser Glu Asp Leu Glu Glu Gln Leu Lys Lys Ile Lys Asp Tyr Ile Tyr Arg Thr Leu Asn Ala Lys Ile Ala Ser Asp Val Tyr Asn Arg Val Leu Ile Leu Val Asn Glu Tyr Cys Thr Asn Glu Glu Leu Phe Asp Lys Glu Ser Val Lys Ile Ser Asp Leu Leu Ile Gln Asp Ile Gln Leu Tyr Ala Leu Val Asp Glu Met Leu :Lys Glu Asp Lys Tyr Gln 115 120 l25 VaI Gln His Thr Ile Leu Lys Gly Ile Ile :Lys Arg Lys Tyr Asp Glu Ala Tyr Ser Leu Asn Ser Glu Asp Arg Ile Leu Leu Glu Tyr Gln Glu 145 1S0 155 l60 Arg Leu Leu Glu His Ser His Ala Ser Phe ;Ser Asn Lys Lys Phe Lys (2) INFORMATION FOR SEQ ID N0:71:
(i) SEQUENCE CHARACTERISTICS:
(A} LENGTH: 1883 base pairs (B} TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 91...1833 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID 1J0:71:
AAGCGTTAAA TTCCAATCAA AAACCATCGT ATCGGTG'CTA ATATTGTGTA AAAATTAATG 60 Met Ly:~ Lys Leu Val Leu Val Ile Phe Leu Thr Leu Ala Leu Ser Ile Ser Ala Lys Glu Val Lys Ile Val Phe Leu Glu Thr Ser Asp I1e His Gly Arg heu Phe Ser Tyr Asp Tyr GCG ATT GGC GAG CAA AAA CCC AAT AAC GGC '.CTG ACA AGG ATT GCG ACT 258 Ala Ile Gly Glu Gln Lys Pro Asn Asn Gly heu Thr Arg Ile Ala Thr Leu Ile Lys Lys Gln Arg Ala Glu Asn Lys Asn Val Val Leu Ile Asp Ser Gly Asp Leu Leu Gln Gly Asn Ser Ala Glu Leu Phe Asn Asp Glu Pro Ile His Pro Leu Val Arg Ala Glu Asn Asp Leu Lys Phe Asp Ile Arg Val Leu Gly Asn His Glu Phe Asn Phe Ser Lys Asp Phe Leu Glu 105 110 l15 120 Lys Asn Ile Lys Gly Phe Asn Gly Asp Val Met Asn Ala Asn Ile Ile _. - Lys Ile Ala Asp Asn Lys Pro Phe Val Lys Pro Tyr Ile Ile Lys Lys l40 145 150 Ile Asp Gly Val Arg Val Ala Val Val Gly Tyr Val Val Ala His Ile Pro Thr Trp Glu Ala Ser Thr Pro Glu His Phe Ala Gly Leu Lys Phe 170 l75 l80 Leu Asp Ala Glu Glu Ala Leu Lys Lys Thr Leu Lys Glu Leu Lys Gly Lys Tyr Asp Ile Leu Ile Gly Ala Phe His Leu Gly Arg Glu Asp Glu 205 210 2l5 Lys Gly Gly Asp Gly Ile Pro Asp Leu Ala Lys Lys Phe Pro Gln Phe Asp Ile Ile Phe Ala Gly His Glu His Ala Val Tyr Asn Thr Lys Val Gly Lys Val His Thr Ile Glu Pro Gly Ala Tyr Gly Ala Tyr Leu Ala Lys Gly Val Val Val Phe Asp Thr Lys Thr Lys Lys Lys Ile Ile Thr -17s-Thr Glu Asn Leu Pro Thr Lys Asp Val Pro Glu Asp Glu Glu Leu Ala ' Lys Lys Tyr Glu Tyr Val Asp Lys Lys Ser :Lys Glu Tyr Ala Asn Glu 300 305 3l0 Val Val Gly Glu Val Thr Lys Thr Phe Ile e~sp Arg Pro Asp Phe Ile Thr Gly Glu Glu Lys Ile Thr Thr Met Pro 'Phr Ala Ala Leu Gln G1u Thr Pro Val Ile Glu Leu Ile Asn Lys Val G1n Lys Tyr Tyr Ala Lys 345 350 :355 360 GCC GAT GTT TCA GCG GCA GCC TTA TTC AAT 'CTT GGG GCG AAT TTG AAA 1218 Ala Asp Val Ser Ala Ala Ala Leu Phe Asn 1?he Gly Ala Asn Leu Lys AAA GGG CCT TTC AAA AGA AAA GAT GTC ACT 'CAT ATT TAC AAG TTC GCT 1266 Lys Gly Pro Phe Lys Arg Lys Asp Val Thr 'Cyr Ile Tyr Lys Phe Ala AAT ACG CTC ATT GGA GTG CGT ATA ACG GGT (3AA AAT CTG TTG AAA TAC 1314 _ Asn Thr Leu Ile Gly Val Arg Ile Thr Gly Glu Asn Leu Leu Lys Tyr ATG GAA TGG TCA TAC CGA TTT TAC AAT CAG 'CTG CAA CCA GGA GAT TTG 1362 Met Glu Trp Ser Tyr Arg Phe Tyr Asn Gln Leu Gln Pro Gly Asp Leu ACG ATC AGT TTT AAT GAA AAC ATT CGC GGC '.CAT AAC TTT GAT ATG TTT 1410 Thr Ile Ser Phe Asn Glu Asn Ile Arg Gly '~yr Asn Phe Asp Met Phe Ser Gly Val Lys Tyr Gln Val Asp Val Thr Lys Pro Ala Gly Gln Arg Ile Ile Asn Pro Thr Ile Asn Asn Lys Pro Ile Asp Pro Lys Ala Ile Tyr Lys Leu Ala Ile Asn Asn Tyr Arg Phe G1y Thr Leu Sex Thr Thr Leu Asn Leu Val Thr Asp Ala Xaa Arg Tyr '.Cyr Asn Ser Tyr Asp Glu Leu Gln Asp Asn Gly Gln Ile Arg Asp Leu Ile Ile Lys Tyr Ile Thr 505 510 5l5 520 Glu Glu Lys Gly Gly Lys Val Thr Pro Glu Leu Glu Gly Asn Trp Glu Ile Ile Asn Tyr Asp Phe Lys Asn Pro Leu Leu Glu Lys Leu Arg Glu Lys Leu Lys Glu Gly Ser Ile Lys Ile Pro Thr Ser Lys Asp Gly Arg Thr Leu Asn Val Lys Ser Ile Lys Glu Ser Glu Val Lys (2) INFORMATION FOR SEQ ID N0:72:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 581 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:72:
Met Lys Lys Leu Val Leu Val Ile Phe Leu Thr Leu Ala Leu Ser Ile Ser Ala Lys Glu Val Lys Ile Val Phe Leu Glu Thr Ser Asp Ile His Gly Arg Leu Phe Ser Tyr Asp Tyr Ala Ile Gly Glu Gln Lys Pro Asn Asn Gly Leu Thr Arg Ile Ala Thr Leu Ile Lys Lys Gln Arg Ala Glu Asn Lys Asn Val Val Leu Ile Asp Ser Gly Asp Leu Leu Gln Gly Asn Ser Ala Glu Leu Phe Asn Asp Glu Pro Ile His Pro Leu Val Arg Ala Glu Asn Asp Leu Lys Phe Asp Ile Arg Val Leu Gly Asn His Glu Phe l00 105 110 Asn Phe Ser Lys Asp Phe Leu Glu Lys Asn Ile Lys Gly Phe Asn Gly l15 120 125 Asp Val Met Asn Ala Asn Ile Ile Lys Ile Ala Asp Asn Lys Pro Phe Val Lys Pro Tyr Ile Ile Lys Lys Ile Asp Gly Val Arg Val Ala Val Val Gly Tyr Val Val Ala His Ile Pro Thr 'rrp Glu Ala Ser Thr Pro Glu His Phe Ala Gly Leu Lys Phe Leu Asp Ala Glu Glu Ala Leu Lys l80 185 190 ' Lys Thr Leu Lys Glu Leu Lys Gly Lys Tyr Asp Ile Leu Ile Gly Ala Phe His Leu Gly Arg Glu Asp G1u Lys Gly Gly Asp Gly Ile Pro Asp 210 2l5 220 Leu Ala Lys Lys Phe Pro Gln Phe Asp Ile :Ile Phe Ala Gly His Glu 22S 230 :?35 240 His Ala Val Tyr Asn Thr Lys Val Gly Lys 'Jal His Thr Ile Glu Pro Gly Ala Tyr Gly Ala Tyr Leu Ala Lys Gly 'Jal Val Val Phe Asp Thr Lys Thr Lys Lys Lys Ile Ile Thr Thr Glu Asn Leu Pro Thr Lys Asp Val Pro Glu Asp Glu Glu Leu Ala Lys Lys 'ryr Glu Tyr Val Asp Lys Lys Ser Lys Glu Tyr Ala Asn Glu Val Val Gly Glu Val Thr Lys Thr 305 310 :315 320 Phe Ile Asp Arg Pro Asp Phe Ile Thr Gly Glu Glu Lys Ile Thr Thr Met Pro Thr Ala Ala Leu Gln Glu Thr Pro Val Ile Glu Leu Ile Asn Lys Val Gln Lys Tyr Tyr Ala Lys Ala Asp Val Ser Ala Ala Ala Leu Phe Asn Phe Gly Ala Asn Leu Lys Lys Gly 1?ro Phe Lys Arg Lys Asp Val Thr Tyr Ile Tyr Lys Phe Ala Asn Thr Leu Ile Gly Val Arg Ile Thr Gly Glu Asn Leu Leu Lys Tyr Met Glu Trp Ser Tyr Arg Phe Tyr Asn Gln Leu Gln Pro Gly Asp Leu Thr Ile S er Phe Asn Glu Asn Ile Arg Gly Tyr Asn Phe Asp Met Phe Ser Gly Val Lys Tyr Gln Val Asp Val Thr Lys Pro Ala Gly Gln Arg Ile Ile Asn Pro Thr Ile Asn Asn Lys Pro Ile Asp Pro Lys Ala Ile Tyr Lys Leu Ala Ile Asn Asn Tyr 465 470 t75 480 Arg Phe Gly Thr Leu Ser Thr Thr Leu Asn Leu Val Thr Asp Ala Xaa Arg Tyr Tyr Asn Ser Tyr Asp Glu Leu Gln Asp Asn Gly Gln Ile Arg Asp Leu Ile Ile Lys Tyr Ile Thr Glu Glu Lys Gly Gly Lys Val Thr Pro Glu Leu Glu Gly Asn Trp Glu Ile Ile Asn Tyr Asp Phe Lys Asn Pro Leu Leu Glu Lys Leu Arg Glu Lys Leu Lys Glu Gly Ser Ile Lys Ile Pro Thr Ser Lys Asp Gly Arg Thr Leu Asn Val Lys Ser Ile Lys Glu Ser Glu Val Lys (2) INFORMATION FOR SEQ ID N0:73:
(i) SEQUENCE CHARACTERISTICS:
{A) LENGTH: 1339 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 68...l252 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:73:
GTTGGTA ATG GAA TCA GTA AAA ACA GGA AAA ACA AAT AAG GTT GGC AAG l09 Met Glu Ser Val Lys Thr Gly Lys Thr Asn Lys Val Gly Lys Asn Thr Glu Met Ala Asn Thr Lys Ala Asn Lys Glu Ala His Phe Lys Gln Ala Ser Thr Ile Thr Asn Ile Ile Arg Ser Ile Arg Gly Ile Phe Thr Lys Ile Ala Lys Lys Val Arg Gly Leu Val Lys Lys His Pro Lys Lys Ser Ser Ala Ala Leu Val Val Leu Thr His Ile Ala Cys Lys Lys Ala Lys Glu Leu Asp Asp Lys Val Gln Asp Lys Ser Lys Gln Ala Glu Lys Glu Asn Gln Ile Asn Trp Trp Lys Tyr Ser Gly Leu Thr Ile Ala 95 100 105 l10 Thr Ser Leu Leu Leu Ala Ala Cys Ser Thr Gly Asp Val Ser Glu Gln 1l5 120 125 Ile Glu Leu Glu Gln Glu Lys Gln Lys Thr Ser Asn Ile Glu Thr Asn l30 l35 140 -laz-Asn Gln Ile Lys Val Glu Gln Glu Lys Gln Lys Thr Ser Asn Ile Glu Thr Asn Asn Gln Ile Lys Val Glu Gln Glu G7.n Gln Lys Thr Ser Asn ' 160 165 170 Thr Gln Lys Asp Leu Val Lys Glu Gln Lys A~:p Leu Val Lys Glu Gln 175 180 1E.5 190 Lys Asp Leu Val Lys Glu Gln Lys Asp Leu Va.l Lys Glu Gln Lys Asp TTG GTT AAA ACA CAG AAA GAT TTC ATT AAA TA,T GTA GAA CAA AAT TGC 733 Leu Val Lys Thr Gln Lys Asp Phe Ile Lys Tyr Val Glu Gln Asn Cys Gln Glu Asn His Asn Gln Phe Phe Ile Glu Lys Gly Gly Ile Lys Ala Gly Ile Gly Ile Glu Val Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala AAA ACC AAT CAA ACC CCT ATC CAG CCA AAA CA.C CTC CCA AAC TCT AAA 877 Lys Thr Asn Gln Thr Pro Ile Gln Pro Lys His Leu Pro Asn Ser Lys Gln Pro Arg Ser Gln Arg Gly Ser Lys Ala Gln Glu Leu Ile Ala Tyr Leu Gln Lys Glu Leu Glu Phe Leu Pro Tyr Ser Gln Lys Ala Ile Ala Lys Gln Val Asp Phe Tyr Arg Pro Ser Ser Ile Ala Tyr Leu Glu Leu Asp Pro Arg Asp Phe Lys Val Thr Glu Glu Trp Gln Lys Glu Asn Leu Lys Ile Arg Ser Lys Ala Gln Ala Lys Met Leu Glu Met Arg Asn Pro CAA GCC CAC CTT TCA AAC TCT CAA AGC CTT TTG TTC GTT CAA AAA ATA 1l65 Gln Ala His Leu Ser Asn Ser Gln Ser Leu Leu Phe Val Gln Lys Ile Phe Ala Asp Val Asn Lys Glu Ile Glu Ala Val Ala Asn Thr Glu Lys Lys Ala Glu Lys Ala Gly Tyr Gly Tyr Ser Lys Arg Met ACATTGCACC AAGTTTTTAA TTATCTGTCG GCTTTTGAAA ACATTTTTTA TGGTAGCGTT l324 (2) INFORMATION FOR SEQ ID N0:74:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 395 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:74:
Met Glu Ser Val Lys Thr Gly Lys Thr Asn Lys Val Gly Lys Asn Thr Glu Met Ala Asn Thr Lys Ala Asn Lys Glu Ala His Phe Lys Gln Ala Ser Thr Ile Thr Asn Ile Ile Arg Ser Ile Arg Gly Ile Phe Thr Lys Ile Ala Lys Lys Val Arg Gly Leu Val Lys Lys His Pro Lys Lys 5er Ser Ala Ala Leu Val Val Leu Thr His Ile Ala Cys Lys Lys Ala Lys Glu Leu Asp Asp Lys Val Gln Asp Lys Ser Lys Gln Ala Glu Lys Glu Asn Gln Ile Asn Trp Trp Lys Tyr Ser Gly Leu Thr Ile Ala Thr Ser 100 l05 110 Leu Leu Leu Ala Ala Cys Ser Thr Gly Asp Val Ser Glu Gln Ile Glu Leu Glu Gln Glu Lys Gln Lys Thr Ser Asn Ile Glu Thr Asn Asn Gln Ile Lys Val Glu Gln Glu Lys Gln Lys Thr Ser Asn Ile Glu Thr Asn l45 150 155 160 Asn Gln Ile Lys Val Glu Gln Glu Gln Gln Lys Thr Ser Asn Thr Gln 165 170 l75 Lys Asp Leu Val Lys Glu Gln Lys Asp Leu Val Lys Glu Gln Lys Asp l80 185 190 Leu Val Lys Glu Gln Lys Asp Leu Val Lys Glu Gln Lys Asp Leu Val l95 200 205 Lys Thr Gln Lys Asp Phe Ile Lys Tyr Val Glu Gln Asn Cys Gln Glu 2l0 2l5 220 Asn His Asn Gln Phe Phe Ile Glu Lys Gly Gly Ile Lys Ala Gly Ile Gly Ile Glu Val Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala Lys Thr Asn Gln Thr Pro Ile Gln Pro Lys His Leu I?ro Asn Ser Lys Gln Pro Arg Ser Gln Arg Gly Ser Lys Ala Gln Glu heu Ile Ala Tyr Leu Gln ' Lys Glu Leu Glu Phe Leu Pro Tyr Ser Gln Lys Ala Ile Ala Lys Gln Val Asp Phe Tyr Arg Pro Ser Ser Ile Ala Tyr Leu Glu Leu Asp Pro Arg Asp Phe Lys Val Thr Glu Glu Trp Gln hys Glu Asn Leu Lys Ile Arg Ser Lys Ala Gln Ala Lys Met Leu Glu Met Arg Asn Pro Gln Ala His Leu Ser Asn Ser Gln Ser Leu Leu Phe Val Gln Lys Ile Phe Ala Asp Val Asn Lys Glu Ile Glu Ala Val Ala Asn Thr Glu Lys. Lys Ala Glu Lys Ala Gly Tyr Gly Tyr Ser Lys Arg P9et (2) INFORMATION FOR SEQ ID N0:75:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 904 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 70...864 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:75:
TAATAACTCA ATCCCATTTG AATGGCATTT TTAAGCCi~AA TTGCTACTAT CTTTGGCTAA 60 AGGTTAAAC ATG ATT AAA CAA ACC CTC ATC AT'C CTT GCC CCT TTT TTT ATC 111 Met Ile Lys Gln Thr Leu Ile I1~~_ Leu Ala Pro Phe Phe Ile GCA ACG CTG TTG TAT TTT TTA GGC GCA CCG (3AT GGG TTA AGA CCT AAC 159 Ala Thr Leu Leu Tyr Phe Leu Gly Ala Pro i~sp Gly Leu Arg Pro Asn Ala Trp Leu Tyr Phe Cys Ile Phe Met Gly Met Ile Ile Gly Leu Ile Leu Glu Pro Val Pro Ser Gly Leu Ile Ala :Leu Ser Ala Leu Val Leu Cys Ile Ala Leu Lys Ile Gly Ala Ser Asp Lys Val Ala Ser Ala Asn Lys Ala Ile Ser Trp Gly Leu Ser Gly Tyr Ala Asn Lys Thr Val Trp Leu Val Phe Val Ala Phe Ile Leu Gly Leu Gly Tyr Glu Lys Ser Leu 95 l00 105 I10 Leu Gly Lys Arg Ile Ala Leu Leu Leu Ile Arg Phe Leu Gly Gln Thr 115 l20 125 Pro Leu Gly Leu Gly Tyr Ala Ile Gly Leu Ser Glu Leu Cys Leu Ala Pro Phe Ile Pro Ser Asn Ser Ala Arg Ser Gly Gly Ile Leu Tyr Pro Ile Val Ser Ser Ile Pro Pro Leu Met Gly Ser Thr Pro Asn Asn Asn l60 165 170 Pro Asp Lys Ile Gly Ala Tyr Leu Met Trp Val Ala Leu Ala Ser Thr Cys Ile Thr Ser Ser Met Phe Leu Thr Ala Leu Ala Pro Asn Pro Leu l95 200 205 Ala Met Glu Ile Ala Ala Lys Met Gly Val Asn Glu Ile Ser Trp Phe Ser Trp Phe Leu Ala Phe Leu Pro Cys Gly Val Val Leu Ile Leu Leu Val Pro Leu Leu Ala Tyr Lys Thr Cys Lys Pro Thr Leu Lys Gly Ser Lys Glu Val Ser Leu Trp Ala Lys Lys Arg Asn (2) INFORMATION FOR SEQ ID N0:76:
(i) SEQUENCE CHARACTERISTICS:
{A) LENGTH: 265 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:76:
Met Ile Lys Gln Thr Leu Ile Ile Leu Ala Pro Phe Phe Ile Ala Thr Leu Leu Tyr Phe Leu Gly Ala Pro Asp Gly Leu Arg Pro Asn Ala Trp Leu Tyr Phe Cys Ile Phe Met Gly Met Ile Ile Gly Leu Ile Leu Glu Pro Val Pro Ser Gly Leu Ile Ala Leu Ser Ala Leu Val Leu Cys Ile Ala Leu Lys Ile Gly Ala Ser Asp Lys Val Ala Ser Ala Asn Lys Ala Ile Ser Trp Gly Leu Ser Gly Tyr Ala Asn Lys Thr Val Trp Leu Val Phe Val Ala Phe Ile Leu Gly Leu Gly Tyr Glu Lys Ser Leu Leu Gly 100 105 l10 Lys Arg Ile Ala Leu Leu Leu Ile Arg Phe Leu Gly Gln Thr Pro Leu Gly Leu Gly Tyr Ala Ile Gly Leu Ser Glu Leu Cys Leu Ala Pro Phe Ile Pro Ser Asn Ser Ala Arg Ser Gly Gly I:le Leu Tyr Pro Ile Val 145 150 7.55 160 Ser Ser Ile Pro Pro Leu Met Gly Ser Thr F>ro Asn Asn Asn Pro Asp 165. 170 l75 Lys Ile Gly Ala Tyr Leu Met Trp Val Ala Leu Ala Ser Thr Cys Ile Thr Ser Ser Met Phe Leu Thr Ala Leu Ala Pro Asn Pro Leu Ala Met l95 200 205 Glu Ile Ala Ala Lys Met Gly Val Asn Glu I:le Ser Trp Phe Ser Trp Phe Leu Ala Phe Leu Pro Cys Gly Val Val Leu Ile Leu Leu Val Pro 225 230 2:35 240 Leu Leu Ala Tyr Lys Thr Cys Lys Pro Thr Leu Lys Gly Ser Lys Glu Val Ser Leu Trp Ala Lys Lys Arg Asn (2) INFORMATION FOR SEQ ID N0:77:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1194 base pairs _ (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: l52...1069 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:77:
Met Ile Lys Ser Trp Thr Lys Lys Trp Phe Leu Ile Leu Phe Leu Met Ala Ser Cys Ser Ser Tyr Leu Val Ala Thr Thr Gly Glu Lys Tyr Phe Lys Met Ala Thr Gln Ala Phe AAG AGA GGG GAC TAC CAT AAA GCG GTG GCT TTT TAT AAG AGG AGC TGT 3l6 Lys Arg Gly Asp Tyr His Lys Ala Val Ala Phe Tyr Lys Arg Ser Cys Asn Leu Arg Val Gly Val Gly Cys Thr Ser Leu Gly Ser Met Tyr Glu Asp Gly Asp Gly Val Asp Gln Asn Ile Thr Lys Ala Val Phe Tyr Tyr Arg Arg Gly Cys Asn Leu Arg Asn His Leu Ala Cys Ala Ser Leu Gly Ser Met Tyr Glu Asp Gly Asp Gly Val Gln Lys Asn Leu Pro Lys Ala l05 110 115 Ile Tyr Tyr Tyr Arg Arg Gly Cys His Leu Lys Gly Gly Val Ser Cys 120 125 l30 135 Gly Ser Leu Gly Phe Met Tyr Phe Asn Gly Thr Gly Val Lys Gln Asn l40 145 l50 Tyr Ala Lys Ala Leu Phe Leu Ser Lys Tyr Ala Cys Ser Leu Asn Tyr 155 160 l65 Gly Ile Ser Cys Asn Phe Val Gly Tyr Met Tyr Arg Asn Ala Lys Gly Val Gln Lys Asp Leu Lys Lys Ala Leu Ala F~sn Phe Lys Arg Gly Cys His Leu Lys Asp Gly Ala Ser Cys Val Ser Leu Gly Tyr Met Tyr Glu GTC GGT ATG GAT GTC AAA CAA AAT GGA GAG C,'AA GCC TTG AAT CTT TAT 844 Val Gly Met Asp Val Lys Gln Asn Gly Glu Clln Ala Leu Asn Leu Tyr Lys Lys Gly Cys Tyr Leu Lys Arg Gly Ser Gly Cys His Asn Val Ala GTG ATG TAT TAC ACC GGT AAG GGC GTT CCA A.AG GAT TTA GAT AAA GCC 940 Val Met Tyr Tyr Thr Gly Lys Gly Val Pro L~ys Asp Leu Asp Lys Ala ATT TCG TAT TAT AAG AAA GGT TGC ACT CTA G'GC TTT AGT GGT AGC TGT 988 Ile Ser Tyr Tyr Lys Lys Gly Cys Thr Leu Gly Phe Ser Gly Ser Cys AAA GTG TTA GAA GAA GTG ATT GGC AAG AAG T'CT GAT GAT TTG CAA GAT 1036 Lys Val Leu Glu Glu Val Ile Gly Lys Lys Ser Asp Asp Leu Gln Asp Asp Ala Gln Asn Asp Thr Gln Asp Asp Met Gln AAT'GATTAAA ACTCATCTTA TAGAAATCTT TCTACTCTCT TGTTATCAAA TAGGGATTAA l149 (2) INFORMATION FOR SEQ ID N0:78:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 306 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:78:
Met Ile Lys Ser Trp Thr Lys Lys Trp Phe Leu Ile Leu Phe Leu Met Ala Ser Cys Ser Ser Tyr Leu Val Ala Thr Thr Gly Glu Lys Tyr Phe Lys Met Ala Thr Gln Ala Phe Lys Arg Gly Asp Tyr His Lys Ala Val Ala Phe Tyr Lys Arg Ser Cys Asn Leu Arg Val Gly Val Gly Cys Thr Ser Leu Gly Ser Met Tyr Glu Asp Gly Asp Gly Val Asp Gln Asn Ile Thr Lys Ala Val Phe Tyr Tyr Arg Arg Gly Cys Asn Leu Arg Asn His Leu Ala Cys Ala Ser Leu Gly Ser Met Tyr Glu Asp Gly Asp Gly Val l00 105 110 Gln Lys Asn Leu Pro Lys Ala Ile Tyr Tyr Tyr Arg Arg Gly Cys His 115 120 l25 Leu Lys Gly Gly Val Ser Cys Gly Ser Leu Gly Phe Met Tyr Phe Asn Gly Thr Gly Val Lys Gln Asn Tyr Ala Lys Ala Leu Phe Leu Ser Lys l45 150 155 160 Tyr Ala Cys Ser Leu Asn Tyr Gly Ile Ser Cys Asn Phe Val Gly Tyr Met Tyr Arg Asn Ala Lys Gly Val Gln Lys Asp Leu Lys Lys Ala Leu l80 185 l90 Ala Asn Phe Lys Arg Gly Cys His Leu Lys Asp Gly Ala Ser Cys Val Ser Leu Gly Tyr Met Tyr Glu Val Gly Met Asp Val Lys Gln Asn Gly Glu Gln Ala Leu Asn Leu Tyr Lys Lys Gly Cys Tyr Leu Lys Arg Gly Ser Gly Cys His Asn Val Ala Val Met Tyr Tyr Thr Gly Lys Gly Val Pro Lys Asp Leu Asp Lys Ala Ile Ser Tyr Tyr Lys Lys Gly Cys Thr Leu Gly Phe Ser Gly Ser Cys Lys Val Leu Glu Glu Val Ile Gly Lys Lys Ser Asp Asp Leu Gln Asp Asp Ala Gln Asn Asp Thr Gln Asp Asp Met Gln (2) INFORMATION FOR SEQ ID N0:79:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1001 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 101...865 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:79:
GCGATTTTAG GTTAATTTTG AGTTTTTAGG AGCAGTTTTT ATG CAA CAA GAA GAG l15 Met Gln Gln Glu Glu Ile Ile Glu Gly Tyr Tyr Gly Ala Ser Lys Gly Leu Lys Lys Ser Gly ( 10 15 20 Ile Tyr Ala Lys Leu Asp Phe Leu Gln Ser Ala Thr Gly Leu Ile Leu Ala Leu Phe Met Ile Ala His Met Phe Leu Val Ser Ser Ile Leu Ile AGC GAT GAA GCC ATG TAT AAA GTG GCG AAA T'TT TTT GAA GGG AGC TTG 307 Ser Asp Glu Ala Met Tyr Lys Val Ala Lys P.he Phe Glu Gly Ser Leu Phe Leu Lys Ala Gly Glu Pro Ala Ile Val S~~r Val Val Ala Ala Gly ATT ATT CTT ATT TTA GTC GCG CAT GCT TTT T'TG GCG TTA AGG AAA TTC 403 Ile Ile Leu Ile Leu Val Ala His Ala Phe L~.u Ala Leu Arg Lys Phe Pro Ile Asn Tyr Arg Gln Tyr Lys Val Phe Lys Thr His Lys His Leu ATG AAA CAT GGC GAT ACG AGC TTG TGG TTT A'rT CAA GCC CTC ACC GGG 499 Met Lys His Gly Asp Thr Ser Leu Trp Phe I:Le Gln Ala Leu Thr Gly TTT GCG ATG TTT TTC TTA GCG AGT ATC CAC T'PA TTT GTC ATG CTC ACA 547 Phe Ala Met Phe Phe Leu Ala Ser Ile His Lc:u Phe Val Met Leu Thr l35 140 145 Glu Pro Glu Ser Ile Gly Pro His Gly Ser Ser Tyr Arg Phe Val Thr 150 155 lEiO 165 CAA AAC TTT TGG CTT TTG TAT ATT TTC TTA T'.CG TTT GCC GTA GAA TTG 643 Gln Asn Phe Trp Leu Leu Tyr Ile Phe Leu Le~u Phe Ala Val Glu Leu 170 l75 180 CAT GGC TCT ATT GGG TTG TAT CGT TTA GCG A'.CC AAA TGG GGG TGG TTT 69l His Gly Ser Ile Gly Leu Tyr Arg Leu Ala Ile Lys Trp Gly Trp Phe 185 190 l95 AAA AAT GTG AGC ATT CAA GGT TTG AGA AAA G'.CC AAA TGG GCG ATG AGC 739 Lys Asn Val Ser Ile Gln Gly Leu Arg Lys Val Lys Trp Ala Met Ser Val Phe Phe Ile Val Leu Gly Leu Cys Thr Tyr Gly Ala Tyr Ile Lys Lys Gly Leu Glu Asn Lys Glu Asn Gly Ile Lys Thr Met Gln Glu Ala Ile Glu Ala Asp Gly Lys Phe His Lys Glu CAAACAAAAG GGTTTAAACA CCATCGTTTT AAGCCTAGTG CCTGTCAGGC GTT l001 (2) INFORMATION FOR SEQ ID N0:80:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 255 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION. SEQ ID N0:80:
Met Gln Gln Glu Glu Ile Ile Glu Gly Tyr Tyr Gly Ala Ser Lys Gly Leu Lys Lys Ser Gly Ile Tyr Ala Lys Leu Asp Phe Leu Gln Ser Ala Thr Gly Leu Ile Leu Ala Leu Phe Met Ile Ala His Met Phe Leu Val Ser Ser Ile Leu Ile Ser Asp Glu Ala Met Tyr Lys Val Ala Lys Phe Phe Glu Gly Ser Leu Phe Leu Lys Ala Gly Glu Pro Ala Ile Val Ser Val Val Ala Ala Gly Ile Ile Leu Ile Leu Val Ala His Ala Phe Leu Ala Leu Arg Lys Phe Pro Ile Asn Tyr Arg Gln Tyr Lys Val Phe Lys Thr His Lys His Leu Met Lys His Gly Asp Thr Ser Leu Trp Phe Ile 1l5 l20 125 Gln Ala Leu Thr Gly Phe Ala Met Phe Phe Leu Ala Ser Ile His Leu l30 135 140 Phe Val Met Leu Thr Glu Pro Glu Ser Ile Gly Pro His Gly Ser Ser Tyr Arg Phe Val Thr Gln Asn Phe Trp Leu Leu Tyr Ile Phe Leu Leu l65 l70 175 Phe Ala Val Glu Leu His Gly Ser Ile Gly Leu Tyr Arg Leu Ala Ile 180 l85 190 Lys Trp Gly Trp Phe Lys Asn Val Ser Ile Gln Gly Leu Arg Lys Val Lys Trp Ala Met Ser Val Phe Phe Ile Val Leu Gly Leu Cys Thr Tyr WO 9$I21225 PCT/US97/21353 Gly Ala Tyr Ile Lys Lys Gly Leu Glu Asn Lys Glu Asn Gly Ile Lys Thr Met Gln Glu Ala Ile Glu Ala Asp Gly Lys Phe His Lys Glu (2) INFORMATION FOR SEQ ID N0:81:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 975 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 82...912 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:81:
TTTTAAAATT AAAGAAAATT TTTTTTAAAG ATTATCAC'rC TTTTTTGATA AAGTAATCAT 60 Met Glu Glu Ser T:hr Ala Phe Ile Leu Ala Leu Val Gly Leu Phe Thr Gly Ile Thr Ala Gly Phe Phe Gly Ile Gly GGG GGG GAG ATT GTC GTC CCT AGC GCG ATT T'rT GCC CAT TTT AGC TAT 207 Gly Gly Glu Ile Val Val Pro Ser Ala Ile Pine Ala His Phe Ser Tyr AGC CAT GCG GTG GGT ATT TCG CTC ATG CAA A'TG CTT TTT TCT TCA GTG 255 Ser His Ala Val Gly Ile Ser Leu Met Gln Met Leu Phe Ser Ser Val GTC GGC TCT ATC ATC AAT TAC AAA AAG GGC T'rA TTG GAT TTG AGA GAA 303 Val Gly Ser Ile Ile Asn Tyr Lys Lys Gly L.=_u Leu Asp Leu Arg Glu Gly Ser Phe Ala Ala Leu Gly Gly Leu Met G.ly Ala Ile Leu Gly Ser TTT ATC TTA AAA ATC ATT GAC GAT AAA ATT T'rA ATG GCG GTG TTT GTG 399 Phe Ile Leu Lys Ile Ile Asp Asp Lys Ile L~~_u Met Ala Val Phe Val Val Val Val Cys Tyr Thr Phe Ile Lys Tyr A.La Phe Ser Ser Asn Lys Lys Pro Lys His Phe Glu Glu Met His Phe Asp Leu His Ala Asn Asn Lys Thr Pro Glu Lys Lys Arg Ala Ile Pro Phe Val Ser Met Asp Arg 140 145 l50 Thr His Gly Val Leu Met Leu Ala Gly Phe Val Thr Gly Ile Phe Ser Ile Pro Leu Gly Met Gly Gly Gly Ile Leu Met Val Pro Phe Leu Gly 175 l80 l85 Tyr Phe Leu Lys Tyr Asp Ser Lys Lys Ile Val Pro Leu Gly Leu Phe Phe Val Val Phe Ala Ser Leu Ser Gly Val Ile Ser Leu Tyr Asn Gly 205 2l0 215 Arg Val Leu Asp Asn Ile Ser Val Gln Ala Gly Val Ile Thr Gly Ile GGA GCG TTT TTA GGC GTG GGC ATT GGC ATC AAG CTT ATC GCT TTG GCT 83l Gly Ala Phe Leu Gly Val Gly Ile Gly Ile Lys Leu Ile Ala Leu Ala Asn Glu Lys Val His Lys Ile Leu Leu Leu Leu Ile Tyr Ala Leu Ser Ile Leu Ala Thr Leu His Lys Leu Ile Met Gly (2) INFORMATION FOR SEQ ID N0:82:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 277 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:82:
Met Glu Glu Ser Thr Ala Phe Ile Leu Ala Leu Val Gly Leu Phe Thr Gly Ile Thr Ala Gly Phe Phe Gly Ile Gly C)ly Gly Glu Ile Val Val Pro Ser Ala Ile Phe Ala His Phe Ser Tyr :~er His Ala Val Gly Ile ( 35 40 45 Ser Leu Met Gln Met Leu Phe Ser Ser Val Val Gly Ser Ile Ile Asn Tyr Lys Lys Gly Leu Leu Asp Leu Arg Glu Gly Ser Phe Ala Ala Leu 65 70 75 g0 Gly Gly Leu Met Gly Ala Ile Leu Gly Ser F~he Ile Leu Lys Ile Ile Asp Asp Lys Ile Leu Met Ala Val Phe Val Val Val Val Cys Tyr Thr Phe Ile Lys Tyr Ala Phe Ser Ser Asn Lys L~ys Pro Lys His Phe Glu Glu Met His Phe Asp Leu His Ala Asn Asn L~ys Thr Pro Glu Lys Lys 130 l35 140 Arg AIa Ile Pro Phe Val Ser Met Asp Arg T'hr His Gly Val Leu Met Leu Ala Gly Phe Val Thr Gly Ile Phe Ser Ile Pro Leu Gly Met Gly Gly Gly Ile Leu Met Val Pro Phe Leu Gly Tyr Phe Leu Lys Tyr Asp 180 1S5 l90 Ser Lys Lys Ile Val Pro Leu Gly Leu Phe Phe Val Val Phe Ala Ser Leu Ser Gly Val Ile Ser Leu Tyr Asn Gly Arg Val Leu Asp Asn Ile Ser Val Gln Ala Gly Val Ile Thr Gly Ile Gly Ala Phe Leu Gly Val Gly Ile Gly Ile Lys Leu Ile Ala Leu Ala Asn Glu Lys Val His Lys Ile Leu Leu Leu Leu Ile Tyr Ala Leu Ser Ile Leu Ala Thr Leu His Lys Leu Ile Met Gly (2) INFORMATION FOR SEQ ID N0:83:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1667 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 220...1482 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: B3:
GGA GTG
Met Gly Val Gly Tyr Gln Ile Gly Gly Ala Gln Gln Asn Ile Asp Asn Lys Gly Ser Thr Leu Arg Asn Asn Val Ile Asn Asn Phe Arg Gln Val Gly Val Gly Met Ala Gly Gly Asn Gly Leu Leu Ala Leu Ala Thr Asn Thr Thr Met Asp Ala CTT TTA GGG .ATA GGC AAC CAA ATT GTC AAT ACT AAT ACA ACT GTT AGC 426 Leu Leu Gly Ile Gly Asn Gln Ile Val Asn Thr Asn Thr Thr Val Ser Asn Asn Asn Ala Glu Leu Thr Gln Phe Lys Lys Ile Leu Pro Gln Ile Glu Gln Arg Phe Glu Thr Asn Lys Asn Ala Tyr Ser Val Gln Ala Leu Gln Val Tyr Leu Ser Asn Val Leu Tyr Asn Leu Val Asn Asn Ser Asn 105 110 l15 Asn Gly Ser Asn Asn Gly Val Val Pro Glu Tyr Val Gly Ile Ile Lys l20 125 130 Val Leu Tyr Gly Ser Gln Asn Glu Phe Ser Leu Leu Ala Thr Glu Ser l35 140 145 Val Val Leu Leu Asn Ala Leu Thr Arg Val Asn Leu Asp Ser Asn Ser Val Phe Leu Lys Gly Leu Leu Ala Gln Met Gln Leu Phe Asn Asp Thr TCT TCA GCA AAG CTA GGC CAG ATC GCA GAA AAC TTG AAG AAC GGT GGT 8l0 Ser Ser Ala Lys Leu Gly Gln Ile Ala Glu Asn Leu Lys Asn Gly Gly Ala Gly Ser Met Leu Gln Lys Asp Val Lys Thr Ile Ser Asp Arg Ile _ GCT ACT TAC CAA GAG AAT CTA AAA CAG CTA GGA GGG ATG CTA AAG AAT 906 Ala Thr Tyr Gln Glu Asn Leu Lys Gln Leu Gly Gly Met Leu Lys Asn Tyr Asp Glu Pro Tyr Leu Pro Gln Phe Gly Pro Gly Thr Ser Ser Gln ' ~ 230 235 240 245 His Gly Val Ile Asn Gly Phe Gly Ile Gln Val Gly Tyr Lys Gln Phe TTT GGG AAC AAG CGG AAT ATA GGC TTA CGA T.AT TAC GCT TTC TTT GAT 1050 Phe Gly Asn Lys Arg Asn Ile Gly Leu Arg Tyr Tyr Ala Phe Phe Asp Tyr Gly Phe Thr Gln Leu Gly Ser Leu Ser Ser Ala Val Lys Ala Asn ATC TTT ACT TAT GGC GCT GGC ACG GAC TTT T'TA TGG AAT ATC TTT AGA 1146 Ile Phe Thr Tyr Gly Ala Gly Thr Asp Phe Leu Trp Asn Ile Phe Arg AGG GTT TTT AGC GAT CAG TCC TTG AAT GTG GGG GTG TTT GGG GGC ATT 1l94 Arg Val Phe Ser Asp Gln Ser Leu Asn Val G:ly Val Phe Gly Gly Ile CAA ATA GCG GGT AAC ACT TGG GAT AGC TCT T'TA AGA GGT CAA ATT GAA 1242 Gln Ile Ala Gly Asn Thr Trp Asp Ser Ser L:~_u Arg Gly Gln Ile Glu Asn Ser Phe Lys Glu Tyr Pro Thr Pro Thr Asn Phe Gln Phe Leu Phe AAT TTG GGT TTA AGG GCT CAT TTT GCC AGC ACC ATG CAC CGC CGG TTT l338 Asn Leu Gly Leu Arg Ala His Phe Ala Ser Thr Met His Arg Arg Phe TTG AGC GCG TCT CAA AGC ATT CAG CAT GGG A'.CG GAA TTT GGC GTG AAA 1386 Leu Ser Ala Ser Gln Ser Ile G1n His Gly Meet Glu Phe Gly Val Lys ATC CCG GCT ATC AAT CAA AGG TAT TTG AGG GC'_C AAT GGG GCT GAT GTG 1434 Ile Pro Ala Ile Asn Gln Arg Tyr Leu Arg Ala Asn Gly Ala Asp Val Asp Tyr Arg Arg Leu Tyr Ala Phe Tyr Ile Asn Tyr Thr Ile Gly Phe GGATCCAGTG GGGAGATGAG GGGAAGGGAA AAATTGTTGA TAGGATCGCT AAAGATTATG l663 ACTT
(2) INFORMATION FOR SEQ ID N0:84:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 421 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (x1) SEQUENCE DESCRIPTION: SEQ ID N0:84:
Met Gly Val Gly Tyr Gln Ile Gly Gly Ala Gln Gln Asn 11e Asp Asn Lys Gly Ser Thr Leu Arg Asn Asn Val Ile Asn Asn Phe Arg Gln Val Gly Val Gly Met Ala Gly Gly Asn Gly Leu Leu Ala Leu Ala Thr Asn Thr Thr Met Asp Ala Leu Leu Gly Ile Gly Asn G1n Ile Val Asn Thr Asn Thr Thr Val Ser Asn Asn Asn Ala Glu Leu Thr Gln Phe Lys Lys Ile Leu Pro Gln Ile Glu Gln Arg Phe Glu Thr Asn Lys Asn Ala Tyr Ser Val Gln Ala Leu Gln Val Tyr Leu Ser Asn Val Leu Tyr Asn Leu Val Asn Asn Ser Asn Asn Gly Ser Asn Asn Gly Val Val Pro Glu Tyr l15 120 l25 Val Gly Ile Ile Lys Val Leu Tyr Gly Ser Gln Asn Glu Phe Ser Leu Leu Ala Thr Glu Ser Val Val Leu Leu Asn Ala Leu Thr Arg Val Asn l45 150 155 160 Leu Asp Ser Asn Ser Val Phe Leu Lys Gly Leu Leu Ala Gln Met Gln 16S l70 175 Leu Phe Asn Asp Thr Ser Ser Ala Lys Leu Gly Gln Ile Ala Glu Asn Leu Lys Asn Gly Gly Ala Gly Ser Met Leu Gln Lys Asp Val Lys Thr Ile Ser Asp Arg Ile Ala Thr Tyr Gln Glu Asn Leu Lys Gln Leu Gly Gly Met Leu Lys Asn Tyr Asp Glu Pro Tyr Leu Pro Gln Phe Gly Pro Gly Thr Ser Ser Gln His Gly Val Ile Asn Gly Phe Gly Ile Gln Val Gly Tyr Lys Gln Phe Phe Gly Asn Lys Arg Asn Ile Gly Leu Arg Tyr Tyr Ala Phe Phe Asp Tyr Gly Phe Thr Gln Leu Gly Ser Leu Ser Ser Ala Val Lys Ala Asn Ile Phe Thr Tyr Gly Ala Gly Thr Asp Phe Leu Trp Asn Ile Phe Arg Arg Val Phe Ser Asp Gln Ser Leu Asn Val Gly _ Val Phe Gly Gly Ile Gln Ile A1a Gly Asn Thr Trp Asp Ser Ser Leu _ Arg Gly Gln Ile Glu Asn Ser Phe Lys Glu Tyr Pro Thr Pro Thr Asn Phe Gln Phe Leu Phe Asn Leu Gly Leu Arg Ala His Phe Ala Ser Thr ' Met His Arg Arg Phe Leu Ser Ala Ser Gln Ser Ile Gln His Gly Met Glu Phe Gly Val Lys Ile Pro Ala Ile Asn Gln Arg Tyr Leu Arg Ala Asn Gly Ala Asp Val Asp Tyr Arg Arg Leu Tyr Ala Phe Tyr Ile Asn Tyr Thr Ile Gly Phe (2) INFORMATION FOR SEQ ID N0:85:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 926 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 207...746 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:85:
GCATTTGCAAGAAACTTTGATGATAGAAGTGGATAGGC'TT GATTTTTCTT TAGTGGAGCG180 ATG ATT
Met Lys Ser Met Arg Phe Ser Tyr Ile Glu Pro Arg Ala Lys Tyr Leu Ile Ser Lys L~eu Ser Lys Ile Trp Val TTT TAC ATT TTT TTA TCT TTT GTG GTA ATA G'GG GGG TTA GTG TGG TTT 329 Phe Tyr Ile Phe Leu 5er Phe Val Val Ile C'~ly Gly Leu Val Trp Phe ATG CAC AAC GCC ATT AAA AGC ACT CAA GAC P,AC GCG TCC AGT TTG ACG 377 Met His Asn Ala Ile Lys Ser Thr Gln Asp F.sn Ala Ser Ser Leu Thr WO 98l21225 PCT/US97/Z1353 -Ile Gln Glu Arg Leu Tyr Arg His Glu Ile Ser Arg Leu Gln Val Lys Thr Asp Glu Thr Leu Lys Leu Ile Lys Glu Ala Lys Lys Arg Leu Asn Tyr Asn Asp Asp Ile Arg Asp Val Leu Gln Gly Leu Leu Asn Ile Val Pro Asp Ser Ile Thr Ile Asn Ser Ile Glu Ile Asp Gln Gln Ser Val l10 1l5 l20 Val Val Ser Gly Lys Thr Pro Ser Lys Glu Ala Phe Tyr Phe Leu Phe Gln Asn Lys Leu Asn Pro Met Phe Asp Tyr Ser Arg Ala Glu Phe Phe CCCTTA GAT GGGTGGTTT AATTTT TCCACC AAC TTT TCT AAT
ProLeu Asp GlyTrpPhe AsnPhe SerThr Asn Phe Ser Asn Ser Val TCCTTA ATA AAAAATCCG GAGTCT AAATGAAGCCATT GCATTTTTCA
SerLeu Ile LysAsnPro GluSer Lys Leu Ile (2) INFORMATION FOR SEQ ID N0:86:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 180 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:86:
Met Lys Ser Met Arg Phe Ser Tyr Ile Glu Pro Arg Ala Lys Tyr Leu Ile Ser Lys Leu Ser Lys Ile Trp Val Phe Tyr Ile Phe Leu Ser Phe Val Val Ile Gly Gly Leu Val Trp Phe Met His Asn Ala Ile Lys 5er Thr Gln Asp Asn Ala Ser Ser Leu Thr Ile Gln Glu Arg Leu Tyr Arg ' S0 55 60 His Glu Ile Ser Arg Leu Gln Val Lys Thr P,sp Glu Thr Leu Lys Leu Ile Lys Glu Ala Lys Lys Arg Leu Asn Tyr F,sn Asp Asp Ile Arg Asp Val Leu Gln Gly Leu Leu Asn Ile Val Pro F,sp Ser Ile Thr Ile Asn Ser Ile Glu Ile Asp Gln Gln Ser Val Val Val Ser Gly Lys Thr Pro Ser Lys Glu Ala Phe Tyr Phe Leu Phe Gln F.sn Lys Leu Asn Pro Met 130 l35 140 Phe Asp Tyr Ser Arg Ala Glu Phe Phe Pro L~eu Ser Asp Gly Trp Phe l45 150 155 l60 Asn Phe Val Ser Thr Asn Phe Ser Asn Ser L~eu Leu Ile Lys Asn Pro 165 170 l75 Glu Ser Ile Lys (2) INFORMATION FOR SEQ ID N0:87:
{i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1440 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 151...1299 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: B7:
Met Cys Val Val Leu Ser Val Lys Arg Asp Gly Glu Lys Thr Leu Glu Asn Asn Glu Glu Asn Lys Asp Glu Lys Leu Ile Leu Ile Asp Glu Phe Glu Val Leu Ala Asn Lys Phe Ile Ser Arg Leu Pro Asn Ile Pro Ser Thr Pro Arg Glu Phe Gly Leu Gly ~ 45 50 55 Lys Gly Glu Ile Met Glu Ile Asp Val Pro Phe Gly Ser Ile Phe Ala TAC AGA CAC ATT GGC TCT ATC AGA CAA AAA GAA TAC AGG ATT GTA GGG 4l4 Tyr Arg His Ile Gly Ser Ile Arg Gln Lys Glu Tyr Arg Ile Val Gly Leu Tyr Arg Asn Asp Val Leu Leu Leu Ser Thr Lys Ser Leu Val Ile 90 95 l00 Gln Pro Arg Asp Ile Leu Leu Val Ala Gly Asn Pro Glu Ile Leu Asn l05 110 115 l20 Ala Val Tyr Leu G1n Val Lys Ser Asn Val Gly Gln Phe Pro Ala Pro 125 l30 135 Phe Gly Lys Ser Ile Tyr Leu Tyr Ile Asp Met Arg Leu Gln Asn Arg Lys Ala Met Met Arg Asp Val Tyr Gln Ala Leu Phe Leu His Lys His Leu Lys Ser Tyr Lys Leu Tyr Ile Gln Val Leu His Pro Thr Ser Pro Lys Phe Tyr His Lys Phe Leu Ala Leu Glu Thr Glu Ser Ile Glu Val Asn Phe Asp Phe Tyr Arg Lys Ser Phe Ile Gln Lys Leu His Glu Asp His Gln Lys Lys Met Gly Leu Ile Val Val Gly Arg Glu Leu Phe Leu Ser Lys Lys His Arg Lys Ala Leu Tyr Lys Thr Ala Thr Pro Val Tyr Lys Thr Asn Thr Ser Gly Leu Ser Lys Thr Ser Gln Ser Val Val Val Leu Asn Glu Ser Leu Asp Ile Asn Glu Asp Met Ser Ser Val Ile Phe Asp Val Ser Met Gln Met Asp Leu Gly Leu Leu Leu Tyr Asp Phe Asp Pro Asn Lys Arg Tyr Lys Asn Glu Ile Va1 Asn His Tyr Glu Asn Leu GCC AAC GCG TTC AAC CGC AAG ATT GAG ATT T'TC CAA ACC GAT ATT AGA 1l34 Ala Asn Ala Phe Asn Arg Lys Ile Glu Ile Phe Gln Thr Asp Ile Arg Asn Pro Ile Met Tyr Leu Asn Ser Leu Arg Asn Pro Ile Leu His Phe Met Pro Phe Glu Glu Cys Ile Thr His Thr Arg Phe Trp Trp Phe Leu Ser Thr Lys Val Glu Lys Leu Ala Phe Leu Asn Asp Asp Asn Pro Gln Ile Phe Ile Pro Val Ala Glu (2) INFORMATION FOR SEQ ID N0:88:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 383 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:88:
Met Cys Val Val Leu Ser Val Lys Arg Asp Gly Glu Lys Thr Leu Glu Asn Asn Glu Glu Asn Lys Asp Glu Lys Leu Ile Leu Ile Asp Glu Phe Glu Val Leu Ala Asn Lys Phe Ile Ser Arg Leu Pro Asn Ile Pro Ser Thr Pro Arg Glu Phe Gly Leu Gly Lys Gly Glu Ile Met Glu Ile Asp Val Pro Phe Gly Ser Ile Phe Ala Tyr Arg His Ile Gly Ser Ile Arg Gln Lys Glu Tyr Arg Ile Val Gly Leu Tyr Arg Asn Asp Val Leu Leu ~ Leu Ser Thr Lys Ser Leu Val Ile Gln Pro Arg Asp Ile Leu Leu Val Ala Gly Asn Pro Glu Ile Leu Asn Ala Val Tyr Leu Gln Val Lys Ser Asn Val Gly Gln Phe Pro Ala Pro Phe Gly Lys Ser Ile Tyr Leu Tyr l30 135 140 Ile Asp Met Arg Leu Gln Asn Arg Lys Ala Met Met Arg Asp Val Tyr l45 150 155 160 Gln Ala Leu Phe Leu His Lys His Leu Lys Ser Tyr Lys Leu Tyr Ile 165 170 l75 Gln Val Leu His Pro Thr Ser Pro Lys Phe Tyr His Lys Phe Leu Ala Leu Glu Thr Glu Ser Ile Glu Val Asn Phe Asp Phe Tyr Arg Lys Ser Phe Ile Gln Lys Leu His Glu Asp His Gln Lys Lys Met Gly Leu Ile 2l0 215 220 Val Val Gly Arg Glu Leu Phe Leu Ser Lys Lys His Arg Lys Ala Leu Tyr Lys Thr Ala Thr Pro Val Tyr Lys Thr Asn Thr Ser Gly Leu Ser Lys Thr Ser Gln Ser Val Val Val Leu Asn Glu Ser Leu Asp Ile Asn Glu Asp Met Ser Ser Val Ile Phe Asp Val Ser Met Gln Met Asp Leu Gly Leu Leu Leu Tyr Asp Phe Asp Pro Asn Lys Arg Tyr Lys Asn Glu Ile Val Asn His Tyr Glu Asn Leu Ala Asn Ala Phe Asn Arg Lys Ile Glu Ile Phe Gln Thr Asp Ile Arg Asn Pro Ile Met Tyr Leu Asn Ser Leu Arg Asn Pro Ile Leu His Phe Met Pro Phe Glu Glu Cys Ile Thr His Thr Arg Phe Trp Trp Phe Leu Ser Thr Lys Val Glu Lys Leu Ala Phe Leu Asn Asp Asp Asn Pro Gln Ile Phe Ile Pro Val Ala Glu (2) INFORMATION FOR SEQ ID N0:89:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 517 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...464 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:89:
Met Val Gly Gly Gly Thr val Lys Lys Asp Leu Lys Lys Ala Ile Gln Tyr Tyr Val Lys Ala Cys Glu Leu Asn Glu Met Phe Gly Cys Leu Ser Leu Val Ser Asn Ser Gln Ile Asn Lys Gln Lys Leu Phe Gln Tyr Leu Ser Lys Ala Cys Glu Leu Asn Ser Gly Asn Gly Cys Arg Phe Leu Gly Asp Phe Tyr Glu Asn G1y Lys Tyr Val Lys Lys Asp Leu Arg Lys Ala Ala Gln TAC TAC TCT AAA GCT TGT GGA TTA AAT GAT C.~1A GAT GGG TGT TTA ATA 344 Tyr Tyr Ser Lys Ala Cys Gly Leu Asn Asp Gln Asp Gly Cys Leu Ile CTA GGA TAT AAG CAA TAT GCT GGC AAG GGC G'TA GTC AAA AAT GAA AAA 392 Leu Gly Tyr Lys Gln Tyr Ala Gly Lys Gly Val Val Lys Asn Glu Lys Gln Ala Val Lys Thr Phe Glu Lys Ala Cys A:rg Leu Gly Ser Glu Asp 1l5 120 125 130 GCA TGT GGT ATT TTA AAC AAC TAC TAGATTTGAi~ ATAAATGCTG TTTTTTAGCT 494 Ala Cys Gly Ile Leu Asn Asn Tyr (2) INFORMATION FOR SEQ ID N0:90:
(i} SEQUENCE CHARACTERISTICS:
(A) LENGTH: 138 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear . (ii} MOLECULE TYFE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:90:
Met Val Gly Gly Gly Thr Val Lys Lys Asp Leu Lys Lys Ala Ile Gln Tyr Tyr Val Lys Ala Cys Glu Leu Asn Glu Met Phe Gly Cys Leu Ser Leu Val Ser Asn Ser Gln Ile Asn Lys Gln Lys Leu Phe Gln Tyr Leu Ser Lys Ala Cys Glu Leu Asn Ser Gly Asn Gly Cys Arg Phe Leu Gly Asp Phe Tyr Glu Asn Gly Lys Tyr Val Lys Lys Asp Leu Arg Lys Ala Ala Gln Tyr Tyr Ser Lys Ala Cys Gly Leu Asn Asp Gln Asp Gly Cys Leu Ile Leu Gly Tyr Lys Gln Tyr Ala Gly Lys Gly Val Val Lys Asn l00 105 110 Glu Lys Gln Ala Val Lys Thr Phe Glu Lys Ala Cys Arg Leu Gly Ser 1l5 120 125 Glu Asp Ala Cys Gly Ile Leu Asn Asn Tyr (2) INFORMATION FOR SEQ ID N0:91:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1663 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence --(B) LOCATION: 68...1600 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:91:
Met Lys Lys Leu Leu Tyr Thr Ile Leu Ala Leu Leu Leu Ile GGC CTT TTA ACA ATC TAT CTC ATC CTT TTT ACA GAA TGG GGG AAT AAG l57 Gly Leu Leu Thr Ile Tyr Leu Ile Leu Phe Thr Glu Trp Gly Asn Lys Ile Ile Ala Ser Tyr Ile Glu Lys Lys Ile Asn Pro Asn Glu His Tyr Leu Ser Val Lys Thr Phe Lys Leu Arg Phe Asn Ser Leu Asp Phe Lys GCT CAA GCC AAC GAT GAT TCC ACG CTC ATT CTT AAG GGG GAT TTT TCA 30l Ala Gln Ala Asn Asp Asp Ser Thr Leu Ile Leu Lys Gly Asp Phe Ser CTT TTAAAGCAA AGCGTAAAT TTGAATTACC'ATATAGATATT AAAGAT 349 Leu LeuLysGln SerValAsn LeuAsnTyrH:isIleAspIle LysAsp TTA CGCTCTTTC AAAGAATGG ATACCCTACC'CTTTAAGGGGG GCTGTT 397 Leu ArgSerPhe LysGluTrp IleProTyrPro LeuArgGly AlaVal 95 100 l05 110 Ile ThrSerGly AsnIleLys GlyHisArgLys AlaLeuMet IleGln Gly ValSerAsn ValAlaGln SerHisThrAla TyrAsnAla LeuLeu 130 l35 140 Asp AspPheLys LeuSerArg LeuAsnLeuAsn AlaGlnAsp AlaAsn Leu GluAspLeu LeuTyrLeu IleAsnArgPro AlaTyrAla AsnAla l60 165 170 Lys ValSerLeu GlnAlaAsp PheAsnSerLeu LysProLeu GluGly His LeuIleLeu ThrAlaAsn AsnAlaLeuIle AsnAsnAla LeuIle Asn GlnIlePhe HisLeuAsn LeuLysAspThr LeuValPhe SerLeu Ser HisSerSer AspPheLys GlyAsnLysAla IleSerAsp ThrThr Leu ThrSerPro LeuAlaAsn PheLysAlaLeu LysSerGlu TyrLeu Phe SerIleLeu LysLeuAsn AlaProTyrThr LeuGluIle ProAsn Leu AlaLysLeu TyrAsnIle ThrAsnHisPro LeuLysGly SerLeu Thr Leu Lys Gly Ala Ile Glu Gln Ser Pro Lys Leu Leu Lys Val Ser Gly His Ser Asn Leu Leu Asp Gly Ala Leu Asp Phe Thr Leu Leu Asn 305 3l0 3Z5 Lys Asp Leu Lys Gly Arg Phe Ser Asn Ile Ser Thr Leu Lys Ala Leu Asp Leu Phe His Tyr Pro Lys Phe Phe Gln Ser Val Ala Asp Ala Asn Leu Asp Tyr Asp Leu Ile Ala Lys Gln Gly Val Leu Lys Ala Arg Leu Lys Asn Ala Arg Phe Leu Lys Asn Ala Phe Ser Asp Phe Leu Tyr Ser Ile Ser Lys Phe Asp Ile Thr Lys Glu Ile Tyr Asn Asp Ala Asn Leu Val Ser Gln Ile Asn Gln Gln Arg Leu Leu Ser Asp Leu Ser Leu Lys Ser Pro Lys Thr Gln Leu Lys Ile His Asn Gly Leu Leu Asp Leu Asn Thr Lys Gln Met Asn Met Leu Met Asp Ala Glu Ile Leu Lys Phe Ile Phe Lys Met Lys Leu Gln Gly Asn Met His Gln Pro Lys Phe Ser Leu Ile Leu Asn Glu Lys Ala Ile Gln Gln Asn Leu Gln Gln Gly Leu Lys Glu Ile Leu Lys Asn Asp Thr Leu Lys Lys Gly Leu Asp His Leu Leu Lys Asp Asp Lys Leu Lys Glu Lys Leu Glu Lys Gly Leu Lys Gly Leu Phe (2) INFORMATION FOR SEQ ID N0:92:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 51i amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:92:
Met Lys Lys Leu Leu Tyr Thr Ile Leu Ala Leu Leu Leu Ile Gly Leu Leu Thr Ile Tyr Leu Ile Leu Phe Thr Glu Trp Gly Asn Lys Ile Ile Ala Ser Tyr Ile Glu Lys Lys Ile Asn Pro Asn Glu His Tyr Leu Ser Val Lys Thr Phe Lys Leu Arg Phe Asn Ser Leu Asp Phe Lys Ala Gln Ala Asn Asp Asp Ser Thr Leu Ile Leu Lys Gly Asp Phe Ser Leu Leu Lys Gln Ser Val Asn Leu Asn Tyr His Ile Asp Ile Lys Asp Leu Arg Ser Phe Lys Glu Trp Ile Pro Tyr Pro Leu Arg Gly Ala Val Ile Thr Ser Gly Asn Ile Lys Gly His Arg Lys Ala Leu Met Ile Gln Gly Val l15 120 125 Ser Asn Val Ala Gln Ser His Thr Ala Tyr Asn Ala Leu Leu Asp Asp Phe Lys Leu Ser Arg Leu Asn Leu Asn Ala Gl:n Asp Ala Asn Leu Glu Asp Leu Leu Tyr Leu Ile Asn Arg Pro Ala Tyr Ala Asn Ala Lys Val Ser Leu Gln Ala Asp Phe Asn Ser Leu Lys Pro Leu Glu Gly His Leu Ile Leu Thr Ala Asn Asn Ala Leu Ile Asn Assn Ala Leu Ile Asn Gln l95 200 205 Ile Phe His Leu Asn Leu Lys Asp Thr Leu Val Phe Ser Leu Ser His Ser Ser Asp Phe Lys Gly Asn Lys Ala Ile Se:r Asp Thr Thr Leu Thr 225 230 23:5 240 Ser Pro Leu Ala Asn Phe Lys Ala Leu Lys Se:r Glu Tyr Leu Phe Ser Ile Leu Lys Leu Asn Ala Pro Tyr Thr Leu Gl~a Ile Pro Asn Leu Ala Lys Leu Tyr Asn Ile Thr Asn His Pro Leu Ly,s Gly Ser Leu Thr Leu Lys Gly Ala Ile Glu Gln Ser Pro Lys Leu Leu Lys Val Ser Gly His WO 98I21225 PCTlUS97/21353 -Ser Asn Leu Leu Asp ply-Ala Leu Asp Phe Thr Leu Leu Asn Lys Asp Leu Lys Gly Arg Phe Ser Asn Ile Ser Thr Leu Lys Ala Leu Asp Leu Phe His Tyr Pro Lys Phe Phe Gln 5er Val Ala Asp Ala Asn Leu Asp Tyr Asp Leu Ile Ala Lys Gln Gly Val Leu Lys Ala Arg Leu Lys Asn Ala Arg Phe Leu Lys Asn Ala Phe Ser Asp Phe Leu Tyr Ser Ile Ser Lys Phe Asp Ile Thr Lys Glu Ile Tyr Asn Asp Ala Asn Leu Val Ser Gln Ile Asn Gln Gln Arg Leu Leu Ser Asp Leu Ser Leu Lys Ser Pro Lys Thr Gln Leu Lys Ile His Asn Gly Leu Leu Asp Leu Asn Thr Lys Gln Met Asn Met Leu Met Asp Ala Glu Ile Leu Lys Phe Ile Phe Lys Met Lys Leu Gln Gly Asn Met His Gln Pro Lys Phe Ser Leu Ile Leu Asn Glu Lys Ala Ile Gln Gln Asn Leu Gln Gln Gly Leu Lys Glu Ile Leu Lys Asn Asp Thr Leu Lys Lys Gly Leu Asp His Leu Leu Lys Asp Asp Lys Leu Lys Glu Lys Leu Glu Lys Gly Leu Lys Gly Leu Phe (2) INFORMATION FOR SEQ ID N0:93:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 947 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAMEJKEY: Coding Sequence (B) LOCATION: 292...645 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:93:
AAGCTATTTT
Met Gly Tyr Tyr 5er Glu Val Thr Gly Asp Tyr Leu Phe Asn Tyr Asn Ser Thr ' S 10 15 Ile Val Val Ala Tyr Asp Arg Ser Asp Ala Met Thr Ser Tyr Tyr Ile Asn Val Ile Val Tyr Glu Leu Gln Lys Leu Gly Phe Tyr Asn Val Phe ACG CAA GCG GAA TTC CCA CTA GAT AAA GCC P,AA AAT GTG ATC TAT GCG 489 Thr Gln Ala Glu Phe Pro Leu Asp Lys Ala L~ys Asn Val Ile Tyr Ala CGC ATT GTC CGT AAC ATC TCA GCT GTG CCG T'TC TAC CAA TAC AAT TAC 537 Arg Ile Val Arg Asn Ile Ser Ala Val Pro Phe Tyr Gln Tyr Asn Tyr Gln Leu Ile Asp Gln Val Asn Lys Pro Cys Tyr Phe Leu Gly Gly Gln Phe Tyr Cys Ser Gln Thr Leu Arg Ile Ile Thr Leu Ser Met Ala Leu Ala Ser Lys Phe ACGGGTTTTA
AAATGCGCTCAAAATAT g47 (2) INFORMATION FOR SEQ ID N0:94:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 118 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal - (xi) SEQUENCE DESCRIPTION: SEQ ID N0:94:
Met Gly Tyr Tyr Ser Glu Val Thr G1y Asp Tyr Leu Phe Asn Tyr Asn Ser Thr Ile Val Val Ala Tyr Asp Arg Ser Asp Ala Met Thr Ser Tyr Tyr Ile Asn Val Ile Val Tyr Glu Leu Gln Lys Leu Gly Phe Tyr Asn Val Phe Thr Gln Ala Glu Phe Pro Leu Asp Lys Ala Lys Asn Val Ile Tyr Ala Arg Ile Val Arg Asn Ile Ser Ala Val Pro Phe Tyr Gln Tyr 65 70 75 8p Asn Tyr Gln Leu Ile Asp Gln Val Asn Lys Pro Cys Tyr Phe Leu Gly Gly Gln Phe Tyr Cys Ser Gln Thr Leu Arg Ile Ile Thr Leu Ser Met Ala Leu Ala Ser Lys Phe (2) INFORMATION FOR SEQ ID N0:95:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 875 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 348...716 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:95:
ATG
Met Gln Ala Phe Lys Ser Val Ser Ala Ile Lys Lys Asp Glu Asn Ile Thr Ala Asn Asn Thr Gln Lys Glu Arg Ile Leu Phe Gly Ala Leu Ser Asn Pro Leu Leu Glu Gly Ala Ile Asp Lys Val Ser Ala Lys Asn Phe Ile Pro Pro Asn Thr Leu Leu Ser Thr Asp Lys Thr Gln Ala Leu Ile Ile Val Arg Lys Asn Asp Ile Ile Thr Gly Val Tyr Glu Glu Gly Gln Ile Ser Ile 7u 75 80 CTA CTA GCG ATT
Glu Ile Ser LysAla GluAsn Gly Leu Asn Gln Ile Leu Leu Ala Ile AAT AGC C'TC TTG
Gln Ala Lys LeuGlu AsnLys Ile Lys Ala Lys Val Asn Ser Leu Leu l00 105 1l0 115 AGC AGC TCT GCGCAA TTATAAAGGACA'TTCATGAAATT GGTTTTAGGC746 AAA ATC
Ser Ser Ser AlaGln Leu Lys Ile ATCAGTGGAG TTGCGGTT'TTTAGAAAAATT ACCCAAAGAA806 CGAGCGGGAT
ACCCCTAGCC
TTGTCGTGGC
GTCTAAAAAC
(2) INFORMATION FOR SEQ ID N0:96:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 123 amino acids (B) TYPE: amino acid (C} STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:96:
Met Gln Ala Phe Lys Ser Val Ser Ala Ile Lys Lys Asp Glu Asn Ile Thr Ala Asn Asn Thr Gln Lys Glu Arg Ile Leu Phe Gly Ala Leu Ser Asn Pro Leu Leu Glu Gly Ala Ile Asp Lys Val Ser Ala Lys Asn Phe Ile Pro Pro Asn Thr Leu Leu Ser Thr Asp Lys Thr Gln Ala Leu Ile Ile Val Arg Lys Asn Asp Ile Ile Thr Gly Val Tyr Glu Glu Gly Gln Ile Ser Ile Glu Ile Ser Leu Lys Ala Leu G:lu Asn Gly Ala Leu Asn Gln Ile Ile Gln Ala Lys Asn Leu Glu Ser Asn Lys Ile Leu Lys Ala 100 l05 110 _ Lys Val Leu Ser Ser Ser Lys Ala Gln Ile L~.u 1l5 120 (2) INFORMATION FOR SEQ ID N0:97:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 394 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single _ (D) TOPOLOGY: linear ( ix ) FEATURE;
(A} NAME/KEY: Coding Sequence (B} LOCATION: 160...345 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:97:
Met Gln Lys Glu Gln Glu Ala Gln Glu Ile Ala Lys Lys Ala Val Lys Ile Val Phe Phe Leu Gly Leu Val Val Val Leu Leu Met Met Ile Asn Leu Tyr Met Leu Ile Asn Gln Ile Asn Ala Ser Ala Gln Met Ser His Gln Ile Lys Lys Ile Glu Glu Arg Leu Asn Gln Glu Gln Lys (2) INFORMATION FOR SEQ ID N0:98:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 62 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:98:
Met Gln Lys Glu Gln Glu Ala Gln Glu Ile Ala Lys Lys Ala Val Lys Ile Val Phe Phe Leu Gly Leu Val Val Val Leu Leu Met Met Ile Asn Leu Tyr Met Leu Ile Asn Gln Ile Asn Ala Ser Ala Gln Met Ser His Gln Ile Lys Lys Ile Glu Glu Arg Leu Asn Gln Glu Gln Lys (2) INFORMATION FOR SEQ ID N0:99:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 982 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 320...880 (D) OTHER INFORMATION:
(A) NAME/KEY: sig_peptide (B) LOCATION: 320...400 (D) OTHER INFORMATION:
(A) NAME/KEY: mat peptide (B) LOCATION: 401...B80 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:99:
ATTGAAGTTGGTGATTATAC CTATTTGTATCTTAAAAAT'TTGATTTTAAA AGTTTGAGAT180 GGTTTTGTAGGTGTATCCCA CTTATCCAATTTATATCAA'TATTTTCACTC TAAAACCCTC240 ATCCTTGATAAAAAATTAAA CCTTTTAGAAAAATAACCG:4TTTTAGGGTG TAACTTTAAT300 AAA GCT TTG
Met Ile Lys Arg Ile Cys Ile Leu Ser Ala Leu AGT GCG AGT TTA GCG CTG GCT GGC GAA GTG AA'T GGG TTT TTC ATG GGT 400 Ser Ala Ser Leu Ala Leu Ala Gly Glu Val Asn Gly Phe Phe Met Gly GCG GGT TAT CAG CAA GGT CGT TAT GGT CCT TA'P AAC AGC AAT TAC TCT 448 Ala Gly Tyr Gln Gln Gly Arg Tyr Gly Pro Ty:r Asn Ser Asn Tyr Ser GAT TGG CGC CAT GGC AAT GAT CTT TAT GGT TT(3 AAT TTC AAA TTA GGT 496 Asp Trp Arg His Gly Asn Asp Leu Tyr Gly Leu Asn Phe Lys Leu Gly Phe Val Gly Phe Ala Asn Lys Trp Phe Gly Al<~ Arg Val Tyr Gly Phe Leu Asp Trp Yrie Asn ~rnr-~Ser Gly Thr Glu His Thr Lys Thr Asn Leu Leu Thr Tyr Gly Gly Gly Gly Asp Leu Ile Val Asn Leu Ile Pro Leu 65 70 75 g0 Asp Lys Phe Ala Leu Gly Leu Ile Gly Gly Val Gln Leu Ala Gly Asn Thr Trp Met Phe Pro Tyr Asp Val Asn Gln Thr Arg Phe Gln Phe Leu 100 105 1l0 Trp Asn Leu Gly Gly Arg Met Arg Val Gly Asp Arg Ser Ala Phe Glu l15 120 125 Ala Gly Val Lys Phe Pro Met Val Asn Gln Gly Asn Lys Asp Val Arg Ala Tyr Pro Leu Leu Phe Leu Gly Met Trp Ile Met Phe Phe Thr Phe 145 150 l55 l60 (2) INFORMATION FOR SEQ ID N0:100:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 187 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:100:
Met Ile Lys Arg Ile Ala Cys Ile Leu Ser Leu Ser Ala Ser Leu Ala Leu Ala Gly Glu Val Asn Gly Phe Phe Met Gly Ala Gly Tyr Gln Gln Gly Arg Tyr Gly Pro Tyr Asn Ser Asn Tyr Ser Asp Trp Arg His Gly Asn Asp Leu Tyr Gly Leu Asn Phe Lys Leu Gly Phe Val Gly Phe Ala Asn Lys Trp Phe Gly Ala Arg Val Tyr Gly Phe Leu Asp Trp Phe Asn Thr Ser Gly Thr Glu His Thr Lys Thr Asn Leu Leu Thr Tyr Gly Gly WO 98l21225 PCT/US97/21353 Gly Gly Asp Leu 11e vai-HSn Leu Ile Pro L~~_u Asp Lys Phe Ala Leu Gly Leu Ile Gly Gly Val Gln Leu Ala Gly Asn Thr Trp Met Phe Pro Tyr Asp Val Asn Gln Thr Arg Phe Gln Phe La~u Trp Asn Leu Gly Gly Arg Met Arg Val Gly Asp Arg Ser Ala Phe G.Lu Ala Gly Val Lys Phe Pro Met Val Asn Gln Gly Asn Lys Asp Val Arg Ala Tyr Pro Leu Leu Phe Leu Gly Met Trp Ile Met Phe Phe Thr Phe (2) INFORMATION FOR SEQ ID NO:lO.L:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 843 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 262...777 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:101:
CCAATGGAGG CGTTTCCAAA AACCCAAACG GGCGCTTT7.'T AAAGAAAAAT CTCAAAAAAT 60 GCGATTTTAA TGAAGAAGAA TTAAAAATCA TGTTTGAAC:C TGAAGAAAAA AGGTTGTTAG 180 Met Ser Lys Lys Assn Ser Val Ile Ser Gly 1 _'~ I 0 Leu Met Asn Phe Phe Ser Glu Lys Asn Glu Arg Trp Leu Leu Ala His Arg His Thr Arg Gly Phe Val Ile Val Ala Trp Leu Phe Arg Phe Lys ~ AGC ATT GCG TTT TCT ATT TTG ATC ACT CTG TTG GTT ATT TTA GTG GAT 435 Ser Ile Ala Phe Ser Ile Leu Ile Thr Leu Leu Val Ile Leu Val Asp ATT TGG GTT TAT AGC GAT GTG CGT CAG TTT TT'A TTG GAC ACT TCT AGC 483 - Ile Trp Val Tyr Ser Asp Val Arg Gln Phe Leu Leu Asp Thr Ser Ser Gly Arg Tyr Gly Pro Tyr Asn Ser A
SerPheIleTrp LeuLeuIle AlaLeuLeuIle LysTrpGly ValIle ValIleSerAla ArgLysCys TyrGlnPheSer GlnLysMet PheThr LeuIleGlnArg LysArgGln IleArgGluAsn LeuLysAsn ArgSer 110 l15 120 AsnTyrLysAsp ThrLysAsn AlaGluLysLeu SerSerIle AlaGlu GluIleIleSer LysLysGln GluGluSerArg ProLysGlu AspSer AsnHisGluAsn HisLysGlu LysLeuSerAsn IleThrGlu GluSer GAGGAATTGA AA
AAAGCTAAAA
AGGATAGGGG
AspSer (2) INFORMATION FOR SEQ ID N0:102:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 172 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:102:
Met Ser Lys Lys Asn Ser Val Ile Ser Gly Leu Met Asn Phe Phe Ser Glu Lys Asn Glu Arg Trp Leu Leu Ala His Arg His Thr Arg Gly Phe Val Ile Val Ala Trp Leu Phe Arg Phe Lys Ser Ile Ala Phe Ser Ile Leu Ile Thr Leu Leu Val Ile Leu Val Asp Ile Trp Val Tyr Ser Asp Val Arg Gln Phe Leu Leu-HSp Thr Ser Ser Ser Phe Ile Trp Leu Leu Ile Ala Leu Leu Ile Lys Trp Gly Val Ile Val Ile Ser Ala Arg Lys _ 85 90 95 Cys Tyr Gln Phe Ser Gln Lys Met Phe Thr Lf:u Ile Gln Arg Lys Arg Gln Ile Arg Glu Asn Leu Lys Asn Arg Ser A:>n Tyr Lys Asp Thr Lys Asn Ala Glu Lys Leu Ser Ser Ile Ala Glu Glu Ile Ile Ser Lys Lys Gln Glu Glu Ser Arg Pro Lys Glu Asp Ser Asn His Glu Asn His Lys Glu Lys Leu Ser Asn Ile Thr Glu Glu Ser Asp Ser (2) INFORMATION FOR SEQ ID NO:lOa:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1047 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 34...1005 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:103:
AGAAAGAAAC CATTCAAGGA ACGCATTGAT TTG ATG F.AT AAA CCA TTT TTA ATC 54 Met A.sn Lys Pro Phe Leu Ile Leu Leu Ile Ala Leu Ile Val Phe Ser Gly Cys Asn Met Arg Lys Tyr Phe Lys Pro Ala Lys His Gln Ile Lys Gly Glu Ala Tyr Phe Pro Asn His Leu Gln Glu Ser Ile Val Ser Ser Asn Arg Tyr Gly Ala Ile Leu ' Lys Asn Gly Ala Val Ile Gly Asp Lys Gly Leu Thr Gln Leu Arg Ile Gly Lys Asn L~rie Asn ryr-iilu Ser Ser Phe Leu Asn Glu Ser Gln Gly Phe Phe Ile Leu Ala Gln Asp Cys Leu Asn Lys Ile Asp Lys Lys Thr 90 95 loa Asn Lys Ser Lys Val Ala Lys Thr Glu Glu Thr Glu Leu Lys Leu Lys 105 110 l15 Gly Val Glu Ala Glu Val Gln Asp Lys Val Cys His Gln Val Glu Leu l20 125 130 135 Ile Ser Asn Asn Pro Asn Ala Ser Gln Gln Ser Ile Val Ile Pro Leu l40 145 150 Glu Thr Phe Ala Leu Ser Ala Ser Val Lys Gly Asn Leu Leu Ala Val Val Leu Ala Asp Asn Ser Ala Asn Leu Tyr Asp Ile Thr Ser Gln Lys l70 175 180 Leu Leu Phe Ser Glu Lys Gly Ser Pro Ser Thr Thr Ile Asn Ser Leu Met Ala Met Pro Ile Phe Met Asp Thr Val Val Val Phe Pro Met Leu Asp Gly Arg Leu Leu Val Val Asp Tyr Val His Gly Asn Pro Thr Pro Ile Arg Asn Ile Val Ile Ser Ser Asp Lys Phe Phe Asn Asn Ile Thr Tyr Leu Ile Val Asp Gly Asn Asn Met Ile Ala Ser Thr Gly Lys Arg Ile Leu Ser Val Val Ser Gly Gln Glu Phe Asn Tyr Asp Gly Asp Ile Val Asp Leu Leu Tyr Asp Lys Gly Thr Leu Tyr Val Leu Thr Leu Asp GGG CAG ATT TTV CAA AT3~.GAT AAG AGT TTG A.GG GAA TTA AAC AGC GTG 966 Gly Gln Ile Leu Gln Met Asp Lys Ser Leu Arg Glu Leu Asn Ser Val AAA CTG CCT NTC NTC GCT CAA CAC GAT TGT A.TT AAA CCA TAATAAATTG TA 1017 Lys Leu Pro Xaa Xaa Ala Gln His Asp Cys Ile Lys Pro (2} INFORMATION FOR SEQ ID N0:104:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 324 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:104:
Met Asn Lys Pro Phe Leu Ile Leu Leu Ile Ala Leu Ile Val Phe Ser Gly Cys Asn Met Arg Lys Tyr Phe Lys Pro Ala Lys His Gln Ile Lys Gly Glu Ala Tyr Phe Pro Asn His Leu Gln Glu Ser Ile Val Ser Ser Asn Arg Tyr Gly Ala Ile Leu Lys Asn Gly Ala Val Ile Gly Asp Lys Gly Leu Thr Gln Leu Arg Ile Gly Lys Asn Phe Asn Tyr Glu Ser Ser Phe Leu Asn Glu Ser Gln Gly Phe Phe Ile Leu Ala Gln Asp Cys Leu Asn Lys Ile Asp Lys Lys Thr Asn Lys Ser Lys Val Ala Lys Thr Glu Glu Thr Glu Leu Lys Leu Lys Gly Val Glu Ala Glu Val Gln Asp Lys Val Cys His Gln Val Glu Leu Ile Ser Asn Asn Pro Asn A1a Ser Gln l30 135 140 Gln Ser Ile Val Ile Pro Leu Glu Thr Phe Ala Leu Ser Ala Ser Val Lys Gly Asn Leu Leu Ala Val Val Leu Ala Asp Asn Ser Ala Asn Leu Tyr Asp Ile Thr Ser Gln Lys Leu Leu Phe S~er Glu Lys Gly Ser Pro Ser Thr Thr Ile Asn Ser Leu Met Ala Met Pro Ile Phe Met Asp Thr Val Val Val Phe Pro Met Leu Asp Gly Arg L~~u Leu Val Val Asp Tyr ' 210 215 220 Val His Gly Asn Pro Thr Pro Ile Arg Asn Ile Val Ile Ser Ser Asp 225 230 2.35 240 Lys Phe Phe Asn Asn Ile Thr Tyr Leu Ile Val Asp Gly Asn Asn Met - Ile Ala Ser Thr Gly Lys Arg Ile Leu Ser Val Val Ser Gly Gln Glu WO 98l21225 PCTIUS97l21353 -Phe Asn Tyr Asp Gly Asp Ile Val Asp Leu Leu Tyr Asp Lys Gly Thr Leu Tyr Val Leu Thr Leu Asp Gly Gln Ile Leu Gln Met Asp Lys Ser Leu Arg Glu Leu Asn Ser Val Lys Leu Pro Xaa Xaa Ala Gln His Asp Cys Ile Lys Pro (2) INFORMATION FOR SEQ ID N0:105:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1968 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 153...1793 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 153...219 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:105:
Met Asp Lys Asn Asn Asn Asn Leu Arg Leu Ile Leu Ala Ile Ala Leu Ser Phe Leu Phe Ile Ala Leu Tyr Ser Tyr Phe Phe Gln Lys Pro Asn Lys Thr Thr Thr Gln Thr Thr Lys Gln Glu Thr Thr Asn Asn His Thr Ala Thr Ser Pro Asn Ala Pro Asn Ala Gln His Phe Ser Thr Thr Gln Thr Thr Pro Gln Glu Asn Leu WO 98/21225 PCT/US97/21353 w CTA AGC ACV A'r'r '1'L'1 T i T - GAG CAT GCC AGG ATT GAA ATT GAT TCT TTA 413 Leu Ser Thr Ile Ser Phe Glu His Ala Arg Ile Glu Ile Asp Ser Leu Gly Arg Ile Lys Gln Val Tyr Leu Lys Asp Lys Lys Tyr Leu Thr Pro - . Lys Gln Lys Gly Phe Leu Glu His Val Gly His Leu Phe Ser Ser Lys Glu Asn Ala Gln Pro Pro Leu Lys Glu Leu P:ro Leu Leu Ala Ala Asp 100 l05 110 Lys Leu Lys Pro Leu Glu Val Arg Phe Leu A;sp Pro Thr Leu Asn Asn 115 l20 125 AAA GCG TTC AAC ACC CCT TAT AGC GCT TCA A~~1A ACC ACT CTT GGG CCT 653 Lys Ala Phe Asn Thr Pro Tyr Ser Ala Ser Lys Thr Thr Leu Gly Pro 130 135 l40 _ 145 Asn Glu Gln Leu Val Leu Thr Gln Asp Leu G:ly Thr Leu Ser Ile Ile Lys Thr Leu Thr Phe Tyr Asp Asp Leu His Tyr Asp Leu Lys Ile Ala TTC AAA TCG CCC AAT AAC CTT ATC CCT AGC Ti~T GTG ATC ACC AAT GGT 797 Phe Lys Ser Pro Asn Asn Leu Ile Pro Ser Tyr Val Ile Thr Asn Gly Tyr Arg Pro Va1 Ala Asp Leu Asp Ser Tyr Thr Phe Ser Gly Val Leu TTA GAA AAT AGC GAC AAA AAA ATT GAA AAA A'CT GAA GAT AAA GAC GCT 893 Leu Glu Asn Ser Asp Lys Lys Ile Glu Lys I:Le Glu Asp Lys Asp Ala 210 2l5 220 225 AAA GAA ATC AAA CGC TTT TCT AAC ACC CTC T'CT TTA TCC AGC GTG GAT 941 Lys Glu Ile Lys Arg Phe Ser Asn Thr Leu Phe Leu Ser Ser Val Asp Arg Tyr Phe Thr Thr Leu Leu Phe Thr Lys Asp Pro Gln Gly Phe Glu Ala Leu Ile Asp Ser Glu Ile Gly Thr Lys Asn Pro Leu Gly Phe Ile Ser Leu Lys Asn Glu Ala Asn Leu His Gly Tyr Ile Gly Pro Lys Asp Tyr Arg Ser Leu Lys Ala Ile Ser Pro Met Leu Thr Asp Val Ile Glu Tyr Gly Leu Ile Thr Phe Phe Ala Lys Gly Val Phe Val Leu Leu Asp 310 3l5 320 TAT TTG TAT CAA TTC GTG GGC AAT TGG GGT TGG GCT ATC ATT CTT TTA 1229 _ Tyr Leu Tyr Gln Phe Val Gly Asn Trp Gly Trp Ala Ile Ile Leu Leu Thr Ile Ile Val Arg Ile Ile Leu Tyr Pro Leu Ser Tyr Lys Gly Met Val Ser Met Gln Lys Leu Lys Glu Leu Ala Pro Lys Met Lys Glu Leu Gln Glu Lys Tyr Lys Gly Glu Pro Gln Lys Leu Gln Ala His Met Met CAG CTT TAC AAA AAA CAT GGG GCT AAC CCA CTA GGG GGT TGT CTG CCC l421 Gln Leu Tyr Lys Lys His Gly Ala Asn Pro Leu Gly Gly Cys Leu Pro Leu Ile Leu Gln Ile Pro Val Phe Phe Ala Ile Tyr Arg Val Leu Tyr 405 4l0 415 AAC GCT GTG GAA TTG AAA AGC TCA GAG TGG ATC TTA TGG ATT CAT GAT l517 Asn Ala Val Glu Leu Lys Ser Ser Glu Trp Ile Leu Trp Ile His Asp Leu Ser Ile Met Asp Pro Tyr Phe Ile Leu Pro Leu Leu Met Gly Ala Ser Met Tyr Trp His Gln Ser Val Thr Pro Asn Thr Met Thr Asp Pro Met Gln Ala Lys Ile Phe Lys Leu Leu Pro Leu Leu Phe Thr Ile Phe Leu Ile Thr Phe Pro Ala Gly Leu Val Leu Tyr Trp Thr Thr Asn Asn ' ATC CTT TC'.G GTG TTCU Clan CAA CTC ATC ATC P.AT AAA GTC TTA GAG AAT 1757 Ile Leu Ser Val Leu Gln Gln Leu Ile Ile A,sn Lys Val Leu Glu Asn Lys Lys Arg Met His Ala Gln Asn Lys Lys Glu His (2) INFORMATION FOR SEQ ID N0:106:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 547 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein _ (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 1...22 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:106:
Met Asp Lys Asn Asn Asn Asn Leu Arg Leu Ile Leu Ala Ile Ala Leu Ser Phe Leu Phe Ile Ala Leu Tyr Ser Tyr P:he Phe Gln Lys Pro Asn Lys Thr Thr Thr Gln Thr Thr Lys Gln Glu T.hr Thr Asn Asn His Thr Ala Thr Ser Pro Asn Ala Pro Asn Ala Gln His Phe Ser Thr Thr Gln Thr Thr Pro Gln Glu Asn Leu Leu Ser Thr Ile Ser Phe Glu His Ala Arg Ile Glu Ile Asp Ser Leu Gly Arg Ile Lys Gln Val Tyr Leu Lys Asp Lys Lys Tyr Leu Thr Pro Lys Gln Lys G.ly Phe Leu Glu His Val Gly His Leu Phe Ser Ser Lys Glu Asn Ala G.ln Pro Pro Leu Lys Glu Leu Pro Leu Leu Ala Ala Asp Lys Leu Lys P:ro Leu Glu Val Arg Phe ' 110 115 120 Leu Asp Pro Thr Leu Asn Asn Lys Ala Phe A;~n Thr Pro Tyr Ser Ala ' 125 130 135 Ser Lys Thr Thr Leu Gly Pro Asn Glu Gln Leu Val Leu Thr Gln Asp Leu Gly Thr Leu Ser Ile Ile Lys Thr Leu TJzr Phe Tyr Asp Asp Leu 155 16u- 165 170 His Tyr Asp Leu Lys Ile Ala Phe Lys Ser Pro Asn Asn Leu Ile Pro l75 180 185 Ser Tyr Val Ile Thr Asn Gly Tyr Arg Pro Val Ala Asp Leu Asp Ser Tyr Thr Phe Ser Gly Val Leu Leu Glu Asn Ser Asp Lys Lys Ile Glu Lys Ile Glu Asp Lys Asp Ala Lys Glu Ile Lys Arg Phe Ser Asn Thr Leu Phe Leu Ser Ser Val Asp Arg Tyr Phe Thr Thr Leu Leu Phe Thr Lys Asp Pro Gln Gly Phe Glu Ala Leu Ile Asp Ser Glu Ile Gly Thr Lys Asn Pro Leu Gly Phe Ile Ser Leu Lys Asn Glu Ala Asn Leu His Gly Tyr Ile Gly Pro Lys Asp Tyr Arg Ser Leu Lys Ala Ile Ser Pro Met Leu Thr Asp Val Ile Glu Tyr Gly Leu Ile Thr Phe Phe Ala Lys Gly Val Phe Val Leu Leu Asp Tyr Leu Tyr Gln Phe Val Gly Asn Trp Gly Trp Ala Ile Ile Leu Leu Thr Ile Ile Val Arg Ile Ile Leu Tyr Pro Leu Ser Tyr Lys Gly Met Val Ser Met Gln Lys Leu Lys Glu Leu Ala Pro Lys Met Lys Glu Leu Gln Glu Lys Tyr Lys Gly Glu Pro Gln Lys Leu Gln Ala His Met Met Gln Leu Tyr Lys Lys His Gly Ala Asn Pro Leu Gly Gly Cys Leu Pro Leu Ile Leu Gln Ile Pro Val Phe Phe Ala Ile Tyr Arg Val Leu Tyr Asn Ala Val Glu Leu Lys Ser Ser Glu Trp Ile Leu Trp Ile His Asp Leu Ser Ile Met Asp Pro Tyr Phe Ile Leu Pro Leu Leu Met Gly Ala Ser Met Tyr Trp His Gln Ser Val Thr Pro Asn Thr Met Thr Asp Pro Met Gln Ala Lys Ile Phe Lys Leu Leu Pro Leu Leu Phe Thr Ile Phe Leu Ile Thr Phe Pro Ala Gly Leu Val Leu Tyr Trp Thr Thr Asn Asn Ile Leu Ser Val Leu Gln Gln Leu Ile Ile Asn Lys Val Leu Glu Asn Lys Lys Arg Met His Ala Gln Asn Lys 5l0 515 520 Lys Glu His (2) INFORMATION FOR SEQ ID N0:107:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3280 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 151...3207 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 151...241 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:107:
AGGCATGATC AACAATTTAG GGGAGGAATG ATG CTC GCT TCC ATT ATT GAA TTT l74 Met Leu Ala Ser Ile Ile Glu Phe Ser Leu Arg Gln Arg Val Ile Val Ile Val Gly Ala Ile Leu Ile Leu TTT TTT GGG ACT TAT AGT TTT ATC AAC ACT CC'A GTG GAC GCT TTC CCG 270 Phe Phe Gly Thr Tyr Ser Phe Ile Asn Thr Pro Val Asp Ala Phe Pro Asp Ile Ser Pro Thr Gln Val Lys Ile Ile Leu Lys Leu Pro Gly Ser Ser Pro Glu Glu Met Glu Asn Asn Ile Val Arg Pro Leu Glu Leu Glu Leu Leu Gly Leu Lys Gly Gln Lys Ser Leu Arg Ser Val Ser Lys Tyr Ser Ile Ser Asp Ile Thr Ile Asp Phe Asp Asp Ser Val Asp Ile Tyr Leu Ala Arg Asn Ile Val Asn Glu Arg Leu Ser Ser Val Met Lys Asp Leu Pro Val Gly Val Glu Gly Gly Met Ala Pro Ile Val Thr Pro Leu TCA GAT ATC TTT ATG 'i"1'L--ACT ATT GAT GGC AAT ATC ACT GAG ATA GAA 606 Ser Asp Ile Phe Met Phe Thr Ile Asp Gly Asn Ile Thr Glu Ile Glu Lys Arg Gln Leu Leu Asp Phe Val Ile Arg Pro Gln Leu Arg Met Ile Ser Gly Val Ala Asp Val Asn Ser Ile Gly Gly Phe Ser Arg Ala Phe Val Ile Val Pro Asp Phe Asn Asp Met Ala Arg Leu Gly Val Ser Ile Ser Asp Leu Glu Ser Ala Val Arg Val Asn Leu Arg Asn Ser Gly Ala Gly Arg Val Asp Arg Asp Gly Glu Thr Phe Leu Val Lys Ile Gln Thr Ala Ser Leu Ser Leu Glu Asp Ile Gly Lys Ile Thr Val Ser Thr Asn Leu Gly His Leu His Ile Lys Asp Phe Ala Lys Val Ile Ser Gln Ser Arg Thr Arg Leu Gly Phe Val Thr Lys Asp Gly Val Gly Glu Thr Thr GAA GGC TTG GTG CTT TCT TTA AAA GAC GCT AAC ACC AAA GAA ATC ATC l038 Glu Gly Leu Val Leu Ser Leu Lys Asp Ala Asn Thr Lys Glu Ile Ile Thr Gln Val Tyr Gln Lys Leu Glu Glu Leu Lys Pro Phe Leu Pro Asn Gly Val Ser Ile Asn Val Phe Tyr Asp Arg Ser Glu Phe Thr Gln Lys Ala Ile Ala Thr Val Ser Lys Thr Leu Ile Glu Ala Val VaI Leu Ile Ile Ile Thr Leu Phe Leu Phe Leu Gly Asn Leu Arg Ala Ser Val Ala -22e-GTG GGG GTG A'i'1' '1'1H C~T..TTA AGC TTG TCC G'TG GCG TTT ATT TTT ATC 1278 Val Gly Val Ile Leu Pro Leu Ser Leu Ser Val Ala Phe Ile Phe Ile Lys Phe Ser Asp Leu Thr Leu Asn Leu Met Se r Leu Gly Gly Leu Val ATC GCT ATA GGC ATG CTC ATT GAC TCA GCC G'.CG GTG GTG GTG GAA AAC 1374 Ile Ala Ile Gly Met Leu Ile Asp Ser Ala Val Val Val Val Glu Asn GCT TTT GAA AAA TTA AGC GCT AAC ACT AAA A<:C ACT AAA CTC CAT GCA 1422 Ala Phe Glu Lys Leu Ser Ala Asn Thr Lys Thr Thr Lys Leu His Ala ATC TAT CGT TCG TGT AAA GAA ATC GCT GTT TC:A GTG GTG AGC GGG GTG 1470 I1e Tyr Arg Ser Cys Lys Glu Ile Ala Val Ser Val Val Ser Gly Val GTG ATC ATC ATT GTG TTT TTT GTG CCG ATT T7.'A ACC TTA CAG GGG TTA 15l8 Val Ile Ile Ile Val Phe Phe Val Pro Ile Le:u Thr Leu Gln Gly Leu Glu Gly Lys Met Phe Arg Pro Leu Ala Gln Seer Ile Val Tyr Ala Leu Leu Gly Thr Leu Val Leu Ser Ile Thr Ile Ile Pro Val Val Ser Ser Leu Val Leu Lys Ala Thr Pro His Ser Glu Thr Phe Leu Thr Arg Phe Leu Asn Arg Ile Tyr Ala Pro Leu Leu Glu Phe Phe Val His Asn Pro Lys Lys Val Ile Leu Gly Ala Phe Val Phe Leu Ile Ala Ser Leu Ser Leu Phe Pro Phe Val Gly Lys Asn Phe Met Pro Val Leu Asp Glu Gly Asp Val Val Leu Ser Val Glu Thr Thr Pro Ser Ile Ser Leu Asp Gln TCT AGG GAT CTC ATG CTA AAC ATT GAG AGC GCG ATT AAA AAG CAT GTC l902 Ser Arg Asp Leu Met Leu Asn Ile Glu Ser Al,a Ile Lys Lys His Val AAG GAA GTT AAA AGC A'1"1'-GTC GCG CGC ACA GGG AGC GAT GAA TTG GGG 1950 Lys Glu Val Lys 5er Ile Val Ala Arg Thr Gly Ser Asp Glu Leu Gly CTG GAT TTA GGA GGT TTG AAT CAA ACC GAT ACT TTT ATT TCT TTT ATT l998 Leu Asp Leu Gly Gly Leu Asn Gln Thr Asp Thr Phe Ile Ser Phe Ile Pro Lys Lys Glu Trp Ser Val Lys Thr Lys Asp Glu Leu Leu G1u Lys Ile Met Asp Ser Leu Lys Asp Phe Lys Gly Ile Asn Phe Ser Phe Thr Gln Pro Ile Glu Met Arg Ile Ser Glu Met Leu Thr Gly Val Arg Gly Asp Leu Ala Val Lys Ile Phe Gly Asp Gly Ile Ser Glu Leu Asn Glu Leu Ser Phe Gln Ile Ala Gln Ala Leu Lys Gly Ile Lys Gly Ser Ser GAA~GTT TTA ACC ACG CTT AAT GAG GGC GTG AAT TAT TTG TAT GTA ACC 2286 Glu Val Leu Thr Thr Leu Asn Glu Gly Val Asn Tyr Leu Tyr Val Thr Pro Asn Lys Glu Ser Met Ala Asp Val Gly Ile Thr Ser Asp Glu Phe Ser Lys Phe Leu Lys Ser Ala Leu Glu Gly Leu Val Val Asp Val Ile Pro Thr Gly Ile Ser Arg Thr Pro Val Met Ile Arg Gln Glu Ser Asp Phe Ala Ser Ser Ile Thr Lys Ile Lys Ser Leu Ala Leu Thr Ser Lys Tyr Gly Val Leu Val Pro Ile Thr Ser Ile Ala Lys Ile Glu Glu Val Asp Gly Pro Val Ser Val Val Arg Glu Asn Ser Met Arg Met Ser Val GTT CGC AGT AAT GTG G'1'G~~GGG CGC GAT TTG AAA TCT TTT GTA GAA GAG 2622 Val Arg Ser Asn Val Val Gly Arg Asp Leu Lys Ser Phe Val Glu Glu GCT AAA AAA GTG ATC GCT CAA AAC ATC AAA C'TC CCT CCC AGC TAC TAT 2670 Ala Lys Lys Val Ile Ala Gln Asn Ile Lys Leu Pro Pro Ser Tyr Tyr ATC ACT TAT GGG GGG CAG TTT GAA AAC CAG C.zIA CGG GCC AAT AAA AGG 2718 Ile Thr Tyr Gly Gly Gln Phe Glu Asn Gln Gln Arg Ala Asn Lys Arg Leu Ser Thr Val Ile Pro Leu Ser Ile Leu Ala Ile Phe Phe Tle Leu TTT TTC ACT TTT AAA AGC ATT CCT TTA GCC T'rG CTC ATT CTT TTG AAT 2814 Phe Phe Thr Phe Lys Ser Ile Pro Leu Ala L~=_u Leu Ile Leu Leu Asn Ile Pro Phe Ala Val Thr Gly Gly Leu Ile A.la Leu Phe Ala Val Gly GAG TAT ATT TCA GTG CCA GCG AGC GTG GGC T'rT ATC GCT CTT TTT GGG 29l0 Glu Tyr Ile Ser Val Pro Ala Ser Val Gly Plze Ile Ala Leu Phe Gly Ile Ala Val Leu Asn Gly Val Val Met Ile G:Ly Tyr Phe Lys Glu Leu CTC TTG CAA GGG AAA AGC GTA GAA GAA TGC G'rT TTA TTG GGC GCT AAA 3006 Leu Leu Gln Gly Lys Ser Val Glu Glu Cys V<~1 Leu Leu Gly Ala Lys Arg Arg Leu Arg Pro Val Leu Met Thr Ala Cys Ile Ala Gly Leu Gly Leu Leu Pro Leu Leu Phe Ser His Ser Val Gly Ser Glu Val Gln Lys CCT TTA GCG ATC GTG GTG CTT GGA GGC TTG G'.CT ACC TCA AGC GCT CTA 3150 Pro Leu Ala Ile Val Val Leu Gly Gly Leu Val Thr Ser Ser Ala Leu 955 960 9(i5 970 ACC TTA CTC CTA CTG CCG CCA ATG TTT ATG C'.CC ATC GCT AAA AAG ATT 3198 Thr Leu Leu Leu Leu Pro Pro Met Phe Met Le,u Ile Ala Lys Lys Ile AAA ATC GTT TGAGTTAAAG GATTTCACAT GCTCGCT'.CTA GAAATTTATA TTGATATTT 3256 Lys Ile Val GTTTGAAAGA CGCTTTAA'1'H GATT 3280 (2) INFORMATION FOR SEQ ID N0:108:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 10l9 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 1...30 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:108:
Met Leu Ala Ser Ile Ile Glu Phe Ser Leu Arg Gln Arg Val Ile Val Ile Val Gly Ala Ile Leu Ile Leu Phe Phe Gly Thr Tyr Ser Phe Ile Asn Thr Pro Val Asp Ala Phe Pro Asp Ile Ser Pro Thr Gln Val Lys Ile Ile Leu Lys Leu Pro Gly Ser Ser Pro Glu Glu Met Glu Asn Asn Ile Val Arg Pro Leu Glu Leu Glu Leu Leu Gly Leu Lys Gly Gln Lys Ser Leu Arg Ser Val Ser Lys Tyr Ser Ile Ser Asp Ile Thr Ile Asp Phe Asp Asp Ser Val Asp Ile Tyr Leu Ala Arg Asn Ile Val Asn Glu Arg Leu Ser Ser Val Met Lys Asp Leu Pro Val Gly Val Glu Gly Gly Met Ala Pro Ile Val Thr Pro Leu Ser Asp Ile Phe Met Phe Thr Ile 100 105 1l0 Asp Gly Asn Ile Thr Glu Ile Glu Lys Arg Gln Leu Leu Asp Phe Val 1l5 120 125 130 Ile Arg Pro Gln Leu Arg Met Ile Sex Gly Val Ala Asp Val Asn Ser l35 140 145 Ile Gly Gly Phe Ser Arg Ala Phe Val Ile Val Pro Asp Phe Asn Asp Met Ala Arg Leu Gly Val Ser Ile Ser Asp Leu Glu Ser Ala Val Arg Val Asn Leu Arg Asn Ser Gly Ala Gly Arg Val Asp Arg Asp Gly Glu l80 1B5 l90 Thr Phe Leu Val Lys Ile Gln Thr Ala Ser Leu Ser Leu Glu Asp Ile Gly Lys Ile Thr Val Ser Thr Asn Leu Gly His Leu His Ile Lys Asp Phe Ala Lys Val Ile Ser Gln Ser Arg Thr Arg Leu Gly Phe Val Thr Lys Asp Gly Val Gly Glu Thr Thr Glu Gly Leu Val Leu Ser Leu Lys Asp Ala Asn Thr Lys Glu Ile Ile Thr Gln Val Tyr Gln Lys Leu Glu Glu Leu Lys Pro Phe Leu Pro Asn Gly Val Ser Ile Asn Val Phe Tyr Asp Arg Ser Glu Phe Thr Gln Lys Ala Ile Ala Thr Val Ser Lys Thr - Leu Ile Glu Ala Val Val Leu Ile Ile Ile Thr Leu Phe Leu Phe Leu Gly Asn Leu Arg Ala Ser Val Ala Val Gly Val Ile Leu Pro Leu Ser 325 330 335 _ Leu Ser Val Ala Phe Ile Phe Ile Lys Phe Ser Asp Leu Thr Leu Asn Leu Met Ser Leu Gly Gly Leu Val Ile Ala Ile Gly Met Leu Ile Asp Ser Ala Val Val Val Val Glu Asn Ala Phe Glu Lys Leu Ser Ala Asn Thr Lys Thr Thr Lys Leu His Ala Ile Tyr Arg Ser Cys Lys Glu Ile Ala Val Ser Val Val Ser Gly Val Val Ile Ile Ile Val Phe Phe Val Pro Ile Leu Thr Leu Gln Gly Leu Glu Gly Lys Met Phe Arg Pro Leu Ala Gln Ser Ile Val Tyr Ala Leu Leu Gly T:hr Leu Val Leu Ser Ile Thr Ile Ile Pro Val Val Ser Ser Leu Val L~~_u Lys Ala Thr Pro His Ser Glu Thr Phe Leu Thr Arg Phe Leu Asn A:rg Ile Tyr Ala Pro Leu Leu Glu Phe Phe Val His Asn Pro Lys Lys Val Ile Leu Gly Ala Phe Val Phe Leu Ile Ala Ser Leu Ser Leu Phe P:ro Phe Val Gly Lys Asn Phe Met Pro Val Leu Asp Glu Gly Asp Val V;al Leu Ser Val Glu Thr 515 520 5:?5 530 Thr Pro Ser Ile Ser Leu Asp Gln Ser Arg Asp Leu Met Leu Asn Ile Glu Ser Ala Ile Lys Lys His Val Lys Glu V;~1 Lys Ser Ile Val Ala Arg Thr Gly Ser Asp Glu Leu Gly Leu Asp Lf>_u Gly Gly Leu Asn Gln Thr Asp Thr Phe Ile Ser Phe Ile Pro Lys Lys Glu Trp Ser Val Lys Thr Lys Asp Glu Leu Leu Glu Lys Ile Met Asp Ser Leu Lys Asp Phe Lys Gly Ile Asn Phe Ser Phe Thr Gln Pro Ile Glu Met Arg Ile Ser Glu Met Leu Thr Gly Val Arg Gly Asp Leu Ala Val Lys Ile Phe Gly Asp Gly Ile Ser Glu Leu Asn Glu Leu Ser Phe Gln Ile Ala Gln Ala Leu Lys Gly Ile Lys Gly Ser Ser Glu Val LE>u Thr Thr Leu Asn Glu WO 98/21225 PCT/~JS97/21353 -Gly Val Asn T'yr Leu 'nyr-Val Thr Pro Asn Lys Glu Ser Met Ala Asp Val Gly Ile Thr Ser Asp Glu Phe Ser Lys Phe Leu Lys Ser Ala Leu Glu Gly Leu Val Val Asp Val Ile Pro Thr Gly Ile Ser Arg Thr Pro Val Met Ile Arg Gln Glu Ser Asp Phe Ala Ser Ser Ile Thr Lys Ile Lys Ser Leu Ala Leu Thr Ser Lys Tyr Gly Val Leu Val Pro Ile Thr Ser Ile Ala Lys Ile Glu Glu Val Asp Gly Pro Val Ser Val Val Arg 755 760 765 77p Glu Asn Ser Met Arg Met Ser Val Val Arg Ser Asn Val Val Gly Arg Asp Leu Lys Ser Phe Val Glu Glu Ala Lys Lys Val Ile Ala Gln Asn Ile Lys Leu Pro Pro Ser Tyr Tyr Ile Thr Tyr Gly Gly Gln Phe Glu Asn Gln Gln Arg Ala Asn Lys Arg Leu Ser Thr Val Ile Pro Leu Ser Ile Leu Ala Ile Phe Phe Ile Leu Phe Phe Thr Phe Lys Ser Ile Pro Leu Ala Leu Leu Ile Leu Leu Asn Ile Pro Phe Ala Val Thr Gly Gly Leu Ile Ala Leu Phe Ala Val Gly Glu Tyr Ile Ser Val Pro Ala Ser Val Gly Phe Ile Ala Leu Phe Gly Ile Ala Val Leu Asn Gly Val Val Met Ile Gly Tyr Phe Lys Glu Leu Leu Leu Gln Gly Lys Ser Val Glu Glu Cys Val Leu Leu Gly Ala Lys Arg Arg Leu Arg Pro Val Leu Met Thr Ala Cys Ile Ala Gly Leu Gly Leu Leu Pro Leu Leu Phe Ser His Ser Val Gly Ser Glu Val Gln Lys Pro Leu Ala Ile Val Val Leu Gly Gly Leu Val Thr Ser Ser Ala Leu Thr Leu Leu Leu Leu Pro Pro Met Phe Met Leu Ile Ala Lys Lys Ile Lys Ile Val (2) INFORMATION FOR SEQ ID N0:109:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 898 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (iy) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 86...835 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 86...161 (D) OTHER INFORMATION:
' ~ (xi) SEQUENCE DESCRIPTION: SEQ ID N0:109:
GCATAAAATA AACAAACATT AAGTAAGGCT TATCAATA'rT TGATTACAAT TATAAGGGTT 60 Met Leu Gly Asn Val Lys Lys Thr Leu TTT GGG GTC TTG TGT TTG GGC ACG TTG TGT T'rG AGA GGG TTA ATG GCA 160 Phe Gly Val Leu Cys Leu Gly Thr Leu Cys Leu Arg Gly Leu Met Ala Glu Pro Asp Ala Lys Glu Leu Val Asn Leu G.Ly Ile Glu Ser Ala Lys 1 5 10 15_ AAG CAA GAT TTC GCT CAA GCT AAA ACG CAT T'CT GAA AAA GCT TGT GAG 256 Lys Gln Asp Phe Ala Gln Ala Lys Thr His Phe Glu Lys Ala Cys Glu Leu Lys Asn Gly Phe Gly Cys Val Phe Leu Gly Ala Phe Tyr Glu Glu Gly Lys Gly Val Gly Lys Asp Leu Lys Lys A.~a Ile Gln Phe Tyr Thr AAA GGT TGT GAA TTA AAT GAT GGT TAT GGG TCiT AAC CTG CTA GGA AAT 400 Lys Gly Cys Glu Leu Asn Asp Gly Tyr Gly Cys Asn Leu Leu Gly Asn 65 70 7'i 80 Leu Tyr Tyr Asn Gly Gln Gly Val Ser Lys A:>p Ala Lys Lys Ala Ser Gln Tyr Tyr Ser Lys Ala Cys Asp Leu Asn His Ala Glu Gly Cys Met VaI Leu Gly Ser Leu His His Tyr Gly Val Gl.y Thr Pro Lys Asp Leu Arg Lys Ala Leu Asp Leu Tyr Glu Lys Ala Cys Asp Leu Lys Asp Ser CCA GGG TGT ATT AAT u~A--GGA TAT ATA TAT AGT GTA ACA AAG AAT TTT 640 Pro Gly Cys Ile Asn Ala Gly Tyr I1e Tyr Ser Val Thr Lys Asn Phe 145 150 155 l60 Lys Glu Ala Ile Val Arg Tyr Ser Lys Ala Cys Glu Leu Lys Asp Gly Arg Gly Cys Tyr Asn Leu Gly Val Met Gln Tyr Asn Ala Gln Gly Thr 180 185 l90 GCA AAG GAC GAA AAG CAA GCG GTA GAA AAC TTT AAA AAA GGC TGC AAA 784 _ Ala Lys Asp Glu Lys Gln Ala Val Glu Asn Phe Lys Lys Gly Cys Lys Ser Ser Val Lys Glu Ala Cys Asp Ala Leu Lys Glu Leu Lys Ile Glu 2l0 2Z5 220 Leu TTTTAAG ggg (2) INFORMATION FOR SEQ ID NO:110:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 250 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 1...25 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:110:
Met Leu Gly Asn Val Lys Lys Thr Leu Phe Gly Val Leu Cys Leu Gly Thr Leu Cys Leu Arg Gly Leu Met Ala Glu Pro Asp Ala Lys Glu Leu Val Asn Leu Gly Ile Glu Ser Ala Lys Lys Gln Asp Phe Ala Gln Ala Lys Thr His Phe Glu Lys Ala Cys Glu Leu Lys Asn Gly Phe Gly Cys Val Phe Leu Gly Ala Phe Tyr Glu Glu Gly Lys Gly Val Gly Lys Asp Leu Lys Lys Aia 11e ~in-~Phe Tyr Thr Lys G7_y Cys Glu Leu Asn Asp Gly Tyr Gly Cys Asn Leu Leu Gly Asn Leu Tyr Tyr Asn Gly Gln Gly _ 75 80 85 Val Ser Lys Asp Ala Lys Lys Ala Ser Gln Tyr Tyr Ser Lys Ala Cys Asp Leu Asn His Ala Glu Gly Cys Met Val Le:u Gly Ser Leu His His Tyr Gly Val Gly Thr Pro Lys Asp Leu Arg Lys Ala Leu Asp Leu Tyr - 120 125 1.S0 l35 Glu Lys Ala Cys Asp Leu Lys Asp Ser Pro G7_y Cys Ile Asn Ala Gly Tyr Ile Tyr Ser Val Thr Lys Asn Phe Lys G7_u Ala Ile Val Arg Tyr 155 l60 165 Ser Lys Ala Cys Glu Leu Lys Asp Gly Arg G7_y Cys Tyr Asn Leu Gly Val Met Gln Tyr Asn Ala Gln Gly Thr Ala Lys Asp Glu Lys Gln Ala Val Glu Asn Phe Lys Lys Gly Cys Lys Ser SE:r Val Lys Glu Ala Cys 200 205 27_0 215 Asp Ala Leu Lys Glu Leu Lys Ile Glu Leu (2) INFORMATION FOR SEQ ID NO:111:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1079 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 169...834 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 169...289 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:111:
CP,AAAAAAAA AAHAAACAAT TTCAGTTTCT TATTAGCTAG GTTTGATTAA AATGAAAAGC 60 ~ Met Ala Glu Asn Ser Phe Lys Asn vai-5er Thr Gln Pro Lys Val Phe Phe Leu Leu Pro Ala Lys Thr Leu Phe Leu Leu Gly Gly Val Phe Ser Ala Phe Phe Ile Leu Ile Ala Gly Leu Val Phe Phe Asp Tyr Ala His Leu Met Asp Asn Ala Ile Phe Asn Phe Ala Arg Ser Thr Pro Phe Asn Ser Ser Pro Ile Leu Thr Leu Ile Leu Gln Asn Ile Ala Asn Leu Gly Ser Ser Gln Phe Val Leu Pro Leu Ser Leu Leu Val Gly Val Phe Leu Ser Leu Tyr Arg Arg Asn Leu Val Leu Gly Val Trp Phe Val Leu Ser Val Ile Leu TTT GAA GCC CTT TTA GAA TCT TTA AAA CAC CTT TTT GCA TAT TCC ATT 56l Phe Glu Ala Leu Leu Glu Ser Leu Lys His Leu Phe Ala Tyr Ser Ile Gln Trp Leu Ser Arg Ser Ala Asn Phe Pro Asn Ala Thr Ala Leu Ser Leu Val Leu Phe Tyr Gly Leu Leu Ile Leu Leu Ile Pro His Leu Ile Thr His Gln Thr Leu Lys Asn Val Leu Phe Tyr Ser Leu Phe Gly Leu Ile Phe Leu Ile Gly Leu Ala Leu Ile Val Leu Gly Val Ser Phe Ser Ser Val Leu Gly Gly Phe Cys Leu Gly Ala Leu Gly Ala Cys Phe Ser Ile Gly Ile Tyr Leu Ser Val Phe Gln Lys Ile ATTTTTTCAT CAAGCTCAt~T AAAAAGCAAA AAATCGCCCT GATTGCAGCT GGGGTTTTGA 974 TCACGGCTTT GCTTGTGTTT TTATTGCTCT ATCCCTTTi~A AGAAAAAGAC TACACGCAAG 1034 (2) INFORMATION FOR SEQ ID N0:112:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 222 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATION: l...40 (D) OTHER TNFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NC>:112:
Met Ala Glu Asn Ser Phe Lys Asn Val Ser Thr Gln Pro Lys Val Phe -40 -35 -3.0 -25 Phe Leu Leu Pro Ala Lys Thr Leu Phe Leu Leu Gly Gly Val Phe Ser Ala Phe Phe Ile Leu Ile Ala Gly Leu Val Phe Phe Asp Tyr Ala His Leu Met Asp Asn Ala Ile Phe Asn Phe Ala Arg Ser Thr Pro Phe Asn Ser Ser Pro Ile Leu Thr Leu Ile Leu Gln Ae,n Ile Ala Asn Leu Gly Ser Ser Gln Phe Val Leu Pro Leu Ser Leu Leu Val Gly Val Phe Leu Ser Leu Tyr Arg Arg Asn Leu Val Leu Gly Va.l Trp Phe Val Leu Ser Val Ile Leu Phe Glu Ala Leu Leu G1u 5er Leu Lys His Leu Phe Ala Tyr Ser Ile Gln Trp Leu Ser Arg Ser Ala Asn Phe Pro Asn Ala Thr Ala Leu Ser Leu Val Leu Phe Tyr Gly Leu Leu Ile Leu Leu Ile Pro l05 l10 115 120 His Leu Ile Thr His Gln Thr Leu Lys Asn Val Leu Phe Tyr Ser Leu Phe Gly Leu Ile Phe Leu Ile Gly Leu Ala Leu Ile Val Leu Gly Val Ser Phe Ser Ser Val Leu Gly Gly Phe Cys Leu Gly Ala Leu Gly Ala Cys Phe Ser Ile Gly Ile Tyr Leu Ser Val Phe Gln Lys Ile 170 l75 180 (2) INFORMATION FOR SEQ ID N0:113:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 962 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 97...912 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 97...217 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:113:
AAAGTTTGCA ACCTGATGAG AGTAATAATA GAGTTT ATG CTG ATT TCA TTA AAA 1l4 Met Leu Ile Ser Leu Lys Thr Phe Leu Lys Ile Leu Leu Lys Ile Phe Leu Lys Thr Phe Gln Lys Ile Trp Val Val Cys Val Ile Ile Trp Gly Leu Gly Cys Ser Phe Leu Asn Ala Asn Ser Ile Gln Leu Glu Glu Thr Leu Arg Arg Ser Pro Lys Asn Leu Ile Trp Gln His Phe Lys Lys Lys Phe Lys Lys Ser Asn Thr Ile Pro Tyr Ala Pro Asn Ser Arg Trp Lys Tyr Leu Gly Thr Ser Ile Gly Ile Leu G1y Val Ser Leu Val Ile Gly Ile Val Gly Leu Tyr Leu Met Pro Glu Ser Val Thr Asn Trp Asp Lys Glu Lys Phe Gly Ile Lys AGT TGG TTT GAA AA'1' G'LC-CGC ATG GGG CCA FAA CTG GAC AAT GAT AGT 498 Ser Trp Phe Glu Asn Val Arg Met Gly Pro Lys Leu Asp Asn Asp Ser Phe Ile Phe Asn Glu Ile Leu His Pro Tyr F~he Gly Ala Met Tyr Tyr 95 100 1.05 110 ATG CAA CCG CGC ATG GCT GGA TTT AGC TGG A.TG GCA TCA GCG TTT TTT 594 Met Gln Pro Arg Met Ala Gly Phe Ser Trp Nfet Ala Ser Ala Phe Phe TCT TTT ATC ACT TCC ACG CTT TTT TGG GAA T'AT GGC TTG GAA GCG TTT 642 Ser Phe Ile Thr Ser Thr Leu Phe Trp Glu Tyr Gly Leu Glu Ala Phe GTG GAA GTG CCT AGC TGG CAG GAT TTA GTG A.TC ACG CCT TTA TTA GGC 690 Val Glu Val Pro Ser Trp Gln Asp Leu Val Ile Thr Pro Leu Leu Gly 145 150 l55 TCC ATT TTA GGG GAG GGG TTT TAT CAG CTC A.CG CGC TAT ATC CAA CGC 738 Ser Ile Leu Gly Glu Gly Phe Tyr Gln Leu Thr Arg Tyr Ile G1n Arg Asn Glu Gly Lys Leu Phe Gly Ser Leu Phe Leu Gly Arg Leu Val Ile Ala Leu Met Asp Pro Ile Gly Phe Ile Ile Arg Asp Leu Gly Leu Gly Glu Ala Leu Gly Ile Tyr Asn Lys His Glu Ile Arg Ser Ser Leu Ser Pro Asn Gly Leu Asn Leu Thr Tyr Lys Phe (2) INFORMATION FOR SEQ TD N0:114:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 272 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein ' (v) FRAGMENT TYPE: internal (1x) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATIUlv: i...40 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:114:
Met Leu Ile Ser Leu Lys Thr Phe Leu Lys Ile Leu Leu Lys Ile Phe Leu Lys Thr Phe Gln Lys Ile Trp Val Val Cys Val Ile Ile Trp Gly Leu Gly Cys Ser Phe Leu Asn Ala Asn Ser Ile Gln Leu Glu Glu Thr Leu Arg Arg Ser Pro Lys Asn Leu Ile Trp Gln His Phe Lys Lys Lys Phe Lys Lys Ser Asn Thr Ile Pro Tyr Ala Pro Asn Ser Arg Trp Lys Tyr Leu Gly Thr Ser Ile Gly Ile Leu Gly Val Ser Leu Val Ile Gly Ile Val Gly Leu Tyr Leu Met Pro Glu Ser Val Thr Asn Trp Asp Lys Glu Lys Phe Gly Ile Lys Ser Trp Phe Glu Asn val Arg Met Gly Pro Lys Leu Asp Asn Asp Ser Phe Ile Phe Asn Glu Ile Leu His Pro Tyr Phe Gly Ala Met Tyr Tyr Met Gln Pro Arg Met Ala Gly Phe Ser Trp 105 l10 115 120 Met Ala Ser Ala Phe Phe Ser Phe Ile Thr Ser Thr Leu Phe Trp Glu 125 l30 135 Tyr Gly Leu Glu Ala Phe Val Glu Val Pro Ser Trp Gln Asp Leu Val Ile Thr Pro Leu Leu Gly Ser Ile Leu Gly Glu Gly Phe Tyr Gln Leu l55 160 l65 Thr Arg Tyr Ile Gln Arg Asn Glu G1y Lys Leu Phe Gly Ser Leu Phe Leu Gly Arg Leu Val Ile Ala Leu Met Asp Pro Ile Gly Phe Ile Ile Arg Asp Leu Gly Leu Gly Glu Ala Leu Gly Ile Tyr Asn Lys His Glu 205 2l0 215 Ile Arg Ser Ser Leu Ser Pro Asn Gly Leu Asn Leu Thr Tyr Lys Phe (2} INFORMATION FOR SEQ ID N0:115:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: l422 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 216...1202 (L) OTHER llvFURI~1ATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 216...273 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRTPTION: SEQ ID NC>:115:
GATTTTGAAA ATAACCCTAA TGAGCAATCAGCGCTCTTTG TCTTGCCCCTTTCAGCGGTTl20 TTT
Met Lys Arg Phe Val Leu TTT TTA TTG TTC ATG TGC GTT TGC GTT CAA GC'T TAC GCC GAG CAA GAT 2B1 Phe Leu Leu Phe Met Cys Val Cys Val Gln Ala Tyr Ala Glu Gln Asp Tyr Phe Phe Arg Asp Phe Lys Ser Arg Asp Leu Pro Gln Lys Leu His Leu Asp Lys Lys Leu Ser Gln Thr Ile Gln Pro Cys Met Gln Leu Asn Ala Ser Lys His Tyr Thr Ser Thr Gly Val Arg Glu Pro Asp Lys Cys Thr Lys Ser Phe Lys Lys Ser Ala Leu Met Ser Tyr Asp Leu Ala Leu GGT TAT TTG GTG AGT AAG AAT AAG CAA TAC GG~~ TTA AAG GCT ATA GAA 521 Gly Tyr Leu Val Ser Lys Asn Lys Gln Tyr Gly Leu Lys Ala Ile Glu Ile Leu Asn Ala Trp Ala Lys Glu Leu Gln Se:r Val Asp Thr Tyr Gln AGC GAG GAT AAT ATC AAT TTT TAC ATG CCT TA'P ATG AAC ATG GCT TAT 617 Ser Glu Asp Asn Ile Asn Phe Tyr Met Pro Ty:~ Met Asn Met Ala Tyr Trp Phe Val Lys Lys Ala Phe Pro Ser Pro Glu Tyr Glu Asp Phe Ile AAG CGG ATG CGC CAG TAT TCT CAA TCA GCT CT':C AAC ACT AAC CAT GGG 713 Lys Arg Met Arg Gln TyrwSer G1n Ser Ala Leu Asn Thr Asn His Gly 135 l40 145 Ala Trp Gly Ile Leu Phe Asp Val Ser Ser Ala Leu Ala Leu Asp Asp Asn Ala Leu Leu His Asn Ser Ala Asn Arg Trp Gln Glu Trp Val Phe Lys Ala Ile Asp Glu Asn Gly Val Ile Xaa Ser Ala Ile Thr Arg Ser Asp Thr Ser Asp Tyr His Gly Gly Pro Thr Lys Gly Ile Lys Gly Ile 200 205 2l0 Ala Tyr Thr Asn Phe Ala Leu Leu Ala Leu Thr Ile Ser Gly Glu Leu Leu Phe Glu Asn Gly Tyr Asp Leu Trp Gly Ser Gly Ala Gly Lys Arg Leu Ser Val Ala Tyr Asn Lys Val Ala Thr Trp Ile Leu Asn Pro Glu Thr Phe Pro Tyr Phe Gln Pro Asn Leu Ile Gly Val His Asn Asn Ala Tyr Phe Ile Ile Leu Ala Lys His Tyr Ser Ser Pro Ser Ala Asn Glu Leu Leu Lys Gln Gly Asp Leu His Glu Asp Gly Phe Arg Leu Lys Leu Arg Ser Pro (2) INFORMATION FOR SEQ ID N0:116:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 329 amino acids (B) TYPE: amino-acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence ' (B) LOCATION: 1...l9 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:116:
Met Lys Arg Phe Val Leu Phe Leu Leu Phe Met Cys Val Cys Val Gln Ala Tyr Ala Glu Gln Asp Tyr Phe Phe Arg Aap Phe Lys 5er Arg Asp Leu Pro Gln Lys Leu His Leu Asp Lys Lys Leu Ser Gln Thr Ile Gln Pro Cys Met Gln Leu Asn Ala Ser Lys His T~~r Thr Ser Thr Gly Val Arg Glu Pro Asp Lys Cys Thr Lys Ser Phe Lys Lys Ser Ala Leu Met Ser Tyr Asp Leu Ala Leu Gly Tyr Leu Val Se:r Lys Asn Lys Gln Tyr Gly Leu Lys Ala Ile Glu Ile Leu Asn Ala Trp Ala Lys Glu Leu Gln Ser Val Asp Thr Tyr Gln Ser G1u Asp Asn Il.e Asn Phe Tyr Met Pro Tyr Met Asn Met Ala Tyr Trp Phe Val Lys Lys Ala Phe Pro Ser Pro 110 115 12'.0 125 Glu Tyr Glu Asp Phe Ile Lys Arg Met Arg Gln Tyr Ser Gln Ser Ala Leu Asn Thr Asn His Gly Ala Trp Gly Ile Le:u Phe Asp Val Ser Ser Ala Leu Ala Leu Asp Asp Asn Ala Leu Leu His Asn Ser Ala Asn Arg Trp Gln Glu Trp Val Phe Lys Ala Ile Asp Glu Asn Gly Val Ile Xaa Ser Ala Ile Thr Arg Ser Asp Thr Ser Asp Tyr His Gly Gly Pro Thr Lys Gly Ile Lys Gly Ile Ala Tyr Thr Asn Ph.e Ala Leu Leu Ala Leu 210 2l5 220 Thr Ile Ser Gly Glu Leu Leu Phe Glu Asn Gly Tyr Asp Leu Trp Gly Ser Gly Ala Gly Lys Arg Leu Ser Val Ala Tyr Asn Lys Val Ala Thr w 240 245 250 Trp Ile Leu Asn Pro Glu Thr Phe Pro Tyr Ph.e Gln Pro Asn Leu Ile ' 255 260 26S
Gly Val His Asn Asn Ala Tyr Phe Ile Ile Leu Ala Lys His Tyr Ser Ser Pro Ser Ala Asn Glu Leu Leu Lys Gln Gly Asp Leu His Glu Asp Gly Phe Arg Leu Lys Leu Arg Ser Pro (2} INFORMATION FOR SEQ ID N0:117:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1080 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 157...987 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 157...226 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:117:
Met Lys Thr Asn Gly Leu Phe Lys Met Trp Gly Leu Phe Leu Val Leu Ile Ala Leu Val Phe Asn Ala Cys Ser Asp Ser His Lys Glu Lys Lys Asp Ala Leu Glu Val Ile Lys Gln Arg Gly Val Leu Lys Val Gly Val Phe Ser Asp Lys Pro Pro Phe Gly Ser Val Asp Ser Lys Gly Lys Tyr Gln Gly Tyr Asp Val Val Ile Ala Lys Arg Met Ala Leu Asp Leu Leu Gly Asp Glu Asn Lys Ile WO 98l21225 PCT/US97/21353 -GAG TTT ATT CCT GTA GhA GCT TCA GCT AGG GTG GAA TTT TTA AAA GCC 462 G1u Phe Ile Pro Val Glu Ala Ser Ala Arg Val Glu Phe Leu Lys Ala AAT AAA GTG GAT ATT ATC ATG GCT AAT TTC A.CG CGC ACT AAA GAA AGA 510 Asn Lys Val Asp Ile Ile Met Ala Asn Phe Thr Arg Thr Lys Glu Arg - ~ Glu Lys Val Val Asp Phe Ala Lys Pro Tyr Met Lys Val Ala Leu Gly Val Val Ser Lys Asp Gly Val Ile Lys Asn Ile Glu Glu Leu Lys Asp Lys Glu Leu Ile Val Asn Lys Gly Thr Thr Ala Asp Phe Tyr Phe Thr l30 135 140 AAA AAT TAC CCC AAT ATC AAG CTT TTG AAA T'rT GAG CAA AAT ACA GAG 702 Lys Asn Tyr Pro Asn Ile Lys Leu Leu Lys P.he Glu Gln Asn Thr Glu l45 150 155 Thr Phe Leu Ala Leu Leu Asn Asn Lys Ala T.hr Ala Leu Ala His Asp 160 l65 170 175 AAC ACT TTA TTG CTC GCT TGG ACG AAA CAA C,fiC CCT GAA TTT AAA TTA 79B
Asn Thr Leu Leu Leu Ala Trp Thr Lys Gln His Pro Glu Phe Lys Leu GGC ATT ACA AGC CTT GGC GAT AAG GAT GTG A'rC GCT CCA GCG ATT AAA 846 Gly Ile Thr Sex Leu Gly Asp Lys Asp Val I.le Ala Pro Ala Ile Lys Lys Gly Asn Pro Lys Leu Leu Glu Trp Leu Asn Asn Glu Ile Asp Ser 2l0 2l5 220 CTC ATT TCT AGC GAC TTC TTA AAA GAA GCT Ti~T CAA GAG ACT TTA GCA 942 Leu Ile Ser Ser Asp Phe Leu Lys Glu Ala Tyr Gln Glu Thr Leu Ala Pro Val Tyr Gly Asp Glu Ile Lys Pro Glu Glu Ile Ile Phe Glu TCTTTAGGCT TTGAATTCTT GACAGGGTGC GTTTTTAT'CG CTAAATTAGC AATTTTGTGA 1052 TCTTTTTGTT TTTCATTTTG AGATATAT l080 ' (2) INFORMATION FOR SEQ ID N0:1113:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 277 amino acids (B) TYPE: amino.acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 1...23 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:118:
Met Lys Thr Asn Gly Leu Phe Lys Met Trp Gly Leu Phe Leu Val Leu Ile Ala Leu Val Phe Asn Ala Cys Ser Asp Ser His Lys Glu Lys Lys Asp Ala Leu Glu Val Ile Lys Gln Arg Gly Val Leu Lys Val Gly Val Phe Ser Asp Lys Pro Pro Phe Gly Ser Val Asp Ser Lys Gly Lys Tyr Gln Gly Tyr Asp Val Val Ile Ala Lys Arg Met Ala Leu Asp Leu Leu Gly Asp Glu Asn Lys Ile Glu Phe I1e Pro Val Glu Ala Ser Ala Arg Val Glu Phe Leu Lys Ala Asn Lys Val Asp Ile Ile Met Ala Asn Phe Thr Arg Thr Lys Glu Arg Glu Lys Val Val Asp Phe Ala Lys Pro Tyr Met Lys Val Ala Leu Gly Val Val Ser Lys Asp Gly Val Ile Lys Asn 110 l15 120 Ile Glu Glu Leu Lys Asp Lys Glu Leu Ile Val Asn Lys Gly Thr Thr Ala Asp Phe Tyr Phe Thr Lys Asn Tyr Pro Asn Ile Lys Leu Leu Lys 140 l45 l50 Phe Glu Gln Asn Thr Glu Thr Phe Leu Ala Leu Leu Asn Asn Lys Ala Thr Ala Leu Ala His Asp Asn Thr Leu Leu Leu Ala Trp Thr Lys Gln 170 175 180 l85 His Pro Glu Phe Lys Leu Gly Ile Thr Ser Leu Gly Asp Lys Asp Val Ile Ala Pro Ala Ile Lys Lys Gly Asn Pro Lys Leu Leu Glu Trp Leu Asn Asn Glu Ile Asp Ser Leu Ile Ser Ser Asp Phe Leu Lys Glu Ala Tyr Gln Glu Thr Leu Ala Pro Val Tyr Gly Asp Glu Ile Lys Pro Glu Glu Ile Ile Phe Glu (2) INFORMATION FOR SEQ ID N0:119:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1114 base pairs (B} TYPE: nucleic acid y (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
~ (A) NAME/KEY: Coding Sequence (B) LOCATION: 37...1050 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:119:
Met Phe Phe Lys Thr Tyr Gln Lys Leu Leu Gly Ala Ser Cys Leu Thr Leu Tyr Leu Ala Gly Cys GGG AGT GAT AGT AGC GAG CCA TTG GTG GGA A'TT GAA AAA AAT AGC TTC 150 Gly Ser Asp Ser Ser Glu Pro Leu Val Gly Ile Glu Lys Asn Ser Phe AAT TCT ACC GTG AAA ATC ATT TCT AAA ACC G.AC AAC ATA GAA ATC CAA 198 Asn Ser Thr Val Lys Ile Ile Ser Lys Thr Asp Asn Ile Glu Ile Gln Asp Leu Lys Leu Asn Arg Gly Asn Cys Glu His Asp Gln Asn Phe Leu 55 60 6!~ 70 GTA AAG TTA ATC CAA GAA ACA GCC AAT ACA T:9C CTG TTT GCA TCA GAA 294 Val Lys Leu Ile Gln Glu Thr Ala Asn Thr T~~r Leu Phe Ala Ser Glu AAA GAA AAA GCG ATC AAA AAC CAC CAA GCA Ai3A ATC GCA AGA CTT CAA 342 Lys Glu Lys Ala Ile Lys Asn His Gln Ala Lys Ile Ala Arg Leu Gln Lys Asp Leu Glu Glu Leu Thr Gln His Val Gln Gln Ser Asn Asn Leu - GAT AAA TTG TTA GAA AAT GGA GGA CTA TTC G'CT AGT GGC CAT GAT TAT 438 Asp Lys Leu Leu Glu Asn Gly Gly Leu Phe Val Ser Gly His Asp Tyr AAA TAT ACA AAA GAT GAT AAC CCA ATA TAT G'CT GTT AAG AGG ATG CTT 4S6 Lys Tyr Thr Lys Asp Asp Asn Pro Ile Tyr V<~l Val Lys Arg Met Leu l35 140 145 150 Asp Asn Leu Asp Ser Tyr Lys Tyr Glu Ser Asp Asp Val Leu Asp Val l55 160 165 Pro Tyr Glu Lys Leu Leu Glu Ile Ser Ile Ala Ile Glu Asp Thr Lys Asn Pro Lys Asp Tyr Pro Tyr Ile Asn Leu Lys Glu Leu Lys Lys Leu Ile Asp Ser Ile Ile Asp Asp His Gly Tyr Met Ala Asp Gly Phe Leu Asn Glu Tyr Ser Asn Arg Val Ser Lys Lys Gly Leu Gln Ile Leu Ala AAA CTA AAA TCC ATG TGG CCT AGC GTA GGG AAA TTT TAT TTC GCC..TCT 774 Lys Leu Lys Ser Met Trp Pro Ser Val Gly Lys Phe Tyr Phe Ala Ser Leu Lys Glu Ala Ile Pro Arg His Ala Lys Glu Val Thr Asp Lys Met Ile Ser Ser Glu Glu Lys Ser Ile Lys Ala Asn Gln Val Lys Leu Thr Glu Ala Lys Gln Asp Ile Asp Lys Met Glu Lys Ile Ile Lys Asp Leu Glu Ser Lys Lys Asn Thr Leu Ser Val Tyr Leu Lys Phe Gly Glu Ser Phe Thr Ala His Tyr Lys Cys Gln Asn Leu Ile Glu Val Gly Val Lys ACC GAT AAA GGC TCC TGG ACT TTC AAC TTT AAC AGA TAAATCAGGC AAATAT l066 Thr Asp Lys Gly Ser Trp Thr Phe Asn Phe Asn Arg (2) INFORMATION FOR SEQ ID N0:120:
(i) SEQUENCE CHARACTERISTICS:
(A? LENGTH: 338 amino acids ' (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:120:
Met Phe Phe Lys Thr Tyr Gln Lys Leu Leu G:Ly Ala Ser Cys Leu Thr Leu Tyr Leu Ala Gly Cys Gly Ser Asp Ser S:~_r Glu Pro Leu Val Gly Ile Glu Lys Asn Ser Phe Asn Ser Thr Val Lys Ile Ile Ser Lys Thr Asp Asn Ile Glu Ile Gln Asp Leu Lys Leu Asn Arg Gly Asn Cys Glu His Asp Gln Asn Phe Leu Val Lys Leu Ile Gln Glu Thr Ala Asn Thr Tyr Leu Phe Ala Ser Glu Lys Glu Lys Ala Ile Lys Asn His Gln Ala Lys Ile Ala Arg Leu Gln Lys Asp Leu Glu Glu Leu Thr Gln His Val Gln Gln Ser Asn Asn Leu Asp Lys Leu Leu Glu Asn Gly Gly Leu Phe Val Ser Gly His Asp Tyr Lys Tyr Thr Lys A:~p Asp Asn Pro Ile Tyr 130 135 l40 Val Val Lys Arg Met Leu Asp Asn Leu Asp Se>_r Tyr Lys Tyr Glu Ser Asp Asp Val Leu Asp Val Pro Tyr Glu Lys Leu Leu Glu Ile Ser Ile Ala Ile Glu Asp Thr Lys Asn Pro Lys Asp Tyr Pro Tyr Ile Asn Leu Lys Glu Leu Lys Lys Leu Ile Asp Ser Ile I7_e Asp Asp His Gly Tyr Met Ala Asp Gly Phe Leu Asn Glu Tyr Ser A:>n Arg Val Ser Lys Lys Gly Leu Gln Ile Leu Ala Lys Leu Lys Ser Met Trp Pro Ser Va1 Gly 225 230 2.S5 240 Lys Phe Tyr Phe Ala Ser Leu Lys Glu Ala Il.e Pro Arg His Ala Lys Glu Val Thr Asp Lys Met Ile Ser Ser Glu G7.u Lys Ser Ile Lys Ala Asn Gln Val Lys Leu Thr Glu Ala Lys Gln Asp Ile Asp Lys Met Glu Lys Ile Ile Lys Asp Leu Glu Ser Lys Lys A:~n Thr Leu Ser Val Tyr Leu Lys Phe Gly Glu Ser Phe Thr Ala His Tyr Lys Cys Gln Asn Leu 305 310 37.5 320 Ile Glu Val Gly Val Lys Thr Asp Lys Gly Ser Trp Thr Phe Asn Phe Asn Arg (2) INFORMATION FOR SEQ ID N0:121.:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1101 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 40...1026 (D) OTHER INFORMATION:
(A) NAME/KEY: sig peptide (B) LOCATION: 40...99 (D) OTHER INFORMATION:
(A) NAME/KEY: mat peptide (B) LOCATION: 100...1026 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:121:
Met Gln Gln Arg His Leu Gly Pro Leu Lys Val Gly Ala Leu Ala Leu Gly Cys Met Gly Met Thr Tyr Gly Tyr Gly Glu Val His Asp Lys Lys Gln Met Val Lys Leu Ile His Lys Ala Leu Glu Leu Gly Ile Asn Phe Phe Asp Thr Ala Glu Ala Tyr Gly Glu Asp Asn Glu Lys Leu Leu Ala Lys Arg Ser Ser Leu Ile Lys Asp Lys Val Val Val Ala Ser Lys Phe Gly Ile Tyr Tyr Ala Asp Pro Asn Asp Lys Tyr Ala Thr Met Phe Leu Asp Ser Ser Ser Asn Arg Ile Lys Ser Ala Ile Glu Gly Ser Leu Lys Arg Leu Lys Val Glu ~ TGC ATT GAT TTA TAC TAC CAA CAC CGC ATG GAT ACT AAC ACG CCC ATA 438 Cys Ile Asp Leu Tyr Tyr Gln His Arg Met Asp Thr Asn Thr Pro Ile GAA GAA GTG GCA GAA GTT ATG CAA GCT CTT A'.CT AAA GAA GGA AAA ATT 486 Glu Glu Val Ala Glu Val Met Gln Ala Leu Ile Lys Glu Gly Lys Ile Lys Ala Trp Gly Met Ser Glu Ala Gly Leu Se:r Ser Ile Gln Lys Ala l30 135 14E0 145 His Gln Ile Cys Pro Leu Ser Ala Leu Gln Se:r Glu Tyr Ser Leu Trp TGG CGC GAA CCT GAA AAA GAG ATT TTA GGT T7.'T TTA GAA AAA GAA AAA 630 Trp Arg Glu Pro Glu Lys Glu Ile Leu Gly Phe Leu Glu Lys Glu Lys Ile Gly Phe Val Ala Phe Ser Pro Leu Gly Lys Gly Phe Leu Gly Ala l80 185 190 Lys Phe Glu Lys Asn Ala Thr Phe Ala 5er Glu Asp Phe Arg Ser Val TCT CCT AGG TTT AAT CAA GAA AAT CTA GCC AAP. AAT TAC GTC TTG GTG 774 Ser Pro Arg Phe Asn Gln Glu Asn Leu Ala Lys Asn Tyr Val Leu Val Glu Leu Ile Gln Asp His Ala His Ala Lys Gly Val Thr Pro Ala Gln Leu Ala Leu Ser Trp Ile Leu His Thr Gln Lys Ile Ile Val Pro Leu TTT GGC ACC ACC AAA GAA TCC AGG CTC ATA GAA AAT ATA GGG GCT TTG 9l8 Phe Gly Thr Thr Lys Glu Ser Arg Leu Ile G7.u Asn Ile Gly Ala Leu CAG GTT TCT TGG AGT CAA AAA GAA TTG GAG A7.'T TTT CAA AAA GAA TTG 966 Gln Val Ser Trp Ser Gln Lys Glu Leu Glu Il.e Phe Gln Lys Glu Leu ACT GCA ATC AAA ATA GAA GGG GCC CGC TAC CC.'T GAA AGA ATC AAT GAA 1014 Thr Ala Ile Lys Ile Glu Gly Ala Arg Tyr Pro Glu Arg Ile Asn Glu WO 98I21225 PCT/US97/21353 w ATG GTG AAT CAA TAAAAGTATT GGGTATTTAT AATTGCATTG GCTCTTTTAA AAGAG 107l Met Val Asn Gln (2) INFORMATION FOR SEQ ID N0:122:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 329 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:122:
Met Gln Gln Arg His Leu Gly Pro Leu Lys Val Gly Ala Leu Ala Leu Gly Cys Met Gly Met Thr Tyr Gly Tyr Gly Glu Val His Asp Lys Lys Gln Met Val Lys Leu Ile His Lys Ala Leu Glu Leu Gly Ile Asn Phe Phe Asp Thr Ala Glu Ala Tyr Gly Glu Asp Asn Glu Lys Leu Leu Ala Lys Arg Ser Ser Leu Ile Lys Asp Lys Val Val Val Ala Ser Lys Phe Gly Ile Tyr Tyr Ala Asp Pro Asn Asp Lys Tyr Ala Thr Met Phe Leu Asp Ser Ser Ser Asn Arg Ile Lys Ser Ala Ile Glu Gly Ser Leu Lys Arg Leu Lys Val Glu Cys Ile Asp Leu Tyr Tyr Gln His Arg Met Asp 95 100 l05 Thr Asn Thr Pro Ile Glu Glu Val Ala Glu Val Met Gln Ala Leu Ile Lys Glu Gly Lys Ile Lys Ala Trp Gly Met Ser Glu Ala Gly Leu Ser Ser Ile Gln Lys Ala His Gln Ile Cys Pro Leu Ser Ala Leu Gln Ser Glu Tyr Ser Leu Trp Trp Arg Glu Pro Glu Lys Glu Ile Leu Gly Phe Leu Glu Lys Glu Lys Ile Gly Phe Val Ala Phe Ser Pro Leu Gly Lys Gly Phe Leu Gly Ala Lys Phe Glu Lys Asn Ala Thr Phe Ala Ser Glu Asp Phe Arg Ser Val Ser Pro Arg Phe Asn Gln Glu Asn Leu Ala Lys Asn Tyr Val Leu Val Glu Leu Ile Gln Asp His Ala His Ala Lys Gly Val Thr Pro Ala Gln Leu Ala Leu Ser Trp Ile Leu His Thr Gln Lys Ile Ile Val Pro Leu Phe Gly Thr Thr Lys Glu Ser Arg Leu Ile Glu Asn Ile Gly Ala Leu Gln Val Ser Trp Ser Gln Lys Glu Leu Glu Ile Phe Gln Lys Glu Leu Thr Ala Ile Lys Ile Glu Gly Ala Arg Tyr Pro Glu Arg Ile Asn Glu Met Val Asn Gln (2) INFORMATION FOR SEQ ID N0:12a:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 955 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 126...806 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 126...237 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:123:
Met Val Phe Asp Arg Thr Ile Ser Val Arg Glu Lys Lys A1a Ala Lys Thr Leu Gly Ile Ile Gly Ile Val Phe Phe Ile Leu Phe Gly Ile Val Ile Ser Gly Val Ala Phe Gln Lys Glu Trp Val Gln Gln Leu Asp TTA TTT TTT ATA GAC TTG ATC CGC AAC CCT GCC CCC ATT CAA AAA AGC 3l4 Leu Phe Phe Ile Asp Leu Ile Arg Asn Pro Ala Pro Ile Gln Lys Ser Ala Trp Leu Ser Phe Val Phe Phe Ser Thr Trip Phe Ala Gln Ser Lys WO 98!2122S PCT/US97/21353 -Leu Thr Thr Pro Ile Ala Leu Leu Ile Gly Leu Trp Phe Gly Phe Gln Lys Arg Ile Ala Leu Gly Val Trp Phe Phe Phe Ser Ile Leu Leu Gly Glu Phe Thr Leu Lys Ser Leu Lys Leu Leu Val Ala Arg Pro Arg Pro Val Thr Asn Gly Glu Leu Val Phe Ala His Gly Phe Ser Phe Pro Ser Gly His Ala Leu Ala Ser Ala Leu Phe Tyr Gly Ser Leu Ala Leu Leu 1l0 115 120 Leu Cys Tyr Ser Asn Ala Asn Asn Arg Ile Lys Thr Ile Ile Ala Val Val Leu Leu Phe Trp Ile Phe Leu Met Ala Tyr Asp Arg Val Tyr Leu Gly Val His Tyr Pro Ser Asp val Leu Gly Gly Phe Leu Leu Gly Ile l55 160 165 l70 Ala Trp Ser Cys Cys Ser Leu Ala Leu Tyr Leu Gly Phe Leu Lys Arg l75 180 185 Pro Tyr Asn Gln (2) INFORMATION FOR SEQ ID N0:124:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 227 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 1...37 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:124:
Met Val Phe Asp Arg Thr Ile Ser Val Arg Glu Lys Lys Ala Ala Lys Thr Leu Gly Ile Ile Gly Ile Val Phe Phe Ile Leu Phe Gly Ile Val Ile Ser Gly Val Ala Phe Gln Lys Glu Trp Val Gln Gln Leu Asp Leu Phe Phe Ile Asp Leu Ile Arg Asn Pro Ala P:ro Ile Gln Lys Ser Ala Trp Leu Ser Phe Val Phe Phe Ser Thr Trp Phe Ala Gln Ser Lys Leu Thr Thr Pro Ile Ala Leu Leu Ile Gly Leu T:rp Phe Gly Phe Gln Lys Arg Ile Ala Leu Gly Val Trp Phe Phe Phe S~~_r Ile Leu Leu Gly Glu Phe Thr Leu Lys Ser Leu Lys Leu Leu Val A.La Arg Pro Arg Pro Val Thr Asn Gly Glu Leu Val Phe Ala His Gly Phe Ser Phe Pro Ser Gly His Ala Leu Ala Ser Ala Leu Phe Tyr Gly SE:r Leu Ala Leu Leu Leu Cys Tyr Ser Asn Ala Asn Asn Arg Ile Lys Thr Ile Ile Ala Val Val 125 l30 135 Leu Leu Phe Trp Ile Phe Leu Met Ala Tyr Asp Arg Val Tyr Leu Gly Val His Tyr Pro Ser Asp Val Leu Gly Gly Phe Leu Leu Gly Ile Ala Trp Ser Cys Cys Ser Leu Ala Leu Tyr Leu Gly Phe Leu Lys Arg Pro Tyr Asn Gln (2) INFORMATION FOR SEQ ID N0:125:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1183 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 91...1032 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 91...l48 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:125:
TAGTTACAAC TATTTATTGT AAAGGCTAAA ATG TTG AAA TTT AAA TAT GGT TTG l14 Met Leu Lys Phe Lys Tyr Gly Leu Ile Tyr Ile Ala Leu Ile Leu Gly Leu Gln Ala Thr Asp Tyr Asp Asn Leu Glu Glu Glu Asn Gln Gln Leu Asp Glu Lys Ile Asn His Leu Lys Gln Gln Leu Thr Glu Lys Gly Val Ser Pro Lys Glu Met Asp Lys Asp Lys Phe Glu Glu Glu Tyr Ile Asn Arg Ser Tyr Pro Lys Ile Ser Ser Lys Lys Lys Glu Lys Leu Leu Lys Ser Phe Ser Ile Ala Asp Asp Lys Ser Gly Val Phe Leu Gly Gly Gly Tyr Ala Tyr Gly Glu Leu Asn Leu Ser Tyr Gln Gly Glu Met Leu Asp Arg Tyr Gly Ala Asn Ala Pro Ser Ala Phe Lys Asn Asn Ile Asn Ile Asn Ala Pro Val Ser Met Ile Ser Ala Lys Phe Gly Tyr Gln Lys Tyr Phe Val Ser Tyr Phe Gly Thr Arg Phe Tyr Gly Asp Leu Leu Leu G1y Gly Gly Ala Leu Lys Glu Asp Ala 135 l40 145 Ile Lys Gln Pro Val Gly Ser Phe Ile Tyr Val Leu Gly Ala Val Asn 150 155 7.60 165 ACC GATTTA TTG TTT GAT ATG GAT 7.'TTAAAACTAAA AAGCAT 690 CCT TTA
Thr AspLeu Leu Phe Asp Met Asp PheLysThrLys LysHis Pro Leu l70 175 180 TTT GGG
Phe LeuGly Val Tyr Ala Gly Ile C9lyLeuMetLeu TyrGln Phe Gly AGG AAT
Asp ArgPro Asn Gln Asn Gly Leu ValValGlyGly TyrSer Arg Asn TCT TTG
Ser ProAsn Phe Leu Trp Lys Ile GluValAspTyr ThrPhe Ser Leu AAT GTGGGC GTG AGT TTA ACG AGG F,AACACCGTTTA GAGATT 882 CTT TAT
Asn ValGly Val Ser Leu Thr Arg L~ysHisArgLeu GluIle Leu Tyr GGC ACAAAA TTG CCG ATT AGC AGA P,TGGGAGTGGAA GAGGGA 930 TAT TTG
Gly ThrLys Leu Pro Ile Ser Arg NfetGlyValGlu GluGly Tyr Leu GCG ATTTAT CAA AAT AAA GAA GAG C'GTTTGTTGGTT TCGGCT 978 GAT GAT
Ala IleTyr Gln Asn Lys Glu Glu A,rgLeuLeuVal SerAla Asp Asp AAC AACCAG TTC AAG CGA TCC TTA T'TAGTGAATTAT GCGTTT 1026 AGT TTT
Asn AsnGln Phe Lys Arg Ser Leu L~euValAsnTyr AlaPhe Ser Phe TCTTGGAGTT AA
AAGGTTTAAA
ATTTTAGCGT
Ile Phe TTATTTGATT TTAAGTTTTA
TTTAACGCTT
TAATCACAAA
CAAAGAGGGT
GCGCTTAATG
ACAATGAT'G
(2} INFORMATION FOR N0:126:
SEQ ID
(i) SEQUENCE
CHARACTERISTICS:
(A) LENGTH: 314 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE
TYPE:
protein (v) FRAGMENT
TYPE:
internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 1...19 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:126:
Met Leu Lys Phe Lys Tyr Gly Leu Ile Tyr Ile Ala Leu Ile Leu Gly Leu Gln Ala Thr Asp Tyr Asp Asn Leu Glu Glu Glu Asn Gln Gln Leu Asp Glu Lys Ile Asn His Leu Lys Gln Gln Leu Thr Glu Lys Gly Val Ser Pro Lys Glu Met Asp Lys Asp Lys Phe Glu Glu Glu Tyr Ile Asn Arg Ser Tyr Pro Lys Ile Ser Ser Lys Lys Lys Glu Lys Leu Leu Lys Ser Phe Ser Ile Ala Asp Asp Lys Ser Gly Val Phe Leu Gly Gly Gly Tyr Ala Tyr Gly Glu Leu Asn Leu Ser Tyr Gln Gly Glu Met Leu Asp Arg Tyr Gly Ala Asn Ala Pro Ser Ala Phe Lys Asn Asn Ile Asn Ile Asn Ala Pro Val Ser Met Ile Ser Ala Lys Phe Gly Tyr Gln Lys Tyr Phe Val Ser Tyr Phe Gly Thr Arg Phe Tyr Gly Asp Leu Leu Leu Gly Gly Gly Ala Leu Lys Glu Asp Ala Ile Lys Gln Pro Val Gly Ser Phe Ile Tyr Val Leu Gly Ala Val Asn Thr Asp Leu Leu Phe Asp Met Pro 160 l65 170 Leu Asp Phe Lys Thr Lys Lys His Phe Leu Gly Val Tyr Ala Gly Phe Gly Ile Gly Leu Met Leu Tyr Gln Asp Arg Pro Asn Gln Asn Gly Arg Asn Leu Val Val Gly Gly Tyr Ser Ser Pro Asn Phe Leu Trp Lys Ser Leu Ile Glu Val Asp Tyr Thr Phe Asn Val Gly Val Ser Leu Thr Leu Tyr Arg Lys His Arg Leu Glu Ile Gly Thr Lys Leu Pro Ile Ser Tyr Leu Arg Met Gly Val Glu Glu Gly Ala Ile Tyr Gln Asn Lys Glu Asp Asp Glu Arg Leu Leu Val Ser Ala Asn Asn Gln Phe Lys Arg Ser Ser Phe Leu Leu Val Asn Tyr Ala Phe Ile Phe (2) INFORMATION FOR SEQ ID N0:127:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1851 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear ' (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 238...1665 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 238...313 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: 5EQ ID N0:127:
GAGCTAGTTT TAAAAAGTTA GTTTTGTTTT AAAAAGTTi~A TACTATTTTG AAGCACTCCT 60 ATTCAGATGG CTAAGGCACA CAAGAAATTA GGGGACTC'L'G CTGTATTCCT ACCCTGAAGC 120 GTTACCCTAA AATCCTATTG CATAGGTCTA AATAAGAG(~T TAGGGATCAT TTTAGCCATA 180 Met TCA ATT AAA AGG GTT AGA TTG AAA ATA TTC G'CT CTG TTG ATG TCG GTA 28B
Ser Ile Lys Arg Val Arg Leu Lys Ile Phe Val Leu Leu Met Ser Val Ile Leu Gly Ile Ser Leu Thr Gly Cys Ile Gly Tyr Arg Met Asp Leu GAA CAT TTT AAC ACG CTC TAT TAT GAA GAA A(iC CCT AAA AAA GCT TAT 384 Glu His Phe Asn Thr Leu Tyr Tyr Glu Glu Ser Pro Lys Lys Ala Tyr Glu Tyr Ser Lys Gln Phe Thr Lys Lys Lys Lys Asn_Ala Leu Leu Trp GAC TTG CAA AAC GGC TTG AGC GCT TTA TAC GC:C AGA GAT TAC CAG ACT 480 Asp Leu Gln Asn Gly Leu Ser Ala Leu Tyr A_La Arg Asp Tyr Gln Thr Ser Leu Gly Val Leu Asp Gln Ala Glu Gln Arg Phe Asp Lys Thr Gln Ser Ala Phe Thr Arg Gly Ala Gly Tyr Val G7_y Ala Thr Met Ile Asn ' GAT AAT GTG CGC GCT TAT GGG GGG AAT ATT TAT GAG GGC GTT TTA ATC 624 Asp Asn Val Arg Ala Tyr Gly Gly Asn Ile Tyr Glu Gly Val Leu Ile 90 95 l00 Asn Tyr Tyr Lys Ala Ile Asp Tyr Met Leu Leu Asn Asp Ser Ala Lys l05 l10 115 l20 Ala Arg Val Gln Phe Asn Arg Ala Asn Glu Arg Gln Arg Arg Ala Lys l25 130 l35 Glu Phe Tyr Tyr Glu Glu Val Gln Lys Ala Ile Lys Glu Ile Asp Ser Ser Lys Lys His Asn Ile Asn Met Glu Arg Ser Arg Val Glu Val Ser 155 l60 l65 Glu Ile Leu Asn Asn Thr Tyr Ser Asn Leu Asp Lys Tyr Glu Ala Tyr l70 175 180 Gln Gly Leu Leu Asn Pro Ala Val Ser Tyr Leu Ser Gly Leu Phe Tyr 185 190 195 _ 200 Ala Leu Asn Gly Asp Glu Asn Lys Gly Leu Gly Tyr Leu Asn Glu Ala Tyr Gly Ile Ser Gln Ser Pro Phe Val Ala Gln Asp Leu Val Phe Phe Lys Asn Pro Asn Arg Ser His Phe Thr Trp Ile Ile Ile Glu Asp Gly Lys Glu Pro Gln Lys Ser Glu Phe Lys Ile Asp Val Pro Ile Phe Met Ile Asp Ser Val Tyr Asn Val Ser Ile Ala Leu Pro Lys Leu Glu Lys Gly Glu Ala Phe Tyr Gln Asn Phe Thr Leu Lys Asp Gly Glu Lys Val Thr Pro Phe Asp Thr Leu Ala Ser Ile Asp Ala Val Val Ala Ser Glu Phe Arg Lys Gln Leu Pro Tyr Ile Ile Thr Arg Ala Ile Leu Ser Ala ' ACT TTT AAA GTG GGC ATG CAA GCG GTG GCG F,AC TAT TAT TTG GGG TTT 1344 Thr Phe Lys Val Gly Met Gln Ala Val Ala A.sn Tyr Tyr Leu Gly Phe Val Gly Gly Leu Val Thr Ser Leu Tyr Ser Gly Val Ser Thr Phe Ala " Asp Thr Arg Ser Thr Ser Ile Phe Ala His Lys Ile Tyr Leu Met Arg Ile Lys Asn Lys Ala Phe Glu Ser Tyr Glu Val Arg Ala Asp Ser Ile Asp Ala Phe Ser Phe Ser Leu Lys Pro Cys Lys Arg Ser Leu Glu Ser Pro Lys Ile Ile Asp Ala Arg Glu Leu Leu Ser Gly Phe Val Ala Ala Pro Gln Ile Phe Cys Ser Asn Arg His Asn Ile Leu Tyr Val Arg Ser TTT AAA AAC GGG TTT GTT TTG AGT CGT TTA AzIA TGATTTCAAA ACCCCCACCA 1685 Phe Lys Asn Gly Phe Val Leu Ser Arg Leu Lys CGATACGAAA ACCTAAATTA AGGGGAAGTC ATGGCTGA'rA GTTTAGCGGG CATTGATCAA 1805 (2) INFORMATION FOR SEQ ID N0:128:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 476 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 1...25 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:128:
Met Ser Ile Lys Arg Val Arg Leu Lys Ile Phe Val Leu Leu Met Ser Val Ile Leu Gly Ile Ser Leu Thr Gly Cys Ile Gly Tyr Arg Met Asp Leu Glu His Phe Asn Thr Leu Tyr Tyr Glu Glu Ser Pro Lys Lys Ala Tyr Glu Tyr Ser Lys Gln Phe Thr Lys Lys Lys Lys Asn Ala Leu Leu Trp Asp Leu Gln Asn Gly Leu Ser Ala Leu Tyr Ala Arg Asp Tyr Gln Thr Ser Leu Gly Val Leu Asp Gln Ala Glu Gln Arg Phe Asp Lys Thr Gln Ser Ala Phe Thr Arg Gly Ala Gly Tyr Val Gly Ala Thr Met Ile Asn Asp Asn Val Arg Ala Tyr Gly Gly Asn Ile Tyr Glu Gly Val Leu 90 95 l00 Ile Asn Tyr Tyr Lys Ala Ile Asp Tyr Met Leu Leu Asn Asp Ser Ala 105 l10 115 Lys Ala Arg Val Gln Phe Asn Arg Ala Asn Glu Arg Gln Arg Arg Ala Lys Glu Phe Tyr Tyr Glu Glu Val Gln Lys Ala Ile Lys Glu Ile Asp l40 145 l50 Ser Ser Lys Lys His Asn Ile Asn Met Glu Arg Ser Arg Val Glu Val Ser Glu Ile Leu Asn Asn Thr Tyr Ser Asn Leu Asp Lys Tyr Glu Ala 170 l75 180 Tyr Gln Gly Leu Leu Asn Pro Ala Val Ser Tyr Leu Ser Gly Leu Phe Tyr Ala Leu Asn Gly Asp Glu Asn Lys Gly Leu Gly Tyr Leu Asn Glu Ala Tyr Gly Ile Ser Gln Ser Pro Phe Val Ala Gln Asp Leu Val Phe Phe Lys Asn Pro Asn Arg Ser His Phe Thr Trp Ile Ile Ile Glu Asp Gly Lys Glu Pro Gln Lys Ser Glu Phe Lys Ile Asp Val Pro Ile Phe Met Ile Asp Ser Val Tyr Asn Val Ser Ile Ala Leu Pro Lys Leu Glu Lys Gly Glu Ala Phe Tyr Gln Asn Phe Thr Leu Lys Asp Gly Glu Lys Val Thr Pro Phe Asp Thr Leu Ala Ser Ile Asp Ala Val Val Ala Ser Glu Phe Arg Lys Gln Leu Pro Tyr Ile Ile Thr Arg Ala Ile Leu Ser Ala Thr Phe Lys Val Gly Met Gln Ala Val Ala Asn Tyr Tyr Leu Gly Phe Val Gly Gly Leu Val Thr Ser Leu Tyr Ser Gly Val Ser Thr Phe Ala Asp Thr Arg Ser Thr Ser Ile Phe Ala His Lys Ile Tyr Leu Met Arg Ile Lys Asn Lys Ala Phe Glu Ser Tyr Glu Val Arg Ala Asp Ser Ile Asp Ala Phe Ser Phe Ser Leu Lys Pro Cys Lys Arg Ser Leu Glu Ser Pro Lys Ile Ile Asp Ala Arg Glu Leu Leu Ser Gly Phe Val Ala Ala Pro Gln Ile Phe Cys Ser Asn Arg His Asn Ile Leu Tyr Val Arg Ser Phe Lys Asn Gly Phe Val Leu Ser Arg Leu Lys (2) INFORMATION FOR SEQ ID N0:129:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 435 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 1...432 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:129:
ATG TTA GAA AAA TTG ATT GAA AGA GTG TTG T'rT GCC ACT CGT TGG TTG 48 Met Leu Glu Lys Leu Ile Glu Arg Val Leu P:he Ala Thr Arg Trp Leu CTA GCC CCT TTA TGT ATT GCC ATG TCG TTA G'rG CTG GTG GTT TTA GGC 96 Leu Ala Pro Leu Cys Ile Ala Met Ser Leu Val Leu Val Val Leu Gly TAT GTG TTC ATG AAA GAG TTG TGG CAC ATG C'rC AGC CAT TTA AAC ACG 144 Tyr Val Phe Met Lys Glu Leu Trp His Met L~~_u Ser His Leu Asn Thr ATC AGC GAA ACG GAT TTG GTT TTA TCA GCC T'L'A GGA TTA GTG GAT TTG 192 Ile Ser Glu Thr Asp Leu Val Leu Ser Ala Leu Gly Leu Val Asp Leu TTG TTT ATG GCC GGG CTT GTT TTA ATG GTG T'CA CTC GCC AGT TAT GAA 240 Leu Phe Met Ala Gly Leu Val Leu Met Val Le~u Leu Ala Ser Tyr Glu 65 70 7!i 80 Ser Phe Val Ser Lys Leu Asp Lys Val Asp A.la Ser Glu I1e Thr Trp CTA AAG CAC ACG GAT TTT AAC GCT TTA AAA T'CA AAG GTT TCA CTC TCC 336 Leu Lys His Thr Asp Phe Asn Ala Leu Lys L<'u Lys Val Ser Leu Ser Ile Val Ala Ile Ser Ala Ile Phe Leu Leu L;rs Arg Tyr Met Ser Leu Glu Arg Cys Phe Ile Pro Ala Phe Pro Lys Asp Thr Pro Pro Ile Ala (2) INFORMATION FOR SEQ ID N0:130:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: l44 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:130:
Met Leu Glu Lys Leu Ile Glu Arg Val Leu Phe Ala Thr Arg Trp Leu Leu Ala Pro Leu Cys Ile Ala Met Ser Leu Val Leu Val Val Leu Gly Tyr Val Phe Met Lys Glu Leu Trp His Met Leu Ser His Leu Asn Thr Ile Ser Glu Thr Asp Leu Val Leu Ser Ala Leu Gly Leu Val Asp Leu Leu Phe Met Ala Gly Leu Val Leu Met Val Leu Leu Ala Ser Tyr Glu Ser Phe Val Ser Lys Leu.Asp Lys Val Asp Ala Ser Glu Ile Thr Trp Leu Lys His Thr Asp Phe Asn Ala Leu Lys Leu Lys Val Ser Leu Ser l00 105 110 Ile Val Ala Ile Ser Ala Ile Phe Leu Leu Lys Arg Tyr Met Ser Leu Glu Arg Cys Phe Ile Pro Ala Phe Pro Lys Asp Thr Pro Pro Ile Ala l30 135 140 (2) INFORMATION FOR SEQ ID N0:131:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2234 base pairs (B) TYPE. nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 213...2081 (D) OTHER INFORMATION:
WO 98/21225 PCTlUS97/21353 -_ (A) NAME/KEY: Signal Sequence (B) LOCATION: 213...273 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPT10N: SEQ ID N0:131:
GCTCACAATTGAGCTAAAGCCCGCTTTTTA GGGATAAA'TAAAAAGCGTTTTCAAATTGCA120 TGGGTAACTTTATGGGGCGAAGCGTTTCTA AATTTTGG'TATAATCGCTAGAAATTGTGAG180 CGT TTG
Met Arg Leu Leu Trp Trp Leu GTA TTG GTA TTA TCG CTC TTT TTA AAT CCT T'TG AGA GCG GTT GAA GAG 281 Val Leu Val Leu Ser Leu Phe Leu Asn Pro L.~u Arg Ala Val Glu Glu CAT GAA ACA GAT GCG GTG GAT TTG TTT TTG A'TT TTC AAT CAA ATC AAC 329 His Glu Thr Asp Ala Val Asp Leu Phe Leu I.le Phe Asn Gln Ile Asn CAG CTC AAT CAA GTC ATT GAA ACT TAC AAA Ai~A AAC CCT GAA AGA AGC 377 Gln Leu Asn Gln Val 11e Glu Thr Tyr Lys Lys Asn Pro Glu Arg Ser Ala Glu Ile Ser Leu Tyr Asn Thr Gln Lys Asn Asp Leu Ile Lys Ser Leu Thr Ser Lys Val Leu Asn Glu Arg Asp LVS Ile Gly Ile Asp I1e 55 60 ' 65 Asn Gln Asn Leu Lys Glu Gln Glu Lys Ile Lys Lys Arg Leu Ser Lys Ser Ile Asn Gly Asp Asp Phe Tyr Thr Phe Mea Lys Asp Arg Leu Ser Leu Asp Ile Leu Leu Ile Asp Glu Ile Leu Tyr Arg Phe Ile Asp Lys 100 105 17_0 115 Ile Arg Ser Ser Ile Asp Ile Phe Ser Glu G7.n Lys Asp Val Glu Ser Ile Ser Asp Ala Phe Leu Leu Arg Leu Gly Gln Phe Lys Leu Tyr Thr Phe Pro Lys Asn Leu Gly Asn Val Lys Met His Glu Leu Glu Gln Met Phe Ser Asp Tyr Glu Leu Arg Leu Asn Thr Tyr Thr Glu Val Leu Arg Tyr Ile Lys Asn His Pro Lys Glu Val Leu Pro Lys Asn Leu Ile Met Glu Val Asn Met Asp Phe Val Leu Asn Lys Ile Ser Lys Val Leu Pro-200 205 2l0 Phe Thr Thr His Ser Leu Gln Val Ser Lys Ile Val Leu Ala Leu Thr Ile Leu Ala Leu Leu Leu Gly Leu Arg Lys Leu Ile Thr Trp Leu Leu Ala Leu Leu Leu Asp Arg Ile Phe Glu Ile Met Gln Arg Asn Lys Lys Met His Val Asn Val Gln Lys Ser Ile Val Ser Pro Val Ser Val Phe TTA GCC CTA TTT AGT TGC GAT GTG GCT TTA GAT ATT TTC TAC TAC CCT 1l45 Leu Ala Leu Phe Ser Cys Asp Val Ala Leu Asp Ile Phe Tyr Tyr Pro Asn Ala Ser Pro Pro Lys Val Ser Met Trp Val Gly Ala Val Tyr Ile Met Leu Leu Ala Trp Leu Val Ile Ala Leu Phe Lys Gly Tyr Gly Glu 310 3l5 320 Ala Leu Val Thr Asn Met Ala Thr Lys Ser Thr His Asn Phe Arg Lys Glu Val Ile Asn Leu Ile Leu Lys Val Val Tyr Phe Leu Ile Phe Ile Val Ala Leu Leu Gly Val Leu Lys Gln Leu Gly Phe Asn Val Ser Ala Ile Ile Ala Ser Leu Gly Ile Gly Gly Leu Ala Va1 Ala Leu Ala Val Lys Asp Val Leu Ala Asn Phe Phe Ala Ser Va.l Ile Leu Leu Leu Asp Asn Ser Phe Ser Gln Gly Asp Trp Ile Val Cys Gly Glu Val Glu Gly ACG GTG GTG GAA ATG GGG TTA AGG CGC ACC AC.'G ATC AGA GCC TTT GAC 1577 Thr Val Val Glu Met Gly Leu Arg Arg Thr Thr Ile Arg Ala Phe Asp Asn Ala Leu Leu Ser Val Pro Asn Ser Glu LE:u Ala Gly Lys Pro Ile Arg Asn Trp Ser Arg Arg Lys Val G1y Arg Arg Ile Lys Met Glu Ile GGC TTA ACT TAT AGC TCC AGT CAA AGC GCT T7.'A CAG CTT TGC GTG AAA 1721 Gly Leu Thr Tyr Ser Ser Ser Gln Ser Ala Leu Gln Leu Cys Val Lys Asp Ile Lys Glu Met Leu Glu Asn His Pro Lys Ile Ala Asn Gly Ala Asp Ser Ala Leu Gln Asn Val Ser Asp Tyr Arg Tyr Met Phe Lys Lys 500 505 57.0 515 Asp Ile Val Ser Ile Asp Asp Phe Leu Gly Tyr Lys Asn Asn Leu Phe GTC TTT TTA GAT CAG TTT GCG GAC AGC TCT A7.'T AAT ATT TTA GTG TAT 19l3 Val Phe Leu Asp Gln Phe Ala Asp Ser Ser Ile Asn Ile Leu Val Tyr Cys Phe Ser Lys Thr Val Val Trp Glu Glu Trp Leu Glu Val Lys Glu Asp Val Met Leu Lys Ile Met Gly Ile Val Gl.u Lys His His Leu Ser GAA GTG ATC AAC TTG ATT TTA AAA GT
Phe Ala Phe Pro Ser Gln Ser Leu Tyr Val Glu Ser Leu Pro Glu Val Ser Leu Lys Glu Gly Ala Lys Ile (2) INFORMATION FOR SEQ ID N0:132:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 623 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATION: l...20 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:132:
Met Arg Leu Leu Leu Trp Trp Val Leu Val Leu Ser Leu Phe Leu Asn Pro Leu Arg Ala Val Glu Glu His Glu Thr Asp Ala Val Asp Leu Phe Leu Ile Phe Asn Gln Ile Asn Gln Leu Asn Gln Val Ile Glu Thr Tyr Lys Lys Asn Pro Glu Arg Ser Ala Glu Ile Ser Leu Tyr Asn Thr Gln Lys Asn Asp Leu Ile Lys Ser Leu Thr Ser Lys Val Leu Asn Glu Arg Asp Lys Ile Gly Ile Asp Ile Asn Gln Asn Leu Lys Glu Gln Glu Lys Ile Lys Lys Arg Leu Ser Lys Ser Ile Asn Gly Asp Asp Phe Tyr Thr Phe Met Lys Asp Arg Leu Ser Leu Asp Ile Leu Leu Ile Asp Glu Ile Leu Tyr Arg Phe Ile Asp Lys Ile Arg Ser Ser Ile Asp Ile Phe Ser 1l0 115 120 Glu Gln Lys Asp Val Glu Ser Ile Ser Asp Ala Phe Leu Leu Arg Leu Gly Gln Phe Lys Leu Tyr Thr Phe Pro Lys Asn Leu Gly Asn Val Lys Met His Glu Leu Glu Gln Met Phe Ser Asp Tyr Glu Leu Arg Leu Asn -z7o-' 160 165 170 Thr Tyr Thr Glu Val Leu Arg Tyr Ile Lys Assn His Pro Lys Glu Val l75 1S0 185 Leu Pro Lys Asn Leu Ile Met Glu Val Asn Me=t Asp Phe Val Leu Asn Lys Ile Ser Lys Val Leu Pro Phe Thr Thr H_is Ser Leu Gln Val Ser 205 210 2:L5 220 Lys Ile Val Leu Ala Leu Thr Ile Leu Ala Lc=a Leu Leu Gly Leu Arg ' Lys Leu Ile Thr Trp Leu Leu Ala Leu Leu Le:u Asp Arg Ile Phe Glu Ile Met Gln Arg Asn Lys Lys Met His Val Asn Val Gln Lys Ser Ile 255 260 265 _ Val Ser Pro Val Ser Val Phe Leu Ala Leu Phe Ser Cys Asp Val Ala Leu Asp Ile Phe Tyr Tyr Pro Asn Ala Ser Pro Pro Lys Val Ser Met Trp Val Gly Ala Val Tyr Ile Met Leu Leu Ala Trp Leu Val Ile Ala Leu Phe Lys Gly Tyr Gly Glu Ala Leu Val Thr Asn Met Ala Thr Lys Ser Thr His Asn Phe Arg Lys Glu Val Ile Asn Leu Ile Leu Lys Val Val Tyr Phe Leu Ile Phe Ile Val Ala Leu Leu Gly Val Leu Lys Gln Leu Gly Phe Asn Val Ser Ala Ile Ile Ala Se:r Leu Gly Ile Gly Gly Leu Ala Val Ala Leu Ala Val Lys Asp Val LE:u Ala Asn Phe Phe Ala Ser Val Ile Leu Leu Leu Asp Asn Ser Phe Ser Gln Gly Asp Trp Ile Val Cys Gly Glu Val Glu Gly Thr Val Val G7.u Met Gly Leu Arg Arg Thr Thr Ile Arg Ala Phe Asp Asn Ala Leu Leu Ser Val Pro Asn Ser Glu Leu Ala Gly Lys Pro Ile Arg Asn Trp Ser Arg Arg Lys Val Gly Arg Arg Ile Lys Met Glu Ile Gly Leu Thr Tyr Ser Ser Ser Gln Ser Ala Leu Gln Leu Cys Val Lys Asp Ile Lys Gl.u Met Leu Glu Asn His Pro Lys Ile Ala Asn Gly Ala Asp Ser Ala Le:u Gln Asn Val Ser Asp Tyr Arg Tyr Met Phe Lys Lys Asp Ile Val Se:r Ile Asp Asp Phe Leu Gly Tyr Lys Asn Asn Leu Phe Val Phe Leu A:;p Gln Phe Ala Asp Ser Ser Ile Asn Ile Leu Val Tyr Cys Phe Ser Lys Thr Val Val Trp Glu Glu Trp Leu Glu Val Lys Glu Asp Val Met Le:u Lys Ile Met Gly Ile Val Glu Lys His His Leu Ser Phe Ala Phe Pro Ser Gln Ser Leu Tyr Val Glu Ser Leu Pro Glu Val Ser Leu Lys Gl.u Gly Ala Lys Ile (2) INFORMATION FOR SEQ ID N0:133:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 432 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
{A) NAME/KEY: Coding Sequence {B) LOCATION: 1...429 {D) OTHER INFORMATION:
(A) NAME/KEY: sig peptide (B) LOCATION: l...93 (D) OTHER INFORMATION:
{A) NAME/KEY: mat peptide (B) LOCATION: 94...429 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:133:
Met Lys Lys Phe Phe Ser Gln Ser Leu Leu Ala Leu Ile Val Ser Met Asn Ala Leu Leu Ala Met Asp Gly Asn Gly Val Phe Leu Gly Ala Gly Tyr Leu Gln Gly Gln Ala Gln Met His Ala Asp Ile Asn Ser Gln Lys Gln Ala Thr Asn Ala Thr Ile Lys Gly Phe Asp Ala Leu Leu Gly Tyr Gln Phe Phe Phe Gly Lys Tyr Phe Gly Leu Arg Ala Tyr Gly Phe Phe Asp Tyr Ala His Ala Asn Ser Ile Arg Leu Lys Asn Pro Asn Tyr Asn Ser Glu Val Ala Gln Leu Ala Gly Gln Ile Leu Gly Lys Gln Glu Ile WO 98l21225 PCTNS97/21353 Asn Arg Leu Thr Ser Leu Ala Asp Pro Lys Thr Phe Glu Pro Asn Met Leu Thr Tyr Gly Gly Ala Met Asp Leu Met Val Asn Val His Gln (2) INFORMATION FOR SEQ ID N0:134:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 143 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID ;g0:134:
Met Lys Lys Phe Phe Ser Gln Ser Leu Leu .~la Leu Ile Val Ser Met Asn Ala Leu Leu Ala Met Asp Gly Asn Gly 'Val Phe Leu Gly Ala Gly Tyr Leu Gln Gly Gln Ala Gln Met His Ala Asp Ile Asn Ser Gln Lys Gln Ala Thr Asn Ala Thr Ile Lys Gly Phe Asp Ala Leu Leu Gly Tyr Gln Phe Phe Phe Gly Lys Tyr Phe Gly Leu ~Arg Ala Tyr Gly Phe Phe Asp Tyr Ala His Ala Asn Ser Ile Arg Leu :Lys Asn Pro Asn Tyr Asn Ser Glu Val Ala Gln Leu Ala Gly Gln Ile :~eu Gly Lys Gln Glu Ile Asn Arg Leu Thr Ser Leu Ala Asp Pro Lys 'L'hr Phe Glu Pro Asn Met Leu Thr Tyr Gly Gly Ala Met Asp Leu Met Val Asn Val His Gln 100 105 1l0 (2) INFORMATION FOR SEQ ID N0:1:35:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 336 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
. (A) NAME/KEY: Coding Sequence WO 98/21225 PCTlUS97/21353 (B) LOCATION: 1...333 (D) OTHER INFORMATION:
(A) NAME/KEY: sig peptide (B) LOCATION: 1...60 (D) OTHER INFORMATION:
(A) NAME/KEY: mat_peptide (B) LOCATION: 61. .333 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:135:
Met Lys Thr Phe Lys Asn Leu Leu Cys Phe Ser Leu Ile Ala Met Ser Trp Leu Gln Ala Asp Met Leu Asp Asn Phe Thr Arg Ala Ile Asn Ser Tyr Thr Thr Lys Lys Leu Asn Glu Ile Lys Asp Gln Val Asn Ser Ala AAC CCT ACT AAA AAT CAC AAT ACC ACT TAT AAC GCT AAT GGC ATG CTC l92 Asn Pro Thr Lys Asn His Asn Thr Thr Tyr Asn Ala Asn Gly Met Leu Ile Asn Ile Asp Cys Lys Val Leu Lys Asn Asn Phe Tyr Ser Val Cys Tyr Ser Ser Glu Leu Lys Asn Pro Ile Tyr Gly Val Ser Val Leu Phe Gly Asp Leu Val Asp Lys Asn Asn Ile Glu Lys Arg Tyr Glu Phe (2) INFORMATION FOR SEQ ID N0:136:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 111 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal WO 98/21225 PCT/LTS97/21353 -' (xi) SEQUENCE'DESCRIPTION: SEQ ID N0:136:
Met Lys Thr Phe Lys Asn Leu Leu Cys Phe Ser Leu Ile Ala Met Ser Trp Leu Gln Ala Asp Met Leu Asp Asn Phe Thr Arg Ala Ile Asn Ser Tyr Thr Thr Lys Lys Leu Asn Glu Ile Lys Asp Gln Val Asn Ser Ala Asn Pro Thr Lys Asn His Asn Thr Thr Tyr Asn Ala Asn Gly Met Leu Ile Asn Ile Asp Cys Lys Val Leu Lys Asn Asn Phe Tyr Ser Val Cys Tyr Ser Ser Glu Leu Lys Asn Pro Ile Tyr Gly Val Ser Val Leu Phe Gly Asp Leu Val Asp Lys Asn Asn Ile Glu Lys Arg Tyr Glu Phe (2) INFORMATION FOR SEQ ID N0:137:
(i) SEQUENCE CHARACTERISTICS:
- _ -- (A) LENGTH: 2185 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY:-linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 81...2069 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 81...l44 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID ;g0:137:
GTAAAAAATG GCTTATCTGT TCTAGCCTAC TCCCCTT.ATT TTTTCTTAAT CCCTTAGCGG 60 CAGAAGATGA TGGGTTTTTT ATG GGG GTG AGT T.AT CAA ACT TCT CTA GCT 110 Met Gly Val Ser Tyr Gln Thr Ser Leu Ala Ile Gln Arg Val Asp Asn Ser Gly Leu Asn .Ala Ser Gln Ala Ala Ser Thr Tyr Ile Arg Gln Asn Ala Ile Ala Leu Glu Ser Ala Ala Val Pro Leu Ala Tyr Tyr Leu Glu Ala Met Gly Gln Gln Thr Arg Val Leu Met Gln Met Leu Cys Pro Asp Pro Ser Lys Arg Cys Leu Leu Tyr Ala Gly GGT TAT 'AAA AAC GGA TCA AGT AAT ACT AAC GGC GAT ACA GGC AAC AAC 350 Gly Tyr Lys Asn Gly Ser Ser Asn Thr Asn Gly Asp Thr Gly Asn Asn Pro Pro Arg Gly Asn Val Asn Ala Thr Phe Asp Met Gln Ser Leu Val Asn Asn Leu Asn Lys Leu Thr Gln Leu Ile Gly Glu Thr Leu Ile Arg Asn Pro Glu Asn Leu Ser Asn Ala Lys Val Phe Asn Val Lys Phe Gly l05 110 115 Asn Gln Ser Thr Val Ile Ala Leu Pro Glu Gly Leu Ala Asn Thr Met 120 l25 130 Asn Ala Leu Asn Asp Asp Ile Thr Asn Ala Leu Thr Thr Leu Trp Tyr Asn Gln Thr Leu Thr Asn Lys Ser Phe Asn Ser Gly Asn Ser Val Asn Phe Ser Pro Gln Val Leu Gln His Leu Leu G1n Asp Gly Leu Ala Thr 170 l75 180 Ser Asn Gln Thr Ile Cys Ser Thr Gln Asn Gln Cys Thr Ala Thr Asn Glu Ala Lys Ser Ile Ala Gln Asn Ala Gln Asn Ile Phe Gln Ala Leu Met Gln Ala Gly Ile Leu Gly Gly Leu Ala Asn Glu Lys Gln Phe Gly Phe Thr Tyr Asn Lys Ala Pro Asn Gly Ser Asp Ser Gln Gln Gly Tyr WO 98/21225 PCT/US97/21353 ' CAA AGC TTT AGC GGC CCG GGT TAT TAC ACT .AAA AAC GGC GCT AAT GGC 926 Gln Ser Phe Ser Gly Pro Gly Tyr Tyr Thr Lys Asn Gly Ala Asn Gly Thr Thr Gln Ala Pro Leu Lys Ala Leu Pro .~la Gly Ala Thr Ile Gly Ser Gly Asn Gly Gln Tyr Thr Tyr His Pro Ser Ser Ala Val Tyr Tyr TTA GCC GAT AGC ATC ATT GCT AAT GGC ATC i3CC GCT TCT ATG ATT TTT 1070 Leu Ala Asp Ser Ile Ile Ala Asn Gly Ile 'L'hr Ala Ser Met Ile Phe Ser Gly Met Gln Asn Phe Ala Asn Lys Ala Ala Lys Leu Thr Gly Thr Ser Ser Tyr Ser Gln Met Gln Asp Ala Ile Asn Tyr Gly G1u Ser Leu Leu Ser Asn Thr Val Ala Tyr Gly Asp Phe 7:1e Thr Asn Trp Val Ala Pro Tyr Leu Asp Leu Asn Asn Lys Gly Leu Asn Phe Leu Pro Ser Tyr Gly Gly Gln Leu Asn Gly Ala Asn His Gln Thr Pro Gln Leu Thr Pro CAA CAA GCC CAA CAA GAG CAA AAA GTC ATC F,TG AAC CAA CTA GAG CAA 1358 Gln Gln Ala Gln Gln Glu Gln Lys Val Ile Nfet Asn Gln Leu Glu Gln GCC ACA AAC GCC CCC ACC CCC GCG CAA ATA A.AC AGG ATT TTA GCC AAC l406 Ala Thr Asn Ala Pro Thr Pro A1a Gln Ile A.sn Arg Ile Leu Ala Asn Pro Tyr Ser Pro Thr Ala Lys Thr Leu Met A.la Tyr Gly Leu Tyr Arg Ser Lys Ala Val Ile Gly Gly Val Ile Asp Glu Met Gln Thr Lys Val Asn Gln Val Tyr Gln Met Gly Phe Ala Arg Asn Phe Leu Glu His Asn Ser Asn Ser Asn Asn Met Asn Gly Phe Gly Val Lys Met Gly Tyr Lys Gln Phe Phe Gly Lys Lys Arg Met Phe Gly Leu Arg Tyr Tyr Gly Phe Tyr Asp Phe Gly Tyr Ala Gln Phe Gly Ala Glu Ser Ser Leu Val Lys Ala Thr Leu Ser Ser Tyr Gly Ala Gly Thr Asp Phe Leu Tyr Asn Val TTT ACC CGA AAA AGA GGG ACT GAA GCG ATA GAT ATC GGT TTT TTT GCC l790 Phe Thr Arg Lys Arg Gly Thr Glu Ala Ile Asp Ile Gly Phe Phe Ala GGT ATC CAA CTT GCA GGG CAA ACT TGG AAA ACG AAT TTT TTA GAT CAA 18.38 Gly Ile Gln Leu Ala Gly Gln Thr Trp Lys Thr Asn Phe Leu Asp Gln Val Asp Gly Asn His Leu Lys Pro Lys Asp Thr Ser Phe Gln Phe Leu Phe Asp Leu Gly Ile Arg Thr Asn Phe Ser Lys Ile Ala His Gln Lys Arg Ser Arg Phe Ser Gln Gly Ile Glu Phe Gly Leu Lys Ile Pro Val Leu Tyr His Thr Tyr Tyr Gln Ser Glu Gly Val Thr Ala Lys Tyr Arg Arg A1a Phe Ser Phe Tyr Val Gly Tyr Asn Ile Gly Phe (2) INFORMATION FOR SEQ ID N0:138:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 663 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 1...21 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:138:
Met Gly Val Ser Tyr Gln Thr Ser Leu Ala _Cle Gln Arg Val Asp Asn Ser Gly Leu Asn Ala Ser Gln Ala Ala Ser 'Chr Tyr Ile Arg Gln Asn Ala Ile Ala Leu Glu Ser Ala Ala Val Pro Leu Ala Tyr Tyr Leu Glu Ala Met Gly Gln Gln Thr Arg Val Leu Met Gln Met Leu Cys Pro Asp Pro Ser Lys Arg Cys Leu Leu Tyr Ala Gly Gly Tyr Lys Asn Gly Ser Ser Asn Thr Asn Gly Asp Thr Gly Asn Asn Pro Pro Arg Gly Asn Val Asn Ala Thr Phe Asp Met Gln Ser Leu Val Asn Asn Leu Asn Lys Leu Thr Gln Leu Ile Gly Glu Thr Leu Ile Arg Asn Pro Glu Asn Leu Ser Asn Ala Lys Val Phe Asn Val Lys Phe Gly Asn Gln Ser Thr Val Ile Ala Leu Pro Glu Gly Leu Ala Asn Thr Met Asn Ala Leu Asn Asp Asp Ile Thr Asn Ala Leu Thr Thr Leu Trp Tyr Asn Gln Thr Leu Thr Asn 140 l45 7.50 155 Lys Ser Phe Asn Ser Gly Asn Ser Val Asn Phe Ser Pro Gln Val Leu Gln His Leu Leu Gln Asp Gly Leu Ala Thr ~~er Asn Gln Thr Ile Cys 175 180 l85 Ser Thr Gln Asn Gln Cys Thr Ala Thr Asn Cilu Ala Lys Ser Ile Ala Gln Asn Ala Gln Asn Ile Phe Gln Ala Leu Nlet Gln Ala Gly Ile Leu Gly Gly Leu Ala Asn Glu Lys Gln Phe Gly F~he Thr Tyr Asn Lys Ala Pro Asn Gly Ser Asp Ser Gln Gln Gly Tyr Gln Ser Phe Ser Gly Pro Gly Tyr Tyr Thr Lys Asn Gly Ala Asn Gly T'hr Thr Gln Ala Pro Leu Lys Ala Leu Pro Ala Gly Ala Thr Ile Gly 5'er Gly Asn Gly Gln Tyr Thr Tyr His Pro Ser Ser Ala Val Tyr Tyr Lieu Ala Asp Ser Ile Ile Ala Asn Gly Ile Thr Ala Ser Met Ile Phe Ser Gly Met Gln Asn Phe Ala Asn Lys Ala Ala Lys Leu Thr Gly Thr Ser Ser Tyr Ser Gln Met Gln Asp Ala Ile Asn Tyr Gly Glu Ser Leu Leu Ser Asn Thr Val Ala Tyr Gly Asp Phe Ile Thr Asn Trp Val Ala Pro Tyr Leu Asp Leu Asn Asn Lys Gly Leu Asn Phe Leu Pro Ser Tyr Gly Gly Gln Leu Asn Gly Ala Asn His Gln Thr Pro Gln Leu Thr Pro Gln Gln Ala Gln Gln Glu Gln Lys Val Ile Met Asn Gln Leu Glu Gln Ala Thr Asn Ala Pro Thr Pro Ala Gln Ile Asn Arg Ile Leu Ala Asn Pro Tyr Ser Pro Thr Ala Lys Thr Leu Met Ala Tyr Gly Leu Tyr Arg Ser Lys Ala Val Ile Gly Gly Val Ile Asp Glu Met Gln Thr Lys Val Asn Gln Val Tyr Gln Met Gly Phe Ala Arg Asn Phe Leu Glu His Asn Ser Asn Ser Asn Asn Met Asn Gly Phe Gly Val Lys Met Gly Tyr Lys Gln Pl-ie Phe Gly Lys Lys Arg Met Phe Gly Leu Arg Tyr Tyr Gly Phe Tyr Asp Phe Gly Tyr Ala Gln Phe Gly Ala Glu Ser Ser Leu Val Lys Ala Thr Leu Ser Ser Tyr Gly Ala Gly Thr Asp Phe Leu Tyr Asn Val Phe Thr Arg Lys Arg Gly Thr Glu Ala Ile Asp Ile Gly Phe Phe Ala Gly Ile Gln Leu Ala Gly Gln Thr Trp Lys Thr Asn Phe Leu Asp Gln Val Asp Gly Asn His Leu Lys Pro Lys Asp Thr Ser Phe Gln Phe Leu Phe Asp Leu Gly Ile Arg Thr Asn Phe Ser Lys Ile Ala His Gln Lys Arg Ser Arg Phe Ser Gln Gly Ile Glu Phe Gly Leu Lys Ile Pro Val Leu Tyr His Thr Tyr Tyr 605 6l0 615 Gln Ser Glu Gly Val Thr Ala Lys Tyr Arg Arg Ala Phe Ser Phe Tyr Val Gly Tyr Asn Ile Gly Phe (2) INFORMATION FOR SEQ ID N0:139:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1213 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...1160 -zeo-(D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:139:
Met Gly Phe Xaa Leu Ala Leu Gly Tyr Leu Cys Leu Phe Ile Phe Val Leu Ser Ala Ser Leu Ile Ser Glu Lys Ala Leu Ser Lys Gln Tyr Leu Gln Thr GCT AAA GAT AAA ATC ACC TCT TTA AAG AAT 'TTA AAA GTC ATC GCC ATT 200 Ala Lys Asp Lys Ile Thr Ser Leu Lys Asn Leu Lys Val Ile Ala Ile ACC GGA AGC TTT GGG AAA ACC AGC ACC AAA .AAT TTC TTG CTT CAA ATC 248 Thr Gly Ser Phe Gly Lys Thr Ser Thr Lys .Asn Phe Leu Leu Gln Ile Leu Gln Thr Thr Phe Asn Ala His Ala Ser Pro Lys Ser Val Asn Thr CTT TTA GGG CTT GCG AAT GAT ATT AAT CAG .~1AT TTA GAC GAT AGG AGT 344 Leu Leu Gly Leu Ala Asn Asp Ile Asn Gln .Asn Leu Asp Asp Arg Ser GAA ATC TAT ATC GCT GAA GCC GGG GCA AGG ,SAT AAG GGC GAT ATT AAA 392 Glu Ile Tyr Ile Ala Glu Ala Gly Ala Arg ,Asn Lys Gly Asp Ile Lys Glu Ile Thr Cys Leu Ile Glu Pro His Leu 'Val Val Val Ala Glu Val GGC GAA CAG CAT TTA GAA TAC TTT AAA ACT 'rTA GAA AAT ATT TGC GAG 488 Gly Glu Gln His Leu Glu Tyr Phe Lys Thr :Leu Glu Asn Ile Cys Glu ACT AAA GCG GAA TTA TTG GAT TCC AAA CGC 'TTA GAA AAA GCC TTT TGT 536 Thr Lys Ala Glu Leu Leu Asp Ser Lys Arg :Leu Glu Lys Ala Phe Cys Tyr Ser Val Glu Lys Ile Lys Pro Tyr Ala :Pro Lys Asp Ser Pro Leu 165 l70 175 Ile Asp Tyr Ser Ser Leu Val Lys Asn Ile Gln Ser Thr Leu Lys Gly Thr Ser Phe Glu Met Leu Ile Gly Ser Val Trp Glu Arg Phe Glu Thr Lys Val Leu Gly Glu Phe Ser Ala Tyr Asn Ile Ala Ser Ala Ile Leu Ile Ala Lys His Leu Gly Leu Glu Thr Glu Arg Ile Lys Arg Leu Val Leu Glu Leu Asn Pro Ile Ala His Arg Leu Gln Leu Leu Glu Val Asn Gln Lys Ile Ile Ile Asp Asp 5er Phe Asn Gly Asn Leu Lys Gly Met Leu Glu Gly Ile Arg Leu Ala Ser Leu His Lys Gly Arg Lys Val Ile Val Thr Pro Gly Leu Val Glu Ser Asn Thr Glu Ser Asn Glu Ala Leu Ala Gln Lys Ile Asp Gly Val Phe Asp Val Ala Ile Ile Thr Gly Glu Leu Asn Ser Lys Thr Ile Ala Ser Gln Leu Lys Thr Pro Gln Lys Ile Leu Leu Lys Asp Lys Ala Gln Leu Glu Asn Ile Leu Gln Ala Thr Thr Ile Gln Gly Asp Leu Ile Leu Phe Ala Asn Asp Ala Pro Asn Tyr Ile AGGAAATGAA CATGCAACAT TTATACGCTC CTTGGCGCGA AAGTTATTTG AA l213 (2) INFORMATION FOR SEQ ID N0:140:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 370 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single ' (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein ~ (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:140:
Met Gly Phe Xaa Leu Ala Leu Gly Tyr Leu Cys Leu Phe Ile Phe Val Leu Ser Ala Ser Leu Ile Ser Glu Lys Ala Leu Ser Lys Gln Tyr Leu Gln Thr Ala Lys Asp Lys Ile Thr Ser Leu hys Asn Leu Lys Val Ile Ala Ile Thr Gly Ser Phe Gly Lys Thr Ser '.Chr Lys Asn Phe Leu Leu Gln Ile Leu Gln Thr Thr Phe Asn Ala His Ala Ser Pro Lys Ser Val 65 70 '75 80 Asn Thr Leu Leu Gly Leu Ala Asn Asp Ile Asn Gln Asn Leu Asp Asp Arg Ser Glu Ile Tyr Ile Ala Glu Ala Gly Ala Arg Asn Lys Gly Asp Ile Lys Glu Ile Thr Cys Leu Ile G1u Pro His Leu Val Val Val Ala l15 120 125 Glu Val Gly Glu Gln His Leu Glu Tyr Phe Lys Thr Leu Glu Asn Ile l30 135 140 Cys Glu Thr Lys Ala Glu Leu Leu Asp Ser Lys Arg Leu Glu Lys Ala 145 150 7_55 l60 Phe Cys Tyr Ser Val Glu Lys Ile Lys Pro Tyr Ala Pro Lys Asp Ser Pro Leu Ile Asp Tyr Ser Ser Leu Val Lys Asn Ile Gln Ser Thr Leu Lys Gly Thr Ser Phe Glu Met Leu Ile Gly :>er Val Trp Glu Arg Phe Glu Thr Lys Val Leu Gly Glu Phe Ser Ala Tyr Asn Ile Ala Ser Ala Ile Leu Ile Ala Lys His Leu Gly Leu Glu Thr Glu Arg Ile Lys Arg Leu Val Leu Glu Leu Asn Pro Ile Ala His Arg Leu Gln Leu Leu Glu Val Asn Gln Lys Ile Ile Ile Asp Asp Ser Phe Asn Gly Asn Leu Lys Gly Met Leu Glu Gly Ile Arg Leu Ala Ser Leu His Lys Gly Arg Lys Val Ile Val Thr Pro Gly Leu Val Glu Ser Asn Thr Glu Ser Asn Glu Ala Leu Ala Gln Lys Ile Asp Gly Val Phe Asp Val Ala Ile Ile Thr 305 310 .15 320 Gly Glu Leu Asn Ser Lys Thr Ile Ala Ser C~ln Leu Lys Thr Pro Gln Lys Ile Leu Leu Lys Asp Lys Ala Gln Leu Glu Asn Ile Leu Gln Ala Thr Thr Ile Gln Gly Asp Leu Ile Leu Phe Ala Asn Asp Ala Pro Asn Tyr Ile (2) INFORMATION FOR SEQ ID N0:141:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 360 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 82...270 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:141:
AGACTTTTTT TGAATGAGTA A GGA GAA AAT ATT TTG TTC CAT AAA CTG~ATC 111 Gly Glu Asn Ile Leu Phe His Lys Leu Ile Leu Thr Cys Phe Leu Ala Leu Val Ala Ile Thr Ile Gln Ala Cys Gly Tyr Lys Ala Pro Pro Phe Asn Glu Lys Pro Ala Lys Lys Thr Ser Asn Ser Ser Asn Ser Ser Met Gln Thr Pro Thr Asn Ser Thr Thr Pro Glu Phe Leu Asn Gln Pro (2) INFORMATION FOR SEQ ID N0:142:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 63 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear {ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:142:
Gly Glu Asn Ile Leu Phe His Lys Leu Ile Leu Thr Cys Phe Leu Ala . 1 5 10 15 Leu Val Ala Ile Thr Ile Gln Ala Cys Gly 'Cyr Lys Ala Pro Pro Phe Asn Glu Lys Pro Ala Lys Lys Thr Ser Asn Ser Ser Asn Ser Ser Met Gln Thr Pro Thr Asn Ser Thr Thr Pro Glu F?he Leu Asn Gln Pro (2) INFORMATION FOR SEQ ID N0:143:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1024 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 115...921 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:143:
Met AAG AGA GTT AGA GAA CTT GTA AAA AAA CAT C'.CC GAG AAA AGC AGT GTG 165 Lys Arg Val Arg Glu Leu Val Lys Lys His Pro Glu Lys Ser Ser Val Ala Leu Val Val Leu Thr His Ala Ala Cys Lys Lys Ala Lys Glu Leu Asp Asp Lys Val Gln Asp Lys Ser Lys Gln F~la Glu Lys Glu Asn Gln ATC AAT TGG TGG AAA TAT TCA GGA TTA ACA F~TA GCG ACA AGT TTA TTA 309 Ile Asn Trp Trp Lys Tyr Ser Gly Leu Thr I:le Ala Thr Ser Leu Leu 50 55 6.0 65 Leu Ala Ala Cys Ser Val Gly Asp Ile Asp L~ys Gln Ile Glu Leu Glu Gln Glu Lys Lys Glu Ala Glu Asn Ala Arg P.sp Arg Ala Asn Lys Ser Gly Ile Glu Leu Glu Gln Glu Lys Gln Lys Thr Ile Lys Glu Gln Lys l00 105 110 Asp Leu Val Lys Lys Ala Glu Gln Asn Cys Gln Glu Asn His Gly Gln Phe Phe Met Lys Lys Leu Gly Ile Lys Gly Gly Ile Ala Ile Glu Val l30 135 140 145 Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala Lys Thr Asn Gln Thr Pro Ile Gln Pro Lys His Leu Pro Asn Ser Lys Gln Pro His Ser Gln Arg Gly Ser Lys Ala Gln Glu Leu Ile Ala Tyr Leu Gln Lys Glu Leu Glu TCT CTG CCC TAT TCA CAA AAA GCT ATC GCT AAA CAA GTG AAT TTT TAC 74l Ser Leu Pro Tyr Ser Gln Lys Ala Ile Ala Lys Gln Val Asn Phe Tyr Arg Pro Ser Ser Val Ala Tyr Leu Glu Leu Asp Pro Arg Asp Phe Lys Val Thr Glu Glu Trp Gln Lys Glu Asn Leu Lys Ile Arg Ser Lys Ala Gln Ala Lys Met Leu Gly Asn Glu Lys Pro Thr Ser Pro Pro Phe Asn Leu 5er Lys Pro Phe Val Arg Ser Lys Asn Ile Cys (2) INFORMATION FOR SEQ ID N0:144:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 269 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D} TOPOLOGY: linear ~ {ii) MOLECULE; TYPE: protein {v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NC>:144:
Met Lys Arg Val Arg Glu Leu Val Lys Lys Hi.s Pro Glu Lys Ser Ser Val Ala Leu Val Val Leu Thr His Ala Ala Cys Lys Lys Ala Lys Glu Leu Asp Asp Lys Val Gln Asp Lys Ser Lys Gl.n Ala Glu Lys Glu Asn Gln Ile Asn Trp Trp Lys Tyr Ser Gly Leu Thr Ile Ala Thr Ser Leu Leu Leu Ala Ala Cys Ser Val Gly Asp Ile Asp Lys Gln Ile Glu Leu Glu Gln Glu Lys Lys Glu Ala Glu Asn Ala Arg Asp Arg Ala Asn Lys Ser Gly Ile Glu Leu Glu Gln Glu Lys Gln Lys Thr Ile Lys Glu Gln Lys Asp Leu Val Lys Lys Ala Glu Gln Asn Cys Gln Glu Asn His Gly 115 l20 125 Gln Phe Phe Met Lys Lys Leu Gly Ile Lys Gly Gly Ile Ala Ile Glu Val Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala Lys Thr Asn Gln Thr Pro Ile Gln Pro Lys His Leu Pro Asn Ser Lys Gln Pro His Ser Gln Arg Gly Ser Lys Ala Gln Glu Leu Ile Ala Tyr Leu Gln Lys Glu Leu Glu Ser Leu Pro Tyr Ser G1n Lys Ala Ile Ala Lys Gln Val Asn Phe Tyr Arg Pro Ser Ser Val Ala Tyr Leu Glu Leu Asp Pro Arg Asp Phe Lys Val Thr Glu Glu Trp Gln Lys Glu Asn Leu Lys Ile Arg Ser Lys Ala Gln Ala Lys Met Leu Gly Asn Glu Lys Pro Thr Ser Pro Pro Phe Asn Leu Ser Lys Pro Phe Val Arg Ser Lys Asn Ile Cys (2) INFORMATION FOR SEQ ID N0:145:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 669 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 88...603 (D) OTHER INFORMATION:
-2a7-(xi) SEQUENCE DESCRIPTION: SEQ ID N0:145:
Met Phe Asp Lys Lys Leu Ser Ser Asn Asp Trp His Ile Gln Lys Val Glu Met Asn His Gln Val Tyr Asp Ile 15 ~ 20 25 Glu Thr Met Leu Ala Asp Ser Ala Phe Arg Glu His Glu Glu Glu Gln Asp Ser Ser Leu Asn Thr Ala Leu Pro Glu Asp Lys Thr Ala Ile Glu Ala Lys Glu Gln Glu Gln Lys Glu Lys Arg Lys Arg Trp Tyr Glu Leu Phe Lys Lys Lys Pro Lys Pro Lys Ser Ser Met Gly Glu Phe Val Phe Asp Gln Lys Glu Asn Arg Ile Tyr Gly Lys Gly Tyr Cys Asn Arg Tyr Phe Ala Ser Tyr Val Trp Gln Gly Asp Arg His Ile Gly Ile Glu Asp Ser Gly Ile Ser Arg Lys Val Cys Lys Asp Glu His Leu Met Ala Phe Glu Leu Glu Phe Met Glu Asn Phe Lys Gly Asn Phe Thr Val Thr Lys Gly Lys Asp Thr Leu Ile Leu Asp Asn Gln Lys Met Lys Ile Tyr Leu Lys Thr Pro l70 (2) 1NFORI~"uiTION FOR SEQ ID N0:146:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 172 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:146:
Met Phe Asp Lys Lys Leu Ser Ser Asn Asp Trp His Ile Gln Lys Val Glu Met Asn His Gln Val Tyr Asp Ile Glu Thr Met Leu Ala Asp Ser Ala Phe Arg Glu His Glu Glu Glu Gln Asp Sc>r Ser Leu Asn Thr Ala Leu Pro Glu Asp Lys Thr Ala Ile Glu Ala Lys Glu Gln Glu Gln Lys Glu Lys Arg Lys Arg Trp Tyr Glu Leu Phe Lys Lys Lys Pro Lys Pro Lys Ser Ser Met Gly Glu Phe Val Phe Asp Gln Lys Glu Asn Arg Ile Tyr Gly Lys Gly Tyr Cys Asn Arg Tyr Phe Ala Ser Tyr Val Trp Gln Gly Asp Arg His Ile Gly Ile Glu Asp Ser Gly Ile Ser Arg Lys Val l15 120 125 Cys Lys Asp Glu His Leu Met Ala Phe Glu Le~u Glu Phe Met Glu Asn Phe Lys Gly Asn Phe Thr Val Thr Lys Gly L~rs Asp Thr Leu Ile Leu Asp Asn Gln Lys Met Lys Ile Tyr Leu Lys Thr Pro (2) INFORMATION FOR SEQ ID N0:147:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1350 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 87...1280 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NC>:147:
GATTGTTTTT TAAAAAAAGG TTGGTA ATG GAA TCA GTA AAA ACA GGA AAA ACA 1l3 Met Glu Ser Val Lys Thr Gly Lys Thr Asn Lys Val Gly Lys Asn Thr Glu Met Ala Asn Thr Lys Ala Asn Lys Glu Thr His Phe Lys Gln Val Ser Ala Ile Thr Asn Ile Ile Arg Ser Val Gly Gly Phe Phe Thr Lys Ile Ala Lys Arg Val Arg Gly Leu Val Lys Lys His Pro Lys Lys Ser Ser Ala Ala Leu Val Val Leu Thr His Ile Ala Cys Lys Lys Ala Lys Glu Leu Asp Asp Lys Val Gln Asp Lys Ser Lys Gln Ala Glu Lys Glu Asn Gln Ile Asn Trp Trp Lys Tyr Ser Gly Leu Thr Ile Ala Ala Ser Leu Leu Leu Ala Ala Cys Ser Ala Gly Asp Thr Asp Lys Gln Ile Glu Leu Glu Gln Glu Lys Lys Glu Ala Glu Asn Ala Arg Asp Arg Ala Asn Lys Ser Gly Ile Glu Leu Glu Gln Glu Arg Gln Lys Thr Asn Lys Ser Gly Ile Glu Leu Ala Asn Ser Gln Ile Lys Ala Glu Gln Glu Arg Gln Lys Thr Glu Gln Glu Lys Gln Lys Ala l70 175 180 185 Asn Lys Ser Ala Ile Glu Leu Glu Gln G1n Lys Gln Lys Thr Ile Asn Thr Gln Arg Asp Leu Ile Lys Glu Gln Lys Asp Phe Ile Lys Glu Thr Glu Gln Asn Cys Gln Glu Asn His Asn Gln F~he Phe Ile Lys Lys Leu Gly Ile Lys Gly Gly Ile Ala Ile Glu Val G'~lu Ala Glu Cys Lys Thr CCT AAA CCT GCA AAA ACC AAT CAA ACC CCT A,TC CAG CCA AAA CAC CTC 881 Pro Lys Pro Ala Lys Thr Asn Gln Thr Pro Ile Gln Pro Lys His Leu Pro Asn Ser Lys Gln Pro His Ser Gln Arg Gly Ser Lys Ala Gln Glu Phe Ile Ala Tyr Leu Gln Lys Glu Leu Glu Phe Leu Pro Tyr Ser Gln Lys Ala Ile Ala Lys Gln Val Asn Phe Tyr Lys Pro Ser Ser Ile Ala Tyr Leu Glu Leu Asp Pro Arg Asp Phe Lys Val Thr Glu Glu Trp Gln Lys Glu Asn Leu Lys Ile Arg Ser Lys Ala Gln Ala Lys Met Leu Glu Met Arg Asp Leu Lys Pro Asp Pro G.Ln Ala His Leu Pro Thr Ser Gln Ser Leu Leu Phe Val Gln Lys Ile Phe Ala Asp Val Asn Lys Glu Ile Glu Ala Val Ala Asn Thr Glu Lys Lys Ala Glu Lys Ala Gly Tyr Gly Tyr Ser Lys Arg Met (2) INFORMATION FOR SEQ ID N0:148:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 398 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:148:
Met Glu Ser Val Lys Thr Gly Lys Thr Asn Lys Val Gly Lys Asn Thr Glu Met Ala Asn Thr Lys Ala Asn Lys Glu Thr His Phe Lys Gln Val Ser Ala Ile Thr Asn Ile Ile Arg Ser Val Gly Gly Phe Phe Thr Lys I1e Ala Lys Arg Val Arg Gly Leu Val Lys Lys His Pro Lys Lys Ser Ser Ala Ala Leu Val Val Leu Thr His Ile Ala Cys Lys Lys Ala Lys Glu Leu Asp Asp Lys Val Gln Asp Lys Ser Lys Gln Ala Glu Lys Glu Asn Gln Ile Asn Trp Trp Lys Tyr Ser Gly Leu Thr Tle Ala Ala Ser Leu Leu Leu Ala Ala Cys Ser Ala Gly Asp Thr Asp Lys Gln Ile Glu 115 l20 125 Leu Glu Gln Glu Lys Lys Glu Ala Glu Asn Ala Arg Asp Arg Ala Asn Lys Ser Gly Ile Glu Leu Glu Gln Glu Arg Gln Lys Thr Asn Lys Ser 14S l50 155 160 Gly Ile Glu Leu Ala Asn Ser Gln Ile Lys Ala Glu Gln Glu Arg Gln Lys Thr Glu Gln Glu Lys Gln Lys Ala Asn Lys Ser Ala Ile Glu Leu Glu Gln Gln Lys Gln Lys Thr Ile Asn Thr Gln Arg Asp Leu Ile Lys Glu Gln Lys Asp Phe Ile Lys Glu Thr Glu Gln Asn Cys Gln Glu Asn 2l0 215 220 His Asn Gln Phe Phe Ile Lys Lys Leu Gly Ile Lys Gly Gly Ile Ala Ile Glu Val Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala Lys Thr Asn Gln Thr Pro Ile Gln Pro Lys His Leu Pro Asn Ser Lys Glri Pro His Ser Gln Arg Gly Ser Lys Ala Gln Glu Phe Ile Ala Tyr Leu Gln Lys Glu Leu Glu Phe Leu Pro Tyr Ser Gln Lys Ala Ile Ala Lys Gln Val Asn Phe Tyr Lys Pro Ser Ser Tle Ala Tyr Leu Glu Leu Asp Pro Arg Asp Phe Lys Val Thr Glu Glu Trp Gln Lys Glu Asn Leu Lys Ile Arg Ser Lys Ala Gln Ala Lys Met Leu Glu Met Arg Asp Leu Lys Pro Asp Pro Gln Ala His Leu Pro Thr Ser Gln Ser Leu Leu Phe Val Gln Lys Ile Phe Ala Asp Vai Hsn Lys Glu Ile Glu A:la Val Ala Asn Thr Glu Lys Lys Ala Glu Lys Ala Gly Tyr Gly Tyr Ser Lys Arg Met 385 390 3:95 (2) INFORMATION FOR SEQ ID N0:149:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 709 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 336...443 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:149:
TAAGGGATATTGCTAACGAT TAAGCTGTATTGGAAGAG'CTTATTTTGCAAGAATTAATCT60 TCTAAGATTACAAAGGGTAG CGTTTCTGTTTTTGGATT".CAGAGCGTTATTTTGATTGTTT240 ATG AAA
ACC ATT
Met. Lys Arg Asn Thr Ile AGC GTG TTT ATT GGA GCG TCT TTA CTC GGC GC~T TGC GCT AGC GTT GAG 401 Ser Val Phe Ile Gly Ala Ser Leu Leu Gly Gly Cys Ala Ser Val Glu GCT TAT TTT GAC GCT TTG CAT GTT GCT CGC G7.'T AAA GAC GCT TGTTTATAG 452 Ala Tyr Phe Asp Ala Leu His Val Ala Arg Val Lys Asp Ala AAAAAGAAGC ACACCACACG CCCAAAGACT TTGATAGCC:CTTACCACACT GACTAAACCG512 GCACTAGGTT TTAGTTGGGG GTTTTTAGGG GTGTTATT7.'TAGATACTCTC TGTTCCCTTA572 AAGAAAATAA ATTTCTACCA TAAAATAAAA TCTTAAAT7.'AAGGCGACTAA AACCCCACTT632 (2) INFORMATION FOR SEQ ID N0:15C1:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 36 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:150:
Met Lys Thr Ile Arg Asn Ser Val Phe Ile Gly Ala Ser Leu Leu Gly Gly Cys Ala Ser Val Glu Ala Tyr Phe Asp Ala Leu His Val Ala Arg Val Lys Asp Ala (2) INFORMATION FOR SEQ ID N0:151:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 888 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: l9...837 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:151:
Met Glu Phe Met Lys Lys Phe Val Ala Leu Gly Leu Leu Ser Ala Val Leu Ser Ser Ser Leu Leu Ala Glu Gly Asp Gly Val Tyr Ile Gly Thr Asn Tyr Gln Leu Gly Gln Ala Arg Leu Asn Ser Asn Ile Tyr Asn Thr Gly Asp Cys Thr Gly Ser Val Val Gly Cys Pro Pro Gly Leu Thr Ala Asn Lys His Asn Pro Gly Gly Thr Asn Ile Asn Trp His Ala Lys Tyr Ala Asn Gly Ala Leu Asn Gly Leu Gly Leu Asn Val Gly Tyr Lys Lys Phe Phe Gln Phe Lys Seer Phe Asp Met Thr Ser AAG TGG TTT GGT TTT AGA GTG TAT GGG CTT T'TT GAT TAT GGG CAT GCC 387 Lys Trp Phe Gly Phe Arg Val Tyr Gly Leu Plze Asp Tyr Gly His Ala Thr Leu Gly Lys Gln Val Tyr Ala Pro Asn L~Ts I1e Gln Leu Asp Met l25 130 135 GTC TCT TGG GGT GTG GGG AGC GAT TTG TTA GCT GAT ATT ATT GAT AAC 483 _ Val Ser Trp Gly Val Gly Ser Asp Leu Leu A:La Asp Ile Ile Asp Asn 140 145 1!i0 155 GAT AAC GCT TCT TTT GGT ATT TTT GGT GGG G'CC GCT ATC GGC GGT AAC 531 Asp Asn Ala Ser Phe Gly Ile Phe Gly Gly Val Ala Ile Gly Gly Asn ACT TGG AAA AGC TCA GCG GCA AAC TAT TGG AAP. GAG CAA ATC ATT GAA 579 Thr Trp Lys Ser Ser Ala Ala Asn Tyr Trp Lys Glu Gln Ile Ile Glu Ala Lys Gly Pro Asp Val Cys Thr Pro Thr T~rr Cys Asn Pro Asn Ala Pro Tyr Ser Thr Lys Thr Ser Thr Val Ala Phe Gln Val Trp Leu Asn Phe Gly Val Arg Ala Asn Ile Tyr Lys His Asn Gly Val Glu Phe Gly 220 225 2a0 235 GTG AGA GTG CCG CTA CTC ATC AAC AAG TTT T7.'G AGT GCG GGT CCT AAC 771 Val Arg Va1 Pro Leu Leu Ile Asn Lys Phe Leu Ser Ala Gly Pro Asn Ala Thr Asn Leu Tyr Tyr His Leu Lys Arg A:;p Tyr Ser Leu Tyr Leu Gly Tyr Asn Tyr Thr Phe CCTTATAAAA AGG ggg (2) INFORMATION FOR SEQ ID N0:15c.:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 273 amino acids (B) TYPE: amino acid (C) STRANDELNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:152:
Met Glu Phe Met Lys Lys Phe Val Ala Leu Gly Leu Leu Ser Ala Val Leu Ser Ser Ser Leu Leu Ala Glu Gly Asp Gly Val Tyr Ile Gly Thr Asn Tyr Gln Leu Gly Gln Ala Arg Leu Asn Ser Asn Ile Tyr Asn Thr Gly Asp Cys Thr Gly Ser Val Val Gly Cys Pro Pro Gly Leu Thr Ala Asn Lys His Asn Pro Gly Gly Thr Asn Ile Asn Trp His Ala Lys Tyr Ala Asn Gly Ala Leu Asn Gly Leu Gly Leu Asn Val Gly Tyr Lys Lys Phe Phe Gln Phe Lys Ser Phe Asp Met Thr Ser Lys Trp Phe Gly Phe Arg Val Tyr Gly Leu Phe Asp Tyr Gly His Ala Thr Leu Gly Lys Gln Val Tyr Ala Pro Asn Lys Ile Gln Leu Asp Met Val Ser Trp Gly Val Gly Ser Asp Leu Leu Ala Asp Ile Ile Asp Asn Asp Asn Ala Ser Phe Gly Ile Phe Gly Gly Val Ala Ile Gly Gly Asn Thr Trp Lys Ser Ser Ala Ala Asn Tyr Trp Lys Glu Gln Ile Ile Glu Ala Lys Gly Pro Asp Val Cys Thr Pro Thr Tyr Cys Asn Pro Asn Ala Pro Tyr Ser Thr Lys Thr Ser Thr Val Ala Phe Gln Val Trp Leu Asn Phe Gly Val Arg Ala Asn Ile Tyr Lys His Asn Gly Val Glu Phe Gly Val Arg Val Pro Leu Leu Ile Asn Lys Phe Leu Ser Ala Gly Pro Asn Ala Thr Asn Leu Tyr Tyr His Leu Lys Arg Asp Tyr Ser Leu Tyr Leu Gly Tyr Asn Tyr Thr Phe (2) INFORMATION FOR SEQ ID N0:153:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 310 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 10...279 (D) OTHER INFORMATION:
(xi} SEQUENCE DESCRIPTION: SEQ ID N0:153:
Val Ala Val Lys Lys Ile Val Val Ser Trp Cys Val Ala Leu Ala Phe Leu Ser Ala Asp Ser Ala Gln Ala A:an Lys Ala Ile Ser Asn GCG GAT TTG ATT AAA GAG ATA AGG GAT TTA AAA AAA ATC ATC AGC GCG 1.47 Ala Asp Leu Ile Lys Glu Ile Arg Asp Leu Lys Lys Ile Ile Ser Ala Gln Asn Thr Glu Ile Asn Asn Leu Arg Lys Val Gln Glu Val Leu Ser GGG CAA TTA GGG GAC ATG CGT AAG GAT ATA T7.'A AGC ACT AGA GAT TAT 243 Gly Gln Leu Gly Asp Met Arg Lys Asp Ile Le~u Ser Thr Arg Asp Tyr Cys Ile Ser Leu Arg Pro Tyr Ile Tyr Asn Trp Arg (2) INFORMATION FOR SEQ ID NO: I54:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 90 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:154:
Val Ala Val Lys Lys Ile Val Val Ser Trp Cys Val Ala Leu Ala Phe Leu Ser Ala Asp Ser Ala Gln Ala Asn Lys A7.a Ile Ser Asn Ala Asp Leu Ile Lys Glu Ile Arg Asp Leu Lys Lys Ile Ile Ser Ala Gln Asn Thr Glu Ile Asn Asn Leu Arg Lys Val Gln G7.u Val Leu Ser Gly Gln Leu Gly Asp Met Arg Lys Asp Ile Leu Ser Thr Arg Asp Tyr Cys Ile Ser Leu Arg Yro ~l~yr lie Tyr Asn Trp Arg (2) INFORMATION FOR SEQ ID N0:155:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 549 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 16...474 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:155:
Met Glu Gln Asn Ile Phe Ser Leu Leu Ile Gln Lys Lys Ser Tyr Lys Lys Leu Glu Thr Leu Leu Lys Leu Lys Lys Leu Lys GTT TTT ATG CCT TTA AGT TTA CAA GAA AAT TTG CTT TTT ATC TTC ATA l47 Val Phe Met Pro Leu Ser Leu Gln Glu Asn Leu Leu Phe Ile Phe Ile Lys Asp Ser Lys Leu Leu Phe Ala Phe Lys Asp Ile Trp Ala Ser Lys Glu Phe Asn Gln Arg Phe Ala Lys Glu Ile Ser His Phe Leu Asn Thr CAA GGG CAT GCT TAT GGG TTT GAC GGG TTG AAT GGG TTA GAA ATT TTA 29l Gln Gly His Ala Tyr Gly Phe Asp Gly Leu Asn Gly Leu Glu Ile Leu Gly Tyr Val Pro Lys Asp Ala Leu Lys Lys Ser Asn Phe Tyr Ala Pro 95 l00 105 Ile Lys Lys Gln Ala Arg Phe Phe Arg Pro Ser Ala Leu Gly Leu Phe His Asn Pro Ile Lys Hsp Ala Arg Leu His Glu Cys Phe Glu Lys Ala 125 130 135 l40 Arg Ala Leu Ile His Tyr Gln Arg Ser Phe Phe Glu Glu TTATTGTCCA GTTTAAP.AAA CCTTCCTAAC AGCAGTGGCG TGTATCAATA TTTTGATAAA 546 (2) INFORMATION FOR SEQ ID N0:156:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 153 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii} MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:156:
Met Glu Gln Asn Ile Phe Ser Leu Leu Ile Gln Lys Lys Ser Tyr Lys Lys Leu Glu Thr Leu Leu Lys Leu Lys Lys Leu Lys Val Phe Met Pro Leu Ser Leu Gln Glu Asn Leu Leu Phe Ile Phe Ile Lys Asp Ser Lys Leu Leu Phe Ala Phe Lys Asp Ile Trp Ala Ser Lys Glu Phe Asn Gln Arg Phe Ala Lys Glu Ile Ser His Phe Leu Asn Thr Gln Gly His Ala Tyr Gly Phe Asp Gly Leu Asn Gly Leu Glu Ile Leu Gly Tyr Val Pro Lys Asp Ala Leu Lys Lys Ser Asn Phe Tyr Ala Pro Ile Lys Lys Gln Ala Arg Phe Phe Arg Pro Ser Ala Leu Gly Leu Phe His Asn Pro Ile Lys Asp Ala Arg Leu His Glu Cys Phe Glu Lys Ala Arg Ala Leu Ile His Tyr Gln Arg Ser Phe Phe Glu Glu (2) INFORMATION FOR SEQ ID N0:157:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2627 base pairs (B} TYPE: nucleic acid (C} STRANDEDNESS: single (D} TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KHz: Coding Sequence (B) LOCATION: 18...2582 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:157:
Met Lys Ser Lys Lys Leu Tyr Leu Ala Leu Ile Ile Gly Val Leu Leu Ala Phe Leu Thr Leu Ser Ser Trp Leu Gly Asn AGC GGT TTA GTG GGG CGT TTT GGG GTG TGG TTT GCC GCA CTC AAT AAA 7.46 Ser Gly Leu Val Gly Arg Phe Gly Val Trp Phe Ala Ala Leu Asn Lys Lys Tyr Phe Gly His Leu Ser Phe Ile Asn Leu Pro Tyr Leu Ala Trp Val Leu Phe Leu Leu Tyr Lys Thr Lys Asn Pro Phe Thr Glu Ile Val Leu Glu Lys Thr Leu Gly His Leu Leu Gly Ile Leu Ser Leu Leu Phe Leu Gln Ser Ser Leu Leu Asn Gln Gly Glu Ile Gly Asn Ser Ala Arg Leu Phe Leu Arg Pro Phe Ile Gly Asp Phe Gly Leu Tyr Ala Leu Ile Thr Leu Met Val Val Ile Ser Tyr Leu Ile Leu Phe Lys Leu Pro Pro Lys Ser Val Phe Tyr Pro Tyr Met Asn Lys Thr Gln Asn Leu Leu Lys Glu Ile Tyr Lys Gln Cys Leu Gln Ala Phe Ser Pro Asn Phe Ser Pro Lys Lys Glu Gly Phe Glu Asn Thr Pro Ser Asp Ile Gln Lys Lys Glu ACC AAA AAC GAC AAt~ GAA AAA GAA AAC CGC AAF1 GAA AAC CCT ATT A&T 626 Thr Lys Asn Asp Lys Glu Lys Glu Asn Arg Lvs Glu Asn Pro Ile Asn GAA AAC CAC AAA ACC CCT AAC GAA GAA CCG T'.CT TTA GCG ATC CCT ACC 674 Glu Asn His Lys Thr Pro Asn Glu Glu Pro Phe Leu Ala Ile Pro Thr CCC TAT AAC ACG ACT TTA AAT GAT TCA GAG CC:G CAA GAA GGC TTA GTC 722 Pro Tyr Asn Thr Thr Leu Asn Asp Ser Glu Pro Gln Glu Gly Leu Val 220 225 2.l0 235 CAA ATT TCC TCC CAC CCC CCT ACC CAT TAC AC:C ATT TAC CCT AAA AGA 770 Gln Ile Ser Ser His Pro Pro Thr His Tyr Thr Ile Tyr Pro Lys Arg Asn Arg Phe Asp Asp Leu Thr Asn Pro Thr A:>n Pro Pro Leu Lys Glu ATT AAA CAA GAA ACT AAA GAA AGA GAA CCC AC:G CCT ACA AAA GAA ACT 866 Ile Lys Gln Glu Thr Lys Glu Arg Glu Pro Thr Pro Thr Lys Glu Thr CTT ACG CCC ACC ACG CCC AAA CCT ATC ATG CC,'C ACA CTT GCA CCC ATA 914 Leu Thr Pro Thr Thr Pro Lys Pro Ile Met Pro Thr Leu Ala Pro Ile Ile Glu Asn Asp Asn Lys Thr Glu Asn Gln Lys Thr Pro Asn His Pro 300 305 3l.0 315 Lys Lys Glu Glu Asn Pro Gln Glu Asn Thr Gl.n Glu Glu Met Ile Glu Gly Arg Ile Glu Glu Met Ile Lys Glu Asn Le:u Lys Lys Glu Glu Lys Glu Val Gln Asn Ala Pro Asn Phe Ser Pro Va.l Thr Pro Thr Ser Ala Lys Lys Pro Val Met Val Lys Glu Leu Ser Glu Asn Lys Glu Ile Leu GAC GGA TTG GAT TAT GGC GAA GTG CAA AAA CC'C AAA GAT TAT GAG CTT 1202 Asp Gly Leu Asp Tyr Gly Glu Val Gln Lys Pro Lys Asp Tyr Glu Leu 380 385 39'0 395 CCC ACC ACG CAA TTA TTG AAT GCG GTT TGT TT'G AAA GAC ACT TCT TTA 1250 Pro Thr Thr Gln Leu Leu Asn Ala Val Cys Leu Lys Asp Thr Ser Leu GAC GAA AAC GAG A'1"1' GAC CAA AAA ATC CAG GAT CTA TTG AGC AAA CTG 129B
Asp Glu Asn Glu Ile Asp Gln Lys Ile Gln Asp Leu Leu Ser Lys Leu Arg Thr Phe Lys Ile Asp Gly Asp Ile Ile Arg Thr Tyr Ser Gly Pro Ile Val Thr Thr Phe Glu Phe Arg Pro Ala Pro Asn Val Lys Val Ser Arg Ile Leu Gly Leu Ser Asp Asp Leu Ala Met Thr Leu Cys Ala Glu Ser Ile Arg Ile Gln Ala Pro Ile Lys Gly Lys Asp Val Val Gly Ile _ Glu Ile Pro Asn Ser Gln Ser Gln Ile Ile Tyr Leu Arg Glu Ile Leu Glu Ser Glu Leu Phe Gln Lys Ser Ser Ser Pro Leu Thr Leu Ala Leu Gly Lys Asp Ile Val Gly Asn Pro Phe Ile Thr Asp Leu Lys Lys Leu Pro His Leu Leu Ile Ala Gly Thr Thr Gly Ser Gly Lys Ser Val Gly Val Asn Ala Met Ile Leu Ser Leu Leu Tyr Lys Asn Pro Pro Asp Gln Leu Lys Leu Val Met Ile Asp Pro Lys Met Val Glu Phe Ser Ile Tyr Ala Asp Ile Pro His Leu Leu Thr Pro Ile Ile Thr Asp Pro Lys Lys Ala Ile Gly Ala Leu Gln Ser Val Ala Lys Glu Met Glu Arg Arg Tyr Ser Leu Met Ser Glu Tyr Lys Val Lys Thr Ile Asp Ser Tyr Asn Glu Gln Ala Pro Ser Asn Gly Val Glu Ala Phe Pro Tyr Leu Ile Val Val Ile Asp Glu Leu Ala Asp Leu Met Met Thr Gly Gly Lys Glu Ala Glu Phe Pro Ile Ala Arg Ile Ala Gln Met Gly Arg Ala Ser Gly Leu His CTC ATT GTA GCG ACC CAA CGC CCA AGC GTG G:AT GTC GTA ACC GGC TTG 2114 Leu Ile Val Ala Thr Gln Arg Pro Ser Val Asp Va1 Val Thr Gly Leu ATT AAA ACC AAC TTG CCT TCA AGG GTG AGT T'TT AGG GTA GGC ACT AAG 2162 Ile Lys Thr Asn Leu Pro Ser Arg Val Ser Phe Arg Val Gly Thr Lys 700 705 7'10 715 Ile Asp Ser Lys Val Tle Leu Asp Thr Asp G.ly Ala Gln Ser Leu Leu Gly Arg Gly Asp Met Leu Phe Thr Pro Pro G:ly Ala Asn Gly Leu Val Arg Leu His Ala Pro Phe Ala Thr Glu Asp G.lu Ile Lys Lys Ile Val GAT TTT ATT AAA GCC CAA AAA GAA GTA CAA Ti~C GAT AAA GAT TTC TTG 2354 Asp Phe Ile Lys Ala Gln Lys Glu Val Gln T,,rr Asp Lys Asp Phe Leu Leu Glu Glu Ser Arg Met Pro Leu Asp Thr Pro Asn Tyr Gln Gly Asp GAC ATT TTA GAA AGG GCT AAA GCG GTG ATT T'.CA GAA AAA AAG ATC ACT 2450 Asp Ile Leu Glu Arg Ala Lys Ala Val Ile Le.u Glu Lys Lys Ile Thr Ser Thr Ser Phe Leu Gln Arg Gln Leu Lys Ile Gly Tyr Asn Gln Ala GCT ACC ATT ACT GAC GAA TTA GAA GCT CAA GC~C TTT TTA TCC CCA AGA 2546 Ala Thr Ile Thr Asp Glu Leu Glu Ala Gln Gly Phe Leu Ser Pro Arg Asn Ala Lys G1y Rsn Arg Glu Ile Leu Gln Asn Phe TGGATATTGG CAAACAT'1'AU TTTTGATTT 2627 (2) INFORMATION FOR SEQ ID N0:158:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 855 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:158:
Met Lys Ser Lys Lys Leu Tyr Leu Ala Leu Ile Ile Gly Val Leu Leu Ala Phe Leu Thr Leu Ser Ser Trp Leu Gly Asn Ser Gly Leu Val Gly Arg Phe Gly Val Trp Phe Ala Ala Leu Asn Lys Lys Tyr Phe Gly His Leu Ser Phe Ile Asn Leu Pro Tyr Leu Ala Trp Val Leu Phe Leu Leu Tyr Lys Thr Lys Asn Pro Phe Thr Glu Ile Val Leu Glu Lys Thr Leu Gly His Leu Leu Gly Ile Leu Ser Leu Leu Phe Leu Gln Ser Ser Leu Leu Asn Gln Gly Glu Ile Gly Asn Ser Ala Arg Leu Phe Leu Arg Pro 100 l05 110 Phe Ile Gly Asp Phe Gly Leu Tyr Ala Leu Ile Thr Leu Met Val Val Ile Ser Tyr Leu Ile Leu Phe Lys Leu Pro Pro Lys Ser Val Phe Tyr l30 135 140 Pro Tyr Met Asn Lys Thr Gln Asn Leu Leu Lys Glu Ile Tyr Lys Gln Cys Leu Gln Ala Phe Ser Pro Asn Phe Ser Pro Lys Lys Glu Gly Phe Glu Asn Thr Pro Ser Asp Ile Gln Lys Lys Glu Thr Lys Asn Asp Lys Glu Lys Glu Asn Arg Lys Glu Asn Pro Ile Asn Glu Asn His Lys Thr Pro Asn Glu Glu Pro Phe Leu Ala Ile Pro Thr Pro Tyr Asn Thr Thr 210 2l5 220 Leu Asn Asp Ser Glu Pro Gln Glu Gly Leu Val Gln Ile Ser Ser His Pro Pro Thr His Tyr Thr Ile Tyr Pro Lys Arg Asn Arg Phe Asp Asp Leu Thr Asn Pro Thr Asn Pro Pro Leu Lys Glu Ile Lys Gln Glu Thr Lys Glu Arg Glu Pro Thr Pro Thr Lys Glu Thr Leu Thr Pro Thr Thr Pro Lys Pro Ile Met Pro Thr Leu Ala Pro Ile Ile Glu Asn Asp Asn Lys Thr Glu Asn Gln Lys Thr Pro Asn His Pro Lys Lys Glu Glu Asn Pro Gln Glu Asn Thr Gln Glu Glu Met Ile Glu Gly Arg Ile Glu Glu Met Ile Lys V1u tan Leu Lys Lys Glu Glu Lys Glu Val Gln Asn Ala Pro Asn Phe Ser Pro Val Thr Pro Thr Ser Ala Lys Lys Pro Val Met _ 355 360 365 Val Lys Glu Leu Ser Glu Asn Lys Glu Ile L~~u Asp Gly Leu Asp Tyr Gly Glu Val Gln Lys Pro Lys Asp Tyr Glu L~~u Pro Thr Thr Gln Leu 385 390 3.9S 400 Leu Asn Ala Val Cys Leu Lys Asp Thr Ser Leu Asp Glu Asn Glu Ile Asp Gln Lys Ile Gln Asp Leu Leu Ser Lys Leu Arg Thr Phe Lys Ile Asp Gly Asp Ile Ile Arg Thr Tyr Ser Gly P:ro Ile Val Thr Thr Phe Glu Phe Arg Pro Ala Pro Asn Val Lys Val Ser Arg Ile Leu Gly Leu Ser Asp Asp Leu Ala Met Thr Leu Cys Ala G.Lu Ser Ile Arg Ile Gln 465 470 4'75 480 Ala Pro Ile Lys Gly Lys Asp Val Val Gly I:Le Glu Ile Pro Asn Ser Gln Ser Gln Ile Ile Tyr Leu Arg Glu Ile Lc=_u Glu Ser Glu Leu Phe Gln Lys Ser Ser Ser Pro Leu Thr Leu Ala L<~u Gly Lys Asp Ile Val Gly Asn Pro Phe Ile Thr Asp Leu Lys Lys Leu Pro His Leu Leu Ile Ala Gly Thr Thr Gly Ser Gly Lys Ser Val Gly Val Asn Ala Met Ile Leu Ser Leu Leu Tyr Lys Asn Pro Pro Asp Gln Leu Lys Leu Val Met Ile Asp Pro Lys Met Val Glu Phe Ser Ile Tyr Ala Asp Ile Pro His Leu Leu Thr Pro Ile Ile Thr Asp Pro Lys Lys Ala Ile Gly Ala Leu Gln Ser Val Ala Lys Glu Met Glu Arg Arg Tyr Ser Leu Met Ser Glu 6l0 615 620 Tyr Lys Val Lys Thr Ile Asp Ser Tyr Asn GJ_u Gln Ala Pro Ser Asn 625 630 6.S5 640 Gly Val Glu Ala Phe Pro Tyr Leu Ile Val Val Ile Asp Glu Leu Ala Asp Leu Met Met Thr Gly Gly Lys Glu Ala GJ.u Phe Pro Ile Ala Arg Ile Ala Gln Met Gly Arg Ala Ser Gly Leu His Leu Ile Val Ala Thr Gln Arg Pro Ser Val Asp Val Val Thr Gly Le'u Ile Lys Thr Asn Leu Pro Ser Arg Val Ser Phe Arg Val Gly Thr Lys Ile Asp Ser Lys Val 705 710 77.5 720 Ile Leu Asp Thr Asp Gly Ala Gln Ser Leu Leu Gly Arg Gly Asp Met Leu Phe Thr Pro Pro Gly Ala Asn Gly Leu Val Arg Leu His Ala Pro Phe Ala Thr Glu Asp Glu Ile Lys Lys Ile Val Asp Phe Ile Lys Ala Gln Lys Glu Val Gln Tyr Asp Lys Asp Phe Leu Leu Glu Glu Ser Arg Met Pro Leu Asp Thr Pro Asn Tyr Gln Gly Asp Asp Ile Leu Glu Arg Ala Lys Ala Val Ile Leu Glu Lys Lys Ile Thr Ser Thr Ser Phe Leu Gln Arg Gln Leu Lys Ile Gly Tyr Asn Gln Ala Ala Thr Ile Thr Asp Glu Leu Glu Ala Gln Gly Phe Leu Ser Pro Arg Asn Ala Lys Gly Asn Arg Glu Ile Leu Gln Asn Phe (2) INFORMATION FOR SEQ ID N0:159:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19S6 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 56...1945 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:159:
Met Gln Asp Ser Leu His Phe Lys Val Asn Glu Val Gln Gly Val Leu Glu Asn Thr Tyr Thr Ser Met Gly Ile Val Lys Glu Met Leu Pro Lys Asp Thr Lys Arg Glu Ile Lys Ile Gly Leu Leu Lys Asn Phe Ile Leu Ala Asn Ser His Val Ala Gly Val Ser Met Phe Phe Lys Gly Arg Glu Asp Leu Arg Leu Thr Leu Leu Arg Asp Asn Asn Thr Ile Lys Leu Val Glu AAT CCG TCA T'1'A GAU hAT AGC CCT TTA GCG CA;4 AAA GCG ATG AAA AAT 346 Asn Pro Ser Leu Glu Asn Ser Pro Leu Ala Gl:n Lys Ala Met Lys Asn AAA GAA ATT TCT AAA AGT TTG GGT TAT TAT AGc3 AAA ATG CCT AAT GGG 394 Lys Glu Ile Ser Lys Ser Leu Gly Tyr Tyr Arg Lys Met Pro Asn Gly GCG GAA GTT TAT GGG GTG GAT ATT CTT TTA CC'T TTA TTG AAT GAG AAC 442 Ala Glu Val Tyr Gly Val Asp Ile Leu Leu Pro Leu Leu Asn Glu Asn GCT CAA GAG GTT GTA GGG GCT TTG ATG ATT TT'r ATT TCC ATT GAC AGC 490 Ala Gln Glu Val Val Gly Ala Leu Met Ile Phe Ile Ser Ile Asp Ser TTC AGC AAT GAA ATC ACT AAA AAC AGG AGC GA'r TTA TTT TTA ATT GGC 538 Phe Ser Asn Glu Ile Thr Lys Asn Arg Ser Asp Leu Phe Leu Ile Gly Thr Lys Gly Lys Val Leu Leu Ser Ala Asn Lys Ser Leu Gln Asp Lys CCT ATC GCA GAA ATT TAT AAG AGC GTG CCT AAi~ GCC ACC AAC GAA GTG 634 Pro Ile Ala Glu Ile Tyr Lys Ser Val Pro Ly:~ Ala Thr Asn Glu Val ATG GCT ATT TTA GAA AAC GGC TCT AAA GCG AC'C TTA GAA TAC TTA GAT 682 Met Ala Ile Leu Glu Asn Gly Ser Lys Ala Th=r Leu Glu Tyr Leu Asp CCC TTT AGC CAT AAG GAA AAT TTT TTA GCC GT'C GAA ACC TTT AAA ATG 730 Pro Phe Ser His Lys Glu Asn Phe Leu Ala Va.l Glu Thr Phe Lys Met CTA GGC AAA ACA GAA AGT AAA GAC AAT CTT AA'C TGG ATG ATC GCT TTA 778 Leu Gly Lys Thr Glu Ser Lys Asp Asn Leu Asn Trp Met Ile Ala Leu Ile Ile Glu Lys Asp Lys Val Tyr Glu Gln Va.L Gly Ser Val Arg Phe Val Val Ile Ile Ala Ser Ala Ile Met Val Leu Ala Leu Ile Ile Ala ATC ACT CTC TTA ATG CGA GCG ATC GTG AGC AG'C CGT TTG GAA GCC GTT 922 Ile Thr Leu Leu Met Arg Ala Tle Val Ser Ser Arg Leu Glu Ala Val Ser Ser Thr Leu Ser His Phe Phe Lys Leu Leu Asn Asn Gln Ala Asn 290 29S 30c) 305 Phe Ala Thr Glu Asp Glu Ile Lys Lys Ile V
TCT AGC GGT A'1"1' AAt~ 'i iG i-PTT GAA GCG AAA TCC AAT GAC GAG TTA GGC 1018 Ser Ser Gly Ile Lys Leu Ile Glu Ala Lys Ser Asn Asp Glu Leu Gly 310 3l5 320 Arg Met Gln Thr Ala Ile Asn Lys Asn Ile Leu Gln Thr Gln Lys Ile ATG CAA GAA GAC AGG CAA GCC GTC CAA GAC ACC ATT AAA GTG GTT TCA 11l4 Met Gln Glu Asp Arg Gln Ala Val Gln Asp Thr Ile Lys Val Val Ser GAT GTG AAA GCA GGG AAT TTT GCG GTG CGC ATC ACG GCT GAG CCC GCA l162 Asp Val Lys Ala Gly Asn Phe Ala Val Arg Ile Thr Ala Glu Pro Ala Ser Pro Asp Leu Lys Glu Leu Arg Asp Ala Leu Asn Gly Ile Met Asp Tyr Leu Gln Glu Ser Val Gly Thr His Met Pro Ser Ile Phe Lys Ile Phe Glu Ser Tyr Ser Gly Leu Asp Phe Arg Gly Arg Ile Gln Asn Ala Ser Gly Arg Val Glu Leu Val Thr Asn Ala Leu Gly Gln Glu Ile Gln Lys Met Leu Glu Thr Ser Ser Asn Phe Ala Lys Asp Leu Ala Asn Asp Ser Ala Asn Leu Lys Glu Cys Val Gln Asn Leu Glu Lys Ala Ser Asn Ser Gln His Lys Ser Leu Met Glu Thr Ser Lys Thr Ile Glu Asn Ile Thr Thr Ser Ile Gln Gly Val Ser Ser Gln Ser Glu Ala Met I1e Glu CAA GGG CAA GAC ATT AAA AGC ATT GTA GAA ATC ATT AGA GAT ATT GCT l594 Gln Gly Gln Asp Ile Lys Ser Ile Val Glu Ile Ile Arg Asp Ile Ala Asp Gln Thr Asn Leu Leu Ala Leu Asn Ala Ala Ile Glu Ala Ala Arg 5l5 520 525 GCC GGC GAG CAT GUC: huia GGC TTT GCG GTG G'TG GCT GAT GAG GTA AGA 1690 Ala Gly Glu His Gly Arg Gly Phe Ala Val Val Ala Asp Glu Val Arg Lys Leu Ala Glu Arg Thr Gln Lys Ser Leu Ser Glu Ile Glu Ala Asn Ile Asn Tle Leu Val Gln Ser Ile Ser Asp T.hr Ser Glu Ser Ile Lys Asn Gln Val Lys G1u Val Glu Glu Ile Asn Ala Ser Ile Glu Ala Leu Arg Ser Val Thr Glu Gly Asn Leu Lys Ile Ala Ser Asp Ser Leu Glu ATC AGT CAA GAA ATT GAC AAA GTT TCT AAC G.~1T ATT TTA GAA GAT GTG 1930 Ile Ser Gln Glu Ile Asp Lys Val Ser Asn Asp Ile Leu Glu Asp Val 610 6l5 6:20 _ 625 Asn Lys Lys Gln Phe (2) INFORMATION FOR SEQ ID N0:160:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 630 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:160:
Met Gln Asp Ser Leu His Phe Lys Val Asn G.lu Val Gln Gly Val Leu Glu Asn Thr Tyr Thr Ser Met Gly Ile Val Lys Glu Met Leu Pro Lys Asp Thr Lys Arg Glu Ile Lys Ile Gly Leu Leu Lys Asn Phe Ile Leu Ala Asn Ser His Val Ala Gly Val Ser Met Phe Phe Lys Gly Arg Glu Asp Leu Arg Leu Thr Leu Leu Arg Asp Asn Asn Thr Ile Lys Leu Val Glu Asn Pro Ser Leu Glu Asn Ser Pro Leu A.la Gln Lys Ala Met Lys Asn Lys Glu Ile Ser Lys Ser Leu Gly Tyr Tyr Arg Lys Met Pro Asn WO 98/21225 PCTlUS97121353 -Gly Ala Glu Val Tyr Gly Val Asp Ile Leu Leu Pro Leu Leu Asn Glu 1l5 120 125 Asn Ala Gln Glu Val Val Gly Ala Leu Met Ile Phe Ile Ser Ile Asp Ser Phe Ser Asn Glu Ile Thr Lys Asn Arg Ser Asp Leu Phe Leu Ile Gly Thr Lys Gly Lys Val Leu Leu Ser Ala Asn Lys Ser Leu Gln Asp Lys Pro Ile Ala Glu Ile Tyr Lys Ser Val Pro Lys Ala Thr Asn Glu Val Met Ala Ile Leu Glu Asn Gly Ser Lys Ala Thr Leu Glu Tyr Leu l95 200 205 _ Asp Pro Phe Ser His Lys Glu Asn Phe Leu Ala Val Glu Thr Phe Lys Met Leu Gly Lys Thr Glu Ser Lys Asp Asn Leu Asn Trp Met Ile Ala Leu Ile Ile Glu Lys Asp Lys Val Tyr Glu Gln Val Gly Ser Val Arg Phe Val Val Ile Ile Ala Ser Ala Ile Met Val Leu Ala Leu Ile Ile Ala Ile Thr Leu Leu Met Arg Ala Ile Val Ser Ser Arg Leu Glu Ala Val Ser 5er Thr Leu Ser His Phe Phe Lys Leu Leu Asn Asn Gln Ala Asn Ser Ser Gly Ile Lys Leu Ile Glu Ala Lys Ser Asn Asp Glu Leu Gly Arg Met Gln Thr Ala Ile Asn Lys Asn Ile Leu Gln Thr Gln Lys Ile Met Gln Glu Asp Arg Gln Ala Val Gln Asp Thr Ile Lys Val Val Ser Asp Val Lys Ala Gly Asn Phe Ala Val Arg lle Thr Ala Glu Pro Ala Ser Pro Asp Leu Lys Glu Leu Arg Asp Ala Leu Asn Gly Ile Met Asp Tyr Leu Gln Glu Ser Val Gly Thr His Met Pro Ser Ile Phe Lys Ile Phe Glu Ser Tyr Ser Gly Leu Asp Phe Arg Gly Arg Ile Gln Asn Ala Ser Gly Arg Val Glu Leu Val Thr Asn Ala Leu Gly Gln Glu Ile Gln Lys Met Leu Glu Thr Ser Ser Asn Phe Ala Lys Asp Leu Ala Asn Asp Ser Ala Asn Leu Lys Glu Cys Val Gln Asn Leu Glu Lys Ala Ser Asn Ser Gln His Lys Ser Leu Met Glu Thr Ser Lys Thr Ile Glu Asn Ile Thr Thr Ser Ile Gln Gly Val Ser Ser Gln Ser Glu Ala Met Ile Glu Gln Gly Gln Asp Ile Lys Ser Ile Val Glu Ile Ile Arg Asp Ile Ala Asp Gln Thr Asn Leu Leu Ala Leu Asn Ala Ala Ile Glu Ala Ala 5l5 520 525 Arg Ala Gly Glu His Gly Arg Gly Phe Ala Val Val Ala Asp Glu Val Arg Lys Leu A1a (,~lu Hrg Thr Gln Lys Ser L~eu Ser Glu Ile Glu Ala Asn Ile Asn Ile Leu Val Gln Ser Ile Ser A.sp Thr Ser Glu Ser Ile Lys Asn Gln Val Lys Glu Val Glu Glu Ile A.sn Ala Ser Ile Glu Ala Leu Arg Ser Val Thr Glu Gly Asn Leu Lys Ile Ala Ser Asp Ser Leu Glu I1e Ser Gln Glu Ile Asp Lys Val Ser A.sn Asp Ile Leu Glu Asp Val Asn Lys Lys Gln Phe (2) INFORMATION FOR SEQ ID N0:161:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1758 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 8...1702 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:161:
GAGATAA ATG ATG TTT TCT TCA ATG TTT GCT T'CG TTG GGG ACT CGT ATC 49 Met Met Phe Ser Ser Met Phe Ala Ser Leu Gly Thr Arg Ile ATG CTG GTC GTG TTA GCC GCT CTT TTA GGT T'TA GGG GGG CTT TTT ATT 97 Met Leu Val Val Leu Ala Ala Leu Leu Gly Leu Gly Gly Leu Phe Ile GGT TTT GTA AAG GTT ATG CAA AAA GAT GTG T'TA GCG CAA CTC ATG GAG 145 Gly Phe Val Lys Val Met Gln Lys Asp Val Leu Ala Gln Leu Met Glu His Leu Glu Thr Gly Gln Tyr Lys Lys Arg Glu Lys Thr Leu Ala Tyr Met Thr Lys Ile Ile Glu Gln Gly Ile His Glu Tyr Tyr Lys Asn Phe Asp Asn Ala Thr Ala Arg Lys Met Ala Leu A.sp Tyr Phe Lys Arg Ile AAC GAC GAT AAG GGC A'I'G ATT TAT ATG GTG GTG GTG GAT AAA AAC GGG 337 Asn Asp Asp Lys Gly Met Ile Tyr Met Val Val Val Asp Lys Asn Gly 95 100 105 l10 Val Val Leu Phe Asp Pro Val Asn Pro Lys Thr Val Xaa Gln Ser Gly 115 l20 125 Leu Asp Ala Gln Ser Val Asp Gly Val Tyr Tyr Val Arg Gly Tyr Leu GAG GCG GCC AAA AAA GGG GGA GGC TAC ACT TAT TAT AAA ATG CCT AAA 48l Glu Ala Ala Lys Lys Gly Gly Gly Tyr Thr Tyr Tyr Lys Met Pro Lys Tyr Asp Gly Gly Val Pro Glu Lys Lys Phe Ala Tyr Ser His Tyr Asp Glu Val Ser Gln Met Val Ile Ala Thr Thr Ser Tyr Tyr Thr Asp Ile Asn Thr Glu Asn Lys Ala Ile Lys Glu Gly Val Asn Lys Val Phe Asp Glu Asn Thr Thr Lys Leu Phe Leu Trp Ile Leu Thr Ala Thr Ile Ala Leu Val Val Leu Thr Leu Ile Tyr Ala Lys Leu Arg Ile Val Lys Arg Ile Asp Glu Leu Val Leu Lys Ile Asn Ala Phe Ser Arg Gly Asp Lys Asp Leu Arg Ala Lys Ile Asp Val Gly Asp Arg Asn Asp Glu Ile Ser Gln Val Gly Arg Gly Ile Asn Leu Phe Val Glu Asn Ala Arg Leu Ile Met Glu Glu Ile Lys Gly Ile Ser Thr Leu Asn Lys Thr Ser Met Asp Lys Leu Val Gln Ile Thr Gln Glu Thr Gln Lys Ser Met Lys Asp Ser TCA ACC ACC C1'A AA'1' 'lw~ GTG AAA AAT AAA GCC ACT GAT ATA GCG AGC 1009 Ser Thr Thr Leu Asn Ser Val Lys Asn Lys Ala Thr Asp Ile Ala Ser Met Met Asn Ala Ser Ile Glu Gln Ser Gln Gly Leu Arg Lys Arg Leu Ile Glu Thr Gln Gly Leu Val Lys Glu Ser Lys Asp Ala Ile Gly Asp Leu Phe Ser Gln Ile Thr Glu Ser Ala His Thr Glu Glu Glu Leu Ser AGC AAA GTG GAG CAG CTA AGC CGT AAC GCT GAT GAT GTC AAA TCC ATT 120l Ser Lys Val Glu Gln Leu Ser Arg Asn Ala Asp Asp Val Lys Ser Ile Leu Asp Ile Ile Asn Asp Ile Ala Asp Gln Thr Asn Leu Leu Ala Leu Asn Ala Ala Ile Glu Ala Ala Arg Ala Gly G.Lu His Gly Arg Gly Phe Ala Val Val Ala Asp Glu Val Arg Asn Leu Ala Gly Arg Thr Gln Lys TCT TTA GCC GAA ATC AAT TCC ACT ATC ATG G'CG ATT GTC CAA GAA ATC 1393 Ser Leu Ala Glu Ile Asn Ser Thr Ile Met Val Ile Val Gln Glu Ile Asn Ala Val Ser Ser Gln Met Asn Leu Asn Ser Gln Lys Met Glu Arg Leu Ser Asp Met Ser Lys Ser Val Gln Glu Tlzr Tyr Glu Lys Met Ser TCT AAT TTA AGC TCA GTC GTG TCA GAC AGC Ai~T CAA AGC ATG GAC GAT 1537 Ser Asn Leu Ser Ser Val Val Ser Asp Ser A:an Gln Ser Met Asp Asp TAC GCC AAA TCC GGA CAC CAA ATT GAA GTT A'rG GTA AGC GAT TTT GCA 1585 Tyr Ala Lys Ser Gly His Gln Ile Glu Val M~a Val Ser Asp Phe Ala GAG GTG GAA AAA GTG GCT TCT AAG ACT TTA GCG GAT TCT TCA GAT ATT l633 Glu Val Glu Lys Val Ala Ser Lys Thr Leu A.la Asp Ser Ser Asp Ile TTA AAC ATC GC;T AC:U ~Wi GTG AGT GGA ACG ACC ATG AAT TTA GAC AAA 1681 Leu Asn Ile Ala Thr His Val Ser Gly Thr Thr Met Asn Leu Asp Lys Gln Val Asn Leu Phe Lys Thr (2) INFORMATION FOR SEQ ID N0:162:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 565 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:162:
Met Met Phe Ser Ser Met Phe Ala Ser Leu Gly Thr Arg Ile Met Leu Val Val Leu Ala Ala Leu Leu Gly Leu Gly Gly Leu Phe Ile Gly Phe Val Lys Val Met Gln Lys Asp Val Leu Ala Gln Leu Met Glu His Leu Glu Thr Gly Gln Tyr Lys Lys Arg Glu Lys Thr Leu Ala Tyr Met Thr Lys Ile Ile Glu Gln Gly Ile His Glu Tyr Tyr Lys Asn Phe Asp Asn Ala Thr Ala Arg Lys Met Ala Leu Asp Tyr Phe Lys Arg Ile Asn Asp Asp Lys Gly Met Ile Tyr Met Val Val Val Asp Lys Asn Gly Val Val Leu Phe Asp Pro Val Asn Pro Lys Thr Val Xaa Gln Ser Gly Leu Asp 1l5 120 125 Ala Gln Ser Val Asp Gly Val Tyr Tyr Val Arg Gly Tyr Leu Glu Ala Ala Lys Lys Gly Gly Gly Tyr Thr Tyr Tyr Lys Met Pro Lys Tyr Asp Gly Gly Val Pro Glu Lys Lys Phe Ala Tyr Ser His Tyr Asp Glu Val Ser Gln Met Val Ile Ala Thr Thr Ser Tyr Tyr Thr Asp Ile Asn Thr Glu Asn Lys Ala Ile Lys Glu Gly Val Asn Lys Val Phe Asp Glu Asn Thr Thr Lys Leu Phe Leu Trp Ile Leu Thr Ala Thr Ile Ala Leu Val 2l0 215 220 Val Leu Thr Leu Ile Tyr Ala Lys Leu Arg Ile Val Lys Arg Ile Asp Glu Leu Val Leu Lys Ile Asn Ala Phe Ser Arg Gly Asp Lys Asp Leu Arg Ala Lys Ile Asp Val Gly Asp Arg Asn Asp Glu Ile Ser Gln Val Gly Arg Gly Ile Asn Leu Phe Val Glu Asn Ala Arg Leu Ile Met Glu Glu Ile Lys Gly Ile Ser Thr Leu Asn Lys Thr Ser Met Asp Lys Leu Val Gln Ile Thr Gln Glu Thr Gln Lys Ser Met Lys Asp Ser Ser Thr Thr Leu Asn Ser Val Lys Asn Lys Ala Thr Asp Ile Ala Ser Met Met Asn Ala Ser Ile Glu Gln Ser Gln Gly Leu Arg Lys Arg Leu Ile Glu Thr Gln Gly Leu Val Lys Glu Ser Lys Asp Ala Ile Gly Asp Leu Phe Ser Gln Ile Thr Glu Ser Ala His Thr Glu Glu Glu Leu Ser Ser Lys Val Glu Gln Leu Ser Arg Asn Ala Asp Asp Val Lys Ser Ile Leu Asp Ile Ile Asn Asp Ile Ala Asp Gln Thr Asn Leu Leu Ala Leu Asn Ala Ala Ile Glu Ala Ala Arg Ala Gly Glu His Gly Arg Gly Phe Ala Val Val Ala Asp Glu Val Arg Asn Leu Ala Gly Arg Thr Gln Lys Ser Leu Ala Glu Ile Asn Ser Thr Ile Met Val Ile Val Gln Glu Ile Asn Ala Val Ser Ser Gln Met Asn Leu Asn Ser Gln Lys Met Glu Arg Leu Ser Asp Met Ser Lys Ser Val Gln Glu Thr Tyr Glw Lys Met Ser Ser Asn Leu Ser Ser Val Val Ser Asp Ser Asn Gln Se:r Met Asp Asp Tyr Ala Lys Ser Gly His Gln Ile Glu Val Met Val Se:r Asp Phe Ala Glu Val Glu Lys Val Ala Ser Lys Thr Leu Ala Asp Se:r Ser Asp Ile Leu Asn Ile Ala Thr His Val Ser Gly Thr Thr Met Asn Leu Asp Lys Gln Val Asn Leu Phe Lys Thr (2) INFORMATION FOR SEQ ID N0:163:
{i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 686 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: l6...660 (D) OTHER INFORMATION:
WO 98I21225 PCT/~JS97/21353 -(xi) SEQUENCE DESCRIPTION: SEQ ID N0:163:
Met Lys Lys Pro Tyr Arg Lys Ile Ser Asp Tyr Ala Ile Val Gly Gly Leu Ser Ala Leu Val Met Val Ser Ile Val Gly Cys Lys Ser Asn Ala Asp Asp Lys Pro Lys Glu Gln Ser Ser Leu Ser Gln Ser Val Gln Lys Gly Ala Phe Val Ile Leu Glu Glu Gln Lys Asp Lys Ser Tyr Lys Val Val Glu Glu Tyr Pro Ser Ser Arg Thr His Ile Ile Val Arg Asp Leu Gln Gly Asn Glu Arg Val Leu Ser Asn Glu Glu Ile Gln Lys Leu Ile Lys Giu Glu Glu Ala Lys Ile Asp Asn Gly Thr Ser Lys Leu Val Gln Pro Asn Asn Gly Gly Ser Asn Glu Gly Ser Gly Phe Gly Leu Gly Ser Ala Ile Leu Gly Ser Ala Ala Gly Ala Ile Leu Gly Ser Tyr Ile Gly Asn Lys Leu Phe Asn Asn Pro Asn Tyr Gln Gln Asn Ala Gln Arg Thr Tyr Lys Ser Pro Gln Ala Tyr Gln Arg Ser Gln Asn Ser Phe Ser Lys Ser Ala Pro Ser Ala Ser Ser Met Gly Gly Ala Ser Lys Gly Gln Ser Gly Phe Phe Gly Ser Ser Arg Pro Thr Ser Ser Pro GCG GTA AGC 1'L'1' (~c;t~ t.~;H AGG GGC TTT AAC TC'.A TAATTTAATT GATTCAAGGC 6 Ala Val Ser Ser Gly Thr Arg Gly Phe Asn Se:r 205 210 21.5 (2) INFORMATION FOR SEQ ID N0:164:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 215 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:164:
Met Lys Lys Pro Tyr Arg Lys Ile Ser Asp Tyr Ala Ile Val Gly Gly Leu Ser Ala Leu Val Met Val Ser Ile Val Gly Cys Lys Ser Asn Ala Asp Asp Lys Pro Lys Glu Gln Ser Ser Leu Ser Gln Ser Val Gln Lys Gly Ala Phe Val Ile Leu Glu Glu Gln Lys Asp Lys Ser Tyr Lys Val Val Glu Glu Tyr Pro Ser Ser Arg Thr His Ile Ile Val Arg Asp Leu Gln Gly Asn Glu Arg Va1 Leu Ser Asn Glu Glu Ile Gln Lys Leu Ile Lys Glu Glu Glu Ala Lys Ile Asp Asn Gly Thr Ser Lys Leu Val Gln Pro Asn Asn Gly Gly Ser Asn Glu Gly Ser Gly Phe Gly Leu Gly Ser 115 120 l25 Ala Ile Leu Gly Ser Ala Ala Gly Ala Ile Leu Gly Ser Tyr Ile Gly l30 135 140 Asn Lys Leu Phe Asn Asn Pro Asn Tyr Gln Gln Asn Ala Gln Arg Thr 145 150 l55 l60 Tyr Lys Ser Pro Gln Ala Tyr Gln Arg Ser Gln Asn Ser Phe Ser Lys Ser Ala Pro Ser Ala Ser Ser Met Gly Gly Ala Ser Lys Gly Gln Ser Gly Phe Phe Gly Ser Ser Arg Pro Thr Ser Ser Pro Ala Val Ser Ser Gly Thr Arg Gly Phe Asn Ser (2) INFORMATION FOR SEQ ID N0:165:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 8748 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE T~PF,: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 16...8694 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:165:
Met Lys Lys Phe Lys Lys Lys Pro Lys Ser Ile Lys Arg Ser His Gln Asn Gln Lys Thr Ile Leu Lys Arg Pro Leu Trp Leu Met Pro Leu Leu Ile Ser Gly Phe Ala Ser Gly Val Tyr Ala Asn Asn Leu Trp Asp Leu Leu Asn Pro Lys Val Gly Gly Glu Tyr Val His Trp Val- Lys Gly Ser Gln Tyr Cys Ala Trp Trp Glu Phe Ala Gly Cys Leu ..
Lys Asn Val Trp Gly Ala Asn His Lys Gly Tyr Asp Ala Gly Asn Ala Ala Asn Tyr Leu Ser Ser Gln Asn Tyr Gln Ala Ile Ser Val Gly Ser Gly Asn Glu Thr Gly Thr Tyr Ser Leu Ser Gly Phe Thr Asn Tyr Val 1l0 115 120 Gly Gly Asn Leu Thr Ile Asn Leu Gly Asn Ser Val Val Leu Asp Leu Ser Gly Ser Asn Ser Phe Thr Ser Tyr Gln Gly Tyr Asn Gln Gly Lys Asp Asp Val Thr Phe Thr Val Gly Ala Ile Asn Leu Asn Gly Thr Leu l60 l65 170 GAA GTG GGT AAT CG'1' c~'lv GGA TCG GGA GCT GGC ACG CAC ACC GGC ACA 579 Glu Val Gly Asn Arg Val Gly Ser Gly Ala Gly Thr His Thr Gly Thr Ala Thr Leu Asn Leu Asn Ala Asn Lys Val Asn Ile Asn Ser Asn Ile Asn Ala Tyr Lys Thr Ser Gln Val Asn Ile Gly Asn Ala Asn Ser Val ATT ACC ATT GGT TCG GTT TCT TTG AGT GGG GA'T GTT TGC AGT TCT TTA 723 Ile Thr Ile Gly Ser Val Ser Leu Ser Gly As:p Val Cys Ser Ser Leu GCT AGC GTT GGG ATA GGG GCT AAT TGC TCC AC'T TCT GGG CCT AGC TAT 771 Ala Ser Val Gly Ile Gly Ala Asn Cys Ser Thr Ser Gly Pro Ser Tyr Ser Phe Lys Gly Thr Thr Asn Ala Thr Asn Th:r Ala Phe Ser Asn Ala AGC GGC AGT TTC ACT TTT GAA GAG AAC GCC AC'r TTT AGC GGG GCG AAA 867 Ser Gly Ser Phe Thr Phe Glu Glu Asn Ala Th:r Phe Ser Gly Ala Lys Trp Asn Gly Gly Thr Tyr Thr Phe Asn Lys Glu Phe Ser Ala Thr Asn 285 290 29!i 300 Asn Thr Ala Phe Ser Ser Gly Ser Phe Asn Ph<' Lys Gly Val Ser Ser TTT AAT GGT ACT TCG TTT AGT AAC GCT TCT TA'C ACT TTT GAC AAT CAA 101l Phe Asn Gly Thr Ser Phe Ser Asn Ala Ser Tyr Thr Phe Asp Asn Gln Ala Thr Phe Gln Asn Ser Ser Phe Asn Gly Gly Thr Phe Thr Phe Asn AAC CAA ACT AAT CCA ACT AAC AAC GCT CAG CAC. CCC CAA ATT CAA AAC 1107 Asn Gln Thr Asn Pro Thr Asn Asn Ala Gln His Pro Gln Ile Gln Asn Ser Ser Phe Ser Gly Asn Ala Thr Thr Leu Lys Gly Phe Val Asn Phe Gln Gln Ala Phe Asn Asn Ser Asn His Gln Leu Thr Ile Gln Asn Ala TCC TTT AAT AAC: GC:L ~.CU TTT AAC AAT ACC GGT AAA ATC ACT ATA GAA 1251 Ser Phe Asn Asn Ala Thr Phe Asn Asn Thr Gly Lys Ile Thr Ile Glu Lys Asp Ala Ser Phe Asn Asn Thr Thr Phe Asn Thr Ser Val Asp Thr Asn Asn Met Ser Val Thr Gly Gly Val Thr Leu Ser Gly Lys Asn Asp Leu Lys Asn Gly Ser Thr Leu Asp Phe Gly Ser Ser Lys Ile Thr Leu Ala Gln Gly Thr Thr Phe Asn Leu Thr Ser Leu Gly Ser Glu Lys Ser Val Thr Ile Leu Asn Ser Ser Gly Gly Ile Thr Tyr Ser Asn Leu Leu Asn His Ala Ile Asn Gly Leu Thr Ser Ala Leu Lys Thr Asn Glu Ser Leu Ser Asn Pro Gln Ser Phe Ala Gln Gly Leu Trp Asp Ile Ile Thr Tyr Asn Gly Val Thr Gly Gln Leu Leu Asn Glu Asn Ala Ala Thr Ser Lys Pro Thr Asp Ser Ser Pro Ser Lys Ser Ser Thr Asn Ser Thr Gln Val Tyr Gln Val Gly Tyr Lys Ile Gly Asp Thr Ile Tyr Lys Leu Gln Glu Thr Phe Ser His Asn Ser Ile Ile Ile Gln Ala Leu Glu Ser Gly Thr Tyr Thr Pro Pro Pro Val Ile Asn Gly Ser Lys Phe Asp Leu Ser Ala Ser Asn Tyr Ile Asn Ala Asp Met Pro Trp Tyr Asp His Lys Tyr WO 98/21225 PCTlUS97/21353 -TAC ATC CCT AAA TCC LI~ AAT TTT ACA GAG AGC GGG ACT TAT TAC TTG 1923 Tyr Ile Pro Lys Ser Gln Asn Phe Thr Glu Se r Gly Thr Tyr Tyr Leu Pro Ser Val Gln Ile Trp Gly Ser Tyr Thr Assn Ser Phe Lys Gln Thr TTT AGC GCA AAT GGT AGT AAT CTG GTG ATT G(iG TAT AAC TCA ACA TGG 2019 Phe Ser Ala Asn Gly Ser Asn Leu Val Ile G.ly Tyr Asn Ser Thr Trp ACT GAT CAT AAT GTC TCT TCT AGC GGC ACG G'rG TCT TTT GGG GAC ACT 2067 Thr Asp His Asn Val Ser Ser Ser Gly Thr V<~1 Ser Phe Gly Asp Thr Ser Gly Ser Ala Leu Asn Gly His Cys Gly Pro Trp Pro Tyr Tyr Gln Cys Thr Gly Thr Thr Asn Gly Thr Tyr Ser A:La Tyr His Val Tyr Ile Thr Ala Asn Leu Arg Ser Gly Asn Arg Ile Gly Thr Gly Gly Ala Ala Asn Leu Ile Phe Asn Gly Val Asp Ser Ile Asn Ile Ala Asn Ala Thr Ile Thr Gln His Asn Ala Gly Ile Tyr Ser Se.r Ser Met Thr Phe Ser Thr Gln Ser Met Asp Asn Ser Gln Asn Leu Asn Gly Leu Asn Ser Asn 765 770 7'75 780 Gly Lys Leu Ser Val Tyr Gly Thr Thr Phe Thr Asn Glu Ala Lys Asp Gly Lys Phe Ile Phe Asn Ala Gly Gln Ala Val Phe Glu Asn Thr Asn Phe Asn Gly Gly Ser Tyr Gln Phe Ser Gly Asp Ser Leu Asn Phe Ser Asn Asn Asn Gln Phe Asn Ser Gly Ser Phe G7_u Ile Ser Ala Lys Asn WO 98/21225 PCT/US97/21353 w GCT TCG TTC AA'C' AAC VCt AAC TTT AAC AAC AGC GCT TCT TTT AAT TTC 2595 Ala Ser Phe Asn Asn Ala Asn Phe Asn Asn Ser Ala Ser Phe Asn Phe Asn Asn Ser Asn Ala Thr Thr Ser Phe Val Gly Asp Phe Thr Asn Ala Asn Ser Asn Leu Gln Ile Ala Gly Asn Ala Val Phe Gly Asn Ser Thr Asn Gly Ser Gln Asn Thr Ala Asn Phe Asn Asn Thr Gly Ser Val Asn Ile Ser Gly Asn Ala Thr Phe Asp Asn Val Val Phe Asn Gly Pro Thr Asn Thr Ser Val Lys Gly Gln Val Thr Leu Asn Asn Ile Thr Leu Lys Asn Leu Asn Ala Pro Leu Ser Phe Gly Asp Gly Thr Ile Thr Phe Asn Ala His Ser Val Ile Asn Ile Ala Glu Ser Ile Thr Asn Gly Asn Pro Ile Thr Leu Val Ser Ser Ser Lys Glu Ile Glu Tyr Asn Asn Ala Phe Ser Lys Asn Leu Trp Gln Leu Ile Asn Tyr Gln Gly His Gly Ala Ser Ser Glu Lys Leu Val Ser Ser Ala Gly Asn Gly Val Tyr Asp Val Val Tyr Ser Phe Asn Asn Gln Thr Tyr Asn Phe Gln Glu Val Phe Ser Gln Asn Ser Ile Ser Ile Arg Arg Leu Gly Val Asn Met Val Phe Asp Tyr Val Asp Met Glu Lys Ser Asp His Leu Tyr Tyr Gln Asn Ala Leu Gly 1055 1060 l065 TTT ATG ACC TAI: Al ii l.tyT iii-~T AGC TAT AAC AFvT AAT TTA GGG AAT GCA 3 2 6 Phe Met Thr Tyr Met Pro Asn Ser Tyr Asn Assn Asn Leu Gly Asn Ala Asn Asn Thr Ile Tyr Tyr Tyr Asp Lys Ser Ii.e Asp Phe Tyr Ala Ser GGG AAA ACT CTA TTC ACT AAA GCG GAA TTT TC',T CAA ACA TTC ACC GGG 3363 Gly Lys Thr Leu Phe Thr Lys Ala Glu Phe Ser Gln Thr Phe Thr Gly Gln Asn Ser Ala Ile Val Phe Gly Ala Lys Se:r Ile Trp Thr Ser Leu Ser Asp Ala Pro Gln Ser Asn Thr Ile Ile Arg Phe Gly Asp Asn Lys Gly Ala Gly Ser Asn Asp Ala Ser Gly His Cys Trp Asn Leu Gln Cys 1150 1155 1l60 ATA GGC TTT ATT ACA GGG CAT TAT GAA GCG CP.A AAG ATT TAC ATC ACC 3555 Ile Gly Phe Ile Thr Gly His Tyr Glu Ala Gln Lys Ile Tyr Ile Thr l165 1170 1175 1180 Gly Ser Ile Glu Ser Gly Asn Arg Ile Ser Ser Gly Gly Gly Ala Ser Leu Asn Phe Asn Gly Leu Gln Gly Ile Leu Leu Thr Asn Ala Thr Leu 1200 l205 1210 TAT AAC CGC GCC GCT GGC ACG CAA AGC TCG TC'T ATG AAT TTT ATC TCT 3699 Tyr Asn Arg Ala Ala Gly Thr Gln Ser Ser Ser Met Asn Phe Ile Ser AAC AGC GCG AAC ATT CAG GCT CAA AAC TCC TA.T TTT ATA GAC GAT ACC 3747 Asn Ser Ala Asn Ile Gln Ala Gln Asn Ser Tyr Phe Ile Asp Asp Thr GCA CAA AAT GGC GGT AAC CCT AAT TTC AGT TT'C AAC GCT TTG AAT CTG 3795 Ala Gln Asn Gly Gly Asn Pro Asn Phe Ser Phe Asn Ala Leu Asn Leu GAT TTT TCT AAC AGC TCT TTT AGA GGC TAT GT'G GGG AAA ACG CAA TCT 3843 Asp Phe Ser Asn Ser Ser Phe Arg Gly Tyr Val Gly Lys Thr Gln Ser Val Phe Lys Phe Asn Ala Lys Asn Ala Ile Ser Phe Thr Asn Ser Thr AAT TTA AGC TC'1' (~U'1 T'ics ThI CAA ATG CAA GCT AAA AGC GTG TTG TTT 3939 Asn Leu Ser Ser Gly Leu Tyr Gln Met Gln Ala Lys Ser Val Leu Phe Asp Asn Ser Asn Leu Ser Val Ser Val Gly Thr Ser Ser Ile Lys Ala l310 1315 1320 Asn Ala Ile Asn Leu Ser Gln Asn Ala Ser Ile Asn Ala Ser Asn His l325 1330 1335 1340 TCA ACC TTA GAA CTT CAA GGC GAT TTG AAT GTG AAC GAC ACC AGC TCG 40B3 _ Ser Thr Leu Glu Leu Gln Gly Asp Leu Asn Val Asn Asp Thr Ser Ser Leu Asn Leu Asn Gln Ser Thr Ile Asn Val Ser Asn Asn Ala Thr Ile Asn Asp Tyr Ala Ser Leu Ile Ala Ser Asn Gly Ser His Leu Asn Phe 1375 l380 1385 Asn Gly Ala Val Asn Phe Asn Ser Ala Asn Ile Thr Thr Ser Leu Asn Asn Ser Ser Ile Val Phe Lys Gly Ala Val Ser Leu Gly Gly Gln Phe l405 1410 1415 1420 Asn Leu Ser Asn Asn Ser Ser Leu Asp Phe Gln Gly Ser Ser Ala Ile Thr Ser Asn Thr Ala Phe Asn Phe Tyr Asp Asn Ala Phe Ser Gln Ser 1440 1445 l450 CCC ATC ACT TTC CAT CAA GCC CTT GAC ATT AAA GCG CCC TTA AGT TTG 44l9 Pry Ile Thr Phe His Gln Ala Leu Asp Ile Lys Ala Pro Leu Ser Leu Gly Gly Asn Leu Leu Asn Pro Asn Asn Ser Ser Val Leu Asp Leu Lys l470 l475 1480 Asn Ser Gln Leu Val Phe Gly Asp Gln Gly Ser Leu Asn Ile Ala Asn l485 1490 1495 1500 Ile Asp Leu Leu Ser Asp Leu Asn Asp Asn Lys Asn Arg Val Tyr Asn WO 98l21225 PCT/US97/2I353 -ATC ATT CAA GC'.ti CiAI: H1V ~i AGT AAT TGG TAT GAG CGT ATC AGC TTC 4611 Ile Ile Gln Ala Asp Met Asn Ser Asn Trp T;rr Glu Arg Ile Ser Phe Phe Gly Met His Ile Asn Asp Gly Ile Tyr A:>p Ala Lys Asn Gln Thr Tyr Ser Phe Thr Asn Pro Leu Asn Asn Ala Leu Lys Ile Thr Glu Ser TTT AAA GAC AAC CAA CTA AGC GTT ACG CTC TC'_T CAA ATC CCG GGT ATT 4755 Phe Lys Asp Asn Gln Leu Ser Val Thr Leu SE:r Gln Ile Pro Gly Ile 1565 1570 1d75 1580 AAA AAC ACG CTC TAT AAC ATT GGC TCT GAA A7.'T TTT AAC TAC CAA AAA 4803 Lys Asn Thr Leu Tyr Asn Ile Gly Ser Glu Il.e Phe Asn Tyr Gln Lys 1585 l590 l595 Val Tyr Asn Asn Ala Asn Gly Val Tyr Ser Tyr Ser Asp Asp Ala Gln Gly Val Phe Tyr Leu Thr Ser Asn Val Lys Gl.y Tyr Tyr Asn Pro Asn Gln Ser Tyr Gln Ala Ser Gly Ser Asn Asn Thr Thr Lys Asn Asn Asn CTA ACC TCT GAA TCT TCT ATC ATC TCG CAA AC.'C TAT AAC GCG CAA GGC 4995 Leu Thr Ser Glu Ser Ser I1e Ile Ser Gln Thr Tyr Asn Ala Gln Gly 1645 1650 1E.55 1660 Asn Pro Ile Ser Ala Leu His Ile Tyr Asn Lys Gly Tyr Asn Phe Asn AAT ATC AAA GCG TTA GGG CAA ATG GCT CTC AF,A CTC TAC CCT GAA ATC S091 Asn Ile Lys Ala Leu Gly Gln Met Ala Leu Lys Leu Tyr Pro Glu Ile AAA AAG GTA TTA GGG AAT GAT TTT TCG CCC TC.'A AGT TTG AAC GCT TTA 5139 Lys Lys Val Leu Gly Asn Asp Phe Ser Pro Ser Ser Leu Asn Ala Leu Asn Ser Asn Ala Leu Asn Gln Leu Thr Lys Le:u Ile Thr Pro Asn Asp TGG AAA AAC ATT AAC GAG TTG ATT GAT AAC GC'A AAC AAT TCG GTG GTG 5235 Trp Lys Asn Ile Asn Glu Leu Ile Asp Asn Ala Asn Asn Ser Val Val CAA AAT TTC AA'1' AAL c~t3i: ACT TTG ATT GTG GGA GCG ACT CAA ATA GGG 5283 Gln Asn Phe Asn Asn Gly Thr Leu Ile Val Gly Ala Thr Gln Ile Gly 1745 1750 l755 CAA ACA GAC ACC AAT AGC GCG GTT GTT TTT GGG GGC TTG GGC TAT CAA 533l Gln Thr Asp Thr Asn Ser Ala Val Val Phe Gly Gly Leu Gly Tyr Gln Thr Pro Cys Asp Tyr Thr Asp Ile Val Cys Gln Lys Phe Arg Gly Thr l775 1780 1785 Tyr Leu Gly Gln Leu Leu Glu Ser Ser Ser Ala Asp Leu Gly Tyr Ile Asp Thr Thr Phe Asn Ala Lys Glu Ile Tyr Leu Thr Gly Thr Leu Gly Ser Gly Asn Ala Trp Gly Thr Gly Gly Ser Ala Ser Val Thr Phe Asn Ser Gln Thr Ser Leu Ile Leu Asn Gln Ala Asn Ile Val Ser Ser Gln 1840 l845 l850 Thr Asp Gly Ile Phe Ser Met Leu Gly Gln Glu Gly Ile Asn Lys Val 1B55 1860 l865 Phe Asn Gln Ala Gly Leu Ala Asn Ile Leu Gly Glu Val Ala Val G1n Ser Ile Asn Lys Ala Gly Gly Leu Gly Asn Leu Ile Val Asn Thr Leu 1885 1890 1B95 l900 Gly Ser Asn Ser Val Ile Gly Gly Tyr Leu Thr Pro Glu Gln Lys Asn l905 1910 1915 CAA ACC CTA AGC CAG CTT TTA GGG CAG AAT AAC TTT GAT AAT CTC ATG 581l Gln Thr Leu Ser Gln Leu Leu Gly Gln Asn Asn Phe Asp Asn Leu Met l920 1925 1930 Asn Asp Ser Gly Leu Asn Thr Ala Ile Lys Asp Leu Ile Arg Gln Lys Leu Gly Phe Trp Thr Gly Leu Val Gly Gly Leu Ala Gly Leu Gly Gly 1950 l955 1960 ATT GAT TTG CAA AAI: i:C-I G~ AAG CTT ATA GGC AGC ATG TCA ATC AAT 595S
Ile Asp Leu Gln Asn Pro Glu Lys Leu Ile G_Ly Ser Met Ser Ile Asn Asp Leu Leu Ser Lys Lys Gly Leu Phe Asn G7_n Ile Thr Gly Phe Ile TCC GCT AAC GAT ATA GGG CAA GTC ATA AGC G7-'A ATG TTG CAA GAT ATT 6052 Ser Ala Asn Asp Ile Gly Gln Val Ile Ser Val Met Leu Gln Asp Ile GTC AAA CCG AGC AAC GCT TTA AAA AAC GAT G7.'A GCG GCT TTA GGC AAG 6099 Val Lys Pro Ser Asn Ala Leu Lys Asn Asp Val Ala Ala Leu Gly Lys CAA ATG ATT GGC GAA TTT TTA GGC CAA GAC AC:G CTC AAT TCT TTA GAA 6147 Gln Met Ile Gly Glu Phe Leu Gly Gln Asp Thr Leu Asn Ser Leu Glu Ser Leu Leu Gln Asn Gln Gln Ile Lys Ser Va.l Leu Asp Lys Val Leu 2045 2050 20'55 2060 GCG GCT AAA GGT TTA GGG CCT ATT TAT GAA CF,A GGC TTG GGG GAT TTG 6243 Ala Ala Lys Gly Leu Gly Pro Tle Tyr Glu Gln Gly Leu Gly Asp Leu Ile Pro Asn Leu Gly Lys Lys Gly Leu Phe Ala Pro Tyr Gly Leu Ser -CAA GTG TGG CAA AAA GGG GAT TTT AGT TTC AA.C GCA CAA GGC AAT GTT 6339 Gln Val Trp Gln Lys Gly Asp Phe Ser Phe Asn Ala Gln Gly Asn Val Phe Val Gln Asn Ser Thr Phe Ser Asn Ala Asn Gly Gly Thr Leu Ser Phe Asn Ala Gly Asn Ser Leu Ile Phe Ala Gly Asn Asn His Ile Ala Phe Thr Asn His Ala Gly Thr Leu Gln Leu Leu Ser Asp Gln Val Ser Asn Ile Asn Ile Thr Thr Leu Asn Ala Ser Asn Gly Leu Lys Ile Asn Ala Ala Asn Asn Asn Val Ser Val Ser Gln Gly Asn Leu Phe Val Ser 2175 2l80 2l85 Ala Ser Cys Ala Gln Gln Ser Asp Pro Thr Thr Ala Asn Ile Ala Asn Pro Cys Ala Leu Ser Ala Gln Ser Thr Asn Gly Ala Ser Ser Asn Asn Ala Ser Asn Asn Ala Pro Ile Ala Leu Ser Asn Asn Asp Glu Ser Leu Met Val Ala Ala Asn Asp Phe Asn Phe Ser Gly Asn Ile Tyr Ala Asn Gly Val Val Asp Phe Ser Lys Ile Lys Gly Ser Ala Asn Ile Lys Asn Leu Tyr Leu Tyr Asn Asn Ala Gln Phe Gln Ala Asn Asn Leu Thr Ile Ser Asn Gln Ala Val Leu Glu Lys Asn Ala Ser Phe Val Thr Asn Asn Leu Asn Ile Gln Gly Ala Phe Asn Asn Asn Ala Thr Gln Lys Ile Glu Val Leu Gln Asn Leu Val Ile Ala Ser Asn Ala Ser Leu Ser Thr Gly Ile Tyr Gly Leu Glu Val Gly Gly Ala Leu Asn Asn Ser Gly Ala Ile CAT TTT AAT TTA GAA AAT ACC CAA ACG CCA ACG CCG CTC ATT CAA GCA 7l07 His Phe Asn Leu Glu Asn Thr Gln Thr Pro Thr Pro Leu Ile Gln Ala Glu Gly Ile Ile Asn Leu Asn Thr Thr Gln Thr Pro Phe Met Asn Val Asn Asn Ser Met Ala Asn Asn Thr Thr Tyr Thr Leu Leu Lys Ser Ser Arg Tyr Ile Asp Tyr Asn Ile Asn Pro Asn Ser Leu Gln Ser Tyr Leu AAT CTC TAC: !1C'1' '1"1'A LTC -AAT ATC AAC GGG AAC CAC ATA GAG GAA AAA 7299 Asn Leu Tyr Thr Leu Ile Asn Ile Asn Gly Asn His Ile Glu Glu Lys 24l5 2420 2425 AAC GGC GCA TTG ACT TAT TTG GGC CAA CGG G7.'T TTG TTG CAA GAT AAG 7347 Asn Gly Ala Leu Thr Tyr Leu Gly Gln Arg Val Leu Leu Gln Asp Lys GGG TTA TTG TTA AGC GTA GCG CTG CCC AAC TC:A AAC AAC GCT TCT CAA 7395 Gly Leu Leu Leu Ser Val Ala Leu Pro Asri Se'r Asn Asn Ala Ser Gln Asn Asri Ile Leu Ser Leu Ser Val Leu Tyr A:~n Gln Val Lys Met Ser TGC GGC GAT AAA GCG ATG GAT TTT ACC CCC CC'T ACC TTA CAA GAT TAC 7491 Cys Gly Asp Lys Ala Met Asp Phe Thr Pro Pro Thr Leu Gln Asp Tyr Ile Val Gly Ile Gln Gly Gln Ser Ala Leu A:>n Gln Ile Glu Ala Val GGG GGG AAC GCT ATC AAG TGG CTT TCA ACA T7.'G ATG ATG GAG ACT AAA 7587 Gly Gly Asn Ala Ile Lys Trp Leu Ser Thr Leu Met Met Glu Thr Lys Glu Asn Pro Phe Phe Ala Pro Ile Tyr Leu Lys Asn His Ser Leu Asn Glu Ile Leu Gly Val Thr Lys Asp Leu Gln Aeon Thr Ala Ser Leu Ile Ser Asn Pro Asn Phe Arg Asp Asn Ala Thr A~~n Leu Leu Glu Leu Ala Ser Tyr Thr Gln Gln Thr Ser Arg Leu Thr Lys Leu Ser Asp Phe Arg TCT AGA GAG GGA GAG TCT GAT TTT TCT TTG T7.'A GAG CTT AAA AAC AAG 7827 Ser Arg Glu Gly Glu Ser Asp Phe Ser Leu Leu Glu Leu Lys Asn Lys Arg Phe Ser Asp Pro Asn Pro Glu Val Phe Val Lys Tyr Ser Gln Leu 2605 2610 2E.15 2620 Ser Lys His Pro Asn Asn Leu Trp Val Gln Gl.y Val Gly Gly Ala Ser TTT ATT TCT GGG GGC: IjAT-GGC ACG CTT TAT GGC TTG AAT GCG GGC TAT 797I
Phe Ile Ser Gly Gly Asn Gly Thr Leu Tyr Gly Leu Asn Ala Gly Tyr Asp Arg Leu Val Lys Asn Val Ile Leu Gly Gly Tyr Val Ala Tyr Gly Tyr Ser Asp Phe Asn Gly Asn Ile Met His Ser Leu Gly Asn Asn Val GAT GTG GGG ATG TAT GCG AGG GCT TTT TTA AAA AGG AAC GAA TTC ACT 81l5 Asp Val Gly Met Tyr Ala Arg Ala Phe Leu Lys Arg Asn Glu Phe Thr Leu Ser Ala Asn Glu Thr Tyr Gly Gly Asn Ala Thr Ser Ile Asn Ser 2705 2710 27l5 Ser Asn Ser Leu Leu Ser Val Leu Asn Gln Arg Tyr Asn Tyr Asn Thr Trp Thr Thr Ser Val Asn Gly Asn Tyr Gly Tyr Asp Phe Met Phe Lys Gln Lys Ser Val Val Leu Lys Pro Gln Val Gly Leu Ser Tyr His Phe Ile Gly Leu Ser Gly Met Lys Gly Asn Asp Ala Ala Tyr Lys Gln Phe Leu Met His Ser Asn Pro Ser Asn Glu Ser Val Leu Thr Leu Asn Met Gly Leu Glu Ser Arg Lys Tyr Phe Gly Lys Asn Ser Tyr Tyr Phe Val Thr Ala Arg Leu Gly Arg Asp Leu Leu Ile Lys Ser Lys Gly Ser Asn Thr Val Arg Phe Val Gly Glu Asn Thr Leu Leu Tyr Arg Lys Gly Glu Val Phe Asn Thr Phe Ala Ser Val Ile Thr Gly Gly Glu Met His Leu TGG CGT TTG G'i'G '1'A'i GTG -iiAT GCG GGG GTG GGG CTT AAG ATG GGC TTG 8 64 3 Trp Arg Leu Val Tyr Val Rsn Ala Gly Val Gl.y Leu Lys Met Gly Leu Gln Tyr Gln Asp Ile Asn Ile Thr Gly Asn Va.l Gly Met Arg Val Ala TTT TAGCTTTTTT GCTATAATGC TTCGTTCAAA TTTTp,TGGTT AGGTTTTTCT ATGT 8748 Phe (2) INFORMATION FOR SEQ ID N0:166:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2B93 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:166:
Met Lys Lys Phe Lys Lys Lys Pro Lys Ser Ile Lys Arg Ser His Gln Asn Gln Lys Thr Ile Leu Lys Arg Pro Leu Trp Leu Met Pro Leu Leu Ile Ser Gly Phe Ala Ser Gly Val Tyr Ala Asn Asn Leu Trp Asp Leu Leu Asn Pro Lys Val Gly Gly Glu Tyr Val His Trp Val Lys Gly Ser Gln Tyr Cys Ala Trp Trp Glu Phe Ala Gly Cys Leu Lys Asn Val Trp Gly Ala Asn His Lys Gly Tyr Asp Ala Gly Asn Ala Ala Asn Tyr Leu Ser Ser Gln Asn Tyr Gln Ala Ile Ser Val Gly Ser Gly Asn Glu Thr Gly Thr Tyr Ser Leu Ser Gly Phe Thr Asn Tyr Val Gly G1y Asn Leu Thr Ile Asn Leu Gly Asn Ser Val Val Leu Asp Leu Ser Gly Ser Asn Ser Phe Thr Ser Tyr Gln Gly Tyr Asn Gln Gly Lys Asp Asp Val Thr Phe Thr Val Gly Ala Ile Asn Leu Asn Gly Thr Leu Glu Val Gly Asn Arg Val Gly Ser Gly Ala Gly Thr His Thr Gly Thr Ala Thr Leu Asn Leu Asn Ala Asn Lys Val Asn Ile Asn Ser Asn Ile Asn Ala Tyr Lys Thr Ser Gln Val Asn Ile Gly Asn Ala Asn Ser Val Ile Thr Ile Gly Ser Val Ser Leu Ser Gly Asp Val Cys Ser Ser Leu Ala Ser Val Gly Ile Gly Ala Asn C;ys ser-ihr Ser Gly Pro Ser Tyr Ser Phe Lys G1y Thr Thr Asn Ala Thr Asn Thr Ala Phe Ser Asn Ala Ser Gly Ser Phe Thr Phe Glu Glu Asn Ala Thr Phe Ser Gly Ala Lys Trp Asn Gly Gly Thr Tyr Thr Phe Asn Lys Glu Phe Ser Ala Thr Asn Asn Thr Ala Phe Ser 5er Gly Ser Phe Asn Phe Lys Gly Val Ser Ser Phe Asn Gly Thr Ser Phe Ser Asn Ala Ser Tyr Thr Phe Asp Asn Gln Ala Thr Phe Gln Asn Ser Ser Phe Asn Gly Gly Thr Phe Thr Phe Asn Asn Gln Thr Asn Pro Thr Asn Asn Ala Gln His Pro Gln Ile Gln Asn Ser Ser Phe Ser Gly Asn Ala Thr Thr Leu Lys Gly Phe Val Asn Phe Gln Gln Ala Phe Asn Asn Ser Asn His Gln Leu Thr Ile Gln Asn Ala Ser Phe Asn Asn Ala Thr Phe Asn Asn Thr Gly Lys Ile Thr I1e Glu Lys Asp Ala Ser 405 4l0 415 Phe Asn Asn Thr Thr Phe Asn Thr Ser Val Asp Thr Asn Asn Met Ser Val Thr Gly Gly Val Thr Leu Ser Gl.y Lys Asn Asp Leu Lys Asn Gly Ser Thr Leu Asp Phe Gly Ser Ser Lys Ile Thr Leu Ala Gln Gly Thr Thr Phe Asn Leu Thr Ser Leu Gly Ser Glu Lys Ser Val Thr Ile Leu Asn Ser Ser Gly Gly Ile Thr Tyr Ser Asn Leu Leu Asn His Ala Ile Asn Gly Leu Thr Ser Ala Leu Lys Thr Asn Glu Ser Leu Ser Asn Pro Gln Ser Phe Ala Gln Gly Leu Trp Asp Ile Ile Thr Tyr Asn Gly Val Thr Gly Gln Leu Leu Asn Glu Asn Ala Ala Thr Ser Lys Pro Thr Asp Ser Ser Pro Ser Lys Ser Ser Thr Asn Ser Thr Gln Val Tyr Gln Val Gly Tyr Lys Ile Gly Asp Thr Ile Tyr Lys Leu Gln Glu Thr Phe Ser His Asn Ser Ile Ile Ile Gln Ala Leu Glu Ser Gly Thr Tyr Thr Pro Pro Pro Val Ile Asn Gly Ser Lys Phe Asp Leu Ser Ala Ser Asn Tyr Ile Asn Ala Asp Met Pro Trp Tyr Asp His Lys Tyr Tyr Ile Pro Lys Ser Gln Asn Phe Thr Glu Ser Gly Thr Tyr Tyr Leu Pro Ser Val Gln Ile Trp Gly Ser Tyr Thr Asn Ser Phe Lys Gln Thr Phe Ser Ala Asn Gly Ser Asn Leu Val Ile Gly Tyr Asn Ser Thr Trp Thr Asp His Asn Val Ser Ser Ser G1y Thr Val Ser Phe Gly Asp Thr Ser Gly Ser Ala Leu Asn Gly His Cys Gly Pro Trp Pro Tyr T;rr Gln Cys Thr G1y Thr Thr Asn Gly Thr Tyr Ser Ala Tyr His Val T;rr Ile Thr Ala Asn Leu 705 710 7.L5 720 Arg Ser Gly Asn Arg Ile Gly Thr Gly Gly A7La Ala Asn Leu Ile Phe Asn Gly Val Asp Ser Ile Asn Ile Ala Asn A7_a Thr Ile Thr Gln His Asn Ala Gly Ile Tyr Ser Ser Ser Met Thr Phe Ser Thr Gln Ser Met Asp Asn Ser Gln Asn Leu Asn Gly Leu Asn Ser Asn Gly Lys Leu Ser Val Tyr Gly Thr Thr Phe Thr Asn Glu Ala Lys Asp Gly Lys Phe Ile 785 790 7~>5 800 Phe Asn Ala Gly Gln Ala Val Phe Glu Asn Thr Asn Phe Asn Gly Gly Ser Tyr Gln Phe Ser Gly Asp Ser Leu Asn Phe Ser Asn Asn Asn Gln Phe Asn Ser Gly Ser Phe Glu Ile Ser Ala Lys Asn Ala Ser Phe Asn Asn Ala Asn Phe Asn Asn Ser Ala Ser Phe A:~n Phe Asn Asn Ser Asn Ala Thr Thr Ser Phe Val Gly Asp Phe Thr A~;n Ala Asn Ser Asn Leu Gln Ile Ala Gly Asn Ala Val Phe Gly Asn Se:r Thr Asn Gly Ser Gln Asn Thr Ala Asn Phe Asn Asn Thr Gly Ser Val Asn Ile Ser Gly Asn Ala Thr Phe Asp Asn Val Val Phe Asn Gly Pro Thr Asn Thr Ser Val Lys Gly Gln Val Thr Leu Asn Asn Ile Thr Le:u Lys Asn Leu Asn Ala Pro Leu Ser Phe Gly Asp Gly Thr Ile Thr Phe Asn Ala His Ser Val Ile Asn Ile Ala Glu Ser Ile Thr Asn Gly Asn Pro Ile Thr Leu Val Ser Ser Ser Lys Glu Ile Glu Tyr Asn Asn Ala Phe Ser Lys Asn Leu Trp Gln Leu Ile Asn Tyr Gln Gly His Gly Ala Ser Ser Glu Lys Leu Val Ser Ser Ala Gly Asn Gly Val Tyr Asp Va.l Val Tyr Ser Phe Asn Asn Gln Thr Tyr Asn Phe Gln Glu Val Phe Ser Gln Asn Ser Ile Ser Ile Arg Arg Leu Gly Val Asn Met Val Phe Asp Tyr Val Asp Met Glu Lys Ser Asp His Leu Tyr Tyr Gln Asn Ala Leu Gly Phe Met Thr Tyr Met Pro Asn Ser Tyr Asn Asn Asn Leu Gly Asn Ala Asn Asn Thr Ile Tyr Tyr Tjrr Asp Lys Ser Ile Asp Phe Tyr Ala Ser Gly Lys Thr Leu Phe Thr Lys Ala Glu Phe Ser Gln Thr Phe Th.r Gly Gln Asn Ser Ala Ile Val Phe Gly Ala Lys-Ser Ile Trp Thr Ser Leu Ser Asp Ala Pro 1125 l130 l135 Gln Ser Asn Thr Ile Ile Arg Phe Gly Asp Asn Lys Gly Ala Gly Ser 1140 l145 1150 Asn Asp Ala Ser Gly His Cys Trp Asn Leu Gln Cys Ile Gly Phe Ile 1155 l160 I165 Thr Gly His Tyr Glu Ala Gln Lys Ile Tyr Ile Thr Gly Ser Ile Glu Ser Gly Asn Arg Ile Ser Ser Gly Gly Gly Ala Ser Leu Asn Phe Asn Gly Leu Gln Gly Ile Leu Leu Thr Asn Ala Thr Leu Tyr Asn Arg Ala I205 1210 12l5 Ala Gly Thr Gln Ser Ser Ser Met Asn Phe Ile Ser Asn Ser Ala Asn Ile Gln Ala Gln Asn Ser Tyr Phe Ile Asp Asp Thr Ala Gln Asn Gly l235 1240 1245 Gly Asn Pro Asn Phe Ser Phe Asn Ala Leu Asn Leu Asp Phe Ser Asn l250 1255 1260 Ser Ser Phe Arg Gly Tyr Val Gly Lys Thr Gln Ser Val Phe Lys Phe Asn Ala Lys Asn Ala Ile Ser Phe Thr Asn Ser Thr Asn Leu Ser Ser 1285 l290 1295 Gly Leu Tyr Gln Met Gln Ala Lys Ser Val Leu Phe Asp Asn Ser Asn 1300 l305 1310 Leu Ser Val Ser Val Gly Thr Ser Ser Ile Lys Ala Asn Ala Ile Asn Leu Ser Gln Asn Ala Ser Ile Asn Ala Ser Asn His Ser Thr Leu Glu Leu Gln Gly Asp Leu Asn Val Asn Asp Thr Ser Ser Leu Asn Leu Asn Gln Ser Thr Ile Asn Val Ser Asn Asn Ala Thr Ile Asn Asp Tyr A1a l365 1370 l375 Ser Leu Ile Ala Ser Asn Gly Ser His Leu Asn Phe Asn Gly Ala Val Asn Phe Asn Ser Ala Asn Ile Thr Thr Ser Leu Asn Asn Ser Ser Ile 1395 l400 1405 Val Phe Lys Gly Ala Val Ser Leu Gly Gly Gln Phe Asn Leu Ser Asn 1410 l415 1420 Asn Ser Ser Leu Asp Phe Gln Gly Ser Ser Ala Ile Thr Ser Asn Thr Ala Phe Asn Phe Tyr Asp Asn Ala Phe Ser Gln Ser Pro I1e Thr Phe His Gln Ala Leu Asp Ile Lys Ala Pro Leu Ser Leu Gly Gly Asn Leu l460 1465 1470 Leu Asn Pro Asn Asn Ser Ser Val Leu Asp Leu Lys Asn Ser Gln Leu 1475 1480 l485 Val Phe Gly Asp Gln Gly Ser Leu Asn Ile Ala Asn Ile Asp Leu Leu Ser Asp Leu Asn Asp Asn Lys Asn Arg Val Tyr Asn Ile Ile Gln Ala Asp Met Asn Ser Asn Trp Tyr Glu Arg Ile Ser Phe Phe Gly Met His Ile Asn Asp Gly Ile Tyr Asp Ala Lys Asn Gln Thr Tyr Ser Phe Thr Asn Pro Leu Asn Asn Ala Leu Lys Ile Thr Glu Ser Phe Lys Asp Asn Gln Leu Ser Val Thr Leu Ser Gln Ile Pro Gly Ile Lys Asn Thr Leu Tyr Asn Ile Gly Ser Glu Ile Phe Asn Tyr Gln Lys Val Tyr Asn Asn 5B5 1590 15'.35 1600 Ala Asn Gly Val Tyr Ser Tyr Ser Asp Asp A.La Gln Gly Val Phe Tyr Leu Thr Ser Asn Val Lys Gly Tyr Tyr Asn Pro Asn Gln Ser Tyr Gln Ala Ser Gly Ser Asn Asn Thr Thr Lys Asn Asn Asn Leu Thr Ser Glu Ser Ser Ile Ile Ser Gln Thr Tyr Asn Ala Gln Gly Asn Pro Ile Ser Ala Leu His Ile Tyr Asn Lys Gly Tyr Asn Phe Asn Asn Ile Lys Ala 665 1670 16'15 1680 Leu Gly Gln Met Ala Leu Lys Leu Tyr Pro Glu Ile Lys Lys Val Leu Gly Asn Asp Phe Ser Pro Ser Ser Leu Asn Ala Leu Asn Ser Asn Ala 1700 1705 17l0 Leu Asn Gln Leu Thr Lys Leu Ile Thr Pro Asn Asp Trp Lys Asn Ile Asn Glu Leu Ile Asp Asn Ala Asn Asn Ser V<~l Val Gln Asn Phe Asn Asn Gly Thr Leu Ile Val Gly Ala Thr Gln Ile Gly Gln Thr Asp Thr Asn Ser Ala Val Val Phe Gly Gly Leu Gly Tvr Gln Thr Pro Cys Asp 1765 l770 J 1775 Tyr Thr Asp Ile Val Cys Gln Lys Phe Arg G;Ly Thr Tyr Leu Gly Gln Leu Leu Glu Ser Ser Ser Ala Asp Leu Gly T~rr Ile Asp Thr Thr Phe 1795 1800 J l805 Asn Ala Lys Glu Ile Tyr Leu Thr Gly Thr Leu G1y Ser Gly Asn Ala Trp Gly Thr Gly Gly Ser Ala Ser Val Thr Phe Asn Ser Gln Thr Ser Leu Ile Leu Asn Gln Ala Asn Ile Val Ser Ser Gln Thr Asp Gly Ile Phe Ser Met Leu Gly Gln Glu Gly Ile Asn Lys Val Phe Asn Gln Ala Gly Leu Ala Asn Ile Leu Gly Glu Val Ala Val Gln Ser Ile Asn Lys l875 18S0 1885 Ala Gly Gly Leu Gly Asn Leu Ile Val Asn Thr Leu Gly Ser Asn Ser Val Ile Gly Gly Tyr Leu Thr Pro Glu Gln Lys Asn Gln Thr Leu Ser Gln Leu Leu Gly Gln Asn Asn Phe Asp Asn Lc,u Met Asn Asp Ser Gly Leu Asn Thr Ala Ile Lys Asp Leu Ile Arg Gln Lys Leu Gly Phe Trp 1940 l945 l950 Thr Gly Leu Val Gly Gly Leu Ala Gly Leu G.Ly Gly Ile Asp Leu Gln Asn Pro Glu Lys Leu Ile Gly Ser Met Ser I.Le Asn Asp Leu Leu Ser Lys Lys Gly Leu Phe Asn Gln Ile Thr Gly Plze Ile Ser Ala Asn Asp Ile Gly Gln vat m a Ser-Val Met Leu Gln Asp Ile Val Lys Pro Ser Asn Ala Leu Lys Asn Asp Val Ala Ala Leu Gly Lys Gln Met Ile Gly Glu Phe Leu Gly Gln Asp Thr Leu Asn Ser Leu Glu Ser Leu Leu Gln Asn Gln Gln Ile Lys Ser Val Leu Asp Lys Val Leu Ala Ala Lys Gly Leu Gly Pro Ile Tyr Glu Gln Gly Leu Gly Asp Leu Ile Pro Asn Leu Gly Lys Lys Gly Leu Phe Ala Pro Tyr Gly Leu Ser Gln Val Trp Gln Lys Gly Asp Phe Ser Phe Asn Ala Gln Gly Asn Val Phe Val Gln Asn Ser Thr Phe Ser Asn Ala Asn Gly Gly Thr Leu Ser Phe Asn Ala Gly 21l5 2120 2125 Asn Ser Leu Ile Phe Ala Gly Asn Asn His Ile Ala Phe Thr Asn His Ala Gly Thr Leu Gln Leu Leu Ser Asp Gln Val Ser Asn Ile Asn Ile 14S 2150 2l55 2160 Thr Thr Leu Asn Ala Ser Asn Gly Leu Lys Ile Asn Ala Ala Asn Asn 2165 2l70 2175 Asn Val Ser Val Ser Gln Gly Asn Leu Phe Val Ser Ala Ser Cys Ala Gln Gln Ser Asp Pro Thr Thr Ala Asn Ile Ala Asn Pro Cys Ala Leu Ser Ala Gln Ser Thr Asn Gly Ala Ser Ser Asn Asn Ala Ser Asn Asn Ala Pro Ile Ala Leu Ser Asn Asn Asp Glu Ser Leu Met Val Ala Ala Asn Asp Phe Asn Phe Ser Gly Asn Ile Tyr Ala Asn Gly Val Val Asp Phe Ser Lys Ile Lys Gly Ser Ala Asn Ile Lys Asn Leu Tyr Leu Tyr Asn Asn Ala Gln Phe Gln Ala Asn Asn Leu Thr Ile Ser Asn Gln Ala Val Leu Glu Lys Asn Ala Ser Phe Val Thr Asn Asn Leu Asn Ile Gln Gly Ala Phe Asn Asn Asn Ala Thr Gln Lys Ile Glu Val Leu G1n Asn Leu Val Ile Ala Ser Asn Ala Ser Leu Ser Thr Gly Ile Tyr Gly Leu Glu Val Gly Gly Ala Leu Asn Asn Ser Gly Ala Ile His Phe Asn Leu Glu Asn Thr Gln Thr Pro Thr Pro Leu Ile Gln Ala Glu Gly Ile Ile Asn Leu Asn Thr Thr Gln Thr Pro Phe Met Asn Val Asn Asn Ser Met A1a Asn Asn Thr Thr Tyr Thr Leu Leu Lys Ser Ser Arg Tyr Ile Asp Tyr Asn Ile Asn Pro Asn Ser Leu Gln Ser Tyr Leu Asn Leu Tyr Thr Leu Ile Asn Ile Asn Gly Asn His Ile Glu Glu Lys Asn Gly Ala Leu Thr Tyr Leu Gly Gln Arg Val Leu Leu Gln Asp Lys Gly Leu Leu Leu Ser Val Ala Leu Pro Asn Ser Asn Asn Ala Ser Gln Asn Asn Ile Leu Ser Leu Ser Val Leu Tyr Asn Gln Val Lys Met Ser Cys Gly Asp Lys 465 2470 24'75 2480 Ala Met Asp Phe Thr Pro Pro Thr Leu Gln Aap Tyr Ile Val Gly Ile Gln Gly Gln Ser Ala Leu Asn Gln Ile Glu A.La Val Gly Gly Asn Ala Ile Lys Trp Leu Ser Thr Leu Met Met Glu Thr Lys Glu Asn Pro Phe Phe Ala Pro Ile Tyr Leu Lys Asn His Ser Lc.u Asn Glu Ile Leu Gly Val Thr Lys Asp Leu G1n Asn Thr Ala Ser Lc~u Ile Ser Asn Pro Asn Phe Arg Asp Asn Ala Thr Asn Leu Leu Glu Leu Ala Ser Tyr Thr Gln Gln Thr Ser Arg Leu Thr Lys Leu Ser Asp Phe Arg Ser Arg Glu Gly Glu Ser Asp Phe Ser Leu Leu Glu Leu Lys Asn Lys Arg Phe Ser Asp Pro Asn Pro Glu Val Phe Val Lys Tyr Ser Gln Leu Ser Lys His Pro Asn Asn Leu Trp Val Gln Gly Val Gly Gly A7_a Ser Phe Ile 5er Gly 625 2630 26,S5 2640 Gly Asn Gly Thr Leu Tyr Gly Leu Asn Ala G7_y Tyr Asp Arg Leu Val Lys Asn Val Ile Leu Gly Gly Tyr Val Ala Tyr Gly Tyr Ser Asp Phe Asn Gly Asn Ile Met His Ser Leu Gly Asn A~~n Val Asp Val Gly Met Tyr Ala Arg Ala Phe Leu Lys Arg Asn Glu Phe Thr Leu Ser Ala Asn Glu Thr Tyr Gly Gly Asn Ala Thr Ser Ile A:cn Ser Ser Asn Ser Leu 705 2710 277.5 2720 Leu Ser Val Leu Asn Gln Arg Tyr Asn Tyr A~:n Thr Trp Thr Thr Ser Val Asn Gly Asn Tyr Gly Tyr Asp Phe Met Phe Lys Gln Lys Ser Val Val Leu Lys Pro Gln Val Gly Leu Ser Tyr Hi.s Phe Ile Gly Leu Ser Gly Met Lys Gly Asn Asp Ala Ala Tyr Lys Gl.n Phe Leu Met His Ser Asn Pro Ser Asn Glu Ser Val Leu Thr Leu As~n Met Gly Leu Glu Ser Arg Lys Tyr Phe Gly Lys Asn Ser Tyr Tyr Phe Val Thr Ala Arg Leu Gly Arg Asp Leu Leu Ile Lys Ser Lys Gly Ser Asn Thr Val Arg Phe Val Gly Glu Asn Thr Leu Leu Tyr Arg Lys Gly Glu Val Phe Asn Thr Phe Ala Ser Val Ile Thr Gly Gly Glu Met His Leu Trp Arg Leu Val Tyr Val Asn Ala Gly Val Gly Leu Lys Met Gly Leu Gln Tyr Gln Asp WO 98I21225 PCT/US97/2i353 -Ile Asn Ile Thr Gly Asn-Val Gly Met Arg Val Ala Phe (2) INFORMATION FOR SEQ ID N0:167:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1376 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 13...1338 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:167:
Met Gly Asn His Phe Ser Lys Leu Gly Phe Val Leu Ala Ala Leu Gly Ser Ala Ile Gly Leu Gly His Ile Trp Arg Phe Pro Tyr Met Thr Gly Val Ser Gly Gly Gly Ala Phe Val Leu Leu Phe Leu Phe Leu Ser Leu Ser Val Gly Ala Ala Met Phe Ile Ala G1u Met Leu Leu Gly Gln Ser Thr Gln Lys Asn Val Thr Glu Ala Phe Lys Glu Leu Asp Ile Asn Pro Lys Lys Arg Trp Lys Tyr Ala Gly Leu Leu Leu Val Ser Gly Pro Leu Ile Leu Thr Phe Tyr Gly Thr Ile Leu Gly Trp Val Leu Tyr Tyr Leu Val Ser Val Ser Phe Asn Leu Pro Asn Asn Ile Gln Glu 1l0 1l5 120 125 Ser Glu Gln Ile Phe Thr Gln Thr Leu Gln Ser Ile Gly Leu Gln Ser Ile Gly Leu Phe Ser Val Leu Leu Ile Thr Gly Trp Ile Val Ser Arg Gly Ile Lys Glu Gly Ile Glu Lys Leu Asn Leu Val Leu Met Pro Leu Leu Phe Ala Thr Phe Phe Gly Leu Leu Phe Tyr Ala Met Ser Met Asp l75 180 185 Ser Phe Ser Lys Ala Phe His Phe Met Phe Asp Phe Lys Pro Lys Asp Leu Thr Ser Gln Val Phe Thr Tyr Ser Leu Gly Gln Val Phe Phe Ser Leu Ser Ile Gly Leu Gly Ile Asn Ile Thr Tyr Ala Ala Val Thr Asp Lys Thr Gln Asn Leu Leu Lys Ser Thr Ile Trp Val Val Leu Ser Gly Ile Leu Ile Ser Leu Val Ala Gly Leu Met Ile Phe Thr Phe Val Phe Glu Tyr Gly Ala Asn Val Ser Gln Gly Thr Gly Leu Ile Phe Thr Ser TTA CCG GTG GTT TTT GGC CAA ATG GGA GCG AT.A GGC ATT CTT GTT TCG 915 Leu Pro Val Val Phe Gly Gln Met Gly Ala Ile Gly Ile Leu Val Ser Ile Leu Phe Leu Leu,Ala Leu Ala Phe Ala Gly Ile Thr Ser Thr Val Ala Leu Leu Glu Pro Ser Val Met Tyr Leu Thr Glu Arg Tyr Gln Tyr Ser Arg Phe Lys Val Thr Trp Gly Leu Val Ala Leu Ile Phe Val Val Gly Val Val Leu Ile Phe Ser Leu His Lys Asp Tyr Lys Asp Tyr Leu Thr Phe Phe Glu Lys Ser Leu Phe Asp Trp Leu Asp Phe Ala Ser Ser ACC ATT ATC ATG CCT TTA GGC GGG ATG GCA ACC TTT ATT TTT ATG GGT l203 Thr Ile Ile Met Pro Leu Gly Gly Met Ala Thr Phe Ile Phe Met Gly TGG GTT TTG AAA AAA GAA AAA TTG CGT CTT TTG AGC GTG CAC TTT TTA 125l Trp Val Leu Lys Lys Glu Lys Leu Arg Leu Leu Ser Val His Phe Leu Gly Pro Lys Leu Phe Ala Thr Trp Tyr Phe Leu Leu Lys Tyr Ile Thr 4l5 420 425 Pro Leu Ile Val Phe Ser Ile Trp Leu Ser Lys Ile Tyr (2) INFORMATION FOR SEQ ID N0:168:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 442 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUEt-dCE DESCRIPTION: SEQ ID N0:168:
Met Gly Asn His Phe Ser Lys Leu Gly Phe Val Leu Ala Ala Leu Gly Ser Ala Ile Gly Leu Gly His Ile Trp Arg Phe Pro Tyr Met Thr Gly Val Ser Gly Gly Gly Ala Phe Val Leu Leu Phe Leu Phe Leu Ser Leu Ser Val Gly Ala Ala Met Phe Ile Ala Glu Met Leu Leu Gly Gln Ser Thr Gln Lys Asn Val Thr Glu Ala Phe Lys Glu Leu Asp Ile Asn Pro Lys Lys Arg Trp Lys Tyr Ala Gly Leu Leu Leu Val Ser Gly Pro Leu Ile Leu Thr Phe Tyr Gly Thr Ile Leu Gly Trp Val Leu Tyr Tyr Leu Val Ser Val Ser Phe Asn Leu Pro Asn Asn Ile Gln Glu Ser Glu Gln Ile Phe Thr Gln Thr Leu Gln Ser Ile Gly Leu Gln Ser Ile Gly Leu l30 135 140 Phe Ser Val Leu Leu Ile-Thr Gly Trp Ile Val Ser Arg Gly Ile Lys Glu Gly Ile Glu Lys Leu Asn Leu Val Leu Mea Pro Leu Leu Phe Ala Thr Phe Phe Gly Leu Leu Phe Tyr Ala Met Ser Met Asp Ser Phe Ser Lys Ala Phe His Phe Met Phe Asp Phe Lys Pro Lys Asp Leu Thr Ser Gln Val Phe Thr Tyr Ser Leu Gly Gln Val Phe Phe Ser Leu Ser Ile Gly Leu Gly Ile Asn Ile Thr Tyr Ala Ala Val Thr Asp Lys Thr Gln Asn Leu Leu Lys Ser Thr I1e Trp Val Val Leu Ser Gly Ile Leu Ile Ser Leu Val Ala Gly Leu Met Ile Phe Thr Phe Val Phe Glu Tyr Gly Ala Asn Val Ser Gln Gly Thr Gly Leu Ile Phe Thr Ser Leu Pro Val Val Phe Gly Gln Met Gly Ala Ile Gly Ile Le:u Val Ser Ile Leu Phe Leu Leu Ala Leu Ala Phe Ala Gly Ile Thr Se:r Thr Val Ala Leu Leu 305 310 31.5 320 Glu Pro Ser Val Met Tyr Leu Thr Glu Arg Tyr Gln Tyr Ser Arg Phe Lys Val Thr Trp Gly Leu Val Ala Leu Ile Phe Val Val Gly Val Val Leu Ile Phe Ser Leu His Lys Asp Tyr Lys A~~p Tyr Leu Thr Phe Phe Glu Lys Ser Leu Phe Asp Trp Leu Asp Phe Al.a Ser Ser Thr Ile Ile Met Pro Leu Gly Gly Met Ala Thr Phe Ile Phe Met Gly Trp Val Leu Lys Lys Glu Lys Leu Arg Leu Leu Ser Val Hi.s Phe Leu Gly Pro Lys Leu Phe Ala Thr Trp Tyr Phe Leu Leu Lys Tyr Ile Thr Pro Leu Ile Val Phe Ser Ile Trp Leu Ser Lys Ile Tyr (2) INFORMATION FOR SEQ ID N0:169:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1392 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
~ (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 22...l356 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:169:
Met Lys Ile Phe Gly Thr Asp Gly Val Arg Gly Lys Ala Gly Val Lys Leu Thr Pro Met Phe Val Met Arg Leu Gly Ile Ala Ala Gly Leu Tyr Phe Lys Lys His Ser Gln Thr Asn Lys Ile CTA ATC GGT AAA GAC ACC AGA AAA AGC GGC TAT ATG GTA GAA AAC GCT l95 Leu Ile Gly Lys Asp Thr Arg Lys Ser Gly Tyr Met Val Glu Asn Ala Leu Val Ser Ala Leu Thr 5er Tle Gly Tyr Asn Val Ile Gln Ile Gly Pro Met Pro Thr Pro Ala Ile Ala Phe Leu Thr Glu Asp Met Arg Cys Asp Ala Gly Ile Met Ile Ser Ala Ser His Asn Pro Phe Glu Asp Asn Gly Ile Lys Phe Phe Asn Ser Tyr Gly Tyr Lys Leu Lys Glu Glu Glu Glu Lys Ala Ile Glu Glu Ile Phe His Asp Glu Glu Leu Leu His Ser 125 130 l35 Ser Tyr Lys Val Gly Glu Ser Val Gly Ser Ala Lys Arg Ile Asp Asp l40 145 150 Val Ile Gly Arg Tyr Ile Ala His Leu Lys His Ser Phe Pro Lys His Leu Asn Leu Gln Ser Leu Arg Ile Val Leu Asp Thr Ala Asn Gly Ala 175 180 l85 Ala Tyr Lys Val Ala Pro Val Val Phe Ser Glu Leu Gly Ala Asp Val Leu Val Ile Asn Asp Glu Pro Asn Gly Cys Asn Ile Asn Asp Gln Cys Gly Ala Leu His Pro Asn Gln Leu Ser Gln G:Lu Val Lys Lys Tyr Arg Ala Asp Leu Gly Phe Ala Phe Asp Gly Asp Ala Asp Arg Leu Val Val Val Asp Asn Leu Gly Asn Ile Val His Gly Asp Lys Leu Leu Gly Val TTA GGG GTT TAT CAA AAA TCT AAA AAC GCC C'.CT TCT TCT CAA GCG GTT 867 Leu Gly Val Tyr Gln Lys Ser Lys Asn Ala Leu Ser Ser Gln Ala Val Val Ala Thr Asn Met Ser Asn Leu Ala Leu Lys Glu Tyr Leu Lys Ser Gln Asp Leu Glu Leu Lys His Cys Ala Ile Gl_y Asp Lys Phe Val Ser GAA TGC ATG CAA TTG AAT AAA GCC AAT TTT GCiA GGC GAG CAA AGC GGG 1011 Glu Cys Met Gln Leu Asn Lys Ala Asn Phe Gl.y Gly Glu Gln Ser Gly CAT ATC ATT TTT AGC GAT TAC GCT AAA ACA GCiC GAT GGT TTG GTG TGC 1059 His Ile Ile Phe Ser Asp Tyr Ala Lys Thr Gl.y Asp Gly Leu Val Cys Ala Leu Gln Val Ser Ala Leu Val Leu Glu Se:r Lys Gln Val Ser Ser GTT GCG TTA AAC CCC TTT GAA TTA TAC CCC CP,A AGC CTA GTG AAT TTG 1155 Val Ala Leu Asn Pro Phe Glu Leu Tyr Pro Gln Ser Leu Val Asn Leu Asn Val Gln Lys Lys Pro Pro Leu Glu Ser Leu Lys Gly Tyr Ser Ala Leu Leu Lys Glu Leu Asp Lys Leu Glu I1e Arg His Leu Ile Arg Tyr AGC GGC ACT GAA AAC AAA TTG CGA ATC CTT TT'A GAA GCT AAA GAT GAA 1299 Ser Gly Thr Glu Asn Lys Leu Arg Ile Leu Leu Glu Ala Lys Asp Glu WO 98l21225 PCT/IIS97/21353 Lys Leu Leu Glu Ser Lys Met Gln Glu Leu Lys Glu Phe Phe Glu Gly CAT TTG TGC TAAAAACCAC TAAP.AAAAGC CTGTTGGTTT TTATGG 1392 His Leu Cys (2) INFORMATION FOR SEQ ID N0:170:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 445 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:170:
Met Lys Ile Phe Gly Thr Asp Gly Val Arg Gly Lys Ala Gly Val Lys Leu Thr Pro Met Phe Val Met Arg Leu Gly Ile Ala Ala Gly Leu Tyr Phe Lys Lys His Ser Gln Thr Asn Lys Ile Leu Ile Gly Lys Asp Thr Arg Lys Ser Gly Tyr Met Val Glu Asn Ala Leu Val Ser Ala Leu Thr Ser Ile Gly Tyr Asn Val Ile Gln Ile Gly Pro Met Pro Thr Pro Ala Ile Ala Phe Leu Thr Glu Asp Met Arg Cys Asp Ala Gly Ile Met Ile Ser Ala Ser His Asn Pro Phe Glu Asp Asn Gly Ile Lys Phe Phe Asn Ser Tyr Gly Tyr Lys Leu Lys Glu Glu Glu Glu Lys Ala Ile Glu Glu Ile Phe His Asp Glu Glu Leu Leu His Ser Ser Tyr Lys Val Gly Glu 130 135 l40 Ser Val Gly Ser Ala Lys Arg Ile Asp Asp Val Ile Gly Arg Tyr Ile Ala His Leu Lys His Ser Phe Pro Lys His Leu Asn Leu Gln Ser Leu Arg Ile Val Leu Asp Thr Ala Asn Gly Ala Ala Tyr Lys Val Ala Pro Val Val Phe Ser Glu Leu Gly Ala Asp Val Leu Val Ile Asn Asp G1u Pro Asn Gly Cys Asn Ile Asn Asp Gln Cys Gly Ala Leu His Pro Asn Gln Leu Ser Gln Glu Val Lys Lys Tyr Arg Ala Asp Leu Gly Phe Ala Phe Asp Gly Asp Ala Asp Arg Leu Val Val Val Asp Asn Leu Gly Asn Ile Val His Gly Asp Lys Leu Leu Gly Val Leu Gly Val Tyr Gln Lys Ser Lys Asn Ala Leu Ser-Ser Gln Ala Val Va.l Ala Thr Asn Met Ser Asn Leu Ala Leu Lys Glu Tyr Leu Lys Ser Gln Asp Leu Glu Leu Lys His Cys Ala Ile Gly Asp Lys Phe Val Ser Glu Cys Met Gln Leu Asn Lys Ala Asn Phe Gly Gly Glu Gln Ser Gly His Ile Ile Phe Ser Asp Tyr Ala Lys Thr Gly Asp Gly Leu Val Cys Ala Leu Gln Val Sex Ala Leu Val Leu Glu Ser Lys Gln Val Ser Ser Val Ala Leu Asn Pro Phe Glu Leu Tyr Pro Gln Ser Leu Val Asn Leu Asn Val Gln Lys Lys Pro , Pro Leu Glu Ser Leu Lys Gly Tyr Ser Ala Leu Leu Lys Glu Leu Asp 385 390 39'5 400 Lys Leu Glu Ile Arg His Leu Ile Arg Tyr Ser Gly Thr G1u Asn Lys Leu Arg Ile Leu Leu Glu Ala Lys Asp Glu Lys Leu Leu Glu Ser Lys Met Gln Glu Leu Lys Glu Phe Phe Glu Gly His Leu Cys (2) INFORMATION FOR SEQ ID N0:171:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:171:
(2) INFORMATION FOR SEQ ID N0:172:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xii SEQUENCE DESCRIPTION: SEQ ID N0:172:
(2) INFORMATION FOR SEQ ID N0:173:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:173:
(2) INFORMATION FOR SEQ ID N0:174:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:174:
(2) INFORMATION FOR SEQ ID N0:175:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:175:
(2) INFORMATION FOR SEQ ID N0:176:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 35 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: CDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:176:
(2) INFORMATION FOR SEQ ID N0:177:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 36 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: lineal (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:177:
(2) INFORMATION FOR SEQ ID N0:17F3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:178: _ (2) INFORMATION FOR SEQ ID N0:179:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:179:
(2) INFORMATION FOR SEQ ID N0:180:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:180:
GCCCTCGAGT CATTTTAA.AC GACTCAAAAC AAA 33 (2) INFORMATION FOR SEQ ID NO:181:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:181:
{2) INFORMATION FOR SEQ ID N0:182:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:182:
(2) INFORMATION FOR SEQ ID N0:183:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (i1) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:183:
(2) INFORMATION FOR SEQ ID N0:184:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:184:
(2) INFORMATION FOR SEQ ID N0:185:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:185:
(2) INFORMATION FOR SEQ ID N0:186:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:186:
(2) INFORMATION FOR SEQ ID NO:18'7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 31 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:187:
(2) INFORMATION FOR SEQ ID N0:188:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:188:
(2) INFORMATION FOR SEQ ID N0:18:3:
(i) SEQUENCE CHA&ACTERISTICS:
(A) LENGTH: 36 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:189:
(2) INFORMATION FOR SEQ ID N0:190:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:190:
19I, GHPO 213, GHPO 240, GHPO 408, GHPO 411, GHPO 419, GHPO 431, GHPO 474, GHPO 591, GHPO 59G, GHPO G99, GHPO 724, GHPO 730, GHPO 7G1, GHPO 804, GHPO 805, GHPO 812, GHPO 879, GHPO 888, - GHPO 98G, GHPO 105G, GHPO 1081, GHPO 1 l00, GHPO 1140, GHPO
1 l48, GHPO I200, GHPO 1212, GHPO 12S8, GHPO 1263, GHPO 1273, GHPO 1284, GHPO 1299, GHPO 1327, GHPO 134G, GHPO 1378, GHPO
1412, GHPO 1443, GHPO 1466, GHPO 147G, GHPO 153G, GHPO 1559, GHPO 427, GHPO l045, GHPO l262, GHPO 1688, GHPO l538, GHPO 346, GHPO 1012, GHPO 470, GHPO l398, GHPO 1 S50, GHPO 27G, GHPO 1501, GHPO 70G, GHPO 1001, GHPO 732, GHPO 329, GHPO 574, GHPO 1190, GHPO 1374, GHPO 1G20, GHPO 956, HPO 98, GHPO 689, GHPO 208, GHPO 296, GHPO 726, GHPO 1026, GHPO 130I, GHPO 153G, GHPO 166, GHPO 253, GHPO 297, GHPO G15, GHPO l278, GHPO 1282, GHPO 1420, GHPO l484, GHPO 17l9, and GHPO l252.
An isolated polynucleotide of the invention encodes:
(i) a polypeptide having an amino acid sequence that is homologous to a Helicobacter amino acid sequence of a polypeptide, the Helicobacter amino acid sequence being selected from the group consisting of the amino acid sequences shown in SEQ ID N0:2 {GHPO 13), SEQ ID N0:4 (GHPO 73}, SEQ ID NO:G (GHPO 90), SEQ ID N0:8 (GHPO 107}, SEQ ID NO:10 CfI ~~S '96Z OdHJ) Z~I~O1~I CTI ~dS '84Z OdHJ) Ot~i~OIvI CII ~~S g689 OdHJ) 8~I~OI~I CII ~~S '~86 OdH) 9~I~OI~I CfI ~~S '9S6 OdHJ) ~~I~OI~I
CTI ~~S '~OZ9t OdHJ) Z~I~~OI~ CfI ~~S 'l'bL~i OdHJ) 0~I~OI~I CII ~~S SZ
'06i I OdHJ) 8Z1 ~O1~I CfI a~S '~fiLS OdHJ) 9ZI ~OI~I CfI ~~S '~6Z~ OdHJ) fiZi~OI~I CII ~~S '~Z~L OdHJ) ZZI~OI~I CfI ~~S '~IOOi OdHJ) OZi~OI~I
CII ~~S '90L OdHJ) 8i I ~OI~I CfI ~~S '~IOSI OdHJ) 9I I~OI~I CfI ~~S
'~9LZ OdHJ) bI i ~OIvI QI ~~S 'BOSS I OdHrJ) Zi I ~OI~I CfI ~~S '~86~ I OdHJ) OI I ~OI~I CII ~~S '~OLb OdHJ) 80 i ~OI~I CfI ~~S '~Z I O i OdHtJ) 90 i ~OI~I
OZ
CII ~~S '~9~b~ OdHJ) ~bO I ~OIvI CfI ~~S '~8 ~S I OdHJ) ZO I ~OI~I CII ~~S
'889I
OdHJ) 00 i ~OI~I CII a~S '~Z9Z I C)dHJ) 86~OI~I CfI ~dS '{S~0 i OdHJ) 96~OIvI
CLI a~S '~LZt~ OdHJ) t6~0IvI CfI ~~S '~6SSI OdHJ) Z6~Ol~I CTI ~~S '~9~SI
OdHJ) 06~OI~I CTI ~~S '~9Lt~I C>dHrJ) 88~OIvI CfI ~~S '~99tiI OdHJ) 98~OI~I
CII ~~S '~~~bt~ i OdHJ) ~b8 ~OI~I CfI ~~S '~Z i t~ I OdHJ) Z8 ~OI~I CII ~~S
'~8L ~ I S i OdHJ) 08~OIvI CII ~~S '~9~b~I C)dHJ) 8L~OI~I CII a~S '~LZ~I OdHJ) 9L~O1~I
CfI ~J~S '~66ZI OdHJ) tL~OI~I CfI ~~S '{t~8ZI OdHJ) ZL~OI~I CII ~~S '~~LZI
OdHJ) OL~OI~I CfI a~S '~~9ZI C>dHJ) 89~OI~I CfI ~~S '~BSZt OdHJ) 99~O1~I
CII ~~S '~ZIZi OdHJ) t~9~0IvI CfI ~~S '~OOZI OdHJ) Z9~OI~I CII ~~S '~8bi I
OdHJ) 09~OI~I CII a~S '~Ot~t i C>dHJ) BS~OIvL C(I ~~S '~OOI i OdHJ) 9S~OI~I OI
CII ~~S '~i80i OdHJ) ~bS~OI~I CfI b~S '~9SOI OdHJ) ZS~OI~I CII a~S g986 OdHJ) OS~OI~I CfI a~S g888 OdHJ) 8t~~Ol~I CfI a~S '6L8 OdHJ) 9t~~Olvl CfI ~~S '~ZIB OdHJ) t~fi~01'T CtI ~~S g508 OdHJ) Zb~OI~I C(I a~S 'O08 OdHJ) O~~OIvI CfI ~~S '~I9L OdHJ) 8~~OIvI CII ~~S 'lO~L OdHJ) 9~~OI~( CfI ~~S 'OZL OdHJ) ~b~~OI'T CTI a~S g669 OdHJ) Z~~O1~I CII a~S 't96S S
OdHJ) 0~~OIvI CII a~S '~I6S OdHJ) 8Z~OlvI C(I ~~S '~t~Lt~ OdHJ) 9Z~Ol~I
CfI ~~S '~ i ~t~ OdHJ) t~Z~01'T CfI ~~S '~6 i ~ OdHJ) ZZ~Ot~I CfI UPS '~ i i ~
OdHJ) OZ~OIvI CII a~S '~80t~ OdHJ) 8 I ~O1~I CII ~~S '~O~Z OdHJ) 9I ~OI~I CfI
~~S '{~IZ OdHJ) ~i ~OI~I CII ~~S '~I6I OdHJ) ZI~OI~I CII ~~S '~9~I OdHJ) -L-- ~S~iZ/G6Sfl/Z~d SZZTZ/86 OM
bi-SO-666l bGGiGZZO ~a _g_ N0:144 (GHPO 726), SEQ ID N0:146 (GHPO 1026), SEQ ID N0:148 (GHPO 1301 ), SEQ ID N0:150 (GHPO 1536), SEQ ID N0:152 (GHPO 166), SEQ ID N0:154 (GHPO 253), SEQ ID N0:156 (GHPO 297), SEQ ID N0:158 (GHPO 615), SEQ ID N0:160 (GHPO 1278), SEQ ID N0:162 (GHPO 1282), SEQ ID N0:164 (GHPO l420), SEQ ID N0:166 (GHPO l484), SEQ ID
N0:168 (GHPO 1719), and SEQ ID N0:170 (GHPO 1252); or (ii) a derivative of the polypeptide.
In addition to the full-length polypeptides encoded by the polynucleotides of the invention, as set forth above, polynucleotides included in l 0 the invention can also encode polypeptides that lack signal sequences, as well as other polypeptide or peptide fragments of the full-length polypeptides.
The term "isolated polynucleotide" is defined as a polynucleotide that is removed from the environment in which it naturally occurs. For example, a naturally-occurring DNA molecule present in the genome of a living bacteria or as part of a gene bank is not isolated, but the same molecule, separated from the remaining part of the bacterial genome, as a result of, e.g., a cloning event (amplification), is "isolated." Typically, an isolated DNA
molecule is free from DNA regions (e.g., coding regions) with which it is immediately contiguous, at the 5' or 3' ends, in the naturally occurring genome.
Such isolated polynucleotides can be part of a vector or a composition and still be isolated, as such a vector or composition is not part of its natural environment.
A polynucleotide of the invention can consist of RNA or DNA (e.g., cDNA, genomic DNA; or synthetic DNA), or modifications or combinations of RNA or DNA. The polynucleotide can be double-stranded or single-stranded and, if single-stranded, can be the coding (sense) strand or the non-coding (anti-sense) strand. The sequences that encode polypeptides of the invention, as shown in any of SEQ ID NOs:2-170 (even numbers), can be (a) the coding sequence as shown in any of SEQ ID NOs:I-l69 (odd numbers); (b) a ribonucleotide sequence derived by transcription of (a); or (c) a different coding sequence that, as a result of the redundancy or degeneracy of the genetic code, encodes the same polypeptides as the polynucleotide molecules having the sequences illustrated in any of SEQ ID NOs:I-169 (odd numbers). The polypeptide can be one that is naturally secrel:ed or excreted by, e.g., H.
felis, H. mustelae, H. heilmanii, or H. pylori.
By "polypeptide" or "protein" is meant any chain of amino acids, regardless of length or post-translational modification (e.g., glycosylation or phosphorylation). Both terms are used interchangeably in the present application.
By "homologous amino acid sequence" is meant an amino acid sequence that differs from an amino acid sequence shown in any of SEQ ID
NOs:2-170 (even numbers), or an amino acid sequence encoded by the nucleotide sequence of any of SEQ ID NOs: I-l69 (odd numbers), by one or more non-conservative amino acid substitutions, deletions, or additions located at positions at which they do not destroy the specific antigenicity of the polypeptide. Preferably, such a sequence is at least 75%, more preferably at least 80%, and most preferably at least 90% identical to an amino acid sequence shown in any of SEQ ID NOs:2-170 (even-numbers).
Homologous amino acid sequences include sequences that are identical or substantially identical to an amino acid sequence as shown in any of SEQ ID NOs:2-170 (even numbers). By "amino acid sequence that is substantially identical" is meant a sequence that is at least 90%, preferably at least 95%, more preferably at least 97%, and most preferably at least 99%
identical to an amino acid sequence of reference and that differs from the sequence of reference, if at all, by a majority of conservative amino acid substitutions.
Conservative amino acid substitutions typically include substitutions among amino acids of the same class. These classes include, for example, amino acids having uncharged polar side chains, such as asparagine, glutamine, serine, threonine, and tyrosine; amino acids having basic side chains, such as lysine, arginine, and histidine; amino acids having acidic side chains, such as aspartic acid and glutamic acid; and amino acids having nonpolar side chains, such as glycine, alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan, and cysteine.
Homology can be measured using sequence analysis software (e.g., Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 University Avenue, Madison, WI 53705). Similar amino acid sequences are aligned to obtain the maximum degree of homology (i.e., identity). To this end, it may be necessary to artificially introduce gaps into the sequence. Once the optimal alignment has been set up, the degree of homology (i.e., identity) is established by recording a11 of the positions in which the amino acids of both sequences are identical, relative to the total number of positions.
Homologous polynucleotide sequences are defined in a similar way.
Preferably, a homologous sequence is one that is at least 45%, more preferably at least 60%, and most preferably at least 85% identical to a coding sequence of any of SEQ ID NOs:I-169 (odd numbers).
Polypeptides having a sequence homologous to any one of the sequences shown in SEQ ID NOs:2-l70 (even numbers), include naturally occurring allelic variants, as well as mutants or any other non-naturally occurring variants that are analogous in terms of antigenicity, to a polypeptide having a sequence as shown in any one of SEQ ID NOs:2-170 (even numbers).
As is known in the art, an allelic variant is an alternate form of a polypeptide that is characterized as having a substitution, deletion, or addition of one or more amino acids that does not alter the biological function of the polypeptide. By "biological function" is meant a function of the polypeptide in the cells in which it naturally occurs, even if the function is not necessary for the growth or survival of the cells. For example, the biological function of a porin is to allow the entry into cells of compounds present in the extracellular medium. The biological function is distinct fiom the antigenic function. A
polypeptide can have more than one biological function.
Allelic variants are very common in nature. For example, a bacterial species, e.g., H. pylori, is usually represented by a variety of strains that differ from each other by minor allelic variations. Indeed, a polypeptide that fulfills the same biological function in different strains can have an amino acid sequence that is not identical in each of the stn~ains. Such an allelic variation can be equally reflected at the polynucleotide level.
Support for the use of allelic variants of polypeptide antigens comes from, e.g., studies of the Helicobacter urease antigen. The amino acid sequence of Helicobacter urease varies widely from species to species, yet cross-species protection occurs, indicating that the urease molecule, when used as an immunogen, is highly tolerant of amino acid variations. Even among different strains of the single species H. pylori, there are amino acid sequence variations.
For example, although the amino acid sequences of the UreA and Urea subunits of H. pylori and H. felis ureases differ from one another by 26.5% and 11.8%, respectively (Ferrero et al., Molecular Microbiology 9{2):323-333, 1993), it has been shown that H. pylori urease protects mice from H. fells infection {Michetti et al., Gastroenterology 107:1002, 1994). In addition, it has been shown that the individual structural subunits of urease, UreA and Urea, which contain distinct amino acid sequences, are both protective antigens against Helicobacter infection {Michetti et al., supra).
Similarly, Cuenca et al. (Gastroenterology 110: I770, 1996) showed that therapeutic immunization of H. mustelae-infected ferrets with H. pylori urease was effective at eradicating H. mustelae infection. Further, several urease variants have been reported to be effective vaccine antigens, including, e.g., recombinant UreA + Urea apoenzyme expressed from pORV 142 (UreA and Urea sequences derived from H. pylori strain CPM630; Lee et al., J. Infect.
Dis.l72:161, 1995); recombinant UreA + Urea apoenzyme expressed from pORV214 (UreA and Urea sequences differ from H. pylori strain CPM630 by one and two amino acid changes, respectively; Lee et al., supra, 1995); a UreA-glutathione-S-transferase fusion protein (UreA sequence from H. pylori strain ATCC 43 504; Thomas et al., Acta Gastro-Enterologica Belgica S6:54, 1993); UreA + Urea holoenzyme purified from H. pylori strain NCTC11637 (Marchetti et al., Science 267:1655, l995); a UreA-MBP fusion protein (UreA
from H. pylori strain 85P; Ferrero et al., Infection and Immunity 62:4981, 1994); a Urea-MBP fusion protein (Urea from H. pylori strain 85P; Ferrero et al., supra); a UreA-MBP fusion protein (UreA from H. fells strain ATCC
49179; Ferrero et al., supra); a Urea-MBP fusion protein (Urea from H. fells strain ATCC 49l79; Ferrero et al., supra); and a 37 kDa fragment of Urea containing amino acids 220-569 {Dore-Davin et al., "A 37 kD fragment of Urea is sufficient to confer protection against Helicobacter fells infection in mice"). Finally, Thomas et al. (supra) showed that oral immunization of mice with crude sonicates of H. pylori protected mace from subsequent challenge with H. felis.
Polynucleotides, e.g., DNA molecules, encoding allelic variants can easily be obtained by polymerase chain reaction (PCR) amplification of genomic bacterial DNA extracted by conventional methods. This involves the use of synthetic oligonucleotide primers matching sequences that are upstream and downstream of the 5' and 3' ends of the coding region. Suitable primers can be designed based on the nucleotide sequence information provided in any of SEQ ID NOs:I-1G9 (odd numbers). Typically, a primer consists of 10 to 40, I O preferably 15 to 25 nucleotides. It can also be advantageous to select primers containing C and G nucleotides in proportions sufficient to ensure efficient hybridization, e.g., an amount of C and G nucleotides of at least 40%, preferably 50%, of the total nucleotide amount. Those skilled in the art can readily design primers that can be used to isolate the polynucleotides of the invention from different Helicobacter strains. Experimental conditions for carrying out PCR can readily be determined by one skilled in the art and an illustration of carrying out PCR is provided in the Examples below. As is well known in the art, restriction endonuclease recognition sites that contain, typically, 4 to G nucleotides (for example, the sequences 5'-GGATCC-3' (BamHI) or 5'-CTCGAG-3' (XhoI)), can be included on the S' ends of the primers. Restriction sites can be selected by those skilled in the art so that the amplified DNA can be conveniently cloned into an appropriately digested vector, such as a plasmid.
Useful homologs that do not occur naturally can be designed using known methods for identifying regions of an .antigen that are likely to be tolerant of amino acid sequence changes and/or deletions. For example, WO 98l21225 PCTlUS97/21353 -sequences of the antigen from different species can be compared to identify conserved sequences.
Polypeptide derivatives that are encoded by polynucleotides of the invention include, e.g., fragments, polypeptides having large internal deletions derived from full-length polypeptides, and fusion proteins. Polypeptide fragments of the invention can be derived from a polypeptide having a sequence homologous to any of the sequences of SEQ ID NOs:2-170 (even numbers), to the extent that the fragments retain the substantial antigenicity of the parent polypeptide (specific antigenicity). Polypeptide derivatives can also be constructed by large internal deletions that remove a substantial part of the parent polypeptide, while retaining specific antigenicity. Generally, polypeptide derivatives should be about at least 12 amino acids in length to maintain antigenicity. Advantageously, they can be at least 20 amino acids, preferably at least 50 amino acids, more preferably at least 75 amino acids, and most preferably at least 100 amino acids in length.
Useful polypeptide derivatives, e.g., polypeptide fragments, can be designed using computer-assisted analysis of amino acid sequences in order to i dentify sites in protein antigens having potential as surface-exposed, antigenic regions (Hughes et al., Infect. Immun. G0(9):3497, 1992). For example, the Laser Gene Program from DNA Star can be used to obtain hydrophilicity, antigenic index, and intensity index plots for the polypeptides of the invention.
This program can also be used to obtain information about homologies of the polypeptides with known protein motifs. One skilled in the art can readily use the information provided in such plots to select peptide fragments for use as vaccine antigens. For example, fragments spanning regions of the plots in which the antigenic index is relatively high can be selected. One can also select ftagments spanning regions in which both the antigenic index and the intensity plots are relatively high. Fragments. containing conserved sequences, particularly hydrophilic conserved sequences, can also be selected.
Polypeptide fragments and polypeptides having large internal deletions can be used for revealing epitopes that are otherwise masked in the parent polypeptide and that may be of importance for inducing a protective T
cell-dependent immune response. Deletions can also remove immunodominant regions of high variability among strains.
It is an accepted practice in the field of immunology to use fragments and variants of protein immunogens as vaccines, as all that is required to induce an immune response to a protein is a small (e.g., 8 to 10 amino acids) .._ - immunogenic region of the protein. This has been done for a number of vaccines against pathogens other than Helicobacter. For example, short synthetic peptides corresponding to surface-exposed antigens of pathogens such as murine mammary tumor virus (peptide containing 11 amino acids; Dion et al., Virology 179:474-477, 1990), Semliki Forest virus (peptide containing 16 amino acids; Snijders et al., J. Gen. Virol. 72;S57-5G5, 199l), and canine parvovirus (2 overlapping peptides, each containing 15 amino acids; Langeveld et al., Vaccine 12(15):1473-l480, 1994) have; been shown to be effective vaccine antigens against their respective pathogens.
Polynucleotides encoding polypeptide fragments and polypeptides having large internal deletions can be constructed using standard methods (see, e.g., Ausubel et al., Current Protocols in Molecular Biology, John Wiley &
Sons Inc., l994), for example, by PCR, including inverse PCR, by restriction enzyme treatment of the cloned DNA molecules, or by the method of Kunkel et al. (Proc. Natl. Acad. Sci. USA 82:448, l985; biological material available at Stratagene).
A polypeptide derivative can also be produced as a fusion polypeptide that contains a polypeptide or a polypeptide derivative of the invention fused, e.g., at the N- or C-terminal end, to any other polypeptide (hereinafter referred to as a peptide tail). Such a product can be easily obtained by translation of a genetic fusion, i.e., a hybrid gene. Vectors for expressing fusion polypeptides are commercially available, and include the pMal-c2 or pMal-p2 systems of New England Biolabs, in which the peptide tail is a maltose binding protein, the glutathione-S-transferase system of Pharmacia, or the His-Tag system available from Novagen. These and other expression systems provide convenient means for further purification of polypeptides and derivatives of the invention.
Another particular example of fusion polypeptides included in invention includes a polypeptide or polypeptide derivative of the invention fused to a polypeptide having adjuvant activity, such as, e.g., subunit B of either cholera toxin or E. toll heat-labile toxin. Several possibilities can be used for producing such fusion proteins. First, the polypeptide of the invention can be fused to the N-terminal end or, preferably, to the C-terminal end of the polypeptide having adjuvant activity. Second, a polypeptide fragment of the invention can be fused within the amino acid sequence of the polypeptide having adjuvant activity.
Spacer sequences can also be included, if desired.
As stated above, the polynucleotides of the invention encode Helicobacte~ polypeptides in precursor or mature form. They can also encode hybrid precursors containing heterologous signal peptides, which can mature into polypeptides of the invention. By "heterologous signal peptide" is meant a signal peptide that is not found in the naturally-occurring precursor of a polypeptide of the invention.
_17_ A polynucleotide of the invention hybridizes, preferably under stringent conditions, to a polynucleotide having a sequence as shown in any of SEQ ID NOs:I-1G9 (odd numbers}. Hybridization procedures are, e.g., described by Ausubel et al. (supra); Silhavy e~t al. (Experiments with Gene Fusions, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1984); and Davis et al. (A Manual for G%enetic Engineering. Advanced Bacterial Genetics, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1980). Important parameters that can be considered for optimizing hybridization conditions are reflected in the following formula, which facilitates calculation of the melting temperah~re (Tm), which is the temperature above which two complementary DNA strands separate from one another (Casey et al., Nucl. Acid Res. 4:1539, l977): Tm = 81.5 + 0.5 x (%
G+C) + 1.G log (positive ion concentration) - 0.G x (% folnamide). Under appropriate stringency conditions, hybridization temperature (Th) is approximately 20 to 40 ~ C, 20 to 25 ~ C, or, preferably, 3 0 to 40 ~ C below the calculated Tm. Those skilled in the art will understand that optimal temperature and salt conditions can be readily determined empirically in preliminary experiments using conventional procedures. For example, stringent conditions can be achieved, both for pre-hybridizing and hybridizing incubations, (i) within 4-1G hours at 42~C, in G x SSC containing 50% formamide or (ii) within 4-16 hours at 6~~ ~C in an aqueous 6 x SSC
solution ( 1 M NaCI, 0.1 M sodium citrate (pH 7.0)). For polynucleotides containing 30 to 600 nucleotides, the above formula is used and then is corrected by subtracting (G00/polynucleotide size in base pairs). Stringency conditions are defined by a Th that is 5 to 10~C below Tm.
Hybridization conditions with oligonucleotides shorter than 20-30 bases do not precisely follow the rules set forth above. In such cases, the formula for calculating the Tm is as follows: Tm = 4 x (G+C) + 2 (A+T). For example, an 18 nucleotide fragment of 50% G+C would have an approximate Tm of 54~C.
A polynucleotide molecule of the invention, containing RNA, DNA, or modifications or combinations thereof, can have various applications. For example, a polynucleotide molecule can be used (i) in a process for producing the encoded polypeptide in a recombinant host system, (ii) in the construction of vaccine vectors, such as poxviruses, which are further used in methods and compositions for preventing and/or treating Helicobacter infection, (iii) as a vaccine agent, in a naked form or formulated with a delivery vehicle, and (iv) in the construction of attenuated Helicobacter strains that can over-express a polynucleotide of the invention or express it in a non-toxic, mutated form.
According to a second aspect of the invention, there is therefore provided (i) an expression cassette containing a poiynucleotide molecule of the invention placed under the control of elements (e.g., a promoter) required for expression; (ii) an expression vector containing an expression cassette of the invention; (iii) a procaryotic or eucaryotic cell transformed or transfected with an expression cassette and/or vector of the invention, as well as (iv) a process for producing a polypeptide or polypeptide derivative encoded by a polynucleotide of the invention, which involves culturing a procaryotic or eucaryotic cell transformed or transfected with an expression cassette and/or vector of the invention, under conditions that allow expression of the polynucleotide molecule of the invention, and recovering the encoded polypeptide or polypeptide derivative from the cell culture.
A recombinant expression system can be selected from procaryotic and eucaryotic hosts. Eucaryotic hosts include, for example, yeast cells (e.g., Saccharomyces cerevisiae or Pichia Pastoris), mammalian cells (e.g., COS l, NIH3T3, or JEG3 cells), arthropods cells (e.g., Spodoptera frugiperda (SF9) cells), and plant cells. Preferably, a procaryotic host such as E. coli is used.
Bacterial and eucaryotic cells are available from a number of different sources that are known to those skilled in the art, e.g., the American Type Culture Collection (ATCC; Rockville, Maryland).
The choice of the expression cassette will depend on the host system selected, as well as the features desired for the expressed polypeptide. For example, it may be useful to produce a polype;ptide of the invention in a particular Iipidated form or any other form. Typically, an expression cassette includes a constitutive or inducible promoter that is functional in the selected host system; a ribosome binding site; a start c~odon (ATG); if necessary, a region encoding a signal peptide, e.g., a lipidation signal peptide; a polynucleotide molecule of the invention; a stop codon; and, optionally, a 3' terminal region (translation and/or transcription terminator). The signal peptide-encoding region is adjacent to the polynucleotide of the invention and is placed in the proper reading frame. The signal peptide-encoding region can be homologous or heterologous to the polynucleotide molecule encoding the mature polypeptide and it can be specific to the secretion apparatus of the host used for expression. The open reading frame constituted by the polynucleotide molecule of the invention, alone or together with the signal peptide, is placed under the control of the promoter so that transcription and translation occur in the host system. Promoters and signal peptidf;-encoding regions are widely known and available to those skilled in the art. and include, for example, the promoter of Salmonella typhimurium (and derivatives) that is inducible by arabinose (promoter araB) and is functional in Gram-negative bacteria such as E. coli (U.S. Patent No. 5,028,S30; Cagnon et al., Protein Engineering 4(7): 843, I 991 ); the promoter of the bacteriophage T7 RNA polymerase gene, which is functional in a number of E. coli strains expressing T7 polymerase (U.S. Patent No. 4,952,496); the OspA lipidation signal peptide; and RlpB
lipidation signal peptide (Takase et al., J. Bact 169:4692, 1987).
The expression cassette is typically part of an expression vector, which is selected for its ability to replicate in the chosen expression system.
Expression vectors (e.g. plasmids or viral vectors) can be chosen from, for example, those described in Pouwels et al. (Cloning Vectors: A Laboratory Manual, 1985, Supp. 1987), and can be purchased from various commercial sources. Methods for transforming or transfecting host cells with expression vectors are well known in the art and will depend on the host system selected, as described in Ausubel et al. (supra).
Upon expression, a recombinant polypeptide of the invention (or a polypeptide derivative) is produced and remains in the intracellular compartment, is secreted/excreted in the extracellular medium or in the periplasmic space, or is embedded in the cellular membrane. The polypeptide can then be recovered in a substantially purified form from the cell extract or from the supernatant after centrifugation of the cell culture. Typically, the recombinant polypeptide can be purified by antibody-based affinity purification or by any other method known in the art, such as by genetic fusion to a small affinity-binding domain. Antibody-based affinity purification methods are also available for purifying a polypeptide of the invention extracted from a Helicobacter strain. Antidodies useful for immunoaffinity purification of the polypeptides of the invention can be obtained using methods described below.
Polynucleotides of the invention can also be used in DNA
vaccination methods, using either a viral or bacterial host as gene delivery vehicle (live vaccine vector) or administering the gene in a free form, e.g., inserted into a plasmid. Therapeutic or prophylactic efficacy of a polynucleotide of the invention can be evaluated as is described below.
Accordingly, in a third aspect of the invention, there is provided (i) a vaccine vector such as a poxvirus, containing a polynucleotide molecule of the invention placed under the control of elements required for expression; (ii) a composition of matter containing a vaccine vector of the invention, together with a diluent or carrier; (iii) a pharmaceutical composition containing a therapeutically or prophylactically effective amount of a vaccine vector of the invention; (iv) a method for inducing an immune response against Helicobacter I 0 in a mammal (e.g., a human; alternatively, the; method can be used in veterinary applications for treating or preventing Helico~iacter infection of animals, e.g., cats or birds), which involves administering to the mammal an immunogenically effective amount of a vaccine vector of the invention to elicit an immune response, e.g., a protective or therapeutic immune response to 1 S Helicobacter; and (v) a method for preventing and/or treating a Helicobacter (e.g., H. pylori, H. felis, H. mustelae, or H. heilmanii) infection, which involves administering a prophylactic or therapeutic amount of a vaccine vector of the invention to an individual in need. Additionally, the third aspect of the invention encompasses the use of a vaccine vector of the invention in the 20 preparation of a medicament for preventing and/or treating Helicobacter infection.
A vaccine vector of the invention can express one or several polypeptides or derivatives of the invention, a.s well as at least one additional Helicobacter antigen such as a urease apoenz;yme or a subunit, fragment, 2S homolog, mutant, or derivative thereof. In addition, it can express a cytokine, such as interleukin-2 (IL-2) or interleukin-i2 (IL-12), that enhances the immune response. Thus, a vaccine vector can include an additional WO 98I21225 PCT/US9'7/21353 polynucleotide molecule encoding, e.g., urease subunit A, B, or both, or a cytokine, placed under the control of elements required for expression in a mammalian cell.
Alternatively, a composition of the invention can include several vaccine vectors, each of which are capable of expressing a polypeptide or derivative of the invention. A composition can also contain a vaccine vector capable of expressing an additional Helicobacter antigen, such as urease apoenzyme, a subunit, fragment, homolog, mutant, or derivative thereof, or a cytokine such as IL-2 or IL-12.
In vaccination methods for treating or preventing infection in a mammal, a vaccine vector of the invention can be administered by any conventional route in use in the vaccine field, for example, to a mucosal (e.g., ocular, intranasal, oral, gastric, pulmonary, intestinal, rectal, vaginal, or urinary tract) surface or via a parenteral (e.g., subcutaneous, intradermal, intramuscular, intravenous, or intraperitoneal) route, or a combination thereof.
Preferred routes depend upon the choice of the vaccine vector. The administration can be achieved in a single dose or repeated at intervals. The appropriate dosage depends on various parameters that are understood by those skilled in the art, such as the nature of the vaccine vector itself, the route of administration, and the condition of the mammal to be vaccinated (e.g., the weight, age, and general health of the mammal).
Live vaccine vectors that can be used in the invention include viral vectors, such as adenoviruses and poxviruses, as well as bacterial vectors, e.g., Shigella, Salmonella, Vibrio cholerae, Lactobacillus, Bacille bilie de Calmette-Guerin {BCG), and Streptococcus. An example of an adenovirus vector, as well as a method for constructing an adenovirus vector capable of expressing a polynucleotide molecule of the invention, is described in U.S. Patent No.
4,920,209. Poxvirus vectors that can be used in the invention include, e.g., vaccinia and canary pox viruses, which are described in U.S. Patent No.
4,722,848 and U.S. Patent No. S,364,773, respectively (also see, e.g., Tartaglia et al., Virology 188:217, 1992, for a description of a vaccinia virus vector, and Taylor et al, Vaccine 13:S39, l995, for a description of a canary poxvirus vector). Poxvirus vectors capable of expressing a polynucleotide of the invention can be obtained by homologous recombination, as described in Kieny et al. (Nature 312:163, 1984} so that the polynucleotide of the invention is inserted into the viral genome under appropriate conditions for expression in mammalian cells. Generally, the dose of viral vector vaccine, for therapeutic or prophylactic use, can be from about 1 x 104 to about 1 x 10", advantageously from about 1 x 10' to about 1 x 10' ~, or, preferably, from about 1 x 10' to about I x 10'' plaque-forming units per kilogram. Preferably, viral vectors are administered parenterally, for example, in 3 doses that are 4 weeks apart.
1 S Those skilled in the art will recognize that it is, preferable to avoid adding a chemical adjuvant to a composition containin~; a viral vector of the invention and thereby minimizing the immune response to the viral vector itself.
Non-toxicogenic Yibrio cholerae mutant strains that can be used in live oral vaccines are described by Mekalanos et al. (Nature 306:551, 1983) and in U.S. Patent No. 4,882,278 (strain in which a substantial amount of the coding sequence of each of the two ctxA alleles has been deleted so that no functional cholerae toxin is produced); WO 9:'/113S4 (strain in which the irgA
locus is inactivated by mutation; this mutation can be combined in a single strain with ctxA mutations); and WO 94/l S33 (deletion mutant lacking functional ctxA and attRSl DNA sequences). These strains can be genetically engineered to express heterologous antigens, as described in WO 94/19482.
An effective vaccine dose of a V. cholerae strain capable of expressing a polypeptide or polypeptide derivative encoded by a polynucleotide molecule of the invention can contain, e.g., about 1 x 1 OS to about 1 x 10~, preferably about 1 x 106 to about 1 x 1 O8, viable bacteria in an appropriate volume for the selected route of administration. Preferred routes of administration include all mucosal routes, but, most preferably, these vectors are administered intranasally or orally.
Attenuated Salmonella typhimurium strains, genetically engineered for recombinant expression of heterologous antigens, and their use as oral vaccines, are described by Nakayama et al. (Bio/Technology G:693, 1988) and in W O 92/ 113 G 1. Preferred routes of administration for these vectors include all mucosal routes. Most preferably, the vectors are administered intranasally or orally.
Others bacterial strains useful as vaccine vectors are described by High et al. (EMBO 1 l :1991, 1992) and Sizemore et al. (Science 270:299, l995; Shigella flexneri); Medaglini et al. (Proc. Natl. Acad. Sci. USA
92:G868, l995; (Streptococcus gordonii); Flynn (Cell. Mol. Biol. 40 (suppl. I):31, 1194), and in WO 88/GG26, WO 90/0594, WO 91/I3157, WO 92/179G, and WO
92/213?G (Bacille Calmette Guerin). In bacterial vectors, a polynucleotide of the invention can be inserted into the bacterial genome or it can remain in a free state, for example, carned on a plasmid.
An adjuvant can also be added to a composition containing a bacterial vector vaccine. A number of adjuvants that can be used are known to those skilled in the art. For example, preferred adjuvants can be selected from the list provided below.
According to a fourth aspect of the invention, there is also provided (i) a composition of matter containing a polynucleotide of the invention, together with a diluent or carrier; (ii) a pharmaceutical composition containing a therapeutically or prophylactically effective. amount of a polynucleotide of the invention; (iii) a method for inducing an immune response against Helicobacter in a mammal by administering to the mammal an immunogenically effective amount of a polynucleotide of the invention to elicit an immune response, e.g., a protective immune response to Helicobacte;~ ; and (iv) a method for preventing and/or treating a Helicobacter (e.g~., H. pylori, H. felis, H.
mustelae, or H. heilmanii) infection, by administering a prophylactic or therapeutic amount of a polynucleotide of the invention to an individual in need of such -treatment. Additionally, the fourth aspect of the invention encompasses the use of a polynucleotide of the invention in the preparation of a medicament for preventing and/or treating Helicobacter infection. The fourth aspect of the invention preferably includes the use of a pol~ynucleotide molecule placed under conditions for expression in a mammalian cell, e.g., in a plasmid that is unable to replicate in mammalian cells and to substantially integrate into a mammalian genome.
Polynucleotides (for example, DNA or RNA molecules) of the invention can also be administered as such to a mammal as a vaccine. When a DNA molecule of the invention is used, it can be in the form of a plasmid that is unable to replicate in a mammalian cell and unable to integrate into the mammalian genome. Typically, a DNA molecule is placed under the control of a promoter suitable for expression in a mammalian cell. The promoter can function ubiquitously or tissue-specifically. F?xamples of non-tissue specific promoters include the early Cytomegalovirus (CMV) promoter (U.S. Patent No. 4,1 G8,062) and the Rous Sarcoma Virus promoter (Norton et al., Molec.
Cell Bioi. 5:281, 1985). The desmin promoter (Li et al., Gene 78:243, 1989; Li et al., J. Biol. Chem. 266:6562, 1991; Li et al., J. Biol. Chem. 268:10403, 1993) is tissue-specific and drives expression in muscle cells. More generally, useful promoters and vectors are described, e.g., in WO 94/2l797 and by Hartikka et al. (Human Gene Therapy 7:1205, 1996).
For DNA/RNA vaccination, the polynucleotide of the invention can encode a precursor or a mature form of a polypeptide of the invention. When it encodes a precursor form, the precursor sequence can be homologous or heterologous. In the latter case, a eucaryotic leader sequence can be used, such as the leader sequence of the tissue-type plasminogen factor (tPA).
A composition of the invention can contain one or several polynucleotides of the invention. It can also contain at least one additional polynucleotide encoding another Helicobacte~~ antigen, such as urease subunit A, B, or both, or a fragment, derivative, mutant, or analog thereof. A
polynucleotide encoding a cytokine, such as interleukin-2 (IL-2) or interleukin-12 (IL-12), can also be added to the composition so that the immune response is enhanced. These additional polynucleotides are placed under appropriate control for expression. Advantageously, DNA molecules of the invention and/or additional DNA molecules to be included in the same composition are carried in the same plasmid.
Standard methods can be used in the preparation of therapeutic polynucleotides of the invention. For example, a polynucleotide can be used in a naked form, free of any delivery vehicles, such as anionic liposomes, cationic lipids, microparticles, e.g., gold microparticles, precipitating agents, e.g., calcium phosphate, or any other transfection-facilitating agent. In this case, the polynucleotide can be simply diluted in a physiologically acceptable solution, such as sterile saline or sterile buffered saline, with or without a Garner.
When present, the carrier preferably is isotonic, hypotonic, or weakly hypertonic, and has a relatively low ionic strength, such as provided by a sucrose solution, e.g., a solution containing 20% sucrose.
Alternatively, a polynucleotide can be associated with agents that assist in cellular uptake. It can be, e.g., (i) complemented with a chemical agent that modifies cellular permeability, such as bupivacaine (see, e.g., 7), (ii) encapsulated into liposorrles, or (iii) associated with cationic lipids or silica, gold, or tungsten microparticles.
Anionic and neutral liposornes are well-known in the art (see, e.g., Liposomes: A Practical Approach, RPC New Ed, IRL Press, 1990, for a detailed description of methods for making liposomes) and are useful for delivering a large range of products, including; polynucleotides.
Cationic lipids can also be used for gene delivery. Such lipids ._ - include, for example, LipofectinTM, which is also known as DOTMA (N-[ 1-(2,3-dioleyloxy)propyl]-N,N,N-trimethylamrrlonium chloride), DOTAP ( 1,2-bis(oleyloxy)-3-(trimethylammonio)propane), DDAB (dimethyldioctadecyl-ammonium bromide), DOGS (dioctadecylamiidologlycyl spermine), and cholesterol derivatives. A description of these cationic lipids can be found in EP 187,702, WO 90/11092, U.S. Patent No. 5,283,185, WO 91/15501, WO 9S/26356, and U.S. Patent No. 5,527,928. Cationic lipids for gene delivery are preferably used in association with a neutral lipid, such as DOPE
(dioleyl phosphatidylethanolamine; WO 90/l 1092). Other transfection-facilitating compounds can be added to a fornlulation containing cationic Iiposomes. A number of them are described in, e.g., WO 93/18759, WO 93/l9768, WO 94/2S608, and WO 95/2397. They include, e.g., spermine derivatives useful for facilitating the transport; of DNA through the nuclear membrane (see, for example, WO 93/l8759) .and membrane-permeabilizing compounds such as GALA, Gramicidine S, and cationic bile salts (see, for example, WO 93/19768).
Gold or tungsten microparticles can also be used for gene delivery, as described in WO 91/359, WO 93/17706, and by Tang et al. (Nature 356:152, 1992). In this case, the microparticle-coated polynucleotides can be injected via intradermal or intraepidermal routes using a needleless injection device ("gene gun"), such as those described in U.S. Patent No. 4,94S,050, U.S.
Patent No. 5,015,580, and WO 94l24263.
The amount of DNA to be used in a vaccine depends, e.g., on the strength of the promoter used in the DNA construct, the imrnunogenicity of the expressed gene product, the condition of the mammal intended for administration (e.g., the weight, age, and general health of the mammal), the mode of administration, and the type of formulation. In general, a therapeutically or prophylactically effective dose from about 1 ~g to about 1 mg, preferably, from about 10 ~cg to about 800 fig, and, more preferably, from about 25 ~g to about 250 ,ug, can be administered to a human adult. The 1 S administration can be achieved in a single dose or repeated at intervals.
The route of administration can be any conventional route used in the vaccine field. As general guidance, a polynucleotide of the invention can be administered via a mucosal surface, e.g., an ocular, intranasal, pulmonary, oral, intestinal, rectal, vaginal, or urinary tract surface, or via a parenteral route, e.g., by an intravenous, subcutaneous, intraperitoneal, intradermal, intraepidermal, or intramuscular route. The choice of administration route will depend on, e.g., the formulation that is selected. A polynucleotide formulated in association with bupivacaine is advantageously administered into muscle. When a neutral or anionic liposome or a cationic lipid, such as DOTMA, is used, the formulation can be advantageously administered via intravenous, intranasal (for example, by aerosolization), intramuscular, intradermal, and subcutaneous routes. A polynucleotide in a naked form can advantageously be administered via the intramuscular, intradermal, or subcutaneous routes. Although not absolutely required, such a composition can also contain an adjuvant. A
systemic adjuvant that does not require concomitant administration in order to exhibit an adjuvant effect is preferable.
The sequence information provided in the present application enables the design of specific nucleotide probes and primers that can be used in diagnostic methods. Accordingly, in a fifth aspect of the invention, there is provided a nucleotide probe or primer having a sequence found in, or derived by degeneracy of the genetic code from, a sequence shown in any of SEQ ID
NOs: l-169 (odd numbers).
The term "probe" as used in the preaent application refers to a DNA
(preferably single stranded) or RNA molecule; (or modifications or combinations thereof) that hybridizes under the stringent conditions, as defined above, to a polynucleotide molecule having a sequence homologous to any of those shown in SEQ ID NOs:l-169 (odd numbers}, or to a complementary or anti-sense sequence of any of those shown in SEQ ID NOs:I-169 (odd numbers). Generally, probes are significantly shorter than the full-length sequences shown in SEQ ID NOs: l-l69 (odd numbers). For example, they can contain from about 5 to about 100, preferably from about 10 to about 80, nucleotides. In particular, probes have sequences that are at least 75%, preferably at least 85%, more preferably 95% homologous to a portion of a sequence as shown in any of SEQ ID NOs: l-169 (odd numbers) or a sequence complementary to any of such sequences.
Probes can contain modified bases, such as inosine, methyl-5-deoxycytidine, deoxyuridine, dimethylamino-S-deoxyuridine, or diamino-2, G-purine. Sugar or phosphate residues can also be modified or substituted. For example, a deoxyribose residue can be replaced by a polyamide (Nielsen et al., WO 98I21225 PCT/LIS9'I/21353 Science 254: l497, 1991 ) and phosphate residues can be replaced by ester groups, such as diphosphate, alkyl, aryiphosphonate, and phosphorothioate esters. In addition, the 2'-hydroxyl group on ribonucleotides can be modified by addition of, e.g., alkyl groups.
Probes of the invention can be used in diagnostic tests or as capture or detection probes. Such capture probes can be immobilized on solid supports, directly or indirectly, by covalent means or by passive adsorption. A
detection probe can be labeled by a detectable label, for example, a label selected from radioactive isotopes; enzymes, such as peroxidase and alkaline phosphatase;
enzymes that are able to hydrolyze a chromogenic, fluorogenic, or luminescent -- substrate; compounds that are chromogenic, fluorogenic, or luminescent;
nucleotide base analogs; and biotin.
Probes of the invention can be used in any conventional hybridization method, such as in dot blot methods (Maniatis et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1982), Southern blot methods (Southern, J. Mol.
Biol. 98:503, 1975), northern blot methods (identical to Southern blot to the exception that RNA is used as a target), or a sandwich method (Dunn et al., Cell l2:23, 1977). As is known in the art, the latter technique involves the use of a specific capture probe and a specific detection probe that have nucleotide sequences that are at least partially different from each other.
Primers used in the invention usually contain about 10 to 40 nucleotides and are used to initiate enzymatic polymerization of DNA in an amplification process (e.g., PCR), an elongation process, or a reverse transcription method. In a diagnostic method involving PCR, the primers can be labeled.
Thus, the invention also encompasses (i) a reagent containing a probe of the invention for detecting and/or identifying the presence of Helicobacter in a biological material; (ii) a method for detecting and/or identifying the presence of Helicobacter in a biological material, in which {a) a sample is recovered or derived from the biological material, (b) DNA or RNA
is extracted from the material and denatured, and (c) the sample is exposed to a probe of the invention, for example, a capture probe, a detection probe, or both, under stringent hybridization conditions, so tlhat hybridization is detected;
and (iii) a method for detecting and/or identifying; the presence of Helicobacter in a biological material, in which (a) a sample is recovered or derived from the biological material, (b) DNA is extracted therefrom, (c) the extracted DNA is contacted with at least one, or, preferably two, primers of the invention, and amplified by the polymerase chain reaction, and (d) an amplified DNA
molecule is produced.
As mentioned above, polypeptides that can be produced by expression of the polynucleotides of the invention can be used as vaccine antigens. Accordingly, a sixth aspect of the invention features a substantially purified polypeptide or polypeptide derivative having an amino acid sequence encoded by a polynucleotide of the invention.
A "substantially purified polypeptide" is defined as a polypeptide that is separated from the environment in which it naturally occurs and/or a polypeptide that is free of most of the other polypeptides that are present in the environment in which it was synthesized. The polypeptides of the invention can be purified from a natural source, such as a Helicobacter strain, or can be produced using recombinant methods.
Homologous polypeptides or polypeptide derivatives encoded by polynucleotides of the invention can be screened for specific antigenicity by testing cross-reactivity with an antiserum raised against a polypeptide having an amino acid sequence as shown in any of SEQ ID NOs:2-l70 (even numbers). Briefly, a monospecific hyperimmune antiserum can be raised against a purified reference polypeptide as such or as a fusion polypeptide, for example, an expression product of MBP, GST, or His-tag systems, or a synthetic peptide predicted to be antigenic. The homologous polypeptide or derivative that is screened for specific antigenicity can be produced as such or as a fusion polypeptide. In the latter case, and if the antiserum is also raised against a fusion polypeptide, two different fusion systems are employed.
Specific antigenicity can be determined using a number of methods, including Western blot (Towbin et al., Proc. Natl. Acad. Sci. USA 76:4350, l979), dot blot, and ELISA methods, as described below.
In a Western blot assay, the product to be screened, either as a purified preparation or a total E. coli extract, is fractionated by SDS-PAGE, as described, for example, by Laemmli (Nature 227:680, 1970). After being transferred to a filter, such as a nitrocellulose membrane, the material is incubated with the monospecific hyperimmune antiserum, which is diluted in a range of dilutions from about 1:50 to about 1:5,000, preferably from about 1:100 to about 1:500. Specific antigenicity is shown once a band corresponding to the product exhibits reactivity at any of the dilutions in the range.
In an ELISA assay, the product to be screened can be used as the coating antigen. A purified preparation is preferred, but a whole cell extract can also be used. Briefly, about l00 ~,l of a preparation of about 10 ~cg protein/ml is distributed into wells of a 96-well ELISA plate. The plate is incubated for about 2 hours at 37~C, then overnight at 4~C. The plate is washed with phosphate buffer saline (PBS) containing 0.05% Tween 20 (PBS/Tween buffer) and the wells are saturated with 2S0 ~1 PBS containing 1 % bovine serum albumin (BSA), to prevent non-specific antibody binding.
After 1 hour of incubation at 37~C, the plate is washed with PBS/Tween buffer.
The antiserum is serially diluted in PBS/Tween buffer containing O.S% BSA, S and 100 ,ul dilutions are added to each well. 'The plate is incubated for 90 minutes at 37~C, washed, and evaluated using standard methods. For example, a goat anti-rabbit peroxidase conjugate can be added to the wells when the specific antibodies used were raised in rabbits. Incubation is carried out for about 90 minutes at 37~C and the plate is washed. The reaction is developed with the appropriate substrate and the reaction is measured by colorimetry (absorbance measured spectrophotometrically). Under these experimental conditions, a positive reaction i:> shown once an O.D. value of 1.0 is detected with a dilution of at least about 1:50, preferably of at least about 1:S00.
1 S In a dot blot assay, a purified product is preferred, although a whole cell extract can be used. Briefly, a solution o:f the product at a concentration of about 100 ,ug/ml is serially diluted two-fold with SO mM Tris-HCl (pH 7.S).
One hundred ~cl of each dilution is applied to a filter, such as a 0.4S ~m nitrocellulose membrane, set in a 96-well dot blot apparatus (Biorad). The buffer is removed by applying vacuum to the system. Wells are washed by addition of SO mM Tris-HCl (pH 7.S) and the membrane is air-dried. The membrane is saturated in blocking buffer (SO mM Tris-HCl (pH 7.S), 0.1 S M
NaCI, 10 g/1 skim milk) and incubated with an antiserum diluted from about 1:50 to about 1:S000, preferably about 1:S00. The reaction is detected using standard methods. For example, a goat anti-rabbit peroxidase conjugate can be added to the wells when rabbit antibodies are used. Incubation is carned out for about 90 minutes at 37 ~ C and the blot is washed. The reaction is developed with the appropriate substrate and stopped. The reaction is then measured visually by the appearance of a colored spot, e.g., by colorimetry. Under these experimental conditions, a positive reaction is associated with detection of a colored spot for reactions carried out with a dilution of at least about 1:50, preferably, of at least about 1:500. Therapeutic or prophylactic efficacy of a polypeptide or polypeptide derivative of the invention can be evaluated as described below.
According to a seventh aspect of the invention, there is provided (i) a composition of matter containing a polypeptide of the invention, together with a diluent or carrier; (ii) a pharmaceutical composition containing a therapeutically or prophylactically effective amount of a polypeptide of the invention; (iii) a method for inducing an immune response against Helicobacter in a mammal by administering to the mammal an immunogenically effective amount of a polypeptide of the invention to elicit an immune response, e.g., a protective immune response to Helicobacter; and (iv) a method for preventing and/or treating a Helicobacter (e.g., H. pylori, H. fells, H. mustelae, or H.
heilmanii) infection, by administering a prophylactic or therapeutic amount of a polypeptide of the invention to an individual in need of such treatment.
Additionally, this aspect of the invention includes the use of a polypeptide of the invention in the preparation of a medicament for preventing and/or treating Helicobacter infection.
The immunogenic compositions of the invention can be administered by any conventional route in use in the vaccine field, for example, to a mucosal (e.g., ocular, intranasaI, pulmonary, oral, gastric, intestinal, rectal, vaginal, or urinary tract) surface or via a parenteral (e.g., subcutaneous, intradermal, intramuscular, intravenous, or intraperitoneal) route. The choice of the administration route depends upon a number of parameters, such as the adjuvant used. For example, if a mucosal adjuvant is used, the intranasal or oral route will be preferred, and if a lipid formulation or an aluminum compound is used, a parenteral route will be preferred. In the latter case, the subcutaneous or intramuscular route is most x>referred. The choice of administration route can also depend upon thc; nature of the vaccine agent.
For example, a polypeptide of the invention fusedl to CTB or to LTB will be best administered to a mucosal surface.
A composition of the invention can. contain one or several polypeptides or derivatives of the invention. It can also contain at least one additional Helicobacter antigen, such as the urease apoenzyme, or a subunit, fragment, homolog, mutant, or derivative thereof.
For use in a composition of the invention, a polypeptide or polypeptide derivative can be formulated into or with liposomes, such as neutral or anionic liposomes, microspheres, ISCOMS, or virus-like particles (VLPs), to facilitate delivery and/or enhance l;he immune response. These compounds are readily available to those skilled in the art; for example, see Liposomes: A Practical Approach (supra). Adjuvants other than liposomes can also be used in the invention and are well known in the art (see, for example, the list provided below).
Administration can be achieved in .a single dose or repeated as necessary at appropriate intervals that can be determined by those skilled in the art. For example, a priming dose can be followed by three booster doses at weekly or monthly intervals. An appropriate dose depends on various parameters, including the nature of the recipient (e.g., whether the recipient is an adult or an infant), the particular vaccine antigen, the route and frequency of administration, the presence/absence or type of adjuvant, and the desired effect (e.g., protection and/or treatment), and can be readily determined by one skilled in the art. In general, a vaccine antigen of the invention can be administered mucosally in an amount ranging from about I 0 ,ug to about 500 mg, preferably from about 1 mg to about 200 mg. For a parenteral route of administration, the dose usually should not exceed about 1 mg, and is, preferably, about 100 ,ug.
When used as components of a vaccine, the polynucleotides and polypeptides of the invention can be used sequentially as part of a multi-step immunization process. For example, a mammal can be initially primed with a vaccine vector of the invention, such as a pox virus, e.g., via a parenteral route, and then boosted twice with a polypeptide encoded by the vaccine vector, e.g., via the mucosal route. In another example, liposomes associated with a polypeptide or polypeptide derivative of the invention can be used for priming, with boosting being carried out mucosally using a soluble polypeptide or polypeptide derivative of the invention, in combination with a mucosal adjuvant (e.g., LT).
Polypeptides and polypeptide derivatives of the invention can also be used as diagnostic reagents for detecting the presence of anti-Helicobacter antibodies, e.g., in blood samples. Such polypeptides can be about 5 to about 80, preferably, about 10 to about 50, amino acids in length and can be labeled or unlabeled, depending upon the diagnostic method. Diagnostic methods involving such a reagent are described below.
Upon expression of a polynucleotide molecule of the invention, a polypeptide or polypeptide derivative is produced and can be purified using known methods. For example, the polypeptide or polypeptide derivative can be produced as a fusion protein containing a fused tail that facilitates purification.
The fusion product can be used to immunize a small mammal, e.g., a mouse or a rabbit, in order to raise monospecific antibodies against the polypeptide or polypeptide derivative. The eighth aspect of the invention thus provides a monospecific antibody that binds to a polypeptide or polypeptide derivative of the invention.
By "monospecific antibody" is meant an antibody that is capable of reacting with a unique, naturally-occurring Helicobacter polypeptide. Ari antibody of the invention can be polyclonal or monoclonal. Monospecific antibodies can be recombinant, e.g., chimeric (e.g., consisting of a variable _-region of murine origin and a human constant region), humanized (e.g., a human immunoglobulin constant region and a variable region of animal, e.g., murine, origin), and/or single chain. Both polyclonal and monospecific antibodies can also be in the form of immuno~;lobulin fragments, e.g., F(ab)'2 or Fab fragments. The antibodies of the invention can be of any isotype, e.g., IgG or IgA, and polyclonal antibodies can be of a single isotype or can contain a mixture of isotypes.
The antibodies of the invention, which can be raised against a polypeptide or polypeptide derivative of the invention, can be produced and identified using standard immunological assays, e.g., Western blot assays, dot blot assays, or ELISA (see, e.g., Coligan et al., Current Protocols in Immunology, John Wiley & Sons, Inc., New Fork, NY, 1994). The antibodies can be used in diagnostic methods to detect the presence of Helicobacter antigens in a sample, such as a biological sample. The antibodies can also be used in affinity chromatography methods for purifying a polypeptide or polypeptide derivative of the invention. As is discussed further below, the antibodies can also be used in prophylactic and therapeutic passive immunization methods.
Accordingly, a ninth aspect of the invention provides (i) a reagent for detecting the presence of Helicobacter in a biological sample that contains an antibody, polypeptide, or polypeptide derivative of the invention; and (ii) a diagnostic method for detecting the presence of Helicobacter in a biological sample, by contacting the biological sample with an antibody, a polypeptide, or a polypeptide derivative of the invention, so that an immune complex is formed, and detecting the complex as an indication of the presence of Helicobacter in the sample or the organism from which the sample was derived. The immune complex is formed between a component of the sample and the antibody, polypeptide, or polypeptide derivative, and any unbound material can be removed prior to detecting the complex. A polypeptide reagent can be used for detecting the presence of anti-Helicobacter antibodies in a sample, e.g., a blood sample, while an antibody of the invention can be used for screening a sample, such as a gastric extract or biopsy sample, for the presence of Helicobacter polypeptides.
For use in diagnostic methods, the reagent (e.g., the antibody, polypeptide, or polypeptide derivative of the invention) can be in a free state or can be immobilized on a solid support, such as, for example, on the interior surface of a tube or on the surface, or within pores, of a bead.
Immobilization can be achieved using direct or indirect means. Direct means include passive adsorption (i.e., non-covalent binding) or covalent binding between the support and the reagent. By "indirect means" is meant that an anti-reagent compound that interacts with the reagent is first attached to the solid support. For example, if a polypeptide reagent is used, an antibody that binds to it can serve as an anti-reagent, provided that it binds to an epitope that is not involved in recognition of antibodies in biological samples. Indirect means can also employ a ligand-receptor system, for example, a molecule, such as a vitamin, can be grafted onto the polypeptide reagent and the corresponding receptor can be immobilized on the solid phase. This concept is illustrated by the well known biotin-streptavidin system. Alternatively, indirect means can be used, e.g., by adding to the reagent a peptide tail, chemically or by genetic engineering, and immobilizing the grafted or fused product by passive adsorption or covalent linkage of the peptide hail.
According to a tenth aspect of the invention, there is provided a process for purifying from a biological sample a polypeptide or polypeptide derivative of the invention, which involves carrying out antibody-based affinity chromatography with the biological sample, v~rherein the antibody is a monospecific antibody of the invention.
For use in a purification process of the invention, the antibody can be polyclonal or monospecific, and preferably is of the IgG type. Purified IgGs - can be prepared from an antiserum using standard methods (see, e.g., Coligan et al., supra). Conventional chromatography aupports, as well as standard methods for grafting antibodies, are described, for example, by Harlow et al.
(Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1988).
Briefly, a biological sample, such as an H. pylori extract, preferably in a buffer solution, is applied to a chromatography material, which is, preferably, equilibrated with the buffer used to dilute the biological sample, so that the polypeptide or polypeptide derivative of the invention (i.e., the antigen) is allowed to adsorb onto the material. The chromatography material, such as a gel or a resin coupled to an antibody of the invention, can be in batch form or in a column. The unbound components are washed off and the antigen is eluted with an appropriate elution buffer, such as a glycine buffer, a buffer containing a chaotropic agent, e.g., guanidine HCI, or a buffer having high salt concentration (e.g., 3 M MgClz). Eluted fractions are recovered and the presence of the antigen is detected, e.g., by measuring the absorbance at 280 nm.
An antibody of the invention can be screened for therapeutic efficacy as follows. According to an eleventh aspect of the invention, there is provided (i) a composition of matter containing a monospecific antibody of the invention, together with a diluent or carrier; (ii) a pharmaceutical composition containing a therapeutically or prophylactically effective amount of a monospecific antibody of the invention, and (iii) a method for treating or preventing Helicobactef~ (e.g., H. pylori, H. fells, H. mustelae, or H.
heilmanii) infection, by administering a therapeutic or prophylactic amount of a monospecific antibody of the invention to an individual in need of such treatment. In addition, the eleventh aspect of the invention includes the use of a monospecific antibody of the invention in the preparation of a medicament for treating or preventing Helicobacte~~ infection.
The monospecific antibody can be polyclonal or monoclonal, and is, preferably, predominantly of the IgA isotype. In passive immunization methods, the antibody is administered to a mucosal surface of a mammal, e.g., the gastric mucosa, e.g., orally or intragastrically, optionally, in the presence of a bicarbonate buffer. Alternatively, systemic administration, not requiring a bicarbonate buffer, can be carried out. A monospecific antibody of the invention can be administered as a single active agent or as a mixture with at least one additional monospecific antibody specific for a different Helicobacter polypeptide. The amount of antibody and the particular regimen used can be readily determined by one skilled in the art. For example, daily administration of about 100 to 1,000 mg of antibody over one week, or three doses per day of about 100 to l,000 mg of antibody over two or three days, can be effective regimens for most purposes.
Therapeutic or prophylactic efficacy can be evaluated using standard methods in the art, e.g., by measuring induction of a mucosal immune response or induction of protective and/or therapeutic immunity using, e.g., the H.
fells mouse model and the procedures described by Lee et al. (Eur. J.
Gastroenterology & Hepatology 7:303, 1995) or Lee et al. (J. Infect. Dis.
172:1G1, l995). Those skilled in the art will recognize that the H. fells strain of the model can be replaced with another Helicobacter strain. For example, the efficacy of polynucleotide molecules and polypeptides from H. pylori is, preferably, evaluated in a mouse model using an H. pylori strain. Protection can be determined by comparing the degree of Helicobacter infection in the gastric tissue assessed by, for example, urease activity, bacterial counts, or gastritis, to that of a control group. Protection is shown when infection is reduced by comparison to the control group. Such an evaluation can be made for polynucleotides, vaccine vectors, polypeptides, and polypeptide derivatives, as well as for antibodies of the invention.
For example, various doses of an antibody of the invention can be administered to the gastric mucosa of mice previously challenged with an H.
pylori strain as described, e.g., by Lee et al. (supra). Then, after an appropriate period of time, the bacterial load of the mucosa can be estimated by assessing urease activity, as compared to a control. Reduced urease activity indicates that the antibody is therapeutically effective.
Adjuvants that can be used in any of the vaccine compositions described above are described as follows. Adjuvants for parenteral administration include, for example, aluminum compounds, such as aluminum hydroxide, aluminum phosphate, and aluminum hydroxy phosphate. The antigen can be precipitated with, or adsorbed onto, the aluminum compound using standard methods. Other adjuvants, such as RIBI (ImmunoChem, Hamilton, MT), can also be used in parenteral administration.
Adjuvants that can be used for mucosal administration include, for example, bacterial toxins, e.g., the cholera toxin (CT), the E. coli heat-labile toxin (LT), the Clostridium difficile toxin A, the pe~tussis toxin (PT), and combinations, subunits, toxoids, or mutants thereof. For example, a purified preparation of native cholera toxin subunit B (CTB) can be used. Fragments, homologs, derivatives, and fusions to any of these toxins can also be used, provided that they retain adjuvant activity. Preferably, a mutant having reduced toxicity is used. Suitable mutants are described, e.g., in WO 95/17211 (Arg-7-Lys CT mutant), WO 9G/GG27 (Arg-l92-Gly LT mutant), and WO
95/34323 (Arg-9-Lys and Glu-l29-Gly PT mutant). Additional LT mutants - that can be used in the methods and compositions of the invention include, e.g., Ser-G3-Lys, Ala-69-Gly, Glu-110-Asp, and Glu-112-Asp mutants. Other adjuvants, such as the bacterial monophosphoryl lipid A (MPLA) of, e.g., E.
coli, Salmonella mihnesota, Salmonella typhimurium, or Shigella flexneri;
I 5 saponins, and polylactide glycolide (PLGA) microspheres, can also be used in mucosal administration. Adjuvants useful for both mucosal and parenteral administration, such as polyphosphazene (WO 95/24l5), can also be used.
Any pharmaceutical composition of the invention, containing a polynucleotide, polypeptide, polypeptide derivative, or antibody of the invention, can be manufactured using standard methods. It can be formulated with a pharmaceutically acceptable diluent or carrier, e.g., water or a saline solution, such as phosphate buffered saline, optionally, including a bicarbonate salt, such as sodium bicarbonate, e.g., 0.1 to 0.5 M. Bicarbonate can advantageously be added to compositions intended for oral or intragastric administration. In general, a diluent or carrier can be selected on the basis of the mode and route of administration, and standard pharmaceutical practice.
Suitable pharmaceutical carriers and diluents, as well as pharmaceutical necessities for their use in pharmaceutical formulations, are described in Remington's Pharmaceutical Sciences, a standard reference text in this field and in the USP/NF.
The invention also includes methods in which gastroduodenal infections, such as Helicobacter infection, are treated by oral administration of a Helicobacter polypeptide of the invention and a mucosal adjuvant, in combination with an antibiotic, an antisecretory agent, a bismuth salt, an antacid, sucralfate, or a combination thereof. Examples of such compounds that can be administered with the vaccine antigen and an adjuvant are antibiotics, including, e.g., macrolides, tetracyclines, (3-lactams, aminoglycosides, quinolones, penicillins, and derivatives thereof (specific examples of antibiotics that can be used in the; invention include, e.g., amoxicillin, clarithromycin, tetracycline, metronidizole, erythromycin, cefuroxime, and erythromycin); antisecretory agents, including, e.g., H2-receptor antagonists (e.g., cimetidine, ranitidine, famotidine, nizatidine, and roxatidine), proton pump inhibitors (e.g., omeprazole, lansoprazole, and pantoprazole), prostaglandin analogs (e.g., misoprostil and enprostil), and anticholinergic agents (e.g., pirenzepine, telenzepine, carbenoxolone, and proglumide); and bismuth salts, including colloidal bismuth subcitrate, tripotassium dicitrate bismuthate, bismuth subsalicylate, bicitropeptide, and pepto-bismol (see, e.g., Goodwin et al., Helicobacter pylori, Biology and Clinical Practice, CRC Press, Boca Raton, FL,, pp 366-395, 1993; Physicians' Desk Reference, 49''' edn., Medical Economics Data Production Company, Montvale, New Jersey, 199S). In addition, compounds containing more than one of the above-listed components coupled together, e.g., ranitidine coupled to bismuth subcitrate, can be used. The invention also includes compositions for carrying out these methods, i. e., compositions containing a Helicobacter antigen (or antigens) of the invention, an adjuvant, and one or more of the above-listed compounds, in a pharmaceutically acceptable earner or diluent.
Amounts of the above-listed compounds used in the methods and compositions of the invention can readily be determined by one skilled in the art. In addition, one skilled in the art can readily design treatment/immunization schedules. For example, the non-vaccine components can be administered on days 1-14, and the vaccine antigen + adjuvant can be administered on days 7, 14, 21, and 28.
Methods and pharmaceutical compositions of the invention can be used to treat or to prevent Helicobacter infections and, accordingly, gastroduodenal diseases associated with these infections, including acute, chronic, and atrophic gastritis, and peptic ulcer diseases, e.g., gastric and duodenal ulcers.
The clones of the invention were originally isolated by a transposon shuttle mutagenesis method. Briefly, in this method, a TnMax9 mini-blaM
transposon was used for insertional mutagenesis of an H. pylori gene library established in E. coli. 192 E. coli clones expressing active ~i-lactamase fusion proteins were obtained, indicating that the corresponding target plasmids carry H. pylori genes encoding extracytoplasmic proteins. Individual mutants were transferred onto the chromosome of H. pylori P 1 or P 12 by natural transformation, resulting in l35 distinct H. pylori mutants. This method is described in further detail, as follows.
The transposon TnMax9 (Kahrs et al., Gene 1 G7:53, 1995) was used to generate mutations in an H. pylori library in E. coli. As illustrated in Fig.
1 A, TnMax9 contains, in addition to a cat~C-resistance gene close to the inverted repeat (IR), an unexpressed open reading frame encoding ~3-lactamase without a promoter or signal sequence (mature ~3-lactamase, blaM; Kahrs et al., WO 98/21225 PCT/US97/21353 w supra). For production of extracytoplasmic BIaM fusion proteins resulting in ampicillin-resistant (amps) clones, expression of the cloned H. pylori genes in E. coli is obligatory. The minimal vector pMin2 (Kahrs et al., supra; see Fig.
1 B), containing a weak constitutive promoter (P;s~) upstream of the multiple cloning site, was used for construction of the H. pylori library to ensure expression of H. pylori genes in E. coli.
In construction of the library, H. pylori DNA was partially digested with Sau3A and HpaII, size fractionated by preparative agarose gel electrophoresis, and 3-6 kilobase fragments were ligated into the BgIII and CIaI
sites of pMin2. The library was introduced into E. coli strain E 181 (pTnMax9), which is a derivative of HB 101 containing the TnMax9 transposon, by electroporation. This generated approximately 2,400 independent transformants. More than 95% of the plasmids contained an insert of between 3 and 6 kilobases, showing that the 1.7 megabase H. pylori chromosome was statistically covered. Since not every plasmid could be expected to contain a target gene carrying an export signal, the library was partitioned into a total of 198 pools (24 pools of 20 clones and 174 pools of I 1 clones). Using a cotton swab, either eleven or twenty individual colonies were inoculated in 0. S ml LB
medium in eppendorf tubes, vortexed, and 100 ml of the suspension was spread on LB agar plates supplemented with tetracycline and chloramphenicol to select for maintenance of both plasmids. Insertion of TnMax9 into the target plasmids was induced with l00 mM isopropyl-b-D-thiogalactoside (IPTG) separately for each pool (Haas et al., Gene 130:23-2 I , 1993 ). Plasmids were transferred into E145 by triparental mating, in which 25 ml of the donor strain (E 181 ), 25 ml of the mobilisator (HB 1 O 1 (pRI'.2013)), and 50 ml of the recipient strain (E145) were mixed from corrf;sponding bacterial suspensions (O.D.55~= 10). The matings were performed for 2-3 hours at 37~C on nitrocellulose filters, which were placed on LB plates. Bacteria were suspended in 1 ml LB and aliquots were spread on LB plates containing chloramphenicol, tetracycline, and rifampicin. Each pool gave rise to chloramphenicol-resistant transconjugates in E 145, demonstrating that both transposition and conjugation were successful. Generally, several thousand chloramphenicol-resistant transconjugates were obtained, but the number of ampR colonies varied in different pools, ranging from one to several hundred colonies. Two amp's colonies from each positive pool were isolated, plasmid DNA was extracted, and the DNA was characterized by further restriction analysis. Only those TnMax9 insertions of a single pool that mapped in obviously different plasmid clones, or in markedly different regions of the same clone, were used further.
From l58 of the 198 pools, ampicillin-resistant E145 transconjugates were obtained (80%), showing that in several pools, TnMax9 inserted into expressed genes, resulting in production of extracytoplasmic BIaM fusion proteins. Thus, a total of 192 ampa E 145 clones could be isolated by conjugal transfer of plasmids from 198 pools.
To analyze the mutant library, it was determined whether defined gene sequences inactivated by TnMax9 were represented once or several times in the whole library. Five transposon-containing plasmids confernng an ampK
phenotype to E 145 (pMu7, pMu 13, pMu75, pMu94, and pMu 110) were randomly selected and DNA fragments flanking the TnMax9 insert were isolated and used as probes in Southern hybridization of 120 ampR clones. The hybridization probes isolated from clones pMu7, pMu75, and pMu94 were between 0.9 and 1.1 kilobases in size, and hybridized exclusively with the inserts of the homologous plasmids. In contrast, the TnMax9 flanking regions of clones pMu 13 and pMu 110 were 4.0 and 5.5 kilobases, respectively. They WO 98I21225 PCT/US97/21353 w each hybridized with the homologous plasmids, and with one additional clone of the library. Such a result was expected, since the chance of a probe to find a homologous sequence in the library should be: higher, the longer the hybridization probes.
In order to verify the insertion of the transposon into distinct ORFs encoding putative exported proteins, the TnMax9-flanking DNA of five representative ampR mutant clones (pMu7, pMu 12, pMu 18, pMu20, and pMu26) was sequenced, taking advantage of the M I 3 forward and reverse primers on TnMax9 (Fig. 1 A). This analysis revealed that the mini-transposon was inserted into different sequences in each plasmid, thereby interrupting ORFs encoding putative proteins. For two clones, the sequences located upstream of the blaM gene revealed a putative; ribosome-binding site and a potential translational start codon (ATG). Other clones either revealed an ORF
spanning the complete sequence (approximate:ly 400 base pairs upstream and downstream of the TnMax9 insertion) or terminating shortly after the site of TnMax9 insertion. The partial protein sequences from different ORFs were used for database searches, but no significant homologies with known proteins were found.
In a further approach, it was determined whether a known gene, like vacA, encoding the extracellular vacuolating c:ytotoxin of H. pylori, could be identified using this method and how often such a mutation would be represented in the mutant library. Total cell lysates of the 13 5 mutants were tested in an immunoblot using the H. pylori cytotoxin-specific rabbit antiserum AK197 (Schmitt et al., Mol. Microbiol. 12:30'7-3l9, l994). Two mutants were identified that no longer produced the cytotoxin antigen (mutants P1-26 and P1-47) and partial DNA sequencing of the insertion sites revealed that TnMax9 was inserted at distinct positions in the vacA gene, S6 and 53 codons downstream of the ATG start codon.
Thus, the characterization of the mutant collection confirmed that a representative gene library was constructed in E. coli, in which target genes encoding exported H. pylori proteins were efficiently tagged by TnMax9.
In order to establish a collection of mutants lacking distinct exported proteins, the mutations had to be transferred back into the H. pylori chromosome. By means of natural transformation, 86 plasmids could be transformed into the original strain P 1. H. pylori strains P 1 or P 12, which were naturally competent for DNA transformation, were transformed with circular plasmid DNA (0.2-O.S mg/transformation). Transformations to streptomycin resistance were performed with chromosomal DNA ( 1 mg/transformation), isolated from a streptomycin-resistant NCTC 11637 H. pylori mutant according to the procedure described in Haas et al. (Mol. Microbiol. 8:753-760).
i 5 Selection was performed on serum plates containing 4 mg/ml chloramphenicol or 500 mg/ml streptomycin. The transformation frequency for a given mutant was calculated as the number of chloramphenicol-, streptomycin-, or erythromycin-resistant colonies per cfu (average of three experiments). The blaM gene was deleted by NotI digestion, and the plasmid relegated, in those plasmids that did not transform strain P 1 directly. This procedure, which resulted in a twenty- to thirty-fold higher frequency of transformation, as compared to the same plasmid containing blaM, resulted in 36 additional mutant P 1 strains. The blaM deletion plasmids that still did not transform strain P 1 were used to transform the heterologous H. pylori strain P 12, 2S possessing an approximately 10-fold higher transformation frequency compared to Pl. This resulted in thirteen further mutants.
Thus, from the 192 amp's plasmids, a total of 135 H. pylori mutants ( 122 mutants in P 1 and 13 mutants in P 12} were finally obtained by selection for chloramphenicol resistance (70%). The transformation frequency varied between different plasmids in the range of 1 x 10-5 - i x 10-'. The remaining plasmids did not result in any transformants. 'The collection was frozen as individual mutants in stock cultures at -70~C. To verify the correct insertion of the mini-transposon into the H. pylori chromosome, ten representative mutants were tested by Southern hybridization of chromosomal DNA using cat~~ DNA
and the vector pMin2 as probes. Consistent with our previous experience concerning TnMax9-based shuttle mutagenesis of H. pylori, the mini-transposon was, in all cases, inserted into the chromosome without integration of the vector DNA, which probably means by a double cross-over, rather than by a single cross-over event. As judged from the hybridization pattern obtained with the cat gene as a probe, it appears that Tr~Max9 is located in different regions of the chromosome, showing that distinct target genes have been interrupted in individual mutants.
The mutants were analyzed for motility, transformation competence, and adherence to KatoIII cells. Screening of the H. pylori mutant collection allowed identification of mutants impaired in motility, natural transformation competence, and adherence to gastric epithelial cell Iines. Motility mutants could be grouped into distinct classes: (i) mutants lacking the major flagellin subunit FIaA and intact flagella; (ii) mutants with apparently normal flagella, but reduced motility; and (iii) mutants with obviously normal flagella, but completely abolished motility. Two independent mutations, which exhibited defects in natural competence for genetic transformation, mapped to different genetic loci. In addition, two independent mul:ants were isolated by their failure to bind to the human gastric carcinoma cell line KatoIII. Both mutants carried a transposon in the same gene, approximately 0.8 kilobases apart, and showed decrease autoagglutination, when compared to the wild type strain.
Sequences of clones obtained using the above-described transposon shuttle mutagenesis method were used to identify intact genes, lacking inserted transposons, in the H. pylori genome, as is described below in Example 5.
The invention is further illustrated by the following examples.
Example 1 describes identification of genes, such as genes that encode the polypeptides of the invention, in the Helicobacter genome, as well as identification of signal sequences and primer design for amplification of genes lacking signal sequences. Example 2 describes cloning of DNA encoding GHPO 732, GHPO 419, GHPO 1398, GHPO 70G, GHPO 1190, GHPO 98G, GHPO 1420, GHPO 1299, and GHPO 13 into a vector that provides a histidine tag, and production and purification of the resulting his-tagged fusion proteins.
Example 3 describes methods for cloning DNA encoding the polypeptides of 1 S the invention so that they can be produced without his-tags, and Example 4 describes methods for purifying recombinantly produced polypeptides of the invention. Example 5 describes methods for obtaining the nucleic acids of the invention from the deposited clones. Example 6 describes purification of recombinant H. pylori antigen GHPO 1190.
EXAMPLE 1: Identification of genes in the H. pylori genome, identification of signal sequences, and primer design for amplification of genes lacking signal sequences 1.A. Creating H. pylori genomic databases The H. pylori genome was provided as a text file containing a single contiguous string of nucleotides that had been determined to be 1.7G
Megabases in length. The complete genome was split into 17 separate files using the program SPLIT (Creativity in Action), giving rise to 16 contigs, each containing l00,000 nucleotides, and a 17'" contig containing the remaining 76,000 nucleotides. A header was added to each of the 17 files using the format: >hpg0.txt (representing contig 1 ), .hpg l .txt (representing contig 2), etc.
The resulting 17 files, named hpg0 through hpg 16, were then copied together to form one file that represented the plus strand of the complete H. pylori genome.
The constructed database was given the designation "H." A negative strand database of the H. pylori genome was created similarly by first creating a reverse complement of the positive strand using the program SeqPup (D.G.
Gilbert, Indiana University Biology Department) and then performing the same procedure as described above for the plus strand. This database was given the designation "N."
The regions predicted to encode open reading frames (ORFs) were defined for the complete H. pylori genome using the program GENEMARKTM
(Borodovsky et al., Comp. Chem. 17:123, 19!a3). A database was created from a text file containing an annotated version of all ORFs predicted to be encoded by the H. pylori genome for both the plus and minus strands, and was given the designation "O." Each ORF was assigned a number indicating its location on the genome and its position relative to other genes. No manipulation of the text file was required.
1.B. Searching the H. pylori databases The databases constructed as is described above were searched using the program FASTA (Pearson et al., Proc. Natl. Acad. Sci. USA 85:2444-2448, l988). FASTA was used for searching either a DNA sequence against either of the gene databases ("H" and/or "N"), or a peptide sequence against the ORF
library ("O"). TFASTX was used to search a peptide sequence against all possible reading frames of a DNA database ("H" and/or "N" libraries).
Potential frameshifts also being resolved, FASTX was used for searching the translated reading frames of a DNA sequence against either a DNA database, or a peptide sequence against the protein database.
1.C. Isolation of DNA sequences from the H. pylori genome The FASTA searches against the constructed DNA databases identified exact nucleotide coordinates on one or more of the isolated contigs, and therefore the location of the target DNA. Once the exact location of the target sequence was known, the contig identified to carry the gene was exported into the software package MapDraw (DNAStar, Inc.) and the gene was isolated. Gene sequences with flanking DNA were then excised and copied into the EditSeq. Software package (DNAStar, Inc.) for further analysis.
1.D. Identification of signal sequences The deduced protein encoded by a target gene sequence was analyzed using the PROTEAN software package (DNAStar, Inc.). This analysis predicts those areas of the protein that are hydrophobic by using the Kyte-Doolittle algorithm, and identifies any potential polar residues preceding the hydrophobic core region, which is typical for many signal sequences. For confirmation, the target protein was then searched against a PROSITE database (DNAStar, Inc.) consisting of motifs and signatures. Characteristic of many signal sequences and hydrophobic regions in general, is the identification of predicted prokaryotic lipid attachment sites. Where confirmation between the two approaches is apparent at the N-terminus of any protein, putative cleavage sites were sought. Specifically, this includes the presence of either an Alanine (A), Serine (S), or Glycine (G) residue immediately after the core hydrophobic region. In the case of lipoproteins, a Cysteine (C) residue would be identified as the +1 residue, post-cleavage.
1.E. Rational design of PCR primers based on the identification of signal sequences To clone gene sequences as N-terminal translational fusions for the generation of recombinant proteins with N-terminal Histidine tags, the gene sequence that specifies the signal sequence is omitted. The 5'-end of the gene-specific portion of the N-terminal primer is designed to start at the first codon beyond the cleavage site. In the case of lipoproteins, the S'-end of the N-terminal primer begins at the second codon, immediately after the modifiable residue at position +1 post-cleavage. The omission of the signal sequence from the recombinant allows for one-step purification, and potential problems associated with insertion of signal sequences. in the membrane of the host strain carrying the hybrid construct are avoided.
1 S EXAMPLE 2: Preparation of isolated DNA encoding GHPO 732, GHPO
419, GHPO I398, GHPO 706, GHPO 1190, GHPO 986, GHPO 1420, GHPO 1299, and GHPO 13, and production of these polypeptides as histidine-tagged fusion proteins 2.A. Preparation of genomic DNA from Helicobacter pylori Helicobacter pylori strain ORV2001, stored in LB medium containing 50% glycerol at -70~C, is grown on Colombia agar containing 7%
sheep blood for 48 hours under microaerophilic conditions (8-10% CO2, 5-7%
O2, 85-87% NZ). Cells are harvested, washed with phosphate buffer saline (PBS) (pH 7.2), and DNA is then extracted from the cells using the Rapid Prep Genomic DNA Isolation kit (Pharmacia Biol:ech).
2.B. PCR amplification DNA molecules encoding GHPO 732, GHPO 419, GHPO l398, GHPO 70G, GHPO 1 l90, GHPO 986, GHPO 1420, GHPO 1299, and GHPO
13 are amplified from genomic DNA, as can be prepared as is described above, by the Polymerase Chain Reaction (PCR) using the following primers:
GHPO 732 (HPO 64l:
N-terminal primer:
5'-GCCGGATCCATGACTTATGGGTATGGGGAA-3' (SEQ ID N0:171 );
and C-terminal primer:
5'-GCCCTCGAGACTTTTATTGATTCACCATTTCATT-3' (SEQ ID
N0:172).
GHPO 4l9 lHPO 547:
N-terminal primer:
5'-GCCGGATCCATCGCTGAAGAAAATGGGGCG-3' (SEQ ID N0:173);
and C-terminal primer:
5'-GCCCGGCCGCCCTAAAAACTATAAACATAACTC-3' (SEQ ID
NO: l 74).
GHPO l398 (HPO 15~
N-terminal primer:
5'-GCCGGATCCGGTATTAGGAAGCTTATACCATC-3' (SEQ ID N0:175);
and C-terminal primer:
5'-GCCCTCGAGAAGTTCTATTTTTAATTCCTTGAGAG-3' (SEQ ID
N0:176).
GHPO 706 lHPO 507:
WO 98/21225 PCTlUS97/21353 -N-terminal primer:
5'-GCCGGATCCTCTGATAGCCATAAAGAAAAAAAGGAC-3' (SEQ ID
NO:I77); and C-terminal primer:
5'-GCCCTCGAGATCTTTAGAAATCAACCCCCAAAGC-3' (SEQ ID
NO: I 78).
GHPO I 190 fHPO 7G):
N-terminal primer:
5'-GCCGGATCCGACTTAGAACATTTTAACACGCTC-3' (SEQ ID
N0:179);
and C-terminal primer:
5'-GCCCTCGAGTCATTTTAAACGACTCAAAACAAA-3' (SEQ ID
NO:180).
GHPO 986:
N-terminal primer:
S'-GCCGGATCCGGCCAAAGCGTGCGCACTTATTGG-3' (SEQ ID
N0:181 ); and C-terminal primer:
S'-GCCCTCGAGTTATTGTTCCAACCCCC'ACGCATC-3' (SEQ ID N0:182).
C~HPO l420:
N-terminal primer:
5'-GCCGGATCCAAGAGCAATGCTGATGACAAACC-3' (SEQ ID
N0:183); and C-terminal primer:
5'-GCCCTCGAGTTATGAGTTAAAGCCC(~TTGTCC-3' (SEQ ID N0:184).
WO 98/21225 PCTlUS97/21353 N-terminal primer:
5'-GCCGGATCCGAATCAGTAAAAACAGGAAAAAC-3' (SEQ ID
N0:185); and C-terminal primer:
5'-GCCCTCGAGCGGCTCTTTGGAGTTTTATTG-3' (SEQ ID N0:186).
GHPO 13:
N-terminal primer:
5'-GCCGGATCCATCATTCCCTCTCGCTCTATGG-3' (SEQ ID N0:187);
and C-terminal primer:
5'-GCCCTCGAGACCTTAATGCGTTGCGTTTTCTTT-3' (SEQ ID N0:188).
The N-terminal and C-terminal primers for each clone both include a 5' clamp and a restriction enzyme recognition sequence for cloning purposes (BamHI (GGATCC) and Xhol (CTCGAG) or NotI (CGGCCG) recognition sequences). The N-terminal primer is designed so that the amplified product does not encode the signal sequence and the potential cleavage site.
Amplification of gene-specific DNA is carried out using Pwo DNA
Polymerise (Boehringer Mannheim), which is a proof reading polymerise, according to general guidance provided by the manufacturer. Because of the exonuclease activity of the polymerise, two reaction mixtures (mixtures 1 and 2) are first prepared separately and combined just prior to amplification.
These mixtures are as follows:
Ingredient (final conc.l Mixture i full Mixture 2 (ull distilled H,O 160 79 dNTPs (200 ~cM each) 40 ---l Ox PCR buffer --- 20 primers ( 100 nM each) 1 ---DNA template (200 ng) 2 ---as obtained in S.A.
( 10x PCR buffer contains 100 mM Tris-HCl (pH 8.85), 2_'i0 mM KCI, 50 mM
(NH4)~ SO4, 20 mM
MgSO,) Amplification is carried out as follows:
Cvclin,~ conditions Tem ~C Ti a iin. Number of cycles Initial denaturing step 96 4 1 Denaturing step 94 0.5 20 Annealing step 50 1 20 Extension step 72 1 20 Final extension step 72 5 1 2.C. Transformation and selection of transformants A single PCR product is thus amplified and is then digested at 37~C
for 2 hours with BamHI and XhoI or NotI concurrently in a 20 ,ul reaction volume. The digested product is ligated to similarly cleaved pET28a (Novagen) that is dephosphorylated prior to the ligation by treatment with Calf Intestinal Alkaline Phosphatase (CIP). The gene fusion constructed in this manner allows one-step affinity purification of the resulting fusion protein because of the presence of histidine residues at the N-terminus of the fusion protein, which are encoded by the vector.
The ligation reaction (20 ,ul) is can-ied out at 14 ~C overnight and then is used to transform 100 ,ul fresh E, coli XL1-blue competent cells (Novagen). The cells are incubated on ice for 2 hours, heat-shocked at 42 ~C
for 30 seconds, and returned to ice for 90 seconds. The samples are then added to 1 ml LB broth in the absence of selection and grown at 37 ~C for 2 hours.
The cells are plated out on LB agar containing kanamycin (50 ~g/ml) at a 10x and neat dilution and incubated overnight at 37~C. The following day, 50 colonies are picked onto secondary plates and incubated at 37~C overnight.
Five colonies are picked into 3 ml LB broth supplemented with kanamycin (100 ~g/ml) and are grown overnight at 37~C. Plasmid DNA is extracted using the Quiagen mini-prep. method and is quantitated by agarose gel electrophoresis.
PCR is performed with the gene-specific primers under the conditions set forth above and transformant DNA is confirmed to contain the desired insert. If PCR-positive, one of the five plasmid DNA samples (500 ng) extracted from the E. coli XL 1-blue cells is used to transform competent BL21 (~,DE3) E. coli competent cells (Novagen; as described previously).
Transformants ( 10) are picked onto selective kanamycin (50 ~g/mL) containing LB agar plates and stored as a research stock in LB containing 50% glycerol.
_, 2.D. Purification of recombinant proteins One ml of frozen glycerol stock prepared as described in 2.C. is used to inoculate 50 ml of LB medium containing 25 ~g/ml of kanamycin in a 250 ml Erlenmeyer flask. The flask is incubated at 37~C for 2 hours or until the absorbance at 600 nm (0D600) reaches 0.4-1Ø The culture is stopped from growing by placing the flask at 4~C overnight. The following day, 10 ml of the overnight culture are used to inoculate 240 ml LB medium containing kanamycin (25 p.g/ml), with the initial 0D600 about 0.02-0.04. Four flasks are inoculated for each ORF. The cells are grown to an 0D600 of 1.0 (about 2 hours at 37~C), a 1 ml sample is harvested by centrifugation, and the sample is analyzed by SDS-PAGE to detect any leaky expression. The remaining culture is induced with 1 mM IPTG and the induced cultures are grown for an additional 2 hours at 37~C.
The final OD600 is taken and the cells are harvested by centrifugation at 5,000 x g for 15 minutes at 4~C. The supernatant is discarded and the pellets are resuspended in 50 mM Tris-HCl (pH 8.0), 2 mM EDTA. Two hundred and fifty ml of buffer are used for 1 liter of culture and the cells are recovered by centrifugation at 12,000 x g for 20 minutes. 'The supernatant is discarded and the pellets are stored at -45~C.
2. E. Protein purification Pellets obtained from 2.D. are thawed and resuspended in 95 ml of 50 mM Tris-HCl (pH 8.0). Pefabloc and lysozyme are added to final concentrations of 100 ~,M and 100 ~.g/ml, respectively. The mixture is homogenized with magnetic stirring at 5~C for 30 minutes. Benzonase (Merck) is added at a 1 U/ml final concentration, in th.e presence of 10 mM MgCl2, to 1 S ensure total digestion of the DNA. The suspension is sonicated (Branson Sonifier 4S0) for 3 cycles of 2 minutes each at maximum output. The homogenate is centrifuged at l9,000 x g for l 5 minutes and both the supernatant and the pellet are analyzed by SL)S-PAGE to detect the cellular location of the target protein in the soluble or insoluble fractions, as is described further below.
2.E.1. Soluble fraction If the target protein is produced in a soluble form (i.e., in the supernatant obtained in 2.E.} NaCI and imidazole are added to the supernatant to final concentrations of 50 mM Tris-HCl (pH 8.0), 0.5 M NaCI, and 10 mM
imidazole (buffer A). The mixture is filtered through a 0.45 ~.m membrane and WO 98l21225 PCT/US97/21353 -loaded onto an IMAC column (Pharmacia HiTrap chelating Sepharose; 1 ml), which has been charged with nickel ions according to the manufacturer's recommendations. After loading, the column is washed with 50 column volumes of buffer A and the recombinant target protein is eluted with 5 ml of buffer B (50 mM Tris-HCl (pH 8.0), 0.5 M NaCI, 500 mM imidazole).
The elution profile is monitored by measuring the absorbance of the fractions at 280 nm. Fractions corresponding to the protein peak are pooled, dialyzed against PBS containing 0.5 M arginine, filtered through a 0.22 gm membrane, and stored at -45~C.
2.E.2. Insoluble fraction If the target protein is expressed in the insoluble fraction (pellets obtained from 2.E.), purification is conducted under denaturing conditions.
NaCI, imidazole, and urea are added to the resuspended pellet to final concentrations of 50 mM Tris-HCl (pH 8.0), 0.5 M NaCl, 10 mM imidazole, and 6 M urea (buffer C). After complete solubilization, the mixture is filtered through a 0.45 ~,m membrane and loaded onto an IMAC column.
The purification procedures on the IMAC column are the same as described in 2.E.1., except that b M urea is included in a11 buffers used and column volumes of buffer C are used to wash the column after protein loading, instead of 50 column volumes.
The protein fractions eluted from the IMAC column with buffer D
(buffer C containing 500 mM imidazole) are pooled. Arginine is added to the solution to final concentration of 0.5 M and the mixture is dialyzed against PBS
containing 0.5 M arginine and various concentrations of urea (4 M, 3 M, 2 M, 1 M, and 0.5 M) to progressively decrease the concentration of urea. The final dialysate is filtered through a 0.22 ~,m membrane and stored at -45~C.
Alternatively, when the above purification process is not as efficient as it should be, two other processes may be used as follows. A first alternative involves the use of a mild denaturant, N-octyl glucoside (NOG). Briefly, a pellet obtained in 2.E. is homogenized in 5 rr~M imidazole, 500 mM sodium chloride, 20 mM Tris-HCl (pH 7.9) by microfluidization at a pressure of 15,000 psi and is clarified by centrifugation at 4,000-5,000 x g. The pellet is recovered, resuspended in 50 mM NaP04 (pH 7.5) containing 1-2 % weight /volume NOG, and homogenized. The NOG-soluble impurities are removed by centrifugation. The pellet is extracted once more by repeating the preceding extraction step. The pellet is dissolved in 8 M urea, 50 mM Tris (pH 8.0). The urea-solubilized protein is diluted with an equal volume of 2 M arginine, 50 mM Tris (pH 8.0), and is dialyzed against 1 PvI arginine for 24-48 hours to remove the urea. The final dialysate is filtered through a 0.22 ~,m membrane and stored at -45~C.
A second alternative involves the use of a strong denaturant, such as guanidine hydrochloride. Briefly, a pellet obtained in 2.E. is homogenized in mM imidazole, 500 mM sodium chloride, 20 mM Tris-HCl (pH 7.9) by microfluidization at a pressure of 1 S,000 psi .and clarified by centrifugation at 4,000-5,000 x g. The pellet is recovered, resuspended in 6 M guanidine hydrochloride, and passed through an IMAC column charged with Ni~. The bound antigen is eluted with 8 M urea (pH 8.5). Beta-mercaptoethanol is added to the eluted protein to a final concentration of 1 mM, then the eluted protein is passed through a Sephadex G-25 column equilibrated in 0.1 M acetic acid.
Protein eluted from the column is slowly added to 4 volumes of 50 mM
phosphate buffer (pH 7.0). The protein remains in solution.
2.F. Evaluation of the protective activity of the purified protein Groups of 10 Swiss Webster mice (Taconic Labs) are immunized rectally with 25 ~cg of the purified recombinant protein, admixed with 1 ,ug of cholera toxin (Berna) in physiological buffer. Mice are immunized on days 0, 7, 14, and 21. Fourteen days after the last immunization, the mice are challenged with H. pylori strain ORV2001 grown in liquid media (the cells are grown on agar plates, as described in 2.A., and, after harvest, the cells are resuspended in Brucella broth; the flasks are then incubated overnight at 37 ~C).
Fourteen days after challenge, the mice are sacrificed and their stomachs are removed. The amount of H. pylori is determined by measuring the crease activity in the stomach and by culture.
2.G. Production of monospecific polyclonal antibodies 2.G.1. Hyperimmune rabbit antiserum New Zealand rabbits are injected both subcutaneously and Z 5 intramuscularly with 100 ,ug of a purified fusion polypeptide, as obtained in 2.E.1. or 2.E.2., in the presence of Freund's complete adjuvant and in a total volume of approximately 2 ml. Twenty one and 42 days after the initial injection, booster doses, which are identical to priming doses, except that Freund's incomplete adjuvant is used, are administered in the same way.
Fifteen days after the last injection, animal serum is recovered, decomplemented, and filtered through a 0.45 ~cm membrane.
2.G.2. Mouse hyperimmune ascites fluid Ten mice are injected subcutaneously with 10-50 ,ug of a purified fusion polypeptide as obtained in 2.E.1. or 2.E.2., in the presence of Freund's complete adjuvant and in a volume of approximately 200 ,ul. Seven and 14 days after the initial injection, booster doses, which are identical to the priming doses, except that Freund's incomplete adjuvant is used, are administered in the same way.
Twenty one and 28 days after the initial infection, mice receive 50 ,ug of the antigen alone intraperitoneally. On day 21, mice are also injected intraperitoneally with sarcoma 180/TG cells CM26684 (Lennette et al., Diagnostic Procedures for Viral) Rickettsial, ~znd Chlamydial Infections, 5th Ed. Washington DC, American Public Health Association, l979). Ascites fluid is collected 10-13 days after the last injection.
EXAMPLE 3: Methods for producing transcriptional fusions lacking His-tags Methods for amplification and cloning of DNA encoding the polypeptides of the invention as transcriptional fusions lacking His-tags are described as follows. Two PCR primers for each clone are designed based upon the sequences of the polynucleotides that encode them (SEQ ID NOs: l-169 (odd numbers)). These primers can be used to amplify DNA encoding the polypeptides of the invention from any Helicobacter pylori strain, including, for example, ORV2001 and the strain deposited as ATCC deposit number 43579, as well as from other Helicobacter species.
The N-terminal primers are designed to include the ribosome binding site of the target gene, the ATG start site, and any signal sequence and cleavage site. The N-terminal primers can include a 5' clamp and a restriction endonuclease recognition site, such as that for BamHI (GGATCC), which facilitates subsequent cloning. Similarly, the C-terminal primers can include a restriction endonuclease recognition site, such as that for XhoI (CTCGAG), which can be used in subsequent cloning, and a TAA stop codon.
Amplification of genes encoding the polypeptides of the invention is carried out using Thermalase DNA Polymerase under the conditions described above in Example 2. Alternatively, Vent DNA polymerase (New England Biolabs), Pwo DNA polymerase (Boehringer Mannheim), or Taq DNA
polymerase {Appligene) can be used, according to instructions provided by the manufacturers.
A single PCR product for each clone is amplified and cloned into appropriately cleaved pET 24 (e.g., BamHI XhoI cleaved pET 24), resulting in construction of a transcriptional fusion that permits expression of the proteins without His-tags. The expressed products can be purified as denatured proteins that are refolded by dialysis into 1 M arginine.
Cloning into pET 24 allows transcription of the genes from the T7 promoter, which is supplied by the vector, but relies upon binding of the RNA-specific DNA polymerase to the intrinsic ribosome binding sites of the genes, and thereby expression of the complete ORF. The amplification, digestion, and cloning protocols are as described above for constructing transiational fusions.
Amplification of clone GHPO 1190 DNA
Design of PCR primers for cloning Two PCR primers are designed based on the complete gene sequence (see table 1 ). The N-terminal primer (FC 1 ) is designed to include the ribosome binding site of the target gene, the ATG start site, and the signal sequence (with cleavage site). It includes a clamp (GCC) at the 5' most end, and a SacI
recognition sequence (GAGCTC) for cloning purposes.
The C-terminal primer (RN2) includes an XhoI recognition sequence for cloning purposes, and the natural TAA stop codon.
N-terminal primer (FC 1 ):
5'-GCCGAGCTCCAAGCAAAAAAATGTC'.AATTAAAAGGG-3' (SEQ ID
NO:189) C-terminal primer (RN2):
5'-GCCCTCGAGGTCTAAATTAGAATAAGTGTTGTT-3' (SEQ ID N0:190) Amplification of each specified gene can be achieved by employing FC 1 /RN2 primers for any of the genes described (see Table 1 ).
PCR conditions Amplification of gene-specific DNA is earned out using Pwo DNA
Polymerase (Boehringer Mannheim) under the following conditions. Due to _ - 10 the exonuclease activity of the polymerase, two reaction mixtures are prepared separately and combined just prior to amplification.
Reaction ingredients: Ingredient (final conc.~ Mixture 1 lull Mixture 2 lull distilled H,O 160 79 dNTPs (2U0 ~M each) 40 -10X buffer - 20 primer I ( 100 nM) 1 -primer 2 ( 100 nM) 1 -Template (200 ng) 2 0 Cvclin~ condition Tem ~C Time(minl Number of cycles Initial denaturing step 9G 4 1 Denaturing step 94 0.5 20 Annealing step 50 1 20 Extension step 72 1 20 Final extension step 72 1 1 A single PCR product of 624 basepairs is amplified and cloned into SacI-XhoI cleaved pET 24, allowing construction of a transcriptional fusion and expression of GHPO 1190 antigen in the absence of a His-tag. In this instance, expressed product can be purified a;> a denatured protein that is re-folded by dialysis into 1 M arginine.
WO 98l21225 PCTlUS9?/21353 -_(~~_ Cloning into pET 24 allows transcription from the T7 promoter, supplied by the vector, but relies upon binding of the RNA-specific DNA
polymerise to the intrinsic ribosome binding site for GHPO 1190, and thereby expression of the complete ORF. The amplification, restriction, and cloning protocols are as previously described for constructing translational fusions.
EXAMPLE 4: Purification of the polypeptides of the invention by immunoaffinity 4.A. Purification of specific IgGs An immune serum, as prepared in section 2.G., is applied to a protein A
Sepharose Fast Flow column (Pharmacia) equilibrated in 100 mM Tris-HCl (pH 8.0). The resin is washed by applying 10 column volumes of 100 mM
Tris-HCI and 10 volumes of 10 mM Tris-HCI (pH 8.0) to the column. IgG
antibodies are eluted with 0.1 M glycine buffer (pH 3.0) and are collected as ml fractions to which is added 0.25 ml 1 M Tris-HCl {pH 8.0). The optical density of the eluate is measured at 280 nm and the fractions containing the IgG
antibodies are pooled, dialyzed against 50 mM Tris-HCI (pH 8.0), and, if necessary, stored frozen at -70~C.
4.B. Preparation of the column An appropriate amount of CNBr-activated Sepharose 4B gel ( 1 g of dried gel provides for approximately 3.5 ml of hydrated gel; gel capacity is from 5 to 10 mg coupled IgG/ml of gel) manufactured by Pharmacia ( 17-0430-O 1 ) is suspended in 1 mM HCl buffer and washed with a buchner by adding small quantities of 1 mM HCl buffer. The total volume of buffer is 200 ml per gram of gel.
Purified IgG antibodies are dialyzed for 4 hours at 205 ~ C against SO volumes of S00 mM sodium phosphate buffer (pH 7.S). The antibodies are then diluted in S00 mM phosphate buffer (pH 7.S) to a final concentration of 3 mg/ml.
S IgG antibodies are mixed with the gel overnight at 5~3 ~C. The gel is packed into a chromatography column and is washed with 2 column volumes of S00 mM phosphate buffer (pH 7.S), and 1 column volume of SO mM sodium phosphate buffer, containing S00 mM NaCI (pH 7.S). The gel is then transferred to a tube, mixed with 100 mM ethanolamine (pH 7.S) for 4 hours at room temperature, and washed twice with 2 column volumes of PBS. The gel is then stored in 1 / 10,000 PBS/merthiolate. T'he amount of IgG antibodies coupled to the gel is determined by measuring the optical density (OD) at 280 nm of the IgG solution and the direct eluate, plus washings.
4.C. Adsorption and elution of the antigen 1 S An antigen solution in SO mM Tris-HC;1 (pH 8.0), 2 mM EDTA, for example, the supernatant obtained in 3.E. or the solubilized pellet obtained in 3.E., after centrifugation and filtration through a 0.4S ~cm membrane, is applied to a column equilibrated with SO mM Tris-HCl (pH 8.0), 2 mM EDTA, at a flow rate of about 10 ml/hour. The column is then washed with 20 volumes of 50 mM Tris-HCl (pH 8.0), 2 mM EDTA. Alternatively, adsorption can be achieved by mixing overnight at S~3 ~C.
The adsorbed gel is washed with 2 to C~ volumes of 10 mM sodium phosphate buffer (pH 6.8) and the antigen is eluted with 100 mM glycine buffer (pH 2.S). The eluate is recovered in 3 mL fractions, to each of which is added 1 SO ,ul of 1 M sodium phosphate buffer (pH 8.0). Absorption is measured at 280 nm for each fraction; those fractions containing the antigen are pooled and stored at -20 ~C.
EXAMPLE 5: Preparation of isolated DNA encoding the polypeptides of the invention from the deposited clones.
As mentioned above, E. coli strains including plasmids containing nucleic acids encoding GHPO 1190 (formerly HP076, ATCC# 98197), GHPO
1212 (formerly HP018, ATCC# 982l0), GHPO 10l2 (formerly HP0121, ATCC# 98201 ), GHPO I 50l (formerly HP045, ATCC# 98208), GHPO 1 G88 - 10 (formerly HPO101, ATCC# 98l98), GHPO 34G (formerly HPOI 16, ATCC#
98200), GHPO 1200 (formerly HP07, ATCC# 98211), GHPO 1538 (formerly HP0104, ATCC# 98199), GHPO l398 (formerly HPO15, ATCC# 98214), GHPO 100l (formerly HP058, ATCC# 98206), GHPO 470 (formerly HP0132, ATCC# 98202), GHPO 689 (formerly HP09, ATCC# 98203), GHPO 1550 (formerly HP038, ATCC# 98204), GHPO 1G20 (formerly HP087, ATCC# 98205), GHPO 574 (formerly HP071, ATCC# 98217), GHPO 329 (formerly HP070, ATCC# 982l9), GHPO 1374 (formerly HP080, ATCC# 982l5), GHPO 956 (formerly HP095 ATCC# 98216), HPO 98 (ATCC# 98218), GHPO 134G (formerly HP057, ATCC# 98220), GHPO 706 {formerly HPO50, ATCC# 98207), GHPO 732 (formerly HP064, ATCC#
98213), GHPO 419 (formerly HP054, ATCC# 98212), and GHPO 27G
(formerly HP042, ATCC# 98209) were deposited in E. coli strain DHSa under the Budapest Treaty with the American Type Culture Collection (ATCC;
Rockville, Maryland) on October 9, 1996 and were designated with accession numbers indicated in parentheses above. These plasmids each contain a genomic DNA BgIII-CIaI insert from H. pylori strain P 1 or P 12 (referred to as 69-A and 888-0 in Haas et al., Mol. Microbiol. (1993) 8:753). Each of the inserts are disrupted by the presence of transposon TnMax9 (Kahrs et al., Gene ( 1995) 167:53). DNA molecules lacking the transposon can be amplified from the plasmids using standard PCR techniques, such as inverse and recombinant PCR (see, e.g., Innis et al., supra), so that a full-length H. pylori insert is reconstituted. For example, the H. pylori sequences flanking the transposon can each be amplified by PCR, and then ligated together to form the full-length H. pylon~i gene lacking the transposon. Primers that can be used in these methods for each of the twenty-four deposited clones of the invention are shown in Table 1. The locations of insertion of the transposon in each of the deposited clones are between the nucleotides indicated in parentheses after the name of each clone, as follows: HPO101 (4!)7-498), GHPO 1538 (428-429), GHPO 346 (433-444), GHPO l012 (463-4G4), GHPO l32 (408-409), GHPO
1212 (22G-227), GHPO 1550 (347-348), GHPO 27G (372-373), GHPO 150l (299-300), GHPO 70G (29-293), GHPO 4l9' (351-352), GHPO 134G (2GG-2G7), GHPO 1001 (434-435), GHPO 732 (224-22S), GHPO 329 (114-115), GHPO
574 (274-275), GHPO 1l90 (412-413), GHfO 1200 (349-350), GHPO 1374 (105-l06), GHPO 1620 (2G-27), GHPO 956 (64-65), HPO 98 (43-44), and GHPO 689 (346-347).
EXAMPLE 6: Purification of recombinant H. pylori antigen from GHPO
1190.
A pellet of E. coli expressing GHPO 1190 is homogenized in 5 mM
imidazole, S00 mM sodium chloride, 20 mM Tris-HCl (pH 7.9) by microfluidization at a pressure of I5,000 psi, and clarified by centrifugation at - 25 4000-5000g.
Method I
The pellet containing cloned protein is suspended in buffer containing 2% N-octyl glucoside (NOG) and is homogenized. The NOG soluble protein is removed by centrifugation. The pellet is extracted one more time with 2%
NOG. After centrifugation, the pellet is dissolved in 8 M urea. The urea solubilized protein is diluted with an equal volume of 2 M arginine and dialyzed against 1 M arginine for 24-48 hours to remove urea. The cloned protein remains in solution. SDS-PAGE and Coomassie staining, followed by densitometric scanning, shows that the protein is 80-85% pure cloned antigen.
Method 2 The pellet containing cloned protein is solubilized in 6 M guanidine hydrochloride and is passed through an IMAC column charged with Ni~. The bound antigen is eluted with 8 M urea (pH 8.5). (3-Mercaptoethanol is added to eluted protein to a final concentration of 1 mM, then passed through a Sephadex G-25 column equilibrated in 0.1 M acetic acid. Protein eluted from Sephadex G-25 column is slowly added to 4 volumes of 50 mM phosphate (pH
7.0). The protein remains in solution.
Purification of recombinant proteins Recombinant proteins expressed as Histidine-tagged fusion proteins can be solubilized and purified by using a metal affinity column (nickel column).
The bound protein can be eluted with imidazole buffer, with or without urea, or by using low pH buffers, with or without urea. Urea or guanidine hydrochloride-denatured proteins can then be renatured using appropriate renaturing buffers. With a number of recombinant H. pylori antigens (HpaA
and clone GHPO 1 l90), renaturation conditions using arginine hydrochloride (0.25-1 M) have been determined.
WO 98/21225 PCT/US97/21353 w Recombinant proteins without a His-tag can be solubilized and purified using immunoaffinity, ion-exchange, sizing, and/or hydrophobic chromatography. Proteins expressed as insoluble aggregates in inclusion bodies can be solubilized in denaturing agents, such as 8 M urea or 6 M
guanidine hydrochloride. Appropriate folding and renaturation can readily be determined by one skilled in the art.
The above pellet containing cloned protein is suspended in 50 mM
NaP04 (pH 7.5) containing 1 % weight/volume N-octyl glucoside (NOG) and mixed vigorously. The NOG soluble impurities are removed by centrifugation.
The remaining pellet is extracted one more time with the 1 % NOG solution to further remove impurities. After centrifugation, the pellet is solubilized in urea, 50 mM Tris (pH 8.0). The Urea solubi lized protein is diluted with an equal volume of 2 M Arginine, 50 mM Tris (pH 8.0), and is dialyzed against 1 M Arginine, 50 mM Tris, 50 mM NaCI (pH .8.0) for 24-48 hours to remove urea. The cloned protein remains in solution following dialysis. SDS-PAGE
and Coomassie staining followed by densitometric scanning shows that the protein is 80-85% pure cloned antigen.
Other embodiments are within the following claims.
Itl~;-(.'()N5'1'Itll(:'1'1()N (tl~ A (:U>\II'LII'1? Oltl~' 13l' RI?C0119131NAN'I' 1'CK
O
I=' denotes lorward primer R' denotes reverse primer n~
r.
C' denotes coding strand N' denotes non-coding strand AIf 1-C t and RN2 primers have incorporated at their 5' end a clamp and a recognition sequence for cloning purposes GUC clamp present for oning of entire gene sequence amplilication and cl from chromosomal DNA
[Xj denotes any nucleotide sequence not present in the completed gene sequence () hientifies region lap betweenIwo original PCR products, each clone of over the and is consistently 10 nucleotides long for CLONE Primer nt positionsPrimer sequence (5' - 3') Length Tm (oC) Vy tdo. type of gene seq.
C
7 6 FC1 304 - GCC [X] CAAGCAAAAAAATGTCAATTAAAAGGG2 7 7 0 ,,''', (TATGGAACTTA)GAACATTTTAACACGCTCTATTA33 60 RN2 927 - GCC(XJGTCTAAATTAGAATAAGTGTTGTT24 60 w 1 8 FC1 101 - GCC [Xj AATATATGGGAACTTAATGAGAAT2 4 6 0 ' ,.ml FC2 218 - (AAATCTCGCA) GAAATCTTTCACAAGCGAGCAA32 60 ",' ~~ RN2 922 - GCC (XJ ATGTCATGTCAAACTATGAAGC2 2 C
t21 FC1 141 - GCC[XJTCACAATGGATAAAAACAACAACA24 62 N RN1 45t - GCCCTTTTGTTTAGGGGTTAG 21 v FC2 455- 485 (ACAAAAGGGC)TTTTTAGAGCATGTGAGCCATC32 62 RN2 814 - GCC[XjCTGTCCAAATCAGCCACCC 19 G0 4 5 FC1 1 - 26 GCC [XJ ATGAAMGATTTGATTTGTTTf~ATC2 6 6 2 b FC2 290 - (AATACGGCTTTAAAGCTATAGAAAATTTAAACGC)34 60 n RN2 603 - GCC[XJTTAAATATCCCAATCCTGCCAC 22 62 ~o J
N.
W
W
f 101 FC1 3D8 - GCC (X] GMGGATTTATTATGATTAAAAGAA2 5 t~ 0 FC2 488 - (AAATTAGGTT)TTGTAGGCTTTGCCAATAMTG32 60 RN2 893 - GCC [X] AAGGMTAAATTAGAAAGTGAAGM2 5 6 2 N
W
t t s FC1 236 - GCC (X] CGCATTGATTTGATGAATAAACC2 3 6 2 FC2 425 - (GTTATAGGCG) ATAMGGTTTMCGCAGCTAAG3 2 6 0 RN2 812 - GCC (X] CTCACTAAAAAGCMTTTTTGAG'Z 3 6 0 7 FC1 195 - GCC [X] TMGGAATGAAGTTGATAAAATTTGT2 6 6 4 FC2 339 - (ATGMMTGC) ACGCCCAMTAATAAGGAAGTA3 2 6 0 RN2 738 - GCC [X] GGATTTATTGAGCTTTCCCCTT2 2 G 2 1 Oa FC1 25t - GCC (X] AMGGGCGAAMTGAGCMGA ? 1 6 D
RNi 429 - TMMTAACCMCAGAGTGATCA ~' a G 0 FC2 420 - (GGTTATTTTA) GTGGATATTTGGGTTTATAGCGA3 3 6 2 RN2 7S4 - GCC (X] TTTTTTAAGAATCACTTTCTTCGG2 4 f 2 N
m N vo 5 8 Fr1 11, 8 Gl_'C (X1 ATAC_;('_AACAA(TC.ATt~ITTT1-fAAAAC;2 f~ G 6 - 1, d, 3 FC2 425 - (CMGACTTCA) AAAAAGMGGAGCGGTTGCC3 0 6 0 RN2 650 - GCC [X] CTGGCTTATTGCGTATCATC 2 0 E; 0 1 3 2 Fc1 294 - GGC [X] GGMGMTMTGCTCGCTTCC 2 S 6 2 FC2 A00 - (ACACTCCAGT)AGATGCTTTCCCGGATATTrC31 60 RN2 761 - GCC [X] CTATTCTCCAGGGATATGGCC2 1 6 4 9 FC1 211 - GCC [Xj GATGGATTTTTTATGGGGGTGAG2 3 6 4 328 f7 FC2 338 - (CGGCAGTGCC) TTTAGCCTATTATTTAGMGCGA3 3 6 0 RN2 686 - GCC [X] ATGGTATTTGTCTAAGACCCTC2 2 6 2 s N
W
W
W
I
3 8 FC1 220 - GCC [X] AAAAGGGTTTTAAATAATGGCTG2 3 6 0 FC2 239 - (TTATCCTTGT) TGCTGGCTTGGTTTTTTTTAATT3 3 6 0 t~
RN2 597 - GCC [X] AAGATTCTAAAAGGGCTTCAAAT2 3 6 0 7 1 FCi 1 - 25 GCC [X] ATGTTGAAATTTAAATATGGTTTGA2 5 G 0 FC2 265 - (AGTGGGGTTT) TTTTAGGGGGTGGGTATGCT3 0 6 0 RN2 ~ 524 GCC [X] GAGCCTACAGGTTGCTTGC 2 0 G 0 70 FC1 1-23 GCC[X]ATGGTATTTGACAGAACAATCAG23 G2 RN1 115 - GAAAAGCCACCCCGCTTATT 20 f 0 FC2 106 - (GTGGCTTTTC)AAAAAGAGTGGGTGCAACAATT32 60 RN'? 495 - GCC [X) TTAGGAATAGCATAACAAACAAACG2 5 6 6 N
8 0 FC1 1 - 25 GCC [X] ATGTTAGAAAAATTGAITGAAAGAG2 5 6 2 FC2 97 - (TATGTGTTCA) TGAAAGAGTTGTGGCACATGC3 1 6 2 V
RN2 435 - GCC [X] TTATGCGATAGGGGGCGTATC2 1 6 6 m m 95 FC1 1-27 GCC[XjATGAAAAAATTTTITfCTCAATCTTT27 C FC2 55 - (CTACTGGCCA) TGGATGGCAATGGCGTTTnTtAG3 4 6 8 r- RN2 432 - GCC [X] TTATTGATGAACATTAACCATTAAA2 m 98 FC1 1-22 GCC[XjATGAAAACCTTTAAAAACCTGC 22 58 FC2 34 - (CTGATCGCTA) TGAGTTGGCTCCAAGCGGA2 9 6 0 RN2 336 - GCC [X] TTAAAACTCATAGCGTTTTTCAAT2 4 G 0 42 FC1 18-51 GCC[X]GAGAGTAGTGGCAGAGTTTATGCTGATTCC34 98 (,b RN1 380-351 (AACTTTTC)TCTATCCCAATTCGTTACGCTC30 64 FC2 366-396 (GGATAGA)GAAAAGTTTGGCGTCAAAAGTTGG31 f 8 RN2 822-801 GCC [X] GGCTTAAACTGGAACGGATTTC2 2 f 1 N
W
W
50 FC1 140-170 GCC[X]TAAAGTTTGCTAAAAAGATGGTTTTAATT131 76 RNt 297-270 (GACTTCTAAAG)CGTCCTT-TTITfCTTTA28 56 O
FC2 287-31 (CTTTA)GAAGTCATTAAACAAAGAGGGGT2 9 6 4 0~
RN2 607-584 GCC[X]CCCATCTTTAGAAATCAACCCCCA24 70 ' N
N
64 FC1 23-so GCC[X]GAAATCAAGGAGTTTGTATGCAACAGCG28 80 RN1 225-149 (A)AGCTTTTCATTATCTTCCCCATAAGC27 74 FC2 216-244 (fGAAAAGCT)TTTAGCGAAGCGATCAAGCC20 60 RN2 1039-1012GCC[X]CCCAATACTTTTATTGATTCACCATTTC28 74 54 FC1 21-48 GCG[X]CAATAAAACACCAAAATGAATGAGTTAC2(3 68 !' RN1 352-327 (A)GATTTTGTTTTGAGCGTTAGAAATG 26 m FC2 345-376 (CAAAATC)TATAAACTCAATCAAGTCAAAAATG32 62 RN2 1280-1255GCC[X]GCATTTACCCCCTAAAAACTATAAAC26 70 y "i o 15 FC1 14-35 GCC[X]CTGAAGGGTGTATGGTATTAGG 22 64 RN1 157-132 (C)ACCATACATGTATCCTGCATTAATG 26 68 FC2 147-179 (CATGTATGGT)GTAGCAAAGAATTTTAAGGAGGC33 64 RN2 377-349 GCC(X]CGTTAAAACTAAAGTTCTATTTTTAATTC29 70 w N vp 57 FC1 13-39 GCC[X1GTAAGGAATGAGATGATAAAGAGTTGG27 74 C RN1 267-244 (-~GGAATATTCTGATCCACGCCATC 24 FC2 258-294 (GAATATTCC)AAAAGCCGTTTTTTATTACAGAAGAC37 76 N RN2 957-934 GCC[X]CTAAACTCTGGCTTATTGCGTATC2a 6(3 Of ..
B 7 FCt t -22 GCC [XJ ATGCGTTTATTATTGTGGTGGG2 2 6 2 RN1 27-3 (C)AATACCCACCACAATAATAAACGCAT2 5 6 (i FC2 18-50 (GTGGGTATT)GGTATTATCGCTCTTTTTAAATCC33 64 RN2 519-4 GCC [X] TTAAATTTTTAGGGAAAGGGTA2 2 6 2 CONDITIONS FOR RECOMBINANT PCR
Two independent PCR conditions are carried out for FC1/RN1 and FC2/RN2 primers under the same conditions proposed for cloning genes for expression.
N
f.a W
W
W
I
O
~o N
N
N
N
After 20 cycles, the product of each reaction is used as template for a further 20 cycles with FC1/RN2 only The product will encompass the full tenth gene minus the transposon.
The presence of restriction sites at the 5' ends of these primers allows for cloninglexpression studies.
n N
C o N
N
C cVa m w.
N
Z
m m c r m N
..
b n ~o J
N
rr W
W
SEQUENCE LISTING
(1) GENERAL INFORMATION
(i) APPLICANT: ORAVAX, INC.
(ii) TITLE OF THE INVENTION: HELICOBACTER POLYPEPTIDES
AND CORRESPONDING POLYNUCLEOTIDE MOLECULES
(iii) NUMBER OF SEQUENCES: 190 (iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: Clark & Elbing LLF?
(B) STREET: 176 Federal Street (C) CITY: Boston (D) STATE: MA
(E) COUNTRY: USA
iF) ZIP: 02110-22l4 (v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Diskette (B} COMPUTER: IBM Compatible (C) OPERATING SYSTEM: DOS
(D) SOFTWARE: FastSEQ for Window; Version 2.0 (vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER: UNKNOWN
(B) FILING DATE: 14-NOV-1997 (C) CLASSIFICATION:
(vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: 08/749,051 (B} FILING DATE: 14-NOV-l996 (vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: 08/B31,309 (B) FILING DATE: 1-APR-1997 (vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: 08/834,705 (B) FILING DATE: 1-APR-1997 (vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: 08/833,457 (B) FILING DATE: 1-APR-1997 (vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: 08/88l,227 (B) FILING DATE: 24-JL1N-1997 (vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: 08/902,615 (B) FILING DATE: 29-JUL-1997 (viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: Clark, Paul T.
(B) REGISTRATION NUMBER: 30,175 (C) REFERENCE/DOCKET NUMBER: 06132/028W01 (ix} TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: 6l7-428-0200 (B) TELEFAX: 617-428-7045 (C) TELEX:
(2) INFORMATION FOR SEQ ID N0:1:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 989 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear {ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 71...940 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1:
Met Lys Phe Leu Arg Ser Val Tyr Ala Phe Cys Ser Ser Trp Val Gly Thr Ile Val Ile Val Leu Leu Val Ile Phe Phe Ile Ala Gln Ala Phe Ile Ile Pro Ser Arg Ser Met Val Gly Thr Leu Tyr Glu Gly Asp Met Leu Phe Val Lys Lys Phe Ser Tyr Gly Ile Pro Ile Pro Lys Ile Pro Trp Ile Glu Leu Pro Val Met Pro Asp Phe Lys Asn Asn Gly His Leu Ile Glu Gly Asp Arg Pro Lys Arg Gly Glu Val Val Val TTT ATC CCT CCC CAT GAA AAA AAG TCT TAC T.AT GTT AAA AGG AAT TTT 397 Phe Ile Pro Pro His Glu Lys Lys Ser Tyr Tyr Val Lys Arg Asn Phe 95 l00 105 Ala Ile Gly Gly Asp Glu Val Leu Phe Thr Asn Glu Gly Phe Tyr Leu CAC CCT TTT GAG AGC GAC ACG GAC AAA AAT T.AC ATC GCT AAA CAT TAC 493 His Pro Phe Glu Ser Asp Thr Asp Lys Asn Tyr Ile Ala Lys His Tyr Pro Asn Ala Met Thr Lys Glu Phe Met Gly Lys Ile Phe Val Leu Asn CCT TAT AAA AAT GAG CAT CCG GGT ATC CAT T.AC CAA AAA GAC AAT GAA 589 Pro Tyr Lys Asn Glu His Pro Gly Ile His Tyr Gln Lys Asp Asn Glu ACC TTC CAC TTA ATG GAG CAA TTA GCC ACT C:AA GGC GCA GAA GCT AAT 637 Thr Phe His Leu Met Glu Gln Leu Ala Thr Gln Gly Ala Glu Ala Asn Ile Ser Met Gln Leu Ile Gln Met Glu Gly G.lu Lys Val Phe Tyr Lys l90 195 200 205 Lys Ile Asn Asp Asp Glu Phe Phe Met Ile GLy Asp Asn Arg Asp Asn Ser.Ser Asp Ser Arg Phe Trp Gly Ser Val A.la Tyr Lys Asn Ile Val Gly Ser Pro Trp Phe Val Tyr Phe Ser Leu S~'r Leu Lys Asn Ser Leu Glu Met Asp Ala Glu Asn Asn Pro Lys Lys A:rg Tyr Leu Val Arg Trp GAA CGC ATG TTT AAA AGC GTT GGA GGC TTA Gi~A AAA ATC ATT AAA AAA 925 Glu Arg Met Phe Lys Ser Val Gly Gly Leu G:Lu Lys Ile Ile Lys Lys Glu Asn Ala Thr His _77_ (2) INFORMATION FOR SEQ ID N0:2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 290 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:2:
Met Lys Phe Leu Arg Ser Val Tyr Ala Phe Cys Ser Ser Trp Val Gly Thr Ile Val Ile Val Leu Leu Val Ile Phe Phe Ile Ala Gln Ala Phe Ile Ile Pro Ser Arg Ser Met Val Gly Thr Leu Tyr Glu Gly Asp Met Leu Phe Val Lys Lys Phe Ser Tyr Gly Ile Pro Ile Pro Lys Ile Pro _ __ Trp Ile Glu Leu Pro Val Met Pro Asp Phe Lys Asn Asn Gly His Leu Ile Glu Gly Asp Arg Pro Lys Arg Gly Glu Val Val Val Phe Ile Pro Pro His Glu Lys Lys Ser Tyr Tyr VaI Lys Arg Asn Phe Ala Ile Gly Gly Asp Glu Val Leu Phe Thr Asn Glu Gly Phe Tyr Leu His Pro Phe 115 l20 125 Glu Ser Asp Thr Asp Lys Asn Tyr Ile Ala Lys His Tyr Pro Asn Ala Met Thr Lys Glu Phe Met Gly Lys Ile Phe Val Leu Asn Pro Tyr Lys 145 150 l55 160 Asn Glu His Pro Gly Ile His Tyr Gln Lys Asp Asn Glu Thr Phe His l65 l70 175 Leu Met Glu Gln Leu Ala Thr Gln Gly Ala Glu Ala Asn Ile Ser Met l80 185 190 Gln Leu Ile Gln Met Glu Gly Glu Lys Val Phe Tyr Lys Lys Ile Asn Asp Asp Glu Phe Phe Met Ile Gly Asp Asn Arg Asp Asn Ser Ser Asp Ser Arg Phe Trp Gly Ser Val Ala Tyr Lys Asn Ile Val Gly Ser Pro Trp Phe Val Tyr Phe Ser Leu Ser Leu Lys Asn Ser Leu Glu Met Asp Ala Glu Asn Asn Pro Lys Lys Arg Tyr Leu Val Arg Trp Glu Arg Met Phe Lys Ser Val Gly Gly Leu Glu Lys Ile Ile Lys Lys Glu Asn Ala Thr His (2) INFORMATION FOR SEQ ID N0:3:
(i) SEQUENCE CHARACTERISTICS:
WO 98/21225 PCT/US97/21353 w (A) LENGTH: 514 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear ~ (ix) FEATURE:
(A) NAME/KEY: Coding Sequence _ (B) LOCATION: 112...47l (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:3:
GGATTTTTTA GAGCTCTTAG TCAATGATAA TGTGGTAGAi'~ ACGATTGAAA AAGGCTTTGT 60 Met Gly GCA GTG GTT GTT TTA TTT TTA ACG CTG GTT TTi~ TTG TTT TTA GTT TTA 165 Ala Val Val Val Leu Phe Leu Thr Leu Val Leu Leu Phe Leu Val Leu Arg Asp Phe Gly Leu A1a Ser Pro Lys Gln Lys Ile Leu Ala Phe Leu ATC-GTA GGG ATT ATA GGA GCG AGC ATC AGC GT'r TAT ACT TAC AAG CAA 261 -Ile Val Gly Ile Ile Gly Ala Ser Ile Ser Va:L Tyr Thr Tyr Lys Gln AAC CAA CAA AAC CAA CAA GAG ATC GCT TTG CAe~ AGA GCG TTT TTA AGG 309 Asn Gln Gln Asn Gln Gln Glu Ile Ala Leu Gln Arg Ala Phe Leu Arg Gly Glu Thr Leu Leu Cys Lys Gly Ile Lys Va:1 Asn Asn Gln Thr Phe Asn Leu Val Ser Gly Thr Leu Ser Phe Leu G1~,~ Lys Lys Gln Thr Pro ATG AAA GAC GTT CTT GTG GAT TTG GAT TCT TG'P CAG ACG CTC CAA AAA 453 Met Lys Asp Val Leu Val Asp Leu Asp Ser Cy;~ Gln Thr Leu Gln Lys GAT CCC TTA ATC CAA CCC TAATGATGAA TAATAATi'~AT ACCCCACCCA AACCCCTA 509 Asp Pro Leu Ile Gln Pro - (2) INFORMATION FOR SEQ ID N0:4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 120 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single {D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:4:
Met Gly Ala Val Val Val Leu Phe Leu Thr Leu Val Leu Leu Phe Leu Val Leu Arg Asp Phe Gly Leu Ala Ser Pro Lys Gln Lys Ile Leu Ala Phe Leu Ile Val Gly Ile Ile Gly Ala Ser Ile Ser Val Tyr Thr Tyr Lys Gln Asn Gln Gln Asn Gln Gln Glu Ile Ala Leu Gln Arg Ala Phe Leu Arg Gly Glu Thr Leu Leu Cys Lys Gly Ile Lys Val Asn Asn Gln Thr Phe Asn Leu Val Ser Gly Thr Leu Ser Phe Leu Gly Lys Lys Gln Thr Pro Met Lys Asp Val Leu Val Asp Leu Asp Ser Cys Gln Thr Leu 100 105 1l0 Gln Lys Asp Pro Leu Ile Gln Pro 1l5 120 (2) INFORMATION FOR SEQ ID N0:5:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1233 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 135...l049 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:5:
Met Lys Lys Ala Leu Leu Leu Thr Leu Ser Leu Ser Phe Trp Leu His Ala Glu Arg Asn Gly Phe Tyr Leu Gly Leu Asn Phe -so-Leu Glu Gly Ser Tyr Ile Lys Gly Gln Gly Se:r Ile Gly Lys Lys Ala TCA GCA GAA AAC GCC TTA AAT GAA GCG ATC AA'C AAC GCA AAA AAT TCA 314 Ser Ala Glu Asn Ala Leu Asn Glu A1a Ile Asn Asn Ala Lys Asn Ser Leu Phe Pro Asn Thr Lys Ala Ile Arg Asp Ala Gln Asn Ala Leu Asn Ala Val Lys Asp Ser Asn Lys Ile Ala Ser Arc_3 Phe Ala Gly Asn Gly GGA TCG GGC GGT CTT TTT AAT GAG CTC AGC TT7.' GGG TAT AAA TAT TTT 458 Gly Ser Gly Gly Leu Phe Asn Glu Leu Ser Phee Gly Tyr Lys Tyr Phe TTG GGT AAA AAA AGG ATT ATA GGG TTT AGG CAC' TCT CTT TTT TTC GGT 506.
Leu Gly Lys Lys Arg Ile Ile Gly Phe Arg Hia Ser Leu Phe Phe Gly TAC CAA CTT GGT GGC GTT GGT TCT GTT CCT GG7.' AGC GGT TTA ATC GTT 554 Tyr Gln Leu Gly Gly Val Gly Ser Val Pro Gly Ser Gly Leu Ile Val TTT TTA CCC TAT GGT TTC AAT ACG GAT TTG CTC: ATT AAT TGG ACT AAC 602 Phe Leu Pro Tyr Gly Phe Asn Thr Asp Leu Leu Ile Asn Trp Thr Asn GAT AAG CGA GCG TCC CAA AAA TAT GTT GAA CGP, AGG GTA AAA GGG CTC 650 Asp Lys Arg Ala Ser Gln Lys Tyr Val Glu Arc_~ Arg Val Lys Gly Leu l60 165 170 Ser Ile Phe Tyr Lys Asp Met Thr Gly Arg Thr Leu Asp Ala Asn Thr TTA AAA AAA GCA TCA AGG CAT GTA TTT AGA AAA, TCT TCA GGG CTT GTG 746 Leu Lys Lys Ala Ser Arg His Val Phe Arg Lys Ser Ser Gly Leu Val Ile Gly Met Glu Leu Gly Gly Ser Thr Trp Phe Ala Ser Asn Asn Leu 205 2l0 215 220 ACC CCT TTC AAT CAA GTC AAG AGT CGC ACG ATT' TTT CAG TTG CAA GGA 842 Thr Pro Phe Asn Gln Val Lys Ser Arg Thr Ile Phe Gln Leu Gln Gly Lys Phe Gly Val Arg Trp Asn Asn Asp Glu Tyr Asp Ile Asp Arg Tyr -s1-Gly Asp Glu Ile Tyr Leu Gly Gly Ser Ser Val Glu Leu Gly Val Lys Val Pro Ala Phe Lys Val Asn Tyr Tyr Ser Asp Asp Tyr Gly Asp Lys Leu Asp Tyr Lys Arg Val Val Ser Val Tyr Leu Asn Tyr Thr Tyr Asn Phe Lys Asn Lys His AACCTTATTT TTTATTAGCT TGAAACTCTT CAAAGCCTTT TTTTCTCAAT TGGCATGCCG 1l50 (2) INFORMATION FOR SEQ ID N0:6:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 305 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear - (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:6:
Met Lys Lys Ala Leu Leu Leu Thr Leu Ser Leu Ser Phe Trp Leu His Ala Glu Arg Asn Gly Phe Tyr Leu Gly Leu Asn Phe Leu Glu Gly Ser Tyr Ile Lys Gly Gln Gly Ser Ile Gly Lys Lys Ala Ser Ala Glu Asn Ala Leu Asn Glu Ala Ile Asn Asn Ala Lys Asn Ser Leu Phe Pro Asn Thr Lys Ala Ile Arg Asp Ala Gln Asn Ala Leu Asn Ala Val Lys Asp Ser Asn Lys Ile Ala Ser Arg Phe Ala Gly Asn Gly Gly Ser Gly Gly Leu Phe Asn Glu Leu Ser Phe Gly Tyr Lys Tyr Phe Leu Gly Lys Lys Arg Ile Ile Gly Phe Arg His Ser Leu Phe Phe Gly Tyr Gln Leu Gly 115 120 l25 Gly Val Gly Ser Val Pro Gly Ser Gly Leu Ile Val Phe Leu Pro Tyr Gly Phe Asn Thr Asp Leu Leu Ile Asn Trp Thr Asn Asp Lys Arg Ala Ser Gln Lys Tyr Val Glu Arg Arg Val Lys Gly Leu Ser Ile Phe Tyr 165 170 l75 Lys Asp Met Thr Gly Arg Thr Leu Asp Ala Asn Thr Leu Lys Lys Ala Ser Arg His Val Phe Arg Lys Ser Ser Gly Leu Val Ile Gly Met Glu Leu Gly Gly Ser Thr Trp Phe Ala Ser Asn Asn Leu Thr Pro Phe Asn Gln Val Lys Ser Arg Thr Ile Phe Gln Leu Gln Gly Lys Phe Gly Val Arg Trp Asn Asn Asp Glu Tyr Asp Ile Asp Arch Tyr Gly Asp Glu Ile 245 250 ~ 255 Tyr Leu Gly Gly Ser Ser Val Glu Leu Gly Val Lys Val Pro Ala Phe Lys Val Asn Tyr Tyr Ser Asp Asp Tyr Gly Asp Lys Leu Asp Tyr Lys Arg Val Val Ser Val Tyr Leu Asn Tyr Thr Tyr Asn Phe Lys Asn Lys His (2) INFORMATION FOR SEQ ID N0:7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3012 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 142...2682 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7:
Met Lys Val Lys Ser Ile Ser Tyr Val Gly Leu Ser Tyr Met Ser Asp Met Leu Ala Asn Glu Ile Val Lys Ile Arg ~ 15 20 25 Va1 Gly Asp Ile Val Asp Ser Lys Lys Ile Asp Thr Ala Val Leu Ala Leu Phe Asn Gln Gly Tyr Phe Lys Asp Val Tyr Ala Thr Phe Glu Gly Gly Ile Leu Glu Phe His Phe Asp Glu Lys Ala Arg Ile Ala Gly Val GAA ATC AAG GGT TAT GGG ACT GAA AAG GAA AAA GAC GGC TTA AAA TCC 4l1 Glu Ile Lys Gly Tyr Gly Thr Glu Lys Glu Lys Asp Gly Leu Lys Ser Gln Met Gly Ile Lys Lys Gly Asp Thr Phe Asp Glu Gln Lys Leu Glu 95 100 l05 His Ala Lys Thr Ala Leu Lys Thr Ala Leu Glu G1y Gln Gly Tyr Tyr Gly Ser Val Val Glu Val Arg Thr Glu Lys Val Ser Glu Gly Ala Leu 125 l30 l35 Leu Ile Val Phe Asp val Asn Arg Gly Asp Ser Ile Tyr Ile Lys Gln l40 145 l50 Ser Ile Tyr Glu Gly Ser Ala Lys Leu Lys Arg Arg Met Ile Glu Ser 155 160 165 l70 Leu Ser Ala Asn Lys Gln Arg Asp Phe Met Gly Trp Met Trp Gly Leu l75 l80 185 Asn Asp Gly Lys Leu Arg Leu Asp Gln Leu Glu Tyr Asp Ser Met Arg 190 l95 200 ATC CAA GAT GTG TAT ATG CGT AGG GGT TAC TTA GAC GCT CAT ATT'TCT 795 Ile Gln Asp Val Tyr Met Arg Arg Gly Tyr Leu Asp Ala His Ile Ser Ser Pro Phe Leu Lys Thr Asp Phe Ser Thr His Asp Ala Lys Leu His Tyr Lys Val Lys Glu Gly Ile Gln Tyr Arg Ile Ser Asp Ile Leu Ile Glu Ile Asp Asn Pro Val Val Pro Leu Lys Thr Leu Glu Lys Ala Leu Lys Val Lys Arg Lys Asp Val Phe Asn Ile Glu His Leu Arg Ala Asp GCG CAA ATT TTA AAA ACC GAA ATC GCC GAT .AAG GGT TAT GCG TTT GCG 1035 Ala Gln Ile Leu Lys Thr Glu Ile Ala Asp Lys C3ly Tyr Ala Phe Ala GTG GTG AAG CCA GAC TTG GAT AAA GAT GAA .AAA AAC GGG CTT GTG AAA 1b83 ~ Val Val Lys Pro Asp Leu Asp Lys Asp Glu :Lys Asn Gly Leu Val Lys 300 305 3l0 Val Ile Tyr Arg Ile Glu Val Gly Asp Met 'Jal Tyr Ile Asn Asp Val ATC ATT TCA GGG AAC CAG CGC ACG AGC GAT AGG ATC ATT AGA AGG GAG 1l79 Ile Ile Ser Gly Asn Gln Arg Thr Ser Asp Arg Ile Ile Arg Arg Glu TTA TTG TTA GGG CCT AAG GAT AAA TAC AAC 'rTG ACC AAA CTG AGA AAT 1227 Leu Leu Leu Gly Pro Lys Asp Lys Tyr Asn :~eu Thr Lys Leu Arg Asn TCC GAA AAT TCT TTA AGG CGT TTA GGA TTC 'CTC TCT AAA GTC AAA ATT 1275 Ser Glu Asn Ser Leu Arg Arg Leu Gly Phe 7?he Ser Lys Val Lys Ile Glu Glu Lys Arg Val Asn Ser Ser Leu Met Asp Leu Leu Val Ser Val GAA GAG GGG CGT ACT GGG CAG TTG CAA TTT (3GG TTA GGC TAT GGC TCT 1371 Glu Glu Gly Arg Thr Gly Gln Leu Gln Phe Gly Leu Gly Tyr Gly Ser 395 400 405 4l0 Tyr Gly Gly Leu Met Leu Asn Gly Ser Val Ser Glu Arg Asn Leu Phe GGC ACA GGG CAA AGC ATG AGC TTG TAT GCT AAC ATC GCT ACA GGG GGG l467 Gly Thr Gly Gln Ser Met Ser Leu Tyr Ala Asn Ile Ala Thr Gly Gly Gly Arg Ser Tyr Pro Gly Met Pro Lys Gly Ala Gly Arg Met Phe Ala GGG AAT TTG AGC TTG ACT AAT CCA AGG ATT 7.'TT GAC AGC TGG TAT AGC 1563 Gly Asn Leu Ser Leu Thr Asn Pro Arg Ile Phe Asp Ser Trp Tyr Ser Ser Thr Ile Asn Leu Tyr Ala Asp Tyr Arg 7.1e Ser Tyr Gln Tyr Ile 475 480 9a85 490 Gln Gln Gly Gly Gly Phe Gly Val Asn Val Gly Arg Met Leu Gly Asn WO 98/21225 PCTlUS97121353 -AGA ACC CAT GTG AGC TTA GGG TAT AAC TTG AAT GTT ACC AAA CTC CTT l707 Arg Thr His Val Ser Leu Gly Tyr Asn Leu Asn Val Thr Lys Leu Leu Gly Phe Ser Ser Pro Leu Tyr Asn Arg Tyr Tyr Ser Ser Val Asn Glu GTG GTT TCT CCA AGG CAA TGT TCT ACC CCC GCA TCG GTG ATT ATC AAT l803 Val Val Ser Pro Arg Gln Cys Ser Thr Pro Ala Ser Val Ile Ile Asn Arg Leu Ser Gly Gly Lys Thr Pro Leu Gln Pro Glu Ser Cys Ser Ser Pro Gly Ala Ile Thr Thr Sex Pro Glu Ile Arg Gly Ile Trp Asp Arg Asp Tyr His Thr Pro Ile Thr Ser Ser Phe Thr Leu Asp Val Ser Tyr Asp Asn Thr Asp Asp Tyr Tyr Phe Pro Arg Asn Gly Val Ile Phe Ser 605 6l0 615 Ser Tyr Ala Thr Met Ser Gly Leu Pro Ser Ser Gly Thr Leu Asn Ser Trp Asn Gly Leu Gly Gly Asn Val Arg Asn Thr Lys Val Tyr Gly Lys Phe Ala Ala Tyr His His Leu Gln Lys Tyr Leu Leu Ile Asp Leu Ile Ala Arg Phe Lys Thr Gln Gly Gly Tyr Ile Phe Arg Tyr Asn Thr Asp Asp Tyr Leu Pro Leu Asn Ser Thr Phe Tyr Met Gly Gly Val Thr Thr Val Arg Gly Phe Arg Asn Gly Ser Val Thr Pro Lys Asp Glu Phe Gly TTG TGG CTT GGA GGC GAT GGG ATT TTT ACC GCT TCT ACT GAA TTG AGC 233l Leu Trp Leu Gly Gly Asp Gly Ile Phe Thr Ala Ser Thr.Glu Leu Ser WO 98I21225 PCT/US97/21353 w _g6._ Tyr Gly Val Leu Lys Ala Ala Lys Met Arg Leu Ala Trp Phe Phe Asp Phe Gly Phe Leu Thr Phe Lys Thr Pro Thr Arg Gly Ser Phe Phe Tyr _ AAC GCT CCT GTT ACG ACA GCG AAT TTT AAA GAT TAT GGC GTT ATA GGG 247S
Asn Ala Pro Val Thr Thr Ala Asn Phe Lys Asp Tyr Gly Val Ile Gly Ala Gly Phe Glu Arg Ala Thr Trp Arg Ala Ser Thr Gly Leu Gln Ile Glu Trp Ile Ser Pro Met Gly Pro Leu Val Leu Ile Phe Pro Ile Ala TTT TTC AAC CAA TGG GGC GAT GGC AAT GGC AAG AAA TGT AAA GGG CTA 2.619 Phe Phe Asn Gln Trp Gly Asp Gly Asn Gly Lys Lys Cys Lys Gly Leu Cys Phe Asn Pro Asn Met Asp Asp Tyr Thr Gln His Phe Glu Phe Ser Met Gly Thr Arg Phe CTGAAAACTTGACGAC'TTTT ATTGTGGATAGGAATA.TCAATTACACCAATATTTGTTTTG2843 (2) INFORMATION FOR SEQ ID N0:8:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 847 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: B:
Met Lys Val Lys Ser Ile Ser Tyr Val Gly Leu Ser Tyr Met Ser Asp Met Leu Ala Asn Glu Ile Val Lys Ile Arg Val Gly Asp Ile Val Asp _s7_ Ser Lys Lys Ile Asp Thr Ala Val Leu Ala Leu Phe Asn Gln Gly Tyr Phe Lys Asp Val Tyr Ala Thr Phe Glu Gly Gly Ile Leu Glu Phe His Phe Asp Glu Lys Ala Arg Ile Ala Gly Val Glu Ile Lys Gly Tyr Gly Thr Glu Lys Glu Lys Asp Gly Leu Lys Ser Gln Met Gly Ile Lys Lys Gly Asp Thr Phe Asp Glu Gln Lys Leu Glu His Ala Lys Thr Ala Leu l00 105 l10 Lys Thr Ala Leu Glu Gly Gln Gly Tyr Tyr Gly Ser Val Val Glu Val Arg Thr Glu Lys Val Ser Glu Gly Ala Leu Leu Ile Val Phe Asp Val Asn Arg Gly Asp Ser Ile Tyr Ile Lys Gln Ser Ile Tyr Glu Gly Ser l45 l50 l55 160 Ala Lys Leu Lys Arg Arg Met Ile Glu Ser Leu Ser Ala Asn Lys Gln Arg Asp Phe Met Gly Trp Met Trp Gly Leu Asn Asp Gly Lys Leu Arg l80 185 190 Leu Asp Gln Leu Glu Tyr Asp Ser Met Arg Ile Gln Asp Val Tyr Met Arg Arg Gly Tyr Leu Asp Ala His Tle Ser Ser Pro Phe Leu Lfys Thr 2l0 215 220 Asp Phe 5er Thr His Asp Ala Lys Leu His Tyr Lys Val Lys Glu Gly Ile Gln Tyr Arg Ile Ser Asp Ile Leu Ile Glu Ile Asp Asn Pro Val Val Pro Leu Lys Thr Leu Glu Lys Ala Leu Lys Val Lys Arg Lys Asp Val Phe Asn Ile Glu His Leu Arg Ala Asp Ala Gln Ile Leu Lys Thr Glu Ile Ala Asp Lys Gly Tyr Ala Phe Ala Val Val Lys Pro Asp Leu Asp Lys Asp Glu Lys Asn Gly Leu Val Lys Val Ile Tyr Arg Ile Glu Val Gly Asp Met Val Tyr Ile Asn Asp Val Ile Ile Ser Gly Asn Gln Arg Thr Ser Asp Arg Ile Ile Arg Arg Glu Leu Leu Leu Gly Pro Lys Asp Lys Tyr Asn Leu Thr Lys Leu Arg Asn Ser Glu Asn Ser Leu Arg Arg Leu Gly Phe Phe Ser Lys Val Lys Ile Glu Glu Lys Arg Val Asn Ser Ser Leu Met Asp Leu Leu Val Ser Val Glu Glu Gly Arg Thr Gly Gln Leu Gln Phe Gly Leu Gly Tyr Gly Ser Tyr Gly Gly Leu Met Leu Asn Gly Ser Val Ser Glu Arg Asn Leu Phe Gly Thr Gly Gln Ser Met Ser Leu Tyr Ala Asn Ile Ala Thr Gly Gly Gly Arg Ser Tyr Pro Gly Met Pro Lys Gly Ala Gly Arg Met Phe Ala Gly Asn Leu Ser Leu Thr Asn Pro Arg Ile Phe Asp Ser Trp Tyr Ser Ser Thr Ile Asn Leu Tyr _88_ Ala Asp Tyr Arg Ile Ser Tyr Gln Tyr Ile Gln Gln G1y Gly Gly Phe Gly Val Asn Val Gly Arg Met Leu Gly Asn Arg Thr His Val Ser Leu Gly Tyr Asn Leu Asn Val Thr Lys Leu Leu Gly Phe Ser Ser Pro Leu Tyr Asn Arg Tyr Tyr Ser Ser Val Asn Glu Val Val Ser Pro Arg Gln Cys Ser Thr Pro Ala Ser Val Ile Ile Asn Arg Leu Ser Gly Gly Lys Thr Pro Leu Gln Pro Glu Ser Cys Ser Ser Pro Gly Ala Ile Thr Thr Ser Pro Glu Ile Arg Gly Ile Trp Asp Arg Asp Tyr His Thr Pro Ile Thr Ser Ser Phe Thr Leu Asp Val Ser Tyr Asp Asn Thr Asp Asp Tyr Tyr Phe Pro Arg Asn Gly Val Ile Phe Ser Ser Tyr Ala Thr Met Ser Gly Leu Pro Ser Ser Gly Thr Leu Asn Ser Trp Asn Gly Leu Gly Gly Asn Val Arg Asn Thr Lys Val Tyr Gly Lys Phe Ala Ala Tyr His His Leu Gln Lys Tyr Leu Leu Ile Asp Leu Ile Ala Arg Phe Lys Thr Gln Gly Gly Tyr Ile Phe Arg Tyr Asn Thr Asp Asp Tyr Leu Pro Leu Asn Ser Thr Phe Tyr Met Gly Gly Val Thr Thr Val Arg Gly Phe Arg Asn Gly Ser Val Thr Pro Lys Asp Glu Phe G1y Leu Trp Leu Gly Gly Asp Gly Ile Phe Thr Ala Ser Thr Glu Leu Ser Tyr Gly Val Leu Lys Ala Ala Lys Met Arg Leu Ala Trp Phe Phe Asp Phe Gly Phe Leu Thr Phe Lys Thr Pro Thr Arg Gly Ser Phe Phe Tyr Asn Ala Pro Val Thr Thr Ala Asn Phe Lys Asp Tyr Gly Val Ile Gly Ala Gly Phe Glu Arg Ala Thr Trp Arg Ala Ser Thr Gly Leu Gln Ile Glu Trp Ile Ser Pro Met Gly Pro Leu Val Leu Ile Phe Pro Ile Ala Phe Phe Asn Gln Trp Gly Asp Gly Asn Gly Lys Lys Cys Lys Gly Leu Cys Phe Asn Pro Asn Met Asp Asp Tyr Thr Gln His Phe Glu Phe Ser Met Gly Thr Arg Phe (2) INFORMATION FOR SEQ ID N0:9:
(i) SEQUENCE CHARACTERISTICS:
' (A) LENGTH: 1032 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single - (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 149...9l3 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:9:
Met Asp Ile Tyr Ala Leu Tyr Ile Ala Ile Gly Leu Phe Thr Gly Ile Leu Ser Gly Ile Phe Gly Ile Gly Gly Gly Leu Ile Ile Val Pro Ile Met Leu Ala Thr Gly His Ser Phe Glu Glu Ser Ile Gly Ile Ser Ile Leu Gln Met Ala Leu Ser Ser Phe Val Gly Ser Val Leu Asn Phe Lys Lys Lys Ser Leu Asp Phe Ser Leu Gly Leu Leu Ile Gly Ala Gly Gly Leu Ile Gly Ala Ser Phe Ser Gly Phe Val Leu Lys Ile Val Ser Ser Lys Ile Leu Met Val Ile Phe Ala Leu Leu Val Val Tyr Ser Met Ile Gln Phe Val Leu Lys Pro Lys Lys l05 110 115 120 Lys Asp Leu Ile Ala Asp Thr Lys Arg Tyr His Leu Gln Gly Leu Lys l25 l30 135 Leu Phe Leu Ile Gly Thr Leu Thr Gly Phe Phe Ala Ile Thr Leu Gly l40 145 150 Ile Gly Gly Gly Met Leu Met Val Pro Leu Met His Tyr Phe Leu Gly TAT GAT TCT AAA AAA TGC GTG GCT CTA GGG 'TTA TTT TTC ATC TTG TTT 700 Tyr Asp Ser Lys Lys Cys Val Ala Leu Gly :Leu Phe Phe Ile Leu Phe l70 175 1B0 TCT TCT ATT TCA GGA GCT TTT TCT TTA ATG 'TAT CAC CAC ATC ATC AAT 748 Ser Ser Ile Ser Gly Ala Phe Ser Leu Met 'Tyr His His Ile Ile Asn Lys Glu Val Leu Leu Ala Gly Ala Ile Val Gly Leu Gly Ser Val Met 205 2l0 215 Gly Val Ser Ile Gly Ile Lys Trp Ile Met Gly Leu Leu Asn Glu Lys Met His Lys Ala Leu Ile Leu Gly Val Tyr Gly Leu Ser Leu Leu Ile GTT TTA TAC AAA CTC TTT TTT TAATTGATGG T'TTTATACCA CTACTATTTT RAGA 9.47 Val Leu Tyr Lys Leu Phe Phe (2) INFORMATION FOR SEQ ID NO:10:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 255 amino acids (B) TYPE: amino acid !C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID 1V0:10:
Met Asp Ile Tyr Ala Leu Tyr Ile Ala Ile Gly Leu Phe Thr Gly Ile Leu Ser Gly Ile Phe Gly Ile Gly Gly Gly Leu Ile Ile Val Pro Ile Met Leu Ala Thr Gly His Ser Phe Glu Glu Ser Ile Gly Ile Ser Ile Leu Gln Met Ala Leu Ser Ser Phe Val Gly :3er Val Leu Asn Phe Lys Lys Lys Ser Leu Asp Phe Ser Leu Gly Leu Leu Ile Gly Ala Gly Gly 65 70 '75 80 Leu Ile Gly Ala Ser Phe Ser Gly Phe Val Leu Lys Ile Val Ser Ser Lys Ile Leu Met Val Ile Phe Ala Leu Leu Val Val Tyr Ser Met Ile - Gln Phe Val Leu Lys Pro Lys Lys Lys Asp Leu Ile Ala Asp Thr Lys WO 98I21225 PCT/US97121353 w 1l5 120 125 Arg Tyr His Leu Gln Gly Leu Lys Leu Phe Leu Ile Gly Thr Leu Thr 130 135 l40 Gly Phe Phe Ala Ile Thr Leu Gly Ile Gly Gly Gly Met Leu Met Val Pro Leu Met His Tyr Phe Leu Gly Tyr Asp Ser Lys Lys Cys Val Ala Leu Gly Leu Phe Phe Ile Leu Phe Ser Ser Ile Ser Gly Ala Phe Ser l80 185 190 Leu Met Tyr His His Ile Ile Asn Lys Glu Val Leu Leu Ala Gly Ala Ile Val Gly Leu Gly Ser Val Met Gly Val Ser Ile Gly Ile Lys Trp Ile Met Gly Leu Leu Asn Glu Lys Met His Lys Ala Leu Ile Leu Gly Val Tyr Gly Leu Ser Leu Leu Ile Val Leu Tyr Lys Leu Phe Phe (2) INFORMATION FOR SEQ ID NO:11:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 10S7 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 66...980 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:
Met Gly Arg Ile Glu Ser Lys Lys Arg Leu Lys Ala Leu Val Phe Leu Ala Ser Leu Gly Val Leu Trp Gly Asn Ser Ala Glu Lys Thr Pro Phe Phe Lys Thr Lys Asn His Ile Tyr Leu Gly Phe Arg Leu Gly Thr Gly Ala Asn Val His Thr Ser Met Trp Gln Gln Ala Tyr Lys Asp Asn Pro Thr Cys Pro Gly Ser Val Cys Tyr Gly Glu Lys Leu Glu Ala His Tyr Gln Gly Gly Lys Asn Leu Ser Tyr Thr Gly Gln Ile Gly Asp Glu ATA GCT TTT GAT AAA CAC CAT ATT TTA GGC 'TTA AGG GTG TGG GGG GAT 398 Ile Ala Phe Asp Lys His His Ile Leu Gly :Leu Arg Val Trp Gly Asp 100 105 1l0 GTA GAA TAC GCT AAA GCG CAA TTA GGT CAA ,AAA GTG GGG GGT AAT ACC 446 Val Glu Tyr Ala Lys Ala Gln Leu Gly Gln :Lys Val Gly Gly Asn Thr 115 l20 125 Leu Leu Ser Gln Ala Asn Tyr Asp Pro Asn ;11a Ile Lys Thr Tyr Asp TCT GCT TCA AAC ACT CAA GGC CCT TTA GTT 'rTG CAA AAA ACC CCA AGC 542 Ser Ala Ser Asn Thr Gln Gly Pro Leu Val :Leu Gln Lys Thr Pro Ser 145 l50 155 CCT CAA AAC TTC CTT TTC AAT AAC GGG CAT 'rTC ATG GCG TTT GGT TTG 590 Pro Gln Asn Phe Leu Phe Asn Asn Gly His :Phe Met Ala Phe Gly Leu Asn Val Asn Val Phe Val Asn Leu Pro Ile Asp Thr Leu Leu Lys Leu GCT TTA AAA ACA GAA AAA ATG CTG TTT TTT i~AA ATA GGC GTG TTT GGT 686 Ala Leu Lys Thr Glu Lys Met Leu Phe Phe l~ys Ile Gly Val Phe Gly GGG GGC GGG GTG GAA TAC GCA ATA TTA TGG i~GT CCT AAC TAT CAA AAT 734 Gly Gly Gly Val Glu Tyr Ala Ile Leu Trp tier Pro Asn Tyr Gln Asn 210 2l5 220 CAA AAC ACG AAA CAA GGC GAT AAA TTT TTT (3CA GCG GGT GGG GGG TTT 782 Gln Asn Thr Lys Gln Gly Asp Lys Phe Phe Ala Ala Gly Gly Gly Phe TTT GTG AAT TTT GGG GGT TCT TTG TAT ATA (3GC AAA CGC AAC CGC TTC 830 Phe Val Asn Phe Gly Gly Ser Leu Tyr Ile (31y Lys Arg Asn Arg Phe AAT GTG GGG TTA AAA ATC CCT TAC TAT AGC 'CTG AGC GCG CAA AGT TGG 87B
Asn Val Gly Leu Lys Ile Pro Tyr Tyr Ser Leu Ser Ala Gln Ser Trp Lys Asn Phe Gly Ser Ser Asn Val Trp Gln Gln Gln Thr Ile Arg Gln AAC TTC AGC GTT TTT AGG AAT AAA GAA GTT 'CTT GTC AGC TAC GCG TTC 974 Asn Phe Ser Val Phe Arg Asn Lys Glu Val I?he Val Ser Tyr Ala Phe Leu Phe (2) INFORMATION FOR SEQ ID N0:12:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 305 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:12:
Met Gly Arg Ile Glu Ser Lys Lys Arg Leu Lys Ala Leu Val Phe Leu Ala Ser Leu Gly Val Leu Trp Gly Asn Ser Ala Glu Lys Thr Pro Phe Phe Lys Thr Lys Asn His Ile Tyr Leu Gly Phe Arg Leu Gly Thr Gly Ala Asn Val His Thr Ser Met Trp Gln Gln Ala Tyr Lys Asp Asn Pro Thr Cys Pro Gly Ser Val Cys Tyr Gly Glu Lys Leu Glu Ala His Tyr Gln Gly Gly Lys Asn Leu Ser Tyr Thr Gly Gln Ile Gly Asp Glu Ile Ala Phe Asp Lys His His Ile Leu Gly Leu Arg Val Trp Gly Asp Val l00 105 1l0 Glu Tyr Ala Lys Ala Gln Leu Gly Gln Lys Val Gly Gly Asn Thr Leu l15 120 125 Leu Ser Gln Ala Asn Tyr Asp Pro Asn Ala Ile Lys Thr Tyr Asp Ser Ala Ser Asn Thr Gln Gly Pro Leu Val Leu Gln Lys Thr Pro Ser Pro 145 150 155 l60 Gln Asn Phe Leu Phe Asn Asn Gly His Phe Met Ala Phe Gly Leu Asn Val Asn Val Phe Val Asn Leu Pro Ile Asp Thr Leu Leu Lys Leu Ala Leu Lys Thr Glu Lys Met Leu Phe Phe Lys Ile Gly Val Phe Gly Gly Gly Gly Val Glu Tyr Ala Ile Leu Trp Ser Pro Asn Tyr Gln Asn Gln Asn Thr Lys Gln Gly Asp Lys Phe Phe Ala Ala Gly Gly Gly Phe Phe Val Asn Phe Gly Gly Ser Leu Tyr Ile Gly Lys Arg Asn Arg Phe Asn Val Gly Leu Lys Ile Pro Tyr Tyr Ser Leu Ser Ala Gln Ser Trp Lys Asn Phe Gly Ser Ser Asn Val Trp Gln Gln Gln Thr Ile Arg Gln Asn WO 98l21225 PCTlUS97121353 -Phe Ser Val Phe Arg Asn Lys Glu Val Phe Val Ser Tyr Ala Phe Leu Phe (2) INFORMATION FOR SEQ ID N0:1.3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 624 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 77...535 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:13:
Met Glu Asn Asn Glu Asn His Glu Lys Leu Asn Gly GTT TTG CGC AAG TTT TTA GGC GAT GCG TTC .ACG CTT GAT GGG AAA GAA 160 Val Leu Arg Lys Phe Leu Gly Asp Ala Phe Thr Leu Asp Gly Lys Glu Gly Gly Leu Asn Met Glu Lys Leu Arg Glu .Ala Ile Lys Lys Glu Lys CCA ATC ATG AAT ATT TTG CTC ATG GGA GCT .?ACT GGG GTG GGT AAA AGC 256 Pro Ile Met Asn Ile Leu Leu Met Gly Ala 'rhr Gly Val Gly Lys Ser Ser Leu Ile Asn Ala Leu Phe Gly Lys Glu 'Val Ala Lys Ala Gly Val GGA AAA CCC ATC ACT CAG CAT CTT GAA AAA 'rAT GTT GAT GAA GAA AAA 352 Gly Lys Pro Ile Thr Gln His Leu Glu Lys 'Pyr Val Asp Glu G1u Lys GGC TTG ATT TTA TGG GAC ACT AAA GGC ATT (3AA GAT AAA GAT TAT GAA 400 Gly Leu Ile Leu Trp Asp Thr Lys Gly Ile Glu Asp Lys Asp Tyr Glu Asn Thr Leu Glu Ser Ile Lys Lys Glu Met Glu Asp Ser Phe Lys Thr Leu Asp Glu Lys Glu Ala Ile Asp Val Ala Tyr Leu Cys Val Lys Glu l25 l30 135 140 Thr Ser Gly Arg Val Gln Glu Arg Glu Arg Glu Ser Tyr (2) INFORMATION FOR SEQ ID N0:14:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 153 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:14:
Met Glu Asn Asn Glu Asn His Glu Lys Leu Asn Gly Val Leu Arg Lys Phe Leu Gly Asp Ala Phe Thr Leu Asp Gly Lys Glu Gly Gly Leu Asn Met Glu Lys Leu Arg Glu Ala Ile Lys Lys Glu Lys Pro Ile Met Asn Ile Leu Leu Met Gly Ala Thr Gly Val Gly Lys Ser Ser Leu Ile Asn Ala Leu Phe Gly Lys Glu Val Ala Lys Ala Gly Val Gly Lys Pro Ile Thr Gln His Leu Glu Lys Tyr Val Asp Glu Glu Lys Gly Leu Ile Leu Trp Asp Thr Lys Gly Ile Glu Asp Lys Asp Tyr Glu Asn Thr Leu Glu Ser Ile Lys Lys Glu Met Glu Asp Ser Phe Lys Thr Leu Asp Glu Lys Glu Ala Ile Asp Val Ala Tyr Leu Cys Val Lys Glu Thr Ser Gly Arg l30 l35 140 Val Gln Glu Arg Glu Arg Glu Ser Tyr (2) INFORMATION FOR SEQ ID N0:15:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1083 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 155...1033 (D) OTHER INFORMATION:
{xi) SEQUENCE DESCRIPTION: SEQ ID rd0:15:
AACCAAACAG TGCAATTTCA GGTGTCAGTA TTGC ATCi CCT GCT ACG CCA TTA AAT 175 Met: Pro Ala Thr Pro Leu Asn Phe Phe Asp Asn Glu Glu Leu Leu Pro Leu F~sp Asn Val Leu Glu Phe CTC AAA ATC GCC ATT GAT GAG GGC GTT AAA F,AA ATT AGA ATC ACG GGT 271 Leu Lys Ile Ala Ile Asp Glu Gly Val Lys L~ys Ile Arg Ile Thr Gly Gly Glu Pro Leu Leu Arg Lys Gly Leu Asp C~lu Phe Ile Ala Lys Leu CAC GCT TAC AAT AAA GAA GTG GAG TTA GTT T'TA AGC ACT AAT GGT TTT 367 His Ala Tyr Asn Lys Glu Val Glu Leu Val Leu Ser Thr Asn Gly Phe Leu Leu Lys Lys Met Ala Lys Asp Leu Lys A.sn Ala Gly Leu Ala Gln Val Asn Val Ser Leu Asp Ser Leu Lys Ser Asp Arg Val Leu Lys Tle Ser Gln Lys Asp Ala Leu Lys Asn Thr Leu Glu Gly Ile Glu Glu Ser Leu Lys Val Gly Leu Lys Leu Lys Leu Asn Thr Val Val Ile Lys Ser 120 l25 l30 135 Val Asn Asp Asp Glu Ile Leu Glu Leu Leu Glu Tyr Ala Lys Asn Arg His Ile Gln Ile Arg Tyr Ile Glu Phe Met Glu Asn Thr His Ala Lys _97_ Ser Leu Val Lys Gly Leu Lys Glu Arg Glu Ile Leu Asp Leu Ile Ala CAA AAA TAT CAA ATC ATT GAG GCA GAA AAA CCC AAA CAA GGG TCT TCT 75l Gln Lys Tyr Gln Ile Ile Glu Ala Glu Lys Pro Lys Gln Gly Ser Ser l85 190 195 Lys Ile Tyr Thr Leu Glu Asn Gly Tyr Gln Phe Gly Ile Ile Ala Pro His Ser Asp Asp Phe Cys Gln Ser Cys Asn Arg Ile Arg Leu Ala Ser Asp Gly Lys Ile Cys Pro Cys Leu Tyr Tyr Gln Asp Ala Ile Asp Ala Lys Glu Ala Ile Ile Asn Lys Asp Thr Lys Asn Ile Lys Arg Leu Leu Lys Gln Ser Val Ile Asn Lys Pro Glu Lys Asn Met Trp Asn Asp Lys Asn Ser Glu Thr Pro Thr Arg Ala Phe Tyr Tyr Thr Gly Gly (2) INFORMATION FOR SEQ ID N0:16:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 293 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:16:
Met Pro Ala Thr Pro Leu Asn Phe Phe Asp Asn Glu Glu Leu Leu Pro Leu Asp Asn Val Leu Glu Phe Leu Lys Ile Ala Ile Asp Glu Gly Val Lys Lys Ile Arg Ile Thr Gly Gly Glu Pro Leu Leu Arg Lys Gly Leu Asp Glu Phe Ile Ala Lys Leu His Ala Tyr Asn Lys Glu Val Glu Leu val Leu Ser Thr Asn Gly Phe Leu Leu Lys Lys Met Ala Lys Asp Leu _98-Lys Asn Ala Gly Leu Ala Gln Val Asn Val Ser Leu Asp Ser Leu Lys Ser Asp Arg Val Leu Lys Ile Ser Gln Lys Asp Ala Leu Lys Asn Thr 100 l05 110 - Leu Glu Gly Ile Glu Glu Ser Leu Lys Val Gly Leu Lys Leu Lys Leu l15 120 125 Asn Thr Val Val Ile Lys Ser Val Asn Asp Asp Glu Ile Leu Glu Leu _ 130 135 140 Leu Glu Tyr Ala Lys Asn Arg His Ile Gln Ile Arg Tyr Ile Glu Phe 145 l50 155 l60 Met Glu Asn Thr His Ala Lys Ser Leu Val Lys Gly Leu Lys Glu Arg Glu Ile Leu Asp Leu Ile Ala Gln Lys Tyr Gln Ile Ile Glu Ala Glu 180 185 l90 Lys Pro Lys Gln Gly Ser Ser Lys Ile Tyr Thr Leu Glu Asn Gly Tyr l95 200 205 Gln Phe Gly Ile Ile Ala Pro His Ser Asp Asp Phe Cys Gln Ser Cys Asn Arg Ile Arg Leu Ala Ser Asp Gly Lys Ile Cys Pro Cys Leu Tyr Tyr Gln Asp Ala Ile Asp Ala Lys Glu Ala Ile Ile Asn Lys Asp Thr Lys Asn Ile Lys Arg Leu Leu Lys Gln Ser Val Ile Asn Lys Pro Glu Lys Asn Met Trp Asn Asp Lys Asn Ser Glu Thr Pro Thr Arg Ala Phe Tyr Tyr Thr Gly Gly (2) INFORMATION FOR SEQ ID N0:1.7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1181 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 121...l137 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:17:
TACGATTACA AAGATGTTTT TGGGTTTAAG GCGGGGC'GCT ATGAAGCGAA TATTGATTTC 120 " Met Ser Gly Ser Asn Gln Gly Trp Glu Val Tyr Tyr Gln Pro Tyr Lys Thr Glu Thr Gln Arg Leu Arg Phe Trp Trp Trp Ser Ser Phe Gly Arg Gly Leu Ala Phe Asn Ser Trp Ile Tyr Glu Phe Phe Ala Thr Val Pro Tyr Leu Lys Lys Gly Gly Asn Pro Asn Asn Ser Asn Asp Phe Ile Asn Tyr Gly Trp His Gly Ile Thr Thr Thr Tyr Ser Tyr Lys Gly Leu Asp Ala Gln Phe Phe Tyr Tyr Phe Ala Pro Lys Thr Tyr Asn Ala Pro Gly Phe Lys Leu Val Tyr Asp Thr Asn Arg Asn Phe Gln Asn Val Gly Phe l00 l05 110 Arg Ser Gln Ser Met Ile Met Thr Thr Phe Pro Leu Tyr Tyr Arg Gly 115 l20 l25 Trp Tyr Asn Pro Glu Thr Asn Thr Tyr Ser Leu Glu Asp Ser Thr Pro His Gly Ser Leu Leu Gly Arg Asn Gly Val Thr Leu Asn Ile Arg Gln Val Phe Trp Trp Asp Asn Phe Asn Trp Ser Ile Gly Phe Tyr Asn Thr l65 170 175 Phe Gly Asn Ser Asp Ala Phe Leu Gly Ser His Thr Met Pro Arg Gly Asn Asn Thr Ser Tyr Ile Gly Ser Glu Ile Ser Ile Thr Thr Arg His Ala Gly Met Ile Gly Tyr Asp Phe Trp Asp Asn Thr Ala Tyr Asp Gly 210 2l5 220 Leu Ala Asp Ala Ile Thr Asn Ala Asn Thr Phe Thr Phe Tyr Thr Ser -l00-GTT GGA GGG ATC CAT AAG CGT TTT GCA TGG (.AT GTT TTT GGG CGC GTC 888 Val Gly Gly Ile His Lys Arg Phe Ala Trp His Val Phe Gly Arg Val Ser His Ala Asn Lys Asn Ala Leu Gly Gln Val Gly Arg Ala Asn Glu Tyr Ser Leu Gln Phe Asn Ala Ser Tyr Ala F?he Thr Glu Ser Ile Leu Leu Asn Phe Arg Ile Thr Tyr Tyr Gly Ala Arg Ile Asn Lys Gly Tyr Gln Ala Gly Tyr Phe Gly Ala Pro Lys Phe Asn Asn Pro Asp Gly Asp 305 310 3l5 320 Phe Ser Ala Asn Tyr Gln Asp Arg Ser Tyr Met Met Thr Asn Leu Thr Leu Lys Phe (2) INFORMATION FOR SEQ ID NO:1F3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 339 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID Pd0:18:
Met Ser Gly Ser Asn Gln Gly Trp Glu Val Tyr Tyr Gln Pro Tyr Lys Thr Glu Thr Gln Arg Leu Arg Phe Trp Trp Trp Ser Ser Phe Gly Arg Gly Leu Ala Phe Asn Ser Trp Ile Tyr Glu F~he Phe Ala Thr Val Pro Tyr Leu Lys Lys Gly Gly Asn Pro Asn Asn Ser Asn Asp Phe Ile Asn Tyr Gly Trp His Gly Ile Thr Thr Thr Tyr Ser Tyr Lys Gly Leu Asp Ala Gln Phe Phe Tyr Tyr Phe Ala Pro Lys ~~hr Tyr Asn Ala Pro Gly Phe Lys Leu Val Tyr Asp Thr Asn Arg Asn I?he Gln Asn Val Gly Phe Arg Ser Gln Ser Met Ile Met Thr Thr Phe Pro Leu Tyr Tyr Arg Gly l15 120 l25 Trp Tyr Asn Pro Glu Thr Asn Thr Tyr Ser Leu Glu Asp Ser Thr Pro l30 135 l40 His Gly Ser Leu Leu Gly Arg Asn Gly Val Thr Leu Asn lle Arg Gln 145 l50 155 160 Val Phe Trp Trp Asp Asn Phe Asn Trp Ser Ile Gly Phe Tyr Asn Thr Phe Gly Asn Ser Asp Ala Phe Leu Gly Ser His Thr Met Pro Arg Gly 180 l85 190 Asn Asn Thr Ser Tyr Ile Gly Ser Glu Ile Ser Ile Thr Thr Arg His Ala Gly Met Ile Gly Tyr Asp Phe Trp Asp Asn Thr Ala Tyr Asp Gly 210 2l5 220 Leu Ala Asp Ala Ile Thr Asn Ala Asn Thr Phe Thr Phe Tyr Thr Ser Val Gly Gly Tle His Lys Arg Phe Ala Trp His Val Phe Gly Arg Val Ser His Ala Asn Lys Asn Ala Leu Gly Gln Val Gly Arg Ala Asn Glu Tyr Ser Leu Gln Phe Asn Ala Ser Tyr Ala Phe Thr Glu Ser Ile Leu _ 275 280 285 Leu Asn Phe Arg Ile Thr Tyr Tyr Gly Ala Arg Ile Asn Lys Gly Tyr Gln Ala Gly Tyr Phe Gly Ala Pro Lys Phe Asn Asn Pro Asp Gly Asp 305 3l0 315 320 Phe Ser Ala Asn Tyr Gln Asp Arg Ser Tyr Met Met Thr Asn Leu Thr Leu Lys Phe (2) INFORMATION FOR SEQ ID N0:19:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 959 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 133...879 (D) OTHER INFORMATION:
(xi} SEQUENCE DESCRIPTION: SEQ ID N0:19:
AGGATTTTAA GA ATG AAT GAC AAG CGT TTT AGA AAA TAT TGT AGT TTT TCT 17l Met Asn Asp Lys Arg Phe Arg Lys Tyr Cys Ser Phe Ser -loz-ATT TTT TTG TCC TTA TTA GGA ACG TTT GAA T7.'A GAG GCT AAA GAA GAA 219 Ile Phe Leu Ser Leu Leu Gly Thr Phe Glu Le:u Glu Ala Lys Glu Glu Glu Glu Lys Glu Glu Arg Lys Thr Glu Arg Lys Lys Glu Lys Asn Ala CAA CAC ACT CTA GGC AAG GTT ACC ACT CAA GC'.G GCT AAA ATC TTT AAC 3l5 Gln His Thr Leu Gly Lys Val Thr Thr Gln A7.a Ala Lys Ile Phe Asn Tyr Asn Asn Gln Thr Thr Ile Ser Ser Lys Gl.u Leu Glu Arg Arg Gln Ala Asn Gln Ile Ser Asp Met Phe Arg Arg A~3n Pro Asn Ile Asn Val Gly Gly Gly Ala Val Ile Ala Gln Lys Ile Tyr Val Arg Gly Ile Glu GAC AGA TTG GCT CGG GTT ACG GTG GAT GGG GC'.G GCG CAA ATG GGT GCA 507 Asp Arg Leu Ala Arg Val Thr Val Asp Gly Al.a Ala Gln Met Gly Ala 110 115 1a;0 125 AGC' TAT GGG CAT CAA GGC AAT ACG ATC ATT GAC CCT GGA ATG CTT AAA 555 Ser Tyr Gly His Gln Gly Asn Thr Ile Ile A~~p Pro Gly Met Leu Lys AGC GTG GTG GTT ACT AAA GGG GCG GCT CAA GC'.G AGC GCG GGG CCT ATG 603 Ser Val Val Val Thr Lys Gly Ala Ala Gln A7.a Ser Ala Gly Pro Met Ala Leu Ile Gly Ala Ile Lys Met Glu Thr Lys Ser Ala Ser Asp Phe Ile Pro Lys Gly Lys Asp Tyr Ala Ile Ser G7.y Ala Ala Thr Phe Leu l75 180 185 Thr Asn Phe Gly Asp Arg Glu Thr Va1 Met G7.y Ala Tyr Arg His Asn His Phe Asp Ala Leu Leu Tyr Tyr Thr His G7.n Asn Ile Phe Tyr Tyr CGT GAT GGG GAT AAT GCT ACA AAA GAT CTC T7.'T AGA CCT AAA GCG GAG 843 Arg Asp Gly Asp Asn Ala Thr Lys Asp Leu Phe Arg Pro Lys Ala Glu Asn Lys Val Thr Glu Val Leu Ala Ser Lys Thr Met (2) INFORMATION FOR SEQ ID N0:20:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 249 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:20:
Met Asn Asp Lys Arg Phe Arg Lys Tyr Cys Ser Phe Ser Ile Phe Leu Ser Leu Leu Gly Thr Phe Glu Leu Glu Ala Lys Glu Glu Glu Glu Lys Glu Glu Arg Lys Thr Glu Arg Lys Lys Glu Lys Asn Ala Gln His Thr Leu Gly Lys Val Thr Thr Gln Ala Ala Lys Ile Phe Asn Tyr Asn Asn Gln Thr Thr Ile Ser Ser Lys Glu Leu Glu Arg Arg Gln Ala Asn Gln Ile Ser Asp Met Phe Arg Arg Asn Pro Asn Ile Asn Val Gly Gly Gly Ala Val Ile Ala Gln Lys Ile Tyr Val Arg Gly Ile Glu Asp Arg Leu l00 l05 110 Ala Arg Val Thr Val Asp Gly Ala Ala Gln Met Gly Ala Ser Tyr Gly His Gln Gly Asn Thr Ile Ile Asp Pro Gly Met Leu Lys Ser Val Val l30 l35 140 Val Thr Lys Gly Ala Ala Gln Ala Ser Ala Gly Pro Met Ala Leu Ile Gly Ala Ile Lys Met Glu Thr Lys Ser Ala Ser Asp Phe Ile Pro Lys Gly Lys Asp Tyr Ala Ile Ser Gly Ala Ala Thr Phe Leu Thr Asn Phe Gly Asp Arg Glu Thr Val Met Gly Ala Tyr Arg His Asn His Phe Asp l95 200 205 Ala Leu Leu Tyr Tyr Thr His Gln Asn Ile Phe Tyr Tyr Arg Asp Gly 2l0 2l5 220 Asp Asn Ala Thr Lys Asp Leu Phe Arg Pro Lys Ala Glu Asn Lys Val Thr Glu Val Leu Ala Ser Lys Thr Met (2) INFORMATION FOR SEQ ID N0:21:
WO 98/21225 PCT/US97/21353 w (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1306 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 40...1266 (D) OTHER INFORMATION:
(A) NAME/KEY: sig peptide (B) LOCATION: 40...219 (D) OTHER INFORMATION:
(A) NAME/KEY: mat peptide (B) LOCATION: 220...1266 (D} OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:21:
Met Ser Tyr Thr Lys Lys Tyr Ser Thr Pro Pro Asn Arg Arg Lys Met Gln Asn Ile Ile Ala Ile Lys Arg Ser Ser Arg Val Asp Leu Gln Ala Cys Lys Leu Ala Phe Ala Ser Ser Arg Ser Pro Met Gln Phe Gln Lys Thr Leu Phe Pro Leu Pro Leu Leu Phe Leu Ser Cys Cys Ile Ala Glu Glu Asn Gly Ala Tyr Ala Ser Val Gly Phe Glu Tyr Ser Ile Ser His Ala Val Glu His Asn ' Asn Pro Phe Leu Asn Gln Glu Arg Ile Gln Ile Ile Ser Asn Ala Gln Asn Lys Ile Tyr Lys Leu Asn Gln Val Lys Asn Glu Ile Thr Ser Met Gln Asn Thr Phe Asn Tyr Ile Asn Asn Ala Leu Lys Asn Asn Ala Lys Leu Thr Pro Thr Glu Ile Gln Ala Glu Lys Tyr Tyr Leu Gln Ser Thr Leu Gln Asn Ile Glu Lys Ile Val Thr Leu Ser Gly Gly Val Ala Ser 90 95 l00 l05 Asn Pro Lys Leu Val Gln Ala Leu Glu Lys Met Gln Glu Pro Ile Thr 110 l15 120 Asn Pro Leu Glu Leu Ala Glu Asn Leu Arg Asn Leu Glu Leu Gln Phe l25 130 135 Ala Gln Ser Gln Asn Arg Met Leu Ser Ser Leu Ser Ser Gln Thr Ala Gl.n Ile Ser Asn Ser Leu Asn Ala Leu Asp Pro Ser Ser Tyr Ser Lys l55 160 165 Asn Ile Ser Ser Met Ser Gly Val Ser Leu Ser Val Gly Tyr Lys His Phe Phe Thr Lys Lys Lys Asn Gln Gly Phe Arg Tyr Tyr Leu Phe Tyr Asp Tyr Gly Tyr Thr Asn Phe Gly Phe Val Gly Asn Gly Phe Asp Gly Leu Gly Lys Met Asn Asn His Leu Tyr Gly Leu Gly Ile Asn Tyr Leu Tyr Asn Phe Ile Asp Asn Ala Gln Lys His Ser Ser Val Gly Phe Tyr Ala Gly Phe Ala Leu Ala Gly Asn Ser Trp Val Gly Asn Gly Leu Gly ATG TGG GTG AGC CAA ACG GAT TTT ATC AAC AAT TAC TTG ATG GGC TAT l062 Met Trp Val Ser Gln Thr Asp Phe Ile Asn Asn Tyr Leu Met Gly Tyr Gln Ala Lys Ile His Thr Asn Phe Phe Gln 7:1e Pro Leu Asn Phe Gly GTT CGT GTG AAT GTC AAT AGG CAT AAC GGA 7.'TT GAA ATG GGC CTA AAA 1158 Val Arg Val Asn Val Asn Arg His Asn Gly Phe Glu Met Gly Leu Lys Ile Pro Leu Ala Val Asn Ser Phe Tyr Glu Thr His Gly Lys Gly Leu Asn Thr Ser Leu Phe Phe Lys Arg Leu Val L'al Phe Asn Val Ser Tyr Val Tyr Ser Phe (2) INFORMATION FOR SEQ ID N0:22:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 409 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:22:
Met Ser Tyr Thr Lys Lys Tyr Ser Thr Pro Pro Asn Arg Arg Lys Met Gln Asn Ile Ile Ala Ile Lys Arg Ser Ser Arg Val Asp Leu Gln Ala Cys Lys Leu Ala Phe Ala Ser Ser Arg Ser Pro Met Gln Phe Gln Lys Thr Leu Phe Pro Leu Pro Leu Leu Phe Leu Ser Cys Cys Ile Ala Glu Glu Asn Gly Ala Tyr Ala Ser Val Gly Phe Glu Tyr Ser Ile Ser His Ala Val Glu His Asn Asn Pro Phe Leu Asn Gln Glu Arg Ile Gln Ile Ile Ser Asn Ala Gln Asn Lys Ile Tyr Lys Leu Asn Gln Val Lys Asn ' 40 45 50 Glu Ile Thr Ser Met Gln Asn Thr Phe Asn Tyr Ile Asn Asn Ala Leu Lys Asn Asn Ala Lys Leu Thr Pro Thr Glu Ile Gln Ala Glu Lys Tyr -l07-Tyr Leu Gln Ser Thr Leu Gln Asn Ile Glu Lys Ile Val Thr Leu Ser 85 90 95 l00 Gly Gly Val Ala Ser Asn Pro Lys Leu Val Gln Ala Leu Glu Lys Met 105 1l0 115 Gln Glu Pro Ile Thr Asn Pro Leu Glu Leu Ala Glu Asn Leu Arg Asn Leu Glu Leu Gln Phe Ala Gln Ser Gln Asn Arg Met Leu Ser Ser Leu l35 140 145 Ser Ser Gln Thr Ala Gln Ile Ser Asn Ser Leu Asn Ala Leu Asp Pro Ser Ser Tyr Ser Lys Asn Ile Ser Ser Met Ser Gly Val Ser Leu Ser l65 170 175 l80 Val Gly Tyr Lys His Phe Phe Thr Lys Lys Lys Asn Gln Gly Phe Arg 185 l90 195 Tyr Tyr Leu Phe Tyr Asp Tyr Gly Tyr Thr Asn Phe Gly Phe Val Gly Asn Gly Phe Asp Gly Leu Gly Lys Met Asn Asn His Leu Tyr Gly Leu Gly Ile Asn Tyr Leu Tyr Asn Phe Ile Asp Asn Ala Gln Lys His Ser Ser Val Gly Phe Tyr Ala Gly Phe Ala Leu Ala Gly Asn Ser Trp Val Gly Asn Gly Leu Gly Met Trp Val Ser Gln Thr Asp Phe Ile Asn Asn Tyr Leu Met Gly Tyr Gln Ala Lys Ile His Thr Asn Phe Phe Gln Ile Pro Leu Asn Phe Gly Val Arg Val Asn Val Asn Arg His Asn Gly Phe Glu Met Gly Leu Lys Ile Pro Leu Ala Val Asn Ser Phe Tyr Glu Thr His Gly Lys Gly Leu Asn Thr Ser Leu Phe Phe Lys Arg Leu Val Val Phe Asn Val Ser Tyr Val Tyr Ser Phe (2) INFORMATION FOR SEQ ID N0:23:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1030 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 342...824 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:23:
ACC ATC
AAA GTT
Met Thr Ile Lys Val Phe Ser Pro Lys Tyr Pro Thr Glu Leu Glu Glu Phe Tyr Ala Glu Arg Ile Ala Asp Asn Pro Leu Gly Phe Ile Gln ,Arg Leu Asp Leu Leu Pro Ser Ile Ser Gly Phe Val Gln Lys Leu Arg Glu His Gly Gly Glu Phe Phe Glu Met Arg Glu Gly Asn Lys Leu Ile Gly Ile Cys Gly Leu Asn CCT ATC AAT CAA ACA GAA GCC GAG CTG TGC .AAA TTC CAC ATA AAT AGT 596 Pro Ile Asn Gln Thr Glu Ala Glu Leu Cys Lys Phe His Ile Asn Ser Ala Tyr Gln Ser Gln Gly Leu Gly Gln Lys :Leu Tyr Glu Ser Val Glu AAA TAC GCT TTC ATT AAA GGC TAT ACT AAA .ATC TCT CTG CAT GTG AGC 692 Lys Tyr Ala Phe Ile Lys Gly Tyr Thr Lys Ile Ser Leu His Val Ser 105 l10 115 Lys Ser Gln Ile Lys Ala Cys Asn Leu Tyr Gln Lys Leu Gly Phe Val 120 l25 130 CAC ATC AAA GAA GAG GAT TGC GTG GTG GAG 'TTG GGC GAA GAG ACT TTG 788 His Ile Lys Glu Glu Asp Cys Val Val Glu :Leu Gly Glu Glu Thr Leu ATT TTC CCC ACT CTT TTT ATG GAA AAG ATT 'TTG TCT TGATTGGTGC ATCCAT 840 - Ile Phe Pro Thr Leu Phe Met Glu Lys Ile :Leu Ser l50 155 l60 TTGACACACG CCCAAGCGAC ATTCAAACTA TCAAACT'TTC ATTAACACAA CCCAATTAAC 900 GCTAAATAAA CCCTAAAACA AACACTCGTT GTTAAAA'TTT TGTTTTTCAA GCGCTTCGCA 960 AAGTTTTAGA AGCCCTATTT AGGGGTTAAC GCTAAAA'TAG GCTATCAAAA CTACTTTAAT 1020 ' GATTTTATAG 1030 (2) INFORMATION FOR SEQ ID N0:24:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 161 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:24:
Met Thr Ile Lys Val Phe Ser Pro Lys Tyr Pro Thr Glu Leu Glu Glu Phe Tyr Ala Glu Arg Ile Ala Asp Asn Pro Leu Gly Phe Ile Gln Arg Leu Asp Leu Leu Pro Ser Ile Ser Gly Phe Val Gln Lys Leu Arg Glu His Gly Gly Glu Phe Phe Glu Met Arg Glu Gly Asn Lys Leu Ile Gly Ile Cys Gly Leu Asn Pro Ile Asn Gln Thr Glu Ala Glu Leu Cys Lys Phe His Ile Asn Ser Ala Tyr Gln Ser Gln Gly Leu Gly Gln Lys Leu Tyr Glu Ser Val Glu Lys Tyr Ala Phe Ile Lys Gly Tyr Thr Lys Ile Ser Leu His Val Ser Lys Ser Gln Ile Lys Ala Cys Asn Leu Tyr Gln 115 l20 125 Lys Leu Gly Phe Val His Ile Lys Glu Glu Asp Cys Val Val Glu Leu 130 135 l40 Gly Glu Glu Thr Leu Ile Phe Pro Thr Leu Phe Met Glu Lys Ile Leu 145 150 l55 l60 Ser (2) INFORMATION FOR SEQ ID N0:25:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1477 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence {B) LOCATION: 374...1267 (D) OTHER INFORMATION:
(xi} SEQUENCE DESCRIPTION: SEQ ID N0:25:
TTATTTCTTA ATACAAAAGGTAGGCGTTTT GAAACAT7.'TA CTCACACCAT240 ACCCCACTCA
CGTTACTAAA GTC ATG ACC AAT TGC GAC AF,T ATT TTT AAC 409 ATG AAA GAT
Met Met Thr Asn Cys Asp Asn Ile Phe Asn Lys Asp Ala Lys Gln Lys Glu Val Leu Lys Ala Ala Tyr Gln Phe Gly Ser Lys Glu Asn Leu Gly Tyr Glu Met Ala Gly Ile A.la Trp Lys Glu Ser Cys Ala Gly Val Tyr Lys Ile Asn Phe Ser Asp Pro Ser Ala Gly Val Tyr CAT TCT TAT ATC CCA AGC GTT CTA AAA AGC TAT GGG CAT AAT GAT AGC 60l His Ser Tyr Ile Pro Ser Val Leu Lys Ser Tyr Gly His Asn Asp Ser Pro Phe Leu Arg Asn Val Met Gly Glu Leu Leu Ile Lys Asp Asp Ala TTT GCT TCT GAA GTG GCT TTA AAA GAG TTG C'TC TAT TGG AAA ACA CGC 697 Phe Ala Ser Glu Val Ala Leu Lys Glu Leu Leu Tyr Trp Lys Thr Arg Tyr His Asp Asn Leu Lys Asp Met Ile Lys S~~_r Tyr Asn Lys Gly Ser Arg Trp Glu Arg Ser Glu Lys Ser Asn Ala Asp Ala Glu Lys Tyr Tyr l25 130 l35 140 Glu Glu Ile Gln Asp Arg Ile Arg Arg Leu Lys Glu Ser Lys Ile Phe Asp Ser Gln Ser Ser Asn Asp Gln Glu Leu Gln Lys Ser Ala Asn Ser 160 l65 170 AAC CTG GAT TTA GAC CCT ATC GGC AAC GCC A'CG CCC CAA GCC TTA ATT 937 Asn Leu Asp Leu Asp Pro Ile Gly Asn Ala Met Pro Gln Ala Leu Ile Ala Lys Glu Thr Lys Ile Glu Glu Thr Gln Ala Glu Lys Ser Gln Glu Met Lys Glu Thr Thr Ser Glu Gln Thr Lys Ser Lys Pro Glu Lys Ala 205 2l0 2l5 220 AAA GAT AAA CCC ATG TAT TTG GCG CAA ATC AAC AGC ACT GAT TTC ACA 108l Lys Asp Lys Pro Met Tyr Leu Ala Gln Ile Asn Ser Thr Asp Phe Thr Pro Val Lys Lys Ser Pro Lys Lys Pro Ala Lys Val Ser Gln Lys His TCC TTT AAG AAT AAC ATT AAA AAT AAT GTA AAA AAC AAC GCC AAA ACC 1177 _ Ser Phe Lys Asn Asn Ile Lys Asn Asn Val Lys Asn Asn Ala Lys Thr Ala Ser Lys Lys Gln Glu Met Cys Lys Asn Cys Ser Pro Gly Gln Arg Asn Ala Ile Leu Ala Asn His Ile Thr Leu Met Gln Glu Leu (2) INFORMATION FOR SEQ ID N0:26:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 298 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:26:
Met Met Thr Asn Cys Asp Asn Ile Lys Asp Phe Asn Ala Lys Gln Lys Glu Val Leu Lys Ala Ala Tyr Gln Phe Gly Ser Lys Glu Asn Leu Gly Tyr Glu Met Ala Gly Ile Ala Trp Lys Glu Ser Cys Ala Gly Val Tyr Lys Ile Asn Phe Ser Asp Pro Ser Ala Gly Val Tyr His Ser Tyr Ile Pro Ser Val Leu Lys Ser Tyr Gly His Asn Asp Ser Pro Phe Leu Arg Asn Val Met Gly Glu Leu Leu Ile Lys Asp Asp Ala Phe Ala Ser Glu Val Ala Leu Lys Glu Leu Leu Tyr Trp Lys Thr Arg Tyr His Asp Asn WO 98/21225 PCTlUS97/21353 100 105 1l0 Leu Lys Asp Met Ile Lys Ser Tyr Asn Lys Gly Ser Arg Trp Glu Arg ' Ser Glu Lys Ser Asn Ala Asp Ala Glu Lys '.Cyr Tyr Glu Glu Ile Gln 130 135 l40 Asp Arg Ile Arg Arg Leu Lys Glu Ser Lys Ile Phe Asp Ser Gln Ser 14S 150 .L55 l60 Ser Asn Asp Gln Glu Leu Gln Lys Ser Ala Asn Ser Asn Leu Asp Leu ( 165 170 l75 Asp Pro Ile Gly Asn Ala Met Pro Gln Ala Leu Ile Ala Lys Glu Thr Lys Ile Glu Glu Thr Gln Ala Glu Lys Ser CTln Glu Met Lys Glu Thr Thr Ser Glu Gln Thr Lys Ser Lys Pro Glu Lys Ala Lys Asp Lys Pro Met Tyr Leu Ala Gln Ile Asn Ser Thr Asp Phe Thr Pro Val Lys Lys 225 230 <;35 240 Ser Pro Lys Lys Pro Ala Lys Val Ser Gln Lys His Ser Phe Lys Asn Asn 11e Lys Asn Asn Val Lys Asn Asn Ala Lys Thr Ala Ser Lys Lys G1n Glu Met Cys Lys Asn Cys Ser Pro Gly Gln Arg Asn Ala Ile Leu Ala Asn His Ile Thr Leu Met Gln Glu Leu (2) INFORMATION FOR SEQ ID N0:27:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1515 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 141...1340 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N'0:27:
TTAGTGTTGA TTTTTTTATC GTTAGTGTTT GTGCGTCC'TT TAGAGGCTTT GAGCGTGTTT 60 Met Lys Glu Ser Phe Tyr Ile Glu Gly Met Thr Cys Thr Ala Cys Ser Ser Gly Ile Glu A.rg Ser Leu Gly Arg Lys AGT TTT GTG AAA AAA ATA GAA GTG AGC CTT T'TA AAT AAG AGC GCT AAC 266 Ser Phe Val Lys Lys Ile Glu Val Ser Leu Leu Asn Lys Ser Ala Asn I1e Glu Phe Asp Glu Asn Gln Thr Asn Leu Asp Glu Ile Phe Lys Leu Ile Glu Lys Leu Gly Tyr Ser Pro Lys Lys Ala Leu Thr Lys Glu Lys Lys Glu Phe Phe Ser Pro Asn Val Lys Leu Ala Leu Ala Val Ile Phe Thr Leu Phe Val Val Tyr Leu Ser Met Gly Ala Met Leu Ser Pro Ser 95 l00 10S
Leu Leu Pro Glu Ser Leu Leu Ala Ile Asp Asn His Ser Asn Phe Leu 110 l15 120 Asn Ala Cys Leu Gln Leu Ile Gly Ala Leu Ile Val Met His Leu Gly l25 l30 135 Arg Asp Phe Tyr Ile Gln Gly Phe Lys Ala Leu Trp His Arg Gln Pro l40 145 150 Asn Met Ser Ser Leu Ile Ala Ile Gly Thr Ser Ala Ala Leu Ile Ser l55 160 165 170 Ser Leu Trp Gln Leu Tyr Leu Val Tyr Thr Asn His Tyr Thr Asp Gln Trp Ser Tyr Gly His Tyr Tyr Phe Glu Ser Val Cys Val Ile Leu Met l90 19S 200 Phe Val Met Val Gly Lys Arg Ile Glu Asn Val Ser Lys Asp Lys Ala Leu Asp Ala Met Gln Ala Leu Met Lys Asn Ala Pro Lys Thr Ala Leu Lys Met Gln Asn Asn Gln Gln Ile Glu Val Leu Val Asp Ser Ile Val Val Gly Asp Ile Leu Lys Val Leu Pro Gly ~~er Ala Ile Ala Val Asp Gly Glu Ile Ile Glu Gly Glu Gly Glu Leu F,sp Glu Ser Met Leu Ser Gly Glu Ala Leu Pro Val Tyr Lys Lys Val Gly Asp Lys Val Phe Ser Gly Thr Phe Asn Ser His Thr Ser Phe Leu Met Lys Ala Thr Gln Asn Asn Lys Asn Ser Thr Leu Ser Gln Ile Ile Glu Met Ile Tyr Asn Ala CAA AGT TCA AAG GCA GAG ATT TCT CGC TTA GCG GAT AAG GTT TCA AGC 1l78 Gln Ser Ser Lys Ala Glu Ile Ser Arg Leu Ala Asp Lys Val Ser Ser Val Phe Val Pro Ser Val Ile Ala Ile Ser Ile Leu A1a Phe Val Val Trp Leu Ile Ile Ala Pro Lys Pro Asp Phe Trp Trp Asn Phe Gly Ile Ala Leu Glu Val Phe Val Ser Val Leu Val Ile Ser Cys Pro Cys Ala Leu Gly Leu Leu Arg Leu GGGTTATTTT TTAAAGACGC TAAAAGTTTA GAAAAAGC.~A GGCTAGTCAA TACGATCGTT 1438 (2) INFORMATION FOR SEQ ID N0:28:
(i) SEQUENCE CHARACTERTSTICS:
_ (A) LENGTH: 400 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein ' (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:28:
Met Lys Glu Ser Phe Tyr Ile Glu Gly Met Thr Cys Thr Ala Cys Ser Ser Gly Ile Glu Arg Ser Leu Gly Arg Lys Ser Phe Val Lys Lys Ile Glu Val Ser Leu Leu Asn Lys Ser Ala Asn Ile Glu Phe Asp Glu Asn Gln Thr Asn Leu Asp Glu Ile Phe Lys Leu Ile Glu Lys Leu Gly Tyr Ser Pro Lys Lys Ala Leu Thr Lys Glu Lys Lys Glu Phe Phe Ser Pro Asn Val Lys Leu Ala Leu Ala Val Ile Phe Thr Leu Phe Val Val Tyr Leu Ser Met Gly Ala Met Leu Ser Pro Ser Leu Leu Pro Glu Ser Leu Leu Ala Ile Asp Asn His Ser Asn Phe Leu Asn Ala Cys Leu Gln Leu Ile Gly Ala Leu Ile Val Met His Leu Gly Arg Asp Phe Tyr Ile Gln 130 135 l40 Gly Phe Lys Ala Leu Trp His Arg Gln Pro Asn Met Ser Ser Leu Ile Ala Ile Gly Thr Ser Ala Ala Leu Ile Ser Ser Leu Trp Gln Leu Tyr l65 l70 175 Leu Val Tyr Thr Asn His Tyr Thr Asp Gln Trp Ser Tyr Gly His Tyr l80 185 190 Tyr Phe Glu Ser Val Cys Val Ile Leu Met Phe Val Met Val Gly Lys Arg Ile Glu Asn Val Ser Lys Asp Lys Ala Leu Asp Ala Met Gln Ala Leu Met Lys Asn Ala Pro Lys Thr Ala Leu Lys Met Gln Asn Asn Gln Gln Ile Glu Val Leu Val Asp Ser Ile Val Val Gly Asp Ile Leu Lys Val Leu Pro Gly Ser Ala Ile Ala Val Asp Gly Glu Ile Ile Glu Gly Glu Gly Glu Leu Asp Glu Ser Met Leu Ser Gly Glu Ala Leu Pro Val Tyr Lys Lys Val Gly Asp Lys Val Phe Ser Gly Thr Phe Asn Ser His Thr Ser Phe Leu Met Lys Ala Thr Gln Asn Asn Lys Asn Ser Thr Leu 305 3l0 315 320 Ser Gln Ile Ile Glu Met Ile Tyr Asn Ala Gln Ser Ser Lys Ala Glu Ile Ser Arg Leu Ala Asp Lys Val Ser Ser Val Phe Val Pro Ser Val Ile Ala Ile Ser Ile Leu Ala Phe Val Val Trp Leu Ile Ile Ala Pro Lys Pro Asp Phe Trp Trp Asn Phe Gly Ile Ala Leu Glu Val Phe Val Ser Val Leu Val Ile Ser Cys Pro Cys Ala Leu Gly Leu Leu Arg Leu (2) INFORMATION FOR SEQ ID N0:29:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1443 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 76...1389 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:29:
Met Lys Lys Ile Trp Leu Leu Val Trp Gly Leu Cys TCT TGG GTG TTT TTG CAT GCG ATA GAG ATG ATA GAA AAA GCC CCT ACA l59 Ser Trp Val Phe Leu His Ala Ile Glu Met Ile Glu Lys Ala Pro Thr Asn Val Glu Asp Arg Asp Lys Ala Pro His Leu Leu Leu Leu Ala Gly Ile Gln Gly Asp Glu Pro Gly Gly Phe Asn Ala Thr Asn Leu Phe Leu Met His Tyr Ser Val Leu Lys Gly Leu Val Glu Val Val Pro Val Leu Asn Lys Pro Ser Met Leu Arg Asn His Arg Gly Leu Tyr Gly Asp Met Asn Arg Lys Phe Ala Ala Leu Asp Lys Asn Asp Pro Glu Tyr Pro Thr Ile Gln Glu Ile Lys Ser Leu Ile Ala Lys Pro Ser Ile Asp Ala Val 1l0 1l5 120 Leu His Leu His Asp Gly Gly Gly Tyr Tyr Arg Pro Val Tyr Val Asp Ala Met Leu Asn Pro Lys Arg Trp Gly Asn Cys Phe Ile Ile Asp Gln Asp Glu Val Lys Gly Ala Lys Phe Pro Asn Leu Leu Ala Phe Ala Asn 160 l65 170 Asn Thr Ile Glu Ser Ile Asn Ala His Leu Leu His Pro Ile Glu Glu Tyr His Leu Lys Asn Thr Arg Thr Ala Gln Gly Asp Thr Glu Met Gln 190 l95 200 Lys Ala Leu Thr Phe Tyr Ala Ile Asn Gln Lys Lys Ser Ala Phe Ala 205 2l0 2l5 220 Asn Glu Ala Ser Lys Glu Leu Pro Leu Ala Ser Arg Val Phe Tyr His CTG CAA GCC ATT GAG GGC TTA CTC AAT CAG CTC AAT ATC CCT TTT AAG 8.31 Leu Gln Ala Ile Glu Gly Leu Leu Asn Gln Leu Asn Ile Pro Phe Lys Arg Asp Phe Asp Leu Asn Pro Asn Ser Val His Ala Leu Ile Asn Asp AAA AAC TTG TGG GCA AAA ATC AGC TCT TTG CCT AAA ATG CCC CTT TTT 927 -.
Lys Asn Leu Trp Ala Lys Ile Ser Ser Leu Pro Lys Met Pro Leu Phe Asn Leu Arg Pro Lys Leu Asn His Phe Pro Leu Pro His Asn Thr Lys Ile Pro Gln Ile Pro Ile Glu Ser Asn Ala Tyr Ile Val Gly Leu Val Lys Asn Lys Gln Glu Val Phe Leu Lys Tyr Gly Asn Lys Leu Met Thr CGA TTA TCG CCT TTT TAC ATA GAG TTT GAT CCT TCT TTA GAA GAA GTG 11l9 Arg Leu Ser Pro Phe Tyr Ile Glu Phe Asp Pro Ser Leu Glu Glu Val Lys Met Gln Ile Asp Asn Lys Asp Gln Met Val Lys Ile Gly Ser Val Val Glu Val Lys Glu Ser Phe Tyr Ile His Ala Met Asp Asn Ile Arg Ala Asn Val Ile Gly Phe Ser Val Ser Asn Glu Asn Lys Pro Asn Glu Ala Gly Tyr Thr Ile Lys Phe Lys Asp Phe Gln Lys Arg Phe Ser Leu Asp Lys Gln Glu Arg Ile Tyr Arg Ile Glu Phe Tyr Lys Asn Asn Ala TTT AGC GGG ATG ATC TTA GTG AAA TTT GTG T.AGGAATGGA TAAATCTCAT TGC 1412 Phe Ser Gly Met Ile Leu Val Lys Phe Val (2} INFORMATION FOR SEQ ID N0:30:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 438 amino acids (B) TYPE: amino acid (C} STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:30:
Met Lys Lys Ile Trp Leu Leu Val Trp Gly Le~u Cys Ser Trp Val Phe Leu His Ala Ile Glu Met Ile Glu Lys Ala Pro Thr Asn Val Glu Asp Arg Asp Lys Ala Pro His Leu Leu Leu Leu Ala Gly Ile Gln Gly Asp Glu Pro Gly Gly Phe Asn Ala Thr Asn Leu Phe Leu Met His Tyr Ser Val Leu Lys Gly Leu Val Glu Val Val Pro V<~1 Leu Asn Lys Pro Ser Met Leu Arg Asn His Arg Gly Leu Tyr Gly Asp Met Asn Arg Lys Phe Ala Ala Leu Asp Lys Asn Asp Pro Glu Tyr Pro Thr Ile Gln Glu Ile 100 l05 110 Lys Ser Leu Ile Ala Lys Pro Ser Ile Asp Ala Val Leu His Leu His Asp Gly Gly Gly Tyr Tyr Arg Pro Val Tyr Val Asp Ala Met Leu Asn 130 135 l40 Pro Lys Arg Trp Gly Asn Cys Phe Ile Ile A:>p Gln Asp Glu Val Lys Gly Ala Lys Phe Pro Asn Leu Leu Ala Phe A7.a Asn Asn Thr Ile Glu 165 l70 175 Ser Ile Asn Ala His Leu Leu His Pro Ile G7.u Glu Tyr His Leu Lys 180 185 l90 Asn Thr Arg Thr Ala Gln Gly Asp Thr Glu Met Gln Lys Ala Leu Thr Phe Tyr Ala Ile Asn Gln Lys Lys Ser Ala Phe Ala Asn Glu Ala Ser Lys Glu Leu Pro Leu Ala Ser Arg Val Phe Tyr His Leu Gln Ala Ile Glu Gly Leu Leu Asn Gln Leu Asn Ile Pro Phe Lys Arg Asp Phe Asp Leu Asn Pro Asn Ser Val His Ala Leu Ile Asn Asp Lys Asn Leu Trp Ala Lys Ile Ser Ser Leu Pro Lys Met Pro Leu Phe Asn Leu Arg Pro Lys Leu Asn His Phe Pro Leu Pro His Asn Thr Lys Ile Pro Gln Ile Pro Ile Glu Ser Asn Ala Tyr Ile Val Gly Leu Val Lys Asn Lys Gln Glu Val Phe Leu Lys Tyr Gly Asn Lys Leu Met Thr Arg Leu Ser Pro Phe Tyr Ile Glu Phe Asp Pro Ser Leu Glu Glu Val Lys Met Gln Ile Asp Asn Lys Asp Gln Met Val Lys Ile Gly Ser Val Val Glu Val Lys Glu Ser Phe Tyr Ile His Ala Met Asp Asn Ile Arg Ala Asn Val Ile Gly Phe Ser Val Ser Asn Glu Asn Lys Pro Asn Glu Ala Gly Tyr Thr Ile Lys Phe Lys Asp Phe Gln Lys Arg Phe Ser Leu Asp Lys Gln Glu Arg Ile Tyr Arg Ile Glu Phe Tyr Lys Asn Asn Ala Phe Ser Gly Met Ile Leu Val Lys Phe Val (2) INFORMATION FOR SEQ ID N0:31:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1280 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 66...I223 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:31:
Met Leu Arg Lys Asn Ile Leu Ala Tyr Tyr Gly Ala Asn Phe Leu -1zo-TTA ATC ATC GCT CAA AGC TTA CCC CAT GCG .ATT TTA ACC CCC TTG TTG 158 Leu Ile Ile Ala Gln Ser Leu Pro His Ala Ile Leu Thr Pro Leu Leu CTT TCT AAA GGG CTT AGT TTG AGT GAA ATC 'rTG CTC GTG CAA ACC TTT 206 Leu Ser Lys Gly Leu Ser Leu Ser Glu Ile Leu Leu Val Gln Thr Phe TTT AGC TTT TGC GTG CTA GTG GCT GAA TAC ~~CA AGC GGC GTT TTA GCG 254 Phe Ser Phe Cys Val Leu Val Ala Glu Tyr Pro Ser Gly Val Leu Ala GAT TTG ATG AGC CGA AAA AAT TTA TTC CTG GTT TCT AAT GCC TTT TTA 302 _ Asp Leu Met Ser Arg Lys Asn Leu Phe Leu 'Jal Ser Asn Ala Phe Leu Ile Ala Ser Phe Ser Phe Val Leu Phe Phe ,Asp Ser Phe Ile Phe Met CTT TTA GCG TGG GGG TTG TAT GGT TTG TAT :4GC GCA TGC TCT AGC GGC 39B
Leu Leu Ala Trp Gly Leu Tyr Gly Leu Tyr Ser Ala Cys Ser Ser Gly ACG ATT GAA GCT TCA CTC ATC ACA GAC ATT i~AG GAA AAC AAA AAA GAT 446 Thr Tle Glu Ala Ser Leu Ile Thr Asp Ile Lys Glu Asn Lys Lys Asp l15 120 l25 Leu Ser Lys Phe Leu Ala Lys Asn Asn Gln :Cle Thr Tyr Leu Gly Met 130 135 l40 ATT ATA GGG AGT TCT TTG GGA TCG TTT TTG 'CAT CTC AAA GTC CAT GCG 542 Ile Ile Gly Ser Ser Leu Gly Ser Phe Leu 'Cyr Leu Lys Val His Ala Met Leu Tyr Ile Val Gly Ile Phe Leu Ile Met Leu Cys Val Leu Thr l60 165 :L70 175 Ile Ile Phe Tyr Phe Lys Glu Lys Glu Gly Asp Phe Lys Ser Gln Lys l80 185 190 AGC CTG AAA CTC CTT AAA GAG CAA GTC AAA CiGC AGT CTT AAA GAG CTT 686 Ser Leu Lys Leu Leu Lys Glu Gln Val Lys Gly Ser Leu Lys Glu Leu ' Lys Asp Asn Pro Lys Leu Lys Ile Leu Leu Val Gly His Leu Ile Thr CCC GTC TTT TTT ATG AGC CAT TTT CAA ATG 7.'GG CAA GCG TAT TTT TTA 782 Pro Val Phe Phe Met Ser His Phe Gln Met 7.'rp Gln Ala Tyr Phe Leu WO 98l21225 PCTlUS97l21353 -Lys Gln Gly Val Lys Glu Gln Tyr Leu Phe Val Phe Tyr Ile Ala Phe G1n Val Ile Ser Ile Leu Ile His Phe Leu Lys Ala Ser Ser Tyr Ser Gln Lys Ile Ala Leu Ser Ser Leu Val Val Leu Leu Gly Val Ser Pro Leu Leu Leu Ser Asn Ile Pro Tyr Cys Phe Ile Gly Val Tyr Ala Leu Met Val Ala Phe Phe Thr Tyr Met Ser Tyr Cys Leu Asn Tyr Gln Phe 305 3l0 315 Ser Lys Phe Val Ser Lys Asn Asn Ile Ser Ser Leu Ser Ser Leu Leu Ser Ser Cys Val Arg Val Val Ser Val Leu Ile Leu Ser Leu Ser Ser Leu Glu Leu Arg Tyr Phe Ser Pro Leu Thr Ile Ile Thr Met His Phe GCC TTG ACG CTT ATC ATC CTC TTT TTC TTT TTG TAT AAG GCT AAG CCG l214 Ala Leu Thr Leu Ile Ile Leu Phe Phe Phe Leu Tyr Lys Ala Lys Pro TTT GAT GAG TGAGCGGCTT TAAGAGTGCA ACCTTTTAGC GATTTCTATA GCAACATCA l272 Phe Asp Glu (2) INFORMATION FOR SEQ ID N0:32:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 386 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:32:
Met Leu Arg Lys Asn Ile Leu Ala Tyr Tyr Gly Ala Asn Phe Leu Leu Ile Ile Ala Gln Ser Leu Pro His Ala Ile Leu Thr Pro Leu Leu Leu ~ Ser Lys Gly Leu Ser Leu Ser Glu Ile Leu Leu Val Gln Thr Phe Phe Ser Phe Cys Val Leu Val Ala Glu Tyr Pro Ser Gly Val Leu Ala Asp Leu Met Ser Arg Lys Asn Leu Phe Leu Val :>er Asn Ala Phe Leu Ile 65 70 '.~5 80 Ala Ser Phe Ser Phe Val Leu Phe Phe Asp Ser Phe Ile Phe Met Leu Leu Ala Trp Gly Leu Tyr Gly Leu Tyr Ser Ala Cys Ser Ser Gly Thr Ile Glu Ala Ser Leu Ile Thr Asp Ile Lys Glu Asn Lys Lys Asp Leu 1l5 120 125 Ser Lys Phe Leu Ala Lys Asn Asn Gln Ile 7.'hr Tyr Leu Gly Met Ile Ile Gly Ser Ser Leu Gly Ser Phe Leu Tyr Leu Lys Val His Ala Met l45 l50 7.55 160 Leu Tyr Ile Val Gly Ile Phe Leu Ile Met Leu Cys Val Leu Thr Ile Ile Phe Tyr Phe Lys Glu Lys Glu Gly Asp Phe Lys Ser Gln Lys Ser l80 185 190 Leu Lys Leu Leu Lys Glu Gln Val Lys Gly :~er Leu Lys Glu Leu Lys Asp Asn Pro Lys Leu Lys Ile Leu Leu Val Gly His Leu Ile Thr Pro Val Phe Phe Met Ser His Phe Gln Met Trp Gln Ala Tyr Phe Leu Lys 225 230 i:35 240 Gln Gly Val Lys Glu Gln Tyr Leu Phe Val f>he Tyr Ile Ala Phe Gln Val Ile Ser Ile Leu Ile His Phe Leu Lys Ala Ser Ser Tyr Ser Gln Lys Ile Ala Leu Ser Ser Leu Val Val Leu Leu Gly Val Ser Pro Leu Leu Leu Ser Asn Ile Pro Tyr Cys Phe Ile C~ly Val Tyr Ala Leu Met Val Ala Phe Phe Thr Tyr Met Ser Tyr Cys Leu Asn Tyr Gln Phe Ser 305 3l0 3.15 320 Lys Phe Val Ser Lys Asn Asn Ile Ser Ser Leu Ser Ser Leu Leu Ser Ser Cys Val Arg Val Val Ser Val Leu Ile Leu Ser Leu Ser Ser Leu Glu Leu Arg Tyr Phe Ser Pro Leu Thr Ile I:le Thr Met His Phe Ala - Leu Thr Leu Ile Ile Leu Phe Phe Phe Leu Tyr Lys Ala Lys Pro Phe Asp Glu (2) INFORMATION FOR SEQ ID N0:33:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1264 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 5l...1205 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTTON: SEQ ID N0:33:
ATTAAATATG ACTATATACA CTACAACAAT AAGATTTTGA AAGGTTGGTA ATG GAA 56 _ Met Glu Ser Val Lys Thr Gly Lys Thr Asn Lys Val Gly Lys Asn Thr Glu Met GCT AAT ACA AAG GCA AAT AAA GAG GCT CAT TTT AAA CAA GCG AGC ACC 1.52 Ala Asn Thr Lys Ala Asn Lys Glu Ala His Phe Lys Gln Ala Ser Thr ATT ACA AAT ATA ATC AGA TCA ATT CGT GGG ATT TTT ACA AAA ATT GCA 2d0 Ile Thr Asn Ile Ile Arg Ser Ile Arg Gly Ile Phe Thr Lys Ile Ala Lys Lys Val Arg Gly Leu Val Lys Lys His Pro Lys Lys Ser Ser Ala Ala Leu Val Val Leu Thr His Ile Ala Cys Lys Lys Ala Lys Glu Leu Asp Asp Lys Val Gln Asp Lys Ser Lys Gln Ala Glu Lys Glu Asn Gln Ile Asn Trp Trp Lys Tyr Ser Gly Leu Thr Ile Ala Thr Ser Leu Leu Leu Ala Ala Cys Ser Thr Gly Asp Val Ser Glu Gln Ile Glu Leu Glu 1l5 120 l25 130 Gln Glu Lys Gln Lys Thr Ser Asn Ile Glu Thr Asn Asn Gln Ile Lys Val Glu Gln Glu Lys Gln Lys Thr Ser Asn Ile Glu Thr Asn Asn Gln Ile Lys Val Glu Gln Glu Gln Gln Lys Thr Glu Gln Glu Xaa Gln Lys l65 170 175 ' Thr Glu Gln Glu Arg Gln Lys Thr Glu Gln C~lu Lys Gln Lys Thr Ile Lys Thr Gln Lys Asp Phe Ile Lys Tyr Val Glu Gln Asn Cys Gln Glu 195 200 a!05 210 AAT CAT AAT CAA TTC TTT ATT GAA AAA GGA CiGA ATT AAG GCT GGT ATT 728 Asn His Asn Gln Phe Phe Ile Glu Lys Gly Cily Ile Lys Ala Gly Ile GGT ATA GAA GTA GAA GCT GAA TGC AAA ACC C'.CT AAA CCT GCA AAA ACC 776 Gly Ile Glu Val Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala Lys Thr AAT CAA ACC CCT ATC CAG CCA AAA CAC CTC C:CA AAC TCT AAA CAA CCC 824 -- Asn Gln Thr Pro Ile Gln Pro Lys His Leu Pro Asn Ser Lys Gln Pro CGC TCT CAA AGA GGA TCA AAA GCG CAA GAG C'.TT ATC GCT TAT TTG CAA 872 Arg Ser Gln Arg Gly Ser Lys A1a Gln Glu Leu Ile Ala Tyr Leu Gln Lys Glu Leu Glu Ser Leu Pro Tyr Ser Gln Lys Ala Ile Ala Lys Gln GTG GAT TTT TAT AGA.CCA AGT TCT ATC GCT TAT TTA GAA CTA GAC CCT 968 Val Asp Phe Tyr Arg Pro Ser Ser Ile Ala Tyr Leu Glu Leu Asp Pro AGA GAT TTT AAT GTT ACA GAA GAA TGG CAA AAA GAA AAT TTA AAA ATA 10l6 Arg Asp Phe Asn Val Thr Glu Glu Trp Gln Lys Glu Asn Leu Lys Ile CGC TCT AAA GCT CAA GCT AAA ATG CTT GAA ATG AGG AGT TTA AAA CCA l064 Arg Ser Lys Ala Gln Ala Lys Met Leu Glu Met Arg Ser Leu Lys Pro GAC TCA CAA GCC CAC CTT TCA ACC TCT CAA AGC CTT TTG TTC GTT CAA 11l2 Asp Ser Gln Ala His Leu Ser Thr Ser Gln Ser Leu Leu Phe Val Gln Lys Ile Phe Ala Asp Val Asn Lys Glu Ile Lys Val Val Ala Asn Thr GAA AAG AAA GCA GAA AAA GCG GGT TAT GGT 'CAT AGT AAA AGG ATG TAGGC 1210 G1u Lys Lys Ala Glu Lys Ala Gly Tyr Gly '.t'yr Ser Lys Arg Met ' 375 3B0 385 WO 98/21225 PCT/US97/21353 ' ATAAGAAAAC ACCATAAAAT CGTTCTTAGC TTATTTATAG TATTTTAAAA ACTC l264 (2) INFORMATION FOR SEQ ID N0:34:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 385 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:34:
Met Glu Ser Val Lys Thr Gly Lys Thr Asn Lys Val Gly Lys Asn Thr Glu Met Ala Asn Thr Lys Ala Asn Lys Glu Ala His Phe Lys Gln Ala Ser Thr Ile Thr Asn Ile Ile Arg Ser Ile Arg Gly Ile Phe Thr Lys Ile Ala Lys Lys Val Arg Gly Leu Val Lys Lys His Pro Lys Lys Ser Ser Ala Ala Leu Val Val Leu Thr His Ile Ala Cys Lys Lys Ala Lys Glu Leu Asp Asp Lys Val Gln Asp Lys Ser Lys Gln Ala Glu Lys Glu Asn Gln Ile Asn Trp Trp Lys Tyr Ser Gly Leu Thr Ile Ala Thr Ser Leu Leu Leu Ala Ala Cys Ser Thr Gly Asp Val Ser Glu Gln Ile Glu Leu Glu Gln Glu Lys Gln Lys Thr Ser Asn Ile Glu Thr Asn Asn Gln 130 l35 140 Ile Lys Val Glu Gln Glu Lys Gln Lys Thr Ser Asn Ile Glu Thr Asn l45 150 155 160 Asn Gln Ile Lys Val Glu Gln Glu Gln Gln Lys Thr Glu Gln Glu Xaa l65 I70 175 Gln Lys Thr Glu Gln Glu Arg Gln Lys Thr Glu Gln Glu Lys Gln Lys 180 185 l90 Thr Ile Lys Thr Gln Lys Asp Phe Ile Lys Tyr Val Glu Gln Asn Cys Gln Glu Asn His Asn Gln Phe Phe Ile Glu Lys Gly Gly Ile Lys Ala Gly Ile Gly Ile Glu Val Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala Lys Thr Asn Gln Thr Pro Ile Gln Pro Lys His Leu Pro Asn Ser Lys Gln Pro Arg Ser Gln Arg Gly Ser Lys Ala Gln Glu Leu Ile Ala Tyr Leu Gln Lys Glu Leu Glu Ser Leu Pro Tyr Ser Gln Lys Ala Ile Ala Lys Gln Val Asp Phe Tyr Arg Pro Ser Ser Ile Ala Tyr Leu Glu Leu Asp Pro Arg Asp Phe Asn Val Thr Glu Glu Trp Gln Lys Glu Asn Leu WO 98/2I225 PCT/US97/21353 ' Lys Ile Arg Ser Lys Ala Gln Ala Lys Met Leu Glu Met Arg Ser Leu Lys Pro Asp Ser Gln Ala His Leu Ser Thr Ser Gln Ser Leu Leu Phe Val Gln Lys Ile Phe Ala Asp Val Asn Lys Glu Ile Lys Val Val Ala ' 355 360 365 Asn Thr Glu Lys Lys Ala Glu Lys Ala Gly '~yr Gly Tyr Ser Lys Arg - Met (2) INFORMATION FOR SEQ ID N0:3!i:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 410 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 62...340 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID Ld0:35:
ATTCATTTAC TTTTGAGAAA TATAATTCTC TCGCTTT'iCAA GATCATCACA AGGAGTTTCG 60 Met Lys Lys Gln Ile Leu Thr Gly Val Leu Leu Ser Val Leu Ala Val AGT TCT GCA TAC GCT CAC AAA GAT AAA AAA GAC GCC AAA AAA CCT AAA l57 Ser Ser Ala Tyr Ala His Lys Asp Lys Lys Asp Ala Lys Lys Pro Lys Phe Ser Thr Glu Leu Val Val Ala Gln Asn Asp Lys Lys Asp Ala Lys AAA CCT AAA TTT AGC ACA GAA TTA GTC GTG CiCT CAA AAC GAC AAA AAA 253 Lys Pro Lys Phe Ser Thr Glu Leu Val Val Ala Gln Asn Asp Lys Lys GAC GCT AAA AAA CCT AAA TTT AGC ACA GAA 7.'TA GTC GTG GCT CAA AAC 301 Asp Ala Lys Lys Pro Lys Phe Ser Thr Glu Leu Val Val Ala Gln Asn 65 70 ',~5 80 _ GAC AAA AAA GAC GCT AAA AAA CCT AAA AAC 7.'CA GTG GTC TAATGGCTTT GA 352 Asp Lys Lys Asp Ala Lys Lys Pro Lys Asn Ser Val Val ' CTCTAAAAAA GCGTTTTTAA AAACGCTTTT TTGGATA7.'TA TCCTATAATT TCCTACCA 410 WO 9$/21225 PCT/US97/21353 (2) INFORMATION FOR SEQ ID N0:36:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 93 amino acids (B) TYPE: amino acid (C} STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:36:
Met Lys Lys Gln Ile Leu Thr Gly Val Leu Leu Ser Val Leu Ala Val Ser Ser Ala Tyr Ala His Lys Asp Lys Lys Asp Ala Lys Lys Pro Lys Phe Ser Thr Glu Leu Val Val Ala Gln Asn Asp Lys Lys Asp Ala Lys Lys Pro Lys Phe Ser Thr Glu Leu Val Val Ala Gln Asn Asp Lys Lys Asp Ala Lys Lys Pro Lys Phe Ser Thr Glu Leu Val Val Ala Gln Asn Asp Lys Lys Asp Ala Lys Lys Pro Lys Asn Ser Val Val (2) INFORMATION FOR SEQ ID N0:37:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2097 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 67...2046 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:37:
Met Ile Tyr Trp Leu Tyr Leu Ala Val Phe Phe Leu Leu Ser Ala Leu Asp Ala Lys Glu Ile Ala Met Gln Arg Phe Asp Lys Gln Asn His Lys Ile Phe Glu Ile Leu Ala Asp Lys Val Ser Ala Lys Asp Asn Val Ile Thr Ala Ser Gly Asn Ala Ile Leu Leu Asn Tyr Asp Val Tyr Ile Leu Ala Asp Lys Val Arg Tyr Asp Thr Lys Thr Lys Glu Ala Leu Leu Glu Gly Asn Ile Lys Val Tyr Arg Gly Glu Gly Leu Leu Val Lys Thr Asp Tyr Val Lys Leu Ser Leu Asn Glu Lys Tyr Glu Ile Ile Phe 95 l00 l05 110 Pro Phe Tyr Val Gln Asp Ser Val Ser Gly Ile Trp Val Ser Ala Asp l15 120 125 Ile Ala Ser Gly Lys Asp Gln Lys Tyr Lys Val Lys Asn Met Ser Thr l30 l35 l40 Ser Gly Cys Ser Ile Asp Asn Pro I1e Trp His Val Asn Ala Thr Ser 145 l50 155 Gly Ser Phe Asn Met Gln Lys Ser His Leu Ser Met Trp Asn Pro Lys l60 l65 170 Ile Tyr Val Gly Asp Ile Pro Val Leu Tyr Leu Pro Tyr Ile Phe Met l75 l80 1B5 190 Ser Thr Ser Asn Lys Arg Thr Thr Gly Phe Leu Tyr Pro Glu Phe Gly l95 200 205 Thr Ser Asn Leu Asp Gly Phe Ile Tyr Leu Gln Pro Phe Tyr Leu Ala Pro Lys Asn Ser Trp Asp Met Thr Phe Thr Pro Gln Ile Arg Tyr Lys Arg Gly Phe Gly Leu Asn Phe Glu Ala Arg Tyr Ile Asn Ser Lys Asn ' Asp Arg Phe Leu Phe Asn Ala Arg Tyr Phe Arg Asn Tyr Thr Gln Tyr 255 2fi0 265 270 Val Lys Arg Tyr Asp Leu Arg Asn Gln Asn Ile Tyr Gly Phe Glu Phe Leu Ser Ser Ser Arg Asp Thr Leu Gln Lys Tyr Phe His Leu Lys Ser AAT ATT GAC AAC GGG CAT TAC ATT GAC TTT TTA TAC ATG AAC GAT TTG l020 Asn Ile Asp Asn Gly His Tyr Ile Asp Phe Leu Tyr Met Asn Asp Leu 305 310 3l5 GAC TAT GTG CGT TTT GAA AAG GTT AAT AAG CGT ATC ACA GAC GCC ACG l068 Asp Tyr Val Arg Phe Glu Lys Val Asn Lys Arg Ile Thr Asp Ala Thr His Met Ser Arg Ala Asn Tyr Tyr Leu Gln Thr Glu Asn Asn Tyr Tyr Gly Leu Asn Ile Lys Tyr Phe Leu Asn Leu Asn Lys Ile Rsn Asn Asn CGC ACT TTC CAA TCT GTC CCT AAT TTG CAA TAC CAT AAA TAT TTA AAT l212 Arg Thr Phe Gln Ser Val Pro Asn Leu Gln Tyr His Lys Tyr Leu Asn TCT TTG TAT TTT AGA AAT TTG TTG TAT TCG GTG GAT TAT CAG TTT AGA l260 Ser Leu Tyr Phe Arg Asn Leu Leu Tyr Ser Val Asp Tyr Gln Phe Arg Asn Thr Ala Arg Glu Ile Gly Tyr Gly Tyr Val Gln Asn Ala Leu Asn Val Pro Val Gly Leu Gln Phe Ser Leu Phe Lys Lys Tyr Leu Ser Leu Gly Leu Trp Asn Asp Leu Gln Leu Ser Asn Val Ala Leu Met Gln Ser Lys Asn Ser Phe Val Pro Thr Ile Pro Asn Glu Ser Arg Glu Phe Gly AAT TTT GTG TCT TCA AAT TTT TCC ATG TAT GTC AAT ACG GAT TTG GCT l500 Asn Phe Val Ser Ser Asri Phe Ser Met Tyr Val Asn Thr Asp Leu Ala Arg Glu Tyr Asn Lys Leu Phe His Thr Ile Gln Leu Glu Ala Ile Phe Asn Ile Pro Tyr Tyr Thr Phe Lys Asn Gly Leu Phe Ser Gln Asn Met ' 495 500 505 510 TAT GCT TTA AGC GCG CAA GCC TTA AAC AGC TAC ACT TCG CCT TTA TTG l644 Tyr Ala Leu Ser Ala Gln Ala Leu Asn Ser Tyr Thr Ser Pro Leu Leu Arg Asp Tyr Asp Tyr Gln Gly Arg Leu Tyr Asp Ser Val Trp Asn Pro Ser Ser Ile Leu Pro Ser Asn Ala Ser Asn Lys Thr Val Asp Leu Thr Leu Thr Gln Tyr Leu Tyr Gly Leu G7y Gly Gln Glu Leu Leu Tyr Phe Lys Ile Ser Gln Leu Ile Asn Leu Asp Asp Lys Val Ser Pro Phe Arg Met Pro Leu Glu Ser Lys Ile Gly Phe Ser Pro Leu Thr Gly Leu Asn Ile Phe Gly Asn Val Phe Tyr Ser Phe Tyr Gln Asn Arg Leu Glu Glu Ile Ser Val Asn Ala Asn Tyr Gln Arg Lys Phe Leu Ser Phe Asn Leu Ser Tyr Phe Leu Lys Asn Asn Phe Ser Ser Gly Ile Asn Ser Ile Val Glu Asn Leu Arg Ile Ile (2) INFORMATION FOR SEQ ID N0:38:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 660 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:38:
Met Ile Tyr Trp Leu Tyr Leu Ala Val Phe Phe Leu Leu Ser Ala Leu Asp Ala Lys Glu Ile Ala Met Gln Arg Phe Asp Lys Gln Asn His Lys Ile Phe Glu Ile Leu Ala Asp Lys Val Ser Ala Lys Asp Asn Val Ile Thr Ala Ser Gly Asn Ala Ile Leu Leu Asn Tyr Asp Val Tyr Ile Leu Ala Asp Lys Val Arg Tyr Asp Thr Lys Thr Lys Glu Ala Leu Leu Glu Gly Asn Ile Lys Val Tyr Arg Gly Glu Gly Leu Leu Val Lys Thr Asp Tyr Val Lys Leu Ser Leu Asn Glu Lys Tyr Glu Ile Ile Phe Pro Phe Tyr Val Gln Asp Ser Val Ser Gly Ile Trp Val Ser Ala Asp Ile Ala Ser Gly Lys Asp Gln Lys Tyr Lys Val Lys Asn Met Ser Thr Ser Gly Cys Ser Ile Asp Asn Pro Ile Trp His Val Asn Ala Thr Ser Gly Ser 14S 150 l55 160 Phe Asn Met Gln Lys Ser His Leu Ser Met Trp Asn Pro Lys Ile Tyr 165 l70 175 Val Gly Asp Ile Pro Val Leu Tyr Leu Pro Tyr Ile Phe Met Ser Thr Ser Asn Lys Arg Thr Thr Gly Phe Leu Tyr Pro G1u Phe Gly Thr Ser l95 200 205 Asn Leu Asp Gly Phe Ile Tyr Leu Gln Pro Phe Tyr Leu Ala Pro Lys Asn Ser Trp Asp Met Thr Phe Thr Pro Gln Ile Arg Tyr Lys Arg Gly Phe Gly Leu Asn Phe Glu Ala Arg Tyr Ile Asn Ser Lys Asn Asp Arg Phe Leu Phe Asn Ala Arg Tyr Phe Arg Asn Tyr Thr Gln Tyr Val Lys Arg Tyr Asp Leu Arg Asn Gln Asn Ile Tyr Gly Phe Glu Phe Leu Ser 275 2B0 2g5 Ser Ser Arg Asp Thr Leu Gln Lys Tyr Phe His Leu Lys Ser Asn Ile Asp Asn Gly His Tyr Ile Asp Phe Leu Tyr Met Asn Asp Leu Asp Tyr Val Arg Phe Glu Lys Val Asn Lys Arg Ile Thr Asp Ala Thr His Met Ser Arg Ala Asn Tyr Tyr Leu Gln Thr Glu Asn Asn Tyr Tyr Gly Leu Asn Ile Lys Tyr Phe Leu Asn Leu Asn Lys Ile Asn Asn Asn Arg Thr Phe Gln Ser Val Pro Asn Leu Gln Tyr His Lys Tyr Leu Asn Ser Leu Tyr Phe Arg Asn Leu Leu Tyr Ser Val Asp Tyr Gln Phe Arg Asn Thr WO 98I21225 PCT/US97/21353 ' Ala Arg Glu Ile Gly 'l~yr Gly Tyr Val Gln F,sn Ala Leu Asn Val Pro Val Gly Leu Gln Phe Ser Leu Phe Lys Lys Tyr Leu Ser Leu Gly Leu Trp Asn Asp Leu Gln Leu Ser Asn Val Ala heu Met Gln Ser Lys Asn ' 435 440 445 Ser Phe Val Pro Thr Ile Pro Asn Glu Ser F.rg Glu Phe Gly Asn Phe Val Ser Ser Asn Phe Ser Met Tyr Val Asn Thr Asp Leu Ala Arg Glu Tyr Asn Lys Leu Phe His Thr Ile Gln Leu Glu Ala Ile Phe Asn Ile Pro Tyr Tyr Thr Phe Lys Asn Gly Leu Phe ~~er Gln Asn Met Tyr Ala -Leu Ser A1a Gln Ala Leu Asn Ser Tyr Thr S;er Pro Leu Leu Arg Asp Tyr Asp Tyr Gln Gly Arg Leu Tyr Asp Ser Val Trp Asn Pro Ser Ser Ile Leu Pro Ser Asn Ala Ser Asn Lys Thr Val Asp Leu Thr Leu Thr Gln Tyr Leu Tyr Gly Leu Gly Gly Gln Glu Leu Leu Tyr Phe Lys Ile Ser Gln Leu Ile Asn Leu Asp Asp Lys Val ~~er Pro Phe Arg Met Pro Leu Glu Ser Lys Ile Gly Phe Ser Pro Leu Thr Gly Leu Asn Ile Phe Gly Asn Val Phe Tyr Ser Phe Tyr Gln Asn F.rg Leu Glu Glu Ile Ser Val Asn Ala Asn Tyr Gln Arg Lys Phe Leu ~~er Phe Asn Leu Ser Tyr 625 630 E.35 640 Phe Leu Lys Asn Asn Phe Ser Ser Gly Ile Asn Ser Ile Val Glu Asn Leu Arg Ile Ile (2) INFORMATION FOR SEQ ID N0:39:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 961 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
~ (A) NAME/KEY: Coding Sequence (B) LOCATION: 168...764 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID Dd0:39:
Met Thr Ser Ala Leu Leu Gly Leu Gln Ile Val Leu Ala Val Leu Ile Val Val Val Val Leu Leu Gln Lys Ser Ser Ser Ile Gly Leu Gly Ala Tyr Ser Gly Ser Asn Glu Ser Leu Phe Gly Ala Lys Gly Pro Ala Ser Phe Met Ala Lys Leu Thr Met Phe Leu Gly Leu Leu Phe Val Ile Asn Thr Ile Ala Leu Gly Tyr Phe Tyr Asn Lys Glu Tyr Gly Lys Ser Val Leu Asp Glu Thr Lys Thr Asn Lys Glu Leu Ser Pro Leu Val Pro Ala Thr Gly Thr Leu Asn Pro Ala Leu Asn Pro Thr Leu Asn Pro Thr Leu Asn Pro Leu 100 l05 l10 l15 Glu Gln Ala Pro Thr Asn Pro Leu Met Pro Gln Gln Thr Pro Asn Glu Leu Pro Lys Glu Pro Ala Lys Thr Pro Ser Val Glu Ser Pro Lys Gln l35 140 145 Asn Glu Lys Asn Glu Lys Asn Asp Ala Lys Glu Asn Gly Ile Lys Gly 150 l55 160 Val Glu Lys Thr Lys Glu Asn Ala Lys Thr Pro Pro Thr Thr His Gln Lys Pro Lys Thr His Ala Thr Gln Thr Asn Ala His Thr Asn Gln Lys l80 185 190 195 Lys Asp Glu Lys WO 98/Z1225 PCT/US97/21353 ' AAAGCATTCA AGCTTTAAAC AGGGATTTTT CCACTCT'AAG GAGCGCGAAA GTTTCAGTCA 869 ATATTTTAGA TCACATCAAA GTGGATTATT ACGGCAC'GCC CACGGCATTA AATCAAGTCG 929 (2) INFORMATION FOR SEQ ID N0:40:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 199 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:40:
Met Thr Ser Ala Leu Leu Gly Leu Gln Ile Val Leu Ala Val Leu Ile Val Val Val Val Leu Leu Gln Lys Ser Ser Ser Ile Gly Leu Gly Ala Tyr Ser Gly Ser Asn Glu Ser Leu Phe Gly Ala Lys Gly Pro Ala Ser Phe Met Ala Lys Leu Thr Met Phe Leu Gly Leu Leu Phe Val Ile Asn Thr Ile Ala Leu Gly Tyr Phe Tyr Asn Lys Glu Tyr Gly Lys Ser Val Leu Asp Glu Thr Lys Thr Asn Lys Glu Leu Ser Pro Leu Val Pro Ala Thr Gly Thr Leu Asn Pro Ala Leu Asn Pro Thr Leu Asn Pro Thr Leu Asn Pro Leu Glu Gln Ala Pro Thr Asn Pro Leu Met Pro Gln Gln Thr l15 120 125 Pro Asn Glu Leu Pro Lys Glu Pro Ala Lys Thr Pro Ser Val Glu Ser 130 l35 140 Pro Lys Gln Asn Glu Lys Asn Glu Lys Asn Asp Ala Lys Glu Asn Gly Ile Lys Gly Val Glu Lys Thr Lys Glu Asn Ala Lys Thr Pro Pro Thr l65 170 175 Thr His Gln Lys Pro Lys Thr His Ala Thr Gln Thr Asn Ala His Thr l80 18S 190 Asn Gln Lys Lys Asp Glu Lys (2) INFORMATION FOR SEQ ID N0:41:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1058' base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence WO 98/21225 PCTlUS97/21353 (B) LOCATION: 325...879 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:41:
ACCCAACGCT
TAT AAC
Met Leu Gln Ala Ile Tyr Glu Thr Asn Lys Asp Leu Met Gln Lys Ser Ile Gln Ala Leu Asn Arg Asp Phe Ser Thr Leu Arg Ser Ala Lys Val Ser Val Asn Ile Leu Asp His Ile Lys Val Asp Tyr Tyr Gly Thr Pro Thr Ala Leu Asn Gln Val Gly Ser Val Met Ser Leu Asp Ala Thr Thr Leu Gln Ile Ser Pro Trp Glu Lys Asn Leu Leu Lys Glu Ile Glu Arg Ser Ile Gln Glu Ala Asn Ile Gly Val Asn Pro Asn Asn Asp Gly Glu Thr Ile Lys Leu Phe Phe Pro Pro Met Thr Ser Glu Gln Arg Lys Leu Ile Ala Lys Asp Ala Lys Ala Met Gly Glu Lys Ala Lys Val Ala Val Arg Asn Ile Arg Gln Asp Ala Asn Asn Gln Val Lys Lys Leu Glu Lys Asp Lys Glu Ile Ser Glu Asp Glu Ser 140 l45 150 Lys Lys Ala Gln Glu Gln Ile Gln Lys Ile Thr Asp Glu Ala Ile Lys 155 160 l65 AAA ATT GAT GAA AGC GTG AAA AAC AAA GAA (3AC GCG ATC TTA AAG GTC T 88U
Lys Ile Asp Glu Ser Val Lys Asn Lys Glu i~sp Ala Ile Leu Lys Val 170 175 :L80 l85 ' TGCTCAGCAG TGGGTTTCAT TCCAATTATT ATTTGCAATC CGCTAAAGTT TTAGAAGATC 1000 CCAAACTAGC CGAACAATTA GCGCTAGAAT TAGCCAAe~CA AATCCAAGAA GCTCATTT 1058 (2) INFORMATION FOR SEQ ID N0:42:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 185 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:42:
Met Leu Gln Ala Ile Tyr Asn Glu Thr Lys Asp Leu Met Gln Lys Ser Ile Gln Ala Leu Asn Arg Asp Phe Ser Thr Leu Arg Ser Ala Lys Val Ser Val Asn Ile Leu Asp His Ile Lys Val i~.sp Tyr Tyr Gly Thr Pro Thr Ala Leu Asn Gln Val Gly Ser Val Met :3er Leu Asp Ala Thr Thr Leu Gln Ile Ser Pro Trp Glu Lys Asn Leu Leu Lys Glu Ile Glu Arg 65 70 '75 80 Ser Ile Gln Glu Ala Asn Ile Gly Val Asn J?ro Asn Asn Asp Gly Glu Thr Ile Lys Leu Phe Phe Pro Pro Met Thr :3er Glu Gln Arg Lys Leu 100 105 l10 Ile Ala Lys Asp Ala Lys Ala Met Gly Glu Lys Ala Lys Val Ala Val 115 120 l25 Arg Asn Ile Arg Gln Asp Ala Asn Asn Gln 'Jal Lys Lys Leu Glu Lys 130 l35 140 Asp Lys Glu Ile Ser Glu Asp Glu Ser Lys Lys Ala Gln Glu Gln Ile Gln Lys Ile Thr Asp Glu Ala Ile Lys Lys :Ile Asp Glu Ser Val Lys Asn Lys Glu Asp Ala Ile Leu Lys Val (2) INFORMATION FOR SEQ ID N0:4:3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1669 base pairs _ (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear ' (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: l63...1389 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:43:
Met Ala Gln Asn Phe Thr Lys Leu Asn Pro Gln Phe Glu Asn Ile Ile Phe Glu His Asp Asp Asn Gln Met Ile Leu Asn Phe Gly Pro Gln His Pro Ser Ser His GGG CAA TTG CGC TTG ATT TTG GAA TTA GAG GGC GAA AAA ATC-ACT AAG 3l8 Cly Gln Leu Arg Leu Ile Leu Glu Leu Glu Gly Glu Lys Ile Ile Lys Ala Thr Pro Glu Ile Gly Tyr Leu His Arg Gly Cys Glu Lys Leu Gly GAA AAC ATG ACC TAT AAC GAA TAC ATG CCC ACT ACT GAT AGA TTG GAT 4l4 Glu Asn Met Thr Tyr Asn Glu Tyr Met Pro Thr Thr Asp Arg Leu Asp Tyr Thr Ser Ser Thr Ser Asn Asn Tyr Ala Tyr Ala Tyr Ala Val Glu 85 90 95 l00 Thr Leu Leu Asn Leu Glu Ile Pro Arg Arg Ala Gln Val Ile Arg Thr l05 110 115 Ile Leu Leu Glu Leu Asn Arg Met Ile Ser His Ile Phe Phe Ile Ser 120 125 l30 Val His Ala Leu Asp Val Gly Ala Met Ser Val Phe Leu Tyr Ala Phe 135 l40 l45 Lys Thr Arg Glu Tyr Gly Leu Asp Leu Met Glu Asp Tyr Cys Gly Ala l50 155 160 Arg Leu Thr His Asn Ala Ile Arg Ile Gly G1y Val Pro Leu Asp Leu 165 170 .L75 180 CCC CCT AAT TGG TTA GAA GGC TTA AAA AAG '.CTT TTA GGC GAA ATG AGG 750 Pro Pro Asn Trp Leu Glu Gly Leu Lys Lys I?he Leu Gly Glu Met Arg GAA TGC AAA AAA CTC ATT CAA GGC TTA TTG (3AT AAG AAT CGC ATT TGG 798 Glu Cys Lys Lys Leu Ile Gln Gly Leu Leu Asp Lys Asn Arg Ile Trp Arg Met Arg Leu Glu Asn Val Gly Val Val 'Chr Gln Lys Met Ala Gln Ser Trp Gly Met Ser Gly Ile Met Leu Arg Gly Thr Gly Ile Ala Tyr GAC ATC AGA AAA GAA GAG CCT TAT GAG CTT 'CAT AAA GAG CTT GAT TTT 942 Asp Ile Arg Lys Glu Glu Pro Tyr Glu Leu 'Cyr Lys Glu Leu Asp Phe GAT GTG CCG GTG GGC AAT TAT GGC GAT AGT 'CAT GAT AGG TAT TGT TTG 990 Asp Val Pro Val Gly Asn Tyr Gly Asp Ser 'Cyr Asp Arg Tyr Cys Leu TAT ATG TTA GAA ATT GAT GAA AGC GTT CGC i~TC ATT GAA CAG CTC ATT 1038 Tyr Met Leu Glu Ile Asp Glu Ser Val Arg :Lle Ile Glu Gln Leu Ile Pro Met Tyr Ala Lys Thr Asp Thr Pro Ile Met Ala Gln Asn Pro His Tyr Ile Ser Ala Pro Lys Glu Asp Ile Met 'rhr Gln Asn Tyr Ala Leu Met Gln His Phe Val Leu Val Ala Gln Gly IKet Arg Pro Pro Val Gly GAA GTG TAT GCC CCC ACA GAA AGC CCT AAA t3GG GAA TTA GGG TTT TTT 1230 Glu Val Tyr Ala Pro Thr Glu Ser Pro Lys t3ly Glu Leu Gly Phe Phe ' 345 350 355 Ile His Ser Glu Gly Glu Pro Tyr Pro His Arg Leu Lys Ile Arg Ala CCT AGC TTT TAT CAC ATT GGG GCT TTG AGC c,;AC ATT TTA GTG GGG CAA 1326 Pro Ser Phe Tyr His Ile Gly Ala Leu Ser ,?asp Ile Leu Val Gly Gln Tyr Leu Ala Asp Ala Val Thr Val Ile Gly Ser Thr Asn Ala Val Phe Gly Glu Val Asp Arg (2) INFORMATION FOR SEQ ID N0:44:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 409 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear _ - (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:44:
Met Ala Gln Asn Phe Thr Lys Leu Asn Pro Gln Phe Glu Asn Ile Ile Phe Glu His Asp Asp Asn Gln Met Ile Leu Asn Phe Gly Pro Gln His Pro Ser Ser His Gly Gln Leu Arg Leu Ile Leu Glu Leu Glu Gly Glu Lys Ile Ile Lys Ala.Thr Pro Glu Ile Gly Tyr Leu His Arg Gly Cys Glu Lys Leu Gly Glu Asn Met Thr Tyr Asn Glu Tyr Met Pro Thr Thr Asp Arg Leu Asp Tyr Thr Ser Ser Thr Ser Asn Asn Tyr Ala Tyr Ala Tyr Ala Val Glu Thr Leu Leu Asn Leu Glu Ile Pro Arg Arg Ala Gln 100 105 1l0 Val Ile Arg Thr Ile Leu Leu Glu Leu Asn Arg Met Ile Ser His Ile Phe Phe Ile Ser Val His Ala Leu Asp Val Gly Ala Met Ser Val Phe Leu Tyr Ala Phe Lys Thr Arg Glu Tyr Gly Leu Asp Leu Met Glu Asp 145 l50 155 l60 Tyr Cys Gly Ala Arg Leu Thr His Asn Ala Ile Arg Ile Gly Gly Val Pro Leu Asp Leu Pro Pro Asn Trp Leu Glu Gly Leu Lys Lys Phe Leu Gly Glu Met Arg Glu Cys Lys Lys Leu Ile Gln Gly Leu Leu Asp Lys Asn Arg Ile Trp Arg Met Arg Leu Glu Asn Val Gly Val Val Thr Gln 2l0 215 220 Lys Met Ala Gln Ser Trp Gly Met Ser G1y Ile Met Leu Arg Gly Thr Gly Ile Ala Tyr Asp Ile Arg Lys Glu Glu Pro Tyr Glu Leu Tyr Lys Glu Leu Asp Phe Asp Val Pro Val Gly Asn Tyr Gly Asp Ser Tyr Asp ~ Arg Tyr Cys Leu Tyr Met Leu Glu Ile Asp Glu Ser Val Arg Ile Ile Glu Gln Leu Ile Pro Met Tyr Ala Lys Thr Asp Thr Pro Ile Met Ala Gln Asn Pro His Tyr Ile Ser Ala Pro Lys G1u Asp Ile Met Thr Gln Asn Tyr Ala Leu Met Gln His Phe Val Leu Val Ala Gln Gly Met Arg Pro Pro Val Gly Glu Val Tyr Ala Pro Thr Glu Ser Pro Lys Gly Glu Leu Gly Phe Phe Ile His Ser Glu Gly Glu Pro Tyr Pro His Arg Leu Lys Ile Arg Ala Pro Ser Phe Tyr His Ile Gly Ala Leu Ser Asp Ile Leu Val Gly Gln Tyr Leu Ala Asp Ala Val Thr Val Ile Gly Ser Thr Asn Ala Val Phe Gly Glu Val Asp Arg (2} INFORMATION FOR SEQ ID N0:45:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 869 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 358...732 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:45:
CCAATATTGG CTATCATTTA CTTTGATTTCGCCCATC,"GTGTCATGTTCAATTCTAAATTG180 GTTATTATCC GTTCGCAACA AGAATTTTCTTGTTATC.'TTAATGTAAAGGTCAAAACG 360 ATG
Met TTT TTA GTG GTT ATG
GGT
Lys Lys Leu Ala Ala Leu Val Ser Leu Gly Met Gly Phe Leu Val Val Leu Asn Ala Trp Glu Gln Thr Leu Lys Ala Asn Asp Leu Glu Val Lys Ile Lys Ser Val Gly Asn Pro Ile Lys Gly Asp Asn Thr Phe Ile Leu Ser Pro Thr Leu Lys Gly Lys Ala Leu Glu Lys Ala Ile Val Arg Val Gln Phe Met Met Pro Glu Met Pro Gly Met Pro Ala Met Lys Glu Met Ala Gln Val Ser Glu Lys Asn Gly Leu Tyr Glu Ala Lys Thr Asn Leu Ser Met Asn Gly Thr Trp Gln Val Arg Val Asp Ile Lys Ser Lys Glu l00 l05 110 Gly Gln Val Tyr Arg Ala Lys Thr Ser Leu Asp Leu (2) INFORMATION FOR SEQ ID N0:46:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 125 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:46:
Met Lys Lys Leu Ala Ala Leu Phe Leu Val Ser Val Leu Gly Val Met Gly Leu Asn Ala Trp Glu Gln Thr Leu Lys Ala Asn Asp Leu Glu Val Lys Ile Lys Ser Val Giy Asn Pro Ile Lys Gly Asp Asn Thr Phe Ile Leu Ser Pro Thr Leu Lys Gly Lys Ala Leu Glu Lys Ala Ile Val Arg Val Gln Phe Met Met Pro Glu Met Pro Gly Met Pro Ala Met Lys Glu Met Ala Gln Val Ser G1u Lys Asn Gly Leu '.Cyr Glu Ala Lys Thr Asn Leu Ser Met Asn Gly Thr Trp Gln Val Arg Val Asp Ile Lys Ser Lys 100 105 1l0 Glu Gly Gln Val Tyr Arg Ala Lys Thr Ser Leu Asp Leu 115 l20 125 (2) INFORMATION FOR SEQ ID N0:4'7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 12l7 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 73...1152 (D1 OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:47:
TCCATGCGTT TTGATGCGAT TTTAAAAAAT CTTTGGG7.'AT TTTAGCATGC CAATGGTTAA 60 Met Asn Gly Phe Cys Ala Arq_ Leu Arg Ala Ile Thr His AAT GAA AGA TTA AAA ATG AAA ATA GCG GTA 7.'TA CTC AGT GGG GGG GTG 159 Asn Glu Arg Leu Lys Met Lys Ile Ala Val Leu Leu Ser Gly Gly Val Asp Ser Ser Tyr Ser Ala Tyr Ser Leu Lys Cilu Gln Gly His Glu Leu Val Gly Ile Tyr Leu Lys Leu His Ala Ser C~lu Lys Lys His Asp Leu TAC ATC AAA AAC GCT CAA AAA GCA TGC GAG 7.'TT TTA GGC ATT CCT TTA 303 Tyr Ile Lys Asn Ala Gln Lys Ala Cys Glu Phe Leu Gly Ile Pro Leu Glu Val Leu Asp Phe Gln Lys Asp Phe Lys Cter Ala Val Tyr Asp Glu TTT ATC AAC GCC TAT GAA GAA GGG CAA ACC C.'CA AAC CCT TGT GCG TTG 399 Phe Ile Asn Ala Tyr Glu Glu Gly Gln Thr Pro Asn Pro Cys Ala Leu ~ TGC AAC CCT TTA ATG AAG TTT GGG CTA GCT TTG GAT CAC GCT TTA AAA 447 Cys Asn Pro Leu Met Lys Phe Gly Leu Ala Leu Asp His Ala Leu Lys 110 1L5 120 l25 Leu Gly Cys Glu Lys Ile Ala Thr Gly His Tyr Ala Arg Val Lys Glu 130 l35 l40 Ile Asp Lys Ile Ser Tyr Ile Gln Glu Ala Leu Asp Lys Thr Lys Asp Gln Ser Tyr Phe Leu Tyr Ala Leu Glu His Glu Val Ile Ala Lys Leu 160 165 l70 Val Phe Pro Leu Gly Asp Leu Leu Lys Lys Asp Ile Lys Pro Leu Ala Leu Asn Ala Met Pro Phe Leu Gly Thr Leu Glu Thr Tyr Lys Glu Ser 190 l95 200 205 Gln Glu Ile Cys Phe Val Glu Lys Ser Tyr Ile Asp Thr Leu Lys Lys Hi's Val Glu Val Glu Lys Glu Gly Val Val Lys Asn Leu Gln Gly Glu Val Ile Gly Thr His Lys Gly Tyr Met Gln Tyr Thr I1e Gly Lys Arg Lys Gly Phe Ser Ile Lys Gly Ala Leu' Glu Pro His Phe Val Val Gly Ile Asp Ala Lys Lys Asn Glu Leu Val Val Gly Lys Lys Glu Asp Leu Ala Thr His Ser Leu Lys Ala Lys Asn Lys Ser Leu Met Lys Asp Phe Lys Asp Gly Glu Tyr Phe Ile Lys Ala Arg Tyr Arg Ser Val Pro Ala AAA GCG CAT GTG AGT TTG AAA GAT GAG GTG ATT GAA GTG GGG TTT AAA l071 Lys Ala His Val Ser Leu Lys Asp Glu Val Ile Glu Val Gly Phe Lys WO 98I21225 PCT/(JS97/21353 -Glu Pro Phe Tyr Gly Val Ala Lys Gly Gln Ala Leu Val Val Tyr Lys Asp Asp Ile Leu Leu Gly Gly Gly Val Ile Val " GATACGCCTT TTGGCAGTCT CTTAATGTTT TATTGAF~TAG GCGTT 1217 (2) INFORMATION FOR SEQ ID N0:48:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 360 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:48:
Met Asn Gly Phe Cys Ala Arg Leu Arg Ala Ile Thr His Asn Glu Arg Leu Lys Met Lys Ile Ala Val Leu Leu Ser Gly Gly Val Asp Ser Ser Tyr Ser Ala Tyr Ser Leu Lys Glu Gln Gly His Glu Leu Val Gly Ile Tyr Leu Lys Leu His Ala Ser Glu Lys Lys His Asp Leu Tyr Ile Lys Asn Ala Gln Lys Ala Cys Glu Phe Leu Gly Ile Pro Leu Glu Val Leu Asp Phe Gln Lys Asp Phe Lys Ser Ala Val Tyr Asp Glu Phe Ile Asn Ala Tyr Glu Glu Gly Gln Thr Pro Asn Pro Cys Ala Leu Cys Asn Pro 100 105 l10 Leu Met Lys Phe Gly Leu Ala Leu Asp His Ala Leu Lys Leu Gly Cys 115 l20 125 Glu Lys lle Ala Thr Gly His Tyr Ala Arg Val Lys Glu Ile Asp Lys l30 135 140 Ile Ser Tyr Ile Gln Glu Ala Leu Asp Lys Thr Lys Asp Gln Ser Tyr Phe Leu Tyr Ala Leu Glu His Glu Val Ile Ala Lys Leu Val Phe Pro l65 170 175 Leu Gly Asp Leu Leu Lys Lys Asp Ile Lys Pro Leu Ala Leu Asn Ala 1B0 l85 190 Met Pro Phe Leu Gly Thr Leu Glu Thr Tyr Lys Glu Ser Gln Glu Ile _ 195 200 205 Cys Phe Val Glu Lys Ser Tyr Ile Asp Thr Leu Lys Lys His Val Glu _ 210 215 220 Val Glu Lys Glu Gly Val Val Lys Asn Leu Gln Gly Glu Val Ile Gly Thr His Lys Gly Tyr Met Gln Tyr Thr Ile Gly Lys Arg Lys Gly Phe Ser Ile Lys Gly Ala Leu Glu Pro His Phe Val Val Gly Ile Asp Ala Lys Lys Asn Glu Leu Val Val Gly Lys Lys Glu Asp Leu Ala Thr His Ser Leu Lys Ala Lys Asn Lys Ser Leu Met Lys Asp Phe Lys Asp Gly Glu Tyr Phe Ile Lys Ala Arg Tyr Arg Ser Val Pro Ala Lys Ala His Val Ser Leu Lys Asp Glu Val Ile Glu Val Gly Phe Lys Glu Pro Phe Tyr Gly Val Ala Lys Gly Gln Ala Leu Val Val Tyr Lys Asp Asp Ile Leu Leu Gly Gly Gly Val Ile Val (2) INFORMATION FOR SEQ ID N0:49:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 975 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix} FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: l91...793 (D) OTHER INFORMATION:
(xi} SEQUENCE DESCRIPTION: SEQ ID N0:49:
AACCCACTGA
GGGTTTTAGTTTTAGCATGTTAGCATTCAG CCACCACTCTTTTTAAGGAATTTGTTTGAAl80 TTG TTA TCT
GCC ACT
CTT TTA
Met Ser Leu Ala Cys Leu Leu Leu Ser Ala Thr Leu Leu Pro Pro Lys Gly His His Ser Gly Leu Val Asn Leu Tyr Ile Ala His Gln Gly Gln Ser Val Arg Thr Tyr Trp Arg Lys Val Asp Arg Gly Val Ile Ala Lys His Asn Glu Ala Leu Lys Lys Asp Pro Lys Ala Lys Leu Lys Asp Pro Arg Gly Pro Leu Phe Met Leu Gly Ser Glu Arg Phe Met WO 98l21225 PCT/US97/21353 Leu Leu Trp Lys Asn Arg Tyr Ala Leu Ala Lys Pro Gln Ser Phe Arg - Leu Glu Pro Gly Phe Tyr Tyr Leu Asp Ser Phe Ser Val Glu Thr Gln 95 l00 105 Lys Gly Val Leu Gln Ser Ala Pro Gly Tyr Ser Tyr Thr Lys Asn Gly 110 l15 120 125 Tyr Asp Phe Lys Asn Asn Arg Pro Phe Phe Leu Ala Phe Glu Val Lys l30 135 140 Pro Asp Gly Lys Thr Ile Leu Pro Ser Val Glu Leu Ser Leu Ile Lys 145 l50 155 Thr Pro Arg Gly Phe Leu Gly Val Phe Leu Phe Asp Asn Asn Glu Lys l60 165 170 Gly Thr Asn Ala Lys Trp Ile Glu Gly Ser Leu Asn Leu Lys Leu Lys Asn Ala Ser Phe Lys Asp Ala Trp Gly Leu Glu Gln AGATTTTATT ACCCCTATTC AATTGGAACA AAGCCA'CTAA ATTTTTAAAA ACTTTTAAAA 929 ACGATAAACA TAATCCGCGC TCCAAGTAAC ATAGCT'CTCA AAAATG 975 (2) INFORMATION FOR SEQ ID NO:!i0:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 201 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi} SEQUENCE DESCRIPTION: SEQ ID N0:50:
Met Ser Leu Leu Ala Thr Leu Leu Leu Ala Ser Cys Leu Pro Pro Lys Gly His His Ser Gly Leu Val Asn Leu Tyr Ile Ala His Gln Gly Gln Ser Val Arg Thr Tyr Trp Arg Lys Val Asp Arg Gly Val Ile Ala Lys His Asn Glu Ala Leu Lys Lys Asp Pro Lys Ala Lys Leu Lys Asp Pro WO 98/21225 PCT/US97/21353 ' Arg Gly Pro Leu Phe Met Leu Gly Ser Glu Arg Phe Met Leu Leu Trp Lys Asn Arg Tyr Ala Leu Ala Lys Pro Gln Ser Phe Arg Leu Glu Pro Gly Phe Tyr Tyr Leu Asp Ser Phe Ser Val Glu Thr Gln Lys Gly Val Leu Gln Ser Ala Pro Gly Tyr Ser Tyr Thr Lys Asn Gly Tyr Asp Phe l35 l20 125 Lys Asn Asn Arg Pro Phe Phe Leu Ala Phe Glu Val Lys Pro Asp Gly 130 135 l40 Lys Thr Ile Leu Pro Ser Val Glu Leu Ser Leu Ile Lys Thr Pro Arg l45 150 l55 160 Gly Phe Leu Gly Val Phe Leu Phe Asp Asn Asn Glu Lys Gly Thr Asn Ala Lys Trp Ile Glu Gly Ser Leu Asn Leu Lys Leu Lys Asn Ala Ser Phe Lys Asp Ala Trp Gly Leu Glu Gln (2) INFORMATION FOR SEQ ID N0:51:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1116 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 90...1076 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:51:
Met Ser Asn Ser Met Leu Asp Lys Asn Lys Ala Ile Leu Thr Gly Gly Gly Ala Leu Leu Leu Gly Leu Ile Val Leu Phe Tyr Leu Ala Tyr Arg Pro Lys Ala Glu Val Leu Gln Gly Phe Leu Glu Ala Arg Glu Tyr Ser Val Ser Ser Lys Val Pro Gly Arg ATT GAA AAG GTG TTT GTL' AAA AAA GGC GAT CAC ATT AAA AAG GGC GAT 305 Ile Glu Lys Val Phe Val Lys Lys Gly Asp His Ile Lys Lys Gly Asp ' Leu Val Phe Ser Ile Ser Ser Pro Glu Leu Glu Ala Lys Leu Ala Gln Ala Glu Ala Gly His Lys Ala Ala Lys Ala Leu Ser Asp Glu Val Lys Arg Gly Sex Arg Asp Glu Thr Ile Asn Ser Ala Arg Asp Val Trp Gln l05 110 115 120 Ala Ala Lys Ser Gln Ala Thr Leu Ala Lys Glu Thr Tyr Lys Arg Val 125 130 l35 -- Gln Asp Leu Tyr Asp Asn Gly Val Ala Ser Leu Gln Lys Arg Asp Glu 140 l45 150 Ala Tyr Ala Ala Tyr Glu Ser Thr Lys Tyr Asn Glu Ser Ala Ala Tyr Gln Lys Tyr Lys Met Ala Leu Gly Gly Ala Ser Ser Glu Ser Lys Ile Ala Ala Lys Ala Lys Glu 5er Ala Ala Leu Gly Gln Val Asn Glu Val Glu Ser Tyr Leu Lys Asp Val Lys Ala Thr Ala Pro Ile Asp Gly Glu Val Ser Asn Val Leu Leu Ser Gly Gly Glu Leu Ser Pro Lys Gly Phe Pro Val Val Leu Met Ile Asp Leu Lys Asp Ser Trp Leu Lys Ile Ser Val Pro Glu Lys Tyr Leu Asn Glu Phe Lys Val Gly Lys Glu Phe Glu Gly Tyr Ile Pro Ala Leu Lys Lys Ser Thr Lys Phe Arg Val Lys Tyr ' 265 270 275 280 WO 98/2I225 PCT/US97i21353 Leu Ser Val Met Gly Asp Phe Ala Thr Trp Lys Ala Thr Asn Asn Ser Asn Thr Tyr Asp Met Lys Ser Tyr Glu Val Glu Ala Ile Pro Leu Glu Glu Leu Glu Asn Phe Arg Val Gly Met Ser Val Leu Val Thr Ile Lys CCT TAAAAAGGAT TGTTTTGTTC AGATTGATAA GCGCATGGGT 1l16 Pro (2) INFORMATION FOR SEQ ID N0:52:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 329 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:52:
Met Ser Asn Ser Met Leu Asp Lys Asn Lys Ala Ile Leu Thr Gly Gly Gly Ala Leu Leu Leu Gly Leu Ile Val Leu Phe Tyr Leu Ala Tyr Arg Pro Lys Ala Glu Val Leu Gln Gly Phe Leu Glu Ala Arg Glu Tyr Ser Val Ser Ser Lys Val Pro Gly Arg Ile Glu Lys Val Phe Val Lys Lys Gly Asp His Ile Lys Lys Gly Asp Leu Val Phe Ser Ile Ser Ser Pro Glu Leu Glu Ala Lys Leu Ala Gln Ala Glu Ala Gly His Lys Ala Ala Lys Ala Leu Ser Asp Glu Val Lys Arg Gly Ser Arg Asp Glu Thr Ile Asn Ser Ala Arg Asp Val Trp Gln Ala Ala Lys Ser Gln Ala Thr Leu 1l5 l20 l25 Ala Lys Glu Thr Tyr Lys Arg Val Gln Asp Leu Tyr Asp Asn Gly Val l30 135 140 Ala Ser Leu Gln Lys Arg Asp Glu Ala Tyr Ala Ala Tyr Glu Ser Thr l45 l50 l55 160 Lys Tyr Asn Glu Ser Ala Ala Tyr Gln Lys Tyr Lys Met Ala Leu Gly Gly Ala Ser Ser Glu Ser Lys Ile Ala Ala Lys Ala Lys Glu Ser Ala l80 185 l90 Ala Leu Gly Gln Val Asn Glu Val Glu Ser Tyr Leu Lys Asp Val Lys l95 200 205 Ala Thr Ala Pro Ile Asp Gly Glu Val Ser Asn Val Leu Leu Ser Gly 210 2l5 220 Gly Glu Leu Ser Pro Lys Gly Phe Pro Val Val Leu Met Ile Asp Leu ~ Lys Asp Ser Trp Leu Lys Ile Ser Val Pro Glu Lys Tyr Leu Asn Glu Phe Lys Val Gly Lys Glu Phe Glu Gly Tyr Ile Pro Ala Leu Lys Lys Ser Thr Lys Phe Arg Val Lys Tyr Leu Ser Val Met Gly Asp Phe Ala Thr Trp Lys Ala Thr Asn Asn Ser Asn Thr Tyr Asp Met Lys Ser Tyr Glu Val Glu Ala Ile Pro Leu Glu Glu Leu Glu Asn Phe Arg Val Gly Met Ser Val Leu Val Thr Ile Lys Pro (2) INFORMATION FOR SEQ ID N0:53:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1514 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 94...1467 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:53:
Met Leu Glu Thr Ser Ser His Phe Leu Lys Ser Phe Arg Leu Lys Arg Tyr Ile Gly Phe Leu Leu Ile TCT TTA GCG TTA TTA ATC ACG CCC TTT GTT CGC ATT GAT GGG GCG CAT 2l0 Ser Leu Ala Leu Leu Ile Thr Pro Phe Val Arg Ile Asp Gly Ala His Leu Phe Leu Ile Ser Phe Glu His Lys Gln Leu His Phe Leu Gly Lys ~ Ile Phe Ser Ala Glu Glu Leu Gln Val Met Pro Phe Met Val Ile Leu Leu Phe Ile Gly Ile Phe Phe Ile Thr Thr Ser Leu Gly Arg Val Trp Cys Gly Trp Ala Cys Pro Gln Thr Phe Leu Arg Val Leu Tyr Arg Asp Val Ile Glu Thr Lys Ile Phe Lys Leu His Lys Lys Ile Ser Asn Lys l05 l10 115 Gln Glu Ser Pro Lys Asn Thr Pro Ser Tyr Lys Ile Arg Lys Val Leu 120 125 l30 13S
Ser Val Leu Leu Phe Ala Pro Val Val Ala Gly Leu Met Met Leu Phe Phe Phe Tyr Phe Ile Ala Pro Glu Asp Phe Phe Met Tyr Leu Lys Asn 155 l60 165 Pro Ser Asp His Pro Ile Ala Met Gly Phe Trp Leu Phe Ser Thr Ala Val Val Leu Phe Asp Ile Val Val Val Ala Glu Arg Phe Cys Ile Tyr 185 l90 195 Leu Cys Pro Tyr Ala Arg Val Gln Ser Val Leu Tyr Asp Asn Asp Thr 200 205 2l0 215 Leu Asn Pro Ile Tyr Asp Glu Lys Arg Gly Gly Ala Leu Tyr Asn Asn Gln Gly His Leu Phe Pro Leu Pro Pro Lys Lys Arg Ser Pro Glu Asn Glu Cys Val Asn Cys Leu His Cys Val Gln Val Cys Pro Thr His Ile Asp Ile Arg Lys Gly Leu Gln Leu Glu Cys Ile Asn Cys Leu Glu Cys Val Asp Ala Cys Thr Ile Thr Met Ala Lys Phe Asn Arg Pro Ser Leu Ile Gln Trp Ser Ser Thr Asn Ala Ile Asn Thr Arg Gln Lys Val His Leu Val Arg Leu Lys Thr Ile Ala Tyr Met C;ly Val Ile Ala Ile Val ATC GCT CTT TTA GCC ATC ACT TCG TTT AAA FAA GAA CGC ATG CTC TTA 1l22 Ile Ala Leu Leu Ala Ile Thr Ser Phe Lys Lys Glu Arg Met Leu Leu GAC ATT AAC CGC AAC AGC GAT CTG TAT GAA TTG CGC TCT AGC GGG TAT. 1170 Asp Ile Asn Arg Asn 5er Asp Leu Tyr Glu Leu Arg Ser Ser Gly Tyr GTG GAT AAC GAT TAC GTG TTT TTA TTC CAC F,AC ACG GAC AAT AAA GAC 1218 Val Asp Asn Asp Tyr Val Phe Leu Phe His Asn Thr Asp Asn Lys Asp His Glu Phe Tyr Phe Lys Val Leu Gly Gln hys Asp Ile Gln Ile Lys AAG CCT TTA AAT CCT ATC GCC ATT AAA GCC C;GG CAA AAG ATT AAA GCG 1314 Lys Pro Leu Asn Pro Ile Ala Ile Lys Ala C;ly Gln Lys Ile Lys Ala GTA GTG ATT TTA AGA AAA CCC CTA AAG AGT F~AC GCC ACA GAA TAC AAG 1362 Val Val Ile Leu Arg Lys Pro Leu Lys Ser F~sn Ala Thr Glu Tyr Lys 4l0 415 420 AAC GCT AAA GAC GCT CTA ATC CCC ATT ACC F~TA CAA GCT TAT AGC GCG 1410 Asn Ala Lys Asp Ala Leu Ile Pro Ile Thr I:le Gln Ala Tyr Ser A1a Asp Asp Lys Asn Ile Thr Ile Glu Arg Glu Ser Val Phe Ile Ala Pro 440 445 9.50 455 AGT GAG GAT TGAAGCCTAA AACTAGCGTT CAATCAC'.TTC ATAAGGCAAG CCTTGTT 1514 Ser Glu Asp (2) INFORMATION FOR SEQ ID N0:59:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 458 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear WO 98/21225 PCTlUS97/21353 (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:54:
Met Leu Glu Thr Ser Ser His Phe Leu Lys Ser Phe Arg Leu Lys Arg Tyr Ile Gly Phe Leu Leu Ile Ser Leu Ala Leu Leu Ile Thr Pro Phe Val Arg Ile Asp Gly Ala His Leu Phe Leu Ile Ser Phe Glu His Lys Gln Leu His Phe Leu Gly Lys Ile Phe Ser Ala Glu Glu Leu Gln Val Met Pro Phe Met Val Ile Leu Leu Phe Ile Gly Ile Phe Phe Ile Thr Thr Ser Leu Gly Arg Val Trp Cys Gly Trp Ala Cys Pro Gln Thr Phe Leu Arg Val Leu Tyr Arg Asp Val Ile Glu Thr Lys Ile Phe Lys Leu His Lys Lys Ile Ser Asn Lys Gln Glu Ser Pro Lys Asn Thr Pro Ser 115 l20 125 Tyr Lys Ile Arg Lys Val Leu Ser Val Leu Leu Phe Ala Pro Val Val l30 135 l40 Ala Gly Leu Met Met Leu Phe Phe Phe Tyr Phe Ile Ala Pro Glu Asp Phe Phe Met Tyr Leu Lys Asn Pro Ser Asp His Pro Ile Ala Met Gly 165 170 l75 Phe Trp Leu Phe Ser Thr Ala Val Val Leu Phe Asp Ile Val Val Val 180 l85 190 Ala Glu Arg Phe Cys Ile Tyr Leu Cys Pro Tyr Ala Arg Val Gln Ser Val Leu Tyr Asp Asn Asp Thr Leu Asn Pro Ile Tyr Asp Glu Lys Arg 210 2l5 220 Gly Gly Ala Leu Tyr Asn Asn Gln Gly His Leu Phe Pro Leu Pro Pro Lys Lys Arg Ser Pro Glu Asn Glu Cys Val Asn Cys Leu His Cys Val Gln Val Cys Pro Thr His Ile Asp Ile Arg Lys Gly Leu Gln Leu Glu Cys Ile Asn Cys Leu Glu Cys Val Asp Ala Cys Thr Ile Thr Met Ala Lys Phe Asn Arg Pro Ser Leu Ile Gln Trp Ser Ser Thr Asn Ala Ile Asn Thr Arg Gln Lys Val His Leu Val Arg Leu Lys Thr Ile Ala Tyr Met Gly Val Ile Ala Ile Val Ile Ala Leu Leu Ala Ile Thr Ser Phe Lys Lys Glu Arg Met Leu Leu Asp Ile Asn Arg Asn Ser Asp Leu Tyr Glu Leu Arg Ser Ser Gly Tyr Val Asp Asn Asp Tyr Val Phe Leu Phe His Asn Thr Asp Asn Lys Asp His Glu Phe Tyr Phe Lys Val Leu Gly Gln Lys Asp Ile Gln Ile Lys Lys Pro Leu Asn Pro Ile Ala Ile Lys Ala Gly Gln Lys Ile Lys Ala Val Val Ile Leu Arg Lys Pro Leu Lys Ser Asn Ala Thr Glu Tyr Lys Asn Ala Lys Asp Ala Leu Ile Pro Ile Thr Ile Gln Ala Tyr Ser Ala Asp Asp Lys Asn Ile Thr Ile Glu Arg Glu Ser Val Phe Ile Ala Pro Ser Glu Asp (2) INFORMATION FOR SEQ ID N0:55:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 990 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 228...782 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:55:
ACGATTTGATCAATAACGAA AATAAAATTGATGAAATC'AATAATGAAGAA AACGCTGATC60 CTTCGCAAAAAAGAACGAAC AACGTTTTGCAACGAGCC'ACTAACCACCAA GACAATCTCA120 ATTCCCCACTCAACAGGAAG TATTAAAGTGTGAAACTT'TTTTCAAAGGAT TTATTTAAAA180 AAGTAACCCCTTTATTTTTA AGCGTTTATTTTTTAAAC'CCCACCATT ATG CAA 236 GCC
Met Gln Ala Lys Ser Arg Phe Tyr Val Ala Ser Gln Tyr Gln Val Gly Lys Met Ile ATG AAA AAA TAC AAC GAT CTC AAA CGC ACG A.TT GAA GGG GCG AGC TTT 332 Met Lys Lys Tyr Asn Asp Leu Lys Arg Thr Ile Glu Gly Ala Ser Phe Ser Leu Gly Trp Glu Ile Asn Pro Thr Asn Tyr Trp Phe Tyr Ser Arg Tyr Tyr Phe Phe Met Asp Tyr Gly Asn Val Ile Leu Asn Lys Arg Thr Gly Ala Gln Ala Asn Met Phe Thr Tyr Gly Phe Gly Gly Asp Leu Ile Val Glu Tyr Asn Lys Asn Pro Leu Tyr Val Phe Ser Leu Phe Tyr Gly Met Gln Val Ala Glu Asn Thr Trp Thr Ile Ser Lys His Ser Ala Asn Phe Ile Ile Asp Asp Trp Arg Ser Ile Gln Gly Phe Ser Leu Lys Thr Ser Asn Phe Arg Met Leu Gly Leu Val Gly Phe Lys Phe Gln Thr Val Leu Phe His His Asp Ala Ser Ile Glu Val Gly Ile Lys Trp Pro Phe Ala Phe Glu Tyr Asp Ser Ala Phe Val Arg Leu Phe Ser Val Phe Ile l65 170 175 Ser His Thr Phe Tyr Leu 180 l85 (2) INFORMATION FOR SEQ ID N0:56:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 185 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:56:
Met Gln Ala Lys Ser Arg Phe Tyr Val Ala Ser Gln Tyr Gln Val Gly Lys Met Ile Met Lys Lys Tyr Asn Asp Leu Lys Arg Thr Ile Glu Gly Ala Ser Phe Ser Leu Gly Trp Glu Ile Asn Pro Thr Asn Tyr Trp Phe Tyr Ser Arg Tyr Tyr Phe Phe Met Asp Tyr Gly Asn Val Ile Leu Asn Lys Arg Thr Gly Ala Gln Ala Asn Met Phe Thr Tyr Gly Phe Gly Gly Asp Leu Ile Val Glu Tyr Asn Lys Asn Pro Leu Tyr Val Phe Ser Leu WO 98/2I225 PCT/US97l21353 Phe Tyr Gly Met Gln Val Ala Glu Asn Thr Trp Thr Ile Ser Lys His Ser Ala Asn Phe Ile Ile Asp Asp Trp Arg 5er Tle Gln Gly Phe Ser Leu Lys Thr Ser Asn Phe Arg Met Leu Gly Leu Val Gly Phe Lys Phe ' 130 135 140 Gln Thr Val Leu Phe His His Asp Ala Ser Ile Glu Val Gly Ile Lys 145 150 l55 160 Trp Pro Phe Ala Phe Glu Tyr Asp Ser Ala Phe Val Arg Leu Phe Ser Val Phe Ile Ser His Thr Phe Tyr Leu 180 l85 (2) INFORMATION FOR SEQ ID N0:57:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1161 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 109...1113 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID P10:57:
Met Leu Val Thr Arg Phe Lys Lys Ala Phe Ile Ser Tyr Ser Leu Gly Val Leu Val Ala Ser Leu Trp Leu Asn Val Cys Asn Ala Ser Ala Gln Glu Val Lys GTC AAG GAT TAT TTC GGG GAG CAA ACC ATC AAG CTT CCT GTT TCT AAA 26l Val Lys Asp Tyr Phe Gly Glu Gln Thr Ile Lys Leu Pro Val Ser Lys Ile Ala Tyr Ile Gly Ser Tyr Val Glu Val Pro Ala Met Leu Asn Val Trp Asn Arg Val Val Gly Val Ser Asp Tyr Ala Phe Lys Asp Asp Ile Val Lys Ala Thr Leu Lys Gly Glu Asp Leu Lys Arg Val Lys His Met Ser Thr Asp His Thr Ala Ala Leu Asn Val Glu Leu Leu Lys Lys Leu Ser Pro Asp Leu Val Val Thr Phe Val Gly Asn Pro Lys Ala Val Glu His Ala Lys Lys Phe Gly Ile Ser Phe Leu Ser Phe Gln Glu Thr Thr Ile Ala Glu Ala Met Gln Ala Met Gln Ala G1n Ala Thr Val Leu Glu l50 155 160 Ile Asp Ala Ser Lys Lys Phe Ala Lys Met Gln Glu Thr Leu Asp Phe l65 170 175 Ile Ala Glu Arg Leu Lys Asn Val Lys Lys Lys Lys Gly Va1 Glu Leu 180 l85 190 l95 Phe His Lys Ala Asn Lys Ile Ser Gly His Gln Ala Ile Ser Ser Asp Ile Leu Glu Lys Gly Gly Ile Asp Asn Phe Gly Leu Lys Tyr Val Lys Phe Gly Arg Ala Asp Ile Ser Val Glu Lys Ile Val Lys Glu Asn Pro Glu Ile Ile Phe Ile Trp Trp Ile Ser Pro Leu Thr Pro Glu Asp Val Leu Asn Asn Pro Lys Phe Ala Thr Ile Lys Ala Ile Lys Asn Lys Gln Val Tyr Lys Leu Pro Thr Met Asp Ile Gly Gly Pro Arg Ala Pro Leu Ile Ser Leu Phe Ile Ala Leu Lys Ala His Pro Glu Ala Phe Lys Gly GTG GAT ATT AAT GCG ATG GTT AAA GAC TAC T'AT AAA GTG GTT TTT_GAT 1077 Val Asp Ile Asn Ala Met Val Lys Asp Tyr Tyr Lys Val Val Phe Asp ~ Leu Asn Asp Ala Glu Val Glu Pro Phe Leu Trp His GTTGATGTTT TTAGCCTTTC GTGTATCGCG CT l161 (2) INFORMATION FOR SEQ ID N0:58:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 335 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:58:
Met Leu Val Thr Arg Phe Lys Lys Ala Phe Ile Ser Tyr Ser Leu Gly Val Leu Val Ala Ser Leu Trp Leu Asn Val C'ys Asn Ala Ser Ala Gln Glu Val Lys Val Lys Asp Tyr Phe Gly Glu Gln Thr Ile Lys Leu Pro Val Ser Lys Ile Ala Tyr Ile Gly Ser Tyr Val Glu Val Pro Ala Met Leu Asn Val Trp Asn Arg Val Val Gly Val Ser Asp Tyr Ala Phe Lys Asp Asp Ile Val Lys Ala Thr Leu Lys Gly Glu Asp Leu Lys Arg Val Lys His Met Ser Thr Asp His Thr Ala Ala Leu Asn Val Glu Leu Leu Lys Lys Leu Ser Pro Asp Leu Val Val Thr Phe Val Gly Asn Pro Lys Ala Val Glu His Ala Lys Lys Phe Gly Ile Ser Phe Leu Ser Phe Gln l30 135 140 Glu Thr Thr Ile Ala Glu Ala Met Gln Ala MLet Gln Ala Gln Ala Thr 145 l50 155 160 Val Leu Glu Ile Asp Ala Ser Lys Lys Phe A.la Lys Met Gln Glu Thr ~ Leu Asp Phe Ile Ala Glu Arg Leu Lys Asn Val Lys Lys Lys Lys Gly Val Glu Leu Phe His Lys Ala Asn Lys Ile Ser Gly His Gln Ala Ile l95 200 205 Ser Ser Asp Ile Leu Glu Lys Gly Gly Ile A.sp Asn Phe Gly Leu Lys 2l0 215 220 Tyr Val Lys Phe Gly Arg Ala Asp Ile Ser Val Glu Lys Ile Val Lys Glu Asn Pro Glu Ile Ile Phe Ile Trp Trp Ile Ser Pro Leu Thr Pro Glu Asp Val Leu Asn Asn Pro Lys Phe Ala Thr Ile Lys Ala Ile Lys Asn Lys Gln Val Tyr Lys Leu Pro Thr Met Asp Ile Gly Gly Pro Arg Ala Pro Leu Ile Ser Leu Phe Ile Ala Leu Lys Ala His Pro Glu Ala Phe Lys Gly Val Asp Ile Asn Ala Met Val Lys Asp Tyr Tyr Lys Val Val Phe Asp Leu Asn Asp Ala Glu Val Glu Pro Phe Leu Trp His (2) INFORMATION FOR SEQ ID N0:59:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 800 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 123...669 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:59:
TGAAATCAAA CAAAGCCAAA AAGAAAAGAA AAAATTCCCC ACTTTCAAAG GAGGTTTTTA l20 Met Arg Trp Trp Cys Phe Leu Val Cys Cys Phe Gly Ile Leu Ser Val Met Asp Ala Lys Lys Leu Glu Asn Lys Asn Leu Lys Lys Glu Arg Glu Leu Leu Glu Ile Thr Gly Asn Gln Phe Val Ala Asn Asp Lys Thr Lys Thr Ala Val Ile Gln Gly Asn Val Gln Ile Lys Lys Gly Lys Asp Arg Leu Phe Ala Asp Lys Val Ser Val Phe Leu Asn Asp Lys Arg Lys Pro Glu Arg Tyr Glu Ala Thr Gly Asn Thr His Phe Asn Ile Phe Thr Glu GAC AAT CGT GAA ATC AGC GGG AGT GCT GAC A.AG CTC ATT TAT AAC GCG 456 Asp Asn Arg Glu Ile Ser Gly Ser Ala Asp L~ys Leu Ile Tyr Asn Ala Leu Asn Gly Glu Tyr Lys Leu Leu Gln Asn A.la Val Val Arg Glu Val GGG AAA TCC AAT GTC ATC ACC GGC GAT GAA A.TC ATT TTA AAC AAA ACT 552 Gly Lys Ser Asn Val Ile Thr Gly Asp Glu Ile Ile Leu Asn Lys Thr 130 135 l40 Lys Gly Tyr Ala Asp Val Leu Gly Ser Ala Lys Arg Pro Ala Lys Phe 145 150 l55 160 Val Phe Asp Met Glu Asp Ile Asn Glu Glu Asn Arg Lys Ala Lys Leu 165 170 l75 Lys Lys Lys Gly Glu Lys Pro (2) INFORMATION FOR SEQ ID N0:60:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 183 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:60:
Met Arg Trp Trp Cys Phe Leu Val Cys Cys Phe Gly Ile Leu Ser Val Met Asp Ala Lys Lys Leu Glu Asn Lys Asn Leu Lys Lys Glu Arg Glu Leu Leu Glu Ile Thr Gly Asn Gln Phe Val Ala Asn Asp Lys Thr Lys Thr Ala Val Ile Gln Gly Asn Val Gln Ile Lys Lys Gly Lys Asp Arg Leu Phe Ala Asp Lys Val Ser Val Phe Leu Asn Asp Lys Arg Lys Pro Glu Arg Tyr Glu Ala Thr Gly Asn Thr His Phe Asn Ile Phe Thr Glu Asp Asn Arg Glu Ile Ser Gly Ser Ala Asp Lys Leu Ile Tyr Asn Ala l00 l05 110 Leu Asn Gly Glu Tyr Lys Leu Leu Gln Asn Ala Val Val Arg Glu Val 115 l20 125 Gly Lys Ser Asn Val Ile Thr Gly Asp Glu Ile Ile Leu Asn Lys Thr 130 l35 140 Lys Gly Tyr Ala Asp val Leu Gly Ser Ala Lys Arg Pro Ala Lys Phe l45 150 155 160 Val Phe Asp Met Glu Asp Ile Asn Glu Glu Asn Arg Lys Ala Lys Leu L65 170 l75 Lys Lys Lys Gly Glu Lys Pro (2) INFORMATION FOR SEQ ID N0:61:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 724 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: S8...618 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:61:
Met Lys Leu Ile Lys Phe Val Arg Asn Val Val Leu Phe Ile Leu Thr Ala Ile Phe Leu Ala Phe Met Leu Leu Val Ser Tyr Cys Met Pro His Tyr Ser Ala Ala Val Ile Ser Gly Val Glu Val Lys Arg Met Asn Glu Asn Glu Asn Thr Pro Asn Asn Lys Glu Val Lys Thr Leu Ala Arg Asp Val Tyr Phe Val Gln Thr Tyr Asp Pro Lys Asp Gln Lys Ser Val Thr Val Tyr Arg Asn Glu Asp Thr Arg Phe Ser Phe Pro Phe Tyr Phe Lys Phe Asn Ser Ala Asp Ile Ser Ala Leu Ala Gln Ser Leu Ile Asn Gln Gln Val Glu Val Lys Tyr Tyr Gly Trp Arg I1e Asn Leu Phe Asn Met Phe Pro Asn Val Ile Phe Leu Lys Pro Leu Lys Glu Ser Thr Asp Ile Ser Lys Pro Ile Phe Ser Trp Ile Leu Tyr Ala Leu Leu Leu Met Gly Phe Phe Ile Ser Ala Arg Ser Val Cys 155 160 l65 ACT TTA TTT AAG AGC AAA GCT CAT TAAAACTTT'T AGGCTTTGTT GGAAAATCAC 648 Thr Leu Phe Lys Ser Lys Ala His AATGGGGTTA TTGGAGCGTG TATTAAAAAG CTCAATAT.?~G GGCAAGCTGA TGCTGTGAAA 708 (2) INFORMATION FOR 5EQ ID N0:62:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 177 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:62:
Met Lys Leu Ile Lys Phe Val Arg Asn Val Val Leu Phe Ile Leu Thr Ala Ile Phe Leu Ala Phe Met Leu Leu Val Ser Tyr Cys Met Pro His Tyr Ser Ala Ala Val Ile Ser Gly Val Glu Val Lys Arg Met Asn Glu Asn Glu Asn Thr Pro Asn Asn Lys Glu Val Lys Thr Leu Ala Arg Asp Val Tyr Phe Val Gln Thr Tyr Asp Pro Lys Aap Gln Lys Ser Val Thr Val Tyr Arg Asn G1u Asp Thr Arg Phe Ser Phe Pro Phe Tyr Phe Lys Phe Asn Ser Ala Asp Ile Ser Ala Leu Ala G7.n Ser Leu Ile Asn Gln ' l00 105 l10 Gln Val Glu Val Lys Tyr Tyr Gly Trp Arg Il.e Asn Leu Phe Asn Met Phe Pro Asn Val Ile Phe Leu Lys Pro Leu Lys Glu Ser Thr Asp Ile Ser Lys Pro Ile Phe Ser Trp Ile Leu Tyr Ala Leu Leu Leu Met Gly l45 1S0 155 l60 Phe Phe Ile Ser Ala Arg Ser Val Cys Thr Leu Phe Lys Ser Lys Ala His (2) INFORMATION FOR SEQ ID N0:63:
{i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 982 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear {ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 117...9l1 (D) OTHER INFORMATION:
(A) NAME/KEY: sig peptide (B) LOCATION: 117...167 (D) OTHER INFORMATION:
(A) NAME/KEY: mat_peptide (B) LOCATION: 168 ..911 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:63:
TGGTTAAAAA GGACACAATA AACCCCAAAA ATGAAATTTA AATATATGGG AACTTA ATG l19 Met Arg Ile Phe Phe Val Ile Met Gly Leu Val Phe Phe Gly Cys Thr Ser Lys Val His Glu Met Lys Lys Ser Pro Cys Thr Leu Tyr Glu Asn Arg Leu Asn Leu Ala Glu Ile Phe His Lys Arg Ala Ile Asp Leu Phe Arg Glu Leu Leu Ser His Gln Glu Lys His Leu Glu Asn Lys Leu Ser Gly Phe SerValSer AspLeuAsp MetGlnSerValPhe ArgLeuGlu Arg Asn ArgLeuLys IleAlaTyr LysLeuLeuGlyLeu MetSerPhe Ile Ala LeuIleLeu AlaIleVal LeuI1eSerLeuLeu ProLeuGln Lys Thr GluHisHis PheValAsp PheLeuAsnGlnAsp LysHisTyr Val 100 105 1l0 Ile IleGlnArg AlaAspLys SerIleSerSerAsn GluAlaLeu Ala Arg SerLeuIle GlyAlaTyr ValLeuAsnArgGlu SerIleAsn Arg 130 135 l40 ATT GACGATAAA TCGCGCTAT GAATTGGTGC~~CTTG CAAAGCAGT TCT 647 Ile AspAspLys SerArgTyr GluLeuValArgLeu GlnSerSer Ser Lys ValTrpGln ArgPheGlu AspLeuIleL-ysThr GlnAsnSer Ile l65 170 175 TAT GTGCAAAGC CATTTGGAA AGAGAAGTCC.ATATC GTCAATATT GCG 743 Tyr ValGlnSer HisLeuGlu ArgGluValHisIle ValAsnIle Ala Ile TyrGlnGln AspAsnAsn ProIleAlaSe Val SerIleAla Ala r AAA CTTTTGAAT GAAAACAAG CTGGTGTATG:~AAAG CGTTATAAA ATC 839 Lys LeuLeuAsn GluAsnLys LeuValTyrG.LuLys ArgTyrLys Ile GTA TTGAGTTAT TTGTTTGAC ACCCCGATGAi'~TTCA AGCTTGCAA GCT 887 Val LeuSerTyr LeuPheAsp ThrProMetA;>nSer SerLeuGln Ala _ TGC AAGCTCTCA GGCTTCATA GTTTGACATGACi~ 941 TATAGATGAG
CTTTATGCGG
Cys LysLeuSer GlyPheIle Val ACAGAATGGC
TAACGCAGCA
GGCACCGA(3T
(2} INFORMATION FOR SEQ ID N0:64:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 265 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:64:
Met Arg Ile Phe Phe Val Ile Met Gly Leu Val Phe Phe Gly Cys Thr Ser Lys Val His Glu Met Lys Lys Ser Pro Cys Thr Leu Tyr Glu Asn Arg Leu Asn Leu Ala Glu Ile Phe His Lys Arg Ala Ile Asp Leu Phe Arg Glu Leu Leu Ser His Gln Glu Lys His Leu Glu Asn Lys Leu Ser Gly Phe Ser Val Ser Asp Leu Asp Met Gln Ser Val Phe Arg Leu Glu Arg Asn Arg Leu Lys Ile Ala Tyr Lys Leu Leu Gly Leu Met Ser Phe Ile Ala Leu Ile Leu Ala Ile Val Leu Ile Ser Leu Leu Pro Leu Gln Lys Thr Glu His His Phe Val Asp Phe Leu Asn Gln Asp Lys His Tyr 100 l05 110 Val Ile Ile Gln Arg Ala Asp Lys Ser Ile Ser Ser Asn Glu Ala Leu l15 120 l25 Ala Arg Ser Leu Ile Gly Ala Tyr Val Leu Asn Arg Glu Ser Ile Asn Arg Ile Asp Asp Lys Ser Arg Tyr Glu Leu Val Arg Leu Gln Ser Ser Ser Lys Val Trp Gln Arg Phe Glu Asp Leu Ile Lys Thr Gln Asn Ser Ile Tyr Val Gln Ser His Leu Glu Arg Glu Val His Ile Val Asn Ile 180 185 l90 Ala Ile Tyr Gln Gln Asp Asn Asn Pro Ile Ala Ser Val Ser Ile Ala Ala Lys Leu Leu Asn Glu Asn Lys Leu Val Tyr Glu Lys Arg Tyr Lys 2l0 215 220 Ile Val Leu Ser Tyr Leu Phe Asp Thr Pro Met Asn Ser Ser Leu Gln Ala Cys Lys Leu Ser Gly Phe Ile Val (2) INFORMATION FOR SEQ ID N0:65:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2059 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 183...1961 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID 2d0:65:
GTTTTTTAAA
TACTTTGGCT
AGTCATTTTGATTTCTAAAA ATAGTCTATA ATGCTCG('_AA TTAAGGTTAT180 GAGATATTTT
CA ATG GCT ATA AAA ATA CTT TTT ATA A'..~G TTA AAC 227 AAA ACA CTC AGT
Met Lys Ala Ile Lys Ile Leu Phe Ile Me't Leu Asn Thr Leu Ser GCT ATC AGC GTG AAT AGG GCG TTG TTT GAT 7:'TA AAA GAT TCG CAA TTA 275 Ala Ile Ser Val Asn Arg Ala Leu Phe Asp Leu Lys Asp Ser Gln Leu AAA GGG GAA TTA ACG CCA AAA ATA GTG AAT 7.'TT GGG GGT TAT AAA AGC 323 Lys Gly Glu Leu Thr Pro Lys Ile Val Asn F>he Gly Gly Tyr Lys Ser AGC ACT GAA GAG TGG GGG GCT ACG GCT TTA dIAC TAT ATC AAT GCG GCT 371 Ser Thr Glu Glu Trp Gly Ala Thr Ala Leu Asn Tyr Ile Asn Ala Ala AAT GGC GAT GCG AAA AAA TTC AGC ACT CTA CyTG GAA AAA ATG CGT TTT 419 Asn Gly Asp Ala Lys Lys Phe Ser Thr Leu Val Glu Lys Met Arg Phe AAC TCC GGT ATA TTG GGG AAT TTA AGA GTG C:AT GCA CGT TTG AGG CAA 467 Asn Ser G1y Ile Leu Gly Asn Leu Arg Val His Ala Arg Leu Arg Gln 80 85 ~)0 95 GCC CTA AAA TTG CAA AAG AAT TTG AAA TAT 7.'GC CTT AAA ATC ATC GCT 515 Ala Leu Lys Leu Gln Lys Asn Leu Lys Tyr C'ys Leu Lys Ile Ile Ala Arg Asp Ser Phe Tyr Ser Tyr Arg Thr Gly l:le Tyr Ile Pro Leu Gly 115 l20 125 Ile Ser Leu Lys Asp Gln Lys Thr Ala Gln Lys Met Leu Ala Asp Leu AGC GTG GTA GGG GCG TAT CTT AAA AAA CAA C:AA GAG AAT GAA AAG GCT 659 Ser Val Val Gly Ala Tyr Leu Lys Lys Gln C:ln Glu Asn Glu Lys Ala Gln Ser Pro Tyr Tyr Arg Asn Asn Asn Tyr Tyr Asn Ser Tyr Tyr Ser 160 165 l70 175 Pro Tyr Tyr Gly Met Tyr Gly Met Tyr Gly Met Gly Met Tyr Gly Met Tyr Gly Met Gly Met Tyr Asp Phe Tyr Asp Phe Tyr Asp Gly Met Tyr GGA TTC TAC CCT AAC ATG TTT TTC ATG ATG CAA GTT CAA GAT TAC TTG 85l Gly Phe Tyr Pro Asn Met Phe Phe Met Met Gln Val Gln Asp Tyr Leu 210 2l5 220 Met Leu Glu Asn Tyr Met Tyr Ala Leu Asp Gln Glu Glu Ile Leu Asp His Asp Ala 5er Thr Asp Gln Leu Asp Thr Pro Thr Asp Asp Asp Lys Asp Asp Lys Asp Asp Lys Ser Leu Gln Gln Ala Asn Leu Met Asn Phe Tyr Arg Asp Pro Lys Phe Ser Lys Gly Ile Gln Thr Asn Arg Leu Asn Ser Ala Leu Val Asn Leu Asp Asn 5er Arg Met Leu Lys Asp Asn Ser Leu Phe His Thr Lys Ala Met Pro Thr Lys Ser Val Asp Ala Ile Thr Ser Gln Ala Lys Glu Leu Asn His Leu Val Gly Gln Ile Lys Glu Met Lys Gln Asp Gly Ala Ser Pro Ser Lys Ile Asp Ser Val Val Asn Lys Ala Met Glu Val Arg Asp Lys Leu Asp Asn Asn Leu Asn Gln Leu Asp AAT GAC TTA AAA GAT CAA AAA GGG CTT TCA AGC GAG CAA CAA GCT CAA 133l Asn Asp Leu Lys Asp Gln Lys Gly Leu Ser Ser Glu Gln Gln Ala Gln GTG GATAAAGCC CTAGACAGCGTG CAACAA'TTAAGC CATAGCAGCGAT 1379 Val AspLysAla LeuAspSerVal GlnGln:LeuSer HisSerSerAsp GTG GTGGGGAAT TATTTAGACGGG AGTTTG.AAAATT GATGGCGATGAT 1427 Val ValGlyAsn TyrLeuAspGly SerLeu:LysI1e AspGlyAspAsp Arg AspAspLeu AsnAspAlaMet AsnAsnProMet GlnGlnProVal CAA CAAACGCCT ACTAGCAACATG GCCGAC.ACCCAT GCAAATGACAGC 1523 Gln GlnThrPro ThrSerAsnMet AlaAsp'rhrHis AlaAsnAspSer AAG GATCAAGGG AGTAACGCGCTC ATAAACCCTAAC AGCGCCACTAAC 157l Lys AspGlnGly SerAsnAlaLeu IleAsnProAsn SerAlaThrAsn GCC GACGACACT CACACTGACGAT ACTCAC.ACTGAC ACTAACACCACA 1619 Ala AspAspThr HisThrAspAsp ThrHis'ThrAsp ThrAsnThrThr Asn AspAlaSer ThrThrAspThr ProThr.AspAsp LysAspAlaSer GGC TTGAACAAT ACCGGCGATA'rGAATAAC.ACGGAT ACCGGCAACACG 1715 Gly LeuAsnAsn ThrGlyAspMet AsnAsn'rhrAsp ThrGlyAsnThr Asp ThrGlyAsn ThrAspThrGly AsnThr,AspAsp MetSerAsnMet AAC AACGGCAAC GATGATACGGGT AACGCT.AATGAC GACATGAGCAAC 1811 Asn AsnGlyAsn AspAspThrGly AsnAla.AsnAsp AspMetSerAsn Gly AsnAspMet GlyAspAspLeu AsnAsn.AlaAsn AspMetAsnAsp Asp MetGlyAsn GlyAsnAspAsp MetGly.AspMet GlyAspMetAsn Asp AspMetGly GlyAspMetGly AspMetGlyAsp MetGlyAspMet ' GGG AATTGAGATTAAC CCCAATATCA TAAGGAATAT
AAGAGTGATA TT
GCCAAAACTT
Gly Asn (2) INFORMATION FOR SEQ ID N0:66:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 593 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:66:
Met Lys Ala Ile Lys Ile Leu Phe Ile Met Thr Leu Ser Leu Asn Ala Ile Ser Val Asn Arg Ala Leu Phe Asp Leu Lys Asp Ser Gln Leu Lys Gly Glu Leu Thr Pro Lys Ile Val Asn Phe Gly Gly Tyr Lys Ser Ser Thr Glu Glu Trp Gly Ala Thr Ala Leu Asn Tyr Ile Asn Ala Ala Asn Gly Asp Ala Lys Lys Phe Ser Thr Leu Val Glu Lys Met Arg Phe Asn Ser Gly Ile Leu Gly Asn Leu Arg Val His Ala Arg Leu Arg Gln Ala Leu Lys Leu Gln Lys Asn Leu Lys Tyr Cys Leu Lys Ile Ile Ala Arg l00 105 110 Asp Ser Phe Tyr Ser Tyr Arg Thr Gly Ile Tyr Ile Pro Leu Gly Ile l15 120 l25 Ser Leu Lys Asp Gln Lys Thr Ala Gln Lys Met Leu Ala Asp Leu Ser Val Val Gly Ala Tyr Leu Lys Lys Gln Gln Glu Asn Glu Lys Ala Gln 145 150 155 l60 Ser Pro Tyr Tyr Arg Asn Asn Asn Tyr Tyr Asn Ser Tyr Tyr Ser Pro Tyr Tyr Gly Met Tyr Gly Met Tyr Gly Met Gly Met Tyr Gly Met Tyr l80 185 190 Gly Met Gly Met Tyr Asp Phe Tyr Asp Phe Tyr Asp Gly Met Tyr Gly Phe Tyr Pro Asn Met Phe Phe Met Met Gln Val Gln Asp Tyr Leu Met 210 2l5 220 Leu Glu Asn Tyr Met Tyr Ala Leu Asp Gln Glu Glu Ile Leu Asp His Asp Ala Ser Thr Asp Gln Leu Asp Thr Pro Thr Asp Asp Asp Lys Asp Asp Lys Asp Asp Lys Ser Leu Gln Gln Ala Asn Leu Met Asn Phe Tyr Arg Asp Pro Lys Phe Ser Lys Gly Ile Gln Thr Asn Arg Leu Asn Ser Ala Leu Val Asn Leu Asp Asn Ser Arg Met Leu Lys Asp Asn Ser Leu Phe His Thr Lys Ala Met Pro Thr Lys Ser Val Asp Ala Ile Thr Ser Gln Ala Lys Glu Leu Asn His Leu Val Gly Gln Ile Lys Glu Met Lys Gln Asp Gly Ala Ser Pro Ser Lys Ile Asp Ser Val Val Asn Lys Ala Met Glu Val Arg Asp Lys Leu Asp Asn Asn Leu Asn Gln Leu Asp Asn Asp Leu Lys Asp Gln Lys Gly Leu Ser Ser Glu Gln Gln Ala Gln Val Asp Lys Ala Leu Asp Ser Val Gln Gln Leu Ser His Ser Ser Asp Val Val Gly Asn Tyr Leu Asp Gly Ser Leu Lys Ile Asp Gly Asp Asp Arg 405 4l0 415 Asp Asp Leu Asn Asp Ala Met Asn Asn Pro Met Gln Gln Pro Val Gln Gln Thr Pro Thr Ser Asn Met Ala Asp Thr His Ala Asn Asp Ser Lys Asp Gln Gly Ser Asn Ala Leu Ile Asn Pro Asn Ser Ala Thr Asn Ala Asp Asp Thr His Thr Asp Asp Thr His Thr Asp Thr Asn Thr Thr Asn Asp Ala Ser Thr Thr Asp Thr Pro Thr Asp Asp Lys Asp Ala Ser Gly Leu Asn Asn Thr Gly Asp Met Asn Asn Thr Asp Thr Gly Asn Thr Asp 500 505 5l0 Thr Gly Asn Thr Asp Thr Gly Asn Thr Asp Asp Met Ser Asn Met Asn Asn Gly Asn Asp Asp Thr Gly Asn Ala Asn Asp Asp Met Ser Asn Gly Asn Asp Met Gly Asp Asp Leu Asn Asn Ala Asn Asp Met Asn Asp Asp Met Gly Asn Gly Asn Asp Asp Met Gly Asp Met Gly Asp Met Asn Asp Asp Met Gly Gly Asp Met Gly Asp Met Gly Asp Met Gly Asp Met Gly Asn (2) INFORMATION FOR SEQ ID N0:67:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1527 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 1l2...1461 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:67:
Met Ser Met Glu Phe Asp Ala Val Ile Ile Gly Gly Gly Val Ser Gly Cys Ala Thr Phe Tyr Thr Leu Ser Glu Tyr Ser Ser Leu Lys Arg Val Ala Ile Val Glu Lys Cys Ser Lys Leu Ala Gln Ile Ser Ser Ser Ala Lys Ala Asn Ser Gln Thr Ile His Asp Gly Ser Ile Glu Thr Asn Tyr Thr Pro Glu Lys Ala Lys Lys Val Arg Leu Ser Ala Tyr Lys Thr Arg Gln Tyr Ala Leu Asn Lys Gly Leu Gln Asn Glu Val Ile Phe Glu Thr Gln Lys Met Ala Ile Gly Val Gly Asp Glu Glu Cys Glu Phe Met Lys Lys Arg l00 10S 110 Tyr Glu Ser Phe Lys Glu Ile Phe Val Gly Leu Glu Glu Phe Asp Lys l15 120 125 130 Gln Lys Ile Lys Glu Leu Glu Pro Asn Val Ile Leu Gly Ala Asn Gly Ile Asp Arg His Glu Asn Ile Ile Gly His Gly Tyr Arg Lys Asp Trp Ser Thr Met Asn Phe Ala Lys Leu Ser Glu Asn Phe Val Glu Glu Ala 165 170 l75 Leu Lys Leu Lys Pro Asn Asn Gln Val Phe Leu Asn Phe Lys Val Lys 180 185 l90 Lys Ile Glu Lys Arg Asn Asp Thr Tyr Ala Val Ile Ser Glu Asp Ala GAA GAA GTG TAT GCT AAA TTC GTG CTG GTC .AAT GCC GGC TCT TAC GCT 7B9 Glu Glu Val Tyr Ala Lys Phe Val Leu Val .Asn Ala Gly Ser Tyr Ala TTG CCT TTG GCT CAG AGC ATG GGC TAT GGC ~~TA GAT TTA GGG TGC TTG 837 Leu Pro Leu Ala Gln Ser Met Gly Tyr Gly :Leu Asp Leu Gly Cys Leu Pro Val Ala Gly Ser Phe Tyr Phe Val Pro ,asp Leu Leu Arg Gly Lys GTT TAT ACC GTT CAA AAC CCC AAA CTC CCT 'TTT GCA GCC GTG CAT GGC 933 Val Tyr Thr Val Gln Asn Pro Lys Leu Pro :Phe Ala Ala Val His Gly Asp Pro Asp Ala Val Ile Lys Gly Lys Thr Arg Ile Gly Pro Thr Ala 275 280 285 ~ 290 TTA ACG ATG CCT AAA TTA GAA CGC AAC AAA 'TGT TGG CTT AAG GGC ATT 1029 Leu Thr Met Pro Lys Leu Glu Arg Asn Lys Cys Trp Leu Lys Gly Ile AGC TTG GAA TTG TTG AAA ATG GAT TTG AAT i~AA GAT GTG TTT AAA ATT 1077 Ser Leu Glu Leu Leu Lys Met Asp Leu Asn Lys Asp Val Phe Lys Ile Ala Phe Asp Leu Met Ser Asp Lys Glu Ile Arg Asn Tyr Val Phe Lys Asn Met Val Phe Glu Leu Pro Ile Ile Gly Lys Arg Lys Phe Leu Lys GAC GCT CAA AAA ATC ATC CCC TCT CTT AGC CTA GAA GAT CTA GAA TAC 122l Asp Ala Gln Lys Ile Ile Pro Ser Leu Ser Leu Glu Asp Leu Glu Tyr Ala His Gly Phe Gly Glu Val Arg Pro Gln Val Leu Asp Arg Thr Lys Arg Lys Leu Glu Leu Gly Glu Lys Lys Ile Cys Thr His Lys Gly Ile Thr Phe Asn Met Thr Pro Ser Pro Gly Ala Thr Ser Cys Leu Gln Asn Ala Leu Val Asp Ser Gln Glu Ile Ala Ala Tyr Leu Gly Glu Ser Phe Glu Leu Glu Arg Phe Tyr Lys Asp Leu Ser Pro Glu Glu Leu Glu Asn (2) INFORMATION FOR SEQ ID N0:68:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 450 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:68:
Met Ser Met Glu Phe Asp Ala Val Ile Ile Gly Gly Gly Val Ser Gly Cys Ala Thr Phe Tyr Thr Leu Ser Glu Tyr Ser Ser Leu Lys Arg Val Ala Ile Val Glu Lys Cys Ser Lys Leu Ala Gln Ile Ser Ser Ser Ala Lys Ala Asn Ser Gln Thr Ile His Asp Gly Ser Ile Glu Thr Asn Tyr Thr Pro Glu Lys Ala Lys Lys Val Arg Leu Ser Ala Tyr Lys Thr Arg Gln Tyr Ala Leu Asn Lys Gly Leu Gln Asn Glu Val Ile Phe Glu Thr Gln Lys Met Ala Ile Gly Val Gly Asp Glu Glu Cys Glu Phe Met Lys Lys Arg Tyr Glu Ser Phe Lys Glu Ile Phe Val Gly Leu Glu Glu Phe l15 I20 125 Asp Lys Gln Lys Ile Lys Glu Leu Glu Pro Asn Val Ile Leu Gly Ala Asn Gly Ile Asp Arg His Glu Asn Ile Ile Gly His Gly Tyr Arg Lys Asp Trp Ser Thr Met Asn Phe Ala Lys Leu Ser Glu Asn Phe Val Glu Glu Ala Leu Lys Leu Lys Pro Asn Asn Gln Val Phe Leu Asn Phe Lys l80 185 l90 Val Lys Lys Ile Glu Lys Arg Asn Asp Thr Tyr Ala Val Ile Ser Glu 195 200 2b5 Asp Ala Glu Glu Val Tyr Ala Lys Phe Val Leu Val Asn Ala Gly Ser 2i0 215 220 Tyr Ala Leu Pro Leu A1a Gln Ser Met Gly Tyr Gly Leu Asp Leu Gly Cys Leu Pro Val Ala Gly Ser Phe Tyr Phe Val Pro Asp Leu Leu Arg Gly Lys Val Tyr Thr Val Gln Asn Pro Lys :Leu Pro Phe Ala Ala Val His Gly Asp Pro Asp Ala Val Ile Lys Gly :Lys Thr Arg Ile Gly Pro Thr Ala Leu Thr Met Pro Lys Leu Glu Arg .Asn Lys Cys Trp Leu Lys Gly Ile Ser Leu Glu Leu Leu Lys Met Asp :Leu Asn Lys Asp Val Phe Lys Ile Ala Phe Asp Leu Met Ser Asp Lys Glu Tle Arg Asn Tyr Val Phe Lys Asn Met Val Phe Glu Leu Pro Ile Ile Gly Lys Arg Lys Phe Leu Lys Asp Ala Gln Lys Ile Ile Pro Ser Leu Ser Leu Glu Asp Leu Glu Tyr Ala His Gly Phe Gly Glu Val Arg Pro Gln Val Leu Asp Arg Thr Lys Arg Lys Leu Glu Leu Gly Glu Lys :Lys Ile Cys Thr His Lys Gly Ile Thr Phe Asn Met Thr Pro Ser Pro Gly Ala Thr Ser Cys Leu 405 410 4l5 Gln Asn Ala Leu Val Asp Ser Gln Glu Ile .Ala Ala Tyr Leu Gly Glu _ -_ 420 425 430 Ser Phe Glu Leu Glu Arg Phe Tyr Lys Asp Leu Ser Pro Glu Glu Leu Glu Asn (2) INFORMATION FOR SEQ ID N0:69:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 653 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 63...590 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:69:
Met Pro Lys Pro Lys Lys Asn Thr Leu Pro Cys Ser Leu Ser Val Lys Met Ser Tyr Phe Met Arg Phe Leu Ile Lys Trp Arg Thr Arg Ser Leu Ser His Lys Met Met Thr Leu Ile Gln Ile Leu Ser Ile Leu Ala TTA GCG AGC AAG GCC AGT GAA GAT TTA GAA GAG CAA CTC AAA AAA ATC 25l Leu Ala Ser Lys Ala Ser Glu Asp Leu Glu Glu Gln Leu Lys Lys Tle Lys Asp Tyr Ile Tyr Arg Thr Leu Asn Ala Lys Ile Ala Ser Asp Val Tyr Asn Arg Val Leu Ile Leu Val Asn Glu Tyr Cys Thr Asn Glu Glu 80 85 90 g5 Leu Phe Asp Lys Glu Ser Val Lys Ile Ser Asp Leu Leu Ile Gln Asp Ile Gln Leu Tyr Ala Leu Val Asp Glu Met Leu Lys Glu Asp Lys Tyr Gln Val Gln His Thr Ile Leu Lys Gly Ile Ile Lys Arg Lys Tyr Asp l30 13S 140 Glu Ala Tyr Ser Leu Asn Ser Glu Asp Arg Ile Leu Leu Glu Tyr Gln l45 150 l55 Glu Arg Leu Leu Glu His Ser His Ala Ser Phe Ser Asn Lys Lys Phe 160 l65 170 175 Lys (2) INFORMATION FOR SEQ ID N0:70:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: l76 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:70:
Met Pro Lys Pro Lys Lys Asn Thr Leu Pro Cys Ser Leu Ser Val Lys Met Ser Tyr Phe Met Arg Phe Leu Ile Lys Trp Arg Thr Arg Ser Leu Ser His Lys Met Met Thr Leu Ile Gln Ile Leu Ser Ile Leu Ala Leu Ala Ser Lys Ala Ser Glu Asp Leu Glu Glu Gln Leu Lys Lys Ile Lys Asp Tyr Ile Tyr Arg Thr Leu Asn Ala Lys Ile Ala Ser Asp Val Tyr Asn Arg Val Leu Ile Leu Val Asn Glu Tyr Cys Thr Asn Glu Glu Leu Phe Asp Lys Glu Ser Val Lys Ile Ser Asp Leu Leu Ile Gln Asp Ile Gln Leu Tyr Ala Leu Val Asp Glu Met Leu :Lys Glu Asp Lys Tyr Gln 115 120 l25 VaI Gln His Thr Ile Leu Lys Gly Ile Ile :Lys Arg Lys Tyr Asp Glu Ala Tyr Ser Leu Asn Ser Glu Asp Arg Ile Leu Leu Glu Tyr Gln Glu 145 1S0 155 l60 Arg Leu Leu Glu His Ser His Ala Ser Phe ;Ser Asn Lys Lys Phe Lys (2) INFORMATION FOR SEQ ID N0:71:
(i) SEQUENCE CHARACTERISTICS:
(A} LENGTH: 1883 base pairs (B} TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 91...1833 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID 1J0:71:
AAGCGTTAAA TTCCAATCAA AAACCATCGT ATCGGTG'CTA ATATTGTGTA AAAATTAATG 60 Met Ly:~ Lys Leu Val Leu Val Ile Phe Leu Thr Leu Ala Leu Ser Ile Ser Ala Lys Glu Val Lys Ile Val Phe Leu Glu Thr Ser Asp I1e His Gly Arg heu Phe Ser Tyr Asp Tyr GCG ATT GGC GAG CAA AAA CCC AAT AAC GGC '.CTG ACA AGG ATT GCG ACT 258 Ala Ile Gly Glu Gln Lys Pro Asn Asn Gly heu Thr Arg Ile Ala Thr Leu Ile Lys Lys Gln Arg Ala Glu Asn Lys Asn Val Val Leu Ile Asp Ser Gly Asp Leu Leu Gln Gly Asn Ser Ala Glu Leu Phe Asn Asp Glu Pro Ile His Pro Leu Val Arg Ala Glu Asn Asp Leu Lys Phe Asp Ile Arg Val Leu Gly Asn His Glu Phe Asn Phe Ser Lys Asp Phe Leu Glu 105 110 l15 120 Lys Asn Ile Lys Gly Phe Asn Gly Asp Val Met Asn Ala Asn Ile Ile _. - Lys Ile Ala Asp Asn Lys Pro Phe Val Lys Pro Tyr Ile Ile Lys Lys l40 145 150 Ile Asp Gly Val Arg Val Ala Val Val Gly Tyr Val Val Ala His Ile Pro Thr Trp Glu Ala Ser Thr Pro Glu His Phe Ala Gly Leu Lys Phe 170 l75 l80 Leu Asp Ala Glu Glu Ala Leu Lys Lys Thr Leu Lys Glu Leu Lys Gly Lys Tyr Asp Ile Leu Ile Gly Ala Phe His Leu Gly Arg Glu Asp Glu 205 210 2l5 Lys Gly Gly Asp Gly Ile Pro Asp Leu Ala Lys Lys Phe Pro Gln Phe Asp Ile Ile Phe Ala Gly His Glu His Ala Val Tyr Asn Thr Lys Val Gly Lys Val His Thr Ile Glu Pro Gly Ala Tyr Gly Ala Tyr Leu Ala Lys Gly Val Val Val Phe Asp Thr Lys Thr Lys Lys Lys Ile Ile Thr -17s-Thr Glu Asn Leu Pro Thr Lys Asp Val Pro Glu Asp Glu Glu Leu Ala ' Lys Lys Tyr Glu Tyr Val Asp Lys Lys Ser :Lys Glu Tyr Ala Asn Glu 300 305 3l0 Val Val Gly Glu Val Thr Lys Thr Phe Ile e~sp Arg Pro Asp Phe Ile Thr Gly Glu Glu Lys Ile Thr Thr Met Pro 'Phr Ala Ala Leu Gln G1u Thr Pro Val Ile Glu Leu Ile Asn Lys Val G1n Lys Tyr Tyr Ala Lys 345 350 :355 360 GCC GAT GTT TCA GCG GCA GCC TTA TTC AAT 'CTT GGG GCG AAT TTG AAA 1218 Ala Asp Val Ser Ala Ala Ala Leu Phe Asn 1?he Gly Ala Asn Leu Lys AAA GGG CCT TTC AAA AGA AAA GAT GTC ACT 'CAT ATT TAC AAG TTC GCT 1266 Lys Gly Pro Phe Lys Arg Lys Asp Val Thr 'Cyr Ile Tyr Lys Phe Ala AAT ACG CTC ATT GGA GTG CGT ATA ACG GGT (3AA AAT CTG TTG AAA TAC 1314 _ Asn Thr Leu Ile Gly Val Arg Ile Thr Gly Glu Asn Leu Leu Lys Tyr ATG GAA TGG TCA TAC CGA TTT TAC AAT CAG 'CTG CAA CCA GGA GAT TTG 1362 Met Glu Trp Ser Tyr Arg Phe Tyr Asn Gln Leu Gln Pro Gly Asp Leu ACG ATC AGT TTT AAT GAA AAC ATT CGC GGC '.CAT AAC TTT GAT ATG TTT 1410 Thr Ile Ser Phe Asn Glu Asn Ile Arg Gly '~yr Asn Phe Asp Met Phe Ser Gly Val Lys Tyr Gln Val Asp Val Thr Lys Pro Ala Gly Gln Arg Ile Ile Asn Pro Thr Ile Asn Asn Lys Pro Ile Asp Pro Lys Ala Ile Tyr Lys Leu Ala Ile Asn Asn Tyr Arg Phe G1y Thr Leu Sex Thr Thr Leu Asn Leu Val Thr Asp Ala Xaa Arg Tyr '.Cyr Asn Ser Tyr Asp Glu Leu Gln Asp Asn Gly Gln Ile Arg Asp Leu Ile Ile Lys Tyr Ile Thr 505 510 5l5 520 Glu Glu Lys Gly Gly Lys Val Thr Pro Glu Leu Glu Gly Asn Trp Glu Ile Ile Asn Tyr Asp Phe Lys Asn Pro Leu Leu Glu Lys Leu Arg Glu Lys Leu Lys Glu Gly Ser Ile Lys Ile Pro Thr Ser Lys Asp Gly Arg Thr Leu Asn Val Lys Ser Ile Lys Glu Ser Glu Val Lys (2) INFORMATION FOR SEQ ID N0:72:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 581 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:72:
Met Lys Lys Leu Val Leu Val Ile Phe Leu Thr Leu Ala Leu Ser Ile Ser Ala Lys Glu Val Lys Ile Val Phe Leu Glu Thr Ser Asp Ile His Gly Arg Leu Phe Ser Tyr Asp Tyr Ala Ile Gly Glu Gln Lys Pro Asn Asn Gly Leu Thr Arg Ile Ala Thr Leu Ile Lys Lys Gln Arg Ala Glu Asn Lys Asn Val Val Leu Ile Asp Ser Gly Asp Leu Leu Gln Gly Asn Ser Ala Glu Leu Phe Asn Asp Glu Pro Ile His Pro Leu Val Arg Ala Glu Asn Asp Leu Lys Phe Asp Ile Arg Val Leu Gly Asn His Glu Phe l00 105 110 Asn Phe Ser Lys Asp Phe Leu Glu Lys Asn Ile Lys Gly Phe Asn Gly l15 120 125 Asp Val Met Asn Ala Asn Ile Ile Lys Ile Ala Asp Asn Lys Pro Phe Val Lys Pro Tyr Ile Ile Lys Lys Ile Asp Gly Val Arg Val Ala Val Val Gly Tyr Val Val Ala His Ile Pro Thr 'rrp Glu Ala Ser Thr Pro Glu His Phe Ala Gly Leu Lys Phe Leu Asp Ala Glu Glu Ala Leu Lys l80 185 190 ' Lys Thr Leu Lys Glu Leu Lys Gly Lys Tyr Asp Ile Leu Ile Gly Ala Phe His Leu Gly Arg Glu Asp G1u Lys Gly Gly Asp Gly Ile Pro Asp 210 2l5 220 Leu Ala Lys Lys Phe Pro Gln Phe Asp Ile :Ile Phe Ala Gly His Glu 22S 230 :?35 240 His Ala Val Tyr Asn Thr Lys Val Gly Lys 'Jal His Thr Ile Glu Pro Gly Ala Tyr Gly Ala Tyr Leu Ala Lys Gly 'Jal Val Val Phe Asp Thr Lys Thr Lys Lys Lys Ile Ile Thr Thr Glu Asn Leu Pro Thr Lys Asp Val Pro Glu Asp Glu Glu Leu Ala Lys Lys 'ryr Glu Tyr Val Asp Lys Lys Ser Lys Glu Tyr Ala Asn Glu Val Val Gly Glu Val Thr Lys Thr 305 310 :315 320 Phe Ile Asp Arg Pro Asp Phe Ile Thr Gly Glu Glu Lys Ile Thr Thr Met Pro Thr Ala Ala Leu Gln Glu Thr Pro Val Ile Glu Leu Ile Asn Lys Val Gln Lys Tyr Tyr Ala Lys Ala Asp Val Ser Ala Ala Ala Leu Phe Asn Phe Gly Ala Asn Leu Lys Lys Gly 1?ro Phe Lys Arg Lys Asp Val Thr Tyr Ile Tyr Lys Phe Ala Asn Thr Leu Ile Gly Val Arg Ile Thr Gly Glu Asn Leu Leu Lys Tyr Met Glu Trp Ser Tyr Arg Phe Tyr Asn Gln Leu Gln Pro Gly Asp Leu Thr Ile S er Phe Asn Glu Asn Ile Arg Gly Tyr Asn Phe Asp Met Phe Ser Gly Val Lys Tyr Gln Val Asp Val Thr Lys Pro Ala Gly Gln Arg Ile Ile Asn Pro Thr Ile Asn Asn Lys Pro Ile Asp Pro Lys Ala Ile Tyr Lys Leu Ala Ile Asn Asn Tyr 465 470 t75 480 Arg Phe Gly Thr Leu Ser Thr Thr Leu Asn Leu Val Thr Asp Ala Xaa Arg Tyr Tyr Asn Ser Tyr Asp Glu Leu Gln Asp Asn Gly Gln Ile Arg Asp Leu Ile Ile Lys Tyr Ile Thr Glu Glu Lys Gly Gly Lys Val Thr Pro Glu Leu Glu Gly Asn Trp Glu Ile Ile Asn Tyr Asp Phe Lys Asn Pro Leu Leu Glu Lys Leu Arg Glu Lys Leu Lys Glu Gly Ser Ile Lys Ile Pro Thr Ser Lys Asp Gly Arg Thr Leu Asn Val Lys Ser Ile Lys Glu Ser Glu Val Lys (2) INFORMATION FOR SEQ ID N0:73:
(i) SEQUENCE CHARACTERISTICS:
{A) LENGTH: 1339 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 68...l252 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:73:
GTTGGTA ATG GAA TCA GTA AAA ACA GGA AAA ACA AAT AAG GTT GGC AAG l09 Met Glu Ser Val Lys Thr Gly Lys Thr Asn Lys Val Gly Lys Asn Thr Glu Met Ala Asn Thr Lys Ala Asn Lys Glu Ala His Phe Lys Gln Ala Ser Thr Ile Thr Asn Ile Ile Arg Ser Ile Arg Gly Ile Phe Thr Lys Ile Ala Lys Lys Val Arg Gly Leu Val Lys Lys His Pro Lys Lys Ser Ser Ala Ala Leu Val Val Leu Thr His Ile Ala Cys Lys Lys Ala Lys Glu Leu Asp Asp Lys Val Gln Asp Lys Ser Lys Gln Ala Glu Lys Glu Asn Gln Ile Asn Trp Trp Lys Tyr Ser Gly Leu Thr Ile Ala 95 100 105 l10 Thr Ser Leu Leu Leu Ala Ala Cys Ser Thr Gly Asp Val Ser Glu Gln 1l5 120 125 Ile Glu Leu Glu Gln Glu Lys Gln Lys Thr Ser Asn Ile Glu Thr Asn l30 l35 140 -laz-Asn Gln Ile Lys Val Glu Gln Glu Lys Gln Lys Thr Ser Asn Ile Glu Thr Asn Asn Gln Ile Lys Val Glu Gln Glu G7.n Gln Lys Thr Ser Asn ' 160 165 170 Thr Gln Lys Asp Leu Val Lys Glu Gln Lys A~:p Leu Val Lys Glu Gln 175 180 1E.5 190 Lys Asp Leu Val Lys Glu Gln Lys Asp Leu Va.l Lys Glu Gln Lys Asp TTG GTT AAA ACA CAG AAA GAT TTC ATT AAA TA,T GTA GAA CAA AAT TGC 733 Leu Val Lys Thr Gln Lys Asp Phe Ile Lys Tyr Val Glu Gln Asn Cys Gln Glu Asn His Asn Gln Phe Phe Ile Glu Lys Gly Gly Ile Lys Ala Gly Ile Gly Ile Glu Val Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala AAA ACC AAT CAA ACC CCT ATC CAG CCA AAA CA.C CTC CCA AAC TCT AAA 877 Lys Thr Asn Gln Thr Pro Ile Gln Pro Lys His Leu Pro Asn Ser Lys Gln Pro Arg Ser Gln Arg Gly Ser Lys Ala Gln Glu Leu Ile Ala Tyr Leu Gln Lys Glu Leu Glu Phe Leu Pro Tyr Ser Gln Lys Ala Ile Ala Lys Gln Val Asp Phe Tyr Arg Pro Ser Ser Ile Ala Tyr Leu Glu Leu Asp Pro Arg Asp Phe Lys Val Thr Glu Glu Trp Gln Lys Glu Asn Leu Lys Ile Arg Ser Lys Ala Gln Ala Lys Met Leu Glu Met Arg Asn Pro CAA GCC CAC CTT TCA AAC TCT CAA AGC CTT TTG TTC GTT CAA AAA ATA 1l65 Gln Ala His Leu Ser Asn Ser Gln Ser Leu Leu Phe Val Gln Lys Ile Phe Ala Asp Val Asn Lys Glu Ile Glu Ala Val Ala Asn Thr Glu Lys Lys Ala Glu Lys Ala Gly Tyr Gly Tyr Ser Lys Arg Met ACATTGCACC AAGTTTTTAA TTATCTGTCG GCTTTTGAAA ACATTTTTTA TGGTAGCGTT l324 (2) INFORMATION FOR SEQ ID N0:74:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 395 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:74:
Met Glu Ser Val Lys Thr Gly Lys Thr Asn Lys Val Gly Lys Asn Thr Glu Met Ala Asn Thr Lys Ala Asn Lys Glu Ala His Phe Lys Gln Ala Ser Thr Ile Thr Asn Ile Ile Arg Ser Ile Arg Gly Ile Phe Thr Lys Ile Ala Lys Lys Val Arg Gly Leu Val Lys Lys His Pro Lys Lys 5er Ser Ala Ala Leu Val Val Leu Thr His Ile Ala Cys Lys Lys Ala Lys Glu Leu Asp Asp Lys Val Gln Asp Lys Ser Lys Gln Ala Glu Lys Glu Asn Gln Ile Asn Trp Trp Lys Tyr Ser Gly Leu Thr Ile Ala Thr Ser 100 l05 110 Leu Leu Leu Ala Ala Cys Ser Thr Gly Asp Val Ser Glu Gln Ile Glu Leu Glu Gln Glu Lys Gln Lys Thr Ser Asn Ile Glu Thr Asn Asn Gln Ile Lys Val Glu Gln Glu Lys Gln Lys Thr Ser Asn Ile Glu Thr Asn l45 150 155 160 Asn Gln Ile Lys Val Glu Gln Glu Gln Gln Lys Thr Ser Asn Thr Gln 165 170 l75 Lys Asp Leu Val Lys Glu Gln Lys Asp Leu Val Lys Glu Gln Lys Asp l80 185 190 Leu Val Lys Glu Gln Lys Asp Leu Val Lys Glu Gln Lys Asp Leu Val l95 200 205 Lys Thr Gln Lys Asp Phe Ile Lys Tyr Val Glu Gln Asn Cys Gln Glu 2l0 2l5 220 Asn His Asn Gln Phe Phe Ile Glu Lys Gly Gly Ile Lys Ala Gly Ile Gly Ile Glu Val Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala Lys Thr Asn Gln Thr Pro Ile Gln Pro Lys His Leu I?ro Asn Ser Lys Gln Pro Arg Ser Gln Arg Gly Ser Lys Ala Gln Glu heu Ile Ala Tyr Leu Gln ' Lys Glu Leu Glu Phe Leu Pro Tyr Ser Gln Lys Ala Ile Ala Lys Gln Val Asp Phe Tyr Arg Pro Ser Ser Ile Ala Tyr Leu Glu Leu Asp Pro Arg Asp Phe Lys Val Thr Glu Glu Trp Gln hys Glu Asn Leu Lys Ile Arg Ser Lys Ala Gln Ala Lys Met Leu Glu Met Arg Asn Pro Gln Ala His Leu Ser Asn Ser Gln Ser Leu Leu Phe Val Gln Lys Ile Phe Ala Asp Val Asn Lys Glu Ile Glu Ala Val Ala Asn Thr Glu Lys. Lys Ala Glu Lys Ala Gly Tyr Gly Tyr Ser Lys Arg P9et (2) INFORMATION FOR SEQ ID N0:75:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 904 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 70...864 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:75:
TAATAACTCA ATCCCATTTG AATGGCATTT TTAAGCCi~AA TTGCTACTAT CTTTGGCTAA 60 AGGTTAAAC ATG ATT AAA CAA ACC CTC ATC AT'C CTT GCC CCT TTT TTT ATC 111 Met Ile Lys Gln Thr Leu Ile I1~~_ Leu Ala Pro Phe Phe Ile GCA ACG CTG TTG TAT TTT TTA GGC GCA CCG (3AT GGG TTA AGA CCT AAC 159 Ala Thr Leu Leu Tyr Phe Leu Gly Ala Pro i~sp Gly Leu Arg Pro Asn Ala Trp Leu Tyr Phe Cys Ile Phe Met Gly Met Ile Ile Gly Leu Ile Leu Glu Pro Val Pro Ser Gly Leu Ile Ala :Leu Ser Ala Leu Val Leu Cys Ile Ala Leu Lys Ile Gly Ala Ser Asp Lys Val Ala Ser Ala Asn Lys Ala Ile Ser Trp Gly Leu Ser Gly Tyr Ala Asn Lys Thr Val Trp Leu Val Phe Val Ala Phe Ile Leu Gly Leu Gly Tyr Glu Lys Ser Leu 95 l00 105 I10 Leu Gly Lys Arg Ile Ala Leu Leu Leu Ile Arg Phe Leu Gly Gln Thr 115 l20 125 Pro Leu Gly Leu Gly Tyr Ala Ile Gly Leu Ser Glu Leu Cys Leu Ala Pro Phe Ile Pro Ser Asn Ser Ala Arg Ser Gly Gly Ile Leu Tyr Pro Ile Val Ser Ser Ile Pro Pro Leu Met Gly Ser Thr Pro Asn Asn Asn l60 165 170 Pro Asp Lys Ile Gly Ala Tyr Leu Met Trp Val Ala Leu Ala Ser Thr Cys Ile Thr Ser Ser Met Phe Leu Thr Ala Leu Ala Pro Asn Pro Leu l95 200 205 Ala Met Glu Ile Ala Ala Lys Met Gly Val Asn Glu Ile Ser Trp Phe Ser Trp Phe Leu Ala Phe Leu Pro Cys Gly Val Val Leu Ile Leu Leu Val Pro Leu Leu Ala Tyr Lys Thr Cys Lys Pro Thr Leu Lys Gly Ser Lys Glu Val Ser Leu Trp Ala Lys Lys Arg Asn (2) INFORMATION FOR SEQ ID N0:76:
(i) SEQUENCE CHARACTERISTICS:
{A) LENGTH: 265 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:76:
Met Ile Lys Gln Thr Leu Ile Ile Leu Ala Pro Phe Phe Ile Ala Thr Leu Leu Tyr Phe Leu Gly Ala Pro Asp Gly Leu Arg Pro Asn Ala Trp Leu Tyr Phe Cys Ile Phe Met Gly Met Ile Ile Gly Leu Ile Leu Glu Pro Val Pro Ser Gly Leu Ile Ala Leu Ser Ala Leu Val Leu Cys Ile Ala Leu Lys Ile Gly Ala Ser Asp Lys Val Ala Ser Ala Asn Lys Ala Ile Ser Trp Gly Leu Ser Gly Tyr Ala Asn Lys Thr Val Trp Leu Val Phe Val Ala Phe Ile Leu Gly Leu Gly Tyr Glu Lys Ser Leu Leu Gly 100 105 l10 Lys Arg Ile Ala Leu Leu Leu Ile Arg Phe Leu Gly Gln Thr Pro Leu Gly Leu Gly Tyr Ala Ile Gly Leu Ser Glu Leu Cys Leu Ala Pro Phe Ile Pro Ser Asn Ser Ala Arg Ser Gly Gly I:le Leu Tyr Pro Ile Val 145 150 7.55 160 Ser Ser Ile Pro Pro Leu Met Gly Ser Thr F>ro Asn Asn Asn Pro Asp 165. 170 l75 Lys Ile Gly Ala Tyr Leu Met Trp Val Ala Leu Ala Ser Thr Cys Ile Thr Ser Ser Met Phe Leu Thr Ala Leu Ala Pro Asn Pro Leu Ala Met l95 200 205 Glu Ile Ala Ala Lys Met Gly Val Asn Glu I:le Ser Trp Phe Ser Trp Phe Leu Ala Phe Leu Pro Cys Gly Val Val Leu Ile Leu Leu Val Pro 225 230 2:35 240 Leu Leu Ala Tyr Lys Thr Cys Lys Pro Thr Leu Lys Gly Ser Lys Glu Val Ser Leu Trp Ala Lys Lys Arg Asn (2) INFORMATION FOR SEQ ID N0:77:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1194 base pairs _ (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: l52...1069 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:77:
Met Ile Lys Ser Trp Thr Lys Lys Trp Phe Leu Ile Leu Phe Leu Met Ala Ser Cys Ser Ser Tyr Leu Val Ala Thr Thr Gly Glu Lys Tyr Phe Lys Met Ala Thr Gln Ala Phe AAG AGA GGG GAC TAC CAT AAA GCG GTG GCT TTT TAT AAG AGG AGC TGT 3l6 Lys Arg Gly Asp Tyr His Lys Ala Val Ala Phe Tyr Lys Arg Ser Cys Asn Leu Arg Val Gly Val Gly Cys Thr Ser Leu Gly Ser Met Tyr Glu Asp Gly Asp Gly Val Asp Gln Asn Ile Thr Lys Ala Val Phe Tyr Tyr Arg Arg Gly Cys Asn Leu Arg Asn His Leu Ala Cys Ala Ser Leu Gly Ser Met Tyr Glu Asp Gly Asp Gly Val Gln Lys Asn Leu Pro Lys Ala l05 110 115 Ile Tyr Tyr Tyr Arg Arg Gly Cys His Leu Lys Gly Gly Val Ser Cys 120 125 l30 135 Gly Ser Leu Gly Phe Met Tyr Phe Asn Gly Thr Gly Val Lys Gln Asn l40 145 l50 Tyr Ala Lys Ala Leu Phe Leu Ser Lys Tyr Ala Cys Ser Leu Asn Tyr 155 160 l65 Gly Ile Ser Cys Asn Phe Val Gly Tyr Met Tyr Arg Asn Ala Lys Gly Val Gln Lys Asp Leu Lys Lys Ala Leu Ala F~sn Phe Lys Arg Gly Cys His Leu Lys Asp Gly Ala Ser Cys Val Ser Leu Gly Tyr Met Tyr Glu GTC GGT ATG GAT GTC AAA CAA AAT GGA GAG C,'AA GCC TTG AAT CTT TAT 844 Val Gly Met Asp Val Lys Gln Asn Gly Glu Clln Ala Leu Asn Leu Tyr Lys Lys Gly Cys Tyr Leu Lys Arg Gly Ser Gly Cys His Asn Val Ala GTG ATG TAT TAC ACC GGT AAG GGC GTT CCA A.AG GAT TTA GAT AAA GCC 940 Val Met Tyr Tyr Thr Gly Lys Gly Val Pro L~ys Asp Leu Asp Lys Ala ATT TCG TAT TAT AAG AAA GGT TGC ACT CTA G'GC TTT AGT GGT AGC TGT 988 Ile Ser Tyr Tyr Lys Lys Gly Cys Thr Leu Gly Phe Ser Gly Ser Cys AAA GTG TTA GAA GAA GTG ATT GGC AAG AAG T'CT GAT GAT TTG CAA GAT 1036 Lys Val Leu Glu Glu Val Ile Gly Lys Lys Ser Asp Asp Leu Gln Asp Asp Ala Gln Asn Asp Thr Gln Asp Asp Met Gln AAT'GATTAAA ACTCATCTTA TAGAAATCTT TCTACTCTCT TGTTATCAAA TAGGGATTAA l149 (2) INFORMATION FOR SEQ ID N0:78:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 306 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:78:
Met Ile Lys Ser Trp Thr Lys Lys Trp Phe Leu Ile Leu Phe Leu Met Ala Ser Cys Ser Ser Tyr Leu Val Ala Thr Thr Gly Glu Lys Tyr Phe Lys Met Ala Thr Gln Ala Phe Lys Arg Gly Asp Tyr His Lys Ala Val Ala Phe Tyr Lys Arg Ser Cys Asn Leu Arg Val Gly Val Gly Cys Thr Ser Leu Gly Ser Met Tyr Glu Asp Gly Asp Gly Val Asp Gln Asn Ile Thr Lys Ala Val Phe Tyr Tyr Arg Arg Gly Cys Asn Leu Arg Asn His Leu Ala Cys Ala Ser Leu Gly Ser Met Tyr Glu Asp Gly Asp Gly Val l00 105 110 Gln Lys Asn Leu Pro Lys Ala Ile Tyr Tyr Tyr Arg Arg Gly Cys His 115 120 l25 Leu Lys Gly Gly Val Ser Cys Gly Ser Leu Gly Phe Met Tyr Phe Asn Gly Thr Gly Val Lys Gln Asn Tyr Ala Lys Ala Leu Phe Leu Ser Lys l45 150 155 160 Tyr Ala Cys Ser Leu Asn Tyr Gly Ile Ser Cys Asn Phe Val Gly Tyr Met Tyr Arg Asn Ala Lys Gly Val Gln Lys Asp Leu Lys Lys Ala Leu l80 185 l90 Ala Asn Phe Lys Arg Gly Cys His Leu Lys Asp Gly Ala Ser Cys Val Ser Leu Gly Tyr Met Tyr Glu Val Gly Met Asp Val Lys Gln Asn Gly Glu Gln Ala Leu Asn Leu Tyr Lys Lys Gly Cys Tyr Leu Lys Arg Gly Ser Gly Cys His Asn Val Ala Val Met Tyr Tyr Thr Gly Lys Gly Val Pro Lys Asp Leu Asp Lys Ala Ile Ser Tyr Tyr Lys Lys Gly Cys Thr Leu Gly Phe Ser Gly Ser Cys Lys Val Leu Glu Glu Val Ile Gly Lys Lys Ser Asp Asp Leu Gln Asp Asp Ala Gln Asn Asp Thr Gln Asp Asp Met Gln (2) INFORMATION FOR SEQ ID N0:79:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1001 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 101...865 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:79:
GCGATTTTAG GTTAATTTTG AGTTTTTAGG AGCAGTTTTT ATG CAA CAA GAA GAG l15 Met Gln Gln Glu Glu Ile Ile Glu Gly Tyr Tyr Gly Ala Ser Lys Gly Leu Lys Lys Ser Gly ( 10 15 20 Ile Tyr Ala Lys Leu Asp Phe Leu Gln Ser Ala Thr Gly Leu Ile Leu Ala Leu Phe Met Ile Ala His Met Phe Leu Val Ser Ser Ile Leu Ile AGC GAT GAA GCC ATG TAT AAA GTG GCG AAA T'TT TTT GAA GGG AGC TTG 307 Ser Asp Glu Ala Met Tyr Lys Val Ala Lys P.he Phe Glu Gly Ser Leu Phe Leu Lys Ala Gly Glu Pro Ala Ile Val S~~r Val Val Ala Ala Gly ATT ATT CTT ATT TTA GTC GCG CAT GCT TTT T'TG GCG TTA AGG AAA TTC 403 Ile Ile Leu Ile Leu Val Ala His Ala Phe L~.u Ala Leu Arg Lys Phe Pro Ile Asn Tyr Arg Gln Tyr Lys Val Phe Lys Thr His Lys His Leu ATG AAA CAT GGC GAT ACG AGC TTG TGG TTT A'rT CAA GCC CTC ACC GGG 499 Met Lys His Gly Asp Thr Ser Leu Trp Phe I:Le Gln Ala Leu Thr Gly TTT GCG ATG TTT TTC TTA GCG AGT ATC CAC T'PA TTT GTC ATG CTC ACA 547 Phe Ala Met Phe Phe Leu Ala Ser Ile His Lc:u Phe Val Met Leu Thr l35 140 145 Glu Pro Glu Ser Ile Gly Pro His Gly Ser Ser Tyr Arg Phe Val Thr 150 155 lEiO 165 CAA AAC TTT TGG CTT TTG TAT ATT TTC TTA T'.CG TTT GCC GTA GAA TTG 643 Gln Asn Phe Trp Leu Leu Tyr Ile Phe Leu Le~u Phe Ala Val Glu Leu 170 l75 180 CAT GGC TCT ATT GGG TTG TAT CGT TTA GCG A'.CC AAA TGG GGG TGG TTT 69l His Gly Ser Ile Gly Leu Tyr Arg Leu Ala Ile Lys Trp Gly Trp Phe 185 190 l95 AAA AAT GTG AGC ATT CAA GGT TTG AGA AAA G'.CC AAA TGG GCG ATG AGC 739 Lys Asn Val Ser Ile Gln Gly Leu Arg Lys Val Lys Trp Ala Met Ser Val Phe Phe Ile Val Leu Gly Leu Cys Thr Tyr Gly Ala Tyr Ile Lys Lys Gly Leu Glu Asn Lys Glu Asn Gly Ile Lys Thr Met Gln Glu Ala Ile Glu Ala Asp Gly Lys Phe His Lys Glu CAAACAAAAG GGTTTAAACA CCATCGTTTT AAGCCTAGTG CCTGTCAGGC GTT l001 (2) INFORMATION FOR SEQ ID N0:80:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 255 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION. SEQ ID N0:80:
Met Gln Gln Glu Glu Ile Ile Glu Gly Tyr Tyr Gly Ala Ser Lys Gly Leu Lys Lys Ser Gly Ile Tyr Ala Lys Leu Asp Phe Leu Gln Ser Ala Thr Gly Leu Ile Leu Ala Leu Phe Met Ile Ala His Met Phe Leu Val Ser Ser Ile Leu Ile Ser Asp Glu Ala Met Tyr Lys Val Ala Lys Phe Phe Glu Gly Ser Leu Phe Leu Lys Ala Gly Glu Pro Ala Ile Val Ser Val Val Ala Ala Gly Ile Ile Leu Ile Leu Val Ala His Ala Phe Leu Ala Leu Arg Lys Phe Pro Ile Asn Tyr Arg Gln Tyr Lys Val Phe Lys Thr His Lys His Leu Met Lys His Gly Asp Thr Ser Leu Trp Phe Ile 1l5 l20 125 Gln Ala Leu Thr Gly Phe Ala Met Phe Phe Leu Ala Ser Ile His Leu l30 135 140 Phe Val Met Leu Thr Glu Pro Glu Ser Ile Gly Pro His Gly Ser Ser Tyr Arg Phe Val Thr Gln Asn Phe Trp Leu Leu Tyr Ile Phe Leu Leu l65 l70 175 Phe Ala Val Glu Leu His Gly Ser Ile Gly Leu Tyr Arg Leu Ala Ile 180 l85 190 Lys Trp Gly Trp Phe Lys Asn Val Ser Ile Gln Gly Leu Arg Lys Val Lys Trp Ala Met Ser Val Phe Phe Ile Val Leu Gly Leu Cys Thr Tyr WO 9$I21225 PCT/US97/21353 Gly Ala Tyr Ile Lys Lys Gly Leu Glu Asn Lys Glu Asn Gly Ile Lys Thr Met Gln Glu Ala Ile Glu Ala Asp Gly Lys Phe His Lys Glu (2) INFORMATION FOR SEQ ID N0:81:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 975 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 82...912 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:81:
TTTTAAAATT AAAGAAAATT TTTTTTAAAG ATTATCAC'rC TTTTTTGATA AAGTAATCAT 60 Met Glu Glu Ser T:hr Ala Phe Ile Leu Ala Leu Val Gly Leu Phe Thr Gly Ile Thr Ala Gly Phe Phe Gly Ile Gly GGG GGG GAG ATT GTC GTC CCT AGC GCG ATT T'rT GCC CAT TTT AGC TAT 207 Gly Gly Glu Ile Val Val Pro Ser Ala Ile Pine Ala His Phe Ser Tyr AGC CAT GCG GTG GGT ATT TCG CTC ATG CAA A'TG CTT TTT TCT TCA GTG 255 Ser His Ala Val Gly Ile Ser Leu Met Gln Met Leu Phe Ser Ser Val GTC GGC TCT ATC ATC AAT TAC AAA AAG GGC T'rA TTG GAT TTG AGA GAA 303 Val Gly Ser Ile Ile Asn Tyr Lys Lys Gly L.=_u Leu Asp Leu Arg Glu Gly Ser Phe Ala Ala Leu Gly Gly Leu Met G.ly Ala Ile Leu Gly Ser TTT ATC TTA AAA ATC ATT GAC GAT AAA ATT T'rA ATG GCG GTG TTT GTG 399 Phe Ile Leu Lys Ile Ile Asp Asp Lys Ile L~~_u Met Ala Val Phe Val Val Val Val Cys Tyr Thr Phe Ile Lys Tyr A.La Phe Ser Ser Asn Lys Lys Pro Lys His Phe Glu Glu Met His Phe Asp Leu His Ala Asn Asn Lys Thr Pro Glu Lys Lys Arg Ala Ile Pro Phe Val Ser Met Asp Arg 140 145 l50 Thr His Gly Val Leu Met Leu Ala Gly Phe Val Thr Gly Ile Phe Ser Ile Pro Leu Gly Met Gly Gly Gly Ile Leu Met Val Pro Phe Leu Gly 175 l80 l85 Tyr Phe Leu Lys Tyr Asp Ser Lys Lys Ile Val Pro Leu Gly Leu Phe Phe Val Val Phe Ala Ser Leu Ser Gly Val Ile Ser Leu Tyr Asn Gly 205 2l0 215 Arg Val Leu Asp Asn Ile Ser Val Gln Ala Gly Val Ile Thr Gly Ile GGA GCG TTT TTA GGC GTG GGC ATT GGC ATC AAG CTT ATC GCT TTG GCT 83l Gly Ala Phe Leu Gly Val Gly Ile Gly Ile Lys Leu Ile Ala Leu Ala Asn Glu Lys Val His Lys Ile Leu Leu Leu Leu Ile Tyr Ala Leu Ser Ile Leu Ala Thr Leu His Lys Leu Ile Met Gly (2) INFORMATION FOR SEQ ID N0:82:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 277 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:82:
Met Glu Glu Ser Thr Ala Phe Ile Leu Ala Leu Val Gly Leu Phe Thr Gly Ile Thr Ala Gly Phe Phe Gly Ile Gly C)ly Gly Glu Ile Val Val Pro Ser Ala Ile Phe Ala His Phe Ser Tyr :~er His Ala Val Gly Ile ( 35 40 45 Ser Leu Met Gln Met Leu Phe Ser Ser Val Val Gly Ser Ile Ile Asn Tyr Lys Lys Gly Leu Leu Asp Leu Arg Glu Gly Ser Phe Ala Ala Leu 65 70 75 g0 Gly Gly Leu Met Gly Ala Ile Leu Gly Ser F~he Ile Leu Lys Ile Ile Asp Asp Lys Ile Leu Met Ala Val Phe Val Val Val Val Cys Tyr Thr Phe Ile Lys Tyr Ala Phe Ser Ser Asn Lys L~ys Pro Lys His Phe Glu Glu Met His Phe Asp Leu His Ala Asn Asn L~ys Thr Pro Glu Lys Lys 130 l35 140 Arg AIa Ile Pro Phe Val Ser Met Asp Arg T'hr His Gly Val Leu Met Leu Ala Gly Phe Val Thr Gly Ile Phe Ser Ile Pro Leu Gly Met Gly Gly Gly Ile Leu Met Val Pro Phe Leu Gly Tyr Phe Leu Lys Tyr Asp 180 1S5 l90 Ser Lys Lys Ile Val Pro Leu Gly Leu Phe Phe Val Val Phe Ala Ser Leu Ser Gly Val Ile Ser Leu Tyr Asn Gly Arg Val Leu Asp Asn Ile Ser Val Gln Ala Gly Val Ile Thr Gly Ile Gly Ala Phe Leu Gly Val Gly Ile Gly Ile Lys Leu Ile Ala Leu Ala Asn Glu Lys Val His Lys Ile Leu Leu Leu Leu Ile Tyr Ala Leu Ser Ile Leu Ala Thr Leu His Lys Leu Ile Met Gly (2) INFORMATION FOR SEQ ID N0:83:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1667 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 220...1482 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: B3:
GGA GTG
Met Gly Val Gly Tyr Gln Ile Gly Gly Ala Gln Gln Asn Ile Asp Asn Lys Gly Ser Thr Leu Arg Asn Asn Val Ile Asn Asn Phe Arg Gln Val Gly Val Gly Met Ala Gly Gly Asn Gly Leu Leu Ala Leu Ala Thr Asn Thr Thr Met Asp Ala CTT TTA GGG .ATA GGC AAC CAA ATT GTC AAT ACT AAT ACA ACT GTT AGC 426 Leu Leu Gly Ile Gly Asn Gln Ile Val Asn Thr Asn Thr Thr Val Ser Asn Asn Asn Ala Glu Leu Thr Gln Phe Lys Lys Ile Leu Pro Gln Ile Glu Gln Arg Phe Glu Thr Asn Lys Asn Ala Tyr Ser Val Gln Ala Leu Gln Val Tyr Leu Ser Asn Val Leu Tyr Asn Leu Val Asn Asn Ser Asn 105 110 l15 Asn Gly Ser Asn Asn Gly Val Val Pro Glu Tyr Val Gly Ile Ile Lys l20 125 130 Val Leu Tyr Gly Ser Gln Asn Glu Phe Ser Leu Leu Ala Thr Glu Ser l35 140 145 Val Val Leu Leu Asn Ala Leu Thr Arg Val Asn Leu Asp Ser Asn Ser Val Phe Leu Lys Gly Leu Leu Ala Gln Met Gln Leu Phe Asn Asp Thr TCT TCA GCA AAG CTA GGC CAG ATC GCA GAA AAC TTG AAG AAC GGT GGT 8l0 Ser Ser Ala Lys Leu Gly Gln Ile Ala Glu Asn Leu Lys Asn Gly Gly Ala Gly Ser Met Leu Gln Lys Asp Val Lys Thr Ile Ser Asp Arg Ile _ GCT ACT TAC CAA GAG AAT CTA AAA CAG CTA GGA GGG ATG CTA AAG AAT 906 Ala Thr Tyr Gln Glu Asn Leu Lys Gln Leu Gly Gly Met Leu Lys Asn Tyr Asp Glu Pro Tyr Leu Pro Gln Phe Gly Pro Gly Thr Ser Ser Gln ' ~ 230 235 240 245 His Gly Val Ile Asn Gly Phe Gly Ile Gln Val Gly Tyr Lys Gln Phe TTT GGG AAC AAG CGG AAT ATA GGC TTA CGA T.AT TAC GCT TTC TTT GAT 1050 Phe Gly Asn Lys Arg Asn Ile Gly Leu Arg Tyr Tyr Ala Phe Phe Asp Tyr Gly Phe Thr Gln Leu Gly Ser Leu Ser Ser Ala Val Lys Ala Asn ATC TTT ACT TAT GGC GCT GGC ACG GAC TTT T'TA TGG AAT ATC TTT AGA 1146 Ile Phe Thr Tyr Gly Ala Gly Thr Asp Phe Leu Trp Asn Ile Phe Arg AGG GTT TTT AGC GAT CAG TCC TTG AAT GTG GGG GTG TTT GGG GGC ATT 1l94 Arg Val Phe Ser Asp Gln Ser Leu Asn Val G:ly Val Phe Gly Gly Ile CAA ATA GCG GGT AAC ACT TGG GAT AGC TCT T'TA AGA GGT CAA ATT GAA 1242 Gln Ile Ala Gly Asn Thr Trp Asp Ser Ser L:~_u Arg Gly Gln Ile Glu Asn Ser Phe Lys Glu Tyr Pro Thr Pro Thr Asn Phe Gln Phe Leu Phe AAT TTG GGT TTA AGG GCT CAT TTT GCC AGC ACC ATG CAC CGC CGG TTT l338 Asn Leu Gly Leu Arg Ala His Phe Ala Ser Thr Met His Arg Arg Phe TTG AGC GCG TCT CAA AGC ATT CAG CAT GGG A'.CG GAA TTT GGC GTG AAA 1386 Leu Ser Ala Ser Gln Ser Ile G1n His Gly Meet Glu Phe Gly Val Lys ATC CCG GCT ATC AAT CAA AGG TAT TTG AGG GC'_C AAT GGG GCT GAT GTG 1434 Ile Pro Ala Ile Asn Gln Arg Tyr Leu Arg Ala Asn Gly Ala Asp Val Asp Tyr Arg Arg Leu Tyr Ala Phe Tyr Ile Asn Tyr Thr Ile Gly Phe GGATCCAGTG GGGAGATGAG GGGAAGGGAA AAATTGTTGA TAGGATCGCT AAAGATTATG l663 ACTT
(2) INFORMATION FOR SEQ ID N0:84:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 421 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (x1) SEQUENCE DESCRIPTION: SEQ ID N0:84:
Met Gly Val Gly Tyr Gln Ile Gly Gly Ala Gln Gln Asn 11e Asp Asn Lys Gly Ser Thr Leu Arg Asn Asn Val Ile Asn Asn Phe Arg Gln Val Gly Val Gly Met Ala Gly Gly Asn Gly Leu Leu Ala Leu Ala Thr Asn Thr Thr Met Asp Ala Leu Leu Gly Ile Gly Asn G1n Ile Val Asn Thr Asn Thr Thr Val Ser Asn Asn Asn Ala Glu Leu Thr Gln Phe Lys Lys Ile Leu Pro Gln Ile Glu Gln Arg Phe Glu Thr Asn Lys Asn Ala Tyr Ser Val Gln Ala Leu Gln Val Tyr Leu Ser Asn Val Leu Tyr Asn Leu Val Asn Asn Ser Asn Asn Gly Ser Asn Asn Gly Val Val Pro Glu Tyr l15 120 l25 Val Gly Ile Ile Lys Val Leu Tyr Gly Ser Gln Asn Glu Phe Ser Leu Leu Ala Thr Glu Ser Val Val Leu Leu Asn Ala Leu Thr Arg Val Asn l45 150 155 160 Leu Asp Ser Asn Ser Val Phe Leu Lys Gly Leu Leu Ala Gln Met Gln 16S l70 175 Leu Phe Asn Asp Thr Ser Ser Ala Lys Leu Gly Gln Ile Ala Glu Asn Leu Lys Asn Gly Gly Ala Gly Ser Met Leu Gln Lys Asp Val Lys Thr Ile Ser Asp Arg Ile Ala Thr Tyr Gln Glu Asn Leu Lys Gln Leu Gly Gly Met Leu Lys Asn Tyr Asp Glu Pro Tyr Leu Pro Gln Phe Gly Pro Gly Thr Ser Ser Gln His Gly Val Ile Asn Gly Phe Gly Ile Gln Val Gly Tyr Lys Gln Phe Phe Gly Asn Lys Arg Asn Ile Gly Leu Arg Tyr Tyr Ala Phe Phe Asp Tyr Gly Phe Thr Gln Leu Gly Ser Leu Ser Ser Ala Val Lys Ala Asn Ile Phe Thr Tyr Gly Ala Gly Thr Asp Phe Leu Trp Asn Ile Phe Arg Arg Val Phe Ser Asp Gln Ser Leu Asn Val Gly _ Val Phe Gly Gly Ile Gln Ile A1a Gly Asn Thr Trp Asp Ser Ser Leu _ Arg Gly Gln Ile Glu Asn Ser Phe Lys Glu Tyr Pro Thr Pro Thr Asn Phe Gln Phe Leu Phe Asn Leu Gly Leu Arg Ala His Phe Ala Ser Thr ' Met His Arg Arg Phe Leu Ser Ala Ser Gln Ser Ile Gln His Gly Met Glu Phe Gly Val Lys Ile Pro Ala Ile Asn Gln Arg Tyr Leu Arg Ala Asn Gly Ala Asp Val Asp Tyr Arg Arg Leu Tyr Ala Phe Tyr Ile Asn Tyr Thr Ile Gly Phe (2) INFORMATION FOR SEQ ID N0:85:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 926 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 207...746 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:85:
GCATTTGCAAGAAACTTTGATGATAGAAGTGGATAGGC'TT GATTTTTCTT TAGTGGAGCG180 ATG ATT
Met Lys Ser Met Arg Phe Ser Tyr Ile Glu Pro Arg Ala Lys Tyr Leu Ile Ser Lys L~eu Ser Lys Ile Trp Val TTT TAC ATT TTT TTA TCT TTT GTG GTA ATA G'GG GGG TTA GTG TGG TTT 329 Phe Tyr Ile Phe Leu 5er Phe Val Val Ile C'~ly Gly Leu Val Trp Phe ATG CAC AAC GCC ATT AAA AGC ACT CAA GAC P,AC GCG TCC AGT TTG ACG 377 Met His Asn Ala Ile Lys Ser Thr Gln Asp F.sn Ala Ser Ser Leu Thr WO 98l21225 PCT/US97/Z1353 -Ile Gln Glu Arg Leu Tyr Arg His Glu Ile Ser Arg Leu Gln Val Lys Thr Asp Glu Thr Leu Lys Leu Ile Lys Glu Ala Lys Lys Arg Leu Asn Tyr Asn Asp Asp Ile Arg Asp Val Leu Gln Gly Leu Leu Asn Ile Val Pro Asp Ser Ile Thr Ile Asn Ser Ile Glu Ile Asp Gln Gln Ser Val l10 1l5 l20 Val Val Ser Gly Lys Thr Pro Ser Lys Glu Ala Phe Tyr Phe Leu Phe Gln Asn Lys Leu Asn Pro Met Phe Asp Tyr Ser Arg Ala Glu Phe Phe CCCTTA GAT GGGTGGTTT AATTTT TCCACC AAC TTT TCT AAT
ProLeu Asp GlyTrpPhe AsnPhe SerThr Asn Phe Ser Asn Ser Val TCCTTA ATA AAAAATCCG GAGTCT AAATGAAGCCATT GCATTTTTCA
SerLeu Ile LysAsnPro GluSer Lys Leu Ile (2) INFORMATION FOR SEQ ID N0:86:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 180 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:86:
Met Lys Ser Met Arg Phe Ser Tyr Ile Glu Pro Arg Ala Lys Tyr Leu Ile Ser Lys Leu Ser Lys Ile Trp Val Phe Tyr Ile Phe Leu Ser Phe Val Val Ile Gly Gly Leu Val Trp Phe Met His Asn Ala Ile Lys 5er Thr Gln Asp Asn Ala Ser Ser Leu Thr Ile Gln Glu Arg Leu Tyr Arg ' S0 55 60 His Glu Ile Ser Arg Leu Gln Val Lys Thr P,sp Glu Thr Leu Lys Leu Ile Lys Glu Ala Lys Lys Arg Leu Asn Tyr F,sn Asp Asp Ile Arg Asp Val Leu Gln Gly Leu Leu Asn Ile Val Pro F,sp Ser Ile Thr Ile Asn Ser Ile Glu Ile Asp Gln Gln Ser Val Val Val Ser Gly Lys Thr Pro Ser Lys Glu Ala Phe Tyr Phe Leu Phe Gln F.sn Lys Leu Asn Pro Met 130 l35 140 Phe Asp Tyr Ser Arg Ala Glu Phe Phe Pro L~eu Ser Asp Gly Trp Phe l45 150 155 l60 Asn Phe Val Ser Thr Asn Phe Ser Asn Ser L~eu Leu Ile Lys Asn Pro 165 170 l75 Glu Ser Ile Lys (2) INFORMATION FOR SEQ ID N0:87:
{i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1440 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 151...1299 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: B7:
Met Cys Val Val Leu Ser Val Lys Arg Asp Gly Glu Lys Thr Leu Glu Asn Asn Glu Glu Asn Lys Asp Glu Lys Leu Ile Leu Ile Asp Glu Phe Glu Val Leu Ala Asn Lys Phe Ile Ser Arg Leu Pro Asn Ile Pro Ser Thr Pro Arg Glu Phe Gly Leu Gly ~ 45 50 55 Lys Gly Glu Ile Met Glu Ile Asp Val Pro Phe Gly Ser Ile Phe Ala TAC AGA CAC ATT GGC TCT ATC AGA CAA AAA GAA TAC AGG ATT GTA GGG 4l4 Tyr Arg His Ile Gly Ser Ile Arg Gln Lys Glu Tyr Arg Ile Val Gly Leu Tyr Arg Asn Asp Val Leu Leu Leu Ser Thr Lys Ser Leu Val Ile 90 95 l00 Gln Pro Arg Asp Ile Leu Leu Val Ala Gly Asn Pro Glu Ile Leu Asn l05 110 115 l20 Ala Val Tyr Leu G1n Val Lys Ser Asn Val Gly Gln Phe Pro Ala Pro 125 l30 135 Phe Gly Lys Ser Ile Tyr Leu Tyr Ile Asp Met Arg Leu Gln Asn Arg Lys Ala Met Met Arg Asp Val Tyr Gln Ala Leu Phe Leu His Lys His Leu Lys Ser Tyr Lys Leu Tyr Ile Gln Val Leu His Pro Thr Ser Pro Lys Phe Tyr His Lys Phe Leu Ala Leu Glu Thr Glu Ser Ile Glu Val Asn Phe Asp Phe Tyr Arg Lys Ser Phe Ile Gln Lys Leu His Glu Asp His Gln Lys Lys Met Gly Leu Ile Val Val Gly Arg Glu Leu Phe Leu Ser Lys Lys His Arg Lys Ala Leu Tyr Lys Thr Ala Thr Pro Val Tyr Lys Thr Asn Thr Ser Gly Leu Ser Lys Thr Ser Gln Ser Val Val Val Leu Asn Glu Ser Leu Asp Ile Asn Glu Asp Met Ser Ser Val Ile Phe Asp Val Ser Met Gln Met Asp Leu Gly Leu Leu Leu Tyr Asp Phe Asp Pro Asn Lys Arg Tyr Lys Asn Glu Ile Va1 Asn His Tyr Glu Asn Leu GCC AAC GCG TTC AAC CGC AAG ATT GAG ATT T'TC CAA ACC GAT ATT AGA 1l34 Ala Asn Ala Phe Asn Arg Lys Ile Glu Ile Phe Gln Thr Asp Ile Arg Asn Pro Ile Met Tyr Leu Asn Ser Leu Arg Asn Pro Ile Leu His Phe Met Pro Phe Glu Glu Cys Ile Thr His Thr Arg Phe Trp Trp Phe Leu Ser Thr Lys Val Glu Lys Leu Ala Phe Leu Asn Asp Asp Asn Pro Gln Ile Phe Ile Pro Val Ala Glu (2) INFORMATION FOR SEQ ID N0:88:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 383 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:88:
Met Cys Val Val Leu Ser Val Lys Arg Asp Gly Glu Lys Thr Leu Glu Asn Asn Glu Glu Asn Lys Asp Glu Lys Leu Ile Leu Ile Asp Glu Phe Glu Val Leu Ala Asn Lys Phe Ile Ser Arg Leu Pro Asn Ile Pro Ser Thr Pro Arg Glu Phe Gly Leu Gly Lys Gly Glu Ile Met Glu Ile Asp Val Pro Phe Gly Ser Ile Phe Ala Tyr Arg His Ile Gly Ser Ile Arg Gln Lys Glu Tyr Arg Ile Val Gly Leu Tyr Arg Asn Asp Val Leu Leu ~ Leu Ser Thr Lys Ser Leu Val Ile Gln Pro Arg Asp Ile Leu Leu Val Ala Gly Asn Pro Glu Ile Leu Asn Ala Val Tyr Leu Gln Val Lys Ser Asn Val Gly Gln Phe Pro Ala Pro Phe Gly Lys Ser Ile Tyr Leu Tyr l30 135 140 Ile Asp Met Arg Leu Gln Asn Arg Lys Ala Met Met Arg Asp Val Tyr l45 150 155 160 Gln Ala Leu Phe Leu His Lys His Leu Lys Ser Tyr Lys Leu Tyr Ile 165 170 l75 Gln Val Leu His Pro Thr Ser Pro Lys Phe Tyr His Lys Phe Leu Ala Leu Glu Thr Glu Ser Ile Glu Val Asn Phe Asp Phe Tyr Arg Lys Ser Phe Ile Gln Lys Leu His Glu Asp His Gln Lys Lys Met Gly Leu Ile 2l0 215 220 Val Val Gly Arg Glu Leu Phe Leu Ser Lys Lys His Arg Lys Ala Leu Tyr Lys Thr Ala Thr Pro Val Tyr Lys Thr Asn Thr Ser Gly Leu Ser Lys Thr Ser Gln Ser Val Val Val Leu Asn Glu Ser Leu Asp Ile Asn Glu Asp Met Ser Ser Val Ile Phe Asp Val Ser Met Gln Met Asp Leu Gly Leu Leu Leu Tyr Asp Phe Asp Pro Asn Lys Arg Tyr Lys Asn Glu Ile Val Asn His Tyr Glu Asn Leu Ala Asn Ala Phe Asn Arg Lys Ile Glu Ile Phe Gln Thr Asp Ile Arg Asn Pro Ile Met Tyr Leu Asn Ser Leu Arg Asn Pro Ile Leu His Phe Met Pro Phe Glu Glu Cys Ile Thr His Thr Arg Phe Trp Trp Phe Leu Ser Thr Lys Val Glu Lys Leu Ala Phe Leu Asn Asp Asp Asn Pro Gln Ile Phe Ile Pro Val Ala Glu (2) INFORMATION FOR SEQ ID N0:89:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 517 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...464 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:89:
Met Val Gly Gly Gly Thr val Lys Lys Asp Leu Lys Lys Ala Ile Gln Tyr Tyr Val Lys Ala Cys Glu Leu Asn Glu Met Phe Gly Cys Leu Ser Leu Val Ser Asn Ser Gln Ile Asn Lys Gln Lys Leu Phe Gln Tyr Leu Ser Lys Ala Cys Glu Leu Asn Ser Gly Asn Gly Cys Arg Phe Leu Gly Asp Phe Tyr Glu Asn G1y Lys Tyr Val Lys Lys Asp Leu Arg Lys Ala Ala Gln TAC TAC TCT AAA GCT TGT GGA TTA AAT GAT C.~1A GAT GGG TGT TTA ATA 344 Tyr Tyr Ser Lys Ala Cys Gly Leu Asn Asp Gln Asp Gly Cys Leu Ile CTA GGA TAT AAG CAA TAT GCT GGC AAG GGC G'TA GTC AAA AAT GAA AAA 392 Leu Gly Tyr Lys Gln Tyr Ala Gly Lys Gly Val Val Lys Asn Glu Lys Gln Ala Val Lys Thr Phe Glu Lys Ala Cys A:rg Leu Gly Ser Glu Asp 1l5 120 125 130 GCA TGT GGT ATT TTA AAC AAC TAC TAGATTTGAi~ ATAAATGCTG TTTTTTAGCT 494 Ala Cys Gly Ile Leu Asn Asn Tyr (2) INFORMATION FOR SEQ ID N0:90:
(i} SEQUENCE CHARACTERISTICS:
(A) LENGTH: 138 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear . (ii} MOLECULE TYFE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:90:
Met Val Gly Gly Gly Thr Val Lys Lys Asp Leu Lys Lys Ala Ile Gln Tyr Tyr Val Lys Ala Cys Glu Leu Asn Glu Met Phe Gly Cys Leu Ser Leu Val Ser Asn Ser Gln Ile Asn Lys Gln Lys Leu Phe Gln Tyr Leu Ser Lys Ala Cys Glu Leu Asn Ser Gly Asn Gly Cys Arg Phe Leu Gly Asp Phe Tyr Glu Asn Gly Lys Tyr Val Lys Lys Asp Leu Arg Lys Ala Ala Gln Tyr Tyr Ser Lys Ala Cys Gly Leu Asn Asp Gln Asp Gly Cys Leu Ile Leu Gly Tyr Lys Gln Tyr Ala Gly Lys Gly Val Val Lys Asn l00 105 110 Glu Lys Gln Ala Val Lys Thr Phe Glu Lys Ala Cys Arg Leu Gly Ser 1l5 120 125 Glu Asp Ala Cys Gly Ile Leu Asn Asn Tyr (2) INFORMATION FOR SEQ ID N0:91:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1663 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence --(B) LOCATION: 68...1600 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:91:
Met Lys Lys Leu Leu Tyr Thr Ile Leu Ala Leu Leu Leu Ile GGC CTT TTA ACA ATC TAT CTC ATC CTT TTT ACA GAA TGG GGG AAT AAG l57 Gly Leu Leu Thr Ile Tyr Leu Ile Leu Phe Thr Glu Trp Gly Asn Lys Ile Ile Ala Ser Tyr Ile Glu Lys Lys Ile Asn Pro Asn Glu His Tyr Leu Ser Val Lys Thr Phe Lys Leu Arg Phe Asn Ser Leu Asp Phe Lys GCT CAA GCC AAC GAT GAT TCC ACG CTC ATT CTT AAG GGG GAT TTT TCA 30l Ala Gln Ala Asn Asp Asp Ser Thr Leu Ile Leu Lys Gly Asp Phe Ser CTT TTAAAGCAA AGCGTAAAT TTGAATTACC'ATATAGATATT AAAGAT 349 Leu LeuLysGln SerValAsn LeuAsnTyrH:isIleAspIle LysAsp TTA CGCTCTTTC AAAGAATGG ATACCCTACC'CTTTAAGGGGG GCTGTT 397 Leu ArgSerPhe LysGluTrp IleProTyrPro LeuArgGly AlaVal 95 100 l05 110 Ile ThrSerGly AsnIleLys GlyHisArgLys AlaLeuMet IleGln Gly ValSerAsn ValAlaGln SerHisThrAla TyrAsnAla LeuLeu 130 l35 140 Asp AspPheLys LeuSerArg LeuAsnLeuAsn AlaGlnAsp AlaAsn Leu GluAspLeu LeuTyrLeu IleAsnArgPro AlaTyrAla AsnAla l60 165 170 Lys ValSerLeu GlnAlaAsp PheAsnSerLeu LysProLeu GluGly His LeuIleLeu ThrAlaAsn AsnAlaLeuIle AsnAsnAla LeuIle Asn GlnIlePhe HisLeuAsn LeuLysAspThr LeuValPhe SerLeu Ser HisSerSer AspPheLys GlyAsnLysAla IleSerAsp ThrThr Leu ThrSerPro LeuAlaAsn PheLysAlaLeu LysSerGlu TyrLeu Phe SerIleLeu LysLeuAsn AlaProTyrThr LeuGluIle ProAsn Leu AlaLysLeu TyrAsnIle ThrAsnHisPro LeuLysGly SerLeu Thr Leu Lys Gly Ala Ile Glu Gln Ser Pro Lys Leu Leu Lys Val Ser Gly His Ser Asn Leu Leu Asp Gly Ala Leu Asp Phe Thr Leu Leu Asn 305 3l0 3Z5 Lys Asp Leu Lys Gly Arg Phe Ser Asn Ile Ser Thr Leu Lys Ala Leu Asp Leu Phe His Tyr Pro Lys Phe Phe Gln Ser Val Ala Asp Ala Asn Leu Asp Tyr Asp Leu Ile Ala Lys Gln Gly Val Leu Lys Ala Arg Leu Lys Asn Ala Arg Phe Leu Lys Asn Ala Phe Ser Asp Phe Leu Tyr Ser Ile Ser Lys Phe Asp Ile Thr Lys Glu Ile Tyr Asn Asp Ala Asn Leu Val Ser Gln Ile Asn Gln Gln Arg Leu Leu Ser Asp Leu Ser Leu Lys Ser Pro Lys Thr Gln Leu Lys Ile His Asn Gly Leu Leu Asp Leu Asn Thr Lys Gln Met Asn Met Leu Met Asp Ala Glu Ile Leu Lys Phe Ile Phe Lys Met Lys Leu Gln Gly Asn Met His Gln Pro Lys Phe Ser Leu Ile Leu Asn Glu Lys Ala Ile Gln Gln Asn Leu Gln Gln Gly Leu Lys Glu Ile Leu Lys Asn Asp Thr Leu Lys Lys Gly Leu Asp His Leu Leu Lys Asp Asp Lys Leu Lys Glu Lys Leu Glu Lys Gly Leu Lys Gly Leu Phe (2) INFORMATION FOR SEQ ID N0:92:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 51i amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:92:
Met Lys Lys Leu Leu Tyr Thr Ile Leu Ala Leu Leu Leu Ile Gly Leu Leu Thr Ile Tyr Leu Ile Leu Phe Thr Glu Trp Gly Asn Lys Ile Ile Ala Ser Tyr Ile Glu Lys Lys Ile Asn Pro Asn Glu His Tyr Leu Ser Val Lys Thr Phe Lys Leu Arg Phe Asn Ser Leu Asp Phe Lys Ala Gln Ala Asn Asp Asp Ser Thr Leu Ile Leu Lys Gly Asp Phe Ser Leu Leu Lys Gln Ser Val Asn Leu Asn Tyr His Ile Asp Ile Lys Asp Leu Arg Ser Phe Lys Glu Trp Ile Pro Tyr Pro Leu Arg Gly Ala Val Ile Thr Ser Gly Asn Ile Lys Gly His Arg Lys Ala Leu Met Ile Gln Gly Val l15 120 125 Ser Asn Val Ala Gln Ser His Thr Ala Tyr Asn Ala Leu Leu Asp Asp Phe Lys Leu Ser Arg Leu Asn Leu Asn Ala Gl:n Asp Ala Asn Leu Glu Asp Leu Leu Tyr Leu Ile Asn Arg Pro Ala Tyr Ala Asn Ala Lys Val Ser Leu Gln Ala Asp Phe Asn Ser Leu Lys Pro Leu Glu Gly His Leu Ile Leu Thr Ala Asn Asn Ala Leu Ile Asn Assn Ala Leu Ile Asn Gln l95 200 205 Ile Phe His Leu Asn Leu Lys Asp Thr Leu Val Phe Ser Leu Ser His Ser Ser Asp Phe Lys Gly Asn Lys Ala Ile Se:r Asp Thr Thr Leu Thr 225 230 23:5 240 Ser Pro Leu Ala Asn Phe Lys Ala Leu Lys Se:r Glu Tyr Leu Phe Ser Ile Leu Lys Leu Asn Ala Pro Tyr Thr Leu Gl~a Ile Pro Asn Leu Ala Lys Leu Tyr Asn Ile Thr Asn His Pro Leu Ly,s Gly Ser Leu Thr Leu Lys Gly Ala Ile Glu Gln Ser Pro Lys Leu Leu Lys Val Ser Gly His WO 98I21225 PCTlUS97/21353 -Ser Asn Leu Leu Asp ply-Ala Leu Asp Phe Thr Leu Leu Asn Lys Asp Leu Lys Gly Arg Phe Ser Asn Ile Ser Thr Leu Lys Ala Leu Asp Leu Phe His Tyr Pro Lys Phe Phe Gln 5er Val Ala Asp Ala Asn Leu Asp Tyr Asp Leu Ile Ala Lys Gln Gly Val Leu Lys Ala Arg Leu Lys Asn Ala Arg Phe Leu Lys Asn Ala Phe Ser Asp Phe Leu Tyr Ser Ile Ser Lys Phe Asp Ile Thr Lys Glu Ile Tyr Asn Asp Ala Asn Leu Val Ser Gln Ile Asn Gln Gln Arg Leu Leu Ser Asp Leu Ser Leu Lys Ser Pro Lys Thr Gln Leu Lys Ile His Asn Gly Leu Leu Asp Leu Asn Thr Lys Gln Met Asn Met Leu Met Asp Ala Glu Ile Leu Lys Phe Ile Phe Lys Met Lys Leu Gln Gly Asn Met His Gln Pro Lys Phe Ser Leu Ile Leu Asn Glu Lys Ala Ile Gln Gln Asn Leu Gln Gln Gly Leu Lys Glu Ile Leu Lys Asn Asp Thr Leu Lys Lys Gly Leu Asp His Leu Leu Lys Asp Asp Lys Leu Lys Glu Lys Leu Glu Lys Gly Leu Lys Gly Leu Phe (2) INFORMATION FOR SEQ ID N0:93:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 947 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAMEJKEY: Coding Sequence (B) LOCATION: 292...645 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:93:
AAGCTATTTT
Met Gly Tyr Tyr 5er Glu Val Thr Gly Asp Tyr Leu Phe Asn Tyr Asn Ser Thr ' S 10 15 Ile Val Val Ala Tyr Asp Arg Ser Asp Ala Met Thr Ser Tyr Tyr Ile Asn Val Ile Val Tyr Glu Leu Gln Lys Leu Gly Phe Tyr Asn Val Phe ACG CAA GCG GAA TTC CCA CTA GAT AAA GCC P,AA AAT GTG ATC TAT GCG 489 Thr Gln Ala Glu Phe Pro Leu Asp Lys Ala L~ys Asn Val Ile Tyr Ala CGC ATT GTC CGT AAC ATC TCA GCT GTG CCG T'TC TAC CAA TAC AAT TAC 537 Arg Ile Val Arg Asn Ile Ser Ala Val Pro Phe Tyr Gln Tyr Asn Tyr Gln Leu Ile Asp Gln Val Asn Lys Pro Cys Tyr Phe Leu Gly Gly Gln Phe Tyr Cys Ser Gln Thr Leu Arg Ile Ile Thr Leu Ser Met Ala Leu Ala Ser Lys Phe ACGGGTTTTA
AAATGCGCTCAAAATAT g47 (2) INFORMATION FOR SEQ ID N0:94:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 118 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal - (xi) SEQUENCE DESCRIPTION: SEQ ID N0:94:
Met Gly Tyr Tyr Ser Glu Val Thr G1y Asp Tyr Leu Phe Asn Tyr Asn Ser Thr Ile Val Val Ala Tyr Asp Arg Ser Asp Ala Met Thr Ser Tyr Tyr Ile Asn Val Ile Val Tyr Glu Leu Gln Lys Leu Gly Phe Tyr Asn Val Phe Thr Gln Ala Glu Phe Pro Leu Asp Lys Ala Lys Asn Val Ile Tyr Ala Arg Ile Val Arg Asn Ile Ser Ala Val Pro Phe Tyr Gln Tyr 65 70 75 8p Asn Tyr Gln Leu Ile Asp Gln Val Asn Lys Pro Cys Tyr Phe Leu Gly Gly Gln Phe Tyr Cys Ser Gln Thr Leu Arg Ile Ile Thr Leu Ser Met Ala Leu Ala Ser Lys Phe (2) INFORMATION FOR SEQ ID N0:95:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 875 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 348...716 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:95:
ATG
Met Gln Ala Phe Lys Ser Val Ser Ala Ile Lys Lys Asp Glu Asn Ile Thr Ala Asn Asn Thr Gln Lys Glu Arg Ile Leu Phe Gly Ala Leu Ser Asn Pro Leu Leu Glu Gly Ala Ile Asp Lys Val Ser Ala Lys Asn Phe Ile Pro Pro Asn Thr Leu Leu Ser Thr Asp Lys Thr Gln Ala Leu Ile Ile Val Arg Lys Asn Asp Ile Ile Thr Gly Val Tyr Glu Glu Gly Gln Ile Ser Ile 7u 75 80 CTA CTA GCG ATT
Glu Ile Ser LysAla GluAsn Gly Leu Asn Gln Ile Leu Leu Ala Ile AAT AGC C'TC TTG
Gln Ala Lys LeuGlu AsnLys Ile Lys Ala Lys Val Asn Ser Leu Leu l00 105 1l0 115 AGC AGC TCT GCGCAA TTATAAAGGACA'TTCATGAAATT GGTTTTAGGC746 AAA ATC
Ser Ser Ser AlaGln Leu Lys Ile ATCAGTGGAG TTGCGGTT'TTTAGAAAAATT ACCCAAAGAA806 CGAGCGGGAT
ACCCCTAGCC
TTGTCGTGGC
GTCTAAAAAC
(2) INFORMATION FOR SEQ ID N0:96:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 123 amino acids (B) TYPE: amino acid (C} STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:96:
Met Gln Ala Phe Lys Ser Val Ser Ala Ile Lys Lys Asp Glu Asn Ile Thr Ala Asn Asn Thr Gln Lys Glu Arg Ile Leu Phe Gly Ala Leu Ser Asn Pro Leu Leu Glu Gly Ala Ile Asp Lys Val Ser Ala Lys Asn Phe Ile Pro Pro Asn Thr Leu Leu Ser Thr Asp Lys Thr Gln Ala Leu Ile Ile Val Arg Lys Asn Asp Ile Ile Thr Gly Val Tyr Glu Glu Gly Gln Ile Ser Ile Glu Ile Ser Leu Lys Ala Leu G:lu Asn Gly Ala Leu Asn Gln Ile Ile Gln Ala Lys Asn Leu Glu Ser Asn Lys Ile Leu Lys Ala 100 l05 110 _ Lys Val Leu Ser Ser Ser Lys Ala Gln Ile L~.u 1l5 120 (2) INFORMATION FOR SEQ ID N0:97:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 394 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single _ (D) TOPOLOGY: linear ( ix ) FEATURE;
(A} NAME/KEY: Coding Sequence (B} LOCATION: 160...345 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:97:
Met Gln Lys Glu Gln Glu Ala Gln Glu Ile Ala Lys Lys Ala Val Lys Ile Val Phe Phe Leu Gly Leu Val Val Val Leu Leu Met Met Ile Asn Leu Tyr Met Leu Ile Asn Gln Ile Asn Ala Ser Ala Gln Met Ser His Gln Ile Lys Lys Ile Glu Glu Arg Leu Asn Gln Glu Gln Lys (2) INFORMATION FOR SEQ ID N0:98:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 62 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:98:
Met Gln Lys Glu Gln Glu Ala Gln Glu Ile Ala Lys Lys Ala Val Lys Ile Val Phe Phe Leu Gly Leu Val Val Val Leu Leu Met Met Ile Asn Leu Tyr Met Leu Ile Asn Gln Ile Asn Ala Ser Ala Gln Met Ser His Gln Ile Lys Lys Ile Glu Glu Arg Leu Asn Gln Glu Gln Lys (2) INFORMATION FOR SEQ ID N0:99:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 982 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 320...880 (D) OTHER INFORMATION:
(A) NAME/KEY: sig_peptide (B) LOCATION: 320...400 (D) OTHER INFORMATION:
(A) NAME/KEY: mat peptide (B) LOCATION: 401...B80 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:99:
ATTGAAGTTGGTGATTATAC CTATTTGTATCTTAAAAAT'TTGATTTTAAA AGTTTGAGAT180 GGTTTTGTAGGTGTATCCCA CTTATCCAATTTATATCAA'TATTTTCACTC TAAAACCCTC240 ATCCTTGATAAAAAATTAAA CCTTTTAGAAAAATAACCG:4TTTTAGGGTG TAACTTTAAT300 AAA GCT TTG
Met Ile Lys Arg Ile Cys Ile Leu Ser Ala Leu AGT GCG AGT TTA GCG CTG GCT GGC GAA GTG AA'T GGG TTT TTC ATG GGT 400 Ser Ala Ser Leu Ala Leu Ala Gly Glu Val Asn Gly Phe Phe Met Gly GCG GGT TAT CAG CAA GGT CGT TAT GGT CCT TA'P AAC AGC AAT TAC TCT 448 Ala Gly Tyr Gln Gln Gly Arg Tyr Gly Pro Ty:r Asn Ser Asn Tyr Ser GAT TGG CGC CAT GGC AAT GAT CTT TAT GGT TT(3 AAT TTC AAA TTA GGT 496 Asp Trp Arg His Gly Asn Asp Leu Tyr Gly Leu Asn Phe Lys Leu Gly Phe Val Gly Phe Ala Asn Lys Trp Phe Gly Al<~ Arg Val Tyr Gly Phe Leu Asp Trp Yrie Asn ~rnr-~Ser Gly Thr Glu His Thr Lys Thr Asn Leu Leu Thr Tyr Gly Gly Gly Gly Asp Leu Ile Val Asn Leu Ile Pro Leu 65 70 75 g0 Asp Lys Phe Ala Leu Gly Leu Ile Gly Gly Val Gln Leu Ala Gly Asn Thr Trp Met Phe Pro Tyr Asp Val Asn Gln Thr Arg Phe Gln Phe Leu 100 105 1l0 Trp Asn Leu Gly Gly Arg Met Arg Val Gly Asp Arg Ser Ala Phe Glu l15 120 125 Ala Gly Val Lys Phe Pro Met Val Asn Gln Gly Asn Lys Asp Val Arg Ala Tyr Pro Leu Leu Phe Leu Gly Met Trp Ile Met Phe Phe Thr Phe 145 150 l55 l60 (2) INFORMATION FOR SEQ ID N0:100:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 187 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:100:
Met Ile Lys Arg Ile Ala Cys Ile Leu Ser Leu Ser Ala Ser Leu Ala Leu Ala Gly Glu Val Asn Gly Phe Phe Met Gly Ala Gly Tyr Gln Gln Gly Arg Tyr Gly Pro Tyr Asn Ser Asn Tyr Ser Asp Trp Arg His Gly Asn Asp Leu Tyr Gly Leu Asn Phe Lys Leu Gly Phe Val Gly Phe Ala Asn Lys Trp Phe Gly Ala Arg Val Tyr Gly Phe Leu Asp Trp Phe Asn Thr Ser Gly Thr Glu His Thr Lys Thr Asn Leu Leu Thr Tyr Gly Gly WO 98l21225 PCT/US97/21353 Gly Gly Asp Leu 11e vai-HSn Leu Ile Pro L~~_u Asp Lys Phe Ala Leu Gly Leu Ile Gly Gly Val Gln Leu Ala Gly Asn Thr Trp Met Phe Pro Tyr Asp Val Asn Gln Thr Arg Phe Gln Phe La~u Trp Asn Leu Gly Gly Arg Met Arg Val Gly Asp Arg Ser Ala Phe G.Lu Ala Gly Val Lys Phe Pro Met Val Asn Gln Gly Asn Lys Asp Val Arg Ala Tyr Pro Leu Leu Phe Leu Gly Met Trp Ile Met Phe Phe Thr Phe (2) INFORMATION FOR SEQ ID NO:lO.L:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 843 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 262...777 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:101:
CCAATGGAGG CGTTTCCAAA AACCCAAACG GGCGCTTT7.'T AAAGAAAAAT CTCAAAAAAT 60 GCGATTTTAA TGAAGAAGAA TTAAAAATCA TGTTTGAAC:C TGAAGAAAAA AGGTTGTTAG 180 Met Ser Lys Lys Assn Ser Val Ile Ser Gly 1 _'~ I 0 Leu Met Asn Phe Phe Ser Glu Lys Asn Glu Arg Trp Leu Leu Ala His Arg His Thr Arg Gly Phe Val Ile Val Ala Trp Leu Phe Arg Phe Lys ~ AGC ATT GCG TTT TCT ATT TTG ATC ACT CTG TTG GTT ATT TTA GTG GAT 435 Ser Ile Ala Phe Ser Ile Leu Ile Thr Leu Leu Val Ile Leu Val Asp ATT TGG GTT TAT AGC GAT GTG CGT CAG TTT TT'A TTG GAC ACT TCT AGC 483 - Ile Trp Val Tyr Ser Asp Val Arg Gln Phe Leu Leu Asp Thr Ser Ser Gly Arg Tyr Gly Pro Tyr Asn Ser A
SerPheIleTrp LeuLeuIle AlaLeuLeuIle LysTrpGly ValIle ValIleSerAla ArgLysCys TyrGlnPheSer GlnLysMet PheThr LeuIleGlnArg LysArgGln IleArgGluAsn LeuLysAsn ArgSer 110 l15 120 AsnTyrLysAsp ThrLysAsn AlaGluLysLeu SerSerIle AlaGlu GluIleIleSer LysLysGln GluGluSerArg ProLysGlu AspSer AsnHisGluAsn HisLysGlu LysLeuSerAsn IleThrGlu GluSer GAGGAATTGA AA
AAAGCTAAAA
AGGATAGGGG
AspSer (2) INFORMATION FOR SEQ ID N0:102:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 172 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:102:
Met Ser Lys Lys Asn Ser Val Ile Ser Gly Leu Met Asn Phe Phe Ser Glu Lys Asn Glu Arg Trp Leu Leu Ala His Arg His Thr Arg Gly Phe Val Ile Val Ala Trp Leu Phe Arg Phe Lys Ser Ile Ala Phe Ser Ile Leu Ile Thr Leu Leu Val Ile Leu Val Asp Ile Trp Val Tyr Ser Asp Val Arg Gln Phe Leu Leu-HSp Thr Ser Ser Ser Phe Ile Trp Leu Leu Ile Ala Leu Leu Ile Lys Trp Gly Val Ile Val Ile Ser Ala Arg Lys _ 85 90 95 Cys Tyr Gln Phe Ser Gln Lys Met Phe Thr Lf:u Ile Gln Arg Lys Arg Gln Ile Arg Glu Asn Leu Lys Asn Arg Ser A:>n Tyr Lys Asp Thr Lys Asn Ala Glu Lys Leu Ser Ser Ile Ala Glu Glu Ile Ile Ser Lys Lys Gln Glu Glu Ser Arg Pro Lys Glu Asp Ser Asn His Glu Asn His Lys Glu Lys Leu Ser Asn Ile Thr Glu Glu Ser Asp Ser (2) INFORMATION FOR SEQ ID NO:lOa:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1047 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 34...1005 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:103:
AGAAAGAAAC CATTCAAGGA ACGCATTGAT TTG ATG F.AT AAA CCA TTT TTA ATC 54 Met A.sn Lys Pro Phe Leu Ile Leu Leu Ile Ala Leu Ile Val Phe Ser Gly Cys Asn Met Arg Lys Tyr Phe Lys Pro Ala Lys His Gln Ile Lys Gly Glu Ala Tyr Phe Pro Asn His Leu Gln Glu Ser Ile Val Ser Ser Asn Arg Tyr Gly Ala Ile Leu ' Lys Asn Gly Ala Val Ile Gly Asp Lys Gly Leu Thr Gln Leu Arg Ile Gly Lys Asn L~rie Asn ryr-iilu Ser Ser Phe Leu Asn Glu Ser Gln Gly Phe Phe Ile Leu Ala Gln Asp Cys Leu Asn Lys Ile Asp Lys Lys Thr 90 95 loa Asn Lys Ser Lys Val Ala Lys Thr Glu Glu Thr Glu Leu Lys Leu Lys 105 110 l15 Gly Val Glu Ala Glu Val Gln Asp Lys Val Cys His Gln Val Glu Leu l20 125 130 135 Ile Ser Asn Asn Pro Asn Ala Ser Gln Gln Ser Ile Val Ile Pro Leu l40 145 150 Glu Thr Phe Ala Leu Ser Ala Ser Val Lys Gly Asn Leu Leu Ala Val Val Leu Ala Asp Asn Ser Ala Asn Leu Tyr Asp Ile Thr Ser Gln Lys l70 175 180 Leu Leu Phe Ser Glu Lys Gly Ser Pro Ser Thr Thr Ile Asn Ser Leu Met Ala Met Pro Ile Phe Met Asp Thr Val Val Val Phe Pro Met Leu Asp Gly Arg Leu Leu Val Val Asp Tyr Val His Gly Asn Pro Thr Pro Ile Arg Asn Ile Val Ile Ser Ser Asp Lys Phe Phe Asn Asn Ile Thr Tyr Leu Ile Val Asp Gly Asn Asn Met Ile Ala Ser Thr Gly Lys Arg Ile Leu Ser Val Val Ser Gly Gln Glu Phe Asn Tyr Asp Gly Asp Ile Val Asp Leu Leu Tyr Asp Lys Gly Thr Leu Tyr Val Leu Thr Leu Asp GGG CAG ATT TTV CAA AT3~.GAT AAG AGT TTG A.GG GAA TTA AAC AGC GTG 966 Gly Gln Ile Leu Gln Met Asp Lys Ser Leu Arg Glu Leu Asn Ser Val AAA CTG CCT NTC NTC GCT CAA CAC GAT TGT A.TT AAA CCA TAATAAATTG TA 1017 Lys Leu Pro Xaa Xaa Ala Gln His Asp Cys Ile Lys Pro (2} INFORMATION FOR SEQ ID N0:104:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 324 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:104:
Met Asn Lys Pro Phe Leu Ile Leu Leu Ile Ala Leu Ile Val Phe Ser Gly Cys Asn Met Arg Lys Tyr Phe Lys Pro Ala Lys His Gln Ile Lys Gly Glu Ala Tyr Phe Pro Asn His Leu Gln Glu Ser Ile Val Ser Ser Asn Arg Tyr Gly Ala Ile Leu Lys Asn Gly Ala Val Ile Gly Asp Lys Gly Leu Thr Gln Leu Arg Ile Gly Lys Asn Phe Asn Tyr Glu Ser Ser Phe Leu Asn Glu Ser Gln Gly Phe Phe Ile Leu Ala Gln Asp Cys Leu Asn Lys Ile Asp Lys Lys Thr Asn Lys Ser Lys Val Ala Lys Thr Glu Glu Thr Glu Leu Lys Leu Lys Gly Val Glu Ala Glu Val Gln Asp Lys Val Cys His Gln Val Glu Leu Ile Ser Asn Asn Pro Asn A1a Ser Gln l30 135 140 Gln Ser Ile Val Ile Pro Leu Glu Thr Phe Ala Leu Ser Ala Ser Val Lys Gly Asn Leu Leu Ala Val Val Leu Ala Asp Asn Ser Ala Asn Leu Tyr Asp Ile Thr Ser Gln Lys Leu Leu Phe S~er Glu Lys Gly Ser Pro Ser Thr Thr Ile Asn Ser Leu Met Ala Met Pro Ile Phe Met Asp Thr Val Val Val Phe Pro Met Leu Asp Gly Arg L~~u Leu Val Val Asp Tyr ' 210 215 220 Val His Gly Asn Pro Thr Pro Ile Arg Asn Ile Val Ile Ser Ser Asp 225 230 2.35 240 Lys Phe Phe Asn Asn Ile Thr Tyr Leu Ile Val Asp Gly Asn Asn Met - Ile Ala Ser Thr Gly Lys Arg Ile Leu Ser Val Val Ser Gly Gln Glu WO 98l21225 PCTIUS97l21353 -Phe Asn Tyr Asp Gly Asp Ile Val Asp Leu Leu Tyr Asp Lys Gly Thr Leu Tyr Val Leu Thr Leu Asp Gly Gln Ile Leu Gln Met Asp Lys Ser Leu Arg Glu Leu Asn Ser Val Lys Leu Pro Xaa Xaa Ala Gln His Asp Cys Ile Lys Pro (2) INFORMATION FOR SEQ ID N0:105:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1968 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 153...1793 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 153...219 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:105:
Met Asp Lys Asn Asn Asn Asn Leu Arg Leu Ile Leu Ala Ile Ala Leu Ser Phe Leu Phe Ile Ala Leu Tyr Ser Tyr Phe Phe Gln Lys Pro Asn Lys Thr Thr Thr Gln Thr Thr Lys Gln Glu Thr Thr Asn Asn His Thr Ala Thr Ser Pro Asn Ala Pro Asn Ala Gln His Phe Ser Thr Thr Gln Thr Thr Pro Gln Glu Asn Leu WO 98/21225 PCT/US97/21353 w CTA AGC ACV A'r'r '1'L'1 T i T - GAG CAT GCC AGG ATT GAA ATT GAT TCT TTA 413 Leu Ser Thr Ile Ser Phe Glu His Ala Arg Ile Glu Ile Asp Ser Leu Gly Arg Ile Lys Gln Val Tyr Leu Lys Asp Lys Lys Tyr Leu Thr Pro - . Lys Gln Lys Gly Phe Leu Glu His Val Gly His Leu Phe Ser Ser Lys Glu Asn Ala Gln Pro Pro Leu Lys Glu Leu P:ro Leu Leu Ala Ala Asp 100 l05 110 Lys Leu Lys Pro Leu Glu Val Arg Phe Leu A;sp Pro Thr Leu Asn Asn 115 l20 125 AAA GCG TTC AAC ACC CCT TAT AGC GCT TCA A~~1A ACC ACT CTT GGG CCT 653 Lys Ala Phe Asn Thr Pro Tyr Ser Ala Ser Lys Thr Thr Leu Gly Pro 130 135 l40 _ 145 Asn Glu Gln Leu Val Leu Thr Gln Asp Leu G:ly Thr Leu Ser Ile Ile Lys Thr Leu Thr Phe Tyr Asp Asp Leu His Tyr Asp Leu Lys Ile Ala TTC AAA TCG CCC AAT AAC CTT ATC CCT AGC Ti~T GTG ATC ACC AAT GGT 797 Phe Lys Ser Pro Asn Asn Leu Ile Pro Ser Tyr Val Ile Thr Asn Gly Tyr Arg Pro Va1 Ala Asp Leu Asp Ser Tyr Thr Phe Ser Gly Val Leu TTA GAA AAT AGC GAC AAA AAA ATT GAA AAA A'CT GAA GAT AAA GAC GCT 893 Leu Glu Asn Ser Asp Lys Lys Ile Glu Lys I:Le Glu Asp Lys Asp Ala 210 2l5 220 225 AAA GAA ATC AAA CGC TTT TCT AAC ACC CTC T'CT TTA TCC AGC GTG GAT 941 Lys Glu Ile Lys Arg Phe Ser Asn Thr Leu Phe Leu Ser Ser Val Asp Arg Tyr Phe Thr Thr Leu Leu Phe Thr Lys Asp Pro Gln Gly Phe Glu Ala Leu Ile Asp Ser Glu Ile Gly Thr Lys Asn Pro Leu Gly Phe Ile Ser Leu Lys Asn Glu Ala Asn Leu His Gly Tyr Ile Gly Pro Lys Asp Tyr Arg Ser Leu Lys Ala Ile Ser Pro Met Leu Thr Asp Val Ile Glu Tyr Gly Leu Ile Thr Phe Phe Ala Lys Gly Val Phe Val Leu Leu Asp 310 3l5 320 TAT TTG TAT CAA TTC GTG GGC AAT TGG GGT TGG GCT ATC ATT CTT TTA 1229 _ Tyr Leu Tyr Gln Phe Val Gly Asn Trp Gly Trp Ala Ile Ile Leu Leu Thr Ile Ile Val Arg Ile Ile Leu Tyr Pro Leu Ser Tyr Lys Gly Met Val Ser Met Gln Lys Leu Lys Glu Leu Ala Pro Lys Met Lys Glu Leu Gln Glu Lys Tyr Lys Gly Glu Pro Gln Lys Leu Gln Ala His Met Met CAG CTT TAC AAA AAA CAT GGG GCT AAC CCA CTA GGG GGT TGT CTG CCC l421 Gln Leu Tyr Lys Lys His Gly Ala Asn Pro Leu Gly Gly Cys Leu Pro Leu Ile Leu Gln Ile Pro Val Phe Phe Ala Ile Tyr Arg Val Leu Tyr 405 4l0 415 AAC GCT GTG GAA TTG AAA AGC TCA GAG TGG ATC TTA TGG ATT CAT GAT l517 Asn Ala Val Glu Leu Lys Ser Ser Glu Trp Ile Leu Trp Ile His Asp Leu Ser Ile Met Asp Pro Tyr Phe Ile Leu Pro Leu Leu Met Gly Ala Ser Met Tyr Trp His Gln Ser Val Thr Pro Asn Thr Met Thr Asp Pro Met Gln Ala Lys Ile Phe Lys Leu Leu Pro Leu Leu Phe Thr Ile Phe Leu Ile Thr Phe Pro Ala Gly Leu Val Leu Tyr Trp Thr Thr Asn Asn ' ATC CTT TC'.G GTG TTCU Clan CAA CTC ATC ATC P.AT AAA GTC TTA GAG AAT 1757 Ile Leu Ser Val Leu Gln Gln Leu Ile Ile A,sn Lys Val Leu Glu Asn Lys Lys Arg Met His Ala Gln Asn Lys Lys Glu His (2) INFORMATION FOR SEQ ID N0:106:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 547 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein _ (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 1...22 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:106:
Met Asp Lys Asn Asn Asn Asn Leu Arg Leu Ile Leu Ala Ile Ala Leu Ser Phe Leu Phe Ile Ala Leu Tyr Ser Tyr P:he Phe Gln Lys Pro Asn Lys Thr Thr Thr Gln Thr Thr Lys Gln Glu T.hr Thr Asn Asn His Thr Ala Thr Ser Pro Asn Ala Pro Asn Ala Gln His Phe Ser Thr Thr Gln Thr Thr Pro Gln Glu Asn Leu Leu Ser Thr Ile Ser Phe Glu His Ala Arg Ile Glu Ile Asp Ser Leu Gly Arg Ile Lys Gln Val Tyr Leu Lys Asp Lys Lys Tyr Leu Thr Pro Lys Gln Lys G.ly Phe Leu Glu His Val Gly His Leu Phe Ser Ser Lys Glu Asn Ala G.ln Pro Pro Leu Lys Glu Leu Pro Leu Leu Ala Ala Asp Lys Leu Lys P:ro Leu Glu Val Arg Phe ' 110 115 120 Leu Asp Pro Thr Leu Asn Asn Lys Ala Phe A;~n Thr Pro Tyr Ser Ala ' 125 130 135 Ser Lys Thr Thr Leu Gly Pro Asn Glu Gln Leu Val Leu Thr Gln Asp Leu Gly Thr Leu Ser Ile Ile Lys Thr Leu TJzr Phe Tyr Asp Asp Leu 155 16u- 165 170 His Tyr Asp Leu Lys Ile Ala Phe Lys Ser Pro Asn Asn Leu Ile Pro l75 180 185 Ser Tyr Val Ile Thr Asn Gly Tyr Arg Pro Val Ala Asp Leu Asp Ser Tyr Thr Phe Ser Gly Val Leu Leu Glu Asn Ser Asp Lys Lys Ile Glu Lys Ile Glu Asp Lys Asp Ala Lys Glu Ile Lys Arg Phe Ser Asn Thr Leu Phe Leu Ser Ser Val Asp Arg Tyr Phe Thr Thr Leu Leu Phe Thr Lys Asp Pro Gln Gly Phe Glu Ala Leu Ile Asp Ser Glu Ile Gly Thr Lys Asn Pro Leu Gly Phe Ile Ser Leu Lys Asn Glu Ala Asn Leu His Gly Tyr Ile Gly Pro Lys Asp Tyr Arg Ser Leu Lys Ala Ile Ser Pro Met Leu Thr Asp Val Ile Glu Tyr Gly Leu Ile Thr Phe Phe Ala Lys Gly Val Phe Val Leu Leu Asp Tyr Leu Tyr Gln Phe Val Gly Asn Trp Gly Trp Ala Ile Ile Leu Leu Thr Ile Ile Val Arg Ile Ile Leu Tyr Pro Leu Ser Tyr Lys Gly Met Val Ser Met Gln Lys Leu Lys Glu Leu Ala Pro Lys Met Lys Glu Leu Gln Glu Lys Tyr Lys Gly Glu Pro Gln Lys Leu Gln Ala His Met Met Gln Leu Tyr Lys Lys His Gly Ala Asn Pro Leu Gly Gly Cys Leu Pro Leu Ile Leu Gln Ile Pro Val Phe Phe Ala Ile Tyr Arg Val Leu Tyr Asn Ala Val Glu Leu Lys Ser Ser Glu Trp Ile Leu Trp Ile His Asp Leu Ser Ile Met Asp Pro Tyr Phe Ile Leu Pro Leu Leu Met Gly Ala Ser Met Tyr Trp His Gln Ser Val Thr Pro Asn Thr Met Thr Asp Pro Met Gln Ala Lys Ile Phe Lys Leu Leu Pro Leu Leu Phe Thr Ile Phe Leu Ile Thr Phe Pro Ala Gly Leu Val Leu Tyr Trp Thr Thr Asn Asn Ile Leu Ser Val Leu Gln Gln Leu Ile Ile Asn Lys Val Leu Glu Asn Lys Lys Arg Met His Ala Gln Asn Lys 5l0 515 520 Lys Glu His (2) INFORMATION FOR SEQ ID N0:107:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3280 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 151...3207 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 151...241 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:107:
AGGCATGATC AACAATTTAG GGGAGGAATG ATG CTC GCT TCC ATT ATT GAA TTT l74 Met Leu Ala Ser Ile Ile Glu Phe Ser Leu Arg Gln Arg Val Ile Val Ile Val Gly Ala Ile Leu Ile Leu TTT TTT GGG ACT TAT AGT TTT ATC AAC ACT CC'A GTG GAC GCT TTC CCG 270 Phe Phe Gly Thr Tyr Ser Phe Ile Asn Thr Pro Val Asp Ala Phe Pro Asp Ile Ser Pro Thr Gln Val Lys Ile Ile Leu Lys Leu Pro Gly Ser Ser Pro Glu Glu Met Glu Asn Asn Ile Val Arg Pro Leu Glu Leu Glu Leu Leu Gly Leu Lys Gly Gln Lys Ser Leu Arg Ser Val Ser Lys Tyr Ser Ile Ser Asp Ile Thr Ile Asp Phe Asp Asp Ser Val Asp Ile Tyr Leu Ala Arg Asn Ile Val Asn Glu Arg Leu Ser Ser Val Met Lys Asp Leu Pro Val Gly Val Glu Gly Gly Met Ala Pro Ile Val Thr Pro Leu TCA GAT ATC TTT ATG 'i"1'L--ACT ATT GAT GGC AAT ATC ACT GAG ATA GAA 606 Ser Asp Ile Phe Met Phe Thr Ile Asp Gly Asn Ile Thr Glu Ile Glu Lys Arg Gln Leu Leu Asp Phe Val Ile Arg Pro Gln Leu Arg Met Ile Ser Gly Val Ala Asp Val Asn Ser Ile Gly Gly Phe Ser Arg Ala Phe Val Ile Val Pro Asp Phe Asn Asp Met Ala Arg Leu Gly Val Ser Ile Ser Asp Leu Glu Ser Ala Val Arg Val Asn Leu Arg Asn Ser Gly Ala Gly Arg Val Asp Arg Asp Gly Glu Thr Phe Leu Val Lys Ile Gln Thr Ala Ser Leu Ser Leu Glu Asp Ile Gly Lys Ile Thr Val Ser Thr Asn Leu Gly His Leu His Ile Lys Asp Phe Ala Lys Val Ile Ser Gln Ser Arg Thr Arg Leu Gly Phe Val Thr Lys Asp Gly Val Gly Glu Thr Thr GAA GGC TTG GTG CTT TCT TTA AAA GAC GCT AAC ACC AAA GAA ATC ATC l038 Glu Gly Leu Val Leu Ser Leu Lys Asp Ala Asn Thr Lys Glu Ile Ile Thr Gln Val Tyr Gln Lys Leu Glu Glu Leu Lys Pro Phe Leu Pro Asn Gly Val Ser Ile Asn Val Phe Tyr Asp Arg Ser Glu Phe Thr Gln Lys Ala Ile Ala Thr Val Ser Lys Thr Leu Ile Glu Ala Val VaI Leu Ile Ile Ile Thr Leu Phe Leu Phe Leu Gly Asn Leu Arg Ala Ser Val Ala -22e-GTG GGG GTG A'i'1' '1'1H C~T..TTA AGC TTG TCC G'TG GCG TTT ATT TTT ATC 1278 Val Gly Val Ile Leu Pro Leu Ser Leu Ser Val Ala Phe Ile Phe Ile Lys Phe Ser Asp Leu Thr Leu Asn Leu Met Se r Leu Gly Gly Leu Val ATC GCT ATA GGC ATG CTC ATT GAC TCA GCC G'.CG GTG GTG GTG GAA AAC 1374 Ile Ala Ile Gly Met Leu Ile Asp Ser Ala Val Val Val Val Glu Asn GCT TTT GAA AAA TTA AGC GCT AAC ACT AAA A<:C ACT AAA CTC CAT GCA 1422 Ala Phe Glu Lys Leu Ser Ala Asn Thr Lys Thr Thr Lys Leu His Ala ATC TAT CGT TCG TGT AAA GAA ATC GCT GTT TC:A GTG GTG AGC GGG GTG 1470 I1e Tyr Arg Ser Cys Lys Glu Ile Ala Val Ser Val Val Ser Gly Val GTG ATC ATC ATT GTG TTT TTT GTG CCG ATT T7.'A ACC TTA CAG GGG TTA 15l8 Val Ile Ile Ile Val Phe Phe Val Pro Ile Le:u Thr Leu Gln Gly Leu Glu Gly Lys Met Phe Arg Pro Leu Ala Gln Seer Ile Val Tyr Ala Leu Leu Gly Thr Leu Val Leu Ser Ile Thr Ile Ile Pro Val Val Ser Ser Leu Val Leu Lys Ala Thr Pro His Ser Glu Thr Phe Leu Thr Arg Phe Leu Asn Arg Ile Tyr Ala Pro Leu Leu Glu Phe Phe Val His Asn Pro Lys Lys Val Ile Leu Gly Ala Phe Val Phe Leu Ile Ala Ser Leu Ser Leu Phe Pro Phe Val Gly Lys Asn Phe Met Pro Val Leu Asp Glu Gly Asp Val Val Leu Ser Val Glu Thr Thr Pro Ser Ile Ser Leu Asp Gln TCT AGG GAT CTC ATG CTA AAC ATT GAG AGC GCG ATT AAA AAG CAT GTC l902 Ser Arg Asp Leu Met Leu Asn Ile Glu Ser Al,a Ile Lys Lys His Val AAG GAA GTT AAA AGC A'1"1'-GTC GCG CGC ACA GGG AGC GAT GAA TTG GGG 1950 Lys Glu Val Lys 5er Ile Val Ala Arg Thr Gly Ser Asp Glu Leu Gly CTG GAT TTA GGA GGT TTG AAT CAA ACC GAT ACT TTT ATT TCT TTT ATT l998 Leu Asp Leu Gly Gly Leu Asn Gln Thr Asp Thr Phe Ile Ser Phe Ile Pro Lys Lys Glu Trp Ser Val Lys Thr Lys Asp Glu Leu Leu G1u Lys Ile Met Asp Ser Leu Lys Asp Phe Lys Gly Ile Asn Phe Ser Phe Thr Gln Pro Ile Glu Met Arg Ile Ser Glu Met Leu Thr Gly Val Arg Gly Asp Leu Ala Val Lys Ile Phe Gly Asp Gly Ile Ser Glu Leu Asn Glu Leu Ser Phe Gln Ile Ala Gln Ala Leu Lys Gly Ile Lys Gly Ser Ser GAA~GTT TTA ACC ACG CTT AAT GAG GGC GTG AAT TAT TTG TAT GTA ACC 2286 Glu Val Leu Thr Thr Leu Asn Glu Gly Val Asn Tyr Leu Tyr Val Thr Pro Asn Lys Glu Ser Met Ala Asp Val Gly Ile Thr Ser Asp Glu Phe Ser Lys Phe Leu Lys Ser Ala Leu Glu Gly Leu Val Val Asp Val Ile Pro Thr Gly Ile Ser Arg Thr Pro Val Met Ile Arg Gln Glu Ser Asp Phe Ala Ser Ser Ile Thr Lys Ile Lys Ser Leu Ala Leu Thr Ser Lys Tyr Gly Val Leu Val Pro Ile Thr Ser Ile Ala Lys Ile Glu Glu Val Asp Gly Pro Val Ser Val Val Arg Glu Asn Ser Met Arg Met Ser Val GTT CGC AGT AAT GTG G'1'G~~GGG CGC GAT TTG AAA TCT TTT GTA GAA GAG 2622 Val Arg Ser Asn Val Val Gly Arg Asp Leu Lys Ser Phe Val Glu Glu GCT AAA AAA GTG ATC GCT CAA AAC ATC AAA C'TC CCT CCC AGC TAC TAT 2670 Ala Lys Lys Val Ile Ala Gln Asn Ile Lys Leu Pro Pro Ser Tyr Tyr ATC ACT TAT GGG GGG CAG TTT GAA AAC CAG C.zIA CGG GCC AAT AAA AGG 2718 Ile Thr Tyr Gly Gly Gln Phe Glu Asn Gln Gln Arg Ala Asn Lys Arg Leu Ser Thr Val Ile Pro Leu Ser Ile Leu Ala Ile Phe Phe Tle Leu TTT TTC ACT TTT AAA AGC ATT CCT TTA GCC T'rG CTC ATT CTT TTG AAT 2814 Phe Phe Thr Phe Lys Ser Ile Pro Leu Ala L~=_u Leu Ile Leu Leu Asn Ile Pro Phe Ala Val Thr Gly Gly Leu Ile A.la Leu Phe Ala Val Gly GAG TAT ATT TCA GTG CCA GCG AGC GTG GGC T'rT ATC GCT CTT TTT GGG 29l0 Glu Tyr Ile Ser Val Pro Ala Ser Val Gly Plze Ile Ala Leu Phe Gly Ile Ala Val Leu Asn Gly Val Val Met Ile G:Ly Tyr Phe Lys Glu Leu CTC TTG CAA GGG AAA AGC GTA GAA GAA TGC G'rT TTA TTG GGC GCT AAA 3006 Leu Leu Gln Gly Lys Ser Val Glu Glu Cys V<~1 Leu Leu Gly Ala Lys Arg Arg Leu Arg Pro Val Leu Met Thr Ala Cys Ile Ala Gly Leu Gly Leu Leu Pro Leu Leu Phe Ser His Ser Val Gly Ser Glu Val Gln Lys CCT TTA GCG ATC GTG GTG CTT GGA GGC TTG G'.CT ACC TCA AGC GCT CTA 3150 Pro Leu Ala Ile Val Val Leu Gly Gly Leu Val Thr Ser Ser Ala Leu 955 960 9(i5 970 ACC TTA CTC CTA CTG CCG CCA ATG TTT ATG C'.CC ATC GCT AAA AAG ATT 3198 Thr Leu Leu Leu Leu Pro Pro Met Phe Met Le,u Ile Ala Lys Lys Ile AAA ATC GTT TGAGTTAAAG GATTTCACAT GCTCGCT'.CTA GAAATTTATA TTGATATTT 3256 Lys Ile Val GTTTGAAAGA CGCTTTAA'1'H GATT 3280 (2) INFORMATION FOR SEQ ID N0:108:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 10l9 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 1...30 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:108:
Met Leu Ala Ser Ile Ile Glu Phe Ser Leu Arg Gln Arg Val Ile Val Ile Val Gly Ala Ile Leu Ile Leu Phe Phe Gly Thr Tyr Ser Phe Ile Asn Thr Pro Val Asp Ala Phe Pro Asp Ile Ser Pro Thr Gln Val Lys Ile Ile Leu Lys Leu Pro Gly Ser Ser Pro Glu Glu Met Glu Asn Asn Ile Val Arg Pro Leu Glu Leu Glu Leu Leu Gly Leu Lys Gly Gln Lys Ser Leu Arg Ser Val Ser Lys Tyr Ser Ile Ser Asp Ile Thr Ile Asp Phe Asp Asp Ser Val Asp Ile Tyr Leu Ala Arg Asn Ile Val Asn Glu Arg Leu Ser Ser Val Met Lys Asp Leu Pro Val Gly Val Glu Gly Gly Met Ala Pro Ile Val Thr Pro Leu Ser Asp Ile Phe Met Phe Thr Ile 100 105 1l0 Asp Gly Asn Ile Thr Glu Ile Glu Lys Arg Gln Leu Leu Asp Phe Val 1l5 120 125 130 Ile Arg Pro Gln Leu Arg Met Ile Sex Gly Val Ala Asp Val Asn Ser l35 140 145 Ile Gly Gly Phe Ser Arg Ala Phe Val Ile Val Pro Asp Phe Asn Asp Met Ala Arg Leu Gly Val Ser Ile Ser Asp Leu Glu Ser Ala Val Arg Val Asn Leu Arg Asn Ser Gly Ala Gly Arg Val Asp Arg Asp Gly Glu l80 1B5 l90 Thr Phe Leu Val Lys Ile Gln Thr Ala Ser Leu Ser Leu Glu Asp Ile Gly Lys Ile Thr Val Ser Thr Asn Leu Gly His Leu His Ile Lys Asp Phe Ala Lys Val Ile Ser Gln Ser Arg Thr Arg Leu Gly Phe Val Thr Lys Asp Gly Val Gly Glu Thr Thr Glu Gly Leu Val Leu Ser Leu Lys Asp Ala Asn Thr Lys Glu Ile Ile Thr Gln Val Tyr Gln Lys Leu Glu Glu Leu Lys Pro Phe Leu Pro Asn Gly Val Ser Ile Asn Val Phe Tyr Asp Arg Ser Glu Phe Thr Gln Lys Ala Ile Ala Thr Val Ser Lys Thr - Leu Ile Glu Ala Val Val Leu Ile Ile Ile Thr Leu Phe Leu Phe Leu Gly Asn Leu Arg Ala Ser Val Ala Val Gly Val Ile Leu Pro Leu Ser 325 330 335 _ Leu Ser Val Ala Phe Ile Phe Ile Lys Phe Ser Asp Leu Thr Leu Asn Leu Met Ser Leu Gly Gly Leu Val Ile Ala Ile Gly Met Leu Ile Asp Ser Ala Val Val Val Val Glu Asn Ala Phe Glu Lys Leu Ser Ala Asn Thr Lys Thr Thr Lys Leu His Ala Ile Tyr Arg Ser Cys Lys Glu Ile Ala Val Ser Val Val Ser Gly Val Val Ile Ile Ile Val Phe Phe Val Pro Ile Leu Thr Leu Gln Gly Leu Glu Gly Lys Met Phe Arg Pro Leu Ala Gln Ser Ile Val Tyr Ala Leu Leu Gly T:hr Leu Val Leu Ser Ile Thr Ile Ile Pro Val Val Ser Ser Leu Val L~~_u Lys Ala Thr Pro His Ser Glu Thr Phe Leu Thr Arg Phe Leu Asn A:rg Ile Tyr Ala Pro Leu Leu Glu Phe Phe Val His Asn Pro Lys Lys Val Ile Leu Gly Ala Phe Val Phe Leu Ile Ala Ser Leu Ser Leu Phe P:ro Phe Val Gly Lys Asn Phe Met Pro Val Leu Asp Glu Gly Asp Val V;al Leu Ser Val Glu Thr 515 520 5:?5 530 Thr Pro Ser Ile Ser Leu Asp Gln Ser Arg Asp Leu Met Leu Asn Ile Glu Ser Ala Ile Lys Lys His Val Lys Glu V;~1 Lys Ser Ile Val Ala Arg Thr Gly Ser Asp Glu Leu Gly Leu Asp Lf>_u Gly Gly Leu Asn Gln Thr Asp Thr Phe Ile Ser Phe Ile Pro Lys Lys Glu Trp Ser Val Lys Thr Lys Asp Glu Leu Leu Glu Lys Ile Met Asp Ser Leu Lys Asp Phe Lys Gly Ile Asn Phe Ser Phe Thr Gln Pro Ile Glu Met Arg Ile Ser Glu Met Leu Thr Gly Val Arg Gly Asp Leu Ala Val Lys Ile Phe Gly Asp Gly Ile Ser Glu Leu Asn Glu Leu Ser Phe Gln Ile Ala Gln Ala Leu Lys Gly Ile Lys Gly Ser Ser Glu Val LE>u Thr Thr Leu Asn Glu WO 98/21225 PCT/~JS97/21353 -Gly Val Asn T'yr Leu 'nyr-Val Thr Pro Asn Lys Glu Ser Met Ala Asp Val Gly Ile Thr Ser Asp Glu Phe Ser Lys Phe Leu Lys Ser Ala Leu Glu Gly Leu Val Val Asp Val Ile Pro Thr Gly Ile Ser Arg Thr Pro Val Met Ile Arg Gln Glu Ser Asp Phe Ala Ser Ser Ile Thr Lys Ile Lys Ser Leu Ala Leu Thr Ser Lys Tyr Gly Val Leu Val Pro Ile Thr Ser Ile Ala Lys Ile Glu Glu Val Asp Gly Pro Val Ser Val Val Arg 755 760 765 77p Glu Asn Ser Met Arg Met Ser Val Val Arg Ser Asn Val Val Gly Arg Asp Leu Lys Ser Phe Val Glu Glu Ala Lys Lys Val Ile Ala Gln Asn Ile Lys Leu Pro Pro Ser Tyr Tyr Ile Thr Tyr Gly Gly Gln Phe Glu Asn Gln Gln Arg Ala Asn Lys Arg Leu Ser Thr Val Ile Pro Leu Ser Ile Leu Ala Ile Phe Phe Ile Leu Phe Phe Thr Phe Lys Ser Ile Pro Leu Ala Leu Leu Ile Leu Leu Asn Ile Pro Phe Ala Val Thr Gly Gly Leu Ile Ala Leu Phe Ala Val Gly Glu Tyr Ile Ser Val Pro Ala Ser Val Gly Phe Ile Ala Leu Phe Gly Ile Ala Val Leu Asn Gly Val Val Met Ile Gly Tyr Phe Lys Glu Leu Leu Leu Gln Gly Lys Ser Val Glu Glu Cys Val Leu Leu Gly Ala Lys Arg Arg Leu Arg Pro Val Leu Met Thr Ala Cys Ile Ala Gly Leu Gly Leu Leu Pro Leu Leu Phe Ser His Ser Val Gly Ser Glu Val Gln Lys Pro Leu Ala Ile Val Val Leu Gly Gly Leu Val Thr Ser Ser Ala Leu Thr Leu Leu Leu Leu Pro Pro Met Phe Met Leu Ile Ala Lys Lys Ile Lys Ile Val (2) INFORMATION FOR SEQ ID N0:109:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 898 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (iy) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 86...835 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 86...161 (D) OTHER INFORMATION:
' ~ (xi) SEQUENCE DESCRIPTION: SEQ ID N0:109:
GCATAAAATA AACAAACATT AAGTAAGGCT TATCAATA'rT TGATTACAAT TATAAGGGTT 60 Met Leu Gly Asn Val Lys Lys Thr Leu TTT GGG GTC TTG TGT TTG GGC ACG TTG TGT T'rG AGA GGG TTA ATG GCA 160 Phe Gly Val Leu Cys Leu Gly Thr Leu Cys Leu Arg Gly Leu Met Ala Glu Pro Asp Ala Lys Glu Leu Val Asn Leu G.Ly Ile Glu Ser Ala Lys 1 5 10 15_ AAG CAA GAT TTC GCT CAA GCT AAA ACG CAT T'CT GAA AAA GCT TGT GAG 256 Lys Gln Asp Phe Ala Gln Ala Lys Thr His Phe Glu Lys Ala Cys Glu Leu Lys Asn Gly Phe Gly Cys Val Phe Leu Gly Ala Phe Tyr Glu Glu Gly Lys Gly Val Gly Lys Asp Leu Lys Lys A.~a Ile Gln Phe Tyr Thr AAA GGT TGT GAA TTA AAT GAT GGT TAT GGG TCiT AAC CTG CTA GGA AAT 400 Lys Gly Cys Glu Leu Asn Asp Gly Tyr Gly Cys Asn Leu Leu Gly Asn 65 70 7'i 80 Leu Tyr Tyr Asn Gly Gln Gly Val Ser Lys A:>p Ala Lys Lys Ala Ser Gln Tyr Tyr Ser Lys Ala Cys Asp Leu Asn His Ala Glu Gly Cys Met VaI Leu Gly Ser Leu His His Tyr Gly Val Gl.y Thr Pro Lys Asp Leu Arg Lys Ala Leu Asp Leu Tyr Glu Lys Ala Cys Asp Leu Lys Asp Ser CCA GGG TGT ATT AAT u~A--GGA TAT ATA TAT AGT GTA ACA AAG AAT TTT 640 Pro Gly Cys Ile Asn Ala Gly Tyr I1e Tyr Ser Val Thr Lys Asn Phe 145 150 155 l60 Lys Glu Ala Ile Val Arg Tyr Ser Lys Ala Cys Glu Leu Lys Asp Gly Arg Gly Cys Tyr Asn Leu Gly Val Met Gln Tyr Asn Ala Gln Gly Thr 180 185 l90 GCA AAG GAC GAA AAG CAA GCG GTA GAA AAC TTT AAA AAA GGC TGC AAA 784 _ Ala Lys Asp Glu Lys Gln Ala Val Glu Asn Phe Lys Lys Gly Cys Lys Ser Ser Val Lys Glu Ala Cys Asp Ala Leu Lys Glu Leu Lys Ile Glu 2l0 2Z5 220 Leu TTTTAAG ggg (2) INFORMATION FOR SEQ ID NO:110:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 250 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 1...25 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:110:
Met Leu Gly Asn Val Lys Lys Thr Leu Phe Gly Val Leu Cys Leu Gly Thr Leu Cys Leu Arg Gly Leu Met Ala Glu Pro Asp Ala Lys Glu Leu Val Asn Leu Gly Ile Glu Ser Ala Lys Lys Gln Asp Phe Ala Gln Ala Lys Thr His Phe Glu Lys Ala Cys Glu Leu Lys Asn Gly Phe Gly Cys Val Phe Leu Gly Ala Phe Tyr Glu Glu Gly Lys Gly Val Gly Lys Asp Leu Lys Lys Aia 11e ~in-~Phe Tyr Thr Lys G7_y Cys Glu Leu Asn Asp Gly Tyr Gly Cys Asn Leu Leu Gly Asn Leu Tyr Tyr Asn Gly Gln Gly _ 75 80 85 Val Ser Lys Asp Ala Lys Lys Ala Ser Gln Tyr Tyr Ser Lys Ala Cys Asp Leu Asn His Ala Glu Gly Cys Met Val Le:u Gly Ser Leu His His Tyr Gly Val Gly Thr Pro Lys Asp Leu Arg Lys Ala Leu Asp Leu Tyr - 120 125 1.S0 l35 Glu Lys Ala Cys Asp Leu Lys Asp Ser Pro G7_y Cys Ile Asn Ala Gly Tyr Ile Tyr Ser Val Thr Lys Asn Phe Lys G7_u Ala Ile Val Arg Tyr 155 l60 165 Ser Lys Ala Cys Glu Leu Lys Asp Gly Arg G7_y Cys Tyr Asn Leu Gly Val Met Gln Tyr Asn Ala Gln Gly Thr Ala Lys Asp Glu Lys Gln Ala Val Glu Asn Phe Lys Lys Gly Cys Lys Ser SE:r Val Lys Glu Ala Cys 200 205 27_0 215 Asp Ala Leu Lys Glu Leu Lys Ile Glu Leu (2) INFORMATION FOR SEQ ID NO:111:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1079 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 169...834 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 169...289 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:111:
CP,AAAAAAAA AAHAAACAAT TTCAGTTTCT TATTAGCTAG GTTTGATTAA AATGAAAAGC 60 ~ Met Ala Glu Asn Ser Phe Lys Asn vai-5er Thr Gln Pro Lys Val Phe Phe Leu Leu Pro Ala Lys Thr Leu Phe Leu Leu Gly Gly Val Phe Ser Ala Phe Phe Ile Leu Ile Ala Gly Leu Val Phe Phe Asp Tyr Ala His Leu Met Asp Asn Ala Ile Phe Asn Phe Ala Arg Ser Thr Pro Phe Asn Ser Ser Pro Ile Leu Thr Leu Ile Leu Gln Asn Ile Ala Asn Leu Gly Ser Ser Gln Phe Val Leu Pro Leu Ser Leu Leu Val Gly Val Phe Leu Ser Leu Tyr Arg Arg Asn Leu Val Leu Gly Val Trp Phe Val Leu Ser Val Ile Leu TTT GAA GCC CTT TTA GAA TCT TTA AAA CAC CTT TTT GCA TAT TCC ATT 56l Phe Glu Ala Leu Leu Glu Ser Leu Lys His Leu Phe Ala Tyr Ser Ile Gln Trp Leu Ser Arg Ser Ala Asn Phe Pro Asn Ala Thr Ala Leu Ser Leu Val Leu Phe Tyr Gly Leu Leu Ile Leu Leu Ile Pro His Leu Ile Thr His Gln Thr Leu Lys Asn Val Leu Phe Tyr Ser Leu Phe Gly Leu Ile Phe Leu Ile Gly Leu Ala Leu Ile Val Leu Gly Val Ser Phe Ser Ser Val Leu Gly Gly Phe Cys Leu Gly Ala Leu Gly Ala Cys Phe Ser Ile Gly Ile Tyr Leu Ser Val Phe Gln Lys Ile ATTTTTTCAT CAAGCTCAt~T AAAAAGCAAA AAATCGCCCT GATTGCAGCT GGGGTTTTGA 974 TCACGGCTTT GCTTGTGTTT TTATTGCTCT ATCCCTTTi~A AGAAAAAGAC TACACGCAAG 1034 (2) INFORMATION FOR SEQ ID N0:112:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 222 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATION: l...40 (D) OTHER TNFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NC>:112:
Met Ala Glu Asn Ser Phe Lys Asn Val Ser Thr Gln Pro Lys Val Phe -40 -35 -3.0 -25 Phe Leu Leu Pro Ala Lys Thr Leu Phe Leu Leu Gly Gly Val Phe Ser Ala Phe Phe Ile Leu Ile Ala Gly Leu Val Phe Phe Asp Tyr Ala His Leu Met Asp Asn Ala Ile Phe Asn Phe Ala Arg Ser Thr Pro Phe Asn Ser Ser Pro Ile Leu Thr Leu Ile Leu Gln Ae,n Ile Ala Asn Leu Gly Ser Ser Gln Phe Val Leu Pro Leu Ser Leu Leu Val Gly Val Phe Leu Ser Leu Tyr Arg Arg Asn Leu Val Leu Gly Va.l Trp Phe Val Leu Ser Val Ile Leu Phe Glu Ala Leu Leu G1u 5er Leu Lys His Leu Phe Ala Tyr Ser Ile Gln Trp Leu Ser Arg Ser Ala Asn Phe Pro Asn Ala Thr Ala Leu Ser Leu Val Leu Phe Tyr Gly Leu Leu Ile Leu Leu Ile Pro l05 l10 115 120 His Leu Ile Thr His Gln Thr Leu Lys Asn Val Leu Phe Tyr Ser Leu Phe Gly Leu Ile Phe Leu Ile Gly Leu Ala Leu Ile Val Leu Gly Val Ser Phe Ser Ser Val Leu Gly Gly Phe Cys Leu Gly Ala Leu Gly Ala Cys Phe Ser Ile Gly Ile Tyr Leu Ser Val Phe Gln Lys Ile 170 l75 180 (2) INFORMATION FOR SEQ ID N0:113:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 962 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 97...912 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 97...217 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:113:
AAAGTTTGCA ACCTGATGAG AGTAATAATA GAGTTT ATG CTG ATT TCA TTA AAA 1l4 Met Leu Ile Ser Leu Lys Thr Phe Leu Lys Ile Leu Leu Lys Ile Phe Leu Lys Thr Phe Gln Lys Ile Trp Val Val Cys Val Ile Ile Trp Gly Leu Gly Cys Ser Phe Leu Asn Ala Asn Ser Ile Gln Leu Glu Glu Thr Leu Arg Arg Ser Pro Lys Asn Leu Ile Trp Gln His Phe Lys Lys Lys Phe Lys Lys Ser Asn Thr Ile Pro Tyr Ala Pro Asn Ser Arg Trp Lys Tyr Leu Gly Thr Ser Ile Gly Ile Leu G1y Val Ser Leu Val Ile Gly Ile Val Gly Leu Tyr Leu Met Pro Glu Ser Val Thr Asn Trp Asp Lys Glu Lys Phe Gly Ile Lys AGT TGG TTT GAA AA'1' G'LC-CGC ATG GGG CCA FAA CTG GAC AAT GAT AGT 498 Ser Trp Phe Glu Asn Val Arg Met Gly Pro Lys Leu Asp Asn Asp Ser Phe Ile Phe Asn Glu Ile Leu His Pro Tyr F~he Gly Ala Met Tyr Tyr 95 100 1.05 110 ATG CAA CCG CGC ATG GCT GGA TTT AGC TGG A.TG GCA TCA GCG TTT TTT 594 Met Gln Pro Arg Met Ala Gly Phe Ser Trp Nfet Ala Ser Ala Phe Phe TCT TTT ATC ACT TCC ACG CTT TTT TGG GAA T'AT GGC TTG GAA GCG TTT 642 Ser Phe Ile Thr Ser Thr Leu Phe Trp Glu Tyr Gly Leu Glu Ala Phe GTG GAA GTG CCT AGC TGG CAG GAT TTA GTG A.TC ACG CCT TTA TTA GGC 690 Val Glu Val Pro Ser Trp Gln Asp Leu Val Ile Thr Pro Leu Leu Gly 145 150 l55 TCC ATT TTA GGG GAG GGG TTT TAT CAG CTC A.CG CGC TAT ATC CAA CGC 738 Ser Ile Leu Gly Glu Gly Phe Tyr Gln Leu Thr Arg Tyr Ile G1n Arg Asn Glu Gly Lys Leu Phe Gly Ser Leu Phe Leu Gly Arg Leu Val Ile Ala Leu Met Asp Pro Ile Gly Phe Ile Ile Arg Asp Leu Gly Leu Gly Glu Ala Leu Gly Ile Tyr Asn Lys His Glu Ile Arg Ser Ser Leu Ser Pro Asn Gly Leu Asn Leu Thr Tyr Lys Phe (2) INFORMATION FOR SEQ TD N0:114:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 272 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein ' (v) FRAGMENT TYPE: internal (1x) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATIUlv: i...40 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:114:
Met Leu Ile Ser Leu Lys Thr Phe Leu Lys Ile Leu Leu Lys Ile Phe Leu Lys Thr Phe Gln Lys Ile Trp Val Val Cys Val Ile Ile Trp Gly Leu Gly Cys Ser Phe Leu Asn Ala Asn Ser Ile Gln Leu Glu Glu Thr Leu Arg Arg Ser Pro Lys Asn Leu Ile Trp Gln His Phe Lys Lys Lys Phe Lys Lys Ser Asn Thr Ile Pro Tyr Ala Pro Asn Ser Arg Trp Lys Tyr Leu Gly Thr Ser Ile Gly Ile Leu Gly Val Ser Leu Val Ile Gly Ile Val Gly Leu Tyr Leu Met Pro Glu Ser Val Thr Asn Trp Asp Lys Glu Lys Phe Gly Ile Lys Ser Trp Phe Glu Asn val Arg Met Gly Pro Lys Leu Asp Asn Asp Ser Phe Ile Phe Asn Glu Ile Leu His Pro Tyr Phe Gly Ala Met Tyr Tyr Met Gln Pro Arg Met Ala Gly Phe Ser Trp 105 l10 115 120 Met Ala Ser Ala Phe Phe Ser Phe Ile Thr Ser Thr Leu Phe Trp Glu 125 l30 135 Tyr Gly Leu Glu Ala Phe Val Glu Val Pro Ser Trp Gln Asp Leu Val Ile Thr Pro Leu Leu Gly Ser Ile Leu Gly Glu Gly Phe Tyr Gln Leu l55 160 l65 Thr Arg Tyr Ile Gln Arg Asn Glu G1y Lys Leu Phe Gly Ser Leu Phe Leu Gly Arg Leu Val Ile Ala Leu Met Asp Pro Ile Gly Phe Ile Ile Arg Asp Leu Gly Leu Gly Glu Ala Leu Gly Ile Tyr Asn Lys His Glu 205 2l0 215 Ile Arg Ser Ser Leu Ser Pro Asn Gly Leu Asn Leu Thr Tyr Lys Phe (2} INFORMATION FOR SEQ ID N0:115:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: l422 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 216...1202 (L) OTHER llvFURI~1ATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 216...273 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRTPTION: SEQ ID NC>:115:
GATTTTGAAA ATAACCCTAA TGAGCAATCAGCGCTCTTTG TCTTGCCCCTTTCAGCGGTTl20 TTT
Met Lys Arg Phe Val Leu TTT TTA TTG TTC ATG TGC GTT TGC GTT CAA GC'T TAC GCC GAG CAA GAT 2B1 Phe Leu Leu Phe Met Cys Val Cys Val Gln Ala Tyr Ala Glu Gln Asp Tyr Phe Phe Arg Asp Phe Lys Ser Arg Asp Leu Pro Gln Lys Leu His Leu Asp Lys Lys Leu Ser Gln Thr Ile Gln Pro Cys Met Gln Leu Asn Ala Ser Lys His Tyr Thr Ser Thr Gly Val Arg Glu Pro Asp Lys Cys Thr Lys Ser Phe Lys Lys Ser Ala Leu Met Ser Tyr Asp Leu Ala Leu GGT TAT TTG GTG AGT AAG AAT AAG CAA TAC GG~~ TTA AAG GCT ATA GAA 521 Gly Tyr Leu Val Ser Lys Asn Lys Gln Tyr Gly Leu Lys Ala Ile Glu Ile Leu Asn Ala Trp Ala Lys Glu Leu Gln Se:r Val Asp Thr Tyr Gln AGC GAG GAT AAT ATC AAT TTT TAC ATG CCT TA'P ATG AAC ATG GCT TAT 617 Ser Glu Asp Asn Ile Asn Phe Tyr Met Pro Ty:~ Met Asn Met Ala Tyr Trp Phe Val Lys Lys Ala Phe Pro Ser Pro Glu Tyr Glu Asp Phe Ile AAG CGG ATG CGC CAG TAT TCT CAA TCA GCT CT':C AAC ACT AAC CAT GGG 713 Lys Arg Met Arg Gln TyrwSer G1n Ser Ala Leu Asn Thr Asn His Gly 135 l40 145 Ala Trp Gly Ile Leu Phe Asp Val Ser Ser Ala Leu Ala Leu Asp Asp Asn Ala Leu Leu His Asn Ser Ala Asn Arg Trp Gln Glu Trp Val Phe Lys Ala Ile Asp Glu Asn Gly Val Ile Xaa Ser Ala Ile Thr Arg Ser Asp Thr Ser Asp Tyr His Gly Gly Pro Thr Lys Gly Ile Lys Gly Ile 200 205 2l0 Ala Tyr Thr Asn Phe Ala Leu Leu Ala Leu Thr Ile Ser Gly Glu Leu Leu Phe Glu Asn Gly Tyr Asp Leu Trp Gly Ser Gly Ala Gly Lys Arg Leu Ser Val Ala Tyr Asn Lys Val Ala Thr Trp Ile Leu Asn Pro Glu Thr Phe Pro Tyr Phe Gln Pro Asn Leu Ile Gly Val His Asn Asn Ala Tyr Phe Ile Ile Leu Ala Lys His Tyr Ser Ser Pro Ser Ala Asn Glu Leu Leu Lys Gln Gly Asp Leu His Glu Asp Gly Phe Arg Leu Lys Leu Arg Ser Pro (2) INFORMATION FOR SEQ ID N0:116:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 329 amino acids (B) TYPE: amino-acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence ' (B) LOCATION: 1...l9 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:116:
Met Lys Arg Phe Val Leu Phe Leu Leu Phe Met Cys Val Cys Val Gln Ala Tyr Ala Glu Gln Asp Tyr Phe Phe Arg Aap Phe Lys 5er Arg Asp Leu Pro Gln Lys Leu His Leu Asp Lys Lys Leu Ser Gln Thr Ile Gln Pro Cys Met Gln Leu Asn Ala Ser Lys His T~~r Thr Ser Thr Gly Val Arg Glu Pro Asp Lys Cys Thr Lys Ser Phe Lys Lys Ser Ala Leu Met Ser Tyr Asp Leu Ala Leu Gly Tyr Leu Val Se:r Lys Asn Lys Gln Tyr Gly Leu Lys Ala Ile Glu Ile Leu Asn Ala Trp Ala Lys Glu Leu Gln Ser Val Asp Thr Tyr Gln Ser G1u Asp Asn Il.e Asn Phe Tyr Met Pro Tyr Met Asn Met Ala Tyr Trp Phe Val Lys Lys Ala Phe Pro Ser Pro 110 115 12'.0 125 Glu Tyr Glu Asp Phe Ile Lys Arg Met Arg Gln Tyr Ser Gln Ser Ala Leu Asn Thr Asn His Gly Ala Trp Gly Ile Le:u Phe Asp Val Ser Ser Ala Leu Ala Leu Asp Asp Asn Ala Leu Leu His Asn Ser Ala Asn Arg Trp Gln Glu Trp Val Phe Lys Ala Ile Asp Glu Asn Gly Val Ile Xaa Ser Ala Ile Thr Arg Ser Asp Thr Ser Asp Tyr His Gly Gly Pro Thr Lys Gly Ile Lys Gly Ile Ala Tyr Thr Asn Ph.e Ala Leu Leu Ala Leu 210 2l5 220 Thr Ile Ser Gly Glu Leu Leu Phe Glu Asn Gly Tyr Asp Leu Trp Gly Ser Gly Ala Gly Lys Arg Leu Ser Val Ala Tyr Asn Lys Val Ala Thr w 240 245 250 Trp Ile Leu Asn Pro Glu Thr Phe Pro Tyr Ph.e Gln Pro Asn Leu Ile ' 255 260 26S
Gly Val His Asn Asn Ala Tyr Phe Ile Ile Leu Ala Lys His Tyr Ser Ser Pro Ser Ala Asn Glu Leu Leu Lys Gln Gly Asp Leu His Glu Asp Gly Phe Arg Leu Lys Leu Arg Ser Pro (2} INFORMATION FOR SEQ ID N0:117:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1080 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 157...987 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 157...226 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:117:
Met Lys Thr Asn Gly Leu Phe Lys Met Trp Gly Leu Phe Leu Val Leu Ile Ala Leu Val Phe Asn Ala Cys Ser Asp Ser His Lys Glu Lys Lys Asp Ala Leu Glu Val Ile Lys Gln Arg Gly Val Leu Lys Val Gly Val Phe Ser Asp Lys Pro Pro Phe Gly Ser Val Asp Ser Lys Gly Lys Tyr Gln Gly Tyr Asp Val Val Ile Ala Lys Arg Met Ala Leu Asp Leu Leu Gly Asp Glu Asn Lys Ile WO 98l21225 PCT/US97/21353 -GAG TTT ATT CCT GTA GhA GCT TCA GCT AGG GTG GAA TTT TTA AAA GCC 462 G1u Phe Ile Pro Val Glu Ala Ser Ala Arg Val Glu Phe Leu Lys Ala AAT AAA GTG GAT ATT ATC ATG GCT AAT TTC A.CG CGC ACT AAA GAA AGA 510 Asn Lys Val Asp Ile Ile Met Ala Asn Phe Thr Arg Thr Lys Glu Arg - ~ Glu Lys Val Val Asp Phe Ala Lys Pro Tyr Met Lys Val Ala Leu Gly Val Val Ser Lys Asp Gly Val Ile Lys Asn Ile Glu Glu Leu Lys Asp Lys Glu Leu Ile Val Asn Lys Gly Thr Thr Ala Asp Phe Tyr Phe Thr l30 135 140 AAA AAT TAC CCC AAT ATC AAG CTT TTG AAA T'rT GAG CAA AAT ACA GAG 702 Lys Asn Tyr Pro Asn Ile Lys Leu Leu Lys P.he Glu Gln Asn Thr Glu l45 150 155 Thr Phe Leu Ala Leu Leu Asn Asn Lys Ala T.hr Ala Leu Ala His Asp 160 l65 170 175 AAC ACT TTA TTG CTC GCT TGG ACG AAA CAA C,fiC CCT GAA TTT AAA TTA 79B
Asn Thr Leu Leu Leu Ala Trp Thr Lys Gln His Pro Glu Phe Lys Leu GGC ATT ACA AGC CTT GGC GAT AAG GAT GTG A'rC GCT CCA GCG ATT AAA 846 Gly Ile Thr Sex Leu Gly Asp Lys Asp Val I.le Ala Pro Ala Ile Lys Lys Gly Asn Pro Lys Leu Leu Glu Trp Leu Asn Asn Glu Ile Asp Ser 2l0 2l5 220 CTC ATT TCT AGC GAC TTC TTA AAA GAA GCT Ti~T CAA GAG ACT TTA GCA 942 Leu Ile Ser Ser Asp Phe Leu Lys Glu Ala Tyr Gln Glu Thr Leu Ala Pro Val Tyr Gly Asp Glu Ile Lys Pro Glu Glu Ile Ile Phe Glu TCTTTAGGCT TTGAATTCTT GACAGGGTGC GTTTTTAT'CG CTAAATTAGC AATTTTGTGA 1052 TCTTTTTGTT TTTCATTTTG AGATATAT l080 ' (2) INFORMATION FOR SEQ ID N0:1113:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 277 amino acids (B) TYPE: amino.acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 1...23 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:118:
Met Lys Thr Asn Gly Leu Phe Lys Met Trp Gly Leu Phe Leu Val Leu Ile Ala Leu Val Phe Asn Ala Cys Ser Asp Ser His Lys Glu Lys Lys Asp Ala Leu Glu Val Ile Lys Gln Arg Gly Val Leu Lys Val Gly Val Phe Ser Asp Lys Pro Pro Phe Gly Ser Val Asp Ser Lys Gly Lys Tyr Gln Gly Tyr Asp Val Val Ile Ala Lys Arg Met Ala Leu Asp Leu Leu Gly Asp Glu Asn Lys Ile Glu Phe I1e Pro Val Glu Ala Ser Ala Arg Val Glu Phe Leu Lys Ala Asn Lys Val Asp Ile Ile Met Ala Asn Phe Thr Arg Thr Lys Glu Arg Glu Lys Val Val Asp Phe Ala Lys Pro Tyr Met Lys Val Ala Leu Gly Val Val Ser Lys Asp Gly Val Ile Lys Asn 110 l15 120 Ile Glu Glu Leu Lys Asp Lys Glu Leu Ile Val Asn Lys Gly Thr Thr Ala Asp Phe Tyr Phe Thr Lys Asn Tyr Pro Asn Ile Lys Leu Leu Lys 140 l45 l50 Phe Glu Gln Asn Thr Glu Thr Phe Leu Ala Leu Leu Asn Asn Lys Ala Thr Ala Leu Ala His Asp Asn Thr Leu Leu Leu Ala Trp Thr Lys Gln 170 175 180 l85 His Pro Glu Phe Lys Leu Gly Ile Thr Ser Leu Gly Asp Lys Asp Val Ile Ala Pro Ala Ile Lys Lys Gly Asn Pro Lys Leu Leu Glu Trp Leu Asn Asn Glu Ile Asp Ser Leu Ile Ser Ser Asp Phe Leu Lys Glu Ala Tyr Gln Glu Thr Leu Ala Pro Val Tyr Gly Asp Glu Ile Lys Pro Glu Glu Ile Ile Phe Glu (2) INFORMATION FOR SEQ ID N0:119:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1114 base pairs (B} TYPE: nucleic acid y (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
~ (A) NAME/KEY: Coding Sequence (B) LOCATION: 37...1050 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:119:
Met Phe Phe Lys Thr Tyr Gln Lys Leu Leu Gly Ala Ser Cys Leu Thr Leu Tyr Leu Ala Gly Cys GGG AGT GAT AGT AGC GAG CCA TTG GTG GGA A'TT GAA AAA AAT AGC TTC 150 Gly Ser Asp Ser Ser Glu Pro Leu Val Gly Ile Glu Lys Asn Ser Phe AAT TCT ACC GTG AAA ATC ATT TCT AAA ACC G.AC AAC ATA GAA ATC CAA 198 Asn Ser Thr Val Lys Ile Ile Ser Lys Thr Asp Asn Ile Glu Ile Gln Asp Leu Lys Leu Asn Arg Gly Asn Cys Glu His Asp Gln Asn Phe Leu 55 60 6!~ 70 GTA AAG TTA ATC CAA GAA ACA GCC AAT ACA T:9C CTG TTT GCA TCA GAA 294 Val Lys Leu Ile Gln Glu Thr Ala Asn Thr T~~r Leu Phe Ala Ser Glu AAA GAA AAA GCG ATC AAA AAC CAC CAA GCA Ai3A ATC GCA AGA CTT CAA 342 Lys Glu Lys Ala Ile Lys Asn His Gln Ala Lys Ile Ala Arg Leu Gln Lys Asp Leu Glu Glu Leu Thr Gln His Val Gln Gln Ser Asn Asn Leu - GAT AAA TTG TTA GAA AAT GGA GGA CTA TTC G'CT AGT GGC CAT GAT TAT 438 Asp Lys Leu Leu Glu Asn Gly Gly Leu Phe Val Ser Gly His Asp Tyr AAA TAT ACA AAA GAT GAT AAC CCA ATA TAT G'CT GTT AAG AGG ATG CTT 4S6 Lys Tyr Thr Lys Asp Asp Asn Pro Ile Tyr V<~l Val Lys Arg Met Leu l35 140 145 150 Asp Asn Leu Asp Ser Tyr Lys Tyr Glu Ser Asp Asp Val Leu Asp Val l55 160 165 Pro Tyr Glu Lys Leu Leu Glu Ile Ser Ile Ala Ile Glu Asp Thr Lys Asn Pro Lys Asp Tyr Pro Tyr Ile Asn Leu Lys Glu Leu Lys Lys Leu Ile Asp Ser Ile Ile Asp Asp His Gly Tyr Met Ala Asp Gly Phe Leu Asn Glu Tyr Ser Asn Arg Val Ser Lys Lys Gly Leu Gln Ile Leu Ala AAA CTA AAA TCC ATG TGG CCT AGC GTA GGG AAA TTT TAT TTC GCC..TCT 774 Lys Leu Lys Ser Met Trp Pro Ser Val Gly Lys Phe Tyr Phe Ala Ser Leu Lys Glu Ala Ile Pro Arg His Ala Lys Glu Val Thr Asp Lys Met Ile Ser Ser Glu Glu Lys Ser Ile Lys Ala Asn Gln Val Lys Leu Thr Glu Ala Lys Gln Asp Ile Asp Lys Met Glu Lys Ile Ile Lys Asp Leu Glu Ser Lys Lys Asn Thr Leu Ser Val Tyr Leu Lys Phe Gly Glu Ser Phe Thr Ala His Tyr Lys Cys Gln Asn Leu Ile Glu Val Gly Val Lys ACC GAT AAA GGC TCC TGG ACT TTC AAC TTT AAC AGA TAAATCAGGC AAATAT l066 Thr Asp Lys Gly Ser Trp Thr Phe Asn Phe Asn Arg (2) INFORMATION FOR SEQ ID N0:120:
(i) SEQUENCE CHARACTERISTICS:
(A? LENGTH: 338 amino acids ' (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:120:
Met Phe Phe Lys Thr Tyr Gln Lys Leu Leu G:Ly Ala Ser Cys Leu Thr Leu Tyr Leu Ala Gly Cys Gly Ser Asp Ser S:~_r Glu Pro Leu Val Gly Ile Glu Lys Asn Ser Phe Asn Ser Thr Val Lys Ile Ile Ser Lys Thr Asp Asn Ile Glu Ile Gln Asp Leu Lys Leu Asn Arg Gly Asn Cys Glu His Asp Gln Asn Phe Leu Val Lys Leu Ile Gln Glu Thr Ala Asn Thr Tyr Leu Phe Ala Ser Glu Lys Glu Lys Ala Ile Lys Asn His Gln Ala Lys Ile Ala Arg Leu Gln Lys Asp Leu Glu Glu Leu Thr Gln His Val Gln Gln Ser Asn Asn Leu Asp Lys Leu Leu Glu Asn Gly Gly Leu Phe Val Ser Gly His Asp Tyr Lys Tyr Thr Lys A:~p Asp Asn Pro Ile Tyr 130 135 l40 Val Val Lys Arg Met Leu Asp Asn Leu Asp Se>_r Tyr Lys Tyr Glu Ser Asp Asp Val Leu Asp Val Pro Tyr Glu Lys Leu Leu Glu Ile Ser Ile Ala Ile Glu Asp Thr Lys Asn Pro Lys Asp Tyr Pro Tyr Ile Asn Leu Lys Glu Leu Lys Lys Leu Ile Asp Ser Ile I7_e Asp Asp His Gly Tyr Met Ala Asp Gly Phe Leu Asn Glu Tyr Ser A:>n Arg Val Ser Lys Lys Gly Leu Gln Ile Leu Ala Lys Leu Lys Ser Met Trp Pro Ser Va1 Gly 225 230 2.S5 240 Lys Phe Tyr Phe Ala Ser Leu Lys Glu Ala Il.e Pro Arg His Ala Lys Glu Val Thr Asp Lys Met Ile Ser Ser Glu G7.u Lys Ser Ile Lys Ala Asn Gln Val Lys Leu Thr Glu Ala Lys Gln Asp Ile Asp Lys Met Glu Lys Ile Ile Lys Asp Leu Glu Ser Lys Lys A:~n Thr Leu Ser Val Tyr Leu Lys Phe Gly Glu Ser Phe Thr Ala His Tyr Lys Cys Gln Asn Leu 305 310 37.5 320 Ile Glu Val Gly Val Lys Thr Asp Lys Gly Ser Trp Thr Phe Asn Phe Asn Arg (2) INFORMATION FOR SEQ ID N0:121.:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1101 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 40...1026 (D) OTHER INFORMATION:
(A) NAME/KEY: sig peptide (B) LOCATION: 40...99 (D) OTHER INFORMATION:
(A) NAME/KEY: mat peptide (B) LOCATION: 100...1026 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:121:
Met Gln Gln Arg His Leu Gly Pro Leu Lys Val Gly Ala Leu Ala Leu Gly Cys Met Gly Met Thr Tyr Gly Tyr Gly Glu Val His Asp Lys Lys Gln Met Val Lys Leu Ile His Lys Ala Leu Glu Leu Gly Ile Asn Phe Phe Asp Thr Ala Glu Ala Tyr Gly Glu Asp Asn Glu Lys Leu Leu Ala Lys Arg Ser Ser Leu Ile Lys Asp Lys Val Val Val Ala Ser Lys Phe Gly Ile Tyr Tyr Ala Asp Pro Asn Asp Lys Tyr Ala Thr Met Phe Leu Asp Ser Ser Ser Asn Arg Ile Lys Ser Ala Ile Glu Gly Ser Leu Lys Arg Leu Lys Val Glu ~ TGC ATT GAT TTA TAC TAC CAA CAC CGC ATG GAT ACT AAC ACG CCC ATA 438 Cys Ile Asp Leu Tyr Tyr Gln His Arg Met Asp Thr Asn Thr Pro Ile GAA GAA GTG GCA GAA GTT ATG CAA GCT CTT A'.CT AAA GAA GGA AAA ATT 486 Glu Glu Val Ala Glu Val Met Gln Ala Leu Ile Lys Glu Gly Lys Ile Lys Ala Trp Gly Met Ser Glu Ala Gly Leu Se:r Ser Ile Gln Lys Ala l30 135 14E0 145 His Gln Ile Cys Pro Leu Ser Ala Leu Gln Se:r Glu Tyr Ser Leu Trp TGG CGC GAA CCT GAA AAA GAG ATT TTA GGT T7.'T TTA GAA AAA GAA AAA 630 Trp Arg Glu Pro Glu Lys Glu Ile Leu Gly Phe Leu Glu Lys Glu Lys Ile Gly Phe Val Ala Phe Ser Pro Leu Gly Lys Gly Phe Leu Gly Ala l80 185 190 Lys Phe Glu Lys Asn Ala Thr Phe Ala 5er Glu Asp Phe Arg Ser Val TCT CCT AGG TTT AAT CAA GAA AAT CTA GCC AAP. AAT TAC GTC TTG GTG 774 Ser Pro Arg Phe Asn Gln Glu Asn Leu Ala Lys Asn Tyr Val Leu Val Glu Leu Ile Gln Asp His Ala His Ala Lys Gly Val Thr Pro Ala Gln Leu Ala Leu Ser Trp Ile Leu His Thr Gln Lys Ile Ile Val Pro Leu TTT GGC ACC ACC AAA GAA TCC AGG CTC ATA GAA AAT ATA GGG GCT TTG 9l8 Phe Gly Thr Thr Lys Glu Ser Arg Leu Ile G7.u Asn Ile Gly Ala Leu CAG GTT TCT TGG AGT CAA AAA GAA TTG GAG A7.'T TTT CAA AAA GAA TTG 966 Gln Val Ser Trp Ser Gln Lys Glu Leu Glu Il.e Phe Gln Lys Glu Leu ACT GCA ATC AAA ATA GAA GGG GCC CGC TAC CC.'T GAA AGA ATC AAT GAA 1014 Thr Ala Ile Lys Ile Glu Gly Ala Arg Tyr Pro Glu Arg Ile Asn Glu WO 98I21225 PCT/US97/21353 w ATG GTG AAT CAA TAAAAGTATT GGGTATTTAT AATTGCATTG GCTCTTTTAA AAGAG 107l Met Val Asn Gln (2) INFORMATION FOR SEQ ID N0:122:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 329 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:122:
Met Gln Gln Arg His Leu Gly Pro Leu Lys Val Gly Ala Leu Ala Leu Gly Cys Met Gly Met Thr Tyr Gly Tyr Gly Glu Val His Asp Lys Lys Gln Met Val Lys Leu Ile His Lys Ala Leu Glu Leu Gly Ile Asn Phe Phe Asp Thr Ala Glu Ala Tyr Gly Glu Asp Asn Glu Lys Leu Leu Ala Lys Arg Ser Ser Leu Ile Lys Asp Lys Val Val Val Ala Ser Lys Phe Gly Ile Tyr Tyr Ala Asp Pro Asn Asp Lys Tyr Ala Thr Met Phe Leu Asp Ser Ser Ser Asn Arg Ile Lys Ser Ala Ile Glu Gly Ser Leu Lys Arg Leu Lys Val Glu Cys Ile Asp Leu Tyr Tyr Gln His Arg Met Asp 95 100 l05 Thr Asn Thr Pro Ile Glu Glu Val Ala Glu Val Met Gln Ala Leu Ile Lys Glu Gly Lys Ile Lys Ala Trp Gly Met Ser Glu Ala Gly Leu Ser Ser Ile Gln Lys Ala His Gln Ile Cys Pro Leu Ser Ala Leu Gln Ser Glu Tyr Ser Leu Trp Trp Arg Glu Pro Glu Lys Glu Ile Leu Gly Phe Leu Glu Lys Glu Lys Ile Gly Phe Val Ala Phe Ser Pro Leu Gly Lys Gly Phe Leu Gly Ala Lys Phe Glu Lys Asn Ala Thr Phe Ala Ser Glu Asp Phe Arg Ser Val Ser Pro Arg Phe Asn Gln Glu Asn Leu Ala Lys Asn Tyr Val Leu Val Glu Leu Ile Gln Asp His Ala His Ala Lys Gly Val Thr Pro Ala Gln Leu Ala Leu Ser Trp Ile Leu His Thr Gln Lys Ile Ile Val Pro Leu Phe Gly Thr Thr Lys Glu Ser Arg Leu Ile Glu Asn Ile Gly Ala Leu Gln Val Ser Trp Ser Gln Lys Glu Leu Glu Ile Phe Gln Lys Glu Leu Thr Ala Ile Lys Ile Glu Gly Ala Arg Tyr Pro Glu Arg Ile Asn Glu Met Val Asn Gln (2) INFORMATION FOR SEQ ID N0:12a:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 955 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 126...806 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 126...237 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:123:
Met Val Phe Asp Arg Thr Ile Ser Val Arg Glu Lys Lys A1a Ala Lys Thr Leu Gly Ile Ile Gly Ile Val Phe Phe Ile Leu Phe Gly Ile Val Ile Ser Gly Val Ala Phe Gln Lys Glu Trp Val Gln Gln Leu Asp TTA TTT TTT ATA GAC TTG ATC CGC AAC CCT GCC CCC ATT CAA AAA AGC 3l4 Leu Phe Phe Ile Asp Leu Ile Arg Asn Pro Ala Pro Ile Gln Lys Ser Ala Trp Leu Ser Phe Val Phe Phe Ser Thr Trip Phe Ala Gln Ser Lys WO 98!2122S PCT/US97/21353 -Leu Thr Thr Pro Ile Ala Leu Leu Ile Gly Leu Trp Phe Gly Phe Gln Lys Arg Ile Ala Leu Gly Val Trp Phe Phe Phe Ser Ile Leu Leu Gly Glu Phe Thr Leu Lys Ser Leu Lys Leu Leu Val Ala Arg Pro Arg Pro Val Thr Asn Gly Glu Leu Val Phe Ala His Gly Phe Ser Phe Pro Ser Gly His Ala Leu Ala Ser Ala Leu Phe Tyr Gly Ser Leu Ala Leu Leu 1l0 115 120 Leu Cys Tyr Ser Asn Ala Asn Asn Arg Ile Lys Thr Ile Ile Ala Val Val Leu Leu Phe Trp Ile Phe Leu Met Ala Tyr Asp Arg Val Tyr Leu Gly Val His Tyr Pro Ser Asp val Leu Gly Gly Phe Leu Leu Gly Ile l55 160 165 l70 Ala Trp Ser Cys Cys Ser Leu Ala Leu Tyr Leu Gly Phe Leu Lys Arg l75 180 185 Pro Tyr Asn Gln (2) INFORMATION FOR SEQ ID N0:124:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 227 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 1...37 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:124:
Met Val Phe Asp Arg Thr Ile Ser Val Arg Glu Lys Lys Ala Ala Lys Thr Leu Gly Ile Ile Gly Ile Val Phe Phe Ile Leu Phe Gly Ile Val Ile Ser Gly Val Ala Phe Gln Lys Glu Trp Val Gln Gln Leu Asp Leu Phe Phe Ile Asp Leu Ile Arg Asn Pro Ala P:ro Ile Gln Lys Ser Ala Trp Leu Ser Phe Val Phe Phe Ser Thr Trp Phe Ala Gln Ser Lys Leu Thr Thr Pro Ile Ala Leu Leu Ile Gly Leu T:rp Phe Gly Phe Gln Lys Arg Ile Ala Leu Gly Val Trp Phe Phe Phe S~~_r Ile Leu Leu Gly Glu Phe Thr Leu Lys Ser Leu Lys Leu Leu Val A.La Arg Pro Arg Pro Val Thr Asn Gly Glu Leu Val Phe Ala His Gly Phe Ser Phe Pro Ser Gly His Ala Leu Ala Ser Ala Leu Phe Tyr Gly SE:r Leu Ala Leu Leu Leu Cys Tyr Ser Asn Ala Asn Asn Arg Ile Lys Thr Ile Ile Ala Val Val 125 l30 135 Leu Leu Phe Trp Ile Phe Leu Met Ala Tyr Asp Arg Val Tyr Leu Gly Val His Tyr Pro Ser Asp Val Leu Gly Gly Phe Leu Leu Gly Ile Ala Trp Ser Cys Cys Ser Leu Ala Leu Tyr Leu Gly Phe Leu Lys Arg Pro Tyr Asn Gln (2) INFORMATION FOR SEQ ID N0:125:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1183 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 91...1032 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 91...l48 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:125:
TAGTTACAAC TATTTATTGT AAAGGCTAAA ATG TTG AAA TTT AAA TAT GGT TTG l14 Met Leu Lys Phe Lys Tyr Gly Leu Ile Tyr Ile Ala Leu Ile Leu Gly Leu Gln Ala Thr Asp Tyr Asp Asn Leu Glu Glu Glu Asn Gln Gln Leu Asp Glu Lys Ile Asn His Leu Lys Gln Gln Leu Thr Glu Lys Gly Val Ser Pro Lys Glu Met Asp Lys Asp Lys Phe Glu Glu Glu Tyr Ile Asn Arg Ser Tyr Pro Lys Ile Ser Ser Lys Lys Lys Glu Lys Leu Leu Lys Ser Phe Ser Ile Ala Asp Asp Lys Ser Gly Val Phe Leu Gly Gly Gly Tyr Ala Tyr Gly Glu Leu Asn Leu Ser Tyr Gln Gly Glu Met Leu Asp Arg Tyr Gly Ala Asn Ala Pro Ser Ala Phe Lys Asn Asn Ile Asn Ile Asn Ala Pro Val Ser Met Ile Ser Ala Lys Phe Gly Tyr Gln Lys Tyr Phe Val Ser Tyr Phe Gly Thr Arg Phe Tyr Gly Asp Leu Leu Leu G1y Gly Gly Ala Leu Lys Glu Asp Ala 135 l40 145 Ile Lys Gln Pro Val Gly Ser Phe Ile Tyr Val Leu Gly Ala Val Asn 150 155 7.60 165 ACC GATTTA TTG TTT GAT ATG GAT 7.'TTAAAACTAAA AAGCAT 690 CCT TTA
Thr AspLeu Leu Phe Asp Met Asp PheLysThrLys LysHis Pro Leu l70 175 180 TTT GGG
Phe LeuGly Val Tyr Ala Gly Ile C9lyLeuMetLeu TyrGln Phe Gly AGG AAT
Asp ArgPro Asn Gln Asn Gly Leu ValValGlyGly TyrSer Arg Asn TCT TTG
Ser ProAsn Phe Leu Trp Lys Ile GluValAspTyr ThrPhe Ser Leu AAT GTGGGC GTG AGT TTA ACG AGG F,AACACCGTTTA GAGATT 882 CTT TAT
Asn ValGly Val Ser Leu Thr Arg L~ysHisArgLeu GluIle Leu Tyr GGC ACAAAA TTG CCG ATT AGC AGA P,TGGGAGTGGAA GAGGGA 930 TAT TTG
Gly ThrLys Leu Pro Ile Ser Arg NfetGlyValGlu GluGly Tyr Leu GCG ATTTAT CAA AAT AAA GAA GAG C'GTTTGTTGGTT TCGGCT 978 GAT GAT
Ala IleTyr Gln Asn Lys Glu Glu A,rgLeuLeuVal SerAla Asp Asp AAC AACCAG TTC AAG CGA TCC TTA T'TAGTGAATTAT GCGTTT 1026 AGT TTT
Asn AsnGln Phe Lys Arg Ser Leu L~euValAsnTyr AlaPhe Ser Phe TCTTGGAGTT AA
AAGGTTTAAA
ATTTTAGCGT
Ile Phe TTATTTGATT TTAAGTTTTA
TTTAACGCTT
TAATCACAAA
CAAAGAGGGT
GCGCTTAATG
ACAATGAT'G
(2} INFORMATION FOR N0:126:
SEQ ID
(i) SEQUENCE
CHARACTERISTICS:
(A) LENGTH: 314 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE
TYPE:
protein (v) FRAGMENT
TYPE:
internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 1...19 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:126:
Met Leu Lys Phe Lys Tyr Gly Leu Ile Tyr Ile Ala Leu Ile Leu Gly Leu Gln Ala Thr Asp Tyr Asp Asn Leu Glu Glu Glu Asn Gln Gln Leu Asp Glu Lys Ile Asn His Leu Lys Gln Gln Leu Thr Glu Lys Gly Val Ser Pro Lys Glu Met Asp Lys Asp Lys Phe Glu Glu Glu Tyr Ile Asn Arg Ser Tyr Pro Lys Ile Ser Ser Lys Lys Lys Glu Lys Leu Leu Lys Ser Phe Ser Ile Ala Asp Asp Lys Ser Gly Val Phe Leu Gly Gly Gly Tyr Ala Tyr Gly Glu Leu Asn Leu Ser Tyr Gln Gly Glu Met Leu Asp Arg Tyr Gly Ala Asn Ala Pro Ser Ala Phe Lys Asn Asn Ile Asn Ile Asn Ala Pro Val Ser Met Ile Ser Ala Lys Phe Gly Tyr Gln Lys Tyr Phe Val Ser Tyr Phe Gly Thr Arg Phe Tyr Gly Asp Leu Leu Leu Gly Gly Gly Ala Leu Lys Glu Asp Ala Ile Lys Gln Pro Val Gly Ser Phe Ile Tyr Val Leu Gly Ala Val Asn Thr Asp Leu Leu Phe Asp Met Pro 160 l65 170 Leu Asp Phe Lys Thr Lys Lys His Phe Leu Gly Val Tyr Ala Gly Phe Gly Ile Gly Leu Met Leu Tyr Gln Asp Arg Pro Asn Gln Asn Gly Arg Asn Leu Val Val Gly Gly Tyr Ser Ser Pro Asn Phe Leu Trp Lys Ser Leu Ile Glu Val Asp Tyr Thr Phe Asn Val Gly Val Ser Leu Thr Leu Tyr Arg Lys His Arg Leu Glu Ile Gly Thr Lys Leu Pro Ile Ser Tyr Leu Arg Met Gly Val Glu Glu Gly Ala Ile Tyr Gln Asn Lys Glu Asp Asp Glu Arg Leu Leu Val Ser Ala Asn Asn Gln Phe Lys Arg Ser Ser Phe Leu Leu Val Asn Tyr Ala Phe Ile Phe (2) INFORMATION FOR SEQ ID N0:127:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1851 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear ' (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 238...1665 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 238...313 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: 5EQ ID N0:127:
GAGCTAGTTT TAAAAAGTTA GTTTTGTTTT AAAAAGTTi~A TACTATTTTG AAGCACTCCT 60 ATTCAGATGG CTAAGGCACA CAAGAAATTA GGGGACTC'L'G CTGTATTCCT ACCCTGAAGC 120 GTTACCCTAA AATCCTATTG CATAGGTCTA AATAAGAG(~T TAGGGATCAT TTTAGCCATA 180 Met TCA ATT AAA AGG GTT AGA TTG AAA ATA TTC G'CT CTG TTG ATG TCG GTA 28B
Ser Ile Lys Arg Val Arg Leu Lys Ile Phe Val Leu Leu Met Ser Val Ile Leu Gly Ile Ser Leu Thr Gly Cys Ile Gly Tyr Arg Met Asp Leu GAA CAT TTT AAC ACG CTC TAT TAT GAA GAA A(iC CCT AAA AAA GCT TAT 384 Glu His Phe Asn Thr Leu Tyr Tyr Glu Glu Ser Pro Lys Lys Ala Tyr Glu Tyr Ser Lys Gln Phe Thr Lys Lys Lys Lys Asn_Ala Leu Leu Trp GAC TTG CAA AAC GGC TTG AGC GCT TTA TAC GC:C AGA GAT TAC CAG ACT 480 Asp Leu Gln Asn Gly Leu Ser Ala Leu Tyr A_La Arg Asp Tyr Gln Thr Ser Leu Gly Val Leu Asp Gln Ala Glu Gln Arg Phe Asp Lys Thr Gln Ser Ala Phe Thr Arg Gly Ala Gly Tyr Val G7_y Ala Thr Met Ile Asn ' GAT AAT GTG CGC GCT TAT GGG GGG AAT ATT TAT GAG GGC GTT TTA ATC 624 Asp Asn Val Arg Ala Tyr Gly Gly Asn Ile Tyr Glu Gly Val Leu Ile 90 95 l00 Asn Tyr Tyr Lys Ala Ile Asp Tyr Met Leu Leu Asn Asp Ser Ala Lys l05 l10 115 l20 Ala Arg Val Gln Phe Asn Arg Ala Asn Glu Arg Gln Arg Arg Ala Lys l25 130 l35 Glu Phe Tyr Tyr Glu Glu Val Gln Lys Ala Ile Lys Glu Ile Asp Ser Ser Lys Lys His Asn Ile Asn Met Glu Arg Ser Arg Val Glu Val Ser 155 l60 l65 Glu Ile Leu Asn Asn Thr Tyr Ser Asn Leu Asp Lys Tyr Glu Ala Tyr l70 175 180 Gln Gly Leu Leu Asn Pro Ala Val Ser Tyr Leu Ser Gly Leu Phe Tyr 185 190 195 _ 200 Ala Leu Asn Gly Asp Glu Asn Lys Gly Leu Gly Tyr Leu Asn Glu Ala Tyr Gly Ile Ser Gln Ser Pro Phe Val Ala Gln Asp Leu Val Phe Phe Lys Asn Pro Asn Arg Ser His Phe Thr Trp Ile Ile Ile Glu Asp Gly Lys Glu Pro Gln Lys Ser Glu Phe Lys Ile Asp Val Pro Ile Phe Met Ile Asp Ser Val Tyr Asn Val Ser Ile Ala Leu Pro Lys Leu Glu Lys Gly Glu Ala Phe Tyr Gln Asn Phe Thr Leu Lys Asp Gly Glu Lys Val Thr Pro Phe Asp Thr Leu Ala Ser Ile Asp Ala Val Val Ala Ser Glu Phe Arg Lys Gln Leu Pro Tyr Ile Ile Thr Arg Ala Ile Leu Ser Ala ' ACT TTT AAA GTG GGC ATG CAA GCG GTG GCG F,AC TAT TAT TTG GGG TTT 1344 Thr Phe Lys Val Gly Met Gln Ala Val Ala A.sn Tyr Tyr Leu Gly Phe Val Gly Gly Leu Val Thr Ser Leu Tyr Ser Gly Val Ser Thr Phe Ala " Asp Thr Arg Ser Thr Ser Ile Phe Ala His Lys Ile Tyr Leu Met Arg Ile Lys Asn Lys Ala Phe Glu Ser Tyr Glu Val Arg Ala Asp Ser Ile Asp Ala Phe Ser Phe Ser Leu Lys Pro Cys Lys Arg Ser Leu Glu Ser Pro Lys Ile Ile Asp Ala Arg Glu Leu Leu Ser Gly Phe Val Ala Ala Pro Gln Ile Phe Cys Ser Asn Arg His Asn Ile Leu Tyr Val Arg Ser TTT AAA AAC GGG TTT GTT TTG AGT CGT TTA AzIA TGATTTCAAA ACCCCCACCA 1685 Phe Lys Asn Gly Phe Val Leu Ser Arg Leu Lys CGATACGAAA ACCTAAATTA AGGGGAAGTC ATGGCTGA'rA GTTTAGCGGG CATTGATCAA 1805 (2) INFORMATION FOR SEQ ID N0:128:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 476 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 1...25 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:128:
Met Ser Ile Lys Arg Val Arg Leu Lys Ile Phe Val Leu Leu Met Ser Val Ile Leu Gly Ile Ser Leu Thr Gly Cys Ile Gly Tyr Arg Met Asp Leu Glu His Phe Asn Thr Leu Tyr Tyr Glu Glu Ser Pro Lys Lys Ala Tyr Glu Tyr Ser Lys Gln Phe Thr Lys Lys Lys Lys Asn Ala Leu Leu Trp Asp Leu Gln Asn Gly Leu Ser Ala Leu Tyr Ala Arg Asp Tyr Gln Thr Ser Leu Gly Val Leu Asp Gln Ala Glu Gln Arg Phe Asp Lys Thr Gln Ser Ala Phe Thr Arg Gly Ala Gly Tyr Val Gly Ala Thr Met Ile Asn Asp Asn Val Arg Ala Tyr Gly Gly Asn Ile Tyr Glu Gly Val Leu 90 95 l00 Ile Asn Tyr Tyr Lys Ala Ile Asp Tyr Met Leu Leu Asn Asp Ser Ala 105 l10 115 Lys Ala Arg Val Gln Phe Asn Arg Ala Asn Glu Arg Gln Arg Arg Ala Lys Glu Phe Tyr Tyr Glu Glu Val Gln Lys Ala Ile Lys Glu Ile Asp l40 145 l50 Ser Ser Lys Lys His Asn Ile Asn Met Glu Arg Ser Arg Val Glu Val Ser Glu Ile Leu Asn Asn Thr Tyr Ser Asn Leu Asp Lys Tyr Glu Ala 170 l75 180 Tyr Gln Gly Leu Leu Asn Pro Ala Val Ser Tyr Leu Ser Gly Leu Phe Tyr Ala Leu Asn Gly Asp Glu Asn Lys Gly Leu Gly Tyr Leu Asn Glu Ala Tyr Gly Ile Ser Gln Ser Pro Phe Val Ala Gln Asp Leu Val Phe Phe Lys Asn Pro Asn Arg Ser His Phe Thr Trp Ile Ile Ile Glu Asp Gly Lys Glu Pro Gln Lys Ser Glu Phe Lys Ile Asp Val Pro Ile Phe Met Ile Asp Ser Val Tyr Asn Val Ser Ile Ala Leu Pro Lys Leu Glu Lys Gly Glu Ala Phe Tyr Gln Asn Phe Thr Leu Lys Asp Gly Glu Lys Val Thr Pro Phe Asp Thr Leu Ala Ser Ile Asp Ala Val Val Ala Ser Glu Phe Arg Lys Gln Leu Pro Tyr Ile Ile Thr Arg Ala Ile Leu Ser Ala Thr Phe Lys Val Gly Met Gln Ala Val Ala Asn Tyr Tyr Leu Gly Phe Val Gly Gly Leu Val Thr Ser Leu Tyr Ser Gly Val Ser Thr Phe Ala Asp Thr Arg Ser Thr Ser Ile Phe Ala His Lys Ile Tyr Leu Met Arg Ile Lys Asn Lys Ala Phe Glu Ser Tyr Glu Val Arg Ala Asp Ser Ile Asp Ala Phe Ser Phe Ser Leu Lys Pro Cys Lys Arg Ser Leu Glu Ser Pro Lys Ile Ile Asp Ala Arg Glu Leu Leu Ser Gly Phe Val Ala Ala Pro Gln Ile Phe Cys Ser Asn Arg His Asn Ile Leu Tyr Val Arg Ser Phe Lys Asn Gly Phe Val Leu Ser Arg Leu Lys (2) INFORMATION FOR SEQ ID N0:129:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 435 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 1...432 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:129:
ATG TTA GAA AAA TTG ATT GAA AGA GTG TTG T'rT GCC ACT CGT TGG TTG 48 Met Leu Glu Lys Leu Ile Glu Arg Val Leu P:he Ala Thr Arg Trp Leu CTA GCC CCT TTA TGT ATT GCC ATG TCG TTA G'rG CTG GTG GTT TTA GGC 96 Leu Ala Pro Leu Cys Ile Ala Met Ser Leu Val Leu Val Val Leu Gly TAT GTG TTC ATG AAA GAG TTG TGG CAC ATG C'rC AGC CAT TTA AAC ACG 144 Tyr Val Phe Met Lys Glu Leu Trp His Met L~~_u Ser His Leu Asn Thr ATC AGC GAA ACG GAT TTG GTT TTA TCA GCC T'L'A GGA TTA GTG GAT TTG 192 Ile Ser Glu Thr Asp Leu Val Leu Ser Ala Leu Gly Leu Val Asp Leu TTG TTT ATG GCC GGG CTT GTT TTA ATG GTG T'CA CTC GCC AGT TAT GAA 240 Leu Phe Met Ala Gly Leu Val Leu Met Val Le~u Leu Ala Ser Tyr Glu 65 70 7!i 80 Ser Phe Val Ser Lys Leu Asp Lys Val Asp A.la Ser Glu I1e Thr Trp CTA AAG CAC ACG GAT TTT AAC GCT TTA AAA T'CA AAG GTT TCA CTC TCC 336 Leu Lys His Thr Asp Phe Asn Ala Leu Lys L<'u Lys Val Ser Leu Ser Ile Val Ala Ile Ser Ala Ile Phe Leu Leu L;rs Arg Tyr Met Ser Leu Glu Arg Cys Phe Ile Pro Ala Phe Pro Lys Asp Thr Pro Pro Ile Ala (2) INFORMATION FOR SEQ ID N0:130:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: l44 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:130:
Met Leu Glu Lys Leu Ile Glu Arg Val Leu Phe Ala Thr Arg Trp Leu Leu Ala Pro Leu Cys Ile Ala Met Ser Leu Val Leu Val Val Leu Gly Tyr Val Phe Met Lys Glu Leu Trp His Met Leu Ser His Leu Asn Thr Ile Ser Glu Thr Asp Leu Val Leu Ser Ala Leu Gly Leu Val Asp Leu Leu Phe Met Ala Gly Leu Val Leu Met Val Leu Leu Ala Ser Tyr Glu Ser Phe Val Ser Lys Leu.Asp Lys Val Asp Ala Ser Glu Ile Thr Trp Leu Lys His Thr Asp Phe Asn Ala Leu Lys Leu Lys Val Ser Leu Ser l00 105 110 Ile Val Ala Ile Ser Ala Ile Phe Leu Leu Lys Arg Tyr Met Ser Leu Glu Arg Cys Phe Ile Pro Ala Phe Pro Lys Asp Thr Pro Pro Ile Ala l30 135 140 (2) INFORMATION FOR SEQ ID N0:131:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2234 base pairs (B) TYPE. nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 213...2081 (D) OTHER INFORMATION:
WO 98/21225 PCTlUS97/21353 -_ (A) NAME/KEY: Signal Sequence (B) LOCATION: 213...273 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPT10N: SEQ ID N0:131:
GCTCACAATTGAGCTAAAGCCCGCTTTTTA GGGATAAA'TAAAAAGCGTTTTCAAATTGCA120 TGGGTAACTTTATGGGGCGAAGCGTTTCTA AATTTTGG'TATAATCGCTAGAAATTGTGAG180 CGT TTG
Met Arg Leu Leu Trp Trp Leu GTA TTG GTA TTA TCG CTC TTT TTA AAT CCT T'TG AGA GCG GTT GAA GAG 281 Val Leu Val Leu Ser Leu Phe Leu Asn Pro L.~u Arg Ala Val Glu Glu CAT GAA ACA GAT GCG GTG GAT TTG TTT TTG A'TT TTC AAT CAA ATC AAC 329 His Glu Thr Asp Ala Val Asp Leu Phe Leu I.le Phe Asn Gln Ile Asn CAG CTC AAT CAA GTC ATT GAA ACT TAC AAA Ai~A AAC CCT GAA AGA AGC 377 Gln Leu Asn Gln Val 11e Glu Thr Tyr Lys Lys Asn Pro Glu Arg Ser Ala Glu Ile Ser Leu Tyr Asn Thr Gln Lys Asn Asp Leu Ile Lys Ser Leu Thr Ser Lys Val Leu Asn Glu Arg Asp LVS Ile Gly Ile Asp I1e 55 60 ' 65 Asn Gln Asn Leu Lys Glu Gln Glu Lys Ile Lys Lys Arg Leu Ser Lys Ser Ile Asn Gly Asp Asp Phe Tyr Thr Phe Mea Lys Asp Arg Leu Ser Leu Asp Ile Leu Leu Ile Asp Glu Ile Leu Tyr Arg Phe Ile Asp Lys 100 105 17_0 115 Ile Arg Ser Ser Ile Asp Ile Phe Ser Glu G7.n Lys Asp Val Glu Ser Ile Ser Asp Ala Phe Leu Leu Arg Leu Gly Gln Phe Lys Leu Tyr Thr Phe Pro Lys Asn Leu Gly Asn Val Lys Met His Glu Leu Glu Gln Met Phe Ser Asp Tyr Glu Leu Arg Leu Asn Thr Tyr Thr Glu Val Leu Arg Tyr Ile Lys Asn His Pro Lys Glu Val Leu Pro Lys Asn Leu Ile Met Glu Val Asn Met Asp Phe Val Leu Asn Lys Ile Ser Lys Val Leu Pro-200 205 2l0 Phe Thr Thr His Ser Leu Gln Val Ser Lys Ile Val Leu Ala Leu Thr Ile Leu Ala Leu Leu Leu Gly Leu Arg Lys Leu Ile Thr Trp Leu Leu Ala Leu Leu Leu Asp Arg Ile Phe Glu Ile Met Gln Arg Asn Lys Lys Met His Val Asn Val Gln Lys Ser Ile Val Ser Pro Val Ser Val Phe TTA GCC CTA TTT AGT TGC GAT GTG GCT TTA GAT ATT TTC TAC TAC CCT 1l45 Leu Ala Leu Phe Ser Cys Asp Val Ala Leu Asp Ile Phe Tyr Tyr Pro Asn Ala Ser Pro Pro Lys Val Ser Met Trp Val Gly Ala Val Tyr Ile Met Leu Leu Ala Trp Leu Val Ile Ala Leu Phe Lys Gly Tyr Gly Glu 310 3l5 320 Ala Leu Val Thr Asn Met Ala Thr Lys Ser Thr His Asn Phe Arg Lys Glu Val Ile Asn Leu Ile Leu Lys Val Val Tyr Phe Leu Ile Phe Ile Val Ala Leu Leu Gly Val Leu Lys Gln Leu Gly Phe Asn Val Ser Ala Ile Ile Ala Ser Leu Gly Ile Gly Gly Leu Ala Va1 Ala Leu Ala Val Lys Asp Val Leu Ala Asn Phe Phe Ala Ser Va.l Ile Leu Leu Leu Asp Asn Ser Phe Ser Gln Gly Asp Trp Ile Val Cys Gly Glu Val Glu Gly ACG GTG GTG GAA ATG GGG TTA AGG CGC ACC AC.'G ATC AGA GCC TTT GAC 1577 Thr Val Val Glu Met Gly Leu Arg Arg Thr Thr Ile Arg Ala Phe Asp Asn Ala Leu Leu Ser Val Pro Asn Ser Glu LE:u Ala Gly Lys Pro Ile Arg Asn Trp Ser Arg Arg Lys Val G1y Arg Arg Ile Lys Met Glu Ile GGC TTA ACT TAT AGC TCC AGT CAA AGC GCT T7.'A CAG CTT TGC GTG AAA 1721 Gly Leu Thr Tyr Ser Ser Ser Gln Ser Ala Leu Gln Leu Cys Val Lys Asp Ile Lys Glu Met Leu Glu Asn His Pro Lys Ile Ala Asn Gly Ala Asp Ser Ala Leu Gln Asn Val Ser Asp Tyr Arg Tyr Met Phe Lys Lys 500 505 57.0 515 Asp Ile Val Ser Ile Asp Asp Phe Leu Gly Tyr Lys Asn Asn Leu Phe GTC TTT TTA GAT CAG TTT GCG GAC AGC TCT A7.'T AAT ATT TTA GTG TAT 19l3 Val Phe Leu Asp Gln Phe Ala Asp Ser Ser Ile Asn Ile Leu Val Tyr Cys Phe Ser Lys Thr Val Val Trp Glu Glu Trp Leu Glu Val Lys Glu Asp Val Met Leu Lys Ile Met Gly Ile Val Gl.u Lys His His Leu Ser GAA GTG ATC AAC TTG ATT TTA AAA GT
Phe Ala Phe Pro Ser Gln Ser Leu Tyr Val Glu Ser Leu Pro Glu Val Ser Leu Lys Glu Gly Ala Lys Ile (2) INFORMATION FOR SEQ ID N0:132:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 623 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATION: l...20 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:132:
Met Arg Leu Leu Leu Trp Trp Val Leu Val Leu Ser Leu Phe Leu Asn Pro Leu Arg Ala Val Glu Glu His Glu Thr Asp Ala Val Asp Leu Phe Leu Ile Phe Asn Gln Ile Asn Gln Leu Asn Gln Val Ile Glu Thr Tyr Lys Lys Asn Pro Glu Arg Ser Ala Glu Ile Ser Leu Tyr Asn Thr Gln Lys Asn Asp Leu Ile Lys Ser Leu Thr Ser Lys Val Leu Asn Glu Arg Asp Lys Ile Gly Ile Asp Ile Asn Gln Asn Leu Lys Glu Gln Glu Lys Ile Lys Lys Arg Leu Ser Lys Ser Ile Asn Gly Asp Asp Phe Tyr Thr Phe Met Lys Asp Arg Leu Ser Leu Asp Ile Leu Leu Ile Asp Glu Ile Leu Tyr Arg Phe Ile Asp Lys Ile Arg Ser Ser Ile Asp Ile Phe Ser 1l0 115 120 Glu Gln Lys Asp Val Glu Ser Ile Ser Asp Ala Phe Leu Leu Arg Leu Gly Gln Phe Lys Leu Tyr Thr Phe Pro Lys Asn Leu Gly Asn Val Lys Met His Glu Leu Glu Gln Met Phe Ser Asp Tyr Glu Leu Arg Leu Asn -z7o-' 160 165 170 Thr Tyr Thr Glu Val Leu Arg Tyr Ile Lys Assn His Pro Lys Glu Val l75 1S0 185 Leu Pro Lys Asn Leu Ile Met Glu Val Asn Me=t Asp Phe Val Leu Asn Lys Ile Ser Lys Val Leu Pro Phe Thr Thr H_is Ser Leu Gln Val Ser 205 210 2:L5 220 Lys Ile Val Leu Ala Leu Thr Ile Leu Ala Lc=a Leu Leu Gly Leu Arg ' Lys Leu Ile Thr Trp Leu Leu Ala Leu Leu Le:u Asp Arg Ile Phe Glu Ile Met Gln Arg Asn Lys Lys Met His Val Asn Val Gln Lys Ser Ile 255 260 265 _ Val Ser Pro Val Ser Val Phe Leu Ala Leu Phe Ser Cys Asp Val Ala Leu Asp Ile Phe Tyr Tyr Pro Asn Ala Ser Pro Pro Lys Val Ser Met Trp Val Gly Ala Val Tyr Ile Met Leu Leu Ala Trp Leu Val Ile Ala Leu Phe Lys Gly Tyr Gly Glu Ala Leu Val Thr Asn Met Ala Thr Lys Ser Thr His Asn Phe Arg Lys Glu Val Ile Asn Leu Ile Leu Lys Val Val Tyr Phe Leu Ile Phe Ile Val Ala Leu Leu Gly Val Leu Lys Gln Leu Gly Phe Asn Val Ser Ala Ile Ile Ala Se:r Leu Gly Ile Gly Gly Leu Ala Val Ala Leu Ala Val Lys Asp Val LE:u Ala Asn Phe Phe Ala Ser Val Ile Leu Leu Leu Asp Asn Ser Phe Ser Gln Gly Asp Trp Ile Val Cys Gly Glu Val Glu Gly Thr Val Val G7.u Met Gly Leu Arg Arg Thr Thr Ile Arg Ala Phe Asp Asn Ala Leu Leu Ser Val Pro Asn Ser Glu Leu Ala Gly Lys Pro Ile Arg Asn Trp Ser Arg Arg Lys Val Gly Arg Arg Ile Lys Met Glu Ile Gly Leu Thr Tyr Ser Ser Ser Gln Ser Ala Leu Gln Leu Cys Val Lys Asp Ile Lys Gl.u Met Leu Glu Asn His Pro Lys Ile Ala Asn Gly Ala Asp Ser Ala Le:u Gln Asn Val Ser Asp Tyr Arg Tyr Met Phe Lys Lys Asp Ile Val Se:r Ile Asp Asp Phe Leu Gly Tyr Lys Asn Asn Leu Phe Val Phe Leu A:;p Gln Phe Ala Asp Ser Ser Ile Asn Ile Leu Val Tyr Cys Phe Ser Lys Thr Val Val Trp Glu Glu Trp Leu Glu Val Lys Glu Asp Val Met Le:u Lys Ile Met Gly Ile Val Glu Lys His His Leu Ser Phe Ala Phe Pro Ser Gln Ser Leu Tyr Val Glu Ser Leu Pro Glu Val Ser Leu Lys Gl.u Gly Ala Lys Ile (2) INFORMATION FOR SEQ ID N0:133:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 432 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
{A) NAME/KEY: Coding Sequence {B) LOCATION: 1...429 {D) OTHER INFORMATION:
(A) NAME/KEY: sig peptide (B) LOCATION: l...93 (D) OTHER INFORMATION:
{A) NAME/KEY: mat peptide (B) LOCATION: 94...429 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:133:
Met Lys Lys Phe Phe Ser Gln Ser Leu Leu Ala Leu Ile Val Ser Met Asn Ala Leu Leu Ala Met Asp Gly Asn Gly Val Phe Leu Gly Ala Gly Tyr Leu Gln Gly Gln Ala Gln Met His Ala Asp Ile Asn Ser Gln Lys Gln Ala Thr Asn Ala Thr Ile Lys Gly Phe Asp Ala Leu Leu Gly Tyr Gln Phe Phe Phe Gly Lys Tyr Phe Gly Leu Arg Ala Tyr Gly Phe Phe Asp Tyr Ala His Ala Asn Ser Ile Arg Leu Lys Asn Pro Asn Tyr Asn Ser Glu Val Ala Gln Leu Ala Gly Gln Ile Leu Gly Lys Gln Glu Ile WO 98l21225 PCTNS97/21353 Asn Arg Leu Thr Ser Leu Ala Asp Pro Lys Thr Phe Glu Pro Asn Met Leu Thr Tyr Gly Gly Ala Met Asp Leu Met Val Asn Val His Gln (2) INFORMATION FOR SEQ ID N0:134:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 143 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID ;g0:134:
Met Lys Lys Phe Phe Ser Gln Ser Leu Leu .~la Leu Ile Val Ser Met Asn Ala Leu Leu Ala Met Asp Gly Asn Gly 'Val Phe Leu Gly Ala Gly Tyr Leu Gln Gly Gln Ala Gln Met His Ala Asp Ile Asn Ser Gln Lys Gln Ala Thr Asn Ala Thr Ile Lys Gly Phe Asp Ala Leu Leu Gly Tyr Gln Phe Phe Phe Gly Lys Tyr Phe Gly Leu ~Arg Ala Tyr Gly Phe Phe Asp Tyr Ala His Ala Asn Ser Ile Arg Leu :Lys Asn Pro Asn Tyr Asn Ser Glu Val Ala Gln Leu Ala Gly Gln Ile :~eu Gly Lys Gln Glu Ile Asn Arg Leu Thr Ser Leu Ala Asp Pro Lys 'L'hr Phe Glu Pro Asn Met Leu Thr Tyr Gly Gly Ala Met Asp Leu Met Val Asn Val His Gln 100 105 1l0 (2) INFORMATION FOR SEQ ID N0:1:35:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 336 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
. (A) NAME/KEY: Coding Sequence WO 98/21225 PCTlUS97/21353 (B) LOCATION: 1...333 (D) OTHER INFORMATION:
(A) NAME/KEY: sig peptide (B) LOCATION: 1...60 (D) OTHER INFORMATION:
(A) NAME/KEY: mat_peptide (B) LOCATION: 61. .333 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:135:
Met Lys Thr Phe Lys Asn Leu Leu Cys Phe Ser Leu Ile Ala Met Ser Trp Leu Gln Ala Asp Met Leu Asp Asn Phe Thr Arg Ala Ile Asn Ser Tyr Thr Thr Lys Lys Leu Asn Glu Ile Lys Asp Gln Val Asn Ser Ala AAC CCT ACT AAA AAT CAC AAT ACC ACT TAT AAC GCT AAT GGC ATG CTC l92 Asn Pro Thr Lys Asn His Asn Thr Thr Tyr Asn Ala Asn Gly Met Leu Ile Asn Ile Asp Cys Lys Val Leu Lys Asn Asn Phe Tyr Ser Val Cys Tyr Ser Ser Glu Leu Lys Asn Pro Ile Tyr Gly Val Ser Val Leu Phe Gly Asp Leu Val Asp Lys Asn Asn Ile Glu Lys Arg Tyr Glu Phe (2) INFORMATION FOR SEQ ID N0:136:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 111 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal WO 98/21225 PCT/LTS97/21353 -' (xi) SEQUENCE'DESCRIPTION: SEQ ID N0:136:
Met Lys Thr Phe Lys Asn Leu Leu Cys Phe Ser Leu Ile Ala Met Ser Trp Leu Gln Ala Asp Met Leu Asp Asn Phe Thr Arg Ala Ile Asn Ser Tyr Thr Thr Lys Lys Leu Asn Glu Ile Lys Asp Gln Val Asn Ser Ala Asn Pro Thr Lys Asn His Asn Thr Thr Tyr Asn Ala Asn Gly Met Leu Ile Asn Ile Asp Cys Lys Val Leu Lys Asn Asn Phe Tyr Ser Val Cys Tyr Ser Ser Glu Leu Lys Asn Pro Ile Tyr Gly Val Ser Val Leu Phe Gly Asp Leu Val Asp Lys Asn Asn Ile Glu Lys Arg Tyr Glu Phe (2) INFORMATION FOR SEQ ID N0:137:
(i) SEQUENCE CHARACTERISTICS:
- _ -- (A) LENGTH: 2185 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY:-linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 81...2069 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 81...l44 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID ;g0:137:
GTAAAAAATG GCTTATCTGT TCTAGCCTAC TCCCCTT.ATT TTTTCTTAAT CCCTTAGCGG 60 CAGAAGATGA TGGGTTTTTT ATG GGG GTG AGT T.AT CAA ACT TCT CTA GCT 110 Met Gly Val Ser Tyr Gln Thr Ser Leu Ala Ile Gln Arg Val Asp Asn Ser Gly Leu Asn .Ala Ser Gln Ala Ala Ser Thr Tyr Ile Arg Gln Asn Ala Ile Ala Leu Glu Ser Ala Ala Val Pro Leu Ala Tyr Tyr Leu Glu Ala Met Gly Gln Gln Thr Arg Val Leu Met Gln Met Leu Cys Pro Asp Pro Ser Lys Arg Cys Leu Leu Tyr Ala Gly GGT TAT 'AAA AAC GGA TCA AGT AAT ACT AAC GGC GAT ACA GGC AAC AAC 350 Gly Tyr Lys Asn Gly Ser Ser Asn Thr Asn Gly Asp Thr Gly Asn Asn Pro Pro Arg Gly Asn Val Asn Ala Thr Phe Asp Met Gln Ser Leu Val Asn Asn Leu Asn Lys Leu Thr Gln Leu Ile Gly Glu Thr Leu Ile Arg Asn Pro Glu Asn Leu Ser Asn Ala Lys Val Phe Asn Val Lys Phe Gly l05 110 115 Asn Gln Ser Thr Val Ile Ala Leu Pro Glu Gly Leu Ala Asn Thr Met 120 l25 130 Asn Ala Leu Asn Asp Asp Ile Thr Asn Ala Leu Thr Thr Leu Trp Tyr Asn Gln Thr Leu Thr Asn Lys Ser Phe Asn Ser Gly Asn Ser Val Asn Phe Ser Pro Gln Val Leu Gln His Leu Leu G1n Asp Gly Leu Ala Thr 170 l75 180 Ser Asn Gln Thr Ile Cys Ser Thr Gln Asn Gln Cys Thr Ala Thr Asn Glu Ala Lys Ser Ile Ala Gln Asn Ala Gln Asn Ile Phe Gln Ala Leu Met Gln Ala Gly Ile Leu Gly Gly Leu Ala Asn Glu Lys Gln Phe Gly Phe Thr Tyr Asn Lys Ala Pro Asn Gly Ser Asp Ser Gln Gln Gly Tyr WO 98/21225 PCT/US97/21353 ' CAA AGC TTT AGC GGC CCG GGT TAT TAC ACT .AAA AAC GGC GCT AAT GGC 926 Gln Ser Phe Ser Gly Pro Gly Tyr Tyr Thr Lys Asn Gly Ala Asn Gly Thr Thr Gln Ala Pro Leu Lys Ala Leu Pro .~la Gly Ala Thr Ile Gly Ser Gly Asn Gly Gln Tyr Thr Tyr His Pro Ser Ser Ala Val Tyr Tyr TTA GCC GAT AGC ATC ATT GCT AAT GGC ATC i3CC GCT TCT ATG ATT TTT 1070 Leu Ala Asp Ser Ile Ile Ala Asn Gly Ile 'L'hr Ala Ser Met Ile Phe Ser Gly Met Gln Asn Phe Ala Asn Lys Ala Ala Lys Leu Thr Gly Thr Ser Ser Tyr Ser Gln Met Gln Asp Ala Ile Asn Tyr Gly G1u Ser Leu Leu Ser Asn Thr Val Ala Tyr Gly Asp Phe 7:1e Thr Asn Trp Val Ala Pro Tyr Leu Asp Leu Asn Asn Lys Gly Leu Asn Phe Leu Pro Ser Tyr Gly Gly Gln Leu Asn Gly Ala Asn His Gln Thr Pro Gln Leu Thr Pro CAA CAA GCC CAA CAA GAG CAA AAA GTC ATC F,TG AAC CAA CTA GAG CAA 1358 Gln Gln Ala Gln Gln Glu Gln Lys Val Ile Nfet Asn Gln Leu Glu Gln GCC ACA AAC GCC CCC ACC CCC GCG CAA ATA A.AC AGG ATT TTA GCC AAC l406 Ala Thr Asn Ala Pro Thr Pro A1a Gln Ile A.sn Arg Ile Leu Ala Asn Pro Tyr Ser Pro Thr Ala Lys Thr Leu Met A.la Tyr Gly Leu Tyr Arg Ser Lys Ala Val Ile Gly Gly Val Ile Asp Glu Met Gln Thr Lys Val Asn Gln Val Tyr Gln Met Gly Phe Ala Arg Asn Phe Leu Glu His Asn Ser Asn Ser Asn Asn Met Asn Gly Phe Gly Val Lys Met Gly Tyr Lys Gln Phe Phe Gly Lys Lys Arg Met Phe Gly Leu Arg Tyr Tyr Gly Phe Tyr Asp Phe Gly Tyr Ala Gln Phe Gly Ala Glu Ser Ser Leu Val Lys Ala Thr Leu Ser Ser Tyr Gly Ala Gly Thr Asp Phe Leu Tyr Asn Val TTT ACC CGA AAA AGA GGG ACT GAA GCG ATA GAT ATC GGT TTT TTT GCC l790 Phe Thr Arg Lys Arg Gly Thr Glu Ala Ile Asp Ile Gly Phe Phe Ala GGT ATC CAA CTT GCA GGG CAA ACT TGG AAA ACG AAT TTT TTA GAT CAA 18.38 Gly Ile Gln Leu Ala Gly Gln Thr Trp Lys Thr Asn Phe Leu Asp Gln Val Asp Gly Asn His Leu Lys Pro Lys Asp Thr Ser Phe Gln Phe Leu Phe Asp Leu Gly Ile Arg Thr Asn Phe Ser Lys Ile Ala His Gln Lys Arg Ser Arg Phe Ser Gln Gly Ile Glu Phe Gly Leu Lys Ile Pro Val Leu Tyr His Thr Tyr Tyr Gln Ser Glu Gly Val Thr Ala Lys Tyr Arg Arg A1a Phe Ser Phe Tyr Val Gly Tyr Asn Ile Gly Phe (2) INFORMATION FOR SEQ ID N0:138:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 663 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence (B) LOCATION: 1...21 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:138:
Met Gly Val Ser Tyr Gln Thr Ser Leu Ala _Cle Gln Arg Val Asp Asn Ser Gly Leu Asn Ala Ser Gln Ala Ala Ser 'Chr Tyr Ile Arg Gln Asn Ala Ile Ala Leu Glu Ser Ala Ala Val Pro Leu Ala Tyr Tyr Leu Glu Ala Met Gly Gln Gln Thr Arg Val Leu Met Gln Met Leu Cys Pro Asp Pro Ser Lys Arg Cys Leu Leu Tyr Ala Gly Gly Tyr Lys Asn Gly Ser Ser Asn Thr Asn Gly Asp Thr Gly Asn Asn Pro Pro Arg Gly Asn Val Asn Ala Thr Phe Asp Met Gln Ser Leu Val Asn Asn Leu Asn Lys Leu Thr Gln Leu Ile Gly Glu Thr Leu Ile Arg Asn Pro Glu Asn Leu Ser Asn Ala Lys Val Phe Asn Val Lys Phe Gly Asn Gln Ser Thr Val Ile Ala Leu Pro Glu Gly Leu Ala Asn Thr Met Asn Ala Leu Asn Asp Asp Ile Thr Asn Ala Leu Thr Thr Leu Trp Tyr Asn Gln Thr Leu Thr Asn 140 l45 7.50 155 Lys Ser Phe Asn Ser Gly Asn Ser Val Asn Phe Ser Pro Gln Val Leu Gln His Leu Leu Gln Asp Gly Leu Ala Thr ~~er Asn Gln Thr Ile Cys 175 180 l85 Ser Thr Gln Asn Gln Cys Thr Ala Thr Asn Cilu Ala Lys Ser Ile Ala Gln Asn Ala Gln Asn Ile Phe Gln Ala Leu Nlet Gln Ala Gly Ile Leu Gly Gly Leu Ala Asn Glu Lys Gln Phe Gly F~he Thr Tyr Asn Lys Ala Pro Asn Gly Ser Asp Ser Gln Gln Gly Tyr Gln Ser Phe Ser Gly Pro Gly Tyr Tyr Thr Lys Asn Gly Ala Asn Gly T'hr Thr Gln Ala Pro Leu Lys Ala Leu Pro Ala Gly Ala Thr Ile Gly 5'er Gly Asn Gly Gln Tyr Thr Tyr His Pro Ser Ser Ala Val Tyr Tyr Lieu Ala Asp Ser Ile Ile Ala Asn Gly Ile Thr Ala Ser Met Ile Phe Ser Gly Met Gln Asn Phe Ala Asn Lys Ala Ala Lys Leu Thr Gly Thr Ser Ser Tyr Ser Gln Met Gln Asp Ala Ile Asn Tyr Gly Glu Ser Leu Leu Ser Asn Thr Val Ala Tyr Gly Asp Phe Ile Thr Asn Trp Val Ala Pro Tyr Leu Asp Leu Asn Asn Lys Gly Leu Asn Phe Leu Pro Ser Tyr Gly Gly Gln Leu Asn Gly Ala Asn His Gln Thr Pro Gln Leu Thr Pro Gln Gln Ala Gln Gln Glu Gln Lys Val Ile Met Asn Gln Leu Glu Gln Ala Thr Asn Ala Pro Thr Pro Ala Gln Ile Asn Arg Ile Leu Ala Asn Pro Tyr Ser Pro Thr Ala Lys Thr Leu Met Ala Tyr Gly Leu Tyr Arg Ser Lys Ala Val Ile Gly Gly Val Ile Asp Glu Met Gln Thr Lys Val Asn Gln Val Tyr Gln Met Gly Phe Ala Arg Asn Phe Leu Glu His Asn Ser Asn Ser Asn Asn Met Asn Gly Phe Gly Val Lys Met Gly Tyr Lys Gln Pl-ie Phe Gly Lys Lys Arg Met Phe Gly Leu Arg Tyr Tyr Gly Phe Tyr Asp Phe Gly Tyr Ala Gln Phe Gly Ala Glu Ser Ser Leu Val Lys Ala Thr Leu Ser Ser Tyr Gly Ala Gly Thr Asp Phe Leu Tyr Asn Val Phe Thr Arg Lys Arg Gly Thr Glu Ala Ile Asp Ile Gly Phe Phe Ala Gly Ile Gln Leu Ala Gly Gln Thr Trp Lys Thr Asn Phe Leu Asp Gln Val Asp Gly Asn His Leu Lys Pro Lys Asp Thr Ser Phe Gln Phe Leu Phe Asp Leu Gly Ile Arg Thr Asn Phe Ser Lys Ile Ala His Gln Lys Arg Ser Arg Phe Ser Gln Gly Ile Glu Phe Gly Leu Lys Ile Pro Val Leu Tyr His Thr Tyr Tyr 605 6l0 615 Gln Ser Glu Gly Val Thr Ala Lys Tyr Arg Arg Ala Phe Ser Phe Tyr Val Gly Tyr Asn Ile Gly Phe (2) INFORMATION FOR SEQ ID N0:139:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1213 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...1160 -zeo-(D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:139:
Met Gly Phe Xaa Leu Ala Leu Gly Tyr Leu Cys Leu Phe Ile Phe Val Leu Ser Ala Ser Leu Ile Ser Glu Lys Ala Leu Ser Lys Gln Tyr Leu Gln Thr GCT AAA GAT AAA ATC ACC TCT TTA AAG AAT 'TTA AAA GTC ATC GCC ATT 200 Ala Lys Asp Lys Ile Thr Ser Leu Lys Asn Leu Lys Val Ile Ala Ile ACC GGA AGC TTT GGG AAA ACC AGC ACC AAA .AAT TTC TTG CTT CAA ATC 248 Thr Gly Ser Phe Gly Lys Thr Ser Thr Lys .Asn Phe Leu Leu Gln Ile Leu Gln Thr Thr Phe Asn Ala His Ala Ser Pro Lys Ser Val Asn Thr CTT TTA GGG CTT GCG AAT GAT ATT AAT CAG .~1AT TTA GAC GAT AGG AGT 344 Leu Leu Gly Leu Ala Asn Asp Ile Asn Gln .Asn Leu Asp Asp Arg Ser GAA ATC TAT ATC GCT GAA GCC GGG GCA AGG ,SAT AAG GGC GAT ATT AAA 392 Glu Ile Tyr Ile Ala Glu Ala Gly Ala Arg ,Asn Lys Gly Asp Ile Lys Glu Ile Thr Cys Leu Ile Glu Pro His Leu 'Val Val Val Ala Glu Val GGC GAA CAG CAT TTA GAA TAC TTT AAA ACT 'rTA GAA AAT ATT TGC GAG 488 Gly Glu Gln His Leu Glu Tyr Phe Lys Thr :Leu Glu Asn Ile Cys Glu ACT AAA GCG GAA TTA TTG GAT TCC AAA CGC 'TTA GAA AAA GCC TTT TGT 536 Thr Lys Ala Glu Leu Leu Asp Ser Lys Arg :Leu Glu Lys Ala Phe Cys Tyr Ser Val Glu Lys Ile Lys Pro Tyr Ala :Pro Lys Asp Ser Pro Leu 165 l70 175 Ile Asp Tyr Ser Ser Leu Val Lys Asn Ile Gln Ser Thr Leu Lys Gly Thr Ser Phe Glu Met Leu Ile Gly Ser Val Trp Glu Arg Phe Glu Thr Lys Val Leu Gly Glu Phe Ser Ala Tyr Asn Ile Ala Ser Ala Ile Leu Ile Ala Lys His Leu Gly Leu Glu Thr Glu Arg Ile Lys Arg Leu Val Leu Glu Leu Asn Pro Ile Ala His Arg Leu Gln Leu Leu Glu Val Asn Gln Lys Ile Ile Ile Asp Asp 5er Phe Asn Gly Asn Leu Lys Gly Met Leu Glu Gly Ile Arg Leu Ala Ser Leu His Lys Gly Arg Lys Val Ile Val Thr Pro Gly Leu Val Glu Ser Asn Thr Glu Ser Asn Glu Ala Leu Ala Gln Lys Ile Asp Gly Val Phe Asp Val Ala Ile Ile Thr Gly Glu Leu Asn Ser Lys Thr Ile Ala Ser Gln Leu Lys Thr Pro Gln Lys Ile Leu Leu Lys Asp Lys Ala Gln Leu Glu Asn Ile Leu Gln Ala Thr Thr Ile Gln Gly Asp Leu Ile Leu Phe Ala Asn Asp Ala Pro Asn Tyr Ile AGGAAATGAA CATGCAACAT TTATACGCTC CTTGGCGCGA AAGTTATTTG AA l213 (2) INFORMATION FOR SEQ ID N0:140:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 370 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single ' (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein ~ (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:140:
Met Gly Phe Xaa Leu Ala Leu Gly Tyr Leu Cys Leu Phe Ile Phe Val Leu Ser Ala Ser Leu Ile Ser Glu Lys Ala Leu Ser Lys Gln Tyr Leu Gln Thr Ala Lys Asp Lys Ile Thr Ser Leu hys Asn Leu Lys Val Ile Ala Ile Thr Gly Ser Phe Gly Lys Thr Ser '.Chr Lys Asn Phe Leu Leu Gln Ile Leu Gln Thr Thr Phe Asn Ala His Ala Ser Pro Lys Ser Val 65 70 '75 80 Asn Thr Leu Leu Gly Leu Ala Asn Asp Ile Asn Gln Asn Leu Asp Asp Arg Ser Glu Ile Tyr Ile Ala Glu Ala Gly Ala Arg Asn Lys Gly Asp Ile Lys Glu Ile Thr Cys Leu Ile G1u Pro His Leu Val Val Val Ala l15 120 125 Glu Val Gly Glu Gln His Leu Glu Tyr Phe Lys Thr Leu Glu Asn Ile l30 135 140 Cys Glu Thr Lys Ala Glu Leu Leu Asp Ser Lys Arg Leu Glu Lys Ala 145 150 7_55 l60 Phe Cys Tyr Ser Val Glu Lys Ile Lys Pro Tyr Ala Pro Lys Asp Ser Pro Leu Ile Asp Tyr Ser Ser Leu Val Lys Asn Ile Gln Ser Thr Leu Lys Gly Thr Ser Phe Glu Met Leu Ile Gly :>er Val Trp Glu Arg Phe Glu Thr Lys Val Leu Gly Glu Phe Ser Ala Tyr Asn Ile Ala Ser Ala Ile Leu Ile Ala Lys His Leu Gly Leu Glu Thr Glu Arg Ile Lys Arg Leu Val Leu Glu Leu Asn Pro Ile Ala His Arg Leu Gln Leu Leu Glu Val Asn Gln Lys Ile Ile Ile Asp Asp Ser Phe Asn Gly Asn Leu Lys Gly Met Leu Glu Gly Ile Arg Leu Ala Ser Leu His Lys Gly Arg Lys Val Ile Val Thr Pro Gly Leu Val Glu Ser Asn Thr Glu Ser Asn Glu Ala Leu Ala Gln Lys Ile Asp Gly Val Phe Asp Val Ala Ile Ile Thr 305 310 .15 320 Gly Glu Leu Asn Ser Lys Thr Ile Ala Ser C~ln Leu Lys Thr Pro Gln Lys Ile Leu Leu Lys Asp Lys Ala Gln Leu Glu Asn Ile Leu Gln Ala Thr Thr Ile Gln Gly Asp Leu Ile Leu Phe Ala Asn Asp Ala Pro Asn Tyr Ile (2) INFORMATION FOR SEQ ID N0:141:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 360 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 82...270 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:141:
AGACTTTTTT TGAATGAGTA A GGA GAA AAT ATT TTG TTC CAT AAA CTG~ATC 111 Gly Glu Asn Ile Leu Phe His Lys Leu Ile Leu Thr Cys Phe Leu Ala Leu Val Ala Ile Thr Ile Gln Ala Cys Gly Tyr Lys Ala Pro Pro Phe Asn Glu Lys Pro Ala Lys Lys Thr Ser Asn Ser Ser Asn Ser Ser Met Gln Thr Pro Thr Asn Ser Thr Thr Pro Glu Phe Leu Asn Gln Pro (2) INFORMATION FOR SEQ ID N0:142:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 63 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear {ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:142:
Gly Glu Asn Ile Leu Phe His Lys Leu Ile Leu Thr Cys Phe Leu Ala . 1 5 10 15 Leu Val Ala Ile Thr Ile Gln Ala Cys Gly 'Cyr Lys Ala Pro Pro Phe Asn Glu Lys Pro Ala Lys Lys Thr Ser Asn Ser Ser Asn Ser Ser Met Gln Thr Pro Thr Asn Ser Thr Thr Pro Glu F?he Leu Asn Gln Pro (2) INFORMATION FOR SEQ ID N0:143:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1024 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 115...921 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:143:
Met AAG AGA GTT AGA GAA CTT GTA AAA AAA CAT C'.CC GAG AAA AGC AGT GTG 165 Lys Arg Val Arg Glu Leu Val Lys Lys His Pro Glu Lys Ser Ser Val Ala Leu Val Val Leu Thr His Ala Ala Cys Lys Lys Ala Lys Glu Leu Asp Asp Lys Val Gln Asp Lys Ser Lys Gln F~la Glu Lys Glu Asn Gln ATC AAT TGG TGG AAA TAT TCA GGA TTA ACA F~TA GCG ACA AGT TTA TTA 309 Ile Asn Trp Trp Lys Tyr Ser Gly Leu Thr I:le Ala Thr Ser Leu Leu 50 55 6.0 65 Leu Ala Ala Cys Ser Val Gly Asp Ile Asp L~ys Gln Ile Glu Leu Glu Gln Glu Lys Lys Glu Ala Glu Asn Ala Arg P.sp Arg Ala Asn Lys Ser Gly Ile Glu Leu Glu Gln Glu Lys Gln Lys Thr Ile Lys Glu Gln Lys l00 105 110 Asp Leu Val Lys Lys Ala Glu Gln Asn Cys Gln Glu Asn His Gly Gln Phe Phe Met Lys Lys Leu Gly Ile Lys Gly Gly Ile Ala Ile Glu Val l30 135 140 145 Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala Lys Thr Asn Gln Thr Pro Ile Gln Pro Lys His Leu Pro Asn Ser Lys Gln Pro His Ser Gln Arg Gly Ser Lys Ala Gln Glu Leu Ile Ala Tyr Leu Gln Lys Glu Leu Glu TCT CTG CCC TAT TCA CAA AAA GCT ATC GCT AAA CAA GTG AAT TTT TAC 74l Ser Leu Pro Tyr Ser Gln Lys Ala Ile Ala Lys Gln Val Asn Phe Tyr Arg Pro Ser Ser Val Ala Tyr Leu Glu Leu Asp Pro Arg Asp Phe Lys Val Thr Glu Glu Trp Gln Lys Glu Asn Leu Lys Ile Arg Ser Lys Ala Gln Ala Lys Met Leu Gly Asn Glu Lys Pro Thr Ser Pro Pro Phe Asn Leu 5er Lys Pro Phe Val Arg Ser Lys Asn Ile Cys (2) INFORMATION FOR SEQ ID N0:144:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 269 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D} TOPOLOGY: linear ~ {ii) MOLECULE; TYPE: protein {v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NC>:144:
Met Lys Arg Val Arg Glu Leu Val Lys Lys Hi.s Pro Glu Lys Ser Ser Val Ala Leu Val Val Leu Thr His Ala Ala Cys Lys Lys Ala Lys Glu Leu Asp Asp Lys Val Gln Asp Lys Ser Lys Gl.n Ala Glu Lys Glu Asn Gln Ile Asn Trp Trp Lys Tyr Ser Gly Leu Thr Ile Ala Thr Ser Leu Leu Leu Ala Ala Cys Ser Val Gly Asp Ile Asp Lys Gln Ile Glu Leu Glu Gln Glu Lys Lys Glu Ala Glu Asn Ala Arg Asp Arg Ala Asn Lys Ser Gly Ile Glu Leu Glu Gln Glu Lys Gln Lys Thr Ile Lys Glu Gln Lys Asp Leu Val Lys Lys Ala Glu Gln Asn Cys Gln Glu Asn His Gly 115 l20 125 Gln Phe Phe Met Lys Lys Leu Gly Ile Lys Gly Gly Ile Ala Ile Glu Val Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala Lys Thr Asn Gln Thr Pro Ile Gln Pro Lys His Leu Pro Asn Ser Lys Gln Pro His Ser Gln Arg Gly Ser Lys Ala Gln Glu Leu Ile Ala Tyr Leu Gln Lys Glu Leu Glu Ser Leu Pro Tyr Ser G1n Lys Ala Ile Ala Lys Gln Val Asn Phe Tyr Arg Pro Ser Ser Val Ala Tyr Leu Glu Leu Asp Pro Arg Asp Phe Lys Val Thr Glu Glu Trp Gln Lys Glu Asn Leu Lys Ile Arg Ser Lys Ala Gln Ala Lys Met Leu Gly Asn Glu Lys Pro Thr Ser Pro Pro Phe Asn Leu Ser Lys Pro Phe Val Arg Ser Lys Asn Ile Cys (2) INFORMATION FOR SEQ ID N0:145:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 669 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 88...603 (D) OTHER INFORMATION:
-2a7-(xi) SEQUENCE DESCRIPTION: SEQ ID N0:145:
Met Phe Asp Lys Lys Leu Ser Ser Asn Asp Trp His Ile Gln Lys Val Glu Met Asn His Gln Val Tyr Asp Ile 15 ~ 20 25 Glu Thr Met Leu Ala Asp Ser Ala Phe Arg Glu His Glu Glu Glu Gln Asp Ser Ser Leu Asn Thr Ala Leu Pro Glu Asp Lys Thr Ala Ile Glu Ala Lys Glu Gln Glu Gln Lys Glu Lys Arg Lys Arg Trp Tyr Glu Leu Phe Lys Lys Lys Pro Lys Pro Lys Ser Ser Met Gly Glu Phe Val Phe Asp Gln Lys Glu Asn Arg Ile Tyr Gly Lys Gly Tyr Cys Asn Arg Tyr Phe Ala Ser Tyr Val Trp Gln Gly Asp Arg His Ile Gly Ile Glu Asp Ser Gly Ile Ser Arg Lys Val Cys Lys Asp Glu His Leu Met Ala Phe Glu Leu Glu Phe Met Glu Asn Phe Lys Gly Asn Phe Thr Val Thr Lys Gly Lys Asp Thr Leu Ile Leu Asp Asn Gln Lys Met Lys Ile Tyr Leu Lys Thr Pro l70 (2) 1NFORI~"uiTION FOR SEQ ID N0:146:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 172 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:146:
Met Phe Asp Lys Lys Leu Ser Ser Asn Asp Trp His Ile Gln Lys Val Glu Met Asn His Gln Val Tyr Asp Ile Glu Thr Met Leu Ala Asp Ser Ala Phe Arg Glu His Glu Glu Glu Gln Asp Sc>r Ser Leu Asn Thr Ala Leu Pro Glu Asp Lys Thr Ala Ile Glu Ala Lys Glu Gln Glu Gln Lys Glu Lys Arg Lys Arg Trp Tyr Glu Leu Phe Lys Lys Lys Pro Lys Pro Lys Ser Ser Met Gly Glu Phe Val Phe Asp Gln Lys Glu Asn Arg Ile Tyr Gly Lys Gly Tyr Cys Asn Arg Tyr Phe Ala Ser Tyr Val Trp Gln Gly Asp Arg His Ile Gly Ile Glu Asp Ser Gly Ile Ser Arg Lys Val l15 120 125 Cys Lys Asp Glu His Leu Met Ala Phe Glu Le~u Glu Phe Met Glu Asn Phe Lys Gly Asn Phe Thr Val Thr Lys Gly L~rs Asp Thr Leu Ile Leu Asp Asn Gln Lys Met Lys Ile Tyr Leu Lys Thr Pro (2) INFORMATION FOR SEQ ID N0:147:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1350 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 87...1280 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NC>:147:
GATTGTTTTT TAAAAAAAGG TTGGTA ATG GAA TCA GTA AAA ACA GGA AAA ACA 1l3 Met Glu Ser Val Lys Thr Gly Lys Thr Asn Lys Val Gly Lys Asn Thr Glu Met Ala Asn Thr Lys Ala Asn Lys Glu Thr His Phe Lys Gln Val Ser Ala Ile Thr Asn Ile Ile Arg Ser Val Gly Gly Phe Phe Thr Lys Ile Ala Lys Arg Val Arg Gly Leu Val Lys Lys His Pro Lys Lys Ser Ser Ala Ala Leu Val Val Leu Thr His Ile Ala Cys Lys Lys Ala Lys Glu Leu Asp Asp Lys Val Gln Asp Lys Ser Lys Gln Ala Glu Lys Glu Asn Gln Ile Asn Trp Trp Lys Tyr Ser Gly Leu Thr Ile Ala Ala Ser Leu Leu Leu Ala Ala Cys Ser Ala Gly Asp Thr Asp Lys Gln Ile Glu Leu Glu Gln Glu Lys Lys Glu Ala Glu Asn Ala Arg Asp Arg Ala Asn Lys Ser Gly Ile Glu Leu Glu Gln Glu Arg Gln Lys Thr Asn Lys Ser Gly Ile Glu Leu Ala Asn Ser Gln Ile Lys Ala Glu Gln Glu Arg Gln Lys Thr Glu Gln Glu Lys Gln Lys Ala l70 175 180 185 Asn Lys Ser Ala Ile Glu Leu Glu Gln G1n Lys Gln Lys Thr Ile Asn Thr Gln Arg Asp Leu Ile Lys Glu Gln Lys Asp Phe Ile Lys Glu Thr Glu Gln Asn Cys Gln Glu Asn His Asn Gln F~he Phe Ile Lys Lys Leu Gly Ile Lys Gly Gly Ile Ala Ile Glu Val G'~lu Ala Glu Cys Lys Thr CCT AAA CCT GCA AAA ACC AAT CAA ACC CCT A,TC CAG CCA AAA CAC CTC 881 Pro Lys Pro Ala Lys Thr Asn Gln Thr Pro Ile Gln Pro Lys His Leu Pro Asn Ser Lys Gln Pro His Ser Gln Arg Gly Ser Lys Ala Gln Glu Phe Ile Ala Tyr Leu Gln Lys Glu Leu Glu Phe Leu Pro Tyr Ser Gln Lys Ala Ile Ala Lys Gln Val Asn Phe Tyr Lys Pro Ser Ser Ile Ala Tyr Leu Glu Leu Asp Pro Arg Asp Phe Lys Val Thr Glu Glu Trp Gln Lys Glu Asn Leu Lys Ile Arg Ser Lys Ala Gln Ala Lys Met Leu Glu Met Arg Asp Leu Lys Pro Asp Pro G.Ln Ala His Leu Pro Thr Ser Gln Ser Leu Leu Phe Val Gln Lys Ile Phe Ala Asp Val Asn Lys Glu Ile Glu Ala Val Ala Asn Thr Glu Lys Lys Ala Glu Lys Ala Gly Tyr Gly Tyr Ser Lys Arg Met (2) INFORMATION FOR SEQ ID N0:148:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 398 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:148:
Met Glu Ser Val Lys Thr Gly Lys Thr Asn Lys Val Gly Lys Asn Thr Glu Met Ala Asn Thr Lys Ala Asn Lys Glu Thr His Phe Lys Gln Val Ser Ala Ile Thr Asn Ile Ile Arg Ser Val Gly Gly Phe Phe Thr Lys I1e Ala Lys Arg Val Arg Gly Leu Val Lys Lys His Pro Lys Lys Ser Ser Ala Ala Leu Val Val Leu Thr His Ile Ala Cys Lys Lys Ala Lys Glu Leu Asp Asp Lys Val Gln Asp Lys Ser Lys Gln Ala Glu Lys Glu Asn Gln Ile Asn Trp Trp Lys Tyr Ser Gly Leu Thr Tle Ala Ala Ser Leu Leu Leu Ala Ala Cys Ser Ala Gly Asp Thr Asp Lys Gln Ile Glu 115 l20 125 Leu Glu Gln Glu Lys Lys Glu Ala Glu Asn Ala Arg Asp Arg Ala Asn Lys Ser Gly Ile Glu Leu Glu Gln Glu Arg Gln Lys Thr Asn Lys Ser 14S l50 155 160 Gly Ile Glu Leu Ala Asn Ser Gln Ile Lys Ala Glu Gln Glu Arg Gln Lys Thr Glu Gln Glu Lys Gln Lys Ala Asn Lys Ser Ala Ile Glu Leu Glu Gln Gln Lys Gln Lys Thr Ile Asn Thr Gln Arg Asp Leu Ile Lys Glu Gln Lys Asp Phe Ile Lys Glu Thr Glu Gln Asn Cys Gln Glu Asn 2l0 215 220 His Asn Gln Phe Phe Ile Lys Lys Leu Gly Ile Lys Gly Gly Ile Ala Ile Glu Val Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala Lys Thr Asn Gln Thr Pro Ile Gln Pro Lys His Leu Pro Asn Ser Lys Glri Pro His Ser Gln Arg Gly Ser Lys Ala Gln Glu Phe Ile Ala Tyr Leu Gln Lys Glu Leu Glu Phe Leu Pro Tyr Ser Gln Lys Ala Ile Ala Lys Gln Val Asn Phe Tyr Lys Pro Ser Ser Tle Ala Tyr Leu Glu Leu Asp Pro Arg Asp Phe Lys Val Thr Glu Glu Trp Gln Lys Glu Asn Leu Lys Ile Arg Ser Lys Ala Gln Ala Lys Met Leu Glu Met Arg Asp Leu Lys Pro Asp Pro Gln Ala His Leu Pro Thr Ser Gln Ser Leu Leu Phe Val Gln Lys Ile Phe Ala Asp Vai Hsn Lys Glu Ile Glu A:la Val Ala Asn Thr Glu Lys Lys Ala Glu Lys Ala Gly Tyr Gly Tyr Ser Lys Arg Met 385 390 3:95 (2) INFORMATION FOR SEQ ID N0:149:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 709 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 336...443 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:149:
TAAGGGATATTGCTAACGAT TAAGCTGTATTGGAAGAG'CTTATTTTGCAAGAATTAATCT60 TCTAAGATTACAAAGGGTAG CGTTTCTGTTTTTGGATT".CAGAGCGTTATTTTGATTGTTT240 ATG AAA
ACC ATT
Met. Lys Arg Asn Thr Ile AGC GTG TTT ATT GGA GCG TCT TTA CTC GGC GC~T TGC GCT AGC GTT GAG 401 Ser Val Phe Ile Gly Ala Ser Leu Leu Gly Gly Cys Ala Ser Val Glu GCT TAT TTT GAC GCT TTG CAT GTT GCT CGC G7.'T AAA GAC GCT TGTTTATAG 452 Ala Tyr Phe Asp Ala Leu His Val Ala Arg Val Lys Asp Ala AAAAAGAAGC ACACCACACG CCCAAAGACT TTGATAGCC:CTTACCACACT GACTAAACCG512 GCACTAGGTT TTAGTTGGGG GTTTTTAGGG GTGTTATT7.'TAGATACTCTC TGTTCCCTTA572 AAGAAAATAA ATTTCTACCA TAAAATAAAA TCTTAAAT7.'AAGGCGACTAA AACCCCACTT632 (2) INFORMATION FOR SEQ ID N0:15C1:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 36 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:150:
Met Lys Thr Ile Arg Asn Ser Val Phe Ile Gly Ala Ser Leu Leu Gly Gly Cys Ala Ser Val Glu Ala Tyr Phe Asp Ala Leu His Val Ala Arg Val Lys Asp Ala (2) INFORMATION FOR SEQ ID N0:151:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 888 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: l9...837 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:151:
Met Glu Phe Met Lys Lys Phe Val Ala Leu Gly Leu Leu Ser Ala Val Leu Ser Ser Ser Leu Leu Ala Glu Gly Asp Gly Val Tyr Ile Gly Thr Asn Tyr Gln Leu Gly Gln Ala Arg Leu Asn Ser Asn Ile Tyr Asn Thr Gly Asp Cys Thr Gly Ser Val Val Gly Cys Pro Pro Gly Leu Thr Ala Asn Lys His Asn Pro Gly Gly Thr Asn Ile Asn Trp His Ala Lys Tyr Ala Asn Gly Ala Leu Asn Gly Leu Gly Leu Asn Val Gly Tyr Lys Lys Phe Phe Gln Phe Lys Seer Phe Asp Met Thr Ser AAG TGG TTT GGT TTT AGA GTG TAT GGG CTT T'TT GAT TAT GGG CAT GCC 387 Lys Trp Phe Gly Phe Arg Val Tyr Gly Leu Plze Asp Tyr Gly His Ala Thr Leu Gly Lys Gln Val Tyr Ala Pro Asn L~Ts I1e Gln Leu Asp Met l25 130 135 GTC TCT TGG GGT GTG GGG AGC GAT TTG TTA GCT GAT ATT ATT GAT AAC 483 _ Val Ser Trp Gly Val Gly Ser Asp Leu Leu A:La Asp Ile Ile Asp Asn 140 145 1!i0 155 GAT AAC GCT TCT TTT GGT ATT TTT GGT GGG G'CC GCT ATC GGC GGT AAC 531 Asp Asn Ala Ser Phe Gly Ile Phe Gly Gly Val Ala Ile Gly Gly Asn ACT TGG AAA AGC TCA GCG GCA AAC TAT TGG AAP. GAG CAA ATC ATT GAA 579 Thr Trp Lys Ser Ser Ala Ala Asn Tyr Trp Lys Glu Gln Ile Ile Glu Ala Lys Gly Pro Asp Val Cys Thr Pro Thr T~rr Cys Asn Pro Asn Ala Pro Tyr Ser Thr Lys Thr Ser Thr Val Ala Phe Gln Val Trp Leu Asn Phe Gly Val Arg Ala Asn Ile Tyr Lys His Asn Gly Val Glu Phe Gly 220 225 2a0 235 GTG AGA GTG CCG CTA CTC ATC AAC AAG TTT T7.'G AGT GCG GGT CCT AAC 771 Val Arg Va1 Pro Leu Leu Ile Asn Lys Phe Leu Ser Ala Gly Pro Asn Ala Thr Asn Leu Tyr Tyr His Leu Lys Arg A:;p Tyr Ser Leu Tyr Leu Gly Tyr Asn Tyr Thr Phe CCTTATAAAA AGG ggg (2) INFORMATION FOR SEQ ID N0:15c.:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 273 amino acids (B) TYPE: amino acid (C) STRANDELNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:152:
Met Glu Phe Met Lys Lys Phe Val Ala Leu Gly Leu Leu Ser Ala Val Leu Ser Ser Ser Leu Leu Ala Glu Gly Asp Gly Val Tyr Ile Gly Thr Asn Tyr Gln Leu Gly Gln Ala Arg Leu Asn Ser Asn Ile Tyr Asn Thr Gly Asp Cys Thr Gly Ser Val Val Gly Cys Pro Pro Gly Leu Thr Ala Asn Lys His Asn Pro Gly Gly Thr Asn Ile Asn Trp His Ala Lys Tyr Ala Asn Gly Ala Leu Asn Gly Leu Gly Leu Asn Val Gly Tyr Lys Lys Phe Phe Gln Phe Lys Ser Phe Asp Met Thr Ser Lys Trp Phe Gly Phe Arg Val Tyr Gly Leu Phe Asp Tyr Gly His Ala Thr Leu Gly Lys Gln Val Tyr Ala Pro Asn Lys Ile Gln Leu Asp Met Val Ser Trp Gly Val Gly Ser Asp Leu Leu Ala Asp Ile Ile Asp Asn Asp Asn Ala Ser Phe Gly Ile Phe Gly Gly Val Ala Ile Gly Gly Asn Thr Trp Lys Ser Ser Ala Ala Asn Tyr Trp Lys Glu Gln Ile Ile Glu Ala Lys Gly Pro Asp Val Cys Thr Pro Thr Tyr Cys Asn Pro Asn Ala Pro Tyr Ser Thr Lys Thr Ser Thr Val Ala Phe Gln Val Trp Leu Asn Phe Gly Val Arg Ala Asn Ile Tyr Lys His Asn Gly Val Glu Phe Gly Val Arg Val Pro Leu Leu Ile Asn Lys Phe Leu Ser Ala Gly Pro Asn Ala Thr Asn Leu Tyr Tyr His Leu Lys Arg Asp Tyr Ser Leu Tyr Leu Gly Tyr Asn Tyr Thr Phe (2) INFORMATION FOR SEQ ID N0:153:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 310 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 10...279 (D) OTHER INFORMATION:
(xi} SEQUENCE DESCRIPTION: SEQ ID N0:153:
Val Ala Val Lys Lys Ile Val Val Ser Trp Cys Val Ala Leu Ala Phe Leu Ser Ala Asp Ser Ala Gln Ala A:an Lys Ala Ile Ser Asn GCG GAT TTG ATT AAA GAG ATA AGG GAT TTA AAA AAA ATC ATC AGC GCG 1.47 Ala Asp Leu Ile Lys Glu Ile Arg Asp Leu Lys Lys Ile Ile Ser Ala Gln Asn Thr Glu Ile Asn Asn Leu Arg Lys Val Gln Glu Val Leu Ser GGG CAA TTA GGG GAC ATG CGT AAG GAT ATA T7.'A AGC ACT AGA GAT TAT 243 Gly Gln Leu Gly Asp Met Arg Lys Asp Ile Le~u Ser Thr Arg Asp Tyr Cys Ile Ser Leu Arg Pro Tyr Ile Tyr Asn Trp Arg (2) INFORMATION FOR SEQ ID NO: I54:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 90 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:154:
Val Ala Val Lys Lys Ile Val Val Ser Trp Cys Val Ala Leu Ala Phe Leu Ser Ala Asp Ser Ala Gln Ala Asn Lys A7.a Ile Ser Asn Ala Asp Leu Ile Lys Glu Ile Arg Asp Leu Lys Lys Ile Ile Ser Ala Gln Asn Thr Glu Ile Asn Asn Leu Arg Lys Val Gln G7.u Val Leu Ser Gly Gln Leu Gly Asp Met Arg Lys Asp Ile Leu Ser Thr Arg Asp Tyr Cys Ile Ser Leu Arg Yro ~l~yr lie Tyr Asn Trp Arg (2) INFORMATION FOR SEQ ID N0:155:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 549 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 16...474 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:155:
Met Glu Gln Asn Ile Phe Ser Leu Leu Ile Gln Lys Lys Ser Tyr Lys Lys Leu Glu Thr Leu Leu Lys Leu Lys Lys Leu Lys GTT TTT ATG CCT TTA AGT TTA CAA GAA AAT TTG CTT TTT ATC TTC ATA l47 Val Phe Met Pro Leu Ser Leu Gln Glu Asn Leu Leu Phe Ile Phe Ile Lys Asp Ser Lys Leu Leu Phe Ala Phe Lys Asp Ile Trp Ala Ser Lys Glu Phe Asn Gln Arg Phe Ala Lys Glu Ile Ser His Phe Leu Asn Thr CAA GGG CAT GCT TAT GGG TTT GAC GGG TTG AAT GGG TTA GAA ATT TTA 29l Gln Gly His Ala Tyr Gly Phe Asp Gly Leu Asn Gly Leu Glu Ile Leu Gly Tyr Val Pro Lys Asp Ala Leu Lys Lys Ser Asn Phe Tyr Ala Pro 95 l00 105 Ile Lys Lys Gln Ala Arg Phe Phe Arg Pro Ser Ala Leu Gly Leu Phe His Asn Pro Ile Lys Hsp Ala Arg Leu His Glu Cys Phe Glu Lys Ala 125 130 135 l40 Arg Ala Leu Ile His Tyr Gln Arg Ser Phe Phe Glu Glu TTATTGTCCA GTTTAAP.AAA CCTTCCTAAC AGCAGTGGCG TGTATCAATA TTTTGATAAA 546 (2) INFORMATION FOR SEQ ID N0:156:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 153 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii} MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:156:
Met Glu Gln Asn Ile Phe Ser Leu Leu Ile Gln Lys Lys Ser Tyr Lys Lys Leu Glu Thr Leu Leu Lys Leu Lys Lys Leu Lys Val Phe Met Pro Leu Ser Leu Gln Glu Asn Leu Leu Phe Ile Phe Ile Lys Asp Ser Lys Leu Leu Phe Ala Phe Lys Asp Ile Trp Ala Ser Lys Glu Phe Asn Gln Arg Phe Ala Lys Glu Ile Ser His Phe Leu Asn Thr Gln Gly His Ala Tyr Gly Phe Asp Gly Leu Asn Gly Leu Glu Ile Leu Gly Tyr Val Pro Lys Asp Ala Leu Lys Lys Ser Asn Phe Tyr Ala Pro Ile Lys Lys Gln Ala Arg Phe Phe Arg Pro Ser Ala Leu Gly Leu Phe His Asn Pro Ile Lys Asp Ala Arg Leu His Glu Cys Phe Glu Lys Ala Arg Ala Leu Ile His Tyr Gln Arg Ser Phe Phe Glu Glu (2) INFORMATION FOR SEQ ID N0:157:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2627 base pairs (B} TYPE: nucleic acid (C} STRANDEDNESS: single (D} TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KHz: Coding Sequence (B) LOCATION: 18...2582 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:157:
Met Lys Ser Lys Lys Leu Tyr Leu Ala Leu Ile Ile Gly Val Leu Leu Ala Phe Leu Thr Leu Ser Ser Trp Leu Gly Asn AGC GGT TTA GTG GGG CGT TTT GGG GTG TGG TTT GCC GCA CTC AAT AAA 7.46 Ser Gly Leu Val Gly Arg Phe Gly Val Trp Phe Ala Ala Leu Asn Lys Lys Tyr Phe Gly His Leu Ser Phe Ile Asn Leu Pro Tyr Leu Ala Trp Val Leu Phe Leu Leu Tyr Lys Thr Lys Asn Pro Phe Thr Glu Ile Val Leu Glu Lys Thr Leu Gly His Leu Leu Gly Ile Leu Ser Leu Leu Phe Leu Gln Ser Ser Leu Leu Asn Gln Gly Glu Ile Gly Asn Ser Ala Arg Leu Phe Leu Arg Pro Phe Ile Gly Asp Phe Gly Leu Tyr Ala Leu Ile Thr Leu Met Val Val Ile Ser Tyr Leu Ile Leu Phe Lys Leu Pro Pro Lys Ser Val Phe Tyr Pro Tyr Met Asn Lys Thr Gln Asn Leu Leu Lys Glu Ile Tyr Lys Gln Cys Leu Gln Ala Phe Ser Pro Asn Phe Ser Pro Lys Lys Glu Gly Phe Glu Asn Thr Pro Ser Asp Ile Gln Lys Lys Glu ACC AAA AAC GAC AAt~ GAA AAA GAA AAC CGC AAF1 GAA AAC CCT ATT A&T 626 Thr Lys Asn Asp Lys Glu Lys Glu Asn Arg Lvs Glu Asn Pro Ile Asn GAA AAC CAC AAA ACC CCT AAC GAA GAA CCG T'.CT TTA GCG ATC CCT ACC 674 Glu Asn His Lys Thr Pro Asn Glu Glu Pro Phe Leu Ala Ile Pro Thr CCC TAT AAC ACG ACT TTA AAT GAT TCA GAG CC:G CAA GAA GGC TTA GTC 722 Pro Tyr Asn Thr Thr Leu Asn Asp Ser Glu Pro Gln Glu Gly Leu Val 220 225 2.l0 235 CAA ATT TCC TCC CAC CCC CCT ACC CAT TAC AC:C ATT TAC CCT AAA AGA 770 Gln Ile Ser Ser His Pro Pro Thr His Tyr Thr Ile Tyr Pro Lys Arg Asn Arg Phe Asp Asp Leu Thr Asn Pro Thr A:>n Pro Pro Leu Lys Glu ATT AAA CAA GAA ACT AAA GAA AGA GAA CCC AC:G CCT ACA AAA GAA ACT 866 Ile Lys Gln Glu Thr Lys Glu Arg Glu Pro Thr Pro Thr Lys Glu Thr CTT ACG CCC ACC ACG CCC AAA CCT ATC ATG CC,'C ACA CTT GCA CCC ATA 914 Leu Thr Pro Thr Thr Pro Lys Pro Ile Met Pro Thr Leu Ala Pro Ile Ile Glu Asn Asp Asn Lys Thr Glu Asn Gln Lys Thr Pro Asn His Pro 300 305 3l.0 315 Lys Lys Glu Glu Asn Pro Gln Glu Asn Thr Gl.n Glu Glu Met Ile Glu Gly Arg Ile Glu Glu Met Ile Lys Glu Asn Le:u Lys Lys Glu Glu Lys Glu Val Gln Asn Ala Pro Asn Phe Ser Pro Va.l Thr Pro Thr Ser Ala Lys Lys Pro Val Met Val Lys Glu Leu Ser Glu Asn Lys Glu Ile Leu GAC GGA TTG GAT TAT GGC GAA GTG CAA AAA CC'C AAA GAT TAT GAG CTT 1202 Asp Gly Leu Asp Tyr Gly Glu Val Gln Lys Pro Lys Asp Tyr Glu Leu 380 385 39'0 395 CCC ACC ACG CAA TTA TTG AAT GCG GTT TGT TT'G AAA GAC ACT TCT TTA 1250 Pro Thr Thr Gln Leu Leu Asn Ala Val Cys Leu Lys Asp Thr Ser Leu GAC GAA AAC GAG A'1"1' GAC CAA AAA ATC CAG GAT CTA TTG AGC AAA CTG 129B
Asp Glu Asn Glu Ile Asp Gln Lys Ile Gln Asp Leu Leu Ser Lys Leu Arg Thr Phe Lys Ile Asp Gly Asp Ile Ile Arg Thr Tyr Ser Gly Pro Ile Val Thr Thr Phe Glu Phe Arg Pro Ala Pro Asn Val Lys Val Ser Arg Ile Leu Gly Leu Ser Asp Asp Leu Ala Met Thr Leu Cys Ala Glu Ser Ile Arg Ile Gln Ala Pro Ile Lys Gly Lys Asp Val Val Gly Ile _ Glu Ile Pro Asn Ser Gln Ser Gln Ile Ile Tyr Leu Arg Glu Ile Leu Glu Ser Glu Leu Phe Gln Lys Ser Ser Ser Pro Leu Thr Leu Ala Leu Gly Lys Asp Ile Val Gly Asn Pro Phe Ile Thr Asp Leu Lys Lys Leu Pro His Leu Leu Ile Ala Gly Thr Thr Gly Ser Gly Lys Ser Val Gly Val Asn Ala Met Ile Leu Ser Leu Leu Tyr Lys Asn Pro Pro Asp Gln Leu Lys Leu Val Met Ile Asp Pro Lys Met Val Glu Phe Ser Ile Tyr Ala Asp Ile Pro His Leu Leu Thr Pro Ile Ile Thr Asp Pro Lys Lys Ala Ile Gly Ala Leu Gln Ser Val Ala Lys Glu Met Glu Arg Arg Tyr Ser Leu Met Ser Glu Tyr Lys Val Lys Thr Ile Asp Ser Tyr Asn Glu Gln Ala Pro Ser Asn Gly Val Glu Ala Phe Pro Tyr Leu Ile Val Val Ile Asp Glu Leu Ala Asp Leu Met Met Thr Gly Gly Lys Glu Ala Glu Phe Pro Ile Ala Arg Ile Ala Gln Met Gly Arg Ala Ser Gly Leu His CTC ATT GTA GCG ACC CAA CGC CCA AGC GTG G:AT GTC GTA ACC GGC TTG 2114 Leu Ile Val Ala Thr Gln Arg Pro Ser Val Asp Va1 Val Thr Gly Leu ATT AAA ACC AAC TTG CCT TCA AGG GTG AGT T'TT AGG GTA GGC ACT AAG 2162 Ile Lys Thr Asn Leu Pro Ser Arg Val Ser Phe Arg Val Gly Thr Lys 700 705 7'10 715 Ile Asp Ser Lys Val Tle Leu Asp Thr Asp G.ly Ala Gln Ser Leu Leu Gly Arg Gly Asp Met Leu Phe Thr Pro Pro G:ly Ala Asn Gly Leu Val Arg Leu His Ala Pro Phe Ala Thr Glu Asp G.lu Ile Lys Lys Ile Val GAT TTT ATT AAA GCC CAA AAA GAA GTA CAA Ti~C GAT AAA GAT TTC TTG 2354 Asp Phe Ile Lys Ala Gln Lys Glu Val Gln T,,rr Asp Lys Asp Phe Leu Leu Glu Glu Ser Arg Met Pro Leu Asp Thr Pro Asn Tyr Gln Gly Asp GAC ATT TTA GAA AGG GCT AAA GCG GTG ATT T'.CA GAA AAA AAG ATC ACT 2450 Asp Ile Leu Glu Arg Ala Lys Ala Val Ile Le.u Glu Lys Lys Ile Thr Ser Thr Ser Phe Leu Gln Arg Gln Leu Lys Ile Gly Tyr Asn Gln Ala GCT ACC ATT ACT GAC GAA TTA GAA GCT CAA GC~C TTT TTA TCC CCA AGA 2546 Ala Thr Ile Thr Asp Glu Leu Glu Ala Gln Gly Phe Leu Ser Pro Arg Asn Ala Lys G1y Rsn Arg Glu Ile Leu Gln Asn Phe TGGATATTGG CAAACAT'1'AU TTTTGATTT 2627 (2) INFORMATION FOR SEQ ID N0:158:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 855 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:158:
Met Lys Ser Lys Lys Leu Tyr Leu Ala Leu Ile Ile Gly Val Leu Leu Ala Phe Leu Thr Leu Ser Ser Trp Leu Gly Asn Ser Gly Leu Val Gly Arg Phe Gly Val Trp Phe Ala Ala Leu Asn Lys Lys Tyr Phe Gly His Leu Ser Phe Ile Asn Leu Pro Tyr Leu Ala Trp Val Leu Phe Leu Leu Tyr Lys Thr Lys Asn Pro Phe Thr Glu Ile Val Leu Glu Lys Thr Leu Gly His Leu Leu Gly Ile Leu Ser Leu Leu Phe Leu Gln Ser Ser Leu Leu Asn Gln Gly Glu Ile Gly Asn Ser Ala Arg Leu Phe Leu Arg Pro 100 l05 110 Phe Ile Gly Asp Phe Gly Leu Tyr Ala Leu Ile Thr Leu Met Val Val Ile Ser Tyr Leu Ile Leu Phe Lys Leu Pro Pro Lys Ser Val Phe Tyr l30 135 140 Pro Tyr Met Asn Lys Thr Gln Asn Leu Leu Lys Glu Ile Tyr Lys Gln Cys Leu Gln Ala Phe Ser Pro Asn Phe Ser Pro Lys Lys Glu Gly Phe Glu Asn Thr Pro Ser Asp Ile Gln Lys Lys Glu Thr Lys Asn Asp Lys Glu Lys Glu Asn Arg Lys Glu Asn Pro Ile Asn Glu Asn His Lys Thr Pro Asn Glu Glu Pro Phe Leu Ala Ile Pro Thr Pro Tyr Asn Thr Thr 210 2l5 220 Leu Asn Asp Ser Glu Pro Gln Glu Gly Leu Val Gln Ile Ser Ser His Pro Pro Thr His Tyr Thr Ile Tyr Pro Lys Arg Asn Arg Phe Asp Asp Leu Thr Asn Pro Thr Asn Pro Pro Leu Lys Glu Ile Lys Gln Glu Thr Lys Glu Arg Glu Pro Thr Pro Thr Lys Glu Thr Leu Thr Pro Thr Thr Pro Lys Pro Ile Met Pro Thr Leu Ala Pro Ile Ile Glu Asn Asp Asn Lys Thr Glu Asn Gln Lys Thr Pro Asn His Pro Lys Lys Glu Glu Asn Pro Gln Glu Asn Thr Gln Glu Glu Met Ile Glu Gly Arg Ile Glu Glu Met Ile Lys V1u tan Leu Lys Lys Glu Glu Lys Glu Val Gln Asn Ala Pro Asn Phe Ser Pro Val Thr Pro Thr Ser Ala Lys Lys Pro Val Met _ 355 360 365 Val Lys Glu Leu Ser Glu Asn Lys Glu Ile L~~u Asp Gly Leu Asp Tyr Gly Glu Val Gln Lys Pro Lys Asp Tyr Glu L~~u Pro Thr Thr Gln Leu 385 390 3.9S 400 Leu Asn Ala Val Cys Leu Lys Asp Thr Ser Leu Asp Glu Asn Glu Ile Asp Gln Lys Ile Gln Asp Leu Leu Ser Lys Leu Arg Thr Phe Lys Ile Asp Gly Asp Ile Ile Arg Thr Tyr Ser Gly P:ro Ile Val Thr Thr Phe Glu Phe Arg Pro Ala Pro Asn Val Lys Val Ser Arg Ile Leu Gly Leu Ser Asp Asp Leu Ala Met Thr Leu Cys Ala G.Lu Ser Ile Arg Ile Gln 465 470 4'75 480 Ala Pro Ile Lys Gly Lys Asp Val Val Gly I:Le Glu Ile Pro Asn Ser Gln Ser Gln Ile Ile Tyr Leu Arg Glu Ile Lc=_u Glu Ser Glu Leu Phe Gln Lys Ser Ser Ser Pro Leu Thr Leu Ala L<~u Gly Lys Asp Ile Val Gly Asn Pro Phe Ile Thr Asp Leu Lys Lys Leu Pro His Leu Leu Ile Ala Gly Thr Thr Gly Ser Gly Lys Ser Val Gly Val Asn Ala Met Ile Leu Ser Leu Leu Tyr Lys Asn Pro Pro Asp Gln Leu Lys Leu Val Met Ile Asp Pro Lys Met Val Glu Phe Ser Ile Tyr Ala Asp Ile Pro His Leu Leu Thr Pro Ile Ile Thr Asp Pro Lys Lys Ala Ile Gly Ala Leu Gln Ser Val Ala Lys Glu Met Glu Arg Arg Tyr Ser Leu Met Ser Glu 6l0 615 620 Tyr Lys Val Lys Thr Ile Asp Ser Tyr Asn GJ_u Gln Ala Pro Ser Asn 625 630 6.S5 640 Gly Val Glu Ala Phe Pro Tyr Leu Ile Val Val Ile Asp Glu Leu Ala Asp Leu Met Met Thr Gly Gly Lys Glu Ala GJ.u Phe Pro Ile Ala Arg Ile Ala Gln Met Gly Arg Ala Ser Gly Leu His Leu Ile Val Ala Thr Gln Arg Pro Ser Val Asp Val Val Thr Gly Le'u Ile Lys Thr Asn Leu Pro Ser Arg Val Ser Phe Arg Val Gly Thr Lys Ile Asp Ser Lys Val 705 710 77.5 720 Ile Leu Asp Thr Asp Gly Ala Gln Ser Leu Leu Gly Arg Gly Asp Met Leu Phe Thr Pro Pro Gly Ala Asn Gly Leu Val Arg Leu His Ala Pro Phe Ala Thr Glu Asp Glu Ile Lys Lys Ile Val Asp Phe Ile Lys Ala Gln Lys Glu Val Gln Tyr Asp Lys Asp Phe Leu Leu Glu Glu Ser Arg Met Pro Leu Asp Thr Pro Asn Tyr Gln Gly Asp Asp Ile Leu Glu Arg Ala Lys Ala Val Ile Leu Glu Lys Lys Ile Thr Ser Thr Ser Phe Leu Gln Arg Gln Leu Lys Ile Gly Tyr Asn Gln Ala Ala Thr Ile Thr Asp Glu Leu Glu Ala Gln Gly Phe Leu Ser Pro Arg Asn Ala Lys Gly Asn Arg Glu Ile Leu Gln Asn Phe (2) INFORMATION FOR SEQ ID N0:159:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19S6 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 56...1945 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:159:
Met Gln Asp Ser Leu His Phe Lys Val Asn Glu Val Gln Gly Val Leu Glu Asn Thr Tyr Thr Ser Met Gly Ile Val Lys Glu Met Leu Pro Lys Asp Thr Lys Arg Glu Ile Lys Ile Gly Leu Leu Lys Asn Phe Ile Leu Ala Asn Ser His Val Ala Gly Val Ser Met Phe Phe Lys Gly Arg Glu Asp Leu Arg Leu Thr Leu Leu Arg Asp Asn Asn Thr Ile Lys Leu Val Glu AAT CCG TCA T'1'A GAU hAT AGC CCT TTA GCG CA;4 AAA GCG ATG AAA AAT 346 Asn Pro Ser Leu Glu Asn Ser Pro Leu Ala Gl:n Lys Ala Met Lys Asn AAA GAA ATT TCT AAA AGT TTG GGT TAT TAT AGc3 AAA ATG CCT AAT GGG 394 Lys Glu Ile Ser Lys Ser Leu Gly Tyr Tyr Arg Lys Met Pro Asn Gly GCG GAA GTT TAT GGG GTG GAT ATT CTT TTA CC'T TTA TTG AAT GAG AAC 442 Ala Glu Val Tyr Gly Val Asp Ile Leu Leu Pro Leu Leu Asn Glu Asn GCT CAA GAG GTT GTA GGG GCT TTG ATG ATT TT'r ATT TCC ATT GAC AGC 490 Ala Gln Glu Val Val Gly Ala Leu Met Ile Phe Ile Ser Ile Asp Ser TTC AGC AAT GAA ATC ACT AAA AAC AGG AGC GA'r TTA TTT TTA ATT GGC 538 Phe Ser Asn Glu Ile Thr Lys Asn Arg Ser Asp Leu Phe Leu Ile Gly Thr Lys Gly Lys Val Leu Leu Ser Ala Asn Lys Ser Leu Gln Asp Lys CCT ATC GCA GAA ATT TAT AAG AGC GTG CCT AAi~ GCC ACC AAC GAA GTG 634 Pro Ile Ala Glu Ile Tyr Lys Ser Val Pro Ly:~ Ala Thr Asn Glu Val ATG GCT ATT TTA GAA AAC GGC TCT AAA GCG AC'C TTA GAA TAC TTA GAT 682 Met Ala Ile Leu Glu Asn Gly Ser Lys Ala Th=r Leu Glu Tyr Leu Asp CCC TTT AGC CAT AAG GAA AAT TTT TTA GCC GT'C GAA ACC TTT AAA ATG 730 Pro Phe Ser His Lys Glu Asn Phe Leu Ala Va.l Glu Thr Phe Lys Met CTA GGC AAA ACA GAA AGT AAA GAC AAT CTT AA'C TGG ATG ATC GCT TTA 778 Leu Gly Lys Thr Glu Ser Lys Asp Asn Leu Asn Trp Met Ile Ala Leu Ile Ile Glu Lys Asp Lys Val Tyr Glu Gln Va.L Gly Ser Val Arg Phe Val Val Ile Ile Ala Ser Ala Ile Met Val Leu Ala Leu Ile Ile Ala ATC ACT CTC TTA ATG CGA GCG ATC GTG AGC AG'C CGT TTG GAA GCC GTT 922 Ile Thr Leu Leu Met Arg Ala Tle Val Ser Ser Arg Leu Glu Ala Val Ser Ser Thr Leu Ser His Phe Phe Lys Leu Leu Asn Asn Gln Ala Asn 290 29S 30c) 305 Phe Ala Thr Glu Asp Glu Ile Lys Lys Ile V
TCT AGC GGT A'1"1' AAt~ 'i iG i-PTT GAA GCG AAA TCC AAT GAC GAG TTA GGC 1018 Ser Ser Gly Ile Lys Leu Ile Glu Ala Lys Ser Asn Asp Glu Leu Gly 310 3l5 320 Arg Met Gln Thr Ala Ile Asn Lys Asn Ile Leu Gln Thr Gln Lys Ile ATG CAA GAA GAC AGG CAA GCC GTC CAA GAC ACC ATT AAA GTG GTT TCA 11l4 Met Gln Glu Asp Arg Gln Ala Val Gln Asp Thr Ile Lys Val Val Ser GAT GTG AAA GCA GGG AAT TTT GCG GTG CGC ATC ACG GCT GAG CCC GCA l162 Asp Val Lys Ala Gly Asn Phe Ala Val Arg Ile Thr Ala Glu Pro Ala Ser Pro Asp Leu Lys Glu Leu Arg Asp Ala Leu Asn Gly Ile Met Asp Tyr Leu Gln Glu Ser Val Gly Thr His Met Pro Ser Ile Phe Lys Ile Phe Glu Ser Tyr Ser Gly Leu Asp Phe Arg Gly Arg Ile Gln Asn Ala Ser Gly Arg Val Glu Leu Val Thr Asn Ala Leu Gly Gln Glu Ile Gln Lys Met Leu Glu Thr Ser Ser Asn Phe Ala Lys Asp Leu Ala Asn Asp Ser Ala Asn Leu Lys Glu Cys Val Gln Asn Leu Glu Lys Ala Ser Asn Ser Gln His Lys Ser Leu Met Glu Thr Ser Lys Thr Ile Glu Asn Ile Thr Thr Ser Ile Gln Gly Val Ser Ser Gln Ser Glu Ala Met I1e Glu CAA GGG CAA GAC ATT AAA AGC ATT GTA GAA ATC ATT AGA GAT ATT GCT l594 Gln Gly Gln Asp Ile Lys Ser Ile Val Glu Ile Ile Arg Asp Ile Ala Asp Gln Thr Asn Leu Leu Ala Leu Asn Ala Ala Ile Glu Ala Ala Arg 5l5 520 525 GCC GGC GAG CAT GUC: huia GGC TTT GCG GTG G'TG GCT GAT GAG GTA AGA 1690 Ala Gly Glu His Gly Arg Gly Phe Ala Val Val Ala Asp Glu Val Arg Lys Leu Ala Glu Arg Thr Gln Lys Ser Leu Ser Glu Ile Glu Ala Asn Ile Asn Tle Leu Val Gln Ser Ile Ser Asp T.hr Ser Glu Ser Ile Lys Asn Gln Val Lys G1u Val Glu Glu Ile Asn Ala Ser Ile Glu Ala Leu Arg Ser Val Thr Glu Gly Asn Leu Lys Ile Ala Ser Asp Ser Leu Glu ATC AGT CAA GAA ATT GAC AAA GTT TCT AAC G.~1T ATT TTA GAA GAT GTG 1930 Ile Ser Gln Glu Ile Asp Lys Val Ser Asn Asp Ile Leu Glu Asp Val 610 6l5 6:20 _ 625 Asn Lys Lys Gln Phe (2) INFORMATION FOR SEQ ID N0:160:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 630 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:160:
Met Gln Asp Ser Leu His Phe Lys Val Asn G.lu Val Gln Gly Val Leu Glu Asn Thr Tyr Thr Ser Met Gly Ile Val Lys Glu Met Leu Pro Lys Asp Thr Lys Arg Glu Ile Lys Ile Gly Leu Leu Lys Asn Phe Ile Leu Ala Asn Ser His Val Ala Gly Val Ser Met Phe Phe Lys Gly Arg Glu Asp Leu Arg Leu Thr Leu Leu Arg Asp Asn Asn Thr Ile Lys Leu Val Glu Asn Pro Ser Leu Glu Asn Ser Pro Leu A.la Gln Lys Ala Met Lys Asn Lys Glu Ile Ser Lys Ser Leu Gly Tyr Tyr Arg Lys Met Pro Asn WO 98/21225 PCTlUS97121353 -Gly Ala Glu Val Tyr Gly Val Asp Ile Leu Leu Pro Leu Leu Asn Glu 1l5 120 125 Asn Ala Gln Glu Val Val Gly Ala Leu Met Ile Phe Ile Ser Ile Asp Ser Phe Ser Asn Glu Ile Thr Lys Asn Arg Ser Asp Leu Phe Leu Ile Gly Thr Lys Gly Lys Val Leu Leu Ser Ala Asn Lys Ser Leu Gln Asp Lys Pro Ile Ala Glu Ile Tyr Lys Ser Val Pro Lys Ala Thr Asn Glu Val Met Ala Ile Leu Glu Asn Gly Ser Lys Ala Thr Leu Glu Tyr Leu l95 200 205 _ Asp Pro Phe Ser His Lys Glu Asn Phe Leu Ala Val Glu Thr Phe Lys Met Leu Gly Lys Thr Glu Ser Lys Asp Asn Leu Asn Trp Met Ile Ala Leu Ile Ile Glu Lys Asp Lys Val Tyr Glu Gln Val Gly Ser Val Arg Phe Val Val Ile Ile Ala Ser Ala Ile Met Val Leu Ala Leu Ile Ile Ala Ile Thr Leu Leu Met Arg Ala Ile Val Ser Ser Arg Leu Glu Ala Val Ser 5er Thr Leu Ser His Phe Phe Lys Leu Leu Asn Asn Gln Ala Asn Ser Ser Gly Ile Lys Leu Ile Glu Ala Lys Ser Asn Asp Glu Leu Gly Arg Met Gln Thr Ala Ile Asn Lys Asn Ile Leu Gln Thr Gln Lys Ile Met Gln Glu Asp Arg Gln Ala Val Gln Asp Thr Ile Lys Val Val Ser Asp Val Lys Ala Gly Asn Phe Ala Val Arg lle Thr Ala Glu Pro Ala Ser Pro Asp Leu Lys Glu Leu Arg Asp Ala Leu Asn Gly Ile Met Asp Tyr Leu Gln Glu Ser Val Gly Thr His Met Pro Ser Ile Phe Lys Ile Phe Glu Ser Tyr Ser Gly Leu Asp Phe Arg Gly Arg Ile Gln Asn Ala Ser Gly Arg Val Glu Leu Val Thr Asn Ala Leu Gly Gln Glu Ile Gln Lys Met Leu Glu Thr Ser Ser Asn Phe Ala Lys Asp Leu Ala Asn Asp Ser Ala Asn Leu Lys Glu Cys Val Gln Asn Leu Glu Lys Ala Ser Asn Ser Gln His Lys Ser Leu Met Glu Thr Ser Lys Thr Ile Glu Asn Ile Thr Thr Ser Ile Gln Gly Val Ser Ser Gln Ser Glu Ala Met Ile Glu Gln Gly Gln Asp Ile Lys Ser Ile Val Glu Ile Ile Arg Asp Ile Ala Asp Gln Thr Asn Leu Leu Ala Leu Asn Ala Ala Ile Glu Ala Ala 5l5 520 525 Arg Ala Gly Glu His Gly Arg Gly Phe Ala Val Val Ala Asp Glu Val Arg Lys Leu A1a (,~lu Hrg Thr Gln Lys Ser L~eu Ser Glu Ile Glu Ala Asn Ile Asn Ile Leu Val Gln Ser Ile Ser A.sp Thr Ser Glu Ser Ile Lys Asn Gln Val Lys Glu Val Glu Glu Ile A.sn Ala Ser Ile Glu Ala Leu Arg Ser Val Thr Glu Gly Asn Leu Lys Ile Ala Ser Asp Ser Leu Glu I1e Ser Gln Glu Ile Asp Lys Val Ser A.sn Asp Ile Leu Glu Asp Val Asn Lys Lys Gln Phe (2) INFORMATION FOR SEQ ID N0:161:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1758 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 8...1702 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:161:
GAGATAA ATG ATG TTT TCT TCA ATG TTT GCT T'CG TTG GGG ACT CGT ATC 49 Met Met Phe Ser Ser Met Phe Ala Ser Leu Gly Thr Arg Ile ATG CTG GTC GTG TTA GCC GCT CTT TTA GGT T'TA GGG GGG CTT TTT ATT 97 Met Leu Val Val Leu Ala Ala Leu Leu Gly Leu Gly Gly Leu Phe Ile GGT TTT GTA AAG GTT ATG CAA AAA GAT GTG T'TA GCG CAA CTC ATG GAG 145 Gly Phe Val Lys Val Met Gln Lys Asp Val Leu Ala Gln Leu Met Glu His Leu Glu Thr Gly Gln Tyr Lys Lys Arg Glu Lys Thr Leu Ala Tyr Met Thr Lys Ile Ile Glu Gln Gly Ile His Glu Tyr Tyr Lys Asn Phe Asp Asn Ala Thr Ala Arg Lys Met Ala Leu A.sp Tyr Phe Lys Arg Ile AAC GAC GAT AAG GGC A'I'G ATT TAT ATG GTG GTG GTG GAT AAA AAC GGG 337 Asn Asp Asp Lys Gly Met Ile Tyr Met Val Val Val Asp Lys Asn Gly 95 100 105 l10 Val Val Leu Phe Asp Pro Val Asn Pro Lys Thr Val Xaa Gln Ser Gly 115 l20 125 Leu Asp Ala Gln Ser Val Asp Gly Val Tyr Tyr Val Arg Gly Tyr Leu GAG GCG GCC AAA AAA GGG GGA GGC TAC ACT TAT TAT AAA ATG CCT AAA 48l Glu Ala Ala Lys Lys Gly Gly Gly Tyr Thr Tyr Tyr Lys Met Pro Lys Tyr Asp Gly Gly Val Pro Glu Lys Lys Phe Ala Tyr Ser His Tyr Asp Glu Val Ser Gln Met Val Ile Ala Thr Thr Ser Tyr Tyr Thr Asp Ile Asn Thr Glu Asn Lys Ala Ile Lys Glu Gly Val Asn Lys Val Phe Asp Glu Asn Thr Thr Lys Leu Phe Leu Trp Ile Leu Thr Ala Thr Ile Ala Leu Val Val Leu Thr Leu Ile Tyr Ala Lys Leu Arg Ile Val Lys Arg Ile Asp Glu Leu Val Leu Lys Ile Asn Ala Phe Ser Arg Gly Asp Lys Asp Leu Arg Ala Lys Ile Asp Val Gly Asp Arg Asn Asp Glu Ile Ser Gln Val Gly Arg Gly Ile Asn Leu Phe Val Glu Asn Ala Arg Leu Ile Met Glu Glu Ile Lys Gly Ile Ser Thr Leu Asn Lys Thr Ser Met Asp Lys Leu Val Gln Ile Thr Gln Glu Thr Gln Lys Ser Met Lys Asp Ser TCA ACC ACC C1'A AA'1' 'lw~ GTG AAA AAT AAA GCC ACT GAT ATA GCG AGC 1009 Ser Thr Thr Leu Asn Ser Val Lys Asn Lys Ala Thr Asp Ile Ala Ser Met Met Asn Ala Ser Ile Glu Gln Ser Gln Gly Leu Arg Lys Arg Leu Ile Glu Thr Gln Gly Leu Val Lys Glu Ser Lys Asp Ala Ile Gly Asp Leu Phe Ser Gln Ile Thr Glu Ser Ala His Thr Glu Glu Glu Leu Ser AGC AAA GTG GAG CAG CTA AGC CGT AAC GCT GAT GAT GTC AAA TCC ATT 120l Ser Lys Val Glu Gln Leu Ser Arg Asn Ala Asp Asp Val Lys Ser Ile Leu Asp Ile Ile Asn Asp Ile Ala Asp Gln Thr Asn Leu Leu Ala Leu Asn Ala Ala Ile Glu Ala Ala Arg Ala Gly G.Lu His Gly Arg Gly Phe Ala Val Val Ala Asp Glu Val Arg Asn Leu Ala Gly Arg Thr Gln Lys TCT TTA GCC GAA ATC AAT TCC ACT ATC ATG G'CG ATT GTC CAA GAA ATC 1393 Ser Leu Ala Glu Ile Asn Ser Thr Ile Met Val Ile Val Gln Glu Ile Asn Ala Val Ser Ser Gln Met Asn Leu Asn Ser Gln Lys Met Glu Arg Leu Ser Asp Met Ser Lys Ser Val Gln Glu Tlzr Tyr Glu Lys Met Ser TCT AAT TTA AGC TCA GTC GTG TCA GAC AGC Ai~T CAA AGC ATG GAC GAT 1537 Ser Asn Leu Ser Ser Val Val Ser Asp Ser A:an Gln Ser Met Asp Asp TAC GCC AAA TCC GGA CAC CAA ATT GAA GTT A'rG GTA AGC GAT TTT GCA 1585 Tyr Ala Lys Ser Gly His Gln Ile Glu Val M~a Val Ser Asp Phe Ala GAG GTG GAA AAA GTG GCT TCT AAG ACT TTA GCG GAT TCT TCA GAT ATT l633 Glu Val Glu Lys Val Ala Ser Lys Thr Leu A.la Asp Ser Ser Asp Ile TTA AAC ATC GC;T AC:U ~Wi GTG AGT GGA ACG ACC ATG AAT TTA GAC AAA 1681 Leu Asn Ile Ala Thr His Val Ser Gly Thr Thr Met Asn Leu Asp Lys Gln Val Asn Leu Phe Lys Thr (2) INFORMATION FOR SEQ ID N0:162:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 565 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:162:
Met Met Phe Ser Ser Met Phe Ala Ser Leu Gly Thr Arg Ile Met Leu Val Val Leu Ala Ala Leu Leu Gly Leu Gly Gly Leu Phe Ile Gly Phe Val Lys Val Met Gln Lys Asp Val Leu Ala Gln Leu Met Glu His Leu Glu Thr Gly Gln Tyr Lys Lys Arg Glu Lys Thr Leu Ala Tyr Met Thr Lys Ile Ile Glu Gln Gly Ile His Glu Tyr Tyr Lys Asn Phe Asp Asn Ala Thr Ala Arg Lys Met Ala Leu Asp Tyr Phe Lys Arg Ile Asn Asp Asp Lys Gly Met Ile Tyr Met Val Val Val Asp Lys Asn Gly Val Val Leu Phe Asp Pro Val Asn Pro Lys Thr Val Xaa Gln Ser Gly Leu Asp 1l5 120 125 Ala Gln Ser Val Asp Gly Val Tyr Tyr Val Arg Gly Tyr Leu Glu Ala Ala Lys Lys Gly Gly Gly Tyr Thr Tyr Tyr Lys Met Pro Lys Tyr Asp Gly Gly Val Pro Glu Lys Lys Phe Ala Tyr Ser His Tyr Asp Glu Val Ser Gln Met Val Ile Ala Thr Thr Ser Tyr Tyr Thr Asp Ile Asn Thr Glu Asn Lys Ala Ile Lys Glu Gly Val Asn Lys Val Phe Asp Glu Asn Thr Thr Lys Leu Phe Leu Trp Ile Leu Thr Ala Thr Ile Ala Leu Val 2l0 215 220 Val Leu Thr Leu Ile Tyr Ala Lys Leu Arg Ile Val Lys Arg Ile Asp Glu Leu Val Leu Lys Ile Asn Ala Phe Ser Arg Gly Asp Lys Asp Leu Arg Ala Lys Ile Asp Val Gly Asp Arg Asn Asp Glu Ile Ser Gln Val Gly Arg Gly Ile Asn Leu Phe Val Glu Asn Ala Arg Leu Ile Met Glu Glu Ile Lys Gly Ile Ser Thr Leu Asn Lys Thr Ser Met Asp Lys Leu Val Gln Ile Thr Gln Glu Thr Gln Lys Ser Met Lys Asp Ser Ser Thr Thr Leu Asn Ser Val Lys Asn Lys Ala Thr Asp Ile Ala Ser Met Met Asn Ala Ser Ile Glu Gln Ser Gln Gly Leu Arg Lys Arg Leu Ile Glu Thr Gln Gly Leu Val Lys Glu Ser Lys Asp Ala Ile Gly Asp Leu Phe Ser Gln Ile Thr Glu Ser Ala His Thr Glu Glu Glu Leu Ser Ser Lys Val Glu Gln Leu Ser Arg Asn Ala Asp Asp Val Lys Ser Ile Leu Asp Ile Ile Asn Asp Ile Ala Asp Gln Thr Asn Leu Leu Ala Leu Asn Ala Ala Ile Glu Ala Ala Arg Ala Gly Glu His Gly Arg Gly Phe Ala Val Val Ala Asp Glu Val Arg Asn Leu Ala Gly Arg Thr Gln Lys Ser Leu Ala Glu Ile Asn Ser Thr Ile Met Val Ile Val Gln Glu Ile Asn Ala Val Ser Ser Gln Met Asn Leu Asn Ser Gln Lys Met Glu Arg Leu Ser Asp Met Ser Lys Ser Val Gln Glu Thr Tyr Glw Lys Met Ser Ser Asn Leu Ser Ser Val Val Ser Asp Ser Asn Gln Se:r Met Asp Asp Tyr Ala Lys Ser Gly His Gln Ile Glu Val Met Val Se:r Asp Phe Ala Glu Val Glu Lys Val Ala Ser Lys Thr Leu Ala Asp Se:r Ser Asp Ile Leu Asn Ile Ala Thr His Val Ser Gly Thr Thr Met Asn Leu Asp Lys Gln Val Asn Leu Phe Lys Thr (2) INFORMATION FOR SEQ ID N0:163:
{i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 686 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: l6...660 (D) OTHER INFORMATION:
WO 98I21225 PCT/~JS97/21353 -(xi) SEQUENCE DESCRIPTION: SEQ ID N0:163:
Met Lys Lys Pro Tyr Arg Lys Ile Ser Asp Tyr Ala Ile Val Gly Gly Leu Ser Ala Leu Val Met Val Ser Ile Val Gly Cys Lys Ser Asn Ala Asp Asp Lys Pro Lys Glu Gln Ser Ser Leu Ser Gln Ser Val Gln Lys Gly Ala Phe Val Ile Leu Glu Glu Gln Lys Asp Lys Ser Tyr Lys Val Val Glu Glu Tyr Pro Ser Ser Arg Thr His Ile Ile Val Arg Asp Leu Gln Gly Asn Glu Arg Val Leu Ser Asn Glu Glu Ile Gln Lys Leu Ile Lys Giu Glu Glu Ala Lys Ile Asp Asn Gly Thr Ser Lys Leu Val Gln Pro Asn Asn Gly Gly Ser Asn Glu Gly Ser Gly Phe Gly Leu Gly Ser Ala Ile Leu Gly Ser Ala Ala Gly Ala Ile Leu Gly Ser Tyr Ile Gly Asn Lys Leu Phe Asn Asn Pro Asn Tyr Gln Gln Asn Ala Gln Arg Thr Tyr Lys Ser Pro Gln Ala Tyr Gln Arg Ser Gln Asn Ser Phe Ser Lys Ser Ala Pro Ser Ala Ser Ser Met Gly Gly Ala Ser Lys Gly Gln Ser Gly Phe Phe Gly Ser Ser Arg Pro Thr Ser Ser Pro GCG GTA AGC 1'L'1' (~c;t~ t.~;H AGG GGC TTT AAC TC'.A TAATTTAATT GATTCAAGGC 6 Ala Val Ser Ser Gly Thr Arg Gly Phe Asn Se:r 205 210 21.5 (2) INFORMATION FOR SEQ ID N0:164:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 215 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:164:
Met Lys Lys Pro Tyr Arg Lys Ile Ser Asp Tyr Ala Ile Val Gly Gly Leu Ser Ala Leu Val Met Val Ser Ile Val Gly Cys Lys Ser Asn Ala Asp Asp Lys Pro Lys Glu Gln Ser Ser Leu Ser Gln Ser Val Gln Lys Gly Ala Phe Val Ile Leu Glu Glu Gln Lys Asp Lys Ser Tyr Lys Val Val Glu Glu Tyr Pro Ser Ser Arg Thr His Ile Ile Val Arg Asp Leu Gln Gly Asn Glu Arg Va1 Leu Ser Asn Glu Glu Ile Gln Lys Leu Ile Lys Glu Glu Glu Ala Lys Ile Asp Asn Gly Thr Ser Lys Leu Val Gln Pro Asn Asn Gly Gly Ser Asn Glu Gly Ser Gly Phe Gly Leu Gly Ser 115 120 l25 Ala Ile Leu Gly Ser Ala Ala Gly Ala Ile Leu Gly Ser Tyr Ile Gly l30 135 140 Asn Lys Leu Phe Asn Asn Pro Asn Tyr Gln Gln Asn Ala Gln Arg Thr 145 150 l55 l60 Tyr Lys Ser Pro Gln Ala Tyr Gln Arg Ser Gln Asn Ser Phe Ser Lys Ser Ala Pro Ser Ala Ser Ser Met Gly Gly Ala Ser Lys Gly Gln Ser Gly Phe Phe Gly Ser Ser Arg Pro Thr Ser Ser Pro Ala Val Ser Ser Gly Thr Arg Gly Phe Asn Ser (2) INFORMATION FOR SEQ ID N0:165:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 8748 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE T~PF,: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 16...8694 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:165:
Met Lys Lys Phe Lys Lys Lys Pro Lys Ser Ile Lys Arg Ser His Gln Asn Gln Lys Thr Ile Leu Lys Arg Pro Leu Trp Leu Met Pro Leu Leu Ile Ser Gly Phe Ala Ser Gly Val Tyr Ala Asn Asn Leu Trp Asp Leu Leu Asn Pro Lys Val Gly Gly Glu Tyr Val His Trp Val- Lys Gly Ser Gln Tyr Cys Ala Trp Trp Glu Phe Ala Gly Cys Leu ..
Lys Asn Val Trp Gly Ala Asn His Lys Gly Tyr Asp Ala Gly Asn Ala Ala Asn Tyr Leu Ser Ser Gln Asn Tyr Gln Ala Ile Ser Val Gly Ser Gly Asn Glu Thr Gly Thr Tyr Ser Leu Ser Gly Phe Thr Asn Tyr Val 1l0 115 120 Gly Gly Asn Leu Thr Ile Asn Leu Gly Asn Ser Val Val Leu Asp Leu Ser Gly Ser Asn Ser Phe Thr Ser Tyr Gln Gly Tyr Asn Gln Gly Lys Asp Asp Val Thr Phe Thr Val Gly Ala Ile Asn Leu Asn Gly Thr Leu l60 l65 170 GAA GTG GGT AAT CG'1' c~'lv GGA TCG GGA GCT GGC ACG CAC ACC GGC ACA 579 Glu Val Gly Asn Arg Val Gly Ser Gly Ala Gly Thr His Thr Gly Thr Ala Thr Leu Asn Leu Asn Ala Asn Lys Val Asn Ile Asn Ser Asn Ile Asn Ala Tyr Lys Thr Ser Gln Val Asn Ile Gly Asn Ala Asn Ser Val ATT ACC ATT GGT TCG GTT TCT TTG AGT GGG GA'T GTT TGC AGT TCT TTA 723 Ile Thr Ile Gly Ser Val Ser Leu Ser Gly As:p Val Cys Ser Ser Leu GCT AGC GTT GGG ATA GGG GCT AAT TGC TCC AC'T TCT GGG CCT AGC TAT 771 Ala Ser Val Gly Ile Gly Ala Asn Cys Ser Thr Ser Gly Pro Ser Tyr Ser Phe Lys Gly Thr Thr Asn Ala Thr Asn Th:r Ala Phe Ser Asn Ala AGC GGC AGT TTC ACT TTT GAA GAG AAC GCC AC'r TTT AGC GGG GCG AAA 867 Ser Gly Ser Phe Thr Phe Glu Glu Asn Ala Th:r Phe Ser Gly Ala Lys Trp Asn Gly Gly Thr Tyr Thr Phe Asn Lys Glu Phe Ser Ala Thr Asn 285 290 29!i 300 Asn Thr Ala Phe Ser Ser Gly Ser Phe Asn Ph<' Lys Gly Val Ser Ser TTT AAT GGT ACT TCG TTT AGT AAC GCT TCT TA'C ACT TTT GAC AAT CAA 101l Phe Asn Gly Thr Ser Phe Ser Asn Ala Ser Tyr Thr Phe Asp Asn Gln Ala Thr Phe Gln Asn Ser Ser Phe Asn Gly Gly Thr Phe Thr Phe Asn AAC CAA ACT AAT CCA ACT AAC AAC GCT CAG CAC. CCC CAA ATT CAA AAC 1107 Asn Gln Thr Asn Pro Thr Asn Asn Ala Gln His Pro Gln Ile Gln Asn Ser Ser Phe Ser Gly Asn Ala Thr Thr Leu Lys Gly Phe Val Asn Phe Gln Gln Ala Phe Asn Asn Ser Asn His Gln Leu Thr Ile Gln Asn Ala TCC TTT AAT AAC: GC:L ~.CU TTT AAC AAT ACC GGT AAA ATC ACT ATA GAA 1251 Ser Phe Asn Asn Ala Thr Phe Asn Asn Thr Gly Lys Ile Thr Ile Glu Lys Asp Ala Ser Phe Asn Asn Thr Thr Phe Asn Thr Ser Val Asp Thr Asn Asn Met Ser Val Thr Gly Gly Val Thr Leu Ser Gly Lys Asn Asp Leu Lys Asn Gly Ser Thr Leu Asp Phe Gly Ser Ser Lys Ile Thr Leu Ala Gln Gly Thr Thr Phe Asn Leu Thr Ser Leu Gly Ser Glu Lys Ser Val Thr Ile Leu Asn Ser Ser Gly Gly Ile Thr Tyr Ser Asn Leu Leu Asn His Ala Ile Asn Gly Leu Thr Ser Ala Leu Lys Thr Asn Glu Ser Leu Ser Asn Pro Gln Ser Phe Ala Gln Gly Leu Trp Asp Ile Ile Thr Tyr Asn Gly Val Thr Gly Gln Leu Leu Asn Glu Asn Ala Ala Thr Ser Lys Pro Thr Asp Ser Ser Pro Ser Lys Ser Ser Thr Asn Ser Thr Gln Val Tyr Gln Val Gly Tyr Lys Ile Gly Asp Thr Ile Tyr Lys Leu Gln Glu Thr Phe Ser His Asn Ser Ile Ile Ile Gln Ala Leu Glu Ser Gly Thr Tyr Thr Pro Pro Pro Val Ile Asn Gly Ser Lys Phe Asp Leu Ser Ala Ser Asn Tyr Ile Asn Ala Asp Met Pro Trp Tyr Asp His Lys Tyr WO 98/21225 PCTlUS97/21353 -TAC ATC CCT AAA TCC LI~ AAT TTT ACA GAG AGC GGG ACT TAT TAC TTG 1923 Tyr Ile Pro Lys Ser Gln Asn Phe Thr Glu Se r Gly Thr Tyr Tyr Leu Pro Ser Val Gln Ile Trp Gly Ser Tyr Thr Assn Ser Phe Lys Gln Thr TTT AGC GCA AAT GGT AGT AAT CTG GTG ATT G(iG TAT AAC TCA ACA TGG 2019 Phe Ser Ala Asn Gly Ser Asn Leu Val Ile G.ly Tyr Asn Ser Thr Trp ACT GAT CAT AAT GTC TCT TCT AGC GGC ACG G'rG TCT TTT GGG GAC ACT 2067 Thr Asp His Asn Val Ser Ser Ser Gly Thr V<~1 Ser Phe Gly Asp Thr Ser Gly Ser Ala Leu Asn Gly His Cys Gly Pro Trp Pro Tyr Tyr Gln Cys Thr Gly Thr Thr Asn Gly Thr Tyr Ser A:La Tyr His Val Tyr Ile Thr Ala Asn Leu Arg Ser Gly Asn Arg Ile Gly Thr Gly Gly Ala Ala Asn Leu Ile Phe Asn Gly Val Asp Ser Ile Asn Ile Ala Asn Ala Thr Ile Thr Gln His Asn Ala Gly Ile Tyr Ser Se.r Ser Met Thr Phe Ser Thr Gln Ser Met Asp Asn Ser Gln Asn Leu Asn Gly Leu Asn Ser Asn 765 770 7'75 780 Gly Lys Leu Ser Val Tyr Gly Thr Thr Phe Thr Asn Glu Ala Lys Asp Gly Lys Phe Ile Phe Asn Ala Gly Gln Ala Val Phe Glu Asn Thr Asn Phe Asn Gly Gly Ser Tyr Gln Phe Ser Gly Asp Ser Leu Asn Phe Ser Asn Asn Asn Gln Phe Asn Ser Gly Ser Phe G7_u Ile Ser Ala Lys Asn WO 98/21225 PCT/US97/21353 w GCT TCG TTC AA'C' AAC VCt AAC TTT AAC AAC AGC GCT TCT TTT AAT TTC 2595 Ala Ser Phe Asn Asn Ala Asn Phe Asn Asn Ser Ala Ser Phe Asn Phe Asn Asn Ser Asn Ala Thr Thr Ser Phe Val Gly Asp Phe Thr Asn Ala Asn Ser Asn Leu Gln Ile Ala Gly Asn Ala Val Phe Gly Asn Ser Thr Asn Gly Ser Gln Asn Thr Ala Asn Phe Asn Asn Thr Gly Ser Val Asn Ile Ser Gly Asn Ala Thr Phe Asp Asn Val Val Phe Asn Gly Pro Thr Asn Thr Ser Val Lys Gly Gln Val Thr Leu Asn Asn Ile Thr Leu Lys Asn Leu Asn Ala Pro Leu Ser Phe Gly Asp Gly Thr Ile Thr Phe Asn Ala His Ser Val Ile Asn Ile Ala Glu Ser Ile Thr Asn Gly Asn Pro Ile Thr Leu Val Ser Ser Ser Lys Glu Ile Glu Tyr Asn Asn Ala Phe Ser Lys Asn Leu Trp Gln Leu Ile Asn Tyr Gln Gly His Gly Ala Ser Ser Glu Lys Leu Val Ser Ser Ala Gly Asn Gly Val Tyr Asp Val Val Tyr Ser Phe Asn Asn Gln Thr Tyr Asn Phe Gln Glu Val Phe Ser Gln Asn Ser Ile Ser Ile Arg Arg Leu Gly Val Asn Met Val Phe Asp Tyr Val Asp Met Glu Lys Ser Asp His Leu Tyr Tyr Gln Asn Ala Leu Gly 1055 1060 l065 TTT ATG ACC TAI: Al ii l.tyT iii-~T AGC TAT AAC AFvT AAT TTA GGG AAT GCA 3 2 6 Phe Met Thr Tyr Met Pro Asn Ser Tyr Asn Assn Asn Leu Gly Asn Ala Asn Asn Thr Ile Tyr Tyr Tyr Asp Lys Ser Ii.e Asp Phe Tyr Ala Ser GGG AAA ACT CTA TTC ACT AAA GCG GAA TTT TC',T CAA ACA TTC ACC GGG 3363 Gly Lys Thr Leu Phe Thr Lys Ala Glu Phe Ser Gln Thr Phe Thr Gly Gln Asn Ser Ala Ile Val Phe Gly Ala Lys Se:r Ile Trp Thr Ser Leu Ser Asp Ala Pro Gln Ser Asn Thr Ile Ile Arg Phe Gly Asp Asn Lys Gly Ala Gly Ser Asn Asp Ala Ser Gly His Cys Trp Asn Leu Gln Cys 1150 1155 1l60 ATA GGC TTT ATT ACA GGG CAT TAT GAA GCG CP.A AAG ATT TAC ATC ACC 3555 Ile Gly Phe Ile Thr Gly His Tyr Glu Ala Gln Lys Ile Tyr Ile Thr l165 1170 1175 1180 Gly Ser Ile Glu Ser Gly Asn Arg Ile Ser Ser Gly Gly Gly Ala Ser Leu Asn Phe Asn Gly Leu Gln Gly Ile Leu Leu Thr Asn Ala Thr Leu 1200 l205 1210 TAT AAC CGC GCC GCT GGC ACG CAA AGC TCG TC'T ATG AAT TTT ATC TCT 3699 Tyr Asn Arg Ala Ala Gly Thr Gln Ser Ser Ser Met Asn Phe Ile Ser AAC AGC GCG AAC ATT CAG GCT CAA AAC TCC TA.T TTT ATA GAC GAT ACC 3747 Asn Ser Ala Asn Ile Gln Ala Gln Asn Ser Tyr Phe Ile Asp Asp Thr GCA CAA AAT GGC GGT AAC CCT AAT TTC AGT TT'C AAC GCT TTG AAT CTG 3795 Ala Gln Asn Gly Gly Asn Pro Asn Phe Ser Phe Asn Ala Leu Asn Leu GAT TTT TCT AAC AGC TCT TTT AGA GGC TAT GT'G GGG AAA ACG CAA TCT 3843 Asp Phe Ser Asn Ser Ser Phe Arg Gly Tyr Val Gly Lys Thr Gln Ser Val Phe Lys Phe Asn Ala Lys Asn Ala Ile Ser Phe Thr Asn Ser Thr AAT TTA AGC TC'1' (~U'1 T'ics ThI CAA ATG CAA GCT AAA AGC GTG TTG TTT 3939 Asn Leu Ser Ser Gly Leu Tyr Gln Met Gln Ala Lys Ser Val Leu Phe Asp Asn Ser Asn Leu Ser Val Ser Val Gly Thr Ser Ser Ile Lys Ala l310 1315 1320 Asn Ala Ile Asn Leu Ser Gln Asn Ala Ser Ile Asn Ala Ser Asn His l325 1330 1335 1340 TCA ACC TTA GAA CTT CAA GGC GAT TTG AAT GTG AAC GAC ACC AGC TCG 40B3 _ Ser Thr Leu Glu Leu Gln Gly Asp Leu Asn Val Asn Asp Thr Ser Ser Leu Asn Leu Asn Gln Ser Thr Ile Asn Val Ser Asn Asn Ala Thr Ile Asn Asp Tyr Ala Ser Leu Ile Ala Ser Asn Gly Ser His Leu Asn Phe 1375 l380 1385 Asn Gly Ala Val Asn Phe Asn Ser Ala Asn Ile Thr Thr Ser Leu Asn Asn Ser Ser Ile Val Phe Lys Gly Ala Val Ser Leu Gly Gly Gln Phe l405 1410 1415 1420 Asn Leu Ser Asn Asn Ser Ser Leu Asp Phe Gln Gly Ser Ser Ala Ile Thr Ser Asn Thr Ala Phe Asn Phe Tyr Asp Asn Ala Phe Ser Gln Ser 1440 1445 l450 CCC ATC ACT TTC CAT CAA GCC CTT GAC ATT AAA GCG CCC TTA AGT TTG 44l9 Pry Ile Thr Phe His Gln Ala Leu Asp Ile Lys Ala Pro Leu Ser Leu Gly Gly Asn Leu Leu Asn Pro Asn Asn Ser Ser Val Leu Asp Leu Lys l470 l475 1480 Asn Ser Gln Leu Val Phe Gly Asp Gln Gly Ser Leu Asn Ile Ala Asn l485 1490 1495 1500 Ile Asp Leu Leu Ser Asp Leu Asn Asp Asn Lys Asn Arg Val Tyr Asn WO 98l21225 PCT/US97/2I353 -ATC ATT CAA GC'.ti CiAI: H1V ~i AGT AAT TGG TAT GAG CGT ATC AGC TTC 4611 Ile Ile Gln Ala Asp Met Asn Ser Asn Trp T;rr Glu Arg Ile Ser Phe Phe Gly Met His Ile Asn Asp Gly Ile Tyr A:>p Ala Lys Asn Gln Thr Tyr Ser Phe Thr Asn Pro Leu Asn Asn Ala Leu Lys Ile Thr Glu Ser TTT AAA GAC AAC CAA CTA AGC GTT ACG CTC TC'_T CAA ATC CCG GGT ATT 4755 Phe Lys Asp Asn Gln Leu Ser Val Thr Leu SE:r Gln Ile Pro Gly Ile 1565 1570 1d75 1580 AAA AAC ACG CTC TAT AAC ATT GGC TCT GAA A7.'T TTT AAC TAC CAA AAA 4803 Lys Asn Thr Leu Tyr Asn Ile Gly Ser Glu Il.e Phe Asn Tyr Gln Lys 1585 l590 l595 Val Tyr Asn Asn Ala Asn Gly Val Tyr Ser Tyr Ser Asp Asp Ala Gln Gly Val Phe Tyr Leu Thr Ser Asn Val Lys Gl.y Tyr Tyr Asn Pro Asn Gln Ser Tyr Gln Ala Ser Gly Ser Asn Asn Thr Thr Lys Asn Asn Asn CTA ACC TCT GAA TCT TCT ATC ATC TCG CAA AC.'C TAT AAC GCG CAA GGC 4995 Leu Thr Ser Glu Ser Ser I1e Ile Ser Gln Thr Tyr Asn Ala Gln Gly 1645 1650 1E.55 1660 Asn Pro Ile Ser Ala Leu His Ile Tyr Asn Lys Gly Tyr Asn Phe Asn AAT ATC AAA GCG TTA GGG CAA ATG GCT CTC AF,A CTC TAC CCT GAA ATC S091 Asn Ile Lys Ala Leu Gly Gln Met Ala Leu Lys Leu Tyr Pro Glu Ile AAA AAG GTA TTA GGG AAT GAT TTT TCG CCC TC.'A AGT TTG AAC GCT TTA 5139 Lys Lys Val Leu Gly Asn Asp Phe Ser Pro Ser Ser Leu Asn Ala Leu Asn Ser Asn Ala Leu Asn Gln Leu Thr Lys Le:u Ile Thr Pro Asn Asp TGG AAA AAC ATT AAC GAG TTG ATT GAT AAC GC'A AAC AAT TCG GTG GTG 5235 Trp Lys Asn Ile Asn Glu Leu Ile Asp Asn Ala Asn Asn Ser Val Val CAA AAT TTC AA'1' AAL c~t3i: ACT TTG ATT GTG GGA GCG ACT CAA ATA GGG 5283 Gln Asn Phe Asn Asn Gly Thr Leu Ile Val Gly Ala Thr Gln Ile Gly 1745 1750 l755 CAA ACA GAC ACC AAT AGC GCG GTT GTT TTT GGG GGC TTG GGC TAT CAA 533l Gln Thr Asp Thr Asn Ser Ala Val Val Phe Gly Gly Leu Gly Tyr Gln Thr Pro Cys Asp Tyr Thr Asp Ile Val Cys Gln Lys Phe Arg Gly Thr l775 1780 1785 Tyr Leu Gly Gln Leu Leu Glu Ser Ser Ser Ala Asp Leu Gly Tyr Ile Asp Thr Thr Phe Asn Ala Lys Glu Ile Tyr Leu Thr Gly Thr Leu Gly Ser Gly Asn Ala Trp Gly Thr Gly Gly Ser Ala Ser Val Thr Phe Asn Ser Gln Thr Ser Leu Ile Leu Asn Gln Ala Asn Ile Val Ser Ser Gln 1840 l845 l850 Thr Asp Gly Ile Phe Ser Met Leu Gly Gln Glu Gly Ile Asn Lys Val 1B55 1860 l865 Phe Asn Gln Ala Gly Leu Ala Asn Ile Leu Gly Glu Val Ala Val G1n Ser Ile Asn Lys Ala Gly Gly Leu Gly Asn Leu Ile Val Asn Thr Leu 1885 1890 1B95 l900 Gly Ser Asn Ser Val Ile Gly Gly Tyr Leu Thr Pro Glu Gln Lys Asn l905 1910 1915 CAA ACC CTA AGC CAG CTT TTA GGG CAG AAT AAC TTT GAT AAT CTC ATG 581l Gln Thr Leu Ser Gln Leu Leu Gly Gln Asn Asn Phe Asp Asn Leu Met l920 1925 1930 Asn Asp Ser Gly Leu Asn Thr Ala Ile Lys Asp Leu Ile Arg Gln Lys Leu Gly Phe Trp Thr Gly Leu Val Gly Gly Leu Ala Gly Leu Gly Gly 1950 l955 1960 ATT GAT TTG CAA AAI: i:C-I G~ AAG CTT ATA GGC AGC ATG TCA ATC AAT 595S
Ile Asp Leu Gln Asn Pro Glu Lys Leu Ile G_Ly Ser Met Ser Ile Asn Asp Leu Leu Ser Lys Lys Gly Leu Phe Asn G7_n Ile Thr Gly Phe Ile TCC GCT AAC GAT ATA GGG CAA GTC ATA AGC G7-'A ATG TTG CAA GAT ATT 6052 Ser Ala Asn Asp Ile Gly Gln Val Ile Ser Val Met Leu Gln Asp Ile GTC AAA CCG AGC AAC GCT TTA AAA AAC GAT G7.'A GCG GCT TTA GGC AAG 6099 Val Lys Pro Ser Asn Ala Leu Lys Asn Asp Val Ala Ala Leu Gly Lys CAA ATG ATT GGC GAA TTT TTA GGC CAA GAC AC:G CTC AAT TCT TTA GAA 6147 Gln Met Ile Gly Glu Phe Leu Gly Gln Asp Thr Leu Asn Ser Leu Glu Ser Leu Leu Gln Asn Gln Gln Ile Lys Ser Va.l Leu Asp Lys Val Leu 2045 2050 20'55 2060 GCG GCT AAA GGT TTA GGG CCT ATT TAT GAA CF,A GGC TTG GGG GAT TTG 6243 Ala Ala Lys Gly Leu Gly Pro Tle Tyr Glu Gln Gly Leu Gly Asp Leu Ile Pro Asn Leu Gly Lys Lys Gly Leu Phe Ala Pro Tyr Gly Leu Ser -CAA GTG TGG CAA AAA GGG GAT TTT AGT TTC AA.C GCA CAA GGC AAT GTT 6339 Gln Val Trp Gln Lys Gly Asp Phe Ser Phe Asn Ala Gln Gly Asn Val Phe Val Gln Asn Ser Thr Phe Ser Asn Ala Asn Gly Gly Thr Leu Ser Phe Asn Ala Gly Asn Ser Leu Ile Phe Ala Gly Asn Asn His Ile Ala Phe Thr Asn His Ala Gly Thr Leu Gln Leu Leu Ser Asp Gln Val Ser Asn Ile Asn Ile Thr Thr Leu Asn Ala Ser Asn Gly Leu Lys Ile Asn Ala Ala Asn Asn Asn Val Ser Val Ser Gln Gly Asn Leu Phe Val Ser 2175 2l80 2l85 Ala Ser Cys Ala Gln Gln Ser Asp Pro Thr Thr Ala Asn Ile Ala Asn Pro Cys Ala Leu Ser Ala Gln Ser Thr Asn Gly Ala Ser Ser Asn Asn Ala Ser Asn Asn Ala Pro Ile Ala Leu Ser Asn Asn Asp Glu Ser Leu Met Val Ala Ala Asn Asp Phe Asn Phe Ser Gly Asn Ile Tyr Ala Asn Gly Val Val Asp Phe Ser Lys Ile Lys Gly Ser Ala Asn Ile Lys Asn Leu Tyr Leu Tyr Asn Asn Ala Gln Phe Gln Ala Asn Asn Leu Thr Ile Ser Asn Gln Ala Val Leu Glu Lys Asn Ala Ser Phe Val Thr Asn Asn Leu Asn Ile Gln Gly Ala Phe Asn Asn Asn Ala Thr Gln Lys Ile Glu Val Leu Gln Asn Leu Val Ile Ala Ser Asn Ala Ser Leu Ser Thr Gly Ile Tyr Gly Leu Glu Val Gly Gly Ala Leu Asn Asn Ser Gly Ala Ile CAT TTT AAT TTA GAA AAT ACC CAA ACG CCA ACG CCG CTC ATT CAA GCA 7l07 His Phe Asn Leu Glu Asn Thr Gln Thr Pro Thr Pro Leu Ile Gln Ala Glu Gly Ile Ile Asn Leu Asn Thr Thr Gln Thr Pro Phe Met Asn Val Asn Asn Ser Met Ala Asn Asn Thr Thr Tyr Thr Leu Leu Lys Ser Ser Arg Tyr Ile Asp Tyr Asn Ile Asn Pro Asn Ser Leu Gln Ser Tyr Leu AAT CTC TAC: !1C'1' '1"1'A LTC -AAT ATC AAC GGG AAC CAC ATA GAG GAA AAA 7299 Asn Leu Tyr Thr Leu Ile Asn Ile Asn Gly Asn His Ile Glu Glu Lys 24l5 2420 2425 AAC GGC GCA TTG ACT TAT TTG GGC CAA CGG G7.'T TTG TTG CAA GAT AAG 7347 Asn Gly Ala Leu Thr Tyr Leu Gly Gln Arg Val Leu Leu Gln Asp Lys GGG TTA TTG TTA AGC GTA GCG CTG CCC AAC TC:A AAC AAC GCT TCT CAA 7395 Gly Leu Leu Leu Ser Val Ala Leu Pro Asri Se'r Asn Asn Ala Ser Gln Asn Asri Ile Leu Ser Leu Ser Val Leu Tyr A:~n Gln Val Lys Met Ser TGC GGC GAT AAA GCG ATG GAT TTT ACC CCC CC'T ACC TTA CAA GAT TAC 7491 Cys Gly Asp Lys Ala Met Asp Phe Thr Pro Pro Thr Leu Gln Asp Tyr Ile Val Gly Ile Gln Gly Gln Ser Ala Leu A:>n Gln Ile Glu Ala Val GGG GGG AAC GCT ATC AAG TGG CTT TCA ACA T7.'G ATG ATG GAG ACT AAA 7587 Gly Gly Asn Ala Ile Lys Trp Leu Ser Thr Leu Met Met Glu Thr Lys Glu Asn Pro Phe Phe Ala Pro Ile Tyr Leu Lys Asn His Ser Leu Asn Glu Ile Leu Gly Val Thr Lys Asp Leu Gln Aeon Thr Ala Ser Leu Ile Ser Asn Pro Asn Phe Arg Asp Asn Ala Thr A~~n Leu Leu Glu Leu Ala Ser Tyr Thr Gln Gln Thr Ser Arg Leu Thr Lys Leu Ser Asp Phe Arg TCT AGA GAG GGA GAG TCT GAT TTT TCT TTG T7.'A GAG CTT AAA AAC AAG 7827 Ser Arg Glu Gly Glu Ser Asp Phe Ser Leu Leu Glu Leu Lys Asn Lys Arg Phe Ser Asp Pro Asn Pro Glu Val Phe Val Lys Tyr Ser Gln Leu 2605 2610 2E.15 2620 Ser Lys His Pro Asn Asn Leu Trp Val Gln Gl.y Val Gly Gly Ala Ser TTT ATT TCT GGG GGC: IjAT-GGC ACG CTT TAT GGC TTG AAT GCG GGC TAT 797I
Phe Ile Ser Gly Gly Asn Gly Thr Leu Tyr Gly Leu Asn Ala Gly Tyr Asp Arg Leu Val Lys Asn Val Ile Leu Gly Gly Tyr Val Ala Tyr Gly Tyr Ser Asp Phe Asn Gly Asn Ile Met His Ser Leu Gly Asn Asn Val GAT GTG GGG ATG TAT GCG AGG GCT TTT TTA AAA AGG AAC GAA TTC ACT 81l5 Asp Val Gly Met Tyr Ala Arg Ala Phe Leu Lys Arg Asn Glu Phe Thr Leu Ser Ala Asn Glu Thr Tyr Gly Gly Asn Ala Thr Ser Ile Asn Ser 2705 2710 27l5 Ser Asn Ser Leu Leu Ser Val Leu Asn Gln Arg Tyr Asn Tyr Asn Thr Trp Thr Thr Ser Val Asn Gly Asn Tyr Gly Tyr Asp Phe Met Phe Lys Gln Lys Ser Val Val Leu Lys Pro Gln Val Gly Leu Ser Tyr His Phe Ile Gly Leu Ser Gly Met Lys Gly Asn Asp Ala Ala Tyr Lys Gln Phe Leu Met His Ser Asn Pro Ser Asn Glu Ser Val Leu Thr Leu Asn Met Gly Leu Glu Ser Arg Lys Tyr Phe Gly Lys Asn Ser Tyr Tyr Phe Val Thr Ala Arg Leu Gly Arg Asp Leu Leu Ile Lys Ser Lys Gly Ser Asn Thr Val Arg Phe Val Gly Glu Asn Thr Leu Leu Tyr Arg Lys Gly Glu Val Phe Asn Thr Phe Ala Ser Val Ile Thr Gly Gly Glu Met His Leu TGG CGT TTG G'i'G '1'A'i GTG -iiAT GCG GGG GTG GGG CTT AAG ATG GGC TTG 8 64 3 Trp Arg Leu Val Tyr Val Rsn Ala Gly Val Gl.y Leu Lys Met Gly Leu Gln Tyr Gln Asp Ile Asn Ile Thr Gly Asn Va.l Gly Met Arg Val Ala TTT TAGCTTTTTT GCTATAATGC TTCGTTCAAA TTTTp,TGGTT AGGTTTTTCT ATGT 8748 Phe (2) INFORMATION FOR SEQ ID N0:166:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2B93 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:166:
Met Lys Lys Phe Lys Lys Lys Pro Lys Ser Ile Lys Arg Ser His Gln Asn Gln Lys Thr Ile Leu Lys Arg Pro Leu Trp Leu Met Pro Leu Leu Ile Ser Gly Phe Ala Ser Gly Val Tyr Ala Asn Asn Leu Trp Asp Leu Leu Asn Pro Lys Val Gly Gly Glu Tyr Val His Trp Val Lys Gly Ser Gln Tyr Cys Ala Trp Trp Glu Phe Ala Gly Cys Leu Lys Asn Val Trp Gly Ala Asn His Lys Gly Tyr Asp Ala Gly Asn Ala Ala Asn Tyr Leu Ser Ser Gln Asn Tyr Gln Ala Ile Ser Val Gly Ser Gly Asn Glu Thr Gly Thr Tyr Ser Leu Ser Gly Phe Thr Asn Tyr Val Gly G1y Asn Leu Thr Ile Asn Leu Gly Asn Ser Val Val Leu Asp Leu Ser Gly Ser Asn Ser Phe Thr Ser Tyr Gln Gly Tyr Asn Gln Gly Lys Asp Asp Val Thr Phe Thr Val Gly Ala Ile Asn Leu Asn Gly Thr Leu Glu Val Gly Asn Arg Val Gly Ser Gly Ala Gly Thr His Thr Gly Thr Ala Thr Leu Asn Leu Asn Ala Asn Lys Val Asn Ile Asn Ser Asn Ile Asn Ala Tyr Lys Thr Ser Gln Val Asn Ile Gly Asn Ala Asn Ser Val Ile Thr Ile Gly Ser Val Ser Leu Ser Gly Asp Val Cys Ser Ser Leu Ala Ser Val Gly Ile Gly Ala Asn C;ys ser-ihr Ser Gly Pro Ser Tyr Ser Phe Lys G1y Thr Thr Asn Ala Thr Asn Thr Ala Phe Ser Asn Ala Ser Gly Ser Phe Thr Phe Glu Glu Asn Ala Thr Phe Ser Gly Ala Lys Trp Asn Gly Gly Thr Tyr Thr Phe Asn Lys Glu Phe Ser Ala Thr Asn Asn Thr Ala Phe Ser 5er Gly Ser Phe Asn Phe Lys Gly Val Ser Ser Phe Asn Gly Thr Ser Phe Ser Asn Ala Ser Tyr Thr Phe Asp Asn Gln Ala Thr Phe Gln Asn Ser Ser Phe Asn Gly Gly Thr Phe Thr Phe Asn Asn Gln Thr Asn Pro Thr Asn Asn Ala Gln His Pro Gln Ile Gln Asn Ser Ser Phe Ser Gly Asn Ala Thr Thr Leu Lys Gly Phe Val Asn Phe Gln Gln Ala Phe Asn Asn Ser Asn His Gln Leu Thr Ile Gln Asn Ala Ser Phe Asn Asn Ala Thr Phe Asn Asn Thr Gly Lys Ile Thr I1e Glu Lys Asp Ala Ser 405 4l0 415 Phe Asn Asn Thr Thr Phe Asn Thr Ser Val Asp Thr Asn Asn Met Ser Val Thr Gly Gly Val Thr Leu Ser Gl.y Lys Asn Asp Leu Lys Asn Gly Ser Thr Leu Asp Phe Gly Ser Ser Lys Ile Thr Leu Ala Gln Gly Thr Thr Phe Asn Leu Thr Ser Leu Gly Ser Glu Lys Ser Val Thr Ile Leu Asn Ser Ser Gly Gly Ile Thr Tyr Ser Asn Leu Leu Asn His Ala Ile Asn Gly Leu Thr Ser Ala Leu Lys Thr Asn Glu Ser Leu Ser Asn Pro Gln Ser Phe Ala Gln Gly Leu Trp Asp Ile Ile Thr Tyr Asn Gly Val Thr Gly Gln Leu Leu Asn Glu Asn Ala Ala Thr Ser Lys Pro Thr Asp Ser Ser Pro Ser Lys Ser Ser Thr Asn Ser Thr Gln Val Tyr Gln Val Gly Tyr Lys Ile Gly Asp Thr Ile Tyr Lys Leu Gln Glu Thr Phe Ser His Asn Ser Ile Ile Ile Gln Ala Leu Glu Ser Gly Thr Tyr Thr Pro Pro Pro Val Ile Asn Gly Ser Lys Phe Asp Leu Ser Ala Ser Asn Tyr Ile Asn Ala Asp Met Pro Trp Tyr Asp His Lys Tyr Tyr Ile Pro Lys Ser Gln Asn Phe Thr Glu Ser Gly Thr Tyr Tyr Leu Pro Ser Val Gln Ile Trp Gly Ser Tyr Thr Asn Ser Phe Lys Gln Thr Phe Ser Ala Asn Gly Ser Asn Leu Val Ile Gly Tyr Asn Ser Thr Trp Thr Asp His Asn Val Ser Ser Ser G1y Thr Val Ser Phe Gly Asp Thr Ser Gly Ser Ala Leu Asn Gly His Cys Gly Pro Trp Pro Tyr T;rr Gln Cys Thr G1y Thr Thr Asn Gly Thr Tyr Ser Ala Tyr His Val T;rr Ile Thr Ala Asn Leu 705 710 7.L5 720 Arg Ser Gly Asn Arg Ile Gly Thr Gly Gly A7La Ala Asn Leu Ile Phe Asn Gly Val Asp Ser Ile Asn Ile Ala Asn A7_a Thr Ile Thr Gln His Asn Ala Gly Ile Tyr Ser Ser Ser Met Thr Phe Ser Thr Gln Ser Met Asp Asn Ser Gln Asn Leu Asn Gly Leu Asn Ser Asn Gly Lys Leu Ser Val Tyr Gly Thr Thr Phe Thr Asn Glu Ala Lys Asp Gly Lys Phe Ile 785 790 7~>5 800 Phe Asn Ala Gly Gln Ala Val Phe Glu Asn Thr Asn Phe Asn Gly Gly Ser Tyr Gln Phe Ser Gly Asp Ser Leu Asn Phe Ser Asn Asn Asn Gln Phe Asn Ser Gly Ser Phe Glu Ile Ser Ala Lys Asn Ala Ser Phe Asn Asn Ala Asn Phe Asn Asn Ser Ala Ser Phe A:~n Phe Asn Asn Ser Asn Ala Thr Thr Ser Phe Val Gly Asp Phe Thr A~;n Ala Asn Ser Asn Leu Gln Ile Ala Gly Asn Ala Val Phe Gly Asn Se:r Thr Asn Gly Ser Gln Asn Thr Ala Asn Phe Asn Asn Thr Gly Ser Val Asn Ile Ser Gly Asn Ala Thr Phe Asp Asn Val Val Phe Asn Gly Pro Thr Asn Thr Ser Val Lys Gly Gln Val Thr Leu Asn Asn Ile Thr Le:u Lys Asn Leu Asn Ala Pro Leu Ser Phe Gly Asp Gly Thr Ile Thr Phe Asn Ala His Ser Val Ile Asn Ile Ala Glu Ser Ile Thr Asn Gly Asn Pro Ile Thr Leu Val Ser Ser Ser Lys Glu Ile Glu Tyr Asn Asn Ala Phe Ser Lys Asn Leu Trp Gln Leu Ile Asn Tyr Gln Gly His Gly Ala Ser Ser Glu Lys Leu Val Ser Ser Ala Gly Asn Gly Val Tyr Asp Va.l Val Tyr Ser Phe Asn Asn Gln Thr Tyr Asn Phe Gln Glu Val Phe Ser Gln Asn Ser Ile Ser Ile Arg Arg Leu Gly Val Asn Met Val Phe Asp Tyr Val Asp Met Glu Lys Ser Asp His Leu Tyr Tyr Gln Asn Ala Leu Gly Phe Met Thr Tyr Met Pro Asn Ser Tyr Asn Asn Asn Leu Gly Asn Ala Asn Asn Thr Ile Tyr Tyr Tjrr Asp Lys Ser Ile Asp Phe Tyr Ala Ser Gly Lys Thr Leu Phe Thr Lys Ala Glu Phe Ser Gln Thr Phe Th.r Gly Gln Asn Ser Ala Ile Val Phe Gly Ala Lys-Ser Ile Trp Thr Ser Leu Ser Asp Ala Pro 1125 l130 l135 Gln Ser Asn Thr Ile Ile Arg Phe Gly Asp Asn Lys Gly Ala Gly Ser 1140 l145 1150 Asn Asp Ala Ser Gly His Cys Trp Asn Leu Gln Cys Ile Gly Phe Ile 1155 l160 I165 Thr Gly His Tyr Glu Ala Gln Lys Ile Tyr Ile Thr Gly Ser Ile Glu Ser Gly Asn Arg Ile Ser Ser Gly Gly Gly Ala Ser Leu Asn Phe Asn Gly Leu Gln Gly Ile Leu Leu Thr Asn Ala Thr Leu Tyr Asn Arg Ala I205 1210 12l5 Ala Gly Thr Gln Ser Ser Ser Met Asn Phe Ile Ser Asn Ser Ala Asn Ile Gln Ala Gln Asn Ser Tyr Phe Ile Asp Asp Thr Ala Gln Asn Gly l235 1240 1245 Gly Asn Pro Asn Phe Ser Phe Asn Ala Leu Asn Leu Asp Phe Ser Asn l250 1255 1260 Ser Ser Phe Arg Gly Tyr Val Gly Lys Thr Gln Ser Val Phe Lys Phe Asn Ala Lys Asn Ala Ile Ser Phe Thr Asn Ser Thr Asn Leu Ser Ser 1285 l290 1295 Gly Leu Tyr Gln Met Gln Ala Lys Ser Val Leu Phe Asp Asn Ser Asn 1300 l305 1310 Leu Ser Val Ser Val Gly Thr Ser Ser Ile Lys Ala Asn Ala Ile Asn Leu Ser Gln Asn Ala Ser Ile Asn Ala Ser Asn His Ser Thr Leu Glu Leu Gln Gly Asp Leu Asn Val Asn Asp Thr Ser Ser Leu Asn Leu Asn Gln Ser Thr Ile Asn Val Ser Asn Asn Ala Thr Ile Asn Asp Tyr A1a l365 1370 l375 Ser Leu Ile Ala Ser Asn Gly Ser His Leu Asn Phe Asn Gly Ala Val Asn Phe Asn Ser Ala Asn Ile Thr Thr Ser Leu Asn Asn Ser Ser Ile 1395 l400 1405 Val Phe Lys Gly Ala Val Ser Leu Gly Gly Gln Phe Asn Leu Ser Asn 1410 l415 1420 Asn Ser Ser Leu Asp Phe Gln Gly Ser Ser Ala Ile Thr Ser Asn Thr Ala Phe Asn Phe Tyr Asp Asn Ala Phe Ser Gln Ser Pro I1e Thr Phe His Gln Ala Leu Asp Ile Lys Ala Pro Leu Ser Leu Gly Gly Asn Leu l460 1465 1470 Leu Asn Pro Asn Asn Ser Ser Val Leu Asp Leu Lys Asn Ser Gln Leu 1475 1480 l485 Val Phe Gly Asp Gln Gly Ser Leu Asn Ile Ala Asn Ile Asp Leu Leu Ser Asp Leu Asn Asp Asn Lys Asn Arg Val Tyr Asn Ile Ile Gln Ala Asp Met Asn Ser Asn Trp Tyr Glu Arg Ile Ser Phe Phe Gly Met His Ile Asn Asp Gly Ile Tyr Asp Ala Lys Asn Gln Thr Tyr Ser Phe Thr Asn Pro Leu Asn Asn Ala Leu Lys Ile Thr Glu Ser Phe Lys Asp Asn Gln Leu Ser Val Thr Leu Ser Gln Ile Pro Gly Ile Lys Asn Thr Leu Tyr Asn Ile Gly Ser Glu Ile Phe Asn Tyr Gln Lys Val Tyr Asn Asn 5B5 1590 15'.35 1600 Ala Asn Gly Val Tyr Ser Tyr Ser Asp Asp A.La Gln Gly Val Phe Tyr Leu Thr Ser Asn Val Lys Gly Tyr Tyr Asn Pro Asn Gln Ser Tyr Gln Ala Ser Gly Ser Asn Asn Thr Thr Lys Asn Asn Asn Leu Thr Ser Glu Ser Ser Ile Ile Ser Gln Thr Tyr Asn Ala Gln Gly Asn Pro Ile Ser Ala Leu His Ile Tyr Asn Lys Gly Tyr Asn Phe Asn Asn Ile Lys Ala 665 1670 16'15 1680 Leu Gly Gln Met Ala Leu Lys Leu Tyr Pro Glu Ile Lys Lys Val Leu Gly Asn Asp Phe Ser Pro Ser Ser Leu Asn Ala Leu Asn Ser Asn Ala 1700 1705 17l0 Leu Asn Gln Leu Thr Lys Leu Ile Thr Pro Asn Asp Trp Lys Asn Ile Asn Glu Leu Ile Asp Asn Ala Asn Asn Ser V<~l Val Gln Asn Phe Asn Asn Gly Thr Leu Ile Val Gly Ala Thr Gln Ile Gly Gln Thr Asp Thr Asn Ser Ala Val Val Phe Gly Gly Leu Gly Tvr Gln Thr Pro Cys Asp 1765 l770 J 1775 Tyr Thr Asp Ile Val Cys Gln Lys Phe Arg G;Ly Thr Tyr Leu Gly Gln Leu Leu Glu Ser Ser Ser Ala Asp Leu Gly T~rr Ile Asp Thr Thr Phe 1795 1800 J l805 Asn Ala Lys Glu Ile Tyr Leu Thr Gly Thr Leu G1y Ser Gly Asn Ala Trp Gly Thr Gly Gly Ser Ala Ser Val Thr Phe Asn Ser Gln Thr Ser Leu Ile Leu Asn Gln Ala Asn Ile Val Ser Ser Gln Thr Asp Gly Ile Phe Ser Met Leu Gly Gln Glu Gly Ile Asn Lys Val Phe Asn Gln Ala Gly Leu Ala Asn Ile Leu Gly Glu Val Ala Val Gln Ser Ile Asn Lys l875 18S0 1885 Ala Gly Gly Leu Gly Asn Leu Ile Val Asn Thr Leu Gly Ser Asn Ser Val Ile Gly Gly Tyr Leu Thr Pro Glu Gln Lys Asn Gln Thr Leu Ser Gln Leu Leu Gly Gln Asn Asn Phe Asp Asn Lc,u Met Asn Asp Ser Gly Leu Asn Thr Ala Ile Lys Asp Leu Ile Arg Gln Lys Leu Gly Phe Trp 1940 l945 l950 Thr Gly Leu Val Gly Gly Leu Ala Gly Leu G.Ly Gly Ile Asp Leu Gln Asn Pro Glu Lys Leu Ile Gly Ser Met Ser I.Le Asn Asp Leu Leu Ser Lys Lys Gly Leu Phe Asn Gln Ile Thr Gly Plze Ile Ser Ala Asn Asp Ile Gly Gln vat m a Ser-Val Met Leu Gln Asp Ile Val Lys Pro Ser Asn Ala Leu Lys Asn Asp Val Ala Ala Leu Gly Lys Gln Met Ile Gly Glu Phe Leu Gly Gln Asp Thr Leu Asn Ser Leu Glu Ser Leu Leu Gln Asn Gln Gln Ile Lys Ser Val Leu Asp Lys Val Leu Ala Ala Lys Gly Leu Gly Pro Ile Tyr Glu Gln Gly Leu Gly Asp Leu Ile Pro Asn Leu Gly Lys Lys Gly Leu Phe Ala Pro Tyr Gly Leu Ser Gln Val Trp Gln Lys Gly Asp Phe Ser Phe Asn Ala Gln Gly Asn Val Phe Val Gln Asn Ser Thr Phe Ser Asn Ala Asn Gly Gly Thr Leu Ser Phe Asn Ala Gly 21l5 2120 2125 Asn Ser Leu Ile Phe Ala Gly Asn Asn His Ile Ala Phe Thr Asn His Ala Gly Thr Leu Gln Leu Leu Ser Asp Gln Val Ser Asn Ile Asn Ile 14S 2150 2l55 2160 Thr Thr Leu Asn Ala Ser Asn Gly Leu Lys Ile Asn Ala Ala Asn Asn 2165 2l70 2175 Asn Val Ser Val Ser Gln Gly Asn Leu Phe Val Ser Ala Ser Cys Ala Gln Gln Ser Asp Pro Thr Thr Ala Asn Ile Ala Asn Pro Cys Ala Leu Ser Ala Gln Ser Thr Asn Gly Ala Ser Ser Asn Asn Ala Ser Asn Asn Ala Pro Ile Ala Leu Ser Asn Asn Asp Glu Ser Leu Met Val Ala Ala Asn Asp Phe Asn Phe Ser Gly Asn Ile Tyr Ala Asn Gly Val Val Asp Phe Ser Lys Ile Lys Gly Ser Ala Asn Ile Lys Asn Leu Tyr Leu Tyr Asn Asn Ala Gln Phe Gln Ala Asn Asn Leu Thr Ile Ser Asn Gln Ala Val Leu Glu Lys Asn Ala Ser Phe Val Thr Asn Asn Leu Asn Ile Gln Gly Ala Phe Asn Asn Asn Ala Thr Gln Lys Ile Glu Val Leu G1n Asn Leu Val Ile Ala Ser Asn Ala Ser Leu Ser Thr Gly Ile Tyr Gly Leu Glu Val Gly Gly Ala Leu Asn Asn Ser Gly Ala Ile His Phe Asn Leu Glu Asn Thr Gln Thr Pro Thr Pro Leu Ile Gln Ala Glu Gly Ile Ile Asn Leu Asn Thr Thr Gln Thr Pro Phe Met Asn Val Asn Asn Ser Met A1a Asn Asn Thr Thr Tyr Thr Leu Leu Lys Ser Ser Arg Tyr Ile Asp Tyr Asn Ile Asn Pro Asn Ser Leu Gln Ser Tyr Leu Asn Leu Tyr Thr Leu Ile Asn Ile Asn Gly Asn His Ile Glu Glu Lys Asn Gly Ala Leu Thr Tyr Leu Gly Gln Arg Val Leu Leu Gln Asp Lys Gly Leu Leu Leu Ser Val Ala Leu Pro Asn Ser Asn Asn Ala Ser Gln Asn Asn Ile Leu Ser Leu Ser Val Leu Tyr Asn Gln Val Lys Met Ser Cys Gly Asp Lys 465 2470 24'75 2480 Ala Met Asp Phe Thr Pro Pro Thr Leu Gln Aap Tyr Ile Val Gly Ile Gln Gly Gln Ser Ala Leu Asn Gln Ile Glu A.La Val Gly Gly Asn Ala Ile Lys Trp Leu Ser Thr Leu Met Met Glu Thr Lys Glu Asn Pro Phe Phe Ala Pro Ile Tyr Leu Lys Asn His Ser Lc.u Asn Glu Ile Leu Gly Val Thr Lys Asp Leu G1n Asn Thr Ala Ser Lc~u Ile Ser Asn Pro Asn Phe Arg Asp Asn Ala Thr Asn Leu Leu Glu Leu Ala Ser Tyr Thr Gln Gln Thr Ser Arg Leu Thr Lys Leu Ser Asp Phe Arg Ser Arg Glu Gly Glu Ser Asp Phe Ser Leu Leu Glu Leu Lys Asn Lys Arg Phe Ser Asp Pro Asn Pro Glu Val Phe Val Lys Tyr Ser Gln Leu Ser Lys His Pro Asn Asn Leu Trp Val Gln Gly Val Gly Gly A7_a Ser Phe Ile 5er Gly 625 2630 26,S5 2640 Gly Asn Gly Thr Leu Tyr Gly Leu Asn Ala G7_y Tyr Asp Arg Leu Val Lys Asn Val Ile Leu Gly Gly Tyr Val Ala Tyr Gly Tyr Ser Asp Phe Asn Gly Asn Ile Met His Ser Leu Gly Asn A~~n Val Asp Val Gly Met Tyr Ala Arg Ala Phe Leu Lys Arg Asn Glu Phe Thr Leu Ser Ala Asn Glu Thr Tyr Gly Gly Asn Ala Thr Ser Ile A:cn Ser Ser Asn Ser Leu 705 2710 277.5 2720 Leu Ser Val Leu Asn Gln Arg Tyr Asn Tyr A~:n Thr Trp Thr Thr Ser Val Asn Gly Asn Tyr Gly Tyr Asp Phe Met Phe Lys Gln Lys Ser Val Val Leu Lys Pro Gln Val Gly Leu Ser Tyr Hi.s Phe Ile Gly Leu Ser Gly Met Lys Gly Asn Asp Ala Ala Tyr Lys Gl.n Phe Leu Met His Ser Asn Pro Ser Asn Glu Ser Val Leu Thr Leu As~n Met Gly Leu Glu Ser Arg Lys Tyr Phe Gly Lys Asn Ser Tyr Tyr Phe Val Thr Ala Arg Leu Gly Arg Asp Leu Leu Ile Lys Ser Lys Gly Ser Asn Thr Val Arg Phe Val Gly Glu Asn Thr Leu Leu Tyr Arg Lys Gly Glu Val Phe Asn Thr Phe Ala Ser Val Ile Thr Gly Gly Glu Met His Leu Trp Arg Leu Val Tyr Val Asn Ala Gly Val Gly Leu Lys Met Gly Leu Gln Tyr Gln Asp WO 98I21225 PCT/US97/2i353 -Ile Asn Ile Thr Gly Asn-Val Gly Met Arg Val Ala Phe (2) INFORMATION FOR SEQ ID N0:167:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1376 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 13...1338 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:167:
Met Gly Asn His Phe Ser Lys Leu Gly Phe Val Leu Ala Ala Leu Gly Ser Ala Ile Gly Leu Gly His Ile Trp Arg Phe Pro Tyr Met Thr Gly Val Ser Gly Gly Gly Ala Phe Val Leu Leu Phe Leu Phe Leu Ser Leu Ser Val Gly Ala Ala Met Phe Ile Ala G1u Met Leu Leu Gly Gln Ser Thr Gln Lys Asn Val Thr Glu Ala Phe Lys Glu Leu Asp Ile Asn Pro Lys Lys Arg Trp Lys Tyr Ala Gly Leu Leu Leu Val Ser Gly Pro Leu Ile Leu Thr Phe Tyr Gly Thr Ile Leu Gly Trp Val Leu Tyr Tyr Leu Val Ser Val Ser Phe Asn Leu Pro Asn Asn Ile Gln Glu 1l0 1l5 120 125 Ser Glu Gln Ile Phe Thr Gln Thr Leu Gln Ser Ile Gly Leu Gln Ser Ile Gly Leu Phe Ser Val Leu Leu Ile Thr Gly Trp Ile Val Ser Arg Gly Ile Lys Glu Gly Ile Glu Lys Leu Asn Leu Val Leu Met Pro Leu Leu Phe Ala Thr Phe Phe Gly Leu Leu Phe Tyr Ala Met Ser Met Asp l75 180 185 Ser Phe Ser Lys Ala Phe His Phe Met Phe Asp Phe Lys Pro Lys Asp Leu Thr Ser Gln Val Phe Thr Tyr Ser Leu Gly Gln Val Phe Phe Ser Leu Ser Ile Gly Leu Gly Ile Asn Ile Thr Tyr Ala Ala Val Thr Asp Lys Thr Gln Asn Leu Leu Lys Ser Thr Ile Trp Val Val Leu Ser Gly Ile Leu Ile Ser Leu Val Ala Gly Leu Met Ile Phe Thr Phe Val Phe Glu Tyr Gly Ala Asn Val Ser Gln Gly Thr Gly Leu Ile Phe Thr Ser TTA CCG GTG GTT TTT GGC CAA ATG GGA GCG AT.A GGC ATT CTT GTT TCG 915 Leu Pro Val Val Phe Gly Gln Met Gly Ala Ile Gly Ile Leu Val Ser Ile Leu Phe Leu Leu,Ala Leu Ala Phe Ala Gly Ile Thr Ser Thr Val Ala Leu Leu Glu Pro Ser Val Met Tyr Leu Thr Glu Arg Tyr Gln Tyr Ser Arg Phe Lys Val Thr Trp Gly Leu Val Ala Leu Ile Phe Val Val Gly Val Val Leu Ile Phe Ser Leu His Lys Asp Tyr Lys Asp Tyr Leu Thr Phe Phe Glu Lys Ser Leu Phe Asp Trp Leu Asp Phe Ala Ser Ser ACC ATT ATC ATG CCT TTA GGC GGG ATG GCA ACC TTT ATT TTT ATG GGT l203 Thr Ile Ile Met Pro Leu Gly Gly Met Ala Thr Phe Ile Phe Met Gly TGG GTT TTG AAA AAA GAA AAA TTG CGT CTT TTG AGC GTG CAC TTT TTA 125l Trp Val Leu Lys Lys Glu Lys Leu Arg Leu Leu Ser Val His Phe Leu Gly Pro Lys Leu Phe Ala Thr Trp Tyr Phe Leu Leu Lys Tyr Ile Thr 4l5 420 425 Pro Leu Ile Val Phe Ser Ile Trp Leu Ser Lys Ile Tyr (2) INFORMATION FOR SEQ ID N0:168:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 442 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUEt-dCE DESCRIPTION: SEQ ID N0:168:
Met Gly Asn His Phe Ser Lys Leu Gly Phe Val Leu Ala Ala Leu Gly Ser Ala Ile Gly Leu Gly His Ile Trp Arg Phe Pro Tyr Met Thr Gly Val Ser Gly Gly Gly Ala Phe Val Leu Leu Phe Leu Phe Leu Ser Leu Ser Val Gly Ala Ala Met Phe Ile Ala Glu Met Leu Leu Gly Gln Ser Thr Gln Lys Asn Val Thr Glu Ala Phe Lys Glu Leu Asp Ile Asn Pro Lys Lys Arg Trp Lys Tyr Ala Gly Leu Leu Leu Val Ser Gly Pro Leu Ile Leu Thr Phe Tyr Gly Thr Ile Leu Gly Trp Val Leu Tyr Tyr Leu Val Ser Val Ser Phe Asn Leu Pro Asn Asn Ile Gln Glu Ser Glu Gln Ile Phe Thr Gln Thr Leu Gln Ser Ile Gly Leu Gln Ser Ile Gly Leu l30 135 140 Phe Ser Val Leu Leu Ile-Thr Gly Trp Ile Val Ser Arg Gly Ile Lys Glu Gly Ile Glu Lys Leu Asn Leu Val Leu Mea Pro Leu Leu Phe Ala Thr Phe Phe Gly Leu Leu Phe Tyr Ala Met Ser Met Asp Ser Phe Ser Lys Ala Phe His Phe Met Phe Asp Phe Lys Pro Lys Asp Leu Thr Ser Gln Val Phe Thr Tyr Ser Leu Gly Gln Val Phe Phe Ser Leu Ser Ile Gly Leu Gly Ile Asn Ile Thr Tyr Ala Ala Val Thr Asp Lys Thr Gln Asn Leu Leu Lys Ser Thr I1e Trp Val Val Leu Ser Gly Ile Leu Ile Ser Leu Val Ala Gly Leu Met Ile Phe Thr Phe Val Phe Glu Tyr Gly Ala Asn Val Ser Gln Gly Thr Gly Leu Ile Phe Thr Ser Leu Pro Val Val Phe Gly Gln Met Gly Ala Ile Gly Ile Le:u Val Ser Ile Leu Phe Leu Leu Ala Leu Ala Phe Ala Gly Ile Thr Se:r Thr Val Ala Leu Leu 305 310 31.5 320 Glu Pro Ser Val Met Tyr Leu Thr Glu Arg Tyr Gln Tyr Ser Arg Phe Lys Val Thr Trp Gly Leu Val Ala Leu Ile Phe Val Val Gly Val Val Leu Ile Phe Ser Leu His Lys Asp Tyr Lys A~~p Tyr Leu Thr Phe Phe Glu Lys Ser Leu Phe Asp Trp Leu Asp Phe Al.a Ser Ser Thr Ile Ile Met Pro Leu Gly Gly Met Ala Thr Phe Ile Phe Met Gly Trp Val Leu Lys Lys Glu Lys Leu Arg Leu Leu Ser Val Hi.s Phe Leu Gly Pro Lys Leu Phe Ala Thr Trp Tyr Phe Leu Leu Lys Tyr Ile Thr Pro Leu Ile Val Phe Ser Ile Trp Leu Ser Lys Ile Tyr (2) INFORMATION FOR SEQ ID N0:169:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1392 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
~ (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 22...l356 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:169:
Met Lys Ile Phe Gly Thr Asp Gly Val Arg Gly Lys Ala Gly Val Lys Leu Thr Pro Met Phe Val Met Arg Leu Gly Ile Ala Ala Gly Leu Tyr Phe Lys Lys His Ser Gln Thr Asn Lys Ile CTA ATC GGT AAA GAC ACC AGA AAA AGC GGC TAT ATG GTA GAA AAC GCT l95 Leu Ile Gly Lys Asp Thr Arg Lys Ser Gly Tyr Met Val Glu Asn Ala Leu Val Ser Ala Leu Thr 5er Tle Gly Tyr Asn Val Ile Gln Ile Gly Pro Met Pro Thr Pro Ala Ile Ala Phe Leu Thr Glu Asp Met Arg Cys Asp Ala Gly Ile Met Ile Ser Ala Ser His Asn Pro Phe Glu Asp Asn Gly Ile Lys Phe Phe Asn Ser Tyr Gly Tyr Lys Leu Lys Glu Glu Glu Glu Lys Ala Ile Glu Glu Ile Phe His Asp Glu Glu Leu Leu His Ser 125 130 l35 Ser Tyr Lys Val Gly Glu Ser Val Gly Ser Ala Lys Arg Ile Asp Asp l40 145 150 Val Ile Gly Arg Tyr Ile Ala His Leu Lys His Ser Phe Pro Lys His Leu Asn Leu Gln Ser Leu Arg Ile Val Leu Asp Thr Ala Asn Gly Ala 175 180 l85 Ala Tyr Lys Val Ala Pro Val Val Phe Ser Glu Leu Gly Ala Asp Val Leu Val Ile Asn Asp Glu Pro Asn Gly Cys Asn Ile Asn Asp Gln Cys Gly Ala Leu His Pro Asn Gln Leu Ser Gln G:Lu Val Lys Lys Tyr Arg Ala Asp Leu Gly Phe Ala Phe Asp Gly Asp Ala Asp Arg Leu Val Val Val Asp Asn Leu Gly Asn Ile Val His Gly Asp Lys Leu Leu Gly Val TTA GGG GTT TAT CAA AAA TCT AAA AAC GCC C'.CT TCT TCT CAA GCG GTT 867 Leu Gly Val Tyr Gln Lys Ser Lys Asn Ala Leu Ser Ser Gln Ala Val Val Ala Thr Asn Met Ser Asn Leu Ala Leu Lys Glu Tyr Leu Lys Ser Gln Asp Leu Glu Leu Lys His Cys Ala Ile Gl_y Asp Lys Phe Val Ser GAA TGC ATG CAA TTG AAT AAA GCC AAT TTT GCiA GGC GAG CAA AGC GGG 1011 Glu Cys Met Gln Leu Asn Lys Ala Asn Phe Gl.y Gly Glu Gln Ser Gly CAT ATC ATT TTT AGC GAT TAC GCT AAA ACA GCiC GAT GGT TTG GTG TGC 1059 His Ile Ile Phe Ser Asp Tyr Ala Lys Thr Gl.y Asp Gly Leu Val Cys Ala Leu Gln Val Ser Ala Leu Val Leu Glu Se:r Lys Gln Val Ser Ser GTT GCG TTA AAC CCC TTT GAA TTA TAC CCC CP,A AGC CTA GTG AAT TTG 1155 Val Ala Leu Asn Pro Phe Glu Leu Tyr Pro Gln Ser Leu Val Asn Leu Asn Val Gln Lys Lys Pro Pro Leu Glu Ser Leu Lys Gly Tyr Ser Ala Leu Leu Lys Glu Leu Asp Lys Leu Glu I1e Arg His Leu Ile Arg Tyr AGC GGC ACT GAA AAC AAA TTG CGA ATC CTT TT'A GAA GCT AAA GAT GAA 1299 Ser Gly Thr Glu Asn Lys Leu Arg Ile Leu Leu Glu Ala Lys Asp Glu WO 98l21225 PCT/IIS97/21353 Lys Leu Leu Glu Ser Lys Met Gln Glu Leu Lys Glu Phe Phe Glu Gly CAT TTG TGC TAAAAACCAC TAAP.AAAAGC CTGTTGGTTT TTATGG 1392 His Leu Cys (2) INFORMATION FOR SEQ ID N0:170:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 445 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:170:
Met Lys Ile Phe Gly Thr Asp Gly Val Arg Gly Lys Ala Gly Val Lys Leu Thr Pro Met Phe Val Met Arg Leu Gly Ile Ala Ala Gly Leu Tyr Phe Lys Lys His Ser Gln Thr Asn Lys Ile Leu Ile Gly Lys Asp Thr Arg Lys Ser Gly Tyr Met Val Glu Asn Ala Leu Val Ser Ala Leu Thr Ser Ile Gly Tyr Asn Val Ile Gln Ile Gly Pro Met Pro Thr Pro Ala Ile Ala Phe Leu Thr Glu Asp Met Arg Cys Asp Ala Gly Ile Met Ile Ser Ala Ser His Asn Pro Phe Glu Asp Asn Gly Ile Lys Phe Phe Asn Ser Tyr Gly Tyr Lys Leu Lys Glu Glu Glu Glu Lys Ala Ile Glu Glu Ile Phe His Asp Glu Glu Leu Leu His Ser Ser Tyr Lys Val Gly Glu 130 135 l40 Ser Val Gly Ser Ala Lys Arg Ile Asp Asp Val Ile Gly Arg Tyr Ile Ala His Leu Lys His Ser Phe Pro Lys His Leu Asn Leu Gln Ser Leu Arg Ile Val Leu Asp Thr Ala Asn Gly Ala Ala Tyr Lys Val Ala Pro Val Val Phe Ser Glu Leu Gly Ala Asp Val Leu Val Ile Asn Asp G1u Pro Asn Gly Cys Asn Ile Asn Asp Gln Cys Gly Ala Leu His Pro Asn Gln Leu Ser Gln Glu Val Lys Lys Tyr Arg Ala Asp Leu Gly Phe Ala Phe Asp Gly Asp Ala Asp Arg Leu Val Val Val Asp Asn Leu Gly Asn Ile Val His Gly Asp Lys Leu Leu Gly Val Leu Gly Val Tyr Gln Lys Ser Lys Asn Ala Leu Ser-Ser Gln Ala Val Va.l Ala Thr Asn Met Ser Asn Leu Ala Leu Lys Glu Tyr Leu Lys Ser Gln Asp Leu Glu Leu Lys His Cys Ala Ile Gly Asp Lys Phe Val Ser Glu Cys Met Gln Leu Asn Lys Ala Asn Phe Gly Gly Glu Gln Ser Gly His Ile Ile Phe Ser Asp Tyr Ala Lys Thr Gly Asp Gly Leu Val Cys Ala Leu Gln Val Sex Ala Leu Val Leu Glu Ser Lys Gln Val Ser Ser Val Ala Leu Asn Pro Phe Glu Leu Tyr Pro Gln Ser Leu Val Asn Leu Asn Val Gln Lys Lys Pro , Pro Leu Glu Ser Leu Lys Gly Tyr Ser Ala Leu Leu Lys Glu Leu Asp 385 390 39'5 400 Lys Leu Glu Ile Arg His Leu Ile Arg Tyr Ser Gly Thr G1u Asn Lys Leu Arg Ile Leu Leu Glu Ala Lys Asp Glu Lys Leu Leu Glu Ser Lys Met Gln Glu Leu Lys Glu Phe Phe Glu Gly His Leu Cys (2) INFORMATION FOR SEQ ID N0:171:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:171:
(2) INFORMATION FOR SEQ ID N0:172:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xii SEQUENCE DESCRIPTION: SEQ ID N0:172:
(2) INFORMATION FOR SEQ ID N0:173:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:173:
(2) INFORMATION FOR SEQ ID N0:174:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:174:
(2) INFORMATION FOR SEQ ID N0:175:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:175:
(2) INFORMATION FOR SEQ ID N0:176:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 35 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: CDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:176:
(2) INFORMATION FOR SEQ ID N0:177:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 36 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: lineal (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:177:
(2) INFORMATION FOR SEQ ID N0:17F3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:178: _ (2) INFORMATION FOR SEQ ID N0:179:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:179:
(2) INFORMATION FOR SEQ ID N0:180:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:180:
GCCCTCGAGT CATTTTAA.AC GACTCAAAAC AAA 33 (2) INFORMATION FOR SEQ ID NO:181:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:181:
{2) INFORMATION FOR SEQ ID N0:182:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:182:
(2) INFORMATION FOR SEQ ID N0:183:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (i1) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:183:
(2) INFORMATION FOR SEQ ID N0:184:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:184:
(2) INFORMATION FOR SEQ ID N0:185:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:185:
(2) INFORMATION FOR SEQ ID N0:186:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:186:
(2) INFORMATION FOR SEQ ID NO:18'7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 31 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:187:
(2) INFORMATION FOR SEQ ID N0:188:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:188:
(2) INFORMATION FOR SEQ ID N0:18:3:
(i) SEQUENCE CHA&ACTERISTICS:
(A) LENGTH: 36 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:189:
(2) INFORMATION FOR SEQ ID N0:190:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:190:
Claims (4)
1. An isolated polynucleotide that encodes:
(i) a polypeptide comprising an amino acid sequence that is homologous to the amino acid sequence of Helicobacter polypeptide, wherein said amino acid sequence of said Helicobacter polypeptide is selected from the group consisting of the amino acid sequences as shown in SEQ ID NO:2 (GHPO 13), SEQ ID NO:4 (GHPO 73), SEQ ID NO:6 (GHPO 90), SEQ ID NO:8 (GHPO 107), SEQ ID NO:10 (GHPO 136), SEQ ID NO:12 (GHPO 191), SEQ ID NO:14 (GHPO 213), SEQ ID
NO:16 (GHPO 240), SEQ ID NO:18 (GHPO 408), SEQ ID NO:20 (GHPO 411), SEQ ID NO:22 (GHPO 419), SEQ ID NO:24 (GHPO 431), SEQ ID NO:26 (GHPO
474), SEQ ID NO:28 (GHPO 591), SEQ ID NO:30 (GHPO 596), SEQ ID NO:32 (GHPO 699), SEQ ID NO:34 (GHPO 724), SEQ ID NO:36 (GHPO 730), SEQ ID
NO:38 (GHPO 761), SEQ ID NO:40 (GHPO 804), SEQ ID NO:42 (GHPO 805), SEQ ID NO:44 (GHPO 812), SEQ ID NO:46 (GHPO 879), SEQ ID NO:48 (GHPO
888), SEQ ID NO:50 (GHPO 986), SEQ ID NO:52 (GHPO 1056), SEQ ID NO:54 (GHPO 1081), SEQ ID NO:56 (GHPO 1100), SEQ ID NO:58 (GHPO 1140), SEQ
ID NO:60 (GHPO 1148), SEQ ID NO:62 (GHPO 1200), SEQ ID NO:64 (GHPO
1212), SEQ ID NO:66 (GHPO 1258), SEQ ID NO:68 (GHPO 1263), SEQ ID NO:70 (GHPO 1273), SEQ ID NO:72 (GHPO 1284), SEQ ID NO:74 (GHPO 1299), SEQ
ID NO:76 (GHPO 1327), SEQ ID NO:78 (GHPO 1346), SEQ ID NO:80 (GHPO
1378), SEQ ID NO:82 (GHPO 1412), SEQ ID NO:84 (GHPO 1443), SEQ ID NO:86 (GHPO 1466), SEQ ID NO:88 (GHPO 1476), SEQ ID NO:90 (GHPO 1536), SEQ
ID NO:92 (GHPO 1559), SEQ ID NO:94 (GHPO 427), SEQ ID NO:96 (GHPO
1045), SEQ ID NO:98 (GHPO 1262), SEQ ID NO:100 (GHPO 1688), SEQ ID
NO:102 (GHPO 1538), SEQ ID NO:104 (GHPO 346), SEQ ID NO:106 (GHPO
1012), SEQ ID NO:108 (GHPO 470), SEQ ID NO:110 (GHPO 1398), SEQ ID
NO:112 (GHPO 1550), SEQ ID NO:114 (GHPO 276), SEQ ID NO:116 (GHPO
1501), SEQ ID NO:118 (GHPO 706), SEQ ID NO:120 (GHPO 1001), SEQ ID
NO:122 (GHPO 732), SEQ ID NO:124 (GHPO 329), SEQ ID NO:126 (GHPO 574), SEQ ID NO:128 (GHPO 1190), SEQ ID NO:130 (GHPO 1374), SEQ ID NO:132 (GHPO 1620), SEQ ID NO:134 (GHPO 956), SEQ ID NO:136 (HPO 98), SEQ ID
NO:138 (GHPO 689), SEQ ID NO:140 (GHPO 208), SEQ ID NO:142 (GHPO 296), SEQ ID NO:144 (GHPO 726), SEQ ID NO:146 (GHPO 1026), SEQ ID NO: 148 (GHPO 1301), SEQ ID NO:150 (GHPO 1536), SEQ ID NO:152 (GHPO 166), SEQ
ID NO: 154 (GHPO 253), SEQ ID NO:156 (GHPO 297), SEQ ID NO:158 (GHPO
615), SEQ ID NO:160 (GHPO 1278), SEQ ID NO:162 (GHPO 1282), SEQ ID
NO:164 (GHPO 1420), SEQ ID NO:166 (GHPO 1484), SEQ ID NO:168 (GHPO
1719), and SEQ ID NO: 170 (GHPO 1252); or (ii) a derivative of said polypeptide encoded by said polynucleotide.
(i) a polypeptide comprising an amino acid sequence that is homologous to the amino acid sequence of Helicobacter polypeptide, wherein said amino acid sequence of said Helicobacter polypeptide is selected from the group consisting of the amino acid sequences as shown in SEQ ID NO:2 (GHPO 13), SEQ ID NO:4 (GHPO 73), SEQ ID NO:6 (GHPO 90), SEQ ID NO:8 (GHPO 107), SEQ ID NO:10 (GHPO 136), SEQ ID NO:12 (GHPO 191), SEQ ID NO:14 (GHPO 213), SEQ ID
NO:16 (GHPO 240), SEQ ID NO:18 (GHPO 408), SEQ ID NO:20 (GHPO 411), SEQ ID NO:22 (GHPO 419), SEQ ID NO:24 (GHPO 431), SEQ ID NO:26 (GHPO
474), SEQ ID NO:28 (GHPO 591), SEQ ID NO:30 (GHPO 596), SEQ ID NO:32 (GHPO 699), SEQ ID NO:34 (GHPO 724), SEQ ID NO:36 (GHPO 730), SEQ ID
NO:38 (GHPO 761), SEQ ID NO:40 (GHPO 804), SEQ ID NO:42 (GHPO 805), SEQ ID NO:44 (GHPO 812), SEQ ID NO:46 (GHPO 879), SEQ ID NO:48 (GHPO
888), SEQ ID NO:50 (GHPO 986), SEQ ID NO:52 (GHPO 1056), SEQ ID NO:54 (GHPO 1081), SEQ ID NO:56 (GHPO 1100), SEQ ID NO:58 (GHPO 1140), SEQ
ID NO:60 (GHPO 1148), SEQ ID NO:62 (GHPO 1200), SEQ ID NO:64 (GHPO
1212), SEQ ID NO:66 (GHPO 1258), SEQ ID NO:68 (GHPO 1263), SEQ ID NO:70 (GHPO 1273), SEQ ID NO:72 (GHPO 1284), SEQ ID NO:74 (GHPO 1299), SEQ
ID NO:76 (GHPO 1327), SEQ ID NO:78 (GHPO 1346), SEQ ID NO:80 (GHPO
1378), SEQ ID NO:82 (GHPO 1412), SEQ ID NO:84 (GHPO 1443), SEQ ID NO:86 (GHPO 1466), SEQ ID NO:88 (GHPO 1476), SEQ ID NO:90 (GHPO 1536), SEQ
ID NO:92 (GHPO 1559), SEQ ID NO:94 (GHPO 427), SEQ ID NO:96 (GHPO
1045), SEQ ID NO:98 (GHPO 1262), SEQ ID NO:100 (GHPO 1688), SEQ ID
NO:102 (GHPO 1538), SEQ ID NO:104 (GHPO 346), SEQ ID NO:106 (GHPO
1012), SEQ ID NO:108 (GHPO 470), SEQ ID NO:110 (GHPO 1398), SEQ ID
NO:112 (GHPO 1550), SEQ ID NO:114 (GHPO 276), SEQ ID NO:116 (GHPO
1501), SEQ ID NO:118 (GHPO 706), SEQ ID NO:120 (GHPO 1001), SEQ ID
NO:122 (GHPO 732), SEQ ID NO:124 (GHPO 329), SEQ ID NO:126 (GHPO 574), SEQ ID NO:128 (GHPO 1190), SEQ ID NO:130 (GHPO 1374), SEQ ID NO:132 (GHPO 1620), SEQ ID NO:134 (GHPO 956), SEQ ID NO:136 (HPO 98), SEQ ID
NO:138 (GHPO 689), SEQ ID NO:140 (GHPO 208), SEQ ID NO:142 (GHPO 296), SEQ ID NO:144 (GHPO 726), SEQ ID NO:146 (GHPO 1026), SEQ ID NO: 148 (GHPO 1301), SEQ ID NO:150 (GHPO 1536), SEQ ID NO:152 (GHPO 166), SEQ
ID NO: 154 (GHPO 253), SEQ ID NO:156 (GHPO 297), SEQ ID NO:158 (GHPO
615), SEQ ID NO:160 (GHPO 1278), SEQ ID NO:162 (GHPO 1282), SEQ ID
NO:164 (GHPO 1420), SEQ ID NO:166 (GHPO 1484), SEQ ID NO:168 (GHPO
1719), and SEQ ID NO: 170 (GHPO 1252); or (ii) a derivative of said polypeptide encoded by said polynucleotide.
2. The isolated polynucleotide of claim 1, which encodes a mature form of said polypeptide.
3. The isolated polynucleotide of claim 1 or 2, wherein the polynucleotide is a DNA molecule.
4. A compound, in a substantially purified form, that is the mature form or a derivative of a polypeptide comprising an amino acid sequence that is homologous to a Helicobacter amino acid sequence that is selected from the group consisting of the amino acid sequences as shown in SEQ ID NO:2 (GHPO 13), SEQ ID NO:4 (GHPO
73), SEQ ID NO:6 (GHPO 90), SEQ ID NO:8 (GHPO 107), SEQ ID NO:10 (GHPO
136), SEQ ID NO:12 (GHPO 191), SEQ ID NO:14 (GHPO 213), SEQ ID NO:16 (GHPO 240), SEQ ID NO:18 (GHPO 408), SEQ ID NO:20 (GHPO 411), SEQ ID
NO:22 (GHPO 419), SEQ ID NO:24 (GHPO 431), SEQ ID NO:26 (GHPO 474), SEQ ID NO:28 (GHPO 591), SEQ ID NO:30 (GHPO 596), SEQ ID NO:32 (GHPO
699), SEQ ID NO:34 (GHPO 724), SEQ ID NO:36 (GHPO 730), SEQ ID NO:38 (GHPO 761), SEQ ID NO:40 (GHPO 804), SEQ ID NO:42 (GHPO 805), SEQ ID
NO:44 (GHPO 812), SEQ ID NO:46 (GHPO 879), SEQ ID NO:48 (GHPO 888), SEQ ID NO:50 (GHPO 986), SEQ ID NO:52 (GHPO 1056), SEQ ID NO:54 (GHPO
1081), SEQ ID NO:56 (GHPO 1100), SEQ ID NO:58 (GHPO 1140), SEQ ID NO:60 (GHPO 1148), SEQ ID NO:62 (GHPO 1200), SEQ ID NO:64 (GHPO 1212), SEQ
ID NO:66 (GHPO 1258), SEQ ID NO:68 (GHPO 1263), SEQ ID NO:70 (GHPO
1273), SEQ ID NO:72 (GHPO 1284), SEQ ID NO:74 (GHPO 1299), SEQ ID NO:76 (GHPO 1327), SEQ ID NO:78 (GHPO 1346), SEQ ID NO:80 (GHPO 1378), SEQ
ID NO:82 (GHPO 1412), SEQ ID NO:84 (GHPO 1443), SEQ ID NO:86 (GHPO
1466), SEQ ID NO:88 (GHPO 1476), SEQ ID NO:90 (GHPO 1536), SEQ ID NO:92 (GHPO 1559), SEQ ID NO:94 (GHPO 427), SEQ ID NO:96 (GHPO 1045), SEQ ID
NO:98 (GHPO 1262), SEQ ID NO:100 (GHPO 1688), SEQ ID NO:102 (GHPO
1538), SEQ ID NO:104 (GHPO 346), SEQ ID NO:106 (GHPO 1012), SEQ ID
NO:108 (GHPO 470), SEQ ID NO:110 (GHPO 1398), SEQ ID NO:112 (GHPO
1550), SEQ ID NO:114 (GHPO 276), SEQ ID NO:116 (GHPO 1501), SEQ ID
NO:118 (GHPO 706), SEQ ID NO:120 (GHPO 1001), SEQ ID NO:122 (GHPO
732), SEQ ID NO:124 (GHPO 329), SEQ ID NO:126 (GHPO 574), SEQ ID NO:128 (GHPO 1190), SEQ ID NO:130 (GHPO 1374), SEQ ID NO:132 (GHPO 1620), SEQ
ID NO:134 (GHPO 956), SEQ ID NO:136 (HPO 98), SEQ ID NO:138 (GHPO 689), SEQ ID NO:140 (GHPO 208), SEQ ID NO:142 (GHPO 296), SEQ ID NO:144 (GHPO 726), SEQ ID NO:146 (GHPO 1026), SEQ ID NO:148 (GHPO 1301), SEQ
ID NO:150 (GHPO 1536), SEQ ID NO:152 (GHPO 166), SEQ ID NO:154 (GHPO
253), SEQ ID NO:156 (GHPO 297), SEQ ID NO:158 (GHPO 615), SEQ ID NO:160 (GHPO 1278), SEQ ID NO:162 (GHPO 1282), SEQ ID NO:164 (GHPO 1420), SEQ
ID NO:166 (GHPO 1484), SEQ ID NO:168 (GHPO 1719), and SEQ ID NO:170 (GHPO 1252); or (ii) a derivative of said polypeptide.
5. A method of preventing or treating Helicobacter infection in a mammal, said method comprising administering to said mammal a prophylactically or therapeutically effective amount of a compound of claim 4.
6. The method of claim 5, further comprising administering to said mammal an antibiotic, an antisecretory agent, a bismuth salt, or a combination thereof.
7. The method of claim 6, wherein said antibiotic is selected from the group consisting of amoxicillin, clarithromycin, tetracycline, metronidizole, and erythromycin, and said bismuth salt is selected from the group consisting of bismuth subcitrate and bismuth subsalicylate.
8. The method of claim 6, wherein said antisecretory agent is a proton pump inhibitor, an H2-receptor antagonist, or a prostaglandin analog.
9. The method of claim 8, wherein said proton pump inhibitor is selected from the group consisting of omeprazole, lansoprazole, and pantoprazole; said H2-receptor antagonist is selected from the group consisting of ranitidine, cimetidine, famotidine, nizatidine, and roxatidine; and said prostaglandin analog is selected from the group consisting of misoprostil and enprostil.
10. The method of claim 5, further comprising administering to said mammal a prophylactically or therapeutically effective amount of a second Helicobacter polypeptide or a derivative thereof.
11. The method of claim 10, wherein the second Helicobacter polypeptide is a Helicobacter urease, a subunit, or a derivative thereof.
12. A composition comprising a compound of claim 4, together with a physiologically acceptable diluent or carrier.
13. The composition of claim 12, further comprising an adjuvant.
14. The composition of claim 12, further comprising a second Helicobacter polypeptide or a derivative thereof.
15. The composition of claim 14, wherein said second Helicobacter polypeptide is a Helicobacter urease, a subunit, or a derivative thereof.
16. A method of preventing or treating Helicobacter infection in a mammal, said method comprising administering to said mammal a prophylactically or therapeutically effective amount of a polynucleotide of claim 1.
17. A composition comprising a viral vector, in the genome of which is inserted a DNA molecule of claim 1, said DNA molecule being placed under conditions for expression in a mammalian cell and said viral vector being admixed with a physiologically acceptable diluent or carrier.
18. A composition that comprises a bacterial vector comprising a DNA
molecule of claim 1, said DNA molecule being placed under conditions for expression and said bacterial vector being admixed with a physiologically acceptable diluent or carrier.
19. The composition of claim 18, wherein said vector is selected from the group consisting of Shigella, Salmonella, Vibrio cholerae, Lactobacillus, Bacille bike de Calmette-Guérin, and Streptococcus.
20. A composition comprising a polynucleotide of claim 1, together with a physiologically acceptable diluent or carrier.
21. The composition of claim 20, wherein said polynucleotide is a DNA
molecule that is inserted in a plasmid that is unable to replicate and to substantially integrate in a mammalian genome and is placed under conditions for expression in a mammalian cell.
22. An expression cassette comprising a DNA molecule of claim 1, said DNA molecule being placed under conditions for expression in a procaryotic or eucaryotic cell.
23. A process for producing a compound of claim 4, which comprises culturing a procaryotic or eucaryotic cell transformed or transfected with an expression cassette of claim 22, and recovering said compound from the cell culture.
24. A method of preventing or treating Helicobacter infection in a mammal, said method comprising administering to said mammal a prophylactically or therapeutically effective amount of an antibody that binds to the compound of
4. A compound, in a substantially purified form, that is the mature form or a derivative of a polypeptide comprising an amino acid sequence that is homologous to a Helicobacter amino acid sequence that is selected from the group consisting of the amino acid sequences as shown in SEQ ID NO:2 (GHPO 13), SEQ ID NO:4 (GHPO
73), SEQ ID NO:6 (GHPO 90), SEQ ID NO:8 (GHPO 107), SEQ ID NO:10 (GHPO
136), SEQ ID NO:12 (GHPO 191), SEQ ID NO:14 (GHPO 213), SEQ ID NO:16 (GHPO 240), SEQ ID NO:18 (GHPO 408), SEQ ID NO:20 (GHPO 411), SEQ ID
NO:22 (GHPO 419), SEQ ID NO:24 (GHPO 431), SEQ ID NO:26 (GHPO 474), SEQ ID NO:28 (GHPO 591), SEQ ID NO:30 (GHPO 596), SEQ ID NO:32 (GHPO
699), SEQ ID NO:34 (GHPO 724), SEQ ID NO:36 (GHPO 730), SEQ ID NO:38 (GHPO 761), SEQ ID NO:40 (GHPO 804), SEQ ID NO:42 (GHPO 805), SEQ ID
NO:44 (GHPO 812), SEQ ID NO:46 (GHPO 879), SEQ ID NO:48 (GHPO 888), SEQ ID NO:50 (GHPO 986), SEQ ID NO:52 (GHPO 1056), SEQ ID NO:54 (GHPO
1081), SEQ ID NO:56 (GHPO 1100), SEQ ID NO:58 (GHPO 1140), SEQ ID NO:60 (GHPO 1148), SEQ ID NO:62 (GHPO 1200), SEQ ID NO:64 (GHPO 1212), SEQ
ID NO:66 (GHPO 1258), SEQ ID NO:68 (GHPO 1263), SEQ ID NO:70 (GHPO
1273), SEQ ID NO:72 (GHPO 1284), SEQ ID NO:74 (GHPO 1299), SEQ ID NO:76 (GHPO 1327), SEQ ID NO:78 (GHPO 1346), SEQ ID NO:80 (GHPO 1378), SEQ
ID NO:82 (GHPO 1412), SEQ ID NO:84 (GHPO 1443), SEQ ID NO:86 (GHPO
1466), SEQ ID NO:88 (GHPO 1476), SEQ ID NO:90 (GHPO 1536), SEQ ID NO:92 (GHPO 1559), SEQ ID NO:94 (GHPO 427), SEQ ID NO:96 (GHPO 1045), SEQ ID
NO:98 (GHPO 1262), SEQ ID NO:100 (GHPO 1688), SEQ ID NO:102 (GHPO
1538), SEQ ID NO:104 (GHPO 346), SEQ ID NO:106 (GHPO 1012), SEQ ID
NO:108 (GHPO 470), SEQ ID NO:110 (GHPO 1398), SEQ ID NO:112 (GHPO
1550), SEQ ID NO:114 (GHPO 276), SEQ ID NO:116 (GHPO 1501), SEQ ID
NO:118 (GHPO 706), SEQ ID NO:120 (GHPO 1001), SEQ ID NO:122 (GHPO
732), SEQ ID NO:124 (GHPO 329), SEQ ID NO:126 (GHPO 574), SEQ ID NO:128 (GHPO 1190), SEQ ID NO:130 (GHPO 1374), SEQ ID NO:132 (GHPO 1620), SEQ
ID NO:134 (GHPO 956), SEQ ID NO:136 (HPO 98), SEQ ID NO:138 (GHPO 689), SEQ ID NO:140 (GHPO 208), SEQ ID NO:142 (GHPO 296), SEQ ID NO:144 (GHPO 726), SEQ ID NO:146 (GHPO 1026), SEQ ID NO:148 (GHPO 1301), SEQ
ID NO:150 (GHPO 1536), SEQ ID NO:152 (GHPO 166), SEQ ID NO:154 (GHPO
253), SEQ ID NO:156 (GHPO 297), SEQ ID NO:158 (GHPO 615), SEQ ID NO:160 (GHPO 1278), SEQ ID NO:162 (GHPO 1282), SEQ ID NO:164 (GHPO 1420), SEQ
ID NO:166 (GHPO 1484), SEQ ID NO:168 (GHPO 1719), and SEQ ID NO:170 (GHPO 1252); or (ii) a derivative of said polypeptide.
5. A method of preventing or treating Helicobacter infection in a mammal, said method comprising administering to said mammal a prophylactically or therapeutically effective amount of a compound of claim 4.
6. The method of claim 5, further comprising administering to said mammal an antibiotic, an antisecretory agent, a bismuth salt, or a combination thereof.
7. The method of claim 6, wherein said antibiotic is selected from the group consisting of amoxicillin, clarithromycin, tetracycline, metronidizole, and erythromycin, and said bismuth salt is selected from the group consisting of bismuth subcitrate and bismuth subsalicylate.
8. The method of claim 6, wherein said antisecretory agent is a proton pump inhibitor, an H2-receptor antagonist, or a prostaglandin analog.
9. The method of claim 8, wherein said proton pump inhibitor is selected from the group consisting of omeprazole, lansoprazole, and pantoprazole; said H2-receptor antagonist is selected from the group consisting of ranitidine, cimetidine, famotidine, nizatidine, and roxatidine; and said prostaglandin analog is selected from the group consisting of misoprostil and enprostil.
10. The method of claim 5, further comprising administering to said mammal a prophylactically or therapeutically effective amount of a second Helicobacter polypeptide or a derivative thereof.
11. The method of claim 10, wherein the second Helicobacter polypeptide is a Helicobacter urease, a subunit, or a derivative thereof.
12. A composition comprising a compound of claim 4, together with a physiologically acceptable diluent or carrier.
13. The composition of claim 12, further comprising an adjuvant.
14. The composition of claim 12, further comprising a second Helicobacter polypeptide or a derivative thereof.
15. The composition of claim 14, wherein said second Helicobacter polypeptide is a Helicobacter urease, a subunit, or a derivative thereof.
16. A method of preventing or treating Helicobacter infection in a mammal, said method comprising administering to said mammal a prophylactically or therapeutically effective amount of a polynucleotide of claim 1.
17. A composition comprising a viral vector, in the genome of which is inserted a DNA molecule of claim 1, said DNA molecule being placed under conditions for expression in a mammalian cell and said viral vector being admixed with a physiologically acceptable diluent or carrier.
18. A composition that comprises a bacterial vector comprising a DNA
molecule of claim 1, said DNA molecule being placed under conditions for expression and said bacterial vector being admixed with a physiologically acceptable diluent or carrier.
19. The composition of claim 18, wherein said vector is selected from the group consisting of Shigella, Salmonella, Vibrio cholerae, Lactobacillus, Bacille bike de Calmette-Guérin, and Streptococcus.
20. A composition comprising a polynucleotide of claim 1, together with a physiologically acceptable diluent or carrier.
21. The composition of claim 20, wherein said polynucleotide is a DNA
molecule that is inserted in a plasmid that is unable to replicate and to substantially integrate in a mammalian genome and is placed under conditions for expression in a mammalian cell.
22. An expression cassette comprising a DNA molecule of claim 1, said DNA molecule being placed under conditions for expression in a procaryotic or eucaryotic cell.
23. A process for producing a compound of claim 4, which comprises culturing a procaryotic or eucaryotic cell transformed or transfected with an expression cassette of claim 22, and recovering said compound from the cell culture.
24. A method of preventing or treating Helicobacter infection in a mammal, said method comprising administering to said mammal a prophylactically or therapeutically effective amount of an antibody that binds to the compound of
claim 4.
Applications Claiming Priority (13)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US74905196A | 1996-11-14 | 1996-11-14 | |
US08/749,051 | 1996-11-14 | ||
US83345797A | 1997-04-01 | 1997-04-01 | |
US83130997A | 1997-04-01 | 1997-04-01 | |
US08/834,705 | 1997-04-01 | ||
US08/834,705 US20030023066A1 (en) | 1996-11-14 | 1997-04-01 | Helicobacter polypeptides and corresponding polynucleotide molecules |
US08/831,309 | 1997-04-01 | ||
US08/833,457 | 1997-04-01 | ||
US88122797A | 1997-06-24 | 1997-06-24 | |
US08/881,227 | 1997-06-24 | ||
US90261597A | 1997-07-29 | 1997-07-29 | |
US08/902,615 | 1997-07-29 | ||
PCT/US1997/021353 WO1998021225A1 (en) | 1996-11-14 | 1997-11-14 | Helicobacter polypeptides and corresponding polynucleotide molecules |
Publications (1)
Publication Number | Publication Date |
---|---|
CA2271774A1 true CA2271774A1 (en) | 1998-05-22 |
Family
ID=27560272
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA002271774A Abandoned CA2271774A1 (en) | 1996-11-14 | 1997-11-14 | Helicobacter polypeptides and corresponding polynucleotide molecules |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP1021458A4 (en) |
AU (1) | AU735391B2 (en) |
CA (1) | CA2271774A1 (en) |
WO (1) | WO1998021225A1 (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6503747B2 (en) | 1998-07-14 | 2003-01-07 | University Of Hawaii | Serotype-specific probes for Listeria monocytogenes |
WO2000049044A1 (en) * | 1999-02-19 | 2000-08-24 | Astrazeneca Ab | Expression of helicobacter polypeptides in pichia pastoris |
CA2385822A1 (en) * | 1999-05-31 | 2000-12-07 | Max-Planck-Gesellschaft Zur Forderung Der Wissenschaften E.V. | Essential gene and gene products for identifying, developing and optimising immunological and pharmacological active ingredients for the treatment of microbial infections |
AU2002218985A1 (en) * | 2000-11-15 | 2002-05-27 | Ludwig Deml | Helicobacter cysteine rich protein a (hcpa) and uses thereof |
CN111793137A (en) * | 2019-12-12 | 2020-10-20 | 南京蛋球球生物医学技术合伙企业(有限合伙) | Hp tetravalent antigen and preparation method and application thereof |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB8928625D0 (en) * | 1989-12-19 | 1990-02-21 | 3I Res Expl Ltd | H.pylori dna probes |
US5733740A (en) * | 1992-10-13 | 1998-03-31 | Vanderbilt University | Taga gene and methods for detecting predisposition to peptic ulceration and gastric carcinoma |
BR9609871A (en) * | 1995-04-28 | 2000-03-28 | Ora Vax Inc | Multimeric recombinant urease vaccine |
SK165197A3 (en) * | 1995-06-07 | 1999-01-11 | Astra Ab | Nucleic acid and amino acid sequences relating to helicobacter pylori for diagnostics and therapeutics |
AU1055497A (en) * | 1995-11-17 | 1997-06-11 | Astra Aktiebolag | Nucleic acid and amino acid sequences relating to helicobacter pylori for diagnostics and therapeutics |
AU726892B2 (en) * | 1996-03-29 | 2000-11-23 | Astra Aktiebolag | Nucleic acid and amino acid sequences relating to helicobacter pylori and vaccine compositions thereof |
-
1997
- 1997-11-14 EP EP97947620A patent/EP1021458A4/en not_active Withdrawn
- 1997-11-14 CA CA002271774A patent/CA2271774A1/en not_active Abandoned
- 1997-11-14 AU AU52662/98A patent/AU735391B2/en not_active Ceased
- 1997-11-14 WO PCT/US1997/021353 patent/WO1998021225A1/en not_active Application Discontinuation
Also Published As
Publication number | Publication date |
---|---|
EP1021458A1 (en) | 2000-07-26 |
AU735391B2 (en) | 2001-07-05 |
WO1998021225A1 (en) | 1998-05-22 |
EP1021458A4 (en) | 2001-12-12 |
AU5266298A (en) | 1998-06-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7179448B2 (en) | Recombinant constructs of Borrelia burgdorferi | |
AU784193B2 (en) | Chlamydia antigens and corresponding DNA fragments and uses thereof | |
WO2002018595A9 (en) | Moraxella polypeptides and corresponding dna fragments and uses thereof | |
JP2001527393A (en) | Identification of a polynucleotide encoding a novel Helicobacter polypeptide in the Helicobacter genome | |
EP1311682B1 (en) | Recombinant constructs of borrelia burgdorferi | |
EP1163342B1 (en) | Chlamydia antigens and corresponding dna fragments and uses thereof | |
AU750792B2 (en) | 76 kDa, 32 kDa, and 50 kDa helicobacter polypeptides and corresponding polynucleotide molecules | |
CA2271774A1 (en) | Helicobacter polypeptides and corresponding polynucleotide molecules | |
US20030158396A1 (en) | Identification of polynucleotides encoding novel helicobacter polypeptides in the helicobacter genome | |
US20030124141A1 (en) | Helicobacter polypeptides and corresponding polynucleotide molecules | |
US20020115078A1 (en) | Identification of polynucleotides encoding novel helicobacter polypeptides in the helicobacter genome | |
US20030023066A1 (en) | Helicobacter polypeptides and corresponding polynucleotide molecules | |
US20020107368A1 (en) | Helicobacter proteins, gene sequences and uses thereof | |
US20020160456A1 (en) | Identification of polynucleotides encoding novel helicobacter polypeptides in the helicobacter genome | |
US20020026035A1 (en) | Helicobacter ghpo 1360 and ghpo 750 polypeptides and corresponding polynucleotide molecules | |
US20030069404A1 (en) | Helicobacter antigens and corresponding DNA fragments | |
US20020044949A1 (en) | 76 kda helicobacter polypeptides and corresponding polynucleotide molecules | |
EP1535928B1 (en) | Vaccine compositions comprising Omp85 proteins of Neisseria gonorrhoeae and Neisseria meningitidis | |
JP2001503637A (en) | Helicobacter polypeptides and corresponding polynucleotide molecules | |
WO1997003359A1 (en) | Helicobacter clpb | |
EP1939294A1 (en) | Recombinant constructs of borrelia burgdorferi | |
WO1997028264A1 (en) | A novel gene encoding an outer membrane protein of helicobacter pylori and a recombinant microorganism expressing the same | |
CA2354431A1 (en) | Chlamydia antigens and corresponding dna fragments and uses thereof | |
KR20060128894A (en) | Immunization against chlamydia infection | |
CA2223395A1 (en) | Nucleic acid and amino acid sequences relating to helicobacter pylori for diagnostics and therapeutics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request | ||
FZDE | Discontinued |