MXPA00006576A - Toxins active against ostrinia nubilalis - Google Patents
Toxins active against ostrinia nubilalisInfo
- Publication number
- MXPA00006576A MXPA00006576A MXPA/A/2000/006576A MXPA00006576A MXPA00006576A MX PA00006576 A MXPA00006576 A MX PA00006576A MX PA00006576 A MXPA00006576 A MX PA00006576A MX PA00006576 A MXPA00006576 A MX PA00006576A
- Authority
- MX
- Mexico
- Prior art keywords
- leu
- thr
- asn
- glu
- gly
- Prior art date
Links
- 231100000765 Toxin Toxicity 0.000 title claims abstract description 274
- 239000003053 toxin Substances 0.000 title claims abstract description 274
- 241001147398 Ostrinia nubilalis Species 0.000 title claims abstract description 10
- 108020003112 toxins Proteins 0.000 title abstract description 261
- 229920001850 Nucleic acid sequence Polymers 0.000 claims abstract description 81
- 241000607479 Yersinia pestis Species 0.000 claims abstract description 49
- 230000000361 pesticidal Effects 0.000 claims abstract description 22
- 241000193388 Bacillus thuringiensis Species 0.000 claims abstract description 18
- 229940097012 Bacillus thuringiensis Drugs 0.000 claims abstract description 17
- 239000002773 nucleotide Substances 0.000 claims abstract description 13
- 125000003729 nucleotide group Chemical group 0.000 claims abstract description 13
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 103
- 239000000575 pesticide Substances 0.000 claims description 26
- 241000209149 Zea Species 0.000 claims description 10
- 229920000023 polynucleotide Polymers 0.000 claims description 10
- 239000002157 polynucleotide Substances 0.000 claims description 10
- 235000002017 Zea mays subsp mays Nutrition 0.000 claims description 9
- 102000004965 antibodies Human genes 0.000 claims description 9
- 108090001123 antibodies Proteins 0.000 claims description 9
- 235000005822 corn Nutrition 0.000 claims description 9
- 235000005824 corn Nutrition 0.000 claims description 9
- 230000000295 complement Effects 0.000 claims description 6
- 241001147397 Ostrinia Species 0.000 claims description 5
- 230000001276 controlling effect Effects 0.000 claims description 4
- 235000019622 astringency Nutrition 0.000 claims 1
- 235000019606 astringent taste Nutrition 0.000 claims 1
- 241000196324 Embryophyta Species 0.000 abstract description 47
- 238000000034 method Methods 0.000 abstract description 45
- 239000000463 material Substances 0.000 abstract description 11
- 229920003013 deoxyribonucleic acid Polymers 0.000 description 119
- 150000001413 amino acids Chemical class 0.000 description 89
- 108010061238 threonyl-glycine Proteins 0.000 description 88
- XGDCYUQSFDQISZ-BQBZGAKWSA-N Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O XGDCYUQSFDQISZ-BQBZGAKWSA-N 0.000 description 73
- LRKCBIUDWAXNEG-CSMHCCOUSA-N Leu-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LRKCBIUDWAXNEG-CSMHCCOUSA-N 0.000 description 68
- WITCOKQIPFWQQD-FSPLSTOPSA-N Val-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O WITCOKQIPFWQQD-FSPLSTOPSA-N 0.000 description 61
- 108010034529 leucyl-lysine Proteins 0.000 description 60
- 102000004169 proteins and genes Human genes 0.000 description 59
- 108090000623 proteins and genes Proteins 0.000 description 59
- NFNVDJGXRFEYTK-YUMQZZPRSA-N Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O NFNVDJGXRFEYTK-YUMQZZPRSA-N 0.000 description 58
- 210000004027 cells Anatomy 0.000 description 57
- 150000007523 nucleic acids Chemical class 0.000 description 56
- 108010013835 arginine glutamate Proteins 0.000 description 55
- 108020004707 nucleic acids Proteins 0.000 description 55
- JYOAXOMPIXKMKK-UHFFFAOYSA-N Leucyl-Glutamine Chemical compound CC(C)CC(N)C(=O)NC(C(O)=O)CCC(N)=O JYOAXOMPIXKMKK-UHFFFAOYSA-N 0.000 description 54
- OTXBNHIUIHNGAO-UWVGGRQHSA-N Leu-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN OTXBNHIUIHNGAO-UWVGGRQHSA-N 0.000 description 53
- 241000880493 Leptailurus serval Species 0.000 description 52
- BQBCIBCLXBKYHW-CSMHCCOUSA-N Thr-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@@H]([NH3+])[C@@H](C)O BQBCIBCLXBKYHW-CSMHCCOUSA-N 0.000 description 51
- 108010089804 glycyl-threonine Proteins 0.000 description 51
- MLTRLIITQPXHBJ-BQBZGAKWSA-N Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O MLTRLIITQPXHBJ-BQBZGAKWSA-N 0.000 description 50
- XNSKSTRGQIPTSE-UHFFFAOYSA-N Arginyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CCCNC(N)=N XNSKSTRGQIPTSE-UHFFFAOYSA-N 0.000 description 49
- DSGIVWSDDRDJIO-ZXXMMSQZSA-N Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DSGIVWSDDRDJIO-ZXXMMSQZSA-N 0.000 description 49
- QXRNAOYBCYVZCD-BQBZGAKWSA-N (2S)-6-amino-2-[[(2S)-2-aminopropanoyl]amino]hexanoic acid Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN QXRNAOYBCYVZCD-BQBZGAKWSA-N 0.000 description 48
- WOUIMBGNEUWXQG-VKHMYHEASA-N Ser-Gly Chemical compound OC[C@H](N)C(=O)NCC(O)=O WOUIMBGNEUWXQG-VKHMYHEASA-N 0.000 description 47
- 108010047857 aspartylglycine Proteins 0.000 description 46
- 108010037850 glycylvaline Proteins 0.000 description 46
- 108010057821 leucylproline Proteins 0.000 description 46
- 239000000523 sample Substances 0.000 description 46
- 108010077245 asparaginyl-proline Proteins 0.000 description 45
- BIYXEUAFGLTAEM-WUJLRWPWSA-N Thr-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(O)=O BIYXEUAFGLTAEM-WUJLRWPWSA-N 0.000 description 44
- YBAFDPFAUTYYRW-YUMQZZPRSA-N Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O YBAFDPFAUTYYRW-YUMQZZPRSA-N 0.000 description 43
- HFKJBCPRWWGPEY-BQBZGAKWSA-N L-arginyl-L-glutamic acid Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(O)=O HFKJBCPRWWGPEY-BQBZGAKWSA-N 0.000 description 43
- 108010054155 lysyllysine Proteins 0.000 description 43
- BXNGIHFNNNSEOS-UWVGGRQHSA-N Phe-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 BXNGIHFNNNSEOS-UWVGGRQHSA-N 0.000 description 42
- KLKHFFMNGWULBN-VKHMYHEASA-N Asn-Gly Chemical compound NC(=O)C[C@H](N)C(=O)NCC(O)=O KLKHFFMNGWULBN-VKHMYHEASA-N 0.000 description 41
- LAFKUZYWNCHOHT-WHFBIAKZSA-N Ser-Glu Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O LAFKUZYWNCHOHT-WHFBIAKZSA-N 0.000 description 41
- VBKIFHUVGLOJKT-UHFFFAOYSA-N Asparaginyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CC(N)=O VBKIFHUVGLOJKT-UHFFFAOYSA-N 0.000 description 40
- VTJUNIYRYIAIHF-IUCAKERBSA-N Leu-Pro Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O VTJUNIYRYIAIHF-IUCAKERBSA-N 0.000 description 40
- BECPPKYKPSRKCP-ZDLURKLDSA-N Thr-Glu Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O BECPPKYKPSRKCP-ZDLURKLDSA-N 0.000 description 40
- 108010051242 phenylalanylserine Proteins 0.000 description 40
- GVRKWABULJAONN-UHFFFAOYSA-N Valyl-Threonine Chemical compound CC(C)C(N)C(=O)NC(C(C)O)C(O)=O GVRKWABULJAONN-UHFFFAOYSA-N 0.000 description 39
- 108010038633 aspartylglutamate Proteins 0.000 description 38
- HXWUJJADFMXNKA-UHFFFAOYSA-N Asparaginyl-Leucine Chemical compound CC(C)CC(C(O)=O)NC(=O)C(N)CC(N)=O HXWUJJADFMXNKA-UHFFFAOYSA-N 0.000 description 37
- NYQBYASWHVRESG-MIMYLULJSA-N Phe-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 NYQBYASWHVRESG-MIMYLULJSA-N 0.000 description 37
- YMAWOPBAYDPSLA-UHFFFAOYSA-N glycylglycine zwitterion Chemical compound [NH3+]CC(=O)NCC([O-])=O YMAWOPBAYDPSLA-UHFFFAOYSA-N 0.000 description 37
- 108010081551 glycylphenylalanine Proteins 0.000 description 37
- 238000009396 hybridization Methods 0.000 description 37
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 36
- NFDYGNFETJVMSE-BQBZGAKWSA-N Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CO NFDYGNFETJVMSE-BQBZGAKWSA-N 0.000 description 36
- IPWKGIFRRBGCJO-IMJSIDKUSA-N Ala-Ser Chemical compound C[C@H]([NH3+])C(=O)N[C@@H](CO)C([O-])=O IPWKGIFRRBGCJO-IMJSIDKUSA-N 0.000 description 35
- BCCRXDTUTZHDEU-VKHMYHEASA-N Gly-Ser Chemical compound NCC(=O)N[C@@H](CO)C(O)=O BCCRXDTUTZHDEU-VKHMYHEASA-N 0.000 description 35
- IQHUITKNHOKGFC-MIMYLULJSA-N Thr-Phe Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IQHUITKNHOKGFC-MIMYLULJSA-N 0.000 description 35
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 35
- 108010026333 seryl-proline Proteins 0.000 description 35
- RNKSNIBMTUYWSH-YFKPBYRVSA-N L-prolylglycine Chemical compound [O-]C(=O)CNC(=O)[C@@H]1CCC[NH2+]1 RNKSNIBMTUYWSH-YFKPBYRVSA-N 0.000 description 33
- 108010008355 arginyl-glutamine Proteins 0.000 description 33
- PMGDADKJMCOXHX-BQBZGAKWSA-N Arg-Gln Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(O)=O PMGDADKJMCOXHX-BQBZGAKWSA-N 0.000 description 32
- JHFNSBBHKSZXKB-VKHMYHEASA-N Asp-Gly Chemical compound OC(=O)C[C@H](N)C(=O)NCC(O)=O JHFNSBBHKSZXKB-VKHMYHEASA-N 0.000 description 32
- LTFSLKWFMWZEBD-IMJSIDKUSA-N Ser-Asn Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O LTFSLKWFMWZEBD-IMJSIDKUSA-N 0.000 description 32
- 108010049041 glutamylalanine Proteins 0.000 description 32
- 108010090894 prolylleucine Proteins 0.000 description 32
- RJUHZPRQRQLCFL-IMJSIDKUSA-N Asn-Asn Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(O)=O RJUHZPRQRQLCFL-IMJSIDKUSA-N 0.000 description 31
- GLUBLISJVJFHQS-VIFPVBQESA-N Phe-Gly Chemical compound OC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 GLUBLISJVJFHQS-VIFPVBQESA-N 0.000 description 31
- UBAQSAUDKMIEQZ-QWRGUYRKSA-N Tyr-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 UBAQSAUDKMIEQZ-QWRGUYRKSA-N 0.000 description 31
- HPYDSVWYXXKHRD-VIFPVBQESA-N Tyr-Gly Chemical compound [O-]C(=O)CNC(=O)[C@@H]([NH3+])CC1=CC=C(O)C=C1 HPYDSVWYXXKHRD-VIFPVBQESA-N 0.000 description 31
- 108010029020 prolylglycine Proteins 0.000 description 31
- PDSLRCZINIDLMU-QWRGUYRKSA-N Tyr-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 PDSLRCZINIDLMU-QWRGUYRKSA-N 0.000 description 30
- 230000000694 effects Effects 0.000 description 30
- 108010078144 glutaminyl-glycine Proteins 0.000 description 30
- 108010025306 histidylleucine Proteins 0.000 description 30
- DXJZITDUDUPINW-UHFFFAOYSA-N γ-glutamyl-Asparagine Chemical compound NC(=O)CCC(N)C(=O)NC(CC(N)=O)C(O)=O DXJZITDUDUPINW-UHFFFAOYSA-N 0.000 description 30
- CKAJHWFHHFSCDT-WHFBIAKZSA-N Asp-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O CKAJHWFHHFSCDT-WHFBIAKZSA-N 0.000 description 29
- STTYIMSDIYISRG-WDSKDSINSA-N Val-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(O)=O STTYIMSDIYISRG-WDSKDSINSA-N 0.000 description 29
- SITLTJHOQZFJGG-XPUUQOCRSA-N α-Glu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O SITLTJHOQZFJGG-XPUUQOCRSA-N 0.000 description 29
- KWBQPGIYEZKDEG-FSPLSTOPSA-N Asn-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(N)=O KWBQPGIYEZKDEG-FSPLSTOPSA-N 0.000 description 28
- LDEBVRIURYMKQS-UHFFFAOYSA-N Serinyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CO LDEBVRIURYMKQS-UHFFFAOYSA-N 0.000 description 28
- UQTNIFUCMBFWEJ-UHFFFAOYSA-N Threoninyl-Asparagine Chemical compound CC(O)C(N)C(=O)NC(C(O)=O)CC(N)=O UQTNIFUCMBFWEJ-UHFFFAOYSA-N 0.000 description 28
- UPJONISHZRADBH-XPUUQOCRSA-N Val-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O UPJONISHZRADBH-XPUUQOCRSA-N 0.000 description 28
- 108010068265 aspartyltyrosine Proteins 0.000 description 28
- 108010010147 glycylglutamine Proteins 0.000 description 28
- 108010017391 lysylvaline Proteins 0.000 description 28
- 238000003752 polymerase chain reaction Methods 0.000 description 28
- 108010073969 valyllysine Proteins 0.000 description 28
- XMBSYZWANAQXEV-UHFFFAOYSA-N 4-amino-5-[(1-carboxy-2-phenylethyl)amino]-5-oxopentanoic acid Chemical compound OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 27
- 108010009962 valyltyrosine Proteins 0.000 description 27
- OMSMPWHEGLNQOD-UHFFFAOYSA-N Asparaginyl-Phenylalanine Chemical compound NC(=O)CC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 OMSMPWHEGLNQOD-UHFFFAOYSA-N 0.000 description 26
- JSIQVRIXMINMTA-ZDLURKLDSA-N Glu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O JSIQVRIXMINMTA-ZDLURKLDSA-N 0.000 description 26
- IBIDRSSEHFLGSD-YUMQZZPRSA-N Val-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IBIDRSSEHFLGSD-YUMQZZPRSA-N 0.000 description 26
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 26
- IOUPEELXVYPCPG-UHFFFAOYSA-N val-gly Chemical compound CC(C)C(N)C(=O)NCC(O)=O IOUPEELXVYPCPG-UHFFFAOYSA-N 0.000 description 26
- LESXFEZIFXFIQR-LURJTMIESA-N Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)NCC(O)=O LESXFEZIFXFIQR-LURJTMIESA-N 0.000 description 25
- 108010092854 aspartyllysine Proteins 0.000 description 25
- 108010092114 histidylphenylalanine Proteins 0.000 description 25
- FYYSIASRLDJUNP-WHFBIAKZSA-N Glu-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O FYYSIASRLDJUNP-WHFBIAKZSA-N 0.000 description 24
- FUESBOMYALLFNI-VKHMYHEASA-N Gly-Asn Chemical compound NCC(=O)N[C@H](C(O)=O)CC(N)=O FUESBOMYALLFNI-VKHMYHEASA-N 0.000 description 24
- CIOWSLJGLSUOME-BQBZGAKWSA-N Lys-Asp Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O CIOWSLJGLSUOME-BQBZGAKWSA-N 0.000 description 24
- WBAXJMCUFIXCNI-WDSKDSINSA-N Ser-Pro Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O WBAXJMCUFIXCNI-WDSKDSINSA-N 0.000 description 24
- GXDLGHLJTHMDII-WISUUJSJSA-N Thr-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(O)=O GXDLGHLJTHMDII-WISUUJSJSA-N 0.000 description 24
- 239000000203 mixture Substances 0.000 description 24
- QCWJKJLNCFEVPQ-WHFBIAKZSA-N Asn-Gln Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O QCWJKJLNCFEVPQ-WHFBIAKZSA-N 0.000 description 23
- VGRHZPNRCLAHQA-UHFFFAOYSA-N Aspartyl-Asparagine Chemical compound OC(=O)CC(N)C(=O)NC(CC(N)=O)C(O)=O VGRHZPNRCLAHQA-UHFFFAOYSA-N 0.000 description 23
- SCCPDJAQCXWPTF-VKHMYHEASA-N Gly-Asp Chemical compound NCC(=O)N[C@H](C(O)=O)CC(O)=O SCCPDJAQCXWPTF-VKHMYHEASA-N 0.000 description 23
- NVGBPTNZLWRQSY-UWVGGRQHSA-N Lys-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN NVGBPTNZLWRQSY-UWVGGRQHSA-N 0.000 description 23
- JXWLMUIXUXLIJR-QWRGUYRKSA-N Phe-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 JXWLMUIXUXLIJR-QWRGUYRKSA-N 0.000 description 23
- VBKBDLMWICBSCY-IMJSIDKUSA-N Ser-Asp Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O VBKBDLMWICBSCY-IMJSIDKUSA-N 0.000 description 23
- APIDTRXFGYOLLH-VQVTYTSYSA-N Thr-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)[C@@H](C)O APIDTRXFGYOLLH-VQVTYTSYSA-N 0.000 description 23
- JKHXYJKMNSSFFL-IUCAKERBSA-N Val-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN JKHXYJKMNSSFFL-IUCAKERBSA-N 0.000 description 23
- 108010009298 lysylglutamic acid Proteins 0.000 description 23
- RZVAJINKPMORJF-UHFFFAOYSA-N p-acetaminophenol Chemical compound CC(=O)NC1=CC=C(O)C=C1 RZVAJINKPMORJF-UHFFFAOYSA-N 0.000 description 23
- 108010053725 prolylvaline Proteins 0.000 description 23
- SONUFGRSSMFHFN-IMJSIDKUSA-N Asn-Ser Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O SONUFGRSSMFHFN-IMJSIDKUSA-N 0.000 description 22
- UJTZHGHXJKIAOS-WHFBIAKZSA-N Ser-Gln Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O UJTZHGHXJKIAOS-WHFBIAKZSA-N 0.000 description 22
- PPQRSMGDOHLTBE-UWVGGRQHSA-N Ser-Phe Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 PPQRSMGDOHLTBE-UWVGGRQHSA-N 0.000 description 22
- YKRQRPFODDJQTC-UHFFFAOYSA-N Threoninyl-Lysine Chemical compound CC(O)C(N)C(=O)NC(C(O)=O)CCCCN YKRQRPFODDJQTC-UHFFFAOYSA-N 0.000 description 22
- MFEVVAXTBZELLL-UHFFFAOYSA-N Tyrosyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 MFEVVAXTBZELLL-UHFFFAOYSA-N 0.000 description 22
- XXDVDTMEVBYRPK-XPUUQOCRSA-N Val-Gln Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O XXDVDTMEVBYRPK-XPUUQOCRSA-N 0.000 description 22
- STKYPAFSDFAEPH-LURJTMIESA-N gly-val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CN STKYPAFSDFAEPH-LURJTMIESA-N 0.000 description 22
- NALWOULWGHTVDA-UWVGGRQHSA-N Asp-Tyr Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NALWOULWGHTVDA-UWVGGRQHSA-N 0.000 description 21
- ARPVSMCNIDAQBO-UHFFFAOYSA-N Glutaminyl-Leucine Chemical compound CC(C)CC(C(O)=O)NC(=O)C(N)CCC(N)=O ARPVSMCNIDAQBO-UHFFFAOYSA-N 0.000 description 21
- IIFDPDVJAHQFSR-WHFBIAKZSA-N Asn-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O IIFDPDVJAHQFSR-WHFBIAKZSA-N 0.000 description 20
- PSZNHSNIGMJYOZ-WDSKDSINSA-N Asp-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PSZNHSNIGMJYOZ-WDSKDSINSA-N 0.000 description 20
- PNMUAGGSDZXTHX-BYPYZUCNSA-N Gly-Gln Chemical compound NCC(=O)N[C@H](C(O)=O)CCC(N)=O PNMUAGGSDZXTHX-BYPYZUCNSA-N 0.000 description 20
- OLIFSFOFKGKIRH-WUJLRWPWSA-N Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CN OLIFSFOFKGKIRH-WUJLRWPWSA-N 0.000 description 20
- ATIPDCIQTUXABX-UWVGGRQHSA-N Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN ATIPDCIQTUXABX-UWVGGRQHSA-N 0.000 description 20
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 20
- ZKQOUHVVXABNDG-IUCAKERBSA-N Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 ZKQOUHVVXABNDG-IUCAKERBSA-N 0.000 description 20
- GVUVRRPYYDHHGK-UHFFFAOYSA-N Prolyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C1CCCN1 GVUVRRPYYDHHGK-UHFFFAOYSA-N 0.000 description 20
- HYLXOQURIOCKIH-VQVTYTSYSA-N Thr-Arg Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CCCNC(N)=N HYLXOQURIOCKIH-VQVTYTSYSA-N 0.000 description 20
- 125000000267 glycino group Chemical group [H]N([*])C([H])([H])C(=O)O[H] 0.000 description 20
- 239000000243 solution Substances 0.000 description 20
- JSLGXODUIAFWCF-UHFFFAOYSA-N Arginyl-Asparagine Chemical compound NC(N)=NCCCC(N)C(=O)NC(CC(N)=O)C(O)=O JSLGXODUIAFWCF-UHFFFAOYSA-N 0.000 description 19
- NPDLYUOYAGBHFB-UHFFFAOYSA-N Asparaginyl-Arginine Chemical compound NC(=O)CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N NPDLYUOYAGBHFB-UHFFFAOYSA-N 0.000 description 19
- JMEWFDUAFKVAAT-UHFFFAOYSA-N Methionyl-Asparagine Chemical compound CSCCC(N)C(=O)NC(C(O)=O)CC(N)=O JMEWFDUAFKVAAT-UHFFFAOYSA-N 0.000 description 19
- JQOHKCDMINQZRV-WDSKDSINSA-N Pro-Asn Chemical compound NC(=O)C[C@@H](C([O-])=O)NC(=O)[C@@H]1CCC[NH2+]1 JQOHKCDMINQZRV-WDSKDSINSA-N 0.000 description 19
- ILVGMCVCQBJPSH-WDSKDSINSA-N Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CO ILVGMCVCQBJPSH-WDSKDSINSA-N 0.000 description 19
- 108010018625 phenylalanylarginine Proteins 0.000 description 19
- HZYFHQOWCFUSOV-IMJSIDKUSA-N Asn-Asp Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O HZYFHQOWCFUSOV-IMJSIDKUSA-N 0.000 description 18
- YQAIUOWPSUOINN-IUCAKERBSA-N Lys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN YQAIUOWPSUOINN-IUCAKERBSA-N 0.000 description 18
- FSXRLASFHBWESK-HOTGVXAUSA-N Phe-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 FSXRLASFHBWESK-HOTGVXAUSA-N 0.000 description 18
- VEYJKJORLPYVLO-RYUDHWBXSA-N Val-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 VEYJKJORLPYVLO-RYUDHWBXSA-N 0.000 description 18
- 108010012581 phenylalanylglutamate Proteins 0.000 description 18
- MGHKSHCBDXNTHX-UHFFFAOYSA-N 4-amino-5-[(4-amino-1-carboxy-4-oxobutyl)amino]-5-oxopentanoic acid Chemical compound OC(=O)CCC(N)C(=O)NC(CCC(N)=O)C(O)=O MGHKSHCBDXNTHX-UHFFFAOYSA-N 0.000 description 17
- GADKFYNESXNRLC-WDSKDSINSA-N Asn-Pro Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O GADKFYNESXNRLC-WDSKDSINSA-N 0.000 description 17
- OAMLVOVXNKILLQ-BQBZGAKWSA-N Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(O)=O OAMLVOVXNKILLQ-BQBZGAKWSA-N 0.000 description 17
- DWBZEJHQQIURML-IMJSIDKUSA-N Asp-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O DWBZEJHQQIURML-IMJSIDKUSA-N 0.000 description 17
- JEFZIKRIDLHOIF-BYPYZUCNSA-N Gln-Gly Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(O)=O JEFZIKRIDLHOIF-BYPYZUCNSA-N 0.000 description 17
- KOSRFJWDECSPRO-WDSKDSINSA-N Glu-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(O)=O KOSRFJWDECSPRO-WDSKDSINSA-N 0.000 description 17
- UQHGAYSULGRWRG-WHFBIAKZSA-N Glu-Ser Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CO)C(O)=O UQHGAYSULGRWRG-WHFBIAKZSA-N 0.000 description 17
- SENJXOPIZNYLHU-IUCAKERBSA-N Leu-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N SENJXOPIZNYLHU-IUCAKERBSA-N 0.000 description 17
- WEQJQNWXCSUVMA-RYUDHWBXSA-N Phe-Pro Chemical compound C([C@H]([NH3+])C(=O)N1[C@@H](CCC1)C([O-])=O)C1=CC=CC=C1 WEQJQNWXCSUVMA-RYUDHWBXSA-N 0.000 description 17
- ROHDXJUFQVRDAV-UWVGGRQHSA-N Phe-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 ROHDXJUFQVRDAV-UWVGGRQHSA-N 0.000 description 17
- WCRFXRIWBFRZBR-GGVZMXCHSA-N Thr-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 WCRFXRIWBFRZBR-GGVZMXCHSA-N 0.000 description 17
- 108010000761 leucylarginine Proteins 0.000 description 17
- XUUXCWCKKCZEAW-YFKPBYRVSA-N 2-[[(2S)-2-amino-5-(diaminomethylideneamino)pentanoyl]amino]acetic acid Chemical compound OC(=O)CNC(=O)[C@@H](N)CCCN=C(N)N XUUXCWCKKCZEAW-YFKPBYRVSA-N 0.000 description 16
- GSMPSRPMQQDRIB-WHFBIAKZSA-N Asp-Gln Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O GSMPSRPMQQDRIB-WHFBIAKZSA-N 0.000 description 16
- NTQDELBZOMWXRS-UHFFFAOYSA-N Aspartyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CC(O)=O NTQDELBZOMWXRS-UHFFFAOYSA-N 0.000 description 16
- JLXVRFDTDUGQEE-YFKPBYRVSA-N Gly-Arg Chemical compound NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N JLXVRFDTDUGQEE-YFKPBYRVSA-N 0.000 description 16
- IEFJWDNGDZAYNZ-BYPYZUCNSA-N Gly-Glu Chemical compound NCC(=O)N[C@H](C(O)=O)CCC(O)=O IEFJWDNGDZAYNZ-BYPYZUCNSA-N 0.000 description 16
- 240000001307 Myosotis scorpioides Species 0.000 description 16
- XZKQVQKUZMAADP-IMJSIDKUSA-N Ser-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(O)=O XZKQVQKUZMAADP-IMJSIDKUSA-N 0.000 description 16
- KGNSGRRALVIRGR-UHFFFAOYSA-N gln-tyr Chemical compound NC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 KGNSGRRALVIRGR-UHFFFAOYSA-N 0.000 description 16
- 238000003780 insertion Methods 0.000 description 16
- 108010031719 prolyl-serine Proteins 0.000 description 16
- OWOFCNWTMWOOJJ-WDSKDSINSA-N Gln-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(O)=O OWOFCNWTMWOOJJ-WDSKDSINSA-N 0.000 description 15
- YSWHPLCDIMUKFE-QWRGUYRKSA-N Glu-Tyr Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 YSWHPLCDIMUKFE-QWRGUYRKSA-N 0.000 description 15
- JBCLFWXMTIKCCB-VIFPVBQESA-N Gly-Phe Chemical compound NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-VIFPVBQESA-N 0.000 description 15
- VYZAGTDAHUIRQA-WHFBIAKZSA-N L-alanyl-L-glutamic acid Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O VYZAGTDAHUIRQA-WHFBIAKZSA-N 0.000 description 15
- QOOWRKBDDXQRHC-BQBZGAKWSA-N L-lysyl-L-alanine Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN QOOWRKBDDXQRHC-BQBZGAKWSA-N 0.000 description 15
- CGWAPUBOXJWXMS-HOTGVXAUSA-N Tyr-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 CGWAPUBOXJWXMS-HOTGVXAUSA-N 0.000 description 15
- 108010050848 glycylleucine Proteins 0.000 description 15
- 230000001131 transforming Effects 0.000 description 15
- HSPSXROIMXIJQW-BQBZGAKWSA-N Asp-His Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 HSPSXROIMXIJQW-BQBZGAKWSA-N 0.000 description 14
- 241000588724 Escherichia coli Species 0.000 description 14
- BBBXWRGITSUJPB-YUMQZZPRSA-N Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O BBBXWRGITSUJPB-YUMQZZPRSA-N 0.000 description 14
- 241000255777 Lepidoptera Species 0.000 description 14
- XWOBNBRUDDUEEY-UWVGGRQHSA-N Leu-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 XWOBNBRUDDUEEY-UWVGGRQHSA-N 0.000 description 14
- BWUHENPAEMNGQJ-ZDLURKLDSA-N Thr-Gln Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O BWUHENPAEMNGQJ-ZDLURKLDSA-N 0.000 description 14
- BNQVUHQWZGTIBX-IUCAKERBSA-N Val-His Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CC1=CN=CN1 BNQVUHQWZGTIBX-IUCAKERBSA-N 0.000 description 14
- 108010060035 arginylproline Proteins 0.000 description 14
- 244000052616 bacterial pathogens Species 0.000 description 14
- 230000035772 mutation Effects 0.000 description 14
- 108010020532 tyrosyl-proline Proteins 0.000 description 14
- YZQCXOFQZKCETR-UWVGGRQHSA-N Asp-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 YZQCXOFQZKCETR-UWVGGRQHSA-N 0.000 description 13
- YSZNURNVYFUEHC-BQBZGAKWSA-N Lys-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(O)=O YSZNURNVYFUEHC-BQBZGAKWSA-N 0.000 description 13
- FADYJNXDPBKVCA-UHFFFAOYSA-N Phenylalanyl-Lysine Chemical compound NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 13
- CKHWEVXPLJBEOZ-UHFFFAOYSA-N Threoninyl-Valine Chemical compound CC(C)C(C(O)=O)NC(=O)C(N)C(C)O CKHWEVXPLJBEOZ-UHFFFAOYSA-N 0.000 description 13
- GIAZPLMMQOERPN-YUMQZZPRSA-N Val-Pro Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(O)=O GIAZPLMMQOERPN-YUMQZZPRSA-N 0.000 description 13
- 108010044940 alanylglutamine Proteins 0.000 description 13
- 108010068380 arginylarginine Proteins 0.000 description 13
- 108010062796 arginyllysine Proteins 0.000 description 13
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 13
- 230000009089 cytolysis Effects 0.000 description 13
- 235000005911 diet Nutrition 0.000 description 13
- 230000002934 lysing Effects 0.000 description 13
- 239000006228 supernatant Substances 0.000 description 13
- 239000000725 suspension Substances 0.000 description 13
- LQJAALCCPOTJGB-YUMQZZPRSA-N (2S)-1-[(2S)-2-amino-5-(diaminomethylideneamino)pentanoyl]pyrrolidine-2-carboxylic acid Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O LQJAALCCPOTJGB-YUMQZZPRSA-N 0.000 description 12
- HJCMDXDYPOUFDY-WHFBIAKZSA-N Ala-Gln Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O HJCMDXDYPOUFDY-WHFBIAKZSA-N 0.000 description 12
- QJMCHPGWFZZRID-UHFFFAOYSA-N Asparaginyl-Lysine Chemical compound NCCCCC(C(O)=O)NC(=O)C(N)CC(N)=O QJMCHPGWFZZRID-UHFFFAOYSA-N 0.000 description 12
- SSHIXEILTLPAQT-UHFFFAOYSA-N Glutaminyl-Aspartate Chemical class NC(=O)CCC(N)C(=O)NC(CC(O)=O)C(O)=O SSHIXEILTLPAQT-UHFFFAOYSA-N 0.000 description 12
- WRPDZHJNLYNFFT-UHFFFAOYSA-N Histidinyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CC1=CN=CN1 WRPDZHJNLYNFFT-UHFFFAOYSA-N 0.000 description 12
- HGNRJCINZYHNOU-LURJTMIESA-N Lys-Gly Chemical compound NCCCC[C@H](N)C(=O)NCC(O)=O HGNRJCINZYHNOU-LURJTMIESA-N 0.000 description 12
- XBZOQGHZGQLEQO-IUCAKERBSA-N Lys-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN XBZOQGHZGQLEQO-IUCAKERBSA-N 0.000 description 12
- MYTOTTSMVMWVJN-STQMWFEESA-N Lys-Tyr Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 MYTOTTSMVMWVJN-STQMWFEESA-N 0.000 description 12
- JPNRPAJITHRXRH-UHFFFAOYSA-N Lysyl-Asparagine Chemical compound NCCCCC(N)C(=O)NC(C(O)=O)CC(N)=O JPNRPAJITHRXRH-UHFFFAOYSA-N 0.000 description 12
- IWIANZLCJVYEFX-RYUDHWBXSA-N Pro-Phe Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H]1NCCC1)C1=CC=CC=C1 IWIANZLCJVYEFX-RYUDHWBXSA-N 0.000 description 12
- NHUHCSRWZMLRLA-UHFFFAOYSA-N Sulfizole Chemical compound CC1=NOC(NS(=O)(=O)C=2C=CC(N)=CC=2)=C1C NHUHCSRWZMLRLA-UHFFFAOYSA-N 0.000 description 12
- CUTPSEKWUPZFLV-UHFFFAOYSA-N Threoninyl-Cysteine Chemical compound CC(O)C(N)C(=O)NC(CS)C(O)=O CUTPSEKWUPZFLV-UHFFFAOYSA-N 0.000 description 12
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 12
- 108010015792 glycyllysine Proteins 0.000 description 12
- 239000002609 media Substances 0.000 description 12
- TUTIHHSZKFBMHM-UHFFFAOYSA-N 4-amino-5-[(3-amino-1-carboxy-3-oxopropyl)amino]-5-oxopentanoic acid Chemical compound OC(=O)CCC(N)C(=O)NC(CC(N)=O)C(O)=O TUTIHHSZKFBMHM-UHFFFAOYSA-N 0.000 description 11
- CCUAQNUWXLYFRA-IMJSIDKUSA-N Ala-Asn Chemical compound C[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CC(N)=O CCUAQNUWXLYFRA-IMJSIDKUSA-N 0.000 description 11
- FYRVDDJMNISIKJ-UWVGGRQHSA-N Asn-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FYRVDDJMNISIKJ-UWVGGRQHSA-N 0.000 description 11
- VHLZDSUANXBJHW-UHFFFAOYSA-N Glutaminyl-Phenylalanine Chemical compound NC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 VHLZDSUANXBJHW-UHFFFAOYSA-N 0.000 description 11
- OZILORBBPKKGRI-RYUDHWBXSA-N Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 OZILORBBPKKGRI-RYUDHWBXSA-N 0.000 description 11
- IOWJRKAVLALBQB-IWGUZYHVSA-N Thr-Asp Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O IOWJRKAVLALBQB-IWGUZYHVSA-N 0.000 description 11
- 108090000631 Trypsin Proteins 0.000 description 11
- 102000004142 Trypsin Human genes 0.000 description 11
- 230000003321 amplification Effects 0.000 description 11
- 108010093581 aspartyl-proline Proteins 0.000 description 11
- 230000037213 diet Effects 0.000 description 11
- 108010028295 histidylhistidine Proteins 0.000 description 11
- 108010012058 leucyltyrosine Proteins 0.000 description 11
- 108010044348 lysyl-glutamyl-aspartic acid Proteins 0.000 description 11
- 244000005700 microbiome Species 0.000 description 11
- 238000003199 nucleic acid amplification method Methods 0.000 description 11
- QLROSWPKSBORFJ-BQBZGAKWSA-N pro glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 QLROSWPKSBORFJ-BQBZGAKWSA-N 0.000 description 11
- 108010048818 seryl-histidine Proteins 0.000 description 11
- 108010003137 tyrosyltyrosine Proteins 0.000 description 11
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 11
- UKGGPJNBONZZCM-WDSKDSINSA-N Aspartyl-L-proline Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O UKGGPJNBONZZCM-WDSKDSINSA-N 0.000 description 10
- 241000894006 Bacteria Species 0.000 description 10
- LOJYQMFIIJVETK-WDSKDSINSA-N Gln-Gln Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(O)=O LOJYQMFIIJVETK-WDSKDSINSA-N 0.000 description 10
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 10
- JXNRXNCCROJZFB-RYUDHWBXSA-N L-tyrosyl-L-arginine Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 JXNRXNCCROJZFB-RYUDHWBXSA-N 0.000 description 10
- JAQGKXUEKGKTKX-HOTGVXAUSA-N L-tyrosyl-L-tyrosine Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 JAQGKXUEKGKTKX-HOTGVXAUSA-N 0.000 description 10
- NPBGTPKLVJEOBE-IUCAKERBSA-N Lys-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCNC(N)=N NPBGTPKLVJEOBE-IUCAKERBSA-N 0.000 description 10
- 229920000272 Oligonucleotide Polymers 0.000 description 10
- DBMJMQXJHONAFJ-UHFFFAOYSA-M Sodium laurylsulphate Chemical compound [Na+].CCCCCCCCCCCCOS([O-])(=O)=O DBMJMQXJHONAFJ-UHFFFAOYSA-M 0.000 description 10
- 229940014598 TAC Drugs 0.000 description 10
- 238000000855 fermentation Methods 0.000 description 10
- 230000004151 fermentation Effects 0.000 description 10
- 108010079547 glutamylmethionine Proteins 0.000 description 10
- KZNQNBZMBZJQJO-YFKPBYRVSA-N gly pro Chemical compound NCC(=O)N1CCC[C@H]1C(O)=O KZNQNBZMBZJQJO-YFKPBYRVSA-N 0.000 description 10
- 108010053037 kyotorphin Proteins 0.000 description 10
- 108010064235 lysylglycine Proteins 0.000 description 10
- 108010038320 lysylphenylalanine Proteins 0.000 description 10
- 229960001322 trypsin Drugs 0.000 description 10
- 239000012588 trypsin Substances 0.000 description 10
- UKKNTTCNGZLJEX-UHFFFAOYSA-N γ-glutamyl-Serine Chemical compound NC(=O)CCC(N)C(=O)NC(CO)C(O)=O UKKNTTCNGZLJEX-UHFFFAOYSA-N 0.000 description 10
- VNYDHJARLHNEGA-RYUDHWBXSA-N (2S)-1-[(2S)-2-azaniumyl-3-(4-hydroxyphenyl)propanoyl]pyrrolidine-2-carboxylate Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CC=C(O)C=C1 VNYDHJARLHNEGA-RYUDHWBXSA-N 0.000 description 9
- LCNASHSOFMRYFO-WDCWCFNPSA-N (2S)-5-amino-2-[[(2S,3R)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-3-hydroxybutanoyl]amino]-5-oxopentanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCC(N)=O LCNASHSOFMRYFO-WDCWCFNPSA-N 0.000 description 9
- ULXYQAJWJGLCNR-YUMQZZPRSA-N (3S)-3-[[(2S)-2-amino-4-methylpentanoyl]amino]-4-(carboxymethylamino)-4-oxobutanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O ULXYQAJWJGLCNR-YUMQZZPRSA-N 0.000 description 9
- HKTRDWYCAUTRRL-UHFFFAOYSA-N 4-amino-5-[[1-carboxy-2-(1H-imidazol-5-yl)ethyl]amino]-5-oxopentanoic acid Chemical compound OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CN=CN1 HKTRDWYCAUTRRL-UHFFFAOYSA-N 0.000 description 9
- OMLWNBVRVJYMBQ-YUMQZZPRSA-N Arg-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O OMLWNBVRVJYMBQ-YUMQZZPRSA-N 0.000 description 9
- LSPKYLAFTPBWIL-BYPYZUCNSA-N Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(O)=O LSPKYLAFTPBWIL-BYPYZUCNSA-N 0.000 description 9
- DKEXFJVMVGETOO-LURJTMIESA-N Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CN DKEXFJVMVGETOO-LURJTMIESA-N 0.000 description 9
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 9
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 9
- UGTZHPSKYRIGRJ-YUMQZZPRSA-N Lys-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O UGTZHPSKYRIGRJ-YUMQZZPRSA-N 0.000 description 9
- 206010042297 Subacute sclerosing panencephalitis Diseases 0.000 description 9
- QOLYAJSZHIJCTO-VQVTYTSYSA-N Thr-Pro Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(O)=O QOLYAJSZHIJCTO-VQVTYTSYSA-N 0.000 description 9
- ONWMQORSVZYVNH-UHFFFAOYSA-N Tyrosyl-Asparagine Chemical compound NC(=O)CC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 ONWMQORSVZYVNH-UHFFFAOYSA-N 0.000 description 9
- 238000005119 centrifugation Methods 0.000 description 9
- 239000002158 endotoxin Substances 0.000 description 9
- 108010077515 glycylproline Proteins 0.000 description 9
- 108010036413 histidylglycine Proteins 0.000 description 9
- 239000008188 pellet Substances 0.000 description 9
- 108010072637 phenylalanyl-arginyl-phenylalanine Proteins 0.000 description 9
- 230000002588 toxic Effects 0.000 description 9
- 108010051110 tyrosyl-lysine Proteins 0.000 description 9
- YRJOLUDFVAUXLI-GSSVUCPTSA-N (2S)-2-[[(2S,3R)-2-[[(2S,3R)-2-amino-3-hydroxybutanoyl]amino]-3-hydroxybutanoyl]amino]butanedioic acid Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(O)=O YRJOLUDFVAUXLI-GSSVUCPTSA-N 0.000 description 8
- RDIKFPRVLJLMER-BQBZGAKWSA-N Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)N RDIKFPRVLJLMER-BQBZGAKWSA-N 0.000 description 8
- XBGGUPMXALFZOT-VIFPVBQESA-N Gly-Tyr Chemical compound NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-VIFPVBQESA-N 0.000 description 8
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 8
- ZOKVLMBYDSIDKG-CSMHCCOUSA-N Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN ZOKVLMBYDSIDKG-CSMHCCOUSA-N 0.000 description 8
- QTZXSYBVOSXBEJ-WDSKDSINSA-N Met-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O QTZXSYBVOSXBEJ-WDSKDSINSA-N 0.000 description 8
- 108010079364 N-glycylalanine Proteins 0.000 description 8
- AFWBWPCXSWUCLB-WDSKDSINSA-N Pro-Ser Chemical compound OC[C@@H](C([O-])=O)NC(=O)[C@@H]1CCC[NH2+]1 AFWBWPCXSWUCLB-WDSKDSINSA-N 0.000 description 8
- RZEQTVHJZCIUBT-UHFFFAOYSA-N Serinyl-Arginine Chemical compound OCC(N)C(=O)NC(C(O)=O)CCCNC(N)=N RZEQTVHJZCIUBT-UHFFFAOYSA-N 0.000 description 8
- VPZKQTYZIVOJDV-LMVFSUKVSA-N Thr-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(O)=O VPZKQTYZIVOJDV-LMVFSUKVSA-N 0.000 description 8
- NLKUJNGEGZDXGO-XVKPBYJWSA-N Tyr-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NLKUJNGEGZDXGO-XVKPBYJWSA-N 0.000 description 8
- ZSXJENBJGRHKIG-UHFFFAOYSA-N Tyrosyl-Serine Chemical compound OCC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 ZSXJENBJGRHKIG-UHFFFAOYSA-N 0.000 description 8
- YSGSDAIMSCVPHG-YUMQZZPRSA-N Val-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)C(C)C YSGSDAIMSCVPHG-YUMQZZPRSA-N 0.000 description 8
- 238000004458 analytical method Methods 0.000 description 8
- 108010010430 asparagine-proline-alanine Proteins 0.000 description 8
- 108091006028 chimera Proteins 0.000 description 8
- 108010060199 cysteinylproline Proteins 0.000 description 8
- 108010008237 glutamyl-valyl-glycine Proteins 0.000 description 8
- 230000000749 insecticidal Effects 0.000 description 8
- 239000006166 lysate Substances 0.000 description 8
- 210000004215 spores Anatomy 0.000 description 8
- 238000006467 substitution reaction Methods 0.000 description 8
- 108010038745 tryptophylglycine Proteins 0.000 description 8
- ZJZNLRVCZWUONM-JXUBOQSCSA-N (2S)-2-[[(2S,3R)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-3-hydroxybutanoyl]amino]propanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O ZJZNLRVCZWUONM-JXUBOQSCSA-N 0.000 description 7
- PVMPDMIKUVNOBD-CIUDSAMLSA-N (3S)-3-[[(2S)-2-amino-4-methylpentanoyl]amino]-4-[[(1S)-1-carboxy-2-hydroxyethyl]amino]-4-oxobutanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O PVMPDMIKUVNOBD-CIUDSAMLSA-N 0.000 description 7
- MPZWMIIOPAPAKE-UHFFFAOYSA-N 4-amino-5-[[1-carboxy-4-(diaminomethylideneamino)butyl]amino]-5-oxopentanoic acid Chemical compound OC(=O)CCC(N)C(=O)NC(C(O)=O)CCCN=C(N)N MPZWMIIOPAPAKE-UHFFFAOYSA-N 0.000 description 7
- JQFZHHSQMKZLRU-IUCAKERBSA-N Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N JQFZHHSQMKZLRU-IUCAKERBSA-N 0.000 description 7
- XTWSWDJMIKUJDQ-RYUDHWBXSA-N Arg-Tyr Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 XTWSWDJMIKUJDQ-RYUDHWBXSA-N 0.000 description 7
- ZSRSLWKGWFFVCM-WDSKDSINSA-N Cys-Pro Chemical compound SC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O ZSRSLWKGWFFVCM-WDSKDSINSA-N 0.000 description 7
- YBTCBQBIJKGSJP-BQBZGAKWSA-N Glu-Pro Chemical compound OC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O YBTCBQBIJKGSJP-BQBZGAKWSA-N 0.000 description 7
- MRVYVEQPNDSWLH-UHFFFAOYSA-N Glutaminyl-Valine Chemical compound CC(C)C(C(O)=O)NC(=O)C(N)CCC(N)=O MRVYVEQPNDSWLH-UHFFFAOYSA-N 0.000 description 7
- 241000238631 Hexapoda Species 0.000 description 7
- MDCTVRUPVLZSPG-BQBZGAKWSA-N His-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CNC=N1 MDCTVRUPVLZSPG-BQBZGAKWSA-N 0.000 description 7
- 125000000393 L-methionino group Chemical group [H]OC(=O)[C@@]([H])(N([H])[*])C([H])([H])C(SC([H])([H])[H])([H])[H] 0.000 description 7
- LHSGPCFBGJHPCY-STQMWFEESA-N Leu-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-STQMWFEESA-N 0.000 description 7
- AIXUQKMMBQJZCU-IUCAKERBSA-N Lys-Pro Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O AIXUQKMMBQJZCU-IUCAKERBSA-N 0.000 description 7
- 241000589540 Pseudomonas fluorescens Species 0.000 description 7
- WXVIGTAUZBUDPZ-DTLFHODZSA-N Thr-His Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 WXVIGTAUZBUDPZ-DTLFHODZSA-N 0.000 description 7
- 108010011559 alanylphenylalanine Proteins 0.000 description 7
- 239000003795 chemical substances by application Substances 0.000 description 7
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 7
- 108010090333 leucyl-lysyl-proline Proteins 0.000 description 7
- 230000000422 nocturnal Effects 0.000 description 7
- 108010070643 prolylglutamic acid Proteins 0.000 description 7
- 239000000126 substance Substances 0.000 description 7
- 231100000331 toxic Toxicity 0.000 description 7
- OPINTGHFESTVAX-UHFFFAOYSA-N γ-glutamyl-Arginine Chemical compound NC(=O)CCC(N)C(=O)NC(C(O)=O)CCCNC(N)=N OPINTGHFESTVAX-UHFFFAOYSA-N 0.000 description 7
- CCQOOWAONKGYKQ-BYPYZUCNSA-N (2S)-2-[[2-[(2-azaniumylacetyl)amino]acetyl]amino]propanoate Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)CN CCQOOWAONKGYKQ-BYPYZUCNSA-N 0.000 description 6
- YOKVEHGYYQEQOP-QWRGUYRKSA-N 2-[[(2S)-2-[[(2S)-2-azaniumyl-4-methylpentanoyl]amino]-4-methylpentanoyl]amino]acetate Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YOKVEHGYYQEQOP-QWRGUYRKSA-N 0.000 description 6
- OMNVYXHOSHNURL-WPRPVWTQSA-N Ala-Phe Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OMNVYXHOSHNURL-WPRPVWTQSA-N 0.000 description 6
- FFMIYIMKQIMDPK-BQBZGAKWSA-N Asn-His Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 FFMIYIMKQIMDPK-BQBZGAKWSA-N 0.000 description 6
- JZDHUJAFXGNDSB-WHFBIAKZSA-N Glu-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O JZDHUJAFXGNDSB-WHFBIAKZSA-N 0.000 description 6
- HHSJMSCOLJVTCX-UHFFFAOYSA-N Glutaminyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CCC(N)=O HHSJMSCOLJVTCX-UHFFFAOYSA-N 0.000 description 6
- XUJNEKJLAYXESH-REOHCLBHSA-N L-cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 6
- ZYTPOUNUXRBYGW-YUMQZZPRSA-N Met-Met Chemical compound CSCC[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CCSC ZYTPOUNUXRBYGW-YUMQZZPRSA-N 0.000 description 6
- DZMGFGQBRYWJOR-YUMQZZPRSA-N Met-Pro Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O DZMGFGQBRYWJOR-YUMQZZPRSA-N 0.000 description 6
- OHUXOEXBXPZKPT-STQMWFEESA-N Phe-His Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1N=CNC=1)C(O)=O)C1=CC=CC=C1 OHUXOEXBXPZKPT-STQMWFEESA-N 0.000 description 6
- SHAQGFGGJSLLHE-BQBZGAKWSA-N Pro-Gln Chemical compound NC(=O)CC[C@@H](C([O-])=O)NC(=O)[C@@H]1CCC[NH2+]1 SHAQGFGGJSLLHE-BQBZGAKWSA-N 0.000 description 6
- OIDKVWTWGDWMHY-RYUDHWBXSA-N Pro-Tyr Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 OIDKVWTWGDWMHY-RYUDHWBXSA-N 0.000 description 6
- 241000589516 Pseudomonas Species 0.000 description 6
- SBMNPABNWKXNBJ-UHFFFAOYSA-N Serinyl-Lysine Chemical compound NCCCCC(C(O)=O)NC(=O)C(N)CO SBMNPABNWKXNBJ-UHFFFAOYSA-N 0.000 description 6
- UYKREHOKELZSPB-JTQLQIEISA-N Trp-Gly Chemical compound C1=CC=C2C(C[C@H](N)C(=O)NCC(O)=O)=CNC2=C1 UYKREHOKELZSPB-JTQLQIEISA-N 0.000 description 6
- YBRHKUNWEYBZGT-UHFFFAOYSA-N Tryptophyl-Threonine Chemical compound C1=CC=C2C(CC(N)C(=O)NC(C(O)C)C(O)=O)=CNC2=C1 YBRHKUNWEYBZGT-UHFFFAOYSA-N 0.000 description 6
- 108010087924 alanylproline Proteins 0.000 description 6
- 108010018691 arginyl-threonyl-arginine Proteins 0.000 description 6
- HEDRZPFGACZZDS-UHFFFAOYSA-N chloroform Chemical compound ClC(Cl)Cl HEDRZPFGACZZDS-UHFFFAOYSA-N 0.000 description 6
- PEDCQBHIVMGVHV-UHFFFAOYSA-N glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 6
- 238000002955 isolation Methods 0.000 description 6
- 239000003550 marker Substances 0.000 description 6
- 108010085203 methionylmethionine Proteins 0.000 description 6
- 230000000813 microbial Effects 0.000 description 6
- 230000036961 partial Effects 0.000 description 6
- 238000002360 preparation method Methods 0.000 description 6
- 239000000047 product Substances 0.000 description 6
- DOFAQXCYFQKSHT-SRVKXCTJSA-N (2S)-1-[(2S)-1-[(2S)-2-amino-3-methylbutanoyl]pyrrolidine-2-carbonyl]pyrrolidine-2-carboxylic acid Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DOFAQXCYFQKSHT-SRVKXCTJSA-N 0.000 description 5
- XMAUFHMAAVTODF-STQMWFEESA-N (2S)-2-[[(2S)-2-amino-3-(1H-imidazol-5-yl)propanoyl]amino]-3-phenylpropanoic acid Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CN=CN1 XMAUFHMAAVTODF-STQMWFEESA-N 0.000 description 5
- KYPMKDGKAYQCHO-RYUDHWBXSA-N (2S)-2-[[(2S)-2-amino-3-(4-hydroxyphenyl)propanoyl]amino]-4-methylsulfanylbutanoic acid Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 KYPMKDGKAYQCHO-RYUDHWBXSA-N 0.000 description 5
- LZDNBBYBDGBADK-KBPBESRZSA-N (2S)-2-[[(2S)-2-amino-3-methylbutanoyl]amino]-3-(1H-indol-3-yl)propanoic acid Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-KBPBESRZSA-N 0.000 description 5
- DXQOQMCLWWADMU-ACZMJKKPSA-N (3S)-3-amino-4-[[(2S)-5-amino-1-[[(1S)-1-carboxy-2-hydroxyethyl]amino]-1,5-dioxopentan-2-yl]amino]-4-oxobutanoic acid Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O DXQOQMCLWWADMU-ACZMJKKPSA-N 0.000 description 5
- GJSURZIOUXUGAL-UHFFFAOYSA-N 2-((2,6-Dichlorophenyl)imino)imidazolidine Chemical compound ClC1=CC=CC(Cl)=C1NC1=NCCN1 GJSURZIOUXUGAL-UHFFFAOYSA-N 0.000 description 5
- 241000218473 Agrotis Species 0.000 description 5
- SIFXMYAHXJGAFC-WDSKDSINSA-N Arg-Asp Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O SIFXMYAHXJGAFC-WDSKDSINSA-N 0.000 description 5
- IJYZHIOOBGIINM-WDSKDSINSA-N Arg-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N IJYZHIOOBGIINM-WDSKDSINSA-N 0.000 description 5
- OSASDIVHOSJVII-UHFFFAOYSA-N Arginyl-Cysteine Chemical compound SCC(C(O)=O)NC(=O)C(N)CCCNC(N)=N OSASDIVHOSJVII-UHFFFAOYSA-N 0.000 description 5
- FRYULLIZUDQONW-IMJSIDKUSA-N Asp-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O FRYULLIZUDQONW-IMJSIDKUSA-N 0.000 description 5
- IQTUDDBANZYMAR-UHFFFAOYSA-N Asparaginyl-Methionine Chemical compound CSCCC(C(O)=O)NC(=O)C(N)CC(N)=O IQTUDDBANZYMAR-UHFFFAOYSA-N 0.000 description 5
- DYDKXJWQCIVTMR-UHFFFAOYSA-N Aspartyl-Methionine Chemical compound CSCCC(C(O)=O)NC(=O)C(N)CC(O)=O DYDKXJWQCIVTMR-UHFFFAOYSA-N 0.000 description 5
- 241000255925 Diptera Species 0.000 description 5
- FBTYOQIYBULKEH-ZFWWWQNUSA-N His-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CNC=N1 FBTYOQIYBULKEH-ZFWWWQNUSA-N 0.000 description 5
- HTOOKGDPMXSJSY-STQMWFEESA-N His-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 HTOOKGDPMXSJSY-STQMWFEESA-N 0.000 description 5
- 206010061217 Infestation Diseases 0.000 description 5
- DVCSNHXRZUVYAM-BQBZGAKWSA-N Leu-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O DVCSNHXRZUVYAM-BQBZGAKWSA-N 0.000 description 5
- KFKWRHQBZQICHA-STQMWFEESA-N Leu-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 5
- 108010062166 Lys-Asn-Asp Proteins 0.000 description 5
- 241000219823 Medicago Species 0.000 description 5
- IMTUWVJPCQPJEE-IUCAKERBSA-N Met-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN IMTUWVJPCQPJEE-IUCAKERBSA-N 0.000 description 5
- 108010066427 N-valyltryptophan Proteins 0.000 description 5
- UEKYKRQIAQHOOZ-KBPBESRZSA-N Pro-Trp Chemical compound N([C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)[O-])C(=O)[C@@H]1CCC[NH2+]1 UEKYKRQIAQHOOZ-KBPBESRZSA-N 0.000 description 5
- GRQCSEWEPIHLBI-UHFFFAOYSA-N Tryptophyl-Asparagine Chemical compound C1=CC=C2C(CC(N)C(=O)NC(CC(N)=O)C(O)=O)=CNC2=C1 GRQCSEWEPIHLBI-UHFFFAOYSA-N 0.000 description 5
- AOLHUMAVONBBEZ-STQMWFEESA-N Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AOLHUMAVONBBEZ-STQMWFEESA-N 0.000 description 5
- OBTCMSPFOITUIJ-FSPLSTOPSA-N Val-Asp Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O OBTCMSPFOITUIJ-FSPLSTOPSA-N 0.000 description 5
- 235000017585 alfalfa Nutrition 0.000 description 5
- 235000017587 alfalfa Nutrition 0.000 description 5
- 108010072041 arginyl-glycyl-aspartic acid Proteins 0.000 description 5
- 235000021405 artificial diet Nutrition 0.000 description 5
- 238000004166 bioassay Methods 0.000 description 5
- 230000015572 biosynthetic process Effects 0.000 description 5
- 238000010367 cloning Methods 0.000 description 5
- 108010004073 cysteinylcysteine Proteins 0.000 description 5
- 230000029087 digestion Effects 0.000 description 5
- 230000002068 genetic Effects 0.000 description 5
- 108010019832 glycyl-asparaginyl-glycine Proteins 0.000 description 5
- 108010067216 glycyl-glycyl-glycine Proteins 0.000 description 5
- 108010077435 glycyl-phenylalanyl-glycine Proteins 0.000 description 5
- 108010074027 glycyl-seryl-phenylalanine Proteins 0.000 description 5
- 108010084389 glycyltryptophan Proteins 0.000 description 5
- 238000002844 melting Methods 0.000 description 5
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 5
- 229920001184 polypeptide Polymers 0.000 description 5
- 239000000843 powder Substances 0.000 description 5
- 238000007894 restriction fragment length polymorphism technique Methods 0.000 description 5
- 108010071207 serylmethionine Proteins 0.000 description 5
- 239000011780 sodium chloride Substances 0.000 description 5
- 239000002689 soil Substances 0.000 description 5
- 108010031491 threonyl-lysyl-glutamic acid Proteins 0.000 description 5
- 108010015385 valyl-prolyl-proline Proteins 0.000 description 5
- FAQVCWVVIYYWRR-WHFBIAKZSA-N (2S)-2-[[(2S)-2,5-diamino-5-oxopentanoyl]amino]propanoic acid Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O FAQVCWVVIYYWRR-WHFBIAKZSA-N 0.000 description 4
- RXGLHDWAZQECBI-SRVKXCTJSA-N (2S)-2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]amino]-3-hydroxypropanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O RXGLHDWAZQECBI-SRVKXCTJSA-N 0.000 description 4
- BUXAPSQPMALTOY-UHFFFAOYSA-N 2-[(2-amino-3-sulfanylpropanoyl)amino]pentanedioic acid Chemical compound SCC(N)C(=O)NC(C(O)=O)CCC(O)=O BUXAPSQPMALTOY-UHFFFAOYSA-N 0.000 description 4
- AAKRWBIIGKPOKQ-ONGXEEELSA-N 2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-3-methylbutanoyl]amino]acetic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O AAKRWBIIGKPOKQ-ONGXEEELSA-N 0.000 description 4
- HIINQLBHPIQYHN-JTQLQIEISA-N 2-[[2-[[(2S)-2-amino-3-(4-hydroxyphenyl)propanoyl]amino]acetyl]amino]acetic acid Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 HIINQLBHPIQYHN-JTQLQIEISA-N 0.000 description 4
- 241000589158 Agrobacterium Species 0.000 description 4
- 241000566547 Agrotis ipsilon Species 0.000 description 4
- BUQICHWNXBIBOG-LMVFSUKVSA-N Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)N BUQICHWNXBIBOG-LMVFSUKVSA-N 0.000 description 4
- PQBHGSGQZSOLIR-RYUDHWBXSA-N Arg-Phe Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 PQBHGSGQZSOLIR-RYUDHWBXSA-N 0.000 description 4
- CPMKYMGGYUFOHS-FSPLSTOPSA-N Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(O)=O CPMKYMGGYUFOHS-FSPLSTOPSA-N 0.000 description 4
- 241000254173 Coleoptera Species 0.000 description 4
- RGTVXXNMOGHRAY-UHFFFAOYSA-N Cysteinyl-Arginine Chemical compound SCC(N)C(=O)NC(C(O)=O)CCCNC(N)=N RGTVXXNMOGHRAY-UHFFFAOYSA-N 0.000 description 4
- WXOFKRKAHJQKLT-UHFFFAOYSA-N Cysteinyl-Lysine Chemical compound NCCCCC(C(O)=O)NC(=O)C(N)CS WXOFKRKAHJQKLT-UHFFFAOYSA-N 0.000 description 4
- 108010090461 DFG peptide Proteins 0.000 description 4
- CLSDNFWKGFJIBZ-UHFFFAOYSA-N Glutaminyl-Lysine Chemical compound NCCCCC(C(O)=O)NC(=O)C(N)CCC(N)=O CLSDNFWKGFJIBZ-UHFFFAOYSA-N 0.000 description 4
- 241000256244 Heliothis virescens Species 0.000 description 4
- CZVQSYNVUHAILZ-UWVGGRQHSA-N His-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CN=CN1 CZVQSYNVUHAILZ-UWVGGRQHSA-N 0.000 description 4
- CTCFZNBRZBNKAX-UHFFFAOYSA-N Histidinyl-Glutamine Chemical compound NC(=O)CCC(C(O)=O)NC(=O)C(N)CC1=CN=CN1 CTCFZNBRZBNKAX-UHFFFAOYSA-N 0.000 description 4
- 125000000510 L-tryptophano group Chemical group [H]C1=C([H])C([H])=C2N([H])C([H])=C(C([H])([H])[C@@]([H])(C(O[H])=O)N([H])[*])C2=C1[H] 0.000 description 4
- OAPNERBWQWUPTI-YUMQZZPRSA-N Lys-Gln Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O OAPNERBWQWUPTI-YUMQZZPRSA-N 0.000 description 4
- QCZYYEFXOBKCNQ-STQMWFEESA-N Lys-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QCZYYEFXOBKCNQ-STQMWFEESA-N 0.000 description 4
- RVKIPWVMZANZLI-ZFWWWQNUSA-N Lys-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CCCCN)C(O)=O)=CNC2=C1 RVKIPWVMZANZLI-ZFWWWQNUSA-N 0.000 description 4
- HGCNKOLVKRAVHD-RYUDHWBXSA-N Met-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 HGCNKOLVKRAVHD-RYUDHWBXSA-N 0.000 description 4
- XYVRXLDSCKEYES-JSGCOSHPSA-N Met-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CCSC)C(O)=O)=CNC2=C1 XYVRXLDSCKEYES-JSGCOSHPSA-N 0.000 description 4
- 108010087066 N2-tryptophyllysine Proteins 0.000 description 4
- 241000244206 Nematoda Species 0.000 description 4
- 108020005187 Oligonucleotide Probes Proteins 0.000 description 4
- FFOKMZOAVHEWET-UHFFFAOYSA-N Serinyl-Cysteine Chemical compound OCC(N)C(=O)NC(CS)C(O)=O FFOKMZOAVHEWET-UHFFFAOYSA-N 0.000 description 4
- OYOQKMOWUDVWCR-RYUDHWBXSA-N Tyr-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OYOQKMOWUDVWCR-RYUDHWBXSA-N 0.000 description 4
- QZOSVNLXLSNHQK-UHFFFAOYSA-N Tyrosyl-Aspartate Chemical compound OC(=O)CC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 QZOSVNLXLSNHQK-UHFFFAOYSA-N 0.000 description 4
- 108010036533 arginylvaline Proteins 0.000 description 4
- 230000001580 bacterial Effects 0.000 description 4
- 230000027455 binding Effects 0.000 description 4
- 230000003115 biocidal Effects 0.000 description 4
- 239000001110 calcium chloride Substances 0.000 description 4
- 235000011148 calcium chloride Nutrition 0.000 description 4
- 229910001628 calcium chloride Inorganic materials 0.000 description 4
- 230000001413 cellular Effects 0.000 description 4
- 230000000875 corresponding Effects 0.000 description 4
- 230000018109 developmental process Effects 0.000 description 4
- ZHNUHDYFZUAESO-UHFFFAOYSA-N formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 4
- 238000005755 formation reaction Methods 0.000 description 4
- 108010057083 glutamyl-aspartyl-leucine Proteins 0.000 description 4
- 108010020688 glycylhistidine Proteins 0.000 description 4
- 238000010348 incorporation Methods 0.000 description 4
- 108010044655 lysylproline Proteins 0.000 description 4
- 238000004519 manufacturing process Methods 0.000 description 4
- 108010034507 methionyltryptophan Proteins 0.000 description 4
- 238000002156 mixing Methods 0.000 description 4
- 239000002751 oligonucleotide probe Substances 0.000 description 4
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N phenol Chemical compound OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 4
- 230000002285 radioactive Effects 0.000 description 4
- 238000000844 transformation Methods 0.000 description 4
- 108010080629 tryptophan-leucine Proteins 0.000 description 4
- 108010017949 tyrosyl-glycyl-glycine Proteins 0.000 description 4
- OABOXRPGTFRBFZ-IMJSIDKUSA-N (2R)-2-[[(2R)-2-amino-3-sulfanylpropanoyl]amino]-3-sulfanylpropanoic acid Chemical compound SC[C@H](N)C(=O)N[C@@H](CS)C(O)=O OABOXRPGTFRBFZ-IMJSIDKUSA-N 0.000 description 3
- IDKGBVZGNTYYCC-QXEWZRGKSA-N (2S)-1-[(2S)-4-amino-2-[[(2S)-2-amino-3-methylbutanoyl]amino]-4-oxobutanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(O)=O IDKGBVZGNTYYCC-QXEWZRGKSA-N 0.000 description 3
- VKVDRTGWLVZJOM-DCAQKATOSA-N (2S)-2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-3-methylbutanoyl]amino]-3-hydroxypropanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O VKVDRTGWLVZJOM-DCAQKATOSA-N 0.000 description 3
- JEDIEMIJYSRUBB-FOHZUACHSA-N (3S)-3-[[(2S,3R)-2-amino-3-hydroxybutanoyl]amino]-4-(carboxymethylamino)-4-oxobutanoic acid Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O JEDIEMIJYSRUBB-FOHZUACHSA-N 0.000 description 3
- NTBFKPBULZGXQL-KKUMJFAQSA-N (3S)-4-[[(1S)-1-carboxy-2-(4-hydroxyphenyl)ethyl]amino]-3-[[(2S)-2,6-diaminohexanoyl]amino]-4-oxobutanoic acid Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NTBFKPBULZGXQL-KKUMJFAQSA-N 0.000 description 3
- VWHGTYCRDRBSFI-ZETCQYMHSA-N 2-[[2-[[(2S)-2-amino-4-methylpentanoyl]amino]acetyl]amino]acetic acid Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)NCC(O)=O VWHGTYCRDRBSFI-ZETCQYMHSA-N 0.000 description 3
- XZWXFWBHYRFLEF-FSPLSTOPSA-N Ala-His Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 XZWXFWBHYRFLEF-FSPLSTOPSA-N 0.000 description 3
- SITWEMZOJNKJCH-UHFFFAOYSA-N Alanyl-Arginine Chemical compound CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N SITWEMZOJNKJCH-UHFFFAOYSA-N 0.000 description 3
- 240000002840 Allium cepa Species 0.000 description 3
- QADCERNTBWTXFV-JSGCOSHPSA-N Arg-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCNC(N)=N)N)C(O)=O)=CNC2=C1 QADCERNTBWTXFV-JSGCOSHPSA-N 0.000 description 3
- DVUFTQLHHHJEMK-IMJSIDKUSA-N Asp-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CC(O)=O DVUFTQLHHHJEMK-IMJSIDKUSA-N 0.000 description 3
- TWXZVVXRRRRSLT-UHFFFAOYSA-N Asparaginyl-Cysteine Chemical compound NC(=O)CC(N)C(=O)NC(CS)C(O)=O TWXZVVXRRRRSLT-UHFFFAOYSA-N 0.000 description 3
- ZVDPYSVOZFINEE-UHFFFAOYSA-N Aspartyl-Leucine Chemical compound CC(C)CC(C(O)=O)NC(=O)C(N)CC(O)=O ZVDPYSVOZFINEE-UHFFFAOYSA-N 0.000 description 3
- 229920002799 BoPET Polymers 0.000 description 3
- 240000007124 Brassica oleracea Species 0.000 description 3
- 235000011301 Brassica oleracea var capitata Nutrition 0.000 description 3
- 235000001169 Brassica oleracea var oleracea Nutrition 0.000 description 3
- YHDXIZKDOIWPBW-UHFFFAOYSA-N Cysteinyl-Glutamine Chemical compound SCC(N)C(=O)NC(C(O)=O)CCC(N)=O YHDXIZKDOIWPBW-UHFFFAOYSA-N 0.000 description 3
- NXTYATMDWQYLGJ-UHFFFAOYSA-N Cysteinyl-Leucine Chemical compound CC(C)CC(C(O)=O)NC(=O)C(N)CS NXTYATMDWQYLGJ-UHFFFAOYSA-N 0.000 description 3
- OELDIVRKHTYFNG-UHFFFAOYSA-N Cysteinyl-Valine Chemical compound CC(C)C(C(O)=O)NC(=O)C(N)CS OELDIVRKHTYFNG-UHFFFAOYSA-N 0.000 description 3
- 239000003155 DNA primer Substances 0.000 description 3
- 101700011961 DPOM Proteins 0.000 description 3
- 229940110715 ENZYMES FOR TREATMENT OF WOUNDS AND ULCERS Drugs 0.000 description 3
- 102000004190 Enzymes Human genes 0.000 description 3
- 108090000790 Enzymes Proteins 0.000 description 3
- 241000233866 Fungi Species 0.000 description 3
- LLEUXCDZPQOJMY-AAEUAGOBSA-N Glu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCC(O)=O)N)C(O)=O)=CNC2=C1 LLEUXCDZPQOJMY-AAEUAGOBSA-N 0.000 description 3
- JZOYFBPIEHCDFV-UHFFFAOYSA-N Glutaminyl-Histidine Chemical compound NC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CN=CN1 JZOYFBPIEHCDFV-UHFFFAOYSA-N 0.000 description 3
- PFMUCCYYAAFKTH-YFKPBYRVSA-N Gly-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)CN PFMUCCYYAAFKTH-YFKPBYRVSA-N 0.000 description 3
- AJHCSUXXECOXOY-NSHDSACASA-N Gly-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-NSHDSACASA-N 0.000 description 3
- MMFKFJORZBJVNF-UWVGGRQHSA-N His-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CN=CN1 MMFKFJORZBJVNF-UWVGGRQHSA-N 0.000 description 3
- LNCFUHAPNTYMJB-IUCAKERBSA-N His-Pro Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CN=CN1 LNCFUHAPNTYMJB-IUCAKERBSA-N 0.000 description 3
- VLDVBZICYBVQHB-IUCAKERBSA-N His-Val Chemical compound CC(C)[C@@H](C([O-])=O)NC(=O)[C@@H]([NH3+])CC1=CN=CN1 VLDVBZICYBVQHB-IUCAKERBSA-N 0.000 description 3
- NIKBMHGRNAPJFW-UHFFFAOYSA-N Histidinyl-Arginine Chemical compound NC(=N)NCCCC(C(O)=O)NC(=O)C(N)CC1=CN=CN1 NIKBMHGRNAPJFW-UHFFFAOYSA-N 0.000 description 3
- 125000000998 L-alanino group Chemical group [H]N([*])[C@](C([H])([H])[H])([H])C(=O)O[H] 0.000 description 3
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 3
- 101710029649 MDV043 Proteins 0.000 description 3
- PBOUVYGPDSARIS-IUCAKERBSA-N Met-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CC(C)C PBOUVYGPDSARIS-IUCAKERBSA-N 0.000 description 3
- WEDDFMCSUNNZJR-WDSKDSINSA-N Met-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(O)=O WEDDFMCSUNNZJR-WDSKDSINSA-N 0.000 description 3
- KAKJTZWHIUWTTD-VQVTYTSYSA-N Met-Thr Chemical compound CSCC[C@H]([NH3+])C(=O)N[C@@H]([C@@H](C)O)C([O-])=O KAKJTZWHIUWTTD-VQVTYTSYSA-N 0.000 description 3
- 102000016943 Muramidase Human genes 0.000 description 3
- 108010014251 Muramidase Proteins 0.000 description 3
- 239000005041 Mylar™ Substances 0.000 description 3
- 240000007594 Oryza sativa Species 0.000 description 3
- 235000007164 Oryza sativa Nutrition 0.000 description 3
- 101700061424 POLB Proteins 0.000 description 3
- PYOHODCEOHCZBM-RYUDHWBXSA-N Phe-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 PYOHODCEOHCZBM-RYUDHWBXSA-N 0.000 description 3
- RWCOTTLHDJWHRS-YUMQZZPRSA-N Pro-Pro Chemical compound OC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 RWCOTTLHDJWHRS-YUMQZZPRSA-N 0.000 description 3
- 108010079005 RDV peptide Proteins 0.000 description 3
- 101700054624 RF1 Proteins 0.000 description 3
- 101710042981 SHMT1 Proteins 0.000 description 3
- 235000002595 Solanum tuberosum Nutrition 0.000 description 3
- 240000001016 Solanum tuberosum Species 0.000 description 3
- LWFWZRANSFAJDR-JSGCOSHPSA-N Trp-Val Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](C(C)C)C(O)=O)=CNC2=C1 LWFWZRANSFAJDR-JSGCOSHPSA-N 0.000 description 3
- ZQOOYCZQENFIMC-STQMWFEESA-N Tyr-His Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1N=CNC=1)C(O)=O)C1=CC=C(O)C=C1 ZQOOYCZQENFIMC-STQMWFEESA-N 0.000 description 3
- AUEJLPRZGVVDNU-STQMWFEESA-N Tyr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AUEJLPRZGVVDNU-STQMWFEESA-N 0.000 description 3
- 241000625014 Vir Species 0.000 description 3
- 235000005042 Zier Kohl Nutrition 0.000 description 3
- 108010070944 alanylhistidine Proteins 0.000 description 3
- 108010001271 arginyl-glutamyl-arginine Proteins 0.000 description 3
- 229920002083 cellular DNA Polymers 0.000 description 3
- 238000010192 crystallographic characterization Methods 0.000 description 3
- 108010016616 cysteinylglycine Proteins 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 239000012153 distilled water Substances 0.000 description 3
- 230000002255 enzymatic Effects 0.000 description 3
- WSFSSNUMVMOOMR-UHFFFAOYSA-N formaldehyde Chemical compound O=C WSFSSNUMVMOOMR-UHFFFAOYSA-N 0.000 description 3
- 108020001507 fusion proteins Proteins 0.000 description 3
- 102000037240 fusion proteins Human genes 0.000 description 3
- 239000007789 gas Substances 0.000 description 3
- 108010087823 glycyltyrosine Proteins 0.000 description 3
- 239000008187 granular material Substances 0.000 description 3
- 229940020899 hematological Enzymes Drugs 0.000 description 3
- 108010085325 histidylproline Proteins 0.000 description 3
- 229910052739 hydrogen Inorganic materials 0.000 description 3
- 239000001257 hydrogen Substances 0.000 description 3
- 239000002917 insecticide Substances 0.000 description 3
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 3
- 108010051673 leucyl-glycyl-phenylalanine Proteins 0.000 description 3
- 108010091871 leucylmethionine Proteins 0.000 description 3
- 239000007788 liquid Substances 0.000 description 3
- 238000011068 load Methods 0.000 description 3
- 229960000274 lysozyme Drugs 0.000 description 3
- 239000004325 lysozyme Substances 0.000 description 3
- 235000010335 lysozyme Nutrition 0.000 description 3
- 239000012528 membrane Substances 0.000 description 3
- 108010056582 methionylglutamic acid Proteins 0.000 description 3
- 239000003094 microcapsule Substances 0.000 description 3
- 238000010369 molecular cloning Methods 0.000 description 3
- 235000015097 nutrients Nutrition 0.000 description 3
- 235000002732 oignon Nutrition 0.000 description 3
- 239000006174 pH buffer Substances 0.000 description 3
- 238000004806 packaging method and process Methods 0.000 description 3
- 108010084572 phenylalanyl-valine Proteins 0.000 description 3
- 235000012015 potatoes Nutrition 0.000 description 3
- 108010079317 prolyl-tyrosine Proteins 0.000 description 3
- 210000001938 protoplasts Anatomy 0.000 description 3
- 108091007521 restriction endonucleases Proteins 0.000 description 3
- 235000009566 rice Nutrition 0.000 description 3
- 150000003839 salts Chemical class 0.000 description 3
- 238000000926 separation method Methods 0.000 description 3
- FAPWRFPIFSIZLT-UHFFFAOYSA-M sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 3
- 241000894007 species Species 0.000 description 3
- 239000004094 surface-active agent Substances 0.000 description 3
- 108010084932 tryptophyl-proline Proteins 0.000 description 3
- 108010078580 tyrosylleucine Proteins 0.000 description 3
- KAJAOGBVWCYGHZ-JTQLQIEISA-N (2S)-2-[[2-[(2-azaniumylacetyl)amino]acetyl]amino]-3-phenylpropanoate Chemical compound [NH3+]CC(=O)NCC(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 KAJAOGBVWCYGHZ-JTQLQIEISA-N 0.000 description 2
- IAJFFZORSWOZPQ-SRVKXCTJSA-N (2S)-4-amino-2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]amino]-4-oxobutanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IAJFFZORSWOZPQ-SRVKXCTJSA-N 0.000 description 2
- KXTAGESXNQEZKB-DZKIICNBSA-N (4S)-4-amino-5-[[(2S)-1-[[(1S)-1-carboxy-2-methylpropyl]amino]-1-oxo-3-phenylpropan-2-yl]amino]-5-oxopentanoic acid Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=CC=C1 KXTAGESXNQEZKB-DZKIICNBSA-N 0.000 description 2
- SOYWRINXUSUWEQ-DLOVCJGASA-N (4S)-4-amino-5-[[(2S)-1-[[(1S)-1-carboxy-2-methylpropyl]amino]-3-methyl-1-oxobutan-2-yl]amino]-5-oxopentanoic acid Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O SOYWRINXUSUWEQ-DLOVCJGASA-N 0.000 description 2
- GGJOGFJIPPGNRK-JSGCOSHPSA-N (4S)-4-amino-5-[[2-[[(1S)-1-carboxy-2-(1H-indol-3-yl)ethyl]amino]-2-oxoethyl]amino]-5-oxopentanoic acid Chemical compound C1=CC=C2C(C[C@H](NC(=O)CNC(=O)[C@H](CCC(O)=O)N)C(O)=O)=CNC2=C1 GGJOGFJIPPGNRK-JSGCOSHPSA-N 0.000 description 2
- 229920000160 (ribonucleotides)n+m Polymers 0.000 description 2
- 102100003861 ADRB1 Human genes 0.000 description 2
- 101700078529 ADRB1 Proteins 0.000 description 2
- 241000589156 Agrobacterium rhizogenes Species 0.000 description 2
- 241000589155 Agrobacterium tumefaciens Species 0.000 description 2
- CXISPYVYMQWFLE-VKHMYHEASA-N Ala-Gly Chemical compound C[C@H]([NH3+])C(=O)NCC([O-])=O CXISPYVYMQWFLE-VKHMYHEASA-N 0.000 description 2
- WPWUFUBLGADILS-WDSKDSINSA-N Ala-Pro Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O WPWUFUBLGADILS-WDSKDSINSA-N 0.000 description 2
- 241000588986 Alcaligenes Species 0.000 description 2
- SJUXYGVRSGTPMC-UHFFFAOYSA-N Asparaginyl-Alanine Chemical compound OC(=O)C(C)NC(=O)C(N)CC(N)=O SJUXYGVRSGTPMC-UHFFFAOYSA-N 0.000 description 2
- 235000005340 Asparagus officinalis Nutrition 0.000 description 2
- 240000001498 Asparagus officinalis Species 0.000 description 2
- 206010003694 Atrophy Diseases 0.000 description 2
- 241000223651 Aureobasidium Species 0.000 description 2
- 241000589151 Azotobacter Species 0.000 description 2
- 241000193830 Bacillus <bacterium> Species 0.000 description 2
- 241000219310 Beta vulgaris subsp. vulgaris Species 0.000 description 2
- 235000011331 Brassica Nutrition 0.000 description 2
- 241000219198 Brassica Species 0.000 description 2
- 240000002791 Brassica napus Species 0.000 description 2
- 235000002566 Capsicum Nutrition 0.000 description 2
- 240000000464 Cicer arietinum Species 0.000 description 2
- 235000010523 Cicer arietinum Nutrition 0.000 description 2
- 108020004705 Codon Proteins 0.000 description 2
- YXQDRIRSAHTJKM-IMJSIDKUSA-N Cys-Ser Chemical compound SC[C@H](N)C(=O)N[C@@H](CO)C(O)=O YXQDRIRSAHTJKM-IMJSIDKUSA-N 0.000 description 2
- AYKQJQVWUYEZNU-UHFFFAOYSA-N Cysteinyl-Asparagine Chemical compound SCC(N)C(=O)NC(C(O)=O)CC(N)=O AYKQJQVWUYEZNU-UHFFFAOYSA-N 0.000 description 2
- WQZGKKKJIJFFOK-GASJEMHNSA-N D-Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 2
- 238000001712 DNA sequencing Methods 0.000 description 2
- 102000016911 Deoxyribonucleases Human genes 0.000 description 2
- 108010053770 Deoxyribonucleases Proteins 0.000 description 2
- ZPWVASYFFYYZEW-UHFFFAOYSA-L Dipotassium phosphate Chemical compound [K+].[K+].OP([O-])([O-])=O ZPWVASYFFYYZEW-UHFFFAOYSA-L 0.000 description 2
- 238000002965 ELISA Methods 0.000 description 2
- 241000206602 Eukaryota Species 0.000 description 2
- 240000001441 Fragaria vesca Species 0.000 description 2
- 101710042240 GLUL Proteins 0.000 description 2
- YIWFXZNIBQBFHR-LURJTMIESA-N Gly-His Chemical compound [NH3+]CC(=O)N[C@H](C([O-])=O)CC1=CN=CN1 YIWFXZNIBQBFHR-LURJTMIESA-N 0.000 description 2
- 241000255967 Helicoverpa zea Species 0.000 description 2
- 239000007836 KH2PO4 Substances 0.000 description 2
- ZUKPVRWZDMRIEO-VKHMYHEASA-N L-cysteinylglycine zwitterion Chemical compound SC[C@H]([NH3+])C(=O)NCC([O-])=O ZUKPVRWZDMRIEO-VKHMYHEASA-N 0.000 description 2
- 101700021119 LEUC Proteins 0.000 description 2
- 241000258916 Leptinotarsa decemlineata Species 0.000 description 2
- HIZYETOZLYFUFF-BQBZGAKWSA-N Leu-Cys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CS)C(O)=O HIZYETOZLYFUFF-BQBZGAKWSA-N 0.000 description 2
- 235000007688 Lycopersicon esculentum Nutrition 0.000 description 2
- 241000124008 Mammalia Species 0.000 description 2
- ADHNYKZHPOEULM-BQBZGAKWSA-N Met-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O ADHNYKZHPOEULM-BQBZGAKWSA-N 0.000 description 2
- GNSKLFRGEWLPPA-UHFFFAOYSA-M Monopotassium phosphate Chemical compound [K+].OP(O)([O-])=O GNSKLFRGEWLPPA-UHFFFAOYSA-M 0.000 description 2
- 235000002637 Nicotiana tabacum Nutrition 0.000 description 2
- 240000008962 Nicotiana tabacum Species 0.000 description 2
- 241000256259 Noctuidae Species 0.000 description 2
- 108060006775 PSBP Proteins 0.000 description 2
- IEHDJWSAXBGJIP-RYUDHWBXSA-N Phe-Val Chemical compound CC(C)[C@@H](C([O-])=O)NC(=O)[C@@H]([NH3+])CC1=CC=CC=C1 IEHDJWSAXBGJIP-RYUDHWBXSA-N 0.000 description 2
- 240000001203 Potentilla anserina Species 0.000 description 2
- 235000016594 Potentilla anserina Nutrition 0.000 description 2
- HMNSRTLZAJHSIK-YUMQZZPRSA-N Pro-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 HMNSRTLZAJHSIK-YUMQZZPRSA-N 0.000 description 2
- RVQDZELMXZRSSI-IUCAKERBSA-N Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 RVQDZELMXZRSSI-IUCAKERBSA-N 0.000 description 2
- BEPSGCXDIVACBU-UHFFFAOYSA-N Prolyl-Histidine Chemical compound C1CCNC1C(=O)NC(C(=O)O)CC1=CN=CN1 BEPSGCXDIVACBU-UHFFFAOYSA-N 0.000 description 2
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 2
- YZMPDHTZJJCGEI-BQBZGAKWSA-N Ser-His Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 YZMPDHTZJJCGEI-BQBZGAKWSA-N 0.000 description 2
- LZLREEUGSYITMX-UHFFFAOYSA-N Serinyl-Tryptophan Chemical compound C1=CC=C2C(CC(NC(=O)C(CO)N)C(O)=O)=CNC2=C1 LZLREEUGSYITMX-UHFFFAOYSA-N 0.000 description 2
- 240000003768 Solanum lycopersicum Species 0.000 description 2
- RWQNBRDOKXIBIV-UHFFFAOYSA-N Thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 2
- LWIHDJKSTIGBAC-UHFFFAOYSA-K Tripotassium phosphate Chemical compound [K+].[K+].[K+].[O-]P([O-])([O-])=O LWIHDJKSTIGBAC-UHFFFAOYSA-K 0.000 description 2
- DZHDVYLBNKMLMB-ZFWWWQNUSA-N Trp-Lys Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCCCN)C(O)=O)=CNC2=C1 DZHDVYLBNKMLMB-ZFWWWQNUSA-N 0.000 description 2
- IMMPMHKLUUZKAZ-WMZOPIPTSA-N Trp-Phe Chemical compound C([C@H](NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)N)C(O)=O)C1=CC=CC=C1 IMMPMHKLUUZKAZ-WMZOPIPTSA-N 0.000 description 2
- MYVYPSWUSKCCHG-JQWIXIFHSA-N Trp-Ser Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CO)C(O)=O)=CNC2=C1 MYVYPSWUSKCCHG-JQWIXIFHSA-N 0.000 description 2
- DXYQIGZZWYBXSD-UHFFFAOYSA-N Tryptophyl-Proline Chemical compound C=1NC2=CC=CC=C2C=1CC(N)C(=O)N1CCCC1C(O)=O DXYQIGZZWYBXSD-UHFFFAOYSA-N 0.000 description 2
- BMPPMAOOKQJYIP-WMZOPIPTSA-N Tyr-Trp Chemical compound C([C@H]([NH3+])C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C([O-])=O)C1=CC=C(O)C=C1 BMPPMAOOKQJYIP-WMZOPIPTSA-N 0.000 description 2
- GJNDXQBALKCYSZ-RYUDHWBXSA-N Val-Phe Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 GJNDXQBALKCYSZ-RYUDHWBXSA-N 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 2
- 239000002253 acid Substances 0.000 description 2
- 238000007792 addition Methods 0.000 description 2
- 238000009632 agar plate Methods 0.000 description 2
- 108010047495 alanylglycine Proteins 0.000 description 2
- 108010050025 alpha-glutamyltryptophan Proteins 0.000 description 2
- 239000003242 anti bacterial agent Substances 0.000 description 2
- 108010029539 arginyl-prolyl-proline Proteins 0.000 description 2
- 108010059459 arginyl-threonyl-phenylalanine Proteins 0.000 description 2
- 239000003139 biocide Substances 0.000 description 2
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 2
- 229960002685 biotin Drugs 0.000 description 2
- 235000020958 biotin Nutrition 0.000 description 2
- 239000011616 biotin Substances 0.000 description 2
- 238000009835 boiling Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 150000001875 compounds Chemical class 0.000 description 2
- 230000001809 detectable Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000000378 dietary Effects 0.000 description 2
- 229910000396 dipotassium phosphate Inorganic materials 0.000 description 2
- 235000019797 dipotassium phosphate Nutrition 0.000 description 2
- 239000002270 dispersing agent Substances 0.000 description 2
- KCXVZYZYPLLWCC-UHFFFAOYSA-N edta Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 2
- 238000004520 electroporation Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000000834 fixative Substances 0.000 description 2
- 230000037406 food intake Effects 0.000 description 2
- 238000009472 formulation Methods 0.000 description 2
- 238000007710 freezing Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 108010006664 gamma-glutamyl-glycyl-glycine Proteins 0.000 description 2
- 239000008103 glucose Substances 0.000 description 2
- -1 glutaraldehyde Chemical class 0.000 description 2
- VPZXBVLAVMBEQI-VKHMYHEASA-N gly ala Chemical compound OC(=O)[C@H](C)NC(=O)CN VPZXBVLAVMBEQI-VKHMYHEASA-N 0.000 description 2
- 239000002596 immunotoxin Substances 0.000 description 2
- 238000002347 injection Methods 0.000 description 2
- 239000007924 injection Substances 0.000 description 2
- 230000003834 intracellular Effects 0.000 description 2
- PNDPGZBMCMUPRI-UHFFFAOYSA-N iodine Chemical compound II PNDPGZBMCMUPRI-UHFFFAOYSA-N 0.000 description 2
- 229910052740 iodine Inorganic materials 0.000 description 2
- 239000011630 iodine Substances 0.000 description 2
- 230000000974 larvacidal Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000006011 modification reaction Methods 0.000 description 2
- 229910000402 monopotassium phosphate Inorganic materials 0.000 description 2
- 235000019796 monopotassium phosphate Nutrition 0.000 description 2
- 230000004899 motility Effects 0.000 description 2
- 108010073101 phenylalanylleucine Proteins 0.000 description 2
- 239000004033 plastic Substances 0.000 description 2
- 239000004417 polycarbonate Substances 0.000 description 2
- 238000005215 recombination Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 230000002623 sporogenic Effects 0.000 description 2
- 235000021012 strawberries Nutrition 0.000 description 2
- 230000002194 synthesizing Effects 0.000 description 2
- 108010033670 threonyl-aspartyl-tyrosine Proteins 0.000 description 2
- 210000001519 tissues Anatomy 0.000 description 2
- 230000001988 toxicity Effects 0.000 description 2
- 231100000419 toxicity Toxicity 0.000 description 2
- CEYYIKYYFSTQRU-UHFFFAOYSA-M trimethyl(tetradecyl)azanium;chloride Chemical compound [Cl-].CCCCCCCCCCCCCC[N+](C)(C)C CEYYIKYYFSTQRU-UHFFFAOYSA-M 0.000 description 2
- 108010015666 tryptophyl-leucyl-glutamic acid Proteins 0.000 description 2
- 108010044292 tryptophyltyrosine Proteins 0.000 description 2
- SIGGQAHUPUBWNF-UHFFFAOYSA-N γ-glutamyl-Methionine Chemical compound CSCCC(C(O)=O)NC(=O)C(N)CCC(N)=O SIGGQAHUPUBWNF-UHFFFAOYSA-N 0.000 description 2
- GHBSKQGCIYSCNS-NAKRPEOUSA-N (2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-aminopropanoyl]amino]-4-methylpentanoyl]amino]-3-carboxypropanoyl]amino]butanedioic acid Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O GHBSKQGCIYSCNS-NAKRPEOUSA-N 0.000 description 1
- ZAVCJRJOQKIOJW-KKUMJFAQSA-N (2S)-2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-3-phenylpropanoyl]amino]butanedioic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=CC=C1 ZAVCJRJOQKIOJW-KKUMJFAQSA-N 0.000 description 1
- BQSLGJHIAGOZCD-CIUDSAMLSA-N (2S)-2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]propanoyl]amino]-3-hydroxypropanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O BQSLGJHIAGOZCD-CIUDSAMLSA-N 0.000 description 1
- ZRLUISBDKUWAIZ-CIUDSAMLSA-N (2S)-2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]propanoyl]amino]butanedioic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O ZRLUISBDKUWAIZ-CIUDSAMLSA-N 0.000 description 1
- ZNGPROMGGGFOAA-JYJNAYRXSA-N (2S)-2-[[(2S)-2-[[(2S)-2-azaniumyl-3-methylbutanoyl]amino]-3-(4-hydroxyphenyl)propanoyl]amino]-3-methylbutanoate Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=C(O)C=C1 ZNGPROMGGGFOAA-JYJNAYRXSA-N 0.000 description 1
- XPJBQTCXPJNIFE-ZETCQYMHSA-N (2S)-2-[[2-[(2-azaniumylacetyl)amino]acetyl]amino]-4-methylpentanoate Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)CN XPJBQTCXPJNIFE-ZETCQYMHSA-N 0.000 description 1
- BIZNDKMFQHDOIE-KKUMJFAQSA-N (2S)-4-amino-2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-3-phenylpropanoyl]amino]-4-oxobutanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 BIZNDKMFQHDOIE-KKUMJFAQSA-N 0.000 description 1
- QPBSRMDNJOTFAL-AICCOOGYSA-N (2S,3R)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-aminopropanoyl]amino]-4-methylpentanoyl]amino]-4-methylpentanoyl]amino]-3-hydroxybutanoic acid Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QPBSRMDNJOTFAL-AICCOOGYSA-N 0.000 description 1
- QLQHWWCSCLZUMA-KKUMJFAQSA-N (3S)-3-[[(2S)-2-amino-4-methylpentanoyl]amino]-4-[[(1S)-1-carboxy-2-(4-hydroxyphenyl)ethyl]amino]-4-oxobutanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 QLQHWWCSCLZUMA-KKUMJFAQSA-N 0.000 description 1
- VHJLVAABSRFDPM-UHFFFAOYSA-N 1,4-dimercaptobutane-2,3-diol Chemical compound SCC(O)C(O)CS VHJLVAABSRFDPM-UHFFFAOYSA-N 0.000 description 1
- RLMISHABBKUNFO-WHFBIAKZSA-N 2-[[(2S)-2-[[(2S)-2-azaniumylpropanoyl]amino]propanoyl]amino]acetate Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 1
- PABVKUJVLNMOJP-UHFFFAOYSA-N 4-amino-5-[(1-carboxy-2-sulfanylethyl)amino]-5-oxopentanoic acid Chemical compound OC(=O)CCC(N)C(=O)NC(CS)C(O)=O PABVKUJVLNMOJP-UHFFFAOYSA-N 0.000 description 1
- 229940074728 ANTIINFECTIVE OPHTHALMOLOGICS Drugs 0.000 description 1
- 241000589220 Acetobacter Species 0.000 description 1
- 244000235858 Acetobacter xylinum Species 0.000 description 1
- 235000002837 Acetobacter xylinum Nutrition 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Natural products NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- 229960000643 Adenine Drugs 0.000 description 1
- 229920001817 Agar Polymers 0.000 description 1
- 229920000936 Agarose Polymers 0.000 description 1
- 241000902874 Agelastica alni Species 0.000 description 1
- FSHURBQASBLAPO-WDSKDSINSA-N Ala-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](C)N FSHURBQASBLAPO-WDSKDSINSA-N 0.000 description 1
- WUGMRIBZSVSJNP-UFBFGSQYSA-N Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UFBFGSQYSA-N 0.000 description 1
- 235000005254 Allium ampeloprasum Nutrition 0.000 description 1
- 240000006108 Allium ampeloprasum Species 0.000 description 1
- SLXKOJJOQWFEFD-UHFFFAOYSA-N Aminocaproic acid Chemical compound NCCCCCC(O)=O SLXKOJJOQWFEFD-UHFFFAOYSA-N 0.000 description 1
- 229940064005 Antibiotic throat preparations Drugs 0.000 description 1
- 229940083879 Antibiotics FOR TREATMENT OF HEMORRHOIDS AND ANAL FISSURES FOR TOPICAL USE Drugs 0.000 description 1
- 229940042052 Antibiotics for systemic use Drugs 0.000 description 1
- 229940021383 Antiinfective irrigating solutions Drugs 0.000 description 1
- 229960005475 Antiinfectives Drugs 0.000 description 1
- 229940042786 Antitubercular Antibiotics Drugs 0.000 description 1
- 240000007087 Apium graveolens Species 0.000 description 1
- 235000015849 Apium graveolens Dulce Group Nutrition 0.000 description 1
- 235000010591 Appio Nutrition 0.000 description 1
- WYBVBIHNJWOLCJ-IUCAKERBSA-N Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCNC(N)=N WYBVBIHNJWOLCJ-IUCAKERBSA-N 0.000 description 1
- DAQIJMOLTMGJLO-YUMQZZPRSA-N Arg-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCNC(N)=N DAQIJMOLTMGJLO-YUMQZZPRSA-N 0.000 description 1
- 241000186063 Arthrobacter Species 0.000 description 1
- 241000512259 Ascophyllum nodosum Species 0.000 description 1
- RGGVDKVXLBOLNS-UHFFFAOYSA-N Asparaginyl-Tryptophan Chemical compound C1=CC=C2C(CC(NC(=O)C(CC(N)=O)N)C(O)=O)=CNC2=C1 RGGVDKVXLBOLNS-UHFFFAOYSA-N 0.000 description 1
- FKBFDTRILNZGAI-UHFFFAOYSA-N Aspartyl-Cysteine Chemical compound OC(=O)CC(N)C(=O)NC(CS)C(O)=O FKBFDTRILNZGAI-UHFFFAOYSA-N 0.000 description 1
- 235000007319 Avena orientalis Nutrition 0.000 description 1
- 244000075850 Avena orientalis Species 0.000 description 1
- FTNJWQUOZFUQQJ-NDAWSKJSSA-N Azadirachtin Chemical compound C([C@@H]([C@]1(C=CO[C@H]1O1)O)[C@]2(C)O3)[C@H]1[C@]23[C@]1(C)[C@H](O)[C@H](OC[C@@]2([C@@H](C[C@@H]3OC(=O)C(\C)=C\C)OC(C)=O)C(=O)OC)[C@@H]2[C@]32CO[C@@](C(=O)OC)(O)[C@@H]12 FTNJWQUOZFUQQJ-NDAWSKJSSA-N 0.000 description 1
- 239000005878 Azadirachtin Substances 0.000 description 1
- 108010023063 Bacto-peptone Proteins 0.000 description 1
- 235000016068 Berberis vulgaris Nutrition 0.000 description 1
- 241000335053 Beta vulgaris Species 0.000 description 1
- 108010006654 Bleomycin Proteins 0.000 description 1
- 229960001561 Bleomycin Drugs 0.000 description 1
- 239000011547 Bouin solution Substances 0.000 description 1
- 235000011293 Brassica napus Nutrition 0.000 description 1
- 235000011299 Brassica oleracea var botrytis Nutrition 0.000 description 1
- 235000017647 Brassica oleracea var italica Nutrition 0.000 description 1
- 240000003259 Brassica oleracea var. botrytis Species 0.000 description 1
- 235000010149 Brassica rapa subsp chinensis Nutrition 0.000 description 1
- 235000000536 Brassica rapa subsp pekinensis Nutrition 0.000 description 1
- 235000000540 Brassica rapa subsp rapa Nutrition 0.000 description 1
- 241000499436 Brassica rapa subsp. pekinensis Species 0.000 description 1
- 235000004977 Brassica sinapistrum Nutrition 0.000 description 1
- 229940041514 Candida albicans extract Drugs 0.000 description 1
- 240000000218 Cannabis sativa Species 0.000 description 1
- 240000008574 Capsicum frutescens Species 0.000 description 1
- 210000002421 Cell Wall Anatomy 0.000 description 1
- 229960001927 Cetylpyridinium Chloride Drugs 0.000 description 1
- YMKDRGPMQRFJGP-UHFFFAOYSA-M Cetylpyridinium chloride Chemical compound [Cl-].CCCCCCCCCCCCCCCC[N+]1=CC=CC=C1 YMKDRGPMQRFJGP-UHFFFAOYSA-M 0.000 description 1
- 206010008531 Chills Diseases 0.000 description 1
- 229960005091 Chloramphenicol Drugs 0.000 description 1
- WIIZWVCIJKGZOK-RKDXNWHRSA-N Chloramphenicol Chemical compound ClC(Cl)C(=O)N[C@H](CO)[C@H](O)C1=CC=C([N+]([O-])=O)C=C1 WIIZWVCIJKGZOK-RKDXNWHRSA-N 0.000 description 1
- 235000007542 Cichorium intybus Nutrition 0.000 description 1
- 240000008051 Cichorium intybus Species 0.000 description 1
- 235000005979 Citrus limon Nutrition 0.000 description 1
- 240000002268 Citrus limon Species 0.000 description 1
- 241001429175 Colitis phage Species 0.000 description 1
- 229920000742 Cotton Polymers 0.000 description 1
- 241001337994 Cryptococcus <scale insect> Species 0.000 description 1
- 241000195493 Cryptophyta Species 0.000 description 1
- WYVKPHCYMTWUCW-UHFFFAOYSA-N Cysteinyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CS WYVKPHCYMTWUCW-UHFFFAOYSA-N 0.000 description 1
- 229940104302 Cytosine Drugs 0.000 description 1
- OPTASPLRGRRNAP-UHFFFAOYSA-N Cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 1
- IGXWBGJHJZYPQS-SSDOTTSWSA-N D-Luciferin Chemical compound OC(=O)[C@H]1CSC(C=2SC3=CC=C(O)C=C3N=2)=N1 IGXWBGJHJZYPQS-SSDOTTSWSA-N 0.000 description 1
- CZMRCDWAGMRECN-UGDNZRGBSA-N D-sucrose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 description 1
- 238000007399 DNA isolation Methods 0.000 description 1
- 239000003298 DNA probe Substances 0.000 description 1
- 101710028159 DNTT Proteins 0.000 description 1
- 102100002445 DNTT Human genes 0.000 description 1
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 1
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 1
- 241000321929 Dothiorella pretoriensis Species 0.000 description 1
- 102000008422 EC 2.7.1.78 Human genes 0.000 description 1
- 108010021757 EC 2.7.1.78 Proteins 0.000 description 1
- 108010042407 Endonucleases Proteins 0.000 description 1
- 102000004533 Endonucleases Human genes 0.000 description 1
- 108010067770 Endopeptidase K Proteins 0.000 description 1
- 241000588698 Erwinia Species 0.000 description 1
- 241001585293 Euxoa Species 0.000 description 1
- 241001658214 Euxoa detersa Species 0.000 description 1
- 241001368778 Euxoa messoria Species 0.000 description 1
- 241000444886 Euxoa ochrogaster Species 0.000 description 1
- 241001367768 Euxoa tessellata Species 0.000 description 1
- 102000013165 Exonucleases Human genes 0.000 description 1
- 108060002716 Exonucleases Proteins 0.000 description 1
- 231100000776 Exotoxin Toxicity 0.000 description 1
- 241000233488 Feltia Species 0.000 description 1
- MFBYPDKTAJXHNI-VKHMYHEASA-N Gly-Cys Chemical compound [NH3+]CC(=O)N[C@@H](CS)C([O-])=O MFBYPDKTAJXHNI-VKHMYHEASA-N 0.000 description 1
- IKAIKUBBJHFNBZ-LURJTMIESA-N Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CN IKAIKUBBJHFNBZ-LURJTMIESA-N 0.000 description 1
- 235000010469 Glycine max Nutrition 0.000 description 1
- 240000007842 Glycine max Species 0.000 description 1
- 240000006962 Gossypium hirsutum Species 0.000 description 1
- UYTPUPDQBNUYGX-UHFFFAOYSA-N Guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 1
- 229940093922 Gynecological Antibiotics Drugs 0.000 description 1
- 101710017531 H4C15 Proteins 0.000 description 1
- 102100019126 HBB Human genes 0.000 description 1
- VEXZGXHMUGYJMC-UHFFFAOYSA-N HCl Chemical compound Cl VEXZGXHMUGYJMC-UHFFFAOYSA-N 0.000 description 1
- 102100013102 HDAC10 Human genes 0.000 description 1
- 101710042173 HDAC10 Proteins 0.000 description 1
- 241000208818 Helianthus Species 0.000 description 1
- 235000003222 Helianthus annuus Nutrition 0.000 description 1
- 108091005902 Hemoglobin subunit beta Proteins 0.000 description 1
- VHOLZZKNEBBHTH-YUMQZZPRSA-N His-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CNC=N1 VHOLZZKNEBBHTH-YUMQZZPRSA-N 0.000 description 1
- KRBMQYPTDYSENE-BQBZGAKWSA-N His-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CNC=N1 KRBMQYPTDYSENE-BQBZGAKWSA-N 0.000 description 1
- 240000005979 Hordeum vulgare Species 0.000 description 1
- 235000007340 Hordeum vulgare Nutrition 0.000 description 1
- 102000004157 Hydrolases Human genes 0.000 description 1
- 108090000604 Hydrolases Proteins 0.000 description 1
- 241000108025 Indosylvirana aurantiaca Species 0.000 description 1
- 235000002678 Ipomoea batatas Nutrition 0.000 description 1
- 240000003613 Ipomoea batatas Species 0.000 description 1
- BAUYGSIQEAFULO-UHFFFAOYSA-L Iron(II) sulfate Chemical compound [Fe+2].[O-]S([O-])(=O)=O BAUYGSIQEAFULO-UHFFFAOYSA-L 0.000 description 1
- 240000007049 Juglans regia Species 0.000 description 1
- 235000009496 Juglans regia Nutrition 0.000 description 1
- 241000588748 Klebsiella Species 0.000 description 1
- 241000235649 Kluyveromyces Species 0.000 description 1
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 1
- 125000000174 L-prolyl group Chemical group [H]N1C([H])([H])C([H])([H])C([H])([H])[C@@]1([H])C(*)=O 0.000 description 1
- MTCFGRXMJLQNBG-REOHCLBHSA-N L-serine Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 1
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 1
- XUIIKFGFIJCVMT-LBPRGKRZSA-N L-thyroxine zwitterion Chemical compound IC1=CC(C[C@H]([NH3+])C([O-])=O)=CC(I)=C1OC1=CC(I)=C(O)C(I)=C1 XUIIKFGFIJCVMT-LBPRGKRZSA-N 0.000 description 1
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 1
- 241000186660 Lactobacillus Species 0.000 description 1
- 229940039696 Lactobacillus Drugs 0.000 description 1
- 235000003228 Lactuca sativa Nutrition 0.000 description 1
- 240000008415 Lactuca sativa Species 0.000 description 1
- NTISAKGPIGTIJJ-IUCAKERBSA-N Leu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(C)C NTISAKGPIGTIJJ-IUCAKERBSA-N 0.000 description 1
- BQVUABVGYYSDCJ-ZFWWWQNUSA-N Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-ZFWWWQNUSA-N 0.000 description 1
- 241000192132 Leuconostoc Species 0.000 description 1
- 241000243662 Lumbricus terrestris Species 0.000 description 1
- IGRMTQMIDNDFAA-UHFFFAOYSA-N Lysyl-Histidine Chemical compound NCCCCC(N)C(=O)NC(C(O)=O)CC1=CN=CN1 IGRMTQMIDNDFAA-UHFFFAOYSA-N 0.000 description 1
- SQQMAOCOWKFBNP-UHFFFAOYSA-L Manganese(II) sulfate Chemical compound [Mn+2].[O-]S([O-])(=O)=O SQQMAOCOWKFBNP-UHFFFAOYSA-L 0.000 description 1
- 241000215495 Massilia timonae Species 0.000 description 1
- 235000000434 Melocanna baccifera Nutrition 0.000 description 1
- 241001497770 Melocanna baccifera Species 0.000 description 1
- PESQCPHRXOFIPX-RYUDHWBXSA-N Met-Tyr Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 PESQCPHRXOFIPX-RYUDHWBXSA-N 0.000 description 1
- BJFJQOMZCSHBMY-YUMQZZPRSA-N Met-Val Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(O)=O BJFJQOMZCSHBMY-YUMQZZPRSA-N 0.000 description 1
- 240000000711 Mikania scandens Species 0.000 description 1
- 108010047562 NGR peptide Proteins 0.000 description 1
- 241001443590 Naganishia albida Species 0.000 description 1
- 241000033319 Naganishia diffluens Species 0.000 description 1
- YBYRMVIVWMBXKQ-UHFFFAOYSA-N PMSF Chemical compound FS(=O)(=O)CC1=CC=CC=C1 YBYRMVIVWMBXKQ-UHFFFAOYSA-N 0.000 description 1
- 101710039569 POLM Proteins 0.000 description 1
- 241000222051 Papiliotrema laurentii Species 0.000 description 1
- 241000364057 Peoria Species 0.000 description 1
- 102000035443 Peptidases Human genes 0.000 description 1
- 108091005771 Peptidases Proteins 0.000 description 1
- 239000001888 Peptone Substances 0.000 description 1
- 108010080698 Peptones Proteins 0.000 description 1
- 102000003992 Peroxidases Human genes 0.000 description 1
- 108090000437 Peroxidases Proteins 0.000 description 1
- 235000010627 Phaseolus vulgaris Nutrition 0.000 description 1
- 240000005158 Phaseolus vulgaris Species 0.000 description 1
- HWMGTNOVUDIKRE-UWVGGRQHSA-N Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 HWMGTNOVUDIKRE-UWVGGRQHSA-N 0.000 description 1
- KNPVDQMEHSCAGX-UHFFFAOYSA-N Phenylalanyl-Cysteine Chemical compound SCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 KNPVDQMEHSCAGX-UHFFFAOYSA-N 0.000 description 1
- KLAONOISLHWJEE-UHFFFAOYSA-N Phenylalanyl-Glutamine Chemical compound NC(=O)CCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 KLAONOISLHWJEE-UHFFFAOYSA-N 0.000 description 1
- 206010062080 Pigmentation disease Diseases 0.000 description 1
- 241000758706 Piperaceae Species 0.000 description 1
- 235000010582 Pisum sativum Nutrition 0.000 description 1
- 240000004713 Pisum sativum Species 0.000 description 1
- 206010035148 Plague Diseases 0.000 description 1
- 231100000742 Plant toxin Toxicity 0.000 description 1
- 231100000614 Poison Toxicity 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- 231100000654 Protein toxin Toxicity 0.000 description 1
- 241000589615 Pseudomonas syringae Species 0.000 description 1
- 108010052388 RGES peptide Proteins 0.000 description 1
- 240000007742 Raphanus sativus Species 0.000 description 1
- 235000006140 Raphanus sativus var sativus Nutrition 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 235000009411 Rheum rhabarbarum Nutrition 0.000 description 1
- 244000299790 Rheum rhabarbarum Species 0.000 description 1
- 241000589180 Rhizobium Species 0.000 description 1
- 241000158450 Rhodobacter sp. KYW73 Species 0.000 description 1
- 241000191043 Rhodobacter sphaeroides Species 0.000 description 1
- 241000223252 Rhodotorula Species 0.000 description 1
- 241000223253 Rhodotorula glutinis Species 0.000 description 1
- 241000223254 Rhodotorula mucilaginosa Species 0.000 description 1
- 235000016919 Ribes petraeum Nutrition 0.000 description 1
- 240000005505 Ribes rubrum Species 0.000 description 1
- 235000002355 Ribes spicatum Nutrition 0.000 description 1
- 102000006382 Ribonucleases Human genes 0.000 description 1
- 108010083644 Ribonucleases Proteins 0.000 description 1
- 240000003497 Rubus idaeus Species 0.000 description 1
- 241000235070 Saccharomyces Species 0.000 description 1
- 241001479507 Senecio odorus Species 0.000 description 1
- PBUXMVYWOSKHMF-WDSKDSINSA-N Ser-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CO PBUXMVYWOSKHMF-WDSKDSINSA-N 0.000 description 1
- 241000607720 Serratia Species 0.000 description 1
- 241000607715 Serratia marcescens Species 0.000 description 1
- 229940098362 Serratia marcescens Drugs 0.000 description 1
- 235000003434 Sesamum indicum Nutrition 0.000 description 1
- 240000003670 Sesamum indicum Species 0.000 description 1
- 208000007056 Sickle Cell Anemia Diseases 0.000 description 1
- 238000002105 Southern blotting Methods 0.000 description 1
- 235000009337 Spinacia oleracea Nutrition 0.000 description 1
- 240000003453 Spinacia oleracea Species 0.000 description 1
- 241000222068 Sporobolomyces <Sporidiobolaceae> Species 0.000 description 1
- 241000123675 Sporobolomyces roseus Species 0.000 description 1
- 241000187747 Streptomyces Species 0.000 description 1
- CZMRCDWAGMRECN-GDQSFJPYSA-N Sucrose Natural products O([C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@H](CO)O1)[C@@]1(CO)[C@H](O)[C@@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-GDQSFJPYSA-N 0.000 description 1
- 108010006785 Taq Polymerase Proteins 0.000 description 1
- 241000589500 Thermus aquaticus Species 0.000 description 1
- KAFKKRJQHOECGW-JCOFBHIZSA-N Thr-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)[C@H](O)C)C(O)=O)=CNC2=C1 KAFKKRJQHOECGW-JCOFBHIZSA-N 0.000 description 1
- 229940113082 Thymine Drugs 0.000 description 1
- 229940034208 Thyroxine Drugs 0.000 description 1
- 229940024982 Topical Antifungal Antibiotics Drugs 0.000 description 1
- 241000219793 Trifolium Species 0.000 description 1
- 241000288508 Trinia Species 0.000 description 1
- GSEJCLTVZPLZKY-UHFFFAOYSA-N Tris Chemical compound OCCN(CCO)CCO GSEJCLTVZPLZKY-UHFFFAOYSA-N 0.000 description 1
- 239000007984 Tris EDTA buffer Substances 0.000 description 1
- 239000007983 Tris buffer Substances 0.000 description 1
- 240000008529 Triticum aestivum Species 0.000 description 1
- 229920004890 Triton X-100 Polymers 0.000 description 1
- LCPVBXOHXMBLFW-JSGCOSHPSA-N Trp-Arg Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O)=CNC2=C1 LCPVBXOHXMBLFW-JSGCOSHPSA-N 0.000 description 1
- LYMVXFSTACVOLP-ZFWWWQNUSA-N Trp-Leu Chemical compound C1=CC=C2C(C[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C([O-])=O)=CNC2=C1 LYMVXFSTACVOLP-ZFWWWQNUSA-N 0.000 description 1
- NQIHMZLGCZNZBN-PXNSSMCTSA-N Trp-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CC=3C4=CC=CC=C4NC=3)N)C(O)=O)=CNC2=C1 NQIHMZLGCZNZBN-PXNSSMCTSA-N 0.000 description 1
- WPSXZFTVLIAPCN-UHFFFAOYSA-N Valyl-Cysteine Chemical compound CC(C)C(N)C(=O)NC(CS)C(O)=O WPSXZFTVLIAPCN-UHFFFAOYSA-N 0.000 description 1
- 241000219094 Vitaceae Species 0.000 description 1
- 241000589634 Xanthomonas Species 0.000 description 1
- 241000589636 Xanthomonas campestris Species 0.000 description 1
- NWONKYPBYAMBJT-UHFFFAOYSA-L Zinc sulfate Chemical compound [Zn+2].[O-]S([O-])(=O)=O NWONKYPBYAMBJT-UHFFFAOYSA-L 0.000 description 1
- 241001520823 Zoysia Species 0.000 description 1
- 150000007513 acids Chemical class 0.000 description 1
- HRPVXLWXLXDGHG-UHFFFAOYSA-N acrylamide Chemical compound NC(=O)C=C HRPVXLWXLXDGHG-UHFFFAOYSA-N 0.000 description 1
- 239000000654 additive Substances 0.000 description 1
- OIRDTQYFTABQOQ-KQYNXXCUSA-N adenosine Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O OIRDTQYFTABQOQ-KQYNXXCUSA-N 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 239000008272 agar Substances 0.000 description 1
- 108010008685 alanyl-glutamyl-aspartic acid Proteins 0.000 description 1
- 108010045023 alanyl-prolyl-tyrosine Proteins 0.000 description 1
- 150000001298 alcohols Chemical class 0.000 description 1
- 150000001299 aldehydes Chemical class 0.000 description 1
- 230000002924 anti-infective Effects 0.000 description 1
- 239000008346 aqueous phase Substances 0.000 description 1
- 108010024668 arginyl-glutamyl-aspartyl-valine Proteins 0.000 description 1
- 108010091092 arginyl-glycyl-proline Proteins 0.000 description 1
- 108010069926 arginyl-glycyl-serine Proteins 0.000 description 1
- 108010043240 arginyl-leucyl-glycine Proteins 0.000 description 1
- 230000000386 athletic Effects 0.000 description 1
- 238000000376 autoradiography Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 239000011230 binding agent Substances 0.000 description 1
- OYVAGSVQBOHSSS-UAPAGMARSA-O bleomycin A2 Chemical compound N([C@H](C(=O)N[C@H](C)[C@@H](O)[C@H](C)C(=O)N[C@@H]([C@H](O)C)C(=O)NCCC=1SC=C(N=1)C=1SC=C(N=1)C(=O)NCCC[S+](C)C)[C@@H](O[C@H]1[C@H]([C@@H](O)[C@H](O)[C@H](CO)O1)O[C@@H]1[C@H]([C@@H](OC(N)=O)[C@H](O)[C@@H](CO)O1)O)C=1N=CNC=1)C(=O)C1=NC([C@H](CC(N)=O)NC[C@H](N)C(N)=O)=NC(N)=C1C OYVAGSVQBOHSSS-UAPAGMARSA-O 0.000 description 1
- UIIMBOGNXHQVGW-UHFFFAOYSA-M buffer Substances [Na+].OC([O-])=O UIIMBOGNXHQVGW-UHFFFAOYSA-M 0.000 description 1
- 235000009120 camo Nutrition 0.000 description 1
- 239000001390 capsicum minimum Substances 0.000 description 1
- 150000004649 carbonic acid derivatives Chemical class 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 239000006285 cell suspension Substances 0.000 description 1
- 210000003850 cellular structures Anatomy 0.000 description 1
- 235000005607 chanvre indien Nutrition 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- VEXZGXHMUGYJMC-UHFFFAOYSA-M chloride anion Chemical compound [Cl-] VEXZGXHMUGYJMC-UHFFFAOYSA-M 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 235000008504 concentrate Nutrition 0.000 description 1
- 230000021615 conjugation Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000001816 cooling Methods 0.000 description 1
- 239000012228 culture supernatant Substances 0.000 description 1
- 235000001535 currant Nutrition 0.000 description 1
- 235000001537 currant Nutrition 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 230000004059 degradation Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000004925 denaturation Methods 0.000 description 1
- 230000036425 denaturation Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 108010054813 diprotin B Proteins 0.000 description 1
- ZGTMUACCHSMWAC-UHFFFAOYSA-L disodium;2-[2-[carboxylatomethyl(carboxymethyl)amino]ethyl-(carboxymethyl)amino]acetate Chemical compound [Na+].[Na+].OC(=O)CN(CC([O-])=O)CCN(CC(O)=O)CC([O-])=O ZGTMUACCHSMWAC-UHFFFAOYSA-L 0.000 description 1
- 238000010494 dissociation reaction Methods 0.000 description 1
- 230000005593 dissociations Effects 0.000 description 1
- 238000001035 drying Methods 0.000 description 1
- 238000010410 dusting Methods 0.000 description 1
- 230000020595 eating behavior Effects 0.000 description 1
- 235000013601 eggs Nutrition 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- 239000004495 emulsifiable concentrate Substances 0.000 description 1
- 239000003995 emulsifying agent Substances 0.000 description 1
- 150000002148 esters Chemical class 0.000 description 1
- 239000002095 exotoxin Substances 0.000 description 1
- 230000002349 favourable Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- GNBHRKFJIUUOQI-UHFFFAOYSA-N fluorescein Chemical compound O1C(=O)C2=CC=CC=C2C21C1=CC=C(O)C=C1OC1=CC(O)=CC=C21 GNBHRKFJIUUOQI-UHFFFAOYSA-N 0.000 description 1
- 239000006260 foam Substances 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 238000004108 freeze drying Methods 0.000 description 1
- 239000000499 gel Substances 0.000 description 1
- 238000001502 gel electrophoresis Methods 0.000 description 1
- 238000010358 genetic engineering technique Methods 0.000 description 1
- 210000004602 germ cell Anatomy 0.000 description 1
- 108010080575 glutamyl-aspartyl-alanine Proteins 0.000 description 1
- SXRSQZLOMIGNAQ-UHFFFAOYSA-N glutaraldehyde Chemical compound O=CCCCC=O SXRSQZLOMIGNAQ-UHFFFAOYSA-N 0.000 description 1
- 108010051307 glycyl-glycyl-proline Proteins 0.000 description 1
- 108010028188 glycyl-histidyl-serine Proteins 0.000 description 1
- 108010048994 glycyl-tyrosyl-alanine Proteins 0.000 description 1
- 235000021021 grapes Nutrition 0.000 description 1
- 229910052736 halogen Inorganic materials 0.000 description 1
- 230000002140 halogenating Effects 0.000 description 1
- 150000002367 halogens Chemical class 0.000 description 1
- 238000003306 harvesting Methods 0.000 description 1
- XLYOFNOQVPJJNP-ZSJDYOACSA-N heavy water Substances [2H]O[2H] XLYOFNOQVPJJNP-ZSJDYOACSA-N 0.000 description 1
- 239000011487 hemp Substances 0.000 description 1
- 235000012765 hemp Nutrition 0.000 description 1
- 235000008216 herbs Nutrition 0.000 description 1
- 108010040030 histidinoalanine Proteins 0.000 description 1
- 108010018006 histidylserine Proteins 0.000 description 1
- 238000002744 homologous recombination Methods 0.000 description 1
- 239000010903 husk Substances 0.000 description 1
- UFHFLCQGNIYNRP-UHFFFAOYSA-N hydrogen Chemical compound [H][H] UFHFLCQGNIYNRP-UHFFFAOYSA-N 0.000 description 1
- 229910000041 hydrogen chloride Inorganic materials 0.000 description 1
- 238000007654 immersion Methods 0.000 description 1
- 230000003100 immobilizing Effects 0.000 description 1
- 238000001114 immunoprecipitation Methods 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 230000000977 initiatory Effects 0.000 description 1
- 238000011081 inoculation Methods 0.000 description 1
- 229910052500 inorganic mineral Inorganic materials 0.000 description 1
- 229940079866 intestinal antibiotics Drugs 0.000 description 1
- 229910000359 iron(II) sulfate Inorganic materials 0.000 description 1
- 125000001449 isopropyl group Chemical group [H]C([H])([H])C([H])(*)C([H])([H])[H] 0.000 description 1
- 230000000155 isotopic Effects 0.000 description 1
- 230000002147 killing Effects 0.000 description 1
- 230000001418 larval Effects 0.000 description 1
- 108010077158 leucinyl-arginyl-tryptophan Proteins 0.000 description 1
- 108010083708 leucyl-aspartyl-valine Proteins 0.000 description 1
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 1
- 108010030617 leucyl-phenylalanyl-valine Proteins 0.000 description 1
- 229950008325 levothyroxine Drugs 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 230000000670 limiting Effects 0.000 description 1
- 235000014666 liquid concentrate Nutrition 0.000 description 1
- 238000009630 liquid culture Methods 0.000 description 1
- 239000007791 liquid phase Substances 0.000 description 1
- 238000005567 liquid scintillation counting Methods 0.000 description 1
- 229910052943 magnesium sulfate Inorganic materials 0.000 description 1
- CSNNHWWHGAXBCP-UHFFFAOYSA-L magnesium sulphate Substances [Mg+2].[O-][S+2]([O-])([O-])[O-] CSNNHWWHGAXBCP-UHFFFAOYSA-L 0.000 description 1
- 235000019341 magnesium sulphate Nutrition 0.000 description 1
- 229910000357 manganese(II) sulfate Inorganic materials 0.000 description 1
- 235000012766 marijuana Nutrition 0.000 description 1
- 230000002906 microbiologic Effects 0.000 description 1
- 239000011859 microparticle Substances 0.000 description 1
- 238000000386 microscopy Methods 0.000 description 1
- 238000002703 mutagenesis Methods 0.000 description 1
- 231100000350 mutagenesis Toxicity 0.000 description 1
- 239000005445 natural product Substances 0.000 description 1
- 229930014626 natural products Natural products 0.000 description 1
- 239000006916 nutrient agar Substances 0.000 description 1
- 229940005935 ophthalmologic Antibiotics Drugs 0.000 description 1
- 230000003287 optical Effects 0.000 description 1
- 230000001717 pathogenic Effects 0.000 description 1
- 244000052769 pathogens Species 0.000 description 1
- 235000019319 peptone Nutrition 0.000 description 1
- 230000002093 peripheral Effects 0.000 description 1
- 108010082795 phenylalanyl-arginyl-arginine Proteins 0.000 description 1
- 108010047079 phenylalanyl-leucyl-arginyl-phenylalanine Proteins 0.000 description 1
- 235000021317 phosphate Nutrition 0.000 description 1
- 150000003013 phosphoric acid derivatives Chemical class 0.000 description 1
- 229910052615 phyllosilicate Inorganic materials 0.000 description 1
- 230000019612 pigmentation Effects 0.000 description 1
- 239000003123 plant toxin Substances 0.000 description 1
- 229920000515 polycarbonate Polymers 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 229910000160 potassium phosphate Inorganic materials 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 108090000765 processed proteins & peptides Proteins 0.000 description 1
- 102000004196 processed proteins & peptides Human genes 0.000 description 1
- 230000002062 proliferating Effects 0.000 description 1
- 108010077112 prolyl-proline Proteins 0.000 description 1
- 239000011814 protection agent Substances 0.000 description 1
- 230000001681 protective Effects 0.000 description 1
- 239000011253 protective coating Substances 0.000 description 1
- 235000021013 raspberries Nutrition 0.000 description 1
- 230000003252 repetitive Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000006152 selective media Substances 0.000 description 1
- 108010048397 seryl-lysyl-leucine Proteins 0.000 description 1
- 238000002741 site-directed mutagenesis Methods 0.000 description 1
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 1
- 239000011877 solvent mixture Substances 0.000 description 1
- 238000005507 spraying Methods 0.000 description 1
- 239000003381 stabilizer Substances 0.000 description 1
- 238000003756 stirring Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 239000005720 sucrose Substances 0.000 description 1
- 150000003467 sulfuric acid derivatives Chemical class 0.000 description 1
- 230000004083 survival Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000010257 thawing Methods 0.000 description 1
- 239000003440 toxic substance Substances 0.000 description 1
- 235000019798 tripotassium phosphate Nutrition 0.000 description 1
- 239000012137 tryptone Substances 0.000 description 1
- 108010045269 tryptophyltryptophan Proteins 0.000 description 1
- 108010035534 tyrosyl-leucyl-alanine Proteins 0.000 description 1
- 108010037335 tyrosyl-prolyl-glycyl-glycine Proteins 0.000 description 1
- 108010068794 tyrosyl-tyrosyl-glutamyl-glutamic acid Proteins 0.000 description 1
- 238000009281 ultraviolet germicidal irradiation Methods 0.000 description 1
- 241001515965 unidentified phage Species 0.000 description 1
- 108010003885 valyl-prolyl-glycyl-glycine Proteins 0.000 description 1
- 235000013311 vegetables Nutrition 0.000 description 1
- 235000020234 walnut Nutrition 0.000 description 1
- 238000001262 western blot Methods 0.000 description 1
- 239000004563 wettable powder Substances 0.000 description 1
- 235000021307 wheat Nutrition 0.000 description 1
- 239000012138 yeast extract Substances 0.000 description 1
- 229910000368 zinc sulfate Inorganic materials 0.000 description 1
- 239000011686 zinc sulphate Substances 0.000 description 1
- 235000009529 zinc sulphate Nutrition 0.000 description 1
Abstract
The subject invention concerns materials and methods useful in the control of non-mammalian pests and, particularly, plant pests. In a specific embodiment, the subject invention provides new i(Bacillus thuringiensis) toxins useful for the control of lepidopterans. In preferred embodiments, the subject toxins are used to control i(Ostrinia nubilalis), the European corn borer. The subject invention further provides nucleotide sequences which encode the toxins of the subject invention. The nucleotide sequences of the subject invention can be used to transform hosts, such as plants, to express the pesticidal toxins of the subject invention. The subject invention further concerns novel nucleotide primers for the identification of genes encoding toxins active against pests. The primers are useful in PCR techniques to produce gene fragments which are characteristic of genes encoding these toxins. The primers are also useful as nucleotide probes to detect the toxin-encoding genes.
Description
ACTIVE TOXINS AGAINST OSTRINIA NUBILAUS
BACKGROUND OF THE INVENTION
The soil microbe Bacillus thuringiensis (B. t.) Is a gram positive spore-forming bacterium. Most strains of B.t. does not exhibit pesticidal activity. Some strains of B. t. they produce and can be characterized by the parasporal crystalline protein inclusions. These "d-endotoxins" are different from exotoxins, which have a range of non-specific hosts. These inclusions appear frequently in microscopy as crystals of differentiated forms. Proteins can be highly toxic to pests and specific in their toxic activity. Certain genes of B. t toxins have been isolated and sequenced. and products of B. t. based on recombinant DNA, which have been approved for use. In addition, with the use of genetic engineering techniques, new approaches to administer B toxins are being developed. to agricultural environments, including the use of plants genetically modified with toxin genes from B. t. to obtain resistance to insects and the use of stabilized intact microbial cells as vehicles for the application of B. t toxins. (Gaerther, F.H., L. Kim [1998] TIBTECH 6: S4-S7). Therefore, the endotoxin genes isolated from B. t. They are starting to have commercial value. Until fifteen years ago, the commercial use of B. t pesticides has been very restricted to a narrow range of lepidopteran pests (caterpillars). The preparations of spores and crystals of the subsp. kurstaki from B. thuringlensis have been used for many years as commercial insecticides for lepidopteran pests. For example, the variety kurstaki HD-1 of B. thuringensis produces a crystalline d-endotoxin that is toxic to larvae of a number of lepidopteran insects. However, in recent years researchers have discovered pesticides from B.t. with specificity for a much wider range of pests. For example, other species of B. t have been used commercially, such as israelensis and morrisoni (also known as tenebrionis, also known as ß.t. M-8, also known as ß.t. San diego), in order to to combat insects of the orders Diptera and Coleoptera, respectively (Gaerther, FH [1989] "Cellular Delivery Systems for Insecticidal Proteins: Living and Non-Living Microorganisms", in Controlled Delivery of Crop Protection Agents, RM Wiikins, edit Taylor and Francis, New York and London, 1990, pp. 245-255). See also Couch, T.L. (1989) "Mosquito Pathogenicity of Bacillus thuringiensis var. Israelensis," Developments in Industrial Microbiology 22: 61-76; and Beegle, C.C. (1978) - "Use of Entomogenous Bacteria in Agroecosystems", Developments in Industrial Microbiology 20: 97-104. Krieg, A., A.M. Huger, G.A. Langenbruch, W. Schnetter (1983) Z. ang. Ento. 96: 500-508 describe the tenebrionis variety of Bacillus thuringiensis, which, it is reported, are active against two beetles of the order of the Coleoptera. These are Colorado potato beetle, Leptinotarsa decemlineata and Agelastica alni. Recently, new subspecies of B.t have been identified, and genes responsible for active d-entotoxin proteins have been isolated (Hófte, J., H.R. Whiteley [1989] Microbiological Reviews 52 (2): 242-255). Hófte and Whiteley classified the crystal protein genes into four main classes. The classes were Cryl (specific against Lepidóptera), Cryll (specific against Lepidóptera and Diptera), Crylll (specific against Coleoptera) and CrylV (specific against Diptera). The discovery of strains specifically toxic to other pests has been reported (Feitelson, J.S., J. Payne, L. Kim [1992] Bio / Technology 10: 271-275). It has been suggested that CryV designates a class of specific toxin genes against nematodes. Lambert et al. (Lambert, N., L. Buysse, C. Decock, S. Jansens, C. Piens, B. Saey, J. Seurinck, K. van Audenhove, J. Van Rie, A. Van Vliet, M. Peferoen [1996]
Appl. Environ. Microbiol. 62 (1): 80. 86) and Shevelev et al ([1993] FEBS Lett 336: 79-82) describe the characterization of active Cry9 toxins against lepidoptera. Published PCT applications WO 94/05771 and WO 94 24264 also describe isolates of B.t. active against lepidopteran pests. Gleave et al. ([1991] JGM 138: 55-62) and Smulevitch et al. ([1991] FEBS Letts 293: 25-26) also describe toxins of B.t. A number of other classes of B.t. genes have now been identified. The cloning and expression of a crystal protein gene of B. t. in Escherichia coli has been described in the published literature (Schnept, H.E., H.R. Whiteley [1981] Proc. Nati, Acad. Sci. USA 78: 2893-2897). U.S. Patent No. 4,448,885 and U.S. Patent No. 4,467,036 describe crystal protein expression of B. t. in E. coli. U.S. Patent Nos. 4,990,332, 5,039,523, 5,126,133, 5,164,180 and 5,169,629 are among those that describe B.t. toxins. that have activity against lepidoptera. PCT application WO 96/05314 describes PS86W1, PS86V1 and other isolations of B. t. active against lepidopteran pests. The PCT patent applications published with numbers WO 94/24264 and WO 94/05771 describe isolates of B. t. and active toxins against lepidopteran pests. The proteins of B. t. with activity against members of the Noctuidae family have been described by Lambert and others, previously mentioned. U.S. Patent Nos. 4,797,276 and 4,853,331 describe the tenebrionis strain of B. thuringiensis which can be used to combat coleopterous pests in various media. U.S. Patent No. 4,918,006 describes toxins from B. t. with activity against diptera. US 5,151,363 and US Patent No. 4,948,734 describe certain isolates of B. t. that have activity against nematodes. Other patents of the United States describe activity against nematodes are 5,093,120, 4,236,843, 5,262,399, 5,270,448, 5,281,530, 5,322,932, 5,350,577, 5,426,049 and 5,439,881 (corresponding to Argentine patent applications Nos. 332,269, 329,575 and P 980103088). As a result of deep research and investment of resources, other patents have been granted on new isolations of B. t. and new uses of said isolated products of B. t. See Feitelson and others, already mentioned, for a review. However, the finding of new isolates of B. t. and new uses for known isolates of B. t. it remains an empirical and unpredictable technique. The isolation of genes responsible for toxins has been a slow empirical process. Carozzi et al. (Carozzí, N.B., V.C. Kramer, G.W. Warren, S. Evola, G. Koziel (1991) Appl. Env.Microbiol 57 (11): 3057-3061) describe methods for identifying new isolates of B. t. This report does not disclose or suggest the primers, probes, toxins and specific genes of the present invention for toxin genes active against lepidoptera. U.S. Patent No. 5,204,237 describes specific and universal probes for the isolation of B. toxin genes. However, this patent does not disclose the probes, primers, toxins and genes of the present invention. WO 94/21795 and Estruch J.J. and others ([1996] PNAS 93: 5389-5394) describe toxins from Bacillus microbes. It is indicated that these toxins are produced during the vegetative development of the cells and therefore are called vegetative insecticidal proteins (VIP). It has been reported that these toxins differ from the crystal-forming d-endotoxins. The activity of these toxins against lepidopteran pests has been reported. The black night caterpillar (Agrotis Ípsilon (Hufnagel), Lepidoptera: Noctuidae) is a serious pest for many crops including corn, cotton, cabbage (Brassica, broccoli, cabbage, Chinese cabbages) and turf. Secondary host plants include beets, Capsicum, (peppers), fabaceous chickpeas, lettuce, alfalfa, onions, potatoes, radishes, rapeseed (rice), rice, soybeans, strawberries, sugar beets, tobacco, tomatoes and afforestation trees. In North America, pests of the genus Agrotis feed on clover, corn, tobacco, hemp, onion, strawberries, currants, raspberries, alfalfa, barley, beans, cabbage, oats, peas, potatoes, sweet potatoes, tomatoes, garden flowers , pastures, alfalfa, corn, asparagus, grapes, almost any type of leaf, herbs and many other crops and ornamental plants. Other nocturnal caterpillars of the Agrotini Tribe are pests, especially those of the genus Feltia (for example, F. jaculífera (Guenée); equivalent to ducens subgothica) and Euxoa (eg, E. messoria (Harris), E. scandens (Riley), E. auxiliais Smith, E. detersa (Waiker), E. tessellata (Harris), E. ochrogaster (Guenée) Host plants include various crops, including kelp, night caterpillars are also pests outside North America, and pests of major economic importance attack chickpeas, wheat, vegetables, sugar beet, alfalfa, corn, potatoes, turnips, sunflowers, Brassica, onions, leeks, celery, sesame, asparagus, rhubarb, chicory, greenhouse crops and spinach The black night caterpillar A. Ípsilon is produced as a pest outside of North America, including in Central America, Europe, Asia, Australasia, India, Taiwan, Mexico, Egypt and New Zealand Night caterpillars go through several stages as larvae, although the cutting of seedlings by larvae of later stages produces the most obvious economic damage and losses, the food of the leaves commonly involves the loss of yield of crops such as corn. When reaching the fourth larval stage, the larvae begin to cut plants and parts of the plants, especially the seedlings. Due to the change in eating behavior, economically harmful populations can suddenly accumulate with few warning signs. Their nocturnal habits and furrowing behavior also make detection problematic. Large nocturnal caterpillars can destroy several seedlings per day, and a severe infestation can eliminate entire parts of the crops. A. ipsilon culture controls such as peripheral herb control can help prevent severe infestations: however, these methods are not always feasible or effective. The infestations are very sporadic, and the application of an insecticide before planting or during planting has not been effective until the present. Some baits are available to combat night caterpillars in crops. To protect lawn grasses such as creeping agrostide, chemical insecticides have been employed. The use of chemical pesticides is a specific concern regarding turf due to close contact of the public with the treated areas (for example, on golf courses, athletic fields, parks and other recreational areas, professional landscaping, domestic lawns) . Natural products (for example, nematodes, azadirachtin) generally give few results. To date, Bacillus thuringiensis products have not been widely used to combat black caterpillars since highly effective toxins have not been counted.
BRIEF DESCRIPTION OF THE INVENTION
The present invention relates to materials and methods useful for the control of non-mammalian pests and, especially, plant pests. In a specific embodiment, the present invention presents new toxins that serve to combat Lepidoptera. In an especially preferred embodiment, the toxins of the present invention are used to combat the nocturnal black caterpillar. The present invention also presents nucleotide sequences encoding toxins active against lepidoptera according to the present invention. The present invention also presents nucleotide sequences and methods useful for the identification and characterization of genes encoding pestidic toxins. The present invention also presents new asylums of Bacillus thuringiensis which have pesticidal activity. In one embodiment, the present invention relates to unique nucleotide sequences that serve as primers in PCR techniques. The primers produce characteristic gene fragments that can be used in the identification and isolation of specific toxin genes. The nucleotide sequences of the present invention encode toxins that differ from the d-endotoxins described above. In an embodiment of the present invention, isolates of B.t. in conditions conducive to the high multiplication of the microbe. After treating the microbe to produce a single-stranded genomic nucleic acid, the DNA can be contacted with the primers of the invention and subjected to PCR amplification. The characteristic fragments of toxin-encoding genes are amplified by this method, thus identifying the presence of the toxin-encoding genes or genes. Another aspect of the present invention is the use of the nucleotide sequences described as probes for the detection, identification and characterization of genes that encode B.t. toxins. which are active against lepidoptera. Other aspects of the present invention include genes and isolates identified by the use of the methods and nucleotide sequences described herein. The genes thus identified code for active toxins against Lepidoptera. In the same way, the isolates have activity against these pests. The new pesticide isolates of ß.f. of the present invention include PS31 G1, PS185U2, PS1 1B, PS218G2, PS213E5, PS28C, PS86BB1, PS89J3, PS94R1, PS27J2, PS101 DD and PS202S. As described herein, toxins useful in accordance with the present invention may be chimeric toxins produced by the combination of multiple toxin portions. In a preferred embodiment, the present invention relates to plant cells transformed with at least one polynucleotide sequence of the present invention in such a way that the transformed plant cells express pesticidal toxins in tissues consumed by the pests of interest. Said plant transformation can be obtained employing techniques well known to those skilled in the art and would typically involve modification of the gene to optimize the expression of the toxin in plants. On the other hand, the B.t. isolates can be used. of the present invention or recombinant microbes expressing the toxins described in the present invention for controlling pests. In this aspect, the invention includes the treatment of B.t. practically intact, and / or recombinant cells containing the expressed toxins of the present invention treated in order to prolong the pesticidal activity when substantially intact cells are applied to the environment of the pest of interest. The treated cell acts as a protective coating for the pesticide toxin. The toxin is activated upon ingestion by an insect of interest.
BRIEF DESCRIPTION OF THE SEQUENCES
SEC. ID. NO: 1 is a useful advance primer according to the present invention. SEC. ID. NO: 2 is a reverse primer useful in accordance with the present invention.
SEC. ID. NO: 3 is a useful advance primer according to the present invention. SEC. ID. NO: 4 is a useful reverse primer according to the present invention. SEC. ID. NO: 5 is a useful advance primer according to the present invention. SEC. ID. NO: 6 is a useful reverse primer according to the present invention. SEC. ID. NO: 7 is an amino acid sequence of the toxin designated 1 1 B1AR. SEC. ID. NO: 8 is a nucleotide sequence that encodes an amino acid sequence of the 1 1 B1AR toxin (SEQ ID NO: 7). SEC. ID. NO: 9 is an amino acid sequence of the toxin designated 1 1 B1 BR. SEC. ID. NO: 10 is a nucleotide sequence that encodes an amino acid sequence of the 1 1 B1 BR toxin (SEQ ID NO: 9). SEC. ID. NO: 1 1 is an amino acid sequence of the toxin designated 1291A. SEC. ID. NO: 12 is a nucleotide sequence that encodes an amino acid sequence of the 1291 A toxin (SEQ ID NO: 11). SEC. ID. NO: 13 is an amino acid sequence of the toxin designated 1292A. SEC. ID. NO: 14 is a nucleotide sequence that encodes an amino acid sequence of the 1292A toxin (SEQ ID NO: 13). SEC. ID. NO: 15 is an amino acid sequence of the toxin designated 1292B. SEC. ID. NO: 16 is a nucleotide sequence that encodes an amino acid sequence of toxin 1292B (SEQ ID NO: 15). SEC. ID. NO: 17 is an amino acid sequence of the toxin designated 31 GA. SEC. ID. NO: 18 is a nucleotide sequence encoding an amino acid sequence of the 31 GA toxin (SEQ ID NO: 17). SEC. ID. NO: 19 is an amino acid sequence of the toxin designated 31 GBR. SEC. ID. NO: 20 is a nucleotide sequence that encodes an amino acid sequence of the 31 GBR toxin (SEQ ID NO: 19). SEC. ID. NO: 21 is an amino acid sequence of the toxin designated 85N1 R identified by the method of the present invention. SEC. ID. NO: 22 is a nucleotide sequence that encodes an amino acid sequence of the 85N1 R toxin (SEQ ID NO: 21). SEC. ID. NO: 23 is an amino acid sequence of the toxin designated 85N2. SEC. ID. NO: 24 is a nucleotide sequence that encodes an amino acid sequence of the 85N2 toxin (SEQ ID NO: 23). SEC. ID. NO: 25 is an amino acid sequence of the toxin designated 85N3.
SEC. ID. NO: 26 is a nucleotide sequence that encodes an amino acid sequence of the 85N3 toxin (SEQ ID NO: 25). SEC. ID. NO: 27 is an amino acid sequence of the toxin designated 86V1 C1. SEC. ID. NO: 28 is a nucleotide sequence encoding an amino acid sequence of the 86V1 C1 toxin (SEQ ID NO: 27). SEC. ID. NO: 29 is an amino acid sequence of the toxin designated 86V1 C2. SEC. ID. NO: 30 is a nucleotide sequence encoding an amino acid sequence of the 86V1 C2 toxin (SEQ ID NO: 29). SEC. ID. NO: 31 is an amino acid sequence of the toxin designated 86V1 C3R. SEC. ID. NO: 32 is a nucleotide sequence encoding an amino acid sequence of the 86V1 C3R toxin (SEQ ID NO: 31). SEC. ID. NO: 33 is an amino acid sequence of the toxin designated F525A. SEC. ID. NO: 34 is a nucleotide sequence encoding an amino acid sequence of the toxin F252A (SEQ ID NO: 33). SEC. ID. NO: 35 is an amino acid sequence of the toxin designated F525B. SEC. ID. NO: 36 is a nucleotide sequence encoding an amino acid sequence of the F525B toxin (SEQ ID NO: 35). SEC. ID. NO: 37 is an amino acid sequence of the toxin designated F525C. SEC. ID. NO: 38 is a nucleotide sequence encoding an amino acid sequence of the F525C toxin (SEQ ID NO: 37). SEC. ID. NO: 39 is an amino acid sequence of the toxin designated F573A. SEC. ID. NO: 40 is a nucleotide sequence encoding an amino acid sequence of the F573A toxin (SEQ ID NO: 39). SEC. ID. NO: 41 is an amino acid sequence of the designated toxin F573B. SEC. ID. NO: 42 is a nucleotide sequence encoding an amino acid sequence of the F573B toxin (SEQ ID NO: 41). SEC. ID. NO: 43 is an amino acid sequence of the toxin designated F573C. SEC. ID. NO: 44 is a nucleotide sequence encoding an amino acid sequence of the F573C toxin (SEQ ID NO: 43). SEC. ID. NO: 45 is an amino acid sequence of the toxin designated FBB1A. SEC. ID. NO: 46 is a nucleotide sequence encoding an amino acid sequence of the FBB1A toxin (SEQ ID NO: 45). SEC. ID. NO: 47 is an amino acid sequence of the toxin designated FBB1 BR. SEC. ID. NO: 48 is a nucleotide sequence that encodes an amino acid sequence of the toxin FBB1 BR (SEQ ID NO: 47).
SEC. ID. NO: 49 is an amino acid sequence of the toxin designated FBB1 C. SEC. ID. NO: 50 is a nucleotide sequence that encodes an amino acid sequence of the toxin FBB1 C (SEQ ID NO: 49). SEC. ID. NO: 51 is an amino acid sequence of the toxin designated FBB1 D. SEC. ID. NO: 52 is a nucleotide sequence encoding an amino acid sequence of the toxin FBB1 D (SEQ ID NO: 51). SEC. ID. NO: 53 is an amino acid sequence of the toxin designated J31AR. SEC. ID. NO: 54 is a nucleotide sequence that encodes an amino acid sequence of the J31AR toxin (SEQ ID NO: 53). SEC. ID. NO: 55 is an amino acid sequence of the toxin designated 32AR. SEC. ID. NO: 56 is a nucleotide sequence that encodes an amino acid sequence of the J32AR toxin (SEQ ID NO: 55). SEC. ID. NO: 57 is an amino acid sequence of the toxin designated W1 FAR. SEC. ID. NO: 58 is a nucleotide sequence encoding an amino acid sequence of W1 FAR toxin (SEQ ID NO: 57). SEC. ID. NO: 59 is an amino acid sequence of the toxin designated W1 FBR. SEC. ID. NO: 60 is a nucleotide sequence that encodes an amino acid sequence of the W1 FBR toxin (SEQ ID NO: 59). SEC. ID. NO: 61 is an amino acid sequence of the toxin designated W1 FC. SEC. ID. NO: 62 is a nucleotide sequence that encodes an amino acid sequence of the W1 FC toxin (SEQ ID NO: 61). SEC. ID. NO: 63 is an oligonucleotide that serves as a PCR primer or hybridization probe according to the present invention. SEC. ID. NO: 64 is an oligonucleotide that serves as a PCR primer or hybridization probe according to the present invention. SEC. ID. NO: 65 is an oligonucleotide that serves as a PCR primer or hybridization probe according to the present invention. SEC. ID. NO: 66 is an oligonucleotide that serves as a PCR primer or hybridization probe according to the present invention. SEC. ID. NO: 67 is an oligonucleotide that serves as a PCR primer or hybridization probe according to the present invention. SEC. ID. NO: 68 is an oligonucleotide that serves as a PCR primer or hybridization probe according to the present invention. SEC. ID. NO: 69 is an oligonucleotide that serves as a PCR primer or hybridization probe according to the present invention. SEC. ID. NO: 70 is an amino acid sequence of the toxin designated 86BB1 (a). SEC. ID. NO: 71 is a nucleotide sequence that encodes an amino acid sequence of toxin 86BB1 (a).
SEC. ID. NO: 72 is an amino acid sequence of the toxin designated 86BB1 (b) .1 SEC. ID. NO: 73 is a nucleotide sequence that encodes an amino acid sequence of the toxin 86BB1 (b). SEC. ID. NO: 74 is an amino acid sequence of the toxin designated 31 G1 (a). SEC. ID. NO: 75 is a nucleotide sequence that encodes an amino acid sequence of the 31 G1 (a) toxin. SEC. ID. NO: 76 is an amino acid sequence of the toxin designated chimeric 129HD. SEC. ID. NO: 77 is a nucleotide sequence that encodes an amino acid sequence of the chimeric 129HD toxin. SEC. ID. NO: 78 is an amino acid sequence of the toxin designated 11B (a). SEC. ID. NO: 79 is a nucleotide sequence that encodes an amino acid sequence of the 1 1 B (a) toxin. SEC. ID. NO: 80 is an amino acid sequence of the toxin designated 31G1 (b). SEC. ID. NO: 81 is a nucleotide sequence that encodes an amino acid sequence of the 31G1 (b) toxin. SEC. ID. NO: 82 is an amino acid sequence of the toxin designated 86BB1 (c). SEC. ID. NO: 83 is a nucleotide sequence that encodes an amino acid sequence of toxin 86BB1 (c). SEC. ID. NO: 84 is an amino acid sequence of the toxin designated 86V1 (a). SEC. ID. NO: 85 is a nucleotide sequence that encodes an amino acid sequence of the 86V1 (a) toxin. SEC. ID. NO: 86 is a sequence of amino acids of the toxin designated 86W1 (a). SEC. ID. NO: 87 is a nucleotide sequence that encodes an amino acid sequence of the toxin 86W1 (a). SEC. ID. NO: 88 is a partial amino acid sequence of the toxin designated 94R1 (a). SEC. ID. NO: 89 is a partial nucleotide sequence that encodes an amino acid sequence of the 94R1 toxin (a). SEC. ID. NO: 90 is an amino acid sequence of the toxin designated 185U2 (a). SEC. ID. NO: 91 is a nucleotide sequence that encodes an amino acid sequence of the 185U2 (a) toxin. SEC. ID. NO: 92 is an amino acid sequence of the toxin designated 202S (a). SEC. ID. NO: 93 is a nucleotide sequence that encodes an amino acid sequence of the 202S toxin (a). SEC. ID. NO: 94 is an amino acid sequence of the toxin designated 213E5 (a).
SEC. ID. NO: 95 is a nucleotide sequence that encodes an amino acid sequence of the 213E5 (a) toxin. SEC. ID. NO: 96 is an amino acid sequence of the toxin designated 218G2 (a). SEC. ID. NO: 97 is a nucleotide sequence that encodes an amino acid sequence of the 218G2 (a) toxin. SEC. ID. NO: 98 is an amino acid sequence of the toxin designated 29HD (a). SEC. ID. NO: 99 is a nucleotide sequence that encodes an amino acid sequence of the 29HD toxin (a). SEC. ID. NO: 100 is an amino acid sequence of the toxin designated 1 10HD (a). SEC. ID. NO: 101 is a nucleotide sequence that encodes an amino acid sequence of the 1HD (a) toxin. SEC. ID. NO: 102 is an amino acid sequence of the toxin designated 129HD (b). SEC. ID. NO: 103 is a nucleotide sequence that encodes an amino acid sequence of 129HD toxin (b). SEC. ID. NO: 104 is a partial amino acid sequence of the toxin designated 573HD (a). SEC. ID. NO: 105 is a partial nucleotide sequence that encodes an amino acid sequence of the 573HD toxin (a).
DETAILED DESCRIPTION OF THE INVENTION
The present invention relates to materials and methods for the control of non-mammalian pests. In specific embodiments, the present invention relates to new isolates and toxins of Bacillus thuringiensis that have activity against Lepidoptera. In an especially preferred embodiment, the toxins and methodologies described herein can be used to combat the black night crawler. The present invention also relates to novel genes that encode pesticidal toxins and novel methods to identify and characterize ß.í. that encode toxins with beneficial properties. The present invention not only encompasses the polynucleotide sequences encoding these toxins, but also the use of these polynucleotide sequences to produce recombinant hosts that express the toxins. Certain proteins of the present invention differ from the crystal or "Cry" proteins that had previously been isolated from Bacillus thuringiensis. Another aspect of the present invention relates to novel isolates and to the toxins and genes that can be obtained from these isolated products. The novel isolates B.t. of the present invention have been designated PS31 G1, PS185U2, PS1 1B, PS218G2, PS213E5, PS28C, PS86BB1, PS89J3, PS94R1, PS202S, PS101 DD and PS27J2. The new toxins and polynucleotide sequences presented here are defined according to several parameters. A critical feature of the toxins described herein is a pesticidal activity. In a specific embodiment, these toxins have activity against lepidopteran pests. The toxins and genes of the present invention can also be defined by their amino acid and nucleotide sequences. The sequences of the molecules can be defined in terms of homology with certain exemplified sequences as well as in terms of the ability to hybridize with, or be amplified by, certain probes and primers exemplified. The toxins presented here can also be identified on the basis of their immunoreactivity with certain antibodies. Methods for preparing useful chimeric toxins by combining portions of B.t. crystal proteins have been developed. The combined portions do not need, in and of themselves, to be pesticides, as long as the combination of the portions generates a chimeric protein that is a pesticide. This can be achieved by employing restriction enzymes according to what is described, for example, in European Patent 0 228 838; Ge, A.Z., N.L. Shivarova, D.H. Dean (1989) Proc. Nati Acad. Sci. USA 86: 4037-4041; Ge, A.Z., D. Rivers, R. Milne, D.H. Dean (1991) J. Biol. Chem. 266: 17954-17958; Schnepf, H.E., K. Tomczak, J.P. Ortega, H.R. Whiteley (1990) J. Biol. Chem. 265: 20923-20930; Honee, G., D. Convenis, J. Van Rie, S. Jansens, M. Peferoen, B. Visser (1991) Mol. Microbiol. 5: 2799-2806. On the other hand, recombination can be used using cell recombination mechanisms to obtain similar results. See, for example, Caramori, T., A.M.
Albertini, A. Galizzi (1991) Gene 98: 37-44; Widner, W.R., H.R. Whiteley (1990) J. Bacteriol. 172: 2826-2832; Bosch, D., B. Schipper, H. van der Kliej, R.A. from Maagd, W.J. Stickema (1994) Biotechnology 12: 915-918. A number of other methods by which these chimeric DNAs can be prepared are known in the art. The present invention is intended to include chimeric proteins using the novel sequences identified in the present application. With the explanations presented in this application, a person skilled in the art could easily produce and use the various toxins and polynucleotide sequences presented herein. Isolates of ß.í. useful according to the present invention have been deposited in the permanent collection of the Agricultural Research Service Patent Culture Collection (NRRL), North Regional Research Center, 1815 North University Street, Peoria, Illinois 61604, United States. The crop deposit numbers of the strains of ß.í. They are the following:
Cultivation Deposit number Deposit date ß. í. PS11 B (MT274) NRRL B-21556 Apr 18, 1996 ß. í. PS86BB1 (MT275) NRRL B-21557 April 18, 1996
B. t. PS86V1 (MT276) NRRL B-21558 April 18, 1996 ß. í. PS86W1 (MT277) NRRL B-21559 Apr 18, 1996 ß. í. PS36G1 (MT278) NRRL B-21560 April 18, 1996 ß. í. PS89J3 (MT279) NRRL B-21561 April 18, 1996 ß. í. PS185U2 (MT280) NRRL B-21562 October 18, 1996 ß. í. PS27J2 NRRL B-21799 July 1, 1997 ß. í. PS28E NRRL B-21800 July 1, 1997 ß. í. PS94R1 NRRL B-21801 July 1, 1997 ß. í. PS101 DD NRRL B-21802 July 1, 1997 ß. í. PS202S NRRL B-21803N July 1, 1997 ß. í. PS213E5 NRRL B-21804 July 1, 1997 ß. í. PS218G2 NRRL B-21805 July 1, 1997
E. coli NM522 (MR 922) NRRL B-21794 June 27, 1997
(pMYC2451) E. coli NM522 (MR 923) NRRL B-21795 June 27, 1997
(pMYC2453) E. coli NM522 (MR 924) NRRL B-21796 June 27, 1997
(pMYC2454)
The crops deposited for the purposes of the present patent application were deposited under conditions that guarantee access to them during the period in which this application is pending from a person designated by the Commissioner of Patents and Trademarks under 37 CFR 1.14 and 35 USC 122. The deposits will be available in accordance with the requirements of the patent laws of the countries in which counterparts of the present application or their successors are presented. However, it should be understood that the availability of a deposit does not constitute a license to practice the present invention thus repealing the patent rights granted by governmental action. In addition, the deposits of crops of interest are stored and made available to the public in accordance with the clauses of the Budapest Treaty for the Deposit of Microorganisms, that is, they are kept with all the necessary attention to keep them viable and unpolluted for a period of time. a period of at least five years after the most recent request for the supply of a sample of the deposit and, in any case, for a period of at least thirty (30) years from the date of deposit or during the term of any patent that can be issued with the description of the crop or crops. The depositor recognizes the duty to replace the deposit (s) if the depositary can not provide a sample when requested, due to the state of the deposit. All restrictions regarding the availability to the public of the crop deposits herein will be irrevocably eliminated upon grant of the patent that describes them. The following is a table showing the characteristics of certain useful isolates according to the present invention.
TABLE 1 Description of strains of B. t. toxic for lepidoptera
Cultivation Description of the crystal PM apr. (kDa) Serotype
PS185U2 Small bipyramidal 130 kDa doublet, 70 kDa ND PS1 1 B Twisted bipyramidal 130 kDa, 70 kDa ND PS218G2 amorphous 135 kDa, 127 kDa ND PS213E5 amorphous 130 kDa ND PS86W1 Multiple amorphous 130 kDa, doublet 5a5b gatteriae
PS28C amorphous 130 kDa, triplet 5a5b gatteriae
PS86BB1 BP ext. 130 kDa, doublet 5a5b gatteriae
PAS89J3 Spherical / amorphous 130 kDa, double ND PS86V1 BP 130 kDa, double ND PS94R1 BP and amorphous 130 kDa, double ND HD525 BP and amorphous 130 kDa No motility
HD573 Multiple amorphous 135 kDa, 79 kDa No motility doublet, 72 kDa PS27J2 Lemon form 130 kDa 50 kDa 4 (sotto or kenyae
ND = not determined
In one embodiment, the present invention relates to materials and methods that include nucleotide primers and probes for isolating and identifying Bacillus thuringiensis (B. t.) Genes that encode protein toxins that are active against lepidopteran pests. The nucleotide sequences described herein may also be used to identify new pesticidal isolates. The invention also relates to genes, isolates and toxins identified by the methods and materials described in the present application.
Genes and toxins The genes and toxins useful according to the present invention include not only the complete sequences but also fragments of these sequences, variants, mutants and fusion proteins that retain the pesticidal activity characteristic of the toxins specifically exemplified herein. Chimeric genes and toxins produced by combining portions of more than one toxin or B. t gene. they can also be used according to what is described in the present invention. In the present, the terms "variants" or "variations" of genes are used to refer to nucleotide sequences that encode the same toxins or that encode equivalent toxins with pesticidal activity. The term "equivalent toxins" is used herein to refer to toxins that have the same or essentially the same biological activity against target pests as the toxins exemplified. It should be apparent to those skilled in the art that genes encoding active toxins can be identified and ined by various means. The specific genes exemplified herein can be ined from isolates deposited in a culture reservoir according to the above described. These genes, or portions or variants thereof, can also be constructed synthetically, for example by the use of a gene synthesizer. Variations of genes can be easily constructed using standard techniques to effect point mutations. In addition, fragments of these genes can be prepared using commercially available exonucleases or endonucleases according to standard procedures. For example, enzymes such as ßa / 31 or site-directed mutagenesis can be used to systematically trim nucleotides from the ends of these genes. In addition, genes encoding active fragments can be ined using a variety of restriction enzymes. Proteases can be used to directly in active fragments of these toxins. Equivalent toxins and / or genes encoding these equivalent toxins can be derived from isolates of B. t. and / or DNA libraries using the indications provided herein. There are a number of methods for ining pesticidal toxins according to the present invention. For example, antibodies against the pesticidal toxins described and claimed herein can be used to identify and isolate other toxins from a mixture of proteins. Specifically, antibodies can be cultured against portions of the toxins that are most constant and most different from other ß toxins. í. These antibodies can then be used to identify equivalent toxins with characteristic activity by immunoprecipitation, enzyme-linked immunosorbent assays (ELISA) or western blotting. Antibodies against the toxins described herein, or against equivalent toxins, or fragments of these toxins can be easily prepared using standard procedures in the art. Then you can get the genes that code for these toxins from the microorganism.
Fragments and equivalents that retain the pesticidal activity of the exemplified toxins are within the scope of the present invention. In addition, by virtue of the redundancy of the genetic code, a variety of different DNA sequences can encode the amino acid sequences described herein. It is within the competence of those skilled in the art to generate these alternative DNA sequences that encode them, or essentially the same toxins. These variant DNA sequences are within the scope of the present invention. In the present, the expression "essentially the same" refers to sequences that have substitutions, deletions, additions or insertions of amino acids that do not actually affect pesticidal activity. Fragments that retain pesticidal activity are also included in this definition. Another method for identifying the toxins and genes of the present invention is through the use of oligonucleotide probes. These probes are detectable nucleotide sequences. The probes provide a rapid method for identifying toxin-encoding genes according to the present invention. The nucleotide segments that are used as probes according to the present invention can be synthesized using a DNA synthesizer and standard procedures. Certain toxins of the present invention have been specifically exemplified herein. Since these toxins are merely examples of toxins of the present invention, it should be evident that the present invention encompasses variant or equivalent toxins (as well as nucleotide sequences encoding equivalent toxins) having equal or similar pesticidal activity to that of the exemplified toxin. . The equivalent toxins have amino acid homology with an exemplified toxin. This amino acid identity is typically greater than 60%, preferably it should be greater than 85%, more preferably greater than 80%, more preferably greater than 90% and may be higher than 95%. The amino acid homology is higher in the critical regions of the toxin that are responsible for the biological activity or are involved in the determination of the three-dimensional configuration that is ultimately responsible for the biological activity. In this regard, certain amino acid substitutions are acceptable and feasible if these substitutions take place in regions that are not essential for activity or are conservative amino acid substitutions that do not affect the three-dimensional configuration of the molecule. For example, amino acids can be located in the following classes: non-polar, polar without charge, basic and acid. Conservative substitutions by which an amino acid of one kind is replaced with another amino acid of the same type fall within the scope of the present invention so long as the substitution does not substantially alter the biological activity of the compound. Table 2 presents a list of examples of amino acids that belong to each class.
TABLE 2
Amino acid class Examples of amino acids Non-polar Ala, Val, Leu, He, Pro, Met, Phe, Trp Polar without charge Gly, Ser, Thr, Cys, Tyr, Asn, Gln Asp Acids, Glu Basics Lys, Arg, Hís
In some cases, non-conservative substitutions can also be made. The critical factor is that these substitutions should not cause a decrease in the biological activity of the toxin. The toxins of the present invention can also be characterized in terms of the shape and location of the toxin inclusions, described above. In the present reference to "isolated" polynucleotides and / or "purified" toxins refers to these molecules when they are not associated with other molecules with which they would be found in nature. Accordingly, "purified" toxins would include, for example, the toxins of interest expressed in plants. The reference to "isolated and purified" represents the implication of the "hand of man" according to what is described in the present. Chimeric toxins and genes also imply the "hand of man".
Recombinant hosts The toxin-encoding genes harbored by the isolates of the present invention can be introduced into a wide variety of microbial or plant hosts. The expression of toxin genes results, directly or indirectly, in the production and intracellular preservation of the pesticide. With suitable microbial hosts, for example Pseudomonas, the microbes can be applied to the situs of the pest, where they proliferate and are ingested. The result is the control of the plague. On the other hand, the microbe that acts as host for the toxin gene can be treated under conditions that prolong the activity of the toxin and stabilize the cell. The treated cell, which retains the toxic activity, can then be applied to the environment of the target pest. Where the gene for the ß.í. it is introduced by means of a suitable vector into a microbial host and said host is applied to the environment in the living state, the use of certain host microbes is essential. Microorganisms are selected, guests known to occupy the "phytosphere" (phylloplane, phyllosphere, rhizosphere and / or rhizoplane) of one or more crops of interest. These microorganisms are selected to be able to compete satisfactorily in the determined medium (culture and other insect habitats) with the wild-type microorganisms, to produce a stable maintenance and expression of the gene that expresses the polypeptide pesticide and, conveniently, to give rise to a better pesticide protection against environmental degradation and deactivation. A large number of microorganisms are known to inhabit the phylloplane (the surface of the leaves of plants) and / or the rhizosphere (the soil surrounding the roots of plants) of a wide variety of important crops. These microorganisms include bacteria, algae and fungi. Of particular interest are microorganisms such as bacteria, for example the genera Pseudomonas, Erwinia, Serratia, Klebsiella, Xanthomonas, Streptomyces, Rhizoblum, Rhodoseudomonas, Methilophillus, Agrobacterium, Acetobacter, Lactobacillus, Arthrobacter, Azotobacter, Leuconostoc and Alcaligenes, fungi, especially yeast, for example the genera Saccharomyces, Cryptococcus, Kluyveromyces, Sporobolomyces, Rhodotorula and Aureobasidium. Of particular interest are the bacterial species of the phytosphere, such as Pseudomonas syringae, Pseudomonas fluorescens, Serratia marcescens, Acetobacter xylinum, Agrobacterium tumefaciens, Rhodopseudomonas spheroides, Xanthomonas campestris, Rhizobium melioti, Alcaligenes entrophus and Azotobacter vinlandii, as well as yeast species of the phytosphere such as Rhodotorula rubra, R. glutinis, R. marina, R. aurantiaca. Cryptococcus albidus, C. Diffluens, C. Laurentii, Sccharomyces rosei, S. Pretoriensis, S. Cerevisiae, Sporobolomyces roseus, S. odorus, Klyveromyces veronae, and Aureobasidium pollulans. Of particular interest are pigmented microorganisms. A wide variety of methods are available to introduce a ß.í. which encodes a toxin in a host microorganism under conditions that result in stable maintenance and expression of the gene. These methods are well known to those skilled in the art and have been described, for example, in U.S. Patent No. 5,135,867, which is incorporated herein by reference.
The control of lepidoptera, including the nocturnal black caterpillar, can be obtained by using the isolates, toxins and genes of the present invention by a variety of methods known to those skilled in the art. These methods include, for example, the application of ß.í isolates. to pests (or their habitat), the application of recombinant microbes to pests (or their habitats) and the transformation of plants with genes coding for the pesticidal toxins of the present invention. Recombinant microbes can be, for example, ß.í., E. Coli or Pseudomonas. The transformations can be carried out by those skilled in the art using standard techniques. The materials necessary for these transformations have been described herein or else they can be easily obtained by trained technicians. Synthetic genes that are functionally equivalent to the toxins of the present invention can also be used to transform hosts. You can find methods for the production of synthetic genes, for example, in U.S. Patent No. 5,380,831, corresponding to the pending Argentine patent application No. 314,873.
Treatment of cells As mentioned above, ß.i. or recombinant cells expressing a ß.i. toxin can be treated. to prolong the activity of the toxin and stabilize the cell. The pesticide microcapsule that is formed contains the ß.i. within a cellular structure that has been stabilized and protects the toxin when the microcapsule is applied to the environment of the target pest. Suitable host cells can include prokaryotes or eukaryotes, which are typically limited to those cells that do not produce substances toxic to higher organisms such as mammals. However, organisms that produce substances toxic to higher organisms could be used, where the toxic substances are unstable or the level of application is sufficiently low to avoid any possibility of toxicity to a mammalian host. As hosts, prokaryotes and lower eukaryotes, such as fungi, are of special interest. Upon receiving treatment, the cell is generally intact and substantially in its proliferative form rather than in its spore form, although in some cases spores may be employed. The treatment of the microbial cell, for example a microbe containing the gene for the ß.i. toxin, can be carried out by chemical or physical means, or by a combination of chemical and / or physical means, provided that the technique does not adversely affects the properties of the toxin, nor reduce the cellular capacity of protection of the toxin. Examples of chemical reagents are halogenating agents, especially halogens of Atomic No. 17-80. More specifically, iodine can be used under moderate conditions and for sufficient time to obtain the desired results. Other suitable techniques include treatment with aldehydes such as glutaraldehyde, anti-infectives such as zephirane chloride and cetylpyridinium chloride; alcohols such as isopropyl and ethanol; various histological fixatives such as Lugol iodine, Bouin's fixative, various acids and Helly's fixative (see: Humason, Gretchen L. Animal Tissue Techiniques, W. H. Freeman and Company, 1967); or a combination of physical agents (heat) and chemicals that preserve and prolong the activity of the toxin produced in the cell when it is administered to the host environment. Examples of physical media are short-wave radiation, such as gamma radiation and X-radiation, freezing, UV irradiation, lyophilization and the like. Methods of treating microbial cells have been described in U.S. Patent Nos. 4,695,455 and 4,695,462 which are incorporated herein by reference. Cells generally have greater structural stability that increases resistance to environmental conditions. In cases where the pesticide is presented in a proforma, a method of cell treatment should be chosen that does not inhibit the processing of the proform to the mature form of the pesticide by the target pest pathogen. For example, formaldehyde crosslinks proteins and could inhibit the processing of the proforma of a polypeptide pesticide. The treatment method must retain at least a substantial portion of the bodysponsibility or bioactivity of the toxin. Characteristics of special interest in the selection of a host cell for the purpose of production include the ease of introducing the ß.í. in the host, the availability of expression systems, the efficiency of expression, the stability of the pesticide in the host and the presence of auxiliary genetic abilities. Characteristics of interest for use as a pesticide microcapsule include the protective qualities of the pesticide, such as thick cell walls, pigmentation, and intracellular envelopment or the formation of inclusion corpuscles, survival in aqueous media, lack of toxicity to mammals, attraction of pests for its ingestion; the ease of killing and fixing without prejudice to the toxin and others. Other factors to consider include ease of formulation and carry, economy, storage stability and the like.
Development of the cells The cellular host that contains the insecticidal gene of ß.í. it can be developed in any nutrient medium, in which the construction of DNA presents a selective advantage, giving rise to a selective medium so that all or almost all of the cells retain the ß.í. These cells can be harvested subsequently according to conventional methods. On the other hand, the cells can be treated before harvesting. The cells of ß.í. according to the present invention can be cultured using means and techniques of fermentation standard in the art. When the fermentation cycle is completed, the bacteria can be harvested by first separating the spores of ß.í. and the crystals of the fermentation broth by means known in the art. Spores and crystals of ß.í. they can be recovered using well-known techniques and used as a conventional preparation of β-endotoxin. For example, spores and crystals can be integrated into a formation of wettable powder, liquid concentrate, granules or other formulations by the addition of surfactants, dispersants, inert carriers and other components to facilitate handling and application for specific target pests. These formulations and methods of application are well known in the art. On the other hand, the supernatant of the fermentation process of the present invention can be used to obtain toxins according to the present invention. The secreted soluble toxins are then isolated and purified using generalized knowledge techniques.
Method and formulations for pest control The control of lepidoptera using the isolates, toxins and genes of the present invention can be achieved by a variety of methods known to those skilled in the art. These methods include, for example, the application of ß.í isolates. to pests (or their habitat), the application of recombinant microbes to pests (or their habitats) and the transformation of plants with genes encoding the pesticidal toxins of the present invention. Recombinant microbes can be, for example, ß.í., E. Coli or Pseudomonas. The transformations can be carried out by those skilled in the art using standard techniques. The materials needed for these transformations have been described herein or else they can be easily obtained by trained technicians. Bait granules formed containing an attracting agent and toxins from the ß.í. isolates, or recombinant microbes containing the genes obtained from the ß.í. described herein may be applied to the soil. The formulated product can also be applied in the form of seed coat or root treatment or total treatment of the plant in later stages of the crop cycle. Treatments of the plant and soil of ß.í. can be used in the form of wettable powders, granules or dusting agents, mixing with various inert materials, such as inorganic minerals (phyllosilicates, carbonates, sulfates, phosphates and the like) or botanical materials (powdered corn cobs, rice husks, walnut, and others). The formulations may include spreader-tackifiers, stabilizers, other additives for pesticides or surfactants. The liquid formulations can be water-based or non-aqueous and used as foams, gels, suspensions, emulsifiable concentrates or the like. The ingredients may include rheological agents, surfactants, emulsifiers, dispersants or polymers. As trained technicians will appreciate, the concentration of the pesticide can vary widely depending on the nature of the specific formulation, especially if it is a concentrate or should be used directly. The pesticide is present in at least 1% by weight and can reach 100% by weight. The dry formulations have from 1-95% by weight of the pesticide while the liquid formulations generally have from 1-60% by weight of solids in the liquid phase. The formulations generally have from 102 to about 10 4 cells / mg. These formulations containing cells are administered at a rate of approximately 50 mg (liquid or dry) to 1 kg or more per hectare. The formulations can be applied to the environment of the pest, for example to soil and foliage, by spraying, sprinkling, sprinkling, etc.
Mutants Mutants of the novel isolates that are obtained according to the present invention can be prepared using methods well known in the art. For example, a sporogenic mutant can be obtained by means of the mutagenesis with ethylmetansulfonate (EMS) of an isolate. Mutants can be prepared using ultraviolet light and nytrosoguanidine by methods well known in the art. A smaller percentage of sporogenous mutants remain intact and do not undergo lysis during prolonged periods of fermentation; these strains are designated as lysis minus (-). Less lysis strains can be identified by classifying sporogenous mutants in agitated flask medium and selecting mutants that are still intact and contain toxin crystals at the end of fermentation. The less lysis strains are suitable for a cell treatment process that produces an encapsulated and protected toxic protein. To prepare a fagorresistant variant of said sporogenic mutant, an aliquot of the phage lysate is spread on nutrient agar and allowed to dry. An aliquot of the phage-sensitive bacterial strain is then applied directly to the dry lysate and allowed to dry. The plates are incubated at 30 ° C. The plates are incubated for two days and, after that period, numerous colonies can be seen developing on the agar. Some of these colonies are harvested and subcultured on agar plates with nutrients. These seemingly resistant cultures are analyzed for resistance to cross-application with the phage lysate. A line of phage lysate is applied to the plate and allowed to dry. Then the presumed resistant cultures are applied through the phage line. Resistant bacterial cultures show lysis in any place of the ray through the phage line after incubation overnight at 30 ° C. Phage resistance is then reconfirmed by applying a layer of the resistant culture on the nourishing agar plate. The sensitive strain is also applied in the same way to serve as a positive control. After drying, a drop of the phage lysate is placed in the center of the plate and allowed to dry. The resistant cultures do not show any lysis in the area in which the phage lysate has been placed after incubation at 30 ° C for 24 hours.
Polynucleotide probes It is a well-known fact that DNA has a fundamental property called base complementarity. In nature, DNA commonly exists in the form of pairs of antiparallel chains, the bases of each chain projecting from that chain to the opposite. The adenine base (A) of one chain always opposes the thymine base (T) of the other chain, and the guanine base (G) opposes the cytosine base (C). The bases are held in opposition by their ability to bind hydrogen in this specific manner. Although each individual link is relatively weak, the net effect of many bases with adjacent hydrogen bonds, together with the effects of base accumulation, constitutes a stable union of the two complementary chains. These bonds can be broken by treatments such as high pH or high temperature, and these conditions produce the dissociation or "denaturation" of the two chains. If the DNA is then placed under thermodynamically favorable conditions in hydrogen bonding of the bases, the DNA cells are conjugated or "hybridized" to reform the original double-stranded DNA. If carried out under appropriate conditions, this hybridization can be highly specific. That is, only chains with a high degree of base complementarity can form stable double-chain structures. The ratio of the hybridization specificity to the reaction conditions is well known. Accordingly, hybridization can be used to analyze whether two DNA segments are complementary in their base sequences. It is this hybridization mechanism that facilitates the use of probes to detect and easily characterize the DNA sequences of interest. The probes can be RNA or DNA. The probe normally has at least about 10 bases, more usually at least about 18 bases, and can have up to about 50 bases or more, generally no more than about 200 bases if the probe is prepared synthetically. However, longer probes can be easily employed, and these can have, for example, a length of several kilobases. The sequence of the probe is designed to be at least substantially complementary to a portion of a gene encoding a toxin of interest. The probe does not need to have perfect complementarity with the sequence to which it hybridizes. The probes can be labeled using techniques well known to those skilled in the art. One approach to utilizing the present invention as probes is first to identify by Southern blot analysis of a gene bank of the β-isolate. all DNA segments homologous to the nucleotide sequences described. It is possible, without the help of biological analysis, to know in advance the probable activity of numerous ß.í isolates. This type of probe analysis provides a rapid method to identify potentially valuable insecticidal endotoxin genes within species. various of ß.í.
A hybridization procedure typically includes the initial steps of isolating the DNA sample in question and chemically purifying it. Used bacteria or total fractionated nucleic acid isolated from the bacteria can be used. The cells can be treated using known techniques to release their DNA (and / or RNA). The DNA sample can be cut into segments with an appropriate restriction enzyme. The segments can be separated by size by means of gel electrophoresis, usually agarose or acrylamide. The pieces of interest can be transferred to an immobilizing membrane in order to maintain the geometry of the pieces. The membrane can then be dried and prehybridized to equilibrate it for subsequent immersion in a hybridization solution. The manner in which the nucleic acid is fixed to a solid support may vary. This fixation of the DNA for the subsequent processing has great value for the use of this technique in field studies, far from the laboratory facilities. The specific hybridization technique is not essential for the present invention. As improvements are made in hybridization techniques, they can be easily applied. As is known to the technicians, if the probe molecule and the sample hybridize to form a strong non-covalent bond between the two molecules, it can be presumed logically that the probe and the sample are essentially identical. The detectable marker of the probe provides a means to determine in a known manner whether hybridization has occurred. The nucleotide segments of the present invention that are used as probes can be synthesized using DNA synthesizers by standard procedures. In the use of the nucleotide segments as probes, the specific probe is labeled by any suitable marker known to those skilled in the art, including radioactive and non-radioactive labels. Typical radioactive labels include 32P, 35S or the like. A probe labeled with a radioactive isotope can be constructed from a nucleotide sequence complementary to the DNA sample by a conventional nick translation reaction using DNase and DNA polymerase. The probe and sample can then be combined in a hybridization buffer and maintained at an appropriate temperature until binding occurs. The membrane is then washed to release it from foreign materials, leaving the molecules of the sample and the ligated probe typically detected and quantified by autoradiography and / or liquid scintillation counting. For synthetic probes the use of enzymes such as polynucleotide kinase or terminal transferase to label the DNA at the end for use as a probe may be highly advisable. Non-radioactive labels include, for example, ligands such as biotin or thyroxine, as well as enzymes such as hydrolases or peroxidases, or the various chemiluminescers such as luciferin or fluorescent compounds such as fluorescein and its derivatives. The probes can be prepared in an intrinsically fluorescent manner according to that described in the international application No. WO 93/16094.
The probe can also be marked at both ends with different types of markers for ease of separation, such as, for example, using an isotopic label at the end mentioned above and a biotin label at the other end. The amount of labeled probe present in the hybridization solution varies widely, depending on the nature of the label, the amount of labeled probe that can reasonably be attached to the filter and the stringency of the hybridization. In general, considerable excesses of probe are used to increase the binding coefficient of the probe to the fixed DNA. Various degrees of stringency hybridization can be employed. The more severe the conditions, the greater the complementarity required for double training. The severity can be controlled by means of the temperature, the concentration of the probe, the length of the probe, the ionic power, the time and other factors. Preferably, the hybridization is carried out under stringent conditions by well known techniques in the medium according to what is described, for example, in probes DNA, by Keller, G.H., M.M. Manak (1987), Stockton Press, New York, NY, p. 169-170. Herein, the expression "stringent" conditions of hybridization refers to conditions that produce the same, or approximately the same degree of hybridization specificity as the conditions employed by applicants herein. Specifically, hybridization of immobilized DNA in Southern blots with 32 P-labeled gene probes was carried out by standard methods (Maniatis, T., EF Fritsch, J. Sambrook [1982] Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY). In general, hybridization and subsequent washings were carried out under stringent conditions that resulted in the detection of target sequences with homology to the toxin genes in question. For the double-stranded DNA genetic probes, hybridization was carried out overnight at 20-25 ° C below the melting temperature (Tm) of the DNA hybrid in 6X SSPE, 5X Denhardt's solution, 0.1% SDS, 0.1 mg / ml of denatured DNA. The melting temperature has been described by the following formula (Beltz, GA, KA Jacobs, TH Eickbush, PT Cherbas and FC Kafatos [1983] Methods in Enzymology, R. Wu, L. Grossman and K. Moldave [edit] Academic Press , New York 100: 206-285). Tm 81.5 ° C + 16.6 Log [Na +] +0.41 (% G + C) - 0.61 (% formamide) - 600 / double length in base pairs. The washes are typically carried out in the following manner: (1) Twice at room temperature for 15 minutes in 1X SSPE, 0.1% SDS (low stringency wash). (2) Once at a Tm -20 ° C for 15 minutes in 0.2X SSPE, 0.1 SDS (wash of moderate stringency). For oligonucleotide probes, hybridization was carried out overnight at 10-20 ° C below the melting temperature (Tm) of the hybrid in 6X SSPE, 5X Denhardt's solution, 0.1% SDS, 0.1 mg / ml Denatured DNA The Tm for the oligonucleotide probes was determined according to the following formula: Tm (° C) = 2 (number T / A base pairs) + 4 (number G / C base pairs) (Suggs, SV, T. Mlyake, EH Kawashime, MJ Johnson, K.
Itakura and R.B. Wallace [1981] / CN-UCLA Symp. Dev. Biol. Using Purified Genes, D.D. Brown [edit], Academic Press, New York, 23: 683-693). The washes were typically carried out in the following manner: (1) twice at room temperature for 15 minutes 1X
SSPE, 0.1% SDS (low stringency wash). (2) Once at the hybridization temperature for 15 minutes 1X SSPE, 0.1% SDS (wash of moderate stringency). Duplex formation and stability depend on the substantial complementarity between the two strands of a hybrid and, as noted above, some degree of mismatch can be tolerated. Therefore, the nucleotide sequences of the present invention include mutations (both single and multiple), deletions, insertions and combinations thereof, wherein said mutations, insertions and deletions allow the formation of stable hybrids with the target polynucleotide of interest. Mutations, insertions and deletions can occur in a given polynucleotide sequence in many ways, and these methods are known to those skilled in the art. In the future other methods may be disclosed. Known methods include, but are not limited to: (1) synthesizing chemically or otherwise an artificial sequence that is a mutation, insertion or deletion of the known sequence; (2) using a nucleotide sequence according to the present invention as a probe to obtain, by means of hybridization, a new sequence or mutation, insertion or deletion of the sequence of the probe; and (3) effecting the mutation, insertion or deletion of an in vitro or live assay sequence. It is important to note that variants by mutation, insertion or deletion generated from a given probe can be more or less efficient than the original probe. Despite such differences in efficiency, these variants are within the scope of the present invention. Accordingly, variants by mutation, insertion or deletion of the described nucleotide sequences can be readily prepared by methods well known to those skilled in the art. These variants can also be used as primer sequences provided that the variants have a substantial sequence homology with the original sequence. Herein, the term "substantial sequence homology" refers to a homology that is sufficient to allow the variant to function with the same characteristics as the original probe. Preferably, the variants have amino acid or nucleotide identity with the exemplified sequences greater than 50%; more preferably, there is more than 75% identity and most preferably, there is more than 90% identity. The degree of homology necessary for the variant to function in its intended capacity depends on the intended use of the sequence. It is the competence of a person trained in this technique to carry out mutations by mutation, insertion or deletion designed to improve the performance of the sequence or, on the contrary, provide a methodological advantage.
PCR Technology The Polymerase Chain Reaction (PCR) is a repetitive, enzymatic synthesis with primers of a nucleic acid sequence. This process is well known and commonly used by those skilled in the art (see Mullis, U.S. Patent Nos. 4,683,195, 4,683,202 and 4,800,159; Saíki, Randall K., Stephen Sharf, Fred Faloona, Kary B. Mulis, Glenn T. Horn, Henry A. Eriich, Norman Arnheim [1985] "Enzymatic Amplification of ß-Globin Genomic Sequences and Restriction Site Analysis for Diagnosis of Sickle Cell Anemia" Science 230: 1350-1354). PCR is based on the enzymatic amplification of a DNA fragment of interest that is flanked by two oligonucleotide primers that hybridize to the opposite strands of the target sequence. The primers are oriented with the 3 'ends facing each other. The repeated cycles of heat denaturing of the pattern, the binding of the primers to their complementary sequences and the extension of the primers bound with a DNA polymerase results in the amplification of the segment defined by the 5 'ends of the PCR primers. Since the product of the extension of each primer can serve as a standard for the other primer, each cycle essentially doubles the amount of DNA fragment produced in the previous cycle. This produces the exponential accumulation of the specific white fragment, up to several million times in a few hours. By using a thermostable DNA polymerase, such as Taq polymerase, which is isolated from the thermophilic bacteria Thermus aquaticus, the amplification process can be completely automated. The DNA sequences obtained according to the present invention can be used as primers for PCR amplification. In the practice of PCR amplification some degree of discrepancy between the primer and the standard can be tolerated. Therefore, mutations, deletions and insertions (especially additions of nucleotides at the 5 'end) of the exemplified primers fall within the scope of the present invention. Mutations, insertions and deletions can be produced in a given priming by methods known to a person skilled in the art. It is important to note that variants by mutation, insertion and deletion generated from a certain sequence of primers can be more or less efficient than the original sequences. Despite such differences in efficiency, these variants are within the scope of the present invention. Following are examples that illustrate the procedures for practicing the invention. These examples should not be considered limiting. All percentages are expressed by weight and all proportions of the solvent mixtures are by volume unless otherwise indicated.
EXAMPLE 1 Culture of ß.í. useful according to the present invention
A subculture of ß.í. isolates, or mutants thereof, can be used to inoculate the following salt medium with peptone, glucose: Bactopeptone 7.5 g / l Glucose 1.0 g / l KH2PO4 3.4 g / l K2HPO4 4.35 g / l Saline solution 5.0 ml / l CaCl2 solution 5.0 ml pH 7.2 Solution of salts (100 ml) MgSO4. 7H2O 2.46 g MnSO4. H2O 0.04 g ZnSO4. 7H2O 0.28 g FeSO4. 7H2O 0.40 g CaCl2 solution (100 ml) CaCl2. 2H2O 3.66 g The solution of salts and the CaCl2 solution are sterilized by filtration and added to the broth kept in an autoclave and cooked at the moment of inoculation. The bottles are incubated at 30 ° C on a rotary shaker at 200 rpm. for 64 h. The process described can easily be scaled up to large thermorelers by methods well known in the art. The spores and / or crystals of ß.í., obtained in the aforementioned fermentation, can be isolated by methods known in the art. A frequently employed method is to subject the harvested fermentation broth to separation techniques, for example centrifugation. On the other hand, a subculture of ß.í. isolates, or mutants thereof, can be used to inoculate the following medium, known as TB broth: Tryptone 12 g / l Yeast extract 24 g / l Glycerol 4 g / l KH2PO4 2.1 g / i K2HPO4 14.7 g / l pH 7.4 Potassium phosphate was added to the stock maintained in an autoclave after cooling. The flasks were incubated at 30 ° C on a rotary shaker at 250 r.p.m. for 24 - 36 hours. The process described can easily be scaled up to large thermorelers by methods well known in the art. The ß.í. obtained from the fermentation described can be isolated by methods well known in the art. A frequently employed method consists in subjecting the harvested fermentation broth to separation techniques, for example, centrifugation. In a specific embodiment, the ß.í. Useful according to the present invention can be obtained from the supernatant. The supernatant of the culture containing the active protein (s) was used in the bioassays described below.
EXAMPLE 2 Identification of genes encoding novel Bacillus thurincfiensis toxins active against lepidoptera
Two pairs of primers useful for the identification and classification of novel toxin genes were added by PCR amplification or polymorphic DNA fragments near the 3 * ends of the toxin genes. These oligonucleotide primers allow for the discrimination of genes encoding toxins in the Cry7, Cry8 or Cry9 subfamilies of the genes for the most common lepidopteran-active toxins of the Cry subfamily based on the size differences of the amplified DNA. The sequences of these primers are: Advanced 1 5 'CGTGGCTATATCCTTCGTGTYAC 3' (SEQ ID.
NO.1) Reverse 1 5 * ACRATRAATGTTCCTTCYGTTTC 3 '(SEQ ID.
NO.2) Advance 2 5 'GGATATGTMTTACGTGTAACWGC 3' (SEQ ID.
NO.3) Reverse 2 5 'CTACACTTTCTATRTTGAATRYACCTTC 3' (SEQ.
ID. NO.4) The PCR amplification (Perkin Elmer, Foster City, CA) using the pair of primers 1 (SEQ ID NOS 1 and 2) according to the present invention gives DNA fragments of approximately 415-440 pairs of bases of length from the genes of toxins of ß.í. related to the cryl subfamily. PCR amplification using pair of primers 2 (SEQ.
ID. US. 3 and 4) according to the present invention gives DNA fragments of approximately 230-290 base pairs in length from ß.i. of the subfamilies cry7, cry8 or cry ?. These primers can be used, in accordance with the present invention, to identify genes encoding novel toxins. The crude DNA templates for PCR were prepared from ß.í strains. A cell loop was scraped from a night plate culture of Bacillus thuringiensis and resuspended in 300 ml of pH TE buffer (10 mM Tris- CI, 1 mM EDTA, pH 8.0). Proteinase K was added to complete 0.1 mg / ml and the cell suspension was heated at 55 ° C for 15 minutes. Then the solution was boiled for 15 minutes. The cellular debris was tabletted in a microcentrifuge and the supernatant containing the DNA was transferred to a clean tube. PCR was carried out using the pair of primers consisting of the forward oligonucleotide 2 (SEQ ID NO: 3) and the reverse 2 (SEQ ID NO: 4) described above. Strains containing genes characterized by amplification of DNA fragments of approximately 230-290 bp in length were identified. Then the spore-crystal preparations of these layers were analyzed to determine the biological activity against Agrotis εilon and other lepidopteran targets. PS185U2 was examined using both pairs of primers 1 and 2 (SEQ ID NOS: 1 and 2 and SEQ ID NOS: 3 and 4, respectively). In this strain, the first pair 1 (SEQ ID Nos. 1 and 2) gave a DNA band of the expected size for toxin genes related to the cryl subfamily.
EXAMPLE 3 Analysis of the existence of polymorphisms in restriction fragments
(RFLP) of the Bacillus thuringiensis toxin genes present in active strains against lepidoptera.
Whole cell DNA was prepared from strains of Bacillus thuringiensis (B.t.) grown to obtain an optical density, at 600 nm, of 1.0. Cells were pelleted by centrifugation and resuspended in protoplast pH buffer (20 mg / ml lysozyme in 0.3 M sucrose, 25 mM Tris-CI [pH 8.0], 25 mM EDTA). After incubation at 37 ° C for 1 hour, the protoplasts were lysed by two cycles of freezing and thawing. Nine volumes of a solution of 0.1 M NaCl, 0.1% SDS, 0.1 M Tris-CI were added to complete the lysis. The clarified lysate was extracted twice with phenol: chloroform (1: 1). The nucleic acids were precipitated with two volumes of ethanol and tabletted by centrifugation. The pellet was resuspended in TE pH buffer and RNase was added until a final concentration of 50g / ml was obtained. After incubation at 37 ° C for 1 hour, the solution was extracted once with phenol: chloroform (1: 1) and once with chloroform saturated with TE. The DNA was precipitated from the aqueous phase by the addition of one tenth volume of NaOAc3M and two volumes of ethanol. The DNA was pelleted by centrifugation, washed with 70% ethanol, dried and the pH buffer of TE was resuspended.
Two types of DNA probes amplified by PCR and labeled with 32 P were used in normal Southern hybridizations of total cellular DNA of β. to characterize toxin genes by RFLP. The first probe (A) was an amplified DNA fragment using the following primers: Adelante 3: 5 'CCAGWTTTATAGGAGG3' (SEQ ID NO: 5) Inverse 3: 5 'GTAAACAAGCTCGCCACCGC3' (SEQ ID NO: 6) second probe (B) consisted of the DNA fragment of 230-290 bp or of 415-440 bp amplified with the primers described in the previous example. Hybridization of immobilized DNA in Southern blots with the aforementioned 32 P labeled probes was carried out using standard methods (Maniatis, T., EF Fritsch, J. Sambrook [1982] Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY). In general, the hybridization and subsequent washes were carried out under conditions of moderate rigor. For the double-stranded DNA genetic probes, the hybridization chain was carried out overnight at 20-25 ° C below the melting temperature (Tm) of the DNA hybrid in 6X SSPE, 5X Denhart's solution, 0.1% SDS, 0.1 mg / ml denatured DNA. The melting temperature has been described by the following formula (Beltz, GA, KA Jacobs, TH Eickbush, PT Cherbas and FC Kafatos [1983] Methods in Enzymology, R. Wu, L. Grossman and K. Moldave [edit] Academic Press , New York 100: 206-285).
Tm 81.5 ° C + 16.6 Log [Na +] + 0.41 (% G + C) -0.61 (% formamide) -600 / double length in base pairs. The washes are typically carried out in the following manner: (1) Twice at room temperature for 15 minutes in 1X SSPE, 0.1% SDS (low stringency wash). (2) Once at a Tm -20 ° C for 15 minutes in 0.2X SSPE, 0.1 SDS (wash of moderate stringency). RFLP data were obtained for the ten most active strains against Agrotis εilon (Tables 3 and 4). The bands of DNA in hybridization described herein contain all or part of the novel toxin genes investigated.
TABLE 3 RFLP data for Bacillus thuringiensis strains using probe A Approximate size (base pairs) Bacillus thuringiensis strain
I85U2 PS89J3 PS11B HD129 PS86BB1 PS86W1 PS86V1 PS31G1 HD573 HD525
EcoRI 8410 11837 11168 11132 8267 8718 10356 11687 9816 9570 3631 9769 7347 5876 5585 5159 7105 7419 5908 5760 1900 7225 3684 3659 3838 3742 925 4921 628 1716 661 846 Ul 498 Sacl 8997 6326 10057 9165 12170 10564 6708 6216 5645 5450 5593 6046 6063 5204 5074 3741 4120 4758 5724 5993 7105 2670 1993 6129 1945 3868 3436 936 1190 3027 Kpnl 12852 4596 9878 4258 5802 8938 6300 Xbal 2658 1596 5876 9312 763 3870 5911 630 3258 2827 2093 2636 1521 1760 1010 625
TABLE 4 RFLP data for Bacillus thuringiensis strains using probe B Approximate size (base pairs) Bacillus thuringiensis strain
Digestion PS185U2 PS89J3 PS11B HD129 PS86BB1 PS86W1 PS86V1 PS31G1 HD573 HD525
EcoRI 10493 10838 9874 4922 8286 7334 9791 8603 9741 9741 4387 6217 7347 3048 5567 6638 6412 4228 6146 5840 3686 3685 3878
Sacl 10252 5177 9619 11487 11475 10646 5840 5840 6217 5297 6638 6081 6789 5486 or
HinDIII 7197 5880 7719 5187 5567 6316 6412 6475 5840 5840 5553 3985 6033 4022 3740 4239 4199 3183 4522 4522 2700 2882 2513 2845 3057 Kpnl 3548 12113 1446 10491 10624 12074 12756 1528 10791 10791 7345 1076 7884 8953 9286 4082 4296 1994 2099
Xbal 5262 5048 4563 5716 4921 9684 5549 5840 2985 3048 3386 4455 3583 6630 3501 3685
EXAMPLE 4 DNA sequencing of toxin genes
Amplified segments were sequenced by PCR of toxin genes present in strains of B.t. active against Agrotis ipsilon. To achieve this end, amplified DNA fragments obtained by the use of forward 3 (SEQ ID NO: 5) and reverse 3 (SEQ ID NO: 6) primers were cloned into the plasmid vector for cloning. DNA TA PCR, pCRIl, according to that described by the supplier (Invitrogen, San Diego, CA). Several individual pCRIl clones were chosen from the mixture of amplified DNA fragments of each ß.i strain. for sequencing. The colonies were lysed by boiling to obtain a crude DNA plasmid. The DNA templates for automatic sequencing were amplified by PCR using specific vector primers flanking the multiple cloning sites of the plasmid. These DNA templates were sequenced using automatic sequencing methodologies from Applied Biosystems (Foster City, CA). By this method, the toxin gene sequences and their corresponding nucleotide sequences described below were identified (SEQ ID NO: 7 A SEQ ID NO: 62). These sequences are listed in Table 5. The sequences of polypeptides deduced from these nucleotide sequences are also set forth. From these partial gene sequences, seven oligonucleotides useful as PCR primers or hybridization probes were designed. The sequences of these oligonucleotides are as follows. GTTCATTGGTATAAGAGTTGGTG 3 '(SEQ ID NO: 63) 5'CCACTGCAAGTCCGGACCAAATTCG 3"(SEQ ID NO 64) 5'GAATATATTCCCGTCYATCTCTGG 3' (SEQ ID NO: 65) 5'GCACGAATTACTGTAGCGATAGG 3 '(SEC. ID No. 67) of GCTGGTAACTTTGGAGATATGCGTG 3 '(SEQ ID NO: 67) 5'GATTTCTTTGTAACACGTGGAGG 3' (SEQ ID NO: 68) 5'CACTACTAATCAGAGCGATCTG 3 '(SEQ ID NO: 69) In the Table 5 lists the specific gene toxin sequences and the oligonucleotide probes that allow the identification of these genes by hybridization or by PCR in combination with the reverse primer 3 described above.
TABLE 5 Sequence ID reference numbers
EXAMPLE 5 Isolation and DNA sequencing of whole toxin genes
Total cellular DNA was extracted from strains of ß.í. using standard procedures known in the art. See, for example, example 3 above. Sau3A partial restriction fragment libraries fractionated by size of total cellular DNA were constructed in the bacteriophage vector, Lambda-Gem11. The recombinant phage were packaged and applied in E. coli KW251 cells. The plates were screened by hybridization with radiolabelled probes of specific genes derived from DNA fragments amplified by PCR with oligonucleotide primers of SEQ. ID. US. 5 and 6. Hybridization phages were purified on plates and used to infect liquid cultures of E. coli KW251 cells for DNA isolation by standard procedures (Maniatis, T., EF Fritsch, J. Sambrook [1982] Molecular Cloning : A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY). Then the toxin genes were subcloned into pBluescript vectors (Stratagene) for analysis of DNA sequences. The complete toxin genes listed below were sequenced using automated sequencing methodologies from Applied Biosystems (Foster City, CA). The sequences of the toxin genes and the respective predicted polypeptide sequences are listed below.
Source strain SEC. ID. of the SEC. ID. Designation of nucleotide peptide PS86BB1 toxin SEC. ID. NO.70 SEC. ID. NO.71 86BB1 (a) PS86BB1 SEC. ID. DO NOT. 72 SEC. ID: NO 73 86BB1 (B) PS31G1 SEC. ID. DO NOT. 74 SEC. ID. DO NOT. 75 31 G1 (a)
Recombinant E. coli NM522 strains containing these plasmids encoding these toxins were deposited in the NRRL on June 27, 1997.
Strain source Plasmid Designation of the toxin number of NRRL MR922 pMYC2451 86BB1 (a) B-21794 MR923 pMCY2453 86BB1 (b) B-21795 MR924 pMYC2454 31 G1 (a) B-21796
EXAMPLE 6 Heterologous expression of novel B.t toxins in Pseudomonas fluorescens (P.f)
Ester toxin genes were modified in plasmid vectors by standard DNA cloning methods, and transformed into Pseudomonas fluorescens for expression. The recombinant bacterial strains (Table 6) were cultured in shaker flasks for toxin production for expression and quantitative biological assay against a variety of lepidopteran insect pests.
TABLE 6 Recombinant strains of Pseudomonas fluorecens for the heterologous expression of novel toxins
Strain source Plasmid Toxin Strain of recombinant P.f
PS86BB1 pMYC2804 86BB1 (a) MR1259 PS86BB1 pMYC2805 86BB1 (b) MR1260 PS31G1 pMYC2430 31G1 (a) MR1264 EXAMPLE 7 Processing of Endotoxins with Trypsin
Cultures of Pseudomonas fluorescens were cultured for 48 hours by standard procedures. The cell pellets were harvested by centrifugation and washed three times with water and stored at -70 ° C. The endotoxin inclusions were isolated from the cells treated with lysozyme and DNase by differentiated centrifugation. The toxins thus isolated were then processed to limit the peptides by trypsinolysis and then used for biological tests on lepidopteran pests. Below are the detailed protocols. The toxin inclusion corpuscles were prepared from the crude cell pellets washed as follows: 41 Lysis pH Regulator (prepared on the day of use)
Grams Base of Tris 24.22 NaCI 46.75 Glycerol 252 Dithiothreitol 0.62 Disodium salt of EDTA 29.78 Triton X-100 20 ml
The pH is adjusted to 7.5 with HCl and completed to a final volume
(4 I) with distilled water. 1. The frozen cell pellet is thawed in a 37 ° C water bath.
2. Lysis pH regulator is added until the polycarbonate bottles with capacity for 500 ml are as full as possible - 400 ml of total volume. It is dispersed by inverting the bottle or using the Polytron at a few r.p.m. 3. Centrifuge (10,000 x g) for 20 minutes at 4 ° C. 4. The supernatant is decanted and discarded. 5. The pellet is resuspended in 5 ml of lysis pH regulator for each gram of pellet, using the Polytron at a few r.p.m. to disperse the pellet. 6. 25 mg / ml of lysozyme solution is added to the suspension until a final concentration of 0.6 mg / ml is obtained. 7. Incubate at 37 ° C for 4 min. It is reversed every 30 seconds. 8. The suspension is placed on ice for 1 hour. 9. Add 2.5 M MgCI-6 H2O to the tubes until a final concentration of 60 mM is obtained. A 40 mg / ml solution of deoxyribonuclease I (Sigma) is added to obtain a final concentration of 0.5 mg / ml. 10. Incubate overnight at 4 ° C. 1 1. The lysate used by the Polytron is homogenized at a few r.p.m. 12. Centrifuge at 10,000 g at 4 ° C for 20 minutes. The supernatant is decanted and discarded. 13. The inclusion pellet is resuspended in lysis pH regulator. It is controlled microscopically to verify complete cell lysis. 14. The inclusion pellet is washed in lysis pH regulator 5 times (steps 2-5 are repeated). 15. Store as a suspension of 10 mM Tris-Cl pH 7.5, 0.1 mM PMSF and store at -70 ° C in 1.5 ml Eppitubes.
The digestion of inclusions with trypsin is done as follows:
Digestion solution: 1. 2 ml of 1 M NaCAPS pH 10.5. 2. Preparation of the inclusion (up to 100 mg of protein). 3. Trypsin in a ratio of 1: 100 with the amount of protein for cleavage (added during the procedure). 4. H2O until obtaining a final volume of 10 ml. The treatment with trypsin is done as follows: 1. The digestion solution is incubated, without trypsin, at 37 ° C for 15 minutes. 2. Trypsin is added to a ratio of 1: 100 (trypsin protein: toxin w / w).
3. The solution is incubated for 2 hours at 37 ° C with occasional mixing by inversion. 4. The digestion solution is centrifuged for 15 minutes at 15,000 g at 4 ° C. 5. Remove and store the supernatant. 6. The supernatant is analyzed by SDS-PAGE and used for the biological assay as described below.
EXAMPLE 8 Expression of a Gene from an HD129 strain of B.I in a chimeric construct
A gene was isolated from strain HD129 from B.í. this gene seems to be a pseudogene without codon of initiation of the apparent translation. To express this gene from HD129, we designed and constructed a genetic fusion with the first 28 crylAc codons in a Pseudomonas expression system. The nucleotide and peptide sequences of this chimeric toxin are set forth in SEC. ID. Nos. 76 and 77. Upon induction, the recombinant P. fluorescens containing this novel chimeric toxin expressed the polypeptide of the expected size.
EXAMPLE 9 Additional sequencing of toxin genes
DNA of soluble toxins was sequenced from the isolates listed in Table 7. The NOS. of SEC. ID. The sequences thus obtained are also shown in table 7.
TABLE 7
Isolated 3 SEC source. ID. I NO of the SEC. ID. protein nucleotide toxin PS11B 78 79 11B (a) PS31 G1 80 81 31 G1 (b)
PS86BB1 82 83 86BB1 (c)
PS86V1 84 85 86V1 (a)
PS86W1 86 87 86W1 (a)
PS94R1 88 89 94R1 (a)
PS85U2 90 91 185U2 (a)
PS202S 92 93 202S (a)
PS213E5 94 95 213E5 (a)
PS218G2 96 97 218G2 (a)
HD29 98 99 29HD (a)
HD1 10 100 101 1 10HD (a)
HD129 102 103 129HD (b)
HD573 104 105 573HD (a)
EXAMPLE 10 Biological test on the black night caterpillar
Powder suspensions containing ß.i isolates were prepared. mixing an appropriate amount of powder with distilled water and stirring vigorously. The suspensions were mixed with an artificial diet of nocturnal black caterpillars (BioServ, Frenchtown, NJ) amended with 28 grams of alfalfa powder (BioServ), 1.2 ml of formalin per liter of finished diet. The suspensions were mixed with finished artificial diet at the rate of 3 ml of suspension plus 27 ml of diet. After centrifugation, this mixture was poured into plastic trays with compartmented 3 ml receptacles (Nutrend Container Corporation, Jacksonville, F.L). A water control that did not contain ß.í was served as a control. Larvae of Agrotis Ípsilon were placed in the first early stage (French Agricultural Services, Lambertown, M.N) per unit in the dietary mixture. The "MYLAR" (ClearLam Packaging, IL) covered receptacles were then sealed using a stapler and several perforations were made in each receptacle to give rise to gas exchange. The larvae were kept at 29 ° C for four days in a conservation room 14:10 (light: dark). After four days, mortality was recorded. The following isolates were observed to have activity against the night black caterpillar: PS185U2, PS11B, PS218GS, PS213E5, PS86W1, PS28C, PS86BB1, PS89J3, PS86R1, PS94R1, HD525, HD573, PS27J2, HD1 10, HD10, PS202S, HD29, PS101 DD, HD129 and PS31 G1. The results of the biological test are shown in table 8.
TABLE 8 Percentage of nocturnal black caterpillar mortality associated with isolates of B.t.
Estimated toxin concentration (μg of toxin / ml of diet) Sample 200 100 50 25 PS86BB1 51 25 9 1 PS31 G1 30 20 7 5 PS11 B 37 16 3 0 HD573 11 13 3 0 HD129 87 73 43 7 PS86V1 73 29 19 3 PS89J3 68 27 15 3 PS86W1 61 23 12 15 PS185U2 69 32 14 16 HD525 67 20 11 4 Control with 1 water
EXAMPLE 11 Activity of isolates of B.t against Aqrotis ipsilon
Samples were analyzed as cultures of supernatants. The samples were applied to an artificial black night caterpillar diet (BioServ, Frenchtown, NJ) and the air was allowed to dry before infestation with larvae. A water control without ß.í. it served as control. The eggs were applied to each treated receptacle and then sealed with laminated "MYLAR" (ClearLam Packaging, IL) cover using a stapler and several perforations were made in each receptacle to give rise to gas exchange. The biological tests were kept at 25 ° C for 7 days in a conservation room 14:10 (light: dark). After seven days, mortality was recorded. Strains that exhibited mortality (greater than with water control) against A. ipsilon are reported in table 9.
TABLE 9 Larvicidal activity of concentrated supernatants of B.t. in a maximum load test on neonates of A. ipsilon
Strain Activity PS86W1 + PS28C + PS86BB1 + PS89J3 + PS86V1 + PS94R1 + HD573 +
EXAMPLE 12 Activity of Pseudomonas fluorescens clones of B.t. against Heliothis virescens (Fabricius) and Helicoperva zea (Boddie)
Strains were analyzed in the form of samples of Pseudomonas fluorescens clones or of culture supernatants of ß.i. Suspensions of clones were prepared by individually mixing the samples with distilled water and shaking vigorously. For the biological trials of incorporation into the diet, the suspensions were mixed with the artificial diet at the rate of 6 ml of suspension plus 54 ml of diet. After centrifugation, this mixture was poured into plastic trays with compartmented 3 ml receptacles (Nutrend Container Corporation, Jacksonville, FL). The supernatant samples were mixed at a rate of 3-6 ml with the diet according to what was outlined above. In the biological tests of maximum load, suspensions or supernatants were applied to the maximum of the artificial diet and allowed to dry before proceeding to larvae infestation. A water witness served as a control. Larvae were placed in the first stage (USDA-ARS, Stoneville, MS) per unit on the dietary mixture. They were then sealed with a laminated "MYLAR" cover (ClearLam Packaging, IL) using a stapler and several perforations were made in each receptacle to give rise to gas exchange. The larvae were kept at 25 ° C for 6 days in a conservation room 14:10 (light: dark). After six days, mortality was recorded. The results are the following:
TABLE 10 Larvicidal activity of concentrated supernatants of B.t. in a biological test of maximum load
H. virescens H. zea s% strain% (μg / cm2) atrophy atrophy mortality mortality HD129 44.4 100 Yes 50 Yes 44.4 81 Yes 50 Yes 47.6 100 Yes 36 No
PS185U2 23.4 100 Yes 100 Yes 23.4 100 Yes 95 Yes 21.2 100 Yes 96 Yes 21.2 100 Yes PS31G1 8.3 70 Yes 39 Yes 8.3 17 Yes 30 Yes 3.6 29 Yes 30 Yes 3.6 0 No
TABLE 11 Strains analyzed in the biological assay of incorporation into the diet in H. virescens v. H. zea
H. virescens H. zea Total Protein Total protein Strain% Mortality% dead (diet μg / ml) (diet μg / ml) PS1 1 B NAr 45 268 96
PS185U2 55 100 55 100
PS31 G1 0 50 43.4 13
PS86BB1 23.3 100 23.3 100
PS86V1 17 100 17 92
PS86W1 18 100 18 83
PS89J3 13 100 13 81
HD129 NA 100 138.3 13
HD525 3 96 171.7 0
HD573A 3 96 78.3 21
1 Information on proteins is not available.
QUADRÓ 12 Response to the dose of H. virescens in biological tests of incorporation into the diet using preparations of spores and frozen crystals
MR # LC50 (μg / ml) 1259 13,461 1259 trypsin 1,974 1260 12,688 1260 trypsin 0.260 1264 95.0 1264 trypsin 2823
EXAMPLE 13 Activity against Ostrinia nubilalis (European corn borer)
The isolates and toxins of the present invention can be used to combat Ostrinia nubilalis, the European corn borer (ECB). The activity against ECB can be easily verified, for example, by means of standard biological assay procedures for incorporation into the artificial diet in insects using, for example, larvae in the first stage. In a specific embodiment, trypsin-treated clones expressing the 31 G1 (a) gene demonstrated to have a LC50 value of 0.284 (μg / ml)
EXAMPLE 14 Insertion of plant toxin genes
One aspect of the present invention is the transformation of plants with genes that code for insecticidal toxins. The transformed plants are resistant to attack by white pests. The genes encoding the pesticidal toxins, as described herein, can be inserted into plant cells using a variety of techniques known in the art. For example, a large number of cloning vectors are available which consist of a replication system in E. coli and a marker that allows the selection of the transformed cells for the preparation for the insertion of foreign genes in higher plants. The vectors consist, for example, in the pBR322 series, the pUC series, the M13mp series, pACYC184, etc. Accordingly, the sequence encoding the ß.i. it can be inserted into the vector at a suitable restriction site. The resulting plasmid is used for transformation into E. coli. The E. coli cells are cultured in a suitable nutrient medium, then harvested and lysed. The plasmid is recovered. The sequence analysis, restriction analysis, electrophoresis and other biological biochemical-molecular methods are generally carried out as analysis methods. After each manipulation, the DNA sequence used can be excised and linked to the next DNA sequence. Each plasmid sequence can be cloned into the same or other plasmids.
Depending on the method of insertion of the desired genes into the plant, other DNA sequences may be necessary. If, for example, the Ti or Ri plasmid is used for the transformation of the plant cell, at least the right border must be joined, although frequently both the right and left borders of the T-DNA of the Ti or Ri plasmid as the region of flanking the genes to insert. The use of T-DNA for the transformation of plant cells has been intensively investigated and sufficiently described in EP 120 516; Hoekema (1985) in: The Binary Plant Vector System, Offset-durkkerij Kanters B.V., Ablasserdam, Chapter 4; Fraley et al., Crit. Rev. Plant Sci. 4: 1- 46; and An et al. (1985) EMBO J. 4: 277-287. Once the inserted DNA has been integrated into the genome, it is relatively stable there and, as a general rule, does not come back out. They usually contain a selection marker that confers on the transformed plant cells resistance to a biocide or an antibiotic, such as cenamycin, G418, bleomycin, hygromycin or chloramphenicol, among others. The individually used marker must, therefore, allow the selection of transformed cells instead of cells that do not contain the inserted DNA. A large number of techniques are available to insert DNA into a plant host cell. These techniques include the transformation with T-DNA using Agrobacterium tumefaciens or Agrobacterium rhizogenes as transformation agent, fusion, injection, biolistics (bombardment of microparticles) or electroporation, as well as other possible methods. If Agrobacteria are used for the transformation, the DNA to be inserted has to be cloned in special plasmids, that is, in an intermediate vector or a binary vector. Intermediate vectors can be integrated into the Ti or Ri plasmid by homologous recombination due to sequences that are homologous to the T-DNA sequences. The Ti or Ri plasmid also contains the vir region necessary for the transfer of the T-DNA. Intermediate vectors can not replicate in Agrobacteria. The intermediate vector can be transferred to Agrobacterium tumefacíens by means of an auxiliary plasmid (conjugation). The binary vectors can be replicated in both E. Coli and Agrobacteria. These comprise a selection marker gene and a binder or polylinker that are formed by the right and left end regions of the T-DNA. They can be transformed directly into Agrobacteria (Holsters) and others [1978] Mol. Gen. Genet. 163: 181-187). The Agrobacterium used as the host cell must contain a plasmid carrying a vir region. The vir region is necessary for the transfer of the T-DNA to the plant cell. It may contain more T-DNA. The bacterium thus transformed is used for the transformation of plant cells. Plant explants can be advantageously grown with Agrobacterium tumefacíens or Agrobacterium rhizogenes for the transfer of DNA to the plant cell. Then whole plants can be regenerated from the infected plant material (for example pieces of leaves, segments of stem, roots, but also protoplasts or cells grown in suspension) in a suitable medium that may contain antibiotics or biocides for selection. The plants thus obtained can be analyzed below to confirm the presence of the inserted DNA. There are no special requirements with respect to plasmids in case of injection and electroporation. It is possible to use common plasmids such as, for example, pUC derivatives. The transformed cells regenerate inside the plants in the usual way. They can form germ cells and transmit the transformed trait (s) to the plants of the progeny. Such plants can be cultivated as usual and crossed with plants that have transformed hereditary factors or other hereditary factors. The resulting hybrid individuals have the corresponding phenotypic properties. In a preferred embodiment of the present invention, the plants are transformed with genes in which the codon usage has been optimized for the plants. See, for example, US patent. No. 5,380,831. In addition, advantageously, plants encoding a truncated toxin are used. The truncated toxin typically encodes about 55% to about 80% of the complete toxin. The methods to create genes of ß.í. Synthetics for use in plants are known in the art. It is to be understood that the examples and embodiments described herein are presented for illustrative purposes only and that technicians trained in the environment may suggest various modifications or changes in light thereof and these should be included in the spirit and purpose of this application. and the scope of the appended claims.
LIST OF SEQUENCES
(1) GENERAL INFORMATION: (i) APPLICANT: Schnepf, H. Ernest Wicker, Carol Narva, Kenneth E. Walz, Michelle Stockhoff, Brian Muller-Cohn, Judy
(ii) TITLE OF THE INVENTION: Active toxins against Os trinia. ubilalis
(iii) SEQUENCE NUMBER: 105
(iv) ADDRESS TO SEND CORRESPONDENCE: (A) RECIPIENT: Saliwanchik, Lloyd & Saliwanchik (B) STREET: 2421 N.W. 41st Street, Suite A-l (C) CITY: Gaínesville (D) STATE: Florida (E) COUNTRY: USA (F) ZIP CODE: 3260S
(v) COMPUTER LEADABLE FORM: (A) TYPE OF MEDIA: Flexible disk (B) COMPUTER: IBM PC compatible (C) OPERATING SYSTEM: PC-DOS / MS-DOS (D) SOFTWARE: Patentln (vi) DATA OF THE CURRENT APPLICATION: (A) APPLICATION NUMBER: US (B) SUBMISSION DATE: (C) CLASSIFICATION:
(vii) PREVIOUS APPLICATION DATA: (A) APPLICATION NUMBER: US 08 / 886,615 (B) SUBMISSION DATE: l-JUL-1997 (C) CLASSIFICATION *:
(vii) PREVIOUS APPLICATION DATA: (A) APPLICATION NUMBER: US 08 / 674,002 (B) SUBMISSION DATE: l-JUL-1996 (C) CLASSIFICATION:
(viii) INFORMATION ABOUT THE POWDER / AGENT: (A) NAME: Sanders, Jay M. (B) REGISTRATION NUMBER: 39,355 (C) REFERENCE NUMBER / CASE: MA-701C2
(ix) INFORMATION ABOUT TELECOMMUNICATIONS: (A) TELEPHONE: (352) 375-8100 (B) TELEFAX: (352) 372-5800
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 1: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 23 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 1:
CGTGGCTATA TCCTTCGTGT YAC 23
(2) SEQUENCE INFORMATION: ID. OF SEQUENCE NO:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 23 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 2:
ACRATRAATG TTCCTTCYGT TTC 23
(2) SEQUENCE INFORMATION: ID. OF SEQUENCE NO: 3: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 23 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) INFORMATION FOR SEQUENCE IDENTIFICATION NO. 3
GGATATGTMT TACGTGTAAC WGC 23
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 4:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 28 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 4:
CTACACTTTC TATRTTGAAT RYACCTTC 28 (2) SEQUENCE IDENTIFICATION INFORMATION NO. 5: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 16 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 5:
CCAGWTTTAY AGGAGG 16
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 6:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ü) TYPE OF MOLECULE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: ID. SEQUENCE NO: 6: GTAAACAAGC TCGCCACCGC 20 (2) SEQUENCE IDENTIFICATION INFORMATION NO. 7:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 137 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 7:
Pro Gly Phe Xaa Gly Gly Asp lie Leu Arg Arg Thr Ser Pro Xaa Gln 1 5 10 15
lie Be Xaa Leu Arg Val Asn lie Thr Ala Pro Leu Ser Gln Arg Tyr 20 25 30
Arg Val Arg lie Xaa Xaa Ala Ser Thr Th Xaa Xaa Gln Phe His Thr 35 40 45
Ser lie Xaa Gly Arg Pro lie Asn Gln Gly Asn Phe Ser Xaa Thr Met 50 55 60
Being Ser Gly Being Asn Leu Gln Being Gly Xaa Phe Arg Thr Val Gly Phe 65 70 75 80
Thr Thr Pro Xaa Asn Phe Ser Asn Gly Ser Ser Val Phe Thr Leu Ser 85 90 95
Xaa His Val Phe Asn Ser Gly Asn Glu Val Tyr lie Asp Arg lie Glu 100 105 110
Phe Val Pro Wing Glu Val Thr Phe Glu Wing Glu Tyr Asp Leu Glu Arg 115 120 125
Ala Xaa Lys Ala Val Ala Ser Leu Phe 130 135
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 8:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 413 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 8:
CCAGGATTTA YAGGAGGAGA TATTCTTCGA AGAACTTCAC CTGKSCAGAT TTCAWCCTTA 60
AGAGTAAATA TTACTGCACC ATTATCACAA AGATATCGGG TAAGAATTCR CWACGCTTCT 120
ACYACAWATT TWCAATTCCA TACATCAATT GRCGGAAGAC CTATTAATCA GGGKAATTTT 180
TCASCAACTA TGAGTAGTGG GAGTAATTTA CAGTCCGGAA KCTTTAGGAC TGTAGGTTTT 240
ACTACTCCGT KTAACTTTTC AAATGGATCA AGTGTATTTA CGTTAAGTKC TCATGTCTTC 300
AATTCAGGCA ATGAAGTTTA TATAGATCGA ATTGAATTTG TTCCGGCAGA AGTAACCTTT 360
GAGGCAGAAT ATGATTTAGA AAGAGCACMA AAGGCGGTGG CGAGCTTGTT TAC 413
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 9:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 13 S amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein (xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 9:
Pro Gly Phe Thr Gly Gly Asp lie Leu Arg Arg Thr Asp Gly Gly Xaa 1 5 10 15
Val Gly Thr lie Arg Ala Asn Val Asn Ala Pro Leu Thr Gln Gln Tyr 20 25 30
Arg lie Arg Leu Arg Tyr Ala Ser Thr Thr Ser Phe Val Val Asn Leu 35 40 45
Phe Val Asn Asn Be Wing Wing Gly Phe Thr Leu Pro Be Thr Met Wing 50 55 60
Gln Asn Gly Being Leu Thr Xaa Glu Being Phe Asn Thr Leu Glu Val Thr 65 70 75 80
His Xaa lie Arg Phe Ser Gln Ser Asp Thr Thr Leu Arg Leu Asn lie 85 90 95
Phe Pro Ser lie Ser Gly Gln Xaa Val Tyr Val Asp Lys Xaa Glu lie 100 105 110
Val Pro Xaa Asn Pro Thr Arg Glu Wing Glu Glu Asp Leu Glu Asp Xaa 115 120 125
Lys Lys Ala Val Ala Ser Leu Phe 130 135 (2) INFORMATION FOR IDENTIFICATION OF SEQUENCE NO. 10:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 410 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 10:
CCAGGWTTTA CAGGAGGGGA TATACTTCGA AGAACGGaCG GTGGTRCAGT TGGAACGATT 60
AGAGCTAATG TTAATGCCCC ATTAACACAA CAATATCGTA TAAGATTACG CTATGCTTCG 120
ACAACAAGTT TTGTTGTTAA TTTATTTGTT AATAATAGTG CGGCTGGCTT TACTTTACCG 180
AGTACAATGG CTCAAAATGG TTCTTTAACA YRCGAGTCGT TTAATACCTT AGAGGTAACT 240
CATWCTATTA GATTTTCACA GTCAGATACT ACACTTAGGT TGAATATATT CCCGTCYATC 300
TCTGGTCAAG RAGTGTATGT AGATAAACWT GAAATCGTTC CAWTTAACCC GACACGAGAA 360
GCGGAAGAAG ATTTAGAAGA TSCAAAGAAA GCGGTGGCGA GCTTGTTTAC 410
(2) SEQUENCE IDENTIFICATION INFORMATION NO. eleven:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 137 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 11:
Pro Gly Phe Xaa Gly Gly Asp lie Leu Arg Arg Thr Gly Val Gly Thr 1 5 10 15
Phe Gly Thr lie Arg Val Arg Xaa Thr Ala Pro Leu Thr Gln Arg Tyr 20 25 30
Arg lie Arg Phe Arg Phe Ala Xaa Thr Thr Asn Leu Phe lie Gly lie 35 40 45
Arg Val Gly Asp Arg Gln Val Asn Tyr Phe Asp Phe Gly Arg Thr Met 50 55 60
Asn Arg Gly Asp Glu Leu Arg Tyr Glu Be Phe Wing Thr Arg Glu Phe 65 70 75 80
Thr Thr Asp Phe Asn Phe Arg Gln Pro Gln Glu Leu He Ser Val Phe 85 90 95
Wing Asn Wing Phe Wing Wing Gly Gln Glu Val Tyr Phe Asp Arg He Glu 100 105 110
He He Pro Val Asn Pro Ala Arg Glu Ala Lys Glu Asp Leu Glu Ala 115 120 125
Ala Lys Lys Ala Val Ala Ser Leu Phe 130 135
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 12:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 413 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 12:
CCAGGTTTTA YAGGAGGGGA TATACTCCGA AGAACAGGGG TTGGTACATT TGGAACAATA 60
AGGGTAAGGA YTACTGCCCC CTTAACACAA AGATATCGCA TAAGATTCCG TTTCGCTTYT 120
ACCACAAATT TGTTCATTGG TATAAGAGTT GGTGATAGAC AAGTAAATTA TTTTGACTTC 180
GGAAGAACAA TGAACAGAGG AGATGAATTA AGGTACGAAT CTTTTGCTAC AAGGGAGTTT 240
ACTACTGATT TTAATTTTAG ACAACCTCAA GAATTAATCT CAGTGTTTGC AAATGCATTT 300
AGCGCTGGTC AAGAAGTTTA TTTTGATAGA ATTGAGATTA TCCCCGTTAA TCCCGCACGA 360
GAGGCGAAAG AGGATYTAGA AGCAGCAAAG AAAGCGGTGG CGAGCTTGTT TAC 413
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 13
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 135 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 13:
Gly Phe He Gly Gly Ala Leu Leu Gln Arg Thr Asp His Gly Ser Leu 1 5 10 15
Gly Val Leu Arg Val Gln Phe Pro Leu His Leu Arg Gln Gln Tyr Arg 20 25 30
He Xaa Val Arg Tyr Ala Xaa Thr Thr Asn He Arg Leu Ser Val Asn 35 40 45
Gly Ser Phe Gly Thr He Ser Gln Asn Leu Pro Ser Thr Met Arg Leu 50 55 60
Gly Glu Asp Leu Arg Tyr Gly Ser Phe Wing He Arg Glu Phe Asn Thr .65 70 75 80
Ser He Arg Pro Thr Wing Ser Pro Asp Gln He Arg Leu Thr He Glu 85 90 95
Pro Ser Phe He Arg Gln Glu Val Tyr Val Asp Arg He Glu Phe He 100 105 110 Pro Val Asn Pro Thr Arg Glu Ala Lys Glu Asp Leu Glu Ala Ala Lys 115 120 125
Lys Ala Val Ala Ser Leu Phe 130 135
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 14:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 407 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 14
GGMTTTATAG GAGGAGCTCT ACTTCAAAGG ACTGACCATG GTTCGCTTGG AGTATTGAGG 60
GTCCAATTTC CACTTCACTT AAGACAACAA TATCGTATTA SAGTCCGTTA TGCTTYTACA 120
ACAAATATTC GATTGAGTGT GAATGGCAGT TTCGGTACTA TTTCTCAAAA TCTCCCTAGT 180 ACAATGAGAT TAGGAGAGGA TTTAAGATAC GGATCTTTTG CTATAAGAGA GTTTAATACT 240
TCTATTAGAC CCACTGCAAG TCCGGACCAA ATTCGATTGA CAATAGAACC ATCTTTTATT 300
AGACAAGAGG TCTATGTAGA TAGAATTGAG TTCATTCCAG TTAATCCGAC GCGAGAGGCG 360
AAAGAGGATC TAGAAGCAGC AAAAAAAGCG GTGGCGAGCT TGTTTAC 407
(2) SEQUENCE IDENTIFICATION INFORMATION NO. fifteen:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 137 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 15:
Pro Gly Phe Thr Gly Gly Asp He Leu Arg Arg Thr Ser Pro Gly Gln 1 5 10 15 He Ser Thr Leu Arg Val Asn He Thr Ala Pro Leu Ser Gln Arg Tyr 20 25 30
Arg Val Arg He Arg Tyr Ala Ser Thr Thr Asn Leu Gln Phe His Thr 35 40 45
Be He Asp Gly Arg Pro He Asn Gln Gly Asn Phe Be Wing Thr Met 50 55 60
Being Ser Gly Being Asn Leu Gln Being Gly Being Phe Arg Thr Val Gly Phe 65 70 75 80
Thr Thr Pro Phe Asn Phe Ser Asn Gly Ser Ser Val Phe Thr Leu Ser 85 90 95
Ala His Val Phe Asn Ser Gly Asn Glu Val Tyr He Asp Arg He Glu 100 105 110
Phe Val Pro Wing Glu Val Thr Phe Glu Wing Glu Tyr Asp Leu Glu Arg 115 120 125
Wing Gln Lys Wing Val Wing Ser Leu Phe 130 135
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 16:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 413 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 16:
CCAGGATTTA CAGGAGGAGA TATTCTTCGA AGAACTTCAC CTGGCCAGAT TTCAACCTTA 60
AGAGTAAATA TTACTGCACC ATTATCACAA AGATATCGGG TAAGAATTCG CTACGCTTCT 120
ACCACAAATT TACAATTCCA TACATCAATT GACGGAAGAC CTATTAATCA GGGGAATTTT 180
TCAGCAACTA TGAGTAGTGG GAGTAATTTA CAGTCCGGAA GCTTTAGGAC TGTAGGTTTT 240
ACTACTCCGT TTAACTTTTC AAATGGATCA AGTGTATTTA CGTTAAGTGC TCATGTCTTC 300
AATTCAGGCA ATGAAGTTTA TATAGATCGA ATTGAATTTG TTCCGGCAGA AGTAACCTTT 360
GAGGCAGAAT ATGATTTAGA AAGAGCGCAA AAGGCGGTGG CGAGCTTGTT TAC 413
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 17 (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 136 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: pfotein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 17:
Pro Gly Phe Xaa Gly Gly Asp He Leu Arg Arg Thr Asp Gly Gly Wing 1 5 10 15
Val Gly Thr He Arg Wing Asn Val Asn Wing Pro Leu Thr Gln Gln Tyr 20 25 30
Arg He Arg Leu Arg Tyr Wing Being Thr Thr Being Phe Val Val Asn Leu 35 40 45
Phe Val Asn Asn Be Wing Wing Gly Phe Thr Leu Pro Be Thr Met Wing 50 55 60
Gln Asn Gly Being Leu Thr Tyr Glu Being Phe Asn Thr Leu Glu Val Thr 65 70 75 80
His Thr He Arg Phe Ser Gln Ser Asp Thr Thr Leu Arg Leu Asn He 85 90 95
Phe Pro Be He Be Gly Gln Glu Val Tyr Val Asp Lys Leu Glu He 100 105 110
Val Pro He Asn Pro Thr Arg Glu Wing Glu Glu Asp Leu Glu Asp Wing 115 120 125
Lys Lys Wing Val Wing Ser Leu Phe 130 135
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 18:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 410 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 18:
CCAGGWTTTA YAGGAGGGGA TATACTTCGA AGAACGGACG GTGGTGCAGT TGGAACGATT 60
AGAGCTAATG TTAATGCCCC ATTAACACAA CAATATCGTA TAAGATTACG CTATGCTTCG 120
ACAACAAGTT TTGTTGTTAA TTTATTTGTT AATAATAGTG CGGCTGGCTT TACTTTACCG 180
AGTACAATGG CTCAAAATGG TTCTTTAACA TACGAGTCGT TTAATACCTT AGAGGTAACT 240
CATACTATTA GATTTTCACA GTCAGATACT ACACTTAGGT TGAATATATT CCCGTCTATC 300
TCTGGTCAAG AAGTGTATGT AGATAAACTT GAAATCGTTC CAATTAACCC GACACGAGAA 360
GCGGAAGAAG ATTTAGAAGA "TGCAAAGAAA GCGGTGGCGA GCTTGTTTAC 1 °
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 19: "
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 137 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 19:
Pro Gly Phe Xaa Gly Gly Asp He Leu Arg Arg Thr Ser Pro Gly Gln 1 5 10 15
He Be Thr Leu Arg Val Asn He Thr Wing Pro Leu Ser Gln Arg Tyr 20 25 30
Arg Val Arg He Arg Tyr Ala Xaa Thr Thr Asn Leu Gln Phe His Thr 35. 40 45
Be He Asp Gly Arg Pro He Asn Gln Gly Asn Phe Be Wing Thr Met 50 55 60
Being Ser Gly Being Asn Leu Gln Being Gly Being Phe Arg Thr Val Gly Phe 65 70 75 80
Thr Thr Pro Phe Asn Phe Ser Asn Gly Ser Ser Val Phe Thr Leu Ser 85 90 95
Ala His Val Phe Asn Ser Gly Asn Glu Val Tyr He Asp Arg He Glu 100 105 110
Phe Val Pro Wing Glu Val Thr Phe Glu Wing Glu Tyr Asp Leu Glu Arg 115 120 125 Wing Gln Lys Wing Val Wing Ser Leu Phe 130 135
(2) SEQUENCE IDENTIFICATION INFORMATION NO. twenty:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 413 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 20:
CCAGGWTTTA YAGGAGGAGA TATTCTTCGA AGAACTTCAC CTGGCCAGAT TTCAACCTTA 6 °
AGAGTAAATA TTACTGCACC ATTATCACAA AGATATCGGG TAAGAATTCG CTACGCTTYT 120
ACYACAAATT TACAATTCCA TACATCAATT GACGGAAGAC CTATTAATCA GGGKAATTTT 180
TCAGCAACTA TGAGTAGTGG GAGTAATTTA CAGTCCGGAA GCTTTAGGAC TGTAGGTTTT 240 ACTACTCCGT TTAACTTTTC AAATGGATCA AGTGTATTTA CGTTAAGTGC TCATGTCTTC 300
AATTCAGGCA ATGAAGTTTA TATAGATCGA ATTGAATTTG TTCCGGCAGA AGTAACCTTT 360
GAGGCAGAAT ATGATTTAGA AAGAGCACAA AAGGCGGTGG CGAGCTTGTT TAC 413
(2) SEQUENCE IDENTIFICATION INFORMATION NO. twenty-one:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 106 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 21:
Phe Thr Gly Gly Asp He Leu Arg Arg Asn Thr He Gly Glu Phe Val 1 5 10 15
Ser Leu Gln Val Asn He Asn Ser Pro He Thr Gln Arg Tyr Arg Leu 20 25 30 Arg Phe Arg Tyr Wing Ser Ser Arg Asp Wing Arg He Thr Val Wing He 35 40 45
Gly Gly Gln He Arg Val Asp Met Thr Leu Glu Lys Thr Met Glu He 50 55 60
Gly Glu Ser Leu Thr Xaa Arg Thr Phe Ser Tyr Thr Asn Phe Ser Asn 65 70 75 80
Pro Phe Ser Phe Arg Wing Asn Pro Asp He He Arg He Wing Glu Glu 85 90 95
Leu Pro He Arg Gly Gly Glu Leu Val Tyr 100 105
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 22:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 318 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 22:
TTTACAGGAG GGGATATCCT TCGAAGAAAT ACCATTGGTG AGTTTGTGTC TTTACAAGTC 60
AATATTAACT CACCAATTAC CCAAAOATAC CGTTTAAGAT TTCGTTATGC TTCCAGTAGG 120
GATGCACGAA TTACTGTAGC GATAGGAGGA CAAATTAGAG TAGATATGAC CCTTGAAAAA 180
ACCATGGAAA TTGGGGAGAG CTTAACATYT AGAACATTTA GCTATACCAA TTTTAGTAAT 240
CCTTTTTCAT TTAGGGCTAA TCCAGATATA ATTAGAATAG CTGAAGAACT TCCTATTCGC 300
GGTGGCGAGC TTGTTTAC 318
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 2. 3:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 96 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein (xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 23:
He Pro Leu Val Ser Leu Cys Leu Tyr Lys Ser He Leu Thr His Gln 1 5 10 15
Leu Pro Lys Asp Thr Val Xaa Xaa Phe Val Met Leu Pro Val Gly Met 20 25 30
His Glu Leu Leu Xaa Arg Xaa Glu Asp Lys Leu Glu Xaa He Xaa Pro 35 40 45
Leu Lys Lys Pro Trp Lys Leu Gly Arg Ala Xaa His Leu Glu His Leu 50 55 60
Ala He Pro He Leu Val He Leu Phe His Leu Gly Leu He Gln He
65 70 75 80
Xaa Leu Glu Xaa Leu Lys Asn Phe Leu Phe Ala Val Ala Ser Leu Phe 85 90 95
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 24:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 292 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 24:
AAATACCATT GGTGAGTTTG TGTCTTTACA AGTCAATATT AACTCACCAA TTACCCAAAG 60
ATACCGTTTA ARATTTCGTT ATGCTTCCAG TAGGGATGCA CGAATTACTG TAGCGATAGG 120
AGGACAAATT AGAGTAGATA TGACCCTTGA AAAAACCATG GAAATTGGGG AGAGCTTAAC 180
ATCTAGAACA TTTAGCTATA CCAATTTTAG TAATCCTTTT TCATTTAGGG CTAATCCAGA 240
TATAATTAGA ATAGCTGAAG AACTTCCTAT TCGCGGTGGC GAGCTTGTTT AC 292
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 25:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 108 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 25:
Pro Gly Phe Xaa Gly Gly Asp He Leu Arg Arg Asn Thr He Gly Glu 1 5 10 15
Phe Val Ser Leu Gln Val Asn He Asn Ser Pro He Thr Gln Arg Tyr 20 25 30
Arg Leu Arg Phe Arg Tyr Wing Being Ser Arg Asp Wing Arg He Thr Val 35 40 45
Wing He Gly Gly Gln He Arg Val Xaa Met Thr Leu Glu Lys Thr Met 50 55 60
Glu He Gly Glu Ser Leu Thr Ser Arg Thr Phe Ser Tyr Thr Asn Phe 65 70 75 80
Being Asn Pro Phe Being Phe Arg Wing Asn Pro Asp He He Arg He Wing 85 90 95
Glu Glu Leu Pro He Arg Gly Gly Glu Leu Val Tyr 100 105
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 26: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 324 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 26:
CCAGGWTTTA YAGGAGGGGA TATCCTTCGA AGAAATACCA TTGGTGAGTT TGTGTCTTTA 60
CAAGTCAATA TTAACTCACC AATTACCCAA AGATACCGTT TAAGATTTCG TTATGCTTCC 120
AGTAGGGATG CACGAATTAC TGTAGCGATA GGAGGACAAA TTAGAGTAKA TATGACCCTT 180
GAAAAAACCA TGGAAATTGG GGAGAGCTTA ACATCTAGAA CATTTAGCTA TACCAATTTT 240
AGTAATCCTT TTTCATTTAG GGCTAATCCA GATATAATTA GAATAGCTGA AGAACTTCCT 300
ATTCGCGGTG GCGAGCTTGT TTAC 324 (2) SEQUENCE IDENTIFICATION INFORMATION NO. 27:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 136 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linearF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 27:
Gly Phe Xaa Gly Gly Asp Val He Arg Arg Thr Asn Thr Gly Gly Phe 1 5 10. fifteen
Gly Ala He Arg Val Ser Val Thr Gly Pro Leu Thr Gln Arg Tyr Arg 20 25 30
He Arg Phe Arg Tyr Wing Being Thr He Asp Phe Asp Phe Phe Val Thr 35 40 45
Arg Gly Gly Thr Thr He Asn Asn Phe Arg Phe Thr Arg Thr Met Asn 50 55 60
Arg Gly Gln Glu Ser Arg Tyr Glu Ser Tyr Arg Thr Val Glu Phe Thr 65 70 75 80
Thr Pro Phe Asn Phe Thr Gln Ser Gln Asp He He Arg Thr Xaa He 85 90 95
Gln Gly Leu Ser Gly Asn Gly Glu Val Tyr Leu Asp Arg He Glu He 100 105 110
He Pro Val Asn Pro Thr Arg Glu Wing Glu Glu Asp Leu Glu Wing Ala 115 120 125
Lys Lys Wing Val Wing Ser Leu Phe 130 135
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 28:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 411 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 28:
AGGATTTAYA GGAGGAGATG TAATCCGAAG AACAAATACT GGTGGATTCG GAGCAATAAG 60
GGTGTCGGTC ACTGGACCGC TAACACAACG ATATCGCATA AGGTTCCGTT ATGCTTCGAC 120
AATAGATTTT GATTTCTTTG TAACACGTGG AGGAACTACT ATAAATAATT TTAGATTTAC 180
ACGTACAATG AACAGGGGAC AGGAATCAAG ATATGAATCC TATCGTACTG TAGAGTTTAC 240
AACTCCTTTT AACTTTACAC AAAGTCAAGA TATAATTCGA ACAYCTATCC AGGGACTTAG_300_TGGAAATGGG GAAGTATACC TTGATAGAAT TGAAATCATC CCTGTAAATC CAACACGAGA 360
AGCGGAAGAR GATTTAGAAG CGGCGAAGAA AGCGGTGGCG AGCTTGTTTA C 411
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 29:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 136 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 29:
Pro Gly Phe He Gly Gly Ala Leu Leu Gln Arg Thr Asp His Gly Ser 1 5 10 15
Leu Gly Val Leu Arg Val Gln Phe Pro Leu His Leu Arg Gln Gln Tyr 20 25 30
Arg He Arg Val Arg Tyr Ala Ser Thr Thr Asn He Arg Leu Ser Val 35 40 45
Asn Gly Being Phe Gly Thr He Being Gln Asn Leu Pro Being Thr Met Arg 50 55 60
Leu Gly Glu Asp Leu Arg Tyr Gly Be Phe Wing He Arg Glu Phe Asn 65 70 75 80
Thr Ser He Arg Pro Thr Wing Ser Pro Asp Gln He Arg Leu Thr He 85 90 95
Glu Pro Ser Phe lie Arg Gln Glu Val Tyr Val Asp Arg He Glu Phe 100 105 110
He Pro Val Asn Pro Thr Arg Glu Ala Lys Glu Asp Leu Glu Ala Ala 115 120 125
Lys Lys Ala Val Ala Ser Leu Phe 130 135 (2) INFORMATION FOR IDENTIFICATION OF SEQUENCE NO. 30:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 410 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 30:
CCAGGATTTA TAGGAGGAGC TCTACTTCAA AGGACTGACC ATGGTTCGCT TGGSEXATTG 60
AGGGTCCAAT TTCCACTTCA CTTAAGACAA CAATATCGTA TTAGAGTCCG TTK? BCZTCT 120
ACAACAAATA TTCGATTGAG TGTGAATGGC AGTTTCGGTA CTATTTCTCA AAA3C CCCT 180
AGTACAATGA GATTAGGAGA GGATTTAAGA TACGGATCTT TTGCTATAAG AGA £ S TTAAT 240
ACTTCTATTA GACCCACTGC AAGTCCGGAC CAAATTCGAT TGACAATAGA ACC3CTCT * TTT 300
ATTAGACAAG AGGTCTATGT AGATAGAATT GAGTTCATTC CAGTTAATCC GACSCCSiGAG
) z:
;:; x ~. S? C í ^ i: J._:
Glv P ra Tcr p Gly Th:
Gln
- > .i i and As :? Ph¿ Ser He Arg 11 = -iO
Gl 't-xl Xslxr Ser £ hr Me- 50 55 60
Asn Arg Gly Gln Glu Leu Thr Tyr Glu Ser Phe Val Thr Ser Glu Phe 65 70 75 80
Thr Thr Asn Gln Ser Asp Leu Pro Phe Thr Phe Thr Gln Wing Gln Glu 85 90 95
Asn Leu Thr He Leu Wing Glu Gly Val Ser Thr Gly Ser Glu Tyr Phe 100 105 110
He Asp Arg He Glu He He Pro Val Asn Pro Ala Arg Glu Ala Glu 115 120 125
Glu Asp Leu Glu Ala Ala Lys Lys Ala Val Ala Ser Leu Phe 130 135 140
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 32:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 428 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 32:
CCAGGWTTTA YAGGAGGGGG TATACTCCGA AGAACAACTA ATGGCACATT TGGAACGTTA 60
AGAGTAACAG TTAATTCACC ATTAACACAA AGATATCGCG TAAGAGTTCG TTTTGCTTCA 120
TCAGGAAATT TCAGCATAAG GATACTGCGT GGAAATACCT CTATAGCTTA TCAAAGATTT 180
GGGAGTACAA TGAACAGAGG ACAGGAACTA ACTTACGAAT CATTTGTCAC AAGTGAGTTC 240
ACTACTAATC AGAGCGATCT GCCTTTTACA TTTACACAAG CTCAAGAAAA TTTAACAATC 300
CTTGCAGAAG GTGTTAGCAC CGGTAGTGAA TATTTTATAG ATAGAATTGA AATCATCCCT 360
GTGAACCCGG CACGAGAAGC AGAAGAGGAT TTAGAAGCRG CGAAGAAAGC GGTGGCGAGC 420
TTGTTTAC 428
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 33: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 136 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. FROM SECUENCXA NO: 33:
Pro Gly Phe He Gly Gly Ala Leu Leu Gln Arg Thr Asp His Gly Ser 1 5 10 15
Leu Gly Val Leu Arg Val Gln Phe Pro Leu His Leu Arg Gln Gln Tyr 20 25 30
Arg He Arg Val Arg Tyr Ala Ser Thr Thr Asn He Arg Leu Ser Val 35 40 45
Asn Gly Being Phe Gly Thr He Being Gln Asn Leu Pro Being Thr Met Arg 50 55 60
Leu Gly Glu Asp Leu Arg Tyr Gly Be Phe Wing He Arg Glu Phe Asn 65 70 75 80
Thr Ser He Arg Pro Thr Wing Ser Pro Asp Gln He Arg Leu Thr He 85 90 95 Glu Pro Be Phe He Arg Gln Glu Val Tyr Val Asp Arg He Glu Phe 100 105 110
He Pro Val Asn Pro Thr Arg Glu Ala Lys Glu Asp Leu Glu Ala Ala 115 120 125
Lys Lys Wing Val Wing Ser Leu Phe 130 135
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 3. 4:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 410 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 34
CCAGGATTTA TAGGAGGAGC TCTACTTCAA AGGACTGACC ATGGTTCGCT TGGAGTATTG 60
AGGGTCCAAT TTCCACTTCA CTTAAGACAA CAATATCGTA TTAGAGTCCG TTATGCTTCT ACAACAAATA TTCGATTGAG TGTGAATGGC AGTTTCGGTA CTATTTCTCA AAATCTCCCT 180
AGTACAATGA GATTAGGAGA GGATTTAAGA TACGGATCTT TTGCTATAAG AGAGTTTAAT 240
ACTTCTATTA GACCCACTGC AAGTCCGGAC CAAATTCGAT TGACAATAGA ACCATCTTTT 300
ATTAGACAAG AGGTCTATGT AGATAGAATT GAGTTCATTC CAGTTAATCC GACGCGAGAG 360
GCGAAAGAGG ATCTAGAAGC AGCAAAAAAA GCGGTGGCGA GCTTGTTTAC 410
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 35:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 137 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 35:
Pro Gly Phe Thr Gly Gly Asp He Leu Arg Arg Thr Gly Val Gly Thr 1 5 10 15
Phe Gly Thr He Arg Val Arg Thr Thr Wing Pro Leu Thr Gln Arg Tyr 20 25 30
Arg He Arg Phe Arg Phe Wing Being Thr Thr Asn Leu Phe He Gly He 35 40 45
Arg Val Gly Asp Arg Gln Val Asn Tyr Phe Asp Phe Gly Arg Thr Met 50 55 60
Asn Arg Gly Asp Glu Leu Arg Tyr Glu Be Phe Wing Thr Arg Glu Phe 65 70 75 80
Thr Thr Asp Phe Asn Phe Arg Gln Pro Gln Glu Leu He Ser Val Phe 85 90 95
Wing Asn Wing Phe Wing Wing Gly Gln Glu Val Tyr Phe Asp Arg He Glu 100 105 110
He He Pro Val Asn Pro Ala Arg Glu Ala Lys Glu Asp Leu Glu Ala 115 120 125
Ala Lys Lys Ala Val Ala Ser Leu Phe 130 135 (2) INFORMATION FOR IDENTIFICATION OF SEQUENCE NO. 36:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 413 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO.-36:
CCAGGTTTTA CAGGAGGGGA TATACTCCGA AGAACAGGGG TTGGTACATT TGGAACAATA 60
AGGGTAAGGA CTACTGCCCC CTTAACACAA AGATATCGCA TAAGATTCCG TTTCGCTTCT 120
ACCACAAATT TGTTCATTGG TATAAGAGTT GGTGATAGAC AAGTAAATTA TTTTGACTTC 180
GGAAGAACAA TGAACAGAGG AGATGAATTA AGGTACGAAT CTTTTGCTAC AAGGGAGTTT 240
ACTACTGATT TTAATTTTAG ACAACCTCAA GAATTAATCT CAGTGTTTGC AAATGCATTT 300
AGCGCTGGTC AAGAAGTTTA TTTTGATAGA ATTGAGATTA TCCCCGTTAA TCCCGCACGA 360
GAGGCGAAAG AGGATCTAGA AGCAGCAAAG AAAGCGGTGG CGAGCTTGTT TAC 413
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 37:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 137 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 37:
Pro Gly Phe Thr Gly Gly Asp He Leu Arg Arg Thr Ser Pro Gly Gln 1 5 10 15
He Be Thr Leu Arg Val Asn He Thr Wing Pro Leu Ser Gln Arg Tyr 20 25 30
Arg Val Arg He Arg Tyr Ala Ser Thr Thr Asn Leu Gln Phe His Thr 35 40 45
Be He Asp Gly Arg Pro He Asn Gln Gly Asn Phe Be Wing Thr Met 50 55 60
Being Ser Gly Being Asn Leu Gln Being Gly Being Phe Arg Thr Val Gly Phe 65 70 75 80
Thr Thr Pro Phe Asn Phe Ser Asn Gly Ser Ser Val Phe Thr Leu Ser 85 90 95
Ala His Val Phe Asn Ser Gly Asn Glu Val Tyr He Asp Arg He Glu 100 105 110
Phe Val Pro Wing Glu Val Thr Phe Glu Wing Glu Tyr Asp Leu Glu Arg 115 120 125
Wing Gln Lys Wing Val Wing Ser Leu Phe 130 135
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 38:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 413 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 38:
CCAGGWTTTA CAGGAGGAGA TATTCTTCGA AGAACTTCAC CTGGCCAGAT TTCAACCTTA 60
AGAGTAAATA TTACTGCACC ATTATCACAA AGATATCGGG TAAGAATTCG CTACGCTTCT 120
ACCACAAATT TACAATTCCA TACATCAATT GACGGAAGAC CTATTAATCA GGGGAATTTT 180
TCAGCAACTA TGAGTAGTGG GAGTAATTTA CAGTCCGGAA GCTTTAGGAC TGTAGGTTTT 240
ACTACTCCGT TTAACTTTTC AAATGGATCA AGTGTATTTA CGTTAAGTGC TCATGTCTTC 300
AATTCAGGCA ATGAAGTTTA TATAGATCGA ATTGAATTTG TTCCGGCAGA AGTAACCTTT 360
GAGGCAGAAT ATGATTTAGA AAGAGCACAR AAGGCGGTGG CGAGCTTGTT TAC 413
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 39
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 137 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 39:
Pro Gly Phe Thr Gly Gly Asp He Leu Arg Arg Thr Gly Val Gly Thr 1 5 10 15
Phe Gly Thr He Arg Val Arg Thr Thr Wing Pro Leu Thr Gln Arg Tyr 20 25 30
Arg He Arg Phe Arg Phe Wing Being Thr Thr Asn Leu Phe He Gly He 35 40 45
Arg Val Gly Asp Arg Gln Val Asn Tyr Phe Asp Phe Gly Arg Thr Met 50 55 60
Asn Arg Gly Asp Glu Leu Arg Tyr Glu Be Phe Wing Thr Arg Glu Phe 65 70 75 80
Thr Thr Asp Phe Asn Phe Arg Gln Pro Gln Glu Leu He Ser Val Phe 85 90 95
Wing Asn Wing Phe Wing Wing Gly Gln Glu Val Tyr Phe Asp Arg He Glu 100 105 lio He He Pro Pro Val Asn Pro Wing Arg Glu Wing Lys Glu Asp Leu Glu Wing 115 120 125
Ala Lys Lys Ala Val Ala Ser Leu Phe 130 135
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 40:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 413 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 40:
CCAGGTTTTA CAGGAGGGGA TATACTCCGA AGAACAGGGG TTGGTACATT TGGAACAATA 60
AGGGTAAGGA CTACTGCCCC CTTAACACAA AGATATCGCA TAAGATTCCG TTTCGCTTCT 12 °
ACCACAAATT TGTTCATTGG TATAAGAGTT GGTGATAGAC AAGTAAATTA TTTTGACTTC 180 GGAAGAACAA TGAACAGAGG AGATGAATTA AGGTACGAAT CTTTTGCTAC AAGGGAGTTT 240
ACTACTGATT TTAATTTTAG ACAACCTCAA GAATTAATCT CAGTGTTTGC AAATGCATTT 300
AGCGCTGGTC AAGAAGTTTA TTTTGATAGA ATTGAGATTA TCCCCGTTAA TCCCGCACGA 360
GAGGCGAAAG AGGATCTAGA AGCAGCAAAG AAAGCGGTGG CGAGCTTGTT TAC 413
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 41:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 137 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 41:
Pro Gly Phe Thr Gly Gly Asp He Leu Arg Arg Thr Asn Wing Gly Asn 1 5 10 15 Phe Gly Asp Met Arg Val Asn He Thr Wing Pro Leu Ser Gln Arg Tyr 20 25 30
Arg Val Arg He Arg Tyr Wing Ser Thr Wing Asn Leu Gln Phe His Thr 35 40 45
Be He Asn Gly Arg Wing He Asn Gln Wing Asn Phe Pro Wing Thr Met 50 55 60
Asn Ser Gly Glu Asn Leu Gln Ser Gly Ser Phe Arg Val Wing Gly Phe 65 70 75 80
Thr Thr Pro Phe Thr Phe Ser Asp Wing Leu Ser Thr Phe Thr He Gly 85 90 95
Wing Phe Being Phe Being Being Asn Asn Glu Val Tyr He Asp Arg He Glu 100 105 110
Phe Val Pro Wing Glu Val Thr Phe Wing Thr Glu Ser Asp Gln Asp Arg 115 120 125
Wing Gln Lys Wing Val Wing Ser Leu Phe 130 135
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 42:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 413 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 42:
CCAGGWTTTA CAGGAGGGGA TATCCTTCGA AGAACGAATG CTGGTAACTT TGGAGATATG 60
CGTGTAAACA TTACTGCACC ACTATCACAA AGATATCGCG TAAGGATTCG TTATGCTTCT 120
ACTGCAAATT TACAATTCCA TACATCAATT AACGGAAGAG CCATTAATCA GGCGAATTTC 180
CCAGCAACTA TGAACAGTGG GGAGAATTTA CAGTCCGGAA GCTTCAGGGT TGCAGGTTTT 240
ACTACTCCAT TTACCTTTTC AGATGCACTA AGCACATTCA CAATAGGTGC TTTTAGCTTC 300
TCTTCAAACA ACGAAGTTTA TATAGATCGA ATTGAATTTG TTCCGGCAGA AGTAACATTT 360
GCAACAGAAT CTGATCAGGA TAGAGCACAA AAGGCGGTGG CGAGCTTGTT TAC 413
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 43:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 136 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ü) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 43:
Pro Gly Phe He Gly Gly Ala Leu Leu Gln Arg Thr Asp His Gly Ser 1 5 10 15
Leu Gly Val Leu Arg Val Gln Phe Pro Leu His Leu Arg Gln Gln Tyr 20 25 30
Arg He Arg Val Arg Tyr Ala Ser Thr Thr Asn He Arg Leu Ser Val 35 40 45
Asn Gly Being Phe Gly Thr He Being Gln Asn Leu Pro Being Thr Met Arg 50 55 60
Leu Gly Glu Asp Leu Arg Tyr Gly Be Phe Wing He Arg Glu Phe Asn 65 70 75 80
Thr Ser He Arg Pro Thr Wing Ser Pro Asp Gln He Arg Leu Thr He 85 90 95
Glu Pro Ser Phe He Arg Gln Glu Val Tyr Val Asp Arg He Glu Phe 100 105 lio
He Pro Val Asn Pro Thr Arg Glu Ala Lys Glu Asp Leu Xaa Ala Ala 115 120 125
Lys Lys Wing Val Wing Ser Leu Phe 130 135
(2) INFORMATION FOR SEQUENCE IDENTIFICATION NO / 44:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 410 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 44:
CCAGGATTTA TAGGAGGAGC TCTACTTCAA AGGACTGACC ATGGTTCGCT TGGAGTATTG 60
AGGGTCCAAT TTCCACTTCA CTTAAGACAA CAATATCGTA TTAGAGTCCG TTATGCTTCT 120
ACAACAAATA TTCGATTGAG TGTGAATGGC AGTTTCGGTA CTATTTCTCA AAATCTCCCT 180
AGTACAATGA GATTAGGAGA GGATTTAAGA TACGGATCTT TTGCTATAAG AGAGTTTAAT 240
ACTTCTATTA GACCCACTGC AAGTCCGGAC CAAATTCGAT TGACAATAGA ACCATCTTTT 300
ATTAGACAAG AGGTCTATGT AGATAGAATT GAGTTCATTC CAGTTAATCC GACGCGAGAG 360
GCGAAAGAGG ATCTAKAAGC AGCAAAAAAA GCGGTGGCGA GCTTGTTTAC 410
(2) SEQUENCE IDENTIFICATION INFORMATION NO. Four. Five:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 137 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 45:
Gln Xaa Leu Ser Gly Gly Asp Val He Arg Arg Thr Asn Thr __Gly Gly 1 5 10 15
Phe Gly Wing He Arg Val Ser Val Thr Gly Pro Leu Thr Gln Arg Tyr 20 25 30
Arg He Arg Phe Arg Tyr Wing Being Thr He Asp Phe Asp Phe Phe Val 35 40 45
Thr Arg Gly Gly Thr Thr He Asn Asn Phe Arg Phe Thr Arg Thr Met 50 55 60
Asn Arg Gly Gln Glu Ser Arg Tyr Glu Ser Tyr Arg Thr Val Glu Phe 65 70 75 80
Thr Thr Pro Phe Asn Phe Thr Gln Ser Gln Asp He He Arg Thr Ser 85 '90 95
He Gln Gly Leu Ser Gly Asn Gly Glu Val Tyr Leu Asp Arg He Glu 100 105 110
He He Pro Val Asn Pro Thr Arg Glu Wing Glu Glu Asp Leu Glu Wing 115 120 125 Wing Lys Lys Wing Val Wing Ser Leu Phe 130 135
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 46:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 414 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 46:
CCAGGWTTTA tCAGGAGGAG ATGTAATCCG AAGAACAAAT ACTGGTGGAT TCGGAGCAAT 60
AAGGGTGTCG GTCACTGGAC CGCTAACACA ACGATATCGC ATAAGGTTCC GTTATGCTTC 120
GACAATAGAT TTTGATTTCT TTGTAACACG TGGAGGAACT ACTATAAATA ATTTTAGATT 180
TACACGTACA ATGAACAGGG GACAGGAATC AAGATATGAA TCCTATCGTA CTGTAGAGTT 240 TACAACTCCT TTTAACTTTA CACAAAGTCA AGATATAATT CGAACATCTA TCCAGGGACT 300
TAGTGGAAAT GGGGAAGTAT ACCTTGATAG AATTGAAATC ATCCCTGTAA ATCCAACACG 360
AGAAGCGGAA GARGATTTAG AAGCGGCGAA GAAAGCGGTG GCGAGCTTGT TTAC 414
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 47:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 142 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO.-47:
Pro Gly Phe Thr Gly Gly Gly He Leu Arg Arg Thr Thr Asn Gly Thr 1 5 10 15
Phe Gly Thr Leu Arg Val Thr Val Asn Ser Pro Leu Thr Gln Arg Tyr 20 25 30 Arg Val Arg Val Arg Phe Wing Being Ser Gly Asn Phe Ser He Arg He 35 40 45
Leu Arg Gly Asn Thr Ser He Wing Tyr Gln Arg Phe Gly Ser "Thr Met 50 55 60
Asn Arg Gly Gln Glu Leu Thr Tyr Glu Ser Phe Val Thr Ser Glu Phe 65 70 75 80
Thr Thr Asn Gln Ser Asp Leu Pro Phe Thr Phe Thr Gln Wing Gln Glu 85 90 95
Asn Leu Thr He Leu Wing Glu Gly Val Ser Thr Gly Ser Glu Tyr Phe 100 105 lio
He Asp Arg He Glu He He Pro Val Asn Pro Ala Arg Glu Ala Glu 115 120 125
Glu Asp Leu Glu Ala Ala Lys Lys Ala Val Ala Ser Leu Phe 130 135 140
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 48:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 428 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 48:
CCAGGWTTTA CAGGAGGGGG TATACTCCGA AGAACAACTA ATGGCACATT TGGAACGTTA 60
AGAGTAACAG TTAATTCACC ATTAACACAA AGATATCGCG TAAGAGTTCG TTTTGCTTCA 120
TCAGGAAATT TCAGCATAAG GATACTGCGT GGAAATACCT CTATAGCTTA TCAAAGATTT 180
GGGAGTACAA TGAACAGAGG ACAGGAACTA ACTTACGAAT CATTTGTCAC AAGTGAGTTC 240
ACTACTAATC AGAGCGATCT GCCTTTTACA TTTACACAAG CTCAAGAAAA TTTAACAATC 300
CTTGCAGAAG GTGTTAGCAC CGGTAGTGAA TATTTTATAG ATAGAATTGA AATCATCCCT 360
GTGAACCCGG CACGAGAAGC AGAAGAGGAT TTAGAAGCAG CGAAGAAAGC GGTGGCGAGC 420
TTGTTTAC 428 (2) SEQUENCE IDENTIFICATION INFORMATION NO. 49:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 136 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 49:
Pro Gly Phe He Gly Oly Wing Leu Leu Gln Arg Thr Asp His Gly Ser 1 5 10 15
Leu Gly Val Leu Arg Val Gln Phe Pro Leu His Leu Arg Gln Gln Tyr 20 25 30
Arg He Arg Val Arg Tyr Ala Ser Thr Thr Asn He Arg Leu Ser Val 35 40 45
Asn Gly Being Phe Gly Thr He Being Gln Asn Leu Pro Being Thr Met Arg 50 55 60
Leu Gly Glu Asp Leu Arg Tyr Gly Be Phe Wing He Arg Glu Phe Asn 65 70 75 80
Thr Ser He Arg Pro Thr Wing Ser Pro Asp Gln He Arg Leu Thr He 85 90 95
Glu Pro Ser Phe He Arg Gln Glu Val Tyr Val Asp Arg He Glu Phe 100 105 110
He Pro Val Asn Pro Thr Arg Glu Ala Lys Glu Asp Leu Glu Ala Ala 115 120 125
Lys Lys Wing Val Wing Ser Leu Phe 130 135
(2) SEQUENCE IDENTIFICATION INFORMATION NO. fifty:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 410 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 50:
CCAGGWTTTA TAGGAGGAGC TCTACTTCAA AGGACTGACC ATGGTTCGCT TGGAGTATTG 60
AGGGTCCAAT TTCCACTTCA CTTAAGACAA CAATATCGTA TTAGAGTCCG TTATGCTTCT 120
ACAACAAATA TTCGATTGAG TGTGAATGGC AGTTTCGGTA CTATTTCTCA AAATCTCCCT 180
AGTACAATGA GATTAGGAGA GGATTTAAGA TACGGATCTT TTGCTATAAG AGAGTTTAAT 240
ACTTCTATTA GACCCACTGC AAGTCCGGAC CAAATTCGAT TGACAATAGA ACCATCTTTT 300
ATTAGACAAG AGGTCTATGT AGATAGAATT GAGTTCATTC CAGTTAATCC GACGCGAGAG 360
GCGAAAGAGG ATCTAGAAGC AGCAAAAAAA GCGGTGGCGA GCTTGTTTAC 410
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 51:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 137 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein (xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 51:
Pro Gly Phe Thr Gly Gly Asp He Leu Arg Arg Thr Gly Val Gly Thr 1 5 10 15
Phe Gly Thr He Arg Val Arg Thr Thr Wing Pro Leu Thr Gln Arg Tyr 20 25 30
Arg He Arg Phe Arg Phe Wing Being Thr Thr Asn Leu Phe He Gly Ile- 35 40 45
Arg Val Gly Asp Arg Gln Val Asn Tyr Phe Asp Phe Gly Arg Thr Met
50 55 60
Asn Arg Gly Asp Glu Leu Arg Tyr Glu Be Phe Wing Thr Arg Glu Phe 65 70 75 80
Thr Thr Asp Phe Asn Phe Arg Gln Pro Gln Glu Leu He Ser Val Phe 85 90 95
Wing Asn Wing Phe Wing Wing Gly Gln Glu Val Tyr Phe Asp Arg He Glu 100 105 110
He He Pro Val Asn Pro Ala Arg Glu Ala Lys Glu Asp Leu Glu Ala 115 120 125
Ala Lys Lys Ala Val Ala Ser Leu Phe 130 135 (2), INFORMATION FOR IDENTIFICATION OF SEQUENCE NO. 52:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 412 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 52:
CCAGGTTTTA CAGGAGGGGA TATACTCCGA AGAACAGGGG TTGGTACATT TGGAACAATA 60
AGGGTAAGGA CTACTGCCCC CTTAACACAA AGATATCGCA TAAGATTCCG TTTCGCTTCT 120
ACCACAAATT TGTTCATTGG TATAAGAGTT GGTGATAGAC AAGTAAATTA TTTTGACTTC 180
GGAAGAACAA TGAACAGAGG AGATGAATTA AGGTACGAAT CTTTTGCTAC AAGGGAGTTT 240
ACTACTGATT TTAATTTTAG ACAACCTCAA GAATTAATCT CAGTGTTTGC AAATGCATTT 300
AGCGCTGGTC AAGAAGTTTA TTTTGATAGA ATTGAGATTA TCCCCGTTAA TCCCGCACGA 360
GAGGCGAAAG AGGATCTAGA AGCAGCAAAG AAAGCGGTGG CGAGCTTGTT TA 412
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 53:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 137 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 53:
Pro Gly Phe Thr Gly Gly Asp Val He Arg Arg Thr Asn Thr Gly Gly 1 5 10 15
Phe Gly Wing He Arg Val Ser Val Thr Gly Pro Leu Thr Gln Arg Tyr 20 25 30
Arg He Arg Phe Arg Tyr Wing Being Thr He Asp Phe Asp Phe Phe Val 35 40 45
Thr Arg Gly Gly Thr Thr He Asn Asn Phe Arg Phe Thr Arg Thr Met 50 55 60
Asn Arg Gly Gln Glu Ser Arg Tyr Glu Ser Tyr Arg Thr Val Glu Phe 65 70 75 80
Thr Thr Pro Phe Asn Phe Thr Gln Ser Gln Asp He He Arg Thr Ser 85 90 95
He Gln Gly Leu Ser Gly Asn Gly Glu Val Tyr Leu Asp Arg He Glu 100 105 110
He He Pro Val Asn Pro Thr Arg Glu Ala Glu Glu Asp Xaa Glu Ala 115 120 125
Ala Lys Lys Ala Val Ala Ser Leu Phe 130 135
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 54:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 413 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 54:
CCAGGATTTA CAGGAGGAGA TGTAATCCGA AGAACAAATA CTGGTGGATT CGGAGCAATA 60
AGGGTGTCGG TCACTGGACC GCTAACACAA CGATATCGCA TAAGGTTCCG TTATGCTTCG 120
ACAATAGATT TTGATTTCTT TGTAACACGT GGAGGAACTA CTATAAATAA TTTTAGATTT 180
ACACGTACAA TGAACAGGGG ACAGGAATCA AGATATGAAT CCTATCGTAC TGTAGAGTTT 240
ACAACTCCTT TTAACTTTAC ACAAAGTCAA GATATAATTC GAACATCTAT CCAGGGACTT 300
AGTGGAAATG GGGAAGTATA CCTTGATAGA ATTGAAATCA TCCCTGTAAA TCCAACACGA 360
GAAGCGGAAG AGGATTTWGA AGCGGCGAAG AAAGCGGTGG CGAGCTTGTT TAC 413
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 55:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 136 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 55:
Pro Gly Phe He Gly Gly Ala Leu Leu Gln Arg Thr Asp His Gly Ser 1 5 10"15
Leu Gly Val Leu Arg Val Gln Phe Pro Leu His Leu Arg Gln Gln Tyr 20 25 30
Arg He Arg Val Arg Tyr Ala Ser Thr Thr Asn He Arg Leu Ser Val 35 40 45
Asn Gly Being Phe Gly Thr He Being Gln Asn Leu Pro Being Thr Met Arg 50 55 60
Leu Gly Glu Asp Leu Arg Tyr Gly Be Phe Wing He Arg Glu Phe Asn 65 70 75 80
Thr Ser He Arg Pro Thr Wing Ser Pro Asp Gln He Arg Leu Thr He 85 90 95
Glu Pro Ser Phe He Arg Gln Glu Val Tyr Val Asp Arg He Glu Phe
100 105 110 He Pro Val Asn Pro Thr Arg Glu Ala Lys Xaa Asp Leu Xaa Ala Ala 115 120 125
Lys Lys Wing Val Wing Ser Leu Phe 130 135
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 56:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 410 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 56:
CCAGGATTTA TAGGAGGAGC TCTACTTCAA AGGACTGACC ATGGTTCGCT TGGAGTATTG 60
AGGGTCCAAT TTCCACTTCA CTTAAGACAA CAATATCGTA TTAGAGTCCG TTATGCTTCT 120
ACAACAAATA TTCGATTGAG TGTGAATGGC AGTTTCGGTA CTATTTCTCA AAATCTCCCT 180 AGTACAATGA GATTAGGAGA GGATTTAAGA TACGGATCTT TTGCTATAAG AGAGTTTAAT 240
ACTTCTATTA GACCCACTGC AAGTCCGGAC CAAATTCGAT TGACAATAGA ACCATCTTTT 300
ATTAGACAAG AGGTCTATGT AGATAGAATT GAGTTCATTC CAGTTAATCC GACGCGAGAG 360
GCGAAAGAKG ATCTABAAGC AGCAAAAAAA GCGGTGGCGA GCTTGTTTAC 410
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 57:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 137 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 57:
Pro Gly Phe Thr Gly Gly Asp Val He Arg Arg Thr Asn Thr Gly Gly 1 5 10 15 Phe Gly Wing He Arg Val Ser Val Thr Gly Pro Leu Thr Gln Arg Tyr 20 25 30
Arg He Arg Phe Arg Tyr Wing Being Thr He Asp Phe Asp Phe Phe Val 35 40 45
Thr Arg Gly Gly Thr Thr He Asn Asn Phe Arg Phe Thr Arg Thr Met 50 55 60
Asn Arg Gly Gln Glu Ser Arg Tyr Glu Ser Tyr Arg Thr Val Glu Phe 65 70 75 80
Thr Thr Pro Phe Asn Phe Thr Gln Ser Gln Asp He He Arg Thr Ser 85 90 95
He Gln Gly Leu Ser Gly Asn Gly Glu Val Tyr Leu Asp Arg He Glu 100 105 110
He He Pro Val Asn Pro Thr Arg Glu Ala Glu Glu Asp Leu Glu Ala 115 120 125
Ala Lys Lys Ala Val Ala Ser Leu Phe 130 135
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 58:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 413 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (gendmico)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 58:
CCAGGWTTTA CAGGAGGAGA TGTAATCCGA AGAACAAATA CTGGTGGATT CGGAGCAATA 60
AGGGTGTCGG TCACTGGACC GCTAACACAA CGATATCGCA TAAGGTTCCG TTATGCTTCG 120
ACAATAGATT TTGATTTCTT TGTAACACGT GGAGGAACTA CTATAAATAA TTTTAGATTT 180
ACACGTACAA TGAACAGGGG ACAGGAATCA AGATATGAAT CCTATCGTAC TGTAGAGTTT 240
ACAACTCCTT TTAACTTTAC ACAAAGTCAA GATATAATTC GAACATCTAT CCAGGGACTT 300
AGTGGAAATG GGGAAGTATA CCTTGATAGA ATTGAAATCA TCCCTGTAAA TCCAACACGA 360
GAAGCGGAAG AGGATTTAGA AGCGGCGAAG AAAGCGGTGG CGAGCTTGTT TAC 413
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 59:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 142 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 59:
Pro Gly Phe Xaa Gly Gly Gly He Leu Arg Arg Thr Thr Asn Gly Thr 1 5 10 15
Phe Gly Thr Leu Arg Val Thr Val Asn Ser Pro Leu Thr Gln Arg Tyr 20 25 30
Arg Val Arg Val Arg Phe Wing Being Ser Gly Asn Phe Ser He Arg He
40 45
Leu Arg Gly Asn Thr Be He Wing Tyr Gln Arg Phe Gly Be Thr Met
50 55 60
Asn Arg Gly Gln Glu Leu Thr Tyr Glu Ser Phe Val Thr Ser Glu Phe 65 70 75 80
Thr Thr Asn Gln Ser Asp Leu Pro Phe Thr Phe Thr Gln Wing Gln Glu 85 90 95
Asn Leu Thr He Leu Wing Glu Gly Val Ser Thr Gly Ser Glu Tyr Phe 100 105 110
He Asp Arg He Glu He He Pro Val Asn Pro Ala Arg Glu Ala Glu 115 120 125
Glu Asp Leu Glu Ala Ala Lys Lys Ala Val Ala Ser Leu Phe 130 135 140
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 60:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 428 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ü) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 60:
CCAGGWTTTA YAGGAGGGGG TATACTCCGA AGAACAACTA ATGGCACATT TGGAACGTTA 60
AGAGTAACAG TTAATTCACC ATTAACACAA AGATATCGCG TAAGAGTTCG TTTTOCTTCA 120
TCAGGAAATT TCAGCATAAG GATACTGCGT GGAAATACCT CTATAGCTTA TCAAAGATTT 180
GGGAGTACAA TGAACAGAGG ACAGGAACTA ACTTACGAAT CATTTGTCAC AAGTGAGTTC 240
ACTACTAATC AGAGCGATCT GCCTTTTACA TTTACACAAG CTCAAGAAAA TTTAACAATC 300
CTTGCAGAAG GTGTTAGCAC CGGTAGTGAA TATTTTATAG ATAGAATTGA AATCATCCCT 360
GTGAACCCGG CACGAGAAGC AGAAGAGGAT TTAGAAGCAG CGAAGAAAGC GGTGGCGAGC 420
TTGTTTAC 428
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 61:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 136 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 61:
Pro Gly Phe He Gly Gly Ala Leu Leu Gln Arg Thr Asp His Gly Ser 1 5 10 15
Leu Gly Val Leu Arg Val Gln Phe Pro Leu His Leu Arg Gln Gln Tyr 20 25 30
Arg He Arg Val Arg Tyr Ala Ser Thr Thr Asn He Arg Leu Ser Val 35 40 45
Asn Gly Being Phe Gly Thr He Being Gln Asn Leu Pro Being Thr Met Arg 50 55 60
Leu Gly Glu Asp Leu Arg Tyr Gly Be Phe Wing He Arg Glu Phe Asn 65 70 75 80
Thr Ser He Arg Pro Thr Wing Ser Pro Asp Gln He Arg Leu Thr He 85 90 95
Glu Pro Ser Phe He Arg Gln Glu Val Tyr Val Asp Arg He Glu Phe 100 105 lio He Pro Val Asn Pro Thr Arg Glu Ala Lys Glu Asp Leu Glu Ala Ala 115 120 125
Lys Lys Wing Val Wing Ser Leu Phe 130 135
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 62:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 410 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 62:
CCAGGTTTTA TAGGAGGAGC TCTACTTCAA AGGACTGACC ATGGTTCGCT TGGAGTATTG 60
AGGGTCCAAT TTCCACTTCA CTTAAGACAA CAATATCGTA TTAGAGTCCG TTATGCTTCT 120
ACAACAAATA TTCGATTGAG TGTGAATGGC AGTTTCGGTA CTATTTCTCA AAATCTCCCT 180 AGTACAATGA GATTAGGAGA GGATTTAAGA TACGGATCTT TTGCTATAAG AGAGTTTAAT 240
ACTTCTATTA GACCCACTGC AAGTCCGGAC CAAATTCGAT TGACAATAGA ACCATCTTTT 300
ATTAGACAAG AGGTCTATGT AGATAGAATT GAGTTCATTC CAGTTAATCC GACGCGAGAG 360
GCGAAAGAGG ATCTAGAAGC AGCAAAAAAA GCGGTGGCGA GCTTGTTTAC 410 -
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 63
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 23 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 63:
GTTCATTGGT ATAAGAGTTG GTG 23 (2) SEQUENCE IDENTIFICATION INFORMATION NO. 64
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 25 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 64:
CCACTGCAAG TCCGGACCAA ATTCG 25
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 65:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 24 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 65:
GAATATATTC CCGTCYATCT CTGG 24 (2) SEQUENCE IDENTIFICATION INFORMATION NO. 66:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 23 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA. { genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 66:
GCACGAATTA CTGTAGCGAT AGG 23
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 67:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 25 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ü) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO.-67
GCTGGTAACT TTGGAGATAT GCGTG 25 (2) INFORMATION FOR SEQUENCE IDENTIFICATION ÑO. 68:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 23 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 68:
GATTTCTTTG TAACACGTGG AGG 23
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 69:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 22 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ü) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 69:
CACTACTAAT CAGAGCGATC TG 22 (2) SEQUENCE IDENTIFICATION INFORMATION NO. 70:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1156 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 70:
Met Asn Gln Asn Lys His Gly He He Gly Wing Being Asn Cys Gly Cys 1 5 10 15
Wing Being Asp Asp Val Wing Lys Tyr Pro Leu Wing Asn Asn Pro Tyr Ser 20 25 30
Being Ala Leu Asn Leu Asn Being Cys Gln Asn Ser Being He Leu Asn Trp 35 40 45
He Asn He He Gly Asp Ala Wing Lys Glu Wing Val Ser He Gly Thr 50 55 60
Thr He Val Ser Leu He Thr Ala Pro Ser Leu Thr Gly Leu He Ser
65 70 75 80
He Val Tyr Asp Leu He Gly Lys Val Leu Gly Gly Ser Gly Gln 85 90 95
Be He Be Asp Leu Be He Cys Asp Leu Leu Be He He Asp Leu 100 105 110
Arg Val Ser Gln Ser Val Leu Asn Asp Gly He Wing Asp Phe Asn Gly 115 120 125
Ser Val Leu Leu Tyr Arg Asn Tyr Leu Glu Wing Leu Asp Ser Trp Asn 130 135 140
Lys Asn Pro Asn Ser Wing Being Wing Glu Glu Leu Arg Thr Arg Phe Arg 145 150 155 - 160
He Wing Asp Ser Glu Phe Asp Arg He Leu Thr Arg Gly Ser Leu Thr 165 170 175
Asn Gly Gly Ser Leu Wing Arg Gln Asn Wing Gln He Leu Leu Leu Pro 180 - 185 190
Be Phe Ala Be Ala Ala Phe Phe His Leu Leu Leu Leu Arg Asp Ala 195 200 205
Thr Arg Tyr Gly Thr Asn Trp Gly Leu Tyr Asn Wing Thr Pro Phe He 210 215 220
Asn Tyr Gln Ser Lys Leu Val Glu Leu He Glu Leu Tyr Thr Asp Tyr_ 225 230 235 240 Cys Val His Trp Tyr Asn Arg Gly Phe Asn Glu Leu Arg Gln Arg Gly 245 250 255
Thr Ser Ala Thr Ala Trp Leu Glu Phe His Arg Tyr Arg Arg Glu Met 260 265 270
Thr Leu Met Val Leu Asp He Val Wing Ser Phe Ser Ser Leu Asp He 275 280 285
Thr Asn Tyr Pro He Glu Thr Asp Phe Gln Leu Ser Arg Val He Tyr 290 295 300
Thr Asp Pro He Gly Phe Val His Arg Ser Ser Leu Arg Gly Glu Ser 305 310 315 320
Trp Phe Ser Phe Val Asn Arg Wing Asn Phe Ser Asp Leu Glu Asn Wing 325 330 335
He Pro Asn Pro Arg Pro Ser Trp Phe Leu Asn Asn Met He He Ser 340 340 350
Thr Gly Ser Leu Thr Leu Pro Val Ser Pro Ser Thr Asp Arg Ala Arg 355 360 365
Val Trp Tyr Gly Ser Arg Asp Arg Be Ser Pro Wing Asn Ser Gln Phe 370 375 380
He Thr Glu Leu He Ser Gly Gln His Thr Thr Wing Thr Gln Thr He 385 390 395 400
Leu Gly Arg Asn He Phe Arg Val Asp Ser Gln Ala Cys Asn Leu Asn 405 410 415
Asp Thr Thr Tyr Gly Val Asn Arg Ala Val Phe Tyr His Asp Ala Ser
420 425 430
Glu Gly Ser Gln Arg Ser Val Tyr Glu Gly Tyr He Arg Thr Thr Gly
435 440 445
He Asp Asn Pro Arg Val Gln Asn He Asn Thr Tyr Leu Pro Gly Glu
450 455 460
Asn Be Asp He Pro Thr Pro Glu Asp Tyr Thr His He Leu Ser Thr
465 470 475 _ - 480
Thr He Asn Leu Thr Gly Gly Leu Arg Gln Val Wing Being Asn Arg Arg 485 490 495
Be Ser Leu Val Met Tyr Gly Trp Thr His Lys Ser Leu Ala Arg Asn
500 505 510
Asn Thr He Asn Pro Asp Arg He Thr Gln He Pro Leu Thr Lys Val
515 520 525
Asp Thr Arg Gly Thr Gly Val Ser Tyr Val Asn Asp Pro Gly Phe He 530 535 540 Gly Gly Wing Leu Leu Gln Arg Thr Asp His Gly Ser Leu Gly Val Leu 545 550 555 - 560
Arg Val Gln Phe Pro Leu His Leu Arg Gln Gln Tyr Arg He Arg Val 565 570 575
Arg Tyr Ala Ser Thr Thr Asn He Arg Leu Ser Val Asn Gly Ser Phe 580 585 590
Gly Thr He Ser Gln Asn Leu Pro Be Thr Met Arg Leu Gly Glu Asp 595 600 605
Leu Arg Tyr Gly Be Phe Wing He Arg Glu Phe Asn Thr Ser He Arg 610 615 620
Pro Thr Wing Ser Pro Asp Gln He Arg Leu Thr He Glu Pro Ser Phe 625 630 635 640
He Arg Gln Glu Val Tyr Val Asp Arg He Glu Phe He Pro Val Asn 645 650 655
Pro Thr Arg Glu Wing Lys Glu Asp Leu Glu Wing Wing Lys Lys Wing Val 660 665 670
Ala Ser Leu Phe Thr Arg Thr Arg Asp Gly Leu Gln Val Asn Val Lys 675 680 685
Asp Tyr Gln Val Asp Gln Ala Wing Asn Leu Val Ser Cys Leu Ser Asp 690 695 700
Glu Gln Tyr Gly Tyr Asp Lys Lys Met Leu Leu Glu Wing Val Arg Ala 705 710 715 720
Ala Lys Arg Leu Ser Arg Glu Arg Asn Leu Leu Gln Asp Pro Asp Phe 725 730 735
Asn Thr He Asn Ser Thr Glu Glu Asn Gly Trp Lys Wing Ser Asn Gly 740 745 750
Val Thr He Ser Glu Gly Gly Pro Phe Tyr Lys Gly Arg Ala He Gln 755 760 765
Leu Wing Being Wing Arg Glu Asn Tyr Pro Thr Tyr He Tyr Gln Lys Val 770 775 780
Asp Ala Ser Glu Leu Lys Pro Tyr Thr Arg Tyr Arg Leu Asp Gly Phe 785 790 795 800
Val Lys Ser Ser Gln Asp Leu Glu He Asp Leu He His His His Lys 805 810 815
Val His Leu Val Lys Asn Val Pro Asp Asn Leu Val Ser Asp Thr Tyr 820 825 830
Pro Asp Asp Ser Cys Ser Gly He Asn Arg Cys Gln Glu Gln Gln Met 835 840 845 Val Asn Ala Gln Leu Glu Thr Glu His His His Pro Met Asp Cys Cys 850 855 860
Glu Ala Ala Gln Thr His Glu Phe Ser Ser Tyr He Asp Thr Gly Asp 865 870 875 880
Leu Asn Being Ser Val Asp Gln Gly He Trp Wing He Phe Lys Val Arg 885 890 895
Thr Thr Asp Gly Tyr Wing Thr Leu Gly Asn Leu Glu Leu Val Glu Val 900 905 910
Gly Pro Leu Ser Gly Glu Ser Leu Glu Arg Glu Gln Arg Asp Asn Thr 915 920 925
Lys Trp Ser Wing Glu Leu Gly Arg Lys Arg Wing Glu Thr Asp Arg Val 930 935 940
Tyr Gln Asp Ala Lys Gln Ser He Asn His Leu Phe Val Asp Tyr Gln 945 950 955 960
Asp Gln Gln Leu Asn Pro Glu He Gly Met Wing Asp He Met Asp Wing 965 970 975
Gln Asn Leu Val Wing Being Ser Asp Val Tyr Being Asp Wing Val Leu 980 985 990
Gln He Pro Gly He Asn Tyr Glu He Tyr Thr Glu Leu Ser Asn Arg 995 1000 1005
Leu Gln Gln Wing Being Tyr Leu Tyr Thr Being Arg Asn Wing Val Gln Asn 1010 1015 1020
Gly Asp Phe Asn Asn Gly Leu Asp Ser Trp Asn Ala Thr Ala Gly Ala 1025 1030 1035 1040
Ser Val Gln Gln Asp Gly Asn Thr His Phe Leu Val Leu Ser His Trp 1045 1050 1055
Asp Ala Gln Val Ser Gln Gln Phe Arg Val Gln Pro Asn Cys Lys Tyr 1060 1065 1070
Val Leu Arg Val Thr Ala Glu Lys Val Gly Gly Gly Asp Gly Tyr Val 1075 1080 1085
Thr He Arg Asp Asp Wing His His Thr Glu Thr Leu Thr Phe Asn Wing 1090 1095 1100
Cys Asp Tyr Asp He Asn Gly Thr Tyr Val Thr Asp Asn Thr Tyr Leu 1105 1110 1115 1120
Thr Lys Glu Val Val Phe His Pro Glu Thr Gln His Met Trp Val Glu 1125 1130 1135
Val Asn Glu Thr Glu Gly Wing Phe His He Asp Ser He Glu Phe Val 1140 1145 1150 Glu Thr Glu Lys 1155
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 71:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 3471 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 71:
ATGAATCAAA ATAAACACGG AATTATTGGC GCTTCCAATT GTGGTTGTGC ATCTGATGAT 60
GTTGCGAAAT ATCCTTTAGC CAACAATCCA TATTCATCTG CTTTAAATTT AAATTCTTGT 120
CAAAATAGTA GTATTCTCAA CTGGATTAAC ATAATAGGCG ATGCAGCAAA AGAAGCAGTA 180 -
TCTATTGGGA CAACCATAGT CTCTCTTATC ACAGCACCTT CTCTTACTGG ATTAATTTCA 240
ATAGTATATG ACCTTATAGG TAAAGTACTA GGAGGTAGTA GTGGACAATC CATATCAGAT 300
TTGTCTATAT GTGACTTATT ATCTATTATT GATTTACGGG TAAGTCAGAG TGTTTTAAAT 360
GATGGGATTG CAGATTTTAA TGGTTCTGTA CTCTTATACA GGAACTATTT AGAGGCTCTG 420
GATAGCTGGA ATAAGAATCC TAATTCTGCT TCTGCTGAAG AACTCCGTAC TCGTTTTAGA 480
ATCGCCGACT CAGAATTTGA TAGAATTTTA ACCCGAGGGT CTTTAACGAA TGGTGGCTCG 540
TTAGCTAGAC AAAATGCCCA AATATTATTA TTACCTTCTT TTGCGAGCGC TGCATTTTTC 600
CATTTATTAC TACTAAGGGA TGCTACTAGA TATGGCACTA ATTGGGGGGCT ATACAATGCT 660
ACACCTTTTA TAAATTATCA ATCAAAACTA GTAGAGCTTA TTGAACTATA TACTGATTAT 720
TGCGTACATT GGTATAATCG AGGTTTCAAC GAACTAAGAC AACGAGGCAC TAGTGCTACA 780
GCTTGGTTAG AATTTCATAG ATATCGTAGA GAGATGACAT TGATGGTATT AGATATAGTA 840 GCATCATTTT CAAGTCTTGA TATTACTAAT TACCCAATAG AAACAGATTT TCAGTTGAGT 900
AGGGTCATTT ATACAGATCC AATTGGTTTT GTACATCGTA GTAGTCTTAG GGGAGAAAGT 960
TGGTTTAGCT TTGTTAATAG AGCTAATTTC TCAGATTTAG AAAATGCAAT ACCTAATCCT 1020
AGACCGTCTT GGTTTTTAAA TAATATGATT ATATCTACTG GTTCACTTAC ATTGCCGGTT 1080
AGCCCAAGTA CTGATAGAGC GAGGGTATGG TATGGAAGTC GAGATCGAAT TTCCCCTGCT 1140
AATTCACAAT TTATTACTGA ACTAATCTCT GGACAACATA CGACTGCTAC ACAAACTATT 1200
TTAGGGCGAA ATATATTTAG AGTAGATTCT CAAGCTTGTA ATTTAAATGA TACCACATAT 1260
GGAGTGAATA GGGCGGTATT TTATCATGAT GCGAGTGAAG GTTCTCAAAG ATCCGTGTAC 1320
GAGGGGTATA TTCGAACAAC TGGGATAGAT AACCCTAGAG TTCAAAATAT TAACACTTAT 1380
TTACCTGGAG AAAATTCAGA TATCCCAACT CCAGAAGACT ATACTCATAT ATTAAGCACA 1440
ACAATAAATT TAACAGGAGG ACTTAGACAA GTAGCATCTA ATCGCCGTTC ATCTTTAGTA 1500
ATGTATGGTT GGACACATAA AAGTCTGGCT CGTAACAATA CCATTAATCC AGATAGAATT 1560
ACACAGATAC CATTGACGAA GGTTGATACC CGAGGCACAG GTGTTTCTTA TGTGAATGAT 1620
CCAGGATTTA TAGGAGGAGC TCTACTTCAA AGGACTGACC ATGGTTCGCT TGGAGTATTG 1680
AGGGTCCAAT TTCCACTTCA CTTAAGACAA CAATATCGTA TTAGAGTCCG TTATGCTTCT 1740
ACAACAAATA TTCGATTGAG TGTGAATGGC AGTTTCGGTA CTATTTCTCA AAATCTCCCT 1800
AGTACAATGA GATTAGGAGA GGATTTAAGA TACGGATCTT TTGCTATAAG AGAGTTTAAT 1860
ACTTCTATTA GACCCACTGC AAGTCCGGAC CAAATTCGAT TGACAATAGA ACCATCTTTT 1920
ATTAGACAAG AGGTCTATGT AGATAGAATT GAGTTCATTC CAGTTAATCC GACGCGAGAG 1980 GCGAAAGAGG ATCTAGAAGC AGCAAAAAAA GCGGTGGCGA GCTTGTTTAC ACGCACAAGG 2040
GACGGATTAC AAGTAAATGT GAAAGATTAT CAAGTCGATC AAGCGGCAAA TTTAGTGTCA 2100
TGCTTATCAG ATGAACAATA TGGGTATGAC AAAAAGATGT TATTGGAAGC GGTACGTGCG 2160
GCAAAACGAC TTAGCCGAGA ACGCAACTTA CTTCAGGATC CAGATTTTAA TACAATCAAT 2220
AGTACAGAAG AAAATGGATG GAAAGCAAGT AACGGCGTTA CTATTAGTGA GGGCGGGCCA 2280
TTCTATAAAG GCCGTGCAAT TCAGCTAGCA AGTGCACGAG AAAATTACCC AACATACATC 2340
TATCAAAAAG TAGATGCATC GGAGTTAAAG CCGTATACAC GTTATAGACT GGATGGGTTC 2400
GTGAAGAGTA GTCAAGATTT AGAAATTGAT CTCATTCACC ATCATAAAGT CCATCTTGTG 2460
AAAAATGTAC CAGATAATTT AGTATCTGAT ACTTACCCAG ATGATTCTTG TAGTGGAATC 2520
AATCGATGTC AGGAACAACA GATGGTAAAT GCGCAACTGG AAACAGAGCA TCATCATCCG 2580
ATGGATTGCT GTGAAGCAGC TCAAACACAT GAGTTTTCTT CCTATATTGA TACAGGGGAT 2640
TTAAATTCGA GTGTAGACCA GGGAATCTGG GCGATCTTTA AAGTTCGAAC AACCGATGGT 2700
TATGCGACGT TAGGAAATCT TGAATTGGTA GAGGTCGGAC CGTTATCGGG TGAATCTTTA 2760
GAACGTGAAC AAAGGGATAA TACAAAATGG AGTOCAGAGC TAGGAAGAAA GCGTGCAGAA 2820
ACAGATCGCG TGTATCAAGA TGCCAAACAA TCCATCAATC ATTTATTTGT GGATTATCAA 2880
GATCAACAAT TAAATCCAGA AATAGGGATG GCAGATATTA TGGACGCTCA AAATCTTGTC 2940
GCATCAATTT CAGATGTATA TAGCGATGCC GTACTGCAAA TCCCTGGAAT TAACTATGAG 3000
ATTTACACAG AGCTGTCCAA TCGCTTACAA CAAGCATCGT ATCTGTATAC GTCTCGAAAT 3060
GCGGTGCAAA ATGGGGACTT TAACAACGGG CTAGATAGCT GGAATGCAAC AGCGGGTGCA 3120 TCGGTACAAC AGGATGGCAA TACGCATTTC TTAGTTCTTT CTCATTGGGA TGCACAAGTT 3180
TCTCAACAAT TTAGAGTGCA GCCGAATTGT AAATATGTAT TACGTGTAAC AGCAGAGAAA 3240
GTAGGCGGCG GAGACGGATA CGTGACTATC CGGGATGATG CTCATCATAC AGAAACGCTT 3300
ACATTTAATG CATGTGATTA TGATATAAAT GGCACGTACG TGACTGATAA TACGTATCTA 3360
ACAAAAGAAG TGGTATTCCA TCCGGAGACA CAACACATGT GGGTAGAGGT AAATGAAACA 3420
GAAGGTGCAT TTCATATAGA TAGTATTGAA TTCGTTGAAA CAGAAAAGTA 3471
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 72:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1156 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein (xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 72:
Met Asn Arg Asn Asn Gln Asn Glu Tyr Glu He He Asp Ala Pro His 1 5 10 15
Cys Gly Cys Pro Ser Asp Asp Asp Val Arg Tyr Pro Leu Ala Ser Asp
25 30
Pro Asn Ala Ala Leu Gln Asn Met Asn Tyr Lys Aep Tyr Leu Gln Met 35 40 45
Thr Asp Glu Asp Tyr Thr Asp Ser Tyr He Asn Pro Ser Leu Ser He 50 55 60
Ser Gly Arg Asp Ala Val Gln Thr Ala Leu Thr Val Val Gly Arg He 65 70 75 80
Leu Gly Ala Leu Gly Val Pro Phe Ser Gly Gln He Val Ser Phe Tyr 85 90 95
Gln Phe Leu Leu Asn Thr Leu Trp Pro Val Asn Asp Thr Wing He Trp 100 105 lio
Glu Ala Phe Met Arg Gln Val Glu Glu Leu Val Asn. Gln Gln He Thr 115 120 125
Glu Phe Wing Arg Asn Gln Wing Leu Wing Arg Leu Gln Gly Leu Gly Asp 130 135 140 Being Phe Asn Val Tyr Gln Arg Being Leu Gln Asn Trp Leu Wing Asp Arg 145 150 155 160
Asn Asp Thr Arg Asn Leu Ser Val Val Arg Ala Gln Phe He Ala Leu 165 170 175
Asp Leu Asp Phe Val Asn Wing He Pro Leu Phe Wing Val Asn Gly Gln 180 185 190
Gln Val Pro Leu Leu Ser Val Tyr Ala Gln Ala Val Asn Leu His Leu 195 200 205
Leu Leu Leu Lys Asp Wing Ser Leu Phe Gly Glu Gly Trp Gly Phe Thr 210 215 220
Gln Gly Glu He Ser Thr Tyr Tyr Asp Arg Gln Leu Glu Leu Thr Ala 225 230 235 240
Lys Tyr Thr Asn Tyr Cys Glu Thr Trp Tyr Asn Thr Gly Leu Asp Arg 245 250 255
Leu Arg Gly Thr Asn Thr Glu Ser Trp Leu Arg Tyr His Gln Phe Arg 260 265 270
Arg Glu Met Thr Leu Val Val 'Leu Asp Val Val Ala Leu Phe ~ Pro Tyr 275 280 285
Tyr Asp Val Arg Leu Tyr Pro Thr Gly Ser Asn Pro Gln Leu Thr Arg 290 295 300
Glu Val Tyr Thr Asp Pro He Val Phe Asn Pro Pro Wing Asn Val Gly 305 310 315 320
Leu Cys Arg Arg Trp Gly Thr Asn Pro Tyr Asn Thr Phe Ser Glu Leu 325 330 335
Glu Asn Wing Phe He Arg Pro Pro His Leu Phe Asp Arg Leu Asn Ser 340 345 350
Leu Thr He Ser Ser Asn Arg Phe Pro Val Ser Ser Asn Phe Met Asp 355 360 365
Tyr Trp Ser Gly His Thr Leu Arg Arg Ser Tyr Leu Asn Asp Ser Wing 370 375 380
Val Gln Glu Asp Ser Tyr Gly Leu He Thr Thr Thr Arg Ala Thr He 385 390 395 400
Asn Pro Gly Val Asp Gly Thr Asn Arg He Glu Ser Thr Ala Val Asp 405 410 415
Phe Arg Be Ala Leu He Gly He Tyr Gly Val Asn Arg Ala Ser Phe 420 425 430
Val Pro Gly Gly Leu Phe Asn Gly Thr Thr Ser Pro Wing Asn Gly Gly
435 440 445 Cys Arg Asp Leu Tyr Asp Thr Asn Asp Glu Leu Pro Pro Asp Glu Ser 450 455 460
Thr Gly Ser Ser Thr His Arg Leu Ser His Val Thr Phe Phe Ser Phe 465 470 475 480
Gln Thr Asn Gln Wing Gly Ser He Wing Asn Wing Gly Ser Val Pro Thr 485 490 495
Tyr Val Trp Thr Arg Arg Asp Val Asp Leu Asn Asn Thr He Thr Pro 500 505 510
Asn Arg He Thr Gln Leu Pro Leu Val Lys Wing Ser Wing Pro Val Ser 515 520 525
Gly Thr Thr Val Leu Lys Gly Pro Gly Phe Thr Gly Gly Gly He Leu 530 535 540
Arg Arg Thr Thr Asn Gly Thr Phe Gly Thr Leu Arg Val Thr Val Asn 545 550 555 560
Ser Pro Leu Thr Gln Arg Tyr Arg Val Arg Val Arg Phe Wing Ser Ser 565 570 575
Gly Asn Phe Be He Arg He Leu Arg Gly Asn Thr Be He Wing Tyr 580 585 590
Gln Arg Phe Gly Ser Thr Met Asn Arg Gly Gln Glu Leu Thr Tyr Glu 595 600 605
Be Phe Val Thr Ser Glu Phe Thr Thr Asn Gln Ser Asp Leu Pro Phe 610 615 620
Thr Phe Thr Gln Wing Gln Glu Asn Leu Thr He Leu Wing Glu Gly Val 625 630 635 640
Ser Thr Gly Ser Glu Tyr Phe He Asp Arg He Glu He He Pro Val 645 650 655
Asn Pro Ala Arg Glu Ala Glu Glu Asp Leu Glu Ala Ala Lys Lys Ala 660 665 670
Val Ala Asn Leu Phe Thr Arg Thr Arg Asp Gly Leu Gln Val Asn Val 675 680 685
Thr Asp Tyr Gln Val Asp Gln Ala Wing Asn Leu Val Ser Cys Leu Ser 690 695 700
Asp Glu Gln Tyr Gly His Asp Lys Lys Met Leu Leu Glu Wing Val Arg 705 710 715 - 720
Ala Ala Lys Arg Leu Ser Arg Glu Arg Asn Leu Leu Gln Asp Pro Asp 725 730 735
Phe Asn Thr He Asn Ser Thr Glu Glu Asn Gly Trp Lys Wing Ser Asn 740 745 750 Gly Val Thr He Ser Glu Gly Gly Pro Phe Phe Lys Gly Arg Ala Leu 755 760 765
Gln Leu Ala Be Wing Arg Glu Asn Tyr Pro Thr Tyr He Tyr Gln Lys 770 775 780
Val Asp Ala Ser Val Leu Lys Pro Tyr Thr Arg Tyr Arg Leu Asp Gly 785 790 795 800
Phe Val Lys Ser Ser Gln Asp Leu Glu He Asp Leu He His His His 805 810 815
Lys Val His Leu Val Lys Asn Val Pro Asp Asn Leu Val Ser Asp Thr 820 825 830
Tyr Ser Asp Gly Ser Cys Ser Gly He Asn Arg Cys Asp Glu Gln His 835 840 845
Gln Val Asp Met Gln Leu Asp Ala Glu His His Pro Met Asp Cys Cys 850 855 860
Glu Ala Ala Gln Thr His Glu Phe Ser Ser Tyr He Asn Thr Gly Asp 865 870 875 880
Leu Asn Ala Ser Val Asp Gln Gly He Trp Val Val Leu Lys Val Arg 885 890 .895
Thr Thr Asp Gly Tyr Wing Thr Leu Gly Asn Leu Glu Leu Val Glu Val 900 905 910
Gly Pro Leu Ser Gly Glu Ser Leu Glu Arg Glu Gln Arg Asp Asn Ala 915 920 925
Lys Trp Asn Wing Glu Leu Gly Arg Lys Arg Wing Glu He Asp Arg Val 930 935 940
Tyr Leu Ala Ala Lys Gln Ala He Asn His Leu Phe Val Asp Tyr Gln
945 950 955 960
Asp Gln Gln Leu Asn Pro Glu He Gly Leu Wing Glu He Asn Glu Wing 965 970 975
Ser Asn Leu Val Glu Ser lie Ser Gly Val Tyr Ser Asp Thr Leu Leu 980 985 990
Gln He Pro Gly He Asn Tyr Glu He Tyr Thr Glu Leu Ser Asp Arg
995 1000 1005
Leu Gln Gln Wing Being Tyr Leu Tyr Thr Being Arg Asn Wing Val Gln Asn 1010 1015 1020
Gly Asp Phe Asn Ser Gly Leu Asp Ser Trp Asn Thr Thr Met Asp Wing 1025 1030 1035 1040
Ser Val Gln Gln Asp Gly Asn Met His Phe Leu Val Leu Ser His Trp 1045 1050 1055 Asp Wing Gln Val Ser Gln Gln Leu Arg Val Asn Pro Asn Cys Lys Tyr 1060 1065 1070
Val Leu Arg Val Thr Ala Arg Lys Val Gly Gly Gly Asp Gly Tyr Val 1075 1080 1085
Thr He Arg Asp Gly Wing His His Gln Glu Thr Leu Thr Phe Asn Wing 1090 1095 1100
Cys Asp Tyr Asp Val Asn Gly Thr Tyr Val Asn Asp Asn Ser Tyr He 1105 1110 1115 1120
Thr Glu Glu Val Val Phe Tyr Pro Glu Thr Lys His Met Trp Val Glu 1125 1130 1135
Val Ser Glu Ser Glu Gly Ser Phe Tyr He Asp Ser He Glu Phe He 1140 1145 1150
Glu Thr Gln Glu 1155
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 73:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 3471 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 73:
ATGAATCGAA ATAATCAAAA TGAATATGAA ATTATTGATG CCCCCCATTG TGGGTGTCCA 60
TCAGATGACG ATGTGAGGTA TCCTTTGGCA AGTGACCCAA ATGCAGCGTT ACAAAATATG 120
AACTATAAAG ATTACTTACA AATGACAGAT GAGGACTACA CTGATTCTTA TATAAATCCT 180
AGTTTATCTA TTAGTGGTAG AGATGCAGTT CAGACTGCGC TTACTGTTGT TGGGAGAATA 240
CTCGGGGCTT TAGGTGTTCC GTTTTCTGGA CAAATAGTGA GTTTTTATCA ATTCCTTTTA 300
AATACACTGT GGCCAGTTAA TGATACAGCT ATATGGGAAG CTTTCATGCG ACAGGTGGAG 360
GAACTTGTCA ATCAACAAAT AACAGAATTT GCAAGAAATC AGGCACTTGC AAGATTGCAA 420
GGATTAGGAG ACTCTTTTAA TGTATATCAA CGTTCCCTTC AAAATTGGTT GGCTGATCGA 480
AATGATACAC GAAATTTAAG TGTTGTTCGT GCTCAATTTA TAGCTTTAGA CCTTGATTTT 540
GTTAATGCTA TTCCATTGTT TGCAGTAAAT GGACAGCAGG TTCCATTACT GTCAGTATAT 600
GCACAAGCTG TGAATTTACA TTTGTTATTA TTAAAAGATG CATCTCTTTT TGGAGAAGGA 660
TGGGGATTCA CACAGGGGGA AATTTCCACA TATTATGACC GTCAATTGGA ACTAACCGCT 720
AAGTACACTA ATTACTGTGA AACTTGGTAT AATACAGGTT TAGATCGTTT AAGAGGAACA 780
AATACTGAAA GTTGGTTAAG ATATCATCAA TTCCGTAGAG AAATGACTTT AGTGGTATTA 840
GATGTTGTGG CGCTATTTCC ATATTATGAT GTACGACTTT ATCCAACGGG ATCAAACCCA 900
CAGCTTACAC GTGAGGTATA TACAGATCCG ATTGTATTTA ATCCACCAGC TAATGTTGGA 960
CTTTGCCGAC GTTGGGGTAC TAATCCCTAT AATACTTTTT CTGAGCTCGA AAATGCCTTC 1020 ATTCGCCCAC CACATCTTTT TGATAGGCTG AATAGCTTAA CAATCAGCAG TAATCGATTT 1080
CCAGTTTCAT CTAATTTTAT GGATTATTGG TCAGGACATA CGTTACGCCG TAGTTATCTO 1140
AACGATTCAG CAGTACAAGA AGATAGTTAT GGCCTAATTA CAACCACAAG AGCAACAATT 1200
AATCCTGGAG TTGATGGAAC AAACCGCATA GAGTCAACGG CAGTAGATTT TCGTTCTGCA 1260
TTGATAGGTA TATATGGCGT GAATAGAGCT TCTTTTGTCC CAGGAGGCTT GTTTAATGGT 1320
ACGACTTCTC CTGCTAATGG AGGATGTAGA GATCTCTATG ATACAAATGA TGAATTACCA 1380
CCAGATGAAA GTACCGGAAG TTCTACCCAT AGACTATCTC ATGTTACCTT TTTTAGTTTT 1440
CAAACTAATC AGGCTGGATC TATAGCTAAT GCAGGAAGTG TACCTACTTA TGTTTGGACC 1500
CGTCGTGATG TGGACCTTAA TAATACGATT ACCCCAAATA GAATTACACA ATTACCATTG 1560
GTAAAGGCAT CTGCACCTGT TTCGGGTACT ACGGTCTTAA AAGGTCCAGG ATTTACAGGA 1620
GGGGGTATAC TCCGAAGAAC AACTAATGGC ACATTTGGAA CGTTAAGAGT AACAGTTAAT 1680
TCACCATTAA CACAAAGATA TCGCGTAAGA GTTCGTTTTG CTTCATCAGG AAATTTCAGC 1740
ATAAGGATAC TGCGTGGAAA TACCTCTATA GCTTATCAAA GATTTGGGAG TACAATGAAC 1800
AGAGGACAGG AACTAACTTA CGAATCATTT GTCACAAGTG AGTTCACTAC TAATCAGAGC 1860
GATCTGCCTT TTACATTTAC ACAAGCTCAA GAAAATTTAA CAATCCTTGC AGAAGGTGTT 1920
AGCACCGGTA GTGAATATTT TATAGATAGA ATTGAAATCA TCCCTGTGAA CCCGGCACGA 1980
GAAGCAGAAG AGGATTTAGA AGCAGCGAAG AAAGCGGTGG CGAACTTGTT TACACGTACA 2040
AGGGACGGAT TACAGGTAAA TGTGACAGAT TATCAAGTGG ACCAAGCGGC AAATTTAGTG 2100
TCATGCTTAT CCGATGAACA ATATGGGCAT GACAAAAAGA TGTTATTGGA AGCGGTAAGA 2160 GCGGCAAAAC GCCTCAGCCG CGAACGCAAC TTACTTCAAG ATCCAGATTT TAATACAATC 2220
AATAGTACAG AAGAGAATGG CTGGAAGGCA AGTAACGGTG TTACTATTAG CGAGGGCGGT 2280
CCATTCTTTA AAGGTCGTGC ACTTCAGTTA GCAAGCGCAA GAGAAAATTA TCCAACATAC 2340
ATTTATCAAA AAGTAGATGC ATCGGTGTTA AAGCCTTATA CACGCTATAG ACTAGATGGA 2400
TTTGTGAAGA GTAGTCAAGA TTTAGAAATT GATCTCATCC ACCATCATAA AGTCCATCTT 2460
GTAAAAAATG TACCAGATAA TTTAGTATCT GATACTTACT CAGATGGTTC TTGCAGCGGA 2520
ATCAACCGTT GTGATGAACA GCATCAGGTA GATATGCAGC TAGATGCGGA GCATCATCCA 2580
ATGGATTGCT GTGAAGCGGC TCAAACACAT GAGTTTTCTT CCTATATTAA TACAGGGGAT 2640
CTAAATGCAA GTGTAGATCA GGGCATTTGG GTTGTATTAA AAGTTCGAAC AACAGATGGG 2700
TATGCGACGT TAGGAAATCT TGAATTGGTA GAGGTTGGGC CATTATCGGG TGAATCTCTA 2760
GAACGGGAAC AAAGAGATAA TGCGAAATGG AATGCAGAGC TAGGAAGAAA ACGTGCAGAA 2820
ATAGATCGTG TGTATTTAGC TGCGAAACAA GCAATTAATC ATCTGTTTGT AGACTATCAA 2880
GATCAACAAT TAAATCCAGA AATTGGGCTA GCAGAAATTA ATGAAGCTTC AAATCTTGTA 2940
GAGTCAATTT CGGGTGTATA TAGTGATACA CTATTACAGA TTCCTGGGAT TAACTACGAA 3000
ATTTACACAG AGTTATCCGA TCGCTTACAA CAAGCATCGT ATCTGTATAC GTCTAGAAAT 3060
GCGGTGCAAA ATGGAGACTT TAACAGTGGT CTAGATAGTT GGAATACAAC TATGGATGCA 3120
TCGGTTCAGC AAGATGGCAA TATGCATTTC TTAGTTCTTT CGCATTGGGA TGCACAAGTT 3180
TCCCAACAAT TGAGAGTAAA TCCGAATTGT AAGTATGTCT TACGTGTGAC AGCAAGAAAA 3240
GTAGGAGGCG GAGATGGATA CGTCACAATC CGAGATGGCG CTCATCACCA AGAAACTCTT 3300 ACATTTAATG CATGTGACTA CGATGTAAAT GGTACGTATG TCAATGACAA TTCGTATATA 3360
ACAGAAGAAG TGGTATTCTA CCCAGAGACA AAACATATGT GGGTAGAGGT GAGTGAATCC 3420
GAAGGTTCAT TCTATATAGA CAGTATTGAG TTTATTGAAA CACAAGAGTA G 3471
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 74:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1150 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 74:
Met Asn Arg Asn Asn Pro Asn Glu Tyr Glu He He Asp Ala Pro Tyr 10 15
Cys Gly Cys Pro As Asp Asp Asp Val Arg Tyr Pro Leu Wing As Asp 20 25 30 Pro Asn Wing Wing Phe Gln Asn Met Asn Tyr Lys Glu Tyr Leu Gln Thr 35 40 45
Tyr Asp Gly Asp Tyr Thr Gly Ser Leu He Asn Pro Asn Leu Ser He 50 55 60
Asn Pro Arg Asp Val Leu Gln Thr Gly He Asn He Val Gly Arg He 65 70 75 80
Leu Gly Phe Leu Gly Val Pro Phe Wing Gly Gln Leu Val Thr Phe Tyr 85 90 95
Thr Phe Leu Leu Asn Gln Leu Trp Pro Thr Asn Asp Asn Wing Val Trp 100 105 110
Glu Wing Phe Met Wing Gln He Glu Glu Leu He Asp Gln Lys He Ser 115 120 125
Ala Gln Val Val Arg Asn Ala Leu Asp Asp Leu Thr Gly Leu His Asp 130 135 140
Tyr Tyr Glu Glu Tyr Leu Wing Ala Leu Glu Glu Trp Leu Glu Arg Pro 145 150 155 160
Asn Gly Ala Arg Ala Asn Leu Val Thr Gln Arg Phe Glu Asn Leu His 165 170 175
Thr Ala Phe Val Thr Arg Met Pro Ser Phe Gly Thr Gly Pro Gly Ser 180 185 190
Gln Arg Asp Ala Val Ala Leu Leu Thr Val Tyr Ala Gln Ala Ala Asn 195 200 205
Leu His Leu Leu Leu Leu Lys Asp Ala Glu He Tyr Gly Ala Arg Trp 210 215 220
Gly Leu Gln Gln Gly Gln He Asn Leu Tyr Phe Asn Wing Gln Gln Glu 225 230 235 240
Arg Thr Arg He Tyr Thr Asn His Cys Val Glu Thr Tyr Asn Arg Gly 245 250 255
Leu Glu Asp Val Arg Gly Thr Asn Thr Glu Ser Trp Leu Asn Tyr His 260 265 270
Arg Phe Arg Arg Glu Met Thr Leu Met Ala Met Asp Leu Val Ala Leu
275 280 285
Phe Pro Phe Tyr Asn Val Arg Gln Tyr Pro Asn Gly Wing Asn Pro Gln 290 295 300
Leu Thr Arg Glu He Tyr Thr Asp Pro He Val Tyr Asn Pro Pro Wing 305 310 315 320
Asn Gln Gly He Cys Arg Arg Trp Gly Asn Asn Pro Tyr Asn _Thr Phe 325 330 335 Ser Glu Leu Glu Asn Wing Phe He Arg Pro Pro His Leu Phe Glu Arg 340 345 350
Leu Asn Arg Leu Thr He Be Arg Asn Arg Tyr Thr Wing Pro Thr Thr 355 360 365
Asn Ser Phe Leu Asp Tyr Trp Ser Gly His Thr Leu Gln Ser Gln His 370 375 380
Wing Asn Asn Pro Thr Thr Tyr Glu Thr Ser Tyr Gly Gln He Thr Ser 385 390 395 400
Asn Thr Arg Leu Phe Asn Thr Thr Asn Gly Wing Arg Wing He Asp Ser 405 410 -415
Arg Ala Arg Asn Phe Gly Asn Leu Tyr Ala Asn Leu Tyr Gly Val Ser 420 425 430
Ser Leu Asn He Phe Pro Thr Gly Val Met Ser Glu He Thr Asn Wing 435 440 445
Wing Asn Thr Cys Arg Gln Asp Leu Thr Thr Thr Glu Glu Leu Pro Leu 450 455 460
Glu Asn Asn Asn Phe Asn Leu Leu Ser His Val Thr Phe Leu Arg Phe 465 470 475 480
Asn Thr Thr Gln Gly Gly Pro Leu Wing Thr Leu Gly Phe Val Pro Thr 485 490 495
Tyr Val Trp Thr Arg Glu Asp Val Asp Phe Thr Asn Thr He Thr Ala 500 505 510
Asp Arg He Thr Gln Leu Pro Trp Val Lys Wing Ser Glu He Gly Gly 515 520 525
Gly Thr Thr Val Val Lys Gly Pro Gly Phe Thr Gly Gly Asp He Leu 530 535 540
Arg Arg Thr Asp Gly Gly Wing Val Gly Thr He Arg Wing Asn Val Asn 545 550 555 560
Wing Pro Leu Thr Gln Gln Tyr Arg lie Arg Leu Arg Tyr Wing Ser Thr 565 570 575
Thr Ser Phe Val Val Asn Leu Phe Val Asn Asn Ser Ala Ala Gly Phe
580 585 590
Thr Leu Pro Ser Thr Met Wing Gln Asn Gly Ser Leu Thr Tyr Glu Ser 595 600 605
Phe Asn Thr Leu Glu Val Thr His Thr He Arg Phe Ser Gln Ser Asp 610 615 620
Thr Thr Leu Arg Leu Asn He Phe Pro Be Ser Gly Gln Glu Val 625 630 635 640 Tyr Val Asp Lys Leu Glu He Val Pro He Asn Pro Thr Arg Glu Wing 645 650 655
Glu Glu Asp Leu Glu Asp Ala Lys Lys Ala Val Ala Ser Leu Phe Thr 660 665 670
Arg Thr Arg Asp Gly Leu Gln Val Asn Val Thr Asp Tyr Gln Val Asp 675 680 685
Gln Ala Ala Asn Leu Val Ser Cys Leu Ser Asp Glu Gln Tyr Gly His 690 695 700
Asp Lys Lys Met Leu Leu Glu Wing Val Arg Wing Wing Lys Arg Leu Ser 705 710 715 720
Arg Glu Arg Asn Leu Leu Gln Asp Pro Asp Phe Asn Glu He Asn Ser 725 730 735
Thr Glu Glu Asn Gly Trp Lys Wing Ser Asn Gly Val Thr He Ser Glu 740 745 750
Gly Gly Pro Phe Phe Lys Gly Arg Wing Leu Gln Leu Wing Being Wing Arg 755 760 765
Glu Asn Tyr Pro Thr Tyr He Tyr Gln Lys Val Asp Wing Ser Thr Leu 770 775 780
Lys Pro Tyr Thr Arg Tyr Lys Leu Asp Gly Phe Val Gln Ser Ser Gln 785 790 795 800
Asp Leu Glu He Asp Leu He His His His Lys Val His Leu Val Lys 805 810 815
Asn Val Pro Asp Asn Leu Val Ser Asp Thr Tyr Ser Asp Gly Ser Cys 820 825 830
Ser Gly He Asn Arg Cys Glu Glu Gln His Gln Val Asp Val Gln Leu 835 840 845
Asp Ala Glu Asp His Pro Lys Asp Cys Cys Glu Ala Ala Gln Thr His 850 855 860
Glu Phe Ser Ser Tyr He His Thr Gly Asp Leu Asn Ala Ser Val Asp 865 870 875 880
Gln Gly He Trp Val Val Leu Gln Val Arg Thr Thr Asp Gly Tyr Ala 885 890 895
Thr Leu Gly Asn Leu Glu Leu Val Glu Val Gly Pro Leu Ser Gly Glu 900 905 910
Ser Leu Glu Arg Glu Gln Arg Asp Asn Wing Lys Trp Asn Glu Glu Val 915 920. 925
Gly Arg Lys Arg Wing Glu Thr Asp Arg He Tyr Gln Asp Wing Lys Gln 930 935 940 Wing He Asn His Leu Phe Val Asp Tyr Gln Asp Gln Gln Leu Ser Pro 945 950 955 960_
Glu Val Gly Met Wing Asp He He Asp Wing Gln Asn Leu He Wing Ser 965 970 975
He Be Asp Val Tyr Ser Asp Ala Val Leu Gln He Pro Gly He Asn 980 985 990
Tyr Glu Met Tyr Thr Glu Leu Ser Asn Arg Leu Gln Gln Wing Ser Tyr 995 1000 1005
Leu Tyr Thr Ser Arg Asn Val Val Gln Asn Gly Asp Phe Asn Ser Gly 1010 1015 1020
Leu Asp Ser Trp Asn Wing Thr Thr Asp Thr Wing Val Gln Gln Asp Gly 1025 1030 1035 1040
Asn Met His Phe Leu Val Leu Ser His Trp Asp Ala Gln Val Ser Gln 1045 1050 1055
Gln Phe Arg Val Gln Pro Asn Cys Lys Tyr Val Leu Arg Val Thr Wing 1060 1065 1070
Lys Lys Val Gly Asn Gly Asp Gly Tyr Val Thr He Gln Asp Gly Wing 1075 1080 1085
His His Arg Glu Thr Leu Thr Phe Asn Wing Cys Asp Tyr Asp Val Asn 1090 1095 1100
Gly Thr His Val Asn Asp Asn Ser Tyr He Thr Lys Glu Leu Val Phe 1105 1110 1115 1120
Tyr Pro Lys Thr Glu His Met Trp Val Glu Val Ser Glu Thr Glu Gly 1125 - - 1130 1135
Thr Phe Tyr He Asp Ser He Glu Phe He Glu Thr Gln Glu 1140 1145 1150
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 75:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 3453 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 75:
ATGAATCGAA ATAATCCAAA TGAATATGAA ATTATTGATG CCCCCTATTG TGGGTGTCCG 60
TCAGATGATG ATGTGAGGTA TCCTTTGGCA AGTGACCCAA ATGCAGCGTT CCAAAATATG 120
AACTATAAAG AGTATTTACA AACGTATGAT GGAGACTACA CAGGTTCTCT TATCAATCCT 180
AACTTATCTA TTAATCCTAG AGATGTACTA CAAACAGGTA TTAATATTGT GGGAAGAATA 240
CTAGGGTTTT TAGGTGTTCC ATTTGCGGGT CAACTAGTTA CTTTCTATAC CTTTCTCTTA 300
AATCAGTTGT GGCCAACTAA TGATAATGCA GTATGGGAAG CTTTTATGGC GCAAATAGAA 360
GAGCTAATCG ATCAAAAAAT ATCGGCGCAA GTAGTAAGGA ATGCACTCGA TGACTTAACT 420
GGATTACACG ATTATTATGA GGAGTATTTA GCAGCATTAG AGGAGTGGCT GGAAAGACCG 480
AACGGAGCAA GAGCTAACTT AGTTACACAG AGGTTTGAAA ACCTGCATAC TGCATTTGTA 540
ACTAGAATGC CAAGCTTTGG TACGGGTCCT GGTAGTCAAA GAGATGCGGT AGCGTTGTTG 600
ACGGTATATG CACAAGCAGC GAATTTGCAT TTGTTATTAT TAAAAGATGC AGAAATCTAT 660 GGGGCAAGAT GGGGACTTCA ACAAGGGCAA ATTAACTTAT ATTTTAATGC TCAACAAGAA 720
CGTACTCGAA TTTATACCAA TCATTGCGTG GAAACATATA ATAGAGGATT AGAAGATGTA 780
AGAGGAACAA ATACAGAAAG TTGGTTAAAT TACCATCGAT TCCGTAGAGA GATGACATTA 840
ATGGCAATGG ATTTAGTGGC CCTATTCCCA TTCTATAATG TGCGACAATA TCCAAATGGG 900
GCAAATCCAC AGCTTACACG TGAAATATAT ACAGATCCAA TCGTATATAA TCCACCAGCT 960
AATCAGGGAA TTTGCCGACG TTGGGGGAAT AATCCGTATA ATACATTTTC TGAACTTGAA 1020
AATGCTTTTA TTCGCCCGCC ACATCTTTTT GAAAGGTTGA ACAGATTAAC TATTTCTAGA 1080
AACCGATATA CAGCTCCAAC AACTAATAGC TTCCTAGACT ATTGGTCAGG TCATACTTTA 1140
CAAAGCCAAC ATGCAAATAA CCCGACGACA TATGAAACTA GTTACGGTCA GATTACCTCT 1200
AACACACGTT TATTCAATAC GACTAATGGA GCCCGTGCAA TAGATTCAAG GGCAAGAAAT 1260
TTTGGTAACT TATACGCTAA TTTGTATGGC GTTAGCAGCT TGAACATTTT CCCAACAGGT 1320
GTGATGAGTG AAATCACCAA TGCAGCTAAT ACGTGTCGGC AAGACCTTAC TACAACTGAA 1380
GAACTACCAC TAGAGAATAA TAATTTTAAT CTTTTATCTC ATGTTACTTT CTTACGCTTC 1440
AATACTACTC AGGGTGGCCC CCTTGCAACT CTAGGGTTTG TACCCACATA TGTGTGGACA 1500
CGTGAAGATG TAGATTTTAC GAACACAATT ACTGCGGATA GAATTACACA ACTACCATGG 1560
GTAAAGGCAT CTGAAATAGG TGGGGGTACT ACTGTCGTGA AAGGTCCAGG ATTTACAGGA 1620
GGGGATATAC TTCGAAGAAC GGACGGTGGT GCAGTTGGAA CGATTAGAGC TAATGTTAAT 1680
GCCCCATTAA CACAACAATA TCGTATAAGA TTACGCTATG CTTCGACAAC AAGTTTTGTT 1740
GTTAATTTAT TTGTTAATAA TAGTGCGGCT 'GGCTTTACTT TACCGAGTAC AATGGCTCAA 1800 AATGGTTCTT TAACATACGA GTCGTTTAAT ACCTTAGAGG TAACTCATAC TATTAGATTT 1860
TCACAGTCAG ATACTACACT TAGGTTGAAT ATATTCCCGT CTATCTCTGG TCAAGAAGTG 1920
TATGTAGATA AACTTGAAAT CGTTCCAATT AACCCGACAC GAGAAGCGGA AGAAGATTTA 1980
GAAGATGCAA AGAAAGCGGT GGCGAGCTTG TTTACACGTA CAAGGGATGG ATTACAGGTA 2040
AATGTGACAG ATTACCAAGT CGATCAGGCG GCAAATTTAG TGTCGTGCTT ATCAGATGAA 2100
CAATATGGGC ATGATAAAAA GATGTTATTG GAAGCCGTAC GCGCAGCAAA ACGCCTCAGC 2160
CGCGAACGCA ACTTACTTCA AGATCCAGAT TTTAATGAAA TAAATAGCAC AGAAGAAAAT 2220
GGCTGGAAGG CAAGTAACGG TGTTACTATT AGCGAGGGCG GTCCATTCTT TAAAGGTCGT 2280
GCACTTCAGT TAGCAAGCGC ACGTGAAAAT TACCCAACAT ACATCTATCA AAAGGTAGAT 2340
GCATCGACGT TAAAACCTTA TACACGATAT AAACTAGATG GATTTGTGCA AAGTAGTCAA 2400
GATTTAGAAA TTGACCTCAT TCATCATCAT AAAGTCCACC TCGTGAAAAA TGTACCAGAT 2460
AATTTAGTAT CTGATACTTA TTCTGATGGC TCATGTAGTG GAATTAACCG TTGTGAGGAA 2520
CAACATCAGG TAGATGTGCA GCTAGATGCG GAGGATCATC CAAAGGATTG TTGTGAAGCG 2580
GCTCAAACAC ATGAGTTTTC TTCCTATATT CATACAGGTG ATCTAAATGC AAGTGTAGAT 2640
CAAGGCATTT GGGTTGTATT GCAGGTTCGA ACAACAGATG GTTATGCGAC GTTAGGAAAT 2700
CTTGAATTGG TAGAGGTTGG TCCATTATCG GGTGAATCTT TAGAACGAGA ACAAAGAGAT 2760
AATGCGAAAT GGAATGAAGA GGTAGGAAGA AAGCGTGCAG AAACAGATCG CATATATCAA 2820
GATGCGAAAC AAGCAATTAA CCATCTATTT GTAGACTATC AAGATCAACA ATTAAGTCCA 2880
GAGGTAGGGA TGGCGGATAT TATTGATGCT CAAAATCTTA TCGCATCAAT TTCAGATGTA 2940 TATAGCGATG CAGTACTGCA AATCCCTGGG ATTAACTACG AGATGTATAC AGAGTTATCC 3000
AATCGATTAC AACAAGCATC GTATCTGTAT ACGTCTCGAA ATGTCGTGCA AAATGGGGAC 3060
TTTAACAGTG GTTTAGATAG TTGGAATGCA ACAACTGATA CAGCTGTTCA GCAGGATGGC 3120
AATATGCATT TCTTAGTTCT TTCCCATTGG GATGCACAAG TTTCTCAACA ATTTAGAGTA 3180
CAGCCGAATT GTAAATATGT GTTACGTGTG ACAGCGAAGA AAGTAGGGAA CGGAGATGGA 3240
TATGTTACGA TCCAAGATGG CGCTCATCAC CGAGAAACAC TGACATTCAA TGCATGTGAC 3300
TACGATGTAA ATGGTACGCA TGTAAATGAT AATTCGTATA TTACAAAAGA ATTGGTGTTC 3360CGGAACATAT GTGGGTAGAG GTAAGTGAAA CAGAAGGTAC CTTCTATATA 3420
GACAGCATTG AGTTCATTGA AACACAAGAG TAG_3453_(2) SEQUENCE IDENTIFICATION INFORMATION NO. 76:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1134 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 76:
Met Asp Asn Asn Pro Asn He Asn Glu Cys He Pro Tyr Asn Cys Leu 1 5 10 15
Being Asn Pro Glu Val Glu Val Leu Gly Gly Glu Arg Gly Asn Val Arg 20 25 30
Thr Gly Leu Gln Thr Gly He Asp He Val Wing Val Val Val Gly Wing 35 40 45
Leu Gly Gly Pro Val Gly Gly He Leu Thr Gly Phe Leu Ser Thr Leu 50 55 60
Phe Gly Phe Leu Trp Pro Ser Asn Asp Gln Wing Val Trp Glu Wing Phe 65 70 75 80
He Glu Gln Met Glu Glu Leu He Glu Gln Arg He Ser Asp Gln Val 85 90 95
Val Arg Thr Ala Leu Asp Asp Leu Thr Gly He Gln Asn Tyr Tyr Asn 100 105 110
Gln Tyr Leu He Wing Leu Lys Glu Trp Glu Glu Arg Pro Asn Gly Val 115 120 125
Arg Ala Asn Leu Val Leu Gln Arg Phe Glu He Leu His Ala Leu Phe 130 135 140
Val Ser Ser Met Pro Ser Phe Gly Ser Gly Pro Gly Ser Gln Arg Phe 145 150 155 160
Gln Ala Gln Leu Leu Val Val Tyr Ala Gln Ala Ala Asn Leu His Leu 165 170 175
Leu Leu Leu Ala Asp Ala Glu Lys Tyr Gly Ala Arg Trp Gly Leu Arg 180 185 190
Glu Ser Gln He Gly Asn Leu Tyr Phe Asn Glu Leu Gln Thr Arg Thr 195 200 205
Arg Asp Tyr Thr Asn His Cys Val Asn Wing Tyr Asn Asn Gly Teu Wing 210 215 220
Gly Leu Arg Gly Thr Ser Wing Glu Ser Trp Leu Lys Tyr His Gln Phe 225 230 235 240 Arg Arg Glu Wing Thr Leu Met Wing Met Asp Leu He Wing Leu Phe Pro 245 250 255
Tyr Tyr Asn Thr Arg Arg Tyr Pro He Wing Val Asn Pro Gln Leu Thr 260 265 270
Arg Glu Val Tyr Thr Asp Pro Leu Gly Val Pro Ser Glu Glu Ser Ser 275 280 285
Leu Phe Pro Glu Leu Arg Cys Leu Arg Trp Gln Glu Thr Ser Wing Met 290 295 300
Thr Phe Ser Asn Leu Glu Asn Wing He He Ser Ser "Pro" HisTeu Phe 305 310 315 - 320
Asp Thr He Asn Asn Leu Met He Tyr Thr Gly Ser Phe Ser Val His 325 330 335
Leu Thr Asn Gln Leu He Glu Gly Trp He Gly His Ser Val Thr Ser 340 345 350
Ser Leu Leu Wing Ser Gly Pro Thr Thr Val Leu Arg Arg Asn Tyr Gly 355 360 365
Ser Thr Thr Ser He Val Asn Tyr Phe Ser Phe Asn Asp Arg Asp Val 370 375 380
Tyr Gln He Asn Thr Arg Ser His Thr Gly Leu Gly Phe Gln Asn Wing 385 390 395 400
Pro Leu Phe Gly He Thr Arg Wing Gln Phe Tyr Pro Gly Gly Thr Tyr 405 410 415
Ser Val Thr Gln Arg Asn Wing Leu Thr Cys Glu Gln Asn Tyr Asn Ser 420 425 430
He Asp Glu Leu Pro Ser Leu Asp Pro Asn Glu Pro He Ser Arg Ser 435 440 445
Tyr Ser His Arg Leu Ser His He Thr Ser Tyr Leu His Arg Val Leu 450 455 460
Thr He Asp Gly He Asn He Tyr Ser Gly Asn Leu Pro Thr Tyr Val 465 470 475 480
Trp Thr His Arg Asp Val Asp Leu Thr Asn Thr He Thr Wing Asp Arg 485 490 495
He Thr Gln Leu Pro Leu Val Lys Ser Phe Glu He Pro Wing Gly Thr 500 505 510
Thr Val Val Arg Gly Pro Gly Phe Thr Gly Gly Asp He Leu Arg Arg 515 520 525
Thr Gly Val Gly Thr Phe Gly Thr He Arg Val Arg Thr Thr Wing Pro 530 535 540 Leu Thr Gln Arg Tyr Arg He Arg Phe Arg Phe Wing Ser Thr Thr Asn 545 550 555 560
Leu Phe He Gly He Arg Val Gly Asp Arg Gln Val Asn Tyr Phe Asp 565 570 575
Phe Gly Arg Thr Met Asn Arg Gly Asp Glu Leu Arg Tyr Glu Ser Phe 580 585 590
Wing Thr Arg Glu Phe Thr Thr Asp Phe Asn Phe Arg Gln Pro Gln Glu 595 600 605
Leu He Ser Val Phe Wing Asn Wing Phe Be Wing Gly Gln Glu Val Tyr 610 615 620
Phe Asp Arg He Glu He He Pro Val Asn Pro Ala Arg Glu Ala Lys 625 630 635 640
Glu Asp Leu Glu Ala Ala Lys Lys Ala Val Ala Ser Leu Phe Thr Arg 645 650 655
Thr Arg Asp Gly Leu Gln Val Asn Val Lys Asp Tyr Gln Val Asp Gln 660 665 670
Ala Ala Asn Leu Val Ser Cys Leu Ser Asp Glu Gln Tyr Gly Tyr Asp 675 680 685
Lys Lys Met Leu Leu Glu Wing Val Arg Wing Wing Lys Arg Leu Ser Arg 690 695 700
Glu Arg Asn Leu Leu Gln Asp Pro Asp Phe Asn Thr He Asn Ser Thr 705 710 715 720
Glu Glu Asn Gly Trp Lys Wing Ser Asn Gly Val Thr He Ser Glu Gly 725 730 735
Gly Pro Phe Tyr Lys Gly Arg Wing Leu Gln Leu Wing Being Wing Arg Glu 740 745 750
Asn Tyr Pro Thr Tyr He Tyr Gln Lys Val Asp Wing Ser Glu Leu Lys 755 760 765
Pro Tyr Thr Arg Tyr Arg Ser Asp Gly Phe Val Lys Ser Ser Gln Asp 770 775 780
Leu Glu He Asp Leu He His His His Lys Val His Leu Val Lys Asn 785 790 795 800
Val Pro Asp Asn Leu Val Ser Asp Thr Tyr Pro Asp Asp Ser Cys Ser 805 810 815
Gly He Asn Arg Cys Gln Glu Gln Gln Met Val Asn Wing Gln Leu Glu 820 825 830
Thr Glu His His His Pro Met Asp Cys Cys Glu Ala Wing Gln Thr His 835 840 845 Glu Phe Ser Ser Tyr He Asp Thr Gly Asp Leu Asn Ser Ser Val Asp 850 855 860
Gln Gly He Trp Wing He Phe Lys Val Arg Thr Thr Asp Gly Tyr Wing 865 870 875 880
Thr Leu Gly Asn Leu Glu Leu Val Glu Val Gly Pro Leu Ser Gly Glu 885 890 895
Ser Leu Glu Arg Glu Gln Arg Asp Asn Thr Lys Trp Ser Wing Glu Leu 900 905 910
Gly Arg Lys Arg Wing Glu Thr Asp Arg Val Tyr Gln Asp Wing Lys Gln 915 920 925
Ser He Asn His Leu Phe Val Asp Tyr Gln Asp Gln Gln Leu Asn Pro 930 935 940
Glu He Gly Met Wing Asp He Met Asp Wing Gln Asn Leu Val Wing Ser 945 950 955 960
He Be Asp Val Tyr Ser Asp Ala Val Leu Gln He Pro Gly He Asn 965 970 975
Tyr Glu He Tyr Thr Glu Leu Ser Asn Arg Leu Gln Gln Wing Ser Tyr 980 985 990
Leu Tyr Thr Ser Arg Asn Ala Val Gln Asn Gly Asp Phe Asn Asn Gly 995 1000 1005
Leu Asp Ser Trp Asn Ala Thr Ala Gly Ala Ser Val Gln Gln Asp Gly 1010 1015 1020
Asn Thr His Phe Leu Val Leu Ser His Trp Asp Ala Gln Val Ser Gln 1025 1030 1035 1040
Gln Phe Arg Val Gln Pro Asn Cys Lys Tyr Val Leu Arg Val Thr Wing 1045 1050 1055
Glu Lys Val Gly Gly Gly Asp Gly Tyr Val Thr He Arg Asp Gly Wing 1060 1065 1070
His His Thr Glu Thr Leu Thr Phe Asn Wing Cys Asp Tyr Asp He Asn 1075 1080 1085
Gly Thr Tyr Val Thr Asp Asn Thr Tyr Leu Thr Lys Glu Val He Phe 1090 1095 1100
Tyr Ser His Thr Glu His Met Trp Val Glu Val Asn Glu Thr Glu Gly 1105 1110 1115 1120
Wing Phe His He Asp Ser He Glu Phe Val Glu Thr Glu Lys 1125 1130
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 77: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 3411 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 77:
ATGGATAACA ATCCGAACAT CAATGAATGC ATTCCTTATA ATTGTTTAAG TAACCCTGAA 60
GTAGAAGTAT TAGGTGGAGA AAGAGGAAAT GTTAGAACTG OACTACAAAC TGGAATTGAT 120
ATTGTTGCAG TAGTAGTAGG TGCTTTAGGT GGACCAGTTG GTGGCATACT CACTGGTTTT i8o
CTTTCTACTC TTTTTGGTTT TCTTTGGCCA TCTAATGATC AAGCAGTATG GGAAGCTTTT 240
ATAGAACAAA TGGAAGAACT GATTGAACAA AGGATATCAG ATCAAGTAGT AAGGACTGCA 300
CTCGATGACT TAACTGGAAT TCAAAATTAT TATAATCAAT ATCTAATAGC ATTAAAGGAA 360 TGGGAGGAAA GACCAAACGG CGTAAGAGCA AACTTAGTTT TGCAAAGATT TGAAATCTTG 420
CACGCGCTAT TTGTAAGTAG TATGCCAAGT TTTGGTAGTG GCCCTGGAAG TCAAAGGTTT 480
CAGGCACAAT TGTTGGTTGT TTATGCGCAA GCAGCAAATC TTCATTTACT ATTATTAGCT 540
GATGCTGAAA AGTATGGGGC AAGATGGGGA CTCCGTGAAT CCCAGATAGG AAATTTATAT 600
TTTAATGAAC TACAAACTCG TACTCGAGAT TACACCAACC ATTGTGTAAA CGCGTATAAT 660
AACGGGTTAG CCGGGTTACG AGGAACGAGC GCTGAAAGTT GGTTAAAGTA CCATCAATTC 720
CGCAGAGAAG CAACCTTAAT GGCAATGGAT TTGATAGCTT TATTTCCATA TTATAACACC 780
CGGCGATATC CAATCGCAGT AAATCCTCAG CTTACACGTG AGGTATATAC AGATCCATTA 840
GGCGTTCCTT CTGAAGAATC AAGTTTATTT CCAGAATTGA GATGCTTAAG ATGGCAAGAG 900
ACTTCTGCCA TGACTTTTTC AAATTTGGAA AATGCAATAA TTTCGTCACC ACATCTATTT 960
GACACAATAA ACAATTTAAT GATTTATACC GGTTCCTTTT CCGTTCACCT AACCAATCAA 1020
TTAATTGAAG GGTGGATTGG ACATTCTGTA ACTAGTAGTT TGTTGGCCAG TGGACCAACA 1080
ACAGTACTGA GAAGAAATTA CGGTAGCACG ACATCTATTG TAAACTATTT TAGTTTTAAT 1140
GATCGTGATG TTTATCAGAT TAATACGAGA TCACATACTG GGTTGGGATT CCAGAACGCA 1200
CCTTTATTTG GAATCACTAG AGCTCAATTT TACCCAGGTG GGACTTATTC AGTAACTCAA 1260
CGAAATGCAT TAACATGTGA ACAAAATTAT AATTCAATTG ATGAGTTACC GAGCCTAGAC 1320
CCAAATGAAC CTATCAGTAG AAGTTATAGT CATAGATTAT CTCATATTAC CTCCTATTTG 1380
CATCGTGTAT TGACTATTGA TGGTATTAAT ATATATTCAG GAAATCTCCC TACTTATGTA 1440
TGGACCCATC GCGATGTGGA CCTTACAAAC ACGATTACCG CAGATAGAAT TACACAACTA 1500 CCATTGGTAA AGTCATTTGA AATACCTGCG GGTACTACTG TCGTAAGAGG ACCAGGTTTT 1560
ACAGGAGGGG ATATACTCCG AAGAACAGGG GTTGGTACAT TTGGAACAAT AAGGGTATAGG 1620
ACTACTGCCC CCTTAACACA AAGATATCGC ATAAGATTCC GTTTCGCTTC TACCACAAAT 1680
TTGTTCATTG GTATAAGAGT TGGTGATAGA CAAGTAAATT ATTTTGACTT CGGAAGAACA 1740
ATGAACAGAG GAGATGAATT AAGGTACGAA TCTTTTTGCTA CAAGGGAGTT TACTACTGAT 1800
TTTAATTTTA GACAACCTCA AGAATTAATC TCAGTGTTTG CAAATGCATT TAGCGCTGGT 1860
CAAGAAGTTT ATTTTGATAG AATTGAGATT ATCCCCGTTA ATCCCGCACG AGAGGCGAAA 1920
GAGGATCTAG AAGCAGCAAA GAAAGCGGTG GCGAGCTTGT TTACACGCAC AAGGGACGGA 1980 _
TTACAAGTAA ATGTGAAAGA TTATCAAGTC GATCAAGCGG CAAATTTAGT GTCATGCTTA 2040
TCAGATGAAC AATATGGGTA TGACAAAAAG ATGTTATTGG AAGCGGTACG CGCGGCAAAA 2100
CGCCTCAGCC GAGAACGTAA CTTACTTCAG GATCCAGATT TTAATACAAT CAATAGTACA 2160
GAAGAAAATG GATGGAAAGC AAGTAACGGC GTTACTATTA GTGAGGGCGG TCCATTCTAT 2220
AAAGGCCGTG CACTTCAGCT AGCAAGTGCA CGAGAAAATT ATCCAACATA CATTTATCAA 2280
AAAGTAGATG CATCGGAGTT AAAACCTTAT ACACGTTATA GATCAGATGG GTTCGTGAAG 2340
AGTAGTCAAG ATTTAGAAAT TGATCTCATT CACCATCATA AAGTCCATCT TGTGAAAAAT 2400
GTACCAGATA ATTTAGTATC TGATACTTAC CCAGATGATT CTTGTAGTGG AATCAATCGA 2460
TGTCAGGAAC AACAGATGGT AAATGCGCAA CTGGAAACAG AGCATCATCA TCCGATGGAT 2520
TGCTGTGAAG CAGCTCAAAC ACATGAGTTT TCTTCCTATA TTGATACAGG GGATTTAAAT 2580
TCGAGTGTAG ACCAGGGAAT CTGGGCGATC TTTAAAGTTC GAACAACCGA TGGTTATGCG 2640 ACGTTAGGAA ATCTTGAATT GGTAGAGGTC GGACCGTTAT CGGGTGAATC TTTAGAACGT 2700
GAACAAAGGG ATAATACAAA ATGGAGTGCA GAGCTAGGAA GAAAGCGTGC AGAAACAGAT 2760
CGCGTGTATC AAGATGCCAA ACAATCCATC AATCATTTAT TTGTGGATTA TCAAGATCAA 2820
CAATTAAATC CAGAAATAGG GATGGCAGAT ATTATGGACG CTCAAAATCT TGTCGCATCA 2880
ATTTCAGATG TATATAGCGA TGCCGTACTG CAAATCCCTG GAATTAACTA TGAGATTTAC 2940
ACAGAGCTGT CCAATCGCTT ACAACAAGCA TCGTATCTGT ATACGTCTCG AAATGCGGTG 3000
CAAAATGGGG ACTTTAACAA CGGGCTAGAT AGCTGGAATG CAACAGCGGG TGCATCGGTA 3060
CAACAGGATG GCAATACGCA TTTCTTAGTT CTTTCTCATT GGGATGCACA AGTTTCTCAA 3120
CAATTTAGAG TGCAGCCGAA TTGTAAATAT GTATTACGTG TAACAGCAGA GAAAGTAGGC 3180
GGCGGAGACG GATACGTGAC TATCCGGGAT GGTGCTCATC ATACAGAAAC GCTTACATTT 3240
AATGCATGTG ATTATGATAT AAATGGCACG TACGTGACTG ATAATACGTA TCTAACAAAA 3300
GAAGTGATAT TCTATTCACA TACAGAACAC ATGTGGGTAG AGGTAAATGA AACAGAAGGT 3360
GCATTTCATA TAGATAGTAT TGAATTCGTT GAAACAGAAA AGTAAGGTAC C 3411
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 78:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 789 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 78:
Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 1 5 10 15
He Asp Tyr Phe Asn Gly He Tyr Gly Phe Wing Thr Gly He Lys Asp 20 25 30
He Met As Met Met He Phe Lys Thr Asp Thr Gly Gly Asp Leu Thr Leu 35 40 45
Asp Glu He Leu Lys Asn Gln Gln Leu Leu Asn Asp He Ser Gly Lys 50 55 60
Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Wing Gln Gly Asn 65 70 75 80
Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Wing Asn Glu Gln 85 90 95
Asn Gln Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 100 105 110
Met Leu Arg Val Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Val 115 120 125
Met Lys Gln Asn Tyr Ala Leu Ser Leu Gln He Glu Tyr Leu Ser Lys 130 135 140
Gln Leu Gln Glu He As Asp Lys Leu Asp He As Asn Val Asn Val 145 150. 155 - 160
Leu He Asn Be Thr Leu Thr Glu He Thr Pro Wing Tyr Gln Arg He 165 170 175 Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Wing Thr Glu Thr 180 185 190
Be Ser Lys Val Lys Lys Asp Gly Ser Pro Wing Asp He Leu Asp Glu 195 200 205
Leu Thr Glu Leu Thr Glu Leu Wing Lys Ser Val Thr Lys Asn Asp Val 210 215 220
Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 225 230 235 240
Asn Asn Leu Phe Gly Arg Be Wing Leu Lys Thr Wing Ser Glu Leu He 245 250 255
Thr Lys Glu Asn Val Lys Wing Ser Gly Ser Glu Val Gly Asn Val Tyr 260 265 270
Asn Phe Leu He Val Leu Thr Ala Leu Gln Ala Lys Ala Phe Leu Thr 275 280 285
Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Wing Asp He Asp Tyr Thr 290 295 300
Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 305 310 315 320
Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 325 330 335
Lys Val Lys Gly Ser Asp Glu Asp Wing Lys Met He Val Glu Wing Lys 340 345 350
Pro Gly His Wing Leu He Gly Phe Glu He Being Asn Asp Being He Thr 355 360 365
Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gln Asn Tyr Gln Val Asp 370 375 380
Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Met Asp Lys Leu Leu 385 390 395 400
Cys Pro Asp Gln Ser Glu Gln He Tyr Tyr Thr Asn Asn He Val Phe 405 410 415
Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 420 425 430
Thr Leu Arg Tyr Glu Val Thr Wing Asn Phe Tyr Asp Ser Ser Thr Gly 435 440 445
Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 450 455 460
Arg Thr Leu Ser Wing Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 465 470 475 480 He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gln Wing 485 490 495
Asp Glu Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 500 505 510
Glu Leu Leu Leu Thr Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 515 520 525
Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 530 535 540
Glu Glu Asp Asn Leu Glu Pro Trp Lys Wing Asn Asn Lys Asn Wing Tyr 545 550 555 560
Val Asp His Thr Gly Oly Val Asn Gly Thr Lys Ala Leu Tyr Val His 565 570 - 575
Lys Asp Gly Gly He Ser Gln Phe He Gly Asp Lys Leu Lys Pro Lys 580 585 590
Thr Glu Tyr Val He Gln Tyr Thr Val Lys Gly Lys Pro Ser He His 595 600 605
Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 610 615 620
Asn Leu Glu Asp Tyr Gln Thr He Asn Lys Arg Phe Thr Thr Gly Thr 625 630 635 640
Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gln Asn Gly Asp Glu 645 650 655
Wing Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Pro Glu Lys 660 665 670
Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 675 680 685
Ser Thr Asn He Ser Gly Asn Thr Leu Thr Leu Tyr Gln Gly Gly Arg 690 695 700
Gly He Leu Lys Gln Asn Leu Gln Leu Asp Being Phe Ser Thr Tyr Arg 705 710 715 720
Val Tyr Phe Ser Val Ser Gly Asp Wing Asn Val Arg He Arg Asn Ser 725 730 735
Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 740 745 750
Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu
755 760 765
Leu Ser Gln Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 770 775 780 Asp Val Ser He Lys 785
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 79:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 2370 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 79:
ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 60
AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 120
GATACAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT 180
ATTTCTGGTA AATTGGATGG GGTGAATGGA AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240 TTAAATACAG AATTATCTAA GGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA 300
AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCGGGTATA TCTACCTAAA 360
ATTACCTCTA TGTTGAGTGA TGTAATGAAA CAAAATTATG CGCTAAGTCT GCAAATAGAA 420
TACTTAAGTA AACAATTGCA AGAGATTTCT GATAAGTTGG ATATTATTAA TGTAAATGTA 480
CTTATTAACT CTACACTTAC TGAAATTACA CCTGCGTATC AAAGGATTAA ATATGTGAAC 540
GAAAAATTTG AGGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATGGC 600
TCTCCTGCAG ATATTCTTGA TGAGTTAACT GAGTTAACTG AACTAGCGAA AAGTGTAACA 660
AAAAATGATG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 720
AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAC TAAAGAAAAT 780
GTGAAAGCAA GTGGCAGTGA GGTCGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 840
CTGCAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900
ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960
AACATCCTCC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020
AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GACATGCATT GATTGGGTTT 1080
GAAATTAGTA ATGATTCAAT TACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 1140
TATCAAGTCG ATAAGGATTC CTTATCGGAA GTTATTTATG GTGATATGGA TAAATTATTG 1200
TGCCCAGATC AATCTGAACA AATCTATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 1260
GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 1320
AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAAGT AGAATCAAGT 1380 GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGGG TGTATATGCC GTTAGGTGTC 1440
ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGAAAATTCA 1500
AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 1560
AGCAATAAAG AAACTAAATT GATTGTCCCG CCAAGTGGTT TTATTAGCAA TATTGTAGAG 1620
AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGCGTAT 1680
GTAGATCATA CAGGCGGAGT GAATGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 1740
ATTTCACAAT TTATTGGAGA TAAGTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT 1800
GTTAAAGGAA AACCTTCTAT TCATTTAAAA GATGAAAATA CTGGATATAT TCATTATGAA 1860
GATACAAATA ATAATTTAGA AGATTATCAA ACTATTAATA AACGTTTTAC TACAGGAACT 1920
GATTTAAAGG GAGTGTATTT AATTTTAAAA AGTCAAAATG GAGATGAAGC TTGGGGAGAT 1980
AACTTTATTA TTTTGGAAAT TAGTCCTTCT GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 2040
ACAAATAATT GGACGAGTAC GGGATCAACT AATATTAGCG GTAATACACT CACTCTTTAT 2100
CAGGGAGGAC GAGGGATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTTC AACTTATAGA 2160
GTGTATTTTT CTGTGTCCGG AGATGCTAAT GTAAGGATTA GAAATTCTAG GGAAGTGTTA 2220
TTTGAAAAAA GATATATGAG CGGTGCTAAA GATGTTTCTG AAATGTTCAC TACAAAATTT 2280
GAGAAAGATA ACTTTTATAT AGAGCTTTCT CAAGGGAATA ATTTATATGG TGGTCCTATT 2340
GTACATTTTT ACGATGTCTC TATTAAGTAA 2370
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 80:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 789 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 80:
Met Asn Lys Asp Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 1 5 10 15
He Asp Tyr Phe Asn Gly He Tyr Gly Phe Wing Thr Gly He Lys Asp 20 25 30
He Met As Met Met He Phe Lys Thr Asp Thr Gly Gly Asp Leu Thr Leu 35 40 45
Asp Glu He Leu Lys Asn Gln Gln Leu Leu Asn Asp He Ser Gly Lys 50 55 60
Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Wing Gln Gly Asn 65 70 75 80
Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys lie Wing Asn Glu Gln 85 90 95
Asn Gln Val Leu Asn Glu Val Asn Asn Lys Leu Glu Wing He Ser Thr 100 105 110 He Phe Arg Val Tyr Leu Pro Lys Asn Thr Ser Arg Gly Gly Gly Val 115 120 125
Met Lys Gln Asn Tyr Ala Leu Ser Leu Gln Met Glu Asn Leu Ser Lys 130 135 140
Gln Leu Gln Glu He Ser Val Lys Trp Asp He He Asn Val Asn Val 145 150 155 160
Leu He Asn Be Thr Leu Thr Glu He Thr Pro Wing Tyr Gln Arg He 165 170 175
Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Wing Thr Glu Thr 180 185 190
Be Ser Lys Val Lys Lys Asp Gly Ser Pro Wing Asp He Leu Asp Glu 195 200 205
Leu Thr Glu Leu Thr Glu Leu Wing Lys Ser Val Thr Lys Asn Asp Val 210 215 220
Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 225 230 235 - 240
Asn Asn Leu Phe Gly Arg Be Wing Leu Lys Thr Wing Ser Glu Leu He 245 250 255
Thr Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 260 265 270
Asn Phe Leu He Val Leu Thr Ala Leu Gln Ala Lys Ala Phe Leu Thr 275 280 285
Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Wing Asp He Asp Tyr Thr 290 .. 295 300
Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 305 310 315 320
Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 325 330 335
Lys Val Lys Gly Ser Asp Glu Asp Wing Lys Met He Val Glu Wing Lys 340 345 350
Pro Gly His Wing Leu He Gly Phe Glu He Being Asn Asp Being He Thr 355 360 365
Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gln Asn Tyr Gln Val Asp - 370 375 380
Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Met Asp Lys Leu Leu 385 390 395 400
Cys Pro Asp Gln Ser Glu Gln He Tyr Tyr Thr Asn Asn He Val Phe 405 410 415 Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 420 425 430
Thr Leu Arg Tyr Glu Val Thr Wing Asn Phe Tyr Asp Ser Ser Thr Gly 435 440 445
Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 450 455 460
Arg Thr Leu Ser Wing Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 465 470 475 480
He Be Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gln Wing 485 490 495
Asp Glu Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 500 505 510
Glu Leu Leu Leu Thr Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 515 520 525
Val Pro Pro Ser Gly Phe He Ser Xaa He Val Glu Asn Gly Ser He 530 535 540
Glu Glu Asp Asn Leu Glu Pro Trp Lys Wing Asn Asn Lys Asn Wing Tyr 545 550 555 560
Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 565 570 575
Lys Asp Gly Gly He Ser Gln Phe He Gly Asp Lys Leu Lys Pro Lys
580 585 590
Thr Glu Tyr Val He Gln Tyr Thr Val Lys Gly Lys Pro Ser He His 595 600 605
Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 610 615 620
Asn Leu Glu Asp Tyr Gln Thr He Asn Lys Arg Phe Thr Thr Gly Thr 625 630 635 640
Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gln Asn Gly Asp Glu 645 650 655
Wing Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Pro Glu Lys 660 665 670
Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 675 680 685
Ser Thr Asn He Ser Gly Asn Thr Leu Thr Leu Tyr Gln Gly Gly Arg 690 695 700
Gly He Leu Lys Gln Asn Leu Gln Leu Asp Ser Phe Ser Thr Tyr Arg 705 710 715 720 Val Tyr Phe Ser Val Ser Gly Asp Wing Asn Val Arg He Arg Asn Ser 725 730 735
Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 740 745 750
Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 755 760 765
Leu Ser Gln Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 770 775 780
Asp Val Ser He Lys 785
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 81:
(i) CHARACTERISTICS OF LAT ^ SEQUENCE: (A) LENGTH: 2375 base pairs (B) TYPE: nucleic acid - (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO ": 81:
ATGAACAAGG ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 60
AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 120
GATACAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT 180
ATTTCTGGTA AATTGGATGG GGTGAATGGA AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240
TTAAATACAG AATTATCTAA GGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA 300
AATGAGGTTA ATAACAAACT CGAGGCGATA AGTACGATTT TTCGGGTATA TTTACCTAAA 360
AATACCTCTA GGGGGGGGGG GGTAATGAAA CAAAATTATG CGCTAAGTCT GCAAATGGAA 420
AACTTGAGTA AACAATTACA AGAGATTTCT GTTAAGTGGG ATATTATTAA TGTAAATGTA 480
CTTATTAACT CTACACTTAC CGAAATTACA CCTGCGTATC AAAGGATTAA ATATGTGAAC 540
GAAAAATTTG AGGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATGGC 600
TCTCCCGCAG ATATTCTTGA TGAGTTAACT GAGTTAACTG AACTAGCGAA AAGTGTAACA 660
AAAAATGATG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 720
AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAC TAAAGAAAAT 780
GTGAAAACAA GTGGCAGTGA GGTCGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 840
CTGCAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900
ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960
AACATCCTCC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020
AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GACATGCATT GATTGGGTTT 1080
GAAATTAGTA ATGATTCAAT TACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 1140 TATCAAGTCG ATAAGGATTC CTTATCGGAA GTTATTTATG GTGATATGGA TAAATTATTG 1200
TGCCCAGATC AATCTGAACA AATCTATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 1260
GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 1320
AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 1380
GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGGG TGTATATGCC GTTAGGTGTC 1440
ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGAAAATTCA 1500
AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACCGACTTA 1560
AGCAATAAAG AAACTAAATT GATCGTCCCG CCAAGTGGTT TTATTAGCSA TATTGTAGAG 1620
AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGCGTAT 1680
GTAGATCATA CAGGCGGAGT GAATGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 1740
ATTTCACAAT TTATTGGAGA TAAGTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT 1800
GTTAAAGGAA AACCTTCTAT TCATTTAAAA GATGAAAATA CTGGATATAT TCATTATGAA 1860
GATACAAATA ATAATTTAGA AGATTATCAA ACTATTAATA AACGTTTTAC TACAGGAACT 1920
GATTTAAAGG GAGTGTATTT AATTTTAAAA AGTCAAAATG GAGATGAAGC TTGGGGAGAT 1980
AACTTTATTA TTTTGGAAAT TAGTCCTTCT GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 2040
ACAAATAATT GGACGAGTAC GGGATCAACT AATATTAGCG GTAATACACT CACTCTTTAT 2100
CAGGGAGGAC GAGGGATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTTC AACTTATAGA 2160
GTGTATTTTT CTGTGTCCGG AGATGCTAAT GTAAGGATTA GAAATTCTAG GGAAGTGTTA 2220
TTTGAAAAAA GATATATGAG CGGTGCTAAA GATGTTTCTG AAATGTTCAC TACAAAATTT 2280 GAGAAAGATA ACTTTTATAT AGAGCTTTCT CAAGGGAATA ATTTATATGG TGGTCCTATT 2340
GTTCATTTTT ACGATGTCTC TATTAAGTAA CCCAA 2375
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 82:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 789 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 82:
Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 1 5 10 15
He- Asp Tyr Phe Asn Gly He Tyr Gly Phe Wing Thr Gly He Lys Asp 20 25 30
He Met As Met Met He Phe Lys Thr Asp Thr Gly Gly Asn Leu Thr Leu 35 40 45
Asp Glu He Leu Lys Asn Gln Gln Leu Leu Asn Glu He Ser Gly Lys 50 55 60
Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Wing Gln Gly Asn 65 70 75 80
Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Wing Asn Glu Gln 85 90 95
Asn Gln Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 100 105 110
Met Leu His He Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Val 115 120 125
Met Lys Gln Asn Tyr Ala Leu Ser Leu Gln He Glu Tyr Leu Ser Lys 130 135 140
Gln Leu Gln Glu He As Asp Lys Leu Asp He As Asn Val Asn Val 145 150 155 160
Leu He Asn Be Thr Leu Thr Glu He Thr Pro Wing Tyr Gln Arg He 165 170 175
Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Wing Thr Glu Thr 180 185 190
Thr Leu Lys Val Lys Lys Asp Ser Ser Pro Wing Asp He Leu Asp Glu 195 200 205 Leu Thr Glu Leu Thr Glu Leu Wing Lys Ser Val Thr Lys Asn Asp Val 210 215 220
Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 225 230 235 240
Asn Asn Leu Phe Gly Arg Be Wing Leu Lys Thr Wing Ser Glu Leu He 245 250 255
Ala Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 260 265 270
Asn Phe Leu He Val Leu Thr Ala Leu Gln Ala Lys Ala Phe Leu Thr 275 280 285
Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Wing Asp He Asp Tyr Thr 290 295 300
Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 305 310 315"320
Asn - He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Al-a 325 330 - 335
Lys Val Lys Gly Ser Asp Glu Asp Wing Lys Met He Val Glu Wing Lys 340 345 350
Pro Gly Tyr Ala Leu Val Gly Phe Glu Met Be Asn Asp Ser He Thr 355 360 365
Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gln Asn Tyr Gln Val Asp 370 375 380
Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Thr Asp Lys Leu Leu 385 390 395 400
Cys Pro Asp Gln Ser Glu Gln He Tyr Tyr Thr Asn Asn He Val Phe 405 410 415
Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 420 425 430
Thr Leu Arg Tyr Glu Val Thr Wing Asn Phe Tyr Asp Ser Ser_Thr Gly 435 440 445
Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 450 455 460
Arg Thr Leu Ser Wing Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 465"470 475 48 >
He Be Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gln Wing 485 490 495
Asp Gly Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 500 505 510 Glu Leu Leu Leu Wing Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 515 520 525
Val Leu Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 530 535 540
Glu Glu Asp Asn Leu Glu Pro Trp Lys Wing Asn Asn Lys Asn Wing Tyr 545 550 555 560
Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 565 570 575
Lys Asp Gly Gly Phe Ser Gln Phe He Gly Asp Lys Leu Lys Pro Lys 580 585 590
Thr Glu Tyr Val He Gln Tyr Thr Val Lys Gly Lys Pro Ser He His 595 600 605
Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 610 615 620
Asn Leu Lys Asp Tyr Gln Thr He Thr Lys Arg Phe Thr Thr Gly Thr 625 630 635 640
Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gln Asn Gly Asp Glu 645 650 655
Wing Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Pro Glu Lys 660 665 670
Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 675 680 685"-" "Be Thr His He Be Gly Asn Thr Leu Thr Leu Tyr Gln Gly Gly Arg 690 695 700
Gly He Leu Lys Gln Asn Leu Gln Leu Asp Being Phe Ser Thr Tyr Arg 705 710 715 720
0 Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn Ser 725 730 735
Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 740 745 750
Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 755 760 765
Leu Ser Gln Gly Asn Asn Leu Tyr Gly Gly Pro He Val His_Phe Asn
770 775 780
Asp Val Ser He Lys 785
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 83: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 2375 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 83:
ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 60
AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAATATGAT TTTTAAAACG 120
GATACAGGTG GTAATCTAAC CTTAGATGAA ATCCTAAAGA ATCAGCAGTT ACTAAATGAG iso
ATTTCTGGTA AATTGGATGG GGTAAATGGG AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240
TTAAATACAG AATTATCTAA GGAAATCTTA AAAATTGCAA ATGAACAGAA TCAAGTCTTA 300
AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCATATATA TCTACCTAAA 360 ATTACATCTA TGTTAAGTGA TGTAATGAAG CAAAATTATG CGCTAAGTCT GCAAATAGAA 420
TACTTAAGTA AACAATTGCA AGAAATTTCT GATAAATTAG ATATTATTAA CGTAAATGTT 480
CTTATTAACT CTACACTTAC TGAAATTACA CCTGCATATC AACGGATTAA ATATGTGAAT 540
GAAAAATTTG AAGAATTAAC TTTTGCTACA GAAACCACTT TAAAAGTAAA AAAGGATAGC 600
TCGCCTGCTG ATATTCTTGA TGAGTTAACT GAATTAACTG AACTAGCGAA AAGTGTTACA 660
AAAAATGACG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 720
AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCTTCAG AATTAATTGC TAAAGAAAAT 780
GTGAAAACAA GTGGCAGTGA AGTAGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 840
CTACAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900
ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960
AACATCCTTC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020
AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GATATGCATT GGTTGGGTTT 1080
GAAATGAGCA ATGATTCAAT CACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 1140
TATCAAGTTG ATAAGGATTC CTTATCGGAA GTTATTTATG GTGATACGGA TAAATTATTG 1200
TGTCCAGATC AATCTGAACA AATATATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 1260
GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 1320
AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 1380
GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGAG TGTATATGCC ATTAGGTGTC 1440
ATCAGTGAAA CATTTTTGAC TCCGATAAAT GGGTTTGGCC TCCAAGCTGA TGGAAATTCA 1500 AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 1560
AGCAATAAAG AAACTAAATT GATCGTCCTG CCAAGTGGTT TTATTAGCAA TATTGTAGAG 1620
AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGCGTAT 1680
GTAGATCATA CAGGCGGAGT GAATGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 1740
TTTTCACAAT TTATTGGAGA TAAGTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT 1800
GTTAAAGGAA AACCTTCTAT TCATTTAAAA GATGAAAATA CTGGATATAT TCATTATGAA 1860
GATACAAATA ATAATTTAAA AGATTATCAA ACTATTACTA AACGTTTTAC TACAGGAACT 1920
GATTTAAAGG GAGTGTATTT AATTTTAAAA AGTCAAAATG GAGATGAAGC TTGGGGAGAT 1980
AACTTTATTA TTTTGGAAAT TAGTCCTTCT GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 2040
ACAAATAATT GGACGAGTAC GGGATCAACT CATATTAGCG GTAATACACT CACTCTTTAT 2100
CAGGGAGGAC GAGGAATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTTC AACTTATAGA 2160
GTGTATTTTT .CTGTGTCCGG AGATGCTAAT GTAAGGATTA GAAATTCTAG GGAAGTGTTA
2220
TTTGAAAAAA GATATATGAG CGGTGCTAAA GATGTTTCTG AAATGTTCAC TACAAAATTT 2280
GAGAAAGATA ACTTTTATAT AGAGCTTTCT CAAGGGAATA ATTTATATGG TGGTCCTATT 2340
GTACATTTTA ACGATGTCTC TATTAAGTAA CCCAA 2375
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 84:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 789 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 84:
Met Asn Lys Asn Asn Thr Lys Leu Ser Wing Arg Ala Leu Pro Ser Phe 1 5 10 15
He Asp Tyr Phe Asn Gly He Tyr Gly Phe Wing Thr Gly He Lys Asp 20 25 30
He Met As Met Met He Phe Lys Thr Asp Thr Gly Gly Asn Leu Thr Leu 35 40 45
Asp Glu He Leu Lys Asn Gln Gln Leu Leu Asn Glu He Ser Gly Lys 50 • 55 60
Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Wing Gln Gly Asn 65 70 75 80
Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Wing Asn Glu Gln 85 90 95
Asn Gln Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 100 105 110
Met- Leu His He Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Va-1 115 120 125
Met Lys Gln Asn Tyr Ala Leu Ser Leu Gln He Glu Tyr Leu Ser Lys 130 135 140
Gln Leu Gln Glu He As Asp Lys Leu Asp He As Asn Val Asn Val 145 150 155 160
Leu He Asn Be Thr Leu Thr Glu He Thr Pro Wing Tyr Gln Arg He 165 170 175
Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Wing Thr Glu Thr 180 185 190
Be Ser Lys Val Lys Lys Asp Ser Pro Pro Wing Asp He Leu Asp Glu 195 200 205
Leu Thr Glu Leu Thr Glu Leu Wing Lys Ser Val Thr Lys Aen Asp Val 210 215 220
Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 225 230 235 240-
Asn Asn Leu Phe Gly Arg Be Wing Leu Lys Thr Wing Ser Glu Leu He 245 250 255
Ala Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 260 265 270
Asn Phe Leu He Val Leu Thr Ala Leu Gln Ala Lys Ala Phe Leu Thr 275 280 285
Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Wing Asp He Asp Tyr Thr 290 295 300 Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 305 310 315 320
Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 325 330 335
Lys Val Lys Gly Ser Asp Glu Asp Wing Lys Met He Val Glu Wing Lys 340 345 350
Pro Gly Tyr Ala Leu Val Gly Phe Glu Met Be Asn Asp Ser He Thr 355 360 365
Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gln Asn Tyr Gln Val Asp 370 375 380
Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Thr Asp Lys Leu Leu 385 390 395 400
Cys Pro Asp Gln Ser Glu Gln He Tyr Tyr Thr Asn Asn He Val Phe 405 410 415
Pro- Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 420 425 430
Thr Leu Arg Tyr Glu Val Thr Wing Asn Phe Tyr Asp Ser Ser Thr Gly 435 440 445
Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 450 455 460
Arg Thr Leu Ser Wing Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 465 470 475 480
He Be Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gln Wing 485 490 495
Asp Gly Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 500 505 510
Glu Leu Leu Leu Wing Thr Asp Leu Being Asn Lys Glu Thr Lys Leu He
515 520 525
Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 530 535 540
Glu Glu Asp Asn Leu Glu Pro Trp Lys Wing Asn Asn Lys Asn Wing Tyr 545 550 555 560
Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 565 570 575
Lys Asp Gly Gly Phe Ser Gln Phe He Gly Asp Lys Leu Lys Pro Lys
580 585 590
Thr Glu Tyr Val He Gln Tyr Thr Val Lys Gly Lys Pro Ser He His 595 600 605 Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 610 615 620
Asn Leu Lys Asp Tyr Gln Thr He Thr Lys Arg Phe Thr Thr Gly Thr 625 630 635 640
Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gln Asn Gly Asp Glu 645 650 655
Wing Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Pro Glu Lys 660 665 670
Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 675 680 685
Ser Thr His He Ser Gly Asn Thr Leu Thr Leu Tyr Gln Gly Gly Arg 690 695 700
Gly Lie Leu Lys Gln Asn Leu Gln Leu Asp Ser Phe Ser Thr Tyr Arg 705 710 715 720
Val Tyr Phe Ser Val Ser Gly Asp Wing Asn Val Arg He Arg Asn Seí- 725 730 735
Arg Glu Val Leu Phe Glu Lys Gly Tyr Met Ser Gly Ala Lys Asp Val 740 745 750
Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 755 760 765
Leu Ser Gln Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 770 775 780
Asp Val Ser He Lys 785
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 85:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 2375 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 85:
ATGAACAAGA ATAATACTAA ATTAAGCGCA AGGGCCCTAC CGAGTTTTAT TGATTAT-TTT 60
AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAATATGAT TTTTAAAACG 120
GATACAGGTG GTAATCTAAC CTTAGATGAA ATCCTAAAGA ATCAGCAGTT ACTAAATGAG 180
ATTTCTGGTA AATTGGATGG GGTAAATGGG AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240
TTAAATACAG AATTATCTAA GGAAATCTTA AAAATTGCAA ATGAACAGAA TCAAGTCTTA 300
AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCATATATA TCTACCTAAA 360
ATTACATCTA TGTTAAGTGA TGTAATGAAA CAAAATTATG CGCTAAGTCT GCAAATAGAA 420
TACTTAAGTA AACAATTGCA AGAAATTTCT GATAAATTAG ATATTATTAA CGTAAATGTC 480
CTTATTAACT CTACACTTAC TGAAATTACA CCTGCATATC AACGGATTAA ATATGTGAAT 540
GAAAAATTTG AAGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATAGC 600
CCCCCTGCTG ATATTCTTGA TGAGTTAACT GAATTAACTG AACTAGCGAA AAGTGTAACA 660
AAAAATGACG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 720 AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCTTCAG AATTAATTGC TAAAGAAAAT 780
GTGAAAACAA GTGGCAGTGA AGTAGGAAAT GTTTATAATT TCTTAATTGT ATTAACAGCT 840
CTACAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900
ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960
AACATCCTTC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020
AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GATATGCATT GGTTGGTTTT 1080
GAAATGAGCA ATGATTCAAT CACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 1140
TATCAAGTTG ATAAGGATTC CTTATCGGAG GTTATTTATG GTGATACGGA TAAATTATTG 1200
TGTCCAGATC AATCTGAACA AATATATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 1260
GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 1320
AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 1380
GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGAG TGTATATGCC ATTAGGTGTC 1440
ATCAGTGAAA CATTTTTGAC TCCGATAAAT GGGTTTGGCC TCCAAGCTGA TGGAAATTCA 1500
AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 1560
AGCAATAAAG AAACTAAATT GATCGTCCCG CCAAGTGGTT TTATTAGCAA TATTGTAGAG 1620
AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGCGTAT 1680
GTAGATCATA CAGGCGGAGT GAATGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 1740
TTTTCACAAT TTATTGGAGA TAAGTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT 1800
GTTAAAGGAA AACCTTCTAT TCATTTAAAA GATGAAAATA CTGGATATAT TCATTATGAA 1860 GATACAAATA ATAATTTAAA AGATTATCAA ACTATTACTA AACGTTTTAC TACAGGAACT 1920
GATTTAAAGG GAGTGTATTT AATTTTAAAA AGTCAAAATG GAGATGAAGC TTGGGGAGAT 1980
AACTTTATTA TTTTGGAAAT TAGTCCTTCT GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 2040
ACAAATAATT GGACGAGTAC GGGATCAACT CATATTAGCG GTAATACACT CACTCTTTAT 2100
CAGGGAGGAC GAGGAATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTTC AACTTATAGA 2160
GTGTATTTTT CTGTGTCCGG AGATGCTAAT GTAAGGATTA GAAATTCTAG GGAAGTGTTA 2220
TTTGAAAAAG GATATATGAG CGGTGCTAAA GATGTTTCTG AAATGTTCAC TACAAAATTT 2280
GAGAAAGATA ACTTTTATAT AGAGCTTTCT CAAGGGAATA ATTTATATGG TGGTCCTATT 2340
GTACATTTTT ACGATGTCTC TATTAAGTAA CCAAG 2375 (2) SEQUENCE IDENTIFICATION INFORMATION NO. 86:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 759 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 86:
Met Asn Lys Asn Asn Thr Lys Leu Ser Wing Arg Ala Leu Pro Ser Phe 1 5 10 15
He Asp Tyr Phe Asn Gly He Tyr Gly Phe Wing Thr Gly He Lys Asp 20 25 30
He Met As Met Met He Phe Lys Thr Asp Thr Gly Gly Asn Leu Thr Leu 35 40 45
Asp Glu He Leu Lys Asn Gln Gln Leu Leu Asn Glu He Ser Gly Lys 50 55 60
Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Wing Gln Gly Asn 65 70 75 80
Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Wing Asn Glu Gln 85 90 95
Asn Gln Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 100 105 110
Met Leu Arg He Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Val 115 120 125
Met Asn Gln Asn Tyr Ala Leu Ser Leu Gln He Glu Tyr Leu Ser Lys 130 135 140
Gln Leu Gln Glu He Being Asp Lys Leu Asp He He Asn Val Asn Val 145 150 155 - 160
Leu He Asn Be Thr Leu Thr Glu He Thr Pro Wing Tyr Gln Arg He 165 170 175
Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Wing Thr Glu Thr 180 185 190
Xaa Ser Lys Val Lys Lys Asp Gly Ser Pro Wing Asp He Leu Asp Glu 195 200 205
Leu Thr Glu Leu Thr Glu Leu Wing Lys Ser Val Thr Lys Asn Asp Val 210 215 220
Asp Gly Phe Glu He Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 225 230 235 240 Asn Asn Leu He Gly Arg Be Ala Leu Lys Thr Ala Ser Glu Leu He 245 250 255
Xaa Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 260 265 270
Asn Phe Leu He Val Leu Thr Ala Leu Gln Ala Lys Ala Phe Leu Thr 275 280 285
Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Wing Asp He Asp Tyr Thr 290 295 300
Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 305 310 315 320
Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 325 330 335
Lys Val Lys Gly Ser Asp Glu Asp Wing Lys Met He Val Glu Wing Lys 340 345 350
Pro Gly Tyr Ala Leu Val Gly Phe Glu Met Ser Asn Asp Ser He Th-r 355 360 365
Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gln Asn Tyr Gln Val Asp 370 375 380
Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Thr Asp Lys Leu Leu 385 390 395 400
Cys Pro Asp Gln Ser Glu Gln He Tyr Tyr Thr Asn Asn He Val Phe 405 410 415
Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 420 425 430
Thr Leu Arg Tyr Glu Val Thr Wing Asn Phe Tyr Asp Ser Ser Thr Gly 435 440 445
Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 450 455 460
Arg Thr Leu Ser Wing Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 465 470 475 480
He Be Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gln Wing 485 490 495
Asp Glu Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 500 505 510
Glu Leu Leu Leu Thr Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 515 520 525
Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser His 530 535 540 Arg Arg Gly Gln Phe Arg Ala Val Glu Ser Lys Glu Cys Val Cys Arg 545 550 555 560
Ser Tyr Arg Arg Ser Glu Trp Asn Ser Phe He Cys Ser Gly Arg Arg 565 570 575
Asn Phe Thr He Tyr Trp Arg Val Lys Thr Glu Asn Val Cys Asn Pro 580 585 590
He Tyr Cys Arg Lys Thr Phe Tyr Ser Phe Lys Arg Lys Tyr Trp He 595 600 605
Tyr Ser Leu Arg Tyr Lys Phe Lys Arg Leu Ser Asn Tyr Tyr Thr Phe 610 615 620
Tyr Tyr Arg Asn Phe Lys Gly Ser Val Phe Asn Phe Lys Lys Ser Lys 625 630 635 640
Trp Arg Ser Leu Gly Arg Leu Tyr Tyr Phe Gly Asn Ser Phe Lys Val 645 650 655
He Lys Ser Arg He Asn Tyr Lys Leu Asp Glu Tyr Gly He Asn Ser 660 665 670
Tyr Arg Tyr Thr His Ser Leu Ser Gly Arg Thr Arg Asn Ser Lys Thr 675 680 685
Lys Pro Ser He Arg Phe Phe Asn Leu Ser Val Phe Phe Cys Val Arg 690 695 700
Arg Cys Cys Lys Asp Lys Phe Gly Ser Val He Lys Lys He Tyr Glu
705 710 715 720
Arg Cys Arg Cys Phe Asn Val His Tyr Lys He Glu Arg Leu Leu Tyr 725 730 735
Arg Ala Phe Ser Arg Glu Phe He Trp Trp Ser Tyr Cys Thr Phe Leu 740 745 750
Arg Cys Leu Tyr Val Thr Gln 755
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 87:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 2376 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual - (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 87:
ATGAACAAGA ATAATACTAA ATTAAGCGCA AGAGCCCTAC CGAGTTTTAT TGATTATTTT 60
AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAATATGAT TTTTAAAACG 120
GATACAGGTG GTAATCTAAC CTTAGATGAA ATCCTAAAGA ATCAGCAGTT ACTAAATGAG 180
ATTTCTGGTA AATTGGATGG GGTAAATGGG AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240
TTAAATACAG AATTATCTAA GGAAATCTTA AAAATTGCAA ATGAACAAAA TCAAGTCTTA 300
AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCGGATATA TCTACCTAAA 360
ATTACATCTA TGTTAAGTGA TGTAATGAAC CAAAATTATG CGCTAAGTCT GCAAATAGAA 420AACAATTGCA AGAAATTTCT GATAAATTGG ATATTATTAA TGTAAATGTA 480
CTTATTAACT CTACACTTAC TGAAATTACA CCTGCGTATC AAAGGATTAA ATATGTGAAC 540 -
GAAAAATTTG AGGAATTAAC TTTTGCTACA GAAACTAKTT CAAA? GTAAA AAAGGATGGC 600 TCTCCTGCAG ATATTCTTGA TGAGTTAACT GAGTTAACTG AACTAGCGAA AAGTGTAACA 660
AAAAATGATG TGGATGGTTT TGAAATTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 720
AATAATTTAA TCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAS TAAAGAAAAT 780
GTGAAAACAA GTGGCAGTGA GGTAGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 840
CTACAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900
ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960
AACATCCTTC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020
AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GATATGCATT GGTTGGTTTT 1080
GAAATGAGCA ATGATTCAAT CACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 1140 -
TATCAAGTTG ATAAGGATTC CTTATCGGAG GTTATTTATG GTGATACGGA TAAATTATTG 1200
TGTCCAGATC AATCTGAACA AATATATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 1260
GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 1320
AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 1380
GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGAG TGTATATGCC GTTAGGTGTC 1440
ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGAAAATTCA 1500
AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 1560
AGCAATAAAG AAACTAAATT GATCGTCCCG CCAAGTGGTT TTATTAGCAA TATTGTAGAG 1620
AACGGGTCCC ATAGAAGAGG ACAATTTAGA GCCGTGGAAA GCAAATAATA AGAATGCGTA 1680
TGTAGATCAT ACAGGCGGAG TGAATGGAAC TAAAGCTTTA TATGTTCATA AGGACGGAGG 1740 AATTTCACAA TTTATTGGAG ATAAGTTAAA ACCGAAAACT GAGTATGTAA TCCAATATAC 1800
TGTTAAAGGA AAACCTTCTA TTCATTTAAA AGATGAAAAT ACTGGATATA TTCATTATGA 1860
AGATACAAAT AATAATTTAA AAGATTATCA AACTATTACT AAACGTTTTA CTACAGGAAC 1920
TGATTTAAAG GGAGTGTATT TAATTTTAAA AAGTCAAAAT GGAGATGAAG CTTGGGGAGA 1980
TAACTTTATT ATTTTGGAAA TTAGTCCTTC TGAAAAGTTA TTAAGTCCAG AATTAATTAA 2040
TACAAATAAT TGGACGAGTA CGGGATCAAC TCATATTAGC GGTAATACAC TCACTCTTTA 2100
TCAGGGAGGA CGAGGAATTC TAAAACAAAA CCTTCAATTA GATAGTTTTT CAACTTATAG_2160_AGTGTATTTT TCTGTGTCCG GAGATGCTAA TGTAAGGATT AGAAATTCTA GGGAAGTGTT 2220
ATTTGAAAAA AGATATATGA GCGGTGCTAA AGATGTTTCT GAAATGTTCA CTACAAAATT 2280
TGAGAAAGAT AACTTTTATA TAGAGCTTTC TCAAGGGAAT AATTTATATG GTGGTCCTAT 2340
TGTACATTTT TACGATGTCT CTATTAAGTA ACCCAA 2376
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 88:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 511 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 88:
Tyr Leu Ser Lys Gln Leu Gln Glu He Ser Asp Lys Leu Asp He He 1 5 10 15
Asn Val Asn Val Leu He Asn Ser Thr Leu Thr Glu He Thr Pro Aa 20 25 30
Tyr Gln Arg He Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe 35 40 45
Wing Thr Glu Thr Thr Leu Lys Val Lys Lys Asp Ser Ser Pro Wing Asp 50 55 60
He Leu Asp Glu Leu Thr Glu Leu Thr Glu Leu Wing Lys Ser Val Thr 65 70 75 80
Lys Asn Asp Val Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp 85 90 95
Val Met Val Gly Asn Asn Leu Phe Gly Arg Be Ala Leu Lys Thr Ala 100 105 110
Ser Glu Leu He Wing Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val 115 120 125
Gly Asn Val Tyr Asn Phe Leu He Val Leu Thr Ala Leu Gln Ala Lys 130 135 140
Wing Phe Leu Thr Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Wing Asp 145 150 155 160
He Asp Tyr Thr Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu 165 170 175
Glu Phe Arg Val Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn 180 185 190
Pro Asn Tyr Ala Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met He 195 200 205 Val Glu Ala Lys Pro Gly Tyr Ala Leu Val Gly Phe Glu Met Ser Asn 210 215 220
Asp Ser He Thr Val Leu Lys Val Tyr Glu Wing Lys Leu Lys Gln Asn 225 230 235 240
Tyr Gln Val Asp Lys Asp Pro Leu Ser Glu Val He Tyr Gly Asp Thr 245 250 255
Asp Lys Leu Leu Cys Pro Asp Gln Ser Glu Gln He Tyr Tyr Thr Asn 260 265 270
Asn He Val Phe Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr 275 280 285
Lys Lys Met Lys Thr Leu Arg Tyr Glu Val Thr Wing Asn Phe Tyr Asp 290 295 300
Being Ser Thr Gly Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser 305 310 315 320
Glu Ala Glu Tyr Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met 325 330 335
Pro Leu Gly Val He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe 340 345 350
Gly Leu Gln Wing Asp Gly Asn Being Arg Leu He Thr Leu Thr Cys Lys 355 360 365
Being Tyr Leu Arg Glu Leu Leu Leu Wing Thr Asp Leu Being Asn Lys Glu 370 375 380
Thr Lys Leu He Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu 385 390 395 400
Asn Gly Ser He Glu Glu Asp Asn Leu Glu Pro Trp Lys Wing Asn Asn 405 410 - 415
Lys Asn Wing Tyr Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Wing 420 425 430
Leu Tyr Val His Lys Asp Gly Gly He Ser Gln Phe He Gly Asp Lys 435 440 445
Leu Lys Pro Lys Thr Glu Tyr Val He Gln Tyr Thr Val Lys Gly Lys 450 455 460
Pro Ser He His Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu
465-470 475 480
Asp Thr Asn Asn Asn Leu Lys Asp Tyr Gln Thr He Thr Lys Arg Phe 485 490 495
Thr Thr Gly Thr Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser 500 505 510 (2) SEQUENCE IDENTIFICATION INFORMATION NO. 89:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1533 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 1:
TACTTAAGTA AACAATTGCA AGAAATTTCT GATAAATTAG ATATTATTAA CGTAAATGTT 60
CTTATTAACT CTACACTTAC TGAAATTACA CCTGCATATC AACGGATTAA ATATGTGAAT 120
GAAAAATTTG AAGAATTAAC TTTTGCTACA GAAACCACTT TAAAAGTAAA AAAGGATAGC 180
TCGCCTGCTG ATATTCTTGA TGAGTTAACT GAATTAACTG AACTAGCGAA AAGTGTTACA 240
AAAAATGACG TTGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 300
AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCTTCAG AATTAATTGC TAAAGAAAAT 360
GTGAAAACAA GTGGCAGTGA AGTAGGAAAT GTTTATAATT TCTTAATTGT ATTAACAGCT 420
CTACAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 480
ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 540 _
AACATCCTYC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 600
AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GATATGCATT GGTTGGTTTT 660
GAAATGAGCA ATGATTCAAT CACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 720
TATCAAGTTG ATAAGGATCC CTTATCGGAG GTTATTTATG GTGATACGGA TAAATTATTG 780
TGTCCAGATC AATCTGAACA AATATATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 840
GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 900 AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 960
GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGAG TGTATATGCC ATTAGGTGTC 1020
ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGGAAATTCA 1080
AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 1140
AGCAATAAAG AAACTAAATT GATCGTCCCG CCAAGTGGTT TTATTAGCAA TATTGTAGAG 1200
AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGCGTAT 1260 -
GTAGATCATA CAGGCGGAGT GAATGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 1320
ATTTCACAAT TTATTGGAGA TAAGTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT 1380
GTTAAAGGAA AACCTTCTAT TCATTTAAAA GATGAAAATA CTGGATATAT TCATTATGAA 1440
GATACAAATA ATAATTTAAA AGATTATCAA ACTATTACTA AACGTTTTAC TACAGGAACT 1500
GATTTAAAGG GAGTGTATTT AATTTTAAAA AGT 1533
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 90:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 789 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 90:
Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 1 5 10 15
He Asp Tyr Phe Asn Gly He Tyr Gly Phe Wing Thr Gly He Lys Asp 20 25 30
He Met As Met Met He Phe Lys Thr Asp Thr Gly Gly Asp Leu Thr Leu 35 40 45
Asp Glu He Leu Lys Asn Gln Gln Leu Leu Asn Asp He Ser Gly Lys 50 55 60
Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Wing Gln Gly Asn 65 70 75 80
Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Wing Asn Glu Gln 85 90 95
Asn Gln Val Leu Asn Asp Val Asp Asn Lys Leu Asp / Ala He Asn Thr 100 105 110
Met Leu Arg Val Tyr Leu Pro Lys He Thr Xaa Met Leu Ser Asp Val 115 120 125
Met Lys Gln Asn Tyr Ala Leu Ser Leu Gln He Glu Tyr Leu Ser Lys 130 135 140
Gln Leu Gln Glu He As Asp Lys Leu Asp He As Asn Val Asn Val 145 150 155 160
Leu He Asn Be Thr Leu Thr Glu He Thr Pro Wing Tyr Gln Arg He 165 170 175
Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Wing Thr Glu Thr 180 185 190
Be Ser Lys Val Lys Lys Asp Gly Ser Pro Wing Asp He Leu Asp Glu 195 200 205 Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 210 215 220
Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 225 230 235 240
Asn Asn Leu Phe Gly Arg Be Wing Leu Lys Thr Wing Ser Glu Leu He 245 250 255
Thr Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 260 265 270
Asn Phe Leu He Val Leu Thr Ala Leu Gln Ala Lys Ala Phe Leu Thr 275 280 285
Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Wing Asp He Asp Tyr Thr 290 295 300
Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 305 310 315 320
Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 325 330 335
Lys Val Lys Gly Ser Asp Glu Asp Wing Lys Met He Val Glu Wing Lys 340 345 350
Pro Gly His Wing Leu Val Gly Phe Glu He Ser Asn Asp Ser He Thr 355 360 365
Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gln Asn Tyr Gln Val Asp 370 375 380
Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Met Asp Lys Leu Leu 385 390 395 400
Cys Pro Asp Gln Ser Glu Gln He Tyr Tyr Thr Asn Asn He Val Phe 405 410 415
Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 420 425 430
Thr Leu Arg Tyr Glu Val Thr Wing Asn Phe Tyr Asp Ser Ser Thr Gly 435 440 445
Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 450 455 460
Arg Thr Leu Ser Wing Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 465 470 475 48-0
He Be Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Pro Gln Ala 485 490 495
Asp Glu Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 500 505 510 Lys Leu Leu Leu Wing Thr Asp Phe Ser Asn Lys Glu Thr Lys Leu He 515 520 525
Leu Pro Pro Ser Gly Phe He Ser Asn He Val Xaa Asn Gly Ser He 530 535 540
Glu Glu Asp Asn Leu Glu Pro Gly Lys Wing Asn Asn Arg Asn Xla Tyr 545 550 555 560
Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 565 570 575
Lys Asp Gly Gly He Ser Gln Phe He Gly Asp Lys Leu Lys Pro Lys 580 585 590
Thr Glu Tyr Val He Gln Tyr Thr Val Lys Gly Lys Pro Ser He His 595 600 605
Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 610 615 620
Asn Leu Glu Asp Tyr Gln Thr He Thr Lys Arg Phe Thr Thr Gly Thr 625 630 635 - 640
Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gln Asn Gly Asp Glu 645 650 655
Wing Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Pro Glu Lys 660 665 670
Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 675 680 685
Ser Thr Asn He Ser Gly Asn Thr Leu Thr Leu Tyr Gln Gly Gly Arg 690 695 700
Gly He Leu Lys Gln Asn Leu Gln Leu Asp Being Phe Ser Thr Tyr Arg 705 710 715 720
Val Tyr Phe Ser Val Ser Gly Asp Wing Asn Val Arg He Arg Asn Ser 725 730 735
Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 740 745 750
Ser Glu He Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 755 760 765
Leu Ser Gln Gly Asn Asn Leu Asn Gly Gly Pro He Val His Phe Tyr -770 775 780
Asp Val Ser He Lys 785
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 91: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 2367 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA. { genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 91:
ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 60
AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 120
GATACAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT iso
ATTTCTGGTA AATTGGATGG GGTGAATGGA AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240
TTAAATACAG AATTATCTAA AGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA 300
AATGATGTTG ATAACAAACT CGATGCGATA AATACGATGC TTCGGGTATA TCTACCTAAA 360 ATTACCCTAT GTTGAGTGAT GTAATGAAAC AAAATTATGC GCTAAGTCTG CAAATAGAAT 420
ACTTAAGTAA ACAATTGCAA GAGATTTCTG ATAAGTTGGA TATTATTAAT GTAAATGTAC 480
TTATTAACTC TACACTTACT GAAATTACAC CTGCGTATCA AAGGATTAAA TATGTGAACG 540
AAAAATTTGA GGAATTAACT TTTGCTACAG AAACTAGTTC AAAAGTAAAA AAGGATGGCT 600
CTCCTGCAGA TATTCTTGAT GAGTTAACTG AGTTAACTGA ACTAGCGAAA AGTGTAACAA 660
AAAATGATGT GGATGGTTTT GAATTTTACC TTAATACATT CCACGATGTA ATGGTAGGAA 720
ATAATTTATT CGGGCGTTCA GCTTTAAAAA CTGCATCGGA ATTAATTACT AAAGAAAATG 780
TGAAAACAAG TGGCAGTGAG GTCGGAAATG TTTATAACTT CTTAATTGTA TTAACAGCTC 840
TGCAAGCAAA AGCTTTTCTT ACTTTAACAA CATGCCGAAA ATTATTAGGC TTAGCAGATA 900
TTGATTATAC TTCTATTATG AATGAACATT TAAATAAGGA AAAAGAGGAA TTTAGAGTAA 960
ACATCCTCCC TACACTTTCT AATACTTTTT CTAATCCTAA TTATGCAAAA GTTAAAGGAA 1020
GTGATGAAGA TGCAAAGATG ATTGTGGAAG CTAAACCAGG ACATGCATTG GTTGGGTTTG 1080
AAATTAGTAA TGATTCAATT ACAGTATTAA AAGTATATGA GGCTAAGCTA AAACAAAATT 1140
ATCAAGTTGA TAAGGATTCC TTATCGGAAG TTATTTATGG TGATATGGAT AAATTATTGT 1200
GCCCAGATCA ATCTGAACAA ATCTATTATA CAAATAACAT AGTATTTCCA AATGAATATG 1260
TAATTACTAA AATTGATTTT ACTAAAAAAA TGAAAACTTT AAGATATGAG GTAACAGCGA 1320
ATTTTTATGA TTCTTCTACA GGAGAAATTG ACTTAAATAA GAAAAAAGTA GAATCAAGTG 1380
AAGCGGAGTA TAGAACGTTA AGTGCTAATG ATGATGGAGT GTATATGCCG TTAGGTGTCA 1440
TCAGTGAAAC ATTTTTGACT CCGATTAATG GGTTTGGCCC CCAAGCTGAT GAAAATTCAA 1500 GATTAATTAC TTTAACATGT AAATCATATT TAAGAAAACT ACTGCTAGCA ACAGACTTTA 1560
GCAATAAAGA AACTAAATTG ATCCTCCCGC CAAGTGGTTT TATTAGCAAT ATTGTAGAAA 1620
CGGGTCCATA GAAGAGGACA ATTTAGAGCC GGGGAAAGCA AATAATAGGA ATGCGTATGT 1680
AGATCATACA GGCGGAGTGA ATGGAACTAA AGCTTTATAT GTTCATAAGG AOGGAGGAAT 1740
TTCACAATTT ATTGGAGATA AGTTAAAACC GAAAACTGAG TATGTAATCC AATATACTGT 1800
TAAAGGAAAA CCTTCTATTC ATTTAAAAGA TGAAAATACT GGATATATTC ATTATGAAGA 1860
TACAAATAAT AATTTAGAAG ATTATCAAAC TATTACTAAA CGTTTTACTA CAGGAACTGA 1920 -
TTTAAAGGGA GTGTATTTAA TTTTAAAAAG TCAAAATGGA GATGAAGCTT GGGGAGATAA 1980
CTTTATTATT TTGGAAATTA GTCCTTCTGA AAAGTTATTA AGTCCAGAAT TAATTAATAC 2040
AAATAATTGG ACGAGTACGG GATCAACTAA TATTAGCGGT AATACACTCA CTCTTTATCA 2100
GGGAGGACGA GGAATTCTAA AACAAAACCT TCAATTAGAT AGTTTTTCAA CTTATAGAGT 2160
GTATTTTTCT GTGTCCGGAG ATGCTAATGT AAGGATTAGA AATTCTAGGG AAGTGTTATT 2220
TGAAAAAAGA TATATGAGCG GTGCTAAAGA TGTTTCTGAA ATTTTCACTA CAAAATTTGA 2280
GAAAGATAAC TTTTATATAG AGCTTTCTCA AGGGAATAAT TTAAATGGTG GCCCTATTGT 2340
ACATTTTTAC GATGTCTCTA TTAAGTA 2367
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 92:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 789 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 9:
Met Asn Lys Asn Asn Thr Lys Leu Ser Wing Arg Ala Leu Pro Ser Phe 1 5 10 15
He Asp Tyr Phe Asn Gly He Tyr Gly Phe Wing Thr Gly He Lys Asp 20 25 30
He Met As Met Met He Phe iys Thr Asp Thr Gly Gly Asn Leu Thr Leu 35 40 45
Asp Glu He Leu Lys Asn Gln Gln Leu Leu Asn Glu He Ser Gly Lys 50 55 60
Leu Gly Gly Val Asn Gly Ser Leu Asn Asp Leu He Wing Gln Gly Asn 65 70 75 80
Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Wing Asn Glu Gln 85 90 95
Asn Gln Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 100 _ 105 - - 110
Met Leu His He Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Val 115 120 125
Met Lys Gln Asn Tyr Ala Leu Ser Leu Gln He Glu Tyr Leu Ser Lys 130 135 140
Gln Leu Gln Glu He As Asp Lys Leu Asp He As Asn Val Asn Val 145 150 155 160
Leu He Asn Be Thr Leu Thr Glu He Thr Pro Wing Tyr Gln Arg He 165 170 175
Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Wing Thr Glu Thr 180 185 190
Thr Leu Lys Val Lys Lys Asp Ser Ser Pro Wing Asp He Leu Asp Glu 195 200 205
Leu Thr Glu Leu Thr Glu Leu Wing Lys Ser Val Thr Lys Asn Asp Val 210 215 220
Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 225 230 235 240
Asn Asn Leu Phe Gly Arg Be Wing Leu Lys Thr Wing Ser Glu Leu He 245 250 255
Ala Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 260 265 270
Asn Phe Leu He Val Leu Thr Ala Leu Gln Ala Lys Ala Phe Leu Thr
275 280 285
Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Wing Asp He Asp Tyr Thr 290 295 300 Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 305 310 315 320
Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 325 330 335
Lys Val Lys Gly Ser Asp Glu Asp Wing Lys Met He Val Glu Wing Lys 340 345 350
Pro Gly Tyr Ala Leu Val Gly Phe Glu Met Be Asn Asp Ser He Thr 355 360 365
Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gln Asn Tyr Gln Val Asp 370 375 380
Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Thr Asp Lys Leu Leu 385 390 395 400
Cys Pro Asp Gln Ser Glu Gln He Tyr Tyr Thr Asn Asn He Val Phe 405 410 415
Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 420 425 430
Thr Leu Arg Tyr Glu Val Thr Wing Asn Phe Tyr Asp Ser Ser Thr Gly 435 440 445
Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 450 455 460
Arg Thr Leu Ser Wing Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 465 470 475 480
He Be Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gln Wing 485 490 495
Asp Gly Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 500 505 510
Glu Leu Leu Leu Thr Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 515 520 525
Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 530 535 540
Glu Glu Asp Asn Leu Glu Pro Trp Lys Wing Asn Asn Lys Asn Wing Tyr 545 550 555 560
Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 565 570 575
Lys Asp Gly Gly He Ser Gln Phe He Gly Asp Lys Leu Lys Pro Lys 580 585 590
Thr Glu Tyr Val He Gln Tyr Thr Val Lys Gly Lys Pro Ser He His 595 600 605 Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 610 615 620
Asn Leu Lys Asp Tyr Gln Thr He Thr Lys Arg Phe Thr Thr Gly Thr 625 630 635 640
Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gln Asn Gly Asp Glu 645 650 655
Wing Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Pro Glu Lys 660 665 670
Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 675 680 685
Ser Thr His He Ser Gly Asn Thr Leu Thr Leu Tyr Gln Gly Gly Arg 690 695 700
Gly He Leu Lys Gln Asn Leu Gln Leu Asp Being Phe Ser Thr Tyr Arg 705 710 715 _ 720
Val Tyr Phe Ser Val Ser Gly Asp Wing Asn Val Arg He Arg Asn Ser 725 730 735
Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val - 740 745 750
Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 755 760 765
Leu Ser Gln Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 770 775 780
Asp Val Ser He Lys 785
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 93:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 2369 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 93:
ATGAACAAGA ATAATACTAA ATTAAGCGCA AGGGCCCTAC CGAGTTTTAT TGATTATTTT 60
AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAATATGAT TTTTAAAACG 120 - -
GATACAGGTG GTAATCTAAC CTTAGATGAA ATCCTAAAGA ATCAGCAGTT ACTAAATGAG 180
ATTTCTGGTA AATTGGGGGG GGTAAATGGG AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240
TTAAATACAG AATTATCTAA GGAAATCTTA AAAATTGCAA ATGAACAAAT CAAGTCTTAA 300
ATGATGTTAA TAACAAACTC GATGCGATAA ATACGATGCT TCATATATAT CTACCTAAAA 360
TTACATCTAT GTTAAGTGAT GTAATGAAGC AAAATTATGC GCTAAGTCTG CAAATAGAAT 420
ACTTAAGTAA ACAATTGCAA GAAATTTCTG ATAAATTAGA TATTATTAAC GTAAATGTTC 480
TTATTAACTC TACACTTACT GAAATTACAC CTGCATATCA ACGGATTAAA TATGTGAATG 540
AAAAATTTGA AGAATTAACT TTTGCTACAG AAACCACTTT AAAAGTAAAA AAGGATAGCT 600
CGCCTGCTGA TATTCTTGAT GAGTTAACTG AATTAACTGA ACTAGCGAAA AGTGTTACAA 660
AAAATGACGT TGATGGTTTT GAATTTTACC TTAATACATT CCACGATGTA ATGGTAGGAA 720 ATAATTTATT CGGGCGTTCA GCTTTAAAAA CTGCTTCAGA ATTAATTGCT AAAGAAAATG 780
TGAAAACAAG TGGCAGTGAA GTAGGAAATG TTTATAATTT CTTAATTGTA TTAACAGCTC 840
TACAAGCAAA AGCTTTTCTT ACTTTAACAA CATGCCGAAA ATTATTAGGC TTAGCAGATA 900
TTGATTATAC TTCTATTATG AATGAACATT TAAATAAGGA AAAAGAGGAA TTTAGAGTAA 960
ACATCCTTCC TACACTTTCT AATACTTTTT CTAATCCTAA TTATGCAAAA GTTAAAGGAA 1020
GTGATGAAGA TGCAAAGATG ATTGTGGAAG CTAAACCAGG ATATGCATTG GTTGGTTTTG 1080
AAATGAGCAA TGATTCAATC ACAGTATTAA AAGTATATGA GGCTAAGCTA AAACAAAATT 1140
ATCAAGTTGA TAAGGATTCC TTATCGGAGG TTATTTATGG TGATACGGAT AAATTATTGT 1200
GTCCAGATCA ATCTGAACAA ATATATTATA CAAATAACAT AGTATTTCCA AATGAATATG 1260
TAATTACTAA AATTGATTTC ACTAAAAAAA TGAAAACTTT AAGATATGAG GTAACAGCGA 1320
ATTTTTATGA TTCTTCTACA GGAGAAATTG ACTTAAATAA GAAAAAAGTA GAATCAAGTG 1380
AAGCGGAGTA TAGAACGTTA AGTGCTAATG ATGATGGAGT GTATATGCCA TTAGGTGTCA 1440
TCAGTGAAAC ATTTTTGACT CCGATAAATG GGTTTGGCCT CCAAGCTGAT GGAAATTCAA 1500
GATTAATTAC TTTAACATGT AAATCATATT TAAGAGAACT ACTGCTAGCA ACAGACTTAA 1560
GCAATAAAGA AACTAAATTG ATTGTCCCGC CAAGTGGTTT TATTAGCAAT ATTGTAGAGA 1620
ACGGGTCCAT AGAAGAGGAC AATTTAGAGC CGTGGAAAGC AAATAATAAG AATGCGTATG 1680
TAGATCATAC AGGCGGAGTG AATGGAACTA AAGCTTTATA TGTTCATAAG GACGGAGGAA 1740
TTTCACAATT TATTGGAGAT AAGTTAAAAC CGAAAACTGA GTATGTAATC CAATATACTG 1800
TTAAAGGAAA ACCTTCTATT CATTTAAAAG ATGAAAATAC TGGATATATT CATTATGAAG 1860 ATACAAATAA TAATTTAAAA GATTATCAAA CTATTACTAA ACGTTTTACT ACAGGAACTG 1920
ATTTAAAGGG AGTGTATTTA ATTTTAAAAA GTCAAAATGG AGATGAAGCT TGGGGAGATA 1980
ACTTTATTAT TTTGGAAATT AGTCCTTCTG AAAAGTTATT AAGTCCAGAA TTAATTAATA 2040
CAAATAATTG GACGAGTACG GGATCAACTC ATATTAGCGG TAATACACTC ACTCTTTATC 2100
AGGGAGGACG AGGAATTCTA AAACAAAACC TTCAATTAGA TAGTTTTTCA ACTTATAGAG 2160
TGTATTTTTC TGTGTCCGGA GATGCTAATG TAAGGATTAG AAATTCTAGG GAAGTGTTAT 2220
TTGAAAAAAG ATATATGAGC GGTGCTAAAG ATGTTTCTGA AATGTTCACT ACAAAATTTG 2280
AGAAAGATAA CTTTTATATA GAGCTTTCTC AAGGGAATAA TTTATATGGT GGTCCTATTG 2340
TACATTTTTA CGATGTCTCT ATTAAGTAA 2369
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 94: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 789 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 94:
Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 1 5 10 15
He Asp Tyr Phe Asn Gly He Tyr Gly Phe Wing Thr Gly He Lys Asp 20 25 30
He Met As Met Met He Phe / Lys Thr Asp Thr Gly Gly Asp Leu Thr Leu 35 40 45
Asp Glu He Leu Lys Asn Gln Gln Leu Leu Asn Asp He Ser Gly Lys 50 55 60
Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Wing Gln Gly Asn 65 70 75 80
Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Wing Asn Glu Gln 85 90 95 Asn Gln Val Leu Asn Asp Val Asn Asn Lys Leu Asp Wing He Asn Thr 100 - 105 110
Met Leu Arg Val Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Val 115 120 125
Met Lys Gln Asn Tyr Ala Leu Ser Leu Gln He Glu Tyr Leu Ser Lys 130 135 140
Gln Leu Gln Glu He As Asp Lys Leu Asp He As Asn Val Asn Val 145 150 155 160
Leu He Asn Be Thr Leu Thr Glu He Thr Pro Wing Tyr Gln Arg He 165 170 175
Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Wing Thr Glu Thr 180 185 190
Be Ser Lys Val Lys Lys Asp Gly Ser Pro Wing Asp He Leu Asp Glu 195 200 205
Leu Ala Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 210 215 220
Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 225 230 235 - 240
Asn Asn Leu Phe Gly Arg Be Wing Leu Lys Thr Wing Ser Glu Leu He 245 250 255
Thr Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 260 265 270
Asn Phe Leu He Val Leu Thr Ala Leu Gln Ala Lys Ala Phe Leu Thr 275 280 285
Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Wing Asp He Asp Tyr Thr 290 295 300
Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 305 310 315 320
Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 325 330 335
Lys Val Lys Gly Ser Asp Glu Asp Wing Lys Met He Val Glu Wing Lys 340 - 345 350
Pro Gly His Wing Leu He Gly Phe Glu He Being Asn Asp Being He Thr 355 360 365
Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gln Asn Tyr Gln Val Asp
370 375 380
Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Met Asp Lys Leu Leu 385 390 395 - 400 Cys Pro Asp Gln Ser Glu Gin He Tyr Tyr Thr Asn Asn He Val Phe 405 410-415
Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 420 425 430
Thr Leu Arg Tyr Glu Val Thr Wing Asn Phe Tyr Asp Ser Ser Thr Gly 435 440 445
Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 450 455 460
Arg Thr Leu Ser Wing Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 465 470 475 480
He Be Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gln Wing 485 490 495
Asp Glu Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 500 505 510
Glu Leu Leu Leu Thr Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 515 520 525
Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 530 535 540
Glu Glu Asp Asn Leu Glu Pro Trp Lys Wing Asn Asn Lys Asn Wing Tyr 545 550 555 560
Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 565 570 575
Lys Asp Gly Gly He Ser Gln Phe He Gly Asp Lys Leu Lys Pro Lys
580 585 590
Thr Glu Tyr Val He Gln Tyr Thr Val Lys Gly Lys Pro Ser He His 595 600 605
Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 610 615 620
Asn Leu Glu Asp Tyr Gln Thr He Asn Lys Arg Phe Thr Thr Gly Thr 625 630 635 640
Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gln Asn Gly Asp Glu 645 650 655
Wing Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Pro Glu Lys 660 665 670
Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 675 680 685
Being Thr Asn He Being Gly Asn Thr Leu Thr Leu Tyr Gln Gly Gly Arg 690 695 700 Gly He Leu Lys Gln Asn Leu Gln Leu Asp Being Phe Ser Thr Tyr Arg 705 710 715 720
Val Tyr Phe Ser Val Ser Gly Asp Wing Asn Val Arg He Arg Asn Ser 725 730 735
Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 740 745 750
Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 755 760 765
Leu Ser Gln Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 770 775 780
Asp Val Ser He Lys 785
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 95:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 2370 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 95:
TTGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 60
AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 120
GATACAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT 180
ATTTCTGGTA AATTGGATGG GGTGAATGGA AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240
TTAAATACAG AATTATCTAA GGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA 300
AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCGGGTATA TCTACCTAAA 360
ATTACCTCTA TGTTGAGTGA TGTAATGAAA CAAAATTATG CGCTAAGTCT GCAAATAGAA 420
TACTTAAGTA AACAATTGCA AGAGATTTCT GATAAGTTGG ATATTATTAA TGTAAATGTA 480
CTTATTAACT CTACACTTAC TGAAATTACA CCTGCGTATC AAAGGATTAA ATATGTGAAC 540 GAAAAATTTG AGGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATGGC 600
TCTCCTGCAG ATATTCTTGA TGAGTTAGCT GAGTTAACTG AACTAGCGAA AAGTGTAACA 660
AAAAATGATG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 720
AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAC TAAAGAAAAT 780
GTGAAAACAA GTGGCAGTGA GGTCGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 840
CTGCAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900
ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960
AACATCCTCC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020
AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GACATGCATT GATTGGGTTT 1080
GAAATTAGTA ATGATTCAAT TACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 1140
TATCAAGTCG ATAAGGATTC CTTATCGGAA GTTATTTATG GTGATATGGA TAAATTATTG 1200
TGCCCAGATC AATCTGAACA AATCTATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 1260
GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 1320
AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 1380
GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGGG TGTATATGCC GTTAGGTGTC 1440
ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGAAAATTCA 1500
AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 1560
AGCAATAAAG AAACTAAATT GATTGTCCCG CCAAGTGGTT TTATTAGCAA TATTGTAGAG 1620
AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGCGTAT 1680 GTAGATCATA CAGGCGGAGT GAATGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 1740
ATTTCACAAT TTATTGGAGA TAAGTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT 1800
GTTAAAGGAA AACCTTCTAT TCATTTAAAA GATGAAAATA CTGGATATAT TCATTATGAA 1860
GATACAAATA ATAATTTAGA AGATTATCAA ACTATTAATA AACGTTTTAC TACAGGAACT 1920
GATTTAAAGG GAGTGTATTT AATTTTAAAA AGTCAAAATG GAGATGAAGC TTGGGGAGAT 1980
AACTTTATTA TTTTGGAAAT TAGTCCTTCT GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 2040
ACAAATAATT GGACGAGTAC GGGATCAACT AATATTAGCG GTAATACACT CACTCTTTAT 2100
CAGGGAGGAC GAGGGATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTTC AACTTATAGA 2160
GTGTATTTTT CTGTGTCCGG AGATGCTAAT GTAAGGATTA GAAATTCTAG GGAAGTGTTA 2220
TTTGAAAAAA GATATATGAG CGGTGCTAAA GATGTTTCTG AAATGTTCAC TACAAAATTT 2280
GAGAAAGATA ACTTTTATAT AGAGCTTTCT CAAGGGAATA ATTTATATGG TGGTCCTATT 2340
GTACATTTTT ACGATGTCTC TATTAAGTAA 2370
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 96:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 789 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 96:
Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 1 5 10 15
He Asp Tyr Phe Asn Gly He Tyr Gly Phe Wing Thr Gly He Lys Asp 20 25 30
He Met As Met Met He Phe Lys Thr Asp Thr Gly Gly Asp Leu Thr Leu 35 40 45 Asp Glu He Leu Lys Asn Gln Gln Leu Leu Asn Asp He Ser Gly Lys 50 55 60
Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Wing Gln Gly Asn 65 70 75 80
Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Wing Asn Glu Gln 85 90 95
Asn Gln Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 100 105 110
Met Leu Arg Val Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Val 115 120 125
Met Lys Gln Asn Tyr Ala Leu Ser Leu Gln He Glu Tyr Leu _Ser Lys 130 135 140
Gln Leu Gln Glu He As Asp Lys Leu Asp He As Asn Val Asn Val 145 150 155 160
Leu He Asn Being Thr Leu Thr Glu He Thr Pro Wing Tyr Gln Arg He 165 _ 170"- 175
Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Wing Thr Glu Thr 180 185 190
Be Ser Lys Val Lys Lys Asp Gly Ser Pro Wing Asp He Leu Asp Glu 195 200 205
Leu Thr Glu Leu Thr Glu Leu Wing Lys Ser Val Thr Lys Asn Asp Val 210 215 220
Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 225 230 235 240
Asn Asn Leu Phe Gly Arg Be Wing Leu Lys Thr Wing Ser Glu Leu He 245 250 255
Thr Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 260 265 270
Asn Phe Leu He Val Leu Thr Ala Leu Gln Ala Lys Ala Phe Leu Thr 275 280 285
Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Wing Asp He Asp Tyr Thr 290 295 300
Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 305 310 315 320
Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 325 330 335
Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met He Val Val Glu Ala Lys 340 345 350 Pro Gly His Ala Leu He Gly Phe Glu He Ser As Asp Ser He Thr 355 360 365
Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gln Asn Tyr Gln Val Asp 370 375 380.
Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Met Asp Lys Leu Leu 385 390 395 400
Cys Pro Asp Gln Ser Glu Gln He Tyr Tyr Thr Asn Asn lie Jal Phe 405 410 415
Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 420 425 430
Thr Leu Arg Tyr Glu Val Thr Wing Asn Phe Tyr Asp Ser Ser Thr Gly 435 440 445
Glu He Asp Leu Asn Lys Lys Asn Val Glu Ser Ser Glu Ala Glu Tyr 450 455 460
Arg Thr Leu Ser Wing Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 465 470 475 480
He Be Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gln Wing 485 490 495
Asp Glu Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 500 505 510
Glu Leu Leu Leu Thr Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 515 520 525
Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 530 535 540
Glu Glu Asp Asn Leu Glu Pro Trp Lys Wing Asn Asn Lys Asn Wing Tyr 545 550 555 560
Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 565 570 575
Lys Asp Gly Gly He Ser Gln Phe He Gly Asp Lys Leu Lys Pro Lys 580 585 590
Thr Glu Tyr Val He Gln Tyr Thr Val Lys Gly Lys Pro Ser He His 595 600 605
Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 610 615 620
Aen Leu Glu Asp Tyr Gln Thr He Asn Lys Arg Phe Thr Thr Gly Thr 625 630 635 _ 640
Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gln Asn Gly Asp Glu 645 650 655 Wing Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Pro Glu Lys 660 665 670
Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 675 680 685
Ser Thr Asn He Ser Gly Asn Thr Leu Thr Leu Tyr Gln Gly Gly Arg 690 695 700
Gly He Leu Lys Gln Asn Leu Gln Leu Asp Being Phe Ser Thr Tyr Arg 705 710 715 720
Val Tyr Phe Ser Val Ser Gly Asp Wing Asn Val Arg He Arg Asn Ser 725 730 735
Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 740 745 750
Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 755 760 765
Leu Ser Gln Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 770 775 780
Asp Val Ser He Lys 785 (2) SEQUENCE IDENTIFICATION INFORMATION NO. 97:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 2374 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 97:
ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 60
AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 120
GATACAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT 180
ATTTCTGGTA AATTGGATGG GGTGAATGGA AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240
TTAAATACAG AATTATCTAA GGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA 300
AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCGGGTATA TCTACCTAAA 360
ATTACCTCTA TGTTGAGTGA TGTAATGAAA CAAAATTATG CGCTAAGTCT GCAAATAGAA 420
TACTTAAGTA AACAATTGCA AGAGATTTCT GATAAGTTGG ATATTATTAA TGTAAATGTA 480
CTTATTAACT CTACACTTAC TGAAATTACA CCTGCGTATC AAAGGATTAA ATATGTGAAC 540
GAAAAATTTG AGGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATGGC 600
TCTCCTGCAG ATATTCTTGA TGAGTTAACT GAGTTAACTG AACTAGCGAA AAGTGTAACA 660
AAAAATGATG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 720
AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAC TAAAGAAAAT 780
GTGAAAACAA GTGGCAGTGA GGTCGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 840
CTGCAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900 ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960
AACATCCTCC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020
AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GACATGCATT GATTGGGTTT 1080
GAAATTAGTA ATGATTCAAT TACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 1140
TATCAAGTCG ATAAGGATTC CTTATCGGAA GTTATTTATG GTGATATGGA TAAATTATTG 1200
TGCCCAGATC AATCTGAACA AATCTATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 1260
GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 1320
AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAACGT CGAATCAAGT 1380
GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGGG TGTATATGCC GTTAGGTGTC 1440
ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGAAAATTCA 1500
AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 1560
AGCAATAAAG AAACTAAATT GATGTCCCGC CAAGTGGTTT TATTAGCAAT ATTGTAGAGA 1620
ACGGGTCCAT AGAAGAGGAC AATTTAGAGC CGTGGAAAGC AAATAATAAG AATGCGTATG 1680
TAGATCATAC AGGCGGAGTG AATGGAACTA AAGCTTTATA TGTTCATAAG GACGGAGGAA 1740
TTTCACAATT TATTGGAGAT AAGTTAAAAC CGAAAACTGA GTATGTAATC CAATATACTG 1800
TTAAAGGAAA ACCTTCTATT CATTTAAAAG ATGAAAATAC TGGATATATT CATTATGAAG 1860
ATACAAATAA TAATTTAOAA GATTATCAAA CTATTAATAA ACGTTTTACT ACAGGAACTG 1920
ATTTAAAGGG AGTGTATTTA ATTTTAAAAA GTCAAAATGG AGATGAAGCT TGGGGAGATA 1980
ACTTTATTAT TTTGGAAATT AGTCCTTCTG AAAAGTTATT AAGTCCAGAA TTAATTAATA 2040 CAAATAATTG GACGAGTACG GGATCAACTA ATATTAGCGG TAATACACTC ACTCTTTATC 2100
AGGGAGGACG AGGGATTCTA AAACAAAACC TTCAATTAGA TAGTTTTTCA ACTTATAGAG 2160
TGTATTTTTC TGTGTCCGGA GATGCTAATG TAAGGATTAG AAATTCTAGG GAAGTGTTAT 2220
TTGAAAAAAG ATATATGAGC GGTGCTAAAG ATGTTTCTGA AATGTTCACT ACAAAATTTG 2280
AGAAAGATAA CTTTTATATA GAGCTTTCTC AAGGGAATAA TTTATATGGT GGTCCTATTG 2340
TACATTTTTA CGATGTCTCT ATTAAGTAAC CCAA 2374
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 98:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 789 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein (xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 98:
Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 1 5 10 15
He Asp Tyr Phe Asn Gly He Tyr Gly Phe Wing Thr Gly He Lys Asp
25 30
He Met As Met Met He Phe Lys Thr Asp Thr Gly Gly Asn Leu Thr Leu 35 40 45
Asp Glu He Leu Lys Asn Gln Gln Leu Leu Asn Glu He Ser Gly Lys 50 55 60
Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Wing Gln Gly Asn 65 70 75 80
Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Wing Asn Glu Gln 85 90 95
Asn Gln Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 100 105 110
Met Leu His He Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Val 115 120 125
Met Lys Gln Asn Tyr Ala Leu Ser Leu Gln He Glu Tyr Leu Ser Lys 130 135 140 Gln Leu Xaa Glu He Ser Asp Lys Leu Asp He He Asn Val Asn Val 145 150 155 160
Leu He Asn Be Thr Leu Thr Glu He Thr Pro Wing Tyr Gln Arg He 165 170 - -175
Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Wing Thr Glu Thr 180 185 190
Thr Leu Lys Val Lys Lys Asp Ser Ser Pro Wing Asp He Leu Asp Glu 195 200 205
Leu Thr Glu Leu Thr Glu Leu Wing Lys Ser Val Thr Lys Asn Asp Val 210 215 220
Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 225 230 235 240
Asn Asn Leu Phe Gly Arg Be Wing Leu Lys Thr Wing Ser Glu Leu He 245 250 255
Ala Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 260 265 270
Asn Phe Leu He Val Leu Thr Ala Leu Gln Ala Lys Ala Phe Leu Thr 275 280 285
Leu Thr Thr Cys Xaa Lys Leu Leu Gly Leu Wing Asn He Asp Tyr Thr 290 295 300
Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 305 310 315 320
Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 325 330 335
Lys Val Lys Gly Ser Asp Glu Asp Wing Lys Met He Val Glu Wing Lys 340 345 350
Pro Gly Tyr Ala Leu Val Gly Phe Glu Met Be Asn Asp Ser He Thr 355 360 365
Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gln Asn Tyr Gln Val Asp 370 375 380
Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Thr Asp Lys Leu Leu 385 390 395 400
Cys Pro Asp Gln Ser Glu Gln He Tyr Tyr Thr Asn Asn He Val Phe 405 410 415
Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 420 425 430
Thr Leu Arg Tyr Glu Val Thr Wing Asn Phe Tyr Asp Being Ser Thr Gly 435 440 445 Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Wing Glu Tyr 450 455 460
Arg Thr Leu Ser Wing Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 465 470 475 480
He Ser Glu Thr Phe Leu Thr Xaa He Xaa Gly Phe Gly Leu Gln Wing 485 490 495
Asp Gly Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 500 505 510
Glu Leu Leu Leu Thr Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 515 520 525
Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 530 535 540
Glu Glu Asp Asn Leu Glu Pro Trp Lys Wing Asn Asn Lys Asn Wing Tyr 545 550 555 560
Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 565 570 575
Lys Asp Gly Gly Phe Ser Gln Phe He Gly Asp Xaa Leu Lys Pro Lys 580 585 590
Thr Glu Tyr Xaa He Gln Tyr Thr Val Lys Gly Lys Pro Ser He His 595 600 605
Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 610 615 620
Asn Leu Lys Asp Tyr Gln Thr He Thr Lys Arg Phe Thr Thr Gly Thr 625 630 635 640
Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gln Asn Gly Asp Glu 645 650 655
Wing Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Pro Glu Lys 660 665 670
Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 675 680 685
Ser Thr His He Ser Gly Asn Thr Leu Thr Leu Tyr Gln Gly Gly Arg 690 695 700
Gly He Leu Lys Gln Asn Leu Gln Leu Asp Being Phe Ser Thr Tyr Arg 705 710 715 720
Val Tyr Phe Ser Val Ser Gly Asp Wing Asn Val Arg He Arg Asn Ser 725 730 735
Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Wing Lys Asp Val 740 745 750 Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 755 760 765
Leu Ser Gln Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 770 775 780
Asp Val Ser He Lys 785
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 99:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 2366 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 99:
ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CGAGTTTTAT TGATTATTTT 60
AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAATATGAT TTTTAAAACG 120 GATACAGGTG GTAATCTAAC CTTAGATGAA ATCCTAAAGA ATCAGCAGTT ACTAAATGAG 180
ATTTCTGGTA AATTGGATGG GGTAAATGGG AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240
TTAAATACAG AATTATCTAA GGAAATCTTA AAAATTGCAA ATGAACAGAA TCAAGTCTTA 300
AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCATATATA TCTACCTAAA 360
ATTACATCTA TGTTAAGTGA TGTAATGAAG CAAAATTATG CGCTAAGTCT GCAAATAGAA 420
TACTTAAGTA AACAATTGCA GAATTTCTGA TAAATTAGAT ATTATTAACG TAAATGTTCT 480
TATTAACTCT ACACTTACTG AAATTACACC TGCATATCAA CGGATTAAAT ATGTGAAGAA 540
AAATTTGAAG AATTAACTTT TGCTACAGAA ACCACTTTAA AAGTAAAAAA GGATAGCTCG 600
CCTGCTGATA TTCTTGATGA GTTAACTGAA TTAACTGAAC TAGCGAAAAG TGTTACAAAA 660
AATGACGTTG ATGGTTTTGA ATTTTACCTT AATACATTCC ACGATGTAAT GGTAGGAAAT 720
AATTTATTCG GGCGTTCAGC TTTAAAAACT GCTTCAGAAT TAATTGCTAA AGAAAATGTG 780
AAAACAAGTG GCAGTGAAGT AGGAAATGTT TATAATTTCT TAATTGTATT AACAGCTCTA 840
CAAGCAAAAG CTTTTCTTAC TTTAACAACA TGCCAAAATT ATTAGGCTTA GCAAATATTG 900
ATTATACTTC TATTATGAAT GAACATTTAA ATAAGGAAAA AGAGGAATTT AGAGTAAACA 960
TCCTTCCTAC ACTTTCTAAT ACTTTTTCTA ATCCTAATTA TGCAAAAGTT AAAGGAAGTG 1020
ATGAAGATGC AAAGATGATT GTGGAAGCTA AACCAGGATA TGCATTGGTT GGTTTTGAAA 1080
TGAGCAATGA TTCAATCACA GTATTAAAAG TATATGAGGC TAAGCTAAAA CAAAATTATC 1140
AAGTTGATAA GGATTCCTTA TCGGAGGTTA TTTATGGTGA TACGGATAAA TTATTGTGTC 1200
CAGATCAATC TGAACAAATA TATTATACAA ATAACATAGT ATTTCCAAAT GAATATGTAA 1260 TTACTAAAAT TGATTTCACT AAAAAAATGA AAACTTTAAG ATATGAGGTA ACAGCGAATT 1320
TTTATGATTC TTCTACAGGA GAAATTGACT TAAATAAGAA AAAAGTAGAA TCAAGTGAAG 1380
CGGAGTATAG AACGTTAAGT GCTAATGATG ATGGAGTGTA TATGCCATTA GGTGTCATCA 1440
GTGAAACATT TTTGACTCGA TTATGGGTTT GGCCTCCAAG CTGATGGAAA TTCAAGATTA 1500
ATTACTTTAA CATGTAAATC ATATTTAAGA GAACTACTGC TAGCAACAGA CTTAAGCAAT 1560
AAAGAAACTA AATTGATTGT CCCCCAAGTG GTTTTATTAG CAATATTGTA GAGAACGGGT 1620
CCATAGAAGA GGACAATTTA GAGCCGTGGA AAGCAAATAA TAAGAATGCG TATGTAGATC 1680
ATACAGGCGG AGTGAATGGA ACTAAAGCTT TATATGTTCA TAAGGACGGA GGATTTTCAC 1740
AATTTATTGG AGATAATTAA AACCGAAAAC TGAGTATTAA TCCAATATAC TGTTAAAGGA 1800
AAACCTTCTA TTCATTTAAA AGATGAAAAT ACTGGATATA TTCATTATGA AGATACAAAT 1860
AATAATTTAA AAGATTATCA AACTATTACT AAACGTTTTA CTACAGGAAC TGATTTAAAG 1920
GGAGTGTATT TAATTTTAAA AAGTCAAAAT GGAGATGAAG CTTGGGGAGA TAACTTTATT 1980
ATTTTGGAAA TTAGTCCTTC TGAAAAGTTA TTAAGTCCAG AATTAATTAA TACAAATAAT 2040
TGGACGAGTA CGGGATCAAC TCATATTAGC GGTAATACAC TCACTCTTTA TCAGGGAGGA 2100
CGAGGAATTC TAAAACAAAA CCTTCAATTA GATAGTTTTT CAACTTATAG AGTGTATTTT 2160
TCTGTGTCCG GAGATGCTAA TGTAAGGATT AGAAATTCTA GGGAAGTGTT ATTTGAAAAA 2220
AGATATATGA GCGGTGCTAA AGATGTTTCT GAAATGTTCA CTACAAAATT TGAGAAAGAT 2280
AACTTTTATA TAGAGCTTTC TCAAGGGAAT AATTTATATG GTGGTCCTAT TGTACATTTT 2340
TACGATGTCT CTATTAAGTA ACCCAA 2366 (2) SEQUENCE IDENTIFICATION INFORMATION NO. 100:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 789 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 100:
Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 1 5 10 15
He Asp Tyr Phe Asn Gly He Tyr Gly Phe Wing Thr Gly He Lys Asp 20 25 30
He Met As Met Met He Phe Lys Thr Asp Thr Gly Gly Asp Leu Thr Leu 35 40 45
Asp Glu He Leu Lys Asn Gln Gln Leu Leu Asn Asp He Ser Gly Lys 50 55 60
Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Wing Gln Gly Asn 65 70 75 80
Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Wing Asn Glu Gln 85 90 95
Asn Gln Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr
100 105 110
Met Leu Arg Val Tyr Leu Pro Lys He Thr Phe Met Leu Ser Asp Val 115 120 125
Met Lys Gln Asn Tyr Ala Leu Ser Leu Gln He Glu Tyr Leu Ser Lys 130 135 140
Gln Leu Gln Glu He As Asp Lys Leu Asp He As Asn Val Asn Val 145 150 155 160
Leu He Asn Be Thr Leu Thr Glu He Thr Pro Wing Tyr Gln Arg He 165 170 175
Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Wing Thr Glu Thr 180 185 190
Be Ser Lys Val Lys Lys Asp Gly Ser Pro Wing Asp He Leu Asp Glu 195 200 205
Leu Thr Glu Leu Thr Glu Leu Wing Lys Ser Val Thr Lys Asn Asp Val 210 215 220
Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 225 230 235 240 Asn Asn Leu Phe Gly Arg Be Ala Leu Lys Thr Ala Ser Glu Leu He 245 - 250 255
Thr Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 260 265 270
Asn Phe Leu He Val Leu Thr Ala Leu Gln Ala Lys Ala Phe Leu Thr 275 280 285
Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Wing Asp He Asp Tyr Thr 290 295 300
Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 305 310 315 320
Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 325 330 335
Lys Val Lys Gly Ser Asp Glu Asp Wing Lys Met He Val Glu Wing Lys 340 345 350
Pro Gly His Wing Leu He Gly Phe Glu He Being Asn Asp Being He Thr 355 360 365
Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gln Asn Tyr Gln Val Asp 370 375 380
Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Met Asp Lys Leu Leu 385 390 395 400
Cys Pro Asp Gln Ser Glu Gln He Tyr Tyr Thr Asn Asn He Val Phe 405 410 415
Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys
420 425 430
Thr Leu Arg Tyr Glu Val Thr Wing Asn Phe Tyr Asp Being Ser Thr Gly
435 440 445
Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Wing Glu Tyr 450 455 460 '
Arg Thr Leu Ser Wing Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 4-65 470 475 480
He Be Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gln Wing 485 490 495
Asp Glu Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 500 505 510
Glu Leu Leu Leu Thr Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 515 520 525
Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 530 535 540 Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Xaa Asn Xaa Asn Ala Tyr 545 550 555 - 560
Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 565 570 575
Lys Asp Gly Gly He Ser Gln Phe He Gly Asp Lys Leu Lys Pro Lys 580 585 590
Thr Glu Tyr Val He Gln Tyr Thr Val Lys Gly Lys Pro Ser He His 595 600 605
Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 610 615 620
Asn Leu Xaa Xaa Tyr Gln Thr He Asn Lys Arg Phe Thr Thr Gly Thr 625 630 635 640
Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gln Asn Gly Xaa Glu 645 650 655
Wing Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Pro Glu Lys 660 665 670
Leu Leu Ser Pro Xaa Leu He Asn Thr Xaa Asn Trp Thr Ser Thr Gly 675 680 685
Ser Thr Asn He Ser Gly Asn Thr Leu Thr Leu Tyr Gln Gly Gly Arg 690 695 700
Gly He Leu Lys Gln Asn Leu Gln Leu Asp Ser Phe Xaa Thr Tyr Arg 705 710 715 720
Val Tyr Phe Ser Val Ser Gly Asp Wing Asn Val Arg He Arg Asn Ser 725 730 735
Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Xaa Val 740 745 750
Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu
755 760 765
Leu Ser Gln Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 770 775 780
Asp Val Ser He Lys 785
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 101:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 2362 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) 'TOPOLOGY: linear (ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 101:
ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 60
AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 120
GATACAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT 180
ATTTCTGGTA AATTGGATGG GGTGAATGGA AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240
TTAAATACAG AATTATCTAA GGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA 300
AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCGGGTATA TCTACCTAAA 360
ATTACCTTTA TGTTGAGTGA TGTAATGAAA CAAAATTATG CGCTAAGTCT GCAAATAGAA 420
TACTTAAGTA AACAATTGCA AGAGATTTCT GATAAGTTGG ATATTATTAA TGTAAATGTA 480 CTTATTAACT CTACACTTAC TGAAATTACA CCTGCGTATC AAAGGATTAA ATATGTGAAC 540
GAAAAATTTG AGGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATGGC 600
TCTCCTGCAG ATATTCTTGA TGAGTTAACT GAGTTAACTG AACTAGCGAA AAGTGTAACA 660
AAAAATGATG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 720
AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAC TAAAGAAAAT 780
GTGAAAACAA GTGGCAGTGA GGTCGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 840
CTGCAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG GTTAGCAGAT 900
ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960
AACATCCTCC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020
AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GACATGCATT GATTGGGTTT 1080
GAAATTAGTA ATGATTCAAT TACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 1140
TATCAAGTCG ATAAGGATTC CTTATCGGAA GTTATTTATG GTGATATGGA TAAATTATTG 1200
TGCCCAGATC AATCTGAACA AATCTATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 1260
GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 1320
AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 1380
GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGGG TGTATATGCC GTTAGGTGTC 1440
ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCT CCAAGCTGAT GAAAATTCAA 1500 ~
GATTAATTAC TTTAACATGT AAATCATATT TAAGAGAACT ACTGCTAGCA ACAGACTTAA 1560
GCAATAAAGA AACTAAATTG ATCGTCCCGC CAAGTGGTTT TATTAGCAAT ATTGTAGAGA 1620 ACGGGTCCAT AGAAGAGGAC AATTTAGAGC CCTGGAAAGC AATAATAGAA TGCGTATGTA 1680
GATCATACAG GCGGAGTGAA TGGAACTAAA GCTTTATATG TTCATAAGGA CGGAGGAATT 1740
TCACAATTTA TTGGAGATAA GTTAAAACCG AAAACTGAGT ATGTAATCCA ATATACTGTT 1800
AAAGGAAAAC CTTCTATTCA TTTAAAAGAT GAAAATACTG GATATATTCA TTATGAAGAT 1860
ACAAATAATA ATTTAAATTA TCAAACTATT AATAAACGTT TTACTACAGG AACTGATTTA 1920
AAGGGAGTGT ATTTAATTTT AAAAAGTCAA AATGGAATGA AGCTTGGGGA GATAACTTTA 1980
TTATTTTGGA AATTAGTCCT TCTGAAAAGT TATTAAGTCC AAATTAATTA ATACAATAAT 2040
TGGACAGTAC GGGATCAACT AATATTAGCG GTAATACACT CACTCTTTAT CAGGGAGGAC 2100
GAGGGATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTCA ACTTATAGAG TGTATTTTTC 2160
TGTGTCCGGA GATGCTAATG TAAGGATTAG AAATTCTAGG GAAGTGTTAT TTGAAAAAAG 2220
ATATATGAGC GGTGCTAAAA TGTTTCTGAA ATGTTCACAC AAAATTTGAG AAAGATAACT 2280
TTTATATAGA GCTTTCTCAA GGGAATAATT TATATGGTGG TCCTATTGTA CATTTTTACG 2340
ATGTCTCTAT TAAGTAACCC AA 2362
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 102:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 790 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 102:
Met His Glu Asn Asn Thr Lys Leu Ser Wing Arg Ala Leu Pro Ser Phe 1 5 10 15
He Asp Tyr Phe Asn Gly He Tyr Gly Phe Wing Thr Gly He Lys Asp 20 25 30 He Met Asn Met He Phe Lys Thr Asp Thr Gly Gly Asn Leu Thr Leu 35 40 45
Asp Glu He Leu Lys Asn Gln Gln Leu Leu Asn Glu He Ser Gly Lys 50 55 60
Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Wing Gln Gly Asn 65 70 75 80
Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Wing Asn Glu Gln 85 90 95
Ser Gln Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 100 105 110
Met Leu His He Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Val 115 120 125
Met Lys Gln Asn Tyr Ala Leu Ser Leu Gln He Glu Tyr Leu Ser Lys 130 135 140
Gln Leu Gln Glu He As Asp Lys Leu Asp He As Asn Val Asn Val 145 150 155 160
Leu He Asn Be Thr Leu Thr Glu He Thr Pro Wing Tyr Gln Arg He 165 170 175
Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Wing Thr Glu Thr 180 185 190
Thr Leu Lys Val Lys Lys Asp Xaa Ser Pro Wing Asp He Leu Asp Glu 195 200 205
Leu Thr Glu Leu Thr Glu Leu Wing Lys Ser Val Thr Lys Asn Asp Val 210 215 220
Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 225 230 235 240
Asn Asn Leu Phe Gly Arg Be Wing Leu Lys Thr Wing Ser Glu Leu He 245 250 255
Ala Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 260 265 270
Asn Phe Leu He Val Leu Thr Ala Leu Gln Ala Lys Ala Phe Leu Thr 275 280 285
Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Wing Asp He Asp Tyr Thr 290 295 300
Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 305 310 315 320
Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Wing 325 330 335 Lys Val Lys Gly Ser Asp Glu Asp Wing Lys Met He Val Glu Wing Lys 340 345 350
Pro Gly Tyr Ala Leu Val Gly Phe Glu Met Be Asn Asp Ser He Thr 355 360 365
Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gln Asn Tyr Gln Val Asp 370 375 380
Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Thr Asp Lys Leu Leu 385 390 395 400
Cys Pro Asp Gln Ser Glu Gln He Tyr Tyr Thr Asn Asn He Val Phe 405 410 415
Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 420 425 430
Thr Leu Arg Tyr Glu Val Thr Wing Asn Phe Tyr Asp Ser Ser Thr Gly 435 440 445
Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 450 455 460
Arg Thr Leu Ser Wing Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 465 470 475 480
He Be Glu Thr Phe Leu Thr Pro lie Asn Gly Phe Gly Leu Gln Ali 485 490 495
Asp Gly Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 500 505 510
Lys Leu Leu Leu Wing Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 515 520 525
Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 530 535 540
Glu Glu Asp Asn Leu Glu Pro Trp Lys Wing Asn Asn Lys Asn Wing Tyr 545 550 555 560
Val Asp His Thr Gly Gly Val Lys Gly Thr Lys Ala Leu Tyr Val His 565 570 575
Lys Asp Gly Gly He Ser Gln Phe He Gly Asp Xaa Leu Lys Pro Lys
580 585 590
Thr Glu Tyr Val He Gln Tyr Thr Val Lys Gly Lys Pro Ser He His
595 600 605
Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 610 615 620
Asn Leu Lys Asp Tyr Gln Thr lie Thr Lys Arg Phe Thr Thr Thr Gly Thr 625 630 635 640 Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gln Asn Gly Asp Glu 645 650 655
Wing Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Pro Glu Lys 660 665 670
Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 675 680 685
Ser Thr His He Ser Gly Asn Thr Leu Thr Leu Tyr Gln Gly Gly Arg 690 695 700
Gly He Leu Lys Gln Asn Leu Gln Leu Asp Being Phe Ser Thr Tyr Arg 705 710 715 720
Val Tyr Phe Ser Val Ser Oly Asp Wing Asn Val Arg He Arg Asn Ser 725 730 735
Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 740 745 750
Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 755 760 765
Leu Ser Gln Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 770 775 780
Asp Val Xaa He Lys Pro 785 790
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 103 (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 2375 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (gendmico)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 103:
ATGCACGAGA ATAATACTAA ATTAAGCGCA AGGGCCTTAC CGAGTTTTAT TGATTATTTT 60
AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAATATGAT TTTTAAAACG 120
GATACAGGTG GTAATCTAAC CTTAGATGAA ATCCTAAAGA ATCAGCAGTT ACTAAATGAG 180
'ATTTCTGGTA AATTGGATGG GGTAAATGGG AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240
TTAAATACAG AATTATCTAA GGAAATCTTA AAAATTGCAA ATGAACAGAG TCAAGTTTTA 300
AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCATATATA TCTACCTAAA 360
ATTACATCTA TGTTAAGTGA TGTAATGAAG CAAAATTATG CGCTAAGTCT GCAAATAGAA 420
TACTTAAGTA AACAATTGCA AGAAATTTCT GATAAATTAG ATATTATTAA CGTAAATGTT 480
CTTATTAACT CTACACTTAC TGAAATTACA CCTGCATATC AACGGATTAA ATATGTGAAT 540
GAAAAATTTG AAGAATTAAC TTTTGCTACA GAAACCACTT TAAAAGTAAA AAAGGATRAC 600
TCGCCTGCTG ATATTCTTGA TGAATTAACT GAATTAACTG AACTAGCGAA AAGTGTTACA 660
AAAAATGACG TTGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 720
AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCTTCAG AATTAATTGC TAAAGAAAAT 780
GTGAAAACAA GTGGCAGTGA AGTAGGAAAT GTTTATAATT TCTTAATTGT ATTAACAGCT 840 CTACAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900
ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960
AACATCCTTC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020
AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GATATGCATT GGTTGGTTTT 1080
GAAATGAGCA ATGATTCAAT CACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 1140
TATCAAGTTG ATAAGGATTC CTTATCGGAG GTTATTTATG GTGATACGGA TAAATTATTG 1200
TGTCCAGATC AATCTGAACA AATATATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 1260
GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 1320
AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 1380
GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGAG TGTATATGCC ATTAGGTGTC 1440
ATCAGTGAAA CATTTTTGAC TCCGATAAAT GGGTTTGGCC TCCAAGCTGA TGGAAATTCA 1500
AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAAAAC TACTGCTAGC AACAGACTTA 1560
AGCAATAAAG AAACTAAATT GATCGTCCCG CCAAGTGGTT TTATTAGCAA TATTGTAGAG 1620
AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGCGTAT 1680
GTAGATCATA CAGGCGGAGT GAAAGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 1740
ATTTCACAAT TTATTGGAGA TAAKTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT 1800
GTTAAAGGAA AACCTTCTAT TCATTTAAAA GATGAAAATA CTGGATATAT TCATTATGAA 1860
GATACAAATA ATAATTTAAA AGATTATCAA ACTATTACTA AACGTTTTAC TACAGGAACT 1920
GATTTAAAGG GAGTGTATTT AATTTTAAAA AGTCAAAATG GAGATGAAGC TTGGGGAGAT 1980 AACTTTATTA TTTTGGAAAT TAGTCCTTCT GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 2040
ACAAATAATT GGACGAGTAC GGGATCAACT CATATTAGCG GTAATACACT CACTCTTTAT 2100
CAGGGAGGAC GAGGAATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTTC AACTTATAGA 2160
GTGTATTTTT CTGTGTCCGG AGATGCTAAT GTAAGGATTA GAAATTCTAG GGAAGTGTTA 2220
TTTGAAAAAA GATATATGAG CGGTGCTAAA GATGTTTCTG AAATGTTCAC TACAAAATTT 2280
GAGAAAGATA ACTTTTATAT AGAGCTTTCT CAAGGGAATA ATTTATATGG TGGTCCTATT 2340
GTGCATTTTT ACGATGTCYC TATTAAGTAA CCCAA 2375
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 104
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 554 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 104:
Thr Leu His Leu Leu Lys Leu His Leu Arg He Lys Gly Leu Asn Met 1 5 10 15
Thr Lys Asn Leu Arg Asn Leu Leu Leu Xaa Xaa Leu Xaa Gln Lys Lys 20 25 30
Arg Met Ala Leu Leu Gln He Phe Xaa Met Ser Leu Ser Xaa Asn Arg 35 40 45
Lys Val Gln Lys Met Met Trp Met Val Leu Asn Phe Thr Leu He His 50 55 60
Ser Thr Met Xaa Glu Xle_ He Tyr Ser Gly Val Gln Leu Lys Leu Xaa 65 70 75 80
Arg Asn Leu Leu Lys Lys Met Lys Gln Val Wing Val Xaa Xaa Glu Met 85 90 95
Phe He Xaa Ser Leu Tyr Gln Leu Xaa Lys Gln Lys Leu Phe Leu Leu 100 105 110
Gln His Wing Glu Asn Tyr Xaa Gln He Leu He He Leu Leu Leu Met 115 120 125 Asn He He Arg Lys Lys Arg Asn Leu Glu Thr Ser Xaa Leu His Phe 130 135 140
Leu He Leu Phe Leu He Leu He Met Gln Lys Leu Lys Glu Val Met 145 150 155 160
Lys Met Gln Arg Leu Trp Lys Leu Asn Gln Asp Met His Trp Leu Val 165 170 175
Leu Lys Ala Met He Gln Ser Gln Tyr Lys Tyr Met Arg Leu Ser Asn 180 185 190
Lys He lie Lys Leu He Arg He Pro Tyr Arg Arg Leu Phe Met Val 195 200 205
He Arg He Asn Tyr Cys Val Gln He Asn Leu Asn Lys Tyr He He 210 215 220
Gln He Thr Tyr Phe Gln Met Asn Met Leu Leu Lys Leu He Ser Leu 225 230 235 240
Lys Lys Lys Leu Asp Met Arg Gln Arg He Phe Met He Leu Leu Gln 245 250 255
Glu Lys Leu Thr He Arg Lys Lys Asn Gln Val Lys Arg Ser He Glu 260 265 270
Arg Val Leu Met Met Met Xaa Cys He Cys His Val Ser Ser Val Lys 275 280 285
His Phe Leu Arg Met Gly Leu Wing Ser Lys Leu Arg Gln He Gln Asp 290 295 300
Leu Leu His Val Asn His He Glu Asn Tyr Cys Gln Gln Thr Ala He 305 310 315 320
Arg Lys Leu Asn Being Ser Arg Gln Val Phe Tyr Gln Tyr Cys Arg Glu 325 330 335
Arg Val Leu Arg Arg Gly Gln Phe Arg Wing Val Glu Ser Lys Glu Cys 340 345 350
Val Cys Arg Ser Tyr Arg Arg Ser Glu Trp Asn Ser Phe He Cys Ser 355 360 365
Gly Arg Arg Asn Phe Thr He Tyr Trp Arg Val Lys Thr Glu Asn Val 370 375 380
Cys Asn Pro He Tyr Cys Arg Lys Thr Phe Tyr Ser Phe Lys Arg Lys 385 390 395 400
Tyr Trp He Tyr Ser Leu Arg Tyr Lys Phe Lys Arg Leu Ser Asn Tyr 405 410 -415
Tyr Thr Phe Tyr Tyr Arg Asn Phe Lys Gly Ser Val Phe Asn Phe Lys 420 425 430 Lys Ser Lys Trp Arg Ser Leu Gly Arg Leu Tyr Tyr Phe Gly Asn Ser 435 440 445
Phe Lys Val He Lys Ser Arg He Asn Tyr Lys Leu Asp Glu Tyr Gly 450 455 460
He Asn Ser Tyr Arg Tyr Thr His Ser Leu Ser Gly Arg Thr Arg Asn 465 470 475 480
Be Lys Thr Lys Pro Be He Arg Phe Phe Asn Leu Ser Val Phe Phe 485 490 495
Cys Val Arg Arg Cys Cys Lys Asp Lys Phe Gly Ser Val He Lys Lys 500 505 510
He Tyr Glu Arg Cys Arg Cys Phe Asn Val His Tyr Lys He Glu Arg 515 520 525
Leu Leu Tyr Arg Wing Phe Ser Arg Glu Phe He Trp Trp Ser Tyr Cys 530 535 540
Thr Phe Leu Arg Cys Leu Tyr Val Thr Gln 545 550
(2) SEQUENCE IDENTIFICATION INFORMATION NO. 105:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1888 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: ID. OF SEQUENCE NO: 105:
ACTCTACACT TACTGAAATT ACACCTGCGT ATCAAAGGAT TAAATATGTG AACGAAAAAT 60
TTGAGGAATT AACTTTTGCT ACRGAMACTA KTTCAAAAGT AAAAAMGGAT GGCTCTCCTS 120
CAGATATTCT KGATGAGTTA ACTGAGTTAA CWGAACTAGC GAAAAGTGTA ACAAAAAATG 180
ATGTGGATGG TTTTRAATTT TACCTTAATA CATTCCACGA TGTAAKGGTA GGAAATAATT 240
TATTCGGGCG TTCAGCTTTA AAAACTGCWT CGGAATTAAT TRCTAAAGAA AATGTGAAAA 300
CAAGTGGCAG TGARGTMGGA AATGTTTATA AYTTCTTAAT TGTATTAACA GCTCTRCAAG 360
CAAAAGCTTT TCTTACTTTA ACAACATGCC GAAAATTATT AGGSTTAGCA GATATTGATT 420
ATACTTCTAT TATGAATGAA CATTTAAATA AGGAAAAAGA GGAATTTAGA GTAAACATCC 480
TYCCTACACT TTCTAATACT TTTTCTAATC CTAATTATGC AAAAGTTAAA GGAAGTGATG 540
AAGATGCAAA GATGATTGTG GAAGCTAAAC CAGGATATGC ATTGGTTGGT TTTGAAATGA 600
GCAATGATTC AATCACAGTA TTAAAAGTAT ATGAGGCTAA GCTAAAACAA AATTATCAAG 660
TTGATAAGGA TTCCTTATCG GAGGTTATTT ATGGTGATAC GGATAAATTA TTGTGTCCAG 720
ATCAATCTGA ACAAATATAT TATACAAATA ACATAGTATT TCCAAATGAA TATGTAATTA 780"
CTAAAATTGA TTTCACTAAA AAAATGAAAA CTTTAAGATA TGAGGTAACA GCGAATTTTT 840
ATGATTCTTC TACAGGAGAA ATTGACTTAA ATAAGAAAAA AGTAGAATCA AGTGAAGCGG 900
AGTATAGAAC GTTAAGTGCT AATGATGATG GRGTGTATAT GCCATTAGGT GTCATCAGTG 960 AAACATTTTT GACTCCGATA AATGGGTTTG GCCTCCAAGC TGAGGCAAAT TCAAGATTAA 1020
TTACTTTAAC ATGTAAATCA TATTTAAGAG AACTACTGCT AGCAACAGAC TTAAGCAATW 1080
AGGAAACTAA ATTGATCTTC CCGCCAAGTG TTTTATTAGC AATATTGTAG AGAACGGGTC 1140
CTTAGAAGAG GACAATTTAG AGCCGTGGAA AGCAAATAAT AAGAATGCGT ATGTAGATCA 1200
TACAGGCGGA GTGAATGGAA CTAAAGCTTT ATATGTTCAT AAGGACGGAG GAATTTCACA 1260
ATTTATTGGA GATAAGTTAA AACCGAAAAC TGAGTATGTA ATCCAATATA CTGTTAAAGG 1320
AAAACCTTCT ATTCATTTAA AAGATGAAAA TACTGGATAT ATTCATTATG AAGATACAAA 1380
TAATAATTTA AAAGATTATC AAACTATTAC TAAACGTTTT ACTACAGGAA CTGATTTAAA 1440
GGGAGTGTAT TTAATTTTAA AAAGTCAAAA TGGAGATGAA GCTTGGGGAG ATAACTTTAT 1500
TATTTTGGAA ATTAGTCCTT CTGAAAAGTT ATTAAGTCCA GAATTAATTA ATACAAATAA 1560
TTGGACGAGT ACGGGATCAA CTCATATTAG CGGTAATACA CTCACTCTTT ATCAGGGAGG 1620
ACGAGGAATT CTAAAACAAA ACCTTCAATT AGATAGTTTT TCAACTTATA GAGTGTATTT 1680
TTCTGTGTCC GGAGATGCTA ATGTAAGGAT TAGAAATTCT AGGGAAGTGT TATTTGAAAA 1740
AAGATATATG AGCGGTGCTA AAGATGTTTC TGAAATGTTC ACTACAAAAT TTGAGAAAGA 1800
TAACTTTTAT ATAGAGCTTT CTCAAGGGAA TAATTTATAT GGTGGTCCTA TTGTACATTT 1860
TTACGATGTC TCTATTAAGT AACCCAAA 1888
Claims (10)
1. - A method for the control of the European corn borer (Ostrinia nubialis), which comprises the contact of the pest with a toxin that includes an amino acid sequence that has at least 75% identity with SEC. ID. DO NOT. 74 or a pesticide fragment thereof.
2. A method according to claim 1, further characterized in that the amino acid sequence has at least 75% identity with a pesticidal fragment of SEC. ID. DO NOT. 74.
3. A method according to claim 1, further characterized in that the amino acid sequence has at least 75% identity with SEC. ID. DO NOT. 7
4. 4. A method according to claim 1, further characterized in that the toxin comprises the amino acid sequence appearing in SEC. ID. DO NOT. 74. or a pesticide fragment thereof.
5. A method according to claim 1, further characterized in that the toxin comprises the amino acid sequence appearing in SEC. ID. DO NOT. 74. 6.- A method for controlling the perforant of European corn (Ostrinia nubialis), which comprises contacting the pest with a toxin that includes an amino acid sequence encoded by a polynucleotide that has a complement that hybridizes under conditions of high astringency with a nucleotide sequence encoding SEC. ID. DO NOT. 74 or a pesticide fragment thereof. 7. A method according to claim 6, further characterized in that the nucleotide sequence encodes a pesticidal fragment of SEC. ID. DO NOT. 74. 8. A method according to claim 6, further characterized in that the nucleotide sequence encodes SEC. ID. DO NOT. 74. 9.- A method for controlling the perforant of European corn (Ostrinia nubialis), which comprises the contact of the pest with a toxin that immunointeracts with an antibody to SEC. ID. DO NOT. 74 or a pesticide fragment thereof. 10. A method according to claim 9, further characterized in that the toxin immunointeracts with an antibody to a pesticidal fragment of SEC. ID. DO NOT. 74. 1 1. - A method according to claim 9, further characterized in that the toxin immunointeracts with an antibody to SEC. ID. DO NOT. 74. 12.- A method for controlling the perforant of European corn (Ostrinia nubialis), wherein said method comprises contacting said pest with an amount of pesticide from a Bacillus thuringiensis toxin, wherein said toxin has a selected characteristic. of the group consisting of: a) said toxin comprises an amino acid sequence having at least about 75% homology with a sequence selected from the group consisting of SEC. ID. DO NOT. 70, SEC. ID. DO NOT. 72, SEC. ID. NO.74, SEC. ID.NO. 76, SEC. ID. DO NOT. 78, SEC. ID. DO NOT. 80, SEC. ID. DO NOT. 82, SEC. ID. DO NOT. 84, SEC. ID. DO NOT. 86, SEC. ID. DO NOT. 88, SEC. ID. DO NOT. 90, SEC. ID. DO NOT. 92, SEC. ID. DO NOT. 94, SEC. ID. DO NOT. 96, SEC. ID. DO NOT. 98, SEC. ID. DO NOT. 100, SEC. ID. DO NOT. 102 and SEC. ID. DO NOT. 104; b) said toxin comprises an amino acid sequence that is encoded by a nucleotide that hybridizes to a nucleotide sequence that encodes an amino acid sequence selected from the group consisting of SEC. ID. DO NOT. 70, SEC. ID. DO NOT. 72, SEC. ID. NO.74, SEC. ID.NO. 76, SEC. ID. DO NOT. 78, SEC. ID. DO NOT. 80, SEC. ID. DO NOT. 82, SEC. ID. DO NOT. 84, SEC. ID. DO NOT. 86, SEC. ID. DO NOT. 88, SEC. ID. DO NOT. 90, SEC. ID. DO NOT. 92, SEC. ID. DO NOT. 94, SEC. ID. DO NOT. 96, SEC. ID. DO NOT. 98, SEC. ID. DO NOT. 100, SEC. ID. DO NOT. 102 and SEC. ID. DO NOT. 104; and c) said immunointeractive toxin with an antibody to a toxin selected from the group consisting of SEC. ID. DO NOT. 70, SEC. ID. DO NOT. 72, SEC. ID. NO.74, SEC. ID.NO. 76, SEC. ID. DO NOT. 78, SEC. ID. DO NOT. 80, SEC. ID. DO NOT. 82, SEC. ID. DO NOT. 84, SEC. ID. DO NOT. 86, SEC. ID. DO NOT. 88, SEC. ID. DO NOT. 90, SEC. ID. DO NOT. 92, SEC. ID. DO NOT. 94, SEC. ID. DO NOT. 96, SEC. ID. DO NOT. 98, SEC. ID. DO NOT. 100, SEC. ID. DO NOT. 102 and SEC. ID. DO NOT. 104
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09002285 | 1997-12-31 |
Publications (1)
Publication Number | Publication Date |
---|---|
MXPA00006576A true MXPA00006576A (en) | 2001-06-26 |
Family
ID=
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU741036B2 (en) | Toxins active against pests | |
US6083499A (en) | Pesticidal toxins | |
KR101841296B1 (en) | Use of cry1da in combination with cry1be for management of resistant insects | |
AU713376B2 (en) | Controlling hemipteran insect pests with bacillus thuringiensis | |
US7355003B2 (en) | Pesticidal proteins | |
US5632987A (en) | Bacillus thuringiensis toxins active against corn rootworm larvae | |
US6752992B2 (en) | Toxins active against pests | |
US6570005B1 (en) | Toxins active against pests | |
KR20000053001A (en) | Novel pesticidal toxins and nucleotide sequences which encode these toxins | |
KR20010043334A (en) | Pesticidal toxins and nucleotide sequences which encode these toxins | |
US20040128716A1 (en) | Polynucleotides, pesticidal proteins, and novel methods of using them | |
US7790961B2 (en) | Pesticidal proteins | |
MXPA00006576A (en) | Toxins active against ostrinia nubilalis | |
AU745617B2 (en) | Materials and methods for controlling homopteran pests | |
AU2003203829B2 (en) | Pesticidal Toxins | |
MXPA00001353A (en) | Materials and methods for controlling homopteran pests |