MXPA99011066A - Plant amino acid biosynthetic enzymes - Google Patents
Plant amino acid biosynthetic enzymesInfo
- Publication number
- MXPA99011066A MXPA99011066A MXPA/A/1999/011066A MX9911066A MXPA99011066A MX PA99011066 A MXPA99011066 A MX PA99011066A MX 9911066 A MX9911066 A MX 9911066A MX PA99011066 A MXPA99011066 A MX PA99011066A
- Authority
- MX
- Mexico
- Prior art keywords
- nucleic acid
- seq
- val
- leu
- gly
- Prior art date
Links
- 102000004190 Enzymes Human genes 0.000 title claims abstract description 98
- 108090000790 Enzymes Proteins 0.000 title claims abstract description 98
- 150000001413 amino acids Chemical class 0.000 title claims description 210
- 230000001851 biosynthetic Effects 0.000 title claims description 63
- 150000007523 nucleic acids Chemical group 0.000 claims abstract description 163
- 108091006028 chimera Proteins 0.000 claims abstract description 63
- 230000014509 gene expression Effects 0.000 claims abstract description 62
- 108010022394 EC 4.2.3.1 Proteins 0.000 claims abstract description 38
- 102000006843 EC 4.2.3.1 Human genes 0.000 claims abstract description 38
- 108010007784 EC 2.5.1.6 Proteins 0.000 claims abstract description 36
- 108010006873 Threonine Dehydratase Proteins 0.000 claims abstract description 35
- 108010014468 Dihydrodipicolinate reductase Proteins 0.000 claims abstract description 29
- 108010001625 diaminopimelate epimerase Proteins 0.000 claims abstract description 28
- 238000004519 manufacturing process Methods 0.000 claims abstract description 21
- 239000004473 Threonine Substances 0.000 claims abstract description 16
- 102000007357 EC 2.5.1.6 Human genes 0.000 claims abstract 6
- 229920002676 Complementary DNA Polymers 0.000 claims description 127
- 239000002299 complementary DNA Substances 0.000 claims description 121
- 229920001850 Nucleic acid sequence Polymers 0.000 claims description 112
- 229920003013 deoxyribonucleic acid Polymers 0.000 claims description 45
- 108020004707 nucleic acids Proteins 0.000 claims description 33
- 230000001105 regulatory Effects 0.000 claims description 26
- 229920001184 polypeptide Polymers 0.000 claims description 22
- 230000001131 transforming Effects 0.000 claims description 19
- 230000000694 effects Effects 0.000 claims description 14
- 230000000295 complement Effects 0.000 claims description 13
- 150000001875 compounds Chemical class 0.000 claims description 11
- 229920000272 Oligonucleotide Polymers 0.000 claims description 8
- 102000003960 Ligases Human genes 0.000 claims description 7
- 108090000364 Ligases Proteins 0.000 claims description 7
- 230000002401 inhibitory effect Effects 0.000 claims description 7
- 238000010367 cloning Methods 0.000 claims description 6
- 239000003999 initiator Substances 0.000 claims description 4
- 108090000854 Oxidoreductases Proteins 0.000 claims description 3
- 102000004316 Oxidoreductases Human genes 0.000 claims description 3
- 230000000875 corresponding Effects 0.000 claims description 3
- 230000002194 synthesizing Effects 0.000 claims description 2
- 229940088598 Enzyme Drugs 0.000 abstract description 32
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 abstract description 25
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 abstract description 22
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 abstract description 18
- XUJNEKJLAYXESH-REOHCLBHSA-N L-cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 abstract description 15
- 239000004472 Lysine Substances 0.000 abstract description 15
- 229960000310 ISOLEUCINE Drugs 0.000 abstract description 8
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 abstract description 8
- 235000018417 cysteine Nutrition 0.000 abstract description 8
- 230000015572 biosynthetic process Effects 0.000 abstract description 7
- 230000037348 biosynthesis Effects 0.000 abstract description 6
- 229940009098 Aspartate Drugs 0.000 abstract description 4
- CKLJMWTZIZZHCS-UHFFFAOYSA-N DL-aspartic acid Chemical compound OC(=O)C(N)CC(O)=O CKLJMWTZIZZHCS-UHFFFAOYSA-N 0.000 abstract description 4
- 238000010276 construction Methods 0.000 abstract description 3
- 230000000692 anti-sense Effects 0.000 abstract 1
- 235000001014 amino acid Nutrition 0.000 description 106
- 241000196324 Embryophyta Species 0.000 description 63
- 102000004169 proteins and genes Human genes 0.000 description 55
- 108090000623 proteins and genes Proteins 0.000 description 55
- 235000018102 proteins Nutrition 0.000 description 52
- 210000004027 cells Anatomy 0.000 description 48
- 240000008042 Zea mays Species 0.000 description 41
- 235000002017 Zea mays subsp mays Nutrition 0.000 description 40
- 108010034529 leucyl-lysine Proteins 0.000 description 34
- 235000010469 Glycine max Nutrition 0.000 description 33
- OTXBNHIUIHNGAO-UWVGGRQHSA-N Leu-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN OTXBNHIUIHNGAO-UWVGGRQHSA-N 0.000 description 32
- 102100006971 MAT1A Human genes 0.000 description 32
- 235000005822 corn Nutrition 0.000 description 30
- 235000005824 corn Nutrition 0.000 description 30
- 108010050848 glycylleucine Proteins 0.000 description 28
- 229920001405 Coding region Polymers 0.000 description 27
- 108010047857 aspartylglycine Proteins 0.000 description 27
- 125000000267 glycino group Chemical group [H]N([*])C([H])([H])C(=O)O[H] 0.000 description 26
- 108010037850 glycylvaline Proteins 0.000 description 26
- 239000002773 nucleotide Substances 0.000 description 26
- 125000003729 nucleotide group Chemical group 0.000 description 26
- 108010061238 threonyl-glycine Proteins 0.000 description 26
- 239000002585 base Substances 0.000 description 25
- 108010073969 valyllysine Proteins 0.000 description 25
- 108010057821 leucylproline Proteins 0.000 description 24
- JHFNSBBHKSZXKB-VKHMYHEASA-N Asp-Gly Chemical compound OC(=O)C[C@H](N)C(=O)NCC(O)=O JHFNSBBHKSZXKB-VKHMYHEASA-N 0.000 description 22
- 241000880493 Leptailurus serval Species 0.000 description 22
- STKYPAFSDFAEPH-LURJTMIESA-N gly-val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CN STKYPAFSDFAEPH-LURJTMIESA-N 0.000 description 22
- 229920000160 (ribonucleotides)n+m Polymers 0.000 description 21
- VTJUNIYRYIAIHF-IUCAKERBSA-N Leu-Pro Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O VTJUNIYRYIAIHF-IUCAKERBSA-N 0.000 description 21
- JKHXYJKMNSSFFL-IUCAKERBSA-N Val-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN JKHXYJKMNSSFFL-IUCAKERBSA-N 0.000 description 21
- 108010038633 aspartylglutamate Proteins 0.000 description 21
- 239000002609 media Substances 0.000 description 21
- YBAFDPFAUTYYRW-YUMQZZPRSA-N Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O YBAFDPFAUTYYRW-YUMQZZPRSA-N 0.000 description 20
- LESXFEZIFXFIQR-LURJTMIESA-N Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)NCC(O)=O LESXFEZIFXFIQR-LURJTMIESA-N 0.000 description 20
- XGDCYUQSFDQISZ-BQBZGAKWSA-N Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O XGDCYUQSFDQISZ-BQBZGAKWSA-N 0.000 description 20
- IOUPEELXVYPCPG-UHFFFAOYSA-N val-gly Chemical compound CC(C)C(N)C(=O)NCC(O)=O IOUPEELXVYPCPG-UHFFFAOYSA-N 0.000 description 20
- DKEXFJVMVGETOO-LURJTMIESA-N Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CN DKEXFJVMVGETOO-LURJTMIESA-N 0.000 description 19
- 108010077245 asparaginyl-proline Proteins 0.000 description 19
- 108010017391 lysylvaline Proteins 0.000 description 19
- CKAJHWFHHFSCDT-WHFBIAKZSA-N Asp-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O CKAJHWFHHFSCDT-WHFBIAKZSA-N 0.000 description 18
- 240000007842 Glycine max Species 0.000 description 18
- NFNVDJGXRFEYTK-YUMQZZPRSA-N Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O NFNVDJGXRFEYTK-YUMQZZPRSA-N 0.000 description 18
- 108010078144 glutaminyl-glycine Proteins 0.000 description 18
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 17
- JYOAXOMPIXKMKK-UHFFFAOYSA-N Leucyl-Glutamine Chemical compound CC(C)CC(N)C(=O)NC(C(O)=O)CCC(N)=O JYOAXOMPIXKMKK-UHFFFAOYSA-N 0.000 description 17
- GIAZPLMMQOERPN-YUMQZZPRSA-N Val-Pro Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(O)=O GIAZPLMMQOERPN-YUMQZZPRSA-N 0.000 description 17
- STTYIMSDIYISRG-WDSKDSINSA-N Val-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(O)=O STTYIMSDIYISRG-WDSKDSINSA-N 0.000 description 17
- 108010070643 prolylglutamic acid Proteins 0.000 description 17
- 210000001519 tissues Anatomy 0.000 description 17
- GJSURZIOUXUGAL-UHFFFAOYSA-N 2-((2,6-Dichlorophenyl)imino)imidazolidine Chemical compound ClC1=CC=CC(Cl)=C1NC1=NCCN1 GJSURZIOUXUGAL-UHFFFAOYSA-N 0.000 description 16
- GVRKWABULJAONN-UHFFFAOYSA-N Valyl-Threonine Chemical compound CC(C)C(N)C(=O)NC(C(C)O)C(O)=O GVRKWABULJAONN-UHFFFAOYSA-N 0.000 description 16
- 108010060035 arginylproline Proteins 0.000 description 16
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 16
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 16
- MRVYVEQPNDSWLH-UHFFFAOYSA-N Glutaminyl-Valine Chemical compound CC(C)C(C(O)=O)NC(=O)C(N)CCC(N)=O MRVYVEQPNDSWLH-UHFFFAOYSA-N 0.000 description 15
- JLXVRFDTDUGQEE-YFKPBYRVSA-N Gly-Arg Chemical compound NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N JLXVRFDTDUGQEE-YFKPBYRVSA-N 0.000 description 15
- BIYXEUAFGLTAEM-WUJLRWPWSA-N Thr-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(O)=O BIYXEUAFGLTAEM-WUJLRWPWSA-N 0.000 description 15
- 241000209140 Triticum Species 0.000 description 15
- 108010013835 arginine glutamate Proteins 0.000 description 15
- 230000002068 genetic Effects 0.000 description 15
- 108010010147 glycylglutamine Proteins 0.000 description 15
- 239000002245 particle Substances 0.000 description 15
- 235000021307 wheat Nutrition 0.000 description 15
- NTQDELBZOMWXRS-UHFFFAOYSA-N Aspartyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CC(O)=O NTQDELBZOMWXRS-UHFFFAOYSA-N 0.000 description 14
- PNMUAGGSDZXTHX-BYPYZUCNSA-N Gly-Gln Chemical compound NCC(=O)N[C@H](C(O)=O)CCC(N)=O PNMUAGGSDZXTHX-BYPYZUCNSA-N 0.000 description 14
- ZOKVLMBYDSIDKG-CSMHCCOUSA-N Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN ZOKVLMBYDSIDKG-CSMHCCOUSA-N 0.000 description 14
- 108020004999 Messenger RNA Proteins 0.000 description 14
- FADYJNXDPBKVCA-UHFFFAOYSA-N Phenylalanyl-Lysine Chemical compound NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 14
- 108010092114 histidylphenylalanine Proteins 0.000 description 14
- 229920002106 messenger RNA Polymers 0.000 description 14
- 238000003752 polymerase chain reaction Methods 0.000 description 14
- JSLGXODUIAFWCF-UHFFFAOYSA-N Arginyl-Asparagine Chemical compound NC(N)=NCCCC(N)C(=O)NC(CC(N)=O)C(O)=O JSLGXODUIAFWCF-UHFFFAOYSA-N 0.000 description 13
- KLKHFFMNGWULBN-VKHMYHEASA-N Asn-Gly Chemical compound NC(=O)C[C@H](N)C(=O)NCC(O)=O KLKHFFMNGWULBN-VKHMYHEASA-N 0.000 description 13
- HXWUJJADFMXNKA-UHFFFAOYSA-N Asparaginyl-Leucine Chemical compound CC(C)CC(C(O)=O)NC(=O)C(N)CC(N)=O HXWUJJADFMXNKA-UHFFFAOYSA-N 0.000 description 13
- MLTRLIITQPXHBJ-BQBZGAKWSA-N Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O MLTRLIITQPXHBJ-BQBZGAKWSA-N 0.000 description 13
- 235000007164 Oryza sativa Nutrition 0.000 description 13
- LDEBVRIURYMKQS-UHFFFAOYSA-N Serinyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CO LDEBVRIURYMKQS-UHFFFAOYSA-N 0.000 description 13
- 241000192581 Synechocystis sp. Species 0.000 description 13
- IQHUITKNHOKGFC-MIMYLULJSA-N Thr-Phe Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IQHUITKNHOKGFC-MIMYLULJSA-N 0.000 description 13
- XXDVDTMEVBYRPK-XPUUQOCRSA-N Val-Gln Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O XXDVDTMEVBYRPK-XPUUQOCRSA-N 0.000 description 13
- UPJONISHZRADBH-XPUUQOCRSA-N Val-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O UPJONISHZRADBH-XPUUQOCRSA-N 0.000 description 13
- 108010068265 aspartyltyrosine Proteins 0.000 description 13
- 210000002257 embryonic structures Anatomy 0.000 description 13
- 108010015792 glycyllysine Proteins 0.000 description 13
- 239000000203 mixture Substances 0.000 description 13
- 108010053725 prolylvaline Proteins 0.000 description 13
- UKKNTTCNGZLJEX-UHFFFAOYSA-N γ-glutamyl-Serine Chemical compound NC(=O)CCC(N)C(=O)NC(CO)C(O)=O UKKNTTCNGZLJEX-UHFFFAOYSA-N 0.000 description 13
- MPZWMIIOPAPAKE-UHFFFAOYSA-N 4-amino-5-[[1-carboxy-4-(diaminomethylideneamino)butyl]amino]-5-oxopentanoic acid Chemical compound OC(=O)CCC(N)C(=O)NC(C(O)=O)CCCN=C(N)N MPZWMIIOPAPAKE-UHFFFAOYSA-N 0.000 description 12
- PSZNHSNIGMJYOZ-WDSKDSINSA-N Asp-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PSZNHSNIGMJYOZ-WDSKDSINSA-N 0.000 description 12
- JEFZIKRIDLHOIF-BYPYZUCNSA-N Gln-Gly Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(O)=O JEFZIKRIDLHOIF-BYPYZUCNSA-N 0.000 description 12
- YBTCBQBIJKGSJP-BQBZGAKWSA-N Glu-Pro Chemical compound OC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O YBTCBQBIJKGSJP-BQBZGAKWSA-N 0.000 description 12
- SCCPDJAQCXWPTF-VKHMYHEASA-N Gly-Asp Chemical compound NCC(=O)N[C@H](C(O)=O)CC(O)=O SCCPDJAQCXWPTF-VKHMYHEASA-N 0.000 description 12
- ATIPDCIQTUXABX-UWVGGRQHSA-N Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN ATIPDCIQTUXABX-UWVGGRQHSA-N 0.000 description 12
- NFDYGNFETJVMSE-BQBZGAKWSA-N Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CO NFDYGNFETJVMSE-BQBZGAKWSA-N 0.000 description 12
- ILVGMCVCQBJPSH-WDSKDSINSA-N Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CO ILVGMCVCQBJPSH-WDSKDSINSA-N 0.000 description 12
- BECPPKYKPSRKCP-ZDLURKLDSA-N Thr-Glu Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O BECPPKYKPSRKCP-ZDLURKLDSA-N 0.000 description 12
- CKHWEVXPLJBEOZ-UHFFFAOYSA-N Threoninyl-Valine Chemical compound CC(C)C(C(O)=O)NC(=O)C(N)C(C)O CKHWEVXPLJBEOZ-UHFFFAOYSA-N 0.000 description 12
- 230000003321 amplification Effects 0.000 description 12
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 12
- 238000000034 method Methods 0.000 description 12
- 238000003199 nucleic acid amplification method Methods 0.000 description 12
- FUESBOMYALLFNI-VKHMYHEASA-N Gly-Asn Chemical compound NCC(=O)N[C@H](C(O)=O)CC(N)=O FUESBOMYALLFNI-VKHMYHEASA-N 0.000 description 11
- BCCRXDTUTZHDEU-VKHMYHEASA-N Gly-Ser Chemical compound NCC(=O)N[C@@H](CO)C(O)=O BCCRXDTUTZHDEU-VKHMYHEASA-N 0.000 description 11
- HFKJBCPRWWGPEY-BQBZGAKWSA-N L-arginyl-L-glutamic acid Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(O)=O HFKJBCPRWWGPEY-BQBZGAKWSA-N 0.000 description 11
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 11
- ROHDXJUFQVRDAV-UWVGGRQHSA-N Phe-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 ROHDXJUFQVRDAV-UWVGGRQHSA-N 0.000 description 11
- NYQBYASWHVRESG-MIMYLULJSA-N Phe-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 NYQBYASWHVRESG-MIMYLULJSA-N 0.000 description 11
- WOUIMBGNEUWXQG-VKHMYHEASA-N Ser-Gly Chemical compound OC[C@H](N)C(=O)NCC(O)=O WOUIMBGNEUWXQG-VKHMYHEASA-N 0.000 description 11
- YKRQRPFODDJQTC-UHFFFAOYSA-N Threoninyl-Lysine Chemical compound CC(O)C(N)C(=O)NC(C(O)=O)CCCCN YKRQRPFODDJQTC-UHFFFAOYSA-N 0.000 description 11
- 108010036413 histidylglycine Proteins 0.000 description 11
- 108010051242 phenylalanylserine Proteins 0.000 description 11
- 239000000047 product Substances 0.000 description 11
- 108010026333 seryl-proline Proteins 0.000 description 11
- LWTDZKXXJRRKDG-KXBFYZLASA-N (-)-Phaseollin Natural products C1OC2=CC(O)=CC=C2[C@H]2[C@@H]1C1=CC=C3OC(C)(C)C=CC3=C1O2 LWTDZKXXJRRKDG-KXBFYZLASA-N 0.000 description 10
- NALWOULWGHTVDA-UWVGGRQHSA-N Asp-Tyr Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NALWOULWGHTVDA-UWVGGRQHSA-N 0.000 description 10
- VGRHZPNRCLAHQA-UHFFFAOYSA-N Aspartyl-Asparagine Chemical compound OC(=O)CC(N)C(=O)NC(CC(N)=O)C(O)=O VGRHZPNRCLAHQA-UHFFFAOYSA-N 0.000 description 10
- KOSRFJWDECSPRO-WDSKDSINSA-N Glu-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(O)=O KOSRFJWDECSPRO-WDSKDSINSA-N 0.000 description 10
- LRKCBIUDWAXNEG-CSMHCCOUSA-N Leu-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LRKCBIUDWAXNEG-CSMHCCOUSA-N 0.000 description 10
- CIOWSLJGLSUOME-BQBZGAKWSA-N Lys-Asp Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O CIOWSLJGLSUOME-BQBZGAKWSA-N 0.000 description 10
- 241000209094 Oryza Species 0.000 description 10
- GLUBLISJVJFHQS-VIFPVBQESA-N Phe-Gly Chemical compound OC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 GLUBLISJVJFHQS-VIFPVBQESA-N 0.000 description 10
- ZKQOUHVVXABNDG-IUCAKERBSA-N Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 ZKQOUHVVXABNDG-IUCAKERBSA-N 0.000 description 10
- AFWBWPCXSWUCLB-WDSKDSINSA-N Pro-Ser Chemical compound OC[C@@H](C([O-])=O)NC(=O)[C@@H]1CCC[NH2+]1 AFWBWPCXSWUCLB-WDSKDSINSA-N 0.000 description 10
- OBTCMSPFOITUIJ-FSPLSTOPSA-N Val-Asp Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O OBTCMSPFOITUIJ-FSPLSTOPSA-N 0.000 description 10
- 235000016383 Zea mays subsp huehuetenangensis Nutrition 0.000 description 10
- 108010062796 arginyllysine Proteins 0.000 description 10
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 10
- 230000004927 fusion Effects 0.000 description 10
- KZNQNBZMBZJQJO-YFKPBYRVSA-N gly pro Chemical compound NCC(=O)N1CCC[C@H]1C(O)=O KZNQNBZMBZJQJO-YFKPBYRVSA-N 0.000 description 10
- 108010089804 glycyl-threonine Proteins 0.000 description 10
- 108010038320 lysylphenylalanine Proteins 0.000 description 10
- 235000009973 maize Nutrition 0.000 description 10
- 108010090894 prolylleucine Proteins 0.000 description 10
- 235000009566 rice Nutrition 0.000 description 10
- HZYFHQOWCFUSOV-IMJSIDKUSA-N Asn-Asp Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O HZYFHQOWCFUSOV-IMJSIDKUSA-N 0.000 description 9
- GADKFYNESXNRLC-WDSKDSINSA-N Asn-Pro Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O GADKFYNESXNRLC-WDSKDSINSA-N 0.000 description 9
- BBBXWRGITSUJPB-YUMQZZPRSA-N Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O BBBXWRGITSUJPB-YUMQZZPRSA-N 0.000 description 9
- JSIQVRIXMINMTA-ZDLURKLDSA-N Glu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O JSIQVRIXMINMTA-ZDLURKLDSA-N 0.000 description 9
- LYCVKHSJGDMDLM-LURJTMIESA-N His-Gly Chemical compound OC(=O)CNC(=O)[C@@H](N)CC1=CN=CN1 LYCVKHSJGDMDLM-LURJTMIESA-N 0.000 description 9
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 9
- YQAIUOWPSUOINN-IUCAKERBSA-N Lys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN YQAIUOWPSUOINN-IUCAKERBSA-N 0.000 description 9
- 108010079364 N-glycylalanine Proteins 0.000 description 9
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 9
- HPYDSVWYXXKHRD-VIFPVBQESA-N Tyr-Gly Chemical compound [O-]C(=O)CNC(=O)[C@@H]([NH3+])CC1=CC=C(O)C=C1 HPYDSVWYXXKHRD-VIFPVBQESA-N 0.000 description 9
- YSGSDAIMSCVPHG-YUMQZZPRSA-N Val-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)C(C)C YSGSDAIMSCVPHG-YUMQZZPRSA-N 0.000 description 9
- 108010092854 aspartyllysine Proteins 0.000 description 9
- 238000006243 chemical reaction Methods 0.000 description 9
- 238000003780 insertion Methods 0.000 description 9
- RZVAJINKPMORJF-UHFFFAOYSA-N p-acetaminophenol Chemical compound CC(=O)NC1=CC=C(O)C=C1 RZVAJINKPMORJF-UHFFFAOYSA-N 0.000 description 9
- QLROSWPKSBORFJ-BQBZGAKWSA-N pro glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 QLROSWPKSBORFJ-BQBZGAKWSA-N 0.000 description 9
- 230000035897 transcription Effects 0.000 description 9
- LQJAALCCPOTJGB-YUMQZZPRSA-N (2S)-1-[(2S)-2-amino-5-(diaminomethylideneamino)pentanoyl]pyrrolidine-2-carboxylic acid Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O LQJAALCCPOTJGB-YUMQZZPRSA-N 0.000 description 8
- CCUAQNUWXLYFRA-IMJSIDKUSA-N Ala-Asn Chemical compound C[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CC(N)=O CCUAQNUWXLYFRA-IMJSIDKUSA-N 0.000 description 8
- JQFZHHSQMKZLRU-IUCAKERBSA-N Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N JQFZHHSQMKZLRU-IUCAKERBSA-N 0.000 description 8
- IJYZHIOOBGIINM-WDSKDSINSA-N Arg-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N IJYZHIOOBGIINM-WDSKDSINSA-N 0.000 description 8
- XNSKSTRGQIPTSE-UHFFFAOYSA-N Arginyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CCCNC(N)=N XNSKSTRGQIPTSE-UHFFFAOYSA-N 0.000 description 8
- FRYULLIZUDQONW-IMJSIDKUSA-N Asp-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O FRYULLIZUDQONW-IMJSIDKUSA-N 0.000 description 8
- YZQCXOFQZKCETR-UWVGGRQHSA-N Asp-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 YZQCXOFQZKCETR-UWVGGRQHSA-N 0.000 description 8
- JBCLFWXMTIKCCB-VIFPVBQESA-N Gly-Phe Chemical compound NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-VIFPVBQESA-N 0.000 description 8
- OHUXOEXBXPZKPT-STQMWFEESA-N Phe-His Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1N=CNC=1)C(O)=O)C1=CC=CC=C1 OHUXOEXBXPZKPT-STQMWFEESA-N 0.000 description 8
- LTFSLKWFMWZEBD-IMJSIDKUSA-N Ser-Asn Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O LTFSLKWFMWZEBD-IMJSIDKUSA-N 0.000 description 8
- VBKBDLMWICBSCY-IMJSIDKUSA-N Ser-Asp Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O VBKBDLMWICBSCY-IMJSIDKUSA-N 0.000 description 8
- LAFKUZYWNCHOHT-WHFBIAKZSA-N Ser-Glu Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O LAFKUZYWNCHOHT-WHFBIAKZSA-N 0.000 description 8
- BQBCIBCLXBKYHW-CSMHCCOUSA-N Thr-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@@H]([NH3+])[C@@H](C)O BQBCIBCLXBKYHW-CSMHCCOUSA-N 0.000 description 8
- BNQVUHQWZGTIBX-IUCAKERBSA-N Val-His Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CC1=CN=CN1 BNQVUHQWZGTIBX-IUCAKERBSA-N 0.000 description 8
- 108010068380 arginylarginine Proteins 0.000 description 8
- 108010093581 aspartyl-proline Proteins 0.000 description 8
- 108010081551 glycylphenylalanine Proteins 0.000 description 8
- 108010005942 methionylglycine Proteins 0.000 description 8
- 108010068488 methionylphenylalanine Proteins 0.000 description 8
- 108010029020 prolylglycine Proteins 0.000 description 8
- 108010048818 seryl-histidine Proteins 0.000 description 8
- SITLTJHOQZFJGG-XPUUQOCRSA-N α-Glu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O SITLTJHOQZFJGG-XPUUQOCRSA-N 0.000 description 8
- XMBSYZWANAQXEV-UHFFFAOYSA-N 4-amino-5-[(1-carboxy-2-phenylethyl)amino]-5-oxopentanoic acid Chemical compound OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 7
- 241000219195 Arabidopsis thaliana Species 0.000 description 7
- SONUFGRSSMFHFN-IMJSIDKUSA-N Asn-Ser Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O SONUFGRSSMFHFN-IMJSIDKUSA-N 0.000 description 7
- KWBQPGIYEZKDEG-FSPLSTOPSA-N Asn-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(N)=O KWBQPGIYEZKDEG-FSPLSTOPSA-N 0.000 description 7
- DWBZEJHQQIURML-IMJSIDKUSA-N Asp-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O DWBZEJHQQIURML-IMJSIDKUSA-N 0.000 description 7
- 210000003763 Chloroplasts Anatomy 0.000 description 7
- IKAIKUBBJHFNBZ-LURJTMIESA-N Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CN IKAIKUBBJHFNBZ-LURJTMIESA-N 0.000 description 7
- RNKSNIBMTUYWSH-YFKPBYRVSA-N L-prolylglycine Chemical compound [O-]C(=O)CNC(=O)[C@@H]1CCC[NH2+]1 RNKSNIBMTUYWSH-YFKPBYRVSA-N 0.000 description 7
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 7
- SENJXOPIZNYLHU-IUCAKERBSA-N Leu-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N SENJXOPIZNYLHU-IUCAKERBSA-N 0.000 description 7
- HIZYETOZLYFUFF-BQBZGAKWSA-N Leu-Cys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CS)C(O)=O HIZYETOZLYFUFF-BQBZGAKWSA-N 0.000 description 7
- HGNRJCINZYHNOU-LURJTMIESA-N Lys-Gly Chemical compound NCCCC[C@H](N)C(=O)NCC(O)=O HGNRJCINZYHNOU-LURJTMIESA-N 0.000 description 7
- QCZYYEFXOBKCNQ-STQMWFEESA-N Lys-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QCZYYEFXOBKCNQ-STQMWFEESA-N 0.000 description 7
- QTZXSYBVOSXBEJ-WDSKDSINSA-N Met-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O QTZXSYBVOSXBEJ-WDSKDSINSA-N 0.000 description 7
- JXWLMUIXUXLIJR-QWRGUYRKSA-N Phe-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 JXWLMUIXUXLIJR-QWRGUYRKSA-N 0.000 description 7
- GVUVRRPYYDHHGK-UHFFFAOYSA-N Prolyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C1CCCN1 GVUVRRPYYDHHGK-UHFFFAOYSA-N 0.000 description 7
- PBUXMVYWOSKHMF-WDSKDSINSA-N Ser-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CO PBUXMVYWOSKHMF-WDSKDSINSA-N 0.000 description 7
- AUEJLPRZGVVDNU-STQMWFEESA-N Tyr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AUEJLPRZGVVDNU-STQMWFEESA-N 0.000 description 7
- 230000000408 embryogenic Effects 0.000 description 7
- 108010077515 glycylproline Proteins 0.000 description 7
- 108010000761 leucylarginine Proteins 0.000 description 7
- 108010031719 prolyl-serine Proteins 0.000 description 7
- 108010004914 prolylarginine Proteins 0.000 description 7
- 239000011347 resin Substances 0.000 description 7
- 229920005989 resin Polymers 0.000 description 7
- 108010078580 tyrosylleucine Proteins 0.000 description 7
- TUTIHHSZKFBMHM-UHFFFAOYSA-N 4-amino-5-[(3-amino-1-carboxy-3-oxopropyl)amino]-5-oxopentanoic acid Chemical compound OC(=O)CCC(N)C(=O)NC(CC(N)=O)C(O)=O TUTIHHSZKFBMHM-UHFFFAOYSA-N 0.000 description 6
- HKTRDWYCAUTRRL-UHFFFAOYSA-N 4-amino-5-[[1-carboxy-2-(1H-imidazol-5-yl)ethyl]amino]-5-oxopentanoic acid Chemical compound OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CN=CN1 HKTRDWYCAUTRRL-UHFFFAOYSA-N 0.000 description 6
- BUQICHWNXBIBOG-LMVFSUKVSA-N Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)N BUQICHWNXBIBOG-LMVFSUKVSA-N 0.000 description 6
- OSASDIVHOSJVII-UHFFFAOYSA-N Arginyl-Cysteine Chemical compound SCC(C(O)=O)NC(=O)C(N)CCCNC(N)=N OSASDIVHOSJVII-UHFFFAOYSA-N 0.000 description 6
- 101700056065 CHP2 Proteins 0.000 description 6
- 241001510512 Chlamydia phage 2 Species 0.000 description 6
- 108020004705 Codon Proteins 0.000 description 6
- NXTYATMDWQYLGJ-UHFFFAOYSA-N Cysteinyl-Leucine Chemical compound CC(C)CC(C(O)=O)NC(=O)C(N)CS NXTYATMDWQYLGJ-UHFFFAOYSA-N 0.000 description 6
- OELDIVRKHTYFNG-UHFFFAOYSA-N Cysteinyl-Valine Chemical compound CC(C)C(C(O)=O)NC(=O)C(N)CS OELDIVRKHTYFNG-UHFFFAOYSA-N 0.000 description 6
- ARPVSMCNIDAQBO-UHFFFAOYSA-N Glutaminyl-Leucine Chemical compound CC(C)CC(C(O)=O)NC(=O)C(N)CCC(N)=O ARPVSMCNIDAQBO-UHFFFAOYSA-N 0.000 description 6
- PFMUCCYYAAFKTH-YFKPBYRVSA-N Gly-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)CN PFMUCCYYAAFKTH-YFKPBYRVSA-N 0.000 description 6
- OLIFSFOFKGKIRH-WUJLRWPWSA-N Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CN OLIFSFOFKGKIRH-WUJLRWPWSA-N 0.000 description 6
- MDCTVRUPVLZSPG-BQBZGAKWSA-N His-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CNC=N1 MDCTVRUPVLZSPG-BQBZGAKWSA-N 0.000 description 6
- WRPDZHJNLYNFFT-UHFFFAOYSA-N Histidinyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CC1=CN=CN1 WRPDZHJNLYNFFT-UHFFFAOYSA-N 0.000 description 6
- 240000005979 Hordeum vulgare Species 0.000 description 6
- 235000007340 Hordeum vulgare Nutrition 0.000 description 6
- 206010020649 Hyperkeratosis Diseases 0.000 description 6
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 6
- NPBGTPKLVJEOBE-IUCAKERBSA-N Lys-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCNC(N)=N NPBGTPKLVJEOBE-IUCAKERBSA-N 0.000 description 6
- HGCNKOLVKRAVHD-RYUDHWBXSA-N Met-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 HGCNKOLVKRAVHD-RYUDHWBXSA-N 0.000 description 6
- DZMGFGQBRYWJOR-YUMQZZPRSA-N Met-Pro Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O DZMGFGQBRYWJOR-YUMQZZPRSA-N 0.000 description 6
- HMNSRTLZAJHSIK-YUMQZZPRSA-N Pro-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 HMNSRTLZAJHSIK-YUMQZZPRSA-N 0.000 description 6
- UJTZHGHXJKIAOS-WHFBIAKZSA-N Ser-Gln Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O UJTZHGHXJKIAOS-WHFBIAKZSA-N 0.000 description 6
- PPQRSMGDOHLTBE-UWVGGRQHSA-N Ser-Phe Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 PPQRSMGDOHLTBE-UWVGGRQHSA-N 0.000 description 6
- RZEQTVHJZCIUBT-UHFFFAOYSA-N Serinyl-Arginine Chemical compound OCC(N)C(=O)NC(C(O)=O)CCCNC(N)=N RZEQTVHJZCIUBT-UHFFFAOYSA-N 0.000 description 6
- SBMNPABNWKXNBJ-UHFFFAOYSA-N Serinyl-Lysine Chemical compound NCCCCC(C(O)=O)NC(=O)C(N)CO SBMNPABNWKXNBJ-UHFFFAOYSA-N 0.000 description 6
- QOLYAJSZHIJCTO-VQVTYTSYSA-N Thr-Pro Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(O)=O QOLYAJSZHIJCTO-VQVTYTSYSA-N 0.000 description 6
- GXDLGHLJTHMDII-WISUUJSJSA-N Thr-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(O)=O GXDLGHLJTHMDII-WISUUJSJSA-N 0.000 description 6
- ZSXJENBJGRHKIG-UHFFFAOYSA-N Tyrosyl-Serine Chemical compound OCC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 ZSXJENBJGRHKIG-UHFFFAOYSA-N 0.000 description 6
- WITCOKQIPFWQQD-FSPLSTOPSA-N Val-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O WITCOKQIPFWQQD-FSPLSTOPSA-N 0.000 description 6
- VEYJKJORLPYVLO-RYUDHWBXSA-N Val-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 VEYJKJORLPYVLO-RYUDHWBXSA-N 0.000 description 6
- 229940093612 Zein Drugs 0.000 description 6
- 229920002494 Zein Polymers 0.000 description 6
- 108010055615 Zein Proteins 0.000 description 6
- 230000004075 alteration Effects 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 6
- 108010008355 arginyl-glutamine Proteins 0.000 description 6
- 230000001580 bacterial Effects 0.000 description 6
- HEDRZPFGACZZDS-UHFFFAOYSA-N chloroform Chemical compound ClC(Cl)Cl HEDRZPFGACZZDS-UHFFFAOYSA-N 0.000 description 6
- 238000010192 crystallographic characterization Methods 0.000 description 6
- 108010016616 cysteinylglycine Proteins 0.000 description 6
- 108010060199 cysteinylproline Proteins 0.000 description 6
- 108010020688 glycylhistidine Proteins 0.000 description 6
- 239000007788 liquid Substances 0.000 description 6
- 108010009298 lysylglutamic acid Proteins 0.000 description 6
- 108010056582 methionylglutamic acid Proteins 0.000 description 6
- 108010012581 phenylalanylglutamate Proteins 0.000 description 6
- 108010071207 serylmethionine Proteins 0.000 description 6
- 239000006228 supernatant Substances 0.000 description 6
- 239000000725 suspension Substances 0.000 description 6
- 108010009962 valyltyrosine Proteins 0.000 description 6
- 239000005019 zein Substances 0.000 description 6
- RFCVXVPWSPOMFJ-UHFFFAOYSA-N 2-[(2-azaniumyl-3-phenylpropanoyl)amino]-4-methylpentanoate Chemical compound CC(C)CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 RFCVXVPWSPOMFJ-UHFFFAOYSA-N 0.000 description 5
- XUUXCWCKKCZEAW-YFKPBYRVSA-N 2-[[(2S)-2-amino-5-(diaminomethylideneamino)pentanoyl]amino]acetic acid Chemical compound OC(=O)CNC(=O)[C@@H](N)CCCN=C(N)N XUUXCWCKKCZEAW-YFKPBYRVSA-N 0.000 description 5
- JQDFGZKKXBEANU-UHFFFAOYSA-N Alanyl-Cysteine Chemical compound CC(N)C(=O)NC(CS)C(O)=O JQDFGZKKXBEANU-UHFFFAOYSA-N 0.000 description 5
- PMGDADKJMCOXHX-BQBZGAKWSA-N Arg-Gln Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(O)=O PMGDADKJMCOXHX-BQBZGAKWSA-N 0.000 description 5
- IIFDPDVJAHQFSR-WHFBIAKZSA-N Asn-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O IIFDPDVJAHQFSR-WHFBIAKZSA-N 0.000 description 5
- NPDLYUOYAGBHFB-UHFFFAOYSA-N Asparaginyl-Arginine Chemical compound NC(=O)CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N NPDLYUOYAGBHFB-UHFFFAOYSA-N 0.000 description 5
- UKGGPJNBONZZCM-WDSKDSINSA-N Aspartyl-L-proline Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O UKGGPJNBONZZCM-WDSKDSINSA-N 0.000 description 5
- WXOFKRKAHJQKLT-UHFFFAOYSA-N Cysteinyl-Lysine Chemical compound NCCCCC(C(O)=O)NC(=O)C(N)CS WXOFKRKAHJQKLT-UHFFFAOYSA-N 0.000 description 5
- 229920002760 Expressed sequence tag Polymers 0.000 description 5
- LOJYQMFIIJVETK-WDSKDSINSA-N Gln-Gln Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(O)=O LOJYQMFIIJVETK-WDSKDSINSA-N 0.000 description 5
- LSPKYLAFTPBWIL-BYPYZUCNSA-N Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(O)=O LSPKYLAFTPBWIL-BYPYZUCNSA-N 0.000 description 5
- SSHIXEILTLPAQT-UHFFFAOYSA-N Glutaminyl-Aspartate Chemical compound NC(=O)CCC(N)C(=O)NC(CC(O)=O)C(O)=O SSHIXEILTLPAQT-UHFFFAOYSA-N 0.000 description 5
- IEFJWDNGDZAYNZ-BYPYZUCNSA-N Gly-Glu Chemical compound NCC(=O)N[C@H](C(O)=O)CCC(O)=O IEFJWDNGDZAYNZ-BYPYZUCNSA-N 0.000 description 5
- YIWFXZNIBQBFHR-LURJTMIESA-N Gly-His Chemical compound [NH3+]CC(=O)N[C@H](C([O-])=O)CC1=CN=CN1 YIWFXZNIBQBFHR-LURJTMIESA-N 0.000 description 5
- PWWVAXIEGOYWEE-UHFFFAOYSA-N Isophenergan Chemical compound C1=CC=C2N(CC(C)N(C)C)C3=CC=CC=C3SC2=C1 PWWVAXIEGOYWEE-UHFFFAOYSA-N 0.000 description 5
- VYZAGTDAHUIRQA-WHFBIAKZSA-N L-alanyl-L-glutamic acid Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O VYZAGTDAHUIRQA-WHFBIAKZSA-N 0.000 description 5
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 5
- JXNRXNCCROJZFB-RYUDHWBXSA-N L-tyrosyl-L-arginine Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 JXNRXNCCROJZFB-RYUDHWBXSA-N 0.000 description 5
- DVCSNHXRZUVYAM-BQBZGAKWSA-N Leu-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O DVCSNHXRZUVYAM-BQBZGAKWSA-N 0.000 description 5
- XWOBNBRUDDUEEY-UWVGGRQHSA-N Leu-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 XWOBNBRUDDUEEY-UWVGGRQHSA-N 0.000 description 5
- UGTZHPSKYRIGRJ-YUMQZZPRSA-N Lys-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O UGTZHPSKYRIGRJ-YUMQZZPRSA-N 0.000 description 5
- YSZNURNVYFUEHC-BQBZGAKWSA-N Lys-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(O)=O YSZNURNVYFUEHC-BQBZGAKWSA-N 0.000 description 5
- MYTOTTSMVMWVJN-STQMWFEESA-N Lys-Tyr Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 MYTOTTSMVMWVJN-STQMWFEESA-N 0.000 description 5
- KAKJTZWHIUWTTD-VQVTYTSYSA-N Met-Thr Chemical compound CSCC[C@H]([NH3+])C(=O)N[C@@H]([C@@H](C)O)C([O-])=O KAKJTZWHIUWTTD-VQVTYTSYSA-N 0.000 description 5
- 101710010904 PCBD2 Proteins 0.000 description 5
- BXNGIHFNNNSEOS-UWVGGRQHSA-N Phe-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 BXNGIHFNNNSEOS-UWVGGRQHSA-N 0.000 description 5
- IEHDJWSAXBGJIP-RYUDHWBXSA-N Phe-Val Chemical compound CC(C)[C@@H](C([O-])=O)NC(=O)[C@@H]([NH3+])CC1=CC=CC=C1 IEHDJWSAXBGJIP-RYUDHWBXSA-N 0.000 description 5
- 101710027506 Rv0998 Proteins 0.000 description 5
- WBAXJMCUFIXCNI-WDSKDSINSA-N Ser-Pro Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O WBAXJMCUFIXCNI-WDSKDSINSA-N 0.000 description 5
- DSGIVWSDDRDJIO-ZXXMMSQZSA-N Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DSGIVWSDDRDJIO-ZXXMMSQZSA-N 0.000 description 5
- UQTNIFUCMBFWEJ-UHFFFAOYSA-N Threoninyl-Asparagine Chemical compound CC(O)C(N)C(=O)NC(C(O)=O)CC(N)=O UQTNIFUCMBFWEJ-UHFFFAOYSA-N 0.000 description 5
- GJNDXQBALKCYSZ-RYUDHWBXSA-N Val-Phe Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 GJNDXQBALKCYSZ-RYUDHWBXSA-N 0.000 description 5
- 108010050025 alpha-glutamyltryptophan Proteins 0.000 description 5
- 230000027455 binding Effects 0.000 description 5
- 230000018109 developmental process Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 235000013312 flour Nutrition 0.000 description 5
- 101710010895 glpV Proteins 0.000 description 5
- 108010079547 glutamylmethionine Proteins 0.000 description 5
- VPZXBVLAVMBEQI-VKHMYHEASA-N gly ala Chemical compound OC(=O)[C@H](C)NC(=O)CN VPZXBVLAVMBEQI-VKHMYHEASA-N 0.000 description 5
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 5
- 229910052737 gold Inorganic materials 0.000 description 5
- 239000010931 gold Substances 0.000 description 5
- 108010018006 histidylserine Proteins 0.000 description 5
- 238000002955 isolation Methods 0.000 description 5
- 108010053037 kyotorphin Proteins 0.000 description 5
- 108010044655 lysylproline Proteins 0.000 description 5
- 230000000813 microbial Effects 0.000 description 5
- 230000035772 mutation Effects 0.000 description 5
- 108010084572 phenylalanyl-valine Proteins 0.000 description 5
- 108010073101 phenylalanylleucine Proteins 0.000 description 5
- 102000002933 thioredoxin family Human genes 0.000 description 5
- 108060008226 thioredoxin family Proteins 0.000 description 5
- XMAUFHMAAVTODF-STQMWFEESA-N (2S)-2-[[(2S)-2-amino-3-(1H-imidazol-5-yl)propanoyl]amino]-3-phenylpropanoic acid Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CN=CN1 XMAUFHMAAVTODF-STQMWFEESA-N 0.000 description 4
- LZDNBBYBDGBADK-KBPBESRZSA-N (2S)-2-[[(2S)-2-amino-3-methylbutanoyl]amino]-3-(1H-indol-3-yl)propanoic acid Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-KBPBESRZSA-N 0.000 description 4
- GMKMEZVLHJARHF-UHFFFAOYSA-L 2,6-diaminopimelate(2-) Chemical compound [O-]C(=O)C(N)CCCC(N)C([O-])=O GMKMEZVLHJARHF-UHFFFAOYSA-L 0.000 description 4
- AAKRWBIIGKPOKQ-ONGXEEELSA-N 2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-3-methylbutanoyl]amino]acetic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O AAKRWBIIGKPOKQ-ONGXEEELSA-N 0.000 description 4
- 229920000936 Agarose Polymers 0.000 description 4
- BNODVYXZAAXSHW-UHFFFAOYSA-N Arginyl-Histidine Chemical compound NC(=N)NCCCC(N)C(=O)NC(C(O)=O)CC1=CN=CN1 BNODVYXZAAXSHW-UHFFFAOYSA-N 0.000 description 4
- RJUHZPRQRQLCFL-IMJSIDKUSA-N Asn-Asn Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(O)=O RJUHZPRQRQLCFL-IMJSIDKUSA-N 0.000 description 4
- GSMPSRPMQQDRIB-WHFBIAKZSA-N Asp-Gln Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O GSMPSRPMQQDRIB-WHFBIAKZSA-N 0.000 description 4
- CPMKYMGGYUFOHS-FSPLSTOPSA-N Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(O)=O CPMKYMGGYUFOHS-FSPLSTOPSA-N 0.000 description 4
- IQTUDDBANZYMAR-UHFFFAOYSA-N Asparaginyl-Methionine Chemical compound CSCCC(C(O)=O)NC(=O)C(N)CC(N)=O IQTUDDBANZYMAR-UHFFFAOYSA-N 0.000 description 4
- OMSMPWHEGLNQOD-UHFFFAOYSA-N Asparaginyl-Phenylalanine Chemical compound NC(=O)CC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 OMSMPWHEGLNQOD-UHFFFAOYSA-N 0.000 description 4
- VBKIFHUVGLOJKT-UHFFFAOYSA-N Asparaginyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CC(N)=O VBKIFHUVGLOJKT-UHFFFAOYSA-N 0.000 description 4
- DYDKXJWQCIVTMR-UHFFFAOYSA-N Aspartyl-Methionine Chemical compound CSCCC(C(O)=O)NC(=O)C(N)CC(O)=O DYDKXJWQCIVTMR-UHFFFAOYSA-N 0.000 description 4
- ZSRSLWKGWFFVCM-WDSKDSINSA-N Cys-Pro Chemical compound SC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O ZSRSLWKGWFFVCM-WDSKDSINSA-N 0.000 description 4
- RGTVXXNMOGHRAY-UHFFFAOYSA-N Cysteinyl-Arginine Chemical compound SCC(N)C(=O)NC(C(O)=O)CCCNC(N)=N RGTVXXNMOGHRAY-UHFFFAOYSA-N 0.000 description 4
- 108010090461 DFG peptide Proteins 0.000 description 4
- 241000588724 Escherichia coli Species 0.000 description 4
- OWOFCNWTMWOOJJ-WDSKDSINSA-N Gln-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(O)=O OWOFCNWTMWOOJJ-WDSKDSINSA-N 0.000 description 4
- SXGAGTVDWKQYCX-BQBZGAKWSA-N Glu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O SXGAGTVDWKQYCX-BQBZGAKWSA-N 0.000 description 4
- LLEUXCDZPQOJMY-AAEUAGOBSA-N Glu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCC(O)=O)N)C(O)=O)=CNC2=C1 LLEUXCDZPQOJMY-AAEUAGOBSA-N 0.000 description 4
- YSWHPLCDIMUKFE-QWRGUYRKSA-N Glu-Tyr Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 YSWHPLCDIMUKFE-QWRGUYRKSA-N 0.000 description 4
- IAJOBQBIJHVGMQ-UHFFFAOYSA-N Glufosinate Chemical compound CP(O)(=O)CCC(N)C(O)=O IAJOBQBIJHVGMQ-UHFFFAOYSA-N 0.000 description 4
- HHSJMSCOLJVTCX-UHFFFAOYSA-N Glutaminyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CCC(N)=O HHSJMSCOLJVTCX-UHFFFAOYSA-N 0.000 description 4
- KRBMQYPTDYSENE-BQBZGAKWSA-N His-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CNC=N1 KRBMQYPTDYSENE-BQBZGAKWSA-N 0.000 description 4
- NIKBMHGRNAPJFW-UHFFFAOYSA-N Histidinyl-Arginine Chemical compound NC(=N)NCCCC(C(O)=O)NC(=O)C(N)CC1=CN=CN1 NIKBMHGRNAPJFW-UHFFFAOYSA-N 0.000 description 4
- 229920002459 Intron Polymers 0.000 description 4
- 108020004391 Introns Proteins 0.000 description 4
- ZUKPVRWZDMRIEO-VKHMYHEASA-N L-cysteinylglycine zwitterion Chemical compound SC[C@H]([NH3+])C(=O)NCC([O-])=O ZUKPVRWZDMRIEO-VKHMYHEASA-N 0.000 description 4
- JAQGKXUEKGKTKX-HOTGVXAUSA-N L-tyrosyl-L-tyrosine Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 JAQGKXUEKGKTKX-HOTGVXAUSA-N 0.000 description 4
- OAPNERBWQWUPTI-YUMQZZPRSA-N Lys-Gln Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O OAPNERBWQWUPTI-YUMQZZPRSA-N 0.000 description 4
- XBZOQGHZGQLEQO-IUCAKERBSA-N Lys-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN XBZOQGHZGQLEQO-IUCAKERBSA-N 0.000 description 4
- UASDAHIAHBRZQV-YUMQZZPRSA-N Met-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCNC(N)=N UASDAHIAHBRZQV-YUMQZZPRSA-N 0.000 description 4
- ADHNYKZHPOEULM-BQBZGAKWSA-N Met-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O ADHNYKZHPOEULM-BQBZGAKWSA-N 0.000 description 4
- QXOHLNCNYLGICT-YFKPBYRVSA-N Met-Gly Chemical compound CSCC[C@H](N)C(=O)NCC(O)=O QXOHLNCNYLGICT-YFKPBYRVSA-N 0.000 description 4
- 108010066427 N-valyltryptophan Proteins 0.000 description 4
- 108091005771 Peptidases Proteins 0.000 description 4
- HWMGTNOVUDIKRE-UWVGGRQHSA-N Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 HWMGTNOVUDIKRE-UWVGGRQHSA-N 0.000 description 4
- RWCOTTLHDJWHRS-YUMQZZPRSA-N Pro-Pro Chemical compound OC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 RWCOTTLHDJWHRS-YUMQZZPRSA-N 0.000 description 4
- BEPSGCXDIVACBU-UHFFFAOYSA-N Prolyl-Histidine Chemical compound C1CCNC1C(=O)NC(C(=O)O)CC1=CN=CN1 BEPSGCXDIVACBU-UHFFFAOYSA-N 0.000 description 4
- 239000004365 Protease Substances 0.000 description 4
- YZMPDHTZJJCGEI-BQBZGAKWSA-N Ser-His Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 YZMPDHTZJJCGEI-BQBZGAKWSA-N 0.000 description 4
- ATHGHQPFGPMSJY-UHFFFAOYSA-N Spermidine Chemical compound NCCCCNCCCN ATHGHQPFGPMSJY-UHFFFAOYSA-N 0.000 description 4
- NHUHCSRWZMLRLA-UHFFFAOYSA-N Sulfizole Chemical compound CC1=NOC(NS(=O)(=O)C=2C=CC(N)=CC=2)=C1C NHUHCSRWZMLRLA-UHFFFAOYSA-N 0.000 description 4
- 229940094937 Thioredoxin Drugs 0.000 description 4
- KAFKKRJQHOECGW-JCOFBHIZSA-N Thr-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)[C@H](O)C)C(O)=O)=CNC2=C1 KAFKKRJQHOECGW-JCOFBHIZSA-N 0.000 description 4
- CUTPSEKWUPZFLV-UHFFFAOYSA-N Threoninyl-Cysteine Chemical compound CC(O)C(N)C(=O)NC(CS)C(O)=O CUTPSEKWUPZFLV-UHFFFAOYSA-N 0.000 description 4
- LWFWZRANSFAJDR-JSGCOSHPSA-N Trp-Val Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](C(C)C)C(O)=O)=CNC2=C1 LWFWZRANSFAJDR-JSGCOSHPSA-N 0.000 description 4
- IBIDRSSEHFLGSD-YUMQZZPRSA-N Val-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IBIDRSSEHFLGSD-YUMQZZPRSA-N 0.000 description 4
- WPSXZFTVLIAPCN-UHFFFAOYSA-N Valyl-Cysteine Chemical compound CC(C)C(N)C(=O)NC(CS)C(O)=O WPSXZFTVLIAPCN-UHFFFAOYSA-N 0.000 description 4
- 238000007792 addition Methods 0.000 description 4
- 230000001419 dependent Effects 0.000 description 4
- 235000013305 food Nutrition 0.000 description 4
- 108010051307 glycyl-glycyl-proline Proteins 0.000 description 4
- YMAWOPBAYDPSLA-UHFFFAOYSA-N glycylglycine zwitterion Chemical compound [NH3+]CC(=O)NCC([O-])=O YMAWOPBAYDPSLA-UHFFFAOYSA-N 0.000 description 4
- 239000001307 helium Substances 0.000 description 4
- 229910052734 helium Inorganic materials 0.000 description 4
- SWQJXJOGLNCZEY-UHFFFAOYSA-N helium(0) Chemical compound [He] SWQJXJOGLNCZEY-UHFFFAOYSA-N 0.000 description 4
- 108010028295 histidylhistidine Proteins 0.000 description 4
- 238000009396 hybridization Methods 0.000 description 4
- 238000009114 investigational therapy Methods 0.000 description 4
- 238000002372 labelling Methods 0.000 description 4
- 108010083708 leucyl-aspartyl-valine Proteins 0.000 description 4
- 108010091871 leucylmethionine Proteins 0.000 description 4
- 108010012058 leucyltyrosine Proteins 0.000 description 4
- 239000003550 marker Substances 0.000 description 4
- 238000000746 purification Methods 0.000 description 4
- 108091007521 restriction endonucleases Proteins 0.000 description 4
- 238000004114 suspension culture Methods 0.000 description 4
- 108010051110 tyrosyl-lysine Proteins 0.000 description 4
- 108010003137 tyrosyltyrosine Proteins 0.000 description 4
- VNYDHJARLHNEGA-RYUDHWBXSA-N (2S)-1-[(2S)-2-azaniumyl-3-(4-hydroxyphenyl)propanoyl]pyrrolidine-2-carboxylate Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CC=C(O)C=C1 VNYDHJARLHNEGA-RYUDHWBXSA-N 0.000 description 3
- FAQVCWVVIYYWRR-WHFBIAKZSA-N (2S)-2-[[(2S)-2,5-diamino-5-oxopentanoyl]amino]propanoic acid Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O FAQVCWVVIYYWRR-WHFBIAKZSA-N 0.000 description 3
- XPJBQTCXPJNIFE-ZETCQYMHSA-N (2S)-2-[[2-[(2-azaniumylacetyl)amino]acetyl]amino]-4-methylpentanoate Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)CN XPJBQTCXPJNIFE-ZETCQYMHSA-N 0.000 description 3
- MGHKSHCBDXNTHX-UHFFFAOYSA-N 4-amino-5-[(4-amino-1-carboxy-4-oxobutyl)amino]-5-oxopentanoic acid Chemical compound OC(=O)CCC(N)C(=O)NC(CCC(N)=O)C(O)=O MGHKSHCBDXNTHX-UHFFFAOYSA-N 0.000 description 3
- HJCMDXDYPOUFDY-WHFBIAKZSA-N Ala-Gln Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O HJCMDXDYPOUFDY-WHFBIAKZSA-N 0.000 description 3
- CXISPYVYMQWFLE-VKHMYHEASA-N Ala-Gly Chemical compound C[C@H]([NH3+])C(=O)NCC([O-])=O CXISPYVYMQWFLE-VKHMYHEASA-N 0.000 description 3
- SITWEMZOJNKJCH-UHFFFAOYSA-N Alanyl-Arginine Chemical compound CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N SITWEMZOJNKJCH-UHFFFAOYSA-N 0.000 description 3
- OMLWNBVRVJYMBQ-YUMQZZPRSA-N Arg-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O OMLWNBVRVJYMBQ-YUMQZZPRSA-N 0.000 description 3
- PQBHGSGQZSOLIR-RYUDHWBXSA-N Arg-Phe Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 PQBHGSGQZSOLIR-RYUDHWBXSA-N 0.000 description 3
- XTWSWDJMIKUJDQ-RYUDHWBXSA-N Arg-Tyr Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 XTWSWDJMIKUJDQ-RYUDHWBXSA-N 0.000 description 3
- HSPSXROIMXIJQW-BQBZGAKWSA-N Asp-His Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 HSPSXROIMXIJQW-BQBZGAKWSA-N 0.000 description 3
- OAMLVOVXNKILLQ-BQBZGAKWSA-N Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(O)=O OAMLVOVXNKILLQ-BQBZGAKWSA-N 0.000 description 3
- QJMCHPGWFZZRID-UHFFFAOYSA-N Asparaginyl-Lysine Chemical compound NCCCCC(C(O)=O)NC(=O)C(N)CC(N)=O QJMCHPGWFZZRID-UHFFFAOYSA-N 0.000 description 3
- OOULJWDSSVOMHX-UHFFFAOYSA-N Cysteinyl-Methionine Chemical compound CSCCC(C(O)=O)NC(=O)C(N)CS OOULJWDSSVOMHX-UHFFFAOYSA-N 0.000 description 3
- WYVKPHCYMTWUCW-UHFFFAOYSA-N Cysteinyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CS WYVKPHCYMTWUCW-UHFFFAOYSA-N 0.000 description 3
- QIVBCDIJIAJPQS-SECBINFHSA-N D-tryptophane Chemical compound C1=CC=C2C(C[C@@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-SECBINFHSA-N 0.000 description 3
- 210000002472 Endoplasmic Reticulum Anatomy 0.000 description 3
- JZDHUJAFXGNDSB-WHFBIAKZSA-N Glu-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O JZDHUJAFXGNDSB-WHFBIAKZSA-N 0.000 description 3
- 239000005561 Glufosinate Substances 0.000 description 3
- JZOYFBPIEHCDFV-UHFFFAOYSA-N Glutaminyl-Histidine Chemical compound NC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CN=CN1 JZOYFBPIEHCDFV-UHFFFAOYSA-N 0.000 description 3
- VHLZDSUANXBJHW-UHFFFAOYSA-N Glutaminyl-Phenylalanine Chemical compound NC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 VHLZDSUANXBJHW-UHFFFAOYSA-N 0.000 description 3
- MFBYPDKTAJXHNI-VKHMYHEASA-N Gly-Cys Chemical compound [NH3+]CC(=O)N[C@@H](CS)C([O-])=O MFBYPDKTAJXHNI-VKHMYHEASA-N 0.000 description 3
- MMFKFJORZBJVNF-UWVGGRQHSA-N His-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CN=CN1 MMFKFJORZBJVNF-UWVGGRQHSA-N 0.000 description 3
- LNCFUHAPNTYMJB-IUCAKERBSA-N His-Pro Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CN=CN1 LNCFUHAPNTYMJB-IUCAKERBSA-N 0.000 description 3
- VLDVBZICYBVQHB-IUCAKERBSA-N His-Val Chemical compound CC(C)[C@@H](C([O-])=O)NC(=O)[C@@H]([NH3+])CC1=CN=CN1 VLDVBZICYBVQHB-IUCAKERBSA-N 0.000 description 3
- SBUJHOSQTJFQJX-NOAMYHISSA-N Kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 3
- 125000000998 L-alanino group Chemical group [H]N([*])[C@](C([H])([H])[H])([H])C(=O)O[H] 0.000 description 3
- LHSGPCFBGJHPCY-STQMWFEESA-N Leu-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-STQMWFEESA-N 0.000 description 3
- 235000007688 Lycopersicon esculentum Nutrition 0.000 description 3
- AIXUQKMMBQJZCU-IUCAKERBSA-N Lys-Pro Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O AIXUQKMMBQJZCU-IUCAKERBSA-N 0.000 description 3
- 108010060534 MSH (11-13) Proteins 0.000 description 3
- JHKXZYLNVJRAAJ-WDSKDSINSA-N Met-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(O)=O JHKXZYLNVJRAAJ-WDSKDSINSA-N 0.000 description 3
- IMTUWVJPCQPJEE-IUCAKERBSA-N Met-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN IMTUWVJPCQPJEE-IUCAKERBSA-N 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 3
- JMEWFDUAFKVAAT-UHFFFAOYSA-N Methionyl-Asparagine Chemical compound CSCCC(N)C(=O)NC(C(O)=O)CC(N)=O JMEWFDUAFKVAAT-UHFFFAOYSA-N 0.000 description 3
- NDYNTQWSJLPEMK-UHFFFAOYSA-N Methionyl-Cysteine Chemical compound CSCCC(N)C(=O)NC(CS)C(O)=O NDYNTQWSJLPEMK-UHFFFAOYSA-N 0.000 description 3
- 240000007594 Oryza sativa Species 0.000 description 3
- WEQJQNWXCSUVMA-RYUDHWBXSA-N Phe-Pro Chemical compound C([C@H]([NH3+])C(=O)N1[C@@H](CCC1)C([O-])=O)C1=CC=CC=C1 WEQJQNWXCSUVMA-RYUDHWBXSA-N 0.000 description 3
- FSXRLASFHBWESK-HOTGVXAUSA-N Phe-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 FSXRLASFHBWESK-HOTGVXAUSA-N 0.000 description 3
- KNPVDQMEHSCAGX-UHFFFAOYSA-N Phenylalanyl-Cysteine Chemical compound SCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 KNPVDQMEHSCAGX-UHFFFAOYSA-N 0.000 description 3
- SHAQGFGGJSLLHE-BQBZGAKWSA-N Pro-Gln Chemical compound NC(=O)CC[C@@H](C([O-])=O)NC(=O)[C@@H]1CCC[NH2+]1 SHAQGFGGJSLLHE-BQBZGAKWSA-N 0.000 description 3
- RVQDZELMXZRSSI-IUCAKERBSA-N Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 RVQDZELMXZRSSI-IUCAKERBSA-N 0.000 description 3
- OIDKVWTWGDWMHY-RYUDHWBXSA-N Pro-Tyr Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 OIDKVWTWGDWMHY-RYUDHWBXSA-N 0.000 description 3
- XZKQVQKUZMAADP-IMJSIDKUSA-N Ser-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(O)=O XZKQVQKUZMAADP-IMJSIDKUSA-N 0.000 description 3
- FFOKMZOAVHEWET-UHFFFAOYSA-N Serinyl-Cysteine Chemical compound OCC(N)C(=O)NC(CS)C(O)=O FFOKMZOAVHEWET-UHFFFAOYSA-N 0.000 description 3
- 240000003768 Solanum lycopersicum Species 0.000 description 3
- VPZKQTYZIVOJDV-LMVFSUKVSA-N Thr-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(O)=O VPZKQTYZIVOJDV-LMVFSUKVSA-N 0.000 description 3
- HYLXOQURIOCKIH-VQVTYTSYSA-N Thr-Arg Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CCCNC(N)=N HYLXOQURIOCKIH-VQVTYTSYSA-N 0.000 description 3
- IOWJRKAVLALBQB-IWGUZYHVSA-N Thr-Asp Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O IOWJRKAVLALBQB-IWGUZYHVSA-N 0.000 description 3
- APIDTRXFGYOLLH-VQVTYTSYSA-N Thr-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)[C@@H](C)O APIDTRXFGYOLLH-VQVTYTSYSA-N 0.000 description 3
- WCRFXRIWBFRZBR-GGVZMXCHSA-N Thr-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 WCRFXRIWBFRZBR-GGVZMXCHSA-N 0.000 description 3
- ONWMQORSVZYVNH-UHFFFAOYSA-N Tyrosyl-Asparagine Chemical compound NC(=O)CC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 ONWMQORSVZYVNH-UHFFFAOYSA-N 0.000 description 3
- MFEVVAXTBZELLL-UHFFFAOYSA-N Tyrosyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 MFEVVAXTBZELLL-UHFFFAOYSA-N 0.000 description 3
- 239000011543 agarose gel Substances 0.000 description 3
- 108010044940 alanylglutamine Proteins 0.000 description 3
- 108010047495 alanylglycine Proteins 0.000 description 3
- 108010087924 alanylproline Proteins 0.000 description 3
- 108090001123 antibodies Proteins 0.000 description 3
- 102000004965 antibodies Human genes 0.000 description 3
- 238000004166 bioassay Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 230000002255 enzymatic Effects 0.000 description 3
- 239000003797 essential amino acid Substances 0.000 description 3
- 235000020776 essential amino acid Nutrition 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 239000004744 fabric Substances 0.000 description 3
- 108020001507 fusion proteins Proteins 0.000 description 3
- 102000037240 fusion proteins Human genes 0.000 description 3
- 108010026364 glycyl-glycyl-leucine Proteins 0.000 description 3
- 108010048994 glycyl-tyrosyl-alanine Proteins 0.000 description 3
- 108010084389 glycyltryptophan Proteins 0.000 description 3
- 108010087823 glycyltyrosine Proteins 0.000 description 3
- 230000002363 herbicidal Effects 0.000 description 3
- 239000004009 herbicide Substances 0.000 description 3
- 108010025306 histidylleucine Proteins 0.000 description 3
- 229960000318 kanamycin Drugs 0.000 description 3
- 239000003446 ligand Substances 0.000 description 3
- 108010064235 lysylglycine Proteins 0.000 description 3
- 230000014759 maintenance of location Effects 0.000 description 3
- 235000012054 meals Nutrition 0.000 description 3
- 238000002844 melting Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000006011 modification reaction Methods 0.000 description 3
- 230000002018 overexpression Effects 0.000 description 3
- 230000037361 pathway Effects 0.000 description 3
- 108010082527 phosphinothricin N-acetyltransferase Proteins 0.000 description 3
- 230000001402 polyadenylating Effects 0.000 description 3
- 238000001556 precipitation Methods 0.000 description 3
- OZAIFHULBGXAKX-UHFFFAOYSA-N precursor Substances N#CC(C)(C)N=NC(C)(C)C#N OZAIFHULBGXAKX-UHFFFAOYSA-N 0.000 description 3
- 108090000765 processed proteins & peptides Proteins 0.000 description 3
- 102000004196 processed proteins & peptides Human genes 0.000 description 3
- 239000000243 solution Substances 0.000 description 3
- 229910052717 sulfur Inorganic materials 0.000 description 3
- 239000011593 sulfur Substances 0.000 description 3
- -1 sulfur amino acids Chemical class 0.000 description 3
- 230000001629 suppression Effects 0.000 description 3
- 238000003786 synthesis reaction Methods 0.000 description 3
- 108010017949 tyrosyl-glycyl-glycine Proteins 0.000 description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 3
- NJMYZEJORPYOTO-UHFFFAOYSA-N γ-glutamyl-Proline Chemical compound NC(=O)CCC(N)C(=O)N1CCCC1C(O)=O NJMYZEJORPYOTO-UHFFFAOYSA-N 0.000 description 3
- VKVDRTGWLVZJOM-DCAQKATOSA-N (2S)-2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-3-methylbutanoyl]amino]-3-hydroxypropanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O VKVDRTGWLVZJOM-DCAQKATOSA-N 0.000 description 2
- ILDSIMPXNFWKLH-KATARQTJSA-N (2S)-2-[[(2S,3R)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-3-hydroxybutanoyl]amino]-3-hydroxypropanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ILDSIMPXNFWKLH-KATARQTJSA-N 0.000 description 2
- QXRNAOYBCYVZCD-BQBZGAKWSA-N (2S)-6-amino-2-[[(2S)-2-aminopropanoyl]amino]hexanoic acid Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN QXRNAOYBCYVZCD-BQBZGAKWSA-N 0.000 description 2
- KXTAGESXNQEZKB-DZKIICNBSA-N (4S)-4-amino-5-[[(2S)-1-[[(1S)-1-carboxy-2-methylpropyl]amino]-1-oxo-3-phenylpropan-2-yl]amino]-5-oxopentanoic acid Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=CC=C1 KXTAGESXNQEZKB-DZKIICNBSA-N 0.000 description 2
- KRHRBKYBJXMYBB-WHFBIAKZSA-N 2-[[(2R)-2-[[(2S)-2-aminopropanoyl]amino]-3-sulfanylpropanoyl]amino]acetic acid Chemical compound C[C@H](N)C(=O)N[C@@H](CS)C(=O)NCC(O)=O KRHRBKYBJXMYBB-WHFBIAKZSA-N 0.000 description 2
- HIINQLBHPIQYHN-JTQLQIEISA-N 2-[[2-[[(2S)-2-amino-3-(4-hydroxyphenyl)propanoyl]amino]acetyl]amino]acetic acid Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 HIINQLBHPIQYHN-JTQLQIEISA-N 0.000 description 2
- DGVVWUTYPXICAM-UHFFFAOYSA-N 2-mercaptoethanol Chemical compound OCCS DGVVWUTYPXICAM-UHFFFAOYSA-N 0.000 description 2
- 241000589155 Agrobacterium tumefaciens Species 0.000 description 2
- RDIKFPRVLJLMER-BQBZGAKWSA-N Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)N RDIKFPRVLJLMER-BQBZGAKWSA-N 0.000 description 2
- FSHURBQASBLAPO-WDSKDSINSA-N Ala-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](C)N FSHURBQASBLAPO-WDSKDSINSA-N 0.000 description 2
- WPWUFUBLGADILS-WDSKDSINSA-N Ala-Pro Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O WPWUFUBLGADILS-WDSKDSINSA-N 0.000 description 2
- FFMIYIMKQIMDPK-BQBZGAKWSA-N Asn-His Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 FFMIYIMKQIMDPK-BQBZGAKWSA-N 0.000 description 2
- FYRVDDJMNISIKJ-UWVGGRQHSA-N Asn-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FYRVDDJMNISIKJ-UWVGGRQHSA-N 0.000 description 2
- DVUFTQLHHHJEMK-IMJSIDKUSA-N Asp-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CC(O)=O DVUFTQLHHHJEMK-IMJSIDKUSA-N 0.000 description 2
- ZARXTZFGQZBYFO-JQWIXIFHSA-N Asp-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CC(O)=O)N)C(O)=O)=CNC2=C1 ZARXTZFGQZBYFO-JQWIXIFHSA-N 0.000 description 2
- RGGVDKVXLBOLNS-UHFFFAOYSA-N Asparaginyl-Tryptophan Chemical compound C1=CC=C2C(CC(NC(=O)C(CC(N)=O)N)C(O)=O)=CNC2=C1 RGGVDKVXLBOLNS-UHFFFAOYSA-N 0.000 description 2
- ZVDPYSVOZFINEE-UHFFFAOYSA-N Aspartyl-Leucine Chemical compound CC(C)CC(C(O)=O)NC(=O)C(N)CC(O)=O ZVDPYSVOZFINEE-UHFFFAOYSA-N 0.000 description 2
- 206010003664 Atrial septal defect Diseases 0.000 description 2
- 241000701489 Cauliflower mosaic virus Species 0.000 description 2
- HAYVTMHUNMMXCV-UHFFFAOYSA-N Cysteinyl-Alanine Chemical compound OC(=O)C(C)NC(=O)C(N)CS HAYVTMHUNMMXCV-UHFFFAOYSA-N 0.000 description 2
- 239000003155 DNA primer Substances 0.000 description 2
- 102000033147 ERVK-25 Human genes 0.000 description 2
- FYYSIASRLDJUNP-WHFBIAKZSA-N Glu-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O FYYSIASRLDJUNP-WHFBIAKZSA-N 0.000 description 2
- UQHGAYSULGRWRG-WHFBIAKZSA-N Glu-Ser Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CO)C(O)=O UQHGAYSULGRWRG-WHFBIAKZSA-N 0.000 description 2
- XIPZDANNDPMZGQ-UHFFFAOYSA-N Glutaminyl-Cysteine Chemical compound NC(=O)CCC(N)C(=O)NC(CS)C(O)=O XIPZDANNDPMZGQ-UHFFFAOYSA-N 0.000 description 2
- CLSDNFWKGFJIBZ-UHFFFAOYSA-N Glutaminyl-Lysine Chemical compound NCCCCC(C(O)=O)NC(=O)C(N)CCC(N)=O CLSDNFWKGFJIBZ-UHFFFAOYSA-N 0.000 description 2
- AJHCSUXXECOXOY-NSHDSACASA-N Gly-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-NSHDSACASA-N 0.000 description 2
- VEXZGXHMUGYJMC-UHFFFAOYSA-N HCl Chemical compound Cl VEXZGXHMUGYJMC-UHFFFAOYSA-N 0.000 description 2
- 108010093488 His-His-His-His-His-His Proteins 0.000 description 2
- WSDOHRLQDGAOGU-UHFFFAOYSA-N Histidinyl-Asparagine Chemical compound NC(=O)CC(C(O)=O)NC(=O)C(N)CC1=CN=CN1 WSDOHRLQDGAOGU-UHFFFAOYSA-N 0.000 description 2
- QOOWRKBDDXQRHC-BQBZGAKWSA-N L-lysyl-L-alanine Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN QOOWRKBDDXQRHC-BQBZGAKWSA-N 0.000 description 2
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 2
- BQVUABVGYYSDCJ-ZFWWWQNUSA-N Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-ZFWWWQNUSA-N 0.000 description 2
- NVGBPTNZLWRQSY-UWVGGRQHSA-N Lys-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN NVGBPTNZLWRQSY-UWVGGRQHSA-N 0.000 description 2
- JPNRPAJITHRXRH-UHFFFAOYSA-N Lysyl-Asparagine Chemical compound NCCCCC(N)C(=O)NC(C(O)=O)CC(N)=O JPNRPAJITHRXRH-UHFFFAOYSA-N 0.000 description 2
- GUBGYTABKSRVRQ-YOLKTULGSA-N Maltose Natural products O([C@@H]1[C@H](O)[C@@H](O)[C@H](O)O[C@H]1CO)[C@@H]1[C@H](O)[C@@H](O)[C@@H](O)[C@@H](CO)O1 GUBGYTABKSRVRQ-YOLKTULGSA-N 0.000 description 2
- MUMXFARPYQTTSL-BQBZGAKWSA-N Met-Gln Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O MUMXFARPYQTTSL-BQBZGAKWSA-N 0.000 description 2
- BJFJQOMZCSHBMY-YUMQZZPRSA-N Met-Val Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(O)=O BJFJQOMZCSHBMY-YUMQZZPRSA-N 0.000 description 2
- 102000035443 Peptidases Human genes 0.000 description 2
- MIDZLCFIAINOQN-WPRPVWTQSA-N Phe-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 MIDZLCFIAINOQN-WPRPVWTQSA-N 0.000 description 2
- JMCOUWKXLXDERB-WMZOPIPTSA-N Phe-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CC=CC=C1 JMCOUWKXLXDERB-WMZOPIPTSA-N 0.000 description 2
- KLAONOISLHWJEE-UHFFFAOYSA-N Phenylalanyl-Glutamine Chemical compound NC(=O)CCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 KLAONOISLHWJEE-UHFFFAOYSA-N 0.000 description 2
- JQOHKCDMINQZRV-WDSKDSINSA-N Pro-Asn Chemical compound NC(=O)C[C@@H](C([O-])=O)NC(=O)[C@@H]1CCC[NH2+]1 JQOHKCDMINQZRV-WDSKDSINSA-N 0.000 description 2
- UEKYKRQIAQHOOZ-KBPBESRZSA-N Pro-Trp Chemical compound N([C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)[O-])C(=O)[C@@H]1CCC[NH2+]1 UEKYKRQIAQHOOZ-KBPBESRZSA-N 0.000 description 2
- 229920000320 RNA (poly(A)) Polymers 0.000 description 2
- 229940081973 S-Adenosylmethionine Drugs 0.000 description 2
- MEFKEPWMEQBLKI-AIRLBKTGSA-O S-adenosyl-L-methionine zwitterion Chemical compound O[C@@H]1[C@H](O)[C@@H](C[S+](CC[C@H]([NH3+])C([O-])=O)C)O[C@H]1N1C2=NC=NC(N)=C2N=C1 MEFKEPWMEQBLKI-AIRLBKTGSA-O 0.000 description 2
- SSJMZMUVNKEENT-IMJSIDKUSA-N Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CO SSJMZMUVNKEENT-IMJSIDKUSA-N 0.000 description 2
- 240000001016 Solanum tuberosum Species 0.000 description 2
- 235000002595 Solanum tuberosum Nutrition 0.000 description 2
- 229920000978 Start codon Polymers 0.000 description 2
- PWIQCLSQVQBOQV-AAEUAGOBSA-N Trp-Glu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 PWIQCLSQVQBOQV-AAEUAGOBSA-N 0.000 description 2
- DZHDVYLBNKMLMB-ZFWWWQNUSA-N Trp-Lys Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCCCN)C(O)=O)=CNC2=C1 DZHDVYLBNKMLMB-ZFWWWQNUSA-N 0.000 description 2
- MYVYPSWUSKCCHG-JQWIXIFHSA-N Trp-Ser Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CO)C(O)=O)=CNC2=C1 MYVYPSWUSKCCHG-JQWIXIFHSA-N 0.000 description 2
- GRQCSEWEPIHLBI-UHFFFAOYSA-N Tryptophyl-Asparagine Chemical compound C1=CC=C2C(CC(N)C(=O)NC(CC(N)=O)C(O)=O)=CNC2=C1 GRQCSEWEPIHLBI-UHFFFAOYSA-N 0.000 description 2
- NZCPCJCJZHKFGZ-UHFFFAOYSA-N Tryptophyl-Glutamine Chemical compound C1=CC=C2C(CC(N)C(=O)NC(CCC(N)=O)C(O)=O)=CNC2=C1 NZCPCJCJZHKFGZ-UHFFFAOYSA-N 0.000 description 2
- KBUBZAMBIVEFEI-UHFFFAOYSA-N Tryptophyl-Histidine Chemical compound C=1NC2=CC=CC=C2C=1CC(N)C(=O)NC(C(O)=O)CC1=CN=CN1 KBUBZAMBIVEFEI-UHFFFAOYSA-N 0.000 description 2
- AOLHUMAVONBBEZ-STQMWFEESA-N Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AOLHUMAVONBBEZ-STQMWFEESA-N 0.000 description 2
- QZOSVNLXLSNHQK-UHFFFAOYSA-N Tyrosyl-Aspartate Chemical compound OC(=O)CC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 QZOSVNLXLSNHQK-UHFFFAOYSA-N 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 2
- 229960001570 ademetionine Drugs 0.000 description 2
- 238000009632 agar plate Methods 0.000 description 2
- 108010072041 arginyl-glycyl-aspartic acid Proteins 0.000 description 2
- 108010009111 arginyl-glycyl-glutamic acid Proteins 0.000 description 2
- 108010059459 arginyl-threonyl-phenylalanine Proteins 0.000 description 2
- 239000001110 calcium chloride Substances 0.000 description 2
- 229910001628 calcium chloride Inorganic materials 0.000 description 2
- 235000011148 calcium chloride Nutrition 0.000 description 2
- 238000004113 cell culture Methods 0.000 description 2
- 238000005119 centrifugation Methods 0.000 description 2
- 235000013339 cereals Nutrition 0.000 description 2
- 239000003153 chemical reaction reagent Substances 0.000 description 2
- 244000038559 crop plants Species 0.000 description 2
- 235000020930 dietary requirements Nutrition 0.000 description 2
- 230000029087 digestion Effects 0.000 description 2
- 108010054813 diprotin B Proteins 0.000 description 2
- 239000003623 enhancer Substances 0.000 description 2
- 230000002708 enhancing Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 239000000499 gel Substances 0.000 description 2
- 108010042598 glutamyl-aspartyl-glycine Proteins 0.000 description 2
- 108010008237 glutamyl-valyl-glycine Proteins 0.000 description 2
- 108010073628 glutamyl-valyl-phenylalanine Proteins 0.000 description 2
- DHMQDGOQFOQNFH-UHFFFAOYSA-N glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 2
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 2
- 108010062266 glycyl-glycyl-argininal Proteins 0.000 description 2
- VLKZOEOYAKHREP-UHFFFAOYSA-N hexane Chemical compound CCCCCC VLKZOEOYAKHREP-UHFFFAOYSA-N 0.000 description 2
- 108010085325 histidylproline Proteins 0.000 description 2
- 125000001165 hydrophobic group Chemical group 0.000 description 2
- 239000003112 inhibitor Substances 0.000 description 2
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 2
- 108010054155 lysyllysine Proteins 0.000 description 2
- 239000012528 membrane Substances 0.000 description 2
- OKKJLVBELUTLKV-UHFFFAOYSA-N methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 description 2
- 244000005700 microbiome Species 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 229910052757 nitrogen Inorganic materials 0.000 description 2
- IJGRMHOSHXDMSA-UHFFFAOYSA-N nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 2
- 108010058731 nopaline synthase Proteins 0.000 description 2
- 238000007826 nucleic acid assay Methods 0.000 description 2
- 235000016709 nutrition Nutrition 0.000 description 2
- 230000036961 partial Effects 0.000 description 2
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N phenol Chemical compound OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 2
- 108010058453 phenylalanyl-glutamyl-glycine Proteins 0.000 description 2
- 108010024607 phenylalanylalanine Proteins 0.000 description 2
- 108010018625 phenylalanylarginine Proteins 0.000 description 2
- 108010083476 phenylalanyltryptophan Proteins 0.000 description 2
- 108010077112 prolyl-proline Proteins 0.000 description 2
- 108010079317 prolyl-tyrosine Proteins 0.000 description 2
- 230000000644 propagated Effects 0.000 description 2
- 230000002829 reduced Effects 0.000 description 2
- 239000006152 selective media Substances 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 238000002864 sequence alignment Methods 0.000 description 2
- 230000035939 shock Effects 0.000 description 2
- 229940063673 spermidine Drugs 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 239000000758 substrate Substances 0.000 description 2
- 238000000844 transformation Methods 0.000 description 2
- 230000014621 translational initiation Effects 0.000 description 2
- 108010080629 tryptophan-leucine Proteins 0.000 description 2
- 108010029384 tryptophyl-histidine Proteins 0.000 description 2
- 108010084932 tryptophyl-proline Proteins 0.000 description 2
- 108010038745 tryptophylglycine Proteins 0.000 description 2
- 101700075735 tyr-1 Proteins 0.000 description 2
- 108010020532 tyrosyl-proline Proteins 0.000 description 2
- 108010003885 valyl-prolyl-glycyl-glycine Proteins 0.000 description 2
- 108010015385 valyl-prolyl-proline Proteins 0.000 description 2
- DXJZITDUDUPINW-UHFFFAOYSA-N γ-glutamyl-Asparagine Chemical compound NC(=O)CCC(N)C(=O)NC(CC(N)=O)C(O)=O DXJZITDUDUPINW-UHFFFAOYSA-N 0.000 description 2
- SIGGQAHUPUBWNF-UHFFFAOYSA-N γ-glutamyl-Methionine Chemical compound CSCCC(C(O)=O)NC(=O)C(N)CCC(N)=O SIGGQAHUPUBWNF-UHFFFAOYSA-N 0.000 description 2
- DOFAQXCYFQKSHT-SRVKXCTJSA-N (2S)-1-[(2S)-1-[(2S)-2-amino-3-methylbutanoyl]pyrrolidine-2-carbonyl]pyrrolidine-2-carboxylic acid Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DOFAQXCYFQKSHT-SRVKXCTJSA-N 0.000 description 1
- YSMPVONNIWLJML-FXQIFTODSA-N (2S)-1-[(2S)-2-[[(2S)-2-aminopropanoyl]amino]-3-carboxypropanoyl]pyrrolidine-2-carboxylic acid Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(O)=O YSMPVONNIWLJML-FXQIFTODSA-N 0.000 description 1
- IDKGBVZGNTYYCC-QXEWZRGKSA-N (2S)-1-[(2S)-4-amino-2-[[(2S)-2-amino-3-methylbutanoyl]amino]-4-oxobutanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(O)=O IDKGBVZGNTYYCC-QXEWZRGKSA-N 0.000 description 1
- DSTWKJOBKSMVCV-UWVGGRQHSA-N (2S)-2-[[(2R)-2-amino-3-sulfanylpropanoyl]amino]-3-(4-hydroxyphenyl)propanoic acid Chemical compound SC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 DSTWKJOBKSMVCV-UWVGGRQHSA-N 0.000 description 1
- UCXQIIIFOOGYEM-ULQDDVLXSA-N (2S)-2-[[(2S)-1-[(2S)-2-amino-4-methylpentanoyl]pyrrolidine-2-carbonyl]amino]-3-(4-hydroxyphenyl)propanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 UCXQIIIFOOGYEM-ULQDDVLXSA-N 0.000 description 1
- NKCXQMYPWXSLIZ-PSRDDEIFSA-N (2S)-2-[[(2S)-1-[(2S)-5-amino-2-[[2-[[(2S)-6-amino-2-[[2-[[(2S)-2-[[(2S)-4-amino-2-[[(2S)-2-[[(2S,3R)-2-[[(2S)-2-[[(2S,3R)-2-amino-3-hydroxybutanoyl]amino]-3-(1H-indol-3-yl)propanoyl]amino]-3-hydroxybutanoyl]amino]propanoyl]amino]-4-oxobutanoyl]amino]-3-m Chemical compound O=C([C@H](CCC(N)=O)NC(=O)CNC(=O)[C@H](CCCCN)NC(=O)CNC(=O)[C@@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)NC(=O)[C@@H](N)[C@@H](C)O)[C@@H](C)O)C(C)C)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O NKCXQMYPWXSLIZ-PSRDDEIFSA-N 0.000 description 1
- MHBUWPFQNPJTAS-QAETUUGQSA-N (2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2,4-diamino-4-oxobutanoyl]amino]-4-methylpentanoyl]amino]-3-phenylpropanoyl]amino]butanedioic acid Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=CC=C1 MHBUWPFQNPJTAS-QAETUUGQSA-N 0.000 description 1
- BUZMZDDKFCSKOT-CIUDSAMLSA-N (2S)-2-[[(2S)-2-[[(2S)-2-amino-4-carboxybutanoyl]amino]-4-carboxybutanoyl]amino]pentanedioic acid Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BUZMZDDKFCSKOT-CIUDSAMLSA-N 0.000 description 1
- ZAVCJRJOQKIOJW-KKUMJFAQSA-N (2S)-2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-3-phenylpropanoyl]amino]butanedioic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=CC=C1 ZAVCJRJOQKIOJW-KKUMJFAQSA-N 0.000 description 1
- PDQDCFBVYXEFSD-SRVKXCTJSA-N (2S)-2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]amino]butanedioic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O PDQDCFBVYXEFSD-SRVKXCTJSA-N 0.000 description 1
- ZNGPROMGGGFOAA-JYJNAYRXSA-N (2S)-2-[[(2S)-2-[[(2S)-2-azaniumyl-3-methylbutanoyl]amino]-3-(4-hydroxyphenyl)propanoyl]amino]-3-methylbutanoate Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=C(O)C=C1 ZNGPROMGGGFOAA-JYJNAYRXSA-N 0.000 description 1
- ICYRCNICGBJLGM-HJGDQZAQSA-N (2S)-2-[[(2S,3R)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-3-hydroxybutanoyl]amino]butanedioic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(O)=O ICYRCNICGBJLGM-HJGDQZAQSA-N 0.000 description 1
- ZJZNLRVCZWUONM-JXUBOQSCSA-N (2S)-2-[[(2S,3R)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-3-hydroxybutanoyl]amino]propanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O ZJZNLRVCZWUONM-JXUBOQSCSA-N 0.000 description 1
- CCQOOWAONKGYKQ-BYPYZUCNSA-N (2S)-2-[[2-[(2-azaniumylacetyl)amino]acetyl]amino]propanoate Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)CN CCQOOWAONKGYKQ-BYPYZUCNSA-N 0.000 description 1
- BIZNDKMFQHDOIE-KKUMJFAQSA-N (2S)-4-amino-2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-3-phenylpropanoyl]amino]-4-oxobutanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 BIZNDKMFQHDOIE-KKUMJFAQSA-N 0.000 description 1
- BUDNAJYVCUHLSV-ZLUOBGJFSA-N (3S)-3-[[(2S)-2-aminopropanoyl]amino]-4-[[(1S)-1-carboxy-2-hydroxyethyl]amino]-4-oxobutanoic acid Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O BUDNAJYVCUHLSV-ZLUOBGJFSA-N 0.000 description 1
- JEDIEMIJYSRUBB-FOHZUACHSA-N (3S)-3-[[(2S,3R)-2-amino-3-hydroxybutanoyl]amino]-4-(carboxymethylamino)-4-oxobutanoic acid Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O JEDIEMIJYSRUBB-FOHZUACHSA-N 0.000 description 1
- SOYWRINXUSUWEQ-DLOVCJGASA-N (4S)-4-amino-5-[[(2S)-1-[[(1S)-1-carboxy-2-methylpropyl]amino]-3-methyl-1-oxobutan-2-yl]amino]-5-oxopentanoic acid Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O SOYWRINXUSUWEQ-DLOVCJGASA-N 0.000 description 1
- VHJLVAABSRFDPM-UHFFFAOYSA-N 1,4-dimercaptobutane-2,3-diol Chemical compound SCC(O)C(O)CS VHJLVAABSRFDPM-UHFFFAOYSA-N 0.000 description 1
- UWOCFOFVIBZJGH-UHFFFAOYSA-N 2,3-dihydrodipicolinic acid Chemical compound OC(=O)C1CC=CC(C(O)=O)=N1 UWOCFOFVIBZJGH-UHFFFAOYSA-N 0.000 description 1
- 239000005631 2,4-D Substances 0.000 description 1
- BUXAPSQPMALTOY-UHFFFAOYSA-N 2-[(2-amino-3-sulfanylpropanoyl)amino]pentanedioic acid Chemical compound SCC(N)C(=O)NC(C(O)=O)CCC(O)=O BUXAPSQPMALTOY-UHFFFAOYSA-N 0.000 description 1
- LENZDBCJOHFCAS-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 1
- LVTKHGUGBGNBPL-UHFFFAOYSA-N 3-Amino-1,4-dimethyl-5H-pyrido[4,3-b]indole Chemical compound N1C2=CC=CC=C2C2=C1C(C)=C(N)N=C2C LVTKHGUGBGNBPL-UHFFFAOYSA-N 0.000 description 1
- 102000011848 5-Methyltetrahydrofolate-Homocysteine S-Methyltransferase Human genes 0.000 description 1
- 108010075604 5-Methyltetrahydrofolate-Homocysteine S-Methyltransferase Proteins 0.000 description 1
- 101710028178 ANPEP Proteins 0.000 description 1
- 229920001817 Agar Polymers 0.000 description 1
- 241000589158 Agrobacterium Species 0.000 description 1
- IPWKGIFRRBGCJO-IMJSIDKUSA-N Ala-Ser Chemical compound C[C@H]([NH3+])C(=O)N[C@@H](CO)C([O-])=O IPWKGIFRRBGCJO-IMJSIDKUSA-N 0.000 description 1
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 1
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 1
- BFNBIHQBYMNNAN-UHFFFAOYSA-N Ammonium sulfate Chemical compound N.N.OS(O)(=O)=O BFNBIHQBYMNNAN-UHFFFAOYSA-N 0.000 description 1
- 241000219194 Arabidopsis Species 0.000 description 1
- SIFXMYAHXJGAFC-WDSKDSINSA-N Arg-Asp Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O SIFXMYAHXJGAFC-WDSKDSINSA-N 0.000 description 1
- ROWCTNFEMKOIFQ-YUMQZZPRSA-N Arg-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCNC(N)=N ROWCTNFEMKOIFQ-YUMQZZPRSA-N 0.000 description 1
- QADCERNTBWTXFV-JSGCOSHPSA-N Arg-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCNC(N)=N)N)C(O)=O)=CNC2=C1 QADCERNTBWTXFV-JSGCOSHPSA-N 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- 108010083946 Asp-Tyr-Leu-Lys Proteins 0.000 description 1
- SJUXYGVRSGTPMC-UHFFFAOYSA-N Asparaginyl-Alanine Chemical compound OC(=O)C(C)NC(=O)C(N)CC(N)=O SJUXYGVRSGTPMC-UHFFFAOYSA-N 0.000 description 1
- 229960005261 Aspartic Acid Drugs 0.000 description 1
- BHELIUBJHYAEDK-OAIUPTLZSA-N Aspoxicillin Chemical compound C1([C@H](C(=O)N[C@@H]2C(N3[C@H](C(C)(C)S[C@@H]32)C(O)=O)=O)NC(=O)[C@H](N)CC(=O)NC)=CC=C(O)C=C1 BHELIUBJHYAEDK-OAIUPTLZSA-N 0.000 description 1
- 241001453380 Burkholderia Species 0.000 description 1
- 241000589513 Burkholderia cepacia Species 0.000 description 1
- 239000008001 CAPS buffer Substances 0.000 description 1
- 108091022177 Cysteine synthases Proteins 0.000 description 1
- AYKQJQVWUYEZNU-UHFFFAOYSA-N Cysteinyl-Asparagine Chemical compound SCC(N)C(=O)NC(C(O)=O)CC(N)=O AYKQJQVWUYEZNU-UHFFFAOYSA-N 0.000 description 1
- YHDXIZKDOIWPBW-UHFFFAOYSA-N Cysteinyl-Glutamine Chemical compound SCC(N)C(=O)NC(C(O)=O)CCC(N)=O YHDXIZKDOIWPBW-UHFFFAOYSA-N 0.000 description 1
- 238000001712 DNA sequencing Methods 0.000 description 1
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 1
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 1
- 108010064711 EC 1.1.1.3 Proteins 0.000 description 1
- 108010071598 EC 2.7.1.39 Proteins 0.000 description 1
- 108010013369 EC 3.4.21.9 Proteins 0.000 description 1
- 108030003594 EC 4.1.1.20 Proteins 0.000 description 1
- 108091000044 EC 4.3.3.7 Proteins 0.000 description 1
- 108010076010 EC 4.4.1.8 Proteins 0.000 description 1
- 108010042407 Endonucleases Proteins 0.000 description 1
- 102000004533 Endonucleases Human genes 0.000 description 1
- ZMMJGEGLRURXTF-UHFFFAOYSA-N Ethidium bromide Chemical compound [Br-].C12=CC(N)=CC=C2C2=CC=C(N)C=C2[N+](CC)=C1C1=CC=CC=C1 ZMMJGEGLRURXTF-UHFFFAOYSA-N 0.000 description 1
- 108010074860 Factor Xa Proteins 0.000 description 1
- 101710030624 GALNT5 Proteins 0.000 description 1
- 101710007252 GG11003 Proteins 0.000 description 1
- 229960002989 Glutamic Acid Drugs 0.000 description 1
- 108010070675 Glutathione Transferase family Proteins 0.000 description 1
- 102000005720 Glutathione Transferase family Human genes 0.000 description 1
- XBGGUPMXALFZOT-VIFPVBQESA-N Gly-Tyr Chemical compound NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-VIFPVBQESA-N 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 101710017531 H4C15 Proteins 0.000 description 1
- VHOLZZKNEBBHTH-YUMQZZPRSA-N His-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CNC=N1 VHOLZZKNEBBHTH-YUMQZZPRSA-N 0.000 description 1
- MAJYPBAJPNUFPV-UHFFFAOYSA-N Histidinyl-Cysteine Chemical compound SCC(C(O)=O)NC(=O)C(N)CC1=CN=CN1 MAJYPBAJPNUFPV-UHFFFAOYSA-N 0.000 description 1
- CTCFZNBRZBNKAX-UHFFFAOYSA-N Histidinyl-Glutamine Chemical compound NC(=O)CCC(C(O)=O)NC(=O)C(N)CC1=CN=CN1 CTCFZNBRZBNKAX-UHFFFAOYSA-N 0.000 description 1
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 1
- ILRYLPWNYFXEMH-WHFBIAKZSA-N L-cystathionine dizwitterion Chemical compound [O-]C(=O)[C@@H]([NH3+])CCSC[C@H]([NH3+])C([O-])=O ILRYLPWNYFXEMH-WHFBIAKZSA-N 0.000 description 1
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 1
- 125000000773 L-serino group Chemical group [H]OC(=O)[C@@]([H])(N([H])*)C([H])([H])O[H] 0.000 description 1
- 101700021119 LEUC Proteins 0.000 description 1
- NTISAKGPIGTIJJ-IUCAKERBSA-N Leu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(C)C NTISAKGPIGTIJJ-IUCAKERBSA-N 0.000 description 1
- KFKWRHQBZQICHA-STQMWFEESA-N Leu-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 1
- 241000209510 Liliopsida Species 0.000 description 1
- 229920001776 Mature messenger RNA Polymers 0.000 description 1
- ZYTPOUNUXRBYGW-YUMQZZPRSA-N Met-Met Chemical compound CSCC[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CCSC ZYTPOUNUXRBYGW-YUMQZZPRSA-N 0.000 description 1
- WEDDFMCSUNNZJR-WDSKDSINSA-N Met-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(O)=O WEDDFMCSUNNZJR-WDSKDSINSA-N 0.000 description 1
- PESQCPHRXOFIPX-RYUDHWBXSA-N Met-Tyr Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 PESQCPHRXOFIPX-RYUDHWBXSA-N 0.000 description 1
- FEMOMIGRRWSMCU-UHFFFAOYSA-N Ninhydrin Chemical compound C1=CC=C2C(=O)C(O)(O)C(=O)C2=C1 FEMOMIGRRWSMCU-UHFFFAOYSA-N 0.000 description 1
- 108010077850 Nuclear Localization Signals Proteins 0.000 description 1
- 210000004940 Nucleus Anatomy 0.000 description 1
- YBYRMVIVWMBXKQ-UHFFFAOYSA-N PMSF Chemical compound FS(=O)(=O)CC1=CC=CC=C1 YBYRMVIVWMBXKQ-UHFFFAOYSA-N 0.000 description 1
- 240000005158 Phaseolus vulgaris Species 0.000 description 1
- OZILORBBPKKGRI-RYUDHWBXSA-N Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 OZILORBBPKKGRI-RYUDHWBXSA-N 0.000 description 1
- 102000030951 Phosphotransferases Human genes 0.000 description 1
- 108091000081 Phosphotransferases Proteins 0.000 description 1
- 108010064851 Plant Proteins Proteins 0.000 description 1
- 102000026947 Plant Proteins Human genes 0.000 description 1
- 210000002706 Plastids Anatomy 0.000 description 1
- FELJDCNGZFDUNR-WDSKDSINSA-N Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 FELJDCNGZFDUNR-WDSKDSINSA-N 0.000 description 1
- IWIANZLCJVYEFX-RYUDHWBXSA-N Pro-Phe Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H]1NCCC1)C1=CC=CC=C1 IWIANZLCJVYEFX-RYUDHWBXSA-N 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 108020004418 Ribosomal RNA Proteins 0.000 description 1
- YAHZABJORDUQGO-NQXXGFSBSA-N Ribulose-1,5-bisphosphate Chemical compound OP(=O)(O)OC[C@@H](O)[C@@H](O)C(=O)COP(O)(O)=O YAHZABJORDUQGO-NQXXGFSBSA-N 0.000 description 1
- 108050008511 S-adenosylmethionine synthase Proteins 0.000 description 1
- 101710042981 SHMT1 Proteins 0.000 description 1
- LZLREEUGSYITMX-UHFFFAOYSA-N Serinyl-Tryptophan Chemical compound C1=CC=C2C(CC(NC(=O)C(CO)N)C(O)=O)=CNC2=C1 LZLREEUGSYITMX-UHFFFAOYSA-N 0.000 description 1
- DBMJMQXJHONAFJ-UHFFFAOYSA-M Sodium laurylsulphate Chemical compound [Na+].CCCCCCCCCCCCOS([O-])(=O)=O DBMJMQXJHONAFJ-UHFFFAOYSA-M 0.000 description 1
- 238000002105 Southern blotting Methods 0.000 description 1
- 102100009508 TMPRSS15 Human genes 0.000 description 1
- 108090000190 Thrombin Proteins 0.000 description 1
- LCPVBXOHXMBLFW-JSGCOSHPSA-N Trp-Arg Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O)=CNC2=C1 LCPVBXOHXMBLFW-JSGCOSHPSA-N 0.000 description 1
- UYKREHOKELZSPB-JTQLQIEISA-N Trp-Gly Chemical compound C1=CC=C2C(C[C@H](N)C(=O)NCC(O)=O)=CNC2=C1 UYKREHOKELZSPB-JTQLQIEISA-N 0.000 description 1
- 101710042194 Trpgamma Proteins 0.000 description 1
- UBAQSAUDKMIEQZ-QWRGUYRKSA-N Tyr-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 UBAQSAUDKMIEQZ-QWRGUYRKSA-N 0.000 description 1
- PDSLRCZINIDLMU-QWRGUYRKSA-N Tyr-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 PDSLRCZINIDLMU-QWRGUYRKSA-N 0.000 description 1
- CGWAPUBOXJWXMS-HOTGVXAUSA-N Tyr-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 CGWAPUBOXJWXMS-HOTGVXAUSA-N 0.000 description 1
- BMPPMAOOKQJYIP-WMZOPIPTSA-N Tyr-Trp Chemical compound C([C@H]([NH3+])C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C([O-])=O)C1=CC=C(O)C=C1 BMPPMAOOKQJYIP-WMZOPIPTSA-N 0.000 description 1
- 210000003934 Vacuoles Anatomy 0.000 description 1
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 1
- 235000007244 Zea mays Nutrition 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive Effects 0.000 description 1
- 238000001042 affinity chromatography Methods 0.000 description 1
- 238000001261 affinity purification Methods 0.000 description 1
- 239000008272 agar Substances 0.000 description 1
- 235000004279 alanine Nutrition 0.000 description 1
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 1
- 108010070944 alanylhistidine Proteins 0.000 description 1
- 229910052921 ammonium sulfate Inorganic materials 0.000 description 1
- 235000011130 ammonium sulphate Nutrition 0.000 description 1
- 229960000070 antineoplastic Monoclonal antibodies Drugs 0.000 description 1
- 239000008346 aqueous phase Substances 0.000 description 1
- 108010052670 arginyl-glutamyl-glutamic acid Proteins 0.000 description 1
- 108010069926 arginyl-glycyl-serine Proteins 0.000 description 1
- 108010084758 arginyl-tyrosyl-aspartic acid Proteins 0.000 description 1
- 108010036533 arginylvaline Proteins 0.000 description 1
- 108010010430 asparagine-proline-alanine Proteins 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 230000001486 biosynthesis of amino acids Effects 0.000 description 1
- UIIMBOGNXHQVGW-UHFFFAOYSA-M buffer Substances [Na+].OC([O-])=O UIIMBOGNXHQVGW-UHFFFAOYSA-M 0.000 description 1
- UXVMQQNJUSDDNG-UHFFFAOYSA-L cacl2 Chemical compound [Cl-].[Cl-].[Ca+2] UXVMQQNJUSDDNG-UHFFFAOYSA-L 0.000 description 1
- 239000005018 casein Substances 0.000 description 1
- 235000021240 caseins Nutrition 0.000 description 1
- 230000024881 catalytic activity Effects 0.000 description 1
- 230000001413 cellular Effects 0.000 description 1
- 230000030570 cellular localization Effects 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 108010031100 chloroplast transit peptides Proteins 0.000 description 1
- 230000023298 conjugation with cellular fusion Effects 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 108010069495 cysteinyltyrosine Proteins 0.000 description 1
- 230000003247 decreasing Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 150000004985 diamines Chemical class 0.000 description 1
- 235000005911 diet Nutrition 0.000 description 1
- 230000037213 diet Effects 0.000 description 1
- KCXVZYZYPLLWCC-UHFFFAOYSA-N edta Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 1
- 230000001214 effect on cellular process Effects 0.000 description 1
- 238000010828 elution Methods 0.000 description 1
- 238000009585 enzyme analysis Methods 0.000 description 1
- 238000001976 enzyme digestion Methods 0.000 description 1
- 239000002532 enzyme inhibitor Substances 0.000 description 1
- 210000003527 eukaryotic cell Anatomy 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000000855 fermentation Methods 0.000 description 1
- 230000004151 fermentation Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 239000012458 free base Substances 0.000 description 1
- 108010006664 gamma-glutamyl-glycyl-glycine Proteins 0.000 description 1
- 238000010353 genetic engineering Methods 0.000 description 1
- 230000035784 germination Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- KGNSGRRALVIRGR-UHFFFAOYSA-N gln-tyr Chemical compound NC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 KGNSGRRALVIRGR-UHFFFAOYSA-N 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- 102000005396 glutamine synthetase family Human genes 0.000 description 1
- 108020002326 glutamine synthetase family Proteins 0.000 description 1
- 108010057083 glutamyl-aspartyl-leucine Proteins 0.000 description 1
- 108010013768 glutamyl-aspartyl-proline Proteins 0.000 description 1
- 108010049041 glutamylalanine Proteins 0.000 description 1
- 108010027668 glycyl-alanyl-valine Proteins 0.000 description 1
- 108010019832 glycyl-asparaginyl-glycine Proteins 0.000 description 1
- 108010001064 glycyl-glycyl-glycyl-glycine Proteins 0.000 description 1
- 108010066198 glycyl-leucyl-phenylalanine Proteins 0.000 description 1
- 108010074027 glycyl-seryl-phenylalanine Proteins 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 238000003505 heat denaturation Methods 0.000 description 1
- 239000008079 hexane Substances 0.000 description 1
- 108010041601 histidyl-aspartyl-glutamyl-leucine Proteins 0.000 description 1
- 238000000265 homogenisation Methods 0.000 description 1
- 230000002209 hydrophobic Effects 0.000 description 1
- 238000004191 hydrophobic interaction chromatography Methods 0.000 description 1
- 230000001900 immune effect Effects 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 230000000977 initiatory Effects 0.000 description 1
- 230000003834 intracellular Effects 0.000 description 1
- 238000004255 ion exchange chromatography Methods 0.000 description 1
- 238000005304 joining Methods 0.000 description 1
- 235000021374 legumes Nutrition 0.000 description 1
- 101710030587 ligN Proteins 0.000 description 1
- 238000007834 ligase chain reaction Methods 0.000 description 1
- 101700077585 ligd Proteins 0.000 description 1
- 230000000670 limiting Effects 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 101700072735 lys-1 Proteins 0.000 description 1
- 108010089256 lysyl-aspartyl-glutamyl-leucine Proteins 0.000 description 1
- 108010076718 lysyl-glutamyl-tryptophan Proteins 0.000 description 1
- 108009000345 mRNA Processing Proteins 0.000 description 1
- 230000013011 mating Effects 0.000 description 1
- 230000035800 maturation Effects 0.000 description 1
- 230000001404 mediated Effects 0.000 description 1
- QSHDDOUJBYECFT-UHFFFAOYSA-N mercury Chemical compound [Hg] QSHDDOUJBYECFT-UHFFFAOYSA-N 0.000 description 1
- 229910052753 mercury Inorganic materials 0.000 description 1
- 108010085203 methionylmethionine Proteins 0.000 description 1
- 108010045030 monoclonal antibodies Proteins 0.000 description 1
- 229960000060 monoclonal antibodies Drugs 0.000 description 1
- 102000005614 monoclonal antibodies Human genes 0.000 description 1
- 230000001338 necrotic Effects 0.000 description 1
- 238000007899 nucleic acid hybridization Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 108010074082 phenylalanyl-alanyl-lysine Proteins 0.000 description 1
- 108010072637 phenylalanyl-arginyl-phenylalanine Proteins 0.000 description 1
- 235000021118 plant-derived protein Nutrition 0.000 description 1
- 230000006308 pollination Effects 0.000 description 1
- 108091008117 polyclonal antibodies Proteins 0.000 description 1
- 230000001124 posttranscriptional Effects 0.000 description 1
- 230000032361 posttranscriptional gene silencing Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- FLVBXVXXXMLMOX-UHFFFAOYSA-N proquinazid Chemical compound C1=C(I)C=C2C(=O)N(CCC)C(OCCC)=NC2=C1 FLVBXVXXXMLMOX-UHFFFAOYSA-N 0.000 description 1
- 238000001742 protein purification Methods 0.000 description 1
- 230000002797 proteolythic Effects 0.000 description 1
- 230000022532 regulation of transcription, DNA-dependent Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 229920002973 ribosomal RNA Polymers 0.000 description 1
- 229920002477 rna polymer Polymers 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 230000003248 secreting Effects 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 238000005204 segregation Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 108010069117 seryl-lysyl-aspartic acid Proteins 0.000 description 1
- 239000011780 sodium chloride Substances 0.000 description 1
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 1
- 230000000392 somatic Effects 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 108010005652 splenotritin Proteins 0.000 description 1
- 238000003892 spreading Methods 0.000 description 1
- 239000007858 starting material Substances 0.000 description 1
- 230000001502 supplementation Effects 0.000 description 1
- 150000003573 thiols Chemical class 0.000 description 1
- 229960004072 thrombin Drugs 0.000 description 1
- 230000001052 transient Effects 0.000 description 1
- 230000014616 translation Effects 0.000 description 1
- PIEPQKCYPFFYMG-UHFFFAOYSA-N tris acetate Chemical compound CC(O)=O.OCC(N)(CO)CO PIEPQKCYPFFYMG-UHFFFAOYSA-N 0.000 description 1
- 241001515965 unidentified phage Species 0.000 description 1
- 230000021037 unidirectional conjugation Effects 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
- 239000004474 valine Substances 0.000 description 1
- 230000003612 virological Effects 0.000 description 1
- 238000001262 western blot Methods 0.000 description 1
- OPINTGHFESTVAX-UHFFFAOYSA-N γ-glutamyl-Arginine Chemical compound NC(=O)CCC(N)C(=O)NC(C(O)=O)CCCNC(N)=N OPINTGHFESTVAX-UHFFFAOYSA-N 0.000 description 1
Abstract
This invention relates to an isolated nucleic acid fragment encoding a plant enzyme that catalyzes steps in the biosynthesis of lysine, threonine, methionine, cysteine and isoleucine from aspartate, the enzyme a member selected from the group consisting of:dihydrodipicolinate reductase, diaminopimelate epimerase, threonine synthase, threonine deaminase and S-adenosylmethionine synthetase. The invention also relates to the construction of a chimeric gene encoding all or a portion of the enzyme, in sense or antisense orientation, wherein expression of the chimeric gene results in production of altered levels of the enzyme in a transformed host cell.
Description
BIOSYNTHETIC ENZYMES OF PLANT AMINO ACIDS
This application claims the benefit of the U.S. Provisional Application No.60 / 048, 771, registered on June 6, 1997, and the U.S. Provisional Application No. 60 / 049,443, registered on June 12, 1997.
FIELD OF THE INVENTION
This invention is in the field of molecular biology of plants. More specifically, this invention relates to nucleic acid fragments that encode enzymes involved in the biosynthesis of amino acids in plants and seeds.
BACKGROUND OF THE INVENTION
Many vertebrates, including man, lack the ability to manufacture a certain number of amino acids and therefore require in their diet of these prefabricated amino acids. These are called essential amino acids. Human foods and animal feeds, derived from many grains, are deficient in essential amino acids, such as lysine, the sulfurized amino acids methionine and cysteine, threonine and tryptophan. For example in corn (Zea mays L.) lysine is the most limited amino acid for the dietary requirements of many animals. Soy flour (Glycina max L.) is used as an additive
REF .: 31928 for animal foods based on corn mainly as a lysine supplement. In this way, an increase in the lysine content of either corn or soy would reduce or eliminate the need to supplement the feed of grain mixtures with lysine produced via microbial fermentation. In addition, in corn the sulfur amino acids occupy the third place among the most limited amino acids after lysine and tryptophan, for the dietary requirements of many animals. The use of soy flour, which is rich in lysine and tryptophan to supplement corn in animal feed is limited by the low sulfur amino acid content of the legume. Thus, an increase in the content of the sulfur amino acids in both corn and soy would improve the nutritional quality of the mixtures and reduce the need for additional supplementation through the addition of more expensive methionine.
Lysine, threonine, methionine, cysteine and isoleucine are amino acids derived from aspartate. The regulation of the biosynthesis of each member of this family is interconnected (see Figure 1). One approach to increasing the nutritional quality of human foods and animal feeds is to increase production and accumulation of specific free amino acids via genetic engineering of this biosynthetic pathway.
Alteration of enzyme activity in this route could lead to altered levels of lysine, threonine, methionine, cysteine and isoleucine. However, some of the genes that encode enzymes that regulate this route in plants, especially corn, soybeans and wheat, are available.
The organization of the pathway leading to the biosynthesis of lysine, threonine, methionine, cysteine and isoleucine indicates that the overexpression or reduction of expression of genes encoding, inter alia, threonine synthase, dihydrodipicolinate reductase, diaminopimelate epi erasa, threonine deaminase and S-adenosylmethionine synthetase in corn, soy, wheat and other crop plants could be used to alter the levels of these amino acids in human food and animal feed. Accordingly, the availability of nucleic acid sequences encoding all or a portion of these enzymes would facilitate the development of nutritionally enhanced crop plants.
BRIEF DESCRIPTION OF THE INVENTION
The present invention relates to isolated nucleic acid fragments that encode enzymes of plants involved in amino acid biosynthesis. Specifically, this invention concerns isolated nucleic acid fragments encoding the following plant enzymes that catalyze steps in the biosynthesis of lysine, threonine, methionine, cysteine and isoleucine from aspartate: dihydrodipicolinate reductase, diaminopimelate epimerase, threonine synthetase, threonine deaminase, and S-adenosylmethionine synthetase. In addition, this invention relates to nucleic acid fragments that are complementary to nucleic acid fragments encoding the listed plant biosynthetic enzymes.
In another embodiment, the present invention relates to the chimeric genes encoding the amino acid biosynthetic enzymes listed above, or to chimeric genes comprising nucleic acid fragments that are complementary to the nucleic acid fragments encoding the enzymes, operably linked to sequences suitable regulators, wherein the expression of the chimeric genes results in production of levels of coding enzymes in transformed host cells that are altered (ie, increasing or decreasing) of the levels produced in non-transformed host cells.
In a further embodiment, the present invention relates to a transformed host cell comprising in its genome a chimeric gene encoding the plant amino acid biosynthetic enzyme, operably linked to the appropriate regulatory sequences, the enzyme selected from a group consisting of: dihydr.odipicolinate reductase, diaminopimelate epimerase, threonine synthase, threonine deaminase and S-adenosylmethionine synthetase. The expression of the chimeric gene results in the production of altered levels of the biosynthetic enzyme in the transformed host cell. The transformed host cells may be of eukaryotic or prokaryotic origin, and include cells derived from higher plants and microorganisms. The invention also includes transformed plants that come from transformed host cells of higher plants, and seeds derived from such transformed plants.
In a further modality of the. present invention relates to a method of altering the expression levels of a plant biosynthetic enzyme in a transformed host cell consisting of: a) transformation of a host cell with a chimeric gene including a nucleic acid fragment encoding the plant biosynthetic enzyme, selected from the group consisting of dihydrodipicolinate reductase, diaminopimelate epimerase, threonine synthase, threonine deaminase and S-adenosylmethionine synthetase, operatively linked to suitable regulatory sequences; and b) growth of the transformed host cell under conditions that are suitable for the expression of the chimeric gene wherein the expression of the chimeric gene results in production of altered levels of biosynthetic enzymes in the transformed host cell.
A further embodiment of the present invention relates to a method for obtaining a nucleic acid fragment that encodes all or substantially all of the amino acid sequence coding for dihydrodipicolinate reductase, diaminopimethate epimerase, threonine synthase, threonine deaminase and S-adenosylmethionine synthetase. plant.
A further embodiment of the present invention is a method for evaluating at least one compound for its ability to inhibit the activity of a plant biosynthetic enzyme selected from a group consisting of dihydrodipicolinate reductase, diaminopimelate epimerase, threonine synthase, threonine deaminase and S-adenosylmethionine. synthetase, the method consists of the steps of: (a) transforming a host cell with a chimeric gene comprising a nucleic acid fragment encoding the plant biosynthetic enzyme selected from the group consisting of dihydrodipicolinate reductase, diaminopimelate epimerase, threonine synthase, threonine deaminase and S-adenosylmethionine synthetase, operatively linked to suitable regulatory sequences; (b) development of the transformed host cell under conditions that are suitable for the expression of the chimeric gene wherein the expression of the chimeric gene results in production of the biosynthetic enzyme in the transformed host cell; (c) optionally the purification of the biosynthetic enzyme expressed by the transformed host cell; (d) treatment of the biosynthetic enzyme with a compound to be tested; and (e) comparing the activity of a biosynthetic enzyme that has been treated with the test compound for the activity of an untreated biosynthetic enzyme, in order to select compounds with potential for inhibitory activity.
BRIEF DESCRIPTION OF THE SCHEMES AND DESCRIPTIONS OF THE SEQUENCE
The invention can be fully understood from the following detailed description and accompanying drawings and sequence descriptions which are part of this application.
Figure 1 describes the biosynthetic pathway of the amino acid family of aspartate. The following abbreviations are used: AK = aspartoquinasa; ASADH = aspartic semialdehyde dehydrogenase; DHDPS = dihydrodipicolinate synthase; DHDPR = dihydrodipicolinate reductase, DAPEP = diaminopimelate epimerase; DAPDC = diaminopimelate decarboxylase; HDH = homoserine dehydrogenase, HK = homoserine kinase; TS = tresnin synthase; TD = threonine deaminase; C? S = cystathionine? -syntase; CßL = cystathionine β-lyase; MS = methionine synthase; CS = cysteine synthase, and SAMS = S-adenosylmethionine synthase.
Figure 2 shows a multiple alignment of amino acid sequence fragments reported here encoding dihydrodipicolinate reductase (SEQ ID NOs: 2 and 4) and the dihydrodipicolinate reductase sequence of Synechocystis sp. declared in DDBJ Accession No. D90899 (SEQ ID NO: 5).
Figure 3 shows a multiple alignment of the amino acid sequence fragments reported here coding for diaminopimelate epimerase (SEQ ID Nos: 7, 9, 11, and 13) and the diaminopimelate epimerase sequence of Synechocystis sp. declared in DDBJ Accession No. D90917 (SEQ ID N0: 14). -
Figure 4 shows the multiple alignment of the amino acid sequence fragments reported here encoding threonine synthase (SEQ ID Nos: 16, 18, 20, 22, 24 and 26) and the threonine synthase sequence of Arabidopsis thaliana reported in GenBank Accession No L41666 (SEQ ID NO: 27).
Figure 5 shows the multiple alignment of the amino acid sequence fragments reported herein encoding threonine deaminase (SEQ ID Nos: 9, 31, and 33) of the layered Brukholderia tresnin synthase reported in GenBank Accession No. U40630 (SEQ ID NO: 3. 4).
Figure 6 shows the nucleotide sequence alignment of S-adenosylmethionine synthetase reported here for maize (SEQ ID NO: 35) with the nucleotide sequence of S-adenosylmethionine synthetase from Oryza sativa declared in EMBL Accession NO.Z26867 (SEQ ID NO. : 37).
Figure 7 shows the alignment of the nucleotide sequence of S-adenosylmethionine synthetase reported here for soy (SEQ ID NO: 38) with the nucleotide sequence of S-adenosyl-methionine synthetase from Lycopersicon esculentum reported in EMBL Accession NO.Z24741 ( SEQ ID NO: 40).
Figure 8 shows the alignment of the nucleotide sequence of S-adenosylmethionine synthetase reported here for wheat (SEQ ID NO: 41) with the nucleotide sequence of S-adenosylmethionine synthetase from Hordeum vulgare declared in DDBJ Accession No. D63835 (SEQ ID. NO: 43).
The amino acid sequence alignments were carried out using the Clustal alignment method (Higgins, DG and Sharp, PM (1989) CABIOS 5: 151-153) of the Megalign program of the LASARGENE bioinformatics department (DNASTAR Inc., Madison , Wl). The nucleotide sequence alignments were a result of the BLASTN search carried out with each individual sequence of S-adenosylmethionine.
The following sequence descriptions and sequence lists appended here comply with the nucleotide and / or amino acid sequence governing rules published in patent applications as stated in 37 C.F.R. §1.821-1.825.
SEQ ID NO: 1 is the nucleotide sequence comprising the insertion of the entire cDNA in the clone csiln.pk0042, a3 encoding a corn dihydrodipicolinate reductase.
SEQ ID NO: 2 is the deduced amino acid sequence of a portion of a corn dihydrodipicolinate reductase derived from the nucleotide sequence of SEQ ID NO: 1.
SEQ ID NO: 3 is the nucleotide sequence comprising a portion of the cDNA insert in clone rls2.pk0017.d3 encoding a dihydrodipicolinate SEQ ID NO: 4 is the deduced amino acid sequence of a portion of dihydrodipicolinate reductase from the derived rice of the nucleotide sequence of SEQ ID NO: 3.
SEQ ID NO: 5 is the entire amino acid sequence of dihydrodipicolinate reductase from Synechocystis sp. DDBJ Accession No. D90899.
SEQ ID NO: 6 is the nucleotide sequence comprising the insertion of the entire cDNA in the clone chp2.pk0008.h4 coding for the diaminopimelate corn epimerase.
SEQ ID NO: 7 is the deduced amino acid sequence of a portion of the corn diamino-epimerase epimerase derived from the nucleotide sequence of SEQ ID NO: 6.
SEQ ID NO: 8 is the nucleotide sequence comprising a portion of the cDNA insert in clone rls48.pk0036.hl0 encoding a diaminopimelate epimerase from rice.
SEQ ID NO: 9 is the deduced amino acid sequence of a portion of a diaminopimethate epimerase from rice derived from the nucleotide sequence of SEQ ID NO: 8.
SEQ ID NO: 10 is the sequence of nucleotides comprising a contiguity formed from portions of sfll.pk0031.h3, and sgslc.pk002.kl2 and the complete cDNA inserted from the se2.pk0005 clones. fl, and ses8w.pk0010.hll coding soybean diaminopimelate epimerase.
SEQ ID NO: 11 is the deduced amino acid sequence of soybean diaminopimelate epimerase derived from the nucleotide sequence of SEQ ID NO: 10.
SEQ ID NO: 12 is the nucleotide sequence comprising a portion of the cDNA insert in clone wlm24.pk0030.g4 encoding a wheat epimerase diaminopimelate.
SEQ ID NO: 13 is the deduced amino acid sequence of the portion of a wheat diaminopimethate epimerase derived from the nucleotide sequence SEQ ID NO: 12.
SEQ ID NO: 14 is the nucleotide sequence comprising the entire diaminopimelate epimerase of Synechocystis sp. From DDBJ Accession No. D90917.
SEQ ID NO: 15 is the nucleotide sequence comprising the insertion of the entire cDNA in clone cc2.pk0031.c9 encoding a threonine synthetase of maize.
SEQ ID NO: 16 is the deduced amino acid sequence of a portion of a threonine synthetase of corn derived from the nucleotide sequence declared in SEQ ID NO: 15.
SEQ ID NO: 17 is the nucleotide sequence comprising part of the cDNA insert in clone csl.pk0058.g5 encoding a threonine maize synthase.
SEQ ID NO: 18 is the amino acid sequence deduced from a portion of a threonine maize synthase, derived from the nucleotide sequence of SEQ ID NO: 17
SEQ ID NO: 19 is the nucleotide sequence comprising part of the cDNA insert in clone rls72.pk0018.e7 encoding a threonine rice synthase.
SEQ ID NO: 20 is the amino acid sequence deduced from a portion of a threonine synthase. of rice derived from the nucleotide sequence declared in SEQ ID NO: 19. SEQ ID NO: 21 is the nucleotide sequence comprising part of the cDNA insert in clone sel.06a03 encoding a threonine synthase of soybean.
SEQ ID NO: 22 is the amino acid sequence deduced from a portion of a threonine synthase of soybean, derived from the nucleotide sequence declared in SEQ ID NO: 21.
SEQ ID NO: 23 is the nucleotide sequence comprising the insertion of the entire cDNA in the srl .pk0003 clone. f6 encoding a threonine soy synthase.
SEQ ID NO: 24 is the amino acid sequence deduced from a portion of a threonine synthase of soybean derived from the nucleotide sequence declared in SEQ ID NO: 23.
SEQ ID NO: 25 is the sequence of nucleotides comprising a part of the insertion of cDNA in the clone wrl .pk0085.h2 encoding a wheat threonine synthase.
SEQ ID NO: 26 is the amino acid sequence deduced from a portion of a wheat threonine synthase derived from the nucleotide sequence declared in SEQ ID NO: 25.
SEQ ID NO: 27 is the entire amino acid sequence of an Arabidopsis thaliana threonine synthase, found in GenBank Accession No. L41666.
SEQ ID NO: 28 is the nucleotide sequence comprising the insertion of the entire cDNA in the clone above .pk0064. f4 encoding a threonine deaminase from corn.
SEQ ID NO: 29 is the amino acid sequence deduced from a portion of a threonine deaminase of corn derived from the nucleotide sequence declared in SEQ ID NO: 28.
SEQ ID NO: 30 is the nucleotide sequence comprising a portion of the cDNA insert in clone sfll.pk0055.h7 encoding a threonine deaminase of soy.
SEQ ID NO: 31 is the amino acid sequence deduced from a portion of a threonine deaminase of soybean derived from the nucleotide sequence declared in SEQ ID NO: 30.
SEQ ID NO: 32 is the nucleotide sequence comprising the entire cDNA insert in the clone sre.pk0044. f3 encoding a soy threonine deaminase.
SEQ ID NO: 33 is the deduced amino acid sequence of a portion of a soy threonine deaminase of the nucleotide sequence declared in SEQ ID NO: 32.
SEQ ID NO: 34 is the entire amino acid sequence of a threonine deaminase from Burkholderia capada found in GenBank Accession No.U49630.
SEQ ID NO: 35 is the sequence of nucleotides comprising the insertion of the entire cDNA the clone cc3.mn0002.d2 encoding the whole S-adenosylmethionine synthetase of corn.
SEQ ID NO: 36 is the deduced amino acid sequence of a corn S-adenosylmethionine synthetase derived from the nucleotide sequence declared in SEQ ID NO: 35
SEQ ID NO: 37 is the entire nucleotide sequence of an S-adenosylmethionine synthetase from Oryza sativa found in EMBL Accession No. Z26867.
SEQ ID NO: 38 is the entire nucleotide sequence of the entire cDNA insert in clone s2.12b206 encoding the whole soy S-adenosylmethionine synthetase.
SEQ ID NO: 39 is the deduced amino acid sequence of the entire S-adenosylmethionine synthetase derived from the nucleotide sequence declared in SEQ ID NO: 38.
SEQ ID NO: 0 is the entire nucleotide sequence of an S-adenosylmethionine synthetase from Lycopersicon esculentum found in EMBL Accession No.Z24741. SEQ ID NO: 41 is the nucleotide sequence comprising a contiguity formed from portions of cDNA inserts in the clones wrel.pk0002.cl2, wleln.pk0070.b8, wkmlc.pk0003.g4, wlkl.pk0028.d3, wreln.pkl70 .d8, wrl.pk0086.d5, wrl .pk0103.h8 and wreln.pk0082.b2 encoding a portion of wheat S-adenosylmethionine synthetase.
SEQ ID NO: 42 is the deduced amino acid sequence of a wheat S-adenosylmethionine synthetase derived from the nucleotide sequence declared in SEQ ID NO: 41.
SEQ ID NO: 43 is the entire nucleotide sequence of an S-adenosylmethionine synthetase from Hordeum vulgare found in DDBJ Accession No. D63835.
Sequence Descriptions contain the one letter code for the nucleotide sequence characters and the three letter codes for amino acids, defined according to the IUPAC-IYUB standards, described in Nucleic Acids Research 13: 3021-3030 ( 1985) and in Biochemical Journal 219 (No.2): 345-373 (1984) which are mentioned and incorporated herein by reference. The symbols and the format used for the nucleotide and amino acid sequence comply with the rules described in 37 C.F.R. §1.822.
DETAILED DESCRIPTION OF THE INVENTION
In the context of this discovery, a number of terms will be used. As described, an "isolated nucleic acid fragment" is a DNA or RNA polymer having one or double strand; optionally they contain synthetic, unnatural or altered nucleotide bases. An isolated nucleic acid fragment in the form of a DNA polymer may comprise one or more segments of cDNA, genomic or synthetic DNA. As described herein, "contiguity" refers to a collection of overlapping nucleic acid sequences to form a sequence of contiguous nucleotides. For example, several DNA sequences can be compared and aligned to identify the common or overlapping regions. The individual sequences can then be assembled into a single contiguous nucleotide sequence.
As described "substantially similar" refers to fragments of nucleic acids where changes in one or more nucleotide bases result in the substitution of one or more amino acids, but do not affect the functional properties of the protein encoded by the DNA sequence. "Substantially similar" also refers to nucleic acid fragments wherein changes in one or more nucleotide bases do not affect the ability of the nucleic acid fragment to mediate the alteration of gene expression by counter-sense or co-suppression technology "Substantially "similar" also refers to modifications of the nucleic acid fragments of the current invention such as the deletion or insertion of one or more nucleotides that do not substantially affect the functional properties of the resulting transcription vis-à-vis the ability to mediate the alteration. of the genetic expression by the counter-sense, technology of co-suppression or alteration of the functional properties of the resulting protein molecule. It is therefore understood that the invention encompasses more than the specific sequences exemplified.
For example, it is well known in the art that counter-sense suppression and co-suppression of gene expression can be achieved by using fragments of nucleic acid representing less than the entire coding region of a gene, and by nucleic acid fragments. that do not share 100% of the identity with the gene to be deleted. Moreover, alterations in a gene which result in the production of a chemically equivalent amino acid at a given site, but have no effect on the functional properties of the encoded protein, are well known in the art. In this way, a codon for the amino acid alanine, a hydrophobic amino acid, can be replaced by a codon coding for another less hydrophobic residue, such as glycine, or another more hydrophobic residue, such as valine, leucine or isoleucine. Similarly, in changes that result in substitution of a negatively charged residue for another, such as aspartic acid for glutamic acid, or a positively charged residue for another, such as lysine for arginine, one may also expect to produce a functionally equivalent product. Changes in nucleotides that result in the alteration of the N and C terminal portions of the protein molecule should not be expected to alter the activity of the protein. Each one of the proposed modifications is very deep in the routine of art, as is the determination of retention of biological activity of the coded products. Moreover, the skilled artisan recognizes that substantially similar sequences encompassed by this invention are also defined by their ability to hybridize, under stringent conditions (0.1X SSC, 0.1% SDS, 65 ° C), with the sequences exemplified herein. The preferred substantially similar nucleic acid fragments of the present invention are those fragments of nucleic acids whose DNA sequences are 80% identical to the DNA sequence of the nucleic acid fragments reported herein. The most preferred nucleic acid fragments are 90% identical to the identical DNA sequence of the nucleic acid fragments reported herein. The most preferred nucleic acid fragments are 95% identical to the DNA sequence of the nucleic acid fragments reported herein. The Clustal multiple alignment algorithm (Higgins, D.G. and Sharp, P.M 819899 CABIOS 5: 151-153) was used here with a GAP PENALTY of 10 and a GAP LENGTH PENALTY of 10.
A "substantial portion" of an amino acid or nucleotide sequence comprises sufficient of the amino acid sequence of a polypeptide or of the nucleotide sequence of a gene to achieve a putative identification of that polypeptide or gene, either by manual evaluation of the sequence by an expert, or by comparison of automated computer sequences and identification using algorithms such as BLAST (Basic Local Alignment Seach Tool; Altchul, SF, et al., (1993) J.Mol.Biol.215: 403410; see also www.ncbi.nlm.nih.gov/BLAST/). In general, the sequence of ten or more contiguous amino acids or thirty or more nucleotides is necessary in order to putatively identify the polypeptide or nucleic acid sequence as a homolog of a known protein or gene. In addition, with respect to the nucleotide sequences, the specific oligonucleotide gene examined, comprising 20-30 contiguous nucleotides, can be used in methods of genetic identification of dependent chain (eg, Southern hybridization) and isolation (eg, in if your hybridization of bacterial colonies or bacteriophage plaques). In addition, short oligonucleotides of 12-15 bases can be used as amplification primers in PCR in order to obtain a specific fragment of particular nucleic acid comprising primers. Accordingly, a "substantial portion" of a nucleotide sequence comprises sufficient of the sequence to provide specific identification and / or isolation of the nucleic acid fragment comprising the sequence. The current specification teaches the partial or complete sequences of amino acids and nucleotides encoding one or more particular plant proteins. The skilled artisan, having the benefit of the sequences as reported here, may now use all or a substantial portion of the disclosed sequences for purposes known to those skilled in the art. Accordingly, the present invention comprises the complete sequences as reported in the Sequence List, as well as the substantial portions of these sequences as defined above.
"Degenerate codon" refers to divergence in the genetic code allowing variation of the nucleotide sequence without affecting the amino acid sequence of a coded polypeptide. Accordingly, the present invention relates to any nucleic acid fragment that encodes all or a substantial portion of the amino acid sequence encoding biosynthetic amino acid enzymes as set forth in the sequences SEQ ID NOS: 2, 4, 7, 9, 11 , 13, 16, 18, 20,
22, 24, 26, 29, 31 and 33. The skilled artisan is well aware of the "codon-bias" displayed by a specific host cell in use of nucleotide codons to specify a given amino acid. Therefore, when a gene is synthesized to improve expression in the host cell, it is desirable to design such a gene that its frequency of codon usage approaches the frequency of use of preferred codons of the host cell.
"Synthetic genes" can be assembled from blocks constructed of oligonucleotides that are chemically synthesized using methods known to those skilled in the art. These constructed blocks are linked and hardened to form segments of genes that are then enzymatically assembled to build the entire gene. "Chemically synthesized", in relation to the DNA sequence, means that the nucleotide components were assembled in vi tro. Manual chemical synthesis of DNA can be completed using well-established procedures, or automatic chemical synthesis of DNA can be performed using one of a considerable number of commercial machines available. Therefore, genes can be adapted for optimal gene expression based on optimization of the nucleotide sequence to reflect the codon-bias of the host cell. Trained artisans appreciate the likelihood of successful gene expression if the use of the codon is predisposed to those codons favored by the host. The determination of the preferred codons can be based on a study of the genes derived from the host cell where the sequence of information is available.
"Gene" refers to the nucleic acid fragment that expresses a specific protein, including the regulatory sequences preceding (5 'non-coding sequence) and following (3' non-coding sequence) the coding sequence. "Native gene" refers to the gene as it is found in nature with its own regulatory sequences. "Chimeric gene" is. refers to any gene that is not a native gene, comprising regulatory and coding sequences that are not found together in nature. Accordingly, a chimeric gene can comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a different way from that found in nature. "Endogenous gene" refers to the native gene in its original position in the genome of an organism. "Foreign" gene refers to a gene not normally found in the host organism, but which is introduced into the host organism by gene transfer. Foreign genes may comprise native genes inserted into a non-native organism, or chimeric genes. A "transgene" is a gene that has been introduced into the genome by a transformation procedure.
"Coding sequence" refers to a DNA sequence that codes for a specific amino acid sequence "Regulatory sequences" refers to nucleotide sequences located at the upper end (non-coding sequences), middle, or end lower (3 'non-coding sequences) of a coding sequence, and which influence the transcription, processing or stability of RNA, or translation of an associated coding sequence Regulatory sequences may include promoters, leader translation sequences, introns and polyadenylation sequences recognizing
"Promoter" refers to a DNA sequence capable of controlling the expression of an RNA coding or functional sequence. In general, a coding sequence is located 3 'to a promoter sequence. The promoter sequence consists of proximal and more distal elements at the upper end as well as the last elements are referred to as enhancers. Accordingly, an "enhancer" is a DNA sequence which can stimulate promoter activity and can be an innate element of a promoter or a heterologous element inserted to improve the level or tissue specificity of a promoter. The promoters can be derived in their entirety from a native gene, or they can be composed of different elements derived from different promoters found in nature, or they can still comprise synthetic segments of DNA. It is understood by those skilled in the art that different promoters can direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental conditions. Promoters which cause a gene to be expressed in most cell types are most often referred to as "constitutive promoters". New promoters of various useful types in plant cells are constantly being discovered; several examples can be found in the compilation of Okamuro and Goldberg, (1989) Biochemistry of Plants 15: 1 -82. It is further recognized that, in most cases, the exact boundaries of the regulatory sequences have not been fully defined, DNA fragments of different lengths may have identical promoter activity.
The "leader translation sequence" refers to a DNA sequence located between the promoter sequence of a gene and the coding sequence. The leader translation sequence is presented at the fully processed upper end of messenger RNA of the initial translation sequence. The leader translation sequence may affect the processing of the primary translation of messenger RNA, the stability of the messenger RNA or the translation efficiency. Examples of leading translation sequences have been described (Turner, R. and Foster, G.D. (1985) Molecular Biotechnology 3: 225).
The "3 'non-coding sequences" refer to DNA sequences located at the lower end of a coding sequence and include polyadenylation recognition sequences and other coding regulatory signal signals capable of affecting mRNA processing or gene expression. The polyadenylation signal is usually characterized in that it affects the addition of polyadenylic acid channels to the 3 'end of the mRNA precursor. The use of different 3 'non-coding sequences is exemplified by Ingelbrecht et. al., (1989) Plant Cell 1: 671-680.
"RNA transcription" refers to the product resulting from transcription catalyzed by an RNA polymerase to a DNA sequence. When the RNA transcript is a perfect complementary copy of a DNA sequence, it is referred to as a primary transcript or it can be an RNA sequence derived from a post-transcriptional process of the primary transcript and is referred to as mature RNA. "Messenger RNA (mRNA)" refers to RNA without introns and can be translated into proteins by the cell. "cDNA" refers to a double strand of DNA that is complementary and derived from mRNA.
"Sense of RNA" refers to the transcription of RNA that includes mRNA and can then be translated into proteins by the cell. "RNA derivative" refers to the transcription of RNA that is complementary to all or part of a primary primary transcript or mRNA and that blocks the expression of a target gene (U.S. Pat No. 5,107,065). The complementarity of an RNA in contradiction can be with any part of the transcription of the specific gene, i.e., in the 5 'non-coding sequence, 3' non-coding sequence, introns, or the coding sequence. "Functional RNA" refers to an RNA in contradiction, ribosomal RNA or other RNA that is not translated and still has an effect on cellular processes.
The term "operably linked" refers to the association of nucleic acid sequences in a single fragment of nucleic acid such that the function of one is affected by the other. For example, a promoter is operably linked to a coding sequence when it is capable of affecting the expression of the coding sequence (i.e., that the coding sequence is under transcriptional control of the promoter). Coding sequences can be operatively linked to regulatory sequences in normal orientation or in contradiction.
The term "expression" as used herein, refers to the transcription and stable accumulation of normal RNA (mRNA) or nonsense RNA derived from the nucleic acid fragment of the invention. Expression can also refer to the translation of mRNA into a polypeptide. "Inconsensus inhibition" refers to the production of transcripts of RNA in contradiction capable of suppressing the expression of the target protein. "Over-expression" refers to the production of a gene product in transgenic organisms that exceeds production levels in normal or non-transformed organisms. "Cosuppression" refers to the production of normal RNA transcripts capable of suppressing the expression of identical or substantially similar foreign endogenous genes (U.S. Patent No. 5,231,020).
Altered levels refers to the production of gene products in transgenic organisms in amounts or proportions that differ from those of normal or non-transformed organisms. "Mature protein" refers to a polypeptide processed post-translationally; i.e., one of which pre or propeptides present in the product of the primary translation have been excluded.
"Precursor protein" refers to the primary product of mRNA translation; i.e., with pre and propeptides still present. Pre and propeptides are not limited to intracellular localization signals.
A "chloroplast transit peptide" is an amino acid sequence which is translated in conjunction with a protein and directs the protein to the chloroplast or other types of plastids present in the cell in which the protein was made. "Chloroplast transit sequence" refers to a nucleotide sequence that encodes a transit peptide in chloroplasts. A "peptide signal" is an amino acid sequence which is translated in conjunction with a protein and directs the protein to the secretory system (Chrispeels, JJ, (1991) Ann. Rev. Plant Phys. Plant, Mol. Biol. : 21-53). If the protein is directed to a vacuole, a specific vacuolar signal (supra) can be added later, or if it is directed to the endoplasmic reticulum, an endoplasmic reticulum retention signal (supra) can be added. If the protein is directed to the nucleus, any present peptide signal must be removed and instead include a nuclear localization signal (Raikhel (1992) Plant Phys.100: 1627-1632).
"Transformation" refers to the transfer of a nucleic acid fragment to the genome of the host organism, resulting in genetically stable hereditary characters. Host organisms containing the transformed nucleic acid fragments are referred to as "transgenic" organisms. Examples of methods for plant transformation include Agrobacterium-mediated transformation (De Blaere et al (1987) Meth. Enzymol 143: 277) and accelerated particles or "bombardment of gene" transformation technology (Klein et.al. 1987) Nature (London) 327: 70-73; US Pat No. 4,945,050).
As described above, the standard recombinant DNA and the molecular cloning techniques used herein are well known in the art and are best described in Sambrook, J., Fritsch, E.F. and Maniatis, T. Molecular Cloning: A Laboratory Manual; Cold spring Harbor Laboratory Press: Cold Spring Harbor, 1989 (hereinafter "Maniatis").
Nucleic acid fragments encoding at least a portion of several biosynthetic amino acid enzymes in plants have been isolated and identified by comparison of cDNA sequences in randomly selected plants, to publish databases containing nucleotide and protein sequences using BLAST algorithms well known to those skilled in the art. Table 1 lists the biosynthetic amino acid enzymes described herein, and the designation of cDNA clones comprising the nucleic acid fragments encoding these enzymes.
TABLE 1 Biosynthetic amino acid enzymes
3
The nucleic acid fragments of the present invention can be used to isolate cDNAs and genes encoding homologous enzymes of the same species or of other plant species. The isolation of homologous genes using sequence dependent protocols is well known in the art. Examples of dependent sequence protocols include, but are not limited to, nucleic acid hybridization methods, and DNA and RNA amplification methods as exemplified for various uses of nucleic acid amplification technologies (eg, chain reaction of the polymerase, ligase chain reaction).
For example, genes encoding other amino acid biosynthetic enzymes, either cDNAs or genomic DNAs, could be isolated directly using all or a portion of the current nucleic acid fragments as DNA hybridization tests to protect libraries of any desired plant using good methodology. known by those qualified in art. Specific oligonucleotide tests based on current nucleic acid sequences can be designated and synthesized by methods known in the art (Maniatis). In addition, whole sequences can be used directly to synthesize DNA tests by methods known to skilled artisans such as random primer DNA labeling, notch translation, or final labeling techniques, or RNA tests using suitable transcription systems in vi tro. In addition, specific primers can be designed and used to amplify a part or the entire length of the current sequences. The products resulting from the amplification can be labeled directly during the amplification or labeling reactions after the amplification reactions, and used as tests for isolation of the full length of the cDNA or genomic fragments under appropriate severity conditions.
In addition, two short segments of the current nucleic acid fragments can be used in the polymerase chain reaction protocols to amplify long nucleic acid fragments by encoding DNA or RNA homologous genes. The polymerase chain reaction can also be performed in a library of cloned nucleic acid fragments wherein the sequence of one primer is derived from the current nucleic acid fragments, and the sequence of another primer takes advantage of the presence of regions of polyadenylic acid by the end of the 3 'of the mRNA precursor encoding plant genes. Alternatively, the second starter sequence may be based on sequences derived from the cloning vector. For example, skilled artisans can follow the RACE protocol (Frohman et al., (1988) NAS USA-85: 8998) to generate cDNAs by using PCR to amplify copies of the region between an isolated point in the transcript at the end of the 3 'or 5'. Initiators oriented in the 3 'and 5' direction can be designed from the current sequences.
Using adequate commercial systems 3 'RACE or 5' RACE
(BRL), specific fragments 3 'or 5' cDNA can be isolated (Ohara et al., (1989) PANS USA 86: 5673; Loh et al., (1989) Science 243: 217). Products generated by the 3 'and 5' RACE procedures can be combined to generate complete cDNA chains (Frohman, M.A. and Martin, G.R., (1989) Techniques 1: 165).
The availability of the current nucleotide and the deduced amino acid sequences facilitate the safeguarding of cDNA in libraries of immunological expression. Synthetic peptides representing portions of the current amino acid sequences can be synthesized. These peptides can be used to immunize animals to produce polyclonal or monoclonal antibodies with specificity for peptides or proteins comprising the amino acid sequences. These antibodies can then be used to protect cDNA expression libraries to isolate full-length cDNA clones of interest (Lerner, R.A. (1984) Adv. Inmumol. 36: 1; Maniatis).
The nucleic acid fragments of the present invention can be used to create transgenic plants in which the disclosed biosynthetic enzymes are present at higher or lower levels than normal or in type cells or developmental stages in which they are not normally found. This would have the effect of altering the level of free minoacids in those cells.
The overexpression of the biosynthetic enzymes of the present invention can be completed by primary construction of chimeric genes in which the coding regions are operatively linked to promoters capable of directing the expression of the gene in the desired tissue at the desired stage of development. For reasons of convenience, the chimeric genes may comprise promoter sequences and leader translation sequences derived from the same genes. The 3 'non-coding sequences encoding the translation end signals can also be provided. The current chimeric genes may also comprise one or more introns in order to facilitate the expression of the gene.
The plasmid vectors comprising the current chimeric genes can then be manufactured. The plasmid vector option is dependent on the method that will be used to transform host plants. The skilled artisan is well aware of the genetic elements that must be present in the plasmid vector in order to successfully transform, selecting and spreading host cells containing the chimeric genes. The skilled artisan will also recognize that the events of the different independent transformations will result in different levels and patterns of expression ~ (Johns et al. (1985) EMBO J. 4: 2411-2418; De Almeida et al., (1989) Mol Gen. Genetics 218: 78-86), and in this way these multiple events must be safeguarded in order to obtain lines expressing the desired levels and patterns of expression. Such a guard can be carried out by Southern DNA analysis, Northern analysis of mRNA expression, Western analysis of protein expression, or phenotypic analyzes.
For some applications this may be helpful in directing current biosynthetic enzymes to different cellular compartments, or to facilitate their secretions from the cell. It is thus imagined that the chimeric genes described above can then be supplemented by altering the coding sequences to encode enzymes with appropriate intracellular target sequences such as transient sequences (Keegstra, K. (1989) Cell 56: 247-253), signal sequences or sequences encoding the location of the endoplasmic reticulum (Chrispeels, JJ, (1991) Ann. Rev. PlantPhys, Plan Mol. Biol. 42: 21-53), or added signals of nuclear localization (Raikhel, N. (1992) Plant Phys 100: 1627-1632) and / or with white sequences that are currently removed. While the cited references give examples of each of these, the list is not exhaustive and many useful white signals can be discovered in the future.
This may also be desirable to reduce or eliminate the expression of genes encoding the present biosynthetic enzymes in plants for some applications. In order to accomplish this, chimeric genes designed for co-suppression of current biosynthetic enzymes can be constructed by joining genes or gene fragments encoding the enzyme for plant promoter sequences. Alternatively, chimeric genes designed to express RNA in contradiction to all or part of the current nucleic acid fragments can be constructed by binding the gene fragment or genes in reverse orientation to promoter sequences in plant. Both co-suppression and chimeric genes in contrasense can be introduced into plants via transformation in which the expression of the corresponding endogenous genes are reduced or eliminated.
Biosynthetic enzymes of current amino acids (or portions of enzymes) can be produced in heterologous host cells, particularly in the cells of microbial hosts, and can be used to prepare antibodies for the enzymes by methods well known to those skilled in the art. The antibodies are useful for detecting the enzyme in cells in itself or in cell-in-vitro extracts. The preferred heterologous host cells for the production of the current amino acid biosynthetic enzymes are the microbial hosts. Microbial expression systems and expression vectors containing regulatory sequences that direct high levels of expression of foreign proteins are well known to those skilled in the art. Some of these could be used to construct chimeric genes for production of the present biosynthetic amino acid enzymes. These chimeric genes can then be introduced into appropriate microorganisms via transformation to provide high levels of enzyme expression. An example of a vector for high level expression of the current biosynthetic amino acid enzymes in a bacterial host is provided (example 11).
Additionally, biosynthetic enzymes of current plant amino acids can be used as targets to facilitate the design and / or identification of enzyme inhibitors that may be useful as herbicides. This is desirable because the enzymes described here catalyze several steps in a route initiating the production of several essential amino acids. Therefore, inhibition of the activity of one or more of the enzymes described herein could initiate the inhibition of sufficient amino acid biosynthesis to inhibit the development of the plant. Thus, the plant's current amino acid biosynthetic enzymes could be appropriate for the discovery and design of new herbicides.
All or a substantial portion of the nucleic acid fragments of the present invention can also be used as tests to map genes that are genetically and physically part of them, and as markers for traits attached to those genes. Such information can be useful in the production of plants in order to develop lines with desired phenotypes. For example, current nucleic acid fragments can be used as restriction fragment extension polymorphism markers. Southern blots (Maniatis) of digested restriction of plant genomic DNA can be tested with the nucleic acid fragments of the present invention. The resulting pattern bands can also be subjected to genetic analysis using computer programs such as MapMaker (Lander et al. , (1987) Genomics 1: 174-181) in order to construct a genetic map. In addition, the nucleic acid fragments of the present invention can be used to test Southern blots containing restriction-treated endonucleases of genomic DNAs from a group of individuals represented as parents and progeny of a defined genetic cross. Segregation of the DNA polymorphism is annotated and used to calculate the position of the current nucleic acid sequence in the genetic map previously obtained using this population (Botstein, D. et al., (1980) Am. J. Hum. Genet. 32: 314-331) ..
The production and use of the gene-derived plant tests for use in genetic mapping is described in R. Bernatsky, R. And Tanksley, S.D. (1986) Plant Mol. Biol. Repórter 4 (1): 37-41. Numerous publications describe the genetic mapping of specific clones of cDNA using the methodology outlined above or variations thereof. For example, interbreeding F2 populations, populations of later crosses, random mating populations, nearby isogenic lines, and other groups of individuals can be used to map. Such methodologies are well known to those skilled in the art.
Nucleic acid assays derived from the current nucleic acid sequence can be used for physical mapping (ie, placement of sequences on physical maps; see Hoheisel, JD, et al., Tn: Nonmammalian Genomic Analysis: A Practical Guide, Academic press 1996 , pp. 319-346, and references cited here).
In another embodiment, nucleic acid assays derived from current nucleic acid sequences can be used in direct fluorescence in situ (FISH) hybridization mapping (Trask, B: J: (1991) Trends Genet., 7: 149-154) . Although current methods of FISH mapping favor the use of long clones (Several to several hundred KB; see Laan, M. et al. (1995) Genome Research 5: 13-20) improvements in sensitivity may allow the performance of the FISH mapping using shorter tests.
A variety of methods based on nucleic acid amplification of genetic and physical mapping can be conducted using the current nucleic acid sequences.
Examples include allele-specific amplification (Kazazian, HH (1989) J. Lab. Clin. Med. 114 (2) .95-96) polymorphism of PCR amplified fragments (CAPS, Sheffield, VC et.al. (1993) Genomics 16,325-332), linkage of specific alleles (Landergre, U. et al. (1988) Science 241: 1077-1080), nucleotide extension reactions (Sokolov, BP (1990) Nucleic Acid Res. 18: 3671), Hybrid Mapping Radiation (Walter, MA et al. (1997) Nature genetics 7: 22-28) and Happy Mapping (Dear, PH 7And Cook, PR (1989) Nucleic Acid Res. 17: 6795-6807). For these methods, the sequence of a nucleic acid fragment is used to design and produce primer pairs for use in the amplification reaction or primer extension reactions. The design of such initiators is well known to those skilled in the art. In methods employing genetic mapping based. in PCR, it may be necessary to identify differences in the DNA sequence between parents of the cross map in the region corresponding to the current nucleic acid sequence. This, however, is not generally necessary for mapping methods.
The loss of function of mutant phenotypes can be identified by current cDNA clones either by disruption protocols of a target gene or by identification of specific mutants for these genes contained in a maize population carrying mutations in all possible genes (Ballinger and Benzer, (1989) Proc. Nati, Acad. Sci. USA 86.9402, Koes et al., (1995) Proc. Nati, Acad. Sci USA 92.8149, Bensen et al., (1995) -Plant Cell 7:75). The last approach can be accomplished in two ways. First, short segments of the current nucleic acid fragments can be used in the polymerase chain reaction protocols in conjunction with a mutation of the labeled primer sequence in DNAs prepared from a population of plants in which mutant transposons or some other element causing mutation in DNA has been introduced, (see Bensen, supra). Amplification of a specific DNA fragment with these primers indicates the insertion of labeled labeling mutation element in or near the plant gene encoding dihydrodipicolinate reductase, diaminopimelate epimerase, threonine synthase, threonine deaminase, or S-adenosylmethionine synthetase. Alternatively, the current nucleic acid fragment can be used as a hybridization test against the PCR amplification products generated from the mutated population using the tagged mutant primer sequence in conjunction with an arbitrary genomic site primer, such as that for an adapter. synthetic restriction enzyme anchor site. With either method, a plant containing a mutation in an endogenous gene encoding a dihydrodipicolinate reductase, diaminopimelate epimerase, threonine synthase, threonine deaminase, or S-adenosylmethionine synthetase can be identified and obtained. This mutant plant can then be used to determine or confirm the natural function of the product gene dihydrodipicolinate reductase, diaminopimelate epimerase, threonine synthase, threonine deaminase, and S-adenosylmethionine synthetase.
EXAMPLES
The present invention is further defined in the following examples, in which all parts and percentages are by weight and degrees Celsius, unless stated otherwise. It should be understood that these examples, while indicating preferred embodiments of the invention, are given solely for the purpose of illustration. From the discussions before these examples, one skilled in the art can find out the essential characteristics of this invention, and without leaving the spirit and scope thereof, can make several changes and modifications of the invention to adapt this to various uses and conditions.
EXAMPLE 1 Composition of cDNA libraries; isolation and sequencing of cDNA clones cDNA libraries representing mRNAs were prepared from various tissues of corn, rice, soybeans and wheat. The characteristics of the libraries are described below.
TABLE 2
* These libraries were essentially normalized as described in U.S. Pat. No. 5,482,845. ** Application of 6-iodo-2-propoxy-3-propyl-4 (3H) -quinazolinone; Synthesis and methods of use This compound is described in USSN 08 / 545,827, incorporated herein by reference.
The cDNA libraries were prepared in UNI-ZAP ™ XR vectors according to the manufacturing protocol (Stratagene Coning Systems, La Jolla, CA). Conversion of the UNI-ZAP ™ XR libraries into the plasmid libraries was carried out according to the protocol provided by Stratagene. In the conversion, the cDNA inserts were contained in the plasmid vector pBluescript. The cDNA inserts from randomly selected bacterial colonies containing recumbent pBluescript plasmids were amplified via the polymerase chain reaction using primers specific for vector sequences flanking the inserted cDNA sequences, or the plasmid DNA was prepared from bacterial cell culture. Amplified insert DNA or plasmid DNAs were sequenced in sequencing reactions of the labeled primer to generate partial cDNA sequences (labeled expressed sequence or "ESTs", see Adams, M.D. et al., (1991) Science 252: 1651). The resulting ESTs were analyzed using a Model 377 fluorescent sequencer Perkin Elmer.
EXAMPLE 2 Identification and characterization of cAPN clones
ESTs encoding plant amino acid biosynthetic enzymes were identified by BLAST conducted investigations (Basic Local Alignation Seach Tool;
Altschul, S.F., et al., (1993) J. Mol. Biol. 215í 403-410; see also www.ncbi.nlm.nih.gov/BLAST/) for similarity for sequences contained in the BLAST database "nr" (including all non-redundant translations)
GenBank CDS, sequences derived from Brookhaven protein DataBank 3-dimensional structure, the main release of the protein sequence database SWISS-PROT, EMBL, and DDBJ databases). The cDNA sequences obtained in Example 1 were analyzed for similarity for all publicly available DNA sequences contained in the "nr" database using the BLASTN algorithm provided by the National Center for Biotechnology Information (NCBI). The DNA sequences were translated into all reading frames and compared by similarity for all publicly available protein sequences contained in the "nr" database using the BLASTX algorithm (Gish, W. And States, DJ (1993) Nature Genetics 3: 266-272) provided by the NCBI. For convenience, the P-value (probability) of observing a match of the cDNA sequences for a sequence contained in the database searched merely by chance as calculated by BLAST are reported here as "pLog" values, which represents the negative of the logarithm of the P-value reported. Therefore, the higher the pLog value, the greater probability that the cDNA sequence and the BLAST "hit" represent homologous proteins.
EXAMPLE 3 Characterization of cDNA clones encoding homologous polypeptides for dihydrodipicolinate reductase.
The BLASTX search using nucleotide sequences from clones csin.pk0042.a3 and rls2.pk0017.d3 revealed similarity of proteins encoded by cDNA for the enzyme Dihydrodipicolinate reductase of Synechocystis sp. (DDBJ Accession No. D90899). The BLAST pLog values were 12.60 and 11.68 for csin.pk0042.a3 and rls2.pk0017.d3, respectively.
The entire cDNA sequence inserted into the clone csin.pk0042.a3 was determined and observed in SEQ ID NO: 1, the deduced amino acid sequence of this cDNA is observed in SEQ ID NO: 2. The amino acid sequence declared in SEQ ID NO: 2 was evaluated by BLASTP, yielding a pLog value of 36.72 against the Dihydrodipicolinate reductase sequence of Synechocystis sp. The sequence of a portion of cDNA inserted from clone rls2.pk0017.d3 is shown in SEQ ID NO: 3; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 4. Figure 2 shows an alignment of the amino acid sequences declared in SEQ ID NO: 2 and the dihydrodipicolinate reductase sequence of Synechocystis sp. (SEQ ID NO: 5). SEC ID NO: 2 it is 40% identical for the dihydrodipicolinate reductase sequence of Synechocystis sp. (SEC • ID NO: 5). Sequence alignments were carried out by the Clustal method of alignment (Higgins, DG and Sharp PM (1989) CABIOS 5: 151-153), using the Megalign program of the bioinformatics computing department LASARGENE (DNASTAR Inc., Madison, Wl) . Percent sequence identity calculations were carried out by the Jotun Hein method (Hein, JJ (1990) Meth., 183: 626-645) using the Megalign program of the LASARGENE bioinformatics computing department (DNASTAR Inc., Madison , Wl).
The BLAST marker and sequence alignments and probabilities indicate that the current nucleic acid fragments encode almost entirely a corn dihydrodipicolinate reductase, and a portion of a rice dihydrodipicolinate reductase. These sequences represent the first sequences encoding plant dihydrodipicolinate reductase.
EXAMPLE 4 Characterization of cDNA clones diaminopimelate epimerase.
The BLASTX search using the nucleotide sequence from clones chp2.pk0008.h4, rls48.pk0036.hl0, wlm24.pk0030. g4, and contiguous sequences assembled from clones se2.pk0005.f1, ses8w.pk0010.hll, sf11.pk0031.h3, and sgslc.pk002. kl2 revealed similarity of the proteins encoded by the cDNAs for diaminopimelate epimerase. of Synechocystis sp. (DDBJ Accession No. D90917). The BLAST results for each of these ESTs are shown in Table 3:
TABLE 3 BLAST results for clones encoding homologous polypeptides for diaminopimelate epimerase
The sequence of the whole cDNA insert in clone chp2, pk0008, h4 was determined and is shown in SEQ ID NO: 6; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 7. The amino acid sequence declared in SEQ ID NO: 7 was evaluated by BLASTP, yielding a pLog value of 75.66 against the sequence of Synechocystis sp. The sequence of a portion of the cDNA insert of clone rls48.pk0036.hl0 is shown in SEQ ID NO: 8; The deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 9. The assembled nucleotide sequence of the contiguous clones se2.pk0005.fl, ses8w.pk0010.hll, sfll.pk0031.h3, and sgslc.pk002.kl2 was determined and shown in SEQ ID NO: 10; The deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 11. The amino acid sequence declared in SEQ ID NO: 11 was evaluated by BLASTP, yielding a pLog value of 98.57 against the sequence of Synechocystis sp. The sequence of a portion of the cDNA insert of clone wlm24, pk0030.g4 is shown in SEQ ID NO: 12; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 13. Figure 3 shows an alignment of the amino acid sequences reported in SEQ ID Nos: 7, 9, 11 and 13 and the sequence of Synechocystis sp. (SEQ ID NO: 14). The data in Table 4 represent a calculation of the percent identity of the declared amino acid sequences, in SEQ ID Nos: 7, 9, 11 and 13 and the sequence of Synechocystis sp.
TABLE 4 Percentage of identity of amino acid sequences deduced from nucleotide sequences of cDNA clones encoding homologous polypeptides for diaminopimelate epimerase.
The sequence alignments were carried out by the Clustal alignment method (Higgins, DG and Sharp, PM (1989) CABIOS 5: 151-153) using the Megalign program of the LASARGENE bioinformatics computing department (DNASTAR Inc., Madison, Wl ). Percent sequence identity calculations were carried out by the Jotun Hein method (Hein JJ (1990) Meth., 183: 626-645) using the Megalign program of the LASARGENE bioinformatics computing department (DNASTAR Inc. , Madison, Wl).
The sequence alignments and the BLAST markers and probabilities indicate that the nucleic acid fragments encode almost completely a corn diamine epimerase (chp2.pk0008.h4), a portion of diaminopimelate rice epimerase.
(rls48.pk0036.hl0), and a whole soybean diaminopimelate epimerase (se2.pk0005.f1, ses8w.pk0010.hll, sfll.pk0031.h3, and sgslc.pk002.kl2), and a portion of wheat diaminopimelate epimerase (wlm2 .pk0030.g4). These sequences represent the first plant sequences encoding the enzyme diaminopimelate epimerase.
EXAMPLE 5 Characterization of cDNA clones encoding threonine synthase
Search for BLASTX using the EST sequences of clones cc2.pk0031. c9, csl .pk0058. g5, rls72.pkOOld. e7, sel.06a03, srl .pk0003. f6, and wrl .pk0085.h2 revealed similarity of the proteins encoded by the cDNAs to the threonine synthase of Arabidopsis thaliana, (GenBank Accession No. L41666). The BLAST results of each of these ESTs are shown in Table 5:
TABLE 5 BLAST results for clones encoding threonine synthase homologous polypeptides
The insert sequence of the entire cDNA in clone cc2.pk0031.c9 was determined and shown in SEQ ID NO: 15; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 16. The amino acid sequence declared in SEQ ID NO: 16 was evaluated by BLASTP, yielding a pLog value of 166.11 against the sequence of Arabidopsis thaliana. BLASTN against the best indicated identity of nucleotides 520 to 684 from cc2.pk0031.c9 with nucleotides 1 to 162 of an EST maize (GenBank Accession No.T18d47). The sequence of a portion of the cDNA insert of the clone csl.pk0058.g5 is shown in SEQ ID NO: 17, the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 18. The sequence of a portion of the cDNA insert of the clone rls72.pk0018.e7 is shown in SEQ ID NO: 19; the deduced amino acid sequence deduced from this cDNA is shown in SEQ ID NO: 20. The sequence of a portion of the cDNA insert of clone sel.06a03 is shown in SEQ ID NO: 21; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 22. The sequence of the entire cDNA insert in the clone srl .pk0003. f6 was determined and shown in SEQ ID NO: 23; the amino acid sequence deduced from. cDNA is shown in SEQ ID NO: 24. The amino acid sequence declared in SEQ ID NO: 24 was evaluated by BLASTP yielding a pLog value of 275.06 against the sequence of Arabidopsis thaliana. The sequence of a portion of the cDNA insert of the clone wrl .pk0085.h2 is shown in SEQ ID NO: 25; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 26. Figure 4 presents an alignment of the amino acid sequences reported in SEQ ID NOs: 16, 18, 20, 22, 24 and 26 and the Arabidopsis sequence Thaliana The data in Table 6 represents a calculation of the percent identity of the amino acid sequences reported in SEQ ID Nos: 16, 18, 20, 22, 24 and 26 and the sequence of Arabidopsis thaliana (SEQ ID NO: 27).
TABLE 6 Percentage of identity of amino acid sequences deduced from nucleotide sequences of cDNA clones encoding homologous polypeptides for threonine synthase
Alignments in the sequence were carried out by the Clustal alignment method (Higgins, DG and Sharp, PM (1989) CABIOS 5: 151-153) using the Megalign program of the LASARGENE bioinformatics computing department (DNASTAR Inc., Madison, Wl). Percentage identity calculations of the sequence were carried out by the Jotun Hein method (Hein JJ (1990) Meth., 183: 626-645) using the Megalign program of the LASARGENE bioinformatics computing department (DNASTAR Inc., Madison , Wl).
The sequence alignments and the BLAST markers and probabilities indicate that the nucleic acid fragments encode portions of a maize threonine synthase (cc2.pk0031.c9 and csl.pk0058.g5), a portion of rice threonine synthase (rls72. pk0018.e7) portions of a soybean threonine synthase (sel.06a03 and srl.pk0003.f6), and a portion of wheat threonine synthase (wrl .pk00d5.h2). These sequences represent the first sequences of corn, rice, soy and wheat that encode threonine synthase.
EXAMPLE 6 Characterization of cDNA Clones that encode Threonine Deaminase
The BLASTX investigation using the EST sequence of clone cen.pk0064.f revealed similarity of the protein encoded by the cDNA to threonine deaminase of Brukholderia capada (GenBank Accession No. U40630, pLog = 31.38). The BLASTX research using EST sequences of clones sfI.pk0055.h7 and sre.pk0044. f3 revealed similarity of the protein encoded by the cDNA to threonine deaminase of Solanum tuberosum and Brukholderia capada (EMBL Accssesion No. X67846 and GenBank Accession No. U40630, respectively). The BLAST pLog values were 36.55 and 31.79 for Sfll.pk0055.h7, and 19.47 and 14.51 for sre.pk0044. f3.
The sequence of the entire cDNA insert in clone cenl.pk0064.f4 was determined and is shown in SEQ ID NO: 2d; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 29. The amino acid sequence declared in SEQ ID NO: 29 was evaluated by BLASTP, yielding a pLog value of 134.85 versus the Brukholderia capped sequence. The sequence of the portion of the cDNA insert in clone sf11.pk0055.h7 is shown in SEQ ID NO: 30; the amino acid sequence deduced from this cDNA is shown in SEQ ID NO: 31. The sequence of the entire insert of cDNA in the clone sre.pk0044. f3 was determined and is shown in SEQ ID NO: 32; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 33. The amino acid sequence declared in SEQ ID NO: 33 was evaluated by BLASTP, yielding a pLog value of 19.24 versus the sequence of Solanum tuberosum and 15.19 versus the sequence of threonine deaminase from Brukholderia capada. Figure 5 shows an alignment of the amino acid sequences declared in SEQ ID NO: 29, 31 and 33 and the sequence of de Brukholderia capada (SEQ ID NO: 34). The data in Table 7 represent a percent identity calculation of the amino acid sequences reported in SEQ ID Nos: 29, 31 and 3335 and the Brukholderia capped sequence.
TABLE 7 Percentage of Identity of Derived T-amino acid Sequences of Nucleotide Sequences of cDNA Clones Encoding Polypeptide Homologs to Threonine Deaminase
The sequence alignments were performed by the Clustal alignment method (Higgins, DG and Sharp, PM (1989) CABJOS 5: 151-153) using the Megalign program of the LASARGEN bioinformatics computing department (DNASTAR Inc., Madison, Wl ). Percentage sequence identity calculations were carried out by the Jotun Hein method (Hein, JJ: (1990) Meth. Enz 183: 626-645) using the Megalign program of the LASARGEN bioinformatics computing department (DNASTAR Inc. , Madison, Wl)
The sequence alignments and the BLAST markers and the probabilities indicate that the current nucleic acid fragments that encode whole or almost entirely the threonine deaminase of corn (cenl.pk0064.f) and portions of threonine deaminase of soy (sfll.pk0055. h7 and sre.pk0044.f3). These sequences represent the first sequences of corn and soybean that encode threonine deaminase.
EXAMPLE 7 Characterization of cDNA Clones Encoding S-adenosylmethionine synthetase.
The BLASTX investigation using the nucleotide sequence of the clone cc3.mn0002.d2 revealed similarity of the protein encoded by the cDNA to S-adenosylmethionine synthetase from Oriza sativa (EMBL Accession No. Z26867; pLog = 99.03). The sequence of the entire cDNA insert in clone cc3.mn0002.d2 was determined and is shown in SEQ ID NO: 35; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 36. The nucleotide sequence declared in SEQ ID NO: 35 was evaluated by BLASTN, yielding a pLog value greater than 200 versus the Oriza sativa sequence. Figure 6 presents a sequence alignment of 6
nucleotides declared in SEQ ID NO: 35 and the sequence of Oriza sativa (SEQ ID NO: 37). The nucleotide sequence in SEQ ID NO: 35 is 88.5% identical over 1216 nucleotides to the nucleotide sequence of the Oriza sativa S-adenisylmethionine synthetase.
The BLASTX investigation using the nucleotide sequence of clone s2.12b06 revealed similarity of the protein encoded by the cDNA to S-adenosylmethionine synthetase from Lycopersicom esculentum (EMBL Accession No. Z24741; pLog = 62.62). The sequence of the whole cDNA insert in clone s2.12b06 was determined and is shown in SEQ ID NO: 38; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 39. The nucleotide sequence declared in SEQ ID NO: 38 was evaluated by BLASTN, yielding a pLog value greater than 200 versus the sequence of Lycopersicom esculentum. Figure 7 shows an alignment of nucleotide sequences declared in SEQ ID NO: 38 and the sequence of Lycopersicom esculentum (SEQ ID NO: 40). The nucleotide sequence declared in SEQ ID NO: 38 is 82% identical over 1210 nucleotides to the sequence of Lycopersicom esculentum.
The BLASTX investigation using the nucleotide sequence of the contiguous assembly of the clones wrel.pk0002.cl2, wleln.pk0070.b8, wkmlc.pk0003.g4, wlkl.pk0028.d3, wrelnpkl70.d8, wrl, pk0086.d5, wrl. pk0103.h8, and wreln.pk0082.b2 revealed similarity of the protein encoded by the contiguous to S-adenosyl-ethionine synthetase of Hordeum vulgare. { DDBJ Accession No. 63835) with a pLog value greater than 200. The nucleotide sequence of the contiguous assembly of the clones wrel.pk0002.cl2, wleln.pk0070.bd, wkmlc.pk0003.g4, wlkl .pk0028.d3, wrelnpkl70 .d8, wrl.pk0086.d5, wrl .pk0103.h8, and wreln.pk0082.b2 is shown in SEQ ID NO: 41; the deduced amino acid sequence of this cDNA is shown in SEQ ID NO: 42. Figure 8 shows an alignment of nucleotide sequences reported in SEQ ID NO: 41 and the sequence of Hordeum vulgare (SEQ ID NO: 43). SEQ ID NO: 41 is 92% identical to the sequence of Hordeum vulgare.
The sequence alignments and the BLAST markers and probabilities indicate that the current nucleic acid fragments that encode whole or almost entirely the S-adenosylmethionine synthetase from corn, soy, or wheat. These sequences represent the first sequences of corn, soy and wheat that encode S-adenosylmethionine synthetase.
EXAMPLE 8 Expression of Chimeric Genes in Monocotyledon Cells.
A chimeric gene comprising a cDNA encoding a plant biosynthetic enzyme in sense orientation with respect to the 27 kD corn zein promoter that is located 5 'to the cDNA fragment, and the 3' terminal of 10 kD zein that is located 3 ' to the fragment of cDNA, it can be constructed. The cDNA fragment of this gene can be generated by polymerase chain reaction (PCR) of the cDNA clone using appropriate oligonucleotide primers and under appropriate experimental conditions. The cloning sites (Ncol or Smal) can be incorporated into the oligonucleotides to provide proper orientation of the DNA fragment when inserted into the digested vector pML103 as described below. The amplified DNA can then be digested with Ncol and Smal restriction enzymes and fractionated on 0.7% low melting point agarose gel in 40 mM Tris-acetate, pH 8.5, 1 mM EDTA. An appropriate band of can be excised from the gel, melted at 68 ° C and combined with a 4.9 kb fragment of Ncol-Smal from plasmid pML 103. Plasmid pML 103 has been deposited under the terms of the Budapest treaty at ATCC (American Type Culture Collection, 10801 University Boulevard, Manassas, VA 20110-2209), and has an accession number ATCC 97366. The DNA segment of pML103 contains a 1.05 kb Ncol-Smal promoter fragment of the 27 kD corn zein gene. and a 0.96 kb fragment of S al-Sall from the 3 'terminus of the 10 kD corn zein gene in the vector pGem9Zf (+) (Promega). The vector and the inserted DNA can be ligated at 15 ° C during the night, essentially as described (Maniatis). The ligated DNA can then be used to transform E. coli XLl-Blue (Epicurian Coli XL-1 Blue, Stratagen). Bacterial transformants can be protected by restriction enzyme digestion of the plasmid DNA and analysis of the limited nucleotide sequence using the dideoxy chain termination method (Sequenase ™ DNA Sequencing Kit; U.S. Biochemical). The construction of the resulting plasmid comprises a chimeric gene encoding, in the 5 'and 3' direction, the 27 kD corn zein promoter, a cDNA fragment encoding a plant amino acid biosynthetic enzyme, and the 3 'region of zein 10 kD.
The chimeric gene described above can be introduced into maize cells by the following procedure. Immature maize embryos can be dissected from developed cariopses derived from crosses of maize lines born H99 and LH132. Embryos are isolated 10 to 11 days after pollination when they are 1.0 to 1.5 mm in length. The embryos are then placed with the sides of the shafts down and in contact with the solidified agarose medium N6 (Chu et al., (1975) Sci. Sin. Peking 16: 659-668). The embryos are kept in the dark at 27 ° C. The crumbly embryogenic callus consisting of undifferentiated masses of cells with pro-embrionoid and embryo-born suspensory structures proliferates from the scutellum of these immature embryos. The embryogenic callus isolated from the primary explant can be cultured in an N6 medium and subcultured in this medium every 2 or 3 weeks.
Plasmid p35S / Ac (obtained from Peter Eckes, Hoechst Ag, Frankfurt, Germany) can be used for transformation experiments to provide a selectable marker. This plasmid contains the Pat gene
(see European Patent Publication 0 242 236) which encodes phosphinothricin acetyl transferase (PAT). The enzyme PAT confers resistance to inhibitors of glutamine synthetase herbicides such as phosphinothricin. The pat gene in p35S / Ac is under the control of the 35S promoter of the Cauliflower Mosaic Virus (Odell et al., (1985)
Nature 313: 810-812) and the 3 'region of the T-DNA nopaline synthase gene of the Ti plasmid of Agrobacterium tumefaciens.
The method of bombardment of particles (Klein et al.,
(1987) Nature 327: 70-73) can be used to transfer genes to callus culture cells. According to this method, gold particles (1 μm in diameter) are coated with DNA using technique 8
following. Ten μg of the DNA plasmids are added to 50 μL of a suspension of gold particles (60 mg per mL). Calcium chloride (50 μL of 2.5 M solution) and spermidine free base (20 μL of a l.O M solution) are added to the particles. The suspension is vigorously stirred during the addition of these solutions. After 10 minutes, the tubes are centrifuged for a short time (5 sec at 15,000 rpm) and the supernatant removed. The particles are suspended in 200 μL of absolute ethanol, centrifuged again and the supernatant removed. The ethanol rinsing is performed again and the particles resuspended in a final volume of 30 μL of ethanol. An aliquot (5 μL) of gold particles coated with DNA can be placed in the center of a Kapton ™ disc (Bio-Rad Labs). The particles are then accelerated into the corn tissue with a Biolistic ™ PDS-1000 / He (Bio-Rad Instruments, Hercules CA), using a helium pressure of 1000 psi, a gap distance of 0.5 cm and a throw distance of 1.0 cm.
For the bombardment, the embryogenic tissue is placed on filter paper on a N6 solidified agarose medium. The fabric is arranged as a thin layer and covered with a circular area about 5 cm in diameter. The petri dish containing the fabric can be placed in the camera of the PDS-1000 / He approximately 8 cm from the limiter screen. The air in the chamber is then evacuated to a vacuum of 28 inches Hg. The macrocarrier is accelerated with a helium shock wave using a rupture membrane that bursts when the helium pressure in the shock tube reaches 1000 psi.
Seven days after the bombardment the tissue can be transferred to an N6 medium containing glufosinate (2 mg per liter) and lacks casein and proline. The tissue continues to grow slowly in this medium. After an additional 2 weeks the tissue can be transferred to a fresh N6 medium containing glufosinate. After 6 weeks, areas of about 1 cm in diameter of callus growing effectively can be identified in some of the plates containing a medium supplemented with glufosinate. These calluses can continue to grow when subcultured on a selective medium.
Plants can be regenerated from transgenic callus first by transfer of tissue clusters to an N6 medium supplemented with 0.2 mg per liter of 2,4-D. After two weeks the tissue can be transferred to a regenerated medium (Fromm et al., (1990) Bio / Technology 8: 833-839).
EXAMPLE 9 Expression of the Chimeric Genes in Dicotyledonous Cells
A box or expression cassette of a specific seed composed of the promoter and terminator of the gene expression coding for the β subunit of the phaseolin stored protein of the seed of the bean Phaseolus vulgaris (Doyle et al (1986) J. Biol. Chem. 261 : 9228-9238) can be used for expression of the biosynthetic enzymes of current amino acids in transformed soybeans. The phaseolin box includes about 500 nucleotides upstream (5 ') of the translation initiation codon and about 1650 nucleotides of the terminal (3') end of the phaseolin translation stop codon. Between the 5 'and 3' regions are sites of unique Ncol restriction endonucleases (which include the translation initiation codon ATG), Sma I, Kpn I and Xba I. The entire box is flanked by Hind III sites.
The cDNA fragment of this gene can be generated by polymerase chain reaction (PCR) of the cDNA clone using appropriate oligonucleotide primers. The cloning sites can be incorporated into the oligonucleotides to provide orientation of the DNA fragment when inserted into the expression vector. The amplification is then carried out, and the isolated fragment is inserted into a vector pUCld transporting the expression box of the seed.
Biosynthetic enzymes of plant amino acids are known to be localized in chloroplasts. In this way, for those enzymes (or polypeptides representing part of the current amino acid biosynthetic enzymes) that lack a white chloroplast signal, the DNA fragment can be inserted into the expression vector that can be synthesized by PCR with primers encoding a white chloroplast signal. For example, a chloroplast transit sequence equivalent to the cts of the small subunit of ribulose 1, 5-bisphosphate carboxylase of soybean (Berry-Lowe et al (1982) J. Mol. Appl. Gent., 483-49d) It can be used.
Soybean embryos can then be transformed with expression vectors comprising sequences encoding a plant amino acid biosynthetic enzyme. To induce tic embryos, cotyledons, 3-5 mm in length dissected from the sterilized surface, immature soybean seeds A2872, can be grown in light or in the dark at 26 ° C on an appropriate agar medium for 6 days. -10 weeks. The tic embryos which produce secondary embryos are then excised and placed in a suitable liquid medium. After repeated selection for groupings of tic embryos which multiplied early, embryos in globular stage, suspensions are maintained as described below.
Soybean embryogenic suspension cultures can be maintained in 35 mL of liquid medium on a rotary shaker, at 150 rpm, at 26 ° C with fluorescent lights on a program of 16: 8 hours day / night. The cultures are subcultured every two weeks by inoculating approximately 35 mg of tissue into 35 mL of liquid medium.
Soybean embryogenic suspension cultures can then be transformed by the particle bombardment method (Kline et al (1987) Nature (London) 327.70, US Patent No. 4,945,050). An instrument of Du Pont Biolistic ™ PDS100 / HE (retro-fitted helium) can be used for these transformations.
A selected marker gene which can be used to facilitate the transformation of soybean is a chimeric gene composed of the 35S promoter of the Cauliflower Mosaic Virus (Odell et al. (1985) Nature 313: 810-812)), the hygromycin gene phosphotransferase of plasmid pJR225 (from E. coli; Gritz the al. (1983) Gen 25: 179-188) and the 3 'region of the nopaline synthase gene of the T-DNA or of the Ti plasmid of Agrobacterium tumefaciens. The seed expression box comprising the 5 'region of the phaseolin, the fragment encoding the biosynthetic enzyme and the phaseolin of the 3' region can be isolated as a restriction fragment. This fragment can then be inserted into a single restriction site of the marker gene transporter vector.
For 50 μL of 60 mg / mL of a suspension of particles of 60 mg / mL of gold is added (in order): 5 μL DNA (1 μg / μL), 20 μl of spermidine (0.1 M), and 50 μL of CaCl2
(2.5 M). The particle preparation is then stirred for 3 minutes, centrifuged in a microfuge for 10 seconds and the supernatant removed. The particles of
DNA-coated ones are then washed once in 400 μL of
70% ethanol and resuspended in 40 μL of anhydrous ethanol. The DNA / particle suspension can be sonicated three times for one second each time. Five μL of the gold particles covered with DNA are then loaded onto each macro transporter disc.
Approximately 300-400 mg of two-week-old suspension culture is placed in an empty 60x15 mm petri dish and the residual liquid removed from the tissue with a pipette. For each transformation experiment, approximately 5-10 tissue plates are normally bombarded. The rupture pressure of the membrane is adjusted to 1100 psi and the chamber is evacuated to a vacuum of 28 inches of mercury. The fabric is placed approximately 3.5 inches apart from the retention screen and bombarded three times. In subsequent bombardment, the tissue can be divided in half and placed back into the liquid and cultured as described above.
Five to seven days after the bombardment, the liquid medium can be exchanged with fresh medium, eleven to twelve days after bombardment with fresh medium containing 50 mg / mL of hygromycin. This selective medium can be refreshed every week. - Seven to eight weeks after the bombardment, the transformed, green tissue can be observed growing from necrotic embryogenic clusters. The isolated green tissue is removed and inoculated into individual flasks to generate new transformed embryogenic suspension cultures, propagated by cloning. Each new line can be treated as an independent transformation event. These suspensions can be subcultured and maintained as groups of immature or regenerated embryos within whole plants by maturation and germination of individual somatic embryos.
EXAMPLE 10 Analysis of the Amino Acid Content of the Transformed Plant Seeds
To analyze by expression of the chimeric genes in seeds and for the consequences of expression on the amino acid content in the seeds, a seed meal can be prepared by any number of methods suitable for those skilled in the art. The seed meal can be partially or completely defatted, via extraction of hexane for example, if desired. The protein extracts can be prepared from the flour and analyzed by enzymatic activity. Alternatively the presence of. any of the expressed enzymes can be immunologically tested by methods well known to those skilled in the art. To measure the composition of free amino acids in seeds, free amino acids can be extracted from the flour and analyzed by methods well known by those skilled in the art (Bielinski et al. (1996) Anal. Biochem. 17: 278-293) . The composition can then be determined using any commercially available amino acid analyzer. To measure the free amino acid composition of the seeds, the flour containing both bound protein and free amino acids can be hydrolyzed by acid to release the amino acids linked to the protein and the composition can be determined using any commercially available amino acid analyzer. The seeds expressing the biosynthetic enzymes of current amino acids and with altered content of lysine, threonine, methionine, cysteine and / or isoleucine as compared to wild type seeds can then be identified and propagated.
To measure the free amino acid composition of the seeds, the free amino acids can be extracted from 8-10 milligrams of seed meal in 1.0 mL of methanol / chloroform / water mixed in a ratio of 12v / 5v / 3v (MCW) at temperature ambient. The mixture can then be vigorously stirred and then centrifuged in an eppendorf microcentrifuge for about 3 minutes; approximately 0.8 mL of the supernatant is then decanted. For this supernatant, 0.2 mL of chloroform followed by 0.3 L of water are added. The mixture is then vigorously stirred and centrifuged in an eppendorf microcentrifuge for about 3 minutes. The upper aqueous phase, approximately 1.0 mL, can then be removed and dried in a Savant Speed Vac concentrator. The samples are then hydrolysed in 6N hydrochloric acid, 0.4% ß-mercaptoethanol under nitrogen for 24 h at 110-120 ° C. Ten percent of the sample can then be analyzed using a Beckman Model 6300 amino acid analyzer using posterior detection in ninhydrin column. The relative free amino acid levels in the seeds are then compared in this manner as ratios of lysine, threonine, methionine, cysteine, and / or isoleucine to leucine, using leucine as an internal standard.
EXAMPLE 11 Expression of Chimeric Genes in Microbial Cells
The cDNAs encoding the biosynthetic enzymes of current plant amino acids can be inserted into the expression vector pET24d (Novagen) T7 of E. coli. The DNA plasmid containing a -cDNA can be appropriately digested to release the nucleic acid fragment encoding the enzyme. This fragment can be purified on a NuSieve GTG ™ 1% low melting agarose gel (FMC). The buffer and the agarose contain 10 μg / ml ethidium bromide for visualization of the DNA fragment. The fragment can be purified from the agarose gel by digestion with GELase ™ (Epicenter Technology) according to the manufacturer's instructions, precipitated with alcohol, dried and suspended in 20 μL of water. Suitable oligonucleotide adapters can be ligated to the fragment using T4 DNA ligase (New England Biolabs, Beverly, MA). The fragment containing the ligated adapters can be purified from the excess of adapters using low melt agarose as described above. The vector pET24d is digested, dephosphorylated with alkaline phosphatase (NEB) and deproteinized with phenol / chloroform as * described above. The pET24 vector prepared and the fragment can then be ligated at 16 ° C for 15 hours followed by transformation into DH5 electrocompetent cells (GIBCO BRL). Transformants can be selected on agar plates containing 2xYT medium and 50 μg / mL kanamycin. The transformants containing the gene encoding the enzyme are then protected for correct orientation with respect to the pET24d T7 promoter by restriction of enzyme analysis.
Clones in the correct orientation with respect to the T7 promoter can be transformed into BL21 (DE3) competent cells (Novagen) and selected on 2xYT agar plates containing 50 μg / mL kanamycin. A colony appeared from this transformation construct can be grown overnight at 30 ° C in a 2xYT medium containing 50 μg / mL kanamycin. The culture is then diluted twice with fresh medium, allowing re-growth for 1 h, and inducing by addition of isopropylthiogalactopyranoside to a final concentration of ImM. The cells are then harvested by centrifugation after 3 h and re-suspended in 50 μL of 50 mM Tris-HCl at pH 8.0 containing 0.1 mM DTT and 0.2 mM phenyl methylsulfonyl fluoride. A small amount of a 1 mm glass bed can be added and the mix sonicated 3 times for about 5 seconds each time with a micro-test sonicator. The mixture is centrifuged and the protein concentration of the supernatant determined. One μg of protein from the soluble fraction of the culture can be separated by SDS-polyacrylamide gel electrophoresis. The gels can be observed by protein bands migrating to the expected molecular weight.
EXAMPLE 12 'Evaluation of Compounds for their Ability to Inhibit the Activity of a Plant Amino Acid Biosynthetic Enzyme
The plant amino acid biosynthetic enzymes described herein can be produced using any number of known methods for that enabled in the art. Such methods include, but are not limited to, bacterial expression as described in Example 6, or expression in eukaryotic cell cultures, in-plant, and using viral expression systems in suitably infected organisms or cell lines. Current enzymes can be expressed separately as mature proteins, or can be co-expressed in E. coli or other suitable means of expression. In addition, whether expressed separately or in combination, current enzymes can be expressed either in mature forms of the proteins as observed in vivo or as a fusion of proteins bound by covalence to a variety of enzymes, proteins or affinity residue. Associations of common fusion proteins include glutathione S-transferase
("GST"), theorexin ("Trx"), protein bound maltose, and C-and / or N-terminal hexahistidine polypeptide ("(HisJß").
The fusion of proteins can be carried out with a recognized site of protease at the melting point in such a way that the fusion partners can be separated by digestion by proteases to yield intact mature enzymes. Examples of such proteases include thrombin, enterokinase and factor Xa. However, any protease can be used specifically by separating the peptide connected to the fusion protein and the biosynthetic enzyme.
The purification of the current enzymes, if desired, can utilize any number of familiar separation technologies for those skilled in the art of protein purification. Examples of such methods include, but are not limiting, homogenization, filtration, centrifugation, heat denaturation, ammonium sulfate precipitation, salt precipitation, pH precipitation, ion exchange chromatography, hydrophobic interaction chromatography and affinity chromatography, where the ligand by affinity represents a substrate, analog substrate or inhibitor. When the enzymes are expressed as protein fusion, the purification protocol may include the use of an affinity resin which is specified by the protein fusion residue bound to an expressed enzyme or an affinity resin containing ligands which are specific for the enzyme. For example, an enzyme can be expressed as a protein fusion coupled to the thioredoxin C-terminus. In addition, a peptide (His) 6 can be constructed in the N-terminus of the fused thioredoxin moiety to provide additional opportunities for affinity purification. Other suitable affinity resins could be synthesized by binding the appropriate ligands to any suitable resin such as Sepharose-4B. In an alternate modality, a thioredoxin protein fusion can be eluted using dithiothreitol; however, elution can be performed using reagents which interact to displace the thioredoxin from the resin. These reagents include β-mercaptoethanol or another reduced thiol.
The fusion of eluted protein can be subject to further purification by the traditional methods set forth above, if desired. The proteolytic separation of the thioredoxin fusion protein and the biosynthetic enzyme can be performed after the protein fusion is purified or while the protein is still bound to the ThisBond ™ affinity resin or to another resin.
The partially purified or purified, crude enzyme, either alone or as a protein fusion, can be used in assays for the evaluation of compounds for their ability to inhibit the enzymatic activation of the plant amino acid biosynthetic enzymes described herein. The assays can be conducted under well-known experimental conditions which allow optimal enzymatic activity. Examples of assays for many of these enzymes can be found in Methods in Enzymology Vol. V, (Colowich and Kaplan eds) Academic Press, New York or Methods in Enzymology Vol. XVII, (Tabor and Tabor eds) Academic Press, New York. Specific examples can be found in the following references, each of which is incorporated herein by reference: dihydrodipicolinate reductase can be assayed as described in Farkas et al. (1965) J. Biol. Chem. 240: 4717-4722, or Cremer et al. (1988) J. Gen. Microbiol. 134: 3221-3229; diaminopimelate epimerase can be assayed as described in Work (1962) Methods in Enzymology Vol. V, (Colowich and Kaplan eds) 858-864, Academic Press, New York; Threonine synthase can be assayed as described in Giovanelli et al. (1984) Plant Physiol. 76285-292 or Curien et al. (1966) FEBS Lett. 390: 85-90; Threonine deaminase can be assayed as described in Tomova et al. (1968) Biochemistry (USSR) 33: 200-208 or Dougal (1970) Phytochemistry 5: 959-964; and S-adenosylmethionine synthetase can be assayed as described in Mudd (1960) Biochim. Biophys. Acta 38: 354-355 or Boerjan et al. (1994) Plant Cell 5: 1401-1414.
SEQUENCE LIST
(1) GENERAL INFORMATION (i) APPLICANT (A) CONSIGNEE: EIDU PONT DE NEMOURS AND COMPANY (B) STREET: 1007 MARKET STREET (C) CITY: WILMINGTON (D) STATE: DELAWARE (E) COUNTRY: USA (F) ZIP: 19898 (G) TELEPHONE: 302-992-4926 (H) TELEFAX: 302-773-0164 Ti) TELEX: 6717325; ii) TITLE OF THE INVENTION: BIOSYNTHETIC ENZYMES OF PLANT AMINO ACIDS (iii) NUMBER OF SEQUENCES: 43 (iv) LEGIBLE FORMAT IN COMPUTING: (A) TYPE OF MEDIUM: DISKETTE, 3.5 INCH (B) COMPUTER: IBM COMPATIBLE PC (C) OPERATING SYSTEM: MICROSOFT WINDOWS 95 (D) SOFTWARE: MICROSOFT WORD VERSION 7.0A
(v) CURRENT APPLICATION REFERENCE: (A) APPLICATION NUMBER: (B) REGISTRATION DATE: (vi) CLASSIFICATION OF PREVIOUS APPLICATION: (A) APPLICATION NUMBER: 60 / 048,771 (B) REGISTRATION DATE: JUNE 6, 1997 (vii) INFORMATION OF THE LAWYER / AGENT (A) NAME: MAJARÍAN, WILLIAM R. (B) REGISTRATION NUMBER: 41,173 (C) REFERENCE / NUMBER OF CEDULA: BB-1087 (2) INFORMATION FOR SEC ID NO: l (i) ) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 908 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE (B) CLON : csiln.pk0042.a3 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: l:
ACGCGGGACA GATAAGTGGC ATGGACGAGC CGCTGGAGAT CCCTGTGCTG AACGACCTCA 60
CCATGGTTCT GGGCTCCATA GCGCAGTCGA GAGCAACCGG CGTGGTGGTC GACTTCAGCG 120
AGCCTTCAGC TGTTTACGAC AATGTCAAGC AGGCAGCGGC GTTTGGTCTG AGCAGCGTCG 180
TCTACGTTCC GAAAATCGAG CTAGAGACAG TGACTGAACT GTCAGCGTTC TGCGAGAAGG 2 0
CAAGCGGCTG CTTGGTTGCG CCAACGCTGT CGATTGGGTC CGTGCTCCTT CAGCAAGCGG 300
CTATACAGGC CTCGTTCCAC TACAGCAACG TTGAGATTGT GGAATCGAGA CCAAACCCAT 360
CGGATCTTCC ATCGCAAGAT GCAATCCAGA TTGCAAACAA CATATCAGAC CTTGGTCAGA 420
TATACAACAG GGAAGATATG GATTCCAGCA GTCCAGCCAG AGGCCAGCTG CTCGGGGAAG 480
ACGGAGTGCG CGTGCACAGC ATGGTTCTCC CTGGTCTCGT CTCCAGCACG TCGATCAACT 540
TCTCTGGCCC AGGAGAGATG TACACCTTAC GGCATGACGT TGCGAATGTT CAGTGCCTGA 600
TGCCAGGACT GATCCTGGCG ATACGGAAGG TGGTGCGGTT CAAGAACTTG ATTTATGGGC 660
TAGAGAAGTT CTTGTAGTGA ACAACAAACA ACCAATGCAA AACATCGACA GGCAACAGGC 720
AAGGCAGATA TCATCTGACG TCGCAACAAC CAAAACGACA GAGATTTGGA AAATAAAGGC 780
TGCACAGAAG ACGTCTGGGG TTTTGTGTGC ACCAGGCTGC GCAGAGAACG TCTGTCATTT 840
TGTGTGCACC ACTACGGCAC TACCTGCTGA GCGCGATTTT TATAAAAAAG GCATGGGAGG 900
GAGATCAT 908
INFORMATION FOR SEQ ID NO: 2: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 224 amino acids (B) TYPE: amino acids (C) STRING: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE (B) CLON: csiln .pk0042. a3 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 2;
Ala Gly Gln He Ser Gly Met Asp Glu "Pro'-Leu - '? L-u He Pro Val -Leu
1 5 10 15. í .-- .A "ll • '. - * 7 J.. *
Asn Asp Leu Thr Met Val Leu Gly Ser He Wing G n Ser Arg Wing Thr 20 25 30 Gly Val Val Val Asp Phe Ser Glu Pro Ser Val Val Tyr Asp Asn Val 35 40 45 Lys Gln Wing Wing Wing Phe Gly Leu Ser Val Val Tyr Val Pro Lys 50 55 60 He Glu Leu Glu Thr Val Thr Glu Leu Ser Wing Phe Cys Glu Lys Wing 65 70 75 80
Be Gly Cys Leu Val Wing Pro Thr Leu Be He Gly Ser Val Leu Leu 85 90 95 Gln Gln Wing Wing Gln Wing Being Phe His Tyr Being Asn Val Glu He 100 105 110 Val Glu Ser Arg Pro Asn Pro Ser Asp Leu Pro Ser Gln Asp Ala He 115 120 125 Gln He Ala Asn Asn He Ser Asp Leu Gly Gln He Tyr Asn Arg Glu
130 135 140 Asp Met Asp Being Ser Pro Pro Wing Arg Gly Gln Leu Leu Gly Glu Asp 145 150 155 160
Gly Val Arg Val His Ser Met Val Leu Pro Gly Leu Val Ser Ser Thr 165 170 175
Be He Asn Phe Be Gly Pro Gly Glu Met Tyr Thr Leu Arg His Asp 180 185 190 Val Wing Asn Val Gln Cys Leu Met Pro Gly Leu He Leu Wing He Arg 195 200 205 Lys Val Val Arg Phe Lys Asn Leu He Tyr Gly Leu Glu Lys Phe Leu 210 215 220 (2) INFORMATION FOR SEQ ID NO: 3: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 339 base pairs (B) TYPE: nucleic acids (C) CHAIN: single (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE (B) CLON: rls2.pk0017.d3 (i) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 3:GGAGAAATGC AGCAAAGGTC CTCTGCTCAA CGCAGATGCC GCCATCTCAG 60
AGCACAATCA AGGTTGTTAT CATTGGGGCG ACAAAAGAGA TTGGAAGAAC GGCAATAGCG 120
GCAGTAAGTA AAGCAAGGGG AATGGAGCTT GCAGGGGCCA TAGATTCTCA GTGTATAGGC 180
CTAGATGCAG GAGAGATAAG TGGCATGGGA AGAACCCTGG AAATTCCGGT GCTCAATGAT 240
CTCACAATGG TTCTGGGCTC AATTGCACAA ACCAGAGCAA CTGGAGTGGT GGTTGATTTT 300
AGTGAACCTT CAACTGTTTA TGATAATGTC AAACAGGCA 339
(2) INFORMATION FOR SEQ ID NO: (i) CHARACTERISTICS OF THE SEQUENCE (A) LENGTH: 113 amino acids (B) TYPE: amino acids (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE (B) CLON: rls2.pk0017.d3 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 4:
Lys He Gly Arg Arg Asn Wing Wing Lys Val Leu Cys Ser Thr Gln Met 1 5 10 15 Pro Pro Ser Gln Ser Thr He Lys Val Val He He Gly Wing Thr Lys 20 25 30 Glu He Gly Arg Thr Wing He Wing Wing Val Ser Lys Wing Arg Gly Met 35 40 45 Glu Leu Wing Gly Wing He Asp Ser Gln Cys He Gly Leu Asp Wing Gly 50 55 60 Glu He Be Gly Met Gly Arg Thr Leu Glu He Pro Val Leu Asn Asp 65 70 75 80 Leu Thr Met Val Leu Gly Ser He Wing Gln Thr Arg Wing Thr Gly Val 85 90 95 Val Val Asp Phe Ser Glu Pro Ser Thr Val Tyr Asp Asn Val Lys Gln 100 105 110 Wing
(2) INFORMATION FOR SEQ ID NO: 5: (i) CHARACTERISTICS OF THE SEQUENCE (A) LENGTH: 275 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE : peptide (vi) ORIGINAL SOURCE (A) Synechocystus sp (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 5:
Met Ala Asn Gln Asp Leu He Pro Val Val Val Asn Gly Ala Ala Glv 5 • 10. - - fifteen _. * Lys Met Gly Arg Glu Val He Lys Wing Val Wing Gln Ala Pro Asp Leu. 20 - ^ 25 3p Gln Leu Val Gly Ala Val Asp His Asn Pro Ser ¿eu 'dlñ Gly Gln sp 45 He Gly Glu Val Val Gly He Ala Pro Leu Glu Val ro Val.Leu Ala 50 55 60 Asp Leu Gln Ser Val Leu Val Leu Ala Thr Gln Glu Lys He G n Gly 65 70 75 80 Val Met Val Asp Phe Thr His Pro Ser Gly Val 'Tyr Asp Asn Val Arg 85 90 -' 95 Ser Ala He Ala Tyr Gly Val Arg Pro Val Val Gly Thr Thr Gly I? U 100 105 '110 Ser Glu Gln Gln He Gln Asp Leu Gly Asp Phe Wing Glu Lys Wing Ser 115 120 125 Thr Gly Cys Leu He Wing Pro Asn Phe Wing He Gly Val Leu Leu Met 130 135 140 Gln Gln Wing Wing Val Gln Wing Cys Gln Tyr Phe Asp His Val Glu He 145 150 155 160 He Glu Leu His His Asn Gln Lys Wing Asp Wing Pro Ser Gly Thr Wing 165 170 175 He Lys Thr Wing Gln Met Leu Wing Glu Met Gly Lys Thr Phe Asn Pro 180 185 190 Pro Wing Val Glu Glu Lys Glu Thr He Wing Gly Wing Lys Gly Gly Leu 195 200 205 Gly Pro Gly Gln He Pro He His Be He Arg Leu Pro Gly Leu He 210 215 220 Wing His Gln Glu Val Leu Phe Gly Ser Pro Gly Gln Leu Tyr Thr He 225 230 2 35 240 Arg His Asp Thr Thr Asp Arg Wing Cys Tyr Met Pro Gly Val Leu Leu 245 250 255 Gly He Arg Lys Val Val Glu Leu Lys Gly Leu Val Tyr Gly Leu Glu 260 265 270 Lys Leu Leu 275
(2) INFORMATION FOR SEQ ID NO: 6: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1012 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE (B) CLON: Chp2 .pk0008. h4 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 6:
TATTGCCAGA GATGTGTGGT AATGGAGTCC GTTGCTTCGC TCGGTTTATA GCCGAGATTG 60
AAAATCTGCA GGGGACAAAT AGATTCACTA TTCATACTGG TGCTGGAAAG ATCGTTCCTG 120
AAATACAAAG TGATGGGCAG GTAAAGGTTG ATATGGGCGA GCCTATCCTT TCTGGACTAG_180_ACATCCCCAC AAAACTGCTA GCTACCAAGA ACAAAGCTGT TGTTCAAGCT GAATTGGCAG 2 0
TTGAGGGCTT AACATGGCAT GTCACATGTG TTAGCATGGG AAACCCTCAC TGTGTCACAT 300
TTGGTGCAAA TGAGTTAAAG GTATTGCAGG TCGACGATTT AAAACTTAGC GaAATTGGGC 360
CTAAATTTGA GCATCATGAA ATGTTTCCTG CTCGCACAAA CACAGAATTC GTACAGGTTT 420
TGTCTCGCTC ACACCTCAAA ATGCGGGTCT GGGAACGTGG TGCTGGAGCA ACTCTTGCCT 80
GTGGTACTGG TGCTTGTGCA GTGGTTGTTG CAGCTGTTCT TGAGGGTCGA GCTGAGCGGA 540
AATGTGTAGT TGATTTGCCT GGCGGGCCAT TGGAAATTGA GTGGAGGGAG GATGACAATC 600
ATGTTTACAT GACTGGTCCT GCAGAGGTCG TCTTTTATGG ATCTGTTGTT CACTAGGTAC 660
TGGGGACCAA GATAGAAGGG TTGGCTGCCA CTCAGAGCTT GTGAGATTGG TTATAGTATC 720
CATGAAACAG AGTGTTCTGG TACCAGTACA CTTGTXCAGA TATTCTTAAT TATGATTGCT 780
TGATTTGGGT AGCMGTAGAG GCTTCCTTTT GAAGCATTCT AGTGTTCMCC TTTTGTACTC 840
CTTTAGTTTG TCAGGTTTGA ACACTACATG GGTAACATGT CYTTCCCACC ATTTTCYGTT 900
TCTTTTCTTT GTAAGTGAAC GCCAATGCAG TTTTAGTATT GTTTTCTATA GATTTGTCTT 960
GATGCACTGG GCTTACTACT TATTTTCTGG TATGAATGCT GCCTATTTCC TG 1012
(2) INFORMATION FOR SEQ ID NO: 7: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 217 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE (B) CLON: chp2.pk0008.h4 (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 7:
Leu Pro Glu Met Cys Gly Asn Gly Val Arg Cys Phe Ala Arg Phe He
1 5 1 i r0s 1.5- Wing Glu He Glu Asn Leu Gln Gly Thr Asn? Rg Phe Thr He His Thr 20 5 30 Gly Wing Gly Lys He Val Pro Glu He Gln Ser? Sp Gly Gln Val 35 Lys 40 45 Val Asp Met Gly Glu Pro He Leu Ser Gly Leu Asp He Pro Thr X, ys 55 60 Leu Leu Wing Thr Lys Asn Lys Wing Val Val Gln Wing Glu Leu Wing Val 65 70 75 80
Glu Gly Leu Thr Trp His Val Thr Cys Val Ser Met Gly Asn Pro His 85 90 95 Cys Val Thr Phe Ely Wing Asn Glu Leu Lys Val Leu Gln Val Asp Asp 100 IOS - ..- • no Leu Lys Ser Glu He Gly Pro Lys Phe Glu His Hxs Glu Oest. Phe 115 120. 125 Pro Ala Arg Thr Asn Thr Glu Phe Val Gln Val Leu Ser Arg Ser His 130 135 140 Leu Lys Met Arg Val Trp Glu Arg Gly Ala Gly Ala Thr Leu Ala Cys 145 150 155 160
Gly Thr Gly Wing Cys Wing Val Val Val Wing Wing Val Leu Glu Gly Arg 165 170 175 Wing Glu Arg Lys Cys Val Val Asp Leu Pro Gly Gly Pro Leu Glu He 180 185 190 Glu Trp Arg Glu Asp Asp Asn His Val Tyr Met Thr Gly Pro Ala Glu 195 200 205 Val Val Phe Tyr Gly Ser Val Val His 210 215
(2) INFORMATION FOR SEQ ID NO: 8: (i) CHARACTERISTICS OF THE SEQUENCE: - (A) LENGTH: 481 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE (B) CLON: rls48.pk0036.hl0. { xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 8:
TGTATCCGGC GCCGACGGTG TGATCTTCGT CATGCCGGGG GTCAATGGCG CGGACTACAC 60
CATGAGGATC TTCAACTCGG ACGGCAGTGA GCCGGAGATG TGTGGCAATG GAGTCCGTTG 120
CTTTGCCCGG TTTATAGCTG AGCTTGAAAA CCTACAGGGA ACACATAGCT TCAAAATTCA 180
CACTGGCGCT GGGCTAATCA TTCCTGAAAT ACAAAATGAT GGCAAGGTAA AGGTTGATAT 240
GGGCCAGCCC ATTCTCTCTG GACCAGATAT TCCAACAAAA CTGCCATCCA CCAAGAATGA 300
AGCCGTTGTC CAAGCTGATT TGGGCAGTTG ATGGCTCAAC ATGGCAAGTA ACCTGTGTTA 360
GCATGGGCAA TCCACATTGT GTCACATTTG GCACAAAGGA GCTCAAGGTT TTGCATGTTG 420
ATGATTAAAG CTTAATGATA TTGGGGCCTA AATTCAGCAT CATGAAATGT TCCTGCCCCA 480
C 481
(2) INFORMATION FOR SEQ ID NO: 9: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 85 amino acids (B) TYPE: amino acids (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE (B) CLON: rls48.pk0036.hl0 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 9
Val Ser Gly Wing Asp Gly Val He Phe Val Met Pro Gly Val Asn Gly 1 5 10 15 Wing Asp Tyr Thr Met Arg He Phe Asn Ser Asp Gly Ser Glu Pro Glu 20 25 30 Met Cys Gly Asn Gly Val Arg Cys Phe Ala Arg Phe He Wing Glu Leu 35 40 45 Glu Asn Leu Gln Gly Thr His Ser Phe Lys He- His Thr Gly Wing Gly 50 55 60 Leu He He Pro Glu He Gln Asn Asp Gly Lys Val Lys Val Asp Met 65 70 75 80 Gly Gln Pro He Leu 85 (2) INFORMATION FOR SEQ ID NO: 10: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1301 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: cDNA
(xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 10:
ATCCCTTATT AAGCAGGGGT TTCGCGGCGC GAGACGGTGA CACTGGCAGA GTGGAATTTC 60
CGCCGCCATT CGAAGCTACA GCGATGGCCA TAACCGCCAC CATTTCCGTT CCCCTCACAT 120
CCCCCAGTCG CCGCACTCTC ACCTCCGTCA ATAGCCTCTC TCCCCTTTCT ACCCGATCCA 180
CTTTGCCCAC ACCGCAACGC ACTTTCAAAT ACCCTAATTC GCGCCTCGTC GTGTCTTCCA 240
TGAGCACCGA AACAGCCGTC AAAACTTCAT CCGCCTCCTT CCTCAACCGC AAGGAGTCCG 300
GCTTCCTCCA TTTCGCCAAG TACCACGGCC TCGGAAACGA CTTCGTTTTG ATTGACAATA 360CTC CGAGCCCAAG ATCAGTGCTG AGAAAGCGGT GCAACTGTGT GATCGGAACT 420
TCGGCGTTGG AGCTGACGGA GTTATCTTTG TCTTGCCTGG CATCAGTGGC ACCGATTATA 480
CCATGAGGAT TTTTAACTCT GATGGTAGTG AGCCTGAGAT GTGTGGCAAT GGAGTTCGAT 540
GCTTTGCCAA ATTTGTTTCT CAGCTTGAGA ATTTACATGG GAGGCATAGT TTTACCATTC 600
ATACTGGTGC TGGTCTGATT ATTCCTGAAG TCTTGGAGGA TGGAAATGTC AGAGTTGATA 660
TGGGGGAGCC AGTTCTTAAA GCCTTGGATG TGCCTACTAA ATTACCTGCA AATAAGGATA 720
ATGCTGTTGT TAAATCACAG CTAGTTGTAG ATGGAGTTAT TTGGCATGTG ACCTGTGTTA 780
GCATGGGGAA TCCACACTGT GTAACTTTCA GTAGAGAAGG AAGCCAGAAT TTGCTTGTTG 8 0
ATGAA TGAA GCTAGCAGAA ATTGGGCCAA AATTTGAACA TCATGAGGTG TTCCCTGCAC 900
GAACTAACAC AGAGTTTGTG CAAGTATTAT CTAACTCTCA CTTGAAAATG CGTGTTTGGG 960
AGCGGGGAGC AGGAGCAACC CTAGCCTGTG GAACTGGAGC TTGTGCTACT GTTGTTGCAG 1020
CAGTTCTTGA GGGTCGTGCT GGGAGGAATT GCACGGTTGA TCTACCTGGA GGGCCTCTTC 1080
AGATTGAGTG GAGGGAGGAA GATAATCATG TTTATATGAC AGGCTCAGCC GATGTÁGTTT 1140 ATTATGGTTC TTTGCCCCTT TGATATGTTG CCCCCATTGT TAAACCCAAT ATGGAATTAG_1200_GAATTGGTGA ATAATATTTG TATGAGAGGT GGACTTTCTG CTTGTTCCTA ATATTTTGCC 1260
ACGTCTTTAT AAAAAAAAAA AAAAAAAAAA AAAAAAAAAAA TO 1301 (2) INFORMATION FOR SEQ ID NO: 11: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 359 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 11:
Met Wing He Thr Wing Thr He Ser Val Pro Leu Thr Ser Pro Ser Arg 1 5 - 10 15 Arg Thr Leu Thr Ser Val Asn Ser Leu Ser Pro Leu Ser Thr Arg Ser 20 25 30 Thr Leu Pro Thr Pro Gln Arg Thr Phe Lys Tyr Pro Asn Ser Arg Leu 35 40 45 Val Val Ser Ser Met Ser Thr Glu Thr Wing Val Lys Thr Ser Wing 50 55 60 Ser Phe Leu Asn Arg Lys Glu Ser Gly Phe Leu His Phe Wing Lys Tyr 65 70 75 80 His Gly Leu Gly Asn Asp Phe Val Leu As Asn Arg Asp Be Ser 85 90 95 Glu Pro Lys He Ser Wing Glu Lys Wing Val Gln Leu Cys Asp Arg Asn 100 105"110 Phe Gly Val Gly Wing Asp Gly Val He Phe Val Leu Pro Gly He Ser 115 120 125 Gly Thr Asp Tyr Thr Met Arg He Phe Asn Ser Asp Gly Ser Glu Pro 130 135 140 Glu Met Cys Gly Asn Gly Val Arg Cys Phe Ala Lys Phe Val Ser Gln 145 150 155 160
Leu Glu Asn Leu His Gly Arg His Ser Phe Thr He His Thr Gly Wing 165 170 175 Gly Leu He He Pro Glu Val Leu Glu Asp Gly Asn Val Arg Val Asp 180 185 190 Met Gly Glu Pro Val Leu Lys Ala Leu Asp Val Pro Thr Lys Leu Pro 195 200 205 Wing Asn Lys Asp Asn Wing Val Val Lys Ser Gln Leu Val Val Asp Gly 210 215 220 Val He Trp His Val Thr Cys Val Ser Met Gly Asn Pro His Cys Val 225 230 235 240
Thr Phe Ser Arg Glu Gly Ser Gln Asn Leu Leu Val Asp Glu Leu Lys 245 250 255 Leu Wing Glu He Gly Pro Lys Phe Glu His His Glu Val Phe Pro Wing 260 265 270 Arg Thr Asn Thr Glu Phe Val Gln Val Leu Ser Asn Ser His Leu Lys 275 280 285 Met Arg Val Trp Glu Arg Gly Ala Gly Ala Thr Leu Ala Cys Gly Thr 290 295 300 Gly Ala Cys Ala Thr Val Val Ala Ala Ala Val Leu Glu Gly Arg Ala Gly 305 310 315 320 Arg Asn Cys Thr Val Asp Leu Pro Gly Gly Pro Leu Gln He Glu Trp 325 330 335 Arg Glu Glu Asp Asn His Val Tyr Met Thr Gly Ser Wing Asp Val Val 340 345 350 Tyr Tyr Gly Ser Leu Pro Leu 355
(2) INFORMATION FOR SEQ ID NO: 12: (i) CHARACTERISTICS OF SEQUENCE: (A) LENGTH: 602 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (i) TYPE OF MOLECULE: cDNA (Ü) IMMEDIATE SOURCE (B) CLON: Wlm24.pk0030.g4 (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 12
CTCCACCGCC CCCTCCTCGG GCGGTCGCCT CCTCCGTCCG TTCTGTGGGA ATCCGCGCCC 60
CCGCCGCGCC GTCGCCTCGA TGGCCGTGTC CGCTCCCAAG TCGCCAGCCG CCGCCTCGTT 120
CCTCGAGCGC CGCGAGTCCG AGCGCGCGCT CCACTTCGTG AAGTACCAGG GCCTCGGCAA 180
CGACTTCATA ATGGTCGACA ACAGGGATTC GGCCGTACCG AAGGTGACAC CGGAGGAGGC 240
GGCGAAGCTA TGCGACCGAA ACTTTGGGTA TTGGGTGCTG ATGGCGTCAT CTTCGTCCTG 300
CCGGGGGGTCA ACGGCGCGGA CTACACTATG AGGATATTCA ACTCCGATGG CAGCAACCGG 360
AATGTNTGGN ATGGATTCGT TGCTTGCTCG CTTTATACGG AGTTGAAATC TACANGGAAA 420
CATACTTCAA AACAANAGGG GGCTGGATTA ATATCCTGAA ATANAHACAT GNAAGTTANG 480
TNATATGGGC AACAATCTTA TGGCANATTT CA AAAATGC ATCACAAGAT AACTTNTAAA 540
ACGATTGAAT TAGGCAANAG AANTACCGTT ATAGGAACCC ATGAAMCTTG TNAAATTAAG 600
GT "602 (2) INFORMATION FOR SEQ ID NO: 13: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 80 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) ) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE: (B) CLON: wlm24 .pk0030. g4 (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 13:
Wing Leu His Phe Val Lys Tyr Gln Gly Leu Gly Asn Asp Phe He Met 1 5 10 15 Val Asp Asn Arg Asp Ser Wing Val Pro Lys Val Thr Pro Glu Glu Wing 20 25 30 Wing Lys Leu Cys Asp Arg Asn Phe Gly Xaa Gly Wing Asp Gly Val He 35 40 45 Phe Val Leu Pro Gly Val Asn Gly Wing Asp Tyr Thr Met Arg He Phe 50 55 60 Asn Ser Asp Gly Ser Asn Arg Asn Val Trp Xaa Gly Phe Val Ala Cys 65 70 75 80
INFORMATION FOR SEQ ID NO: 14: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 279 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) ORIGINAL SOURCE (A) ORGANISM: Synechocystus sp (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 14
Met Ala Leu Ser Phe Ser Lys Tyr His Gly Leu Gly Asn Asp Phe He 1 5 10 15 Leu Val Asp Asn Arg Gln Ser Thr Glu Pro Cys Leu Thr Pro Asp Gln 20 25 30 Wing Gln Gln Leu Cys Asp Arg His Phe Gly He Gly Wing Asp Gly Val 35 40 45 He Phe Wing Leu Pro Gly Gln Gly Gly Thr Asp Tyr Thr Met Arg He 50 55 60 Phe Asn Ser Asp Gly Ser Glu Pro Glu Met Cys Gly "Asn Gly He Arg 65 70 75 80
Cys Leu Wing Lys Phe Leu Wing Asp Leu Glu Gly Val Glu Glu Lys Thr 85 90 95
Tyr Arg He His Thr Leu Wing Gly Val He Thr Pro Gln Leu Leu Wing 100 105 110 Asp Gly Gln Val "Lys Val Asp Met Gly Glu Pro Gln Leu Leu Wing Glu 115 120 125 Leu He Pro Thr Thr Leu Ala Pro Wing Gly Glu Lys Val Val Asp Leu 130 135 140 Pro Leu Wing Val Wing Gly Gln Thr Trp Wing Val Thr Cys Val Ser Met 145 150 155 160
Gly Asn Pro His Cys Leu Thr Phe Val Asp Asp Val Asp Ser Leu Asn 165 170 175
Leu Thr Glu He Gly Pro Leu Phe Glu His His Pro Gln Phe Ser Gln 180 185 190 Arg Thr Asn Thr Glu Phe He Gln Val Leu Gly "Be Asp Arg Leu Lys 195 200 205 Met Arg Val Trp Glu Arg Gly Wing Gly He Thr Leu Ala Cys Gly - hr 210 215 220 Gly Wing Cys Wing Thr Val Val Wing Wing Val Leu Thr Gly Arg Gly Asp 225 230 235 240
Arg Arg Cys Thr Val Glu Leu Pro Gly Gly Asn Leu - Glu He Glu Trp 245 250 255
Be Wing Gln Asp Asn Arg Leu Tyr Met Thr Gly Pro Wing Gln Arg Val 260 265 270 Phe Ser Gly Gln Wing Glu He 275 (2) INFORMATION FOR SEQ ID NO: 15: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH : 1160 base pairs (B) TYPE: nucleic acid (C) STRING: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE (vii) IMMEDIATE SOURCE (B) CLON: cc2.pk0031. c9 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 15:
GTCGGCTGCG CGTCCACGGG AGACACCTCC GCCGCGCTCT CGGCCTACTG CGCAGCCGCG 60
GGAATCCCCG CCATCGTGTT CCTGCCAGCG GACCGCATCT CGCTGCAGCA GCTCATCCAG 120
CCGATCGCCA ACGGCGCCAC CGTGCTCTCT CTAGACACTG ATTTTGATGG CTGCATGCGG 180
CTCATTCGCG AGGTCACTGC AGAGCTGCCA ATCTACCTTG CCAATTCGCT CAACCCGCTC 240
CGCCTTGAGG GGCAGAAGAC AGCGGCCATC GAGATATTGC AGCAGTTCAA TTGGCAGGTG 300
CCAGATTGGG TCATTGTTCC AGGAGGCAAT CTTGGGAATA TCTATGCATT CTACAAGGGG 360
TTTGAGATGT GCCGCGTTCT TGGACTTGTT GATCGCGTGC CACGGCTTGT CTGCGCACAG 420
GCTGCAAATG CAAATCCATT GTACCGGTAC TACAAGTCAG GTTGGACTGA GTTTGAGCCA 480
CAAACTGCCG AGACTACATT TGCATCTGCG ATACAGATTG GTGATCCTGT ATCTGTTGAC 540
CGTGCGGTGG TCGCGCTGAA GGCCACTGAC GGTATTGTGG AGGAGGCTAC AGAGGAGGAG 600
CTAATGGATG CAACGGCGCT TGCTGACCGC ACTGGGATGT TTGCTTGCCC ACATACTGGG 660
GTTGCACTTG CTGCTTTGTT TAAGCTTCAG GGTCAGCGTA TAATTGGCCC TAATGACCGC 720
ACTGTGGTTG TTAGCACAGC TCATGGGCTG AAGTTCACGC AGTCAAAGAT TGACTACCAT 780
GACAAAAACA TCAAAGACAT GGTTTGCCAG TATGCTAATC CACCGATCAG TGTGAAGGCT 840
GACTTTGGTT CTGTGATGGA TGTTCTCCAG AAAAATCTCA ATGGTAAGAT ATAAAGTTAT 900
ATGATTAATT AACCCTCCAA ACTGTTTTTT TTTGTTTTTT CGTTCCAGGA ATTTTATTCC 960
TGAGTC TTC AACTTTGTTT GGTGAACATG GTATGGTGCT AAAATCTAGA CCTAATACCT 1020
TGTAGTACTA GTTCTGGAGG TCTTTTTGGT TGTAGGTCGA AGTGGATAGA GCTGTTCCTT 1080
GTACTTTATC TGTTTCATGT AATATGAATA ATAAATTATG GTCTAAATAT TTGAATAAAA 1140
AATCGTTTGG AATGACCCAC 1160
(2) INFORMATION FOR SEQ ID NO: 16: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 297 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE (B) CLON: cc2 .pk0031. c9 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 16:
Val Gly Cys Wing Ser Thr Gly Asp Thr Ser Wing Wing Leu Ser Wing Tyr 1 5 10 15 Cys Wing Wing Wing Gly He Pro Wing Hew Val Phe Leu Pro Wing Asp Arg 20 25 30 He Be Leu Gln Gln Leu He Gln Pro He Wing Asn Gly Ala Thr Val 35 40 45 Leu Ser Leu Asp Thr Asp Phe Asp Gly Cys Met Arg Leu He Arg Glu 50 55 60 Val Thr Wing Glu Leu Pro He Tyr Leu Wing Asn Ser Leu Asn Pro Leu 65 70 75 80 Arg Leu Glu Gly Gln Lys Thr 'Wing Wing He Glu He Leu Gln Gln Phe 85 90 95 Asn Trp Gln Val Pro Asp Trp Val He Val Pro Gly Gly Asn Leu Gly 100 105 110 Asn He Tyr Wing Phe Tyr Lys Gly Phe Glu Met Cys Arg Val Leu Gly 115 120 125 Leu Val Asp Arg Val Pro Arg Leu Val Cys Ala Gln Ala Ala Asn Ala 130 135 140 Asn Pro Leu Tyr Arg Tyr Tyr Lys Ser Gly Trp Thr Glu Phe Glu Pro 145 150 155 160
Gln Thr Wing Glu Thr Thr Phe Wing Being Wing He Gln He Gly Asp Pro 165 170 175 Val Ser Val Asp Arg Ala Val Val Ala Leu Lys Wing Thr Asp Gly He 180 185 190 Val Glu Glu Wing Thr Glu Glu Glu Leu Met Asp Wing Thr Ala Leu Wing 195 200 205 Asp Arg Thr Gly Met Phe Wing Cys Pro His Thr Gly Val Wing Leu Wing 210 215 220 Wing Leu Phe Lys Leu Gln Gly Gln Arg He He Gly Pro Asn Asp Arg 225 230 235 240
Thr Val Val Val Ser Thr Wing His Gly Leu Lys Phe Thr Gln Ser Lys 245 250 255 He Asp Tyr His Asp Lys Asn He Lys Asp Met Val Cys Gln Tyr Wing 260 265 270 Asn Pro Pro He Ser Val Lys Wing Asp Phe Gly Ser Val Met Asp Val 275 280. 285 Leu Gln Lys Asn Leu Asn Gly Lys He 290 ^. "_ _ __ 295 (2) INFORMATION FOR SEQ ID NO: 17: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 325 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE: (B) CLON: csl.pk0058.g5 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 17:
ATGGCTTGCA AGTACTCCAA CCCGCCTCTG AGCGTGAAGG CTGACTTTGG CGCCGTGATG 60
GATGTGCTGA AGAAGAGGCT CAAGGGCAAG CTCTGAGCGC CTGTGCCTGG CTAATGCAAT 120
CAACTGATTG GAATGCAGTG GTTTCGTCGG TATCGGGGGG TCTTTTAGGC TTCAGAAATT 180
CTGTCTGGGT TAGACTATTT GTTTGTGGAG TTTAGCAGGA GAATGGCTAT CTCTCCTGCA 240
AGACTGGCGC TCTTTCTTGT GCTACGAATG TGTTACCATG GATAATAAGT GTAGTCGCTG 300
TCGGATTGAA TAATCAAAAA AAAAR 325
(2) INFORMATION FOR SEQ ID NO: 18: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 31 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE (B) CLON: csl. pk0058 g5 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 18:
Met Wing Cys Lys Tyr Ser Asn Pro. Pro Val Ser Val Lys Wing Asp Phe 5 - 10 15 Gly Wing Val Met Asp Val Leu Lys Lys Arg Leu Lys Gly Lys Leu 20 25 30 (2) INFORMATION FOR SEQ ID NO: 19: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 528 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE: ( B) CLON: rls72.pk0018.e7 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 19;
ACACCCAACA CGCAGACTTG ACAGATTCTG CTACTACAAA TCCTGCATAT TTAACAGCGC 60
TGCAACTCGA CGATGGAGAA CGGTGCTGCA ACCAACGGGG CGTCGGAGAA GTCGCACTCT 120
CCTTCACAGA CCTACCTCTC CACAAGGGGA GACGATTATG GGCTCTCATT CGAGACCGTC 180
GTCCTCAAAG GTCTTGCGGC TGACGGGGGT CTTTTCCTGC CCGAGGAAGT GCCCGCGGCA 240
ACCGAGTGGC AAAGCTGGAA AGACCTGCCC TACACCGAGC TTGCCGTCAA GGTTCTCAGC 300
TTGTACATCT CCCCCGCCGA GGTGCCGACG GAAGACCTCA GGGCGCTCGT CGAGCGCAGC 360
TACTCGACCT TCCGATCCAA GGAGGTTGTG CCGCTGGTGA AGCTGGAGGA CAACCTTCAC 420
CTGCTGGAGC TATTCCACGG CCCCAACTAC TCGTTCAAGG ACTGCGCGCT GCAATTCCTT 480
GG AACCTCN TCGAGTACTT TTGACTCNCA AGAACAAGGG AAAGGAGG 528
(2) INFORMATION FOR SEQ ID NO: 20: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 143 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE: (B) CLON: rls72 .pk0018. e7 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 20:
Met Glu Asn Gly Wing Wing Thr Asn Gly Wing Ser Glu Lys Ser His Ser 1 5 10 15 Pro Ser Gln Thr Tyr Leu Ser Thr Arg Gly Asp Asp Tyr Gly Leu Ser 20 25 30 Phe Glu Thr Val Val Leu Lys Gly Leu Ala Wing Asp Gly Gly Leu Phe 35 40 45 Leu Pro Glu Glu Val Pro Ala Wing Thr Glu Trp Gln Ser Trp Lys Asp 50 55 60 Leu Pro Tyr Thr Glu Leu Wing Val Lys Val Leu Ser Leu Tyr He Ser 65 70 75 80 Pro Wing Glu Val Pro Thr Glu Asp Leu Arg Ala Leu Val Glu Arg Ser 85 90 95 Tyr Ser Thr Phe Arg Ser Lys Glu Val Val Pro Leu Val Lys Leu Glu 100 105 110 Asp Asn Leu His Leu Leu Glu Leu Phe His Gly Pro Asn Tyr Ser Phe 115 120 125 Lys Asp Cys Ala Leu Gln Phe Leu Gly Asn Leu Xaa Glu Tyr Phe 130 135 1 0
(2) INFORMATION FOR SEQ ID NO: 21: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 571 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE: (B) CLON: sel.06a03 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 21:
GGATGCAATG GTGCAGGCTG ATTCCACTGG AATGTTCATA TGTCCACACA CTGGGGTGGC 60
TCTGGCGGCG CTTATTAAGC TGAGGAATCG TGGGGTTATC GGTGCCGGTG AGAGGGTTGT 120
GGTGGTGAGC ACTGCACATG GATTGAAGTT TGCACAGAGC AAGATTGATT ATCATTCTGG 180
GCTCATTCCT GGAATGGGCC GCTATGCTAA CCCGCTGGTT TCGGTTAAGG CGGATTTTGG 240
ATCGGTCATG GATGTTCTCA AGGATTCTTG CACAACAAGT CCCCCGACTT TAACAAGTCT 300
TGACGTTGCC AAGTAAGTTT TAGTTCGGGG TTTTTTCTGA TTAAAGATGT TTTTAAACAT 360
GTTTGTGTNC ACTTTCGGTC GTTATTATGG ATTTGTAAGA TTGGGCCCAA GTATTCGAGG 420
GTTTGATTTC AAACAACATG CTTCTGGTGA. CGCAATGCAA ATTTCGGNGC ATAACATCAT 480
TGTCGAAGAT GGATCNCGAC CGATGAAACT GTGTGGCAAG TAATGAGAAG AAAATAGGGC 540
ACTTGTACAG AGATTTAAA GNTTAATTTC N 571
(2) INFORMATION FOR SEQ ID NO: 22: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 104 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE: (B) CLON: sel.06a03
(xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 22:
Asp Ala Met Val Gln Ala Asp Ser Thr Gly Met Phe He Cys Pro His 1 .. 5 - - 10-- ••• ': t .. -15.7 .---. Thr Gly Val Ala Leu Ala Ala Leu Xle Lys Leu Arg Asn Arg Gly Val 20 25 30 He Gly Wing Gly Glu Arg Val Val Val. Val Ser Thr Wing His Gly Leu 35 40 45 Lys Phe Wing Gln Ser Lys He Asp Tyr His Ser Gly Leu He Pro Gly 50 55 60 Met Gly Arg Tyr Wing Asn Pro Leu Val Ser Val Lys Wing Asp Phe Gly 65 70 75 80 Ser Val Met Asp Val Leu Lys Asp Ser Cys Thr Thr Ser Pro Pro Thr 85 90 95 Leu Thr Ser Leu Asp Val Ala Lys 100 (2) INFORMATION FOR SEQ ID NO: 23: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 2191 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY : linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE: (B) CLON: srl.pk0003.f6 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 23:
GCTTCCTCTT CTCTGTTTCA GTCTCTCCCT TTCTCTCTCC AAACCTCTAA ACCCTACGCG 60
CCTCCCAAAC CCGCCGCCCA CTTCGTTGTC CGCGCCCAATENTOTCAC TCAGAACAAC 120
AACTCCTCCT CCAAGCATCG CCGCCCCGCC GACGAGAACA TCCGCGACGA GGCCCGCCGC 180
ATCAATGCGC CCCACGACCA CCACCTCTTC TCGGCCAAGT ACGTCCCCTT CAACGCCGAC 240
TCCTCCTCCT CCTCCTCCAC GGAGTCCTAC TCGCTCGACG AGATCGTCTA CCGCTCCCAA 300
TCCGGCGGCC TCCTGGACGT CCAGCACGAC ATGGATGCCC TCAAGCGTTT CGACGGCGAG 360
TACTGGCGCA ACCTCTTCGA CTCGCGCGTG GGCAAAACCA CCTGGCCTTA CGGCTCCGGC 420
GTCTGGAGCA AAAAAGAATG GGTCCTCCCC GAGATCCACG ACGACGATAT CGTCTCCGCC 480
TTCGAGGGTA ACTCCAACCT CTTCTGGGCC GAGCGTTTCG GCAAACAGTT CCTCGGCATG 540
AACGATTTGT GGGTCAAACA CTGCGGAATC AGCCACACGG GCAGCTTCAA GGATCTCGGC 600
ATGACCGTCC TCGTCAGCCA GGTCAATCGC TTGAGAAAAA TGAACCGCCC CGTCGTCGGT 660
GTTGGTTGCG CCTCCACCGG TGACACATCG GCCGCTTTAT CCGCCTATTG CGCTTCCGCT 720
GCCATTCCTT CCATTGTGTT TTTGCCTGCT AATAAAATCT CTCTTGCCCA ACTTGTTCAG 780
CCTATTGCCA ATGGAGCCTT TGTGTTGAGT ATCGACACTG ATTTTGATGG TTGCATGCAG 840
TTGATCAGAG AAGTCACTGC TGAATTGCCT ATTTATTTGG CTAACTCTCT CAACAGTTTG 900
AAGTTGGAAG GGCAGAAAAC TGCTGCTATT GAGATTCTGC AGCAGTTTGA TTGGCAGGTT 960
CCTGATTGGG TCATTGTGCC TGGAAGCAAC CTTGGCAACA TTTATGCCTT TTACAAAGGG 1020
TTTAAGATGT TTCAAGAGCT TGGGCTTGTG GATAAGATTC CAAGGCTTGT TTGTGCTCAG 1080
GCTGCCAATG CTGATCCTTT GTATTTGTAC TTTAAATCCG GGTGGAAGGA GTTTAAGCCT 1140
GTGAAGTCGA GCACTACATT TGCTTCTGCC ATTCAAATTG GTGATCCTGT TTCCATTGAC 1200
AGGGCGGTTC ACGCGCTAAA GAGTTGCGAT GGGATTGTGG AGGAGGCCAC GGAGGAGGAG 1260
TTGATGGATG CTACAGCGCA GGCGGATTCT ACTGGGATGT TTATTTGCCC CCACACCGGG 1320 GTTGCTTTAA CTGCATTGTT TAAGCTCAGG AACAGCGGGG TTATTAAGGC CACTGATAGG 1380
ACTGTGGTGG TTAGCACTGC TCATGGCTTG AAGTTCACTC AGTCCAAGAT TGATTACCAT 1440
TCTAAGGACA TCAAGGACAT GGCTTGCCGC TATGCTAACC CGCCCATGCA AGTGAAGGCA 1500
GACTTTGGCT CGGTTATGGA TGTTTTGAAG ACGTATTTGC AGAGTAAGGC TCATTAGGTT 1560
AGCATTGCAA GTTTTGCTCC TCCTGAGTTT GCTCATTATT TACTTACTTT TAGGCACTAC 1620
TGCTGTATTG TCTTTTCTAT GAGCTAGGTT TGAGTGTTGT AATAATTTGC TTGCTGCATT 1680
ATGTATGCCG TCTAGTGTTC CATATTGGGC ATCATCCTTA GTATTTGTTG TAGATTTTCT 1740
TTGCTGAGCA TTTGATATAA TAGCTCAAGT AGGAAAATGA ATTGGGTACT ATGAGGAATG 1800
CATATCATTG GCTTGTTATT ACTGGATTCC AGACCACCCC AAAAGAAAAT AATTCCAAAA 1860
AATATAATTA GAACAAATTT CGTCCTTGTT ATGCTGTTGG CATTAAGCTC AGTGTGGGTA 1920
TTACCAAGCA ACTCGAAATC AAGAGAAAAA AAAATTGACA GCAAAGGAGC TGCATTGTTG 1980
GACTGAGTCA CATCACTTCA TTGCTATGTC GTCATATTTC GTTGAATTAC GGGAAGGCAG 2040
CATGCACAGC AATATGCAGC GATTAACTGA AGCCACACCG CACACATTGA AGTAGTAGTC 2100
AATTTAGACA CTCCATCTTG TACTTTCTAC AAAAATGAAT TTTTCTTAGC CATTAAGTAT 2160
AATATTTTAT TCTAAAAAAA AAAAAAAAAA TO 2191
(2) INFORMATION FOR SEQ ID NO: 24: () SEQUENCE CHARACTERISTICS: (A) LENGTH: 518 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE : peptide (vii) IMMEDIATE SOURCE: (B) CLON: srl .pk0003. f6 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 24;
Wing Being Ser Leu Phe Gln Being Leu Pro Phe Being Leu Gln Thr Being 1 5 10 .., 15 Lys Pro Tyr Pro Wing "Pro Lys Pro Wing Wing His Phe Val Val Arg Wing 20 .25 30? Gln Ser Pro Leu Thr Gln Asn Asn Asn Ser Ser Ser Lys His Arg Arg 35 - ~. _ 40 -. ^., J - ...: J 45 ._ s.
Pro Wing Asp Glu Asn He Arg Asp Glu Wing Arg Arg He Asn Wing Pro 50 55 60 His Asp His His Leu Phe Ser.Ala Lys Tyr Val Pro Phe Asn Wing Asp 65 70 75 80
Being Being Being Being Thr Glu Being Tyr Ssr Leu Asp Glu He Val 85 = 90 95
Tyr Arg Ser Gln Ser Gly Gly Leu Leu Asp Val Gln His Asp Met Asp 100 105 110 Wing Leu Lys Arg Phe Asp Gly Glu Tyr Trp Arg Asn Leu Phe Asp Ser 115 120 125 Arg Val Gly Lys Thr Thr Trp Pro Tyr Gly Ser Gly Val Trp Ser Lys 130 135 140 Lys Glu Trp Val Leu Pro Glu He His Asp Asp Asp He Val Ser Wing 145 150 155 '160
Phe Glu Gly Asn Ser Asn Leu Phe Trp Wing Glu Arg Phe Gly Lys Gln 165 170 175
Phe Leu Gly Met As Asp Leu Trp Val Lys His Cys Gly He Ser His 180 185 190 Thr Gly Ser Phe Lys Asp Leu Gly Met Thr Val Leu Val Ser Gln Val 195 200 • 205 Asn Arg Leu Arg Lys Met Asn Arg Pro Val Val Gly Val Gly Cys Wing 210 215 220 Ser Thr Gly Asp Thr Ser Wing Wing Leu Wing Wing Tyr Cys Wing Wing Wing 225 230 235 240
Ala He Pro Ser He Vai Phe Leu Pro Ala Asn Lys He Ser Leu Ala 245 250 255
Gln Leu Val Gln Pro He Wing Asn Gly Wing Phe Val Leu Ser He Asp 260 265 270 Thr Asp Phe Asp Gly Cys Met Gln Leu He Arg Glu Val Thr Wing Glu 275 280 285 Leu Pro He Tyr Leu Wing Asn Ser Leu Asn Ser Leu Lys Leu Glu Gly 290 295 300 Gln Lys Thr Wing Wing He Glu He Leu Gln Gln Phe Asp Trp Gln Val 305 310 315 320
Pro Asp Trp Val He Val Pro Gly Ser Asn Leu Gly Asn He Tyr Ala 325 330 -335
Phe Tyr Lys Gly Phe Lys Met Phe Gln Glu Leu Gly Leu Val Asp Lys 340 345 350 He Pro Arg Leu Val Cys Ala Gln Ala Ala Asn Ala Asp Pro Leu Tyr 355 360 365 Leu Tyr Phe Lys Ser Gly Trp Lys Glu Phe Lys Pro Val Lys Ser Ser 370 375 380 Thr Thr Phe Wing Being Wing He Gln He Gly Asp Pro Val Ser He Asp 385 390 395 400 Arg Wing Val His Wing Leu Lys Ser Cys Asp Gly He Val Glu Glu Wing 405 410 415 Thr Glu Glu Glu Leu Met Asp Wing Thr Wing Gln Wing Asp Be Thr Gly 420 425 430 Met Phe He Cys Pro His Thr Gly Val Ala Leu Thr Ala Leu Phe Lys 435 440 445 Leu Arg Asn Ser Gly Val He Lys Ala Thr Asp Arg Thr Val Val Val 450 455 460 Ser Thr Ala His Gly Leu Lys Phe Thr Gln Ser Lys He Asp Tyr His 465 470 475 480 Ser Lys Asp He Lys Asp Met Wing Cys Arg Tyr Wing Asn Pro Pro Met 485 490. 495 Gln Val Lys Wing Asp Phe Gly Ser Val Met Asp Val Leu Lys Thr Tyr 500 505 510 Leu Gln Ser Lys Wing His 515
(2) INFORMATION FOR SEQ ID NO: 25: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 643 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE: (B) CLON: wrl.pk0085.h2 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 25:
GCTCATCCAG CCCATCGCCA ACGGCGCCAC GGTGCTCTCG CTTGACACGG ATTTCGACGG 60
ATGCATGCGG CTÍATCAGGG "AGGTGACAGC TGAGCTGCCC ATATACCTCG CAAACTCACT 120
CAACTCGCTT CCGGCTGGAG GGGCAGAAGA CTGCAGCCAT CCGAGATATT GCAACANTCA 180
ATTGGCAGGT GCCCGGACTG GGTCACATCC CAAGGAGGCA ATCTGGGGGA ACATTTTATG 240
CTTTCCTACA AGGATTTNAA TTTCCGTGTC CTTNGCTAGT TGATTNCCTT CCNACTCCTT 300
GTTANTNCAA AGGCCGCCA ACGCAAACCC ACTGTACCCG TACTACAATC CTGGGGTGAC 360 TGATTTCCAT CCACTTGNTT GCCGGGACAA TTTNCATCCK GCAACAATTT GGGGATTCCA 420
TATCNATTAC CNTCGGTTTT TTCNCCCTNA AAGGACNNAT GATTNTCCNA GGAACTCCNN 480
AGGNGGATCA AGGATCCAAA GGCTTTCTAC TCACTGGAAN TTGCTTCCCA ANACGGGGTT 540
CACTNCCGCC CGTTAAACCC NTGACAAGTA TAATGGACAA CACNCCGGGG TNTATKACAA 600
CGGCAANTTN AAANCAAGTT NATCATTAGA ACNGGAANTT NCC "643
(2) INFORMATION FOR SEQ ID NO: 26: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 84 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE: (B) CLON: wrl.pk0085.h2
(Xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 26:
Leu He Gln Pro He Wing Asn Gly Wing Thr Val Leu Ser Leu Asp Thr 1 5 10 15 Asp Phe Asp Gly Cys Met Arg Leu He Arg Glu Val Thr Wing Glu Leu 20 25 30 Pro He Tyr Leu Wing Asn Ser Leu Asn Ser Leu Xaa Leu Glu Gly Gln 35 40 45 Lys Thr Wing Wing He Arg Asp He Wing Thr Xaa Asn Trp Gln Val Pro 50 55 60 Gly Leu Gly His He Pro Arg Arg Gln Ser Xaa Thr Phe Tyr Wing Phe 65 1 75 80 Leu Gln Gly Phe
(2) INFORMATION FOR SEQ ID NO: 27: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 525 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear
(ii) TYPE OF MOLECULE: peptide (ii) OROGINAL SOURCE: (A) ORGANISM: Arabidopsis thaliana (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 27
Leu Be Ser Cys Leu Phe Asn Wing Be Val Be Ser Leu Asn Pro Lys 1 5 10 15 Gln Asp Pro He Arg Arg His Arg Ser Thr Ser Leu Leu Arg His Arg 20 25 30 Pro Val Val He Ser Cys Thr Wing Asp Gly Asn Asn He Lys Wing Pro 35 40 45 He Glu Thr Wing Val Lys Pro Pro His Arg Thr Glu Asp Asn lie Arg 50 55 60 Asp Glu Wing Arg Arg Asn Arg Ser Asn Wing Val Asn Pro Phe Ser Wing 65 70 75 80 Lys Tyr Val Pro Phe Asn Wing Wing Pro Gly Ser Thr Glu Ser Tyr Ser 85 90 95 Leu Asp Glu He Val Tyr Arg Ser Arg Ser Gly Gly Leu Leu Asp Val 100 105 110 Glu His Asp Met Glu Ala Leu Lys Arg Phe Asp Gly Wing Tyr Trp Arg 115 120 125 Asp Leu Phe Asp Ser Arg Val Gly Lys Ser Thr Trp Pro Tyr Gly Ser 130 135 140 Gly Val Trp Ser Lys Lys Glu Trp Val Leu Pro Glu He Asp Asp Asp 145 150 155 160 Asp He Val Ser Ala Phe Glu Gly Asn Ser Asn Leu Phe Trp Wing Glu 165 170 175 Arg Phe Gly Lys Gln Phe Leu Gly Met Asn Asp Leu Trp Val Lys His 180 185 190 Cys Gly He Ser HIS Thr Gly Ser Phe Lys Asp Leu Gly Met Thr Val 195 200 205 Leu Val Se Gln Val Asn Arg Leu Arg Lys Met Lys Arg Pro Val Val 210 215 220 Gly Val Gly Cys Ala Ser Thr Gly Asp Thr Ser Ala Ala Leu Ser Ala 225 230 235 240
Tyr Cys Wing Being Wing Gly He Pro Being He Val Phe Leu Pro Wing Asn 245 250 255
Lys He Ser Met Wing Gln Leu Val Gln Pro He Wing Asn Gly Wing Phe 260 265 270 Val Leu Ser He Asp Thr Asp Phe Asp Gly Cys Met Lys Leu He Arg 275 280 285 Glu He Thr Wing Glu Leu Pro He Tyr Leu Wing Asn Ser Leu Asn Ser 290 295 300 Leu Arg Leu Glu Gly Gln Lys Thr Ala Wing He Glu He Leu Gln Gin 305 310 315 320
Phe Asp Trp Gln Val Pro Asp Trp Val He Val Pro Gly Gly Asn Leu 325 330 335
Gly Asn lie Tyr Wing Phe Tyr Lys Gly Phe Lys Met Cys Gln Glu Leu 340 345 350 Gly Leu Val Asp Arg He Pro Arg Met Val Cys Wing Gln Wing Wing Asn 355 360 365 Wing Asn Pro Leu Tyr Leu His Tyr Lys Ser Gly Trp Lys Asp Phe Lys 370 375 380 Pro Met Thr Wing Being Thr Thr Phe Wing Being Wing He Gln He Gly Asp 385 390 395 400
Pro Val Ser He Asp Arg Ala Val Tyr Ala Leu Lys Lys Cys Asn Gly 405 410 415
He Val Glu Glu Ala Thr Glu Glu Glu Leu Met Asp Ala Met Wing Gln 420 425 430 Wing Asp Ser Tfax Gly Met Phe He Cys Pro His Thr Gly Val Ala Leu 435 440 445 Thr Ala Leu Phe Lys Leu Arg Asn Gln Gly Val He Wing Pro Thr Asp 450 455 460 Arg Thr Val Val Val Ser Thr Wing His Gly Leu Lys Phe Thr Gln Ser 465 470 - 475 - 480
Lys He Asp Tyr His Ser Asn Wing He Pro Asp Met Wing Cys Arg Phe 485 490 495
Ser Asn Pro Pro Val Asp Val Lys Wing Asp Phe Gly Wing Val Met Asp 500 505 510 Val Leu Lys Ser Tyr Leu Gly Ser Asn Thr Leu Thr Ser 515 • 520 525
(2) INFORMATION FOR SEQ ID NO: 28: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1478 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE: (E) CLON: cenl.pk0064.f (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 28
CAACAGTGGT CCTTGAGGGG GACTCATATG ATGAAGCTCA GTCATATGCA AAATTGCGTT 60 GCCAGCAGGA AGGCCGCACA TTTGTACCTC CTTTTGACCA TCCTGATGTC ATCACTGGAC 120 AAGGAACTAT CGGCATGGAA ATTGTTAGGC AGCTGCAAGG TCCACTGCAT GCAATATTTG 180 TACCTGTTGG AGGTGGTGGA TTAATTGCTG GAATTGCTGC CTATGTAAAA CGGGTTCGCC 240 CAGAGGTGAA AATAATTGGA GTGGAACCCT CAGATGCAAA TGCAATGGCA TTATCCTTGT 300 GTCATGGTAA GAGGGTCATG TTGGAGCATG TTGGTGGGTT TGCTGATGGT GTAGCTGTCA 360 AAGCTGTTGG GGAAGAAACA TTTCGCCTGT GCAGAGAGCT AGTAGATGGC ATTGTTATGG 420 TCAGTCGAGA TGCTATTTGT GCTTCAATAA AGGATATGTT TGAGGAGAAA AGAAGTATCC 80 TTGAACCTGC TGGTGCCCTT GCATTGGCTG GGGCTGAAGC CTACTGCAAA TACTATAACT 540 TGAAAGGAGA AACTGTGGTT GCAATAACTA GTGGGGCAAA TATGAACTTT GATCGACTTA 600 GACTAGTAAC CGAGCTAGCT GATGTTGGCC GAAAACGGGA AGCAGTGTTA GCTACATTTC 660 TGCCAGAGCG GCAGGGAAGC TTCAAAAAAT TCACAGAATT GGTTGGCAGG ATGAATATTA 720 CTGAATTCAA ATACAGATAC GATTCTAATG CAAAAGATGC CCTTGTTCTT TACAGTGTTG 780 GCATCTACAC TGACAATGAG CTTGGAGCAA TGATGGATCG CATGGAATCT GCGAAACTGA 840 GGACTGTTAA CCTTACTGAC AATGATTTGG CAAAGGACCA CCTTAGATAC TTTATTGGAG 900 GAAGATCAGA AATAAAAGAT GAACTGGTTT ACCGGTTCAT TTTCCCGGAA AGGCCTGGGG 960 CCCTTATGAA ATTTTTGGAC ACGTTTAGTC CTCGTTGGAA CATCAGCCTT TTCCATTACC 1020 GTGCACAGGG TGAAGCTGGA GCAAATGTAT TAGTTGGTAT ACAAGTGCCG CCAGCAGAAT 1080 TTGATGAAT7 CAAGAGTCAT GCCAACAATC TTGGGTACGA GTACATGTCA GAGCACAACA 1140 ATGAGATATA CCGGTTGCTG TTGCGTGACC CAAAGGTCTA ATGTATATGC CTTTGCTCCC 1200 ATAATAAGTT GGTGACACTT TTCAAGGAAG ATTTTGCTCC AAGGTAGAAG TTGCGAGTTT 1260 CTTCAAGTTG AAATGAAGCC ATCACCAAAT GTAGCTTCGG TGTGCCATCT GTTTACTCAG 1320
TTAGATCATG TAGTGTATCA GTTGTGTATC TTTGTTGTTG TGCTTCGTGA TCTCAATTTA 1380
TTGCTTTGTG CACCTAGAGG TTGTCAAATA ATGATAACCG ATATGTTATC TAAATATCTA 1 0
ATAATGATTA TGTGATTGTG ATTAAAAAGG GGGGGCCC 1 78
(2) INFORMATION FOR SEQ ID NO: 29: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 392 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE: (B) CLON: cenl .pk0064. f4 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 29:
Thr Val Val Leu Glu Gly Asp Ser Tyr Asp Glu Wing Gln Ser Tyr Wing 1 5 10 15 Lys Leu Arg Cys Gln Gln Glu Gly Arg Thr Phe Val Pro- Pro Phe Asp 20 2 - > c5 30 His Pro Asp Val lie Thr Gly Gln Gly Thr He Gly Met Glu He Val 35 40 45 Arg Gin Leu Gln Gly Pro Leu His Wing He Phe Val Pro Val Gly Gly 50 5 «i 6n0 Gly Gly Leu He Wing Gly He Ala Wing Tyr Val Lys Arg Val Arg Pro 65 70 75 80 Glu Val Lys He He Gly Val Glu Pro Ser Asp Wing Asn Wing Met Wing 85 90 9 * Leu Ser Leu Cys His Gly Lys Arg Val Met Leu Glu Hi = Val Gly Gly 100 105 110 Phe Wing Asp Gly Val Wing Val Lys Wing Val Gly Glu Glu Thr Phe Arg 120 125 Leu Cys Arg Glu Leu Val Asp Gly He Val Val Val Ser Arg Asp Ala 130 1T3í5 1.4 * 0? He Cys Wing Being He Lys Asp Met Phe Glu Glu Lys Arg Ser He Leu 145 1"• 5 = 0" 155 160 Glu Pro Wing Gly Wing Leu Wing Leu Wing Gly Wing Glu Wing Tyr Cys Lys 165 170 175
Tyr Tyr Asn Leu Lys Gly Glu Thr Val Val Wing He Thr Ser Gly Wing 180 185 190. .
Asn Met Asn Phe Asp Arg Leu Arg Leu Val Thr Glu Leu Wing Asp Val .195 200 205 Gly Arg Lys Arg Glu Wing Val Leu Wing Thr Phe Leu Pro Glu Arg Gln 210 215 -.-.- V-. -220. .
Gly Ser Phe Lys Lys Phe Thr Glu Leu Val Gly Arg Met Asn He Thr 225 230 235 240
Glu Phe Lys Tyr Arg Tyr Asp Ser Asn Ala Lys Asp Ala Leu Val Leu 245 250 255
Tyr Ser Val Gly He Tyr Thr Asp Asn Glu Leu Gly Wing Met Met Asp 260 265 270 Arg Met Glu Be Wing Lys Leu Arg Thr Val Asn Leu Thr Asp Asn Asp 275 280 285 Leu Wing Lys Asp His Leu Arg Tyr Phe He Gly Gly Arg Ser Glu He 290 295 300 Lys Asp Glu Leu Val Tyr Arg Phe He Phe Pro Glu Arg Pro Gly Ala 305 310 315 320
Leu Met Lye Phe Leu Asp Thr Phe Ser Pro Arg Trp Asn He Ser Leu 325 330 335
Phe His Tyr Arg Wing Gln Gly Glu Wing Gly Wing Asn Val Leu Val Gly 340 345 350 He Gln Val Pro Pro Wing Glu Phe Asp Glu Phe Lys Ser His Wing Asn 355 360 365 Asn Leu Gly Tyr Glu Tyr Met Ser Glu His Asn Asn Glu He Tyr Arg 370 375 380 Leu Leu Leu Arg Asp Pro Lys Val 385 390
(2) INFORMATION FOR SEQ ID NO: 30: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 728 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE: (B) CLON: sfll.pk0055.h7
(xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 30:
AAAATATTGT AGCAATAACC AGTGGAGCAA ACATGAATTT TGATAAACTT CGGGTTGTAA 60 CTGAACTTGC TAATGTTGGT CGTAAACAAG AGGCTGTGCT GGCAACTGTT ATGGCAGAGG 120 AGCCTGGCAG TTTCAAACAA TTTTGTGAAT TGGTGGGGCA GATGAACATA ACAGAATTCA 180. AATACAGATA TAACTCAAAT GAGAAGGCAG TTGTCCTTTA CAGTGTTGGG GTTCACACAA 240 TCTCCGAACT AAGAGCAATG CAGGAGAGGA TGGAATCTTC TCAGCTCAAA ACTTACAATC 300 TCACAGAAAG TGACTTGGTG AAAGACCACT TGCGTTACTT GATGGGAGGC CGATCAAACG 360 TTCAGAATGA GGTCTTTGTC GTCTCACCTT TCCAAGAAAG ACTGGTGCTT TGATGAAATT 420 TTTGGACCCT TCAGTCCACG TTGGGATATT AGTTTATCCA TTACCGAGGG GAGGTGAAAC 480 TGGAGCAAAC TGCTAGTTGG NTACAGGTAC CAAAATGAGA TAGATGAGTC CATGATCGTG 540 CTAACAAACT GGATATGATT ATAAGTGGNA ATATGTGATG NCTCAGCTCA ATCNCGATGG 600 GGNTTAAGCA CTGCATATGG GNATTAGGGG NAGNTACANT TAAATTCACG GCCTCAAGNT 660 AAGCATANTN TAGGAACTAG CTTTACAGGG GGCTACNANT TAACCGNGTA TTTTTTTTGA GATGANNG 720 728
(2) INFORMATION FOR SEQ ID NO: 31: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 152 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE: (B) CLON: sfll.pk0055.h7 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 31
Asn He Val Wing He Thr Ser Gly Wing Asn Met Asn Phe Asp Lys Leu 1 5 10 15 Arg Val Val Thr Glu Leu Ala Asn Val Gly Arg Lys Gln Glu Val Wing 20 25 30 Leu Wing Thr Val Met Wing Glu Glu Pro Gly Ser Phe Lys Gln Phe Cys 35 40 45 Glu Leu Val Gly Gln Met Asn He Thr Glu Phe Lys Tyr Arg Tyr Asn 50 55 60 Ser Asn Glu Lys Ala Val Val Leu Tyr Ser Val Gly Val HAS Thr He 65 70 75 80 Ser Glu Leu Arg Ala Met Gln Glu Arg Met Glu Be Ser Gln Leu Lys 85 90 95 Thr Tyr Asn Leu Thr Glu Be Asp Leu Val Lys Asp HAS Leu Arg Tyr 100 105 110 Leu Met Gly Gly Arg Ser Asn Val Gln Asn Glu Val Phe Val Val Ser 115 120 125 Pro Xaa Pro Arg Lys Thr Gly Ala Leu Met Lys Phe Leu Asp Xaa Phe 130 135 140 Ser Pro Arg Trp Asp He Ser Leu 145 150
(2) INFORMATION FOR SEQ ID NO: 32: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 572 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE: (B) CLON: sre. pk0044. f3 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 32:
AAAGACCTGG TGCTTTGATG AAATTTTTGG ACCCCTTCAG TCCACGTTGG AATATCAGTT 60
TATTCCATTA CCGAGGGGAG GGTGAAACTG GAGCAAATGT GCTAGTTGGA ATACAGGTAC 120
CCAAAAGTGA GATGGATGAG TTCCACGATC GTGCCAACAA ACTTGGATAT GATTATAAAG 180
TGGTGAATAA TGATGATGAC -TTCCAGCTTC TAATGCACTG ATGATGGTTT TAGGCACTTG 240 CCATTATTGT GTATTTTAGT CAACAAGTTT GCCATATTTA ATATTTCCAC GGTCGTTTCT 300
AAAAGTTGGA TGGGGAAAAA AGGTGGAAAG GAAGTGGCCT TCAGACATGT CATTAGTTGA 360
TTAGAGGAAC AACTAGTTCT TTTTACCTAA TGCGGCGTCT TATTACATTT TTTATAATCT 420
GTAATTTATG TTTTTTTGTT GTTGTTAACA TTGGAATCTT -ATAATGTTGT TGCCTGGTCT 480
TTTGTGTCTG TAATATAAGT GTCTTCAAAA GGTTGTTTGC TAAATTTCAG CAGCCTAAAA 540
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AA '572
(2) INFORMATION FOR SEQ ID NO: 33: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 72 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE: (B) CLON: sre.pk0044.f3 (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 33:
Arg Pro Gly Ala Leu Met Lys Phe Leu Asp Pro Phe Ser Pro Arg Trp 1 5 10 15 Asn He Ser Leu Phe His Tyr Arg Gly Glu Glu Glu Thr Gly Wing Asn 20 2S 30 Val Leu Val Gly He Gln Val Pro Lys Ser Glu Met Asp Glu Phe His .35 40 45 Asp Arg Wing Asn Lys Leu Gly Tyr Asp Tyr Lys Val Val Asn Asn Asp 50 55 60 Asp Asp Phe Gln Leu Met His 65 65
(2) INFORMATION FOR SEQ ID NO: 34: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 507 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) ORIGINAL SOURCE: (A) ORGANISM: Burkholderia capacia (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 34;
Met Wing His Asp Tyr Leu Lys Lys He Leu Thr Wing Arg Val Tyr 1 5 10 15 Asp Val Wing Phe Glu Thr Glu Leu Glu Pro Wing Arg Asn Leu Wing 20 25 30 Arg Leu Arg Asn Pro Val Tyr Leu Lys Arg Glu Asp Asn Gln Pro Val 35 40 45 Phe Ser Phe Lys Leu Arg Gly Wing Tyr Asn Lys Met Wing His He Pro 50 55 60 Wing Asp Wing Leu Wing Arg Gly Val He Thr Wing Being Wing Gly Asn His € 5 70 75 80 Wing Gln Gly Val Wing Phe Wing Wing Arg Met Gly Val Lys Wing Val 85 90 95 He Val Val Val Pro Val Thr Pro Gln Val Val Val Val Val W Val Val 100 Valve No Ala Wing Gly Gly Pro Gly Val Glu Val He Gln Wing Gly Val Ser Tyr 115 120 125 Ser Asp Wing Tyr Wing His Wing Leu Lys Val Gln Glu Glu Arg Gly Leu 130 135 i4th Thr Phe Val HAS Pro Phe Asp Asp Pro Tyr Val He Wing Gly Gln Glv 145 iso i ** 1, 60 Thr He Wing Met Glu He Leu Arg Gln HAS Gln Gly Pro He His Wing 165 170 175 He Phe Val Pro He Gly Gly Gly Gly Leu Wing Ala Gly Val Wing Wing 180 185 190 Tyr Val Lys Wing Val Arg Pro Glu He Lys Val He Gly Val Gln Ala 195 200 205 Glu Asp S er Cys Wing Met Wing Gln Ser Leu Gln Wing Gly Lys Arg Val 210 215 220 Glu Leu Wing Glu Val Gly Leu Phe Wing Asp Gly Thr Wing Val Lys Leu 225 230 235 240
Val Gly Glu Glu Thr Phe Arg Leu Cys Lys Glu Tyr Leu Asp Gly Val 245 250 255
Val Thr Val Asp Thr Asp Wing Leu Cys Wing Wing He Lys Asp Val Phe 260 265 270 Gln Asp Thr Arg Ser Val Leu Glu Pro Ser Gly Wing Leu Wing Val Wing 275 280 285 Gly Wing Lys Leu Tyr Wing Glu Arg Glu Gly He Glu Asn Gln Thr Leu 290 295 300 Val Wing Val Thr Ser Gly Wing Asn Met Asn Phe Asp Arg Met Arg Phe 305 310 3 5 ..- .. -320
Val Ala Glu Arg Ala Glu Val Gly Glu Ala Arg Glu Ala Val Phe Ala 325 330 335
Val Thr He Pro Glu Glu Arg Gly Ser Phe Lys Arg Phe Cys Ser Leu 340 345 350 Val Gly Asp Arg Asn Val Thr Glu Phe Asn Tyr Arg He Wing Asp Wing 355 360 365 Gln Ser Wing His He Phe Val Gly Val Gln He Arg Arg Arg Gly Glu 370 375 380 Be Ala Asp He Ala Ala Asn Phe Glu Ser His Gly Phe Lys Thr Ala
385 39C 395 400
Asp Leu Thr His Asp Glu Leu Ser Lys Glu His He Arg Tyr Met Val 405 410 415
Gly Gly Arg Ser Pro Leu Ala Leu Asp Glu Arg Leu Phe Arg Phe Glu 420 425 430 Phe Pro Glu Arg Pro Gly Ala Leu Met Lys Phe Leu Ser Ser Met Ala
435 440 445 Pro Asp Trp Asn He Ser Leu Phe His Tyr Arg Asn Gln Gly Wing Asp 450 455 460 Tyr Ser Ser He Leu Val Gly Leu Gln Val Pro Gln Wing Asp His Wing 465 470 475 480
Glu Phe Glu Arg Phe Leu Ala Ala Leu Gly Tyr Pro Tyr Val Glu Glu 485 490 495
Be Ala Asn Pro Ala. -r Ar? Leu Phe Leu Ser 500 505
(2) INFORMATION FOR SEQ ID NO: 35: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1582 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE: (B) CLON: cc3.mn0002d2 (xii) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 35:
ACGAGACGAG TCCCCTCCCC CCACCTCGCC TCACCCAACC GGAACGAACA AGTTACCATC 60
TCATCCCAAC CCCGCCTCGA CCGGATCTCG TCGGACTCGG ATCCGCCCGA CCACCCCGCG 120
CCGCCGCAGA TCAAAGAAGA TGGCAGCTCT CGACACCTTC CTCTTCACCT CGGAGTCTGT 180
GAACGAGGGA CACCCTGACA AGCTCTGCGA CCAGGTCTCA GATGCCGTTC TTGACGCTTG 240
CCTTGCTGAG GACCCTGACA GCAAGGTTGC TTGTGAGACC TGCACCAAGA CCAACATGGT 300
CATGGTCTTT GGTGAGATCA CCACCAAGGC CAATGTCGAC TACGAGAAGA TTGTCAGGGA 360
GACCTGCCGC AACATTGG7T TTGTGTCAAA CGATGTCGGG CTTGACGCTG ACCACTGCAA 420
GGTGCTCGTG AACATTGAGC AGCAGTCCCC TGATATTGCT CAGGGTGTGC ATGGCCACTT 480
CACCAAGCGC CCCGAGGAGA TTGGAGCTGG TGACCAGGGA CACATGTTCG GGTATGCGAC 540 CGATGAGACC CCTGAGTTGA TGCCCCTCAG CCATGTCCTT GCCACCAAGC TAGGTGCTCG 600
TCTCACCGAG GTCCGCAAGA ACGGAACCTG CCCCTGGCTC AGGCCTGATG GGAAGACCCA 660
GGTGACAGTC GAGTACCGCA ATGAGGGTGG TGCCATGGTC CCCATCCGTG TCCACACCGT 720
CCTCATCTCC ACCCAGCACG ACGAGACAGT GACCAATGAT GAGATCGCTG CTGACCTGAA 780
GGAGCATGTC ATCAAGCCTA TCATCCCTGA GCAGTACCTT GACGAGAAGA CCATCTTCCA 840
CCTTAACCCA TCCGGCCGCT TTGTCATTGG TGGACCTCAC GGCGATGCTG GCCTCACTGG 900
CCGCAAGATC ATCATTGACA CCTACGGTGG CTGGGGAGCC CATGGCGGTG GCGCTTTCTC 960
CGGCAAGGAC CCAACCAAGG TTGACCGCAG CGGAGCCTAT GTCGCGAGGC AGGCTGCCAA 1020
GAGCATCGTC GCCAGCGGCC TTGCTCGCCG CGCCATCGTC CAGGTGTCCT ACGCCATCGG 1080
CGTGCCCGAG CCTCTCTCCG TGTTTGTCGA CACGTACGGC ACCGGCGCGA TCCCCGACAA 1140
GGAGATCCTC AAGATTGTCA AGGAGAACTT CGATTTCAGG CCTGGCATGA TTATCATCAA 1200
CCTTGACCTC AAGAAAGGCG GCAACGGGCG CTACCTCAAG ACGGCAGCCT ACGGCCACTT 1260 CGGAAGGGAC GACCCTGACT TCACCTGGGA GGTGGTGAAG CCACTCAAGT CGGAGAAACC 1320
TTCTGCCTAA GGCGGCCTTT TTTTCAGTAA GAAGCTTTTG GTGGTCTGCT GTGCTTAATC 1380
ATGCTTTTAT ATGGCTTCTA CATGTTGTGG TTCTTTCTTG ATCTGCACCG CGCTTATCGT 14 0
TTGTGTTGTA CTGCCCTAAT AAGTGGTGCT TATGAGGACT GTTTCTGGTT TTGCTGCTTA 1500
TGTTGTAATG CTTTGAAACA ATGAAAGAAG CTACAGGCCA CAGCTATTTT GAGAAGTAAT 1560
GGAACCTCGT GCCGTTTTGA TT 1582
(2) INFORMATION FOR SEQ ID NO: 36: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 396 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE: (B) CLON: CLON: cc3.mn0002.d2 (i) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 36:
Met Ala Ala Leu Asp Thr Phe Leu Phe Thr Ser Glu Ser Val Asn Glu 1 5 10 15 Gly His Pro Asp Lys Leu Cys Asp Gln Val Ser Asp Ala Val Leu Asp 20 25 30 Wing Cys Leu Wing Glu Asp Pro Asp Ser Lys Val Ala Cys Glu Thr Cys 35 40 45 Thr Lys Thr Asn Met Val Val Met Val Phe Gly Glu He Thr Thx Lys Ala 50 55 60 Asn Val Asp Tyr Glu Lys He Val Arg Glu Thr Cys Arg Asn He Gly 65 70 75 80 Phe Val Ser Asn Asp Val Gly Leu Asp Wing Asp His Cys Lys Val Leu 85 90 95 Val Asn He Glu Gln Gln Ser Pro Asp He Wing Gln Gly Val His Gly 100 105 110 His Phe Thr Lys Arg Prc Glu Glu He Gly Wing Gly Asp Gln Gly His 115 120 125 Met Phe 'Gly Tyr Ala Thr Asp Glu Thr Pro Glu Leu Met Pro Leu Ser 130 135 140 Kis Val Leu Wing Thr Lys Leu Gly Wing Arg Leu Thr Glu Val Arg Lys 145 150 155 160
Asn Gly Thr Cys Pro Trp Leu Arg Pro Asp Gly Lys Thr Gln Val Thr 165 170 175
Val Glu Tyr Arg Asn Glu Gly Gly Wing Met Val Pro He Arg Val His 180 185 190 Thr Val Leu He Ser Thr Gln His Asp Glu Thr Val Thr Asn Asp Glu 1S5 200 205 He Al2 Wing Asp Leu Lys Giu His Val He Lys Pro He He Pro Glu 210 215 220 Gin Tyr Leu Asp Glu Lys Thr lie Phe His Leu Asn Pro Ser Gly Arg 225 230 235 240
Phe Val He Gly Glv Pro His Gly Asp Wing Gly Leu Thr Gly Arg Lys 245 250- 255
He He lie Aso Thr Tyr Gly Gly Trp Gly Wing Gly Gly Wing 260 265 270 Phe Ser Gly Lys Asp Pro Thr Lys Val Asp Arg Ser Gly Wing Tyr Val 275 280 285 Wing Arg Gln Wing Wing Lys Ser He Val Wing Wing Gly Leu Ala Arg Arg 290 255 300 Ala He Val Gln Val Ser Tyr Ala He Gly Val Pro Glu Pro Leu Ser 305 310 315 320
Val Phe Val Asp Thr Tyr Gly Thr Gly Ala He Pro Asp Lys Glu He 325 330 335
Leu Lys He Val Lys Glu Asn Phe A = p Phe Arg Pro Gly Met He He 340 345 350 He Asn Leu Asp Leu Lys Lys Gly Gly Asn Gly Arg Tyr Leu Lys Thr 355 360 365 Wing Wing Tyr Gly His Phe Gly Arg Asp Asp Pro Asp Phe Thr Trp Glu 370 375 380 Val Val Lys Pro Leu Lys Ser Glu Lys Pro Ser Wing 385 390 395
•
(2 INFORMATION FOR: SEQ ID NO: 37: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 2183 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) ORIGINAL SOURCE: (A) ORGANISM: Oryza sativa (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 37:AAATGAACGG AAAATGGAAA AAAAAATTGA TTGGTGCCAC TTCAAAGTTA 60
AATATGCCAA GACGAATTGA TATGTTTCTG CTGTTGTTTT ATGCTCTTGA TTAGTTGATG 120
CGCATGTTCA ATGATTTATG ATGTTTGTCT TTGTGGAAAG ATTACATGTA AAGAGTATAG_180_TAGAACCCCT AAAAGCTAGC CAGCGATTTC GCTCTTTTTT TCCAGGTCTC CATGATATGT 240
TTACCCCTAA AAGTGGTATA TTTATGTGAT AGTTACAATA CATAGTGGAC CACGATTGAT 300
TATGCGTTTA TGCTGATTCC GGCAGAAAAT TGTTAGATTC CTTGTGCTCT ATACCTGCTT 360
GTTGCGCTTG TAGAGAATAT TACAAATACC TAACACTTGC CCAAGGAACT TAGGAACTTA 420
GTCAACTCTT TGTAGGGACA ACTATTTTAG CCCAAAATTG TGGTCTTGTC AGGTGCCAAC 480
AAAACAGCAT CTTGGCGTAC ATAAGCTATA TAGAGGATTA AAAGGAATGT TTTGTTCCTT 540
GCTACTGTTT TTTTAACCTG TTTACTCAGG ACAAATTTTG TTGCATAAAC CATTTGTTCT 600
AGGGATCAGT ATTGTCCTCT CAGTGTGTTA TGTAAGCATT TCCAGAAATC AATTGTCGCT 660
ATCAGCTTCC CTCACATTAG CTATCACTTA TACCCCTTTT TTTCTCATAG GCTCACCATG 720
TCCATTTTAT TCATGATATT TCTTTGTCTA AAGTATGTGA AATACCATTT TATGCAGATA 780
GGAGAAGATG GCCGCACTTG ATACCTTCCT CTTTACCTCG GAGTCTGTGA ACGAGGGCCA 840
CCCTGACAAG CTCTGCGACC AAGTCTCAGA TGCTGTGCTT GATGCCTGCC TCGCCGAGGA 900
CCCTGACAGC AAGGTCGCTT GTGAGACCTG CACCAAGACA AACATGGTCA TGGTCTTTGG 960
TGAGATCACC ACCAAGGCTA ACGTTGACTA TGAGAAGATT GTCAGGGAGA CATGCCGTAA 1020
CATCGGTTTT GTGTCAGCTG ATGTCGGTCT CGATGCTGAC CACTGCAAGG TGCTTGTGAA 1080
CATCGAGCAG CAGTCCCCTG ACATTGCACA GGGTGTGCAC GGGCACTTCA CCAAGCGCCC 1140
TGAGGAGATT GGTGCTGGTG ACCAGGGACA CATGTTTGGA TATGCAACTG ATGAGACCCC 1200
TGAGTTGATG CCCCTCAGCC ATGTCCTTGC TACCAAGCTT GGCGCTCGTC TTACGGAGGT 1260
TCGCAAGAAT GGGACCTGCG CATGGCTCAG GCCTGACGGG AAGACCCAAG TGACTGTTGA 1320
GTACCGCAAT GAGAGCGGTG CCAGGGTCCC TGTCCGTGTC CACACCGTCC TCATCTCTAC 1380
CCAGCATGAT GAGACAGTCA CCAACGATGA GATTGCTGCT GACCTGAAGG AGCATGTCAT 1440
CAAGCCTGTC ATTCCCGAGC AGTACCTTGA TGAGAAGACA ATCTTCCATC TTAACCCATC 1500
TGGTCGCTTC GTCATTGGCG GACCTCATGG TGATGCTGGT CTCACTGGCC GGAAGATCAT 1560
CATTGACACT TATGGTGGCT GGGGAGCTCA CGGTGGTGGT GCCTTCTCTG GCAAGGACCC 1620 - • - • '' •: '- AACCAAGGTT GACCGCAGTG GAGCATACGT CGCAAGGCAA GCTGCCAAGA GCATTGTTGC' 1680 TAGTGGCCTT GCTCGCCGCT GCATTGTCCA AGTATCATAC GCCATCGGTG TCCCAGAGCC 1740
ACTGTCCGTA TTCGTCGACA CATACGGCAC TGGCAGGATC CCTGACAAGG AGATCCTCAA 1800
GATTGTGAAG GAGAACTTCG ACTTCAGGCC TGGCATGATC ATCATCAACC TTGACCTCAA 1860
GAAAGGCGGC AACGGACGCT ACCTCAAGAC GGCGGCTTAC GGTCACTTCG GAAGGGACGA 1920
CCCAGACTTC ACCTGGGAGG -TGGTGAAGCC CCTCAAGTGG GAGAAGCCTT CTGCCTAAAA 1980
GCTCCCTTTC GGAGGCTTTT GCTCTGTCCC ATTATGGTGT TTTGTTTCCT CGCTGCTCAG 2040
CATTGTGATT CTTAACCTGC CCCCCGCTGC CATTTATGCC CATGCACGCT ACTTTCCTAA 2100
TAATAAGTAC TTATAAGGGT ATTGTGTTTG AATATTTTAC CTAGAGGAGG AGGAGGATTT 2160
GTTATCTGTT ATTGCTTAAG CTT 2183
(2) INFORMATION FOR SEQ ID NO: 38: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1484 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) IMMEDIATE SOURCE: (B) CLON: s2.12b06 (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 38;
AGCGAAGCCC CACTCAACCA CCACACCACT CTCTCTGCTC TTCTTCTACC TTTCAAGTTT 60
TTAAAGTATT AAGATGGCAG AGACATTCCT ATTTACCTCA GAGTCAGTGA ACGAGGGACA 120
CCCTGACAAG CTCTGCGACC AAATCTCCGA TGCTGTCCTC GACGCTTGCC TTGAACAGGA 180
CCCAGACAGC AAGGTTGCCT GCGAAACATG CACCAAGACC AACTTGGTCA TGGTCTTCGG 240
AGAGATCACC ACCAAGGCCA ACGTTGACTA CGAGAAGATC GTGCGTGACA CCTGCAGGAA 30C
CATCGGCTTC GTCTCAAACG ATGTGGGACT TGATGCTGAC AACTGCAAGG TCCTTGTAAA 360
CATTGAGCAG CAGAGCCCTG ATATTGCCCA GGGTGTGCAC GGCCACCTTA CCAAAAGACC 420
CGAGGAAATC GGTGCTGGAG ACCAGGGTCA CATGTTTGGC TATGCCACGG ACGAAACCCC 480 AGAATTGATG CCATTGAGTC ATGTTCTTGC AÁCTAAACTC GGTGCTCGTC TCACCGAGGT 540
TCGCAAGAAC GGAACCTGCC CATGGTTGAG GCCTGATGGG AAAACCCAAG TGACTGTTGA 600
GTATTACAAT GACAACGGTG CCATGGTTCC AGTTCGTGTC CACACTGTGC TTATCTCCAC 660
CCAACATGAT GAGACTGTGA CCAACGACGA AATTGCAGCT GACCTCAAGG AGCATGTGAT 720
CAAGCCGGTG ATCCCGGAGA AGTACCTTGA TGAGAAGACC ATTTTCCACT TGAACCCCTC 780
TGGCCGTTTT GTCATTGGAG GTCCTCACGG TGATGCTGGT CTCACCGGCC GCAAGATCAT 840
CATCGATACT TACGGAGGAT GGGGTGCTCA TGGTGGTGGT GCTTTCTCCG GGAAGGATCC 900
CACCAAGGTT GATAGGAGTG GTGCTTACAT TGTGAGACAG GCTGCTAAGA GCATTGTGGC 960
AAGTGGACTA GCCAGAAGGT GCATTGTGCA AGTGTCTTAT GCCATTGGTG TGCCCGAGCC 1020 TTTGTCTGTC TTTGTTGACA CCTATGGCAC CGGGAAGATC CATGATAAGG AGATTCTCAA 1080
CATTGTGAAG GAGAACTTTG ATTTCAGGCC CGGTATGATC TCCATCAACC TTGATCTCAA 1140
GAGGGGTGGG AATAACAGGT TCTTGAAGAC TGCTGCATAT GGACACTTCG GCAGAGAGGA 1200
CCCTGACTTC ACATGGGAAG TGGTCAAGCC CCTCAAGTGG GAGAAGGCCT AAGGCCATTC 1260
ATTCCACTGC AATGTGCTGG GAGTTTTTTA GCGTTGCCCT TATAATGTCT ATTATCCATA 1320
ACTTTCCACG TCCCTTGCTC TGTGTTTTTC TCTCGTCGTC CTCCTCCTAT TTTGTTTCTC 1380
CTGCCTTTCA TTTGTAATTT TTTACATGAT CAACTAAAAA ATGTACTCTC TGTTTTCCGA 1440
CCATTGTGTC TCTTAATATC AGTATCAAAA AGAATGTTCC AAGTT 1485
(2) INFORMATION FOR SEQ ID NO: 39: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 392 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (vii) IMMEDIATE SOURCE: (B) CLON: CLON: s2.12b06 (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 39;
Met Ala Glu Thr Phe Leu Phe Thr Ser Glu Ser Val Asn Glu Gly His 1 5 10 15
Pro Asp Lys Leu Cys Asp Gln He Ser Asp Ala Val Leu Asp Ala Cys 20 25 30. Leu Glu Gln Aso Pro Asp Ser Lys Val Wing Cys Glu Thr Cys Thr Lys 35"40 45 Thr Asn Leu Val Met Val Phe Glv Glu He Thr Thr Lys Wing Asn Val 50 55" 60 Asp Tyr Glu Lys He Val Arg Asp Thr Cys Arg Asn He Gly Phe Val 65 70 75 80
Ser Asn Asp Val Gly Leu Asp Wing Asp Asn Cys Lys Val Leu Val Asn 85 90 95
He Glu Gln Gln Ser Pro Asp He Wing Gln Gly Val His Gly His Leu 100 105 110 Thr Lys Arg Pro Glu Glu He Gly Wing Gly Asp Gln Gly His Met Phe 115 120 125 Gly Tyr Ala Thr Asp Glu Thr Pro Glu Leu Met Pro Leu Ser His Val 130 135. ... 140 ... ...
Leu Ala Thr Lys Leu Gly Ala Arg Leu Thr Glu Val Arg Lys Asn Gly 145 150 155 160
Thr Cys Pro Trp Leu Arg Pro Asp Gly Lys Thr Gln Val Thr "Val Glu -165 - .170 175
Tyr Tyr Asn Asp Asn Gly Wing Met Val Pro Val Arg Val His Thr Val 180 185 190 Leu He Ser Thr Gln His ASD Glu Thr Val Thr Asn Asp Glu He Ala
195 * 200 205 Wing Asp Leu Lys Glu His Val He Lys Pro Val He Pro Glu Lys Tyr 210 215 220 Leu ASD Glu Lvs Thr He Phe His Leu Asn Pro Ser Gly Arg Phe Val 225"* 230 235 240
He Gly Giy Pro Kis Gly Asp Wing Gly Leu Thr Gly Arg Lys He He 245 250 • 255
He Asp Thr Tyr Gly Gly or Gly Wing Gly Gly Ala Ghe Wing Phe Ser 260 265 270 Gly Lys ASD Pro Thr Lys Val ASD Arg Ser Gly Ala Tyr He Val Arg 275 280 285 Gln Ala Ala Lvs Ser He Val Ala Ser Gly Leu Ala Arg Arg Cys He 29C 295 300 Val Gln Val Ser Tyr Ala I;. Glv Val Pro Glu Pro Leu Ser Val Phe 305 310 315 320
Val Asp Thr Tyr Gly Thr Gly Lys lie His Asp Lys Glu He Leu Asn 325 330 335
He Val Lys Glu Asn Phe ASD Phe Arg Pro Gly Met He Ser He As As 340 345 350 Leu ASD Leu Ly = Ara Giv Glv Asn Asn Arg Phe Leu Lys Thr Ala Wing 35 = "" '360 365 Tvr Giy HAS Phe Gly Arg Glu Asp Pro Aso Phe Thr Trp Glu Val Val 370 375 380 Lys Pro Leu Lys Tro Glu Lys Wing 385 '390 (2) INFORMATION FOR SEQ ID NO: 40: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1479 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: cDNA (vii) ORIGINAL SOURCE: (A) ORGANISM: Lycopersicon esculentum
(xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 40:
GAATTCCTAC AAAGAGGTTA TTTCTCTCAA GGGGTAAAAA GATTGCCCCT TTTCGACATT 60
TATAATCCTC TTTTTCTCTT TGTTCGCCGT TGGGTTCTTC ACTTTCCTGT TTCTTGAGAA 120
TGGAAACTTT CTTATTCACC TCCGAGTCTG TGAACGAGGG TCACCCAGAC AAGCTCTGTG 180
ATCAGATCTC TGATGCAGTT CTTGATGCCT GCCTTGAGCA AGATCCCGAG AGCAAAGTTG 240
CATGTGAAAC TTGCACCAAG ACCAACTTGG TCATGGTCTT TGGTGAGATC ACAACCAAGG 300 CTATTGTAGA CTATGAGAAG ATTGTGCGTG ACACATGCCG TAATATTGGA TTTGTTTCTG 360
ATGATGTTGG TCTTGATGCT GACAACTGCA AGGTCCTTGT TTACATTGAG CAGCAAAGTC 420
CTGATATTGC TCAAGGTGTC CACGGCCATC TGACCAAACG CCCCGAGGAG ATTGGTGCTG 480
GTGACCAGGG CCACATGTTT GGCTATGCAA CAGATGAGAC CCCTGAATTA ATGCCTCTCA 5 0
GTCACGTGCT TGCAACTAAA CTTGGTGCCC GTCTTACAGA AGTCCGCAAG AATGGCACCT 600
GCGCCTGGTT GAGGCCTGAT GGCAAGACCC AAGTTACTGT TGAGTATAGC AATGACAATG 660
GTGCCATGGT TCCAATTAGG GTACACACTG TTCTTATCTC CACCCAACAC GATGAGACCG 720
TTACCAATGA TGAGATTGCC CGCGACCTTA AG6AGCATGT CATCAAACCA GTCATCCCAG 780
AGAAGTACCT TGATGAGAAT ACTATTTTCC ACCTTAACCC ATCTGGCCGA TTCGTTATTG 840
GTGGACCTCA TGGTGATGCT GGTCTCACTG GTCGTAAAAT CATCATCGAC ACTTATGGTG 900
GTTGGGGTGC TCATGGTGGT GGTGCTTTCT CGGGCAAAGA CCCAACCAAG GTCGACAGGA 960
GTGGTGCATA CATTGTAAGG CAGGCTGCAA AGAGTATCGT AGCTAGTGGA CTTGCTCGTA 1020 GATGCATCGT GCAGGTATCT TATGCCATCG GTGTGCCTGA GCCATTGTCT GTATTCGTTG 1080
ACACCTATGG CACTGGAAAG ATCCCTGACA GGGAAATTTT GAAGATCGTT AAGGAGAACT 1140
TTGACTTCAG ACCTGGAATG ATGTCCATTA ACTTGGATTT GAAGAGGGGT GGCAATAGAA 1200
GATTCTTGAA AACTGCTGCC TATGGTCACT TTGGACGTGA TGACCCCGAT TCACATGGG 1260
AAGTTGTCAA GCCCCTCAAG TGGGAAAAGC CCCAAGACTA ATAAGTGCTT GCCTATGTTT 1320
TTGTTCTTTG TTGTTTGCTT GTGGCTTTAG AATCTCCCCC GTGTTTGCTT GTTTGTCTTT 1380
GTATTTTCTC TTTTGACCCT TTATTTTGTT ATTGTCCTGT TTCCATTGTG TTGGATGGAT 1440
ATCTTAGGCC TTGGAATATT AAGGAAAGAA AAGGAATTC 1479
(2) INFORMATION FOR SEQ ID NO: 1: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1380 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ü) TYPE OF MOLECULE: cDNA
(xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 41
CCCTCCCTTC GGTTCATCGG CCTCCCGATC GAGCAGTAGA AGCAGCGCAA GGGCATCGCT 60 AGCACTAAAG AAATGGCAGC CGAGACGTTC CTCTTCACGT CCGAGTCTGT GAACGAGGGC 120 CATCCCGACA AGCTCTGTGA CCAAGTCTCC GACGCCGTCT TGGATGCCTG CTTGGCCCAG 180 GATGCCGACA GCAAGGTCGC CTGCGAGACC GTCACCAAGA CCAACATGGT CATGGTCTTG 240 GGCGAGATCA CCACCAAGGC CACCGTCGAC TATGAGAAGA TCGTG GTGA CACCTGCCGC 300 AACATCGGTT TCATCTCTGA TGACGTTGGT CTCGACGCCG ACCGTTGCAA RGTGCTCGTC 360 AACATCGAGC AGCAGTCCCC TGACATTGCC CAGGGTGTTC ATGGACACTT CACCAAGCGT 420 CCCGAAGAAG TCGGCGCCGG TGACCAGGGC ATCATGTTCG GCTATGCCAC CGATGAGACC 480 CCTGAGCTGA TGCCCCTCAA GCACGTGCTT GCCACCAAGC TYGGAGCTCG CCTCACSGAG 540 GTCCGCAAGA ATGGCACCTG CGCCTGGGTC AGGCCTGACG GAAAGACCCA -GGTCACAGTC 600 GAGTACCTAA ACGAGGATGG TGCCATGGTA CCTGTTCGTG TGCACACCGT CCTCATCTCC 660 ACCCAGCACG ACGAGACCGT CACCAACGAC GAGATTGCTG CGGACCTCAA GGAGCATGTC 720 ATCAAGCCGG TGATCCCCGC AAAGTACCTC GATGAGAACA CCATCTTCCA CCTGAACCCG 780 TCTGGCCGCT TCGTCATCGG CGGCCCCCAC GGTGACGCCG GTCTCACCGG CCGCAAGATC 840 ATCATCGACA CCTATGGTG G CTGGGGAGCC CACGGCGGCG GTGCCTTCTC TGGCAAGGAC 900 CCAACCAAGG TCGACCGYAG TGGCGCCTAC ATTGCCAGGC ARGCCGCCAA GAGCATCATC 960 GCCAGCGGCC TCGCACGCCG CTGCATTGTG CAGATCTCAT ACGCCATCGG TGTGCCTGAG 1020 CCTTTGTCTG TGTTCGTCGA CTCCTACGGC ACCGGCAAGA TCCCCGACAG GGAGATCCTC 1080 AAGCTCGTGA AGGAGAACTT TGACTTCAGG CCCGGGATGA TCAGCATCAA CCTGGACTTG 1140 AAGAAAGGTG GAAACAGGTT CATCAAGACC GCTGCTTACG GTCACTTTGG CCGTGATGAT 1200 GCCGACTTCA CCTGGGAGGT GGTGAAGCCC CTCAAGTTCG ACAAGGCATC TGCCTAAGAG 1260 CATGGCATTC TCTTGGTCTG CCGCCTCTCA AGTTCGTCAA GACGGGATCA TGTTGCTCCT 1320 GGGAAGTGGG AAGAAGCATT AGACATTGAA GCGACGCTCT ACACTGGTCT TGTTGTATGG 1380
(2) INFORMATION FOR SEQ ID NO: 42: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 394 amino acids (B) TYPE: amino acid (C) CHAIN: not relevant (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: peptide (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 42;
Met Ala Ala Glu Thr Phe Leu Phe Thr Ser Glu Ser Val Asn Glu Gly 5 10 15 HAS Pro Asp Lys Leu Cys Asp Gln Val Ser Asp Ala Val Leu Asp Ala 20 25 30 Cys Leu Ala Gln Asp Ala Asp Ser Lys Val Ala Cys Glu Thr Val Thr 35 40 45 Lys Thr Asn Met Val Met Val Leu Gly Glu He Thr Thr Lys Wing Thr 50 55 60 Val Asp Tyr Glu Lys He Val Arg Asp Thr Cys Arg Asn He Gly Phe 65 70 75 80 He Ser Asp Asp Val Gly Leu Asp Wing Asp Arg Cys Lys Val Leu Val 85 90 95 Asn He Glu Gln Gln Ser Pro Asp He Wing Gln Gly Val HAS Gly His 100 105 110 Phe Thr Lys Arg Pro Glu Glu Val Gly Wing Gly Asp Gln Gly He Met 115 120 125 Phe Gly Tyr Ala Thr Asp Glu Thr Pro Glu Leu Met Pro Leu Lys His 130 135 140 Val Leu Wing Thr Lys Leu Gly Wing Arg Leu Thr Glu Val Arg Lys Asn 145 150 155 160
Gly Thr Cys Wing Trp Val Arg Pro Asp Gly Lys Thr Gln Val Thr Val 165 170 175
Glu Tyr Leu Asn Glu Asp Gly Wing Met Val Pro Val Arg Val His Thr 180 185 190 Val Leu He Ser Thr Gln His Asp Glu Thr Val Thr Asn Asp Glu He 195 200 205 Wing Wing Asp -Leu Lys Glu His Val He Lys Pro Val He Pro Wing Lys 210 215 220 Tyr Leu Asp Glu Asn Thr He Phe His Leu Asn Pro Ser Gly Arg Phe 225 230 235 240
Val He Gly Gly Pro HAS Gly Asp Wing Gly Leu Thr Gly Arg Lys He 245 250 255
He He As Asp Thr Tyr Gly Gly Trp Gly Wing His Gly Gly Gly Wing Phe 260 265 270 Ser Gly Lvs Asp Pro Thr Lys Val Asp Arg Ser Gly Wing Tyr He Wing 275 280 285 Arg Gln Wing Wing Ly = Ser He He Wing Ser Gly Leu Ala Arg Arg Cys 290 295 300 He Val Gln He Ser Tyr Ala He Gly Val Pro Glu Pro Leu Ser Val 305 310 315 320 he Val Asp Ser Tyr Gly Thr Gly Lys He Pro Asp Arg Glu He Leu 325 330 335
Lys Leu Val Lys Glu Asn Phe Asp Phe Arg Pro Gly Met He Ser He 340 345 350 Asn Leu Asp Leu Lys Lys Gly Gly Asn Arg Phe He Lys Thr Wing Wing 355 360 365 Tyr Gly His Phe Gly Arg Asp Asp Wing Asp Phe Thr Trp Glu Val Val 370 375 380 Ly = Pro Leu Lys Phe Asp Lys Wing Ser Wing 385 390
(2) INFORMATION FOR SEQ ID NO: 43: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1353 base pairs (B) TYPE: nucleic acid (C) CHAIN: simple (D) TOPOLOGY: linear (ií) TYPE OF MOLECULE: cDNA (vii) ORIGINAL SOURCE: (A) ORGANISM: Hordeum vulgare (xi) DESCRIPTION OF THE SEQUENCE: SEQ ID NO: 43:
GAATTCCGGA TAGCATCAGC ACAACTGCAC GAGAGCATCT CTACCACCAA AGAAATGGCG 60
GCCGAGACGT TCCTCTTCAC GTCCGAGTCC GTGAACGAGG GCCATCCCGA CAAGCTGTGC 120
GACCAGGTCT CTGACGCCGT CTTGGACGCC TGCTTGGCCC AGGATCCTGA CAGCAAGGTT 180
GCTTGCGAGA CCTGCACCAA GACCAACATG GTCATGGTCT TCGGCGAGAT CACCACCAAG 240
GCCACCGTTG ACTATGAGAA GATTGTGCGC GACACCTGCC GTGACATCGG CTTCATCTCT 300
GACGACGTCG GTCTCGATGC CGACCATTGC AAGGTGCTCG TCAACATCGA GCAGCAATCC 360
CCTGACATTG CCCAGGGTGT TCACGGACAC TTCACCAAGC GTCCAGAAGA GGTCGGCGCC 420
GGTGACCAGG GCATCATGTT TGGCTACGCC ACTGATGAGA CCCCTGAGCT GATGCCCCTC 480
ACCCACATGC TTGCCACCAA GCTCGGAGCT CGCCTCACCG AGGTCCGCAA GAATGGCACC 540
TGCGCCTGGC TCAGGCCTGA TGGAAAGACC CAGGTCACCA TTGAGTACCT AAACGAGGGT 600
GGTGCCATGG TGCCCGTTCG TGTGCACACC GTCCTCATCT CCACCCAGCA TGATGAGACC 660
GTCACCAACG ATGAGATCGC TGCAGACCTC AAGGAGCATG TCATCAAGCC GGTGATTCCC 720
GGGAAG7ACC TCGATGAGAA CACCATC7TC CACCTGAACC CATCGGGCCCC CTTTGTCATC 780
GGTGGCCCTC ACGGCGA7GC CGG7C7CACC GCCCGCAAGA TCATCATCGA CACCTATGGT 840
GGC7GGGGAG CCCACGGCGG CGG7GCC77C TCTGGCAAGG ACCCTACCAA GGTCGACCGC 900
AGTGGCGCC7 ACATTGCCAG GCAGGCTGCC AAGAGCATCA TCGCCAGCGG CCTCGCACGC 960
CGGTGCATTG TGCAGATC7C A7ATGCCATC GGTGTACCTG AGCCTTTGTC TGTGTTCGTC 1020
GACTCCTACG GCACTGGCAA GATCCCTGAC AGGGAGATCC TCAAGCTCGT GAAGGAGAAC 1080
7TTGACTTCA GACCCGGGAT GATCACGATC AACCTCGACT TGAAGAAAGG TGGAAACAGG 1140
77CA7CAAGA CAGCTGC7TA CGG7CAC7TT GGCCGCGATG ATGC7GACTT CACCTGGGAG 120C
GTGGTGAAGC CCCTCAAGTT CGACAAGGCA TCTGCTTAAG AAGAAGACAT CACATTGAGG 1260
G7TCTTCTTG GTCTGATGCC TCTCAAGTTC GGCAAGGCGG GATCCTTTTG CTCCTCGGAA 1320
GTAAGAAGAA GCATTCAACA TCGCCCGGAA TTC 1353 It is noted that in relation to this date, the best method known to the applicant to put into practice the aforementioned
• invention, which is clear from the present description of the invention. Having described the invention as above, it is claimed as property contained in the following:
Claims (52)
- An isolated nucleic acid fragment encoding all or a substantial portion of a plant dihydrodipicolinate reductase, characterized in that it comprises an element selected from the group consisting of: a) an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 2 and 4; Y b) an isolated nucleic acid fragment that is substantially similar to an isolated nucleic acid fragment encoding all or a substantial portion of the. amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 2 and 4; Y c) an isolated nucleic acid fragment that is complementary to (a) or (b).
- A fragment of nucleic acid isolated from Claim 1 characterized in that the nucleotide sequence of the fragment comprises all or a portion of the sequence declared a member selected from the group consisting of SEQ ID NO: 1 and 3.
- 3. A chimeric gene characterized in that it comprises the nucleic acid fragment of Claim 1 operatively linked to suitable regulatory sequences.
- 4. A transformed host cell characterized in that it comprises a chimeric gene of Claim 3.
- 5. A dihydropicolinate reductase polypeptide characterized by comprises all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 2 and 4.
- An isolated nucleic acid fragment encoding all or a substantial portion of plant diaminopimelate epimerase characterized in that it comprises a member selected from the group consisting of: a) an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 7, 9, 11, and 13. b) an isolated nucleic acid fragment that is substantially similar to an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 7, 9, 11, and 13; Y c) an isolated nucleic acid fragment that is complementary to (a) or (b).
- The isolated nucleic acid fragment of Claim 6 characterized in that the nucleotide sequence of the fragment comprising all or a portion of the sequence declared in a member selected from the group consisting of SEQ ID NO: 6, 8, 10 and 12.
- A chimeric gene characterized in that it comprises the nucleic acid fragment of Claim 6 operatively linked to suitable regulatory sequences.
- A transformed host cell characterized by ague comprises the chimeric gene of Claim 8.
- A diaminopimelate epimerase polypeptide characterized by comprises all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 7, 9, 11 and 13.
- An isolated nucleic acid fragment encoding all or a substantial portion of a plant threonine synthase characterized in that it comprises a member selected from the group consisting of: a) an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 16 and 18. b) an isolated nucleic acid fragment that is substantially similar to an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 16 and 18; Y c) an isolated nucleic acid fragment that is complementary to (a) or (b).
- The nucleic acid fragment isolated from Claim 11 characterized in that the nucleotide sequence of the fragment comprising all or a portion of the sequence declared in a member selected from the group consisting of SEQ ID NO: 15 and 17.
- A chimeric gene characterized by comprising the nucleic acid fragment of Claim 11 operatively linked to suitable regulatory sequences.
- A transformed host cell characterized in that it comprises the chimeric gene of Claim 13.
- A threonine synthetase polypeptide characterized in that it comprises all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 16 and 18.
- An isolated nucleic acid fragment encoding all or a substantial portion of a plant threonine synthase characterized in that it comprises a member selected from the group consisting of: a) an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in SEQ ID NO: 20. b) an isolated nucleic acid fragment that is substantially similar to an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in SEQ ID NO: 20; Y c) an isolated nucleic acid fragment that is complementary to (a) or (b).
- The nucleic acid fragment isolated from the Claim 16 characterized in that the nucleotide sequence of the fragment comprising all or a portion of the sequence declared in SEQ ID. NO: 19
- A chimeric gene characterized in that it comprises the nucleic acid fragment of Claim 16 operatively linked to suitable regulatory sequences.
- A transformed host cell characterized in that it comprises the chimeric gene of Claim 18.
- A threonine synthase polypeptide characterized in that it comprises all or a substantial portion of the amino acid sequence declared in SEQ ID NO: 20.
- An isolated nucleic acid fragment encoding all or a substantial portion of a plant threonine synthase characterized in that it comprises a member selected from the group consisting of: a) an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 22 and 24. b) an isolated nucleic acid fragment that is substantially similar to an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 22 and 24; Y c) an isolated nucleic acid fragment that is complementary to (a) or (b).
- The nucleic acid fragment isolated from Claim 21 characterized in that the nucleotide sequence of the fragment comprises all or a portion of the sequence declared in a member selected from the group consisting of SEQ ID NO: 21 and 23.
- A chimeric gene characterized in that it comprises the nucleic acid fragment of Claim 21 operatively linked to suitable regulatory sequences.
- A transformed host cell characterized in that it comprises the chimeric gene of Claim 23.
- A threonine synthase polypeptide characterized in that it comprises all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 22 and 24.
- An isolated nucleic acid fragment encoding all or a substantial portion of a plant threonine synthase characterized in that it comprises a member selected from the group consisting of: a) an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in SEQ ID NO: 26. b) an isolated nucleic acid fragment that is substantially similar to an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in SEQ ID NO: 26; Y c) an isolated nucleic acid fragment that is complementary to (a) or (b).
- 27. The isolated nucleic acid fragment of Claim 26 characterized in that the nucleotide sequence of the fragment comprising all or a portion of the sequence declared in SEQ ID NO. NO: 25
- 28. A chimeric gene characterized in that it comprises the nucleic acid fragment of. Claim 26 operatively linked to suitable regulatory sequences.
- 29. A transformed host cell characterized in that it comprises the chimeric gene of Claim 28.
- 30. A threonine synthase polypeptide characterized in that it comprises all or a substantial portion of the amino acid sequence declared in SEQ ID NO: 26.
- 31. An isolated nucleic acid fragment encoding all or a substantial portion of a plant threonine deaminase characterized in that it comprises a member selected from the group consisting of: a) an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in SEQ ID NO: 29. b) an isolated nucleic acid fragment that is substantially similar to an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in SEQ ID NO: 29; Y c) an isolated nucleic acid fragment that is complementary to (a) or (b).
- The nucleic acid fragment isolated from the Claim 31 characterized in that the nucleotide sequence of the fragment comprising all or a portion of the sequence declared in SEQ ID. NO: 28
- A chimeric gene characterized in that it comprises the nucleic acid fragment of Claim 31 operably linked to suitable regulatory sequences.
- 34. A transformed host cell characterized in that it comprises the chimeric gene of Claim 33.
- 35. A threonine deaminase polypeptide characterized in that it comprises all or a substantial portion of the amino acid sequence declared in SEQ ID NO: 29.
- 36. An isolated nucleic acid fragment encoding all or a substantial portion of a plant threonine deaminase characterized in that it comprises a member selected from the group consisting of: a) an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 31 and 33. b) an isolated nucleic acid fragment that is substantially similar to an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 31 and 33; Y c) an isolated nucleic acid fragment that is complementary to (a) or (b).
- 37. The nucleic acid fragment isolated from Claim 36 characterized in that the nucleotide sequence of the fragment comprises all or a portion of the sequence declared in a member selected from the group consisting of SEQ ID NO: 30 and 32.
- 38. A chimeric gene characterized in that it comprises the nucleic acid fragment of Claim 36 operably linked to suitable regulatory sequences.
- 39. A transformed host cell characterized in that it comprises the chimeric gene of Claim 38.
- 40. A threonine deaminase polypeptide characterized in that it comprises all or a substantial portion of the amino acid sequence declared in a member selected from the group consisting of SEQ ID NO: 31 and 33.
- 41. An isolated nucleic acid fragment encoding all or a substantial portion of a plant S-adenosylmethionine synthetase characterized in that the nucleotide sequence of the fragment comprises all of a portion of the sequence reported in SEQ ID NO: 35.
- A chimeric gene characterized in that it comprises the nucleic acid fragment of Claim 41 operably linked to suitable regulatory sequences.
- A transformed host cell characterized in that it comprises the chimeric gene of Claim 42.
- An isolated nucleic acid fragment encoding all or a substantial portion of a plant S-adenosylmethionine synthetase characterized in that the nucleotide sequence of the fragment comprises all or a portion of the sequence reported in SEQ ID NO: 38.
- A chimeric gene characterized in that it comprises the nucleic acid fragment of claim 44 operably linked to suitable regulatory sequences.
- A transformed host cell characterized in that it comprises the chimeric gene of Claim 45.
- An isolated nucleic acid fragment encoding all or a substantial portion of a plant S-adenosylmethionine synthetase characterized in that the nucleotide sequence of the fragment comprises all or a portion of the sequence reported in SEQ ID NO: 41.
- 48. A chimeric gene characterized in that it comprises the nucleic acid fragment of Claim 47 operably linked to suitable regulatory sequences.
- 49. A transformed host cell characterized in that it comprises the chimeric gene of Claim 48.
- 50. A method of altering the level of expression of the plant amino acid biosynthetic enzyme in a host cell characterized in that it comprises: a) transforming a host cell with the chimeric gene of any of claims 3, 8, 13, 18, 23, 28, 33, 38, 42, 45, and 48; Y b) growing the transformed host cell produced in step (a) under conditions that are suitable for the expression of the chimeric gene wherein the expression of the chimeric gene leads to the production of altered levels of an amino acid biosynthetic enzyme of a plant in the transformed host cell.
- 51. A method of obtaining a nucleic acid fragment encoding all or substantially all of the amino acid sequence encoding a biosynthetic amino acid enzyme of a plant characterized in that it comprises: a) probing a cDNA or genomic library with the nucleic acid fragment of any of Claims 1, 6, 11, 16, 21, 26, 31, 36, 41, 44, and 47; b) identifying a DNA clone that hybridizes with the nucleic acid fragment of any of Claims 1, 6, 11, 16, 21, 26, 31, 36, 41, 44, and 47; c) isolating the DNA clone identified in step (b); Y d) sequencing cDNA or genomic fragment comprising the clone isolated in step (c). wherein the fragment of the nucleic acid sequence encodes all or substantially all of the amino acid sequence encoding a plant amino acid biosynthetic enzyme. A method of obtaining a nucleic acid fragment encoding a portion of an amino acid sequence encoding a plant amino acid biosynthetic enzyme characterized in that it comprises: a) synthesizing the initiator oligolucleotide corresponding to a portion of the sequence declared in any of SEQ ID NOs: 1, 3, 6, 8, 10, 12, 15, 17, 19, 21, 23, 25, 28, 30, 32, 35, 38, and 41; Y b) amplifying a cDNA insert present in a cloning vector using the initiator oligonucleotide of step (a) and one primer representing sequences of the cloning vector wherein the amplified nucleic acid fragment encodes a portion of an amino acid sequence encoding a biosynthetic amino acid enzyme of a plant. A product, characterized in that it is produced by the method of Claim 51. A product, characterized in that it is produced by the method of Claim
- 52. A method for evaluating at least one compound for its ability to inhibit the activity of a plant biosynthetic enzyme selected from the group consisting of dihydrodipicolinate reductase, diaminopimelate epimerase, threonine synthase, threonine deaminase and S-adenosylmethionine synthetase, characterized in that it comprises the steps of : a) transforming a host cell with a chimeric gene comprising a nucleic acid fragment encoding a plant biosynthetic enzyme selected from the group consisting of dihydrodipicolinate reductase, diaminopimelate epimerase, threonine synthase, threonine deaminase and S-adenosylmethionine synthetase, operatively linked to suitable regulatory sequences; b) growing the transformed host cell under conditions that are suitable for the expression of the chimeric gene characterized in that the expression of the chimeric gene results in the production of the operatively encoded biosynthetic enzyme bound to the nucleic acid fragment in the transformed host cell; c) optionally purifying the plant biosynthetic enzyme expressed by the transformed host cell; d) treating the biosynthetic enzyme with a compound to be tested; Y e) comparing the activity of the biosynthetic enzyme that has been treated with a test compound for the activity of an untreated plant biosynthetic enzyme, or which compounds are selected with potential for inhibitory activity.
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US60/048771 | 1997-06-06 | ||
US048771 | 1997-06-06 | ||
US60/049443 | 1997-06-12 | ||
US049443 | 1997-06-12 |
Publications (1)
Publication Number | Publication Date |
---|---|
MXPA99011066A true MXPA99011066A (en) | 2000-09-04 |
Family
ID=
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6664445B1 (en) | Plant amino acid biosynthetic enzymes | |
EP1002113B1 (en) | Plant amino acid biosynthetic enzymes | |
US6346403B1 (en) | Methionine metabolic enzymes | |
US5912414A (en) | Nucleic acid fragments, chimeric genes and methods for increasing the methionine content of the seeds of plants | |
WO1999055887A2 (en) | Carotenoid biosynthesis enzymes | |
US7135622B2 (en) | Mevalonate synthesis enzymes | |
WO1999049013A2 (en) | Tryptophan biosynthetic enzymes | |
US20030088886A1 (en) | Plant methionine synthase gene and methods for increasing the methionine content of the seeds of plants | |
US7439420B2 (en) | Plant amino acid biosynthetic enzymes | |
US6204039B1 (en) | Plant isocitrate dehydrogenase homologs | |
US20040064848A1 (en) | Chorismate biosynthesis enzymes | |
MXPA99011066A (en) | Plant amino acid biosynthetic enzymes | |
US20060156441A1 (en) | Aspartate kinase | |
US20060026705A1 (en) | Plant amino acid biosynthetic enzymes | |
US7368633B2 (en) | Plant amino acid biosynthetic enzymes | |
US6297055B1 (en) | Amino acid decarboxylases | |
US20020119546A1 (en) | Squalene synthesis enzymes | |
WO1999021880A2 (en) | Plant branched-chain amino acid biosynthetic enzymes | |
US6403859B1 (en) | Vitamin B metabolism proteins | |
US20020157132A1 (en) | Plant amino acid biosynthetic enzymes | |
WO2000004168A1 (en) | Ornithine biosynthesis enzymes | |
US7192758B2 (en) | Polynucleotides encoding phosphoribosylanthranilate isomerase |