MXPA00009108A - Genes controlling diseases - Google Patents
Genes controlling diseasesInfo
- Publication number
- MXPA00009108A MXPA00009108A MXPA/A/2000/009108A MXPA00009108A MXPA00009108A MX PA00009108 A MXPA00009108 A MX PA00009108A MX PA00009108 A MXPA00009108 A MX PA00009108A MX PA00009108 A MXPA00009108 A MX PA00009108A
- Authority
- MX
- Mexico
- Prior art keywords
- leu
- val
- ser
- phe
- lys
- Prior art date
Links
- 230000001276 controlling effect Effects 0.000 title abstract description 5
- 201000010099 disease Diseases 0.000 title description 18
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 146
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 146
- 244000053095 fungal pathogens Species 0.000 claims abstract description 65
- 229920003013 deoxyribonucleic acid Polymers 0.000 claims description 257
- 241000196324 Embryophyta Species 0.000 claims description 229
- 230000014509 gene expression Effects 0.000 claims description 163
- 229920001850 Nucleic acid sequence Polymers 0.000 claims description 96
- 229920000272 Oligonucleotide Polymers 0.000 claims description 76
- 150000001413 amino acids Chemical group 0.000 claims description 73
- 230000000875 corresponding Effects 0.000 claims description 35
- 229920002676 Complementary DNA Polymers 0.000 claims description 34
- 101710010344 MLXIPL Proteins 0.000 claims description 31
- 101710031513 mml-1 Proteins 0.000 claims description 31
- IMMPMHKLUUZKAZ-WMZOPIPTSA-N Trp-Phe Chemical compound C([C@H](NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)N)C(O)=O)C1=CC=CC=C1 IMMPMHKLUUZKAZ-WMZOPIPTSA-N 0.000 claims description 25
- 239000002299 complementary DNA Substances 0.000 claims description 24
- 230000000692 anti-sense Effects 0.000 claims description 21
- 230000001717 pathogenic Effects 0.000 claims description 21
- 244000052769 pathogens Species 0.000 claims description 21
- 239000000203 mixture Substances 0.000 claims description 20
- 235000007340 Hordeum vulgare Nutrition 0.000 claims description 18
- 125000000267 glycino group Chemical group [H]N([*])C([H])([H])C(=O)O[H] 0.000 claims description 17
- 229920000023 polynucleotide Polymers 0.000 claims description 17
- 239000002157 polynucleotide Substances 0.000 claims description 17
- 230000000694 effects Effects 0.000 claims description 15
- 230000002538 fungal Effects 0.000 claims description 13
- 241000221787 Erysiphe Species 0.000 claims description 11
- UQHGAYSULGRWRG-WHFBIAKZSA-N Glu-Ser Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CO)C(O)=O UQHGAYSULGRWRG-WHFBIAKZSA-N 0.000 claims description 11
- 241001480061 Blumeria graminis Species 0.000 claims description 10
- 241000221785 Erysiphales Species 0.000 claims description 10
- 108020004999 Messenger RNA Proteins 0.000 claims description 9
- 238000002744 homologous recombination Methods 0.000 claims description 9
- 229920002106 messenger RNA Polymers 0.000 claims description 9
- DYDKXJWQCIVTMR-UHFFFAOYSA-N Aspartyl-Methionine Chemical compound CSCCC(C(O)=O)NC(=O)C(N)CC(O)=O DYDKXJWQCIVTMR-UHFFFAOYSA-N 0.000 claims description 8
- 230000002759 chromosomal Effects 0.000 claims description 8
- 229920002033 ribozyme Polymers 0.000 claims description 6
- 229920002395 Aptamer Polymers 0.000 claims description 5
- 238000000137 annealing Methods 0.000 claims description 5
- 125000000510 L-tryptophano group Chemical group [H]C1=C([H])C([H])=C2N([H])C([H])=C(C([H])([H])[C@@]([H])(C(O[H])=O)N([H])[*])C2=C1[H] 0.000 claims description 4
- 238000009396 hybridization Methods 0.000 claims description 4
- 230000000295 complement Effects 0.000 claims description 3
- 238000002156 mixing Methods 0.000 claims description 3
- 108020004491 Antisense DNA Proteins 0.000 claims description 2
- 239000003816 antisense DNA Substances 0.000 claims description 2
- 230000023298 conjugation with cellular fusion Effects 0.000 claims description 2
- 230000013011 mating Effects 0.000 claims description 2
- 230000021037 unidirectional conjugation Effects 0.000 claims description 2
- 241000209219 Hordeum Species 0.000 claims 2
- 230000008836 DNA modification Effects 0.000 claims 1
- 230000003111 delayed Effects 0.000 claims 1
- 235000018102 proteins Nutrition 0.000 description 123
- 210000004027 cells Anatomy 0.000 description 107
- 241000282326 Felis catus Species 0.000 description 91
- 108010092114 histidylphenylalanine Proteins 0.000 description 60
- 241000880493 Leptailurus serval Species 0.000 description 56
- NFDYGNFETJVMSE-BQBZGAKWSA-N Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CO NFDYGNFETJVMSE-BQBZGAKWSA-N 0.000 description 52
- GVRKWABULJAONN-UHFFFAOYSA-N Valyl-Threonine Chemical compound CC(C)C(N)C(=O)NC(C(C)O)C(O)=O GVRKWABULJAONN-UHFFFAOYSA-N 0.000 description 51
- LRKCBIUDWAXNEG-CSMHCCOUSA-N Leu-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LRKCBIUDWAXNEG-CSMHCCOUSA-N 0.000 description 50
- 108010017391 lysylvaline Proteins 0.000 description 43
- 230000001131 transforming Effects 0.000 description 43
- 108010054155 lysyllysine Proteins 0.000 description 42
- 241001122767 Theaceae Species 0.000 description 40
- IOUPEELXVYPCPG-UHFFFAOYSA-N val-gly Chemical compound CC(C)C(N)C(=O)NCC(O)=O IOUPEELXVYPCPG-UHFFFAOYSA-N 0.000 description 40
- BCCRXDTUTZHDEU-VKHMYHEASA-N Gly-Ser Chemical compound NCC(=O)N[C@@H](CO)C(O)=O BCCRXDTUTZHDEU-VKHMYHEASA-N 0.000 description 38
- NFNVDJGXRFEYTK-YUMQZZPRSA-N Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O NFNVDJGXRFEYTK-YUMQZZPRSA-N 0.000 description 38
- 241000209140 Triticum Species 0.000 description 38
- 235000021307 wheat Nutrition 0.000 description 38
- XWOBNBRUDDUEEY-UWVGGRQHSA-N Leu-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 XWOBNBRUDDUEEY-UWVGGRQHSA-N 0.000 description 37
- 108010025306 histidylleucine Proteins 0.000 description 37
- IJYZHIOOBGIINM-WDSKDSINSA-N Arg-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N IJYZHIOOBGIINM-WDSKDSINSA-N 0.000 description 36
- JKHXYJKMNSSFFL-IUCAKERBSA-N Val-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN JKHXYJKMNSSFFL-IUCAKERBSA-N 0.000 description 35
- 108010089804 glycyl-threonine Proteins 0.000 description 35
- 108010073969 valyllysine Proteins 0.000 description 35
- 108010050848 glycylleucine Proteins 0.000 description 34
- 108010057821 leucylproline Proteins 0.000 description 34
- 108010031719 prolyl-serine Proteins 0.000 description 34
- 108010026333 seryl-proline Proteins 0.000 description 34
- NVGBPTNZLWRQSY-UWVGGRQHSA-N Lys-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN NVGBPTNZLWRQSY-UWVGGRQHSA-N 0.000 description 33
- 108010083476 phenylalanyltryptophan Proteins 0.000 description 32
- YQAIUOWPSUOINN-IUCAKERBSA-N Lys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN YQAIUOWPSUOINN-IUCAKERBSA-N 0.000 description 31
- KLAONOISLHWJEE-UHFFFAOYSA-N Phenylalanyl-Glutamine Chemical compound NC(=O)CCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 KLAONOISLHWJEE-UHFFFAOYSA-N 0.000 description 31
- ZKQOUHVVXABNDG-IUCAKERBSA-N Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 ZKQOUHVVXABNDG-IUCAKERBSA-N 0.000 description 29
- GXDLGHLJTHMDII-WISUUJSJSA-N Thr-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(O)=O GXDLGHLJTHMDII-WISUUJSJSA-N 0.000 description 29
- 108010005942 methionylglycine Proteins 0.000 description 28
- RDIKFPRVLJLMER-BQBZGAKWSA-N Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)N RDIKFPRVLJLMER-BQBZGAKWSA-N 0.000 description 27
- XNSKSTRGQIPTSE-UHFFFAOYSA-N Arginyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CCCNC(N)=N XNSKSTRGQIPTSE-UHFFFAOYSA-N 0.000 description 27
- JSIQVRIXMINMTA-ZDLURKLDSA-N Glu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O JSIQVRIXMINMTA-ZDLURKLDSA-N 0.000 description 27
- MMFKFJORZBJVNF-UWVGGRQHSA-N His-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CN=CN1 MMFKFJORZBJVNF-UWVGGRQHSA-N 0.000 description 27
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 27
- 108010038633 aspartylglutamate Proteins 0.000 description 27
- 108010034529 leucyl-lysine Proteins 0.000 description 27
- AFWBWPCXSWUCLB-WDSKDSINSA-N Pro-Ser Chemical compound OC[C@@H](C([O-])=O)NC(=O)[C@@H]1CCC[NH2+]1 AFWBWPCXSWUCLB-WDSKDSINSA-N 0.000 description 26
- 240000008042 Zea mays Species 0.000 description 26
- 235000002017 Zea mays subsp mays Nutrition 0.000 description 26
- 238000000034 method Methods 0.000 description 26
- DKEXFJVMVGETOO-LURJTMIESA-N Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CN DKEXFJVMVGETOO-LURJTMIESA-N 0.000 description 25
- LESXFEZIFXFIQR-LURJTMIESA-N Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)NCC(O)=O LESXFEZIFXFIQR-LURJTMIESA-N 0.000 description 25
- OTXBNHIUIHNGAO-UWVGGRQHSA-N Leu-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN OTXBNHIUIHNGAO-UWVGGRQHSA-N 0.000 description 25
- QOLYAJSZHIJCTO-VQVTYTSYSA-N Thr-Pro Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(O)=O QOLYAJSZHIJCTO-VQVTYTSYSA-N 0.000 description 25
- YKRQRPFODDJQTC-UHFFFAOYSA-N Threoninyl-Lysine Chemical compound CC(O)C(N)C(=O)NC(C(O)=O)CCCCN YKRQRPFODDJQTC-UHFFFAOYSA-N 0.000 description 25
- OHUXOEXBXPZKPT-STQMWFEESA-N Phe-His Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1N=CNC=1)C(O)=O)C1=CC=CC=C1 OHUXOEXBXPZKPT-STQMWFEESA-N 0.000 description 24
- PPQRSMGDOHLTBE-UWVGGRQHSA-N Ser-Phe Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 PPQRSMGDOHLTBE-UWVGGRQHSA-N 0.000 description 24
- 238000010276 construction Methods 0.000 description 24
- 108010064235 lysylglycine Proteins 0.000 description 24
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 24
- XUJNEKJLAYXESH-REOHCLBHSA-N L-cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 23
- VTJUNIYRYIAIHF-IUCAKERBSA-N Leu-Pro Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O VTJUNIYRYIAIHF-IUCAKERBSA-N 0.000 description 23
- XZKQVQKUZMAADP-IMJSIDKUSA-N Ser-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(O)=O XZKQVQKUZMAADP-IMJSIDKUSA-N 0.000 description 23
- BQBCIBCLXBKYHW-CSMHCCOUSA-N Thr-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@@H]([NH3+])[C@@H](C)O BQBCIBCLXBKYHW-CSMHCCOUSA-N 0.000 description 23
- STTYIMSDIYISRG-WDSKDSINSA-N Val-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(O)=O STTYIMSDIYISRG-WDSKDSINSA-N 0.000 description 23
- 238000003752 polymerase chain reaction Methods 0.000 description 23
- 108010029020 prolylglycine Proteins 0.000 description 22
- CKAJHWFHHFSCDT-WHFBIAKZSA-N Asp-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O CKAJHWFHHFSCDT-WHFBIAKZSA-N 0.000 description 21
- 229920001405 Coding region Polymers 0.000 description 21
- BBBXWRGITSUJPB-YUMQZZPRSA-N Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O BBBXWRGITSUJPB-YUMQZZPRSA-N 0.000 description 21
- JBCLFWXMTIKCCB-VIFPVBQESA-N Gly-Phe Chemical compound NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-VIFPVBQESA-N 0.000 description 21
- WBAXJMCUFIXCNI-WDSKDSINSA-N Ser-Pro Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O WBAXJMCUFIXCNI-WDSKDSINSA-N 0.000 description 21
- 108010068265 aspartyltyrosine Proteins 0.000 description 21
- 108010015792 glycyllysine Proteins 0.000 description 21
- 108010009298 lysylglutamic acid Proteins 0.000 description 21
- 108010090894 prolylleucine Proteins 0.000 description 21
- 230000001105 regulatory Effects 0.000 description 21
- SIGGQAHUPUBWNF-UHFFFAOYSA-N γ-glutamyl-Methionine Chemical compound CSCCC(C(O)=O)NC(=O)C(N)CCC(N)=O SIGGQAHUPUBWNF-UHFFFAOYSA-N 0.000 description 21
- IKAIKUBBJHFNBZ-LURJTMIESA-N Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CN IKAIKUBBJHFNBZ-LURJTMIESA-N 0.000 description 20
- HIZYETOZLYFUFF-BQBZGAKWSA-N Leu-Cys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CS)C(O)=O HIZYETOZLYFUFF-BQBZGAKWSA-N 0.000 description 20
- JMCOUWKXLXDERB-WMZOPIPTSA-N Phe-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CC=CC=C1 JMCOUWKXLXDERB-WMZOPIPTSA-N 0.000 description 20
- UPJONISHZRADBH-XPUUQOCRSA-N Val-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O UPJONISHZRADBH-XPUUQOCRSA-N 0.000 description 20
- 108010081551 glycylphenylalanine Proteins 0.000 description 20
- BNODVYXZAAXSHW-UHFFFAOYSA-N Arginyl-Histidine Chemical compound NC(=N)NCCCC(N)C(=O)NC(C(O)=O)CC1=CN=CN1 BNODVYXZAAXSHW-UHFFFAOYSA-N 0.000 description 19
- CZVQSYNVUHAILZ-UWVGGRQHSA-N His-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CN=CN1 CZVQSYNVUHAILZ-UWVGGRQHSA-N 0.000 description 19
- HGNRJCINZYHNOU-LURJTMIESA-N Lys-Gly Chemical compound NCCCC[C@H](N)C(=O)NCC(O)=O HGNRJCINZYHNOU-LURJTMIESA-N 0.000 description 19
- GKZIWHRNKRBEOH-HOTGVXAUSA-N Phe-Phe Chemical compound C([C@H]([NH3+])C(=O)N[C@@H](CC=1C=CC=CC=1)C([O-])=O)C1=CC=CC=C1 GKZIWHRNKRBEOH-HOTGVXAUSA-N 0.000 description 19
- 108010061238 threonyl-glycine Proteins 0.000 description 19
- 230000035897 transcription Effects 0.000 description 19
- 229920000160 (ribonucleotides)n+m Polymers 0.000 description 18
- YZQCXOFQZKCETR-UWVGGRQHSA-N Asp-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 YZQCXOFQZKCETR-UWVGGRQHSA-N 0.000 description 18
- FYYSIASRLDJUNP-WHFBIAKZSA-N Glu-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O FYYSIASRLDJUNP-WHFBIAKZSA-N 0.000 description 18
- MLTRLIITQPXHBJ-BQBZGAKWSA-N Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O MLTRLIITQPXHBJ-BQBZGAKWSA-N 0.000 description 18
- NPBGTPKLVJEOBE-IUCAKERBSA-N Lys-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCNC(N)=N NPBGTPKLVJEOBE-IUCAKERBSA-N 0.000 description 18
- 235000005822 corn Nutrition 0.000 description 18
- 235000005824 corn Nutrition 0.000 description 18
- 108010091871 leucylmethionine Proteins 0.000 description 18
- XMAUFHMAAVTODF-STQMWFEESA-N (2S)-2-[[(2S)-2-amino-3-(1H-imidazol-5-yl)propanoyl]amino]-3-phenylpropanoic acid Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CN=CN1 XMAUFHMAAVTODF-STQMWFEESA-N 0.000 description 17
- 241000219194 Arabidopsis Species 0.000 description 17
- 210000003763 Chloroplasts Anatomy 0.000 description 17
- 229920002459 Intron Polymers 0.000 description 17
- QOOWRKBDDXQRHC-BQBZGAKWSA-N L-lysyl-L-alanine Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN QOOWRKBDDXQRHC-BQBZGAKWSA-N 0.000 description 17
- BQVUABVGYYSDCJ-ZFWWWQNUSA-N Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-ZFWWWQNUSA-N 0.000 description 17
- 108010038320 lysylphenylalanine Proteins 0.000 description 17
- 108010051242 phenylalanylserine Proteins 0.000 description 17
- MGHKSHCBDXNTHX-UHFFFAOYSA-N 4-amino-5-[(4-amino-1-carboxy-4-oxobutyl)amino]-5-oxopentanoic acid Chemical compound OC(=O)CCC(N)C(=O)NC(CCC(N)=O)C(O)=O MGHKSHCBDXNTHX-UHFFFAOYSA-N 0.000 description 16
- SONUFGRSSMFHFN-IMJSIDKUSA-N Asn-Ser Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O SONUFGRSSMFHFN-IMJSIDKUSA-N 0.000 description 16
- NTQDELBZOMWXRS-UHFFFAOYSA-N Aspartyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CC(O)=O NTQDELBZOMWXRS-UHFFFAOYSA-N 0.000 description 16
- SSHIXEILTLPAQT-UHFFFAOYSA-N Glutaminyl-Aspartate Chemical compound NC(=O)CCC(N)C(=O)NC(CC(O)=O)C(O)=O SSHIXEILTLPAQT-UHFFFAOYSA-N 0.000 description 16
- CTCFZNBRZBNKAX-UHFFFAOYSA-N Histidinyl-Glutamine Chemical compound NC(=O)CCC(C(O)=O)NC(=O)C(N)CC1=CN=CN1 CTCFZNBRZBNKAX-UHFFFAOYSA-N 0.000 description 16
- 240000005979 Hordeum vulgare Species 0.000 description 16
- UGTZHPSKYRIGRJ-YUMQZZPRSA-N Lys-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O UGTZHPSKYRIGRJ-YUMQZZPRSA-N 0.000 description 16
- BNQVUHQWZGTIBX-IUCAKERBSA-N Val-His Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CC1=CN=CN1 BNQVUHQWZGTIBX-IUCAKERBSA-N 0.000 description 16
- 108010068380 arginylarginine Proteins 0.000 description 16
- 108010093581 aspartyl-proline Proteins 0.000 description 16
- 108010037850 glycylvaline Proteins 0.000 description 16
- RZVAJINKPMORJF-UHFFFAOYSA-N p-acetaminophenol Chemical compound CC(=O)NC1=CC=C(O)C=C1 RZVAJINKPMORJF-UHFFFAOYSA-N 0.000 description 16
- JHFNSBBHKSZXKB-VKHMYHEASA-N Asp-Gly Chemical compound OC(=O)C[C@H](N)C(=O)NCC(O)=O JHFNSBBHKSZXKB-VKHMYHEASA-N 0.000 description 15
- YXQDRIRSAHTJKM-IMJSIDKUSA-N Cys-Ser Chemical compound SC[C@H](N)C(=O)N[C@@H](CO)C(O)=O YXQDRIRSAHTJKM-IMJSIDKUSA-N 0.000 description 15
- YSWHPLCDIMUKFE-QWRGUYRKSA-N Glu-Tyr Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 YSWHPLCDIMUKFE-QWRGUYRKSA-N 0.000 description 15
- OLIFSFOFKGKIRH-WUJLRWPWSA-N Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CN OLIFSFOFKGKIRH-WUJLRWPWSA-N 0.000 description 15
- JYOAXOMPIXKMKK-UHFFFAOYSA-N Leucyl-Glutamine Chemical compound CC(C)CC(N)C(=O)NC(C(O)=O)CCC(N)=O JYOAXOMPIXKMKK-UHFFFAOYSA-N 0.000 description 15
- GLUBLISJVJFHQS-VIFPVBQESA-N Phe-Gly Chemical compound OC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 GLUBLISJVJFHQS-VIFPVBQESA-N 0.000 description 15
- UJTZHGHXJKIAOS-WHFBIAKZSA-N Ser-Gln Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O UJTZHGHXJKIAOS-WHFBIAKZSA-N 0.000 description 15
- RZEQTVHJZCIUBT-UHFFFAOYSA-N Serinyl-Arginine Chemical compound OCC(N)C(=O)NC(C(O)=O)CCCNC(N)=N RZEQTVHJZCIUBT-UHFFFAOYSA-N 0.000 description 15
- LDEBVRIURYMKQS-UHFFFAOYSA-N Serinyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CO LDEBVRIURYMKQS-UHFFFAOYSA-N 0.000 description 15
- ZSXJENBJGRHKIG-UHFFFAOYSA-N Tyrosyl-Serine Chemical compound OCC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 ZSXJENBJGRHKIG-UHFFFAOYSA-N 0.000 description 15
- 108090000848 Ubiquitin Proteins 0.000 description 15
- 102400000757 Ubiquitin Human genes 0.000 description 15
- XXDVDTMEVBYRPK-XPUUQOCRSA-N Val-Gln Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O XXDVDTMEVBYRPK-XPUUQOCRSA-N 0.000 description 15
- 108010049041 glutamylalanine Proteins 0.000 description 15
- 108010085203 methionylmethionine Proteins 0.000 description 15
- 230000004048 modification Effects 0.000 description 15
- 238000006011 modification reaction Methods 0.000 description 15
- 108010084572 phenylalanyl-valine Proteins 0.000 description 15
- NALWOULWGHTVDA-UWVGGRQHSA-N Asp-Tyr Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NALWOULWGHTVDA-UWVGGRQHSA-N 0.000 description 14
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 14
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 14
- NTISAKGPIGTIJJ-IUCAKERBSA-N Leu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(C)C NTISAKGPIGTIJJ-IUCAKERBSA-N 0.000 description 14
- XGDCYUQSFDQISZ-BQBZGAKWSA-N Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O XGDCYUQSFDQISZ-BQBZGAKWSA-N 0.000 description 14
- IMTUWVJPCQPJEE-IUCAKERBSA-N Met-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN IMTUWVJPCQPJEE-IUCAKERBSA-N 0.000 description 14
- IEHDJWSAXBGJIP-RYUDHWBXSA-N Phe-Val Chemical compound CC(C)[C@@H](C([O-])=O)NC(=O)[C@@H]([NH3+])CC1=CC=CC=C1 IEHDJWSAXBGJIP-RYUDHWBXSA-N 0.000 description 14
- NHUHCSRWZMLRLA-UHFFFAOYSA-N Sulfizole Chemical compound CC1=NOC(NS(=O)(=O)C=2C=CC(N)=CC=2)=C1C NHUHCSRWZMLRLA-UHFFFAOYSA-N 0.000 description 14
- 230000004927 fusion Effects 0.000 description 14
- 108010078144 glutaminyl-glycine Proteins 0.000 description 14
- 108010036413 histidylglycine Proteins 0.000 description 14
- FRYULLIZUDQONW-IMJSIDKUSA-N Asp-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O FRYULLIZUDQONW-IMJSIDKUSA-N 0.000 description 13
- UKGGPJNBONZZCM-WDSKDSINSA-N Aspartyl-L-proline Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O UKGGPJNBONZZCM-WDSKDSINSA-N 0.000 description 13
- RNKSNIBMTUYWSH-YFKPBYRVSA-N L-prolylglycine Chemical compound [O-]C(=O)CNC(=O)[C@@H]1CCC[NH2+]1 RNKSNIBMTUYWSH-YFKPBYRVSA-N 0.000 description 13
- KFKWRHQBZQICHA-STQMWFEESA-N Leu-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 13
- NYQBYASWHVRESG-MIMYLULJSA-N Phe-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 NYQBYASWHVRESG-MIMYLULJSA-N 0.000 description 13
- RWCOTTLHDJWHRS-YUMQZZPRSA-N Pro-Pro Chemical compound OC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 RWCOTTLHDJWHRS-YUMQZZPRSA-N 0.000 description 13
- GVUVRRPYYDHHGK-UHFFFAOYSA-N Prolyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C1CCCN1 GVUVRRPYYDHHGK-UHFFFAOYSA-N 0.000 description 13
- CKHWEVXPLJBEOZ-UHFFFAOYSA-N Threoninyl-Valine Chemical compound CC(C)C(C(O)=O)NC(=O)C(N)C(C)O CKHWEVXPLJBEOZ-UHFFFAOYSA-N 0.000 description 13
- 238000010367 cloning Methods 0.000 description 13
- STKYPAFSDFAEPH-LURJTMIESA-N gly-val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CN STKYPAFSDFAEPH-LURJTMIESA-N 0.000 description 13
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 13
- 108010012581 phenylalanylglutamate Proteins 0.000 description 13
- 108010048818 seryl-histidine Proteins 0.000 description 13
- 241000894007 species Species 0.000 description 13
- VHLZDSUANXBJHW-UHFFFAOYSA-N Glutaminyl-Phenylalanine Chemical compound NC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 VHLZDSUANXBJHW-UHFFFAOYSA-N 0.000 description 12
- YIWFXZNIBQBFHR-LURJTMIESA-N Gly-His Chemical compound [NH3+]CC(=O)N[C@H](C([O-])=O)CC1=CN=CN1 YIWFXZNIBQBFHR-LURJTMIESA-N 0.000 description 12
- NIKBMHGRNAPJFW-UHFFFAOYSA-N Histidinyl-Arginine Chemical compound NC(=N)NCCCC(C(O)=O)NC(=O)C(N)CC1=CN=CN1 NIKBMHGRNAPJFW-UHFFFAOYSA-N 0.000 description 12
- VYZAGTDAHUIRQA-WHFBIAKZSA-N L-alanyl-L-glutamic acid Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O VYZAGTDAHUIRQA-WHFBIAKZSA-N 0.000 description 12
- SENJXOPIZNYLHU-IUCAKERBSA-N Leu-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N SENJXOPIZNYLHU-IUCAKERBSA-N 0.000 description 12
- ATIPDCIQTUXABX-UWVGGRQHSA-N Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN ATIPDCIQTUXABX-UWVGGRQHSA-N 0.000 description 12
- ZOKVLMBYDSIDKG-CSMHCCOUSA-N Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN ZOKVLMBYDSIDKG-CSMHCCOUSA-N 0.000 description 12
- RVKIPWVMZANZLI-ZFWWWQNUSA-N Lys-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CCCCN)C(O)=O)=CNC2=C1 RVKIPWVMZANZLI-ZFWWWQNUSA-N 0.000 description 12
- QXOHLNCNYLGICT-YFKPBYRVSA-N Met-Gly Chemical compound CSCC[C@H](N)C(=O)NCC(O)=O QXOHLNCNYLGICT-YFKPBYRVSA-N 0.000 description 12
- ZYTPOUNUXRBYGW-YUMQZZPRSA-N Met-Met Chemical compound CSCC[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CCSC ZYTPOUNUXRBYGW-YUMQZZPRSA-N 0.000 description 12
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 12
- JXWLMUIXUXLIJR-QWRGUYRKSA-N Phe-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 JXWLMUIXUXLIJR-QWRGUYRKSA-N 0.000 description 12
- ILVGMCVCQBJPSH-WDSKDSINSA-N Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CO ILVGMCVCQBJPSH-WDSKDSINSA-N 0.000 description 12
- IQHUITKNHOKGFC-MIMYLULJSA-N Thr-Phe Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IQHUITKNHOKGFC-MIMYLULJSA-N 0.000 description 12
- GJNDXQBALKCYSZ-RYUDHWBXSA-N Val-Phe Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 GJNDXQBALKCYSZ-RYUDHWBXSA-N 0.000 description 12
- WPSXZFTVLIAPCN-UHFFFAOYSA-N Valyl-Cysteine Chemical compound CC(C)C(N)C(=O)NC(CS)C(O)=O WPSXZFTVLIAPCN-UHFFFAOYSA-N 0.000 description 12
- 108010060035 arginylproline Proteins 0.000 description 12
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 12
- 108010047857 aspartylglycine Proteins 0.000 description 12
- 230000002068 genetic Effects 0.000 description 12
- 108010000761 leucylarginine Proteins 0.000 description 12
- 239000003550 marker Substances 0.000 description 12
- 239000002773 nucleotide Substances 0.000 description 12
- 125000003729 nucleotide group Chemical group 0.000 description 12
- 108010053725 prolylvaline Proteins 0.000 description 12
- 108010020532 tyrosyl-proline Proteins 0.000 description 12
- 241000219195 Arabidopsis thaliana Species 0.000 description 11
- PMGDADKJMCOXHX-BQBZGAKWSA-N Arg-Gln Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(O)=O PMGDADKJMCOXHX-BQBZGAKWSA-N 0.000 description 11
- DWBZEJHQQIURML-IMJSIDKUSA-N Asp-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O DWBZEJHQQIURML-IMJSIDKUSA-N 0.000 description 11
- JLXVRFDTDUGQEE-YFKPBYRVSA-N Gly-Arg Chemical compound NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N JLXVRFDTDUGQEE-YFKPBYRVSA-N 0.000 description 11
- SCCPDJAQCXWPTF-VKHMYHEASA-N Gly-Asp Chemical compound NCC(=O)N[C@H](C(O)=O)CC(O)=O SCCPDJAQCXWPTF-VKHMYHEASA-N 0.000 description 11
- VHOLZZKNEBBHTH-YUMQZZPRSA-N His-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CNC=N1 VHOLZZKNEBBHTH-YUMQZZPRSA-N 0.000 description 11
- VLDVBZICYBVQHB-IUCAKERBSA-N His-Val Chemical compound CC(C)[C@@H](C([O-])=O)NC(=O)[C@@H]([NH3+])CC1=CN=CN1 VLDVBZICYBVQHB-IUCAKERBSA-N 0.000 description 11
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 11
- 241000209510 Liliopsida Species 0.000 description 11
- QCZYYEFXOBKCNQ-STQMWFEESA-N Lys-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QCZYYEFXOBKCNQ-STQMWFEESA-N 0.000 description 11
- SHAQGFGGJSLLHE-BQBZGAKWSA-N Pro-Gln Chemical compound NC(=O)CC[C@@H](C([O-])=O)NC(=O)[C@@H]1CCC[NH2+]1 SHAQGFGGJSLLHE-BQBZGAKWSA-N 0.000 description 11
- WOUIMBGNEUWXQG-VKHMYHEASA-N Ser-Gly Chemical compound OC[C@H](N)C(=O)NCC(O)=O WOUIMBGNEUWXQG-VKHMYHEASA-N 0.000 description 11
- LWFWZRANSFAJDR-JSGCOSHPSA-N Trp-Val Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](C(C)C)C(O)=O)=CNC2=C1 LWFWZRANSFAJDR-JSGCOSHPSA-N 0.000 description 11
- PDSLRCZINIDLMU-QWRGUYRKSA-N Tyr-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 PDSLRCZINIDLMU-QWRGUYRKSA-N 0.000 description 11
- 230000003321 amplification Effects 0.000 description 11
- 108010008355 arginyl-glutamine Proteins 0.000 description 11
- 108010062796 arginyllysine Proteins 0.000 description 11
- 238000003199 nucleic acid amplification method Methods 0.000 description 11
- 210000001519 tissues Anatomy 0.000 description 11
- JQFZHHSQMKZLRU-IUCAKERBSA-N Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N JQFZHHSQMKZLRU-IUCAKERBSA-N 0.000 description 10
- SJUXYGVRSGTPMC-UHFFFAOYSA-N Asparaginyl-Alanine Chemical compound OC(=O)C(C)NC(=O)C(N)CC(N)=O SJUXYGVRSGTPMC-UHFFFAOYSA-N 0.000 description 10
- KOSRFJWDECSPRO-WDSKDSINSA-N Glu-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(O)=O KOSRFJWDECSPRO-WDSKDSINSA-N 0.000 description 10
- ARPVSMCNIDAQBO-UHFFFAOYSA-N Glutaminyl-Leucine Chemical compound CC(C)CC(C(O)=O)NC(=O)C(N)CCC(N)=O ARPVSMCNIDAQBO-UHFFFAOYSA-N 0.000 description 10
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 10
- ADHNYKZHPOEULM-BQBZGAKWSA-N Met-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O ADHNYKZHPOEULM-BQBZGAKWSA-N 0.000 description 10
- WEDDFMCSUNNZJR-WDSKDSINSA-N Met-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(O)=O WEDDFMCSUNNZJR-WDSKDSINSA-N 0.000 description 10
- PESQCPHRXOFIPX-RYUDHWBXSA-N Met-Tyr Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 PESQCPHRXOFIPX-RYUDHWBXSA-N 0.000 description 10
- YZMPDHTZJJCGEI-BQBZGAKWSA-N Ser-His Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 YZMPDHTZJJCGEI-BQBZGAKWSA-N 0.000 description 10
- PBUXMVYWOSKHMF-WDSKDSINSA-N Ser-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CO PBUXMVYWOSKHMF-WDSKDSINSA-N 0.000 description 10
- LZLREEUGSYITMX-UHFFFAOYSA-N Serinyl-Tryptophan Chemical compound C1=CC=C2C(CC(NC(=O)C(CO)N)C(O)=O)=CNC2=C1 LZLREEUGSYITMX-UHFFFAOYSA-N 0.000 description 10
- 240000006394 Sorghum bicolor Species 0.000 description 10
- BIYXEUAFGLTAEM-WUJLRWPWSA-N Thr-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(O)=O BIYXEUAFGLTAEM-WUJLRWPWSA-N 0.000 description 10
- NLKUJNGEGZDXGO-XVKPBYJWSA-N Tyr-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NLKUJNGEGZDXGO-XVKPBYJWSA-N 0.000 description 10
- 108010084389 glycyltryptophan Proteins 0.000 description 10
- 108010018006 histidylserine Proteins 0.000 description 10
- 108010056582 methionylglutamic acid Proteins 0.000 description 10
- 230000035772 mutation Effects 0.000 description 10
- 108010077112 prolyl-proline Proteins 0.000 description 10
- 108010071207 serylmethionine Proteins 0.000 description 10
- 241000589158 Agrobacterium Species 0.000 description 9
- JSLGXODUIAFWCF-UHFFFAOYSA-N Arginyl-Asparagine Chemical compound NC(N)=NCCCC(N)C(=O)NC(CC(N)=O)C(O)=O JSLGXODUIAFWCF-UHFFFAOYSA-N 0.000 description 9
- XZFYRXDAULDNFX-UHFFFAOYSA-N Cysteinyl-Phenylalanine Chemical compound SCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XZFYRXDAULDNFX-UHFFFAOYSA-N 0.000 description 9
- 241000588724 Escherichia coli Species 0.000 description 9
- HHSJMSCOLJVTCX-UHFFFAOYSA-N Glutaminyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CCC(N)=O HHSJMSCOLJVTCX-UHFFFAOYSA-N 0.000 description 9
- IEFJWDNGDZAYNZ-BYPYZUCNSA-N Gly-Glu Chemical compound NCC(=O)N[C@H](C(O)=O)CCC(O)=O IEFJWDNGDZAYNZ-BYPYZUCNSA-N 0.000 description 9
- MYTOTTSMVMWVJN-STQMWFEESA-N Lys-Tyr Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 MYTOTTSMVMWVJN-STQMWFEESA-N 0.000 description 9
- KAKJTZWHIUWTTD-VQVTYTSYSA-N Met-Thr Chemical compound CSCC[C@H]([NH3+])C(=O)N[C@@H]([C@@H](C)O)C([O-])=O KAKJTZWHIUWTTD-VQVTYTSYSA-N 0.000 description 9
- 210000002706 Plastids Anatomy 0.000 description 9
- LAFKUZYWNCHOHT-WHFBIAKZSA-N Ser-Glu Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O LAFKUZYWNCHOHT-WHFBIAKZSA-N 0.000 description 9
- SBMNPABNWKXNBJ-UHFFFAOYSA-N Serinyl-Lysine Chemical compound NCCCCC(C(O)=O)NC(=O)C(N)CO SBMNPABNWKXNBJ-UHFFFAOYSA-N 0.000 description 9
- DSGIVWSDDRDJIO-ZXXMMSQZSA-N Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DSGIVWSDDRDJIO-ZXXMMSQZSA-N 0.000 description 9
- DZHDVYLBNKMLMB-ZFWWWQNUSA-N Trp-Lys Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCCCN)C(O)=O)=CNC2=C1 DZHDVYLBNKMLMB-ZFWWWQNUSA-N 0.000 description 9
- CGWAPUBOXJWXMS-HOTGVXAUSA-N Tyr-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 CGWAPUBOXJWXMS-HOTGVXAUSA-N 0.000 description 9
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 9
- 108010079547 glutamylmethionine Proteins 0.000 description 9
- 108010020688 glycylhistidine Proteins 0.000 description 9
- 230000001939 inductive effect Effects 0.000 description 9
- 150000007523 nucleic acids Chemical class 0.000 description 9
- LZDNBBYBDGBADK-KBPBESRZSA-N (2S)-2-[[(2S)-2-amino-3-methylbutanoyl]amino]-3-(1H-indol-3-yl)propanoic acid Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-KBPBESRZSA-N 0.000 description 8
- IPWKGIFRRBGCJO-IMJSIDKUSA-N Ala-Ser Chemical compound C[C@H]([NH3+])C(=O)N[C@@H](CO)C([O-])=O IPWKGIFRRBGCJO-IMJSIDKUSA-N 0.000 description 8
- OMLWNBVRVJYMBQ-YUMQZZPRSA-N Arg-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O OMLWNBVRVJYMBQ-YUMQZZPRSA-N 0.000 description 8
- IIFDPDVJAHQFSR-WHFBIAKZSA-N Asn-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O IIFDPDVJAHQFSR-WHFBIAKZSA-N 0.000 description 8
- 102100004985 GUSB Human genes 0.000 description 8
- YBAFDPFAUTYYRW-YUMQZZPRSA-N Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O YBAFDPFAUTYYRW-YUMQZZPRSA-N 0.000 description 8
- YBTCBQBIJKGSJP-BQBZGAKWSA-N Glu-Pro Chemical compound OC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O YBTCBQBIJKGSJP-BQBZGAKWSA-N 0.000 description 8
- 108010060309 Glucuronidase Proteins 0.000 description 8
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 8
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 8
- OAPNERBWQWUPTI-YUMQZZPRSA-N Lys-Gln Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O OAPNERBWQWUPTI-YUMQZZPRSA-N 0.000 description 8
- JPNRPAJITHRXRH-UHFFFAOYSA-N Lysyl-Asparagine Chemical compound NCCCCC(N)C(=O)NC(C(O)=O)CC(N)=O JPNRPAJITHRXRH-UHFFFAOYSA-N 0.000 description 8
- 108010066427 N-valyltryptophan Proteins 0.000 description 8
- 240000007594 Oryza sativa Species 0.000 description 8
- 235000007164 Oryza sativa Nutrition 0.000 description 8
- HWMGTNOVUDIKRE-UWVGGRQHSA-N Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 HWMGTNOVUDIKRE-UWVGGRQHSA-N 0.000 description 8
- WXVIGTAUZBUDPZ-DTLFHODZSA-N Thr-His Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 WXVIGTAUZBUDPZ-DTLFHODZSA-N 0.000 description 8
- 235000016383 Zea mays subsp huehuetenangensis Nutrition 0.000 description 8
- 238000006243 chemical reaction Methods 0.000 description 8
- 241001233957 eudicotyledons Species 0.000 description 8
- 108010087823 glycyltyrosine Proteins 0.000 description 8
- 108010028295 histidylhistidine Proteins 0.000 description 8
- 239000003999 initiator Substances 0.000 description 8
- 238000002955 isolation Methods 0.000 description 8
- 108010051673 leucyl-glycyl-phenylalanine Proteins 0.000 description 8
- 235000009973 maize Nutrition 0.000 description 8
- 108020004707 nucleic acids Proteins 0.000 description 8
- 108010072637 phenylalanyl-arginyl-phenylalanine Proteins 0.000 description 8
- 235000009566 rice Nutrition 0.000 description 8
- AAKRWBIIGKPOKQ-ONGXEEELSA-N 2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-3-methylbutanoyl]amino]acetic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O AAKRWBIIGKPOKQ-ONGXEEELSA-N 0.000 description 7
- XUUXCWCKKCZEAW-YFKPBYRVSA-N 2-[[(2S)-2-amino-5-(diaminomethylideneamino)pentanoyl]amino]acetic acid Chemical compound OC(=O)CNC(=O)[C@@H](N)CCCN=C(N)N XUUXCWCKKCZEAW-YFKPBYRVSA-N 0.000 description 7
- TUTIHHSZKFBMHM-UHFFFAOYSA-N 4-amino-5-[(3-amino-1-carboxy-3-oxopropyl)amino]-5-oxopentanoic acid Chemical compound OC(=O)CCC(N)C(=O)NC(CC(N)=O)C(O)=O TUTIHHSZKFBMHM-UHFFFAOYSA-N 0.000 description 7
- MPZWMIIOPAPAKE-UHFFFAOYSA-N 4-amino-5-[[1-carboxy-4-(diaminomethylideneamino)butyl]amino]-5-oxopentanoic acid Chemical compound OC(=O)CCC(N)C(=O)NC(C(O)=O)CCCN=C(N)N MPZWMIIOPAPAKE-UHFFFAOYSA-N 0.000 description 7
- SIFXMYAHXJGAFC-WDSKDSINSA-N Arg-Asp Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O SIFXMYAHXJGAFC-WDSKDSINSA-N 0.000 description 7
- QADCERNTBWTXFV-JSGCOSHPSA-N Arg-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCNC(N)=N)N)C(O)=O)=CNC2=C1 QADCERNTBWTXFV-JSGCOSHPSA-N 0.000 description 7
- RJUHZPRQRQLCFL-IMJSIDKUSA-N Asn-Asn Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(O)=O RJUHZPRQRQLCFL-IMJSIDKUSA-N 0.000 description 7
- KLKHFFMNGWULBN-VKHMYHEASA-N Asn-Gly Chemical compound NC(=O)C[C@H](N)C(=O)NCC(O)=O KLKHFFMNGWULBN-VKHMYHEASA-N 0.000 description 7
- KWBQPGIYEZKDEG-FSPLSTOPSA-N Asn-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(N)=O KWBQPGIYEZKDEG-FSPLSTOPSA-N 0.000 description 7
- PSZNHSNIGMJYOZ-WDSKDSINSA-N Asp-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PSZNHSNIGMJYOZ-WDSKDSINSA-N 0.000 description 7
- GSMPSRPMQQDRIB-WHFBIAKZSA-N Asp-Gln Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O GSMPSRPMQQDRIB-WHFBIAKZSA-N 0.000 description 7
- CLSDNFWKGFJIBZ-UHFFFAOYSA-N Glutaminyl-Lysine Chemical compound NCCCCC(C(O)=O)NC(=O)C(N)CCC(N)=O CLSDNFWKGFJIBZ-UHFFFAOYSA-N 0.000 description 7
- MRVYVEQPNDSWLH-UHFFFAOYSA-N Glutaminyl-Valine Chemical compound CC(C)C(C(O)=O)NC(=O)C(N)CCC(N)=O MRVYVEQPNDSWLH-UHFFFAOYSA-N 0.000 description 7
- LYCVKHSJGDMDLM-LURJTMIESA-N His-Gly Chemical compound OC(=O)CNC(=O)[C@@H](N)CC1=CN=CN1 LYCVKHSJGDMDLM-LURJTMIESA-N 0.000 description 7
- WRPDZHJNLYNFFT-UHFFFAOYSA-N Histidinyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CC1=CN=CN1 WRPDZHJNLYNFFT-UHFFFAOYSA-N 0.000 description 7
- YSZNURNVYFUEHC-BQBZGAKWSA-N Lys-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(O)=O YSZNURNVYFUEHC-BQBZGAKWSA-N 0.000 description 7
- IGRMTQMIDNDFAA-UHFFFAOYSA-N Lysyl-Histidine Chemical compound NCCCCC(N)C(=O)NC(C(O)=O)CC1=CN=CN1 IGRMTQMIDNDFAA-UHFFFAOYSA-N 0.000 description 7
- 108010087066 N2-tryptophyllysine Proteins 0.000 description 7
- ROHDXJUFQVRDAV-UWVGGRQHSA-N Phe-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 ROHDXJUFQVRDAV-UWVGGRQHSA-N 0.000 description 7
- FADYJNXDPBKVCA-UHFFFAOYSA-N Phenylalanyl-Lysine Chemical compound NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 7
- 235000011684 Sorghum saccharatum Nutrition 0.000 description 7
- BECPPKYKPSRKCP-ZDLURKLDSA-N Thr-Glu Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O BECPPKYKPSRKCP-ZDLURKLDSA-N 0.000 description 7
- KAFKKRJQHOECGW-JCOFBHIZSA-N Thr-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)[C@H](O)C)C(O)=O)=CNC2=C1 KAFKKRJQHOECGW-JCOFBHIZSA-N 0.000 description 7
- WCRFXRIWBFRZBR-GGVZMXCHSA-N Thr-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 WCRFXRIWBFRZBR-GGVZMXCHSA-N 0.000 description 7
- PWIQCLSQVQBOQV-AAEUAGOBSA-N Trp-Glu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 PWIQCLSQVQBOQV-AAEUAGOBSA-N 0.000 description 7
- YBRHKUNWEYBZGT-UHFFFAOYSA-N Tryptophyl-Threonine Chemical compound C1=CC=C2C(CC(N)C(=O)NC(C(O)C)C(O)=O)=CNC2=C1 YBRHKUNWEYBZGT-UHFFFAOYSA-N 0.000 description 7
- HPYDSVWYXXKHRD-VIFPVBQESA-N Tyr-Gly Chemical compound [O-]C(=O)CNC(=O)[C@@H]([NH3+])CC1=CC=C(O)C=C1 HPYDSVWYXXKHRD-VIFPVBQESA-N 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 7
- 108010077245 asparaginyl-proline Proteins 0.000 description 7
- 108010092854 aspartyllysine Proteins 0.000 description 7
- 238000010494 dissociation reaction Methods 0.000 description 7
- 230000005593 dissociations Effects 0.000 description 7
- 238000003780 insertion Methods 0.000 description 7
- 108010012058 leucyltyrosine Proteins 0.000 description 7
- QLROSWPKSBORFJ-BQBZGAKWSA-N pro glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 QLROSWPKSBORFJ-BQBZGAKWSA-N 0.000 description 7
- 108010004914 prolylarginine Proteins 0.000 description 7
- 239000000126 substance Substances 0.000 description 7
- 230000001629 suppression Effects 0.000 description 7
- VNYDHJARLHNEGA-RYUDHWBXSA-N (2S)-1-[(2S)-2-azaniumyl-3-(4-hydroxyphenyl)propanoyl]pyrrolidine-2-carboxylate Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CC=C(O)C=C1 VNYDHJARLHNEGA-RYUDHWBXSA-N 0.000 description 6
- GJSURZIOUXUGAL-UHFFFAOYSA-N 2-((2,6-Dichlorophenyl)imino)imidazolidine Chemical compound ClC1=CC=CC(Cl)=C1NC1=NCCN1 GJSURZIOUXUGAL-UHFFFAOYSA-N 0.000 description 6
- WPWUFUBLGADILS-WDSKDSINSA-N Ala-Pro Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O WPWUFUBLGADILS-WDSKDSINSA-N 0.000 description 6
- 108010037365 Arabidopsis Proteins Proteins 0.000 description 6
- ROWCTNFEMKOIFQ-YUMQZZPRSA-N Arg-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCNC(N)=N ROWCTNFEMKOIFQ-YUMQZZPRSA-N 0.000 description 6
- QJMCHPGWFZZRID-UHFFFAOYSA-N Asparaginyl-Lysine Chemical compound NCCCCC(C(O)=O)NC(=O)C(N)CC(N)=O QJMCHPGWFZZRID-UHFFFAOYSA-N 0.000 description 6
- IQTUDDBANZYMAR-UHFFFAOYSA-N Asparaginyl-Methionine Chemical compound CSCCC(C(O)=O)NC(=O)C(N)CC(N)=O IQTUDDBANZYMAR-UHFFFAOYSA-N 0.000 description 6
- JZDHUJAFXGNDSB-WHFBIAKZSA-N Glu-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O JZDHUJAFXGNDSB-WHFBIAKZSA-N 0.000 description 6
- LSPKYLAFTPBWIL-BYPYZUCNSA-N Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(O)=O LSPKYLAFTPBWIL-BYPYZUCNSA-N 0.000 description 6
- XBGGUPMXALFZOT-VIFPVBQESA-N Gly-Tyr Chemical compound NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-VIFPVBQESA-N 0.000 description 6
- WSDOHRLQDGAOGU-UHFFFAOYSA-N Histidinyl-Asparagine Chemical compound NC(=O)CC(C(O)=O)NC(=O)C(N)CC1=CN=CN1 WSDOHRLQDGAOGU-UHFFFAOYSA-N 0.000 description 6
- SBUJHOSQTJFQJX-NOAMYHISSA-N Kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 6
- XBZOQGHZGQLEQO-IUCAKERBSA-N Lys-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN XBZOQGHZGQLEQO-IUCAKERBSA-N 0.000 description 6
- AIXUQKMMBQJZCU-IUCAKERBSA-N Lys-Pro Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O AIXUQKMMBQJZCU-IUCAKERBSA-N 0.000 description 6
- JHKXZYLNVJRAAJ-WDSKDSINSA-N Met-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(O)=O JHKXZYLNVJRAAJ-WDSKDSINSA-N 0.000 description 6
- JQOHKCDMINQZRV-WDSKDSINSA-N Pro-Asn Chemical compound NC(=O)C[C@@H](C([O-])=O)NC(=O)[C@@H]1CCC[NH2+]1 JQOHKCDMINQZRV-WDSKDSINSA-N 0.000 description 6
- FFOKMZOAVHEWET-UHFFFAOYSA-N Serinyl-Cysteine Chemical compound OCC(N)C(=O)NC(CS)C(O)=O FFOKMZOAVHEWET-UHFFFAOYSA-N 0.000 description 6
- VPZKQTYZIVOJDV-LMVFSUKVSA-N Thr-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(O)=O VPZKQTYZIVOJDV-LMVFSUKVSA-N 0.000 description 6
- IOWJRKAVLALBQB-IWGUZYHVSA-N Thr-Asp Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O IOWJRKAVLALBQB-IWGUZYHVSA-N 0.000 description 6
- 241000209146 Triticum sp. Species 0.000 description 6
- 239000002253 acid Substances 0.000 description 6
- 150000007513 acids Chemical class 0.000 description 6
- 108010070944 alanylhistidine Proteins 0.000 description 6
- 108010087924 alanylproline Proteins 0.000 description 6
- 108010069926 arginyl-glycyl-serine Proteins 0.000 description 6
- 230000003115 biocidal Effects 0.000 description 6
- 230000002363 herbicidal Effects 0.000 description 6
- 239000004009 herbicide Substances 0.000 description 6
- 229960000318 kanamycin Drugs 0.000 description 6
- 108010073472 leucyl-prolyl-proline Proteins 0.000 description 6
- 108010045397 lysyl-tyrosyl-lysine Proteins 0.000 description 6
- 108010065135 phenylalanyl-phenylalanyl-phenylalanine Proteins 0.000 description 6
- 108010080629 tryptophan-leucine Proteins 0.000 description 6
- RFCVXVPWSPOMFJ-UHFFFAOYSA-N 2-[(2-azaniumyl-3-phenylpropanoyl)amino]-4-methylpentanoate Chemical compound CC(C)CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 RFCVXVPWSPOMFJ-UHFFFAOYSA-N 0.000 description 5
- XMBSYZWANAQXEV-UHFFFAOYSA-N 4-amino-5-[(1-carboxy-2-phenylethyl)amino]-5-oxopentanoic acid Chemical compound OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 5
- 102000007469 Actins Human genes 0.000 description 5
- 108010085238 Actins Proteins 0.000 description 5
- XZWXFWBHYRFLEF-FSPLSTOPSA-N Ala-His Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 XZWXFWBHYRFLEF-FSPLSTOPSA-N 0.000 description 5
- BUQICHWNXBIBOG-LMVFSUKVSA-N Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)N BUQICHWNXBIBOG-LMVFSUKVSA-N 0.000 description 5
- HZYFHQOWCFUSOV-IMJSIDKUSA-N Asn-Asp Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O HZYFHQOWCFUSOV-IMJSIDKUSA-N 0.000 description 5
- OAMLVOVXNKILLQ-BQBZGAKWSA-N Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(O)=O OAMLVOVXNKILLQ-BQBZGAKWSA-N 0.000 description 5
- OMSMPWHEGLNQOD-UHFFFAOYSA-N Asparaginyl-Phenylalanine Chemical compound NC(=O)CC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 OMSMPWHEGLNQOD-UHFFFAOYSA-N 0.000 description 5
- VBKIFHUVGLOJKT-UHFFFAOYSA-N Asparaginyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CC(N)=O VBKIFHUVGLOJKT-UHFFFAOYSA-N 0.000 description 5
- RGGVDKVXLBOLNS-UHFFFAOYSA-N Asparaginyl-Tryptophan Chemical compound C1=CC=C2C(CC(NC(=O)C(CC(N)=O)N)C(O)=O)=CNC2=C1 RGGVDKVXLBOLNS-UHFFFAOYSA-N 0.000 description 5
- SXGAGTVDWKQYCX-BQBZGAKWSA-N Glu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O SXGAGTVDWKQYCX-BQBZGAKWSA-N 0.000 description 5
- PFMUCCYYAAFKTH-YFKPBYRVSA-N Gly-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)CN PFMUCCYYAAFKTH-YFKPBYRVSA-N 0.000 description 5
- KRBMQYPTDYSENE-BQBZGAKWSA-N His-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CNC=N1 KRBMQYPTDYSENE-BQBZGAKWSA-N 0.000 description 5
- 108020004391 Introns Proteins 0.000 description 5
- LHSGPCFBGJHPCY-STQMWFEESA-N Leu-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-STQMWFEESA-N 0.000 description 5
- CIOWSLJGLSUOME-BQBZGAKWSA-N Lys-Asp Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O CIOWSLJGLSUOME-BQBZGAKWSA-N 0.000 description 5
- 210000003463 Organelles Anatomy 0.000 description 5
- KNPVDQMEHSCAGX-UHFFFAOYSA-N Phenylalanyl-Cysteine Chemical compound SCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 KNPVDQMEHSCAGX-UHFFFAOYSA-N 0.000 description 5
- LTFSLKWFMWZEBD-IMJSIDKUSA-N Ser-Asn Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O LTFSLKWFMWZEBD-IMJSIDKUSA-N 0.000 description 5
- VBKBDLMWICBSCY-IMJSIDKUSA-N Ser-Asp Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O VBKBDLMWICBSCY-IMJSIDKUSA-N 0.000 description 5
- SMDQRGAERNMJJF-UHFFFAOYSA-N Tryptophyl-Cysteine Chemical compound C1=CC=C2C(CC(N)C(=O)NC(CS)C(O)=O)=CNC2=C1 SMDQRGAERNMJJF-UHFFFAOYSA-N 0.000 description 5
- GIAZPLMMQOERPN-YUMQZZPRSA-N Val-Pro Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(O)=O GIAZPLMMQOERPN-YUMQZZPRSA-N 0.000 description 5
- 201000009910 diseases by infectious agent Diseases 0.000 description 5
- 210000000056 organs Anatomy 0.000 description 5
- 108010024607 phenylalanylalanine Proteins 0.000 description 5
- 108010073101 phenylalanylleucine Proteins 0.000 description 5
- 229920001184 polypeptide Polymers 0.000 description 5
- 210000001938 protoplasts Anatomy 0.000 description 5
- 230000028070 sporulation Effects 0.000 description 5
- QXRNAOYBCYVZCD-BQBZGAKWSA-N (2S)-6-amino-2-[[(2S)-2-aminopropanoyl]amino]hexanoic acid Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN QXRNAOYBCYVZCD-BQBZGAKWSA-N 0.000 description 4
- HKTRDWYCAUTRRL-UHFFFAOYSA-N 4-amino-5-[[1-carboxy-2-(1H-imidazol-5-yl)ethyl]amino]-5-oxopentanoic acid Chemical compound OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CN=CN1 HKTRDWYCAUTRRL-UHFFFAOYSA-N 0.000 description 4
- 108020005544 Antisense RNA Proteins 0.000 description 4
- QCWJKJLNCFEVPQ-WHFBIAKZSA-N Asn-Gln Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O QCWJKJLNCFEVPQ-WHFBIAKZSA-N 0.000 description 4
- CPMKYMGGYUFOHS-FSPLSTOPSA-N Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(O)=O CPMKYMGGYUFOHS-FSPLSTOPSA-N 0.000 description 4
- HXWUJJADFMXNKA-UHFFFAOYSA-N Asparaginyl-Leucine Chemical compound CC(C)CC(C(O)=O)NC(=O)C(N)CC(N)=O HXWUJJADFMXNKA-UHFFFAOYSA-N 0.000 description 4
- VGRHZPNRCLAHQA-UHFFFAOYSA-N Aspartyl-Asparagine Chemical compound OC(=O)CC(N)C(=O)NC(CC(N)=O)C(O)=O VGRHZPNRCLAHQA-UHFFFAOYSA-N 0.000 description 4
- 108020004705 Codon Proteins 0.000 description 4
- ZSRSLWKGWFFVCM-WDSKDSINSA-N Cys-Pro Chemical compound SC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O ZSRSLWKGWFFVCM-WDSKDSINSA-N 0.000 description 4
- HAYVTMHUNMMXCV-UHFFFAOYSA-N Cysteinyl-Alanine Chemical compound OC(=O)C(C)NC(=O)C(N)CS HAYVTMHUNMMXCV-UHFFFAOYSA-N 0.000 description 4
- WXOFKRKAHJQKLT-UHFFFAOYSA-N Cysteinyl-Lysine Chemical compound NCCCCC(C(O)=O)NC(=O)C(N)CS WXOFKRKAHJQKLT-UHFFFAOYSA-N 0.000 description 4
- WYVKPHCYMTWUCW-UHFFFAOYSA-N Cysteinyl-Threonine Chemical compound CC(O)C(C(O)=O)NC(=O)C(N)CS WYVKPHCYMTWUCW-UHFFFAOYSA-N 0.000 description 4
- OELDIVRKHTYFNG-UHFFFAOYSA-N Cysteinyl-Valine Chemical compound CC(C)C(C(O)=O)NC(=O)C(N)CS OELDIVRKHTYFNG-UHFFFAOYSA-N 0.000 description 4
- 241000233866 Fungi Species 0.000 description 4
- OWOFCNWTMWOOJJ-WDSKDSINSA-N Gln-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(O)=O OWOFCNWTMWOOJJ-WDSKDSINSA-N 0.000 description 4
- JEFZIKRIDLHOIF-BYPYZUCNSA-N Gln-Gly Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(O)=O JEFZIKRIDLHOIF-BYPYZUCNSA-N 0.000 description 4
- AJHCSUXXECOXOY-NSHDSACASA-N Gly-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-NSHDSACASA-N 0.000 description 4
- 125000000998 L-alanino group Chemical group [H]N([*])[C@](C([H])([H])[H])([H])C(=O)O[H] 0.000 description 4
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 4
- DZMGFGQBRYWJOR-YUMQZZPRSA-N Met-Pro Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O DZMGFGQBRYWJOR-YUMQZZPRSA-N 0.000 description 4
- 210000003470 Mitochondria Anatomy 0.000 description 4
- MIDZLCFIAINOQN-WPRPVWTQSA-N Phe-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 MIDZLCFIAINOQN-WPRPVWTQSA-N 0.000 description 4
- OZILORBBPKKGRI-RYUDHWBXSA-N Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 OZILORBBPKKGRI-RYUDHWBXSA-N 0.000 description 4
- BXNGIHFNNNSEOS-UWVGGRQHSA-N Phe-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 BXNGIHFNNNSEOS-UWVGGRQHSA-N 0.000 description 4
- IWIANZLCJVYEFX-RYUDHWBXSA-N Pro-Phe Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H]1NCCC1)C1=CC=CC=C1 IWIANZLCJVYEFX-RYUDHWBXSA-N 0.000 description 4
- HXNYBZQLBWIADP-UHFFFAOYSA-N Prolyl-Cysteine Chemical compound OC(=O)C(CS)NC(=O)C1CCCN1 HXNYBZQLBWIADP-UHFFFAOYSA-N 0.000 description 4
- 240000000111 Saccharum officinarum Species 0.000 description 4
- 235000007201 Saccharum officinarum Nutrition 0.000 description 4
- 235000007238 Secale cereale Nutrition 0.000 description 4
- 240000002057 Secale cereale Species 0.000 description 4
- APIDTRXFGYOLLH-VQVTYTSYSA-N Thr-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)[C@@H](C)O APIDTRXFGYOLLH-VQVTYTSYSA-N 0.000 description 4
- KBUBZAMBIVEFEI-UHFFFAOYSA-N Tryptophyl-Histidine Chemical compound C=1NC2=CC=CC=C2C=1CC(N)C(=O)NC(C(O)=O)CC1=CN=CN1 KBUBZAMBIVEFEI-UHFFFAOYSA-N 0.000 description 4
- ONWMQORSVZYVNH-UHFFFAOYSA-N Tyrosyl-Asparagine Chemical compound NC(=O)CC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 ONWMQORSVZYVNH-UHFFFAOYSA-N 0.000 description 4
- 230000004075 alteration Effects 0.000 description 4
- 229920002847 antisense RNA Polymers 0.000 description 4
- 230000033228 biological regulation Effects 0.000 description 4
- 230000001413 cellular Effects 0.000 description 4
- 239000003184 complementary RNA Substances 0.000 description 4
- 108010060199 cysteinylproline Proteins 0.000 description 4
- 108010072405 glycyl-aspartyl-glycine Proteins 0.000 description 4
- 108010010147 glycylglutamine Proteins 0.000 description 4
- YMAWOPBAYDPSLA-UHFFFAOYSA-N glycylglycine zwitterion Chemical compound [NH3+]CC(=O)NCC([O-])=O YMAWOPBAYDPSLA-UHFFFAOYSA-N 0.000 description 4
- 108010085325 histidylproline Proteins 0.000 description 4
- 238000000338 in vitro Methods 0.000 description 4
- 108010044655 lysylproline Proteins 0.000 description 4
- 230000001404 mediated Effects 0.000 description 4
- 108010018625 phenylalanylarginine Proteins 0.000 description 4
- 108010070643 prolylglutamic acid Proteins 0.000 description 4
- 230000002829 reduced Effects 0.000 description 4
- 239000000523 sample Substances 0.000 description 4
- 238000000926 separation method Methods 0.000 description 4
- 108010033670 threonyl-aspartyl-tyrosine Proteins 0.000 description 4
- 108010038745 tryptophylglycine Proteins 0.000 description 4
- 108010044292 tryptophyltyrosine Proteins 0.000 description 4
- LQJAALCCPOTJGB-YUMQZZPRSA-N (2S)-1-[(2S)-2-amino-5-(diaminomethylideneamino)pentanoyl]pyrrolidine-2-carboxylic acid Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O LQJAALCCPOTJGB-YUMQZZPRSA-N 0.000 description 3
- RXGLHDWAZQECBI-SRVKXCTJSA-N (2S)-2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]amino]-3-hydroxypropanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O RXGLHDWAZQECBI-SRVKXCTJSA-N 0.000 description 3
- IAJFFZORSWOZPQ-SRVKXCTJSA-N (2S)-4-amino-2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]amino]-4-oxobutanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IAJFFZORSWOZPQ-SRVKXCTJSA-N 0.000 description 3
- 108020005345 3' Untranslated Regions Proteins 0.000 description 3
- UNFWWIHTNXNPBV-WXKVUWSESA-N Actinospectacin Chemical compound O([C@@H]1[C@@H](NC)[C@@H](O)[C@H]([C@@H]([C@H]1O1)O)NC)[C@]2(O)[C@H]1O[C@H](C)CC2=O UNFWWIHTNXNPBV-WXKVUWSESA-N 0.000 description 3
- 241000844498 Agatea Species 0.000 description 3
- 241000724328 Alfalfa mosaic virus Species 0.000 description 3
- OSASDIVHOSJVII-UHFFFAOYSA-N Arginyl-Cysteine Chemical compound SCC(C(O)=O)NC(=O)C(N)CCCNC(N)=N OSASDIVHOSJVII-UHFFFAOYSA-N 0.000 description 3
- NPDLYUOYAGBHFB-UHFFFAOYSA-N Asparaginyl-Arginine Chemical compound NC(=O)CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N NPDLYUOYAGBHFB-UHFFFAOYSA-N 0.000 description 3
- 235000007319 Avena orientalis Nutrition 0.000 description 3
- 244000075850 Avena orientalis Species 0.000 description 3
- 210000002421 Cell Wall Anatomy 0.000 description 3
- NXTYATMDWQYLGJ-UHFFFAOYSA-N Cysteinyl-Leucine Chemical compound CC(C)CC(C(O)=O)NC(=O)C(N)CS NXTYATMDWQYLGJ-UHFFFAOYSA-N 0.000 description 3
- 108010020183 EC 2.5.1.19 Proteins 0.000 description 3
- 102000004190 Enzymes Human genes 0.000 description 3
- 108090000790 Enzymes Proteins 0.000 description 3
- 229920002760 Expressed sequence tag Polymers 0.000 description 3
- 101710026072 GS2 Proteins 0.000 description 3
- IAJOBQBIJHVGMQ-UHFFFAOYSA-N Glufosinate Chemical compound CP(O)(=O)CCC(N)C(O)=O IAJOBQBIJHVGMQ-UHFFFAOYSA-N 0.000 description 3
- FUESBOMYALLFNI-VKHMYHEASA-N Gly-Asn Chemical compound NCC(=O)N[C@H](C(O)=O)CC(N)=O FUESBOMYALLFNI-VKHMYHEASA-N 0.000 description 3
- FBTYOQIYBULKEH-ZFWWWQNUSA-N His-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CNC=N1 FBTYOQIYBULKEH-ZFWWWQNUSA-N 0.000 description 3
- FBOZXECLQNJBKD-ZDUSSCGKSA-N L-methotrexate Chemical compound C=1N=C2N=C(N)N=C(N)C2=NC=1CN(C)C1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)C=C1 FBOZXECLQNJBKD-ZDUSSCGKSA-N 0.000 description 3
- 241000710118 Maize chlorotic mottle virus Species 0.000 description 3
- PBOUVYGPDSARIS-IUCAKERBSA-N Met-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CC(C)C PBOUVYGPDSARIS-IUCAKERBSA-N 0.000 description 3
- 241000208125 Nicotiana Species 0.000 description 3
- 235000002637 Nicotiana tabacum Nutrition 0.000 description 3
- PYOHODCEOHCZBM-RYUDHWBXSA-N Phe-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 PYOHODCEOHCZBM-RYUDHWBXSA-N 0.000 description 3
- BEPSGCXDIVACBU-UHFFFAOYSA-N Prolyl-Histidine Chemical compound C1CCNC1C(=O)NC(C(=O)O)CC1=CN=CN1 BEPSGCXDIVACBU-UHFFFAOYSA-N 0.000 description 3
- 108020004412 RNA 3' Polyadenylation Signals Proteins 0.000 description 3
- 240000001016 Solanum tuberosum Species 0.000 description 3
- 235000002595 Solanum tuberosum Nutrition 0.000 description 3
- HYLXOQURIOCKIH-VQVTYTSYSA-N Thr-Arg Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CCCNC(N)=N HYLXOQURIOCKIH-VQVTYTSYSA-N 0.000 description 3
- 229920000401 Three prime untranslated region Polymers 0.000 description 3
- 241000723873 Tobacco mosaic virus Species 0.000 description 3
- MYVYPSWUSKCCHG-JQWIXIFHSA-N Trp-Ser Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CO)C(O)=O)=CNC2=C1 MYVYPSWUSKCCHG-JQWIXIFHSA-N 0.000 description 3
- 101700014863 UBQ3 Proteins 0.000 description 3
- IBIDRSSEHFLGSD-YUMQZZPRSA-N Val-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IBIDRSSEHFLGSD-YUMQZZPRSA-N 0.000 description 3
- WITCOKQIPFWQQD-FSPLSTOPSA-N Val-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O WITCOKQIPFWQQD-FSPLSTOPSA-N 0.000 description 3
- 241000700605 Viruses Species 0.000 description 3
- 239000003242 anti bacterial agent Substances 0.000 description 3
- 108010043240 arginyl-leucyl-glycine Proteins 0.000 description 3
- 108091006028 chimera Proteins 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 230000029087 digestion Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 235000013305 food Nutrition 0.000 description 3
- 108010033719 glycyl-histidyl-glycine Proteins 0.000 description 3
- 108010077515 glycylproline Proteins 0.000 description 3
- 230000012010 growth Effects 0.000 description 3
- 230000028644 hyphal growth Effects 0.000 description 3
- 238000011081 inoculation Methods 0.000 description 3
- 229960000485 methotrexate Drugs 0.000 description 3
- 244000005700 microbiome Species 0.000 description 3
- 235000019713 millet Nutrition 0.000 description 3
- 108010058731 nopaline synthase Proteins 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 230000003032 phytopathogenic Effects 0.000 description 3
- 230000001402 polyadenylating Effects 0.000 description 3
- 238000011084 recovery Methods 0.000 description 3
- 229960000268 spectinomycin Drugs 0.000 description 3
- 108010072986 threonyl-seryl-lysine Proteins 0.000 description 3
- 238000011144 upstream manufacturing Methods 0.000 description 3
- 230000003612 virological Effects 0.000 description 3
- UKKNTTCNGZLJEX-UHFFFAOYSA-N γ-glutamyl-Serine Chemical compound NC(=O)CCC(N)C(=O)NC(CO)C(O)=O UKKNTTCNGZLJEX-UHFFFAOYSA-N 0.000 description 3
- FAQVCWVVIYYWRR-WHFBIAKZSA-N (2S)-2-[[(2S)-2,5-diamino-5-oxopentanoyl]amino]propanoic acid Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O FAQVCWVVIYYWRR-WHFBIAKZSA-N 0.000 description 2
- VKVDRTGWLVZJOM-DCAQKATOSA-N (2S)-2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-3-methylbutanoyl]amino]-3-hydroxypropanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O VKVDRTGWLVZJOM-DCAQKATOSA-N 0.000 description 2
- KYPMKDGKAYQCHO-RYUDHWBXSA-N (2S)-2-[[(2S)-2-amino-3-(4-hydroxyphenyl)propanoyl]amino]-4-methylsulfanylbutanoic acid Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 KYPMKDGKAYQCHO-RYUDHWBXSA-N 0.000 description 2
- NTBFKPBULZGXQL-KKUMJFAQSA-N (3S)-4-[[(1S)-1-carboxy-2-(4-hydroxyphenyl)ethyl]amino]-3-[[(2S)-2,6-diaminohexanoyl]amino]-4-oxobutanoic acid Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NTBFKPBULZGXQL-KKUMJFAQSA-N 0.000 description 2
- UCSJYZPVAKXKNQ-HZYVHMACSA-N 1-[(1S,2R,3R,4S,5R,6R)-3-carbamimidamido-6-{[(2R,3R,4R,5S)-3-{[(2S,3S,4S,5R,6S)-4,5-dihydroxy-6-(hydroxymethyl)-3-(methylamino)oxan-2-yl]oxy}-4-formyl-4-hydroxy-5-methyloxolan-2-yl]oxy}-2,4,5-trihydroxycyclohexyl]guanidine Chemical compound CN[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O[C@H]1O[C@@H]1[C@](C=O)(O)[C@H](C)O[C@H]1O[C@@H]1[C@@H](NC(N)=N)[C@H](O)[C@@H](NC(N)=N)[C@H](O)[C@H]1O UCSJYZPVAKXKNQ-HZYVHMACSA-N 0.000 description 2
- YOKVEHGYYQEQOP-QWRGUYRKSA-N 2-[[(2S)-2-[[(2S)-2-azaniumyl-4-methylpentanoyl]amino]-4-methylpentanoyl]amino]acetate Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YOKVEHGYYQEQOP-QWRGUYRKSA-N 0.000 description 2
- VWHGTYCRDRBSFI-ZETCQYMHSA-N 2-[[2-[[(2S)-2-amino-4-methylpentanoyl]amino]acetyl]amino]acetic acid Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)NCC(O)=O VWHGTYCRDRBSFI-ZETCQYMHSA-N 0.000 description 2
- 108020003589 5' Untranslated Regions Proteins 0.000 description 2
- CCUAQNUWXLYFRA-IMJSIDKUSA-N Ala-Asn Chemical compound C[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CC(N)=O CCUAQNUWXLYFRA-IMJSIDKUSA-N 0.000 description 2
- XAEWTDMGFGHWFK-IMJSIDKUSA-N Ala-Asp Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O XAEWTDMGFGHWFK-IMJSIDKUSA-N 0.000 description 2
- AVKUERGKIZMTKX-NJBDSQKTSA-N Ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 2
- WYBVBIHNJWOLCJ-IUCAKERBSA-N Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCNC(N)=N WYBVBIHNJWOLCJ-IUCAKERBSA-N 0.000 description 2
- PQBHGSGQZSOLIR-RYUDHWBXSA-N Arg-Phe Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 PQBHGSGQZSOLIR-RYUDHWBXSA-N 0.000 description 2
- GADKFYNESXNRLC-WDSKDSINSA-N Asn-Pro Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O GADKFYNESXNRLC-WDSKDSINSA-N 0.000 description 2
- DVUFTQLHHHJEMK-IMJSIDKUSA-N Asp-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CC(O)=O DVUFTQLHHHJEMK-IMJSIDKUSA-N 0.000 description 2
- 240000001498 Asparagus officinalis Species 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 2
- 241000219310 Beta vulgaris subsp. vulgaris Species 0.000 description 2
- 240000002791 Brassica napus Species 0.000 description 2
- 235000011299 Brassica oleracea var botrytis Nutrition 0.000 description 2
- 240000003259 Brassica oleracea var. botrytis Species 0.000 description 2
- 101700011445 CENPJ Proteins 0.000 description 2
- 102100013052 CENPJ Human genes 0.000 description 2
- 108090000994 Catalytic RNA Proteins 0.000 description 2
- 229920000453 Consensus sequence Polymers 0.000 description 2
- 102000005135 EC 1.3.3.4 Human genes 0.000 description 2
- 108020001991 EC 1.3.3.4 Proteins 0.000 description 2
- 108010000700 EC 2.2.1.6 Proteins 0.000 description 2
- 210000002472 Endoplasmic Reticulum Anatomy 0.000 description 2
- XIPZDANNDPMZGQ-UHFFFAOYSA-N Glutaminyl-Cysteine Chemical compound NC(=O)CCC(N)C(=O)NC(CS)C(O)=O XIPZDANNDPMZGQ-UHFFFAOYSA-N 0.000 description 2
- JZOYFBPIEHCDFV-UHFFFAOYSA-N Glutaminyl-Histidine Chemical compound NC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CN=CN1 JZOYFBPIEHCDFV-UHFFFAOYSA-N 0.000 description 2
- MFBYPDKTAJXHNI-VKHMYHEASA-N Gly-Cys Chemical compound [NH3+]CC(=O)N[C@@H](CS)C([O-])=O MFBYPDKTAJXHNI-VKHMYHEASA-N 0.000 description 2
- PNMUAGGSDZXTHX-BYPYZUCNSA-N Gly-Gln Chemical compound NCC(=O)N[C@H](C(O)=O)CCC(N)=O PNMUAGGSDZXTHX-BYPYZUCNSA-N 0.000 description 2
- 239000005562 Glyphosate Substances 0.000 description 2
- XDDAORKBJWWYJS-UHFFFAOYSA-O Glyphosate Chemical compound OC(=O)C[NH2+]CP(O)(O)=O XDDAORKBJWWYJS-UHFFFAOYSA-O 0.000 description 2
- 240000006669 Helianthus annuus Species 0.000 description 2
- 235000003222 Helianthus annuus Nutrition 0.000 description 2
- 241000238631 Hexapoda Species 0.000 description 2
- MDCTVRUPVLZSPG-BQBZGAKWSA-N His-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CNC=N1 MDCTVRUPVLZSPG-BQBZGAKWSA-N 0.000 description 2
- LNCFUHAPNTYMJB-IUCAKERBSA-N His-Pro Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CN=CN1 LNCFUHAPNTYMJB-IUCAKERBSA-N 0.000 description 2
- MAJYPBAJPNUFPV-UHFFFAOYSA-N Histidinyl-Cysteine Chemical compound SCC(C(O)=O)NC(=O)C(N)CC1=CN=CN1 MAJYPBAJPNUFPV-UHFFFAOYSA-N 0.000 description 2
- JXNRXNCCROJZFB-RYUDHWBXSA-N L-tyrosyl-L-arginine Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 JXNRXNCCROJZFB-RYUDHWBXSA-N 0.000 description 2
- 229920001320 Leader sequence (mRNA) Polymers 0.000 description 2
- DVCSNHXRZUVYAM-BQBZGAKWSA-N Leu-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O DVCSNHXRZUVYAM-BQBZGAKWSA-N 0.000 description 2
- 102000020347 Mannose-6-Phosphate Isomerase Human genes 0.000 description 2
- 108091022068 Mannose-6-Phosphate Isomerase Proteins 0.000 description 2
- HGCNKOLVKRAVHD-RYUDHWBXSA-N Met-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 HGCNKOLVKRAVHD-RYUDHWBXSA-N 0.000 description 2
- BJFJQOMZCSHBMY-YUMQZZPRSA-N Met-Val Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(O)=O BJFJQOMZCSHBMY-YUMQZZPRSA-N 0.000 description 2
- JMEWFDUAFKVAAT-UHFFFAOYSA-N Methionyl-Asparagine Chemical compound CSCCC(N)C(=O)NC(C(O)=O)CC(N)=O JMEWFDUAFKVAAT-UHFFFAOYSA-N 0.000 description 2
- 231100000678 Mycotoxin Toxicity 0.000 description 2
- 108010079364 N-glycylalanine Proteins 0.000 description 2
- 101700076891 PAPA Proteins 0.000 description 2
- 101710041747 PPC1 Proteins 0.000 description 2
- 101710008481 PPC3 Proteins 0.000 description 2
- 101710008513 PPCA Proteins 0.000 description 2
- 101710008475 PPCA1 Proteins 0.000 description 2
- 210000002824 Peroxisome Anatomy 0.000 description 2
- NONJJLVGHLVQQM-JHXYUMNGSA-N Pheneticillin Chemical compound N([C@@H]1C(N2[C@H](C(C)(C)S[C@@H]21)C(O)=O)=O)C(=O)C(C)OC1=CC=CC=C1 NONJJLVGHLVQQM-JHXYUMNGSA-N 0.000 description 2
- 235000010582 Pisum sativum Nutrition 0.000 description 2
- 240000004713 Pisum sativum Species 0.000 description 2
- HMNSRTLZAJHSIK-YUMQZZPRSA-N Pro-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 HMNSRTLZAJHSIK-YUMQZZPRSA-N 0.000 description 2
- RVQDZELMXZRSSI-IUCAKERBSA-N Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 RVQDZELMXZRSSI-IUCAKERBSA-N 0.000 description 2
- 108010029485 Protein Isoforms Proteins 0.000 description 2
- 102000001708 Protein Isoforms Human genes 0.000 description 2
- 102100001882 SASS6 Human genes 0.000 description 2
- 101710040031 SASS6 Proteins 0.000 description 2
- 229920000978 Start codon Polymers 0.000 description 2
- 108060008646 TRPA Proteins 0.000 description 2
- UQTNIFUCMBFWEJ-UHFFFAOYSA-N Threoninyl-Asparagine Chemical compound CC(O)C(N)C(=O)NC(C(O)=O)CC(N)=O UQTNIFUCMBFWEJ-UHFFFAOYSA-N 0.000 description 2
- OHGNSVACHBZKSS-KWQFWETISA-N Trp-Ala Chemical compound C1=CC=C2C(C[C@H]([NH3+])C(=O)N[C@@H](C)C([O-])=O)=CNC2=C1 OHGNSVACHBZKSS-KWQFWETISA-N 0.000 description 2
- LCPVBXOHXMBLFW-JSGCOSHPSA-N Trp-Arg Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O)=CNC2=C1 LCPVBXOHXMBLFW-JSGCOSHPSA-N 0.000 description 2
- TYYLDKGBCJGJGW-WMZOPIPTSA-N Trp-Tyr Chemical compound C([C@H](NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)N)C(O)=O)C1=CC=C(O)C=C1 TYYLDKGBCJGJGW-WMZOPIPTSA-N 0.000 description 2
- 101710042194 Trpgamma Proteins 0.000 description 2
- ZQOOYCZQENFIMC-STQMWFEESA-N Tyr-His Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1N=CNC=1)C(O)=O)C1=CC=C(O)C=C1 ZQOOYCZQENFIMC-STQMWFEESA-N 0.000 description 2
- AOLHUMAVONBBEZ-STQMWFEESA-N Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AOLHUMAVONBBEZ-STQMWFEESA-N 0.000 description 2
- YSGSDAIMSCVPHG-YUMQZZPRSA-N Val-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)C(C)C YSGSDAIMSCVPHG-YUMQZZPRSA-N 0.000 description 2
- 241001672648 Vieira Species 0.000 description 2
- 241000607479 Yersinia pestis Species 0.000 description 2
- 101710039743 acu-6 Proteins 0.000 description 2
- 108010008685 alanyl-glutamyl-aspartic acid Proteins 0.000 description 2
- 108010041407 alanylaspartic acid Proteins 0.000 description 2
- 229960000723 ampicillin Drugs 0.000 description 2
- 108010013835 arginine glutamate Proteins 0.000 description 2
- 230000001580 bacterial Effects 0.000 description 2
- 230000027455 binding Effects 0.000 description 2
- 238000004166 bioassay Methods 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 108010025267 calcium-dependent protein kinase Proteins 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 229920000407 conserved sequence Polymers 0.000 description 2
- 108010016616 cysteinylglycine Proteins 0.000 description 2
- 108010069495 cysteinyltyrosine Proteins 0.000 description 2
- 230000000855 fungicidal Effects 0.000 description 2
- 239000000417 fungicide Substances 0.000 description 2
- KGNSGRRALVIRGR-UHFFFAOYSA-N gln-tyr Chemical compound NC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 KGNSGRRALVIRGR-UHFFFAOYSA-N 0.000 description 2
- KZNQNBZMBZJQJO-YFKPBYRVSA-N gly pro Chemical compound NCC(=O)N1CCC[C@H]1C(O)=O KZNQNBZMBZJQJO-YFKPBYRVSA-N 0.000 description 2
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 2
- 229940097068 glyphosate Drugs 0.000 description 2
- 101710024922 hyg Proteins 0.000 description 2
- 230000002401 inhibitory effect Effects 0.000 description 2
- 230000000977 initiatory Effects 0.000 description 2
- 108010053037 kyotorphin Proteins 0.000 description 2
- 108010073093 leucyl-glycyl-glycyl-glycine Proteins 0.000 description 2
- 108010030617 leucyl-phenylalanyl-valine Proteins 0.000 description 2
- 239000003446 ligand Substances 0.000 description 2
- 230000000670 limiting Effects 0.000 description 2
- 230000004807 localization Effects 0.000 description 2
- 108010025153 lysyl-alanyl-alanine Proteins 0.000 description 2
- 108010044348 lysyl-glutamyl-aspartic acid Proteins 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 108010022588 methionyl-lysyl-proline Proteins 0.000 description 2
- 108010068488 methionylphenylalanine Proteins 0.000 description 2
- 239000002636 mycotoxin Substances 0.000 description 2
- 229920001894 non-coding RNA Polymers 0.000 description 2
- 239000002245 particle Substances 0.000 description 2
- 230000037361 pathway Effects 0.000 description 2
- 108010082795 phenylalanyl-arginyl-arginine Proteins 0.000 description 2
- 108010084525 phenylalanyl-phenylalanyl-glycine Proteins 0.000 description 2
- 229920001223 polyethylene glycol Polymers 0.000 description 2
- 230000002028 premature Effects 0.000 description 2
- 238000005215 recombination Methods 0.000 description 2
- 238000009877 rendering Methods 0.000 description 2
- 230000003362 replicative Effects 0.000 description 2
- 108091007521 restriction endonucleases Proteins 0.000 description 2
- 229920002973 ribosomal RNA Polymers 0.000 description 2
- 101700061866 sas-4 Proteins 0.000 description 2
- 101700068875 sas-5 Proteins 0.000 description 2
- 108010048397 seryl-lysyl-leucine Proteins 0.000 description 2
- 210000004215 spores Anatomy 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 108010031491 threonyl-lysyl-glutamic acid Proteins 0.000 description 2
- 108010084932 tryptophyl-proline Proteins 0.000 description 2
- 108010051110 tyrosyl-lysine Proteins 0.000 description 2
- 108010009962 valyltyrosine Proteins 0.000 description 2
- 101700083326 yjbM Proteins 0.000 description 2
- 101700073788 ywaC Proteins 0.000 description 2
- OPINTGHFESTVAX-UHFFFAOYSA-N γ-glutamyl-Arginine Chemical compound NC(=O)CCC(N)C(=O)NC(C(O)=O)CCCNC(N)=N OPINTGHFESTVAX-UHFFFAOYSA-N 0.000 description 2
- BRZYSWJRSDMWLG-DJWUNRQOSA-N (2R,3R,4R,5R)-2-[(1S,2S,3R,4S,6R)-4,6-diamino-3-[(2S,3R,4R,5S,6R)-3-amino-4,5-dihydroxy-6-[(1R)-1-hydroxyethyl]oxan-2-yl]oxy-2-hydroxycyclohexyl]oxy-5-methyl-4-(methylamino)oxane-3,5-diol Chemical compound O1C[C@@](O)(C)[C@H](NC)[C@@H](O)[C@H]1O[C@@H]1[C@@H](O)[C@H](O[C@@H]2[C@@H]([C@@H](O)[C@H](O)[C@@H]([C@@H](C)O)O2)N)[C@@H](N)C[C@H]1N BRZYSWJRSDMWLG-DJWUNRQOSA-N 0.000 description 1
- SBVPYBFMIGDIDX-SRVKXCTJSA-N (2S)-1-[(2S)-1-[(2S)-pyrrolidin-1-ium-2-carbonyl]pyrrolidine-2-carbonyl]pyrrolidine-2-carboxylate Chemical compound OC(=O)[C@@H]1CCCN1C(=O)[C@H]1N(C(=O)[C@H]2NCCC2)CCC1 SBVPYBFMIGDIDX-SRVKXCTJSA-N 0.000 description 1
- VGMNWQOPSFBBBG-XUXIUFHCSA-N (2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-aminopropanoyl]amino]-4-methylpentanoyl]amino]-4-methylpentanoyl]amino]butanedioic acid Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O VGMNWQOPSFBBBG-XUXIUFHCSA-N 0.000 description 1
- BAONJAHBAUDJKA-BZSNNMDCSA-N (2S)-2-[[(2S)-2-[[(2S)-2-amino-3-phenylpropanoyl]amino]-3-(4-hydroxyphenyl)propanoyl]amino]butanedioic acid Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=CC=C1 BAONJAHBAUDJKA-BZSNNMDCSA-N 0.000 description 1
- BUZMZDDKFCSKOT-CIUDSAMLSA-N (2S)-2-[[(2S)-2-[[(2S)-2-amino-4-carboxybutanoyl]amino]-4-carboxybutanoyl]amino]pentanedioic acid Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BUZMZDDKFCSKOT-CIUDSAMLSA-N 0.000 description 1
- PDQDCFBVYXEFSD-SRVKXCTJSA-N (2S)-2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]amino]butanedioic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O PDQDCFBVYXEFSD-SRVKXCTJSA-N 0.000 description 1
- BQSLGJHIAGOZCD-CIUDSAMLSA-N (2S)-2-[[(2S)-2-[[(2S)-2-amino-4-methylpentanoyl]amino]propanoyl]amino]-3-hydroxypropanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O BQSLGJHIAGOZCD-CIUDSAMLSA-N 0.000 description 1
- YYSWCHMLFJLLBJ-ZLUOBGJFSA-N (2S)-2-[[(2S)-2-[[(2S)-2-aminopropanoyl]amino]propanoyl]amino]-3-hydroxypropanoic acid Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YYSWCHMLFJLLBJ-ZLUOBGJFSA-N 0.000 description 1
- DKJPOZOEBONHFS-ZLUOBGJFSA-N (2S)-2-[[(2S)-2-[[(2S)-2-aminopropanoyl]amino]propanoyl]amino]butanedioic acid Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O DKJPOZOEBONHFS-ZLUOBGJFSA-N 0.000 description 1
- PVMPDMIKUVNOBD-CIUDSAMLSA-N (3S)-3-[[(2S)-2-amino-4-methylpentanoyl]amino]-4-[[(1S)-1-carboxy-2-hydroxyethyl]amino]-4-oxobutanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O PVMPDMIKUVNOBD-CIUDSAMLSA-N 0.000 description 1
- CAAMSDWKXXPUJR-UHFFFAOYSA-N 1,5-dihydro-4H-imidazol-4-one Chemical compound O=C1CNC=N1 CAAMSDWKXXPUJR-UHFFFAOYSA-N 0.000 description 1
- 108020004465 16S Ribosomal RNA Proteins 0.000 description 1
- FCKYPQBAHLOOJQ-UHFFFAOYSA-N 2-[[2-[bis(carboxymethyl)amino]cyclohexyl]-(carboxymethyl)amino]acetic acid Chemical compound OC(=O)CN(CC(O)=O)C1CCCCC1N(CC(O)=O)CC(O)=O FCKYPQBAHLOOJQ-UHFFFAOYSA-N 0.000 description 1
- 102000034451 ATPases Human genes 0.000 description 1
- 108091006096 ATPases Proteins 0.000 description 1
- 241000589155 Agrobacterium tumefaciens Species 0.000 description 1
- HJCMDXDYPOUFDY-WHFBIAKZSA-N Ala-Gln Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O HJCMDXDYPOUFDY-WHFBIAKZSA-N 0.000 description 1
- SITWEMZOJNKJCH-UHFFFAOYSA-N Alanyl-Arginine Chemical compound CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N SITWEMZOJNKJCH-UHFFFAOYSA-N 0.000 description 1
- 240000002840 Allium cepa Species 0.000 description 1
- 240000002234 Allium sativum Species 0.000 description 1
- 244000144730 Amygdalus persica Species 0.000 description 1
- 235000007119 Ananas comosus Nutrition 0.000 description 1
- 240000002254 Ananas comosus Species 0.000 description 1
- 229940064005 Antibiotic throat preparations Drugs 0.000 description 1
- 229940083879 Antibiotics FOR TREATMENT OF HEMORRHOIDS AND ANAL FISSURES FOR TOPICAL USE Drugs 0.000 description 1
- 229940042052 Antibiotics for systemic use Drugs 0.000 description 1
- 229940042786 Antitubercular Antibiotics Drugs 0.000 description 1
- 240000007087 Apium graveolens Species 0.000 description 1
- 235000015849 Apium graveolens Dulce Group Nutrition 0.000 description 1
- 235000010591 Appio Nutrition 0.000 description 1
- 241000233788 Arecaceae Species 0.000 description 1
- HSPSXROIMXIJQW-BQBZGAKWSA-N Asp-His Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 HSPSXROIMXIJQW-BQBZGAKWSA-N 0.000 description 1
- ZARXTZFGQZBYFO-JQWIXIFHSA-N Asp-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CC(O)=O)N)C(O)=O)=CNC2=C1 ZARXTZFGQZBYFO-JQWIXIFHSA-N 0.000 description 1
- 235000005340 Asparagus officinalis Nutrition 0.000 description 1
- 102000004625 Aspartate Aminotransferases Human genes 0.000 description 1
- 108010003415 Aspartate Aminotransferases Proteins 0.000 description 1
- ZVDPYSVOZFINEE-UHFFFAOYSA-N Aspartyl-Leucine Chemical compound CC(C)CC(C(O)=O)NC(=O)C(N)CC(O)=O ZVDPYSVOZFINEE-UHFFFAOYSA-N 0.000 description 1
- MXWJVTOOROXGIU-UHFFFAOYSA-N Atrazine Chemical compound CCNC1=NC(Cl)=NC(NC(C)C)=N1 MXWJVTOOROXGIU-UHFFFAOYSA-N 0.000 description 1
- 102000016614 Autophagy-Related Protein 5 Human genes 0.000 description 1
- 108010092776 Autophagy-Related Protein 5 Proteins 0.000 description 1
- 235000000832 Ayote Nutrition 0.000 description 1
- 235000011293 Brassica napus Nutrition 0.000 description 1
- 240000007124 Brassica oleracea Species 0.000 description 1
- 235000011301 Brassica oleracea var capitata Nutrition 0.000 description 1
- 235000017647 Brassica oleracea var italica Nutrition 0.000 description 1
- 235000001169 Brassica oleracea var oleracea Nutrition 0.000 description 1
- 235000000540 Brassica rapa subsp rapa Nutrition 0.000 description 1
- 235000004977 Brassica sinapistrum Nutrition 0.000 description 1
- 235000004936 Bromus mango Nutrition 0.000 description 1
- 240000001358 Bromus mango Species 0.000 description 1
- 229920000018 Callose Polymers 0.000 description 1
- 240000000218 Cannabis sativa Species 0.000 description 1
- 235000002566 Capsicum Nutrition 0.000 description 1
- 235000009467 Carica papaya Nutrition 0.000 description 1
- 240000006432 Carica papaya Species 0.000 description 1
- 241000195597 Chlamydomonas reinhardtii Species 0.000 description 1
- WIIZWVCIJKGZOK-RKDXNWHRSA-N Chloramphenicol Chemical compound ClC(Cl)C(=O)N[C@H](CO)[C@H](O)C1=CC=C([N+]([O-])=O)C=C1 WIIZWVCIJKGZOK-RKDXNWHRSA-N 0.000 description 1
- 108010035563 Chloramphenicol O-Acetyltransferase Proteins 0.000 description 1
- 235000007542 Cichorium intybus Nutrition 0.000 description 1
- 240000008051 Cichorium intybus Species 0.000 description 1
- 229920000062 Coding strand Polymers 0.000 description 1
- 229920000742 Cotton Polymers 0.000 description 1
- 241000219112 Cucumis Species 0.000 description 1
- 235000015510 Cucumis melo subsp melo Nutrition 0.000 description 1
- 240000008067 Cucumis sativus Species 0.000 description 1
- 235000010799 Cucumis sativus var sativus Nutrition 0.000 description 1
- 235000009854 Cucurbita moschata Nutrition 0.000 description 1
- 240000001980 Cucurbita pepo Species 0.000 description 1
- 244000302526 Cucurbita pepo subsp pepo Species 0.000 description 1
- 235000009804 Cucurbita pepo subsp pepo Nutrition 0.000 description 1
- 235000003954 Cucurbita pepo var melopepo Nutrition 0.000 description 1
- 235000017788 Cydonia oblonga Nutrition 0.000 description 1
- 240000000590 Cydonia oblonga Species 0.000 description 1
- 210000000172 Cytosol Anatomy 0.000 description 1
- 101710007887 DHFR Proteins 0.000 description 1
- 239000003155 DNA primer Substances 0.000 description 1
- 238000001712 DNA sequencing Methods 0.000 description 1
- 101700011961 DPOM Proteins 0.000 description 1
- 240000002860 Daucus carota Species 0.000 description 1
- 235000002243 Daucus carota subsp sativus Nutrition 0.000 description 1
- 108010054576 Deoxyribonuclease EcoRI Proteins 0.000 description 1
- 102000007698 EC 1.1.1.1 Human genes 0.000 description 1
- 108010021809 EC 1.1.1.1 Proteins 0.000 description 1
- 210000001161 Embryo, Mammalian Anatomy 0.000 description 1
- 235000016623 Fragaria vesca Nutrition 0.000 description 1
- 240000009088 Fragaria x ananassa Species 0.000 description 1
- 235000011363 Fragaria x ananassa Nutrition 0.000 description 1
- 206010017533 Fungal infection Diseases 0.000 description 1
- 101700032567 GATM Proteins 0.000 description 1
- 101710023886 GUSB Proteins 0.000 description 1
- 241001123946 Gaga Species 0.000 description 1
- 241001200922 Gagata Species 0.000 description 1
- 239000005561 Glufosinate Substances 0.000 description 1
- 235000010469 Glycine max Nutrition 0.000 description 1
- 240000007842 Glycine max Species 0.000 description 1
- 240000006962 Gossypium hirsutum Species 0.000 description 1
- 229940093922 Gynecological Antibiotics Drugs 0.000 description 1
- FRJIAZKQGSCKPQ-FSPLSTOPSA-N His-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CN=CN1 FRJIAZKQGSCKPQ-FSPLSTOPSA-N 0.000 description 1
- 206010020649 Hyperkeratosis Diseases 0.000 description 1
- PWWVAXIEGOYWEE-UHFFFAOYSA-N Isophenergan Chemical compound C1=CC=C2N(CC(C)N(C)C)C3=CC=CC=C3SC2=C1 PWWVAXIEGOYWEE-UHFFFAOYSA-N 0.000 description 1
- MTCFGRXMJLQNBG-REOHCLBHSA-N L-serine Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 1
- 125000000205 L-threonino group Chemical group [H]OC(=O)[C@@]([H])(N([H])[*])[C@](C([H])([H])[H])([H])O[H] 0.000 description 1
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 1
- 235000003228 Lactuca sativa Nutrition 0.000 description 1
- 240000008415 Lactuca sativa Species 0.000 description 1
- 239000005089 Luciferase Substances 0.000 description 1
- 108060001084 Luciferase family Proteins 0.000 description 1
- 235000007688 Lycopersicon esculentum Nutrition 0.000 description 1
- 101710029649 MDV043 Proteins 0.000 description 1
- 240000007119 Malus pumila Species 0.000 description 1
- 235000011430 Malus pumila Nutrition 0.000 description 1
- 235000015103 Malus silvestris Nutrition 0.000 description 1
- 235000014826 Mangifera indica Nutrition 0.000 description 1
- 240000004658 Medicago sativa Species 0.000 description 1
- UASDAHIAHBRZQV-YUMQZZPRSA-N Met-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCNC(N)=N UASDAHIAHBRZQV-YUMQZZPRSA-N 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 240000005561 Musa balbisiana Species 0.000 description 1
- 235000018290 Musa x paradisiaca Nutrition 0.000 description 1
- 241000208133 Nicotiana plumbaginifolia Species 0.000 description 1
- 101700061424 POLB Proteins 0.000 description 1
- 241000364057 Peoria Species 0.000 description 1
- 239000006002 Pepper Substances 0.000 description 1
- 241001223281 Peronospora Species 0.000 description 1
- 240000008426 Persea americana Species 0.000 description 1
- 235000010627 Phaseolus vulgaris Nutrition 0.000 description 1
- 240000005158 Phaseolus vulgaris Species 0.000 description 1
- FSXRLASFHBWESK-HOTGVXAUSA-N Phe-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 FSXRLASFHBWESK-HOTGVXAUSA-N 0.000 description 1
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N Phenol Natural products OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 1
- 235000016761 Piper aduncum Nutrition 0.000 description 1
- 235000017804 Piper guineense Nutrition 0.000 description 1
- 240000000129 Piper nigrum Species 0.000 description 1
- 235000008184 Piper nigrum Nutrition 0.000 description 1
- 108010064851 Plant Proteins Proteins 0.000 description 1
- 102000026947 Plant Proteins Human genes 0.000 description 1
- 210000002381 Plasma Anatomy 0.000 description 1
- 241000209504 Poaceae Species 0.000 description 1
- 241000276498 Pollachius virens Species 0.000 description 1
- FELJDCNGZFDUNR-WDSKDSINSA-N Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 FELJDCNGZFDUNR-WDSKDSINSA-N 0.000 description 1
- 235000009827 Prunus armeniaca Nutrition 0.000 description 1
- 240000005204 Prunus armeniaca Species 0.000 description 1
- 240000002799 Prunus avium Species 0.000 description 1
- 235000006029 Prunus persica var nucipersica Nutrition 0.000 description 1
- 235000006040 Prunus persica var persica Nutrition 0.000 description 1
- 240000005866 Prunus persica var. nucipersica Species 0.000 description 1
- 241000589615 Pseudomonas syringae Species 0.000 description 1
- 235000014443 Pyrus communis Nutrition 0.000 description 1
- 240000001987 Pyrus communis Species 0.000 description 1
- 101700054624 RF1 Proteins 0.000 description 1
- 240000007742 Raphanus sativus Species 0.000 description 1
- 235000006140 Raphanus sativus var sativus Nutrition 0.000 description 1
- 210000003705 Ribosomes Anatomy 0.000 description 1
- 108010003581 Ribulose-Bisphosphate Carboxylase Proteins 0.000 description 1
- 241001092459 Rubus Species 0.000 description 1
- 235000017848 Rubus fruticosus Nutrition 0.000 description 1
- 240000007651 Rubus glaucus Species 0.000 description 1
- 235000011034 Rubus glaucus Nutrition 0.000 description 1
- 235000009122 Rubus idaeus Nutrition 0.000 description 1
- 101700061430 SAT19 Proteins 0.000 description 1
- 101710028729 SLC25A20 Proteins 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- 235000019095 Sechium edule Nutrition 0.000 description 1
- 240000007660 Sechium edule Species 0.000 description 1
- SSJMZMUVNKEENT-IMJSIDKUSA-N Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CO SSJMZMUVNKEENT-IMJSIDKUSA-N 0.000 description 1
- 108020004688 Small Nuclear RNA Proteins 0.000 description 1
- 240000003768 Solanum lycopersicum Species 0.000 description 1
- 235000002597 Solanum melongena Nutrition 0.000 description 1
- 240000002686 Solanum melongena Species 0.000 description 1
- 241000592344 Spermatophyta Species 0.000 description 1
- 235000009337 Spinacia oleracea Nutrition 0.000 description 1
- 240000003453 Spinacia oleracea Species 0.000 description 1
- 235000009184 Spondias indica Nutrition 0.000 description 1
- 241000187191 Streptomyces viridochromogenes Species 0.000 description 1
- 229960005322 Streptomycin Drugs 0.000 description 1
- 101700081234 TTR Proteins 0.000 description 1
- 229960002180 Tetracycline Drugs 0.000 description 1
- OFVLGDICTFRJMM-WESIUVDSSA-N Tetracycline Chemical compound C1=CC=C2[C@](O)(C)[C@H]3C[C@H]4[C@H](N(C)C)C(O)=C(C(N)=O)C(=O)[C@@]4(O)C(O)=C3C(=O)C2=C1O OFVLGDICTFRJMM-WESIUVDSSA-N 0.000 description 1
- 239000004098 Tetracycline Substances 0.000 description 1
- BWUHENPAEMNGQJ-ZDLURKLDSA-N Thr-Gln Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O BWUHENPAEMNGQJ-ZDLURKLDSA-N 0.000 description 1
- 241000536399 Tina Species 0.000 description 1
- 229940024982 Topical Antifungal Antibiotics Drugs 0.000 description 1
- 231100000765 Toxin Toxicity 0.000 description 1
- 229920001949 Transfer RNA Polymers 0.000 description 1
- 241000219793 Trifolium Species 0.000 description 1
- LYMVXFSTACVOLP-ZFWWWQNUSA-N Trp-Leu Chemical compound C1=CC=C2C(C[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C([O-])=O)=CNC2=C1 LYMVXFSTACVOLP-ZFWWWQNUSA-N 0.000 description 1
- GLNADSQYFUSGOU-GPTZEZBUSA-J Trypan blue Chemical compound [Na+].[Na+].[Na+].[Na+].C1=C(S([O-])(=O)=O)C=C2C=C(S([O-])(=O)=O)C(/N=N/C3=CC=C(C=C3C)C=3C=C(C(=CC=3)\N=N\C=3C(=CC4=CC(=CC(N)=C4C=3O)S([O-])(=O)=O)S([O-])(=O)=O)C)=C(O)C2=C1N GLNADSQYFUSGOU-GPTZEZBUSA-J 0.000 description 1
- UBAQSAUDKMIEQZ-QWRGUYRKSA-N Tyr-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 UBAQSAUDKMIEQZ-QWRGUYRKSA-N 0.000 description 1
- 108020004417 Untranslated RNA Proteins 0.000 description 1
- 210000003934 Vacuoles Anatomy 0.000 description 1
- 240000006365 Vitis vinifera Species 0.000 description 1
- 235000014787 Vitis vinifera Nutrition 0.000 description 1
- 235000005042 Zier Kohl Nutrition 0.000 description 1
- UOZODPSAJZTQNH-LSWIJEOBSA-N Zygomycin A1 Chemical compound N[C@@H]1[C@@H](O)[C@H](O)[C@H](CN)O[C@@H]1O[C@H]1[C@@H](O)[C@H](O[C@H]2[C@@H]([C@@H](N)C[C@@H](N)[C@@H]2O)O[C@@H]2[C@@H]([C@@H](O)[C@H](O)[C@@H](CO)O2)N)O[C@@H]1CO UOZODPSAJZTQNH-LSWIJEOBSA-N 0.000 description 1
- 101710029159 aadA Proteins 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 239000012190 activator Substances 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 239000005409 aflatoxin Substances 0.000 description 1
- 108010044940 alanylglutamine Proteins 0.000 description 1
- 108010050181 aleurone Proteins 0.000 description 1
- 235000017585 alfalfa Nutrition 0.000 description 1
- 235000017587 alfalfa Nutrition 0.000 description 1
- 230000003466 anti-cipated Effects 0.000 description 1
- 108010036533 arginylvaline Proteins 0.000 description 1
- 244000052616 bacterial pathogens Species 0.000 description 1
- 238000002869 basic local alignment search tool Methods 0.000 description 1
- 230000037348 biosynthesis Effects 0.000 description 1
- 235000021029 blackberry Nutrition 0.000 description 1
- 230000001488 breeding Effects 0.000 description 1
- UXVMQQNJUSDDNG-UHFFFAOYSA-L cacl2 Chemical compound [Cl-].[Cl-].[Ca+2] UXVMQQNJUSDDNG-UHFFFAOYSA-L 0.000 description 1
- 101700006045 cact Proteins 0.000 description 1
- 239000001110 calcium chloride Substances 0.000 description 1
- 229910001628 calcium chloride Inorganic materials 0.000 description 1
- 235000009120 camo Nutrition 0.000 description 1
- 235000014633 carbohydrates Nutrition 0.000 description 1
- 150000001720 carbohydrates Chemical class 0.000 description 1
- 235000002354 carica papaya Nutrition 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000032823 cell division Effects 0.000 description 1
- 230000022534 cell killing Effects 0.000 description 1
- 239000006285 cell suspension Substances 0.000 description 1
- 229920002678 cellulose Polymers 0.000 description 1
- 239000001913 cellulose Substances 0.000 description 1
- 235000013339 cereals Nutrition 0.000 description 1
- 235000005607 chanvre indien Nutrition 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 235000019693 cherries Nutrition 0.000 description 1
- 108010031100 chloroplast transit peptides Proteins 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- PYKZINFEKIIUQO-XURFAINESA-K cobalt(3+);2-methyl-1H-imidazole;(Z)-4-[2-[[(Z)-4-oxidopent-3-en-2-ylidene]amino]ethylimino]pent-2-en-2-olate;bromide Chemical compound [Co+3].[Br-].CC1=NC=CN1.CC1=NC=CN1.C\C([O-])=C\C(C)=NCCN=C(C)\C=C(\C)[O-] PYKZINFEKIIUQO-XURFAINESA-K 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 244000038559 crop plants Species 0.000 description 1
- 238000010192 crystallographic characterization Methods 0.000 description 1
- 210000004748 cultured cells Anatomy 0.000 description 1
- 230000009089 cytolysis Effects 0.000 description 1
- 230000001086 cytosolic Effects 0.000 description 1
- 230000001687 destabilization Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 108020001096 dihydrofolate reductase family Proteins 0.000 description 1
- 102000004419 dihydrofolate reductase family Human genes 0.000 description 1
- 108010056535 dihydrofolate reductase type II Proteins 0.000 description 1
- 108010054813 diprotin B Proteins 0.000 description 1
- 231100000676 disease causative agent Toxicity 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000004043 dyeing Methods 0.000 description 1
- 235000013601 eggs Nutrition 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 230000002708 enhancing Effects 0.000 description 1
- 230000029578 entry into host Effects 0.000 description 1
- 210000003527 eukaryotic cell Anatomy 0.000 description 1
- 108091006031 fluorescent proteins Proteins 0.000 description 1
- 102000034387 fluorescent proteins Human genes 0.000 description 1
- 238000005755 formation reaction Methods 0.000 description 1
- 238000005194 fractionation Methods 0.000 description 1
- 235000004611 garlic Nutrition 0.000 description 1
- 238000010358 genetic engineering technique Methods 0.000 description 1
- VPZXBVLAVMBEQI-VKHMYHEASA-N gly ala Chemical compound OC(=O)[C@H](C)NC(=O)CN VPZXBVLAVMBEQI-VKHMYHEASA-N 0.000 description 1
- 108010027668 glycyl-alanyl-valine Proteins 0.000 description 1
- 108010062266 glycyl-glycyl-argininal Proteins 0.000 description 1
- 235000009754 grape Nutrition 0.000 description 1
- 235000012333 grape Nutrition 0.000 description 1
- 239000003102 growth factor Substances 0.000 description 1
- 239000011487 hemp Substances 0.000 description 1
- 235000012765 hemp Nutrition 0.000 description 1
- 108010040030 histidinoalanine Proteins 0.000 description 1
- -1 hygromycin Chemical compound 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 239000000411 inducer Substances 0.000 description 1
- 238000001764 infiltration Methods 0.000 description 1
- 239000003112 inhibitor Substances 0.000 description 1
- 230000000749 insecticidal Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 229940079866 intestinal antibiotics Drugs 0.000 description 1
- 108010076756 leucyl-alanyl-phenylalanine Proteins 0.000 description 1
- 108010083708 leucyl-aspartyl-valine Proteins 0.000 description 1
- 235000012766 marijuana Nutrition 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000007069 methylation reaction Methods 0.000 description 1
- 230000000813 microbial Effects 0.000 description 1
- 238000000520 microinjection Methods 0.000 description 1
- 230000002438 mitochondrial Effects 0.000 description 1
- 230000017074 necrotic cell death Effects 0.000 description 1
- 235000015097 nutrients Nutrition 0.000 description 1
- 235000002732 oignon Nutrition 0.000 description 1
- 229940005935 ophthalmologic Antibiotics Drugs 0.000 description 1
- 229960001914 paromomycin Drugs 0.000 description 1
- 230000035515 penetration Effects 0.000 description 1
- 230000000858 peroxisomal Effects 0.000 description 1
- 235000005426 persea americana Nutrition 0.000 description 1
- 150000002989 phenols Chemical class 0.000 description 1
- 108010082527 phosphinothricin N-acetyltransferase Proteins 0.000 description 1
- 230000029553 photosynthesis Effects 0.000 description 1
- 238000010672 photosynthesis Methods 0.000 description 1
- 101700016709 pin-2 Proteins 0.000 description 1
- 235000021118 plant-derived protein Nutrition 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000003825 pressing Methods 0.000 description 1
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 1
- 108010014614 prolyl-glycyl-proline Proteins 0.000 description 1
- 230000000644 propagated Effects 0.000 description 1
- 101710017571 psbA-A Proteins 0.000 description 1
- 235000015136 pumpkin Nutrition 0.000 description 1
- 230000001718 repressive Effects 0.000 description 1
- 238000010839 reverse transcription Methods 0.000 description 1
- 238000003757 reverse transcription PCR Methods 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 239000004460 silage Substances 0.000 description 1
- 238000002741 site-directed mutagenesis Methods 0.000 description 1
- 229920001255 small nuclear ribonucleic acid Polymers 0.000 description 1
- 230000004960 subcellular localization Effects 0.000 description 1
- YROXIXLRRCOBKF-UHFFFAOYSA-N sulfonylurea Chemical compound OC(=N)N=S(=O)=O YROXIXLRRCOBKF-UHFFFAOYSA-N 0.000 description 1
- 229920000511 telomere Polymers 0.000 description 1
- 235000019364 tetracycline Nutrition 0.000 description 1
- 239000003053 toxin Substances 0.000 description 1
- 230000001052 transient Effects 0.000 description 1
- 230000014616 translation Effects 0.000 description 1
- 108010036387 trimethionine Proteins 0.000 description 1
- 108010029384 tryptophyl-histidine Proteins 0.000 description 1
- 231100000925 very toxic Toxicity 0.000 description 1
- 235000005765 wild carrot Nutrition 0.000 description 1
- SITLTJHOQZFJGG-XPUUQOCRSA-N α-Glu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O SITLTJHOQZFJGG-XPUUQOCRSA-N 0.000 description 1
- WQZGKKKJIJFFOK-VFUOTHLCSA-N β-D-glucose Chemical compound OC[C@H]1O[C@@H](O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-VFUOTHLCSA-N 0.000 description 1
- DXJZITDUDUPINW-UHFFFAOYSA-N γ-glutamyl-Asparagine Chemical compound NC(=O)CCC(N)C(=O)NC(CC(N)=O)C(O)=O DXJZITDUDUPINW-UHFFFAOYSA-N 0.000 description 1
Abstract
This invention describes genes encoding proteins which control resistance of plants to fungal pathogens. The invention also describes transgenic plants resistant to fungal pathogens and methods for making plants resistant to fungal pathogens. The invention further discloses a method to isolate additional genes coding for additional proteins controlling the resistance of plants to fungal pathogens.
Description
GENES THAT CODE MLO PROTEINS AND THAT CONFERENCE RESISTANCE TO FUNGI IN PLANTS
The invention describes nucleotide sequences that encode proteins that control the resistance of plants to fungal diseases. The invention also relates to plants resistant to fungal diseases, and to methods for rendering plants resistant to fungal diseases. Fungal diseases are responsible for annual losses of approximately $ 9.1 billion in agricultural crops in the United States, and are caused by a wide variety of biologically diverse pathogens. Traditionally, different strategies have been used to control them. Resistance traits have been reproduced in agriculturally important varieties, thus providing different levels of resistance against a narrow range of isolates or races of pathogens, or against a wider range. However, this involves a long and labor-intensive process of introducing desirable traits into commercial lines through genetic crosses, and due to the risk of pests that evolve to overcome the natural resistance of the plant, a constant effort to reproduce I new traits of resistance in the commercial lines. In an alternative way, fungal diseases have been controlled by the application of chemical fungicides. This strategy usually results in efficient control, but is also associated with the possible development of resistant pathogens, and can be associated with a negative impact on the environment. Moreover, in certain crops, such as barley and wheat, the control of fungal pathogens by chemical fungicides is difficult or impractical. Recent techniques have allowed a better understanding of the interactions between plants and their pathogens at the molecular level, and the mechanisms of resistance have been partially unraveled. Although a large portion of this molecular characterization has been conducted in the Arabidopsis model plant, resistance mechanisms in economically important crops have also begun to be elucidated. i Dusty molds are a major disease that affects most plant species, and have been extensively studied. They are characterized by patches or patches of a white to grayish growth on the tissues of the plant, which correspond to the mycelium and the cleistothecia of the fungus. Dusty molds are caused by several species of fungi of the order Erysiphales. For example, Erysiphe graminis causes dusty mold of cereals and grasses. Although dusty molds are difficult to control in most crops, barley lines are resistant to most known pathogen isolates available. It was shown that mutations in a single locus, the Mío locus, are responsible for the resistance phenotype. My resistance mechanism has been partially elucidated; it involves the formation of large appositions of the cell wall, called papillae, at the contact sites with the pathogen, which contain mainly callose, but also carbohydrates, phenols, and proteins. In the Mine plants, cell wall apposition prevents penetration I of the pathogen, thus providing resistance. Unfortunately, this powerful tool for controlling dusty molds is restricted to barley. In view of the problems caused by fungal diseases in agriculture, in particular by the dusty molds, there remains an unmet need for new and effective strategies to control these types of pathogens in other crops, which are economically attractive to farmers, and acceptable. For the enviroment. The present invention solves the need for novel disease control strategies in plants through the application of genetic engineering techniques. In particular, this invention relates to control strategies against dusty mildew, preferably in economically important crops. The present invention relates to isolated DNA molecules that encode Mine proteins, wherein these proteins
Mine confer resistance of plants to fungal pathogens. In particular, the invention relates to Mio proteins that contain conserved amino acid sequences that the inventors of the present invention are the first to discover, and to the isolated DNA molecules that code for these Mio proteins. The present invention also relates to vectors for the expression of the DNA molecules of the present invention in plants. The present invention also relates to transgenic plants comprising any of the DNA molecules of the present invention. The present invention also describes agricultural products with improved phytosanitary properties comprising transgenic plants resistant to fungal pathogens, by expression of any of the DNA molecules of the present invention. The present invention also further relates to methods for rendering plants resistant to fungal diseases, by altering the expression, in transgenic plants, of proteins encoded by the endogenous copies of the genes corresponding to any of the DNA molecules of the present invention, or by altering the activity or stability of the proteins encoded by the endogenous copies of the genes corresponding to any of the DNA molecules of the present invention. These transgenic plants are desirably resistant to pathogens that infect the living epidermal cells of the plant, most desirably to fungi of the order Erysiphales, also known as powdery molds, preferably of the genus Erysiphe, the causative agent of the powdery mildew, more preferably the plants they are resistant to Erysiphe grami -nis. The present invention further discloses a method for isolating DNA molecules encoding proteins that have the same function or function similar to that of the DNA molecules of the present invention, and which encode the conserved amino acid sequences stipulated in the present invention. Accordingly, the present invention provides new and effective strategies for controlling fungal diseases in economically important crops, potentially reducing the amounts of chemicals applied to the crops, and reducing the risk of the emergence of pathogens resistant to the control agents.
Accordingly, the invention provides: A DNA molecule that encodes a My protein that confers a plant resistance to fungal pathogens, wherein this protein comprises at least one amino acid sequence identical or substantially similar to an amino acid sequence stipulated in SEQ. ID N0: 1 or SEQ ID NO: 2, wherein this DNA molecule is preferably a cDNA molecule. In a preferred embodiment, the DNA molecule is preferably not derived from barley, and is derived from a plant that is a dicot, or from a group of plants consisting of wheat, corn, rice, oats, rye, sorghum, sugarcane, millet, sorghum, and the palm family. In a preferred embodiment, the DNA molecule of the present invention is identical or substantially similar to any of the nucleotide sequences stipulated in SEQ ID NO: 3, SEQ ID NO: 5, or SEQ ID NO: 7 , or encodes a My protein identical or substantially similar to a My protein stipulated in SEQ ID NO: 4, SEQ ID NO: 6, or SEQ ID NO: 8. In a more preferred embodiment, the DNA molecule which comprises the nucleotide sequences stipulated in SEQ ID NO: 3, in SEQ ID NO: 5, or in SEQ ID NO: 7, is derived from wheat. In another preferred embodiment, the DNA molecule of the present invention is identical or substantially similar to any of the nucleotide sequences stipulated in SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 13, in SEQ ID NO: 15, or in SEQ ID NO: 17, or encodes a My protein identical or substantially similar to a Mio protein encoded by any of the nucleotide sequences stipulated in SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, or SEQ ID NO: 18. In a more preferred embodiment, the DNA molecule comprising the nucleotide sequences stipulated in SEQ ID NO. : 9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, or SEQ ID NO: 17, is derived from Arabidopsis tha-liana. In another preferred embodiment, these DNA molecules mentioned hereinabove are modified such that the activity of the endogenous protein is lost. In a particular embodiment of the present invention, the modification of the DNA results in either one, all, or a combination of the following changes in the amino acid sequence of the corresponding protein: Trp (163) to Arg - frame change after Pro (396) frame change after Trp (160) Met (1) to lie Gly (227) to Asp Met (1) to Val - Arg (11) to missing Trp Phe (183), Thr (184) Val (31) to Glu Ser (32) to Phe Leu (271) to His.
In a further preferred embodiment, fungal pathogens desirably infect live epidermal cells, more desirably fungal pathogens are of the order Erysiphales, also known as powdery molds, preferably from the genus Erysiphe, and more preferably the fungal pathogen is Erysiphe. graminis In a further embodiment, the isolated DNA molecule is anti-sense for an isolated molecule as described above, for example anti-sense for a DNA molecule, for example a cDNA molecule, which encodes a Mine protein comprising at least an amino acid sequence identical or substantially similar to an amino acid sequence stipulated in SEQ ID NO: 1 or SEQ ID NO: 2, especially anti-sense for a DNA molecule identical or substantially similar to a DNA molecule stipulated in SEQ. ID NOs: 3, 5, 7, 9, 11, 13, 15, or 17, and which codes for a My protein identical or substantially similar to a Mine protein stipulated in SEQ ID NOS: 4, 6, 8, 10, 12 , 14, 16, or 18.
The invention further provides: A protein comprising at least one amino acid sequence identical or substantially similar to an amino acid sequence stipulated in SEQ ID NO: 1 or SEQ ID NO: 2, wherein this protein is a My protein, and confers to a plant resistant to fungal pathogens. The protein is preferably not derived from barley, and is derived from a plant that is a dicotyledonous, or from a group of plants consisting of wheat, corn, rice, oats, rye, sorghum, sugar cane, millet, sorghum, and the palme-ras family. In a preferred embodiment, the protein of the present invention is encoded by a nucleotide sequence identical or substantially similar to any of the nucleotide sequences stipulated in SEQ ID NO: 3, SEQ ID NO: 5, or SEQ. ID NO: 7, or identical or substantially similar to any of the Mine proteins stipulated in SEQ ID NO: 4, SEQ ID NO: 6, or SEQ ID NO: 8. In a more preferred embodiment, the protein is derived from wheat. In another preferred embodiment, the protein of the present invention is encoded by a nucleotide sequence identical or substantially similar to any of the nucleotide sequences stipulated in SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, or SEQ ID NO: 17, or is identical or substantially similar to any of the Mine proteins stipulated in SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, or SEQ ID NO: 18. In a more preferred embodiment, the protein is derived from A. thaliana. In another preferred embodiment, fungal pathogens desirably infect live epidermal cells, most preferably fungal pathogens are of the order Erysiphales, also known as powdery molds, preferably from the genus Erysiphe, and more preferably the fungal pathogen. is Erysiphe graminis In a further embodiment, the present invention also encompasses mutated forms or truncated forms of proteins encoded by any of the DNA molecules described above.
The invention further provides: An expression cassette comprising any of the DNA molecules described above, for example a
CDNA as described above, wherein the DNA molecule is operably linked to a promoter and termination signals capable of expressing the DNA molecule in plants. In a preferred embodiment, the expression tag is heterologous. In a further preferred embodiment, the promoter and the termination signals are eukaryotic. In a further preferred embodiment, the promoter and the termination signals are heterologous with respect to the coding region.
The invention further provides: A vector comprising any of the expression cassettes described above. In a preferred embodiment, the vector is used for the transformation of the expression cassette into plants. In another preferred embodiment, the vector of the present invention is used for the amplification of any of the DNA molecules described above.
The invention further provides: A cell comprising an expression cassette or parts thereof, comprising an isolated DNA molecule of the present invention, wherein this DNA molecule in the expression cassette can be expressed in the cell. In a preferred embodiment, the DNA molecule is not derived from barley. In another embodiment, the cell is a plant cell. In a further preferred embodiment, the expression cassette is stably integrated into the genome of the cell, or is included in a self-replicating vector, and remains in the cell as an extrachromosomal mole-cell.
The invention further provides: A plant comprising an expression cassette, or parts thereof, comprising an isolated DNA molecule of the present invention. In a preferred embodiment, the DNA molecule is not derived from barley. In another preferred embodiment, the DNA molecule comprised in the expression cassette can be expressed in the plant. In another preferred embodiment, the DNA molecule is stably integrated into the genome of the plant, or is included in a self-replicating vector, and remains in the cell as an extrachromosomal molecule. In another preferred embodiment, the plant is resistant to fungal pathogens, desirably fungal pathogens that infect the living epidermal cells, most desirably the fungal pathogens are of the order Erysiphales, also known as powdery molds, preferably from the genus Erysiphe, and more preferably the fungal pathogen is Erysiphe graminis. The invention also relates to the seed for this plant, whose seed is optionally treated (eg, primed or coated) and / or packaged, for example placed in a bag with instructions for use.
The invention further provides: Agricultural products comprising a plant comprising an isolated DNA molecule of the present invention. In a preferred embodiment, the agricultural product is used, for example, as food, feed, or silage, and does not contain mycotoxins produced by fungal pathogens, such as, for example, aflatoxins. Therefore, the agricultural product has better phytosanitary properties.
The invention further provides: A method for making a plant resistant to a fungal pathogen, which comprises the steps of: a) expressing in a plant an RNA transcript encoded by any of the DNA molecules described above, in an orientation "in sense"; or b) expressing in a plant an RNA transcript encoded by any of the DNA molecules described above, in an "anti-sense" orientation; or c) expressing in a plant a ribozyme capable of specifically dissociating a messenger RNA transcript encoded by an endogenous gene corresponding to any of the DNA molecules described above; or d) expressing in an plant an aptamer specifically targeting an endogenous protein encoded by a gene corresponding to any of the DNA molecules described above; or e) expressing in a plant a mutated or truncated form of any of the DNA molecules described above, so that it can act as a dominant negative mutant; or f) modifying, by homologous recombination in a plant, at least one chromosomal copy of the gene corresponding to any of the DNA molecules described above; or g) modifying, by homologous recombination in a plant, at least one chromosomal copy of the regulatory elements of a gene corresponding to any of the DNA molecules described above.
The invention further provides: A plant obtained by any of the methods described immediately above, including seed for this plant, whose seed is optionally treated (e.g., fattened or coated), and / or packaged, for example placed in a bag with instructions for use. In another pre-ferred embodiment, the obtained plant is resistant to fungal pathogens, desirably to fungal pathogens that infect living epidermal cells, most desirably the fungal pathogens are of the order Erysiphales, also known as powdery molds, preferably to from the genus Erysiphe, and more preferably the fungal pathogen is Erysiphe graminis.
The invention further provides: An agricultural product with improved phytosanitary properties, obtained by any of the methods described immediately above.
The invention further provides: A method for isolating DNA molecules encoding Mio proteins, which comprises the steps of: a) mixing a degenerate oligonucleotide encoding at least six amino acids of SEQ ID NO: 1, and a degenerate oligonucleotide complementary to a sequence encoding at least six amino acids of SEQ ID NO: 2, with DNA extracted from a plant, under conditions that allow the hybridization of these degenerate oligonucleotides in the DNA; and * b) amplifying a DNA fragment of the DNA of this plant, wherein the DNA fragment comprises, at its left and right ends, nucleotide sequences that can be quenched with the degenerate oligonucleotides of step a); and c) obtaining a full-length cDNA clone comprising the DNA fragment from step b).
The invention further provides: A method for producing mutated copies of the nucleotide sequences of the present invention by "in vitro recombination" or "DNA mixture". Mutated copies of the nucleotide sequences of the present invention are used to confer better resistance to fungal pathogens. In a preferred embodiment, the mutant copies of the nucleotide sequences of the present invention are used to confer resistance to a wider range of pathogens. One of these methods is described below: a method for mutating a DNA molecule according to the present invention, wherein the DNA molecule has been dissociated into random double-stranded fragments of a desired size, and which comprises the steps of: a) adding to the population resulting from random double-stranded fragments, one or more single-stranded or double-stranded oligonucleotides, wherein these oligonucleotides comprise an identity area and an area of heterology with the double-stranded polynucleotide; b) denaturing the resulting mixture of random double-stranded fragments and oligonucleotides into single-stranded fragments; c) incubate the resulting population of single-stranded fragments with Tina polymerase, under conditions that result in the mating of these single-stranded fragments in the identity areas, to form pairs of paired fragments, these areas of identity being sufficient for one member of one pair to prime the replica of the other, thereby forming a mutated double-stranded polynucleotide; and d) repeating the second and third steps by at least two additional cycles, wherein the resulting mixture in the second step of an additional cycle includes the double-stranded polynucleotide mutated from the third step of the previous cycle, and the additional cycle forms a polynucleotide of double mutated additional chain.
DEFINITIONS An "isolated DNA molecule" is a nucleotide sequence that, by the hand of man, exists apart from its native environment, and therefore, is not a product of nature. An isolated nucleotide sequence can exist in a purified form, or it can exist in a non-native environment, such as, for example, a transgenic host cell. A "protein", as defined herein, is the entire protein encoded by the corresponding nucleotide sequence, or is a portion of the protein encoded by the corresponding portion of the nucleotide sequence. An "isolated protein" is a protein that is encoded by an isolated nucleotide sequence, and therefore, is not a product of nature. An isolated protein can exist in a purified form, or it can exist in a non-native environment, such as a transgenic host cell, where the protein would not normally be expressed, or would be expressed in a different form or in a different amount in a non-transgenic isogenic host cell. A plant "resistant to a fungal pathogen" has no symptoms, or has minor symptoms, of a fungal infection caused by the fungus, by inhibiting or limiting the ability of the fungal pathogen to grow in the plant. As a consequence, the plant grows better, has higher yields, and produces more seeds. "A protein that confers resistance in a plant to a fungal pathogen" means that the protein is involved in the regulation of the genetic routes of the plant responsible for the resistance of the plant to the fungal pathogen. The protein can be a positive regulator, in that it improves the resistance of the plant to the fungal pathogen, or the protein can be a negative regulator, in that it represses the resistance of the plant to the fungal pathogen. A particular example of a protein that confers resistance in a plant to a fungal pathogen is a My protein. A "My protein" means in the present, a member of the protein family (the Mine family), which has a substantially similar function in a pathway of disease resistance, and which shares some structural homology. The structural homology may be, for example, that family members share at least one conserved region.
In its broadest sense, the term "substantially similar", when used herein with respect to a nucleotide sequence, means a nucleotide sequence corresponding to a sequence of reference nucleotides, wherein the corresponding sequence encodes a polypeptide having substantially the same structure and function as the polypeptide encoded by the reference nucleotide sequence, for example, wherein only changes occur in amino acids that do not affect the function of the polypeptide. Desirably, the substantially similar nucleotide sequence encodes the polypeptide encoded by the reference nucleotide sequence. The percent identity between the substantially similar nucleotide sequence and the reference nucleotide sequence is desirably at least 80 percent, more desirably 85 percent, preferably at least 90 percent, more preferably when less 95 percent, and still more preferably at least 99 percent. The term "substantially similar", when used herein with respect to a protein, means a protein corresponding to a reference protein, wherein the protein has substantially the same structure and function as the reference protein, e.g. , where only changes occur in amino acids that do not affect the function of the polypeptide. When used for a protein or amino acid sequence, the percent identity between I protein or substantially similar and reference amino acid sequence is at least 80 percent, more desirably 85 percent, preferably when less 90 percent, more preferably at least 95 percent, and still most preferably at least 99 percent. The percentage of sequence identity is determined using computer programs that are based on dynamic programming algorithms. Preferred computer programs within the scope of the present invention include the BLAST search programs (Basic Local Alignment Search Tool), designed to scan all available sequence databases, regardless of if a protein or a DNA is requested. The BLAST 2.0 (Gapped BLAST) version of this search tool has been made publicly available on the Internet (currently htf: //www.ncbi.nlm.nih.govBLAST/). It uses a heuristic algorithm that looks for local alignments, as opposed to global alignments, and can therefore detect the relationships between sequences that share only isolated regions. The ratings assigned in a BLAST search have a well-defined statistical interpretation. These preference programs are executed with optional parameters set to the default values.
The term "gene" refers to a coding sequence and associated regulatory sequences, wherein the coding sequence is transcribed into the RNA, such as mRNA, rRNA, tRNA, snRNA, sense RNA, or anti-sense RNA. Examples of the regulatory sequences are the promoter sequences, the 3 'and 5' untranslated sequences, and the termination sequences. Additional elements that may be present are, for example, introns. "Expression" refers to the transcription and / or translation of an endogenous gene or a transgene in plants. In the case of anti-sense constructs, for example, the expression may refer to the transcription of anti-sense DNA only. "Plasma expression", as used herein, means a DNA sequence capable of directing the expression of a particular nucleotide sequence in an appropriate host cell, comprising a promoter operably linked to the nucleotide sequence of interest. , which is operatively linked with termination signals. It also typically comprises the sequences required for an appropriate translation of the nucleotide sequence. The coding region normally encodes a protein of interest, but can also encode a functional RNA of interest, for example, anti-sense RNA, or an untranslated RNA which, in the direction in sense or anti-sense, inhibits the expression of a particular gene, for example anti-sense RNA . The expression cassette comprising the nucleotide sequence of interest can be chimeric, meaning that at least one of its components is heterologous with respect to at least one of its components. The expression cassette may also be one that occurs naturally, but has been obtained in a recombinant form useful for heterologous expression. However, normally the expression cassette is heterologous with respect to the host, i.e., the particular DNA sequence of the expression cassette does not occur naturally in the host cell, and must have been introduced into the host cell or into an ancestor of the host cell through a transformation event. The expression of the nucleotide sequence in the expression cassette may be under the control of a constitutive promoter or of an inducible promoter that initiates transcription only when the host cell is exposed to some particular external stimulus. In the case of a multicellular organism, such as a plant, the promoter may also be specific for a particular tissue or organ or stage of development. "Heterologist", as used herein, means "of different natural or synthetic origin", or represents a non-natural state. For example, if a host cell is transformed with a nucleic sequence derived from another organism, particularly from another species, that gene is heterologous with respect to that host cell, and also with respect to the descendants of the host cell that Take that gene. The transformant nucleic acid may comprise a heterologous promoter, a heterologous coding sequence, or a heterologous terminator sequence. Alternatively, the transforming nucleic acid can be completely heterologous, or it can comprise any possible combination of heterologous and endogenous nucleic acid sequences. In a similar manner, heterologous refers to a nucleotide sequence derived from, and inserted into, the same type of natural original cell, but which is present in an unnatural state, eg, a different number of copies, or under the control of different regulatory elements. The term "promoter" refers to a DNA sequence that initiates the transcription of an associated DNA sequence. The promoter region may also include elements that act as regulators of gene expression, such as activators, enhancers, and / or repressors. "Synthetic nucleotide sequence", as used herein, means a nucleotide sequence that comprises structural characters that are not present in the natural sequence. For example, an artificial sequence that more closely resembles the G + C content and the normal codon distribution of the dicotyledonous and / or mono-cotyledonous genes is said to be synthetic. A regulatory DNA sequence is said to be "operably linked to" or "associated with" a DNA sequence encoding an RNA or a protein, if the two sequences are located in such a way that the regulatory DNA sequence affects the expression of the coding DNA sequence. "Regulatory elements" refer to the sequences involved to confer the expression of a nucleotide sequence. The regulatory elements comprise a promoter operably linked to the nucleotide sequence of interest and to the termination signals. They also normally encompass the sequences required for an appropriate translation of the nucleotide sequence. A "plant" refers to any plant or part of a plant, and particularly to seed plants at any stage of development. It also includes re-cuts, cell or tissue cultures, and seeds. As used in conjunction with the present invention, the term "plant tissue" includes, but is not limited to, whole plants, plant organs, plant seeds, protoplasts, callus, cell cultures, and any groups of plant cells. organized in structural and / or functional units. A "plant cell" refers to the structural and physiological unit of the plant, which comprises a protoplast and a cell wall. The plant cell can be in the form of a single isolated cell, or a cultured cell, or as a part of a higher organized unit, such as, for example, a plant tissue or a plant organ. "Transformation", as used herein, means the introduction of a nucleic acid into a cell. In particular, the stable integration of a DNA molecule in the genome of an organism of interest. A "selectable marker" is conferred by a gene whose expression in a plant cell gives the cell a selective advantage. The selective advantage possessed by cells transformed with the selectable marker gene may be due to their ability to grow in the presence of a negative selective agent, such as an antibiotic or a herbicide, compared to the growth of untransformed cells. The selective advantage possessed by the transformed cells, compared to non-transformed cells, may also be due to their improved or novel ability to use an aggregate compound such as a nutrient, growth factor, or energy source. The selectable marker gene also refers to a gene or a combination of genes whose expression in a plant cell gives the cell both a negative and positive selective advantage. A "traceable marker" is conferred by a gene whose expression does not give a selective advantage to a transformed cell, but whose expression renders the transformed cell fe-notifiably distinct from non-transformed cells.
BRIEF DESCRIPTION OF THE SEQUENCES IN THE LIST OF SEQUENCES
SEQ ID NO: the conserved amino acid sequence 1 SEQ ID NO: 2 conserved amino acid sequence 2 SEQ ID NO: 3 nucleotide sequence of the wheat protein TrMlol SEQ ID NO: 4 TrMlol protein sequence SEQ ID NO: 5 nucleotide sequence of Wheat Myel protein TrMlo2 SEQ ID NO: 6 TrMlo2 protein sequence SEQ ID NO: 7 nucleotide sequence of Wheat protein TrMlo3 SEQ ID NO: 8 TrMlo3 protein sequence SEQ ID NO : 9 nucleotide sequence of the Myo protein of Arabidopsis CIB10259 SEQ ID NO: 10 sequence of the protein of CIB10259 SEQ ID NO: 11 nucleotide sequence of the Myo protein of Arabidopsis CIB10295 SEQ ID NO: 12 sequence of the protein of CIB10295 SEQ ID NO: 13 nucleotide sequence of the My Arabidopsis protein CIB10296 SEQ ID NO: 14 CIB10296 protein sequence SEQ ID NO: 15 nucleotide sequence of the My Arabidopsis protein F19850 SEQ ID NO: 1 6 F19850 protein sequence SEQ ID NO: 17 Nucleotide sequence of the My Arabidopsis protein U95973 SEQ ID NO: 18 U95973 protein sequence
SEQ ID NO:: Oligonucleotide MLO-1 SEQ ID NO:: Oligonucleotide MLO-3 SEQ ID NO:: Oligonucleotide MLO-5 SEQ ID NO:: Oligonucleotide MLO-7 SEQ ID NO:: 23 Oligonucleotide MLO-10 SEQ ID NO:: 24 oligonucleotide MLO-15 SEQ ID NO:: oligonucleotide MLO-26 SEQ ID NO:: 26 oligonucleotide MLO-GSP1 SEQ ID NO:: 27 oligonucleotide MLO-GSP2 SEQ ID NO:: 28 oligonucleotide ST27 SEQ ID NO:: Oligonucleotide N37544-1 SEQ ID NO:: Oligonucleotide N37544-2 SEQ ID NO:: 31 Oligonucleotide T22146-1 SEQ ID NO:: 32 Oligonucleotide T22146-2 SEQ ID NO:: 33 Oligonucleotide H76041-1 SEQ ID NO:: 34 Oligonucleotide H76041-2 SEQ ID NO:: oligonucleotide SAS-1 SEQ ID NO:: 36 oligonucleotide SAS-2 SEQ ID NO:: 37 oligonucleotide SAS-3 SEQ ID NO:: 38 oligonucleotide SAS-4 SEQ ID NO:: 39 oligonucleotide SAS-5 SEQ ID NO:: 40 oligonucleotide SAS-6 SEQ ID NO:: 41 oligonucleotide SAS-7 SEQ ID NO: 42 oligonucleotide SAS-8
Deposits
All deposits were made at the Northern Regional Research Center, 1815 Northern University Street, Peoria, Illinois 61604, USA. The present invention relates to DNA molecules that encode Mine proteins, which give a plant resistance to fungal pathogens. The inventors of the present invention are the first to identify conserved amino acid sequences among the Mio proteins. The conserved amino acid sequences of the present invention are conserved among three Mine proteins derived from wheat, and three Mine proteins derived from A. Thaliana These amino acid sequences are also conserved in two predicted Arabidopsis Mio proteins. The first conserved amino acid sequence, which is stipulated in SEQ ID N0.-1, comprises thirteen amino acids. The fourth amino acid of SEQ ID N0: 1 is Leu, Val, or Ik, its fifth amino acid is Val or Leu, and its seventh amino acid is Phe or Leu. The thirteenth amino acid in SEQ ID N0.-1 is not Ik, and preferably it is Thr, Ser, or Ala. The second conserved amino acid sequence, which is stipulated in SEQ ID NO: 2, comprises fourteen amino acids. The first amino acid in SEQ ID NO: 2 is not Net, and preferably it is Ik, Val, Ser, or Gly. Its third amino acid is Phe, Leu, or Val, its sixth amino acid is Tyn or Asn, its seventh amino acid is Ala or Val, its octa-vo amino acid is Leu or lie, its tenth amino acid is Thr or Ser. The invention covers My proteins isolates comprising at least one of the conserved amino acid sequences described above, and the isolated DNA molecules that encode these Mine proteins. The invention also encompasses isolated My proteins comprising both conserved sequences stipulated in SEQ ID NO: 1 and SEQ ID NO: 2. In a preferred embodiment, the isolated DNA molecules encoding the Mio proteins of the present invention are cDNA molecules. In a further embodiment, DNA molecules that encode Mio proteins that comprise at least one of the conserved amino acid sequences are not derived from barley.
In another embodiment, these DNA molecules are derived from a dicot, or from wheat, corn, rice, oats, rye, sorghum, sugar cane, millet, sorghum, or the family of palm trees. In a preferred embodiment, the one DNA molecule of the present invention is identical or substantially similar to the DNA molecules stipulated in SEQ ID NOs: 3, 5, or 7, and in SEQ ID NOs: 9, 11, 13 , 15, or 17, or encode a My protein, which is identical or substantially similar to any of the Mine proteins stipulated in SEQ ID NOs: 4, 6, or 8, or in SEQ ID NOs: 10, 12, 14 , 16, 18. The DNA molecules stipulated in SEQ ID NOs: 3, 5, or 7, are derived from wheat, and encode the Mio proteins stipulated in SEQ ID NOs: 4, 6, or 8, respectively . The isolation of these DNA molecules is further illustrated in Example 1. The DNA molecules stipulated in SEQ ID NOs: 9, 11, 13, 15, or 17, are derived from Arabidopsis, and encode the stipulated Mine proteins. in SEQ ID NOs: 10, 12, 14, 16, or 18, respectively. The isolation of these DNA molecules is further illustrated in Example 2. The DNA molecule of SEQ ID NO: 3, which encodes a wheat My protein called TrMlol, is deposited as the TrMlol and TrMlol-5 strains, with the access numbers NRRL B-21948 and NRRL B-21949, respectively. The DNA molecule of SEQ ID NO: 5, which encodes a wheat My protein called TrMlo2, is deposited as strains TrMlo2 and TrMlo2-5, with accession numbers NRRL B-21950 and NRRL B-21951, respectively. The DNA molecule of SEQ ID NO: 7, which encodes a wheat My protein called TrMlo3, is deposited as the strains TrMlo3 and TrMlo3-5, with accession numbers NRRL B-21952 and NRRL B-21953, respectively. TrMlol and TrMlo3 comprise the full length cDNAs of the corresponding Mio genes, and also comprise some of the corresponding 5 'and 3' untranslated regions. TrMlo2 is the longest cDNA clone of the corresponding gene that was recovered. It comprises the entire coding region with the exception of the first methionine (start codon), deduced from a comparison with TrMlol and TrMlo3. TrMlo2 also comprises some of the 3 'untranslated region of the corresponding gene. The DNA molecule of SEQ ID NO: 9, which encodes a My Arabidopsis protein named CIB10259, is deposited as strain pCIB10259, with accession number NRRL B-21946. The DNA molecule of SEQ ID NO: 11, which encodes a My Arabidopsis protein named CIB10295, is deposited as strain pCIB10295, with accession number NRRL B-21946. The DNA molecule of SEQ ID NO: 13, which encodes a My Arabidopsis protein named CIB10296, is deposited as strain pCIB1029, with accession number NRRL B-21947. CIB10259, CIB10295, and CIB10296 comprise the full length cDNAs of the corresponding Mio genes, and also comprise some of the corresponding 5 'and 3' untranslated regions. The nucleotide sequences encoding members F19850 and U95973 of the Myo family of Arabidopsis are obtained from Genbank. However, for both clones, a predicted amino acid sequence is determined, and found not to match the amino acid sequence predicted in the Genbank annotation. The Mio proteins determined by the inventors of the present invention, therefore, are novel and not obvious. Both of the above-mentioned proteins contain the conserved amino acid sequences es-tylated in SEQ ID NOs: 1 and 2, and therefore, are encompassed by the present invention, as well as the isolated cDNAs encoding them. A My protein encoded by a DNA molecule of the present invention confers to a plant resistance to fungal pathogens, desirably the fungal pathogens that infect the living epidermal cells of the plant, most desirably the fungal pathogens of the order Erysiphales., also known as dusty molds (Agrios G. 81988) Plant Pathology, Third Edition, Academic Press Inc., in particular page 271). Preferably, a My protein encoded by a DNA molecule of the present invention confers to a plant resistance to the genus Erysiphe, more preferably the fungal pathogen is Erysiphe graminis. The present invention also encompasses recombinant vectors comprising any of the DNA molecules of this invention. In these vectors, the DNA molecules are preferably comprised in an expression cassette comprising regulatory elements for the expression of the DNA molecules in a host cell capable of expressing these DNA molecules. These regulatory elements are usually a promoter and termination signals, and preferably also include elements that allow efficient translation of a protein encoded by a DNA molecule of the present invention. In a preferred embodiment, an expression cassette is heterologous. These vectors are used for the transformation of the expression cassette comprising any of the DNA molecules of this invention into a host cell. In a preferred embodiment, the expression cassette is stably integrated into the host cell DNA. In another preferred embodiment, the expression tag is comprised in a vector, which is capable of replicating in a host cell, and remains in the host cell as an extrachromosomal molecule. In a further preferred embodiment, this extrachromosomal replication molecule is used to amplify the DNA molecules of this invention in a host cell. In a preferred embodiment, this host cell is a microorganism, such as a bacterium, in particular E. coli. In another preferred embodiment, a host cell is a eukaryotic cell, such as, for example, a yeast cell, an insect cell, or a plant cell.
In a further embodiment, a DNA molecule of the present invention is modified by the incorporation of random mutations in a technique known as in vitro recombination or DNA mixing. This technique is described in Stemer et al., Nature 370: 389-391 (1994), and in U.S. Patent No. 5,605,793, incorporated herein by reference. Millions of mutant copies of the nucleotide sequences are produced based on the original nucleotide sequence described herein, and variants with improved properties, such as increased resistance to fungal pathogens, or resistance against a wider range of pathogens. The method comprises forming a mutated double-stranded polynucleotide from a double stranded polynucleotide template comprising the nucleotide sequence of this invention, wherein the double-stranded polynucleotide annealing has been dissociated into random double-stranded fragments of a desired size, and comprises the steps of adding to the population resulting from random double-stranded fragments one or more single-stranded or double-stranded oligonucleotides, wherein these oligonucleotides comprise an identity area and an area of heterology with the annealing of double stranded polynucleotide; denaturing the resulting mixture of random double-stranded fragments and oligonucleotides into single, chain fragments; incubate the resulting population of single-stranded fragments with a polymerase, under conditions that result in the pairing of the single-stranded fragments in the identity areas, to form pairs of paired fragments, these identity areas being sufficient for one member of one pair primes the replica of the other, thereby forming a mutated double-stranded polynucleotide; and repeating the second and third steps by at least two additional cycles, wherein the resulting mixture in the second step of an additional cycle includes the double-stranded polynucleotide mutated from the third step of the previous cycle, and the additional cycle forms a double polynucleotide. additional mutated chain. In a preferred embodiment, the concentration of a single species of random double-stranded fragments in the population of double-stranded random fragments is less than 1 weight percent of the total DNA. In a further preferred embodiment, the double strand polynucleotide annealing comprises at least about 100 species of polynucleotides. In another embodiment, the size of the double-stranded random fragments is from about five base pairs to five kilobases. In a further embodiment, the fourth step of the method involves repeating the second and third steps for at least ten cycles. The present invention also encompasses cells comprising a DNA molecule of the present invention, wherein the DNA molecule is not in its native cellular environment. In a preferred embodiment, these cells are plant cells. In another preferred embodiment, a DNA molecule of the present invention can be expressed in these cells, and is comprised in an expression cassette that allows its expression in those cells. In a preferred embodiment, the ex-pressure cassette is stably integrated into the host cell DNA. In another preferred embodiment, the expression cassette is comprised in a vector, which is capable of replicating in the cell, and remains in the cell as an extrachromosomal molecule. The present invention also encompasses a plant comprising the plant cells described above. In a further embodiment, the DNA molecules of the present invention can be expressed in the plant, and the expression of any of the DNA molecules of the present invention or of a portion thereof in transgenic plants confers resistance to the transgenic plant against fungal pathogens. In a preferred embodiment, fungal pathogens desirably infect living epidermal cells, more desirably fungal pathogens are of the order Erysiphales, also known as powdery molds, preferably from the genus Erysiphe, and more preferably the fungal pathogen is Erysiphe graminis. Accordingly, the present invention also encompasses transgenic plants made resistant to fungal pathogens by the expression of any of the DNA molecules of the present invention, or of a portion thereof. The plants transformed according to the present invention can be monocot or dicot, and include, but are not limited to, corn, wheat, barley, rye, ca-mote, beans, peas, chicory, lettuce, cabbage, cauliflower, broccoli. coli, turnip, radish, spinach, asparagus, onion, garlic, pepper, celery, chayote, pumpkin, hemp, zucchini, apple, pear, quince, melon, plum, cherry, peach, nectarine, apricot, strawberry, grape, raspberry, blackberry, pineapple, avocado, papaya, mango, banana, soybean, tomato, sorghum, sugar cane, sugar beet, sunflower, rape seed, clover, tobacco, carrot, cotton, alfalfa, rice, potato, eggplant, cucumber , Arabidopsis thaliana, and woody plants, such as coniferous and deciduous trees, especially corn, wheat, or sugar beet. Once the desired nucleotide sequence has been transformed into a particular plant species, it can be propagated in that species, or it can be moved towards other varieties of the same species, particularly including commercial varieties, using traditional breeding techniques . For their expression in transgenic plants, DNA molecules may require modification and optimization. It is known in the art that all organisms have specific preferences for the use of codons, and the codons in the nucleotide sequence comprised in the DNA molecules of the present invention can be changed to conform to the specific preferences of the plant, as long as the amino acids encoded by them are maintained. In addition, high expression in plants is best achieved from coding sequences having a GC content of at least 35 percent, and preferably more than 45 percent, The nucleotide sequences that have low contents of GC, can be expressed poorly, due to the existence of ATTTA motifs, which can destabilize the messages, and AATAAA motifs that can cause inappropriately polyadenylation, although the preferred genetic sequences can be adequately expressed in both monocotyledonous plant species As dicotyledons, the sequences can be modified to take into account codon-specific preferences and GC content preferences of monocots or dicots, as these preferences have been shown to differ- (Murray et al., Nucí. 17: 477-498 (1989)). In addition, the nucleotide sequences are ras-trean to determine the existence of s illegitimate splicing events that cause message truncation. All the changes that are required to be made within the nucleotide sequences, such as those described above, are made using well-known techniques of site-directed mutagenesis, polymerase chain reaction, and synthetic gene construction, using the methods disclosed in published Patent Applications Nos. EP 0,385,962, EP 0,359,472 and WO 93/07278. For an efficient translation start, the sequences adjacent to the start methionine may require modification. For example, they can be modified by including sequences that are known to be effective in plants. Joshi has suggested an appropriate consensus for plants (NAR 15: 6643-6653 (1987)), and Clontech suggests * an additional consensus-based translation initiator (1993/1994 catalog, page 210). These consensus are suitable for use with the nucleotide sequences of this invention. The sequences are incorporated into constructs that comprise the nucleotide sequence, up to and including the ATG (while leaving the second amino acid unmodified), or alternatively up to and including the GTC subsequent to the ATG (with the possibility of modifying the second amino acid of the transgene). The DNA molecules in transgenic plants are driven by a promoter that proves to be functional in plants. The choice of the promoter will vary depending on the temporal and spatial requirements for the expression, and also depending on the white species. For the protection of plants against foliar pathogens, expression in the leaves is preferred; for the protection of plants against ear pathogens, expression in inflorescences is preferred (e.g., spikes, panicles, ears, etc.); for the protection of plants against root pathogens, expression in the roots is preferred; For the protection of seedlings against the pathogens of the earth, expression in the roots and / or in the seedlings is preferred. However, in many cases protection against more than one type of phytopathogen is sought, and therefore, expression in multiple tissues is desirable. Although it has been shown that many promoters from dicotyledons are operative in monocots, and vice versa, dicotyledonous promoters are ideally selected for dicotyledonous expression, and monocotyledonous promoters for monocotyledonous expression. However, there is no restriction on where the selected promoters come from; it is sufficient that they are operative to drive the expression of the DNA molecules in the desired cell. Preferred promoters that are constitutively expressed include promoters derived from the Aginabacterium opine synthase genes, for example the nos promoter, or a double promoter from the Ti plasmid of Agrobacte-rium (Velten et al. (1984) EMBO J. 3: 2723-2730), or operative viral promoters in plants, for example the CaMV 35S and 19S promoters, and promoters from genes encoding actin or ubiquitin. Another preferred promoter is a synthetic promoter, such as the Gelvin Super MAS promoter (Ni et al. (1995) Plant J. 7: 661-676). The DNA molecules of this invention can also be expressed under the regulation of promoters that are chemically regulated. This makes it possible for the protein that confers fungal diseases to be synthesized only when the crop plants are treated with the inductive chemicals. The preferred technology for the chemical induction of gene expression is detailed in published application number EP 0,332,104, and in the patent of the United States of America number 5,614,395. A preferred promoter for chemical induction is the PR-la tobacco promoter. A preferred category of promoters is one that is wound inducible. Numerous promoters have been described that are expressed at wound sites, and also at sites of infection by phytopathogens. Ideally, this promoter should only be active locally at the sites of infection, and in this way, the protein that controls fungal diseases only accumulates in the cells that need to synthesize it to kill the invading insect pest. Preferred promoters of this class include those described by Stanford et al., Mol. Gen. Genet. 215: 200-208 (1989), Xu et al., Plant Molec. Biol. 22: 573-588 (1993), Logemann et al., Plant Cell 1: 151-158 (1989), Rohrmeier & Lehle, Plant Molec. Biol. 22: 783-792 (1993), Firek et al., Plant Molec Biol. 22: 129-142 (1993), and Warner et al., Plant J. .3: 191-201 (1993).
Preferred tissue-specific expression patterns include green-specific, stem-specific, stem-specific, and flower-specific tissue. Suitable promoters for expression in green tissue include many that regulate the genes involved in photosynthesis, and many of these have been cloned from both monocotyledonous and dicotyledonous. A preferred promoter is the maize PEPC promoter from the phosphoenol carboxylase gene (Huds-Peth and Gruia, Plant Molec, Biol. 12: 579-589 (1989)). A preferred promoter for specific expression of the root is that described by de Framond (FEBS 2-9_0: 103-106 (1991); European Patent Number EP 0,452,269), and an additional preferred root-specific promoter is that of Tl gene provided by this invention. A preferred promoter of the preferred stem is that described in US Pat. No. 5,625,136, and which promotes expression of the corn trpA gene. Preferred embodiments of the invention are transgenic plants that express a DNA molecule in a specific form of the root. Additional preferred embodiments are transgenic plants that express the DNA molecule in a wound inducible or inducible manner by pathogen infection. In addition to the selection of a suitable promoter, constructs for the expression of the protein in plants require an appropriate transcription terminator to be downstream of the heterologous nucleotide sequence. Several of these terminators are available and are known in the art (e.g., tml from CaMV, E9 from rbcS). Any available terminator known to work in plants can be used in the context of this invention. Numerous other sequences can be incorporated into the expression cassettes for the DNA molecules of this invention. These include sequences that have been shown to improve expression, such as introns sequences (e.g., from adhl and bronzel), and viral leader sequences (e.g., from TMV, MCMV, and AMV). It may be preferable to direct the expression of the DNA molecules to different cellular locations in the plant. In some cases, localization in the cytosol may be desirable, while in other cases, localization in some subcellular organelle may be preferred. The sub-cellular localization of the enzymes encoded by the transgene can be ignited using techniques well known in the art. Normally, DNA encoding the target peptide from a gene product directed to the known organelle is manipulated and fused upstream of the nucleotide sequence. Many of these white chloroplast sequences are known, and their functioning has been demonstrated in heterologous constructs. Suitable vectors for plant transformation are described elsewhere in this specification. For Agrrojbacterium-mediated transformation, binary vectors or vectors carrying at least one T-DNA boundary sequence are suitable, while for direct gene transfer, any vector is suitable, and a linear DNA containing only one may be preferred. the construction of interest. In the case of direct gene transfer, transformation with a single DNA species, or co-transformation, can be used (Schocher et al., Biotechnology 4: 1093-1096 (1986)). For both direct gene transfer and Agrobacterium-mediated transfer, transformation is usually (but not necessarily) undertaken with a selectable marker that can provide resistance to an antibiotic (kanamycin, hygromycin, or methotrexate), or to a herbicide (glufosinate) , glyphosate, or a protoporphyrinogen oxidase inhibitor), or a selectable marker that can confer a selective advantage to transformed cells, such as a phospho-mannose-isomerase gene. However, the choice of the selectable marker is not critical to the invention. In another preferred embodiment, the DNA molecules of this invention are directly transformed into the plastid genome. Plastid transformation technology is extensively described in U.S. Patent Nos. 5,451,513, 5,545,817, and 5,545,818, in PCT Application Number WO 95/16783, and in McBride et al. (1994) Proc. Nati Acad. Sci. USA 91, 7301-7305. The basic technique for chloroplast transformation involves introducing regions of the cloned plastid DNA flanking a selectable marker, together with the DNA molecule of interest, into a suitable target tissue, for example using biolistics or protoplast transformation (e.g. transformation mediated by calcium chloride or PEG). The flanking regions of 1 to 1.5 kb, referred to as targeting sequences, facilitate homologous recombination with the plastid genome, and therefore, allow the replacement or modification of specific regions of the plastome. Initially, point mutations are used in the chloroplast 16S rRNA and rpsl2 genes that confer resistance to spectinomycin and / or es-treptomycin, as selectable markers for transformation (Svab, Z., Hajdukiewicz, P., and Maliga, P. ( 1990) Proc. Nati, Acad. Sci. USA 87, 8526-8530; Staub, J.M., and Maliga, P. (1992) Plant Cell 4, 39-45). This results in stable homoplasmic transformants at a frequency of approximately one per 100 white bombardment. The presence of cloning sites between these markers allowed the creation of an address vector to the plastid for the introduction of foreign genes (Staub, JM, and Maliga, P. (1993) EMBO J. 12, 601-606, incorporated in the present as a reference). Substantial increases in the frequency of transformation are obtained by replacing the recessive rRNA, or the antibiotic resistance genes of the r-protein with a dominant selectable marker, by coding the bacterial gene aa A the spectinomycin detoxifying enzyme, aminoglycoside -3 '-adenyltransferase (Svab, Z., and Maliga, P. (1993) Proc. Nati, Acad. Sci. USA 90, 913-917). Previously, this marker had been successfully used for the high-frequency transformation of the green alga plastid genome Chlamydomonas reinhardtii (Goldschmidt-Clermont, M. (1991) Nucí Acids Res. 19, 4083-4089). Other selectable markers useful for plastid transformation are known in the art, and are encompassed within the scope of the invention. Normally, approximately 15 to 20 cycles of cell division are required following the transformation to achieve a homo-plastidic state. Plastid expression, in which genes are inserted by homologous recombination in all the several thousand copies of the circular plastid genome present in each plant cell, takes advantage of the enormous number of copies on nuclear expression genes, to allow levels of expression that can easily exceed 10 percent of the total soluble plant protein. The present invention also encompasses agricultural products, which comprise transgenic plants made resistant to fungal pathogens, by the expression of any of the DNA molecules of the present invention, or he-chas resistant to fungal pathogens by any of the methods described above. Because these plants are resistant to fungal pathogens, the growth of pathogens in them is suppressed. These plants, and the agricultural products derived from them, are therefore less likely to contain mycotoxins, which are naturally produced by many fungal pathogens, and which can be very toxic to humans and animals. Therefore, these agricultural products have better phytosanitary properties. In a preferred embodiment, these agricultural products are used as food material, as ensilage, or as food. It is a further object of the present invention to provide methods for making plants resistant to fungal pathogens. The Mio proteins encoded by the DNA molecules of the present invention give a plant resistance to a fungal pathogen, and it is a preferred objective of the present invention to alter the expression of these proteins in their natural host environment. It is a further preferred objective of the present invention to alter the stability or activity of these proteins in their natural environment. These alterations of the expression, stability, or activity of the proteins encoded by the DNA molecules of the present invention in a plant, result in a greater resistance of the plant to the fungal pathogens. In a preferred fashion, a protein encoded by a DNA molecule of the present invention is a negative regulator of a plant's resistance to fungal pathogens, because it represses the genetic pathways in the plant that are responsible for resistance. from the plant to fungal pathogens. Accordingly, it is a preferred objective of the present invention to reduce the expression of the Mio proteins encoded by the DNA molecules of the present invention in their natural host environment, or to reduce the stability or activity of these proteins in their environment. natural host environment.
Deletion "in sense" In a preferred embodiment, the reduction of the expression of a protein encoded by a DNA molecule of the present invention is obtained by means of the deletion "in sense" (referenced, for example, in Jorgensen et al. (1996 ) Plant Mol. Biol. 31, 957-973). In this case, all or a portion of a DNA molecule of the present invention is comprised in an expression cassette, which is introduced into a host cell, preferably a plant cell, where the molecule can be expressed of DNA. The DNA molecule is inserted into the expression cassette in the "sense orientation", meaning that the 5 'end of the DNA molecule is adjacent to the promoter in the expression cassette, and that the coding strand of the molecule of DNA can be transcribed. In a preferred embodiment, the DNA molecule can be fully translated, and all the genetic information comprised in the DNA molecule or portion thereof, is translated into a protein. In another preferred embodiment, the DNA molecule can be partially translated, and a short peptide is translated. In a preferred embodiment, this is achieved by inserting at least one premature stop codon into the DNA molecule, which carries the translation to a stop. In another more preferred embodiment, the DNA molecule is transcribed, but no translation product is being made. This is usually accomplished by removing the start codon, for example, the "ATG" of the protein encoded by the DNA molecule. In a further preferred embodiment, the expression cassette comprising the DNA molecule or a portion thereof, is stably integrated into the genome of the host cell. In another preferred embodiment, the expression cassette comprising the DNA molecule or a portion thereof, is comprised in an extrachromosomal replication molecule. In transgenic plants containing one of the expression cassettes described immediately above, the expression of the gene corresponding to the DNA molecule comprised in the expression cassette is reduced or eliminated, leading to reduced levels of the protein, or its absence in the the transgenic plants. As a result of the same, transgenic plants are resistant to fungal pathogens.
"Anti-sense" suppression In another preferred embodiment, the reduction of the expression of a protein encoded by a DNA molecule of the present invention is obtained by "anti-sense" suppression. The whole or a portion of a DNA molecule of the present invention is comprised in an expression cassette, wherein the DNA molecule is introduced into a host cell, preferably a plant cell, where the molecule can be expressed of DNA. The DNA molecule is inserted into the expression cassette in the "anti-sense orientation", meaning that the 3 'end of the DNA molecule is adjacent to the promoter in the expression cell, and that it can be transcribed. non-coding chain of the DNA molecule. In a preferred embodiment, the expression cassette comprising the DNA molecule or a portion thereof, is stably integrated into the genome of the host cell. In another preferred embodiment, the expression cassette comprising the DNA molecule or a portion thereof, is comprised in an extrachromosomal replication molecule. Several publications describing this approach are cited, for further illustration (Green, PJ et al, Ann.Rev. Biochem. 55: 569-597 (1986); van der Krol, AR et al., Antisense Nuc. Acids &Proteins, pages 125-141 (1991); Abel, PP et al., Proc. Nati, Acad. Sci. USA 86: 6949-6952 (1989), Ecker, JR et al., Proc. Nati, Acad. Sci. USA 83: 5372 -5376 (August 1986))
Homologous recombination In another preferred embodiment, at least one genomic copy corresponding to a DNA molecule of the present invention is modified in the genome of the plant by homologous recombination, as illustrated further in Paszkowski et al., EMBO Journal 7: 4021-26 (1988). This technique uses the property of the homologous sequences to recognize each other, and to exchange nucleotide sequences among others, by a process known in the art as homologous recombination. Homologous recombination can occur between the chromosomal copy of a nucleotide sequence in a cell, and an input copy of the nucleotide sequence introduced into the cell by transformation. Therefore, specific modifications are precisely introduced into the chromosomal copy of the nucleotide sequence. In one embodiment, the regulatory elements of the gene encoding a protein of the present invention are modified. The existing regulatory elements are replaced by different regulatory elements, thereby reducing the expression of the protein, or they are mutated or deleted, thus eliminating the expression of the protein. In another embodiment, the coding region of the protein is modified by suppressing a part of the coding sequence of the entire coding sequence, or by mutation. The expression of a mutated protein can also give the plant greater resistance to fungal pathogens. In another preferred embodiment, a mutation is introduced into the chromosomal copy of a DNA molecule, by transforming a cell with a chimeric oligonucleotide composed of a contiguous stretch of RNA and DNA residues into a duplex conformation, with caps Double fork on the ends. An additional feature of the oligonucleotide is the presence of 2'-O-methylation in the residues of
"RNA The RNA / DNA sequence is designed to be aligned with the sequence of a chromosomal copy of a DNA molecule of the present invention, and to contain the desired nucleotide change.This technique is further illustrated in the US Pat. United States of America Number 5,501,967.
Ribozymes In a further embodiment, the RNA encoding a protein of the present invention is dissociated by a catalytic RNA, or ribozyme, specific for that RNA. The ribozyme is expressed in transgenic plants, and results in reduced amounts of DNA encoding the protein of the present invention in plant cells, thereby leading to reduced amounts of protein accumulated in the cells, and to a greater resistance of the plant to fungal pathogens.
This method is further illustrated in U.S. Patent Number 4,987,071.
Negative-Dominant Mutants In another preferred embodiment, the activity of the proteins encoded by the nucleotide sequences of this invention is changed. This is achieved by the expression of dominant negative mutants of the proteins in transgenic plants, leading to the loss of activity of the endogenous protein.
Aptamers In a further embodiment, the activity of a protein encoded by a DNA molecule of the present invention is inhibited, by expression in transgenic plants, of nucleic acid ligands, called aptamers, which bind specifically to the protein. The aptamers are preferably obtained by the SELEX method (Systematic Evolution of Li-gands by Exponential Enrichment, Systematic Evolution of Li-gandos by Exponential Enrichment). In the SELEX method, a candidate mixture of single-stranded nucleic acids having regions of randomized sequence is contacted with the protein, and the nucleic acids having a greater affinity with the target are divided from the rest of the mixture. candidate The divided nucleic acids are amplified to produce a ligand-enriched mixture. After several iterations, a nucleic acid with an optimal affinity for the protein is obtained, and it is used for expression in transgenic plants. This method is further illustrated in U.S. Patent Number 5,270,163.
Methods for Isolating Nucleotide Sequences Comprising Preserved Sequences The conserved sequences comprised in the Mio proteins encoded by the DNA molecules of the present invention are used for the isolation of other DNA molecules encoding these sequences. In a preferred embodiment, a mixture of degenerate oligonucleotides containing at least one possible oligonucleotide encoding a sequence stipulated in SEQ ID NO: 1 or SEQ ID NO: 2 is produced. The mixture of oligonucleotides encoding the sequences stipulated in SEQ ID NO.l, and the mixture of complementary oligonucleotides for the sequences encoding the sequences stipulated in SEQ ID NO: 2, are used in a PCR amplification reaction. , with a temperate DNA of choice. Blends of degenerate oligonucleotides are well known in the art, and the degree of degeneracy is varied as necessary. In a preferred embodiment, DNA annealing is a sample of total DNA from a plant, wherein this DNA sample is obtained by methods well known in the art. The amplified fragments resulting from the PCR reactions described above, are isolated by methods well known in the art, and are used to isolate the corresponding full length cDNAs, by screening a cDNA library, or by using a RACE protocol , both well known in the field. This method represents a novel and useful strategy to isolate new genes that give plants resistance to fungal pathogens.
The invention will be further described with reference to the following detailed examples. These examples are provided for purposes of illustration only, and are not intended to be limiting, unless otherwise specified.
EXAMPLES Example 1: Cloning and Sequencing of My Own Genes from Wheat My genes are cloned from wheat using a polymerase chain reaction approach with reverse transcription. RNA is prepared from the leaves of the UC703 wheat culture, and is used to program reverse transcription using a Stratagene reverse transcription polymerase chain reaction kit. The resulting cDNA is used in the polymerase chain reactions, using the following primers: MLO-26 5 'TTC CAG CAC CGG CAC AAG AA 3' (SEQ ID NO: 25) MLO-10 5 'AAG AAC TGC CTG AAG AAG GC 3 '(SEQ ID NO: 23) MLO-7 5' CAG AAA CTT GTC TCA TCC CTG G 3 '(SEQ ID NO: 22)
MLO-5 5 'ACÁ GAG ACC ACC TCC TTG GAA 3' (SEQ ID NO: 21)
MLO-15 5 'CAC CAC CTT CAT GAT GCT CA 3' (SEQ ID NO: 24)
The polymerase chain reaction is performed using the pairs of primers listed below, whose reactions resulted in the amplification of the fragments of the indicated sizes: MLO-26 and MLO-10 503 bp MLO-26 and MLO-7 1481 bp MLO-5 and MLO-15 650 bp
The fragments are cloned into pCR2.1 or pCR2.1-TOPO (Invitrogen). Plasmid DNA is prepared from the trans-formants, and subjected to DNA sequencing. The sequencing reveals the existence of three different cDNA sequences with very high similarity to each other. These My wheat genes are called TrMlol, TrMlo2, and TrMlo3. An additional My wheat clone is isolated, tracking a wheat cDNA library constructed in the Lambda-ZAPII vector. To track this library, mass separation is performed to convert the library into a Bluescript-based collection of cDNA clones. These are grown separately in groups of 80,000 independent clones, and the plasmid DNA is prepared. Polymerase chain reactions are performed on the pooled DNAs using the oligonucleotide primers MLO-5 and MLO-15 (see above). Three groups produce a band of the expected size of 650 base pairs. These groups are subsequently fractionated by serial culture of bacterial clones at an increasingly lower density, followed at each step by preparation of the plasmid DNA and polymerase chain reaction on each subgroup using the MLO-5 and MLO-15 primers. . After several rounds of fractionation, a single clone is isolated, which contains an insert with a sequence of Mine. The sequencing of the inserts of. Two of these clones reveal that they contain identical inserts. The sequence of the third clone reveals an insert identical to that of the other two clones, but with 40 additional bases at the 5 'end. The cloning of the remaining portions of wheat Mio is carried out by random amplification of the ends of the cDNA (RACE). The RACE reactions are performed on wheat poly-A + RNA of UC703, using an amplification kit of the Marathon cDNA (Clontech). The poly-A + RNA is prepared from total wheat RNA using an oligo-dT cellulose column (Gibco BRL). The oligonucleotides used for the RACE reactions are: MLO-GSP1 5 'TGG ACC TCT TCA TGT TCG ATC CCA TCT G 3' (SEQ ID NO: 26) MLO-GSP2 5 'CCT GAC GCT GTT CCA GAA TGC GTT TCA 3' ( SEQ ID NO: 27)
Amplification using the ML0-GSP1 primer and the 5 'adapter primer provided in the kit results in a DNA fragment of approximately 1,300 nucleotides. Amplification using the MLO-GSP2 primer and the 3 'adapter results in a DNA fragment of approximately 600 nucleotides. The fragments are cloned into PCR2.1-TOPO, and are designated TrMlol-5, TrMlo2-5, and TrMlo3-5, and comprise the 5 'end of the wheat Mio genes TrMlol, TrMlo2, and TrMlo3, respectively. The plasmid DNA is prepared from the clones, and the plasmid inserts are sequenced.
Example 2: Cloning of Mio cDNAs from Arabi-dopsis The comparison of the My protein sequence with translations of the database entries using the TBLASTN program revealed a number of entries similar to Mine. Of these, the full length cDNAs corresponding to three EST entries of Arabidopsis are cloned, with accession numbers H76041, N37544, and T22146. For each EST, oligonucleotides are designed to amplify a sequence corresponding to the EST from an Arabidopsis cDNA library constructed in the plasmid pFL61 (Minet et al. (1992) Gene Nov 16; 121 (2): 393- 396). The oligonucleotides used are: N37544-1 5 'AAG ATC AAG ATG AGG ACG TGG AAG TCG TGG 3' (SEQ ID NO: 29) N37544-2 5 'AGG CTG AAC CAC TGG GGC GCC TCT CAC CAC 3' (SEQ ID NO. : 30) T22146-1 5 'CAA GTA TAT GAT GCG CGTA TCTAGA GGATGA 3' (SEQ ID NO: 31) T22146-2 5 'AGG TTT CAC CAC TAA GTC TCC TTC AAT GGC 3' (SEQ ID NO: 32) H76041 -1 5 'GAT CAT TCAAGA CTTAGG CTC ACT CAT GAG 3' (SEQ ID NO: 33) H76041-2 5 'AAC AGC AAG GAA GATTAC AAA TGA TGC CCA 3' (SEQ ID NO: 34)
The primers N37544-1 and N37544-2, amplify a fragment of approximately 500 base pairs of the DNA prepared from the cDNA library, and the primers T22146-1 and T22146-2 amplify a fragment of approximately 250 base pairs. The primers H76041-1 and H76041-2 amplify two fragments of approximately 350 and approximately 300 base pairs. The approximately 300 base fragment is of the predicted size from the EST sequence, and is subsequently used to diagnose the presence of a cDNA corresponding to EST H76041 in the cDNA library. The DNA of the library is transformed into E. coli, and the clones are organized into groups of approximately 20,000 clones each. The DNA of the individual groups is tracked by polymerase chain reaction, using the different pairs of initiators, and the positive groups are subsequently subfractioned into smaller and smaller numbers of clones. The isolation of the individual positive clones is carried out either by carrying out this process until it is finished, or in some cases, by hybridization of colonies using EST sequences as probes, once the sizes of the groups have reached 200 clones or less. For ESTs N37544 and T22146, a clone corresponding to the EST is successfully isolated and the insert is sequenced. The cDNA corresponding to EST N37544 is designated as CIB10295 in plasmid pCIB10295, and the DNA containing the cDNA corresponding to EST T22146 (CIB10296) is designated pCIB10296. For these two ESTs, corresponding genomic sequences are available in the form of portions of sequences from Arabidopsis BAC clones recently deposited in GenBank. Not surprisingly, the sequence of the protein predicted as translated from these genomic sequences, determined by GenBank, does not correspond to the sequence determined by the direct sequencing of the cDNAs. Accordingly, the amino acid sequence of the genes corresponding to ESTs N37544 and T22146 is not obvious from the GenBank entry, and is only elucidated through cloning and sequencing of the cDNA clones. The clone isolated using primers for EST H76041 is found not to contain the gene for H76041, but a member of the novel family of the Mio gene as an insert. The insert is completely sequenced, and this member of the My gene family is designated CIB10259 in the plasmid pCIB10259.
Example 3: Construction of vectors for the expression of my genes in wheat Two vectors are constructed for the expression "anti-sense" of the My barley wheat gene. Polymerase chain reaction is performed using the barley cDNA, and the initiator pair MLO-5 and MLO-7 (see (1) above), and the reaction results in the amplification of a fragment of 1,124 base pairs which is cloned in pGEM-T (Promega). This fragment is separated from pGEM-T using the SacII and NotI enzymes. The fragment of 1, 124 base pairs is cloned into pBluescript-SK (+). The insert now separates with the restriction sites Ba HI and Sacl, and is cloned into pCIB9806 digested with BamHI-SacI (described in Patent Application Number 08 / 838,219) in an orientation in which the coding sequence of Mine runs opposed to the corn ubiquitin promoter. This plasmid is designated pCKOl. To construct a vector for the "antisense" expression of the whole My gene in wheat, polymerase chain reaction is performed using the pair of primers MLO-1 (5 'ATG TCG GAC AAA AAA GGG GT 3' (SEQ ID NO. : 19) and MLO-10 (see (1) above), and the reaction results in the amplification of a fragment of 635 base pairs that is cloned in pCR2.1 (Invitrogen) This fragment is separated from pCR2.1 as an EcoRI fragment, and inserted into pGEM-9Zf (-) (Promega) .A fragment of 320 nucleotides extending from the Sacl site that occurs naturally in Mine to the site of the MLO-10 initiator, is separated with Sacl and BstXI, pCKOl is digested with Sacl and BstXI, and the 320 base fragment is inserted.To complete the construction of the Mio gene in the expression vector in mono-cotyledons, a Sacl fragment of 210 nucleotides is separated from the pGEM- derivative. 9Zf (-) This fragment contains the 5 'end of the coding sequence of Mine, from the site of initiation MLO-1 to the Sacl site that occurs naturally in the Mío gene. The pCKOl derivative is digested with Sacl, and the 210 base fragment is inserted. The clones are analyzed to determine the orientation of the 210 base fragment in the newly constructed vector by polymerase chain reaction, using primers MLO-1 and MLO-10. Only the clones in which the 210 base fragment is inserted in the anti-sense orientation relative to the ubiquitin promoter produced a 530 base pair product corresponding to the 5 'end of the Mio coding sequence. The resulting plasmid contained the entire Mio coding sequence in the "anti-sense" orientation in relation to the ubiquitin promoter, and is designated pCK02. To construct a vector for the expression of the My gene in the "sense" orientation, the plasmid pCK02 is digested with BamHI to release the coding sequence of Mine as the insert. The BamHI fragment is ligated back into the base vector pCK02. Colonies with the Mio coding sequence in the reverse orientation relative to pCK02 are identified by digestion with Sacl, which produces a 1.8 kb fragment in these clones, as opposed to a 210 base fragment in the clones of identical configuration. pCK02. A clone with the Mio coding sequence is selected in the "sense" orientation in relation to the corn ubiquitin promoter, and is designated pCK03.
Example 4: Construction of Vectors for Expression of My Genes in Arabidopsis My clones in pCIB10259, pCIB10295, and pCIB10296, together with pCK02 (for the My barley gene), are used in polymerase chain reactions, giving as result bands carrying the full length genetic sequences flanked by the BamHI restriction sites. The sequences of the primers used are: SAS-1: 5 'GGATTAAGATCTAAT GGC 3' (SEQ ID NO: 25, for pCIB10295) SAS-2: 5 'CAÁAGATCT TCA TTT CTTAAAAG 3' (SEQ ID NO: 36, for pCIB10295) SAS-3: 5 'GCG GAT CCATGT CGG ACAAAAAAG G 3' (SEQ ID NO: 37, for barley mine) SAS-4: 5 'GCG GAT CCT CAT CCC TGG CTG AAG G 3' (SEQ ID NO: 38, for Barley mine) SAS-5: 5 'GGATCC ACC ATG GCC ACAAGA TG 3' (SEQ ID NO: 39, for pCIB10259) SAS-6: 5 'GGA TCC TTC GTC AATATC ATT AGC 3' (SEQ ID NO: 40 , for pCIB10259) SAS-7: 5 'GCG GAT CCATGG GTC ACG GAG GAG AAG 3' (SEQ ID NO: 41, for pCIB10269) SAS-8: 5 'GCG GAT CCT CAG TTG TTATGA TCA GGA 3' (SEQ ID NO. : 42, for pCIB10296)
The bands are cloned into pCR2.1-TOPO, and the inserts are sequenced from the resulting plasmids, to confirm the absence of mutations introduced by the polymerase chain reaction. The plasmids are digested with Ba-mHI, and the inserts are purified and cloned into pPEH28 digested with BamHI, a launch vector containing a copy of the ubiquitin gene promoter of Arabidopsis UBQ3 (Norris et al. (1993) Plant Molecular Biology 21: 895-906), immediately downstream of the BamHI site. Clones containing My sequences fused with UBQ3 are identified, and restriction analysis is performed to identify the clones with the inserts in the "sense" and "anti-sense" orientations in relation to UBQ3. For each My gene, a clone with the insert in the "sense" orientation and a clone with the insert in the "anti-sense" orientation is digested with Xbal, and the insert is purified and cloned in pCIB200 digested with Xbal. This puts the UBQ3-My gene fusion between the T-DNA boundaries.
Example 5: Wheat Transformation and Identification of Ex-pressors Wheat is transformed by bombardment with immature embryo particles as described in detail in Patent Application Number WO-4/13822. The seedlings are regenerated on a medium containing BASTA, and subjected to polymerase chain reaction analysis. For the diagnosis of the presence of My transgenes by polymerase chain reaction, the following primers are used: MLO-3: 5 'ATG CTA CCA CAC GCA GAT CG 3' ST27: 5 'ACT TCT GCA GGT CGA CTC TA 3'
The MLO-3 initiator corresponds to a region of the My transgene, while the ST27 primer falls within the corn ubiquitin promoter sequence. The use of both the My gene and the initiators of the ubiquitin promoter in the polymerase chain reaction eliminates the false positives that arise from the use of two Mine initiators, which could be primed from the chromosomal copy of the My gene present in wheat. Plants confirmed to contain My transgenes are subjected to RNA gel spot analysis, to determine if they contain altered levels of chromium-chromosomally encoded wheat mRNAs. The poly-A + RNAs are prepared from the individual transgenic lines, and stained on Hybond-N-i- filters. The spots are probed with a fragment of 530 bases corresponding to the 5 'end of the Mío gene. This region is absent from the pCKOl clone; therefore, hybridization is not anticipated in the anti-sense RNA expressed from the transgene in the transgenic lines containing pCKOl. For the pCK02 transgenic lines, where the transgene contains this 5 'end fragment, the probe hybridizes in two bands of different size. An approximately 2.5 kb mRNA corresponding to the anti-sense transgene is distinguished from the 2.0 kb mRNA derived from My chromosomally encoded wheat genes. The abundance of the 2.0 kb mRNA is monitored as a measure of the efficiency of the genetic suppression achieved by the transgene in the individual lines.
Example 6: Wheat Transgenic Line Disease Testing Transgenic and non-transformed wheat line plants UC703 (control) are grown in the greenhouse until they are two weeks old. The plants are moved to a Percival culture chamber (cycle of 8 hours of darkness, 16 ° C, and 16 hours of light, 20 ° C), and inoculated with Erysiphe graminis f. sp. tri-tion through the liberal application of spores. The degree of fungal sporulation is scored two weeks after inoculation. The plants are classified as 1 (little or no hyphal growth and no visible sporulation), 2 (some hyphal growth and sporulation, but less than control plants), or 3 (hyphal growth and sporulation comparable with controls). The transgenic wheat lines that express the Mío constructions show a greater resistance to the pathogen. Below is an example of the results obtained with the construction of Mine of barley anti-sense.
Tracing of sisters from the Trannianic lines Rl v R2 - for the determination of resistance to diseases Sister plants of anti-sense transformants of Mine (seeds T2), are planted, inoculated with E. graminis, and are qualified to determine the resistance to diseases.
The fact that a small percentage of plants Rl and R2 exhibit resistance may be due to the fact that the T2 populations were tested, which are still segregating the transgene.
Example 7: Analysis of Arabidopsis Lines Expressing My Genes pCIB200 derivatives containing My genes are used to transform the Ws-0 ecotype of Arabidopsis, by vacuum infiltration (Bechtold, N., Ellis, J. and Pelletier, G (1993) CR Acad Sci. Paris 316, 1194-1199). The progeny are screened by kanamycin selection to identify the transformants. For the Mio transgenic lines, plants expressing Mio are identified by RNA gel spot analysis. For the Mio genes, the transformants are analyzed to determine the alteration in the continuous state level of mRNA accumulation by RNA gel spot analysis. Transformants that exhibit the sense or anti-sense suppression of the target genes are tested for alteration in the reaction to the phytopathogenic fungi Erysiphe ci -choracearum and Peronospora parasí tica, and the bacterial pathogen Pseudomonas syringae de jitomate. The leaves of the transgenic plants are inspected both macroscopically and microscopically, using dyeing with trypan blue, to test the presence of necrosis. For the inoculation of Erysiphe, spores are liberally applied to Arabidopsis rosettes, and the plants are kept in a Percival culture chamber at 25 ° C. The degree of fungal sporulation is scored 10 days after inoculation.
Example 8: Use of Regions of Similarity between the Sequences of Mine to Isolate Additional Members of the Family of My Gene. The alignment of the pre-said amino acid sequences as encoded by the Mío genes revealed a number of short regions of high amino acid similarity among all the gene products. Degenerate primers are designed in these regions, and polymerase chain reactions are performed with these primers according to the recommendations of the polymerase chain reaction reagent supplier. The amplified fragments are used as a probe to isolate the full-length cDNA or the genomic clones of the novel Mio genes. The amino acid sequences conserved between the Mio proteins of the present invention (in bold), and the degenerate oligonucleotides used for the isolation of My genes are shown below:
M XI X2 X3 X4
WHEAT TrMlol GAG CTC ATG CTG GTG GGC TTC ATC TrMlo2 GAG CTG ATG CTG GTG GGG TTC ATC TrMlo3 GAG CTG ATG CTG GTG GGA TTC ATC
ARABIDOPSIS
CIB10259 GAG CTG ATG ATT CTA GGA TTC ATT CIB10295 GAG CTT ATG CTG TTG GGA TTC ATA CIB10296 GAG CTG ATG TTG TTA GGG TTT ATA F19850 GAG CTG ATG GTT CTT GGA TTC ATC U95973 GAG TTG ATG TTG CTG GGA CTT ATA 5 'GAG CTB ATG MTB BTR GGM TTC AT 3 '
X5 T X6 P L X7 X8 X9 X X10 Q M TRIGO TrMIol GCG CTC GTC ACA CAG ATG GGA TCA TrMlo2 GCG CTC GTC ACA CAG ATG GGA TCG TrMlo3 GCG CTA GTC ACÁ CAG ATG GGA TCA
ARABIDOPSIS
CIB10259 GCA CTA GTT ACT CAG ATG GGT TCA CIB10295 GCA CTT GTT ACT CAG ATG GGT AGT CIB 10296 GCC ATC GTC TCA CAG ATG GGA AGT F19850 GCA CTC GTA ACT CAG ATG GGT TCT U95973 GTA ATC GTT ACT CAG ATG GGA TCT 5 'WCC CAT CTG AGT GAC DAG BGC RTA 3 '
X1 = L, V or I, X2 = V or L, X3 = F or L, X4 = T, S or A. X5 = I, V, S or G, X6 = F, L or V, X7 = Y or N, X8 = A or V, X9 = L or I,
X10 = T or S. 'R = A, G Y = C, T M = A, C K = G, T S = C, G W = A, T H = A, C, T B = C, G, T V = A, C, G
D = A, G, T N = A, C, G, T.
Example 9: Modification of the Coding Sequences and the Adjacent Sequences The DNA molecules described in this application can be modified to be expressed in transgenic host plants, to achieve and optimize or reduce their expression. The following problems can be found, and modification of these DNA molecules can be undertaken using techniques well known in the art. (1) Use of Codons. The use of preferred codons in some plants differs from the codon usage preferred in some other plant species. Normally, plant evolution "has tended toward a strong preference for C and G nucleotides at the third base position of monocots, while dicotyledons often use nucleotides A or T in this position. modification of a gene to incorporate the use of preferred codons for a particular target transgenic species will overcome many of the problems described below for GC / AT content and illegitimate splicing. (2) GC / AT content. Plant genes usually have a GC content greater than 35 percent DNA molecules that are rich in nucleotides A and T can cause several problems in plants Firstly, it is believed that ATTTA motives cause message destabilization , and are found at the 3 'end of many short-lived mRNAs.Second, it is believed that the presentation of polyadenylation signals, such as AATAAA in position Inappropriate messages within the message cause premature truncation of transcription. In addition, monocotyledons can recognize AT-rich sequences as splice sites (see below). (3) Sequences Adjacent to the Start Methionine. It is believed that the ribosomes bind to the 5 'end of the message, and scan to detect the first available ATG in which to initiate the translation. However, it is believed that there is a preference for certain nucleotides adjacent to the ATG, and that the expression of the DNA molecules of the present invention can be enhanced by the inclusion of a new translation initiator in consensus in the ATG. Clontech (1993/1994 catalog, page 210) has suggested a sequence as a consensus translation primer for the expression of the uidA gene of E. coli in plants. In addition, Joshi (NAR 15 .: 6643-6653 (1987)) has compared many sequences of plants adjacent to the ATG, and suggests a consensus sequence. In situations where difficulties are encountered in the expression of DNA molecules in plants, the inclusion of one of these sequences in the starting ATG can improve translation. In such cases, the last three nucleotides of the consensus may not be appropriate to be included in the modified sequence, due to their modification of the second AA residue. Preferred sequences adjacent to the starting methionine may differ between different plant species. A study of 14 maize genes located in the GenBank database provided the following results: Position before Start ATG in 14 Maize Genes:
XQ 3. £ = 1. = £. = L = C 3 8 4 6 2 5 6 0 10 7 T 3 0 3 4 3 2 1 1 1 0 A 2 3 1 4 3 2 3 7 2 3 G 6 0
This analysis can be done for the desired plant species in which the nucleotide sequence is being incorporated, and the sequence adjacent to the ATG is modified to incorporate the preferred nucleotides.
(4) Removal of Illegitimate Junction Sites. The DNA molecules of the present invention can also contain motifs that can be recognized in plants as 5 'or 3' splice sites, and can be dissociated, thereby generating truncated or suppressed messages. These sites can be removed using techniques well known in the field.
(5) Creation of Negative-Dominant Mutants In addition, the DNA molecules of the present invention can also include molecules that are modified in such a way that the activity of the proteins encoded by the nucleotide sequences of this invention is changed. This is achieved by expressing the dominant negative mutants of the proteins in transgenic plants, leading to the loss of activity of the endogenous protein. The location of mutations in the nucleotide sequence of Mine that leads to the production of these dominant negative mutations are listed below. You can enter a single mutation or a combination of the different mutations listed below.
7
Techniques for modifying coding sequences and adjacent sequences are well known in the art. In cases where the initial expression of a DNA molecule of the present invention is low, and it is considered appropriate to make alterations to the sequence as described above, then the construction of synthetic genes can be performed according to the well-known methods in this field. These are described, for example, in the disclosures of Published Patent Numbers EP 0,385,962, EP 0,359,472, and WO 93/07278. In most cases, it is preferable to assay the expression of the genetic constructs using transient assay protocols (which are well known in the art) prior to their transfer to the transgenic plants.
Example 10: Construction of Plant Transformation Vectors There are numerous transformation vectors available for plant transformation, and the DNA molecules of this invention can be used in conjunction with any of these vectors. The selection of the vector to be used will depend on the preferred transformation technique and the target species for the transformation. For certain white species, different antibiotic or herbicide selection markers may be preferred. The selection markers routinely used in the transformation include the nptll gene that confers resistance to kanamycin, paromomycin, geneticin, and related antibiotics (Vieira and Messing, 1982, Gene. 19: 259-268, Be-van et al. 1983, Nature 304: 184-187), the Bacte-riano aadA gene (Goldschmidt-Clermont, 1991, Nucí Acids Res. 19: 4083-4089), which codes for aminoglycoside-3 '-adenilyltransferase, and which confers resistance to streptomycin or spectinomycin, the hph gene that confers resistance to the antibiotic hygromycin (Blochlinger and Diggelmann, 1984, Mol. Cell, Biol. 4 ..- 2929-2931), and the dhfr gene, which confers resistance to methotrexate (Bourouis and Jarry, 1983). , EMBO J. 2: 1099-1104). Other markers to be used include a phosphinothricin acetyltransferase gene, which confers resistance to the herbicide phosphinothricin (White et al., 1990, Nucí Acids Res. 18: 1062, Spencer et al., 1990, Theor. Appl. Genet. : 625-631), a mutant EPSP synthase gene that encodes glyphosate resistance (Hinchee et al., 1988, Bio / Technology 6: 915-922), a mutant acetolactate synthase (ALS) gene that confers resistance to imidazolinone or sulfonylurea (Lee et al., 1988, EMBO J. 2: 1241-1248), a mutant psbA gene that confers resistance to atrazine (Smeda et al., 1993, Plant Physiol. 1_3: 911-917), or a mutant protoporphyrinogen oxidase gene, as described in U.S. Patent No. 5,767,373. Selection markers are also used that result in a positive selection, such as a phosphomannose isomerase gene, as described in U.S. Patent No. 5,767,378. - The identification of transformed cells can also be done through the expression of traceable marker genes, such as the genes encoding clo-ranfenicol acetyltransferase (CAT), β-glucuronidase (GUS), luciferase, and the fluorescent protein green (GFP), or any other protein that confers a phenotypically distinct trait to the transformed cell.
(1) Construction of Vectors Suitable for Transformation with Agrobacterium There are many vectors available for the transformation using. Agrobacterium turn facien. These normally carry at least one T-DNA limit sequence, and include vectors such as pCIB19 (Bevan, Nucí Acids Res. (1984)) and pXYZ. The construction of two typical vectors is described below. Construction of pCIB200 and pCIB2001 The binary vectors pCIB200 and pCIB2001 are used for the construction of recombinant vectors for use with Agrobacterium, and are constructed in the following manner. PTJS75kan is created by digestion with Nar I of pTJS75 (Schmidhauser and Helinski, J. Bacteriol 164: 446-455 (1985)), allowing the separation of the tetracycline resistance gene, followed by the insertion of an AccJ fragment. from pUC4K carrying an NPTII (Vieira and Messing, Gene 1 ^: 259-268
(1982); Bevan et al., Nature 3_Q ,: 184-187 (1983); McBride et al., Plant Molecular Biology 14: 266-276
(1990) ) . The Xhol linkers are ligated to the EcoRV fragment of pCIB7, which contains the boundaries of left and right T-DNA, a chimeric gene nos / nptiI selectable in plants, and the po-lienlazador pUC (Rothstein et al., Gene 5_3_: 153-161 81987)), and the fragment digested with XhoI is cloned into pTJS75kan digested with SalI to create pCIB200 (see also European Patent Number EP 0,332,104, Example 19). pCIB200 contains the following unique polylinker restriction sites: Eco-RI, SstI, Kpnl, BglII, XbaI, and SalI. pCIB2001 is a derivative of pCIB200, which was created by inserting additional restriction sites into the polylinker. The unique restriction sites in the polylinker of pCIB2001 are EcoRI, S tI, Kpnl, BglII, XbaI, SalI, Mul, BcII, Avrll, Apal, Hpal, and Stul. pCIB2001, in addition to containing these unique restriction sites, also has kanamycin selection in plants and bacteria, left and right T-DNA boundaries for Agrobacterium-mediated transformation, trA function derived from RK2 for mobilization between E. coli and other hosts, and the OriT and OriV functions from RK2. The polylinker pCIB2001 is suitable for the cloning of expression cassettes in plants containing their own regulatory signals. Construction of pICBlO and Hicrromycin Selection Derivatives of The same The binary vector pCIBlO contains a gene that codes for kanamycin resistance, to be selected in plants, right and left T-DNA border sequences, and incorporates sequences from the wide-range plasmid of host pRK252, which allows it to replicate in both E. coli and Agrobacterium. Its construction is described by Rothstein et al. (Gene 53: 153-161 (1987)). Different pCIBlO derivatives have been constructed which incorporate the gene for hygromycin B phosphotransferases described by Gritz et al. (Gene 25: 179-188 (1983)). These derivatives make it possible to select cells from transgenic plants on hygromycins alone (pCIB743), or on hygromycin and kanamycin (pCIB715, pCIB717).
(2) Construction of Vectors Suitable for Transformation without Agrobacterium Transformation without the use of Agrobacterium tumefa-ciens circumvents the requirement for T-DNA sequences in the selected transformation vector, and consequently, vectors lacking these can be used. sequences in addition to vectors such as those described above, which contain T-DNA sequences. Transformation techniques that do not rely on Agrobacterium include transformation by means of particle bombardment, protoplast recovery (e.g., PEG and electroporation), and microinjection. The choice of vector depends largely on the preferred selection for the species being transformed. The construction of some typical vectors is described below. Construction of pCIB3064 pCIB3064 is a vector derived from pUC suitable for direct gene transfer techniques in combination with selection by the herbicide BASTA (or phosphinothricin). Plasmid pCIB246 comprises the 35S promoter of CaMV in fusion operative with the GUS gene of E. coli, and the CaMV 35S transcription terminator, and is described in published PCT Application No. WO 93/07278. The 35S promoter of this vector con-has two ATG 5 'sequences from the start site. These sites are mutated using conventional polymerase chain reaction techniques, such that the ATGs are removed, and the SspI and PvuII restriction sites are generated. The new restriction sites are at 96 and 37 base pairs from the single Salí site, and at 101 and 42 base pairs from the actual start site. The resulting derivative of pCIB246 is designated pCIB3025. The GUS gene is then separated from pCIB3025 by digestion with Sali and SacI, the terms blunted, and re-ligated to generate the plasmid pCIB3060. Plasmid pJIT82 is obtained from the John Innes Center, Norwich, and the 400 base pair Smal fragment containing the bar gene from Streptomyces viridochromogenes is separated and inserted into the Hpal site of pCIB3060 (Thompson et al., EMBO J. £ ..- 2519-2523 (1987)). This generated pCIB3064, which comprises of the bar gene under the control of the 35S promoter of CaMV, and the terminator for the selection of the herbicide, a gene for resistance to ampicillin (for selection in E. coli), and a polylinker with the Sphl, PstI, HindIII, and BamHI sites. This vector is suitable for the cloning of expression cassettes in plants that contain their own regulatory signals. Construction of pS0G19 and pSOG35 pSOG35 is a transformation vector using the dihydrofolate reductase of the E. coli gene (DHFR) as a selectable marker that confers resistance to methotrexate. Polymerase chain reaction is used to amplify the 35S promoter (approximately 800 base pairs), the Adon 6 intron of the maize Adhl gene (approximately 550 base pairs), and 18 base pairs of the leader untranslated sequence GUS from pSOGlO. A fragment of 250 base pairs encoding the type II dihydrofolate reductase of E. coli is also amplified by polymerase chain reaction, and these two fragments of the polymerase chain reaction are assembled with a SacI fragment. -PstI from pBI221 (Clone-tech), which comprises the base structure of the pUC19 vector, and the nopaline-synthase terminator. The assembly of these fragments generated pS0G19, which contains the 35S promoter in fusion with the intron 6 sequence, the GUS leader, the DHFR gene, and the nopaline synthase terminator. The replacement of the GUS leader in pS0G19 with the leader sequence from the Corn Chlorotic Speckled Virus (MCMV) generated the vector pSOG35. pSOG19 and pSOG35 carry the pUC gene for ampicillin resistance, and have the HindIII, Sphl, PstI, and EcoRI sites available for the cloning of foreign sequences.
Example 11: Requirements for the Construction of Expression Cassettes in Plants First genetic sequences are assembled for expression in transgenic plants, in expression cassettes behind a suitable promoter, and upstream of a suitable transcription terminator. Selection of the Promoter The selection of the promoter used in the expression cassettes will determine the pattern of special and temporal expression of the transgene in the transgenic plant. The selected promoters will express the transgenes in specific cell types (such as leaf epidermal cells, me-sofilo cells, root bark cells), or in specific tissues or organs (roots, leaves, or flowers, for example). ), and this selection will reflect the desired biosynthesis location of a DNA molecule of the present invention. In an alternative manner, the selected promoter can drive the expression of the gene under a light-induced promoter or other temporarily regulated promoter. An additional alternative is that the selected promoter is chemically regulated. This provides the possibility of inducing expression of the nucleotide sequence only when desired, and is caused by treatment with a chemical inducer. Transcription Terminators There are a variety of transcription terminators available for use in expression cassettes. These are responsible for the termination of transcription beyond the transgene and its correct polyadenylation. Suitable transcription terminators, and those known to work in plants, include the CaMV 35S terminator, the tml terminator, the nopaline synthase terminator, the pea rbcS E9 terminator. These can be used in both monocots and dicots. Sequences for Improving or Regulating Expression Numerous sequences have been found to improve gene expression from within the transcription unit, and these sequences can be used in conjunction with the genes of this invention to increase their expression in transgenic plants. It has been shown that different sequences of intro-nes improve expression, particularly in monocoy-tiled cells. For example, it has been found that the introns of the Adhl corn gene significantly improve the expression of the wild-type gene under its known promoter when introduced into corn cells. It is found that intron 1 is particularly effective and improves expression in fusion constructs with the chloramphenicol acetyltransferase gene (Callis et al., Genes Develop 1: 1183-1200 (1987)). In the same experimental system, the intron of the corn bronzel gene had a similar effect to improve expression (Callis et al., Supra). Intron sequences have been routinely incorporated into plant transformation vectors, usually within the non-translated leader. It is also known that a number of untranslated leader sequences derived from viruses improve expression, and these are particularly effective in dicotyledonous cells. Specifically, it has been shown that the leader sequences of the Tobacco Mosaic Virus (TMV, the * sequence-O), Corn Chlorotic Speck Virus (MCMV), and Alfalfa Mosaic Virus (AMV), are effective to improve expression (for example Gallie et al., Nucí Acids Res. 15: 8693-8711 (1987); Skuzeski et al., Plant Molec. Biol. 15: 65-79 (1990)). Direction of the Genetic Product in the Inside of the Cell It is known that there are different mechanisms to direct the genetic products in plants, and the sequences that control the functioning of these mechanisms have been characterized with some detail. For example, the direction of the gene products towards the chloroplast is controlled by a signal sequence that is found in the amino-terminal end of different proteins, and that dissociates during the chloroplast importation., producing the mature protein (e.g., Comai et al., J. Biol. Chem. 263: 15104-15109 (1988)). These signal sequences can be fused with heterologous gene products to effect the import of heterologous products into the chloroplast (van den Broeck et al., Nature 313: 358-363 (1985)). The DNA encoding the appropriate signal sequences can be isolated from the 5 'end of the cDNAs encoding the RU-BISCO protein, the CAB protein, the EPSP synthase enzyme, the GS2 protein, and many other proteins known to be locate in the chloroplast. Other genetic products are located in other organs, such as mitochondria and peroxisome (for example, Unger et al., Plant Molec, Biol. 13: 411-418 (1989)). The cDNAs encoding these products can also be manipulated to direct the heterologous gene products towards these organelles. Examples of these sequences are the nuclear encoding ATPases, and the specific aspartate aminotransferase isoforms for the mitochondria. The direction towards the cellular protein bodies has been described by Rogers et al. (Proc. Nati, Acad. Sci. USA 82: 6512-6516 (1985)). In addition, sequences have been characterized that cause the direction of the genetic products towards other cellular compartments. The amino-terminal sequences are responsible for the direction towards the endoplasmic reticulum, the apoplast, and the extracellular secretion from the aleurone cells (Koehler and Ho, Plant Cell 2: 769-783 (1990)). Additionally, the amino-terminal sequences, in conjunction with the carboxy-terminal sequences, are responsible for the vacuolar direction of the gene products (Shinshi et al., Plant Molec, Biol. 14: 357-368 (1990)). By fusing the appropriate address sequences described above to the sequences of the transgene of interest, it is possible to direct the transgenic product to any organelle or cell compartment. For chloroplast targeting, for example, the chloroplast signal sequence from the RUBISCO gene, the CAB gene, the EPSP synthase gene, or the GS2 gene, is fused within the framework with the amino-terminal ATG of the transgene. The selected signal sequence must include the known dissociation site, and the constructed fusion must take into account any amino acids after the dissociation site that are required for dissociation. In some cases, this requirement can be satisfied by the addition of a small number of amino acids between the dissociation site and the ATG of the transgene, or alternatively the replacement of some amino acids within the transgenic sequence. Mergers constructed for chloroplast importation can be tested to determine the efficiency of chloroplast recovery by translating in vitro transcribed constructions, followed by recovery of the chloroplast in vi tro, using the techniques described by (Bartlett et al. , In: Edelmann et al (Editors) Methods in Chloroplast Molecular Biology, Elsevier, pages 1081-1091 (1982); Wasmann et al., Mol. Gen. Genet. 205: 446-453 (1986)). These construction techniques are well known in this field, and are equally applicable to mitochondria and peroxisomes. The choice of direction that may be required for the insecticidal toxin. This will normally be cytosolic or chloroplastic, although in some cases it may be mitochondrial or peroxisomal. The expression of the nucleotide sequence may also require the direction towards the endoplasmic reticulum, the apoplast, or the vacuole. The previously described mechanisms for cell targeting can be used not only in conjunction with their known promoters, but also in conjunction with heterologous promoters, to effect a specific cell targeting goal under the transcription regulation of a promoter having a standard. of expression different from that of the promoter from which the directional signal is derived.
Example 12: Examples of Construction of Expression Cassettes The present invention encompasses the expression of a DNA molecule under the regulation of any promoter that can be expressed in plants, regardless of the origin of the promoter. In addition, the invention encompasses the use of any plant-expressible promoter in conjunction with any additional sequences required or selected for the expression of the DNA molecule. These sequences include, but are not restricted to, transcription terminators, ex-tranex sequences to improve expression (such as introns [eg, the Adhl intron], viral sequences [eg, TMV-O]), and intended sequences for the direction of the genetic product towards specific cell organelles and compartments. Constitutive Expression: The CaMV 35S Promoter The construction of plasmid pCGN1761 is described in Published Patent Application Number EP 0,392,225. pCGN1761 contains the 35S 'double' promoter, and the tml transcription terminator, with a unique EcoRI site between the promoter and the terminator, and has a pUC-like base structure. A derivative of pCGN1761 is constructed, which has a modified polylinker that includes the Notl and Xhol sites in addition to the existing EcoRI site. This derivative is designated pCGN1761ENX. pCGN1761ENX is useful for the cloning of cDNA sequences or genetic sequences (including microbial open reading frame sequences) into their polylinker, for the purposes of their expression under the control of the 35S promoter in transgenic plants. All the cassette of promoter 35S-genetic sequence-terminator tml of this construction can be separated by the sites HindIII, Sphl, Salí, and Xbal 5 'for the promoter, and the sites Xbal, BamHI, and Bgll 3' for the terminator, to be transferred to the transformation vectors, such as those described above in Example 35. In addition, the fragment of the double 35S promoter can be removed by 5 'separation with Hin-dlll, Sphl, Sali, Xbal, or PstI, and 3 'separation with any of the restriction sites of the polylinker (EcoRI, Notl, or Xhol) to be replaced with another promoter. Modification of pCGN1761ENX by Optimization of the Start of Translation Site For any of the constructions described in this section, modifications can be made around the cloning sites by introducing sequences that can improve translation. This is particularly useful when genes derived from microorganisms are introduced into expression cassettes in plants, because these genes may not contain sequences adjacent to their starting methionine, which may be suitable for the initiation of translation in plants. In cases where genes derived from microorganisms are to be cloned in expression cassettes in plants in their ATG, it may be useful to modify the site of their insertion to optimize their expression. The modification of pCGN1761ENX is described, by way of example, to incorporate one or more sequences optimized for expression in plants (eg, Joshi, supra). Expression Under a Chemically Regulatory Promoter This section describes the replacement of the double 35S promoter in pCGN1761ENX with any promoter of choice; by way of example, the PR-chemically re-guided promoter is described. The promoter of choice is preferably separated from its source by restriction enzymes, but alternatively it can be amplified with polymerase chain reaction, using primers carrying appropriate terminal restriction sites. If the amplification is undertaken with polymerase chain reaction, then the promoter must be re-sequenced to verify the amplification errors after cloning the amplified promoter in the target vector. The PR-la promoter of chemically-regulable tobacco is dissociated from the plasmid pCIB1004 (see European Patent Number EP 0, 332, 104, Example 21, for construction), and transferred to the plasmid pCGN1761ENX. pCIB1004 dissociates with Ncol, and the 3 'overhang resulting from the linearized fragment is made blunt by its treatment with T4 DNA polymerase. The fragment is then dissociated with HindIII, and the fragment containing the resulting promoter PR-la is gel purified, and cloned into pCGN1761ENX, from which the double 35S promoter has been removed. This is done by dissociation with Xhol, and blunting with T4 polymerase, followed by dissociation with HindIII and isolation of the fragment containing the terminator of the larger vector, where the promoter fragment pCIB1004 is cloned. This generates a derivative of pCGN1761ENX with the PR-la promoter and the tml terminator, and a polylinker that intervenes with the unique EcoRI and NotI sites. The DNA molecule of the present invention can be inserted into this vector, and the fusion products (i.e., promoter-gene-terminator) can subsequently be transferred to any selected transformation vector, including those described in this application. Constitutive Expression: The Actin Promoter It is known that several actin isoforms are expressed in most cell types, and consequently, the actin promoter is a good choice for a constitutive promoter. In particular, the promoter has been cloned and characterized from the Actl rice gene (McElroy et al., Plant Cell J2: 163-171 (1990)). It is found that a 1.3 kb fragment of the promoter contains all the regulatory elements required for expression in rice protoplasts. In addition, numerous expression vectors based on the Actl promoter have been constructed, specifically for use in monocots (McElroy et al., Mol. Gen. Genet 23: 150-160 (1991)). These incorporate Actl-intron 1, the 5 'flanking sequence of Adhl, and Adhl-intron 1 (from the maize alcohol dehydrogenase gene), and the sequence from the 35S promoter of CaMV. The vectors showing the highest expression are 35S fusions and the Actl intron, or the 5 'flanking sequence of Actl and the Actl intron. Optimization of the sequences around the starting ATG (of the GUS reporter gene) also improved the expression. The promoter expression cassettes described by McElroy et al. (Mol.Gen. Genet, 231: 150-160 (1991)) can be easily modified for ex-pressing the DNA molecules of the present invention, and are particularly suitable for use in monocotyledonous hosts. For example, fragments containing the promoter can be removed from the McElroy constructs, and can be used to replace the double 35S promoter in pCGN1761ENX, which is then available for the insertion of specific genetic sequences. The fusion genes thus constructed can then be transferred to appropriate transformation vectors. In a separate report, it has also been found that the Actl promoter of rice with its first intron directs high expression in cultured barley cells (Chibbar et al., Plant Cell Rep. 12..506-509 (1993)). Constitutive Expression: The Ubiquitin Promoter Ubiquitin is another genetic product that is known to accumulate in many cell types, and its promoter has been cloned from several species for use in transgenic plants (eg sunflower - Binet et al. , Plant Science 22: 87-94 (1991), corn - Christensen et al., Plant Molec. Biol. 12.:619-632 (1989)). The corn ubiquitin promoter has been developed in transgenic monocotyledonous systems, and its sequence and vectors constructed for monocot transformation are disclosed in Patent Publication Number EP 0,342,926. In addition, Taylor et al. (Plant Cell Rep. 12: 491-495 (1993)) describe a vector (pAHC25) comprising the maize ubiquitin promoter and the first intron, and its high activity in cell suspensions of numerous monocotyledons., when it is introduced by means of microprojectile bombardment. The ubiquitin promoter is clearly suitable for the expression of a DNA molecule of the present invention in transgenic plants, especially monocotyledons. Suitable vectors are derivatives of pAHC25 or any of the transformation vectors described in this application, modified by the introduction of the ubiquitin promoter and / or the appropriate introns sequences. Root-specific Expression A preferred expression pattern for the nucleotide sequence of the present invention is root expression. The expression of the root is particularly useful for the control of the fungal pathogens of the earth. A suitable root promoter is that described by de Framond (FEBS 290: 103-106 (1991)), and also published Patent Application Number EP 0,452,269. This promoter is transferred to a suitable vector, such as pCGN1761ENX, for the insertion of the nucleotide sequence and the subsequent transfer of the entire promoter-gene-terminator cassette to a transformation vector of interest. Wound-Induced Promoters Wound-inducible promoters are particularly suitable for the expression of DNA molecules of the present invention, because they are normally active, not only on wound induction, but also on the sites of phytopathogenic infection. Numerous of these promoters have been described (eg, Xu et al., Plant Molec, Biol. 22: 573-588 (1993), Logemann et al., Plant Cell 1: 151-158 (1989), Rohrmeier and Lehle, Plant Molec. Biol.
22: 783-792 (1993), Firek et al., Plant Molec. Biol.
22: 129-142 (1993), Warner et al., Plant J. 3_: 191-201
(1993)), and all are suitable for use with the present invention. Logemann et al. (Supra) describe the 5 'upstream sequences of the wunl gene of the dicotyledonous potato. Xu et al. (Supra) demonstrate that a wound-inducible promoter from dicotyledonous potato (pin2) is active in monocotyledonous rice. In addition, Rohrmeier and Lehle (supra) describe the cloning of corn Wipl cDNA, which is wound induced, and which can be used to isolate the known promoter using conventional techniques. In a similar manner, Firek et al. (Supra) and Warner et al. (Supra), have described a wound-induced gene from the monocot Asparagus officinalis, which is expressed at the local sites of the wound and invasion of the pathogen. . Using cloning techniques well known in the art, these promoters can be transferred to suitable vectors, can be fused with the DNA molecule of this invention, and can be used to express these genes at the sites of phytopathogenic infection. Preferred Expression by Sap Patent Application Number WO 93/07278 (to Ciba-Geigy) describes the isolation of the maize trpA gene, which is preferably expressed in the cells of the sap. The genetic sequence and promoter that extends to -1726 from the start of transcription is presented. Using conventional molecular biological techniques, this promoter, or parts thereof, can be transferred to a vector, such as pCGN1761, where it can replace the 35S promoter, and can be used to drive the expression of a DNA molecule of this invention from a preferred way by the sap. In fact, fragments containing the promoter preferred by the sap or parts thereof, can be transferred to any vector, and can be modified for use in transgenic plants.
Specific Expression of Pollen Patent Application Number WO 93/07278 (to Ciba-Geigy) further describes the isolation of the corn calcium-dependent protein kinase (CDPK) gene, which is expressed in pollen cells. The genetic sequence and the promoter extend up to 1,400 base pairs from the start of transcription. Using conventional molecular biological techniques, this promoter or parts thereof, can be transferred to a vector, such as pCGN1761, where it can replace the 35S promoter, and can be used to drive the expression of a DNA molecule of this invention. In fact, fragments containing the pollen-specific promoter or parts thereof, can be transferred to any vector, and can be modified for use in transgenic plants. Specific Leaf Expression A phosphoenol carboxylase (PEPC) encoding the maize gene has been described by Hudspeth and Gruia (Plant Molec Biol 12: 579-589 (1989)). Using conventional molecular biological techniques, the promoter can be used for this gene in order to drive the expression of any gene in a leaf-specific manner in transgenic plants. Expression cbn Address to Chloroplast Chen and Jagendorf (J. Biol. Chem. 26JI: 2363-2367 (1993)) have described the successful use of a chloroplast transit peptide for the importation of a heterologous transgene.
This peptide used is the transit peptide from the rbcS gene of Nicotiana plumbagini folia (Poulsen et al., Mol. Gen. Genet, 205: 193-200 1986)). Using the restriction enzymes Oral and Sphl, or Tsp? OlI and Sphl, the DNA sequence encoding this transit peptide can be separated from the plasmid prbcS-8B (Poulsen et al., Supra), and can be manipulated to be used with any of the constructions described above. The Dral-Sphl fragment extends from -58 in relation to the ATG of rbcS from start to -ta, and including, the first amino acid (also a methionine) of the immature peptide immediately after the import cleavage site, while the Tsp509I-Sphl fragment extends from -8 in relation to the initial rbcS ATG, up to, including, the first amino acid of the mature peptide. Accordingly, these fragments can be appropriately inserted into the polylinker of any selected expression cassette, generating a transcription fusion with the untranslated leader of the selected promoter (e.g., 35S, PR-la, actin, ubiquitin, etc.), while that insertion of the DNA molecule of this invention into a correct fusion downstream of the transit peptide becomes possible. Constructions of this kind are routine in the art. For example, although the Oral end is already blunt, the 5 'Tsp509I site can be blunted by treatment with T4 polymerase, or alternatively it can be ligated with a linker or an adapter sequence to facilitate its fusion with the selected promoter. The 3 'Sphl site can be maintained as such, or alternatively ligated with the adapter of the inlayer sequences to facilitate their insertion into the selected vector, so that appropriate restriction sites are made available for subsequent insertion of the DNA molecule of this invention. Ideally, the ATG of the Sphl site is maintained and comprises the first ATG of the DNA molecule of this invention. Chen and Jagendorf (supra) provide consensus sequences for the ideal dissociation for chloroplast import, and in each case, a methionine is preferred in the first position of the mature protein. In the following positions, there is more variation, and the amino acid may not be as critical. In any case, fusion constructions can be evaluated to determine the efficiency of in vitro import, using the methods described by Bartlett et al. (En: Edelmann et al. (Editors) Methods in Chloroplast Molecular Biology, Elsevier. Pages 1081- 1091 (1982)), and Wasmann et al. (Mol. Gen. Genet 205: 446-453 (1986)). Typically, the best approach may be to generate fusions using the DNA molecule of this invention without modifications to the amino terminus, and only to incorporate modifications when it is apparent that these fusions are not imported into the chloroplast with a high efficiency, in which case, Modifications can be made according to established literature (Chen and Jagendorf, supra).; Wasman et al., Supra; Ko and Ko, J. Biol. Chem. 2 £ 7: 13910-13916 (1992)). * Similar manipulations can be undertaken to use other chloroplast GS2 transit peptide coding sequences from other sources (monocotyledonous and dicotyledonous), and from other genes. In addition, similar procedures can be followed to achieve direction towards other subcellular compartments, such as mitochondria.
LIST OF SEQUENCES < 110 > Novartis < 120 > Genes controlling .diseases < 130 > S-30431 / A < 140 > CGC1989 < 141 > 1998-03-17 < 150 > US 09/042763 < 151 > 1998-03-17 < 160 > 42 < 170 > Patentln Ver. 2.0 < 210 > 1 < 211 > 13 < 212 > PRT < 213 > Artificial Sequence < 220 > < 223 > Description of Artificial Sequence: Conserved amino acid sequence < 400 > 1 Glu Leu Met Xaa Xaa Gly Xaa lie Ser Leu Leu Leu Xaa 1 5 10
< 210 > 2 < 211 > 14 < 212 > PRT < 213 > Artificial Sequence < 220 > < 223 > Description of Artificial Sequence: Conserved amino acid sequence < 400 > 2 Xaa Thr Xaa Pro Leu Xaa Xaa Xaa Val Xaa Gln Met Gly Ser 1 5 10
< 210 > 3 < 211 > 1868 < 212 > DNA < 213 > Triticum sp. < 220 > < 221 > CDS < 222 > (176) .. (1777) < 220 > < 221 > various characteristics < 222 > (365) .. (403) < 223 > location of the amino acid sequence stipulated in SEQ ID No: 1 < 220 > < 221 > various_ characteristics < 222 > (1352) .. (1393) < 223 > location of the amino acid sequence stipulated in SEQ ID No: 2 < 400 > 3 cacgcccaca cttcgccaac acacaacgta cctgcgtacg tacgctttcc atttcctttc 60 ttgctccggc cggccggcca cgtagaatag atacccggcc aggtaggtac ctcgttggct 120 cagacgaccg gcggctgggt ctccggacaa ggaaagaggt tgcgctcggg gaccg atg 178 Met 1 gcg gac gac gac gag tac ecc cea gcg agg acg ctg ceg gag acg ceg 226 Ala Asp Asp Asp Glu Tyr Pro Pro Wing Arg Thr Leu Pro Glu Thr Pro 5 10 15 tec tgg gcg gtg gcc gtc gtc gtc gtc gtc atg ate gtg tech gtc 274 Ser Trp Wing Val Wing Leu Val Phe Wing Val Met lie lie Val Val 20 25 30 etc ctg gag falls geg etc cat aag etc ggc cat tgg ttc falls aag cgg 322 Leu Leu Glu His Wing Leu His Lys Leu Gly His Trp Phe His Lys Arg 35 40 45 falls aag aac gcg ctg gcg gag gcg ctg gag aag ate aag gcg gag etc 370
His Lys Asn Ala Leu Ala Glu Ala Leu Glu Lys lie Lys Ala Glu Leu
50 55 60 65 atg ctg gtg ggc ttc ate teg ctg ctg etc gcc gtg acg cag gac ecc 418 Met Leu Val Gly Phe lie Ser Leu Leu Leu Ala Val Thr Gln Asp Pro 70 75 80 ate tec ggg ata tgc ate tec gag aag gcc gcc age ate atg cgg ecc 466 lie Be Gly lie Cys He Ser Glu Lys Ala Ala Be He Met Arg Pro 85 90 95 tgc aag ctg ecc cct ggc tec gtc aag age aag tac aaa gac tac tac 514 Cys Lys Leu Pro Pro Gly Ser Val Lys Ser Lys Tyr Lys Asp Tyr Tyr 100 105 110 tgc gcc aaa cag ggc aag gtg teg etc atg tec acg ggc age ttg drops 562 Cys Wing Lys Gln Gly Lys Val Ser Leu Met Ser Thr Gly Ser Leu His 115 120 125 cag ctg ata ata ttc ate ttc gtg etc gcc gtc ttc cat gtc acc tac 610 Gln Leu His He Phe He Phe Val Leu Ala Val Phe His Val Thr Tyr 130 135 140 145 age gtc ate ate atg gct cta age cgt etc aaa atg aga aec tgg aag 658 Ser Val He He Met Ala Leu Ser Arg Leu Lys Met Arg Thr Trp Lys 150 155 160 aaa tgg gag here gag acc gcc tec ctg gaa tac cag ttc gca aat gat 706 Lys Trp Glu Thr Glu Thr Wing Ser Leu Glu Tyr Gln Phe Ala Asn Asp 165 170 175 cct gcg cgg ttc cgc ttc acg falls cag acg teg ttc gtg aag cgg drops 754 Pro Wing Arg Phe Arg Phe Thr His Gln Thr Ser Phe Val Lys Arg His 180 185 190 ct_J 99c ctc tec age acc ecc ggc gtc aga tgg gtg gtg gtg gtg ttc ttc 802 Leu Gly Leu Ser Ser Thr Pro Gly Val Arg Trp Val Val Wing Phe Phe 195 200 205 agg cag ttc ttc agg teg gtc acc aag gtg gac tac etc acc ttg agg 850 Arg Gln Phe Phe Arg Ser Val Thr Lys Val Asp Tyr Leu Thr Leu Arg 210 215 220 225 gca ggc ttc ate aac gcg cat ttg teg cat aac age aag ttc gac ttc 898 Wing Gly Phe He Asn Wing His Leu Ser His Asn Ser Lys Phe Asp Phe 230 235 240 falls aag tac ate aag agg tec atg gag gac gac ttc aaa gtc gtc gtt 946 His Lys Tyr He Lys Arg Ser Met Glu Asp Asp Phe Lys Val Val Val 245 250 255 ggc ate age etc ceg ctg tgg tgt gtg gcg ate etc acc etc ttc ctt 994 Gly He Ser Leu Pro Leu Trp Cys Val Wing He Leu Thr Leu Phe Leu 260 265 270 gac att gac ggg ate ggc acg etc acc tgg att tet tcc ate cct etc 1Q42 Asp He Asp Gly He Gly Thr Leu Thr Trp He Be Phe He Pro Leu 275 280 285 gtc ate etc ttg tgt gtt gga acc aag ctg gag atg ate ate atg 1090 Val He Leu Leu Cys Val Gly Thr Lys Leu Glu Met He He Met Met Glu 290 295 300 305 atg gcc ctg gag ate cag gac cgg gcg age gtc ate aag ggg gcg ecc 1138 Met Ala Leu Glu He Gln Asp Arg Ala Ser Val He Lys Gly Ala Pro 310 315 320 gtg gtt gag ecc age aac aag ttc ttc tgg ttc cae cgc ecc gac tgg 1186 Val Val Glu Pro Ser Asn Lys Phe Phe Trp Phe His Arg Pro Asp Trp 325 330 335 gtc etc ttc ttc ata falls ctg acg cta ttc cag aac gcg ttt cag atg 1234 Val Leu Phe Phe He His Leu Thr Leu Phe Gln Asn Wing Phe Gln Met 340 340 350 gca cat ttc gtg tgg ggg gcg gcg acg ecc ggc ttg aag aa tgc ttc 1282 Wing His Phe Val Trp Thr Val Wing Thr Pro Gly Leu Lys Lys Cys Phe 355 360 365 cat atg cae ggg ctg age ate atg aag gtc gtg ctg ggg ctg gct 1330 His Met His He Gly Leu Ser He Met Lys Val Val Leu Gly Leu Wing 370 375 380 385 ctt cag ttc etc tgc age tat ate acc ttc ceg etc tac gcg etc gtc 1378 Leu Gln Phe Leu Cys Ser Tyr He Thr Phe Pro Leu Tyr Ala Leu Val 390 395 400 here cag atg gga tea aac atg aag agg tec ate ttc gac gag cag acg 1426 Thr Gln Met Gly Ser Asn Met Lys Arg Ser He Phe Asp Glu Gln Thr 405 410 415 gcc aag gcg ctg aac tgg cgg aac acg gcc aag gag aag aag aag 1474
Ala Lys Ala Leu Thr Asn Trp Arg Asn Thr Ala Lys Glu Lys Lys Lys 420 425 430 gtc cga gac acg gac atg ctg atg gcg cag atg ate ggc gac gcg acg 1522 Val Arg Asp Thr Asp Met Leu Met Ala Gln Met He Gly Asp Wing Thr 435 440 445 ecc age cg ggg gcg teg ecc atg cct age cgg ggc teg teg ceg gtg 1570 Pro Ser Arg Gly Wing Pro Pro Met Ser Arg Gly Ser Ser Pro Val 450 455 460 465 falls ctg ctt falls aag ggc atg gga cgg tec gac gat ecc cag age acg 1618 His Leu Leu His Lys Gly Met Gly Arg Ser Asp Asp Pro Gln Ser Thr 470 475 480 cea acc ceg agg gcc atg gag gag gct agg gac atg tac ceg gtt 1666 Pro Thr Ser Pro Arg Wing Met Glu Glu Wing Arg Asp Met Tyr Pro Val 485 490 495 gtg gtg gcg cat cea gtg falls aga cta aat cct gct gac agg aga agg 1714 Val Val Ala His Pro Val His Arg Leu Asn Pro Wing Asp Arg Arg Arg 500 505 510 teg gtc teg teg te gca etc gat gtc gac att ecc age gca gat ttt 1762 Ser Val Ser Be Ala Ala Asp Val Asp He Pro Be Ala Asp Phe 515 520 525 tec ttc age cag gga tgagacaagt ttctgtattg atgtt agtcc aatgtatagc 1817
Ser Phe Ser Gln Gly 530 caacatagga tgtcatgatt cgtacaataa gaaatacaaa tttttactga g 1868
< 210 > 4 < 211 > 534 < 212 > PRT < 213 > Triticum sp. < 400 > 4 Met Wing Asp Asp Asp Glu Tyr Pro Pro Wing Arg Thr Leu Pro Glu Thr 1 5 10 15 Pro Ser Trp Wing Val Wing Leu Val Phe Wing Val Met He He Val Val 20 25 30 Val Leu Leu Glu His Wing Leu His Lys Leu Gly His Trp Phe His Lys 35 40 45 Arg His Lys Asn Ala Leu Ala Glu Ala Leu Glu Lys He Lys Ala Glu 50 55 60 Leu Met Leu Val Gly Phe He Ser Leu Leu Leu Ala Val Thr Gln Asp 65 70 75 80
Pro He Ser Gly He Cys He Ser Glu Lys Ala Wing Ser He Met Arg 85 90 95
Pro Cys Lys Leu Pro Pro Gly Ser Val Lys Ser Lys Tyr Lys Asp Tyr 100 105 110 Tyr Cys Ala Lys Gln Gly Lys Val Ser Leu Met Ser Thr Gly Ser Leu 115 120 125 His Gln Leu His He Phe He Phe Val Leu Ala Val Phe His Val Thr 130 135 140 Tyr Ser Val He He Met Ala Leu Ser Arg Leu Lys Met Arg Thr Trp 145 150 155 160
Lys Lys Trp Glu Thr Glu Thr Wing Ser Leu Glu Tyr Gln Phe Wing Asn 165 170 175
Asp Pro Wing Arg Phe Arg Phe Thr His Gln Thr Ser Phe Val Lys Arg 180 185 190 His Leu Gly Leu Ser Ser Thr Pro Gly Val Arg Trp Val Val Wing Phe 195 200 205 Phe Arg Gln Phe Phe Arg Ser Val Thr Lys Val Asp Tyr Leu Thr Leu 210 215 220 Arg Wing Gly Phe He Asn Wing His Leu Ser His Asn Ser Lys Phe Asp 225 230 235 240
Phe His Lys Tyr He Lys Arg Ser Met Glu Asp Asp Phe Lys Val Val 245 250 255
Val Gly He Ser Leu Pro Leu Trp Cys Val Wing He Leu Thr Leu Phe 260 265 270 Leu Asp He Asp Gly He Gly Thr Leu Thr Trp He Ser Phe He Pro 275 280 285 Leu Val He Leu Leu Cys Val Gly Thr Lys Leu Glu Met He He Met 290 295 300 Glu Met Ala Leu Glu He Gln Asp Arg Ala Ser Val He Lys Gly Ala 305 310 315 320
Pro Val Val Glu Pro Ser Asn Lys Phe Phe Trp Phe His Arg Pro Asp 325 330 335
Trp Val Leu Phe Phe He His Leu Thr Leu Phe Gln Asn Wing Phe Gln 340 345 350 Met Wing His Phe Val Trp Thr Val Wing Thr Pro Gly Leu Lys Cys 355 360 365 Phe His Met His He Gly Leu Ser He Met Lys Val Val Leu Gly Leu 370 375 380 Wing Leu Gln Phe Leu Cys Ser Tyr He Thr Phe Pro Leu Tyr Ala Leu
385 390 395 400
Val Thr Gln Met Gly Ser Asn Met Lys Arg Ser He Phe Asp Glu Gln 405 410 415
Thr Ala Lys Ala Leu Thr Asn Trp Arg Asn Thr Wing Lys Glu Lys Lys 420 425 430 Lys Val Arg Asp Thr Asp Met Leu Met Wing Gln Met He Gly Asp Wing 435 440 445 Thr Pro Ser Arg Gly Wing Ser Pro Met Pro Ser Arg Gly Ser Ser Pro 450 455 460 Val His Leu Leu His Lys Gly Met Gly Arg Ser Asp Asp Pro Gln Ser 465 470 475 480
Thr Pro Thr Pro Pro Arg Wing Met Glu Glu Wing Arg Asp Met Tyr Pro 485 490 495
Val Val Val Ala Pro Val His Arg Leu Asn Pro Wing Asp Arg Arg 500 505 510 Arg Ser Val Ser Be Wing Leu Asp Val Asp He Pro Ser Wing Asp 515 520 525 Phe Ser Phe Ser Gln Gly 530
< 210 > 5 < 211 > 1693 < 212 > DNA < 213 > Triticum sp. < 220 > < 221 > CDS < 222 > (1) .. (1602) < 220 > < 221 > various_ characteristics < 222 > (190) .. (228) < 223 > location of the amino acid sequence stipulated in SEQ ID N?: l < 220 > < 221 > various_ characteristics < 222 > (1177) .. (1218) < 223 > location of the amino acid sequence stipulated in SEQ ID No: 2 < 400 > 5 atg gcg gac gac tac gag tac ecc cec gcg cgg acg ctg ceg gag acg 48
Met Ala Glu Asp Tyr Glu Tyr Pro Pro Wing Arg Thr Leu Pro Glu Thr 1 5 10 15 ceg tec tgg gcg gtg gcg etc gtc ttc gcc gtc atg ate ate gtg tec 96 Pro Ser Trp Ala Val Ala Leu Val Phe Ala Val Met He He Val Val 20 25 30 gtc etc ctg gag falls gcg etc falls aag cfcc ggc cat tgg ttc falls aag 144 Val Leu Leu Glu His Wing Leu His Lys Leu Gly His Trp Phe His Lys • 35 40 45 cgg falls aag aac gcg ctg gcg gag gcg ctg gag aag ate aaa gcg gag 192
Arg His Lys Asn Ala Leu Ala Glu Ala Leu Glu Lys He Lys Ala Glu 50 55 60 ctg atg ctg gtg ggg ttc ate teg ctg ctg etc gcc gtg acg cag gac 240
Leu Met Leu Val Gly Phe He Ser Leu Leu Leu Ala Val Thr Gln Asp
65 70 75 80 cea ate tec ggg ata tgc ate tec gag aag gcc gcc age ate atg cgg 288 Pro He Ser Gly He Cys He Ser Glu Lys Ala Ala Ser He Met Arg 85 90 95 ecc tgc age ctg ecc cct ggt tec gtc aag age aag tac aaa gac tac 336 Pro Cys Ser Leu Pro Pro Gly Ser Val Lys Ser Lys Tyr Lys Asp Tyr 100 105 110 tac tgc gcc aaa aag ggc aag gtg teg cta atg tec acg ggc age ttg 384 Tyr Cys Ala Lys Lys Gly Lys Val Ser Leu Met Ser Thr Gly Ser Leu 115 120 125 falls cag etc drops atg ttc ate ttc gtg etc gcc gtc ttc cat gtc acc 432 His Gln Leu His Met Phe He Phe Val Leu Wing Val Phe His Val Thr 130 135 140 tac age gtc ate ate atg gct cta age cgt etc aaa atg agg here tgg 480 Tyr Ser Val He Met Met Ala Leu Ser Arg Leu Met Met Arg Thr Trp 145 150 155 160 aag aaa tgg gag here gag acc gyc tec ttg gaa tac cag ttc gca aat 528 Lys Lys Trp Glu Thr Glu Thr Xaa Ser Leu Glu Tyr Gln Phe Ala Asn 165 170 175 gat cct gcg cgg ttc cgc ttc acg falls cag acg teg ttc gtg aag cgt 576 Asp Pro Ala Arg Phe Arg Phe Thr His Gln Thr Ser Phe Val Lys Arg 180 185 190 falls ctg ggc etc tec age acc ecc ggc ate aga tgg gtg gtg gcc ttc 624 His Leu Gly Leu Ser Ser Thr Pro Gly He Arg Trp Val Val Ala Phe 195 200 205 ttc agg cag ttc ttc agg teg gtc acc aag gtg gac tac etc acc ctg 672 Phe Arg Gln Phe Phe Arg Ser Val Thr Lys Val Asp Tyr Leu Thr Leu 210 215 220 agg gca ggc ttc ate aac gcg cat ttg teg cat aac age aag ttc gac 720 Arg Wing Gly Phe He Asn Ala His Leu Ser His Asn Ser Lys Phe Asp 225 230 235 240 ttc falls aag tac ate agg tec atg gac gac gac ttc aaa gtc gtc 768 Phe His Lys Tyr He Lys Arg Ser Met Glu Asp Asp Phe Lys Val Val 245 250 255 gtt ggc ate age etc ceg ctg tgg tgt gtg gcg ate etc etc etc 816 Val Gly He Ser Leu Pro Leu Trp Cys Val Ala He Leu Thr Leu Phe 260 265 270 ctt gat att gac ggg ate ggc acg etc acc tgg att tet ttc ate cct 864 Leu Asp He Asp Gly He Gly Thr Leu Thr Trp He Ser Phe He Pro 275 280 285 etc gtc ate etc ttg tgt gtt gga acc aag ctg gag atg ate ate atg 912 Leu Val Leu Leu Le Cys Val Gly Thr Lys Leu Glu Met Met He Met 290 295 300 gag atg gcc ctg gag ate cag gac cgg gcg age gtc ate aag ggg gcg 960 Glu Met Ala Leu Glu He Gln Asp Arg Ala Ser Val He Lys Gly Ala 305 310 315 320 ecc gtg gtt gag ecc age aac aag ttc ttc tgg ttc falls cgc ecc gac 1008 Pro Val Val Glu Pro Ser Asn Lys Phe Phe Trp Phe His Arg Pro Asp 325 330 335 t-IST 9tc tc ttc ttc ata cac ct3 9 CT9 FCTC a9 aat 9 9 TFCT ca9 1056 Trp Val Leu Phe Phe He His Leu Thr Leu Phe Gln Asn Ala Phe Gln 340 345 350 atg gca cat ttc gtc tgg here gtg gcc acg ecc ggc ttg aag aaa tgc 1104 Met Ala His Phe Val Trp Thr Val Ala Thr Pro Gly Leu Lys Lys Cys 355 360 365 ttc cat atg cac tie GGT CTG age tie atg AAG gtc GTG CTG GGG CTG 1152 Phe His Met His've Gly Leu Ser I Met Lys Val Val Leu Gly Leu 370 375 380 GCT ctt cag ttc ctc TGC age tat ate acc ttc ecc ctc tac GCG ctc 1200 Ala Leu Gln Phe Leu Cys Ser Tyr He Thr Phe Pro Leu Tyr Ala Leu 385 390 395 400 gtc here cag atg gga TEG aac atg AAG agg tec tie ttc gac gag cag 1248 Val Thr Gln Met Gly Ser As n Met Lys Arg Ser He Phe Asp Glu Gln 405 '410 415 acg gcc aag gcg ctg acc aac tgg cgg aac acg gcc aag gag aag aag 1296 Thr Ala Lys Ala Leu Thr Asn Trp Arg Asn Thr Ala Lys Glu Lys Lys 420 425 430 aag gtc cga gac acg gac atg ctg atg gcg cag atg ate ggc gac gcg 1344 Lys Val Arg Asp Thr Asp Met Leu Met Wing Gln Met He Gly Asp Wing 435 440 445 acg ecc age cg ggc acg teg ceg atg cct age cgg gct teg tea ceg 1392 Thr Pro Ser Arg Gly Thr Pro Pro Met Pro Ser Arg Wing Ser Ser Pro 450 455 460 gtg cac ctg ctt cac aag ggc atg gga cgg tec gac gat ecc cag age 1440 Val His Leu Leu His Lys Gly Met Gly Arg Ser Asp Asp Pro Gln Ser 465 470 475 480 gcg ceg ceg acc agg acc atg gag gag gct agg gac atg tac ceg 1488 Wing Pro Thr Ser Pro Arg Thr Met Glu Glu Wing Arg Asp Met Tyr Pro 485 490 495 gtt gtg gtg gcg cat ecc gtg cac aga cta aat cct gct gac agg cgg 1536 Val Val Val Ala Pro Val His Arg Leu Asn Pro Wing Asp Arg Arg 500 505 510 agg teg gtc tet teg teg gca ctc gat gcc gac ate ecc age gca gat 1584 Arg Ser Val Being Ser Wing Leu Asp Wing Asp He Pro Being Wing Asp 515 520 525 ttt tec ttc age cag gga tgagacaagt ttctgtattg atgttagtcc 1632
Phe Ser Phe Ser Gln Gly 530 aatgtatagc caacatagga tgtcatgatt cgtacaataa gaaatacaaa tttttactga 1692 g 1693
< 210 > 6 < 211 > 534 < 212 > PRT < 213 > Triticum sp. < 400 > 6 Met Wing Glu Asp Tyr Glu Tyr Pro Pro Wing Arg Thr Leu Pro Glu Thr 1 5 10 15 Pro Ser Trp Wing Val Wing Leu Val Phe Wing Val Met He He Val Val 20 25 30 Val Leu Leu Glu His Wing Leu His Lys Leu Gly His Trp Phe His Lys 35 40 45 Arg His Lys Asn Ala Leu Ala Glu Ala Leu Glu Lys He Lys Ala Glu 50 55 60 Leu Met Leu Val Gly Phe He Ser Leu Leu Leu Ala Val Thr Gln Asp 65 70 75 80 Pro He Ser Gly He Cys He Ser Glu Lys Wing Wing He Met Arg 85 90 95 Pro Cys Ser Leu Pro Pro Gly Ser Val Lys Ser Lys Tyr Lys Asp Tyr 100 105 110 Tyr Cys Ala Lys Lys Gly Lys Val Ser Leu Met Ser Thr Gly Ser Leu 115 120 125 His Gln Leu His Met Phe He Phe Val Leu Wing Val Phe His Val Thr 130 135 140 Tyr Ser Val He He Met Ala Leu Ser Arg Leu Lys Met Arg Thr Trp 145 150 155 160
Lys Lys Trp Glu Thr Glu Thr Xaa Ser Leu Glu Tyr Gln Phe Wing Asn 165 170 175
Asp Pro Wing Arg Phe Arg Phe Thr His Gln Thr Ser Phe Val Lys Arg 180 185 190 His Leu Gly Leu Ser Ser Thr Pro Gly He Arg Trp Val Val Wing Phe 195 200 205 Phe Arg Gln Phe Phe Arg Ser Val Thr Lys Val Asp Tyr Leu Thr Leu 210 215 220 Arg Wing Gly Phe He Asn Wing His Leu Ser His Asn Ser Lys Phe Asp 225 230 235 240
Phe His Lys Tyr He Lys Arg Ser Met Glu Asp Asp Phe Lys Val Val 245 250 255
Val Gly He Ser Leu Pro Leu Trp Cys Val Wing He Leu Thr Leu Phe 260 265 270 Leu Asp He Asp Gly He Gly Thr Leu Thr Trp He Ser Phe He Pro 275 280 285 Leu Val He Leu Leu Cys Val Gly Thr Lys Leu Glu Met He He Met 290 295 300 Glu Met Ala Leu Glu He Gln Asp Arg Ala Ser Val He Lys Gly Ala 305 310 315 320
Pro Val Val Glu Pro Ser Asn Lys Phe Phe Trp Phe His Arg Pro Asp 325 330 335
Trp Val Leu Phe Phe He His Leu Thr Leu Phe Gln Asn Wing Phe Gln 340 345 350 Met Wing His Phe Val Trp Thr Val Wing Thr Pro Gly Leu Lys Cys 355 360 365 Phe His Met His He Gly Leu Ser He Met Lys Val Val Leu Gly Leu 370 375 380 Wing Leu Gln Phe Leu Cys Ser Tyr He Thr Phe Pro Leu Tyr Ala Leu 385 390 395 400
Val Thr Gln Met Gly Ser Asn Met Lys Arg Ser He Phe Asp Glu Gln 405 410 415 Thr Wing Lys Wing Leu Thr Asn Trp Arg Asn Thr Wing Lys Glu Lys Lys 420 425 430 Lys Val Arg Asp Thr Asp Met Leu Met Wing Gln Met He Gly Asp Wing 435 440 445 Thr Pro Ser Arg Gly Thr Pro Pro Met Pro Ser Arg Wing Ser Pro 450 455 460 Val His Leu Leu His Lys Gly Met Gly Arg Ser Asp Asp Pro Gln Ser 465 470 475 480
Wing Pro Thr Ser Pro Arg Thr Met Glu Glu Wing Arg Asp Met Tyr Pro 485 490 495 Val Val Val Ala Pro Val His Arg Leu Asn Pro Wing Asp Arg Arg 500 505 510 Arg Ser Val Ser Ser Be Ala Leu Asp Ala Asp He Pro Ser Ala Asp 515 520 525 Phe Ser Phe Ser Gln Gly 530
< 210 > 7 < 211 > 1886 < 212 > DNA < 213 > Triticum sp. < 220 > < 221 > CDS < 222 > (198) .. (1799) < 220 > < 221 > various_ characteristics < 222 > (387) .. (425) < 223 > location of the amino acid sequence stipulated in SEQ ID N?: l < 220 > < 221 > characteristics_ arias < 222 > (1374) .. (1415) < 22"3 > location of the amino acid sequence set forth in SEQ ID NO: 2 < 400 > 7 gctagacata gcagcaacaa cctgcgtgcg taegtaegtt ttcgttttcc tttcttgctc 60 cggccggccg gccggccacg tagaatagat acctgcccag gtacgtacct cgttggetca 120 gacgatcggc ggttggaett gggtgcgcgc cctgccctgc tccggccaag gaaagaggtt 180 gcgctaaaga cgggcgg atg gca aag gac gac ggg tac ecc ceg gcg cgg 230 Met Wing Lys Asp Asp Gly Tyr Pro Pro Wing Arg 1 5 10 acg ctg ceg gag acg ceg tec tgg gcg gtg gcg ctg gtc ttc gcc gtc 278 Thr Leu Pro Glu Thr Pro Ser Trp Wing Val Ala Leu Val Phe Ala Val 15 20 25 atg ate ate gtc tec gtc ctc ctg gag cac gcg ctc cac aag ctc ggc 326 Met He He Val Val Leu Leu Glu His Ala Leu His Lys Leu Gly 30 35 40 cat tgg ttc cac aag cgg cac aag aac gcg ctg gcg gag gcg ctg gag 374
His Trp Phe His Lys Arg His Lys Asn Wing Leu Wing Glu Wing Leu Glu 45 50 55 aag atg aag gcg gag ctg atg ctg gtg gga ttc ate teg ctg ctg ctc 422
Lys Met Lys Wing Glu Leu Met Leu Val Gly Phe He Ser Leu Leu Leu
60 65 70 75 gcc gtc acg cag gac cea ate tec ggg ata tgc ate tec cag aag gcc 470 Wing Val Thr Gln Asp Pro He Ser Gly He Cys He Ser Gln Lys Wing 80 85 90 gcc age ate atg cgc ecc tgc aag gtg gaa ecc ggt tec gtc aag age 518 Wing He Met Arg Pro Cys Lys Val Glu Pro Gly Ser Val Lys Ser 95 100 105 aag tac aag gac tac tac tcc gcc aaa gag ggc aag gtg gcg ctc atg 566 Lys Tyr Lys Asp Tyr Tyr Cys Wing Lys Glu Gly Lys Val Wing Leu Met 110 115 120 tec acg ggc age ctg cac cag ctc cac ata ttc ate ttc gtg cta gcc 614 Ser Thr Gly Ser Leu His Gln Leu His lie Phe lie Phe Val Leu Ala 125 130 135 gtc ttc cat gtc acc age gtc ate ate atg gct cta age cgt ctc 662 Val Phe His Val Thr Tyr Ser Val He He Met Ala Leu Ser Arg Leu 140 145 150 155 aag atg aga here tgg aag aaa tgg gag here gaa acc gcc tec ttg Gaa 710 Lys Met Arg Thr Trp Lys Lys Trp Glu Thr Glu Thr Wing Ser Leu Glu 160 165 170 tac cag ttc gca aat gat cct gcg cgg ttc cgc ttc acg cac cag acg 758 Tyr Gln Phe Wing Asn Asp Pro Wing Arg Phe Arg Phe Thr His Gln Thr 175 180 185 teg ttc gtg aag cgg cac ctg ggc ctg tec age acc ecc ggc gtc aga 806 Ser Phe Val Lys Arg His Leu Gly Leu Ser Ser Thr Pro Gly Val Arg 190 195 200 tgg gtg gtg gcc ttc ttc agg cag ttc ttc agg teg gtc acc aag gtg 854 Trp Val Val Wing Phe Phe Arg Gln Phe Phe Arg Ser Val Thr Lys Val 205 210 215 gac tac ctc acc ttg agg gca ggc ttc ate aac gcg cac ttg teg cag 902 Asp Tyr Leu Thr Leu Arg Wing Gly Phe He Asn Wing His Leu Ser Gln 220 225 230 235 aac age aag ttc gac ttc cac aag tac ate aag agg tec atg gag gac 950 Asn Ser Lys Phe Asp Phe His Lys Tyr He Lys Arg Ser Met Glu Asp 240 245 250 gac ttc aaa gtc gtc gtc gtt gcc ate age ctc ceg ctg tgg gct gcg gcg 998 Asp Phe Lys Val Val Val Gly He Ser Leu Pro Leu Trp Ala Val Ala 255 260 265 ate ctc acc ctc ttc ctt gat ate gc ate gc gc gcc here ctc acc tgg 1046 He Leu Thr Leu Phe Leu Asp He Asp Gly He Gly Thr Leu Thr Trp 270 275 280 gtt tet ttc ate cct ctc ate ate ctc ttg tgt gtt gga acc aag cta 1094
Val Ser Phe He Pro Leu He He Leu Leu Cys Val Gly Thr Lys Leu
285 290 295 gag atg ate ate atg ggg atg gcc ctg gag ate cag gac cgg teg age 1142 Glu Met He Met Met Gly Met Ala Leu Glu He Gln Asp Arg Ser Ser 300 305 310 315 gtc ate aag ggg gca ecc gtg gtc gag ecc age aac aag ttc ttc tgg tcc 1190 Val He Lys Gly Pro Wing Val Val Glu Pro Ser Asn Lys Phe Phe Trp 320 325 330 tcc ccc ccc ccc ccc ccc gtc tc ctc ctc tc ccc acc cccc tcc 1238 Phe His Arg Pro Asp Trp Val Leu Phe Phe He His Leu Thr Leu Phe 335 340 345 cag aac gcg ttt cag atg gca cat ttc gtg tgg here gtg gcc acg ecc 1286
Gln Asn Wing Phe Gln Met Wing His Phe Val Trp Thr Val Wing Thr Pro 350 355 360 ggc ttg aag gac tgc ttc cat atg aac ate ggg ctg age ate atg aag 1334
Gly Leu Lys Asp Cys Phe His Met Asn He Gly Leu Ser He Met Lys
365 370 375 gtc gtg ctg ggg ctg gct ctc cag ttc ctg tgc age tac ate acc ttc 1382 Val Val Leu Gly Leu Ala Leu Gln Phe Leu Cys Ser Tyr He Thr Phe 380 385 390 395 ecc ctc tac gcg cta gtc here cag atg gga tea aac atg aag agg tec 1430 Pro Leu Tyr Ala Leu Val Thr Gln Met Gly Ser Asn Met Lys Arg Ser 400 405 410 ate ttc gac gag cag gcc aag gcg ctg acc aac tg cgg aac acg 1478 He Phe Asp Glu Gln Thr Ala Lys Ala Leu Thr Asn Trp Arg Asn Thr 415 420 425 gcc aag gag aag aag aag aga gga cg gac acg gac atg cg atg gcg cag 1526 Wing Lys Glu Lys Lys Lys Val Arg Asp Thr Asp Met Leu Met Wing Gln 430 435 440 atg ate ggc gac gca here ecc age ca ggc ac tcc ceg atg cct age 1574
Met He Gly Asp Ala Thr Pro Ser Arg Gly Thr Ser Pro Met Pro Ser
445 450 455 egg ggc tea teg ceg gtg cac ctg ctt cag aag ggc atg gga cgg tet 1622 Arg Gly Ser Ser Pro Val His Leu Leu Gln Lys Gly Met Gly Arg Ser 460 465 470 475 gac gat ecc cag age gca ceg acc teg cea agg acc atg gag gag gct 1670 Asp Asp Pro Gln Ser Wing Pro Thr Ser Pro Arg Thr Met Glu Glu Wing 480 485 490 agg gac atg tac ce gtg gtg gtg gcg cat cct gta cac aga cta aat 1718 Arg Asp Met Tyr Pro Val Val Val Ala His Pro Val His Arg Leu Asn 495 500 505 cct gct gac agg cgg agg teg gtc tet tea gcc gcc cc gc gat gac 1766 Pro Wing Asp Arg Arg Arg Ser Val Being Ser Wing Leu Asp Wing Asp 510 515 520 ate ecc age gca gat ttt tcc age cag gga tgagacaagt ttctgtattg 1819 He Pro Be Wing Asp Phe Ser Phe Ser Gln Gly 525 530 atgttagtcc aatgtatagc caacatagga tgtgatgatt cgtacaataa gaaatacaat 1879 tttttac 1886
< 210 > 8 < 211 > 534 < 212 > PRT < 213 > Triticum sp. < 400 > 8 Met Wing Lys Asp Asp Gly Tyr Pro Pro Wing Arg Thr Leu Pro Glu Thr 1 5 10 15 Pro Ser Trp Wing Val Wing Leu Val Phe Wing Val Met He He Val Val 20 25 30 Val Leu Leu Glu His Wing Leu His Lys Leu Gly His Trp Phe His Lys 35 40 45 Arg His Lys Asn Ala Leu Ala Glu Ala Leu Glu Lys Met Lys Ala Glu 50 55 60 Leu Met Leu Val Gly Phe He Ser Leu Leu Leu Ala Val Thr Gln Asp 65 70 75 80 Pro He Ser Gly He Cys He Ser Gln Lys Wing Wing He Met Arg 85 90 95 Pro Cys Lys Val Glu Pro Gly Ser Val Lys Ser Lys Tyr Lys Asp Tyr 100 105 110 Tyr Cys Ala Lys Glu Gly Lys Val Ala Leu Met Ser Thr Gly Ser Leu 115 120 125 His Gln Leu His He Phe He Phe Val Leu Wing Val Phe His Val Thr 130 135 140 Tyr Ser Val He He Met Ala Leu Ser Arg Leu Lys Met Arg Thr Trp 145 150 155 160
Lys Lys Trp Glu Thr Glu Thr Wing Ser Leu Glu Tyr Gln Phe Wing Asn 165 170 175
Asp Pro Wing Arg Phe Arg Phe Thr His Gln Thr Ser Phe Val Lys Arg 180 185 190 His Leu Gly Leu Ser Ser Thr Pro Gly Val Arg Trp Val Val Wing Phe 195 200 205 Phe Arg Gln Phe Phe Arg Ser Val Thr Lys Val Asp Tyr Leu Thr Leu 210 215 220 Arg Wing Gly Phe He Asn Wing His Leu Ser Gln Asn Ser Lys Phe Asp 225 230 235 240
Phe His Lys Tyr He Lys Arg Ser Met Glu Asp Asp Phe Lys Val Val 245 250 255
Val Gly He Ser Leu Pro Leu Trp Wing Val Wing He Leu Thr Leu Phe 260 265 270 Leu Asp He Asp Gly He Gly Thr Leu Thr Trp Val Ser Phe He Pro 275 280 285 Leu He He Leu Leu Cys Val Gly Thr Lys Leu Glu Met He Met Met 290 295 300 Gly Met Ala Leu Glu He Gln Asp Arg Ser Ser Val He Lys Gly Ala 305 310 315 320
Pro Val Val Glu Pro Ser Asn Lys Phe Phe Trp Phe His Arg Pro Asp 325 330 335
Trp Val Leu Phe Phe He His Leu Thr Leu Phe Gln Asn Wing Phe Gln 340 345 350 Met Wing His Phe Val Trp Thr Val Wing Thr Pro Gly Leu Lys Asp Cys 355 360 365 Phe His Met Asn He Gly Leu Ser He Met Lys Val Val Leu Gly Leu 370 375 380 Wing Leu Gln Phe Leu Cys Ser Tyr He Thr Phe Pro Leu Tyr Ala Leu 385 390 395 400
Val Thr Gln Met Gly Ser Asn Met Lys Arg Ser He Phe Asp Glu Gln 405 410 415
Thr Ala Lys Ala Leu Thr Asn Trp Arg Asn Thr Ala Lys Glu Lys Lys 420 425 430 Lys Val Arg Asp Thr Asp Met Leu Met Wing Gln Met He Gly Asp Wing 435 440 445 Thr Pro Ser Arg Gly Thr Ser Pro Met Pro Ser Arg Gly Ser Ser Pro 450 455 460 Val His Leu Leu Gln Lys Gly Met Gly Arg Ser Asp Asp Pro Gln Ser 465 470 475 480
Wing Pro Thr Ser Pro Arg Thr Met Glu Glu Wing Arg Asp Met Tyr Pro 485 490 495 Val Val Val Ala Pro Val His Arg Leu Asn Pro Wing Asp Arg Arg 500 505 510 Arg Ser Val Ser Ser Be Ala Leu Asp Ala Asp He Pro Ser Ala Asp 515 520 525 Phe Ser Phe Ser Gln Gly 530
< 210 > 9 < 211 > 2197 < 212 > DNA < 213 > Arabidopsis thaliana < 220 > < 221 > CDS < 222 > (331) .. (2037) < 220 > < 221 > various_ characteristics < 222 > (589) .. (627) < 223 > location of the amino acid sequence stipulated in SEQ ID N?: l < 220 > < 221 > various_ characteristics < 222 > (1603) .. (1644) < 223 > location of the amino acid sequence stipulated in SEQ ID No: 2 < 400 > 9 agtaatttag ctgttcttct acccctctga tctctcacag gggatcaaat agttttgata 60 catagagcca caacagtgac attagtgtgt tgttgactac tgtaagggtt gggttttgaa 120 aagagacatg aaggagtgtt attaggttga ttgtcttcaa gtacctccag tgtcaaacaa 180 acattgaega ttgattctct teccataatt tattgttatg cattacatat cacagtaaac 240 ggactttcaa gtcaacaccg catttatttg ccctcttcat tgtttcacgt aegtaatcaa 300 ggaccaaggg attttgttct tttggctacc atg gcc here aga tgc ttt tgg tgt 354 Met Ala Thr Arg Cys Phe Trp Cys l 5 tgg aec act ttg ctc ttc tgc tet cag ctg ctt acc ggc ttt gcc cga 402 Trp Thr Leu Leu Phe Cys Ser Gln Leu Leu Thr Gly Phe Ala Arg 10 15 20 gct tcc tet gca ggc ggc gcc aaa gag aaa gga ctc tcc caa act ecc 450
Wing Being Being Wing Gly Gly Wing Lys Glu Lys Gly Leu Being Gln Thr Pro
30 35 40 acc tgg gcc gtt gcc ctc gtc tgt acc ttt ttc att ctt gtc tcc gtc 498 Thr Trp Wing Val Wing Leu Val Cys Thr Phe Phe He Leu Val Ser Val 45 50 55 ctt ctc gag aag gct ctt cac aga gtt gcc acg tgg ttg tgg gag aaa 546 Leu Leu Glu Lys Ala Leu His Arg Val Ala Thr Trp Leu Trp Glu Lys 60 65 70 cat aag aac tet ctg ctt gaa gcc ttg gaa aaa ata aag gcc gag ctg 594 His Lys Asn Ser Leu Leu Glu Ala Leu Glu Lys He Lys Ala Glu Leu 75 80 85 atg att cta gga ttc att tcc ttg ttg cta acc ttc gga gag cag tac 642 Met He Leu Gly Phe He Ser Leu Leu Leu Thr Phe Gly Glu Gln Tyr 90 95 100 att ctc aag att tgt att cct gaa aag gct gc gcc tet atg tta cct 690 He Leu Lys He Cys He Pro Glu Lys Ala Wing Ala Ser Met Leu Pro 105 110 115 120 tgt cea gct cct tet act cat gac caa gac aag acc cac cgc aga cgt 738 Cys Pro Wing Pro Ser Thr His Asp Gln Asp Lys Thr His Arg Arg Arg 125 130 135 cta gct gct acg acc tet tcc cgc tgc gat gag ggt cat gaa cea 786 Leu Wing Wing Thr Thr Ser Ser Arg Cys Asp Glu Gly His Glu Pro 140 145 150 ctc ata cct gcc acg ggt ttg cac cag cta cac att cta ttg ttc ttc 834 Leu He Pro Wing Thr Gly Leu His Gln Leu His He Leu Leu Phe Phe 155 160 165 atg gct gcc ttt cat ate ctc tac agt ttc ate acc atg ctt ggc 882 Met Wing Ala Phe His He Leu Tyr Ser Phe He Thr Met Met Leu Gly 170 175 180 aga ctc aag ate cgt ggc tgg aaa aag tgg gag cag gag here tgt tet 930 Arg Leu Lys He Arg Gly Trp Lys Lys Trp Glu Gln Glu Thr Cys Ser 185 190 195 200 cat gat tac gag ttt tea ate gat cea tea aga ttc aga ctc act cat 978 His Asp Tyr Glu Phe Ser He Asp Pro Be Arg Phe Arg Leu Thr His 205 210 215 gag acg tcc ttt gtt aga caa cat tcc agt ttc tgg here aaa ate ecc 1026 Glu Thr Ser Phe Val Arg Gln His Ser Ser Phe Trp Thr Lys He Pro 220 225 230 ttc ttc ttt tat gct ggg tgc ttc cta cag cag ttt ttc cga tet gtc 1074 Phe Phe Phe Tyr Wing Gly Cys Phe Leu Gln Gln Phe Phe Arg Ser Val 235 240 245 ggg ag act gac tac tta act ctg cgc cat ggc ttc ate gct gcc cat 1122 Gly Arg Thr Asp Tyr Leu Thr Leu Arg His Gly Phe He Wing Wing His 250 255 260 tta gct cea gga aga aga ttc gac ttc cag aag tat ate aaa tea aga tea 1170 Leu Wing Pro Gly Arg Ly s Phe Asp Phe Gln Lys Tyr He Lys Arg Ser 265 270 275 280 ttg gaa gac gat ttc aag gtg gta gt gga ata agt cct ctt ttg tgg 1218 Leu Glu Asp Asp Phe Lys Val Val Val Gly He Ser Pro Leu Leu Trp 285 290 295 gca tea ttt gta att ttc cta ctt ctg aat gtt aat ggc tgg gaa gca 1266 Wing Being Phe Val He Phe Leu Leu Leu Asn Val Asn Gly Trp Glu Wing 300 305 310 ttg ttt tgg gcg tea ate cta cct gta ctt ate att cta gct gtc gtc agt 1314 Leu Phe Trp Wing Be He Leu Pro Val Leu He He Leu Wing Val Ser 315 320 325 acg aag ctt ca gcg ate cta here aga atg gct ctg gga ate acg gag 1362 Thr Lys Leu Gln Ala He Leu Thr Arg Met Wing Leu Gly He Thr Glu 330 335 340 aga cac gca gtt gtt ca ggg ata cct ctc gtg cat ggt tea gat aag 1410 Arg His Wing Val Val Gln Gly He Pro Leu Val His Gly Ser Asp Lys 345 350 355 360 tac ttt tgg ttt aat cgc cct cag ttg cta ctt cat ctt ctt cac ttc 1458 Tyr Phe Trp Phe Asn Arg Pro Gln Leu Leu Leu His Leu His Phe 365 370 375 gcc tta ttt cag aat gct tcc cag cta here tac tc ttc tgg gtc tgg 1506 Ala Leu Phe Gln Asn Ala Phe Gln Leu Thr Tyr Phe Phe Trp Val Trp 380 385 390 tat tcc ttt ggg cta aaa tet tgc ttt cac acg gat ttc aaa cta gtc 1554 Tyr Ser Phe Gly Leu Lys Ser Cys Phe His Thr Asp Phe Lys Leu Val 395 400 405 ate gta aaa ctc tet cta ggc gtt gga gct ttg att ttg tgc age tac 1602 He Val Lys Leu Ser Leu Gly Val Gly Ala Leu He Leu Cys Ser Tyr 410 415 420 ate here ctt cct ttg tat gca cta gtt act cag atg ggt tea aac atg 1650 He Thr Leu Pro Leu Tyr Ala Leu Val Thr Gln Met Gly Ser Asn Met 425 430 435 440 aag aaa gct gtg ttt gat gag caa atg gca aaa gcg ttg aag aaa tgg 1698 Lys Lys Ala Val Phe Asp Glu Gln Met Wing Lys Wing Leu Lys Lys Trp 445 450 455 cac atg act gtg aag aag aag aa ggc aaa gcg aga aag cea cea cea 1746 His Met Thr Val Lys Lys Lys Lys Gly Lys Wing Arg Lys Pro Pro Thr 460 465 470 gag acc ctt ggt gtt tet gac act gtc age acc tet acc tea tcc ttt 1794 Glu Thr Leu Gly Val Ser Asp Thr Val Ser Thr Ser Thr Ser Ser Phe 475 480 485 cac gcc tet gga gcc act cta ctc cgc tcc aag acc act ggt cac teg 1842 His Wing Ser Gly Wing Thr Leu Leu Arg Ser Lys Thr Thr Gly His Ser 490 495 500 here gcc tet tat atg agt aat ttc gag gac ca age atg tet gat ctt 1890 Thr Wing Being Tyr Met Being Asn Phe Glu Asp Gln Be Met As Asp Leu 505 510 515 520 gaa gct gag cea tta tcc cct gaa cea ata gag ggg cac act ctc gtc 1938 Glu Ala Glu Pro Leu Ser Pro Glu Pro He Glu Gly His Thr Leu Val 525 530 535 agg gtt ggt gat cag aac here gag ata gaa tat act gga gat att agt 1986 Arg Val Gly Asp Gln Asn Thr Glu He Glu Tyr Thr Gly Asp He Ser 540 545 550 cct gga aac cat ttc tcc ttt gtg aag aac gtt cct gct aat gat att 2034 Pro Gly Asn Gln Phe Ser Phe Val Lys Asn Val Pro Wing Asn Asp He 555 560 565 gac taatattcaa aatgaatgca gaacaaatcc atcatccggt ctttattttc 2087 Asp tattacatgt atgecaacaa ttgcttcgcc aagtgttacc aactaggttt tetgtataag 2147 gctgtatttt agagctaaaa aaaaaaaaaa aaaaaaaaaa ctaaattact 2197
< 210 > 10 < 211 > 569 < 212 > PRT < 213 > Arabidopsis thaliana < 400 > 10 Met Wing Thr Arg Cys Phe Trp Cys Trp Thr Thr Leu Leu Phe Cys Ser 1 5 10 15 Gln Leu Leu Thr Gly Phe Wing Arg Wing Being Wing Gly Wing Gly Wing Lys 20 25 30 Glu Lys Gly Leu Ser Gln Thr Pro Thr Trp Wing Val Wing Leu Val Cys 35 40 45 Thr Phe Phe He Leu Val Ser Val Leu Leu Glu Lys Ala Leu His Arg 50 55 60 Val Wing Thr Trp Leu Trp Glu Lys His Lys Asn Ser Leu Leu Glu Wing 65 70 75 80
Leu Glu Lys He Lys Wing Glu Leu Met He Leu Gly Phe He Ser Leu 85 90 95 Leu Leu Thr Phe Gly Glu Gln Tyr He Leu Lys He Cys He Pro Glu 100 105 110 Lys Ala Wing Wing Met Leu Pro Cys Pro Wing Pro Ser Thr His Asp 115 120 125 Gln Asp Lys Thr His Arg Arg Arg Leu Wing Wing Thr Thr Ser Ser 130 135 140 Arg Cys Asp Glu Gly His Glu Pro Leu He Pro Wing Thr Gly Leu His 145 150 155 160
Gln Leu His He Leu Leu Phe Phe Met Wing Wing Phe His He Leu Tyr 165 170 175
Be Phe He Thr Met Met Leu Gly Arg Leu Lys He Arg Gly Trp Lys 180 185 190 Lys Trp Glu Gln Glu Thr Cys Ser His Asp Tyr Glu Phe Ser He Asp 195 200 205 Pro Ser Arg Phe Arg Leu Thr His Glu Thr Ser Phe Val Arg Gln His 210 215 220 Ser Ser Phe Trp Thr Lys He Pro Phe Phe Phe Tyr Wing Gly Cys Phe 225 230 235 240
Leu Gln Gln Phe Phe Arg Ser Val Gly Arg Thr Asp Tyr Leu Thr Leu 245 250 255
Arg His Gly Phe He Wing Wing His Wing Leu Wing Pro Gly Arg Lys Phe Asp 260 265 270 Phe Gln Lys Tyr He Lys Arg Ser Leu Glu Asp Asp Phe Lys Val Val 275 280 285 Val Gly He Ser Pro Leu Leu Trp Wing Ser Phe Val He Phe Leu Leu 290 295 300 Leu Asn Val Asn Gly Trp Glu Ala Leu Phe Trp Wing Ser He Leu Pro 305 310 315 320
Val Leu He He Leu Ala Val Ser Thr Lys Leu Gln Ala He Leu Thr 325 330 335
Arg Met Ala Leu Gly He Thr Glu Arg His Wing Val Val Gln Gly He 340 345 350 Pro Leu Val His Gly Ser Asp Lys Tyr Phe Trp Phe Asn Arg Pro Gln 355 360 365 Leu Leu Leu His Leu Leu His Phe Ala Leu Phe Gln Asn Ala Phe Gln 370 375 380 Leu Thr Tyr Phe Phe Trp Val Trp Tyr Ser Phe Gly Leu Lys Ser Cys 385 390 395 400
Phe His .Thr Asp Phe Lys Leu Val He Val Lys Leu Ser Leu Gly Val 405 410 415 Gly Ala Leu He Leu Cys Ser Tyr He Thr Leu Pro Leu Tyr Ala Leu 420 425 430 Val Thr Gln Met Gly Ser Asn Met Lys Lys Ala Val Phe Asp Glu Gln 435 440 445 Met Ala Lys Ala Leu Lys Lys Trp His Met Thr Val Lys Lys Lys 450 455 460 Gly Lys Ala Arg Lys Pro Pro Thr Glu Thr Leu Gly Val Ser Asp Thr 465 470 475 480
Val Ser Thr Ser Thr Ser Ser Phe His Wing Ser Gly Wing Thr Leu Leu 485 490 495 Arg Ser Lys Thr Thr Gly His Ser Thr Wing Ser Tyr Met Ser Asn Phe 500 505 510 Glu Asp Gln Ser Met Ser Asp Leu Glu Wing Glu Pro Leu Ser Pro Glu 515 520 525 Pro He Glu Gly His Thr Leu Val Arg Val Gly Asp Gln Asn Thr Glu 530 535 540 He Glu Tyr Thr Gly Asp He Ser Pro Gly Asn Gln Phe Ser Phe Val 545 550 555 560
Lys Asn Val Pro Wing Asn Asp He Asp 565
< 210 > 11 < 211 > 1935 < 212 > DNA < 213 > Arabidopsis thaliana < 220 > < 221 > CDS < 222 > (89) .. (1807) < 220 > < 221 > various_ characteristics < 222 > (269) .. (307) < 223 > location of the amino acid sequence stipulated in SEQ ID N?: l < 220 > < 221 > various_ characteristics < 222 > (1370) .. (1411) < 223 > location of the amino acid sequence stipulated in SEQ ID No: 2 < 400 > 11 cagtgtgagt aatttagtaa aaagacaaga tctctggtct ggaattagaa gaatcttatt 60 tgggtttttt tcttaggatt aagctcta atg gca gat ca gta aaa gag cgg 112 Met Wing Asp Gln Val Lys Glu Arg 1 5 act tta gag gag acc tet acg tgg gca gtt gtt gtt gtt tgc ttt gtc 160 Thr Leu Glu Glu Thr Ser Thr Trp Wing Val Wing Val Val Cys Phe Val 10 15 20 tta ctc ttt att teg att gtc ctc gaa cat tet att cac aaa att gga 208
Leu Leu Phe He Ser He He Val Leu Glu His Ser He His Lys He Gly
30 35 40 acc tgg ttt aaa aag aag cag aag cag gct ctt ttt gaa gct ctt gaa 256 Thr Trp Phe Lys Lys Lys His Lys Gln Ala Leu Phe Glu Ala Leu Glu 45 50 55 aag gtc aaa gca gag ctt atg ctg ttg gga ttc ata tea cta cta ctc 304 Lys Val Lys Wing Glu Leu Met Leu Leu Gly Phe He Being Leu Leu Leu 60 65 70 here att gga caá here cea ate ate aat ate tgc ate tcc cag aaa gtt 352 Thr He Gly Gln Thr Pro He As Asn He Cys He Ser Gln Lys Val 75 80 85 gcg tea here atg cac cct tgc agt gct gct gaa gaa gct aaa aaa tac 400 Wing Ser Thr Met His Pro Cys Ser Wing Wing Glu Glu Wing Lys Lys Tyr 90 95 100 ggc aag aaa gac gcc gga aag aaa gat gat gga gat gga gat aaa ecc 448 Gly Lys Asp Wing Gly Lys Lys Asp Asp Gly Asp Gly Asp Lys Pro 105 110 115 120 ggt cga aga ctt ctt ctt gag tta gct gaa tet tat ate cat aga aga 496 Gly Arg Arg Leu Leu Leu Glu Leu Ala Glu Be Tyr He His Arg Arg 125 130 135 agt tta gcc acc aaa ggc tat gac aaa tgt gca gag aag ggg aaa gtg 544 Ser Leu Wing Thr Lys Gly Tyr Asp Lys Cys Wing Glu Lys Gly Lys Val 140 145 150 gct ttt gta tet gt t gt t gt tat gga ate cag cag ctg cat ata ttc ate ttc 592 Wing Phe Val Being Wing Tyr Gly He His Gln Leu His He Phe He Phe 155 160 165 gtg ctc gcg gtt gtt cat gtt gtt tac tgc att gtt act tat gct ttc 640 Val Leu Ala Val Val His Val Val Tyr Cys He Val Thr Tyr Ala Phe 170 175 180 gga aag ate aag agg acg tgg aag teg tgg gag gaa gag aa aag 688 Gly Lys He Lys Met Arg Thr Trp Lys Ser Trp Glu Glu Glu Thr Lys 185 190 195 200 here ata gag tat cag tat tcc aac gat cct gag agg ttc agg ttt gcg 736 Thr He Glu Tyr Gln Tyr Ser Asn Asp Pro Glu Arg Phe Arg Phe Wing 205 210 215 agg gac here tet ttt ggg aga aga cat ctc aat ttc tgg age aag acg 784 Arg Asp Thr Ser Phe Gly Arg Arg His Leu Asn Phe Trp Ser Lys Thr 220 225 230 aga gtc here cta tgg att gtt tgt ttt ttt aga cag ttc ttt gga tet 832
Arg Val Thr Leu Trp He Val Cys Phe Phe Arg Gln Phe Phe Gly Ser 235 240 245 gtc acc aaa gtt gat tac tta gca cta aga cat ggt ttc ate atg gcg 880
Val Thr Lys Val Asp Tyr Leu Wing Leu Arg His Gly Phe He Met Wing 250 255 260 cat ttt gct CCC ggt aac gaa tea aga ttc gat ttc cgc aag tat att 928
His Phe Wing Pro Gly Asn Glu Being Arg Phe Asp Phe Arg Lys Tyr He
265 270 275 280 cag aga tea tta gag aaa gac ttc aaa acc gtt gtt gaa ate agt ceg 976
Gln Arg Ser Leu Glu Lys Asp Phe Lys Thr Val Val Glu He Ser Pro 285 290 295 gtt ate tgg ttt gtc gct gtg cta ttc ctc ttg acc aat tea tat gga 1024
Val He Trp Phe Val Wing Val Leu Phe Leu Leu Thr Asn Ser Tyr Gly 300 305 310 tta cgt tet tac ctc tgg tta cea ttc att cea cta gtc gta att cta 1072
Leu Arg Ser Tyr Leu Trp Leu Pro Phe He Pro Leu Val Val He Leu 315 320 325 ata gtt gga here aag ctt gaa gtc ata ata ata aaa ttg ggt cta aga 1120
He Val Gly Thr Lys Leu Glu Val He He Thr Lys Leu Gly Leu Arg 330 335 340 ate caa gag aaa ggt gat gtg gtg aga ggc gcc cea gtg gtt cag cct 1168
He Gln Glu Lys Gly Asp Val Val Arg Gly Ala Pro Val Val Gln Pro
345 350 355 360 ggt gat gac ctc ttc tgg ttt ggc aag cea cgc ttc att ctt ttc ctt 1216
Gly Asp Asp Leu Phe Trp Phe Gly Lys Pro Arg Phe He Leu Phe Leu 365 370 375 att cac ttg gtc ctt ttt acg aat gca ttt caa ctt gca ttc ttt gcc 1264
He His Leu Val Leu Phe Thr Asn Ala Phe Gln Leu Ala Phe Phe Wing 380 385 390 tgg agt acg tat gaa ttc aat ctc aat aat tat ttc cat gaa age act 1312
Trp Ser Thr Tyr Glu Phe Asn Leu Asn Asn Cys Phe His Glu Ser Thr 395 400 405 gca gat gtg gtc att aga ctt gta gtt gga gct gtt gtg cag ata ctt 1360
Wing Asp Val Val He Arg Leu Val Val Gly Wing Val Val Gln He Leu 410 415 420 tgc age tat gtg act ctt cea ctc tat gca ctt gtt act cag atg ggt 1408
Cys Ser Tyr Val Thr Leu Pro Leu Tyr Wing Leu Val Thr Gln Met Gly 425 430 435 440 agt aaa atg aag cea gta tta aac aga gat aga gta gcc acg gca tta 1456
Be Lys Met Lys Pro Thr Val Phe Asn Asp Arg Val Wing Thr Wing Leu 445 450 455 aag aag tgg cat cact act gca aag aac gag acg aaa cac gga aga cac 1504 Lys Lys Trp His His Thr Wing Lys Asn Glu Thr Lys His Gly Arg His 460 465 470 teg gga tcc aat here cct ttc tet age cgt cea act here cea cat 1552 Ser Gly Ser Asn Thr Pro Phe Be Ser Arg Pro Thr Pro Thr His - 475 480 485 ggc tea tet cea ate cat ctc ctt cac aat ttc aat aac cgg age gtt 1600 Gly Ser Ser Pro He His Leu Leu His Asn Phe Asn Asn Arg Ser Val 490 495 500 gaa aat tac cea agt tet cct tet cct aga tac tet ggt cat ggt cat 1648 Glu Asn Tyr Pro Ser Ser Pro Pro Arg Tyr Ser Gly His Gly His 505 510 515 520 cat gaa cac cata ttt tgg gat cct gag tet caa cac caa gaa gct gaa 1696 His Glu His Gln Phe Trp Asp Pro Glu Ser Gln His Gln Glu Ala Glu 525 530 535 act tcc here cat cat tet ctt gcg cat gaa age tea gaa cct gtt ctt 1744 Thr Ser Thr His His Ser Leu Ala His Glu Ser Ser Glu Pro Val Leu 540 545 550 gca tet gtg gaa ctt cct cct ata agg a ct age aaa age tta aga gat 1792 Wing Being Val Glu Leu Pro Pro He Arg Thr Being Lys Ser Leu Arg Asp 555 560 565 ttt tet ttt t a t a t a t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t t ctatatggat acaaaaaaaa aaaaaaaa 1935
< 210 > 12 < 211 > 573 < 212 > PRT < 213 > Arabidopsis thaliana < 400 > 12 Met Wing Asp Gln Val Lys Glu Arg Thr Leu Glu Glu Thr Ser Thr Trp 1 5 10 15 Ala Ala Ala Val Val Cys Phe Val Leu Leu Phe He Ser He Val Val Leu 20 25 30 Glu His Ser He His Lys He Gly Thr Trp Phe Lys Lys Lys His Lys 35 40 45 Gln Ala Leu Phe Glu Ala Leu Glu Lys Val Lys Ala Glu Leu Met Leu 50 55 60 Leu Gly Phe He Ser Leu Leu Leu Thr He Gly Gln Thr Pro He Ser 65 70 75 80
Asn He Cys He Ser Gln Lys Val Ala Ser Thr Met His Pro Cys Ser 85 90 95
Wing Wing Glu Glu Wing Lys Lys Tyr Gly Lys Lys Asp Wing Gly Lys Lys 100 105 110 Asp Asp Gly Asp Gly Asp Lys Pro Gly Arg Arg Leu Leu Leu Glu Leu 115 120 125 Wing Glu Ser Tyr He His Arg Arg Ser Leu Wing Thr Lys Gly Tyr Asp 130 135 140 Lys Cys Wing Glu Lys Gly Lys Val Wing Phe Val Ser Wing Tyr Gly He 145 150 155 160
His Gln Leu His He Phe He Phe Val Leu Wing Val Val His Val Val 165 170 175
Tyr Cys He Val Thr Tyr Wing Phe Gly Lys He Lys Met Arg Thr Trp 180 185 190 Lys Ser Trp Glu Glu Glu Thr Lys Thr He Glu Tyr Gln Tyr Ser Asn 195 200 205 Asp Pro Glu Arg Phe Arg Phe Wing Arg Asp Thr Ser Phe Gly Arg Arg 210 215 220 His Leu Asn Phe Trp Ser Lys Thr Arg Val Thr Leu Trp He Val Cys 225 230 235 240
Phe Phe Arg Gln Phe Phe Gly Ser Val Thr Lys Val Asp Tyr Leu Ala 245 250 255
Leu Arg His Gly Phe He Met Wing His Phe Wing Pro Gly Asn Glu Ser 260 265 270 Arg Phe Asp Phe Arg Lys Tyr He Gln Arg Ser Leu Glu Lys Asp Phe 275 280 285 Lys Thr Val Val Glu He Ser Pro Val He Trp Phe Val Ala Val Leu 290 295 300 Phe Leu Leu Thr Asn Ser Tyr Gly Leu Arg Ser Tyr Leu Trp Leu Pro 305 310 315 320
Phe He Pro Leu Val Val He Leu He Val Gly Thr Lys Leu Glu Val 325 330 335
He He Thr Lys Leu Gly Leu Arg He Gln Glu Lys Gly Asp Val Val 340 345 350 Arg Gly Ala Pro Val Val Gln Pro Gly Asp Asp Leu Phe Trp Phe Gly 355 360 365 Lys Pro Arg Phe He Leu Phe Leu He His Leu Val Leu Phe Thr Asn 370 375 380 Wing Phe Gln Leu Wing Phe Phe Wing Trp Ser Thr Tyr Glu Phe Asn Leu 385 390 395 400
Asn Asn Cys Phe His Glu Ser Thr Wing Asp Val Val He Arg Leu Val 405 410 415
Val Gly Ala Val Val Gln He Leu Cys Ser Tyr Val Thr Leu Pro Leu 420 425 430 Tyr Ala Leu Val Thr Gln Met Gly Ser Lys Met Lys Pro Thr Val Phe 435 440 445 Asn Asp Arg Val Ala Thr Ala Leu Lys Lys Trp His His Thr Ala Lys 450 455 460 Asn Glu Thr Lys His Gly Arg His Ser Gly Ser Asn Thr Pro Phe Ser 465 470 475 480
Ser Arg Pro Thr Thr Pro Thr His Gly Ser Pro Pro He His Leu Leu 485 490 495
His Asn Phe Asn Asn Arg Ser Val Glu Asn Tyr Pro Ser Ser Pro Pro 500 505 510 Pro Arg Tyr Ser Gly His Gly His His Glu His Gln Phe Trp Asp Pro 515 520 525 Glu Ser Gln His Gln Glu Ala Glu Thr Ser Thr His His Ser Leu Ala 530 535 540 His Glu Ser Ser Glu Pro Val Leu Ala Ser Val Glu Leu Pro Pro He 545 550 555 560
Arg Thr Ser Lys Ser Leu Arg Asp Phe Ser Phe Lys Lys 565 570
< 210 > 13 < 211 > 1811 < 212 > DNA < 213 > Arabidopsis thaliana < 220 > < 221 > CDS < 222 > (56) .. (1633) < 220 > < 221 > various_characteristics < 222 > (236) .. (274) < 223 > location of the amino acid sequence stipulated in SEQ ID N?: l < 220 > < 221 > various characteristics < 222 > (1328) .. (1369) < 223 > location of the amino acid sequence stipulated in SEQ ID No: 2 < 400 > 13 gttcccagat teatetttac ttattgteta aattctctct ggtgtgagaa gtaaa atg 58 Met 1 ggt cac gga gga gaga ggg atg teg ctt gaa ttc act ceg acg tgg gtc 106 Gly His Gly Gly Glu Gly Met Ser Leu Glu Phe Thr Pro Thr Trp Val 5 10 15 gtc gcc gga gtt tgt acg gtc ate gtc gcg att tea ctg gcg gtg gag 154 Val Wing Gly Val Cys Thr Val He Val Wing He Ser Leu Wing Val Glu 20. 25 30 cgt ttg ctt cac tat ttc ggt act gtt ctt aaag aag aag aag caa aaa 202 Arg Leu Leu His Tyr Phe Gly Thr Val Leu Lys Lys Lys Gln Lys 35 40 45 ecc ctt tac gaa gcc ctt caa aag gtt aaa gaa gag ctg atg ttg tta 250
Pro Leu Tyr Glu Ala Leu Gln Lys Val Lys Glu Glu Leu Met Leu Leu
50 55 60 65 ggg ttt ata teg ctg tta ctg acg gta ttc ca ggg ctc att tcc aaa 298 Gly Phe He Ser Leu Leu Leu Thr Val Phe Gln Gly Leu He Ser Lys 70 75 80 ttc tgt gtg aaa gaa aat gtg ctt atg cat atg ctt cea tgt tet ctc 346 Phe Cys Val Lys Glu Asn Val Leu Met His Met Leu Pro Cys Ser Leu 85 90 95 gat tea aga cga gaa gct ggg gca agt gaa cat aaa aac gtt here gca 394 Asp Ser Arg Arg Glu Ala Gly Wing Ser Glu His Lys Asn Val Thr Wing 100 105 110 aaa gaa cat ttt cag act ttt tta cct att gtt gga acc act agg cgt 442 Lys Glu His Phe Gln Thr Phe Leu Pro He Val Gly Thr Thr Arg Arg 115 120 125 cta ctt gct gaa cat gct gct gct gtg ca gtt ggt tac tgt age gaa aag 490 Leu Leu Wing Glu His Wing Wing Val Gln Val Gly Tyr Cys Ser Glu Lys 130 135 140 145 ggt aaa gta cea ttg ctt teg ctt gag gca ttg cac cat cta cat att 538 Gly Lys Val Pro Leu Leu Ser Leu Glu Ala Leu His His Leu His He 150 155 160 ttc ate ttc gtc gtc ctc gcc atacc cat gtg here ttc tgt gtc ctt acc 586 Phe He Phe Val Leu Ala He Ser His Val Thr Phe Cys Val Leu Thr 165 170 175 gtg att ttt gga age here agg att cac tgg aag aaa tgg gag gat 634 Val He Phe Gly Ser Thr Arg He His Gln Trp Lys Lys Trp Glu Asp 180 185 190 teg ate gca gat gag aag ttt gac ecc gaa here gct ctc agg aaa aga 682
Ser He Wing Asp Glu Lys Phe Asp Pro Glu Thr Wing Leu Arg Lys Arg 195 200 205 agg gtc act cat gta cac aac cat gct ttt att aaa gag cat ttt "ctt 730
Arg Val Thr His Val His Asn His Wing Phe He Lys Glu His Phe Leu
210 215 220 225 ggt att ggc aaa gat tea gtc ate ctc gga tgg acg ca tc ttt ctc 778
Gly He Gly Lys Asp Ser Val He Leu Gly Trp Thr Gln Ser Phe Leu 230 235 240 aag cata ttc tat gat tet gtg acg aaa tea gat tac gtg act tta cgt 826
Lys Gln Phe Tyr Asp Ser Val Thr Lys Ser Asp Tyr Val Thr Leu Arg 245 250 255 ctt ggt ttc att atg here cat tgt aag gga aac CCC aag ctt aat ttc 874
Leu Gly Phe He Met Thr His Cys Lys Gly Asn Pro Lys Leu Asn Phe 260 265 270 cac aag tat atg cgc gct gta yta gag gat gat ttc aaa ca gtt gtt 922
His Lys Tyr Met Met Arg Ala Xaa Glu Asp Asp Phe Lys Gln Val Val 275 280 285 ggt att agt tgg tat ctt tgg ate ttt gtc gtc ate ttt ttg ctg cta 970
Gly He Ser Trp Tyr Leu Trp He Phe Val Val He Phe Leu Leu Leu 290 295 300 305 aat gtt aac gga tgg cac here tat ttc tgg ata gca ttt att CCC ttt 1018
Asn Val Asn Gly Trp His Thr Tyr Phe Trp He Wing Phe He Pro Phe 310 315 320 gct ttg ctt ctt gct gtg gga here aag ttg gag cat gtg att gca cag 1066
Wing Leu Leu Leu Wing Val Gly Thr Lys Leu Glu His Val He Wing Gln 325 330 335 tta gct cat gaa gtt gca gag aaa cat gta gcc att gaa gga gac tta 1114
Leu Ala His Glu Val Ala Glu Lys His Val Ala He Glu Gly Asp Leu 340 345 350 gtg gtg aaa CCC tea gat gag cat ttc tgg ttc age aaa cct ca ata att 1162
Val Val Lys Pro Ser Asp Glu His Phe Trp Phe Ser Lys Pro Gln He 355 360 365 gtt ctc tac tt ate ate cat ttt ate ctc ttc cag aat gct ttt gag att 1210
Val Leu Tyr Leu He His Phe He Leu Phe Gln Asn Wing Phe Glu He 370 375 380 385 gcg ttt ttc ttt tgg tgg ttt gtt here tac ggc tcc gac teg tgc att 1258
Wing Phe Phe Phe Trp He Trp Val Thr Tyr Gly Phe Asp Ser Cys He 390 395 400 atg gga cag gtg aga tac att gtt cea aga ttg gtt ate ggg gtc ttc 1306
Met Gly Gln Val Arg Tyr He Val Pro Arg Leu Val He Gly Val Phe 405 410 415 att ca gtg ctt tgc agt tac agt here ctg cct ctt tac gcc ate gtc 1354 He Gln Val Leu Cys Ser Tyr Ser Thr Leu Pro Leu Tyr Ala He Val 420 425 430 tea cag atg gga agt age ttc aag aaa gct ata ttc gag gag aat gtg 1402 Being Gln Met Gly Being Ser Phe Lys Lys Wing He Phe Glu Glu Asn Val 435 440 445 cag gtt gtt ttt gtt tg ggt cag ggt aaa gtg aaa caa aag aga gac 1450 Gln Val Gly Leu Val Gly Trp Wing Gln Lys Val Lys Gln Lys Arg Asp 450 455 460 465 cta aaa gct gca gct agat aat gga aac gaa gga age tet cag gct ggt 1498 Leu Lys Ala Ala Ala Be Asn Gly Asn Glu Gly Be Ser Gln Wing Gly 470 475 480 cct ggt cct gat tet ggt tet ggt tet gct cct gct gct gct gct gtt 1546 Pro Gly Pro Asp Ser Gly Be Gly Ser Wing Pro Pro Wing Gly Pro Gly 485 490 495 gca ggt ttt gca gga att cag ctc age aga gta here aga aac aac aac gca 1594 Wing Gly Phe Wing Gly He Gln Leu Ser Arg Val Thr Arg Asn Asn Wing 500 505 510 ggg gac here aac aat gag att here cet gat cat aac aac Asp Thr Gly 1643 tgageagaga Asn Asn Glu Thr Pro He His Asn Asn Asp 515 520 525 tattatettt tecatttaga ggatcatcat cagattttag cttcaaggtc cggttttgtg 1703 gtttatacat aagttatagt gacttgattt ttttgttttg ttacaaagtt accatctttg gattagaatt 1763 gggaaattga aaaaaaaaaa aaaaaaaa atctgtttgt 1811
< 210 > 14 < 211 > 526 < 212 > PRT < 213 > Arabidopsis thaliana < 400 > 14 Met Gly His Gly Gly Glu Gly Met Ser Leu Glu Phe Thr Pro Thr Trp 1 5 10 15 Val Val Wing Gly Val Cys Thr Val He Val Wing He Ser Leu Wing Val 20 25 30 Glu Arg Leu Leu His Tyr Phe Gly Thr Val Leu Lys Lys Lys Lys Gln 35 40 45 Lys Pro Leu Tyr Glu Ala Leu Gln Lys Val Lys Glu Glu Leu Met Leu 50 55 60 Leu Gly Phe He Ser Leu Leu Leu Thr Val Phe Gln Gly Leu He Ser 65 70 75 80
Lys Phe Cys Val Lys Glu Asn Val Leu Met His Met Leu Pro Cys Ser 85 90 95 Leu Asp Ser Arg Arg Glu Wing Gly Wing Ser Glu His Lys Asn Val Thr 100 105 110 Wing Lys Glu His Phe Gln Thr Phe Leu Pro He Val Gly Thr Thr Arg 115 120 125 Arg Leu Leu Wing Glu His Wing Wing Val Gln Val Gly Tyr Cys Ser Glu 130 135 140 Lys Gly Lys Val Pro Leu Leu Ser Leu Glu Wing Leu His His Leu His 145 150 155 160
He Phe He Phe Val Leu Wing He Ser His Val Thr Phe Cys Val Leu 165 170 175
Thr Val He Phe Gly Ser Thr Arg He His Gln Trp Lys Lys Trp Glu 180 185 190 Asp Ser He Wing Asp Glu Lys Phe Asp Pro Glu Thr Wing Leu Arg Lys 195 200 205 Arg Arg Val T r His Val His Asn His Wing Phe He Lys Glu His Phe 210 215 220 Leu Gly He Gly Lys Asp Ser Val He Leu Gly Trp Thr Gln Ser Phe 225 230 235 240
Leu Lys Gln Phe Tyr Asp Ser Val Thr Lys Ser Asp Tyr Val Thr Leu 245 250 255
Arg Leu Gly Phe He Met Thr His Cys Lys Gly Asn Pro Lys Leu Asn 260 265 270 Phe His Lys Tyr Met Met Arg Ala Xaa Glu Asp Asp Phe Lys Gln Val 275 280 285 Val Gly He Ser Trp Tyr Leu Trp He Phe Val Val He Phe Leu Leu 290 295 300 Leu Asn Val Asn Gly Trp His Thr Tyr Phe Trp He Wing Phe He Pro 305 310 315 320
Phe Ala Leu Leu Leu Ala Val Gly Thr Lys Leu Glu His Val He Ala 325 330 335
Gln Leu Wing His Glu Val Wing Glu Lys His Val Wing He Glu Gly Asp 340 345 350 Leu Val Val Lys Pro Ser Asp Glu His Phe Trp Phe Ser Lys Pro Gln 355 360 365 He Val Leu Tyr Leu He His Phe He Leu Phe Gln Asn Ala Phe Glu 370 375 380 He Wing Phe Phe Phe Trp He Trp Val Thr Tyr Gly Phe Asp Ser Cys 385 390 395 400 He Met Gly Gln Val Arg Tyr He Val Pro Arg Leu Val He Gly Val 405 410 415 Phe He Gln Val Leu Cys Ser Tyr Ser Thr Leu Pro Leu Tyr Wing He 420 425 430 Val Ser Gln Met Gly Ser Ser Phe Lys Lys Wing He Phe Glu Glu Asn 435 440 445 Val Gln Val Gly Leu Val Gly Trp Wing Gln Lys Val Lys Gln Lys Arg 450 455 460 Asp Leu Lys Ala Ala Ala Ser Asn Gly Asn Glu Gly Ser Ser Gln Ala 465 470 475 480
Gly Pro Gly Pro Asp Being Gly Ser Gly Being Wing Pro Wing Wing Gly Pro 485 490 495 Gly Wing Gly Wing Gly Wing He Gln Leu Ser Arg Val Thr Arg Asn Asn 500 505 510 Wing Gly Asp Thr Asn Asn Glu He Thr Pro Asp His Asn Asn 515 520 525
< 210 > 15 < 211 > 1782 < 212 > DNA < 213 > Arabidopsis thaliana < 220 > < 221 > CDS < 222 > (1) .. (1779) < 220 > < 221 > various_ characteristics < 222 > (274) .. (313) < 223 > location of the amino acid sequence stipulated in SEQ ID No: l < 220 > < 221 > various_ characteristics < 222 > (1327) .. (1370) < 223 > location of the amino acid sequence stipulated in SEQ ID No: 2 < 400 > 15 atg gga ate ate gac ggt tet ttg ctt cgg cgg ctc att tgt ctc tgt 48 Met Gly He lie Asp Gly Ser Leu Leu Arg Arg Leu He Cys Leu Cys 1 5 10 15 ctc tgg tgt ctt ctc ggt gga gga gtg acg gtg gtt acg gcg gag gat 96 Leu Trp Cys Leu Leu Gly Gly Gly Val Thr Val Val Thr Wing Glu Asp 20 25 30 gag aa aaa gtg gta cat aaa cag ctt aat caa act ceg act tgg gct 144 Glu Lys Lys Val Val His Lys Gln Leu Asn Gln Thr Pro Thr Trp Wing 35 40 45 gtt gct gct gtt tgt act ttc ttc ate gtt gtt ttt gtt ctt ctt gaa 192
Val Ala Ala Val Cys Thr Phe Phe He Val Val Ser Val Leu Leu Glu 50 55 60 aaa ctt ctt cac aaa gtt gga aag gtt cta tgg gat cgg cac aag a 240
Lys Leu Leu His Lys Val Gly Lys Val Leu Trp Asp Arg His Lys Thr
65 70 75 80 gct ctt ctt gac gct ttg gag aag ate aaa gca g g gt ctg atg gtt ctt 288
Wing Leu Leu Asp Wing Leu Glu Lys He Lys Wing Glu Leu Met Val Leu 85 90 95 gga ttc ate tet ttg Ctt ctg here ttt gga ca w acc tta att ttg gat 336
Gly Phe He Be Leu Leu Leu Thr Phe Gly Gln Thr Tyr He Leu Asp 100 105 110 att tgt ate cct tea cat gtt gct cgt acg atg ctc ceg tgt cct gct 384
He Cys He Pro Ser His Val Wing Arg Thr Met Leu Pro Cys Pro Wing 115 120 125 cct aac ttg aaa aag gag gat gat gac aat ggt gaa agt cac agg aga 432
Pro Asn Leu Lys Lys Glu Asp Asp Asp Asn Gly Glu Ser His Arg Arg 130 135 140 ctc ttg teg ttt gag drops aga ttt tta tet gga ggt gaa gca tet CCC 480
Leu Leu Be Phe Glu His Arg Phe Leu Be Gly Gly Glu Ala Be Pro
145 150 155 160 act aaa tgc acg aag gag ggt tat gta gag ctt ate tet gcc gag gca 528
Thr Lys Cys Thr Lys Glu Gly Tyr Val Glu Leu He Ser Wing Glu Wing 165 170 175 ctc cat cag ttg "cac ate ctt ata ttc ttc tta gcc att ttc cac gtt 576
Leu His Gln Leu His He Leu He Phe Phe Leu Wing He Phe His Val 180 185 190 ctt tac age ttc tta act atg att ctt gga agg ttg aag att cgc gga 624
Leu Tyr Be Phe Leu Thr Met Met Leu Gly Arg Leu Lys He Arg Gly 195 200 205 tgg aag cat tgg gag aat gag here tea cat cat aat tac gag ttt tea 672
Trp Lys His Trp Glu Asn Glu Thr Ser Ser His Asn Tyr Glu Phe Ser 210 215 220 here gac act tc aga tct agg cta act cat gaa here tet ttt gtg aga 720
Thr Asp Thr Ser Arg Phe Arg Leu Thr His Glu Thr Ser Phe Val Arg
225 230 235 240 gcg cac acc agt ttc tgg acc cgg att cea ttc ttt ttc tat gtt gga 768
Wing His Thr Ser Phe Trp Thr Arg He Pro Phe Phe Phe Tyr Val Gly 245 250 255 tgc ttt ttc aga cag ttt ttc aga tcc gtt ggg aga act gac tat ttg 816
Cys Phe Phe Arg Gln Phe Phe Arg Ser Val Gly Arg Thr Asp Tyr Leu 260 265 270 here ttg aga aat ggt ttc ate gct gtt cat tta gct cea gga agt caa 864 Thr Leu Arg Asn Gly Phe He Wing Val His Leu Wing Pro Gly Ser Gln 275 280 285 ttt aac ttc caa aaa tac att aaa aga teg ttg gag gat gat ttc aag 912 Phe Asn Phe Gln Lys Tyr He Lys Arg Ser Leu Glu Asp Asp Phe Lys 290 295 300 gta gtc gtt gga gtc age cct gtc ttg tgg gga tet ttt gtg cta ttc 960 Val Val Val Gly Val Ser Pro Val Leu Trp Gly Ser Phe Val Leu Phe 305 310 315 320 ctc ctc cta aat att gac ggt gag tat atg tgc ate ggc act gca 1008 Leu Leu Leu Asn He Asp Gly Glu Tyr Met Met Phe He Gly Thr Wing 325 330 335 ata ce gtt att ate att tta gct gta ggg here aag ctt ca gcg att 1056 He Pro Val He He He Leu Wing Val Gly Thr Lys Leu Gln Wing He 340 345 350 atg here agg atg gct ctt ggt ate here gat aga cat gcg gta gta cato 1104 Met Thr Arg Met Ala Leu Gly He Thr Asp Arg His Wing Val Val Gln 355 360 -v 365 gga atg ceg ctt gta ca ggc aac gat gag tat tt c tgg ttc ggt cgt 1152 Gly Met Pro Leu Val Gln Gly Asn Asp Glu Tyr Phe Trp Phe Gly Arg 370 375 380 ecc cat ttg att ctc cat ctc atg cat ttc gcc ttg ttt cag aac gca 1200 Pro Hís Leu He Leu His Leu Met His Phe Wing Leu Phe Gln Asn Wing 385 390 395 400 ttt cag ate act tat ttc tgg tgg tgg tat tcc ttt gga tea gat 1248 Phe Gln He Thr Tyr Phe Phe Trp He Trp Tyr Ser Phe Gly Ser Asp 405 410 415 tet tgc tac cat cct aat ttc aag att gca ctt gta aaa gta gcg att 1296 Ser Cys Tyr His Pro Asn Phe Lys He Ala Leu Val Lys Val Wing 420 425 430 gct tta gga gta ttg tgt ctt tgc age tac ate here ctt cct ctt tac 1344 Ala Leu Gly Val Leu Cys Leu Cys Ser Tyr He Thr Leu Pro Leu Tyr 435 440 445 gca ctc gta act cag atg ggt tet cgg atg aaa aaa teg gta ttc gat 1392 Ala Leu Val Thr Gln Met Gly Ser Arg Met Lys Lys Ser Val Phe Asp 450 455 460 gaa ca ac g tea aaa gca ctc aag aaa tgg aga atg gca gtg aag aag 1440 Glu Gln Thr Ser Lys Ala Leu Lys Lys Trp Arg Met Wing Val Lys Lys 465 470 475 480 aag aaa ggt gtg aaa gcc act act aag aga cta ggt gga gat gga agt 1488 Lys Lys Gly Val Lys Wing Thr Thr Lys Arg Leu Gly Gly Asp Gly Ser 485 490 495 gcg age cct acg gca teg here gtt agg tet act teg tet gta cgt tea 1536 Ala Ser Pro Thr Wing Ser Thr Val Arg Ser Thr Ser Ser Val Arg Ser 500 505 510 ttg cag cgt tac aaa ac cea cat cat teg atg aga tac gaa gga ctt 1584 Leu Gln Arg Tyr Lys Thr Thr Pro His Ser Met Arg Tyr Glu Gly Leu 515 520 525 gac cct gaa here teg gat ctc gac here gat aat gaa gct ttg act cct 1632
Asp Pro Glu Thr Ser Asp Leu Asp Thr Asp Asn Glu Wing Leu Thr Pro 530 535 540 ecc aaa tet cct cea age ttc gag ctt gtt gtg aaa gtt gaa cea aat 1680 Pro Lys Ser Pro Pro Ser Phe Glu Leu Val Val Lys Val Glu Pro Asn 545 550 555 560 aaag acc aat ac ggt gag act age cgt gac act gaa act gat tet aaa 1728 Lys Thr Asn Thr Gly Glu Thr Ser Arg Asp Thr Glu Thr Asp Ser Lys 565 570 575 gag ttc tet ttc gtc aag cct gct ceg agt aat gaa tea tet caa gac 1776 Glu Phe Ser Phe Val Lys Pro Pro Pro Asn Glu Ser Ser Gln Asp 580 585 590 cgg tga 1782 Arg
< 210 > 16 < 211 > 593 < 212 > PRT < 213 > Arabidopsis thaliana < 400 > 16 Met Gly He He Asp Gly Be Leu Leu Arg Arg Leu He Cys Leu Cys 1 5 10 15 Leu Trp Cys Lea Leu Gly Gly Gly Val Thr Val Val Thr Ala Glu Asp 20 25 30 Glu Lys Lys Val Val His Lys Gln Leu Asn Gln Thr Pro Thr Trp Wing 35 40 45 Val Ala Wing Val Cys Thr Phe Phe He Val Val Ser Val Leu Leu Glu 50 55 60 Lys Leu Leu His Lys Val Gly Lys Val Leu Trp Asp Arg His Lys Thr 65 70 75 80 Ala Leu Leu Asp Ala Leu Glu Lys He Lys Ala Glu Leu Met Val Leu 85 90 95 Gly Phe He Ser Leu Leu Leu Thr Phe Gly Gln Thr Tyr He Leu Asp 100 105 110 He Cys He Pro Ser His Val Ala Arg Thr Met Leu Pro Cys Pro Ala 115 120 125 Pro Asn Leu Lys Lys Glu Asp Asp Asp Asn Gly Glu Ser His Arg Arg 130 135 140 Leu Leu Ser Phe Glu His Arg Phe Leu Ser Gly Gly Glu Ala Ser Pro 145 150 155 160
Thr Lys Cys Thr Lys Glu Gly Tyr Val Glu Leu He Ser Wing Glu Wing 165 170 175
Leu His Gln Leu His He Leu He Phe Phe Leu Wing He Phe His Val 180 185 190 Leu Tyr Ser Phe Leu Thr Met Met Leu Gly Arg Leu Lys He Arg Gly 195 200 205 Trp Lys His Trp Glu Asn Glu Thr Ser Ser His Asn Tyr Glu Phe Ser 210 215 220 Thr Asp Thr Ser Arg Phe Arg Leu Thr His Glu Thr Ser Phe Val Arg 225 230 235 240
Wing His Thr Ser Phe Trp Thr Arg He Pro Phe Phe Phe Tyr Val Gly 245 250 255
Cys Phe Phe Arg Gln Phe Phe Arg Ser Val Gly Arg Thr Asp Tyr Leu 260 265 270 Thr Leu Arg Asn Gly Phe He Wing Val His Leu Wing Pro Gly Ser Gln 275 280 285 Phe Asn Phe Gln Lys Tyr He Lys Arg Ser Leu Glu Asp Asp Phe Lys 290 295 300 Val Val Val Gly Val Ser Pro Val Leu Trp Gly Ser Phe Val Leu Phe 305 310 315 320
Leu Leu Leu Asn He Asp Gly Glu Tyr Met Met Phe He Gly Thr Ala 325 330 335
He Pro Val He He He Leu Wing Val Gly Thr Lys Leu Gln Wing He 340 345 350 Met Thr Arg Met Wing Leu Gly He Thr Asp Arg His Wing Val Val Gln 355 360 365 Gly Met Pro Leu Val Gln Gly Asn Asp Glu Tyr Phe Trp Phe Gly Arg 370 375 380 Pro His Leu He Leu His Leu Met His Phe Ala Leu Phe Gln Asn Ala 385 390 395 400
Phe Gln He Thr Tyr Phe Phe Trp He Trp Tyr Ser Phe Gly Ser Asp 405 410 415
Ser Cys Tyr His Pro Asn Phe Lys He Wing Leu Val Lys Val Wing He 420 425 430 Wing Leu Gly Val Leu Cys Leu Cys Ser Tyr lie Thr Leu Pro Leu Tyr 435 440 445 Wing Leu Val Thr Gln Met Gly Ser Arg Met Lys Lys Ser Val Phe Asp 450 455 460 Glu Gln Thr Ser Lys Ala Leu Lys Lys Trp Arg Met Wing Val Lys Lys 465 470 475 480
Lys Lys Gly Val Lys Wing Thr Thr Lys Arg Leu Gly Gly Asp Gly Ser 485 490 495
Wing Ser Pro Thr Wing Ser Thr Val Arg Ser Thr Ser Ser Val Arg Ser 500 505 510 Leu Gln Arg Tyr Lys Thr Thr Pro His Ser Met Arg Tyr Glu Gly Leu 515 520 525 Asp Pro Glu Thr Ser Asp Leu Asp Thr Asp Asn Glu Ala Leu Thr Pro 530 535 540 Pro Lys Ser Pro Pro Ser Phe Glu Leu Val Val Lys Val Glu Pro Asn 545 550 555 560
Lys Thr Asn Thr Gly Glu Thr Ser Arg Asp Thr Glu Thr Asp Ser Lys 565 570 575
Glu Phe Ser Phe Val Lys Pro Ala Pro Ser Asn Glu Ser Ser Gln Asp 580 585 590 Arg
< 210 > 17 < 211 > 1629 < 212 > DNA < 213 > Arabidopsis thaliana < 220 > < 221 > CDS < 222 > (1) .. (1626) < 220 > < 221 > various_characteristics < 222 > (184) .. (223) < 223 > location of the amino acid sequence stipulated in SEQ ID N?: l < 220 > < 221 > various_ characteristics < 222 > (1141) .. (1183) < 223 > location of the amino acid sequence stipulated in SEQ ID No: 2 < 400 > 17 atg gag cat atg atg aaa gaa gga agg tet ctt gca gag acg ceg act 48 Met Glu His Met Met Lys Glu Gly Arg Ser Leu Wing Glu Thr Pro Thr 1 5 10 15 tac tet gtt gct teg gtt gtt act gtt ttg gtc ttt gtt tgc ttt ctc 96 Tyr Ser Val Wing Val Val Thr Val Leu Val Phe Val Cys Phe Leu 20 25 30 gtt gaa cgc gcc att tac aga ttt gga aga tgg tta aag aag aga act 144 Val Glu Arg Ala He Tyr Arg Phe Gly Lys Trp Leu Lys Lys Thr Arg 35 40 45 aga aag gca ctt ttt act tea ctt gag aaa atg aaa gag gag tg atg 192 Arg Lys Ala Leu Phe Thr Ser Leu Glu Lys Met Lys Glu Glu Leu Met 50 55 60 ttg ctg gga ctt ata tea ctt ctg ttg tea caa age gcg aga tgg att 240
Leu Leu Gly Leu Be Ser Leu Leu Leu Ser Gln Ser Ala Arg Trp He
65 70 75 80 tea gaa ate tgt gtt aac tet tcc ctt ttc aat agt aaa ttc tac att 288 Ser Glu He Cys Val Asn Ser Ser Leu Phe Asn Ser Lys Phe Tyr He 85 90 95 tgc tet gaa gag gac tat gga ate cat aag aaa gtt ctt ctg gaa cac 336 Cys Ser Glu Glu Asp Tyr Gly He His Lys Lys Val Leu Leu Glu His 100 105 110 acc tet tet here aac cag age tcc tta cct cat cat gga ata cat gaa 384 Thr Ser Ser Thr Asn Gln Ser Ser Leu Pro His His Gly He His Glu 115 120 125 gcc tet cat ca g tt ggt cat ggc ggc cgt gaa cea ttt gtg teg tat gag 432 Wing Ser His Gln Cys Gly His Gly Arg Glu Pro Phe Val Ser Tyr Glu 130 135 140 gga ctc gag caà ctc cta aga ttc tta ttc gtc ctg ggt ate act 480 Gly Leu Glu Gln Leu Leu Arg Phe Leu Phe Val Leu Gly He Thr His 145 150 155 160 gtt cta tac agt ggc att gcc att ggt tta gcc atg age aag att tac 528 Val Leu Tyr Ser Gly He Ala He Gly Leu Ala Met Ser Lys He Tyr 165 170 175 agt tgg aga aaa tgg gaa gcc ca gcg Att Ata Gct Gaa Tea Gat 576 Ser Trp Arg Lys Trp Glu Wing Gln Wing He Met Met Wing Glu Ser Asp 180 185 190 ate cac ctt tgt ttc ctg cgg ca t tt aga ggc tcc ata cga aag tet 624 He His Leu Cys Phe Leu Arg Gln Phe Arg Gly Ser He Arg Lys Ser 195 200 205 gac tac tcc gca ctt cgg tta ggt ttc ctc act aaa cat aat ttg cea 672 Asp Tyr Phe Ala Leu Arg Leu Gly Phe Leu Thr Lys His Asn Leu Pro 210 215 220 ttt here tac aac ttc cat atg tat atg gta cgg acg atg gaa gat gag 720
Phe Thr Tyr Asn Phe His Met Tyr Met Val Arg Thr Met Glu Asp Glu
225 230 2§5 240 ttt cat ggc att gtt gga att age tgg cea ctt tgg gtt tac gct ata 768
Phe His Gly He Val Gly He Ser Trp Pro Leu Trp Val Tyr Ala He 245 250 255 gta tgc ate tgc ata aat gtt cat ggc ctg aat atg tac ttt tgg ata 816
Val Cys He Cys He Asn Val His Gly Leu Asn Met Tyr Phe Trp He 260 265 270 tea ttc gtt cct gcc att ctt gtc atg ttg gtt gga acc aaa ctt gag 864
Be Phe Val Pro Ala He Leu Val Met Leu Val Gly Thr Lys Leu Glu 275 280 285 cat gtt gtc tcc aag ctt gct ctc gag gtt aag gag cag cag here ggc 912
His Val Val Ser Lys Leu Ala Leu Glu Val Lys Glu Gln Gln Thr Gly 290 295 300 here tet aat ggg gct ca gtc aaa cea cgt gat ggg ctc ttc tgg ttt 960
Thr Ser Asn Gly Wing Gln Val Lys Pro Arg Asp Gly Leu Phe Trp Phe
305 310 315 320 ggg aaa cea gaa att ctg cta cgg ttg ata caa ttt ate att ttt cag 1008
Gly Lys Pro Glu He Leu Leu Arg Leu He Gln Phe He He Phe Gln 325 330 335 aat gca ttt gaa atg gca here ttc ate tgg ttc ttg tgg gga ate aag 1056
Asn Ala Phe Glu Met Wing Thr Phe He Trp Phe Leu Trp Gly He Lys 340 345 350 gaa aga tet tgc ttc atg aag aac cat gtg atg ata tea age cgg cta 1104
Glu Arg Ser Cys Phe Met Lys Asn His Val Met He Ser Ser Arg Leu 355 360 365 att tet ggg gtt ctc gtt cag ttc tgg tgt agt tat ggc act gtg cct 1152
He Ser Gly Val Leu Val Gln Phe Trp Cys Ser Tyr Gly Thr Val Pro 370 375 380 ctc aat gta ate gtt act cag atg gga tet cgg cat aag aaa gct gtg 1200
Leu Asn Val He Val Thr Gln Met Gly Ser Arg His Lys Lys Ala Val
385 390 395 400 ata gca gag age gta aga gac tea ctt cac agt tgg tgc aag aga gtg 1248
He Wing Glu Ser Val Arg Asp Ser Leu His Ser Trp Cys Lys Arg Val 405 410 415 aaa gag agg tet aag cac acg aga tea gtg tgt tcc ctt gac ac gca 1296
Lys Glu Arg Ser Lys His Thr Arg Ser Val Cys Ser Leu Asp Thr Wing 420 425 430 here ata gac gag aga gac gag atg here gtg ggg here ttg tet agg age 1344
Thr He Asp Glu Arg Asp Glu Met Thr Val Gly Thr Leu Ser Arg Ser 435 440 445 tca teg atg act tea ctg aat cag att ata atac cc gac caca 1392 Being Ser Met Thr Being Leu Asn Gln He Thr He Asn Ser He Asp Gln 450 455 460 gca gag tet ata ttc gga gca gca gct te tcc age agt cct caa gat 1440 Wing Glu Ser He Phe Gly Wing Wing Wing Being Being Pro Gln Asp 465 470 475 480 gga tac acg teg agg gtg gaa gaa tat ctg tet gaa here tac aat aac 1488
Gly Tyr Thr Ser Arg Val Glu Glu Tyr Leu Ser Glu Thr Tyr Asn Asn 485 490 495 ate ggt teg ata ceg cct tta aac gat gag att gag att gag att gaa 1536 He Gly Ser He Pro Pro Leu Asn Asp Glu He Glu He Glu He Glu 500 505 510 ggt gaa gat gat aat gga ggg aga gga agg ggg agt gat gag aat aac 1584 Gly Glu Glu Asp Asn Gly Gly Ar'g Gly Ser Gly Ser Asp Glu Asn Asn 515 520 525 ggt gat gct gga gaa here ctt ctt gag ttg ttt agg agg act tga 1629
Gly Asp Wing Gly Glu Thr Leu Leu Glu Leu Phe Arg Arg Thr 530 535 540
< 210 > 18 < 211 > 542 < 212 > PRT < 213 > Arabidopsls thaliana < 400 > 18 Met Glu Met Met Met Lys Glu Gly Arg Ser Leu Ala Glu Thr Pro Thr 1 5 10 15 Tyr Ser Val Ala Ser Val Val Thr Val Leu Val Phe Val Cys Phe Leu 20 25 30 Val Glu Arg Ala He Tyr Arg Phe Gly Lys Trp Leu Lys Lys Thr Arg 35 40 45 Arg Lys Ala Leu Phe Thr Ser Leu Glu Lys Met Lys Glu Glu Leu Met 50 55 60 Leu Leu Gly Leu He Ser Leu Leu Ser Gln Ser Ala Arg Trp He 65 70 75 80 Ser Glu He Cys Val Asn Ser Ser Leu Phe Asn Ser Lys Phe Tyr He 85 90 95 Cys Ser Glu Glu Asp Tyr Gly He His Lys Lys Val Leu Lelu Glu His 100 105 110 Thr Ser Ser Thr Asn Gln Ser Ser Leu Pro His His Gly He His Glu 115 120 125 Wing Ser His Gln Cys Gly His Gly Arg Glu Pro Phe Val Ser Tyr Glu 130 135 140 Gly Leu Glu Gln Leu Leu Arg Phe Leu Phe Val Leu Gly He Thr His 145 150 155 160
Val Leu Tyr Ser Gly He Wing He Gly Leu Wing Met Ser Lys He Tyr 165 170 175
Being Trp Arg Lys Trp Glu Wing Gln Wing He Met Met Wing Glu Being Asp 180 185 190 He His Leu Cys Phe Leu Arg Gln Phe Arg Gly Ser He Arg Lys Ser 195 200 205 Asp Tyr Phe Ala Leu Arg Leu Gly Phe Leu Thr Lys His Asn Leu Pro 210 215 220 Phe Thr Tyr Asn Phe His Met Tyr Met Val Arg Thr Met Glu Asp Glu 225 230 235 240
Phe His Gly He Val Gly He Ser Trp Pro Leu Trp Val Tyr Ala He 245 250 255
Val Cys He Cys He Asn Val His Gly Leu Asn Met Tyr Phe Trp He 260 265 270 Ser Phe Val Pro Ala He Leu Val Met Leu Val Gly Thr Lys Leu Glu 275 280 285 His Val Val Ser Lys Leu Ala Leu Glu Val Lys Glu Gln Gln Thr Gly 290 295 300 Thr Ser Asn Gly Wing Gln Val Lys Pro Arg Asp Gly Leu Phe Trp Phe 305 310 315 320
Gly Lys Pro Glu He Leu Leu Arg Leu He Gln Phe He He Phe Gln 325 330 335
Asn Ala Phe Glu Met Wing Thr Phe He Trp Phe Leu Trp Gly He Lys 340 345 350 Glu Arg Ser Cys Phe Met Lys Asn His Val Met He Ser Ser Arg Leu 355 360 365 He Be Gly Val Leu Val Gln Phe Trp Cys Ser Tyr Gly Thr Val Pro 370 375 380 Leu Asn Val He Val Thr Gln Met Gly Ser Arg His Lys Lys Ala Val 385 390 395 400
He Wing Glu Ser Val Arg Asp Ser Leu His Ser Trp Cys Lys Arg Val 405 410 415
Lys Glu Arg Ser Lys His Thr Arg Ser Val Cys Ser Leu Asp Thr Wing 420 425 430 Thr He Asp Glu Arg Asp Glu Met Thr Val Gly Thr Leu Ser Arg Ser 435 440 445 Ser Ser Met Thr Ser Leu Asn Gln He Thr He Asn Ser He Asp Gln 450 455 460 Wing Glu Ser He Phe Gly Wing Wing Wing Ser Ser Ser Pro Gln Asp 465 470 475 480
Gly Tyr Thr Ser Arg Val Glu Glu Tyr Leu Ser Glu Thr Tyr Asn Asn 485 490 495 He Gly Ser Pro Pro Leu Asn Asp Glu He Glu He Glu He Glu 500 505 510 Gly Glu Glu Asp Asn Gly Gly Arg Gly Ser Gly Ser Asp Glu Asn Asn 515 520 525 Gly Asp Wing Gly Glu Thr Leu Leu Glu Leu Phe Arg Arg Thr 530 535 540
< 210 > 19 < 211 > 20 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 19 atgtcggaca aaaaaggggt 20
< 210 > 20 < 211 > 20 < 12 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 20 atgctaccac acgcagatcg 20
< 210 > 21 < 211 > 21 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 21 acagagacca cctccttgga a 21 < 210 > 22 < 211 > 22 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 22 cagaaacttg tctcatccct gg 22
< 210 > 23 < 211 > 20 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 23 aagaactgcc tgaagaaggc 20
< 210 > 24 < 211 > 20 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 24 caccaccttc atgatgctca 20
< 210 > 25 < 211 > 20 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 25 ttccagcacc ggcacaagaa 20
< 210 > 26 < 211 > 28 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 26 tggacctctt catgttcgat cccatctg 28
< 210 > 27 < 211 > 27 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 27 cctgacgctg ttccagaatg cgtttca 27
< 210 > 28 < 211 > 20 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 28 acttctgcag gtcgactcta 20
< 210 > 29 < 211 > 30 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 29 aagatcaaga tgaggacgtg gaagtcgtgg 30
< 210 > 30 < 211 > 30 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 30 aggctgaacc actggggcgc ctctcaccac 30
< 210 > 31 < 211 > 30 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 31 caagtatatg atgcgcgctc tagaggatga 30
< 210 > 32 < 211 > 30 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 32 aggtttcacc actaagtctc cttcaatggc 30
< 210 > 33 < 211 > 30 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 33 gatcattcaa gacttaggct cactcatgag 30
< 210 > 34 < 211 > 30 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 34 aacagcaagg aagattacaa atgatgccca 30
< 210 > 35 < 211 > 18 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 35 ggattaagat ctaatggc 18
< 210 > 36 < 211 > 23 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 36 caaagatctt catttcttaa aag 23
< 210 > 37 < 211 > 25 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 37 gcggatccat gtcggacaaa aaagg 25
< 210 > 38 < 211 > 25 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 38 gcggatcctc atccctggct gaagg 25
< 210 > 39 < 211 > 23 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 39 ggatccacca tggccacaag atg 23
< 210 > 40 < 211 > 24 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 40 ggatccttag tcaatatcat tagc 24
< 210 > 41 < 211 > 27 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 41 gcggatccat gggtcacgga ggagaag 27
< 210 > 42 < 211 > 27 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Oligonucleotide < 400 > 42 gcggatcctc agttgttatg atcagga 27
Claims (28)
1. A DNA molecule that codes for a My protein, where this My protein comprises at least one of the sequences stipulated in SEQ ID N0: 1 or SEQ ID N0: 2, and where this My protein confers in a plant resistance to a fungal pathogen.
2. The DNA molecule of claim 1, wherein said DNA molecule is identical or substantially similar to any of the nucleotide sequences stipulated in SEQ ID NO: 3, SEQ ID NO: 5, or SEQ ID NO. : 7, or encode a My protein identical or substantially similar to a My protein stipulated in SEQ ID NO: 4, SEQ ID NO: 6, or SEQ ID NO: 8.
3. The DNA molecule of claim 1, in wherein this DNA molecule is identical or substantially similar to any of the nucleotide sequences stipulated in SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, or SEQ ID NO: 17 , or encodes a My protein identical or substantially similar to a coded Mio protein stipulated in SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, or SEQ ID NO: 18.
4 The DNA molecule of any of claims 1 to 3, wherein this DNA molecule is not derived from barley.
5. The DNA molecule of any one of claims 1 to 4, wherein this DNA is modified, such that the activity of the endogenous protein is lost.
6. The DNA molecule of claim 5, wherein the DNA modification results in one, all, or a combination of the following changes in the amino acid sequence of the corresponding protein: Trp (163) to Arg change of frame after Pro (396) frame change after Trp (160) Met (1) to lie Gly (227) to Asp Met (1) to Val Arg (11) to missing Trp Phe (183), Thr (184) Val (31) to Glu Ser (32) to Phe Leu (271) to His.
7. An anti-sense DNA molecule for a DNA molecule of any one of claims 1 to 6.
8. A protein comprising at least one of the sequences stipulated in SEQ ID N0: 1 or SEQ ID NO: 2, where this protein is a My protein, and confers on a plant resistance to a fungal pathogen.
9. The protein of claim 8, wherein said protein is encoded by a nucleotide sequence identical or substantially similar to any of the sequences stipulated in SEQ ID NO: 3, SEQ ID NO: 5, or SEQ ID NO. : 7, or is identical or substantially similar to any of the proteins stipulated in SEQ ID N0: 4, SEQ ID NO: 6, or SEQ ID NO: 8. The protein of claim 8, wherein this protein is encoded by a nucleotide sequence identical or substantially similar to any of the sequences stipulated in SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, or SEQ ID NO: 17 , or is identical or substantially similar to any of the proteins stipulated in SEQ ID NO:
10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, or SEQ ID NO: 18.
11. The protein of any of claims 8 to 10, wherein this protein is not derived from barley.
12. An expression cassette comprising a DNA molecule of any of claims 1 to 7.
13. A vector comprising an expression cassette comprising a DNA molecule of claim 12.
14. A cell comprising a cassette of expression, or parts thereof, comprising a DNA molecule of any of claims 1 to 7, wherein this DNA molecule in the expression cassette can be expressed in that cell.
15. The cell of claim 14, wherein this DNA molecule is stably integrated into the genome of this cell.
16. The cell of any of claims 14 or 15, wherein this cell is a plant cell.
17. A plant comprising an expression cassette or portions thereof, comprising a DNA molecule of any of claims 1 to 7, wherein this DNA molecule in the expression cassette, it can be expressed in that plant.
18. The plant of claim 17, wherein the DNA molecule is stably integrated into the genome of this plant.
19. An agricultural product comprising a plant comprising a DNA molecule isolated from any of claims 1 to 7, wherein this agricultural product has improved phytosanitary properties.
20. A method for making a plant resistant to a fungal pathogen, which comprises the steps of: a) expressing in the plant a DNA molecule of any of claims 1 to 6, in an orientation "in sense"; or b) expressing in the plant a DNA molecule of any of claims 1 to 6, in an "anti-sense" orientation; or c) expressing in the plant a ribozyme capable of specifically dissociating a messenger RNA transcript encoded by an endogenous gene corresponding to a DNA molecule of any of claims 1 to 6; or d) expressing in a plant an aptamer specifically targeted to a protein or part of a protein encoded by a DNA molecule of any of claims 1 to 6; or e) expressing in a plant a mutated or truncated form of a DNA molecule of any one of claims 1 to 6; of) modifying, by homologous recombination in a plant, at least one chromosomal copy of the gene corresponding to a DNA molecule of any of claims 1 to 6.
21. A plant made resistant to a fungal pathogen by the method of claim 20
22. The plant of claim 21, wherein this pathogen fungal to living epidermal cells.
23. The plant of claim 21, wherein the fungal pathogen is of the order Erysiphales.
24. The plant of claim 21, wherein the fungal pathogen is of the genus Erysiphe.
25. The plant of claim 21, wherein the fungal pathogen is Erysiphe graminis.
26. An agricultural product with improved phytosanitary properties obtained using the method of claim 20.
27. A method for isolating DNA molecules that encode Mio proteins, which comprises the steps of: a) mixing a degenerate oligonucleotide that codes when minus six amino acids of SEQ ID NO: 1, and a degenerate oligonucleotide complementary to a sequence encoding at least six amino acids of SEQ ID NO: 2, with DNA extracted from a plant, under conditions that allow the hybridization of these degenerate oligonucleotides in the DNA; and b) amplifying a DNA fragment of the DNA of this plant, wherein the DNA fragment comprises, at its left and right ends, nucleotide sequences that can be delayed with the degenerate oligonucleotides of step a); and c) obtaining a full-length cDNA clone comprising the DNA fragment from step b).
28. A method for mutating a DNA molecule of claim 1, wherein the DNA molecule has been dissociated into random double-stranded fragments of a desired size, and which comprises the steps of: a) adding to the resulting population of random double-stranded fragments, one or more single-stranded or double-stranded oligonucleotides, wherein these oligonucleotides comprise an identity area and an area of heterology with annealing of the double-stranded polynucleotide; b) denaturing the resulting mixture of random double-stranded fragments and oligonucleotides into single-stranded fragments; c) incubating the resulting population of single-stranded fragments with a polymerase, under conditions that result in the mating of these single-stranded fragments in the identity areas, to form pairs of paired fragments, these areas of identity being sufficient for one member of one pair to prime the replica of the other, thereby forming a mutated double-stranded polynucleotide; and d) repeating the second and third steps by at least two additional cycles, wherein the resulting mixture in the second step of an additional cycle includes the double-stranded polynucleotide mutated from the third step of the previous cycle, and the additional cycle forms a polynucleotide of double mutated additional chain.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/042,763 | 1998-03-17 |
Publications (1)
Publication Number | Publication Date |
---|---|
MXPA00009108A true MXPA00009108A (en) | 2001-07-09 |
Family
ID=
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA2328058C (en) | Dsrna-mediated regulation of gene expression in plants | |
EP1311162B1 (en) | Bacillus thurigiensis crystal protein hybrids | |
AU3331199A (en) | Genes controlling diseases | |
US7358419B2 (en) | Enhanced silk exsertion under stress | |
KR20010112952A (en) | Regulation of viral gene expression | |
AU2001285900A1 (en) | Novel insecticidal toxins derived from Bacillus thuringiensis insecticidal crystal proteins | |
US6706952B1 (en) | Arabidopsis gene encoding a protein involved in the regulation of SAR gene expression in plants | |
EP1841870B1 (en) | Nitrogen-regulated sugar sensing gene and protein and modulation thereof | |
US20070294789A1 (en) | Photoperiodic control of floret differentiation and yield in plants | |
US20110055975A1 (en) | Nitrogen-regulated sugar sensing gene and protein and modulation thereof | |
US7019195B1 (en) | Method for conferring resistance or tolerance aganist furovirus, potyvirus, tospovirus, and cucomovirus to plant cells | |
CN1954075B (en) | Inducible promoters | |
WO2004072239A2 (en) | Novel polypeptides with antifungal activity | |
WO2000078799A2 (en) | Mlo-genes controlling diseases | |
JP2002512036A (en) | A novel insecticidal toxin from Xenorhabdusnematophilus of the genus Xenorabdus and a nucleic acid sequence encoding the same | |
RU2241749C2 (en) | New plant genes and their applying | |
MXPA00009108A (en) | Genes controlling diseases | |
CA2402136A1 (en) | Novel monocotyledonous plant genes and uses thereof | |
WO2007036045A1 (en) | Method of modulating flowering time and shoot branching | |
MXPA02007431A (en) | Root transcriptional factors and methods of use. | |
MXPA00008809A (en) | Expression of trehalose biosynthetic genes in plants |