CA2280894A1 - Production of mature proteins in plants - Google Patents
Production of mature proteins in plants Download PDFInfo
- Publication number
- CA2280894A1 CA2280894A1 CA002280894A CA2280894A CA2280894A1 CA 2280894 A1 CA2280894 A1 CA 2280894A1 CA 002280894 A CA002280894 A CA 002280894A CA 2280894 A CA2280894 A CA 2280894A CA 2280894 A1 CA2280894 A1 CA 2280894A1
- Authority
- CA
- Canada
- Prior art keywords
- mature
- seq
- bpn
- sequence
- amino acid
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 200
- 102000004169 proteins and genes Human genes 0.000 title claims abstract description 109
- 238000004519 manufacturing process Methods 0.000 title abstract description 26
- 108010076504 Protein Sorting Signals Proteins 0.000 claims abstract description 46
- 108091026890 Coding region Proteins 0.000 claims abstract description 43
- 108091006905 Human Serum Albumin Proteins 0.000 claims abstract description 40
- 102000008100 Human Serum Albumin Human genes 0.000 claims abstract description 40
- 102000004411 Antithrombin III Human genes 0.000 claims abstract description 39
- 108090000935 Antithrombin III Proteins 0.000 claims abstract description 38
- 229960005348 antithrombin iii Drugs 0.000 claims abstract description 37
- 235000007164 Oryza sativa Nutrition 0.000 claims abstract description 34
- 235000009566 rice Nutrition 0.000 claims abstract description 34
- 108090000765 processed proteins & peptides Proteins 0.000 claims abstract description 28
- 241000209510 Liliopsida Species 0.000 claims abstract description 25
- BPYKTIZUTYGOLE-IFADSCNNSA-N Bilirubin Chemical compound N1C(=O)C(C)=C(C=C)\C1=C\C1=C(C)C(CCC(O)=O)=C(CC2=C(C(C)=C(\C=C/3C(=C(C=C)C(=O)N\3)C)N2)CCC(O)=O)N1 BPYKTIZUTYGOLE-IFADSCNNSA-N 0.000 claims abstract description 22
- 108020001507 fusion proteins Proteins 0.000 claims abstract description 22
- 125000001429 N-terminal alpha-amino-acid group Chemical group 0.000 claims abstract description 19
- 102000037865 fusion proteins Human genes 0.000 claims abstract description 19
- 239000004382 Amylase Substances 0.000 claims abstract description 17
- 239000002753 trypsin inhibitor Substances 0.000 claims abstract description 17
- 241000282412 Homo Species 0.000 claims abstract description 13
- 230000013595 glycosylation Effects 0.000 claims abstract description 12
- 238000006206 glycosylation reaction Methods 0.000 claims abstract description 12
- 102000005158 Subtilisins Human genes 0.000 claims abstract description 11
- 108010056079 Subtilisins Proteins 0.000 claims abstract description 10
- 210000002966 serum Anatomy 0.000 claims abstract description 10
- 241000193830 Bacillus <bacterium> Species 0.000 claims abstract description 6
- 240000007594 Oryza sativa Species 0.000 claims abstract 6
- 241000196324 Embryophyta Species 0.000 claims description 61
- 238000000034 method Methods 0.000 claims description 48
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 36
- 230000001105 regulatory effect Effects 0.000 claims description 32
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 27
- 235000007340 Hordeum vulgare Nutrition 0.000 claims description 21
- 230000002103 transcriptional effect Effects 0.000 claims description 20
- 230000001939 inductive effect Effects 0.000 claims description 18
- 150000003384 small molecules Chemical class 0.000 claims description 15
- 235000000346 sugar Nutrition 0.000 claims description 14
- 229920001184 polypeptide Polymers 0.000 claims description 12
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 12
- IXORZMNAPKEEDV-UHFFFAOYSA-N gibberellic acid GA3 Natural products OC(=O)C1C2(C3)CC(=C)C3(O)CCC2C2(C=CC3O)C1C3(C)C(=O)O2 IXORZMNAPKEEDV-UHFFFAOYSA-N 0.000 claims description 10
- 230000028327 secretion Effects 0.000 claims description 10
- 239000005980 Gibberellic acid Substances 0.000 claims description 9
- IXORZMNAPKEEDV-OBDJNFEBSA-N gibberellin A3 Chemical group C([C@@]1(O)C(=C)C[C@@]2(C1)[C@H]1C(O)=O)C[C@H]2[C@]2(C=C[C@@H]3O)[C@H]1[C@]3(C)C(=O)O2 IXORZMNAPKEEDV-OBDJNFEBSA-N 0.000 claims description 9
- 230000005562 seed maturation Effects 0.000 claims description 9
- 108091036066 Three prime untranslated region Proteins 0.000 claims description 7
- 230000007226 seed germination Effects 0.000 claims description 7
- 108091026898 Leader sequence (mRNA) Proteins 0.000 claims description 5
- 241000209056 Secale Species 0.000 claims description 4
- 235000007319 Avena orientalis Nutrition 0.000 claims description 3
- 235000007238 Secale cereale Nutrition 0.000 claims description 3
- 244000062793 Sorghum vulgare Species 0.000 claims description 3
- 235000021307 Triticum Nutrition 0.000 claims description 3
- 240000008042 Zea mays Species 0.000 claims description 3
- 235000002017 Zea mays subsp mays Nutrition 0.000 claims description 3
- 108010050181 aleurone Proteins 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 235000019713 millet Nutrition 0.000 claims description 3
- 230000001131 transforming effect Effects 0.000 claims description 3
- 235000007558 Avena sp Nutrition 0.000 claims description 2
- 240000006394 Sorghum bicolor Species 0.000 claims description 2
- 235000011684 Sorghum saccharatum Nutrition 0.000 claims description 2
- 235000005824 Zea mays ssp. parviglumis Nutrition 0.000 claims description 2
- 235000005822 corn Nutrition 0.000 claims description 2
- 230000001737 promoting effect Effects 0.000 claims description 2
- 240000005979 Hordeum vulgare Species 0.000 claims 5
- 238000012258 culturing Methods 0.000 claims 2
- 241000209763 Avena sativa Species 0.000 claims 1
- 244000098338 Triticum aestivum Species 0.000 claims 1
- 239000002773 nucleotide Substances 0.000 claims 1
- 125000003729 nucleotide group Chemical group 0.000 claims 1
- 210000004027 cell Anatomy 0.000 abstract description 65
- 230000009261 transgenic effect Effects 0.000 abstract description 7
- 125000001433 C-terminal amino-acid group Chemical group 0.000 abstract description 2
- 235000018102 proteins Nutrition 0.000 description 91
- 235000001014 amino acid Nutrition 0.000 description 38
- 229940024606 amino acid Drugs 0.000 description 37
- 150000001413 amino acids Chemical class 0.000 description 37
- 241000209094 Oryza Species 0.000 description 30
- 150000007523 nucleic acids Chemical group 0.000 description 26
- 108020004707 nucleic acids Proteins 0.000 description 25
- 102000039446 nucleic acids Human genes 0.000 description 25
- 239000013598 vector Substances 0.000 description 24
- 238000004113 cell culture Methods 0.000 description 19
- 241000209219 Hordeum Species 0.000 description 17
- 108020004705 Codon Proteins 0.000 description 16
- 230000009466 transformation Effects 0.000 description 16
- 108020004414 DNA Proteins 0.000 description 14
- 102000004190 Enzymes Human genes 0.000 description 13
- 108090000790 Enzymes Proteins 0.000 description 13
- 229940088598 enzyme Drugs 0.000 description 13
- 238000001262 western blot Methods 0.000 description 13
- 206010020649 Hyperkeratosis Diseases 0.000 description 12
- 239000012634 fragment Substances 0.000 description 12
- 230000035784 germination Effects 0.000 description 12
- 238000013518 transcription Methods 0.000 description 11
- 230000035897 transcription Effects 0.000 description 11
- 230000000694 effects Effects 0.000 description 10
- 239000002609 medium Substances 0.000 description 10
- 239000013612 plasmid Substances 0.000 description 10
- 108091034117 Oligonucleotide Proteins 0.000 description 9
- 235000013339 cereals Nutrition 0.000 description 9
- 230000001225 therapeutic effect Effects 0.000 description 9
- 210000001519 tissue Anatomy 0.000 description 9
- 108020003589 5' Untranslated Regions Proteins 0.000 description 8
- 239000013604 expression vector Substances 0.000 description 8
- 239000001963 growth medium Substances 0.000 description 8
- 230000000813 microbial effect Effects 0.000 description 8
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 7
- 108010050848 glycylleucine Proteins 0.000 description 7
- 230000006698 induction Effects 0.000 description 7
- 238000003780 insertion Methods 0.000 description 7
- 230000037431 insertion Effects 0.000 description 7
- 239000002207 metabolite Substances 0.000 description 7
- 239000006228 supernatant Substances 0.000 description 7
- 108020005345 3' Untranslated Regions Proteins 0.000 description 6
- HLXHCNWEVQNNKA-UHFFFAOYSA-N 5-methoxy-2,3-dihydro-1h-inden-2-amine Chemical compound COC1=CC=C2CC(N)CC2=C1 HLXHCNWEVQNNKA-UHFFFAOYSA-N 0.000 description 6
- MDSUKZSLOATHMH-UHFFFAOYSA-N N-L-leucyl-L-valine Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(O)=O MDSUKZSLOATHMH-UHFFFAOYSA-N 0.000 description 6
- 230000004988 N-glycosylation Effects 0.000 description 6
- 108010067372 Pancreatic elastase Proteins 0.000 description 6
- 102000016387 Pancreatic elastase Human genes 0.000 description 6
- 210000004899 c-terminal region Anatomy 0.000 description 6
- 108010015792 glycyllysine Proteins 0.000 description 6
- 230000014616 translation Effects 0.000 description 6
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 5
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 5
- 108010068370 Glutens Proteins 0.000 description 5
- CQZDZKRHFWJXDF-WDSKDSINSA-N Gly-Gln-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)CN CQZDZKRHFWJXDF-WDSKDSINSA-N 0.000 description 5
- 101000823116 Homo sapiens Alpha-1-antitrypsin Proteins 0.000 description 5
- DEFJQIDDEAULHB-IMJSIDKUSA-N L-alanyl-L-alanine Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(O)=O DEFJQIDDEAULHB-IMJSIDKUSA-N 0.000 description 5
- MDSUKZSLOATHMH-IUCAKERBSA-N Leu-Val Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@@H](C(C)C)C([O-])=O MDSUKZSLOATHMH-IUCAKERBSA-N 0.000 description 5
- 229930006000 Sucrose Natural products 0.000 description 5
- CZMRCDWAGMRECN-UGDNZRGBSA-N Sucrose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 description 5
- 108010056243 alanylalanine Proteins 0.000 description 5
- 108010047495 alanylglycine Proteins 0.000 description 5
- 238000013459 approach Methods 0.000 description 5
- 230000000295 complement effect Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 238000004890 malting Methods 0.000 description 5
- 239000003550 marker Substances 0.000 description 5
- 108010091311 prosubtilisin Proteins 0.000 description 5
- 238000010561 standard procedure Methods 0.000 description 5
- 238000006467 substitution reaction Methods 0.000 description 5
- 239000005720 sucrose Substances 0.000 description 5
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 5
- BUANFPRKJKJSRR-ACZMJKKPSA-N Ala-Ala-Gln Chemical compound C[C@H]([NH3+])C(=O)N[C@@H](C)C(=O)N[C@H](C([O-])=O)CCC(N)=O BUANFPRKJKJSRR-ACZMJKKPSA-N 0.000 description 4
- UXRVDHVARNBOIO-QSFUFRPTSA-N Asp-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CC(=O)O)N UXRVDHVARNBOIO-QSFUFRPTSA-N 0.000 description 4
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 4
- NVGBPTNZLWRQSY-UWVGGRQHSA-N Lys-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN NVGBPTNZLWRQSY-UWVGGRQHSA-N 0.000 description 4
- YQAIUOWPSUOINN-IUCAKERBSA-N Lys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN YQAIUOWPSUOINN-IUCAKERBSA-N 0.000 description 4
- 108010079364 N-glycylalanine Proteins 0.000 description 4
- 108090000787 Subtilisin Proteins 0.000 description 4
- OBTCMSPFOITUIJ-FSPLSTOPSA-N Val-Asp Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O OBTCMSPFOITUIJ-FSPLSTOPSA-N 0.000 description 4
- GJNDXQBALKCYSZ-RYUDHWBXSA-N Val-Phe Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 GJNDXQBALKCYSZ-RYUDHWBXSA-N 0.000 description 4
- 239000002253 acid Substances 0.000 description 4
- ZVDPYSVOZFINEE-BQBZGAKWSA-N alpha-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(O)=O ZVDPYSVOZFINEE-BQBZGAKWSA-N 0.000 description 4
- 230000015572 biosynthetic process Effects 0.000 description 4
- 230000007812 deficiency Effects 0.000 description 4
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 4
- 108010054155 lysyllysine Proteins 0.000 description 4
- 108010017391 lysylvaline Proteins 0.000 description 4
- 230000008488 polyadenylation Effects 0.000 description 4
- 239000000047 product Substances 0.000 description 4
- 238000000746 purification Methods 0.000 description 4
- 108010026333 seryl-proline Proteins 0.000 description 4
- 150000008163 sugars Chemical class 0.000 description 4
- 238000003786 synthesis reaction Methods 0.000 description 4
- 108010073969 valyllysine Proteins 0.000 description 4
- 241000589155 Agrobacterium tumefaciens Species 0.000 description 3
- RLMISHABBKUNFO-WHFBIAKZSA-N Ala-Ala-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 3
- CXISPYVYMQWFLE-VKHMYHEASA-N Ala-Gly Chemical compound C[C@H]([NH3+])C(=O)NCC([O-])=O CXISPYVYMQWFLE-VKHMYHEASA-N 0.000 description 3
- IPWKGIFRRBGCJO-IMJSIDKUSA-N Ala-Ser Chemical compound C[C@H]([NH3+])C(=O)N[C@@H](CO)C([O-])=O IPWKGIFRRBGCJO-IMJSIDKUSA-N 0.000 description 3
- KXFCBAHYSLJCCY-ZLUOBGJFSA-N Asn-Asn-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O KXFCBAHYSLJCCY-ZLUOBGJFSA-N 0.000 description 3
- IQTUDDBANZYMAR-WDSKDSINSA-N Asn-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(N)=O IQTUDDBANZYMAR-WDSKDSINSA-N 0.000 description 3
- 244000075850 Avena orientalis Species 0.000 description 3
- 241000894006 Bacteria Species 0.000 description 3
- BWGNESOTFCXPMA-UHFFFAOYSA-N Dihydrogen disulfide Chemical compound SS BWGNESOTFCXPMA-UHFFFAOYSA-N 0.000 description 3
- SSHIXEILTLPAQT-WHFBIAKZSA-N Gln-Asp Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O SSHIXEILTLPAQT-WHFBIAKZSA-N 0.000 description 3
- PJBVXVBTTFZPHJ-GUBZILKMSA-N Glu-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)O)N PJBVXVBTTFZPHJ-GUBZILKMSA-N 0.000 description 3
- SCCPDJAQCXWPTF-VKHMYHEASA-N Gly-Asp Chemical compound NCC(=O)N[C@H](C(O)=O)CC(O)=O SCCPDJAQCXWPTF-VKHMYHEASA-N 0.000 description 3
- IKAIKUBBJHFNBZ-LURJTMIESA-N Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CN IKAIKUBBJHFNBZ-LURJTMIESA-N 0.000 description 3
- OLIFSFOFKGKIRH-WUJLRWPWSA-N Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CN OLIFSFOFKGKIRH-WUJLRWPWSA-N 0.000 description 3
- FFALDIDGPLUDKV-ZDLURKLDSA-N Gly-Thr-Ser Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O FFALDIDGPLUDKV-ZDLURKLDSA-N 0.000 description 3
- HTTJABKRGRZYRN-UHFFFAOYSA-N Heparin Chemical compound OC1C(NC(=O)C)C(O)OC(COS(O)(=O)=O)C1OC1C(OS(O)(=O)=O)C(O)C(OC2C(C(OS(O)(=O)=O)C(OC3C(C(O)C(O)C(O3)C(O)=O)OS(O)(=O)=O)C(CO)O2)NS(O)(=O)=O)C(C(O)=O)O1 HTTJABKRGRZYRN-UHFFFAOYSA-N 0.000 description 3
- 101000757319 Homo sapiens Antithrombin-III Proteins 0.000 description 3
- BCVIOZZGJNOEQS-XKNYDFJKSA-N Ile-Ile Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@H](C(O)=O)[C@@H](C)CC BCVIOZZGJNOEQS-XKNYDFJKSA-N 0.000 description 3
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 3
- 241000880493 Leptailurus serval Species 0.000 description 3
- FZIJIFCXUCZHOL-CIUDSAMLSA-N Lys-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN FZIJIFCXUCZHOL-CIUDSAMLSA-N 0.000 description 3
- OAPNERBWQWUPTI-YUMQZZPRSA-N Lys-Gln Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O OAPNERBWQWUPTI-YUMQZZPRSA-N 0.000 description 3
- ATIPDCIQTUXABX-UWVGGRQHSA-N Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN ATIPDCIQTUXABX-UWVGGRQHSA-N 0.000 description 3
- VMTYLUGCXIEDMV-QWRGUYRKSA-N Lys-Leu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCCCN VMTYLUGCXIEDMV-QWRGUYRKSA-N 0.000 description 3
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 3
- XZFYRXDAULDNFX-UHFFFAOYSA-N N-L-cysteinyl-L-phenylalanine Natural products SCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XZFYRXDAULDNFX-UHFFFAOYSA-N 0.000 description 3
- 229910019142 PO4 Inorganic materials 0.000 description 3
- 241000209504 Poaceae Species 0.000 description 3
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 3
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 3
- LAFKUZYWNCHOHT-WHFBIAKZSA-N Ser-Glu Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O LAFKUZYWNCHOHT-WHFBIAKZSA-N 0.000 description 3
- NFDYGNFETJVMSE-BQBZGAKWSA-N Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CO NFDYGNFETJVMSE-BQBZGAKWSA-N 0.000 description 3
- WBAXJMCUFIXCNI-WDSKDSINSA-N Ser-Pro Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O WBAXJMCUFIXCNI-WDSKDSINSA-N 0.000 description 3
- HEMHJVSKTPXQMS-UHFFFAOYSA-M Sodium hydroxide Chemical compound [OH-].[Na+] HEMHJVSKTPXQMS-UHFFFAOYSA-M 0.000 description 3
- DKDHTRVDOUZZTP-IFFSRLJSSA-N Thr-Gln-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)[C@@H](C)O)C(O)=O DKDHTRVDOUZZTP-IFFSRLJSSA-N 0.000 description 3
- IQHUITKNHOKGFC-MIMYLULJSA-N Thr-Phe Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IQHUITKNHOKGFC-MIMYLULJSA-N 0.000 description 3
- HPYDSVWYXXKHRD-VIFPVBQESA-N Tyr-Gly Chemical compound [O-]C(=O)CNC(=O)[C@@H]([NH3+])CC1=CC=C(O)C=C1 HPYDSVWYXXKHRD-VIFPVBQESA-N 0.000 description 3
- VNYDHJARLHNEGA-RYUDHWBXSA-N Tyr-Pro Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CC=C(O)C=C1 VNYDHJARLHNEGA-RYUDHWBXSA-N 0.000 description 3
- 108091023045 Untranslated Region Proteins 0.000 description 3
- JKHXYJKMNSSFFL-IUCAKERBSA-N Val-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN JKHXYJKMNSSFFL-IUCAKERBSA-N 0.000 description 3
- KRNYOVHEKOBTEF-YUMQZZPRSA-N Val-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(O)=O KRNYOVHEKOBTEF-YUMQZZPRSA-N 0.000 description 3
- 108010044940 alanylglutamine Proteins 0.000 description 3
- 235000009582 asparagine Nutrition 0.000 description 3
- 238000003556 assay Methods 0.000 description 3
- 238000010367 cloning Methods 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 108010004073 cysteinylcysteine Proteins 0.000 description 3
- 108010089804 glycyl-threonine Proteins 0.000 description 3
- 229960002897 heparin Drugs 0.000 description 3
- 229920000669 heparin Polymers 0.000 description 3
- 239000005556 hormone Substances 0.000 description 3
- 229940088597 hormone Drugs 0.000 description 3
- 238000005213 imbibition Methods 0.000 description 3
- 238000011534 incubation Methods 0.000 description 3
- 239000003262 industrial enzyme Substances 0.000 description 3
- 239000003112 inhibitor Substances 0.000 description 3
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 3
- 108020004999 messenger RNA Proteins 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical class [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 3
- 239000010452 phosphate Substances 0.000 description 3
- 239000008363 phosphate buffer Substances 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 108010031719 prolyl-serine Proteins 0.000 description 3
- 210000001938 protoplast Anatomy 0.000 description 3
- 238000010188 recombinant method Methods 0.000 description 3
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 3
- 230000005030 transcription termination Effects 0.000 description 3
- 238000013519 translation Methods 0.000 description 3
- 108010020532 tyrosyl-proline Proteins 0.000 description 3
- 108010021889 valylvaline Proteins 0.000 description 3
- LSLXWOCIIFUZCQ-SRVKXCTJSA-N (2S)-2-[[(2S)-2-[[(2S)-2-amino-3-methyl-1-oxobutyl]amino]-3-methyl-1-oxobutyl]amino]-3-methylbutanoic acid Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(O)=O LSLXWOCIIFUZCQ-SRVKXCTJSA-N 0.000 description 2
- AUXMWYRZQPIXCC-KNIFDHDWSA-N (2s)-2-amino-4-methylpentanoic acid;(2s)-2-aminopropanoic acid Chemical compound C[C@H](N)C(O)=O.CC(C)C[C@H](N)C(O)=O AUXMWYRZQPIXCC-KNIFDHDWSA-N 0.000 description 2
- RVLOMLVNNBWRSR-KNIFDHDWSA-N (2s)-2-aminopropanoic acid;(2s)-2,6-diaminohexanoic acid Chemical compound C[C@H](N)C(O)=O.NCCCC[C@H](N)C(O)=O RVLOMLVNNBWRSR-KNIFDHDWSA-N 0.000 description 2
- 241000589158 Agrobacterium Species 0.000 description 2
- XAEWTDMGFGHWFK-IMJSIDKUSA-N Ala-Asp Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O XAEWTDMGFGHWFK-IMJSIDKUSA-N 0.000 description 2
- ZIWWTZWAKYBUOB-CIUDSAMLSA-N Ala-Asp-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O ZIWWTZWAKYBUOB-CIUDSAMLSA-N 0.000 description 2
- HJCMDXDYPOUFDY-WHFBIAKZSA-N Ala-Gln Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O HJCMDXDYPOUFDY-WHFBIAKZSA-N 0.000 description 2
- FBHOPGDGELNWRH-DRZSPHRISA-N Ala-Glu-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O FBHOPGDGELNWRH-DRZSPHRISA-N 0.000 description 2
- ZVFVBBGVOILKPO-WHFBIAKZSA-N Ala-Gly-Ala Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O ZVFVBBGVOILKPO-WHFBIAKZSA-N 0.000 description 2
- XZWXFWBHYRFLEF-FSPLSTOPSA-N Ala-His Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 XZWXFWBHYRFLEF-FSPLSTOPSA-N 0.000 description 2
- ZSOICJZJSRWNHX-ACZMJKKPSA-N Ala-Ile Chemical compound CC[C@H](C)[C@@H](C([O-])=O)NC(=O)[C@H](C)[NH3+] ZSOICJZJSRWNHX-ACZMJKKPSA-N 0.000 description 2
- WUHJHHGYVVJMQE-BJDJZHNGSA-N Ala-Leu-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WUHJHHGYVVJMQE-BJDJZHNGSA-N 0.000 description 2
- OINVDEKBKBCPLX-JXUBOQSCSA-N Ala-Lys-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OINVDEKBKBCPLX-JXUBOQSCSA-N 0.000 description 2
- 108010011667 Ala-Phe-Ala Proteins 0.000 description 2
- XRUJOVRWNMBAAA-NHCYSSNCSA-N Ala-Phe-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 XRUJOVRWNMBAAA-NHCYSSNCSA-N 0.000 description 2
- IPZQNYYAYVRKKK-FXQIFTODSA-N Ala-Pro-Ala Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O IPZQNYYAYVRKKK-FXQIFTODSA-N 0.000 description 2
- DCVYRWFAMZFSDA-ZLUOBGJFSA-N Ala-Ser-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DCVYRWFAMZFSDA-ZLUOBGJFSA-N 0.000 description 2
- HOVPGJUNRLMIOZ-CIUDSAMLSA-N Ala-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N HOVPGJUNRLMIOZ-CIUDSAMLSA-N 0.000 description 2
- AOAKQKVICDWCLB-UWJYBYFXSA-N Ala-Tyr-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N AOAKQKVICDWCLB-UWJYBYFXSA-N 0.000 description 2
- WYBVBIHNJWOLCJ-IUCAKERBSA-N Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCNC(N)=N WYBVBIHNJWOLCJ-IUCAKERBSA-N 0.000 description 2
- LQJAALCCPOTJGB-YUMQZZPRSA-N Arg-Pro Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O LQJAALCCPOTJGB-YUMQZZPRSA-N 0.000 description 2
- IJYZHIOOBGIINM-WDSKDSINSA-N Arg-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N IJYZHIOOBGIINM-WDSKDSINSA-N 0.000 description 2
- SJUXYGVRSGTPMC-IMJSIDKUSA-N Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CC(N)=O SJUXYGVRSGTPMC-IMJSIDKUSA-N 0.000 description 2
- HZYFHQOWCFUSOV-IMJSIDKUSA-N Asn-Asp Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O HZYFHQOWCFUSOV-IMJSIDKUSA-N 0.000 description 2
- TWXZVVXRRRRSLT-IMJSIDKUSA-N Asn-Cys Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CS)C(O)=O TWXZVVXRRRRSLT-IMJSIDKUSA-N 0.000 description 2
- IIFDPDVJAHQFSR-WHFBIAKZSA-N Asn-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O IIFDPDVJAHQFSR-WHFBIAKZSA-N 0.000 description 2
- SZNGQSBRHFMZLT-IHRRRGAJSA-N Asn-Pro-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SZNGQSBRHFMZLT-IHRRRGAJSA-N 0.000 description 2
- VBKIFHUVGLOJKT-FKZODXBYSA-N Asn-Thr Chemical compound C[C@@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)N)O VBKIFHUVGLOJKT-FKZODXBYSA-N 0.000 description 2
- ULZOQOKFYMXHPZ-AQZXSJQPSA-N Asn-Trp-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ULZOQOKFYMXHPZ-AQZXSJQPSA-N 0.000 description 2
- CKAJHWFHHFSCDT-WHFBIAKZSA-N Asp-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O CKAJHWFHHFSCDT-WHFBIAKZSA-N 0.000 description 2
- NTQDELBZOMWXRS-IWGUZYHVSA-N Asp-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(O)=O NTQDELBZOMWXRS-IWGUZYHVSA-N 0.000 description 2
- 241000193744 Bacillus amyloliquefaciens Species 0.000 description 2
- OABOXRPGTFRBFZ-IMJSIDKUSA-N Cys-Cys Chemical compound SC[C@H](N)C(=O)N[C@@H](CS)C(O)=O OABOXRPGTFRBFZ-IMJSIDKUSA-N 0.000 description 2
- XZFYRXDAULDNFX-UWVGGRQHSA-N Cys-Phe Chemical compound SC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XZFYRXDAULDNFX-UWVGGRQHSA-N 0.000 description 2
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 2
- 108010074860 Factor Xa Proteins 0.000 description 2
- 108700039691 Genetic Promoter Regions Proteins 0.000 description 2
- YJIUYQKQBBQYHZ-ACZMJKKPSA-N Gln-Ala-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O YJIUYQKQBBQYHZ-ACZMJKKPSA-N 0.000 description 2
- OPINTGHFESTVAX-BQBZGAKWSA-N Gln-Arg Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N OPINTGHFESTVAX-BQBZGAKWSA-N 0.000 description 2
- KGNSGRRALVIRGR-QWRGUYRKSA-N Gln-Tyr Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 KGNSGRRALVIRGR-QWRGUYRKSA-N 0.000 description 2
- LSPKYLAFTPBWIL-BYPYZUCNSA-N Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(O)=O LSPKYLAFTPBWIL-BYPYZUCNSA-N 0.000 description 2
- OQXDUSZKISQQSS-GUBZILKMSA-N Glu-Lys-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OQXDUSZKISQQSS-GUBZILKMSA-N 0.000 description 2
- OCJRHJZKGGSPRW-IUCAKERBSA-N Glu-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O OCJRHJZKGGSPRW-IUCAKERBSA-N 0.000 description 2
- HRBYTAIBKPNZKQ-AVGNSLFASA-N Glu-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(O)=O HRBYTAIBKPNZKQ-AVGNSLFASA-N 0.000 description 2
- SITLTJHOQZFJGG-XPUUQOCRSA-N Glu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O SITLTJHOQZFJGG-XPUUQOCRSA-N 0.000 description 2
- VPZXBVLAVMBEQI-VKHMYHEASA-N Glycyl-alanine Chemical compound OC(=O)[C@H](C)NC(=O)CN VPZXBVLAVMBEQI-VKHMYHEASA-N 0.000 description 2
- DCQMJRSOGCYKTR-GHCJXIJMSA-N Ile-Asp-Ser Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O DCQMJRSOGCYKTR-GHCJXIJMSA-N 0.000 description 2
- UWBDLNOCIDGPQE-GUBZILKMSA-N Ile-Lys Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN UWBDLNOCIDGPQE-GUBZILKMSA-N 0.000 description 2
- CIDLJWVDMNDKPT-FIRPJDEBSA-N Ile-Phe-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)O)N CIDLJWVDMNDKPT-FIRPJDEBSA-N 0.000 description 2
- VYZAGTDAHUIRQA-WHFBIAKZSA-N L-alanyl-L-glutamic acid Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O VYZAGTDAHUIRQA-WHFBIAKZSA-N 0.000 description 2
- QOOWRKBDDXQRHC-BQBZGAKWSA-N L-lysyl-L-alanine Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN QOOWRKBDDXQRHC-BQBZGAKWSA-N 0.000 description 2
- MLTRLIITQPXHBJ-BQBZGAKWSA-N Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O MLTRLIITQPXHBJ-BQBZGAKWSA-N 0.000 description 2
- RFUBXQQFJFGJFV-GUBZILKMSA-N Leu-Asn-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O RFUBXQQFJFGJFV-GUBZILKMSA-N 0.000 description 2
- NFNVDJGXRFEYTK-YUMQZZPRSA-N Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O NFNVDJGXRFEYTK-YUMQZZPRSA-N 0.000 description 2
- YVKSMSDXKMSIRX-GUBZILKMSA-N Leu-Glu-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O YVKSMSDXKMSIRX-GUBZILKMSA-N 0.000 description 2
- LESXFEZIFXFIQR-LURJTMIESA-N Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)NCC(O)=O LESXFEZIFXFIQR-LURJTMIESA-N 0.000 description 2
- HYIFFZAQXPUEAU-QWRGUYRKSA-N Leu-Gly-Leu Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(C)C HYIFFZAQXPUEAU-QWRGUYRKSA-N 0.000 description 2
- KPYAOIVPJKPIOU-KKUMJFAQSA-N Leu-Lys-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O KPYAOIVPJKPIOU-KKUMJFAQSA-N 0.000 description 2
- XGDCYUQSFDQISZ-BQBZGAKWSA-N Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O XGDCYUQSFDQISZ-BQBZGAKWSA-N 0.000 description 2
- AMSSKPUHBUQBOQ-SRVKXCTJSA-N Leu-Ser-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N AMSSKPUHBUQBOQ-SRVKXCTJSA-N 0.000 description 2
- IRNSXVOWSXSULE-DCAQKATOSA-N Lys-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN IRNSXVOWSXSULE-DCAQKATOSA-N 0.000 description 2
- GQFDWEDHOQRNLC-QWRGUYRKSA-N Lys-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN GQFDWEDHOQRNLC-QWRGUYRKSA-N 0.000 description 2
- QCZYYEFXOBKCNQ-STQMWFEESA-N Lys-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QCZYYEFXOBKCNQ-STQMWFEESA-N 0.000 description 2
- VWPJQIHBBOJWDN-DCAQKATOSA-N Lys-Val-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O VWPJQIHBBOJWDN-DCAQKATOSA-N 0.000 description 2
- 239000007987 MES buffer Substances 0.000 description 2
- WXHHTBVYQOSYSL-FXQIFTODSA-N Met-Ala-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O WXHHTBVYQOSYSL-FXQIFTODSA-N 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 2
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 2
- WYBVBIHNJWOLCJ-UHFFFAOYSA-N N-L-arginyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCCN=C(N)N WYBVBIHNJWOLCJ-UHFFFAOYSA-N 0.000 description 2
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 2
- 125000000729 N-terminal amino-acid group Chemical group 0.000 description 2
- UXQFHEKRGHYJRA-STQMWFEESA-N Phe-Met-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O UXQFHEKRGHYJRA-STQMWFEESA-N 0.000 description 2
- MCIXMYKSPQUMJG-SRVKXCTJSA-N Phe-Ser-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O MCIXMYKSPQUMJG-SRVKXCTJSA-N 0.000 description 2
- IAJOBQBIJHVGMQ-UHFFFAOYSA-N Phosphinothricin Natural products CP(O)(=O)CCC(N)C(O)=O IAJOBQBIJHVGMQ-UHFFFAOYSA-N 0.000 description 2
- GLEOIKLQBZNKJZ-WDSKDSINSA-N Pro-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 GLEOIKLQBZNKJZ-WDSKDSINSA-N 0.000 description 2
- KPDRZQUWJKTMBP-DCAQKATOSA-N Pro-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1 KPDRZQUWJKTMBP-DCAQKATOSA-N 0.000 description 2
- UIMCLYYSUCIUJM-UWVGGRQHSA-N Pro-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 UIMCLYYSUCIUJM-UWVGGRQHSA-N 0.000 description 2
- LPGSNRSLPHRNBW-AVGNSLFASA-N Pro-His-Val Chemical compound C([C@@H](C(=O)N[C@@H](C(C)C)C([O-])=O)NC(=O)[C@H]1[NH2+]CCC1)C1=CN=CN1 LPGSNRSLPHRNBW-AVGNSLFASA-N 0.000 description 2
- IWIANZLCJVYEFX-RYUDHWBXSA-N Pro-Phe Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H]1NCCC1)C1=CC=CC=C1 IWIANZLCJVYEFX-RYUDHWBXSA-N 0.000 description 2
- AFWBWPCXSWUCLB-WDSKDSINSA-N Pro-Ser Chemical compound OC[C@@H](C([O-])=O)NC(=O)[C@@H]1CCC[NH2+]1 AFWBWPCXSWUCLB-WDSKDSINSA-N 0.000 description 2
- FNGOXVQBBCMFKV-CIUDSAMLSA-N Pro-Ser-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O FNGOXVQBBCMFKV-CIUDSAMLSA-N 0.000 description 2
- GVUVRRPYYDHHGK-VQVTYTSYSA-N Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 GVUVRRPYYDHHGK-VQVTYTSYSA-N 0.000 description 2
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 2
- ZUGXSSFMTXKHJS-ZLUOBGJFSA-N Ser-Ala-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O ZUGXSSFMTXKHJS-ZLUOBGJFSA-N 0.000 description 2
- VAUMZJHYZQXZBQ-WHFBIAKZSA-N Ser-Asn-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O VAUMZJHYZQXZBQ-WHFBIAKZSA-N 0.000 description 2
- TYYBJUYSTWJHGO-ZKWXMUAHSA-N Ser-Asn-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O TYYBJUYSTWJHGO-ZKWXMUAHSA-N 0.000 description 2
- WOUIMBGNEUWXQG-VKHMYHEASA-N Ser-Gly Chemical compound OC[C@H](N)C(=O)NCC(O)=O WOUIMBGNEUWXQG-VKHMYHEASA-N 0.000 description 2
- UIGMAMGZOJVTDN-WHFBIAKZSA-N Ser-Gly-Ser Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O UIGMAMGZOJVTDN-WHFBIAKZSA-N 0.000 description 2
- XZKQVQKUZMAADP-IMJSIDKUSA-N Ser-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(O)=O XZKQVQKUZMAADP-IMJSIDKUSA-N 0.000 description 2
- YEDSOSIKVUMIJE-DCAQKATOSA-N Ser-Val-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O YEDSOSIKVUMIJE-DCAQKATOSA-N 0.000 description 2
- 229920002472 Starch Polymers 0.000 description 2
- 108010043934 Sucrose synthase Proteins 0.000 description 2
- 108020005038 Terminator Codon Proteins 0.000 description 2
- OJRNZRROAIAHDL-LKXGYXEUSA-N Thr-Asn-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O OJRNZRROAIAHDL-LKXGYXEUSA-N 0.000 description 2
- IMDMLDSVUSMAEJ-HJGDQZAQSA-N Thr-Leu-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IMDMLDSVUSMAEJ-HJGDQZAQSA-N 0.000 description 2
- YKRQRPFODDJQTC-CSMHCCOUSA-N Thr-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN YKRQRPFODDJQTC-CSMHCCOUSA-N 0.000 description 2
- WRUWXBBEFUTJOU-XGEHTFHBSA-N Thr-Met-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)O)N)O WRUWXBBEFUTJOU-XGEHTFHBSA-N 0.000 description 2
- COYHRQWNJDJCNA-NUJDXYNKSA-N Thr-Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O COYHRQWNJDJCNA-NUJDXYNKSA-N 0.000 description 2
- 108090000190 Thrombin Proteins 0.000 description 2
- 241000209140 Triticum Species 0.000 description 2
- HHPSUFUXXBOFQY-AQZXSJQPSA-N Trp-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O HHPSUFUXXBOFQY-AQZXSJQPSA-N 0.000 description 2
- 108090000631 Trypsin Proteins 0.000 description 2
- 102000004142 Trypsin Human genes 0.000 description 2
- NLKUJNGEGZDXGO-XVKPBYJWSA-N Tyr-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NLKUJNGEGZDXGO-XVKPBYJWSA-N 0.000 description 2
- ZSXJENBJGRHKIG-UWVGGRQHSA-N Tyr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 ZSXJENBJGRHKIG-UWVGGRQHSA-N 0.000 description 2
- PLVVHGFEMSDRET-IHPCNDPISA-N Tyr-Ser-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC3=CC=C(C=C3)O)N PLVVHGFEMSDRET-IHPCNDPISA-N 0.000 description 2
- HSRXSKHRSXRCFC-WDSKDSINSA-N Val-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(O)=O HSRXSKHRSXRCFC-WDSKDSINSA-N 0.000 description 2
- IJBTVYLICXHDRI-FXQIFTODSA-N Val-Ala-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O IJBTVYLICXHDRI-FXQIFTODSA-N 0.000 description 2
- IJBTVYLICXHDRI-UHFFFAOYSA-N Val-Ala-Ala Natural products CC(C)C(N)C(=O)NC(C)C(=O)NC(C)C(O)=O IJBTVYLICXHDRI-UHFFFAOYSA-N 0.000 description 2
- ASQFIHTXXMFENG-XPUUQOCRSA-N Val-Ala-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O ASQFIHTXXMFENG-XPUUQOCRSA-N 0.000 description 2
- ISERLACIZUGCDX-ZKWXMUAHSA-N Val-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N ISERLACIZUGCDX-ZKWXMUAHSA-N 0.000 description 2
- YODDULVCGFQRFZ-ZKWXMUAHSA-N Val-Asp-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O YODDULVCGFQRFZ-ZKWXMUAHSA-N 0.000 description 2
- YTPLVNUZZOBFFC-SCZZXKLOSA-N Val-Gly-Pro Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N1CCC[C@@H]1C(O)=O YTPLVNUZZOBFFC-SCZZXKLOSA-N 0.000 description 2
- PNVLWFYAPWAQMU-CIUDSAMLSA-N Val-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](N)C(C)C PNVLWFYAPWAQMU-CIUDSAMLSA-N 0.000 description 2
- RYHUIHUOYRNNIE-NRPADANISA-N Val-Ser-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N RYHUIHUOYRNNIE-NRPADANISA-N 0.000 description 2
- STTYIMSDIYISRG-UHFFFAOYSA-N Valyl-Serine Chemical compound CC(C)C(N)C(=O)NC(CO)C(O)=O STTYIMSDIYISRG-UHFFFAOYSA-N 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 108010069020 alanyl-prolyl-glycine Proteins 0.000 description 2
- 108010041407 alanylaspartic acid Proteins 0.000 description 2
- 108010005233 alanylglutamic acid Proteins 0.000 description 2
- 108010070944 alanylhistidine Proteins 0.000 description 2
- 108010060035 arginylproline Proteins 0.000 description 2
- 150000001508 asparagines Chemical class 0.000 description 2
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 2
- 108010038633 aspartylglutamate Proteins 0.000 description 2
- 230000033228 biological regulation Effects 0.000 description 2
- 239000006143 cell culture medium Substances 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000029087 digestion Effects 0.000 description 2
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 2
- 238000001962 electrophoresis Methods 0.000 description 2
- 238000000855 fermentation Methods 0.000 description 2
- 230000004151 fermentation Effects 0.000 description 2
- IAJOBQBIJHVGMQ-BYPYZUCNSA-N glufosinate-P Chemical compound CP(O)(=O)CC[C@H](N)C(O)=O IAJOBQBIJHVGMQ-BYPYZUCNSA-N 0.000 description 2
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 2
- 108010079413 glycyl-prolyl-glutamic acid Proteins 0.000 description 2
- 108010087823 glycyltyrosine Proteins 0.000 description 2
- 108010037850 glycylvaline Proteins 0.000 description 2
- 238000000227 grinding Methods 0.000 description 2
- 108010036413 histidylglycine Proteins 0.000 description 2
- 208000015181 infectious disease Diseases 0.000 description 2
- 238000002955 isolation Methods 0.000 description 2
- 229930027917 kanamycin Natural products 0.000 description 2
- 229960000318 kanamycin Drugs 0.000 description 2
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 2
- 229930182823 kanamycin A Natural products 0.000 description 2
- 108010083708 leucyl-aspartyl-valine Proteins 0.000 description 2
- 108010034529 leucyl-lysine Proteins 0.000 description 2
- 108010057821 leucylproline Proteins 0.000 description 2
- 108010003700 lysyl aspartic acid Proteins 0.000 description 2
- 108010025153 lysyl-alanyl-alanine Proteins 0.000 description 2
- 108010009298 lysylglutamic acid Proteins 0.000 description 2
- 108010064235 lysylglycine Proteins 0.000 description 2
- 108010038320 lysylphenylalanine Proteins 0.000 description 2
- 210000001161 mammalian embryo Anatomy 0.000 description 2
- 230000002503 metabolic effect Effects 0.000 description 2
- 239000002184 metal Substances 0.000 description 2
- 229910052751 metal Inorganic materials 0.000 description 2
- 239000011859 microparticle Substances 0.000 description 2
- 238000013508 migration Methods 0.000 description 2
- 230000005012 migration Effects 0.000 description 2
- 210000004897 n-terminal region Anatomy 0.000 description 2
- 150000002482 oligosaccharides Polymers 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000003204 osmotic effect Effects 0.000 description 2
- 230000003647 oxidation Effects 0.000 description 2
- 238000007254 oxidation reaction Methods 0.000 description 2
- 239000002245 particle Substances 0.000 description 2
- 239000000137 peptide hydrolase inhibitor Substances 0.000 description 2
- 108010024607 phenylalanylalanine Proteins 0.000 description 2
- 108010051242 phenylalanylserine Proteins 0.000 description 2
- 239000003375 plant hormone Substances 0.000 description 2
- 108010029020 prolylglycine Proteins 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 230000010076 replication Effects 0.000 description 2
- 229920006395 saturated elastomer Polymers 0.000 description 2
- 241000894007 species Species 0.000 description 2
- 235000019698 starch Nutrition 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 108010061238 threonyl-glycine Proteins 0.000 description 2
- 229960004072 thrombin Drugs 0.000 description 2
- 239000012588 trypsin Substances 0.000 description 2
- 108010051110 tyrosyl-lysine Proteins 0.000 description 2
- 108010021199 valyl-valyl-valine Proteins 0.000 description 2
- 230000003612 virological effect Effects 0.000 description 2
- XVZCXCTYGHPNEM-IHRRRGAJSA-N (2s)-1-[(2s)-2-[[(2s)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(O)=O XVZCXCTYGHPNEM-IHRRRGAJSA-N 0.000 description 1
- AXFMEGAFCUULFV-BLFANLJRSA-N (2s)-2-[[(2s)-1-[(2s,3r)-2-amino-3-methylpentanoyl]pyrrolidine-2-carbonyl]amino]pentanedioic acid Chemical compound CC[C@@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O AXFMEGAFCUULFV-BLFANLJRSA-N 0.000 description 1
- JFOWDKWFHZIMTR-RUCXOUQFSA-N (2s)-2-aminopentanedioic acid;(2s)-2,5-diamino-5-oxopentanoic acid Chemical compound OC(=O)[C@@H](N)CCC(N)=O.OC(=O)[C@@H](N)CCC(O)=O JFOWDKWFHZIMTR-RUCXOUQFSA-N 0.000 description 1
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 1
- GGLXDYDRZKRTDI-UHFFFAOYSA-N 3-(hydroxymethyl)-9-methoxy-5-methylbenzo[f][1]benzofuran-4-carbaldehyde Chemical compound C1=CC=C2C(OC)=C(OC=C3CO)C3=C(C=O)C2=C1C GGLXDYDRZKRTDI-UHFFFAOYSA-N 0.000 description 1
- HZWWPUTXBJEENE-UHFFFAOYSA-N 5-amino-2-[[1-[5-amino-2-[[1-[2-amino-3-(4-hydroxyphenyl)propanoyl]pyrrolidine-2-carbonyl]amino]-5-oxopentanoyl]pyrrolidine-2-carbonyl]amino]-5-oxopentanoic acid Chemical compound C1CCC(C(=O)NC(CCC(N)=O)C(=O)N2C(CCC2)C(=O)NC(CCC(N)=O)C(O)=O)N1C(=O)C(N)CC1=CC=C(O)C=C1 HZWWPUTXBJEENE-UHFFFAOYSA-N 0.000 description 1
- 229920001817 Agar Polymers 0.000 description 1
- BYXHQQCXAJARLQ-ZLUOBGJFSA-N Ala-Ala-Ala Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O BYXHQQCXAJARLQ-ZLUOBGJFSA-N 0.000 description 1
- DKJPOZOEBONHFS-ZLUOBGJFSA-N Ala-Ala-Asp Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O DKJPOZOEBONHFS-ZLUOBGJFSA-N 0.000 description 1
- UWQJHXKARZWDIJ-ZLUOBGJFSA-N Ala-Ala-Cys Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CS)C(O)=O UWQJHXKARZWDIJ-ZLUOBGJFSA-N 0.000 description 1
- LJTZPXOCBZRFBH-CIUDSAMLSA-N Ala-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](C)N LJTZPXOCBZRFBH-CIUDSAMLSA-N 0.000 description 1
- SDMAQFGBPOJFOM-GUBZILKMSA-N Ala-Arg-Arg Chemical compound NC(=N)NCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SDMAQFGBPOJFOM-GUBZILKMSA-N 0.000 description 1
- CCUAQNUWXLYFRA-IMJSIDKUSA-N Ala-Asn Chemical compound C[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CC(N)=O CCUAQNUWXLYFRA-IMJSIDKUSA-N 0.000 description 1
- LBJYAILUMSUTAM-ZLUOBGJFSA-N Ala-Asn-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O LBJYAILUMSUTAM-ZLUOBGJFSA-N 0.000 description 1
- XQGIRPGAVLFKBJ-CIUDSAMLSA-N Ala-Asn-Lys Chemical compound N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)O XQGIRPGAVLFKBJ-CIUDSAMLSA-N 0.000 description 1
- XCVRVWZTXPCYJT-BIIVOSGPSA-N Ala-Asn-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N XCVRVWZTXPCYJT-BIIVOSGPSA-N 0.000 description 1
- FXKNPWNXPQZLES-ZLUOBGJFSA-N Ala-Asn-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O FXKNPWNXPQZLES-ZLUOBGJFSA-N 0.000 description 1
- WDIYWDJLXOCGRW-ACZMJKKPSA-N Ala-Asp-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WDIYWDJLXOCGRW-ACZMJKKPSA-N 0.000 description 1
- KIUYPHAMDKDICO-WHFBIAKZSA-N Ala-Asp-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KIUYPHAMDKDICO-WHFBIAKZSA-N 0.000 description 1
- HFBFSOAKPUZCCO-ZLUOBGJFSA-N Ala-Cys-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)N)C(=O)O)N HFBFSOAKPUZCCO-ZLUOBGJFSA-N 0.000 description 1
- MVBWLRJESQOQTM-ACZMJKKPSA-N Ala-Gln-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O MVBWLRJESQOQTM-ACZMJKKPSA-N 0.000 description 1
- VGPWRRFOPXVGOH-BYPYZUCNSA-N Ala-Gly-Gly Chemical compound C[C@H](N)C(=O)NCC(=O)NCC(O)=O VGPWRRFOPXVGOH-BYPYZUCNSA-N 0.000 description 1
- BLIMFWGRQKRCGT-YUMQZZPRSA-N Ala-Gly-Lys Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN BLIMFWGRQKRCGT-YUMQZZPRSA-N 0.000 description 1
- OBVSBEYOMDWLRJ-BFHQHQDPSA-N Ala-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N OBVSBEYOMDWLRJ-BFHQHQDPSA-N 0.000 description 1
- PNALXAODQKTNLV-JBDRJPRFSA-N Ala-Ile-Ala Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O PNALXAODQKTNLV-JBDRJPRFSA-N 0.000 description 1
- IFKQPMZRDQZSHI-GHCJXIJMSA-N Ala-Ile-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O IFKQPMZRDQZSHI-GHCJXIJMSA-N 0.000 description 1
- DPNZTBKGAUAZQU-DLOVCJGASA-N Ala-Leu-His Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N DPNZTBKGAUAZQU-DLOVCJGASA-N 0.000 description 1
- UJJUHXAJSRHWFZ-DCAQKATOSA-N Ala-Leu-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O UJJUHXAJSRHWFZ-DCAQKATOSA-N 0.000 description 1
- SDZRIBWEVVRDQI-CIUDSAMLSA-N Ala-Lys-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O SDZRIBWEVVRDQI-CIUDSAMLSA-N 0.000 description 1
- PIXQDIGKDNNOOV-GUBZILKMSA-N Ala-Lys-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O PIXQDIGKDNNOOV-GUBZILKMSA-N 0.000 description 1
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 1
- WEZNQZHACPSMEF-QEJZJMRPSA-N Ala-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 WEZNQZHACPSMEF-QEJZJMRPSA-N 0.000 description 1
- WPWUFUBLGADILS-WDSKDSINSA-N Ala-Pro Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O WPWUFUBLGADILS-WDSKDSINSA-N 0.000 description 1
- IORKCNUBHNIMKY-CIUDSAMLSA-N Ala-Pro-Glu Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O IORKCNUBHNIMKY-CIUDSAMLSA-N 0.000 description 1
- ADSGHMXEAZJJNF-DCAQKATOSA-N Ala-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)N ADSGHMXEAZJJNF-DCAQKATOSA-N 0.000 description 1
- XWFWAXPOLRTDFZ-FXQIFTODSA-N Ala-Pro-Ser Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O XWFWAXPOLRTDFZ-FXQIFTODSA-N 0.000 description 1
- YYAVDNKUWLAFCV-ACZMJKKPSA-N Ala-Ser-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O YYAVDNKUWLAFCV-ACZMJKKPSA-N 0.000 description 1
- OMCKWYSDUQBYCN-FXQIFTODSA-N Ala-Ser-Met Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O OMCKWYSDUQBYCN-FXQIFTODSA-N 0.000 description 1
- PEEYDECOOVQKRZ-DLOVCJGASA-N Ala-Ser-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PEEYDECOOVQKRZ-DLOVCJGASA-N 0.000 description 1
- NCQMBSJGJMYKCK-ZLUOBGJFSA-N Ala-Ser-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O NCQMBSJGJMYKCK-ZLUOBGJFSA-N 0.000 description 1
- BUQICHWNXBIBOG-LMVFSUKVSA-N Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)N BUQICHWNXBIBOG-LMVFSUKVSA-N 0.000 description 1
- XQNRANMFRPCFFW-GCJQMDKQSA-N Ala-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C)N)O XQNRANMFRPCFFW-GCJQMDKQSA-N 0.000 description 1
- LSMDIAAALJJLRO-XQXXSGGOSA-N Ala-Thr-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O LSMDIAAALJJLRO-XQXXSGGOSA-N 0.000 description 1
- KUFVXLQLDHJVOG-SHGPDSBTSA-N Ala-Thr-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C)N)O KUFVXLQLDHJVOG-SHGPDSBTSA-N 0.000 description 1
- WUGMRIBZSVSJNP-UFBFGSQYSA-N Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UFBFGSQYSA-N 0.000 description 1
- AENHOIXXHKNIQL-AUTRQRHGSA-N Ala-Tyr-Ala Chemical compound [O-]C(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@@H]([NH3+])C)CC1=CC=C(O)C=C1 AENHOIXXHKNIQL-AUTRQRHGSA-N 0.000 description 1
- YEBZNKPPOHFZJM-BPNCWPANSA-N Ala-Tyr-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O YEBZNKPPOHFZJM-BPNCWPANSA-N 0.000 description 1
- LIWMQSWFLXEGMA-WDSKDSINSA-N Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)N LIWMQSWFLXEGMA-WDSKDSINSA-N 0.000 description 1
- SOTXLXCVCZAKFI-FXQIFTODSA-N Ala-Val-Ala Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O SOTXLXCVCZAKFI-FXQIFTODSA-N 0.000 description 1
- BVLPIIBTWIYOML-ZKWXMUAHSA-N Ala-Val-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BVLPIIBTWIYOML-ZKWXMUAHSA-N 0.000 description 1
- VHAQSYHSDKERBS-XPUUQOCRSA-N Ala-Val-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O VHAQSYHSDKERBS-XPUUQOCRSA-N 0.000 description 1
- LYILPUNCKACNGF-NAKRPEOUSA-N Ala-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C)N LYILPUNCKACNGF-NAKRPEOUSA-N 0.000 description 1
- 102000009027 Albumins Human genes 0.000 description 1
- 108010088751 Albumins Proteins 0.000 description 1
- 101710081722 Antitrypsin Proteins 0.000 description 1
- GIVATXIGCXFQQA-FXQIFTODSA-N Arg-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N GIVATXIGCXFQQA-FXQIFTODSA-N 0.000 description 1
- WOPFJPHVBWKZJH-SRVKXCTJSA-N Arg-Arg-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O WOPFJPHVBWKZJH-SRVKXCTJSA-N 0.000 description 1
- JSLGXODUIAFWCF-WDSKDSINSA-N Arg-Asn Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(O)=O JSLGXODUIAFWCF-WDSKDSINSA-N 0.000 description 1
- RVDVDRUZWZIBJQ-CIUDSAMLSA-N Arg-Asn-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O RVDVDRUZWZIBJQ-CIUDSAMLSA-N 0.000 description 1
- OZNSCVPYWZRQPY-CIUDSAMLSA-N Arg-Asp-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O OZNSCVPYWZRQPY-CIUDSAMLSA-N 0.000 description 1
- YSUVMPICYVWRBX-VEVYYDQMSA-N Arg-Asp-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O YSUVMPICYVWRBX-VEVYYDQMSA-N 0.000 description 1
- JCAISGGAOQXEHJ-ZPFDUUQYSA-N Arg-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N JCAISGGAOQXEHJ-ZPFDUUQYSA-N 0.000 description 1
- VNFWDYWTSHFRRG-SRVKXCTJSA-N Arg-Gln-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O VNFWDYWTSHFRRG-SRVKXCTJSA-N 0.000 description 1
- QYLJIYOGHRGUIH-CIUDSAMLSA-N Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCNC(N)=N QYLJIYOGHRGUIH-CIUDSAMLSA-N 0.000 description 1
- YBZMTKUDWXZLIX-UWVGGRQHSA-N Arg-Leu-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YBZMTKUDWXZLIX-UWVGGRQHSA-N 0.000 description 1
- COXMUHNBYCVVRG-DCAQKATOSA-N Arg-Leu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O COXMUHNBYCVVRG-DCAQKATOSA-N 0.000 description 1
- PZBSKYJGKNNYNK-ULQDDVLXSA-N Arg-Leu-Tyr Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCCN=C(N)N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(O)=O PZBSKYJGKNNYNK-ULQDDVLXSA-N 0.000 description 1
- JQFZHHSQMKZLRU-IUCAKERBSA-N Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N JQFZHHSQMKZLRU-IUCAKERBSA-N 0.000 description 1
- CVXXSWQORBZAAA-SRVKXCTJSA-N Arg-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N CVXXSWQORBZAAA-SRVKXCTJSA-N 0.000 description 1
- PQBHGSGQZSOLIR-RYUDHWBXSA-N Arg-Phe Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 PQBHGSGQZSOLIR-RYUDHWBXSA-N 0.000 description 1
- KZXPVYVSHUJCEO-ULQDDVLXSA-N Arg-Phe-Lys Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=CC=C1 KZXPVYVSHUJCEO-ULQDDVLXSA-N 0.000 description 1
- YFHATWYGAAXQCF-JYJNAYRXSA-N Arg-Pro-Phe Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 YFHATWYGAAXQCF-JYJNAYRXSA-N 0.000 description 1
- KXOPYFNQLVUOAQ-FXQIFTODSA-N Arg-Ser-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O KXOPYFNQLVUOAQ-FXQIFTODSA-N 0.000 description 1
- FRBAHXABMQXSJQ-FXQIFTODSA-N Arg-Ser-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O FRBAHXABMQXSJQ-FXQIFTODSA-N 0.000 description 1
- XNSKSTRGQIPTSE-ACZMJKKPSA-N Arg-Thr Chemical compound C[C@@H]([C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O XNSKSTRGQIPTSE-ACZMJKKPSA-N 0.000 description 1
- DAQIJMOLTMGJLO-YUMQZZPRSA-N Arg-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCNC(N)=N DAQIJMOLTMGJLO-YUMQZZPRSA-N 0.000 description 1
- KEZVOBAKAXHMOF-GUBZILKMSA-N Arg-Val-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCN=C(N)N KEZVOBAKAXHMOF-GUBZILKMSA-N 0.000 description 1
- IARGXWMWRFOQPG-GCJQMDKQSA-N Asn-Ala-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IARGXWMWRFOQPG-GCJQMDKQSA-N 0.000 description 1
- NPDLYUOYAGBHFB-WDSKDSINSA-N Asn-Arg Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N NPDLYUOYAGBHFB-WDSKDSINSA-N 0.000 description 1
- JEPNYDRDYNSFIU-QXEWZRGKSA-N Asn-Arg-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CC(N)=O)C(O)=O JEPNYDRDYNSFIU-QXEWZRGKSA-N 0.000 description 1
- RJUHZPRQRQLCFL-IMJSIDKUSA-N Asn-Asn Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(O)=O RJUHZPRQRQLCFL-IMJSIDKUSA-N 0.000 description 1
- DXZNJWFECGJCQR-FXQIFTODSA-N Asn-Asn-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N DXZNJWFECGJCQR-FXQIFTODSA-N 0.000 description 1
- XWFPGQVLOVGSLU-CIUDSAMLSA-N Asn-Gln-Arg Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N XWFPGQVLOVGSLU-CIUDSAMLSA-N 0.000 description 1
- BZMWJLLUAKSIMH-FXQIFTODSA-N Asn-Glu-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BZMWJLLUAKSIMH-FXQIFTODSA-N 0.000 description 1
- GNKVBRYFXYWXAB-WDSKDSINSA-N Asn-Glu-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O GNKVBRYFXYWXAB-WDSKDSINSA-N 0.000 description 1
- GFFRWIJAFFMQGM-NUMRIWBASA-N Asn-Glu-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GFFRWIJAFFMQGM-NUMRIWBASA-N 0.000 description 1
- KLKHFFMNGWULBN-VKHMYHEASA-N Asn-Gly Chemical compound NC(=O)C[C@H](N)C(=O)NCC(O)=O KLKHFFMNGWULBN-VKHMYHEASA-N 0.000 description 1
- OPEPUCYIGFEGSW-WDSKDSINSA-N Asn-Gly-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OPEPUCYIGFEGSW-WDSKDSINSA-N 0.000 description 1
- PBSQFBAJKPLRJY-BYULHYEWSA-N Asn-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N PBSQFBAJKPLRJY-BYULHYEWSA-N 0.000 description 1
- RAQMSGVCGSJKCL-FOHZUACHSA-N Asn-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(N)=O RAQMSGVCGSJKCL-FOHZUACHSA-N 0.000 description 1
- XLZCLJRGGMBKLR-PCBIJLKTSA-N Asn-Ile-Phe Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XLZCLJRGGMBKLR-PCBIJLKTSA-N 0.000 description 1
- HXWUJJADFMXNKA-BQBZGAKWSA-N Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(N)=O HXWUJJADFMXNKA-BQBZGAKWSA-N 0.000 description 1
- BZWRLDPIWKOVKB-ZPFDUUQYSA-N Asn-Leu-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BZWRLDPIWKOVKB-ZPFDUUQYSA-N 0.000 description 1
- XFJKRRCWLTZIQA-XIRDDKMYSA-N Asn-Lys-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)N)N XFJKRRCWLTZIQA-XIRDDKMYSA-N 0.000 description 1
- NTWOPSIUJBMNRI-KKUMJFAQSA-N Asn-Lys-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NTWOPSIUJBMNRI-KKUMJFAQSA-N 0.000 description 1
- KEUNWIXNKVWCFL-FXQIFTODSA-N Asn-Met-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O KEUNWIXNKVWCFL-FXQIFTODSA-N 0.000 description 1
- PPCORQFLAZWUNO-QWRGUYRKSA-N Asn-Phe-Gly Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC(=O)N)N PPCORQFLAZWUNO-QWRGUYRKSA-N 0.000 description 1
- RVHGJNGNKGDCPX-KKUMJFAQSA-N Asn-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N RVHGJNGNKGDCPX-KKUMJFAQSA-N 0.000 description 1
- GADKFYNESXNRLC-WDSKDSINSA-N Asn-Pro Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O GADKFYNESXNRLC-WDSKDSINSA-N 0.000 description 1
- IDUUACUJKUXKKD-VEVYYDQMSA-N Asn-Pro-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O IDUUACUJKUXKKD-VEVYYDQMSA-N 0.000 description 1
- SONUFGRSSMFHFN-IMJSIDKUSA-N Asn-Ser Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O SONUFGRSSMFHFN-IMJSIDKUSA-N 0.000 description 1
- DOURAOODTFJRIC-CIUDSAMLSA-N Asn-Ser-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N DOURAOODTFJRIC-CIUDSAMLSA-N 0.000 description 1
- NPZJLGMWMDNQDD-GHCJXIJMSA-N Asn-Ser-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NPZJLGMWMDNQDD-GHCJXIJMSA-N 0.000 description 1
- QUMKPKWYDVMGNT-NUMRIWBASA-N Asn-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O QUMKPKWYDVMGNT-NUMRIWBASA-N 0.000 description 1
- PUUPMDXIHCOPJU-HJGDQZAQSA-N Asn-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O PUUPMDXIHCOPJU-HJGDQZAQSA-N 0.000 description 1
- FYRVDDJMNISIKJ-UWVGGRQHSA-N Asn-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FYRVDDJMNISIKJ-UWVGGRQHSA-N 0.000 description 1
- JPPLRQVZMZFOSX-UWJYBYFXSA-N Asn-Tyr-Ala Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=C(O)C=C1 JPPLRQVZMZFOSX-UWJYBYFXSA-N 0.000 description 1
- KWBQPGIYEZKDEG-FSPLSTOPSA-N Asn-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(N)=O KWBQPGIYEZKDEG-FSPLSTOPSA-N 0.000 description 1
- LTDGPJKGJDIBQD-LAEOZQHASA-N Asn-Val-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O LTDGPJKGJDIBQD-LAEOZQHASA-N 0.000 description 1
- PSZNHSNIGMJYOZ-WDSKDSINSA-N Asp-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PSZNHSNIGMJYOZ-WDSKDSINSA-N 0.000 description 1
- FRYULLIZUDQONW-IMJSIDKUSA-N Asp-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O FRYULLIZUDQONW-IMJSIDKUSA-N 0.000 description 1
- VFUXXFVCYZPOQG-WDSKDSINSA-N Asp-Glu-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O VFUXXFVCYZPOQG-WDSKDSINSA-N 0.000 description 1
- PDECQIHABNQRHN-GUBZILKMSA-N Asp-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC(O)=O PDECQIHABNQRHN-GUBZILKMSA-N 0.000 description 1
- XDGBFDYXZCMYEX-NUMRIWBASA-N Asp-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N)O XDGBFDYXZCMYEX-NUMRIWBASA-N 0.000 description 1
- SNDBKTFJWVEVPO-WHFBIAKZSA-N Asp-Gly-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CO)C(O)=O SNDBKTFJWVEVPO-WHFBIAKZSA-N 0.000 description 1
- LBFYTUPYYZENIR-GHCJXIJMSA-N Asp-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)O)N LBFYTUPYYZENIR-GHCJXIJMSA-N 0.000 description 1
- NHSDEZURHWEZPN-SXTJYALSSA-N Asp-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)NC(=O)[C@H](CC(=O)O)N NHSDEZURHWEZPN-SXTJYALSSA-N 0.000 description 1
- SPWXXPFDTMYTRI-IUKAMOBKSA-N Asp-Ile-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SPWXXPFDTMYTRI-IUKAMOBKSA-N 0.000 description 1
- CJUKAWUWBZCTDQ-SRVKXCTJSA-N Asp-Leu-Lys Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O CJUKAWUWBZCTDQ-SRVKXCTJSA-N 0.000 description 1
- HKEZZWQWXWGASX-KKUMJFAQSA-N Asp-Leu-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 HKEZZWQWXWGASX-KKUMJFAQSA-N 0.000 description 1
- OAMLVOVXNKILLQ-BQBZGAKWSA-N Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(O)=O OAMLVOVXNKILLQ-BQBZGAKWSA-N 0.000 description 1
- DPNWSMBUYCLEDG-CIUDSAMLSA-N Asp-Lys-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O DPNWSMBUYCLEDG-CIUDSAMLSA-N 0.000 description 1
- SAKCBXNPWDRWPE-BQBZGAKWSA-N Asp-Met-Gly Chemical compound CSCC[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC(=O)O)N SAKCBXNPWDRWPE-BQBZGAKWSA-N 0.000 description 1
- KRQFMDNIUOVRIF-KKUMJFAQSA-N Asp-Phe-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CC(=O)O)N KRQFMDNIUOVRIF-KKUMJFAQSA-N 0.000 description 1
- JUWISGAGWSDGDH-KKUMJFAQSA-N Asp-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(O)=O)CC1=CC=CC=C1 JUWISGAGWSDGDH-KKUMJFAQSA-N 0.000 description 1
- HJZLUGQGJWXJCJ-CIUDSAMLSA-N Asp-Pro-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O HJZLUGQGJWXJCJ-CIUDSAMLSA-N 0.000 description 1
- FAUPLTGRUBTXNU-FXQIFTODSA-N Asp-Pro-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O FAUPLTGRUBTXNU-FXQIFTODSA-N 0.000 description 1
- DWBZEJHQQIURML-IMJSIDKUSA-N Asp-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O DWBZEJHQQIURML-IMJSIDKUSA-N 0.000 description 1
- KGHLGJAXYSVNJP-WHFBIAKZSA-N Asp-Ser-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O KGHLGJAXYSVNJP-WHFBIAKZSA-N 0.000 description 1
- ZQFRDAZBTSFGGW-SRVKXCTJSA-N Asp-Ser-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZQFRDAZBTSFGGW-SRVKXCTJSA-N 0.000 description 1
- MGSVBZIBCCKGCY-ZLUOBGJFSA-N Asp-Ser-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O MGSVBZIBCCKGCY-ZLUOBGJFSA-N 0.000 description 1
- XAPPCWUWHNWCPQ-PBCZWWQYSA-N Asp-Thr-His Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)O XAPPCWUWHNWCPQ-PBCZWWQYSA-N 0.000 description 1
- JSNWZMFSLIWAHS-HJGDQZAQSA-N Asp-Thr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O JSNWZMFSLIWAHS-HJGDQZAQSA-N 0.000 description 1
- GCACQYDBDHRVGE-LKXGYXEUSA-N Asp-Thr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC(O)=O GCACQYDBDHRVGE-LKXGYXEUSA-N 0.000 description 1
- BYLPQJAWXJWUCJ-YDHLFZDLSA-N Asp-Tyr-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O BYLPQJAWXJWUCJ-YDHLFZDLSA-N 0.000 description 1
- CPMKYMGGYUFOHS-FSPLSTOPSA-N Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(O)=O CPMKYMGGYUFOHS-FSPLSTOPSA-N 0.000 description 1
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 1
- 235000005781 Avena Nutrition 0.000 description 1
- 101710137053 B1-hordein Proteins 0.000 description 1
- 101710125089 Bindin Proteins 0.000 description 1
- 102000004506 Blood Proteins Human genes 0.000 description 1
- 108010017384 Blood Proteins Proteins 0.000 description 1
- UXVMQQNJUSDDNG-UHFFFAOYSA-L Calcium chloride Chemical compound [Cl-].[Cl-].[Ca+2] UXVMQQNJUSDDNG-UHFFFAOYSA-L 0.000 description 1
- 241000701489 Cauliflower mosaic virus Species 0.000 description 1
- 108090000317 Chymotrypsin Proteins 0.000 description 1
- HAYVTMHUNMMXCV-IMJSIDKUSA-N Cys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CS HAYVTMHUNMMXCV-IMJSIDKUSA-N 0.000 description 1
- TVYMKYUSZSVOAG-ZLUOBGJFSA-N Cys-Ala-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O TVYMKYUSZSVOAG-ZLUOBGJFSA-N 0.000 description 1
- CPTUXCUWQIBZIF-ZLUOBGJFSA-N Cys-Asn-Ser Chemical compound SC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O CPTUXCUWQIBZIF-ZLUOBGJFSA-N 0.000 description 1
- HYKFOHGZGLOCAY-ZLUOBGJFSA-N Cys-Cys-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CS)C(=O)N[C@@H](C)C(O)=O HYKFOHGZGLOCAY-ZLUOBGJFSA-N 0.000 description 1
- BUXAPSQPMALTOY-WHFBIAKZSA-N Cys-Glu Chemical compound SC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O BUXAPSQPMALTOY-WHFBIAKZSA-N 0.000 description 1
- RWGDABDXVXRLLH-ACZMJKKPSA-N Cys-Glu-Asn Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CS)N RWGDABDXVXRLLH-ACZMJKKPSA-N 0.000 description 1
- WXOFKRKAHJQKLT-BQBZGAKWSA-N Cys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CS WXOFKRKAHJQKLT-BQBZGAKWSA-N 0.000 description 1
- CNAMJJOZGXPDHW-IHRRRGAJSA-N Cys-Pro-Phe Chemical compound N[C@@H](CS)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1ccccc1)C(O)=O CNAMJJOZGXPDHW-IHRRRGAJSA-N 0.000 description 1
- SAEVTQWAYDPXMU-KATARQTJSA-N Cys-Thr-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O SAEVTQWAYDPXMU-KATARQTJSA-N 0.000 description 1
- OELDIVRKHTYFNG-WDSKDSINSA-N Cys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CS OELDIVRKHTYFNG-WDSKDSINSA-N 0.000 description 1
- YQEHNIKPAOPBNH-DCAQKATOSA-N Cys-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CS)N YQEHNIKPAOPBNH-DCAQKATOSA-N 0.000 description 1
- IMXSCCDUAFEIOE-UHFFFAOYSA-N D-Octopin Natural products OC(=O)C(C)NC(C(O)=O)CCCN=C(N)N IMXSCCDUAFEIOE-UHFFFAOYSA-N 0.000 description 1
- 102000053602 DNA Human genes 0.000 description 1
- 101100411708 Danio rerio rarga gene Proteins 0.000 description 1
- 206010051055 Deep vein thrombosis Diseases 0.000 description 1
- 229940122858 Elastase inhibitor Drugs 0.000 description 1
- 206010014561 Emphysema Diseases 0.000 description 1
- 241001331845 Equus asinus x caballus Species 0.000 description 1
- 206010016654 Fibrosis Diseases 0.000 description 1
- 108700007698 Genetic Terminator Regions Proteins 0.000 description 1
- 229930191978 Gibberellin Natural products 0.000 description 1
- 108010061711 Gliadin Proteins 0.000 description 1
- SXIJQMBEVYWAQT-GUBZILKMSA-N Gln-Asp-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)N)N SXIJQMBEVYWAQT-GUBZILKMSA-N 0.000 description 1
- UICOTGULOUGGLC-NUMRIWBASA-N Gln-Asp-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)N)N)O UICOTGULOUGGLC-NUMRIWBASA-N 0.000 description 1
- LOJYQMFIIJVETK-WDSKDSINSA-N Gln-Gln Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(O)=O LOJYQMFIIJVETK-WDSKDSINSA-N 0.000 description 1
- KCJJFESQRXGTGC-BQBZGAKWSA-N Gln-Glu-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O KCJJFESQRXGTGC-BQBZGAKWSA-N 0.000 description 1
- WVUZERSNWGUKJY-BPUTZDHNSA-N Gln-Glu-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N WVUZERSNWGUKJY-BPUTZDHNSA-N 0.000 description 1
- NSNUZSPSADIMJQ-WDSKDSINSA-N Gln-Gly-Asp Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O NSNUZSPSADIMJQ-WDSKDSINSA-N 0.000 description 1
- YXQCLIVLWCKCRS-RYUDHWBXSA-N Gln-Gly-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)N)N)O YXQCLIVLWCKCRS-RYUDHWBXSA-N 0.000 description 1
- IWUFOVSLWADEJC-AVGNSLFASA-N Gln-His-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O IWUFOVSLWADEJC-AVGNSLFASA-N 0.000 description 1
- KHGGWBRVRPHFMH-PEFMBERDSA-N Gln-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N KHGGWBRVRPHFMH-PEFMBERDSA-N 0.000 description 1
- GIVHPCWYVWUUSG-HVTMNAMFSA-N Gln-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCC(=O)N)N GIVHPCWYVWUUSG-HVTMNAMFSA-N 0.000 description 1
- MTCXQQINVAFZKW-MNXVOIDGSA-N Gln-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N MTCXQQINVAFZKW-MNXVOIDGSA-N 0.000 description 1
- LGIKBBLQVSWUGK-DCAQKATOSA-N Gln-Leu-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O LGIKBBLQVSWUGK-DCAQKATOSA-N 0.000 description 1
- MLSKFHLRFVGNLL-WDCWCFNPSA-N Gln-Leu-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MLSKFHLRFVGNLL-WDCWCFNPSA-N 0.000 description 1
- UWKPRVKWEKEMSY-DCAQKATOSA-N Gln-Lys-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O UWKPRVKWEKEMSY-DCAQKATOSA-N 0.000 description 1
- LURQDGKYBFWWJA-MNXVOIDGSA-N Gln-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)N)N LURQDGKYBFWWJA-MNXVOIDGSA-N 0.000 description 1
- VHLZDSUANXBJHW-QWRGUYRKSA-N Gln-Phe Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 VHLZDSUANXBJHW-QWRGUYRKSA-N 0.000 description 1
- XQDGOJPVMSWZSO-SRVKXCTJSA-N Gln-Pro-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CCC(=O)N)N XQDGOJPVMSWZSO-SRVKXCTJSA-N 0.000 description 1
- OKARHJKJTKFQBM-ACZMJKKPSA-N Gln-Ser-Asn Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OKARHJKJTKFQBM-ACZMJKKPSA-N 0.000 description 1
- BYKZWDGMJLNFJY-XKBZYTNZSA-N Gln-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)N)N)O BYKZWDGMJLNFJY-XKBZYTNZSA-N 0.000 description 1
- OTQSTOXRUBVWAP-NRPADANISA-N Gln-Ser-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O OTQSTOXRUBVWAP-NRPADANISA-N 0.000 description 1
- PAOHIZNRJNIXQY-XQXXSGGOSA-N Gln-Thr-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O PAOHIZNRJNIXQY-XQXXSGGOSA-N 0.000 description 1
- XFHMVFKCQSHLKW-HJGDQZAQSA-N Gln-Thr-Met Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O XFHMVFKCQSHLKW-HJGDQZAQSA-N 0.000 description 1
- RUFHOVYUYSNDNY-ACZMJKKPSA-N Glu-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O RUFHOVYUYSNDNY-ACZMJKKPSA-N 0.000 description 1
- TUTIHHSZKFBMHM-WHFBIAKZSA-N Glu-Asn Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(O)=O TUTIHHSZKFBMHM-WHFBIAKZSA-N 0.000 description 1
- AKJRHDMTEJXTPV-ACZMJKKPSA-N Glu-Asn-Ala Chemical compound C[C@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCC(O)=O)C(O)=O AKJRHDMTEJXTPV-ACZMJKKPSA-N 0.000 description 1
- YYOBUPFZLKQUAX-FXQIFTODSA-N Glu-Asn-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YYOBUPFZLKQUAX-FXQIFTODSA-N 0.000 description 1
- VAZZOGXDUQSVQF-NUMRIWBASA-N Glu-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N)O VAZZOGXDUQSVQF-NUMRIWBASA-N 0.000 description 1
- FYYSIASRLDJUNP-WHFBIAKZSA-N Glu-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O FYYSIASRLDJUNP-WHFBIAKZSA-N 0.000 description 1
- QPRZKNOOOBWXSU-CIUDSAMLSA-N Glu-Asp-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N QPRZKNOOOBWXSU-CIUDSAMLSA-N 0.000 description 1
- CKOFNWCLWRYUHK-XHNCKOQMSA-N Glu-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)N)C(=O)O CKOFNWCLWRYUHK-XHNCKOQMSA-N 0.000 description 1
- CYHBMLHCQXXCCT-AVGNSLFASA-N Glu-Asp-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CYHBMLHCQXXCCT-AVGNSLFASA-N 0.000 description 1
- PABVKUJVLNMOJP-WHFBIAKZSA-N Glu-Cys Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CS)C(O)=O PABVKUJVLNMOJP-WHFBIAKZSA-N 0.000 description 1
- MXPBQDFWIMBACQ-ACZMJKKPSA-N Glu-Cys-Cys Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CS)C(O)=O MXPBQDFWIMBACQ-ACZMJKKPSA-N 0.000 description 1
- PVBBEKPHARMPHX-DCAQKATOSA-N Glu-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCC(O)=O PVBBEKPHARMPHX-DCAQKATOSA-N 0.000 description 1
- KOSRFJWDECSPRO-WDSKDSINSA-N Glu-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(O)=O KOSRFJWDECSPRO-WDSKDSINSA-N 0.000 description 1
- CGOHAEBMDSEKFB-FXQIFTODSA-N Glu-Glu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O CGOHAEBMDSEKFB-FXQIFTODSA-N 0.000 description 1
- NKLRYVLERDYDBI-FXQIFTODSA-N Glu-Glu-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O NKLRYVLERDYDBI-FXQIFTODSA-N 0.000 description 1
- BUZMZDDKFCSKOT-CIUDSAMLSA-N Glu-Glu-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BUZMZDDKFCSKOT-CIUDSAMLSA-N 0.000 description 1
- KUTPGXNAAOQSPD-LPEHRKFASA-N Glu-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)O)N)C(=O)O KUTPGXNAAOQSPD-LPEHRKFASA-N 0.000 description 1
- YBAFDPFAUTYYRW-YUMQZZPRSA-N Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O YBAFDPFAUTYYRW-YUMQZZPRSA-N 0.000 description 1
- IRXNJYPKBVERCW-DCAQKATOSA-N Glu-Leu-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IRXNJYPKBVERCW-DCAQKATOSA-N 0.000 description 1
- MWMJCGBSIORNCD-AVGNSLFASA-N Glu-Leu-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O MWMJCGBSIORNCD-AVGNSLFASA-N 0.000 description 1
- NJCALAAIGREHDR-WDCWCFNPSA-N Glu-Leu-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NJCALAAIGREHDR-WDCWCFNPSA-N 0.000 description 1
- RBXSZQRSEGYDFG-GUBZILKMSA-N Glu-Lys-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O RBXSZQRSEGYDFG-GUBZILKMSA-N 0.000 description 1
- QDMVXRNLOPTPIE-WDCWCFNPSA-N Glu-Lys-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QDMVXRNLOPTPIE-WDCWCFNPSA-N 0.000 description 1
- SXGAGTVDWKQYCX-BQBZGAKWSA-N Glu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O SXGAGTVDWKQYCX-BQBZGAKWSA-N 0.000 description 1
- UMHRCVCZUPBBQW-GARJFASQSA-N Glu-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N UMHRCVCZUPBBQW-GARJFASQSA-N 0.000 description 1
- XMBSYZWANAQXEV-QWRGUYRKSA-N Glu-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-QWRGUYRKSA-N 0.000 description 1
- DXVOKNVIKORTHQ-GUBZILKMSA-N Glu-Pro-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O DXVOKNVIKORTHQ-GUBZILKMSA-N 0.000 description 1
- UQHGAYSULGRWRG-WHFBIAKZSA-N Glu-Ser Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CO)C(O)=O UQHGAYSULGRWRG-WHFBIAKZSA-N 0.000 description 1
- JSIQVRIXMINMTA-ZDLURKLDSA-N Glu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O JSIQVRIXMINMTA-ZDLURKLDSA-N 0.000 description 1
- CQGBSALYGOXQPE-HTUGSXCWSA-N Glu-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O CQGBSALYGOXQPE-HTUGSXCWSA-N 0.000 description 1
- CAQXJMUDOLSBPF-SUSMZKCASA-N Glu-Thr-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAQXJMUDOLSBPF-SUSMZKCASA-N 0.000 description 1
- VHPVBPCCWVDGJL-IRIUXVKKSA-N Glu-Thr-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VHPVBPCCWVDGJL-IRIUXVKKSA-N 0.000 description 1
- ZQNCUVODKOBSSO-XEGUGMAKSA-N Glu-Trp-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(O)=O ZQNCUVODKOBSSO-XEGUGMAKSA-N 0.000 description 1
- UZWUBBRJWFTHTD-LAEOZQHASA-N Glu-Val-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O UZWUBBRJWFTHTD-LAEOZQHASA-N 0.000 description 1
- YPHPEHMXOYTEQG-LAEOZQHASA-N Glu-Val-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O YPHPEHMXOYTEQG-LAEOZQHASA-N 0.000 description 1
- VIPDPMHGICREIS-GVXVVHGQSA-N Glu-Val-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O VIPDPMHGICREIS-GVXVVHGQSA-N 0.000 description 1
- ZYRXTRTUCAVNBQ-GVXVVHGQSA-N Glu-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZYRXTRTUCAVNBQ-GVXVVHGQSA-N 0.000 description 1
- RMWAOBGCZZSJHE-UMNHJUIQSA-N Glu-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N RMWAOBGCZZSJHE-UMNHJUIQSA-N 0.000 description 1
- QXUPRMQJDWJDFR-NRPADANISA-N Glu-Val-Ser Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O QXUPRMQJDWJDFR-NRPADANISA-N 0.000 description 1
- RLFSBAPJTYKSLG-WHFBIAKZSA-N Gly-Ala-Asp Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O RLFSBAPJTYKSLG-WHFBIAKZSA-N 0.000 description 1
- PHONXOACARQMPM-BQBZGAKWSA-N Gly-Ala-Met Chemical compound [H]NCC(=O)N[C@@H](C)C(=O)N[C@@H](CCSC)C(O)=O PHONXOACARQMPM-BQBZGAKWSA-N 0.000 description 1
- LJPIRKICOISLKN-WHFBIAKZSA-N Gly-Ala-Ser Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O LJPIRKICOISLKN-WHFBIAKZSA-N 0.000 description 1
- JLXVRFDTDUGQEE-YFKPBYRVSA-N Gly-Arg Chemical compound NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N JLXVRFDTDUGQEE-YFKPBYRVSA-N 0.000 description 1
- CLODWIOAKCSBAN-BQBZGAKWSA-N Gly-Arg-Asp Chemical compound NC(N)=NCCC[C@H](NC(=O)CN)C(=O)N[C@@H](CC(O)=O)C(O)=O CLODWIOAKCSBAN-BQBZGAKWSA-N 0.000 description 1
- OCDLPQDYTJPWNG-YUMQZZPRSA-N Gly-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)CN OCDLPQDYTJPWNG-YUMQZZPRSA-N 0.000 description 1
- MHHUEAIBJZWDBH-YUMQZZPRSA-N Gly-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN MHHUEAIBJZWDBH-YUMQZZPRSA-N 0.000 description 1
- PNMUAGGSDZXTHX-BYPYZUCNSA-N Gly-Gln Chemical compound NCC(=O)N[C@H](C(O)=O)CCC(N)=O PNMUAGGSDZXTHX-BYPYZUCNSA-N 0.000 description 1
- LXXANCRPFBSSKS-IUCAKERBSA-N Gly-Gln-Leu Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LXXANCRPFBSSKS-IUCAKERBSA-N 0.000 description 1
- IEFJWDNGDZAYNZ-BYPYZUCNSA-N Gly-Glu Chemical compound NCC(=O)N[C@H](C(O)=O)CCC(O)=O IEFJWDNGDZAYNZ-BYPYZUCNSA-N 0.000 description 1
- DHDOADIPGZTAHT-YUMQZZPRSA-N Gly-Glu-Arg Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DHDOADIPGZTAHT-YUMQZZPRSA-N 0.000 description 1
- SOEATRRYCIPEHA-BQBZGAKWSA-N Gly-Glu-Glu Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SOEATRRYCIPEHA-BQBZGAKWSA-N 0.000 description 1
- CCQOOWAONKGYKQ-BYPYZUCNSA-N Gly-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)CN CCQOOWAONKGYKQ-BYPYZUCNSA-N 0.000 description 1
- QITBQGJOXQYMOA-ZETCQYMHSA-N Gly-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)CN QITBQGJOXQYMOA-ZETCQYMHSA-N 0.000 description 1
- BUEFQXUHTUZXHR-LURJTMIESA-N Gly-Gly-Pro zwitterion Chemical compound NCC(=O)NCC(=O)N1CCC[C@H]1C(O)=O BUEFQXUHTUZXHR-LURJTMIESA-N 0.000 description 1
- UTYGDAHJBBDPBA-BYULHYEWSA-N Gly-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)CN UTYGDAHJBBDPBA-BYULHYEWSA-N 0.000 description 1
- HKSNHPVETYYJBK-LAEOZQHASA-N Gly-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)CN HKSNHPVETYYJBK-LAEOZQHASA-N 0.000 description 1
- SCWYHUQOOFRVHP-MBLNEYKQSA-N Gly-Ile-Thr Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SCWYHUQOOFRVHP-MBLNEYKQSA-N 0.000 description 1
- COVXELOAORHTND-LSJOCFKGSA-N Gly-Ile-Val Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O COVXELOAORHTND-LSJOCFKGSA-N 0.000 description 1
- DKEXFJVMVGETOO-LURJTMIESA-N Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CN DKEXFJVMVGETOO-LURJTMIESA-N 0.000 description 1
- IUZGUFAJDBHQQV-YUMQZZPRSA-N Gly-Leu-Asn Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IUZGUFAJDBHQQV-YUMQZZPRSA-N 0.000 description 1
- LRQXRHGQEVWGPV-NHCYSSNCSA-N Gly-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)CN LRQXRHGQEVWGPV-NHCYSSNCSA-N 0.000 description 1
- MHXKHKWHPNETGG-QWRGUYRKSA-N Gly-Lys-Leu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O MHXKHKWHPNETGG-QWRGUYRKSA-N 0.000 description 1
- OQQKUTVULYLCDG-ONGXEEELSA-N Gly-Lys-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCCCN)NC(=O)CN)C(O)=O OQQKUTVULYLCDG-ONGXEEELSA-N 0.000 description 1
- YYXJFBMCOUSYSF-RYUDHWBXSA-N Gly-Phe-Gln Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O YYXJFBMCOUSYSF-RYUDHWBXSA-N 0.000 description 1
- YLEIWGJJBFBFHC-KBPBESRZSA-N Gly-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 YLEIWGJJBFBFHC-KBPBESRZSA-N 0.000 description 1
- JJGBXTYGTKWGAT-YUMQZZPRSA-N Gly-Pro-Glu Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O JJGBXTYGTKWGAT-YUMQZZPRSA-N 0.000 description 1
- HAOUOFNNJJLVNS-BQBZGAKWSA-N Gly-Pro-Ser Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O HAOUOFNNJJLVNS-BQBZGAKWSA-N 0.000 description 1
- BCCRXDTUTZHDEU-VKHMYHEASA-N Gly-Ser Chemical compound NCC(=O)N[C@@H](CO)C(O)=O BCCRXDTUTZHDEU-VKHMYHEASA-N 0.000 description 1
- IRJWAYCXIYUHQE-WHFBIAKZSA-N Gly-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)CN IRJWAYCXIYUHQE-WHFBIAKZSA-N 0.000 description 1
- FGPLUIQCSKGLTI-WDSKDSINSA-N Gly-Ser-Glu Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O FGPLUIQCSKGLTI-WDSKDSINSA-N 0.000 description 1
- SOEGEPHNZOISMT-BYPYZUCNSA-N Gly-Ser-Gly Chemical compound NCC(=O)N[C@@H](CO)C(=O)NCC(O)=O SOEGEPHNZOISMT-BYPYZUCNSA-N 0.000 description 1
- WCORRBXVISTKQL-WHFBIAKZSA-N Gly-Ser-Ser Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WCORRBXVISTKQL-WHFBIAKZSA-N 0.000 description 1
- RHRLHXQWHCNJKR-PMVVWTBXSA-N Gly-Thr-His Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 RHRLHXQWHCNJKR-PMVVWTBXSA-N 0.000 description 1
- FOKISINOENBSDM-WLTAIBSBSA-N Gly-Thr-Tyr Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O FOKISINOENBSDM-WLTAIBSBSA-N 0.000 description 1
- XBGGUPMXALFZOT-VIFPVBQESA-N Gly-Tyr Chemical compound NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-VIFPVBQESA-N 0.000 description 1
- GBYYQVBXFVDJPJ-WLTAIBSBSA-N Gly-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)CN)O GBYYQVBXFVDJPJ-WLTAIBSBSA-N 0.000 description 1
- GWCJMBNBFYBQCV-XPUUQOCRSA-N Gly-Val-Ala Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O GWCJMBNBFYBQCV-XPUUQOCRSA-N 0.000 description 1
- BAYQNCWLXIDLHX-ONGXEEELSA-N Gly-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)CN BAYQNCWLXIDLHX-ONGXEEELSA-N 0.000 description 1
- SBVMXEZQJVUARN-XPUUQOCRSA-N Gly-Val-Ser Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O SBVMXEZQJVUARN-XPUUQOCRSA-N 0.000 description 1
- KSOBNUBCYHGUKH-UWVGGRQHSA-N Gly-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)CN KSOBNUBCYHGUKH-UWVGGRQHSA-N 0.000 description 1
- 108090000288 Glycoproteins Proteins 0.000 description 1
- 102000003886 Glycoproteins Human genes 0.000 description 1
- KZTLOHBDLMIFSH-XVYDVKMFSA-N His-Ala-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O KZTLOHBDLMIFSH-XVYDVKMFSA-N 0.000 description 1
- ZNPRMNDAFQKATM-LKTVYLICSA-N His-Ala-Tyr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZNPRMNDAFQKATM-LKTVYLICSA-N 0.000 description 1
- QSLKWWDKIXMWJV-SRVKXCTJSA-N His-Cys-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)O)N QSLKWWDKIXMWJV-SRVKXCTJSA-N 0.000 description 1
- LYCVKHSJGDMDLM-LURJTMIESA-N His-Gly Chemical compound OC(=O)CNC(=O)[C@@H](N)CC1=CN=CN1 LYCVKHSJGDMDLM-LURJTMIESA-N 0.000 description 1
- OEROYDLRVAYIMQ-YUMQZZPRSA-N His-Gly-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O OEROYDLRVAYIMQ-YUMQZZPRSA-N 0.000 description 1
- BDFCIKANUNMFGB-PMVVWTBXSA-N His-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CN=CN1 BDFCIKANUNMFGB-PMVVWTBXSA-N 0.000 description 1
- KAFZDWMZKGQDEE-SRVKXCTJSA-N His-His-Asp Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CC(=O)O)C(=O)O)N KAFZDWMZKGQDEE-SRVKXCTJSA-N 0.000 description 1
- VYUXYMRNGALHEA-DLOVCJGASA-N His-Leu-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O VYUXYMRNGALHEA-DLOVCJGASA-N 0.000 description 1
- CZVQSYNVUHAILZ-UWVGGRQHSA-N His-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CN=CN1 CZVQSYNVUHAILZ-UWVGGRQHSA-N 0.000 description 1
- PGRPSOUCWRBWKZ-DLOVCJGASA-N His-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CN=CN1 PGRPSOUCWRBWKZ-DLOVCJGASA-N 0.000 description 1
- LNCFUHAPNTYMJB-IUCAKERBSA-N His-Pro Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CN=CN1 LNCFUHAPNTYMJB-IUCAKERBSA-N 0.000 description 1
- GNBHSMFBUNEWCJ-DCAQKATOSA-N His-Pro-Asn Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O GNBHSMFBUNEWCJ-DCAQKATOSA-N 0.000 description 1
- BZAQOPHNBFOOJS-DCAQKATOSA-N His-Pro-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O BZAQOPHNBFOOJS-DCAQKATOSA-N 0.000 description 1
- OWYIDJCNRWRSJY-QTKMDUPCSA-N His-Pro-Thr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O OWYIDJCNRWRSJY-QTKMDUPCSA-N 0.000 description 1
- CHIAUHSHDARFBD-ULQDDVLXSA-N His-Pro-Tyr Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 CHIAUHSHDARFBD-ULQDDVLXSA-N 0.000 description 1
- PLCAEMGSYOYIPP-GUBZILKMSA-N His-Ser-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CN=CN1 PLCAEMGSYOYIPP-GUBZILKMSA-N 0.000 description 1
- DQZCEKQPSOBNMJ-NKIYYHGXSA-N His-Thr-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O DQZCEKQPSOBNMJ-NKIYYHGXSA-N 0.000 description 1
- VLDVBZICYBVQHB-IUCAKERBSA-N His-Val Chemical compound CC(C)[C@@H](C([O-])=O)NC(=O)[C@@H]([NH3+])CC1=CN=CN1 VLDVBZICYBVQHB-IUCAKERBSA-N 0.000 description 1
- KFQDSSNYWKZFOO-LSJOCFKGSA-N His-Val-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O KFQDSSNYWKZFOO-LSJOCFKGSA-N 0.000 description 1
- PUFNQIPSRXVLQJ-IHRRRGAJSA-N His-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N PUFNQIPSRXVLQJ-IHRRRGAJSA-N 0.000 description 1
- 108010001336 Horseradish Peroxidase Proteins 0.000 description 1
- LQSBBHNVAVNZSX-GHCJXIJMSA-N Ile-Ala-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(=O)N)C(=O)O)N LQSBBHNVAVNZSX-GHCJXIJMSA-N 0.000 description 1
- JRHFQUPIZOYKQP-KBIXCLLPSA-N Ile-Ala-Glu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O JRHFQUPIZOYKQP-KBIXCLLPSA-N 0.000 description 1
- QIHJTGSVGIPHIW-QSFUFRPTSA-N Ile-Asn-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C(C)C)C(=O)O)N QIHJTGSVGIPHIW-QSFUFRPTSA-N 0.000 description 1
- IDAHFEPYTJJZFD-PEFMBERDSA-N Ile-Asp-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N IDAHFEPYTJJZFD-PEFMBERDSA-N 0.000 description 1
- CNPNWGHRMBQHBZ-ZKWXMUAHSA-N Ile-Gln Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O CNPNWGHRMBQHBZ-ZKWXMUAHSA-N 0.000 description 1
- YBJWJQQBWRARLT-KBIXCLLPSA-N Ile-Gln-Ser Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O YBJWJQQBWRARLT-KBIXCLLPSA-N 0.000 description 1
- LGMUPVWZEYYUMU-YVNDNENWSA-N Ile-Glu-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N LGMUPVWZEYYUMU-YVNDNENWSA-N 0.000 description 1
- LEHPJMKVGFPSSP-ZQINRCPSSA-N Ile-Glu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)[C@@H](C)CC)C(O)=O)=CNC2=C1 LEHPJMKVGFPSSP-ZQINRCPSSA-N 0.000 description 1
- VOBYAKCXGQQFLR-LSJOCFKGSA-N Ile-Gly-Val Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O VOBYAKCXGQQFLR-LSJOCFKGSA-N 0.000 description 1
- YKLOMBNBQUTJDT-HVTMNAMFSA-N Ile-His-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N YKLOMBNBQUTJDT-HVTMNAMFSA-N 0.000 description 1
- SVBAHOMTJRFSIC-SXTJYALSSA-N Ile-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(=O)N)C(=O)O)N SVBAHOMTJRFSIC-SXTJYALSSA-N 0.000 description 1
- HUORUFRRJHELPD-MNXVOIDGSA-N Ile-Leu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N HUORUFRRJHELPD-MNXVOIDGSA-N 0.000 description 1
- ZNOBVZFCHNHKHA-KBIXCLLPSA-N Ile-Ser-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZNOBVZFCHNHKHA-KBIXCLLPSA-N 0.000 description 1
- DRCKHKZYDLJYFQ-YWIQKCBGSA-N Ile-Thr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DRCKHKZYDLJYFQ-YWIQKCBGSA-N 0.000 description 1
- YCKPUHHMCFSUMD-IUKAMOBKSA-N Ile-Thr-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YCKPUHHMCFSUMD-IUKAMOBKSA-N 0.000 description 1
- DZMWFIRHFFVBHS-ZEWNOJEFSA-N Ile-Tyr-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)O)N DZMWFIRHFFVBHS-ZEWNOJEFSA-N 0.000 description 1
- UYODHPPSCXBNCS-XUXIUFHCSA-N Ile-Val-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(C)C UYODHPPSCXBNCS-XUXIUFHCSA-N 0.000 description 1
- 208000026350 Inborn Genetic disease Diseases 0.000 description 1
- 108010065920 Insulin Lispro Proteins 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- HGCNKOLVKRAVHD-UHFFFAOYSA-N L-Met-L-Phe Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 HGCNKOLVKRAVHD-UHFFFAOYSA-N 0.000 description 1
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 1
- QLROSWPKSBORFJ-BQBZGAKWSA-N L-Prolyl-L-glutamic acid Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 QLROSWPKSBORFJ-BQBZGAKWSA-N 0.000 description 1
- UGTHTQWIQKEDEH-BQBZGAKWSA-N L-alanyl-L-prolylglycine zwitterion Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UGTHTQWIQKEDEH-BQBZGAKWSA-N 0.000 description 1
- LHSGPCFBGJHPCY-UHFFFAOYSA-N L-leucine-L-tyrosine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-UHFFFAOYSA-N 0.000 description 1
- RNKSNIBMTUYWSH-YFKPBYRVSA-N L-prolylglycine Chemical compound [O-]C(=O)CNC(=O)[C@@H]1CCC[NH2+]1 RNKSNIBMTUYWSH-YFKPBYRVSA-N 0.000 description 1
- LZDNBBYBDGBADK-UHFFFAOYSA-N L-valyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-UHFFFAOYSA-N 0.000 description 1
- HSQGMTRYSIHDAC-BQBZGAKWSA-N Leu-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(O)=O HSQGMTRYSIHDAC-BQBZGAKWSA-N 0.000 description 1
- BAJIJEGGUYXZGC-CIUDSAMLSA-N Leu-Asn-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CS)C(=O)O)N BAJIJEGGUYXZGC-CIUDSAMLSA-N 0.000 description 1
- WGNOPSQMIQERPK-GARJFASQSA-N Leu-Asn-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N WGNOPSQMIQERPK-GARJFASQSA-N 0.000 description 1
- WGNOPSQMIQERPK-UHFFFAOYSA-N Leu-Asn-Pro Natural products CC(C)CC(N)C(=O)NC(CC(=O)N)C(=O)N1CCCC1C(=O)O WGNOPSQMIQERPK-UHFFFAOYSA-N 0.000 description 1
- QCSFMCFHVGTLFF-NHCYSSNCSA-N Leu-Asp-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O QCSFMCFHVGTLFF-NHCYSSNCSA-N 0.000 description 1
- HIZYETOZLYFUFF-BQBZGAKWSA-N Leu-Cys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CS)C(O)=O HIZYETOZLYFUFF-BQBZGAKWSA-N 0.000 description 1
- PPBKJAQJAUHZKX-SRVKXCTJSA-N Leu-Cys-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@H](C(O)=O)CC(C)C PPBKJAQJAUHZKX-SRVKXCTJSA-N 0.000 description 1
- FOEHRHOBWFQSNW-KATARQTJSA-N Leu-Cys-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC(C)C)N)O FOEHRHOBWFQSNW-KATARQTJSA-N 0.000 description 1
- KAFOIVJDVSZUMD-DCAQKATOSA-N Leu-Gln-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O KAFOIVJDVSZUMD-DCAQKATOSA-N 0.000 description 1
- KAFOIVJDVSZUMD-UHFFFAOYSA-N Leu-Gln-Gln Natural products CC(C)CC(N)C(=O)NC(CCC(N)=O)C(=O)NC(CCC(N)=O)C(O)=O KAFOIVJDVSZUMD-UHFFFAOYSA-N 0.000 description 1
- BOFAFKVZQUMTID-AVGNSLFASA-N Leu-Gln-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N BOFAFKVZQUMTID-AVGNSLFASA-N 0.000 description 1
- QVFGXCVIXXBFHO-AVGNSLFASA-N Leu-Glu-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O QVFGXCVIXXBFHO-AVGNSLFASA-N 0.000 description 1
- OXRLYTYUXAQTHP-YUMQZZPRSA-N Leu-Gly-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](C)C(O)=O OXRLYTYUXAQTHP-YUMQZZPRSA-N 0.000 description 1
- LAPSXOAUPNOINL-YUMQZZPRSA-N Leu-Gly-Asp Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O LAPSXOAUPNOINL-YUMQZZPRSA-N 0.000 description 1
- KGCLIYGPQXUNLO-IUCAKERBSA-N Leu-Gly-Glu Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O KGCLIYGPQXUNLO-IUCAKERBSA-N 0.000 description 1
- VWHGTYCRDRBSFI-ZETCQYMHSA-N Leu-Gly-Gly Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)NCC(O)=O VWHGTYCRDRBSFI-ZETCQYMHSA-N 0.000 description 1
- APFJUBGRZGMQFF-QWRGUYRKSA-N Leu-Gly-Lys Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN APFJUBGRZGMQFF-QWRGUYRKSA-N 0.000 description 1
- HYMLKESRWLZDBR-WEDXCCLWSA-N Leu-Gly-Thr Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O HYMLKESRWLZDBR-WEDXCCLWSA-N 0.000 description 1
- POZULHZYLPGXMR-ONGXEEELSA-N Leu-Gly-Val Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O POZULHZYLPGXMR-ONGXEEELSA-N 0.000 description 1
- OYQUOLRTJHWVSQ-SRVKXCTJSA-N Leu-His-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O OYQUOLRTJHWVSQ-SRVKXCTJSA-N 0.000 description 1
- OHZIZVWQXJPBJS-IXOXFDKPSA-N Leu-His-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OHZIZVWQXJPBJS-IXOXFDKPSA-N 0.000 description 1
- AZLASBBHHSLQDB-GUBZILKMSA-N Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(C)C AZLASBBHHSLQDB-GUBZILKMSA-N 0.000 description 1
- USLNHQZCDQJBOV-ZPFDUUQYSA-N Leu-Ile-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O USLNHQZCDQJBOV-ZPFDUUQYSA-N 0.000 description 1
- KUIDCYNIEJBZBU-AJNGGQMLSA-N Leu-Ile-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O KUIDCYNIEJBZBU-AJNGGQMLSA-N 0.000 description 1
- LCPYQJIKPJDLLB-UWVGGRQHSA-N Leu-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC(C)C LCPYQJIKPJDLLB-UWVGGRQHSA-N 0.000 description 1
- QNBVTHNJGCOVFA-AVGNSLFASA-N Leu-Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O QNBVTHNJGCOVFA-AVGNSLFASA-N 0.000 description 1
- DNDWZFHLZVYOGF-KKUMJFAQSA-N Leu-Leu-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O DNDWZFHLZVYOGF-KKUMJFAQSA-N 0.000 description 1
- UBZGNBKMIJHOHL-BZSNNMDCSA-N Leu-Leu-Phe Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 UBZGNBKMIJHOHL-BZSNNMDCSA-N 0.000 description 1
- XVZCXCTYGHPNEM-UHFFFAOYSA-N Leu-Leu-Pro Natural products CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(O)=O XVZCXCTYGHPNEM-UHFFFAOYSA-N 0.000 description 1
- OTXBNHIUIHNGAO-UWVGGRQHSA-N Leu-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN OTXBNHIUIHNGAO-UWVGGRQHSA-N 0.000 description 1
- BGZCJDGBBUUBHA-KKUMJFAQSA-N Leu-Lys-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O BGZCJDGBBUUBHA-KKUMJFAQSA-N 0.000 description 1
- BJWKOATWNQJPSK-SRVKXCTJSA-N Leu-Met-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N BJWKOATWNQJPSK-SRVKXCTJSA-N 0.000 description 1
- MVVSHHJKJRZVNY-ACRUOGEOSA-N Leu-Phe-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MVVSHHJKJRZVNY-ACRUOGEOSA-N 0.000 description 1
- UCBPDSYUVAAHCD-UWVGGRQHSA-N Leu-Pro-Gly Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UCBPDSYUVAAHCD-UWVGGRQHSA-N 0.000 description 1
- CHJKEDSZNSONPS-DCAQKATOSA-N Leu-Pro-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O CHJKEDSZNSONPS-DCAQKATOSA-N 0.000 description 1
- MVHXGBZUJLWZOH-BJDJZHNGSA-N Leu-Ser-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MVHXGBZUJLWZOH-BJDJZHNGSA-N 0.000 description 1
- XOWMDXHFSBCAKQ-SRVKXCTJSA-N Leu-Ser-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(C)C XOWMDXHFSBCAKQ-SRVKXCTJSA-N 0.000 description 1
- SBANPBVRHYIMRR-GARJFASQSA-N Leu-Ser-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N SBANPBVRHYIMRR-GARJFASQSA-N 0.000 description 1
- SBANPBVRHYIMRR-UHFFFAOYSA-N Leu-Ser-Pro Natural products CC(C)CC(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O SBANPBVRHYIMRR-UHFFFAOYSA-N 0.000 description 1
- ZJZNLRVCZWUONM-JXUBOQSCSA-N Leu-Thr-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O ZJZNLRVCZWUONM-JXUBOQSCSA-N 0.000 description 1
- AEDWWMMHUGYIFD-HJGDQZAQSA-N Leu-Thr-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O AEDWWMMHUGYIFD-HJGDQZAQSA-N 0.000 description 1
- FGZVGOAAROXFAB-IXOXFDKPSA-N Leu-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC(C)C)N)O FGZVGOAAROXFAB-IXOXFDKPSA-N 0.000 description 1
- KLSUAWUZBMAZCL-RHYQMDGZSA-N Leu-Thr-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(O)=O KLSUAWUZBMAZCL-RHYQMDGZSA-N 0.000 description 1
- LHSGPCFBGJHPCY-STQMWFEESA-N Leu-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-STQMWFEESA-N 0.000 description 1
- XZNJZXJZBMBGGS-NHCYSSNCSA-N Leu-Val-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O XZNJZXJZBMBGGS-NHCYSSNCSA-N 0.000 description 1
- YQFZRHYZLARWDY-IHRRRGAJSA-N Leu-Val-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN YQFZRHYZLARWDY-IHRRRGAJSA-N 0.000 description 1
- QESXLSQLQHHTIX-RHYQMDGZSA-N Leu-Val-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QESXLSQLQHHTIX-RHYQMDGZSA-N 0.000 description 1
- 108010028275 Leukocyte Elastase Proteins 0.000 description 1
- 102000016799 Leukocyte elastase Human genes 0.000 description 1
- 102000003960 Ligases Human genes 0.000 description 1
- 108090000364 Ligases Proteins 0.000 description 1
- MPGHETGWWWUHPY-CIUDSAMLSA-N Lys-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN MPGHETGWWWUHPY-CIUDSAMLSA-N 0.000 description 1
- GGAPIOORBXHMNY-ULQDDVLXSA-N Lys-Arg-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCCN)N)O GGAPIOORBXHMNY-ULQDDVLXSA-N 0.000 description 1
- 108010062166 Lys-Asn-Asp Proteins 0.000 description 1
- BYPMOIFBQPEWOH-CIUDSAMLSA-N Lys-Asn-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N BYPMOIFBQPEWOH-CIUDSAMLSA-N 0.000 description 1
- QUYCUALODHJQLK-CIUDSAMLSA-N Lys-Asp-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O QUYCUALODHJQLK-CIUDSAMLSA-N 0.000 description 1
- GKFNXYMAMKJSKD-NHCYSSNCSA-N Lys-Asp-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O GKFNXYMAMKJSKD-NHCYSSNCSA-N 0.000 description 1
- QBGPXOGXCVKULO-BQBZGAKWSA-N Lys-Cys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CS)C(O)=O QBGPXOGXCVKULO-BQBZGAKWSA-N 0.000 description 1
- ZAWOJFFMBANLGE-CIUDSAMLSA-N Lys-Cys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCCCN)N ZAWOJFFMBANLGE-CIUDSAMLSA-N 0.000 description 1
- HEWWNLVEWBJBKA-WDCWCFNPSA-N Lys-Gln-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCCCN HEWWNLVEWBJBKA-WDCWCFNPSA-N 0.000 description 1
- UGTZHPSKYRIGRJ-YUMQZZPRSA-N Lys-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O UGTZHPSKYRIGRJ-YUMQZZPRSA-N 0.000 description 1
- HGNRJCINZYHNOU-LURJTMIESA-N Lys-Gly Chemical compound NCCCC[C@H](N)C(=O)NCC(O)=O HGNRJCINZYHNOU-LURJTMIESA-N 0.000 description 1
- NKKFVJRLCCUJNA-QWRGUYRKSA-N Lys-Gly-Lys Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN NKKFVJRLCCUJNA-QWRGUYRKSA-N 0.000 description 1
- CANPXOLVTMKURR-WEDXCCLWSA-N Lys-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN CANPXOLVTMKURR-WEDXCCLWSA-N 0.000 description 1
- ZMMDPRTXLAEMOD-BZSNNMDCSA-N Lys-His-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZMMDPRTXLAEMOD-BZSNNMDCSA-N 0.000 description 1
- GNLJXWBNLAIPEP-MELADBBJSA-N Lys-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CCCCN)N)C(=O)O GNLJXWBNLAIPEP-MELADBBJSA-N 0.000 description 1
- RBEATVHTWHTHTJ-KKUMJFAQSA-N Lys-Leu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O RBEATVHTWHTHTJ-KKUMJFAQSA-N 0.000 description 1
- YPLVCBKEPJPBDQ-MELADBBJSA-N Lys-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N YPLVCBKEPJPBDQ-MELADBBJSA-N 0.000 description 1
- WRODMZBHNNPRLN-SRVKXCTJSA-N Lys-Leu-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O WRODMZBHNNPRLN-SRVKXCTJSA-N 0.000 description 1
- LJADEBULDNKJNK-IHRRRGAJSA-N Lys-Leu-Val Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O LJADEBULDNKJNK-IHRRRGAJSA-N 0.000 description 1
- ALGGDNMLQNFVIZ-SRVKXCTJSA-N Lys-Lys-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N ALGGDNMLQNFVIZ-SRVKXCTJSA-N 0.000 description 1
- HVAUKHLDSDDROB-KKUMJFAQSA-N Lys-Lys-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O HVAUKHLDSDDROB-KKUMJFAQSA-N 0.000 description 1
- WBSCNDJQPKSPII-KKUMJFAQSA-N Lys-Lys-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O WBSCNDJQPKSPII-KKUMJFAQSA-N 0.000 description 1
- XFOAWKDQMRMCDN-ULQDDVLXSA-N Lys-Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CCCCN)CC1=CC=CC=C1 XFOAWKDQMRMCDN-ULQDDVLXSA-N 0.000 description 1
- ODTZHNZPINULEU-KKUMJFAQSA-N Lys-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N ODTZHNZPINULEU-KKUMJFAQSA-N 0.000 description 1
- TWPCWKVOZDUYAA-KKUMJFAQSA-N Lys-Phe-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O TWPCWKVOZDUYAA-KKUMJFAQSA-N 0.000 description 1
- PIXVFCBYEGPZPA-JYJNAYRXSA-N Lys-Phe-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N PIXVFCBYEGPZPA-JYJNAYRXSA-N 0.000 description 1
- HYSVGEAWTGPMOA-IHRRRGAJSA-N Lys-Pro-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O HYSVGEAWTGPMOA-IHRRRGAJSA-N 0.000 description 1
- LECIJRIRMVOFMH-ULQDDVLXSA-N Lys-Pro-Phe Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 LECIJRIRMVOFMH-ULQDDVLXSA-N 0.000 description 1
- LKDXINHHSWFFJC-SRVKXCTJSA-N Lys-Ser-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCCCN)N LKDXINHHSWFFJC-SRVKXCTJSA-N 0.000 description 1
- IOQWIOPSKJOEKI-SRVKXCTJSA-N Lys-Ser-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IOQWIOPSKJOEKI-SRVKXCTJSA-N 0.000 description 1
- ZOKVLMBYDSIDKG-CSMHCCOUSA-N Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN ZOKVLMBYDSIDKG-CSMHCCOUSA-N 0.000 description 1
- GIKFNMZSGYAPEJ-HJGDQZAQSA-N Lys-Thr-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O GIKFNMZSGYAPEJ-HJGDQZAQSA-N 0.000 description 1
- QVTDVTONTRSQMF-WDCWCFNPSA-N Lys-Thr-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CCCCN QVTDVTONTRSQMF-WDCWCFNPSA-N 0.000 description 1
- BDFHWFUAQLIMJO-KXNHARMFSA-N Lys-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N)O BDFHWFUAQLIMJO-KXNHARMFSA-N 0.000 description 1
- MYTOTTSMVMWVJN-STQMWFEESA-N Lys-Tyr Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 MYTOTTSMVMWVJN-STQMWFEESA-N 0.000 description 1
- RQILLQOQXLZTCK-KBPBESRZSA-N Lys-Tyr-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O RQILLQOQXLZTCK-KBPBESRZSA-N 0.000 description 1
- UGCIQUYEJIEHKX-GVXVVHGQSA-N Lys-Val-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O UGCIQUYEJIEHKX-GVXVVHGQSA-N 0.000 description 1
- DRRXXZBXDMLGFC-IHRRRGAJSA-N Lys-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN DRRXXZBXDMLGFC-IHRRRGAJSA-N 0.000 description 1
- TXTZMVNJIRZABH-ULQDDVLXSA-N Lys-Val-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 TXTZMVNJIRZABH-ULQDDVLXSA-N 0.000 description 1
- OZVXDDFYCQOPFD-XQQFMLRXSA-N Lys-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N OZVXDDFYCQOPFD-XQQFMLRXSA-N 0.000 description 1
- IKXQOBUBZSOWDY-AVGNSLFASA-N Lys-Val-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CCCCN)N IKXQOBUBZSOWDY-AVGNSLFASA-N 0.000 description 1
- 102100022448 Maturin Human genes 0.000 description 1
- OMMCXYHUVDGUJQ-UHFFFAOYSA-N Maturin Natural products COc1c2occ(CO)c2c(C=O)c3cccc(C)c13 OMMCXYHUVDGUJQ-UHFFFAOYSA-N 0.000 description 1
- 101710152298 Maturin Proteins 0.000 description 1
- JHKXZYLNVJRAAJ-WDSKDSINSA-N Met-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(O)=O JHKXZYLNVJRAAJ-WDSKDSINSA-N 0.000 description 1
- VHGIWFGJIHTASW-FXQIFTODSA-N Met-Ala-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O VHGIWFGJIHTASW-FXQIFTODSA-N 0.000 description 1
- IHITVQKJXQQGLJ-LPEHRKFASA-N Met-Asn-Pro Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N IHITVQKJXQQGLJ-LPEHRKFASA-N 0.000 description 1
- QTZXSYBVOSXBEJ-WDSKDSINSA-N Met-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O QTZXSYBVOSXBEJ-WDSKDSINSA-N 0.000 description 1
- NDYNTQWSJLPEMK-WDSKDSINSA-N Met-Cys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CS)C(O)=O NDYNTQWSJLPEMK-WDSKDSINSA-N 0.000 description 1
- DBXMFHGGHMXYHY-DCAQKATOSA-N Met-Leu-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O DBXMFHGGHMXYHY-DCAQKATOSA-N 0.000 description 1
- USBFEVBHEQBWDD-AVGNSLFASA-N Met-Leu-Val Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O USBFEVBHEQBWDD-AVGNSLFASA-N 0.000 description 1
- JCMMNFZUKMMECJ-DCAQKATOSA-N Met-Lys-Asn Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O JCMMNFZUKMMECJ-DCAQKATOSA-N 0.000 description 1
- AOFZWWDTTJLHOU-ULQDDVLXSA-N Met-Lys-Tyr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AOFZWWDTTJLHOU-ULQDDVLXSA-N 0.000 description 1
- CRVSHEPROQHVQT-AVGNSLFASA-N Met-Met-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)O)N CRVSHEPROQHVQT-AVGNSLFASA-N 0.000 description 1
- QTMIXEQWGNIPBL-JYJNAYRXSA-N Met-Met-Tyr Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N QTMIXEQWGNIPBL-JYJNAYRXSA-N 0.000 description 1
- CNAGWYQWQDMUGC-IHRRRGAJSA-N Met-Phe-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N CNAGWYQWQDMUGC-IHRRRGAJSA-N 0.000 description 1
- OIFHHODAXVWKJN-ULQDDVLXSA-N Met-Phe-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)CC1=CC=CC=C1 OIFHHODAXVWKJN-ULQDDVLXSA-N 0.000 description 1
- CIDICGYKRUTYLE-FXQIFTODSA-N Met-Ser-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O CIDICGYKRUTYLE-FXQIFTODSA-N 0.000 description 1
- DBMLDOWSVHMQQN-XGEHTFHBSA-N Met-Ser-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DBMLDOWSVHMQQN-XGEHTFHBSA-N 0.000 description 1
- KAKJTZWHIUWTTD-VQVTYTSYSA-N Met-Thr Chemical compound CSCC[C@H]([NH3+])C(=O)N[C@@H]([C@@H](C)O)C([O-])=O KAKJTZWHIUWTTD-VQVTYTSYSA-N 0.000 description 1
- BJFJQOMZCSHBMY-YUMQZZPRSA-N Met-Val Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(O)=O BJFJQOMZCSHBMY-YUMQZZPRSA-N 0.000 description 1
- VWFHWJGVLVZVIS-QXEWZRGKSA-N Met-Val-Asn Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O VWFHWJGVLVZVIS-QXEWZRGKSA-N 0.000 description 1
- VYDLZDRMOFYOGV-TUAOUCFPSA-N Met-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCSC)N VYDLZDRMOFYOGV-TUAOUCFPSA-N 0.000 description 1
- WUGMRIBZSVSJNP-UHFFFAOYSA-N N-L-alanyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UHFFFAOYSA-N 0.000 description 1
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 1
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 1
- 108010066427 N-valyltryptophan Proteins 0.000 description 1
- BQVUABVGYYSDCJ-UHFFFAOYSA-N Nalpha-L-Leucyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-UHFFFAOYSA-N 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 108700026244 Open Reading Frames Proteins 0.000 description 1
- 102000035195 Peptidases Human genes 0.000 description 1
- 108091005804 Peptidases Proteins 0.000 description 1
- MIDZLCFIAINOQN-WPRPVWTQSA-N Phe-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 MIDZLCFIAINOQN-WPRPVWTQSA-N 0.000 description 1
- ULECEJGNDHWSKD-QEJZJMRPSA-N Phe-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 ULECEJGNDHWSKD-QEJZJMRPSA-N 0.000 description 1
- MECSIDWUTYRHRJ-KKUMJFAQSA-N Phe-Asn-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O MECSIDWUTYRHRJ-KKUMJFAQSA-N 0.000 description 1
- KAHUBGWSIQNZQQ-KKUMJFAQSA-N Phe-Asn-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 KAHUBGWSIQNZQQ-KKUMJFAQSA-N 0.000 description 1
- OHUXOEXBXPZKPT-STQMWFEESA-N Phe-His Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1N=CNC=1)C(O)=O)C1=CC=CC=C1 OHUXOEXBXPZKPT-STQMWFEESA-N 0.000 description 1
- FXYXBEZMRACDDR-KKUMJFAQSA-N Phe-His-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(O)=O FXYXBEZMRACDDR-KKUMJFAQSA-N 0.000 description 1
- FXPZZKBHNOMLGA-HJWJTTGWSA-N Phe-Ile-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N FXPZZKBHNOMLGA-HJWJTTGWSA-N 0.000 description 1
- RFCVXVPWSPOMFJ-STQMWFEESA-N Phe-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 RFCVXVPWSPOMFJ-STQMWFEESA-N 0.000 description 1
- KDYPMIZMXDECSU-JYJNAYRXSA-N Phe-Leu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 KDYPMIZMXDECSU-JYJNAYRXSA-N 0.000 description 1
- KPEIBEPEUAZWNS-ULQDDVLXSA-N Phe-Leu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 KPEIBEPEUAZWNS-ULQDDVLXSA-N 0.000 description 1
- YCCUXNNKXDGMAM-KKUMJFAQSA-N Phe-Leu-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YCCUXNNKXDGMAM-KKUMJFAQSA-N 0.000 description 1
- BNRFQGLWLQESBG-YESZJQIVSA-N Phe-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O BNRFQGLWLQESBG-YESZJQIVSA-N 0.000 description 1
- BSHMIVKDJQGLNT-ACRUOGEOSA-N Phe-Lys-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 BSHMIVKDJQGLNT-ACRUOGEOSA-N 0.000 description 1
- GKZIWHRNKRBEOH-HOTGVXAUSA-N Phe-Phe Chemical compound C([C@H]([NH3+])C(=O)N[C@@H](CC=1C=CC=CC=1)C([O-])=O)C1=CC=CC=C1 GKZIWHRNKRBEOH-HOTGVXAUSA-N 0.000 description 1
- AXIOGMQCDYVTNY-ACRUOGEOSA-N Phe-Phe-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 AXIOGMQCDYVTNY-ACRUOGEOSA-N 0.000 description 1
- ZVRJWDUPIDMHDN-ULQDDVLXSA-N Phe-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=CC=C1 ZVRJWDUPIDMHDN-ULQDDVLXSA-N 0.000 description 1
- ROHDXJUFQVRDAV-UWVGGRQHSA-N Phe-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 ROHDXJUFQVRDAV-UWVGGRQHSA-N 0.000 description 1
- NYQBYASWHVRESG-MIMYLULJSA-N Phe-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 NYQBYASWHVRESG-MIMYLULJSA-N 0.000 description 1
- MSSXKZBDKZAHCX-UNQGMJICSA-N Phe-Thr-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O MSSXKZBDKZAHCX-UNQGMJICSA-N 0.000 description 1
- FSXRLASFHBWESK-HOTGVXAUSA-N Phe-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 FSXRLASFHBWESK-HOTGVXAUSA-N 0.000 description 1
- QUUCAHIYARMNBL-FHWLQOOXSA-N Phe-Tyr-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N QUUCAHIYARMNBL-FHWLQOOXSA-N 0.000 description 1
- IEHDJWSAXBGJIP-RYUDHWBXSA-N Phe-Val Chemical compound CC(C)[C@@H](C([O-])=O)NC(=O)[C@@H]([NH3+])CC1=CC=CC=C1 IEHDJWSAXBGJIP-RYUDHWBXSA-N 0.000 description 1
- 235000014676 Phragmites communis Nutrition 0.000 description 1
- HMNSRTLZAJHSIK-YUMQZZPRSA-N Pro-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 HMNSRTLZAJHSIK-YUMQZZPRSA-N 0.000 description 1
- SSSFPISOZOLQNP-GUBZILKMSA-N Pro-Arg-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O SSSFPISOZOLQNP-GUBZILKMSA-N 0.000 description 1
- KDIIENQUNVNWHR-JYJNAYRXSA-N Pro-Arg-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O KDIIENQUNVNWHR-JYJNAYRXSA-N 0.000 description 1
- AMBLXEMWFARNNQ-DCAQKATOSA-N Pro-Asn-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@@H]1CCCN1 AMBLXEMWFARNNQ-DCAQKATOSA-N 0.000 description 1
- VPEVBAUSTBWQHN-NHCYSSNCSA-N Pro-Glu-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O VPEVBAUSTBWQHN-NHCYSSNCSA-N 0.000 description 1
- UUHXBJHVTVGSKM-BQBZGAKWSA-N Pro-Gly-Asn Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O UUHXBJHVTVGSKM-BQBZGAKWSA-N 0.000 description 1
- HAEGAELAYWSUNC-WPRPVWTQSA-N Pro-Gly-Val Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O HAEGAELAYWSUNC-WPRPVWTQSA-N 0.000 description 1
- AJCRQOHDLCBHFA-SRVKXCTJSA-N Pro-His-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(O)=O AJCRQOHDLCBHFA-SRVKXCTJSA-N 0.000 description 1
- OFGUOWQVEGTVNU-DCAQKATOSA-N Pro-Lys-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OFGUOWQVEGTVNU-DCAQKATOSA-N 0.000 description 1
- RMODQFBNDDENCP-IHRRRGAJSA-N Pro-Lys-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O RMODQFBNDDENCP-IHRRRGAJSA-N 0.000 description 1
- ULWBBFKQBDNGOY-RWMBFGLXSA-N Pro-Lys-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N2CCC[C@@H]2C(=O)O ULWBBFKQBDNGOY-RWMBFGLXSA-N 0.000 description 1
- SWRNSCMUXRLHCR-ULQDDVLXSA-N Pro-Phe-Lys Chemical compound C([C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@H]1NCCC1)C1=CC=CC=C1 SWRNSCMUXRLHCR-ULQDDVLXSA-N 0.000 description 1
- BGWKULMLUIUPKY-BQBZGAKWSA-N Pro-Ser-Gly Chemical compound OC(=O)CNC(=O)[C@H](CO)NC(=O)[C@@H]1CCCN1 BGWKULMLUIUPKY-BQBZGAKWSA-N 0.000 description 1
- LZHHZYDPMZEMRX-STQMWFEESA-N Pro-Tyr-Gly Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O LZHHZYDPMZEMRX-STQMWFEESA-N 0.000 description 1
- 229940124158 Protease/peptidase inhibitor Drugs 0.000 description 1
- 208000010378 Pulmonary Embolism Diseases 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- SSJMZMUVNKEENT-IMJSIDKUSA-N Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CO SSJMZMUVNKEENT-IMJSIDKUSA-N 0.000 description 1
- SRTCFKGBYBZRHA-ACZMJKKPSA-N Ser-Ala-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SRTCFKGBYBZRHA-ACZMJKKPSA-N 0.000 description 1
- HRNQLKCLPVKZNE-CIUDSAMLSA-N Ser-Ala-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O HRNQLKCLPVKZNE-CIUDSAMLSA-N 0.000 description 1
- GXXTUIUYTWGPMV-FXQIFTODSA-N Ser-Arg-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O GXXTUIUYTWGPMV-FXQIFTODSA-N 0.000 description 1
- LTFSLKWFMWZEBD-IMJSIDKUSA-N Ser-Asn Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O LTFSLKWFMWZEBD-IMJSIDKUSA-N 0.000 description 1
- BCKYYTVFBXHPOG-ACZMJKKPSA-N Ser-Asn-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CO)N BCKYYTVFBXHPOG-ACZMJKKPSA-N 0.000 description 1
- VBKBDLMWICBSCY-IMJSIDKUSA-N Ser-Asp Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O VBKBDLMWICBSCY-IMJSIDKUSA-N 0.000 description 1
- XWCYBVBLJRWOFR-WDSKDSINSA-N Ser-Gln-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O XWCYBVBLJRWOFR-WDSKDSINSA-N 0.000 description 1
- OJPHFSOMBZKQKQ-GUBZILKMSA-N Ser-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CO OJPHFSOMBZKQKQ-GUBZILKMSA-N 0.000 description 1
- QKQDTEYDEIJPNK-GUBZILKMSA-N Ser-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CO QKQDTEYDEIJPNK-GUBZILKMSA-N 0.000 description 1
- GZBKRJVCRMZAST-XKBZYTNZSA-N Ser-Glu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GZBKRJVCRMZAST-XKBZYTNZSA-N 0.000 description 1
- SNVIOQXAHVORQM-WDSKDSINSA-N Ser-Gly-Gln Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O SNVIOQXAHVORQM-WDSKDSINSA-N 0.000 description 1
- IOVHBRCQOGWAQH-ZKWXMUAHSA-N Ser-Gly-Ile Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O IOVHBRCQOGWAQH-ZKWXMUAHSA-N 0.000 description 1
- XXXAXOWMBOKTRN-XPUUQOCRSA-N Ser-Gly-Val Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O XXXAXOWMBOKTRN-XPUUQOCRSA-N 0.000 description 1
- YZMPDHTZJJCGEI-BQBZGAKWSA-N Ser-His Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 YZMPDHTZJJCGEI-BQBZGAKWSA-N 0.000 description 1
- QGAHMVHBORDHDC-YUMQZZPRSA-N Ser-His-Gly Chemical compound OC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CN=CN1 QGAHMVHBORDHDC-YUMQZZPRSA-N 0.000 description 1
- CJINPXGSKSZQNE-KBIXCLLPSA-N Ser-Ile-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(O)=O CJINPXGSKSZQNE-KBIXCLLPSA-N 0.000 description 1
- MOINZPRHJGTCHZ-MMWGEVLESA-N Ser-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N MOINZPRHJGTCHZ-MMWGEVLESA-N 0.000 description 1
- UIPXCLNLUUAMJU-JBDRJPRFSA-N Ser-Ile-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O UIPXCLNLUUAMJU-JBDRJPRFSA-N 0.000 description 1
- ZOPISOXXPQNOCO-SVSWQMSJSA-N Ser-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CO)N ZOPISOXXPQNOCO-SVSWQMSJSA-N 0.000 description 1
- GJFYFGOEWLDQGW-GUBZILKMSA-N Ser-Leu-Gln Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CO)N GJFYFGOEWLDQGW-GUBZILKMSA-N 0.000 description 1
- XNCUYZKGQOCOQH-YUMQZZPRSA-N Ser-Leu-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O XNCUYZKGQOCOQH-YUMQZZPRSA-N 0.000 description 1
- IUXGJEIKJBYKOO-SRVKXCTJSA-N Ser-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CO)N IUXGJEIKJBYKOO-SRVKXCTJSA-N 0.000 description 1
- MUJQWSAWLLRJCE-KATARQTJSA-N Ser-Leu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MUJQWSAWLLRJCE-KATARQTJSA-N 0.000 description 1
- SBMNPABNWKXNBJ-BQBZGAKWSA-N Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CO SBMNPABNWKXNBJ-BQBZGAKWSA-N 0.000 description 1
- UGGWCAFQPKANMW-FXQIFTODSA-N Ser-Met-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O UGGWCAFQPKANMW-FXQIFTODSA-N 0.000 description 1
- ASGYVPAVFNDZMA-GUBZILKMSA-N Ser-Met-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CO)N ASGYVPAVFNDZMA-GUBZILKMSA-N 0.000 description 1
- PPQRSMGDOHLTBE-UWVGGRQHSA-N Ser-Phe Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 PPQRSMGDOHLTBE-UWVGGRQHSA-N 0.000 description 1
- ZKBKUWQVDWWSRI-BZSNNMDCSA-N Ser-Phe-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZKBKUWQVDWWSRI-BZSNNMDCSA-N 0.000 description 1
- BSXKBOUZDAZXHE-CIUDSAMLSA-N Ser-Pro-Glu Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O BSXKBOUZDAZXHE-CIUDSAMLSA-N 0.000 description 1
- QPPYAWVLAVXISR-DCAQKATOSA-N Ser-Pro-His Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CO)N)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O QPPYAWVLAVXISR-DCAQKATOSA-N 0.000 description 1
- FKYWFUYPVKLJLP-DCAQKATOSA-N Ser-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO FKYWFUYPVKLJLP-DCAQKATOSA-N 0.000 description 1
- WLJPJRGQRNCIQS-ZLUOBGJFSA-N Ser-Ser-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O WLJPJRGQRNCIQS-ZLUOBGJFSA-N 0.000 description 1
- NVNPWELENFJOHH-CIUDSAMLSA-N Ser-Ser-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CO)N NVNPWELENFJOHH-CIUDSAMLSA-N 0.000 description 1
- BMKNXTJLHFIAAH-CIUDSAMLSA-N Ser-Ser-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O BMKNXTJLHFIAAH-CIUDSAMLSA-N 0.000 description 1
- XQJCEKXQUJQNNK-ZLUOBGJFSA-N Ser-Ser-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O XQJCEKXQUJQNNK-ZLUOBGJFSA-N 0.000 description 1
- PYTKULIABVRXSC-BWBBJGPYSA-N Ser-Ser-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PYTKULIABVRXSC-BWBBJGPYSA-N 0.000 description 1
- DKGRNFUXVTYRAS-UBHSHLNASA-N Ser-Ser-Trp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O DKGRNFUXVTYRAS-UBHSHLNASA-N 0.000 description 1
- VGQVAVQWKJLIRM-FXQIFTODSA-N Ser-Ser-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O VGQVAVQWKJLIRM-FXQIFTODSA-N 0.000 description 1
- LDEBVRIURYMKQS-WISUUJSJSA-N Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CO LDEBVRIURYMKQS-WISUUJSJSA-N 0.000 description 1
- XJDMUQCLVSCRSJ-VZFHVOOUSA-N Ser-Thr-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O XJDMUQCLVSCRSJ-VZFHVOOUSA-N 0.000 description 1
- SQHKXWODKJDZRC-LKXGYXEUSA-N Ser-Thr-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O SQHKXWODKJDZRC-LKXGYXEUSA-N 0.000 description 1
- NADLKBTYNKUJEP-KATARQTJSA-N Ser-Thr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O NADLKBTYNKUJEP-KATARQTJSA-N 0.000 description 1
- UYLKOSODXYSWMQ-XGEHTFHBSA-N Ser-Thr-Met Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CO)N)O UYLKOSODXYSWMQ-XGEHTFHBSA-N 0.000 description 1
- BDMWLJLPPUCLNV-XGEHTFHBSA-N Ser-Thr-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O BDMWLJLPPUCLNV-XGEHTFHBSA-N 0.000 description 1
- LZLREEUGSYITMX-JQWIXIFHSA-N Ser-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CO)N)C(O)=O)=CNC2=C1 LZLREEUGSYITMX-JQWIXIFHSA-N 0.000 description 1
- ILVGMCVCQBJPSH-WDSKDSINSA-N Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CO ILVGMCVCQBJPSH-WDSKDSINSA-N 0.000 description 1
- MFQMZDPAZRZAPV-NAKRPEOUSA-N Ser-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CO)N MFQMZDPAZRZAPV-NAKRPEOUSA-N 0.000 description 1
- ANOQEBQWIAYIMV-AEJSXWLSSA-N Ser-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N ANOQEBQWIAYIMV-AEJSXWLSSA-N 0.000 description 1
- 102000012479 Serine Proteases Human genes 0.000 description 1
- 108010022999 Serine Proteases Proteins 0.000 description 1
- 102000007562 Serum Albumin Human genes 0.000 description 1
- 108010071390 Serum Albumin Proteins 0.000 description 1
- VPZKQTYZIVOJDV-LMVFSUKVSA-N Thr-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(O)=O VPZKQTYZIVOJDV-LMVFSUKVSA-N 0.000 description 1
- ZUXQFMVPAYGPFJ-JXUBOQSCSA-N Thr-Ala-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN ZUXQFMVPAYGPFJ-JXUBOQSCSA-N 0.000 description 1
- UQTNIFUCMBFWEJ-IWGUZYHVSA-N Thr-Asn Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O UQTNIFUCMBFWEJ-IWGUZYHVSA-N 0.000 description 1
- BWUHENPAEMNGQJ-ZDLURKLDSA-N Thr-Gln Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O BWUHENPAEMNGQJ-ZDLURKLDSA-N 0.000 description 1
- BECPPKYKPSRKCP-ZDLURKLDSA-N Thr-Glu Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O BECPPKYKPSRKCP-ZDLURKLDSA-N 0.000 description 1
- UDQBCBUXAQIZAK-GLLZPBPUSA-N Thr-Glu-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UDQBCBUXAQIZAK-GLLZPBPUSA-N 0.000 description 1
- OQCXTUQTKQFDCX-HTUGSXCWSA-N Thr-Glu-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N)O OQCXTUQTKQFDCX-HTUGSXCWSA-N 0.000 description 1
- ONNSECRQFSTMCC-XKBZYTNZSA-N Thr-Glu-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O ONNSECRQFSTMCC-XKBZYTNZSA-N 0.000 description 1
- BIYXEUAFGLTAEM-WUJLRWPWSA-N Thr-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(O)=O BIYXEUAFGLTAEM-WUJLRWPWSA-N 0.000 description 1
- XFTYVCHLARBHBQ-FOHZUACHSA-N Thr-Gly-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O XFTYVCHLARBHBQ-FOHZUACHSA-N 0.000 description 1
- WXVIGTAUZBUDPZ-DTLFHODZSA-N Thr-His Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 WXVIGTAUZBUDPZ-DTLFHODZSA-N 0.000 description 1
- FQPDRTDDEZXCEC-SVSWQMSJSA-N Thr-Ile-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O FQPDRTDDEZXCEC-SVSWQMSJSA-N 0.000 description 1
- AMXMBCAXAZUCFA-RHYQMDGZSA-N Thr-Leu-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AMXMBCAXAZUCFA-RHYQMDGZSA-N 0.000 description 1
- NCXVJIQMWSGRHY-KXNHARMFSA-N Thr-Leu-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N)O NCXVJIQMWSGRHY-KXNHARMFSA-N 0.000 description 1
- QNCFWHZVRNXAKW-OEAJRASXSA-N Thr-Lys-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QNCFWHZVRNXAKW-OEAJRASXSA-N 0.000 description 1
- DXPURPNJDFCKKO-RHYQMDGZSA-N Thr-Lys-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)[C@@H](C)O)C(O)=O DXPURPNJDFCKKO-RHYQMDGZSA-N 0.000 description 1
- VGYVVSQFSSKZRJ-OEAJRASXSA-N Thr-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@H](O)C)CC1=CC=CC=C1 VGYVVSQFSSKZRJ-OEAJRASXSA-N 0.000 description 1
- GXDLGHLJTHMDII-WISUUJSJSA-N Thr-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(O)=O GXDLGHLJTHMDII-WISUUJSJSA-N 0.000 description 1
- WPSKTVVMQCXPRO-BWBBJGPYSA-N Thr-Ser-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WPSKTVVMQCXPRO-BWBBJGPYSA-N 0.000 description 1
- CSNBWOJOEOPYIJ-UVOCVTCTSA-N Thr-Thr-Lys Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O CSNBWOJOEOPYIJ-UVOCVTCTSA-N 0.000 description 1
- ZEJBJDHSQPOVJV-UAXMHLISSA-N Thr-Trp-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZEJBJDHSQPOVJV-UAXMHLISSA-N 0.000 description 1
- CKHWEVXPLJBEOZ-VQVTYTSYSA-N Thr-Val Chemical compound CC(C)[C@@H](C([O-])=O)NC(=O)[C@@H]([NH3+])[C@@H](C)O CKHWEVXPLJBEOZ-VQVTYTSYSA-N 0.000 description 1
- AKHDFZHUPGVFEJ-YEPSODPASA-N Thr-Val-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O AKHDFZHUPGVFEJ-YEPSODPASA-N 0.000 description 1
- PWONLXBUSVIZPH-RHYQMDGZSA-N Thr-Val-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O PWONLXBUSVIZPH-RHYQMDGZSA-N 0.000 description 1
- 208000007536 Thrombosis Diseases 0.000 description 1
- 238000008050 Total Bilirubin Reagent Methods 0.000 description 1
- MJBBMTOGSOSAKJ-HJXMPXNTSA-N Trp-Ala-Ile Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MJBBMTOGSOSAKJ-HJXMPXNTSA-N 0.000 description 1
- PKUJMYZNJMRHEZ-XIRDDKMYSA-N Trp-Glu-Arg Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PKUJMYZNJMRHEZ-XIRDDKMYSA-N 0.000 description 1
- YXONONCLMLHWJX-SZMVWBNQSA-N Trp-Glu-Leu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O)=CNC2=C1 YXONONCLMLHWJX-SZMVWBNQSA-N 0.000 description 1
- ULHASJWZGUEUNN-XIRDDKMYSA-N Trp-Lys-Ser Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O ULHASJWZGUEUNN-XIRDDKMYSA-N 0.000 description 1
- NIHNMOSRSAYZIT-BPNCWPANSA-N Tyr-Ala-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NIHNMOSRSAYZIT-BPNCWPANSA-N 0.000 description 1
- ZWZOCUWOXSDYFZ-CQDKDKBSSA-N Tyr-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 ZWZOCUWOXSDYFZ-CQDKDKBSSA-N 0.000 description 1
- DXYWRYQRKPIGGU-BPNCWPANSA-N Tyr-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 DXYWRYQRKPIGGU-BPNCWPANSA-N 0.000 description 1
- NGALWFGCOMHUSN-AVGNSLFASA-N Tyr-Gln-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O NGALWFGCOMHUSN-AVGNSLFASA-N 0.000 description 1
- PDSLRCZINIDLMU-QWRGUYRKSA-N Tyr-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 PDSLRCZINIDLMU-QWRGUYRKSA-N 0.000 description 1
- CNLKDWSAORJEMW-KWQFWETISA-N Tyr-Gly-Ala Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](C)C(O)=O CNLKDWSAORJEMW-KWQFWETISA-N 0.000 description 1
- ZQOOYCZQENFIMC-STQMWFEESA-N Tyr-His Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1N=CNC=1)C(O)=O)C1=CC=C(O)C=C1 ZQOOYCZQENFIMC-STQMWFEESA-N 0.000 description 1
- QJKMCQRFHJRIPU-XDTLVQLUSA-N Tyr-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 QJKMCQRFHJRIPU-XDTLVQLUSA-N 0.000 description 1
- FJBCEFPCVPHPPM-STECZYCISA-N Tyr-Ile-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O FJBCEFPCVPHPPM-STECZYCISA-N 0.000 description 1
- YYLHVUCSTXXKBS-IHRRRGAJSA-N Tyr-Pro-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O YYLHVUCSTXXKBS-IHRRRGAJSA-N 0.000 description 1
- JAQGKXUEKGKTKX-HOTGVXAUSA-N Tyr-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 JAQGKXUEKGKTKX-HOTGVXAUSA-N 0.000 description 1
- WYOBRXPIZVKNMF-IRXDYDNUSA-N Tyr-Tyr-Gly Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)NCC(O)=O)C1=CC=C(O)C=C1 WYOBRXPIZVKNMF-IRXDYDNUSA-N 0.000 description 1
- RVGVIWNHABGIFH-IHRRRGAJSA-N Tyr-Val-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O RVGVIWNHABGIFH-IHRRRGAJSA-N 0.000 description 1
- 108010064997 VPY tripeptide Proteins 0.000 description 1
- REJBPZVUHYNMEN-LSJOCFKGSA-N Val-Ala-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](C(C)C)N REJBPZVUHYNMEN-LSJOCFKGSA-N 0.000 description 1
- ZLFHAAGHGQBQQN-AEJSXWLSSA-N Val-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N ZLFHAAGHGQBQQN-AEJSXWLSSA-N 0.000 description 1
- ZLFHAAGHGQBQQN-GUBZILKMSA-N Val-Ala-Pro Natural products CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O ZLFHAAGHGQBQQN-GUBZILKMSA-N 0.000 description 1
- AZSHAZJLOZQYAY-FXQIFTODSA-N Val-Ala-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O AZSHAZJLOZQYAY-FXQIFTODSA-N 0.000 description 1
- LABUITCFCAABSV-BPNCWPANSA-N Val-Ala-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 LABUITCFCAABSV-BPNCWPANSA-N 0.000 description 1
- LABUITCFCAABSV-UHFFFAOYSA-N Val-Ala-Tyr Natural products CC(C)C(N)C(=O)NC(C)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LABUITCFCAABSV-UHFFFAOYSA-N 0.000 description 1
- IBIDRSSEHFLGSD-YUMQZZPRSA-N Val-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IBIDRSSEHFLGSD-YUMQZZPRSA-N 0.000 description 1
- WKWJJQZZZBBWKV-JYJNAYRXSA-N Val-Arg-Tyr Chemical compound NC(N)=NCCC[C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 WKWJJQZZZBBWKV-JYJNAYRXSA-N 0.000 description 1
- VUTHNLMCXKLLFI-LAEOZQHASA-N Val-Asp-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N VUTHNLMCXKLLFI-LAEOZQHASA-N 0.000 description 1
- CFSSLXZJEMERJY-NRPADANISA-N Val-Gln-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O CFSSLXZJEMERJY-NRPADANISA-N 0.000 description 1
- ZEVNVXYRZRIRCH-GVXVVHGQSA-N Val-Gln-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N ZEVNVXYRZRIRCH-GVXVVHGQSA-N 0.000 description 1
- UPJONISHZRADBH-XPUUQOCRSA-N Val-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O UPJONISHZRADBH-XPUUQOCRSA-N 0.000 description 1
- SZTTYWIUCGSURQ-AUTRQRHGSA-N Val-Glu-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SZTTYWIUCGSURQ-AUTRQRHGSA-N 0.000 description 1
- ROLGIBMFNMZANA-GVXVVHGQSA-N Val-Glu-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N ROLGIBMFNMZANA-GVXVVHGQSA-N 0.000 description 1
- CELJCNRXKZPTCX-XPUUQOCRSA-N Val-Gly-Ala Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O CELJCNRXKZPTCX-XPUUQOCRSA-N 0.000 description 1
- MDYSKHBSPXUOPV-JSGCOSHPSA-N Val-Gly-Phe Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N MDYSKHBSPXUOPV-JSGCOSHPSA-N 0.000 description 1
- BVWPHWLFGRCECJ-JSGCOSHPSA-N Val-Gly-Tyr Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N BVWPHWLFGRCECJ-JSGCOSHPSA-N 0.000 description 1
- HLBHFAWNMAQGNO-AVGNSLFASA-N Val-His-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CCSC)C(=O)O)N HLBHFAWNMAQGNO-AVGNSLFASA-N 0.000 description 1
- CPGJELLYDQEDRK-NAKRPEOUSA-N Val-Ile-Ala Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@@H](C)C(O)=O CPGJELLYDQEDRK-NAKRPEOUSA-N 0.000 description 1
- LKUDRJSNRWVGMS-QSFUFRPTSA-N Val-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N LKUDRJSNRWVGMS-QSFUFRPTSA-N 0.000 description 1
- XCTHZFGSVQBHBW-IUCAKERBSA-N Val-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@@H]([NH3+])C(C)C XCTHZFGSVQBHBW-IUCAKERBSA-N 0.000 description 1
- BMOFUVHDBROBSE-DCAQKATOSA-N Val-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](C(C)C)N BMOFUVHDBROBSE-DCAQKATOSA-N 0.000 description 1
- UMPVMAYCLYMYGA-ONGXEEELSA-N Val-Leu-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O UMPVMAYCLYMYGA-ONGXEEELSA-N 0.000 description 1
- XTDDIVQWDXMRJL-IHRRRGAJSA-N Val-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](C(C)C)N XTDDIVQWDXMRJL-IHRRRGAJSA-N 0.000 description 1
- DAVNYIUELQBTAP-XUXIUFHCSA-N Val-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)N DAVNYIUELQBTAP-XUXIUFHCSA-N 0.000 description 1
- WLHIIWDIDLQTKP-IHRRRGAJSA-N Val-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)C(C)C WLHIIWDIDLQTKP-IHRRRGAJSA-N 0.000 description 1
- SYSWVVCYSXBVJG-RHYQMDGZSA-N Val-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)N)O SYSWVVCYSXBVJG-RHYQMDGZSA-N 0.000 description 1
- KANQPJDDXIYZJS-AVGNSLFASA-N Val-Leu-Val Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](C(C)C)N KANQPJDDXIYZJS-AVGNSLFASA-N 0.000 description 1
- IJGPOONOTBNTFS-GVXVVHGQSA-N Val-Lys-Glu Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O IJGPOONOTBNTFS-GVXVVHGQSA-N 0.000 description 1
- VPGCVZRRBYOGCD-AVGNSLFASA-N Val-Lys-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O VPGCVZRRBYOGCD-AVGNSLFASA-N 0.000 description 1
- MBGFDZDWMDLXHQ-GUBZILKMSA-N Val-Met-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](C(C)C)N MBGFDZDWMDLXHQ-GUBZILKMSA-N 0.000 description 1
- IOETTZIEIBVWBZ-GUBZILKMSA-N Val-Met-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CS)C(=O)O)N IOETTZIEIBVWBZ-GUBZILKMSA-N 0.000 description 1
- GIAZPLMMQOERPN-YUMQZZPRSA-N Val-Pro Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(O)=O GIAZPLMMQOERPN-YUMQZZPRSA-N 0.000 description 1
- BGXVHVMJZCSOCA-AVGNSLFASA-N Val-Pro-Lys Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)O)N BGXVHVMJZCSOCA-AVGNSLFASA-N 0.000 description 1
- SSYBNWFXCFNRFN-GUBZILKMSA-N Val-Pro-Ser Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O SSYBNWFXCFNRFN-GUBZILKMSA-N 0.000 description 1
- QWCZXKIFPWPQHR-JYJNAYRXSA-N Val-Pro-Tyr Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 QWCZXKIFPWPQHR-JYJNAYRXSA-N 0.000 description 1
- KSFXWENSJABBFI-ZKWXMUAHSA-N Val-Ser-Asn Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O KSFXWENSJABBFI-ZKWXMUAHSA-N 0.000 description 1
- PGQUDQYHWICSAB-NAKRPEOUSA-N Val-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N PGQUDQYHWICSAB-NAKRPEOUSA-N 0.000 description 1
- QTPQHINADBYBNA-DCAQKATOSA-N Val-Ser-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN QTPQHINADBYBNA-DCAQKATOSA-N 0.000 description 1
- UJMCYJKPDFQLHX-XGEHTFHBSA-N Val-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N)O UJMCYJKPDFQLHX-XGEHTFHBSA-N 0.000 description 1
- GVRKWABULJAONN-VQVTYTSYSA-N Val-Thr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GVRKWABULJAONN-VQVTYTSYSA-N 0.000 description 1
- GVNLOVJNNDZUHS-RHYQMDGZSA-N Val-Thr-Lys Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O GVNLOVJNNDZUHS-RHYQMDGZSA-N 0.000 description 1
- IWADHXDXSQONEL-GUBZILKMSA-N Val-Val-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O IWADHXDXSQONEL-GUBZILKMSA-N 0.000 description 1
- 206010047249 Venous thrombosis Diseases 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 241000209149 Zea Species 0.000 description 1
- 235000016383 Zea mays subsp huehuetenangensis Nutrition 0.000 description 1
- 238000002835 absorbance Methods 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- 238000000862 absorption spectrum Methods 0.000 description 1
- 108010011164 acein 1 Proteins 0.000 description 1
- 150000007513 acids Chemical class 0.000 description 1
- 230000001154 acute effect Effects 0.000 description 1
- 239000008272 agar Substances 0.000 description 1
- 108010017893 alanyl-alanyl-alanine Proteins 0.000 description 1
- 108010076324 alanyl-glycyl-glycine Proteins 0.000 description 1
- 108010045350 alanyl-tyrosyl-alanine Proteins 0.000 description 1
- 108010087924 alanylproline Proteins 0.000 description 1
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 1
- 108010050025 alpha-glutamyltryptophan Proteins 0.000 description 1
- 125000000539 amino acid group Chemical group 0.000 description 1
- 229960000723 ampicillin Drugs 0.000 description 1
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 1
- 230000002785 anti-thrombosis Effects 0.000 description 1
- 230000001475 anti-trypsic effect Effects 0.000 description 1
- 239000003146 anticoagulant agent Substances 0.000 description 1
- 229960004676 antithrombotic agent Drugs 0.000 description 1
- 239000012736 aqueous medium Substances 0.000 description 1
- 108010013835 arginine glutamate Proteins 0.000 description 1
- 108010043240 arginyl-leucyl-glycine Proteins 0.000 description 1
- 108010062796 arginyllysine Proteins 0.000 description 1
- 108010036533 arginylvaline Proteins 0.000 description 1
- 229960001230 asparagine Drugs 0.000 description 1
- 125000000613 asparagine group Chemical group N[C@@H](CC(N)=O)C(=O)* 0.000 description 1
- 108010077245 asparaginyl-proline Proteins 0.000 description 1
- 108010036999 aspartyl-alanyl-histidyl-lysine Proteins 0.000 description 1
- 108010092854 aspartyllysine Proteins 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000003115 biocidal effect Effects 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 230000036772 blood pressure Effects 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- 239000001110 calcium chloride Substances 0.000 description 1
- 229910001628 calcium chloride Inorganic materials 0.000 description 1
- 235000011148 calcium chloride Nutrition 0.000 description 1
- 150000001720 carbohydrates Chemical group 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 150000001768 cations Chemical class 0.000 description 1
- 230000010261 cell growth Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000002738 chelating agent Substances 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 230000001684 chronic effect Effects 0.000 description 1
- 229960002376 chymotrypsin Drugs 0.000 description 1
- 230000007882 cirrhosis Effects 0.000 description 1
- 208000019425 cirrhosis of liver Diseases 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 230000015271 coagulation Effects 0.000 description 1
- 238000005345 coagulation Methods 0.000 description 1
- 210000001072 colon Anatomy 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000009918 complex formation Effects 0.000 description 1
- 238000011109 contamination Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 239000003599 detergent Substances 0.000 description 1
- 235000014113 dietary fatty acids Nutrition 0.000 description 1
- 230000005059 dormancy Effects 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 239000003602 elastase inhibitor Substances 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 210000000981 epithelium Anatomy 0.000 description 1
- 210000003527 eukaryotic cell Anatomy 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000000194 fatty acid Substances 0.000 description 1
- 229930195729 fatty acid Natural products 0.000 description 1
- 150000004665 fatty acids Chemical class 0.000 description 1
- 230000004720 fertilization Effects 0.000 description 1
- 238000011049 filling Methods 0.000 description 1
- 102000035175 foldases Human genes 0.000 description 1
- 108091005749 foldases Proteins 0.000 description 1
- 239000012737 fresh medium Substances 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 208000016361 genetic disease Diseases 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 230000007614 genetic variation Effects 0.000 description 1
- 239000003448 gibberellin Substances 0.000 description 1
- 108010013768 glutamyl-aspartyl-proline Proteins 0.000 description 1
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 1
- 108010049041 glutamylalanine Proteins 0.000 description 1
- 108010079547 glutamylmethionine Proteins 0.000 description 1
- 108091005608 glycosylated proteins Proteins 0.000 description 1
- 102000035122 glycosylated proteins Human genes 0.000 description 1
- 108010051307 glycyl-glycyl-proline Proteins 0.000 description 1
- 108010082286 glycyl-seryl-alanine Proteins 0.000 description 1
- 108010010147 glycylglutamine Proteins 0.000 description 1
- 108010033706 glycylserine Proteins 0.000 description 1
- STKYPAFSDFAEPH-LURJTMIESA-N glycylvaline Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CN STKYPAFSDFAEPH-LURJTMIESA-N 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 230000002363 herbicidal effect Effects 0.000 description 1
- 239000004009 herbicide Substances 0.000 description 1
- 108010025306 histidylleucine Proteins 0.000 description 1
- 108010092114 histidylphenylalanine Proteins 0.000 description 1
- 108010085325 histidylproline Proteins 0.000 description 1
- 101150026546 hsa gene Proteins 0.000 description 1
- 102000052834 human SERPINC1 Human genes 0.000 description 1
- ILHIHKRJJMKBEE-UHFFFAOYSA-N hydroperoxyethane Chemical compound CCOO ILHIHKRJJMKBEE-UHFFFAOYSA-N 0.000 description 1
- 230000001976 improved effect Effects 0.000 description 1
- 239000000411 inducer Substances 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 208000014674 injury Diseases 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 108010031424 isoleucyl-prolyl-proline Proteins 0.000 description 1
- 238000011031 large-scale manufacturing process Methods 0.000 description 1
- DVCSNHXRZUVYAM-BQBZGAKWSA-N leu-asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O DVCSNHXRZUVYAM-BQBZGAKWSA-N 0.000 description 1
- 108010071185 leucyl-alanine Proteins 0.000 description 1
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 1
- 108010049589 leucyl-leucyl-leucine Proteins 0.000 description 1
- 108010091798 leucylleucine Proteins 0.000 description 1
- 108010012058 leucyltyrosine Proteins 0.000 description 1
- 238000009630 liquid culture Methods 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 235000009973 maize Nutrition 0.000 description 1
- 210000004962 mammalian cell Anatomy 0.000 description 1
- 150000002739 metals Chemical class 0.000 description 1
- 108010016686 methionyl-alanyl-serine Proteins 0.000 description 1
- 108010068488 methionylphenylalanine Proteins 0.000 description 1
- 239000000178 monomer Substances 0.000 description 1
- 108010058731 nopaline synthase Proteins 0.000 description 1
- 230000031787 nutrient reservoir activity Effects 0.000 description 1
- 235000016709 nutrition Nutrition 0.000 description 1
- 229920001542 oligosaccharide Polymers 0.000 description 1
- 239000003960 organic solvent Substances 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 239000008188 pellet Substances 0.000 description 1
- 108010074082 phenylalanyl-alanyl-lysine Proteins 0.000 description 1
- 108010084572 phenylalanyl-valine Proteins 0.000 description 1
- 108010073101 phenylalanylleucine Proteins 0.000 description 1
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 1
- BULVZWIRKLYCBC-UHFFFAOYSA-N phorate Chemical compound CCOP(=S)(OCC)SCSCC BULVZWIRKLYCBC-UHFFFAOYSA-N 0.000 description 1
- 108010082527 phosphinothricin N-acetyltransferase Proteins 0.000 description 1
- 230000004962 physiological condition Effects 0.000 description 1
- 229930195732 phytohormone Natural products 0.000 description 1
- 108010025488 pinealon Proteins 0.000 description 1
- 235000013824 polyphenols Nutrition 0.000 description 1
- 230000003389 potentiating effect Effects 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 108010004914 prolylarginine Proteins 0.000 description 1
- 108010070643 prolylglutamic acid Proteins 0.000 description 1
- 108010090894 prolylleucine Proteins 0.000 description 1
- 235000004252 protein component Nutrition 0.000 description 1
- 230000012846 protein folding Effects 0.000 description 1
- 238000000164 protein isolation Methods 0.000 description 1
- 230000020978 protein processing Effects 0.000 description 1
- 230000012743 protein tagging Effects 0.000 description 1
- 230000002797 proteolythic effect Effects 0.000 description 1
- 238000003259 recombinant expression Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 230000008929 regeneration Effects 0.000 description 1
- 238000011069 regeneration method Methods 0.000 description 1
- 238000009256 replacement therapy Methods 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 230000007281 self degradation Effects 0.000 description 1
- 108010048818 seryl-histidine Proteins 0.000 description 1
- 108010071207 serylmethionine Proteins 0.000 description 1
- 230000035939 shock Effects 0.000 description 1
- 238000002741 site-directed mutagenesis Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 108010005652 splenotritin Proteins 0.000 description 1
- 239000003381 stabilizer Substances 0.000 description 1
- 239000008107 starch Substances 0.000 description 1
- 239000011550 stock solution Substances 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 238000001356 surgical procedure Methods 0.000 description 1
- 238000004114 suspension culture Methods 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 238000002560 therapeutic procedure Methods 0.000 description 1
- 230000001732 thrombotic effect Effects 0.000 description 1
- 230000005945 translocation Effects 0.000 description 1
- IMRYETFJNLKUHK-UHFFFAOYSA-N traseolide Chemical compound CC1=C(C(C)=O)C=C2C(C(C)C)C(C)C(C)(C)C2=C1 IMRYETFJNLKUHK-UHFFFAOYSA-N 0.000 description 1
- 230000008733 trauma Effects 0.000 description 1
- 108010087967 type I signal peptidase Proteins 0.000 description 1
- 108010003137 tyrosyltyrosine Proteins 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
- 210000003934 vacuole Anatomy 0.000 description 1
- IBIDRSSEHFLGSD-UHFFFAOYSA-N valinyl-arginine Natural products CC(C)C(N)C(=O)NC(C(O)=O)CCCN=C(N)N IBIDRSSEHFLGSD-UHFFFAOYSA-N 0.000 description 1
- YSGSDAIMSCVPHG-UHFFFAOYSA-N valyl-methionine Chemical compound CSCCC(C(O)=O)NC(=O)C(N)C(C)C YSGSDAIMSCVPHG-UHFFFAOYSA-N 0.000 description 1
- 108010036320 valylleucine Proteins 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/81—Protease inhibitors
- C07K14/8107—Endopeptidase (E.C. 3.4.21-99) inhibitors
- C07K14/811—Serine protease (E.C. 3.4.21) inhibitors
- C07K14/8121—Serpins
- C07K14/8128—Antithrombin III
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/76—Albumins
- C07K14/765—Serum albumin, e.g. HSA
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/81—Protease inhibitors
- C07K14/8107—Endopeptidase (E.C. 3.4.21-99) inhibitors
- C07K14/811—Serine protease (E.C. 3.4.21) inhibitors
- C07K14/8121—Serpins
- C07K14/8125—Alpha-1-antitrypsin
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8216—Methods for controlling, regulating or enhancing expression of transgenes in plant cells
- C12N15/8221—Transit peptides
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8216—Methods for controlling, regulating or enhancing expression of transgenes in plant cells
- C12N15/8222—Developmentally regulated expression systems, tissue, organ specific, temporal or spatial regulation
- C12N15/823—Reproductive tissue-specific promoters
- C12N15/8234—Seed-specific, e.g. embryo, endosperm
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8216—Methods for controlling, regulating or enhancing expression of transgenes in plant cells
- C12N15/8222—Developmentally regulated expression systems, tissue, organ specific, temporal or spatial regulation
- C12N15/823—Reproductive tissue-specific promoters
- C12N15/8235—Fruit-specific
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8216—Methods for controlling, regulating or enhancing expression of transgenes in plant cells
- C12N15/8237—Externally regulated expression systems
- C12N15/8238—Externally regulated expression systems chemically inducible, e.g. tetracycline
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8241—Phenotypically and genetically modified plants via recombinant DNA technology
- C12N15/8242—Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits
- C12N15/8257—Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits for the production of primary gene products, e.g. pharmaceutical products, interferon
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/48—Hydrolases (3) acting on peptide bonds (3.4)
- C12N9/50—Proteinases, e.g. Endopeptidases (3.4.21-3.4.25)
- C12N9/52—Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from bacteria or Archaea
- C12N9/54—Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from bacteria or Archaea bacteria being Bacillus
Landscapes
- Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Biomedical Technology (AREA)
- Zoology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Wood Science & Technology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Biophysics (AREA)
- Microbiology (AREA)
- Cell Biology (AREA)
- Physics & Mathematics (AREA)
- Medicinal Chemistry (AREA)
- Plant Pathology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Gastroenterology & Hepatology (AREA)
- Pregnancy & Childbirth (AREA)
- Reproductive Health (AREA)
- Toxicology (AREA)
- Developmental Biology & Embryology (AREA)
- Pharmacology & Pharmacy (AREA)
- General Chemical & Material Sciences (AREA)
- Peptides Or Proteins (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
- Fertilizers (AREA)
Abstract
A method for producing one of the following proteins in transgenic monocot plant cells is disclosed: (i) mature, glycosylated .alpha.1-antitrypsin (AAT) having the same N-terminal amino acid sequence as mature AAT produced in humans and a glycosylation pattern which increases serum halflife substantially over that of mature non-glycosylated AAT; (ii) mature, glycosylated antithrombin III (ATIII) having the same N-terminal amino acid sequence as mature ATIII produced in humans; (iii) mature human serum albumin (HSA) having the same N-terminal amino acid sequence as mature HSA produced in humans and having the folding pattern of native mature HSA as evidenced by its bilirubin-binding characteristics; and (iv) mature, active subtilisin BPN' (BPN') having the same N-terminal amino acid sequence as BPN' produced in Bacillus. Monocot plants cells are transformed with a chimeric gene which includes a DNA coding sequence encoding a fusion protein having an (i) N-terminal moiety corresponding to a rice .alpha.-amylase signal sequence peptide and, (iii) immediately adjacent the C-terminal amino acid of said peptide, a protein moiety corresponding to the mature protein to be produced.
Description
- WO 98/36085 PCT/US98/030b8 Production of Mature Proteins in Plants FiE;ld of the Invention The present invention relates to the production of mature proteins in plant cells, and in particular, to the production of proteins in mature secreted form.
background of the Invention A major commercial focus of biotechnology is the recombinant production of proteins, including both industrial enzymes and proteins that have important therapeutic uses.
Therapeutic proteins are convmonly produced recombinantly by microbial expression to systems, such as in E, coli and the yeast system S. cerevisiae. To date, the cost of recombinant proteins produced in a microbial host has limited the availability of a variety of therapeutically important proteins, such as human serum albumin (HSA) and a,l-antitrypsin (AAT), to the extent thavt the proteins are in short supply.
Some therapeutic proteins appear to rely on glycosylation for optimal activity or stability, and. the general inability of microbial systems to glycosylate or properly glycosylate mammalian proteins has also limited the usefulness of these recombinant expression systems. In some cases, proper protein folding cannot take place, because of the need for mammalian-specifc foldases or other folding conditions, To some extent, protein expression in cultured mammalian cells, or in transgenic animals may overcome the limitations of microbial expression systems. However, the cost per weight ratio of the protein is still high in mammalian expression systems, and the risk of protein contamination by :mammalian viruses may be a significant regulatory problem. Protein production by transgenic animals also carries the risk of genetic variation from one generation to another. The attendant risk is variation in the recombinant protein produced, for example, variation in protein processing to yield a nature active protein with different N-terminal residue.
It would therefore be desirable to produce selected therapeutic and industrial proteins in a prol:ein expression system that largely overcomes problems associated with microbial and marnmalian-cell systems. In particular, production of the proteins should allow large volume production at low cost, and yield properly processed and glycosylated proteins. Tee production 3o system should also have a relatively stable genotype from generation to generation. These aims are achieved, in the present invention, for the therapeutic proteins AAT, HSA, and antithrombin III
(ATZIl7, and the industrial enzyme subtilisin BPN'.
SUBSTITUTE SHEET {RULE 26) ~um~ ~1-antitry~sin Human at-antitrypsin (AAT) is a monomer with a molecular weight of about 52Kd.
Normal AAT contains 394 residues, with three compiex oligosaccharide units exposed to the surface of the molecule, linked to asparagines 46, 83, and 247 (Carrell, P., et al., Nature (1982) 298:329).
AAT is the major plasma proteinase inhibitor whose primary function is to control the proteolytic activity of trygsin, elastase, and chymotrypsin in plasma. In particular, the protein is a potent inhibitor of neutrophil elastase, and a deficiency of AAT has been observed in a number of patients with chronic emphysema of the lungs. A proportion of individuals with serum deficiency of AAT may progress to cirrhosis and Iiver failure (e.g., Wu, Y., et al., BioEssays _I~(4):163 (1991).
Because of the key role of AAT as an elastase inhibitor, and because of the prevalence of genetic diseases resulting in deficient serum levels of AAT, there has been an active interest in recombinant synthesis of AAT, for human therapeutic use. To date, this approach has not been satisfactory for AAT produced by recombinant methods, for the reasons discussed above.
Human Antithrombin III
Antithrombin III (ATIII) is the major inhibitor of thrombin and factor Xa, and to a lesser extent, other serine proteases generated during the coagulation process, e.g., factors IXa, XIa, and XIIa. The inhibitory effect of ATIII is accelerated dramatically by heparin.
In patients with a history of deep vein thrombosis and pulmonary embolism, the prevalence of ATIII deficiency is 2-39~.
ATIII protein has been useful in treating hereditary ATIII deficiency and has wide clinical applications for the prevention of thrombosis in high risk situations, such as surgery and delivery, and for treating acute thrombotic episodes, when used in combination with heparin.
ATIII is a glycoprotein with a molecular weight of 58,200, having 432 amino acids and containing three disulfide linkages and four asparagine-linked biantennary carbohydrate chains.
Because of the key role of ATIII as an-anti-thrombotic agent, and because of the broad clinical potential in anti-thrombosis therapy, there has been an active interest in recombinant synthesis of 3o ATIII, for human therapeutic use. To date, this approach has not been satisfactory for ATIII
produced by microbial or mammalian recombinant methods, for the reasons discussed above.
Human Serum Albumin Serum albumin is the main protein component of plasma. Its main function is regulation of colloidal osmotic pressure in the bloodstream. Serum albumin binds numerous ions and small molecules, including Ca2+, Na+, K+, fatty acids, hormones, bilirubin and certain drugs.
background of the Invention A major commercial focus of biotechnology is the recombinant production of proteins, including both industrial enzymes and proteins that have important therapeutic uses.
Therapeutic proteins are convmonly produced recombinantly by microbial expression to systems, such as in E, coli and the yeast system S. cerevisiae. To date, the cost of recombinant proteins produced in a microbial host has limited the availability of a variety of therapeutically important proteins, such as human serum albumin (HSA) and a,l-antitrypsin (AAT), to the extent thavt the proteins are in short supply.
Some therapeutic proteins appear to rely on glycosylation for optimal activity or stability, and. the general inability of microbial systems to glycosylate or properly glycosylate mammalian proteins has also limited the usefulness of these recombinant expression systems. In some cases, proper protein folding cannot take place, because of the need for mammalian-specifc foldases or other folding conditions, To some extent, protein expression in cultured mammalian cells, or in transgenic animals may overcome the limitations of microbial expression systems. However, the cost per weight ratio of the protein is still high in mammalian expression systems, and the risk of protein contamination by :mammalian viruses may be a significant regulatory problem. Protein production by transgenic animals also carries the risk of genetic variation from one generation to another. The attendant risk is variation in the recombinant protein produced, for example, variation in protein processing to yield a nature active protein with different N-terminal residue.
It would therefore be desirable to produce selected therapeutic and industrial proteins in a prol:ein expression system that largely overcomes problems associated with microbial and marnmalian-cell systems. In particular, production of the proteins should allow large volume production at low cost, and yield properly processed and glycosylated proteins. Tee production 3o system should also have a relatively stable genotype from generation to generation. These aims are achieved, in the present invention, for the therapeutic proteins AAT, HSA, and antithrombin III
(ATZIl7, and the industrial enzyme subtilisin BPN'.
SUBSTITUTE SHEET {RULE 26) ~um~ ~1-antitry~sin Human at-antitrypsin (AAT) is a monomer with a molecular weight of about 52Kd.
Normal AAT contains 394 residues, with three compiex oligosaccharide units exposed to the surface of the molecule, linked to asparagines 46, 83, and 247 (Carrell, P., et al., Nature (1982) 298:329).
AAT is the major plasma proteinase inhibitor whose primary function is to control the proteolytic activity of trygsin, elastase, and chymotrypsin in plasma. In particular, the protein is a potent inhibitor of neutrophil elastase, and a deficiency of AAT has been observed in a number of patients with chronic emphysema of the lungs. A proportion of individuals with serum deficiency of AAT may progress to cirrhosis and Iiver failure (e.g., Wu, Y., et al., BioEssays _I~(4):163 (1991).
Because of the key role of AAT as an elastase inhibitor, and because of the prevalence of genetic diseases resulting in deficient serum levels of AAT, there has been an active interest in recombinant synthesis of AAT, for human therapeutic use. To date, this approach has not been satisfactory for AAT produced by recombinant methods, for the reasons discussed above.
Human Antithrombin III
Antithrombin III (ATIII) is the major inhibitor of thrombin and factor Xa, and to a lesser extent, other serine proteases generated during the coagulation process, e.g., factors IXa, XIa, and XIIa. The inhibitory effect of ATIII is accelerated dramatically by heparin.
In patients with a history of deep vein thrombosis and pulmonary embolism, the prevalence of ATIII deficiency is 2-39~.
ATIII protein has been useful in treating hereditary ATIII deficiency and has wide clinical applications for the prevention of thrombosis in high risk situations, such as surgery and delivery, and for treating acute thrombotic episodes, when used in combination with heparin.
ATIII is a glycoprotein with a molecular weight of 58,200, having 432 amino acids and containing three disulfide linkages and four asparagine-linked biantennary carbohydrate chains.
Because of the key role of ATIII as an-anti-thrombotic agent, and because of the broad clinical potential in anti-thrombosis therapy, there has been an active interest in recombinant synthesis of 3o ATIII, for human therapeutic use. To date, this approach has not been satisfactory for ATIII
produced by microbial or mammalian recombinant methods, for the reasons discussed above.
Human Serum Albumin Serum albumin is the main protein component of plasma. Its main function is regulation of colloidal osmotic pressure in the bloodstream. Serum albumin binds numerous ions and small molecules, including Ca2+, Na+, K+, fatty acids, hormones, bilirubin and certain drugs.
SUBSTITUTE SHEET (RULE 26) Human serum albumin (FISA) is expressed as a 609 amino acid prepro-protein which is further processed by removal of an amino-terminal peptide and an additional six amino acid residues to form the mature protein. The mature protein found in human serum is a monomeric, _ unglycosylated protein 585 amino acids in length (66 kDal), with a globular structure maintained by i7 disulfide bonds. The pattern of disulfide links forms a structural unit of one small and two large disulfide-linked double loops (Geisow, M.J. et al. (197?) Biochem. J. 163:477-484) which forms a high-affinity bilirubin binding site.
HSA is used to expand blood volume and raise low blood protein levels in cases of shock, trauma, and post-surgical recovery. HSA is often administered in emergency situations to stabilize blood pressure.
Because of the key role of HSA as an osmotic stabilizing agent, and because of its broad clinical potential in, e.g., plasma replacement therapy, there has been an active interest in recombinant synthesis of HSA for human therapeutic use. This approach has not been satisfactory for HSA produced by microbial or mammalian recombinant methods, for the reasons discussed I5 above.
Subtilisin BPN' Subtilisin BPN' (BPN') is an important industrial enzyme, particularly for use as a detergent enzyme. Several groups have reported amino acid substitution modifications of the enzyme that are effective in enhancing the activity, pH optimum, stability and/or therapeutic use of 2o the enzyme.
BPN' is expressed in as a 381 amino acid preproenzyme, including 35 amino acid sequence required for secretion and a 77 amino acid moiety which serves as a chaperon to facilitate folding.
Studies indicate that the pro moiety acts in trans outside of cells.
To date, large-scale production of BPN' is predominantly by microbial fermentation, which 25 has relatively high costs associated with it. In addition, the enzyme tends to auto-degrade at optimal fermentation growth-medium conditions.
Summary of the Invention In one aspect, the invention includes a method of producing, in monocot plant cells, a 30 mature heterologous protein selected from the group consisting of (i) mature, glycosylated ai antitrypsin (AAT) having the same N-terminal amino acid sequence as mature AAT
produced in humans and a glycosylation pattern which increases serum halflife substantially over that of non glycosylated mature AAT; (ii) mature, glycosylated antithrombin III (ATIII) having the same N
terminai amino acid sequence as mature ATIII produced in humans; (iii) mature human serum 35 albumin (HSA) having the same N-terminal amino acid sequence as mature HSA
produced in SUBSTITUTE SHEET (RULE 26) humans and having the folding pattern of native mature HSA as evidenced by its bilirubin-binding characteristics; and (iv) mature, active subtilisin BPN' (BPN'), glycosylated or non-glycosylated, having the same N-terminal amino acid sequence as BPN' produced in Bacillus.
The method includes obtaining monocot cells transformed with a chimeric gene having (i) a monocot transcriptional regulatory region, inducible by addition or removal of a small molecule, or during seed maturation, (ii) a first DNA sequence encoding the heterologous protein, and (iii) a second DNA sequence encoding a signal peptide. The second DNA sequence is operably linked to the transcriptional regulatory region and to the first DNA sequence. The first DNA sequence is in translation-frame with the second DNA sequence, and the two sequences encode a fusion protein.
The transformed cells are cultivated under conditions effective to induce the transcriptional regulatory region, thereby promoting expression of the fusion protein and secretion of the mature heterologous protein from the transformed cells. The mature heterologous protein produced by the transformed cells is then isolated.
In one embodiment of the method, the first DNA sequence encodes pro-subtilisin BPN' (proBPN'), the cultivating includes cultivating the transformed cells at a pH
between 5 and 6, and the isolating step includes incubating the proBPN' to under condition effective to allow its autoconversion to active mature BPN'. In another embodiment, the first DNA
sequence encodes mature BPN', and the cells are transformed with a second chimeric gene containing (i) a transcript-ional regulatory region inducible by addition or removal of a small molecule, (ii) a third DNA
sequence encoding the pro-peptide moiety of BPN', and (iii) a fourth DNA
sequence encoding a signal polypeptide. The fourth DNA sequence is operabiy linked to the transcriptional regulatory region and to the third DNA sequence, and the signal polypeptide is in translation-frame with the pro-peptide moiety and is effective to facilitate secretion of expressed pro-peptide moiety from the transformed cells. The cultivating step includes cultivating the transformed cells at a pH between 5 and 6, and the isolating step includes incubating the mature BPN' and the pro-moiety under conditions effective to allow the conversion of BPN' by the pro- moiety to active mature BPN'.
In another embodiment of the method, the signal peptide is the lZAmy3D signal peptide (SEQ )D NO:1) or the RAmylA signal peptide (SEQ ID N0:4). The coding sequence of the signal peptide may be a codon-optimized sequence, such as the codon-optimized RAmy3D
sequence identified as SEQ ID N0:3. The first DNA sequence may also be codon-optimized.
Exemplary codon-optimized signal peptide-heterologous protein fusion protein coding sequences include 3D-AAT (SEQ ID N0:18), 3D-ATIII (SEQ ID NO:I9), and 3D-HSA (SEQ ID N0:20). The first DNA sequence may further contain codon substitutions which eliminate one or more potential glycosylation sites present in the native amino acid sequence of the heterologous protein, such as the colon-optimized sequence encoding 3D-proBPN' (SEQ ID N0:21).
HSA is used to expand blood volume and raise low blood protein levels in cases of shock, trauma, and post-surgical recovery. HSA is often administered in emergency situations to stabilize blood pressure.
Because of the key role of HSA as an osmotic stabilizing agent, and because of its broad clinical potential in, e.g., plasma replacement therapy, there has been an active interest in recombinant synthesis of HSA for human therapeutic use. This approach has not been satisfactory for HSA produced by microbial or mammalian recombinant methods, for the reasons discussed I5 above.
Subtilisin BPN' Subtilisin BPN' (BPN') is an important industrial enzyme, particularly for use as a detergent enzyme. Several groups have reported amino acid substitution modifications of the enzyme that are effective in enhancing the activity, pH optimum, stability and/or therapeutic use of 2o the enzyme.
BPN' is expressed in as a 381 amino acid preproenzyme, including 35 amino acid sequence required for secretion and a 77 amino acid moiety which serves as a chaperon to facilitate folding.
Studies indicate that the pro moiety acts in trans outside of cells.
To date, large-scale production of BPN' is predominantly by microbial fermentation, which 25 has relatively high costs associated with it. In addition, the enzyme tends to auto-degrade at optimal fermentation growth-medium conditions.
Summary of the Invention In one aspect, the invention includes a method of producing, in monocot plant cells, a 30 mature heterologous protein selected from the group consisting of (i) mature, glycosylated ai antitrypsin (AAT) having the same N-terminal amino acid sequence as mature AAT
produced in humans and a glycosylation pattern which increases serum halflife substantially over that of non glycosylated mature AAT; (ii) mature, glycosylated antithrombin III (ATIII) having the same N
terminai amino acid sequence as mature ATIII produced in humans; (iii) mature human serum 35 albumin (HSA) having the same N-terminal amino acid sequence as mature HSA
produced in SUBSTITUTE SHEET (RULE 26) humans and having the folding pattern of native mature HSA as evidenced by its bilirubin-binding characteristics; and (iv) mature, active subtilisin BPN' (BPN'), glycosylated or non-glycosylated, having the same N-terminal amino acid sequence as BPN' produced in Bacillus.
The method includes obtaining monocot cells transformed with a chimeric gene having (i) a monocot transcriptional regulatory region, inducible by addition or removal of a small molecule, or during seed maturation, (ii) a first DNA sequence encoding the heterologous protein, and (iii) a second DNA sequence encoding a signal peptide. The second DNA sequence is operably linked to the transcriptional regulatory region and to the first DNA sequence. The first DNA sequence is in translation-frame with the second DNA sequence, and the two sequences encode a fusion protein.
The transformed cells are cultivated under conditions effective to induce the transcriptional regulatory region, thereby promoting expression of the fusion protein and secretion of the mature heterologous protein from the transformed cells. The mature heterologous protein produced by the transformed cells is then isolated.
In one embodiment of the method, the first DNA sequence encodes pro-subtilisin BPN' (proBPN'), the cultivating includes cultivating the transformed cells at a pH
between 5 and 6, and the isolating step includes incubating the proBPN' to under condition effective to allow its autoconversion to active mature BPN'. In another embodiment, the first DNA
sequence encodes mature BPN', and the cells are transformed with a second chimeric gene containing (i) a transcript-ional regulatory region inducible by addition or removal of a small molecule, (ii) a third DNA
sequence encoding the pro-peptide moiety of BPN', and (iii) a fourth DNA
sequence encoding a signal polypeptide. The fourth DNA sequence is operabiy linked to the transcriptional regulatory region and to the third DNA sequence, and the signal polypeptide is in translation-frame with the pro-peptide moiety and is effective to facilitate secretion of expressed pro-peptide moiety from the transformed cells. The cultivating step includes cultivating the transformed cells at a pH between 5 and 6, and the isolating step includes incubating the mature BPN' and the pro-moiety under conditions effective to allow the conversion of BPN' by the pro- moiety to active mature BPN'.
In another embodiment of the method, the signal peptide is the lZAmy3D signal peptide (SEQ )D NO:1) or the RAmylA signal peptide (SEQ ID N0:4). The coding sequence of the signal peptide may be a codon-optimized sequence, such as the codon-optimized RAmy3D
sequence identified as SEQ ID N0:3. The first DNA sequence may also be codon-optimized.
Exemplary codon-optimized signal peptide-heterologous protein fusion protein coding sequences include 3D-AAT (SEQ ID N0:18), 3D-ATIII (SEQ ID NO:I9), and 3D-HSA (SEQ ID N0:20). The first DNA sequence may further contain codon substitutions which eliminate one or more potential glycosylation sites present in the native amino acid sequence of the heterologous protein, such as the colon-optimized sequence encoding 3D-proBPN' (SEQ ID N0:21).
SUBSTITUTE SHEET (RULE 26) - In other embodiments of the method, the transcriptional regulatory region may be a promoter derived from a rice or barley ~,-amylase gene, .including RAmyIA, ltAmylB, l2Amy2A, RAmy3A, RAmy3B, RAmy3C, RAmy3D, RAmy3E, pM/C, gKAmy141, gKAmy155, Amy32b, or HV 18. The chimeric gene may further include, between the transcriptional regulatory region and the fusion protein coding sequence, the 5' untranslated region (5' UTR) of an inducible monocot gene such as one of the rice or barley ~,-amylase genes described above. One preferred 5' UTR is that from the RAmylA gene, which is effective to enhance the stability of the gene transcript. The chimeric gene may further include, downstream of the coding sequence, the 3' untranslated region (3' UTR) from an inducible monocot gene, such as one of the rice or barley a-amylase genes to mentioned above. One preferred 3' UTR is from the RAmyIA gene.
Where the method is employed in protein production in a monocot cell culture, preferred promoters are the RAmy3D and RAmy3E gene promoters, which are upregulated by sugar depletion in cell culture. Where the gene is employed in grotein production in germinating seeds, a preferred promoter is the RAmyIA gene promoter, which is upregulated by gibbereIlic acid during seed germination. Where gene is upregulated during seed maturation, a preferred promoter is the barley endosperm-specific B1-hordein promoter.
The invention also includes a mature heterologous protein produced by the above method.
The protein has a glycosylation pattern characteristic of the monocot plant in which the protein is produced. The glycosyated protein is selected from the group consisting of (i) mature glycosylated o,1-antitrypsin (AAT) having the same N-terminal amino acid sequence as mature AAT produced in humans and having a glycosylation pattern which increases serum halflife substantially over that of non-glycosylated mature AAT; (ii) mature glycosylated antithrombin III (ATIII) having the same N-terminal amino acid sequence as mature ATIII produced in humans; and (iii}
mature glycosylated subtilisin BPN' {BPN') having the same N-terminal amino acid sequence as BPN' produced in Bacillus.
The invention also includes plant cells and seeds capable of producing the mature heterologous proteins according to the above method. ' These and other objects and features of the invention will be more fully understood when the following detailed description of the invention is read in conjunction with the accompanying 3o drawings.
Brief Description of the Figures Fig. 1 shows, in the lower row, the amino acid sequence of a RAmy3D signal sequence portion employed in the invention, identified as SEQ ID NO:1; in the middle row, the corresponding native coding sequence, identified as SEQ ID N0:2; and in the upper row, a corresponding codon-optimized sequence, identified as SEQ ID N0:3;
SUBSTITUTE SHEET (RULE 26) Fig. 2 illustrates the components of a chimeric gene constructed in accordance with an embodiment of the invention;
Figs. 3A and 3B illustrate the construction of an exemplary transformation vector for use in transforming a monocot plant, for production of a mature protein in cell culture in accordance with one embodiment of the invention (native mature AAT coding sequence under control of the RAmy3D promoter and signal sequence);
Fig. 4 illustrates factors in the metabolic regulation of AAT production in rice cell culture;
Fig. 5 shows immunodetection of AAT using antibody raised against the C-terminal regioa of AAT;
l0 Fig. 6 shows Western blot analysis of AAT produced by transformed rice cell lines 18F, 11B, and 27F;
Fig. 7 shows the time course of elastase:AAT complex formation in human and rice-produced forms of AAT;
Fig. 8 shows an N-terminal sequence for mature ~,I-antitrypsin (AAT) produced in accordance with the invention, identified herein as SEQ ID N0:22;
Fig. 9 shows a Western blot of ATIII produced in accordance with the invention;
Fig. 10 shows a Western blot of Flant-produced BPN', comparing expression from codon-optimized and native coding sequences;
Fig. 11 compares the specific activity of BPN' codon-optimized (AP106} vs.
BPN' native (AP101) expression in rice callus cell culture; and Fig. 12 shows a western blot of HSA produced in germinating seeds in accordance with the invention.
Brief Description of the Sequences SEQ ID NO:1 is the amino acid sequence of the RAmy3D signal peptide;
SEQ ID N0:2 is the native sequence encoding the RAmy3D signal peptide;
SEQ iD N0:3 is a codon-optimized sequence encoding the RAmy3D signal peptide;
SEQ ID N0:4 is the amino acid sequence of the RAmylA signal peptide;
SEQ ID N0:5 is the 5' UTR derived from the RAmylA gene;
SEQ ID N0:6 is the 3' UTR derived from the RAmyIA gene;
SEQ ID N0:7 is the amino acid sequence of mature ~,1-antitrypsin (AAT);
SEQ ID N0:8 is the native DNA coding sequence of mature AAT;
SEQ ID N0:9 is the amino acid sequence of mature antithrombin III (ATII>7;
SEQ ID NO:IO is the native DNA coding sequence of mature ATIII;
SEQ ID NO:11 is the amino acid sequence of mature human serum albumin (HSA);
Where the method is employed in protein production in a monocot cell culture, preferred promoters are the RAmy3D and RAmy3E gene promoters, which are upregulated by sugar depletion in cell culture. Where the gene is employed in grotein production in germinating seeds, a preferred promoter is the RAmyIA gene promoter, which is upregulated by gibbereIlic acid during seed germination. Where gene is upregulated during seed maturation, a preferred promoter is the barley endosperm-specific B1-hordein promoter.
The invention also includes a mature heterologous protein produced by the above method.
The protein has a glycosylation pattern characteristic of the monocot plant in which the protein is produced. The glycosyated protein is selected from the group consisting of (i) mature glycosylated o,1-antitrypsin (AAT) having the same N-terminal amino acid sequence as mature AAT produced in humans and having a glycosylation pattern which increases serum halflife substantially over that of non-glycosylated mature AAT; (ii) mature glycosylated antithrombin III (ATIII) having the same N-terminal amino acid sequence as mature ATIII produced in humans; and (iii}
mature glycosylated subtilisin BPN' {BPN') having the same N-terminal amino acid sequence as BPN' produced in Bacillus.
The invention also includes plant cells and seeds capable of producing the mature heterologous proteins according to the above method. ' These and other objects and features of the invention will be more fully understood when the following detailed description of the invention is read in conjunction with the accompanying 3o drawings.
Brief Description of the Figures Fig. 1 shows, in the lower row, the amino acid sequence of a RAmy3D signal sequence portion employed in the invention, identified as SEQ ID NO:1; in the middle row, the corresponding native coding sequence, identified as SEQ ID N0:2; and in the upper row, a corresponding codon-optimized sequence, identified as SEQ ID N0:3;
SUBSTITUTE SHEET (RULE 26) Fig. 2 illustrates the components of a chimeric gene constructed in accordance with an embodiment of the invention;
Figs. 3A and 3B illustrate the construction of an exemplary transformation vector for use in transforming a monocot plant, for production of a mature protein in cell culture in accordance with one embodiment of the invention (native mature AAT coding sequence under control of the RAmy3D promoter and signal sequence);
Fig. 4 illustrates factors in the metabolic regulation of AAT production in rice cell culture;
Fig. 5 shows immunodetection of AAT using antibody raised against the C-terminal regioa of AAT;
l0 Fig. 6 shows Western blot analysis of AAT produced by transformed rice cell lines 18F, 11B, and 27F;
Fig. 7 shows the time course of elastase:AAT complex formation in human and rice-produced forms of AAT;
Fig. 8 shows an N-terminal sequence for mature ~,I-antitrypsin (AAT) produced in accordance with the invention, identified herein as SEQ ID N0:22;
Fig. 9 shows a Western blot of ATIII produced in accordance with the invention;
Fig. 10 shows a Western blot of Flant-produced BPN', comparing expression from codon-optimized and native coding sequences;
Fig. 11 compares the specific activity of BPN' codon-optimized (AP106} vs.
BPN' native (AP101) expression in rice callus cell culture; and Fig. 12 shows a western blot of HSA produced in germinating seeds in accordance with the invention.
Brief Description of the Sequences SEQ ID NO:1 is the amino acid sequence of the RAmy3D signal peptide;
SEQ ID N0:2 is the native sequence encoding the RAmy3D signal peptide;
SEQ iD N0:3 is a codon-optimized sequence encoding the RAmy3D signal peptide;
SEQ ID N0:4 is the amino acid sequence of the RAmylA signal peptide;
SEQ ID N0:5 is the 5' UTR derived from the RAmylA gene;
SEQ ID N0:6 is the 3' UTR derived from the RAmyIA gene;
SEQ ID N0:7 is the amino acid sequence of mature ~,1-antitrypsin (AAT);
SEQ ID N0:8 is the native DNA coding sequence of mature AAT;
SEQ ID N0:9 is the amino acid sequence of mature antithrombin III (ATII>7;
SEQ ID NO:IO is the native DNA coding sequence of mature ATIII;
SEQ ID NO:11 is the amino acid sequence of mature human serum albumin (HSA);
SUBSTITUTE SHEET (RULE 26) SEQ ID N0:12 is the native DNA coding sequence of mature IiSA;
SEQ ID N0:13 is the amino acid sequence of native proBPN';
SEQ ID N0:14 is the native DNA coding sequence of proBPN';
SEQ ID NO:IS is the amino acid sequence of the "pro" moiety of BPN';
SEQ ID N0:16 is the amino acid sequence of native mature BPN';
SEQ ID N0:17 is the amino acid sequence of a mature BPN' variant in which alI potential N-glycosylation sites are removed according to Table 2;
SEQ ID N0:18 is a colon-optimized sequence encoding the RAmy3D
signal sequence/mature ai-antitrypsin fusion protein;
SEQ ID N0:19 is a sequence encoding the RAmy3D signal sequence/mature antithrombin III fusion protein, with a colon-optimized RAmy3D coding sequence fused to the native mature ATIII coding sequence;
SEQ ID N0:20 is a sequence encoding the RAmy3D signal sequence/mature human serum albumin fusion protein, with a colon-optimized lRAmy3D coding sequence fused to the native mature HSA coding sequence;
SEQ ID N0:21 is a colon-optimized sequence encoding the RAmy3D
signal sequence/prosubtilisin BPN' fusion protein;
SEQ ID N0:22 is the N-terminal sequence of mature ~,,-antitrypsin produced in accordance with the invention;
2o SEQ ID N0:23 is an oligonucleotide used to prepare the intermediate p3DProSig construct of Example l;
SEQ ID N0:24 is the complement of SEQ ID N0:23;
SEQ iD N0:25 is an oligonucleotide used to prepare the intermediate p3DProSigENDlink construct of Example 1;
SEQ ID N0:26 is the complement of SEQ ID N0:25;
SEQ ID N0:27 is one of six oligonucleotides used to prepare the intermediate p lAProSig construct of Example 1;
SEQ ID NO:28 is one of six oligonucleotides used to prepare the intermediate plAProSig construct of Example 1;
3o SEQ ID N0:29 is one of six oligonucleotides used to prepare the intermediate plAProSig construct of Example 1;
SEQ ID N0:30 is one of six oligonucleotides used to prepare the intermediate plAProSig construct of Example l;
SEQ ID N0:31 is one of six oligonucleotides used to prepare the intermediate plAProSig construct of Example 1;
SEQ ID N0:13 is the amino acid sequence of native proBPN';
SEQ ID N0:14 is the native DNA coding sequence of proBPN';
SEQ ID NO:IS is the amino acid sequence of the "pro" moiety of BPN';
SEQ ID N0:16 is the amino acid sequence of native mature BPN';
SEQ ID N0:17 is the amino acid sequence of a mature BPN' variant in which alI potential N-glycosylation sites are removed according to Table 2;
SEQ ID N0:18 is a colon-optimized sequence encoding the RAmy3D
signal sequence/mature ai-antitrypsin fusion protein;
SEQ ID N0:19 is a sequence encoding the RAmy3D signal sequence/mature antithrombin III fusion protein, with a colon-optimized RAmy3D coding sequence fused to the native mature ATIII coding sequence;
SEQ ID N0:20 is a sequence encoding the RAmy3D signal sequence/mature human serum albumin fusion protein, with a colon-optimized lRAmy3D coding sequence fused to the native mature HSA coding sequence;
SEQ ID N0:21 is a colon-optimized sequence encoding the RAmy3D
signal sequence/prosubtilisin BPN' fusion protein;
SEQ ID N0:22 is the N-terminal sequence of mature ~,,-antitrypsin produced in accordance with the invention;
2o SEQ ID N0:23 is an oligonucleotide used to prepare the intermediate p3DProSig construct of Example l;
SEQ ID N0:24 is the complement of SEQ ID N0:23;
SEQ iD N0:25 is an oligonucleotide used to prepare the intermediate p3DProSigENDlink construct of Example 1;
SEQ ID N0:26 is the complement of SEQ ID N0:25;
SEQ ID N0:27 is one of six oligonucleotides used to prepare the intermediate p lAProSig construct of Example 1;
SEQ ID NO:28 is one of six oligonucleotides used to prepare the intermediate plAProSig construct of Example 1;
3o SEQ ID N0:29 is one of six oligonucleotides used to prepare the intermediate plAProSig construct of Example 1;
SEQ ID N0:30 is one of six oligonucleotides used to prepare the intermediate plAProSig construct of Example l;
SEQ ID N0:31 is one of six oligonucleotides used to prepare the intermediate plAProSig construct of Example 1;
SUBSTITUTE SHEET (RULE 26) SEQ ID N0:32 is one of six oligonucleotides used to prepare the intermediate plAProSig construct of Example 1; , SEQ ID N0:33 is the N-terminal primer used to PCR-amplify the AAT coding sequence according to Example 1; and SEQ ID N0:34 is the C-terminal primer used to PCR-amplify the AAT coding sequence according to Example 1.
Detailed Description of the Invention I. Definitions:
The terms below have the following meaning, unless indicated otherwise in the specif cation.
"Cell culture" refers to cells and cell clusters, typically callus cells, growing on or suspended in a suitable growth medium.
"Germination" refers to the breaking of dormancy in a seed and the resumption of metabolic activity in the seed, including the production of enzymes effective to break down starches in the seed endosperm.
"Inducible" means a promoter that is upregulated by the presence or absence of a small molecules. It includes both indirect and direct inducement.
"Inducible during germination" refers to promoters which are substantially silent but not totally silent prior to germination but are turned on substantially (greater than 25 % ) during germination and development in the seed. Examples of promoters that are inducible during germination are presented below.
"Small molecules", in the context of promoter induction, are typically small organic or bioorganic molecules less than about 1 kDal. Examples of such small molecules include sugars, sugar-derivatives (including phosphate derivatives), and plant hormones (such as, gibberellic or absissic acid).
"Specifically regulatable" refers to the ability ~of a small molecule to preferentially affect transcription from one promoter or group of promoters (e.g., the a-amylase gene farnity), as opposed to non-specific effects, such as, enhancement or reduction of global transcription within a cell by a small molecule.
"Seed maturation" or "grain development" refers to the period starting with fertilization in which metabolizable reserves, e.g., sugars, oligosaccharides, starch, phenolics, amino acids, and proteins, are deposited, with and without vacuole targeting, to various tissues in the seed (grain), e.g., endosperm, tests, aleurone layer, and scutellar epithelium, leading to grain enlargement, grain filling, and ending with .grain desiccation.
Detailed Description of the Invention I. Definitions:
The terms below have the following meaning, unless indicated otherwise in the specif cation.
"Cell culture" refers to cells and cell clusters, typically callus cells, growing on or suspended in a suitable growth medium.
"Germination" refers to the breaking of dormancy in a seed and the resumption of metabolic activity in the seed, including the production of enzymes effective to break down starches in the seed endosperm.
"Inducible" means a promoter that is upregulated by the presence or absence of a small molecules. It includes both indirect and direct inducement.
"Inducible during germination" refers to promoters which are substantially silent but not totally silent prior to germination but are turned on substantially (greater than 25 % ) during germination and development in the seed. Examples of promoters that are inducible during germination are presented below.
"Small molecules", in the context of promoter induction, are typically small organic or bioorganic molecules less than about 1 kDal. Examples of such small molecules include sugars, sugar-derivatives (including phosphate derivatives), and plant hormones (such as, gibberellic or absissic acid).
"Specifically regulatable" refers to the ability ~of a small molecule to preferentially affect transcription from one promoter or group of promoters (e.g., the a-amylase gene farnity), as opposed to non-specific effects, such as, enhancement or reduction of global transcription within a cell by a small molecule.
"Seed maturation" or "grain development" refers to the period starting with fertilization in which metabolizable reserves, e.g., sugars, oligosaccharides, starch, phenolics, amino acids, and proteins, are deposited, with and without vacuole targeting, to various tissues in the seed (grain), e.g., endosperm, tests, aleurone layer, and scutellar epithelium, leading to grain enlargement, grain filling, and ending with .grain desiccation.
SUBSTITUTE SHEET (RULE 2fij "Inducible during seed maturation" refers to promoters which are turned on substantially (greater than 25~Y) during seed maturation.
"Heterologous DNA" or "foreign DNA" refers to DNA which has been introduced into plant cells from another source, or which is from a plant source, including the same plant source, but which is under the control of a promoter or terminator that does not normally regulate expression of the heterologous DNA.
"Heterologous protein" is a protein, including a polypeptide, encoded by a heteroiogous DNA. A "transcription regulatory region" or "promoter" refers to nucleic acid sequences that influence and/or promote initiation of transcription. Promoters are typically considered to include to regulatory regions, such as enhancer or inducer elements.
A "chimeric gene," in the context of the present invention, typically comprises a promoter sequence operably linked to DNA sequence that encodes a heterologous gene product, e.g., a selectable marker gene or a fusion protein gene. A chimeric gene may also contain further transcription regulatory elements, such as transcription termination signals, as well as translation regulatory signals, such as, termination codons.
"Operably linked" refers to components of a chimeric gene or an expression cassette that function as a unit to express a~ heterologous protein. For example, a promoter operably linked to a heterologous DNA, which encodes a protein, promotes the production of functional mRNA
corresponding to the heterologous DNA.
2o A "product" encoded by a DNA molecule includes, for example, RNA molecules and polypeptides.
"Removal" in the context of a metabolite includes both physical removal as by washing and the depletion of the metabolite through the absorption and metabolizing of the metabolite by the cells.
"Substantially isolated" is used in several contexts and typically refers to the at least partial purification of a protein or polypeptide away from unrelated or contaminating components.
Methods and procedures for the isolation or purification of proteins or polypeptides are known in, the art.
"Stably transformed" as used herein refers to a cereal cell or plant that has foreign nucleic acid stably integrated into its genome which is transmitted through multiple generations.
"al-antitrypsin or "AAT" refers to the protease inhibitor which has an amino acid sequence substantially identical or homologous to AAT protein identified by SEQ ID
N0:7.
"Antithrombin III" or "ATIII" refers to the heparin-activated inhibitor of thrombin and factor Xa, and which has an amino acid sequence substantially identical or homologous to AT1II
protein identified by SEQ ID N0:9.
"Heterologous DNA" or "foreign DNA" refers to DNA which has been introduced into plant cells from another source, or which is from a plant source, including the same plant source, but which is under the control of a promoter or terminator that does not normally regulate expression of the heterologous DNA.
"Heterologous protein" is a protein, including a polypeptide, encoded by a heteroiogous DNA. A "transcription regulatory region" or "promoter" refers to nucleic acid sequences that influence and/or promote initiation of transcription. Promoters are typically considered to include to regulatory regions, such as enhancer or inducer elements.
A "chimeric gene," in the context of the present invention, typically comprises a promoter sequence operably linked to DNA sequence that encodes a heterologous gene product, e.g., a selectable marker gene or a fusion protein gene. A chimeric gene may also contain further transcription regulatory elements, such as transcription termination signals, as well as translation regulatory signals, such as, termination codons.
"Operably linked" refers to components of a chimeric gene or an expression cassette that function as a unit to express a~ heterologous protein. For example, a promoter operably linked to a heterologous DNA, which encodes a protein, promotes the production of functional mRNA
corresponding to the heterologous DNA.
2o A "product" encoded by a DNA molecule includes, for example, RNA molecules and polypeptides.
"Removal" in the context of a metabolite includes both physical removal as by washing and the depletion of the metabolite through the absorption and metabolizing of the metabolite by the cells.
"Substantially isolated" is used in several contexts and typically refers to the at least partial purification of a protein or polypeptide away from unrelated or contaminating components.
Methods and procedures for the isolation or purification of proteins or polypeptides are known in, the art.
"Stably transformed" as used herein refers to a cereal cell or plant that has foreign nucleic acid stably integrated into its genome which is transmitted through multiple generations.
"al-antitrypsin or "AAT" refers to the protease inhibitor which has an amino acid sequence substantially identical or homologous to AAT protein identified by SEQ ID
N0:7.
"Antithrombin III" or "ATIII" refers to the heparin-activated inhibitor of thrombin and factor Xa, and which has an amino acid sequence substantially identical or homologous to AT1II
protein identified by SEQ ID N0:9.
SU9ST1TUTE SHEET (RULE 26) - "Human serum albumin" or "HSA" refers to a protein which has an amino acid sequence substantially identical or homologous to the mature HSA protein identified by SEQ ID NO:11.
"Subtilisin" or "subtilisin BPN"' or "BPN"' refers to the protease enzyme produced naturally by B. amyloliquefaciens, and having the sequence of SEQ ID N0:16, or a sequence homologous therewith.
"proBPN"' refers to a form of BPN' having an approximately 78 amino-acid "pro"
moiety that functions as a chaperon polypeptide to assist in folding and activation of the BPN', and having the sequence in SEQ ID N0:13, or a sequence homologous therewith.
"Codon optimization" refers to changes in the coding sequence of a gene to replace native to codons with those corresponding to optimal codons in the host plant.
A DNA sequence is "derived from" a gene, such as a rice or barley a,-amylase gene, if it corresponds in sequence to a segment or region of that gene. Segments of genes which may be derived from a gene include the promoter region, the S' untranslated region, and the 3' untranslated region of the gene.
II. Transformed plant cell The plants used in the process of the present invention are derived from monocots, particularly the members of the taxonomic family known as the Gramineae. This family includes all members of the grass family of which the edible varieties are known as cereals. The cereals include a wide variety of species such as wheat (Trincum sps.), rice (Oryza sps.) barley (Hordeum sps.) oats, (Avena sps.) rye (Secale sps.), corn (Zea sps.) and millet (Pennisettum sps.). In the present invention, preferred family members are rice and barley.
Plant cells or tissues derived from the members of the family are transformed with expression constructs (i.e., plasmid DNA into which the gene of interest has been inserted) using a variety of standard techniques (e.g., electroporation, protoplast fusion or microparticle bombardment}. The expression construct includes a transcription regulatory region (promoter) whose transcription is specifically upregulated by the piesence of absence of a small molecule, such as the reduction or depletion of sugar, e.g., sucrose, in culture medium, or in plant tissues, e.g., germinating seeds. In the gresent invention, particle bombardment is the preferred transformation procedure.
The construct also includes a gene encoding a mature heterologous protein in a form suitable for secretion from plant cells. The gene encoding the recombinant heterologous protein is placed under the control of a metabolically regulated promoter. Metabolically regulated promoters are those in which mRNA synthesis or transcription, is repressed or upregulated by a small 3s metabolite or hormone molecule, such as the rice RAmy3D and I:ZAmy3E
promoters, which are SUBSTITUTE SHEET (RULE 26) WD 98/36085 PCTlUS98/03068 upregulated by sugar-depletion in. cell culture. For protein production in germinating seeds from regenerated transgenic plants, a preferred promoter is the_Ramy lA promoter, which is up-regulated by gibberellic acid during seed germination. The expression construct also utilizes additional_ regulatory DNA sequences e.g., preferred codons, termination sequences, to promote efficient translation of AAT, as will be described.
A. Plant Expression Vector Expression vectors for use in the present invention comprise a chimeric gene (or expression cassette), designed for operation in plants, with companion sequences upstream and downstream to from the expression cassette. The companion sequences will be of plasmid or viral origin and provide necessary characteristics to the vector to permit the vectors to move DNA from bacteria to the desired plant host. Suitable transformation vectors are described in related application PCT WO
95/14099, published May 25, 1995, which is incorporated by reference herein.
Suitable components of the expression vector, including an inducible promoter, coding sequence for a signal peptide, coding sequence for a mature heterologous protein, and suitable termination sequences are discussed below. One exemplary vector is the p3D(AAT)v1.0 vector illustrated in Figs 3A and 3B.
Al. Promoters The transcription regulatory or promoter region is chosen to be regulated in a manner allowing for induction under selected cultivation conditions, e.g., sugar depletion in culture or water uptake followed by gibberellic acid production in germinating seeds.
Suitable promoters, and their method of selection are detailed in above-cited PCT application WO
95/14099. Examples of such promoters include those that transcribe the cereal o,-amylase genes and sucrose synthase genes, and are repressed or induced by small molecules, Iike sugars, sugar depletion or phytohormones such as gibberellic acid or absissic acid. Representative promoters include the promoters from the rice ~,-amylase RAmyIA, RAmyIB, RAmy2A, RAmy3A, RAmy3B, RAmy3C, RAmy3D, and RAmy3E genes, and from the pM/C, gKt1my141, gKAmy155, Amy32b, and HVl8 barley a-amylase genes. These promoters are described, for example, in ADVANCr~S IN
PLANT
$IpTECFiIQOLOGY Ryu, D.D.Y., et al, Eds., Elsevier, Amsterdam, 1994, p.37, and references cited therein. Other suitable promoters include the sucrose synthase and sucrose-6-phosphate-synthetase (SPS) promoters from rice and barley.
Other suitable promoters include promoters which are regulated in a manner allowing for induction under seed-maturation conditions. Examples of such promoters include those associated with the following monocot storage proteins: rice glutelins, oryzins, and prolamines, barley hordeins, wheat gliadins and glutelins, maize zeros and glutelins, oat glutelins, and sorghum SU~STITUTE SHEET (RULE 26) kafirins, millet pennisetins, and rye secaiins. _ A preferred promoter for expression in germinating seeds is the rice a-amylase RAmyIA
promoter, which is upregulated by gibberellic acid. Preferred promoters for expression in cell culture are the rice a-amylase RAmy3D and RAmy3E promoters which are strongly upregulated by sugar depletion in the culture. These promoters are also active during seed germination. A
preferred promoter for expression in maturing seeds is the barley endosperm-specific Bl-hordein promoter (Brandt, A., et al., (1985) Carlsberg Res. Commun. 50:333-345}.
The chimeric gene may further include, between the promoter and coding sequences, the 5' untransiated region (5' UTR) of an inducible monocot gene, such as the 5' UTR
derived from one to of the rice or barley a-amylase genes mentioned above. One preferred 5' UTR
is that derived from the RAmylA gene, which is effective to enhance the stability of the gene transcript. This 5' UTR
has the sequence given by SEQ ID NO:S herein.
A2. Signal Sequgnces In addition to encoding the protein of interest, the chimeric gene encodes a signal sequence (or signal peptide) that allows processing and translocation of the protein, as appropriate. Suitable signal sequences are described in above-referenced PCT application WO
95/14099. One preferred signal sequence is identified as SEQ ID NO:1 and is derived from the RAmy3D
promoter. Another preferred signal sequence is identified as SEQ ID N0:4 and is derived from the RAmyIA promoter.
The plant signal sequence is placed in frame with a heterologous nucleic acid encoding a mature protein, forming a construct which encodes a fusion protein having an N-terminal region correspanding to the signal peptide and, immediately adjacent to the C-terminal amino acid of the signal peptide, the N-terminal amino acid of the mature heterologous protein.
The expressed fusion protein is subsequently secreted and processed by signal peptidase cleavage precisely at the junction of the signal peptide and the mature protein, to yield the mature heterologous protein.
In another embodiment of the invention, the coding sequence in the fusion protein gene, in at least the coding region for the signal sequence, may tie colon-optimized for optimal expression in plant cells, e.g., rice cells, as described below. The upper row in Fig. 1 shows one codon-optimized coding sequence for the RAmy3D signal sequence, identified herein as SEQ ID N0:3.
A3. Naturally-Occurring Heterologous Protein Coding~g"q ~ n~P~
(i) ~1-Anti sin: Mature human AAT is composed of 394 amino acids, having the sequence identified herein as SEQ iD N0:7. The protein has N-glycosyiation sites at asparagines 46, 83 and 247. The corresponding native DNA coding sequence is identified herein as SEQ ID
N0:8.
SU9ST1TUTE SHEET (RULE 26) - (ii) Antithrombin III: Mature human ATIII is composed of 432 amino acids, having the sequence identified herein as SEQ ID N0:9. The protein has N-glycosylation sites at the four asparagine residues 96, 135, 155, and 192. The corresponding native DNA coding sequence is identified herein as SEQ ID NO:10.
(iii) Human serum albumin: Mature HSA as found in human serum is composed of amino acids, having the sequence identified herein as SEQ ID N0:11. The protein has no N-linked glycosylation sites. The corresponding native DNA coding sequence is identified herein as SEQ ID
N0:12.
(iv) Subtilisin BPN': Native proBPN' as produced in B. amyloliquefaciens is composed of l0 352 amino acids, having the sequence identified herein as SEQ 117 N0:13, The corresponding native DNA coding sequence is identifed herein as SEQ ID NO: i4. The proBPN' polypeptide contains a 77 amino acid "pro" moiety which is identified herein as SEQ ID N0:15. The remainder of the polypeptide, which forms the mature active BPN', is a 275 amino acid sequence identified herein by SEQ ID NO: i6. Native BPN' as produced in Bacillus is not glycosylated.
A4. Codon-Optimized Coding Sequences In accordance with one aspect of the invention, it has been discovered that a severalfold enhancement of expression level can be achieved in plant cell culture by modifying the native coding sequence of a heterologous gene by contain predominantly or exciusively, highest-frequency codons found in the plant cell host.
The method will be illustrated for expression of a heterologous gene in rice plant cells, it being recognized that the method is generally applicable to any monocot. As a first step, a representative set of known coding gene sequence from rice is assembled. The sequences are then analyzed for codon frequency for each amino acid, and the most frequent codon is selected for each amino acid. This approach differs from earlier reported codon matching methods, in which more than one frequent codon is selected for at least some of the amino acids. The optimal codons selected in this manner for rice and barley are shown in-Table 1.
Table 1 Amino Acid Rice Preferred Codon Barley Preferred Codon I
AIa A GCC
Arg R CGC
Asn N AAC
SUBSTITUTE SHEET (RULE 26) ~ , Amino Acid Rice Preferred Barley Preferred Codon Codon Asp D GAC
Cys C UGC
GIn Q CAG
Glu E GAG
Gly G GGC
His H CAC
Ile I AUC
Leu L CUC
Lys K AAG
Phe F UUC
Pro P CCG CCC
Ser S AGC UCC
Thr T ACC
TYr Y UAC
Val V GUC GUG
stop UAA UGA
As indicated above, the fusion protein coding sequence in the chimeric gene is constructed such that the final (C-terminal) codon in the signal sequence is immediately followed by the codon for the N-terminal amino acid in the mature form of the heterologous protein.
Exemplary fusion protein genes, in accordance with the present invention, are identified herein as follows:
SEQ ID N0:18, corresponding to codon-optimized coding sequences of the fusion protein consisting of RAmy3D signal sequence/mature a,-antitrypsin;
SEQ ID NO:I9, corresponding to the fusion protein coding sequence consisting of the l0 codon-optimized RAmy3D signal sequence and the native mature antithrombin III sequence;
SEQ ID N0:20, corresponding to the fusion protein coding sequence consisting of the codon-optimized RAmy3D signal sequence and the native mature human serum albumin sequence;
SEQ ID N0:2I, corresponding to codon-optimized coding sequence of the fusion protein RAmy3D signal sequence/prosubtilisin BPN'. In this instance, prosubtilisin is considered the "mature" protein, in that secreted prosubtilisin can autocatatyze to active, mature subtilisin.
In a preferred embodiment, the BPN' coding sequence is further modified to eliminate SU~STlTUTE SHEET (RULE 26) potential N-glycosylation sites, as native BPN' is not glycosylated. Table 2 illustrates preferred -codon substitutions, which eliminate all potential N-glycosylation sites in subtilisin BPN'. SEQ ID
N0:17 corresponds to a mature BPN' amino acid sequence containing the substitutions presented in Table 2.
Table z N Glycosylation Location (Asn) (in Amino Acid Sites mature Substitution protein) Asn Asn Ser 61 Thr Asn Ser Asn Asn Ser 76 Thr Asn Ser Asn Met Ser 123 Thr Met Ser Asn Gly Thr 2I8 Ser Gly Thrt Asn Trp Thr 240 Thr Trp Thr 'improved thcrmostability; Bryan, et al., Proteiru: Structure, Function, and Genetics 1:326 (1986).
A5. Transcription and Translation Terminators The chimeric gene may also include, downstream of the coding sequence, the 3' untranslated region (3' UTR) from an inducible monocot gene, such as one of the rice or barley a-amylase genes mentioned above. One preferred 3' UTR is that derived from the RAmylA gene, whose sequence is given by SEQ ID N0:6. This sequence includes non-coding sequence 5' to the polyadenylation site, the polyadenylation site, and the transcription termination sequence. The transcriptional termination region may be selected, particularly for stability of the mRNA to enhance expression. Polyadenylation tails (Alber and Kawasaki, 1982, Mol. and Appl. Genet.
_1:419-434) are also commonly added to the expression cassette to optimize high levels of 2Q transcription and proper transcription termination, respectively.
Polyadenylation sequences include but are not limited to the Agrobacterium octopine synthetase signal (Gielen, et al., EMBO J. ~:835-846 (I984) or the nopaline synthase of the same species (Depicker, et al., Mol. Appl. Genet. _1:561-573 (1982).
Since the ultimate expression of the heterologous protein will be in a eukaryotic cell (in this case, a member of the grass family), it is desirable to determine whether any portion of the cloned gene contains sequences which will be processed out as introns by the host's splicing machinery. If so, site-directed mutagenesis of the "intron" region may be conducted to prevent losing a portion of the genetic message as a false intron code (Reed and Maniatis, Cell 41:95-105 (1985).
SUBSTITUTE SHEET (RULE 26) Fig. 2 shows the elements of one preferred chimeric gene constructed in accordance with the invention, and intended particularly for use in protein expression in a rice cell suspension culture. The gene includes, in a 5' to 3' direction, the promoter from the RAmy3D gene, which is inducible in cell culture with sugar depletion, the 5' UTR from the RAmylA
gene, which confers enhanced stability on the gene transcript, the RAmy3D signal sequence coding region, as identified above, the coding region of a heterologous protein to be produced, and a 3' UTR region from the RAmyIA gene.
BI. Plant Transformation For transformation of plants, the chimeric gene is placed in a suitable expression vector designed for operation in plants. The vector includes suitable elements of plasmid or viral origin that provide necessary characteristics to the vector to permit the vectors to move DNA from bacteria to the desired plant host. Suitable transformation vectors are described in related application PCT
WO 95/14099, published May 25, 1995, which is incorporated by reference herein. Suitable components of the expression vector, inciuding the chimeric gene described above, are discussed below. One exemplary vector is the p3Dv1.0 vector described in Example 1.
A. Transformation Vector Vectors containing a chimeric gene of the present invention may also include selectable markers for use in plant cells (such as the nptIl kanamycin resistance gene, for selection in kanamycin-containing or the phosphinothricin acetyltransferase gene, for selection in medium containing phosphinothricin (PPT).
The vectors may also include sequences that allow their selection and propagation in a secondary host, such as sequences containing an origin of replication and a selectable marker such as antibiotic or herbicide resistance genes, e.g., FiPH (Hagio et al., Planr Cell Reports ,~:329 (1995); van der Elzer, Plant Mol. Biol. x:299-302 (/985). Typical secondary hosts include bacteria and yeast. In one embodiment, the secondary host is ~scherichia coli, the origin of replication is a colEl-type, and the selectable marker is a gene encoding ampicillin resistance. Such sequences are well known in the art and are commercialiy available as well (e.g., Clontech, Palo Alto, CA;
3o Stratagene, La Jolla, CA).
The vectors of the present invention may also be modified to intermediate plant transformation plasmids that contain a region of homology to an Agrobacterium tumefaciens vector, a T-DNA border region from Agrobacterium tumefaciens, and chimeric genes or expression cassettes (described above). Further, the vectors of the invention may comprise a disarmed plant tumor inducing plasmid of Agrobacterium tumefaciens.
SUBSTITUTE SHEET (RULE 26) WO 98!36085 PCT/US98/03068 The vector described in Example I, and having a promoter from the RAmy3D gene, is suitable for use in a method of mature protein productionin cell culture, where the RAmy3D
promoter is induced by sugar depletion in cell culture medium. Other promoters may be selected for other applications, as indicated above. For example, for mature protein expression in germinating seeds, the coding sequence may be placed under the control of the rice a-amylase RAmylA promoter, which is inducible by gibberellic acid during seed germination.
B. Transformation of plp ant cell Various methods for direct or vectored transformation of plant cells, e.g., plant protoplast cells, have been described, e.g., in above-cited PCT application WO 95/14099.
As noted in that reference, promoters directing expression of selectable markers used for plant transformation (e. g., nptlI) should operate effectively in plant hosts. One such promoter is the nos promoter from native Ti plasmids (Herrera-Estrella, et al., Nature 303:209-213 (i983). Others include the 35S and 19S
promoters of cauliflower mosaic virus {Odell, et al., Nature ~,~I :810-812 {I985) and the Z' promoter (Velten, et al., EMBO J. x:2723-2730 {1984).
In one preferred embodiment, the embryo and endosperm of mature seeds are removed to exposed scutulum tissue cells. The cells may be transformed by DNA bombardment or injection, or by vectored transformation, e.g., by Agrobacteriu»z infection after bombarding the scuteller cells with microparticles to make them susceptible to Agrobacterium infection (Bidney et al., Plant Mol.
Biol. 18:301-313, 1992).
One preferred transformation follows the methods detailed generally in Sivamani, E. et al., Plant Cell Reports x:465 (1996); Zhang, S., et al., Plant Cell Reports 15:465 (1996); and Li, L., et al., Plant Cell Reports 12:250 (1993). Briefly, rice seeds are sterilized by standard methods, and callus induction from the seeds is carried out on MB media with 2,4D. During a first incubation period, callus tissue forms around the embryo of the seed. By the end of the incubation period, (e.g., 14 days at 28~C) the calli are about 0.25 to 0.5 cm in diameter. Callus mass is then detached from the seed, and placed on fresh NB media, and incubated again for about 14 days at 28~C. After the second incubation period, satellite calli developed around the original "mother" callus mass.
These satellite calli were slightly smaller, more compact and defined than the original tissue. It was these calli were transferred to fresh media. The "mother " calti was not transferred. The goal was to select only the strongest, most vigorous growing tissue for further culture.
Calli to be bombarded are selected from 14-day-old subcultures. The size, shape, color and density are all important in selecting calli in the optimal physiological condition for transformation.
The calli should be between .8 and 1.1 mm in diameter. The calli should appear as spherical masses with a rough exterior.
SUBSTITUTE SHEET (RULE 26) Transformation is by particle bombardment, as detailed in the references cited above. After the transformation steps, the cells are typically grown under conditions that permit expression of the selectable marker gene. In a preferred embodiment, the selectable marker gene is HPH. It is preferred to culture the transformed cells under multiple rounds of selection to produce a uniformly stable transformed cell line.
IV. Cell Culture Production of Mature Heterolog us Protein Transgenic cells, typically callus cells, are cultured under conditions that favor plant cell growth, until the cells reach a desired cell density, then under conditions that favor expression of l0 the mature protein under the control of the given promoter. Preferred culture conditions are described below and in Example 2. Purification of the mature protein secreted into the medium is by standard techniques known by those of skill in the art.
Production of mature AAT: In a preferred embodiment, the culture medium contains a phosphate buffer, e. g., the 20 mM phosphate buffer, pH 6.8 described in Example 2, to reduce AAT degradation catalyzed by metals. Alternatively, or in addition, a metal chelating agent, such as EDTA, may be added to the medium.
Following the cell culture method 'described in Example 2, cell culture media was partially purified and the fraction containing AAT was analyzed by Western blot, as shown in Fig. 4. The first two lanes ("phosphate") show AAT bands both in the presence and absence of elastase ("+E"
and "-E"), where the higher molecular weight bands in the presence of elastase correspond roughly to a 58-59 kdal AAT/elastase complex. Also as seen in the figure, expression was high in the absence of sucrose, but nearly undetectable in the presence of sucrose.
To ascertain the degree of glycosylation (as determined by apparent molecular weight by SDS-PAGE) the protein produced in culture was fractionated by SDS-PAGE and immunodetected with a labeled antibody raised against the C-terminal portion of AAT, as shown in Fig. S. Lane 4 contains human AAT, and its migration position corresponds to about 52 kdal.
In lane 3 is the plant-produced AAT, having an apparent molecular weight of about 49-50 kdal, indicating an extent of glycosylation of up to 60-$0~'a of the glycosylation found in human AAT
(non-glycosylated AAT
has a molecular weight of 45 kdal).
Similar results are shown in the Western blots in Fig. 6. Lanes I-3 in this figure correspond to decreasing amount (I5, 10, and 5 ng) of human AAT; lane 4, to 10 ~l supernatant from a non-expressing plant cell line; lanes 5 and 6, to 10 ~l supernatant from AAT-expressing plant cell lines I1B and 27F, respectively, and lane 7, to 10 p,l supernatant from cell line 27F plus 250 ng trypsin. The upward mobility shift in lane 7 is indicative of association between trypsin and the plant-produced AAT.
SUBSTITUTE SHEET (RULE 26) The ability of plant-produced AAT to bind to elastase is demonstrated in Fig.
7, which-shows the shift in molecular weight over a 30 minute binding interval for the 52 kdal human AAT
(lanes 1-4) and the 49-50 kdal plant-produced AAT.
To demonstrate that the mature protein is produced in secreted form, with the desired N-terminus, a chimeric gene constructed as above, and having the coding sequence for mature al-antitrypsin was expressed and secreted in cell culture as described in Example 2. The isolated _ protein was then sequenced at its N-terminal region, yielding the N-terminal sequence shown fn Fig.
8. This sequence, which is identified herein as SEQ ID N0:22, has the same N-terminal residues as native mature a,-antitrypsin.
Production of mature ATIII: In a preferred embodiment, the culture medium contains a MES buffer, pH 6.8. Western blot analysis of the ATIII protein produced, shown in lanes 4 and 6 in Fig. 9, shows a band corresponding to ATIII (lane 1) in cell lines 42 and 46, when grown in the absence (but not in the presence) of sucrose.
Production of mature BPN': In one embodiment of the invention, in which BPN' is secreted I5 as the proBPN' form of the enzyme, the chaperon "pro" moiety of the enzyme facilitates enzyme folding and is cleaved from the enzyme, leaving the active mature form of BPN'. In another embodiment, the mature enzyme is co-expressed and co-secreted with the "pro"
chaperon moiety, with conversion of the enzyme to active form occurring in presence of the free chaperon (Eder et al., Biochem. (1993) 32:18-26; Eder et al, (1993) J. Mol. Biol. 223:293-304).
In yet another embodiment of the invention, the BPN' is secreted in inactive form at a pH
that may be in the 6-8 range, with subsequent activation of the inactive form, e.g., after enzyme isolation, by exposure to the "pro" chaperon moiety, e. g., immobilized to a solid support.
In both of these embodiments, the culture medium is maintained at a pH of between 5 and 6, preferably about 5.5 during the period of active expression and secretion of BPN', to keep the BPN', which is normally active at alkaline pH, at a pH below optimal activity.
Codon optimization to the host plant's most frequent codons yielded a severalfold enhancement in the level of expressed heterologous protein in cell culture as shown in Fig. 11. The extent of enhancement is seen from the Western blot analysis shown in Fig. 10 for two cells lines and further substantiated in Fig. 11. Lane 2 (second from left) in Fig. 10 shows a Western blot of BPN' obtained in culture from cells transformed with a native proBPN' coding sequence. Two bands observed correspond to a lower molecular weight protein whose approximately 35 kdal molecular weight corresponds to that of proBPN'. The upper band corresponds to a somewhat higher molecular weight species, possibly glycosylated.
The first lane in the figure shows BPN' polypeptides produced in culture by plant cells transformed with the codon-optimized proBPN' sequence identified by SEQ ID
N0:21. For SUBSTITUTE SHEET (RULE 26) comparative purposes, the same volume of culture medium, adjusted for cell density, was applied in both lanes 1 and 2. As seen, the amount of BPN' enzyme produced with a colon-optimized sequence was severalfold higher than for subtilisin BPN' produced with the native coding sequence..
Further, a dark band or bands corresponding to mature peptide (molecular weight 27.5 kdal) was observed. However, it should be noted that directly above the band at 35kD is a more pronounced band which may be pro mature product yet to be cleaved into active form.
Fig. 11 compares the specific activity of BPN' colon-optimized (AP106) versus BPN' native (AP101) expression in rice callus cell culture, assayed using the chromogenic peptide substrate suc-Ala-Ala-Pro-Phe-pNA as described by DelMar, E.G. et al. (1979;
Anal. Biochem.
l0 99:316-320). As shown if Fig. I1, several of the cell lines transformed with colon-optimized chimeric genes produced levels of BPN', as evidenced by measured specific activity in culture medium, that were 2-5 times the highest levels observed for plant cells transformed with native proBPN' sequence.
In accordance with another aspect of the invention, it has been found that the transformed plant cell culture is able to express and secrete BPN' at a cell culture pH, pH 5.5, which largely inhibits self-degradation of mature, active BPN'. To assay for optimal pH
conditions, the assay disclosed in DeIMar, et al. (supra) is used to test the media derived from BPN' transformed cell lines under various pH conditions. Transformed rice callus cells are cultured in a MES medium under similar conditions as disclosed in Example 2, but where the pH of the medium is maintained at a selected pH between 5 and 8Ø At each pH, the total amount of expressed and secreted BPN' is determined by Western blot analysis. BPN' activity can be tested in the assay described by DelMar (supra).
V. Production of Mature Heterolo n"s Protein in ~Prminating~g~s In this embodiment, monocot cells transformed as above are used to regenerate plants, seeds from the plants are harvested and then germinated, and the mature protein is isolated from the germinated seeds.
Plant regeneration from cultured protoplasts or callus tissue is carried by standard methods, e.g., as described in Evans et al., HALtt~BOOK OF PLaNT CELL L Es Vol. 1:
(MacMiltan 3o Publishing Co. New York, 1983); and Vasil LR. (ed.), E L CuL~ryRn ArrD
~oMA~rrc CELL
~$NETICS OF PLANTS, Acad. Press, Orlando, Vol. I, 1984, and Vol. III, 1986, and as described in the above-cited PCT application.
A. Seed Germination Condition The transgenic seeds obtained from the regenerated plants are harvested, and prepared for germination by an initial steeping step, in which the seeds immersed in or sprayed with water to SUBSTITUTE SHEET (RULE 26) increase the moisture content of the seed to between 35-45%. This initiates germination. Steeping typically takes place in a steep tank which is typically ftted with a conical end to allow the seed to flow freely out. The addition of compressed air to oxygenate the steeping process is an option.
The temperature is controlled at approximately 22~C depending on the seed.
After steeping, the seeds are transferred to a germination compartment which contains air saturated with water and is under controlled temperature and air flows. The typical temperatures are between I2-25oC and germination is permitted to continue for from 3 to 7 days.
Where the heterologous protein coding gene is operably linked to a inducibie promoter requiring a metabolite such as sugar or plant hormone, e.g., 2 to 100 p,M
gibberellic acid, this l0 metabolite is added, removed or depleted from the steeping water medium and/or is added to the water saturated air used during germination. The seed absorbs the aqueous medium and begins to germinate, expressing the heterologous protein. The medium may then be withdrawn and the malting begun, by maintaining the seeds in a moist temperature controlled aerated environment. In this way, the seeds may begin growth prior to expression, so that the expressed product is less likely to be partially degraded or denatured during the grocess.
More specifically, the temperature during the imbibition or steeping phase will be maintained in the range of about IS-25oC, while the temperature during the germination will usually be about 20~C. The time for the imbibition will usually be from about 1 to 4 days, while the gernunation time will usually be an additional 1 to 10 days, more usually 3 to 7 days. Usually, the 2o time for the malting does not exceed about ten days. The period for the malting can be reduced by using giant hormones during the imbibition, particularly gibberellic acid.
To achieve maximum production of recombinant protein from malting, the malting procedure may be modified to accommodate de-hulled and de-embryonated seeds, as described in above-cited PCT application WO 95114099. In the absence of sugars from the endosperm, there is expected to be a S to 10 fold increase in ltAmy3D promoter activity and thus expression of heterologous protein. Alternatively when embryoless half seeds are incubated in 10 mM CaCl2 and 5 p,M gibberellic acid, there is a SO fold increase in RAmyIA promoter activity.
Prgduction of mature HSA: Following the germination conditions as outlined above and further detailed in Example 3, supernatant was analyzed by Western blot.
Western blot analysis shows production of HSA in germinating rice seeds, with seed samples taken 24, 72, and 120 hours after induction with gibberellin. HSA production was highest approximately 24 hours post-induction (lanes 3 and 4, Fig. 12). Bilirubin binding, a measure of correct folding of plant-produced HSA, is assayed according to the method presented in Example 3.
VI. Production of Mature Heterologous Protein in Maturin Seeds SUBSTITUTE SHEET (RULE 26) In this embodiment, monocot cells transformed as above are used to regenerate punts, anti seeds from the plants are allowed to mature, typically in the field, with consequent production of heterologous protein in the seeds.
Following seed maturation, the seeds and their heterologous proteins may be used directly, that is, without protein isolation, where for example, the heterologous protein is intended to confer a benefit on the seed as a whole, for example, to enrich the seed in the selected protein.
Alternatively, the seeds may be fractionated by standard methods to obtain the heterologous protein in enriched or purified form. In one general approach, the seed is first milled, then suspended in a suitable extraction medium, e.g., an aqueous or an organic solvent, to extract the l0 protein or metabolite of interest. If desired the heterologous protein can be further fractionated and purified, using standard purification methods.
The following examples are provided by way of illustration only and not by way of limitation. Those of skill will readily recognize a variety of noncritical parameters which could be changed or modified to yield essentially similar results.
General Methods Generally, the nomenclature and laboratory procedures with respect to standard recombinant DNA technology can be found in Sambrook, et al., MOLECULAR Ct,orrn~ro - A
LABORATORY
~r , Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 1989 and in S.B.
Gelvin and R.A. Schilperoot, PLANT MOLECULAR BIOLOGY, 1988. Other general references are provided throughout this document. The procedures therein are known in the art and are provided for the convenience of the reader.
xam le 1 Construction of a Transforming Vector Containing a Colon-Optimized n,-antitrYpsin Setluence A. Hv~romvcin Resistance Gene Insertion:
The 3 kb BamHI fragment containing the 35S promoter-Hph-NOS was removed from the plasmid pMON410 (Monsanto, St. Louis, MO) and placed into an site-directed mutagenized BgllT
site in the pUCl8 at 1463 to form the plasmid pUCHl8+.
B. Terminator Insertion:
pOSgIABKS is a 5 kb BamHI-Kpnl fragment from lambda clone ~,OSglA (Huang, N., et al., (1990) Nuc. Acids Res. 18:7007) cloned into pBluescript KS- (Stratagene, San Diego, CA).
SUBSTITUTE SHEET (RULE 26) Plasmid pOSgIABKS was digested with Mspl and blunted with T4 DNA polymerase followed by SpeI digestion. The 350 by terminator fragment was subcloned into pUCI9 (New England BioLabs, Beverly, MA), which had been digested with BamHI, blunted with T4 DNA
polymerase and digested with Xbal, to form pUCl9/terminator.
C. RAmy 3D Promoter Insertion:
. A 1.1 kb NheI-PstI fragment derived from plASI.S (Huang, N. et al. (1993) Plant Mol.
Biol. 23:737-747), was cloned into the vector pGEMSzf jmuitiple cloning site (MCS) (Promega, Madison, WI): ApaI, AatII, SphI, Ncol, SstII, EcoRV, SpeI, NotI, PstI, SaII, NdeI, SacI, MIuI, l0 NsiI~ at the SpeI and Pstl sites to form pGEMSzf (3DlNheI PstI). pGEMSzf (3DlNheI Pst~ was then digested with PstI and SacI, and two non kinased 30mers having the complementary sequences 5' GCTTG ACCTG TAACT CGGGC CAGGC GAGCT 3' (SEQ ID N0:23) and 5' CGCCT
AGCCC GAGTT ACAGG TCAAG CAGCT 3' (SEQ ID N0:24) were Iigated in to form p3DProSig. The promoter fragment prepared by digesting p3DProSig with NcoI, blunting with T4 DNA polymerase, and digesting with SstI was subcloned into pUCl9/terminator which had been digested with EcoRI, blunted with T4 DNA polymerase and digested with SstI, to form p3DProSigEND.
D. Multiple Cloning Site Insertion:
p3DProSigEND was digested with SstI and SmaI followed by the ligation of a new synthetic linker fragment constructed with the non-kinased complementary oligonucleotides 5' AGCTC
CATGG CCGTG GCTCG AGTCT AGACG CGTCC CC 3' (SEQ ID N0:25} and 5' GGGGA
CGCGT CTAGA CTCGA GCCAC GGCCA TGG 3' (SEQ ID N0:26) to form p3DProSigENDlink.
E. p3DProSigENDlink Flankine Site Modification:
p3DProSigENDlink was digested with SalI and blunted with T4 DNA polymerase followed by EcoRV digestion. The blunt fragment was then inserted into pBluescript KS+
(Stratagene) in the EcoRV site so that the HindIII site is proximal to the promoter and the EcoRI is proximal to the terminator sequence. The HindIII EcoRI fragment was then moved into the polylinker of pUCHl8+ to farm the p3Dv1.0 expression vector.
F. RAmyIA Promoter insertion:
A 1.9 kb NheI PstI fragment derived from subclone pOSG2CA2.3 from lambda clone ~,OSg2 (Huang et al. (1990) Plant Mol. Biol. I4:655-668), was cloned into the vector pGEMSzf at SUBSTITUTE SHEET (RULE 26) the SpeI and PstI sites to form pGEMSzf (lAINheI Pstl). pGEMSzf (lAlNheI Pstn was digested with Pstl and SacI and two non-kinased 35mers and four- kinased 32mers were Iigated in, with the complementary sequences as follows: 5' GCATG CAGGT GCTGA ACACC ATGGT GAACA
AACAC 3' (SEQ ID N0:27); 5' TTCTT GTCCC TTTCG GTCCT CATCG TCCTC CT 3' (SEQ
ID N0:28); 5' TGGCC TCTCC TCCAA CTTGA CAGCC GGGAG CT 3' (SEQ ID 0:29); 5' TTCAC CATGG TGTTC AGCAC CTGCA TGCTG CA 3' (SEQ ID N0:30); 5' CGATG AGGAC
CGAAA GGGAC AAGAA GTGTT TG 3' (SEQ ID N0:31); 5' CCCGG CTGTC AAGTT
GGAGG AGAGG CCAAG GAGGA 3' (SEQ ID N0:32) to form plAProSig. The HindIII-SacI
0.8 kb promoter fragment was subcloned from pIAProSig into the p3Dv1.0 vector digested with l0 HindIII-SacI to yield the plAvl.O expression vector.
G. construction of p3D-AAT Plasmid Two PCR grimers were used to amplify a fragment encoding AAT according to the sequence disclosed as Genbank Accession No. K01396: N-terminal primer 5' GAGGA
TCCCC
AGGGA GATGC TGCCC AGAR 3' (SEQ ID N0:33) and C-terminal primer 5' CGCGC TCGAG
TTATT TTTGG GTGGG ATTCA CCAC 3' (SEQ ID N0:34). The N-terminal primer amplifies to a blunt site for in-frame insertion with the end of the p3D signal peptide and the C-terminal primer contains a XhoI site for cloning the fragment into the vector as shown in Figs. 3A and 3B.
Alternatively, the sequence encoding mature AAT (SEQ ID N0:8) or colon-optimized AAT may be 2o chemically synthesized using techniques known in the art, incorporating a XhoI restriction site 3' of the termination colon for insertion into the expression vector as described above.
P~uction of mature ."-antitrypsin in cell culture After selection of transgenic callus, callus cells were suspended in liquid culture containing AA2 media ('Thompson, J.A., et al., Plant Science 47:123 (1986), at 3%
sucrose, pH 5.8.
Thereafter, the cells were shifted to phosphate-buffered media (20 mM
phosphate buffer, pH 6.8) using 10 mL mufti-well tissue culture plates and shaken at 120 rpm in the dark for 48 hours. The supernatant was then removed and stored at -80~C prior to western blot analysis.
Supernatants were concentrated using Centricon-IO filters (Amicon cat. #4207) and washed with induction media to remove substances interfering with electrophoretic migration. Samples were concentrated approximately 10 fold, and mature AAT was purified by SDS
PAGE
electrophoresis. The purified protein was extracted from the electrophoresis medium, and sequenced at its N-terminus, giving the sequence shown in Fig. 8, identified herein as SEQ 7~
N0:22.
SUBSTITUTE SHEET (RULE 26) Example 3 HSA Induction in Germinating Seeds After selection of transgenie plants which tested positive for the presence of a codon-optimized HSA gene driven by the GA3-responsive RAmylA promoter, seeds were harvested and imbibed for 24 hours with 100 rpm orbital shaking in the dark at ZS~C. GA3 was added to a final concentration of Sp.M and incubated for an additional 24-120 hours. Total soluble protein was isolated by double grinding each seed in 120 ~cI grinding buffer and centrifuging at 23,000 x g for 1 minute at 4oC. The clear supernatant was carefully removed from the pellet and transferred to a fresh tube.
Biliruhin bindin assax Bilirubin binding to its high-affinity site on mature HSA is assayed using the method described by Jacobsen, J. et al. (1974; Clin. Chem. 20:783) and Reed, R.G. et al. (1975;
Biochemistry 14:4578-4583). Briefly, the concentration of free bilirubin in equilibrium with protein-bound biiirubin is determined by the rate of peroxide-peroxidase catalyzed oxidation of free bilirubin. Stock solutions of bilirubin (Nutritional Biochemicals Corp.) are prepared fresh daily in 5 mM NaOH containing 1mM EDTA and the concentration determined using a molar absorptivity of 47,500 M'1 cm 1 at 440 nm. An aliquot containing between S and 30 nmol bilirubin is added to a 1 cm cuvette containing I ml PBS and approximately 30 nmol HSA at 37~C. An absorbance spectrum between 500 and 350 nm is recorded. Aliquots of horseradish peroxidase (Sigma), 0.05 mg/ml in PBS, and 0.05% ethyl hydrogen peroxide (Ferrosan; Malmo Sweden) are added and the change in absorbance at ~,max is recorded for 3-5 minutes. The concentrations of free and bound billirubin calculated from the oxidation rate observed using varying concentrations of total bilirubin are used to construct a Scatchard plot from which the association constant for a single binding site is determined.
Although the invention has been described with reference to particular embodiments, it will be appreciated that a variety of changes and modifications can be made without departing from the invention.
SUBSTITUTE SHEET (RULE 26) SEQUENCE LISTING
(1 ) GENERAL INFORMATTON
(i) APPLICANT: Applied Phytologics, Inc. -(ii) TITLE OF THE INVENTION: Production of Mature Proteins in Plants (iii) NUMBER OF SEQUENCES: 34 (iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: Dehlinger & Associates -(B) STREET: P.O. Box 60850 (C) CITY: Palo Alto (D) STATE: CA
(E) COUNTRY: USA
(F} ZIP: 94306 (v) COMPUTER
READABLE
FORM:
(A) MEDIUM TYPE: Diskette (B) COMPUTER: IBM Compatible (C) OPERATING SYSTEM: DOS
(D) SOFTWARE: FastSEQ for Windows Version 2.0 (vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER: PCT/US98/03068 (B) FTLING DATE: 13-FEB--1998 (C) CLASSIFICATION:
(vii) PRIOR APPLICATION DATA:
(A) APPLICATTON NUMBER: 60/038,169 (B) FILING DATE: 13-FEB-1997 (A) APPLICATION NUMBER: 60/037,991 (B) FILING DATE: 13-FEB-1997 (A) APPLICATION NUMBER: 60/038,170 (B) FILING DATE: 13-FEB-1997 (A) APPLICATION NUMBER: 60/038,168 (B) FILING DATE. 13-FEB-1997 -(viii) ATTORNEY/AGENT INFORMATION: -(A) NAME: Petithory, Joanne R
(B) REGISTRATION NUMBER: P42,995 (C) REFERENCE/DOCKET NUMBER: 0665-0007.41 (ix) TELECOMMUNICATION
INFORMATION:
(A) TELEPHONE: 650-324-0880 (B) TELEFAX: 650-324-0960 (2) INFORMATION FOR SEQ ID NO:1:
(7.) SEQUENCE
CHARACTERISTICS:
(A) LENGTH: 25 amino acids -(B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE
TYPE:
peptide (vii) IMMEDIATE SOURCE:
(B) CLONE: 3D signal peptide sequence (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:
SUBSTITUTE SHEET (RULE 26) Met Lys Asn Thr Ser Ser Leu Cys Leu Leu Leu Leu Val Val Leu Cys Ser Leu Thr Cys Asn Ser Gly Gln Ala (2) INFORMATION FOR SEQ ID N0:2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 75 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY. linear (vii) IMMEDIATE SOURCE:
(B) CLONE: native 3D signal peptide DNA sequence (xi) SEQUENCE DESCRIPTION: SEQ ID N0:2:
(2) INFORMATION FOR SEQ ID N0:3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 75 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: codon-optimized 3D signal peptide DNA sequence (xi) SEQUENCE DESCRIPTION: SEQ ID N0:3:
(2) INFORMATION FOR SEQ ID N0:4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 25 amino acids (B) TYPE: amino acid (D) TOPOLOGY. linear (ii) MOLECULE TYPE: peptide (vii) IMMEDIATE SOURCE:
{B) CLONE: RAmylA signal peptide (xi} SEQUENCE DESCRIPTION: SEQ ID N0:4:
Met Val Asn Lys His Phe Leu Ser Leu Ser Val Leu Ile Val Leu Leu Gly Leu Ser Ser Asn Leu Thr Ala Gly (2) INFORMATION FOR SEQ ID NO: S:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 51 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: RAmy lA 5' untranslated region {UTR) (xi) SEQUENCE DESCRIPTTON: SEQ ID N0:5:
SUBSTITUTE SHEET {MULE 26) (2) INFORMATION FOR SEQ ID N0:6: -(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 321 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (vii) IMMEDTATE SOURCE:
(B) CLONE: RAmy 1A 3' untranslated region (UTR) (xi) SEQUENCE DESCRIPTION: SEQ ID N0:6:
CTACGAAAAT TTGATGCGTA G 32l (2) INFORMATION FOR SEQ ID N0:7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 394 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear -(ii) MOLECULE TYPE: protein (vii) IMMEDIATE SOURCE:
(B) CLONE: mature AAT amino acid sequence (xi) SEQUENCE DESCRIPTION: SEQ ID N0:7:
Glu AspProGln GlyAsp AlaAlaGln LysThr AspThrSer HisHis Asp GlnAspHis ProThr PheAsnLys IleThr ProAsnLeu AlaGlu Phe AlaPheSex LeuTyr ArgGlnLeu AlaHis GlnSerAsn SerThr Asn IlePhePhe SerPro ValSerIle AlaThr AlaPileAla MetLeu Ser LeuGlyThr LysAla AspThrHis AspGlu IleLeuGlu GlyLeu Asn PheAsnLeu ThrGlu IleProGlu AlaGln IleHisGlu GlyPhe Gln GluLeuLeu ArgThr LeuAsnGln ProAsp SerGlnLeu GlnLeu Thr ThrGlyAsn GlyLeu PheLeuSer.GluGly LeuLysLeu ValAsp 11.5 12 12 Lys PheLeuGlu AspVal LysLysLeu TyrHis 5erGluAla PheThr Val AsnPheGly AspThr GluGluAla LysLys GlnIleAsn AspTyr Val GluLysGly ThrGln G1yLysIle ValAsp LeuValLys GluLeu Asp ArgAspThr ValPhe AlaLeuVal AsnTyr IlePhePhe LysGly Lys TrpGluArg ProPhe GluValLys AspThr GluGluGlu AspPhe His ValAspGln ValThr ThrValLys ValPro MetMetLys ArgLeu Gly MetPheAsn IleGln HisCysLys LysLeu SerSerTrp ValLeu Leu MetLysTyr LeuGly AsnAlaThr AlaIle PhePheLeu ProAsp SU9ST1TUTE SHEET (RULE 26j WO 98f36085 PCT/US98/U3068 G1u Gly Lys Leu Gln His Leu Glu Asn Glu Leu Thr His Asp Ile Ile 260 265 - .. 270 Thr Lys Phe Leu Glu Asn Glu Asp Arg Arg Ser Ala Ser Leu His Leu 275 280 285 _ Pro Lys Leu Ser Ile Thr Gly Thr Tyr Asp Leu Lys Ser Val Leu Gly Gln Leu Gly Ile Thr Lys Val Phe Ser Asn Gly Ala Asp Leu Ser Gly Val Thr Glu Glu Ala Pro Leu Lys Leu Ser Lys Ala Val His Lys Ala Val Leu Thr Ile Asp Glu Lys Gly Thr Glu Ala Ala Gly Ala Met Phe Leu Glu Ala Ile Pro Met Ser Ile Pro Pro Glu Val Lys Phe Asn Lys Pro Phe VaI Phe Leu Met Ile Glu Gln Asn Thr Lys Ser Pro Leu Phe Met Gly Lys Val Val Asn Pro Thr Gln Lys (2) INFORMATION FOR SEQ ID N0:8:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1185 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE. native coding sequence of mature AAT
(xi) SEQUENCE DESCRIPTION: SEQ ID~N0:8:
TCAGGATCAC
ATACCGCCAG
CGCTACAGCC
GGAGGGCCTG
GGAACTCCTC
CCTGTTCCTC
GTACCACTCA
CAACGATTAC
CAGAGACACA
CTTTGAAGTC
GGTGCCTATG
CTGGGTGCTG
GGGGAAACTA
AAATGAAGAC
TGATCTGAAG
CCTCTCCGGG
GCTGACCATC
CATGTCTATC
AAATACCAAG
(2) INFORMATION FOR SEQ ID N0:9:
(i) SEQUENCE CHARACTERISTTCS:
(A) LENGTH: 432 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (vii) IMMEDIATE SOURCE:
(B) CLONE: mature ATIII as sequence (xi) SEQUENCE DESCRIPTION: SEQ ID N0:9:
His Gly Sex Pro Va1 Asp Ile Cys Thr Ala Lys Pro Arg Asp Ile Pro SUBSTITUTE SHEET (RULE 26) WO 98/36085 PC'd'/CTS98/03068 Met AsnPro MetCys TleTyrArg SerPro GluLysLys AlaThr Glu 20 25 30 _ Asp GluGly SerGlu GlnLysIle ProGlu AlaThrAsn ArgArg Val Trp GluLeu SerLys AlaAsnSer ArgPhe AlaThrThr PheTyr Gln His LeuAla AspSer LysAsnAsp AsnAsp AsnIlePhe LeuSer Pro 65 70 75 g0 Leu SerIle SerThr AlaPheAla MetThr LysLeuGly AlaCys Asn 85 9p g5 Asp ThrLeu GlnGln LeuMetGlu ValPhe LysPheAsp ThrIle Ser Glu LysThr SerAsp GlnIleHis PhePhe PheAlaLys LeuAsn Cys Arg LeuTyr ArgLys AlaAsnLys SerSer LysLeuVal SerA1a Asn Arg LeuPhe GlyAsp LysSerLeu ThrPhe AsnGluThr TyrGln Asp Ile SerGlu LeuVal TyrGlyAla LysLeu GlnProLeu AspPhe Lys Glu AsnAla GluGln SerArgAla AlaIle AsnLysTrp ValSer Asn Lys ThrGlu GlyArg IleThrAsp ValIle ProSerGlu AlaIle Asn Glu LeuThr ValLeu ValLeuVal AsnThr IleTyrPhe LysGly Leu Trp LysSer LysPhe SerProGlu AsnThr ArgLysGlu LeuPhe Tyr _ Lys AlaAsp Gly'GluSerCysSer AlaSer MetMetTyr GlnGlu Gly Lys PheArg T~rArg ArgValAla GluGly ThrGlnVal LeuGlu Leu Pro PheLys GlyAsp AspIleThr MetVal LeuIleLeu ProLys Pro 275 280 2g5 Glu LysSer LeuAla LysValGlu LysGlu LeuThrPro GluVal Leu Gln GluTrp LeuAsp GluLeuGlu GluMet MetLeuVal ValHis Met Pro ArgPhe ArgIle GluAspG1y PheSer LeuLysG1u GlnLeu Gln Asp MetGly LeuVal AspLeuPhe SerPro G1uLysSer LysLeu Pro Gly IleVal AlaGlu GlyArgAsp AspLeu TyrValSer AspA1a Phe His LysAla PheLeu GluValAsn GluGlu GlySerGlu AlaAla Ala Ser ThrAla ValVal TleAlaGly ArgSer LeuAsnPro AsnArg Val 3B5 390 . 395 400 Thr PheLys AlaAsn ArgProPhe LeuVal PheIleArg GluVal Pro Leu AsriThr IleIle PheMetGly ArgVal AlaAsnPro CysVal Lys (2) INFORMATION FORSEQ ID N0:10:
(i) QUENCE CS:
SE CHARACTERISTI
(A) LENGTH: 1299base irs pa (B) TYPE: c id nuclei ac (C) STRANDEDNESS : ngle si (D) TOPOLOGY: near -li (vii) IMMEDIATE SOURCE:
(B) CLONE: native ATIII DNA sequence -(xi) SEQUENCE DESCRIPTION: SEQ TD N0:10:
SUBSTITUTE SHEET (RULE 26) CACGGAAGCC CTGTGGACAT CTGCACAGCC.AAGCCGCGGG ACATTCCCAT GAATCCCATG60 AATCACCGAT
GGAACTCACC
CAACAGGGTG
(2) TNFORMATION FOR SEQ ID N0:11:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: S85 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (vii) IMMEDIATE SOURCE:
(B) CLONE: mature HSA amino acid sequence (xi) SEQUENCE DESCRIPTION: SEQ ID N0:11:
Asp Ala HisLys SerGlu ValAlaHis ArgPheLys AspLeu GlyGlu Glu Asn PheLys AlaLeu ValLeuIle AlaPheAla GlnTyr LeuGln Gln Cys ProPhe GluAsp HisValLys LeuValAsn GluVal ThrGlu Phe Ala LysThr CysVal AlaAspGlu SerAlaGlu AsnCys AspLys Ser Leu HisThr LeuPhe GlyAspLys LeuCysThr ValAla ThrLeu Arg Glu ThrTyr GlyGlu MetAlaAsp CysCysAla LysGln GluPro Glu Arg AsnGlu CysPhe LeuGlnHis LysAspAsp AsnPro AsnLeu 100 lOS' I10 Pro Arg LeuVal ArgPro GluValAsp ValMetCys ThrAla PheHis Asp Asn GluGlu ThrPhe LeuLysLys TyrLeuTyr GluIle AlaArg Arg His ProTyr PheTyr AlaProGlu LeuLeuPhe PheAla LysArg Tyr Lys AlaAla PheThr GluCysCys GlnAlaAla AspLys AlaAla Cys Leu LeuPro LysLeu AspGluLeu ArgAspGlu GlyLys AlaSer Ser Ala LysGln ArgLeu LysCysAla SerLeuGln LysPhe GlyGlu Arg Ala PheLys AlaTrp AlaValAla ArgLeuSer GlnArg PhePro Lys Ala GluPhe AlaGlu ValSerLys LeuValThr AspLeu ThrLys Val His ThrGlu CysCys HisGlyAsp LeuLeuGlu CysAla AspAsp SUBSTfTUTE SHEET (RULE 26) Arg AlaAspLeu AlaLys TyrIle CysGluAsn Gln-Asp SerIleSer Ser LysLeuLys GluCys CysGlu LysProLeu LeuGlu Lys-SerHis Cys IleAlaGlu ValGlu AsnAsp GluMetPro AlaAsp LeuProSer Leu AlaAlaAsp PheVal GluSer LysAspVal CysLys AsnTyrAla 305 310 315 - .- 320 Glu AlaLysAsp ValPhe LeuGly MetPheLeu TyrGlu TyrAlaArg Arg HisProAsp TyrSer ValVal LeuLeuLeu ArgLeu AlaLysThr Tyr GluThrThr LeuGlu LysCys CysAlaAla AlaAsp Pro-HisGlu Cys TyrAlaLys ValPhe AspGlu PheLysPro LeuVal GluGluPro Gln AsnLeuIle LysGln AsnCys G1uLeuPhe LysGln LeuGlyGlu Tyr LysPheGln AsnAla LeuLeu ValArgTyr ThrLys LysValPro G1n ValSerThr ProThr LeuVal GluValSer ArgAsn LeuGlyLys Val GlySerLys CysCys LysHis ProGluAla LysArg MetProCys Ala GluAspTyr LeuSer ValVal LeuAsnGln LeuCys ValLeuHis Glu LysThrPro ValSer AspArg ValThrLys CysCys ThrGluSer Leu -ValAsnArg ArgPro CysPhe SerAlaLeu GluVal AspGluThr Tyr ValProLys GluPhe AsnAla GluThrPhe ThrPhe HisAlaAsp 500 505 510.
Ile CysThrLeu SerGlu LysG1u ArgGlnIle LysLys GlnThrAla Leu ValGluLeu ValLys HisLys ProLysAla ThrLys GluGlnLeu Lys AlaValMet AspAsp PheAla AlaPheVal GluLys CysCysLys Ala AspAspLys GluThr CysPhe AlaGluGlu GlyLys LysLeuVal Ala AlaSerGln AlaAla LeuGly Leu 580 585 - . -(2) INFORMATION FOR SEQ ID N0:12: -(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1865 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: native coding sequence of mature HSA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:12:
AATGAAGTAA
CTGAATTTGC
AAAAACATGT
GTAGCTGATG
SU~STiTUTE SHEET (RULE 26) WO 98/36085 PCTlUS98/03068 TGCCAGTCTC CAAAA.ATTTG GAGAAAGAGC TTTCAAAGCA TGGGCAGTGG CTCGCCTGAG 660 AAAACCTCTG TTGGAAAA.AT CCCACTGCAT TGCCGAAGTG GAAAATGATG AGATGCCTGC 900 ' TCCAACTCTT GTAGAGGTCT CAAGAAACCT AGGAAAAGTG GGCAGCAAAT GTTGTAAACA 1320 AACAC
(2) INFORMATION FOR SEQ ID N0:13:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 352 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (vii) IMMEDIATE SOURCE:
(B) CLONE: native proBPN' amino acid sequence (xi) SEQUENCE DESCRIPTION: SEQ ID N0:13:
Ala Gly LysSer AsnGlyGlu LysLys TyrIIe ValGlyPhe LysGln Thr Met SerThr MetSerAla AlaLys LysLys AspValIle SerGlu Lys Gly GlyLys ValGlnLys GlnPhe LysTyr ValAspAla AlaSer Ala Thr LeuAsn GluLysAla ValLys GluLeu LysLysAsp ProSer 45 Val Ala TyrVal GluGluAsp HisVal AlaHis AlaTyrAla GlnSer Val Pro TyrGly ValSerGln IleLys AlaPro AlaLeuHis SerGln Gly Tyr ThrGIy SerAsnVal LysVal AlaVal IleAspSer GlyIle Asp Ser SerHis ProAspLeu LysVal AlaGly GlyAlaSer MetVal Pro Ser GluThr AsnProPhe GlnAsp AsnAsn SerHisGly ThrHis !55Val Ala GlyThr ValAlaAla LeuAsn AsnSer IleGlyVal LeuGly Val Ala ProSer AlaSerLeu TyrAla ValLys ValLeuGly AlaAsp Gly Ser GlyGln TyrSerTrp IleIle AsnGly IleGluTrp AlaIle Ala Asn AsnMet AspValIle AsnMet SerLeu GlyGlyPro SerGly Ser Ala AlaLeu LysAlaAla ValAsp LysAla ValAlaSer GlyVal Val Val ValAla AlaAlaGly AsnGlu GlyThr SerGlySer SerSer Thr Val GlyTyr ProGlyLys TyrPro SerVal IIeAlaVal GlyAla SUBSTITUTE SHEET (RULE 26j Val Asp Ser Ser Asn Gln Arg Ala Ser Phe Ser Ser Val Gly Pro Glu Leu Asp Val Met Ala Pro Gly Val Ser Ile Gln Ser Thr Leu Pro Gly Asn Lys Tyr Gly Ala Tyr Asn Gly Thr Ser Met Ala Ser Pro His Val Ala Gly Ala Ala Ala Leu Ile Leu Ser Lys His Pro Asn Trp Thr Asn Thr Gln Val Arg Ser Ser Leu Glu Asn Thr Thr Thr Lys Leu Gly Asp Ser Phe Tyr Tyr Gly Lys Gly Leu Ile Asn Val Gln Ala Ala Ala Gln (2) INFORMATION FOR SEQ ID N0:14:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1056 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (vii} IMMEDIATE SOURCE:
(B) CLONE: native proBPN~ coding sequence (xi) SEQUENCE DESCRIPTION: SEQ ID N0:14:
AATGAGCACG
GCAAAAGCAA
AGAATTGAAA
CGCGCAGTCC
CTACACTGGA
TGATTTAAAG
CAACAACTCT
TGTATTAGGC
TTCCGGCCAA
CGTTATTAAC
TAAAGCCGTT
CAGCTCAAGC
TGACAGCAGC
ACCTGGCGTA
GTCAATGGCA
CTGGACAAAC
TTTC
TACTAT
_ 1056 _ GGAAAAGGGC TGATCAACGT ACAGGCGGCA GCTCAG =- -(2) INFORMATION FOR SEQ ID N0:15: _ (i) SEQUENCE CHARACTERISTICS:
(A} LENGTH: 77 amino acids .
(B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (vii) IMMEDIATE SOURCE:
(B) CLONE: subtilisin BPN~ pro-peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:15:
Ala Gly Lys Ser Asn Gly Glu Lys Lys Tyr Ile Val Gly Phe Lys Gln Thr Met Ser Thr Met Ser Ala Ala Lys Lys Lys Asp Val Ile Sex Glu Lys Gly Gly Lys Val Gln Lys Gln Phe Lys Tyr Val Asp Ala Ala Ser Ala Thr Leu Asn Glu Lys Ala Val Lys Glu Leu Lys Lys Asp Pro Ser 34-i SUBSTITUTE SHEET (RULE 2fi) Val Ala Tyr Val Glu Glu Asp His 'Val Ala~His Ala Tyr (2) INFORMATION FOR SEQ ID N0:16:
(i) SEQUENCE CHARACTERISTICS:
- (A) LENGTH: 275 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear ' (ii) MOLECULE TYPE: protein (vii) IMMEDIATE SOURCE:
(B) CLONE: native mature BPN~ amino acid sequence (xi) SEQUENCE DESCRIPTION: SEQ ID N0:16:
Ala Gln SerValPro TyrGly ValSerGln IleLys AlaProAla Leu His Ser GlnGlyTyr ThrGly SerAsnVal LysVal AlaValIle Asp Ser Gly IleAspSer SerHis ProAspLeu LysVal AlaGlyGly Ala Ser Met ValProSer GluThr AsnProPhe GlnAsp AsnAsnSer His Gly Thr HisValAla GlyThr ValAlaAla LeuAsn AsnSerIle Gly Val Leu GlyValAla ProSer AlaSerLeu TyrAla ValLysVal Leu Gly Ala AspGlySer GlyGln TyrSerTrp IleIle AsnGlyIle Glu Trp Ala IleAlaAsn AsnMet AspValIle AsnMet SerLeuGly Gly Pro Ser GlySerAla AlaLeu LysAlaAla ValAsp LysAlaVal A1a Ser Gly ValValVal ValAla AlaAlaGly AsnGlu GlyThrSer Gly Ser Ser SerThrVal GlyTyr ProGlyLys TyrPro SerValIle Ala Val Gly AlaValAsp SerSer AsnGlnArg AlaSer PheSerSer Val Gly Pro GluLeuAsp ValMet AlaProGly ValSer IleGlnSer Thr Leu Pro GlyAsnLys TyrGly AlaTyrAsn GlyThr SerMetAla Ser Pro His ValAlaGly AlaAla AlaLeuIle LeuSer LysHisPro Asn Trp Thr AsnThrGln ValArg SerSerLeu GluAsn ThrThrThr Lys Leu Gly Asp.SerPhe TyrTyr GlyLys~Gly LeuIle AsnValGln Ala Ala Ala Gln !55 (2) INFORMATION FOR SEQ ID N0:17:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 275 amino acids (B) TYPE: amino acid GO (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (vii) IMMEDIATE SOURCE:
(B) CLONE: amino acid sequence of mature BPN~ variant 34-ii SUBSTITUTE SHEET (RULE 26) (xi) SEQUENCE DESCRIPTION: SEQ ID N0:17:
Ala Gln Ser Val Pro Tyr Gly Val Ser Gln Ile Lys Ala Pro Ala Leu His SerGln GlyTyrThr GlySer AsnVal LysValAla ValIle Asp Ser GlyIle AspSerSer HisPro AspLeu LysValAla GlyGly Ala 35 40 45 _ Ser MetVal ProSerGlu ThrAsn ProPhe GlnAspThr AsnSer His Gly ThrHis ValAlaGly ThrVal AlaAla LeuThrAsn SexIle Gly Val LeuGly ValAlaPro SerAla SerLeu TyrAlaVal LysVal Leu Gly AlaAsp GlySerGly GlnTyr SerTrp IleIleAsn GlyIle Glu Trp AlaIle AlaAsnAsn MetAsp ValIle ThrMetSer LeuGly Gly Pro SerGly SerAlaAla LeuLys AlaAla ValAsplaysAlaVal Ala Ser GlyVal ValValVal A1aAla AlaGly AsnGluGly ThrSer G1y Ser SerSer ThrValGly TyrPro GlyLys TyrProSer ValIle Ala Val GlyAla ValAspSer SerAsn GlnArg AlaSerPhe SerSer Val Gly ProGlu LeuAspVal MetAla ProGly ValSerLle GlnSer Thr Leu ProGly AsnLysTyr GlyAla TyrSer GlyThrSer MetAla Ser 210 215 ~ 220 Pro HisVal AlaGlyAla AlaAla LeuIle LeuSerLys HisPro Thr Trp ThrAsn ThrGlnVal ArgSer SerLeu GluAsnThr ThrThr Lys Leu GlyAsp SerPheTyr TyrGly LysGly LeuIleAsn ValGln Ala Ala AlaGln (2) INFORMATION FOR SEQ ID N0:18: -(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1260 base pairs (S) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE: -(B) CLONE: codon-optimized~3D signal peptide-AAT DNA sequence (xi) SEQUENCE DESCRIPTION: SEQ ID N0:18:
34-iii -SLD~STITUTE SHEET (RULE 26j (2) INFORMATION FOR SEQ ID N0:19:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1382 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single .
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: codon-optimized 3D signal peptide-ATIII DNA sequen (xi) SEQUENCE DESCRIPTION: SEQ ID N0:19:
(2) INFORMATION FOR SEQ ID N0:20:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1940 base pairs.
(B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: codon-optimized 3D signal peptide-HSA DNA sequence (xi) SEQUENCE DESCRTPTTON: SEQ ID N0:20:
34-iv SUBSTITUTE SHEET (RULE 26) GAAAA.AATAC TTATATGAAA TTGCCAGAAG ACATCCTTAC TTTTATGCCC CGGAACTCCT540 ATGAATATGC
AAGTGGGCAG
TTGTGAAACA
TGTTGCTGCA AGTCAAGCTG CCTTAGGCTT ATAACATCTA CATTTA&AAG CATCTCAGCC1860 (2) TNFORMATION FOR SEQ ID N0:21:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1140 base pairs (B) TYPE: nucleic acid (C} STRANDEDNESS: single (D) TOPOLOGY: linear (vii} IMMEDIATE SOURCE:
(B) CLONE: codon-optimized 3D signal peptide-BPN~ DNA sequene (xi) SEQUENCE DESCRIPTION: SEQ ID N0:21:
(2) INFORMATION FOR SEQ ID N0:22:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 13 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii} MOLECULE TYPE: peptide 34-v SUBSTITUTE SHEET (RULE 26) WO 98!36085 PCT/US98/03068 (vii) IMMEDIATE SOURCE:
(B) CLONE: N-terminus o~ mature AAT
(xi) SEQUENCE DESCRIPTION. SEQ ID N0:22:
Glu Asp Pro Gln Gly Asp Ala Ala Gln Lys Thr Asp Thr (2) INFORMATION FOR SEQ ID N0:23:
(i) SEQUENCE CHARACTERISTICS:
' (A) LENGTH: 30 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ TD N0:23:
(2) INFORMATION FOR SEQ ID N0:24:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUEI~TCE DESCRIPTION: SEQ ID N0:24:
(2) INFORMATION FOR SEQ ID N0:25:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 37 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID N0:25:
(2) INFORMATION FOR SEQ ID N0:26:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID N0:26:
(2) INFORMATION FOR SEQ ID N0:27:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 35 base pairs X55 (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear 34-vi SUBSTITUTE SHEET (RULE 26) (xi) SEQUENCE DESCRIPTION: SEQ-ID N0:27:
(2) INFORMATION FOR SEQ ID N0:28:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH. 32 base pairs (B) TYPE. nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTTON: SEQ ID N0:28:
TTCTTGTCCC TTTCGGTCCT CATCGTCCTC CT _ 32 (2) INFORMATION FOR SEQ ID N0:29:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs (B) TYPE: nucleic acid __.
(C) STRANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID N0:29:
(2) INFORMATION FOR SEQ TD N0:30:
(i) SEQUENCE CHARACTERTSTICS:
(A) LENGTH: 32 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single -(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID N0:30:
(2) INFORMATION FOR SEQ ID N0:31:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single .
(D) TOPOLOGY: linear (xi} SEQUENCE DESCRIPTION: SEQ ID N0:31:
(2) INFORMATION FOR SEQ ID N0:32:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 35 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear -(xi) SEQUENCE DESCRIPTION: SEQ ID N0:32:
34-vii SU~STITUTE SHEET (RULE 26) (2) INFORMATION FOR SEQ ID N0:33:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 29 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID N0:33:
GAGGATCCCC AGGGAGATGC TGCCCAGAA 2g (2) INFORMATION FOR SEQ ID N0:34:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID N0:34:
34-viii
"Subtilisin" or "subtilisin BPN"' or "BPN"' refers to the protease enzyme produced naturally by B. amyloliquefaciens, and having the sequence of SEQ ID N0:16, or a sequence homologous therewith.
"proBPN"' refers to a form of BPN' having an approximately 78 amino-acid "pro"
moiety that functions as a chaperon polypeptide to assist in folding and activation of the BPN', and having the sequence in SEQ ID N0:13, or a sequence homologous therewith.
"Codon optimization" refers to changes in the coding sequence of a gene to replace native to codons with those corresponding to optimal codons in the host plant.
A DNA sequence is "derived from" a gene, such as a rice or barley a,-amylase gene, if it corresponds in sequence to a segment or region of that gene. Segments of genes which may be derived from a gene include the promoter region, the S' untranslated region, and the 3' untranslated region of the gene.
II. Transformed plant cell The plants used in the process of the present invention are derived from monocots, particularly the members of the taxonomic family known as the Gramineae. This family includes all members of the grass family of which the edible varieties are known as cereals. The cereals include a wide variety of species such as wheat (Trincum sps.), rice (Oryza sps.) barley (Hordeum sps.) oats, (Avena sps.) rye (Secale sps.), corn (Zea sps.) and millet (Pennisettum sps.). In the present invention, preferred family members are rice and barley.
Plant cells or tissues derived from the members of the family are transformed with expression constructs (i.e., plasmid DNA into which the gene of interest has been inserted) using a variety of standard techniques (e.g., electroporation, protoplast fusion or microparticle bombardment}. The expression construct includes a transcription regulatory region (promoter) whose transcription is specifically upregulated by the piesence of absence of a small molecule, such as the reduction or depletion of sugar, e.g., sucrose, in culture medium, or in plant tissues, e.g., germinating seeds. In the gresent invention, particle bombardment is the preferred transformation procedure.
The construct also includes a gene encoding a mature heterologous protein in a form suitable for secretion from plant cells. The gene encoding the recombinant heterologous protein is placed under the control of a metabolically regulated promoter. Metabolically regulated promoters are those in which mRNA synthesis or transcription, is repressed or upregulated by a small 3s metabolite or hormone molecule, such as the rice RAmy3D and I:ZAmy3E
promoters, which are SUBSTITUTE SHEET (RULE 26) WD 98/36085 PCTlUS98/03068 upregulated by sugar-depletion in. cell culture. For protein production in germinating seeds from regenerated transgenic plants, a preferred promoter is the_Ramy lA promoter, which is up-regulated by gibberellic acid during seed germination. The expression construct also utilizes additional_ regulatory DNA sequences e.g., preferred codons, termination sequences, to promote efficient translation of AAT, as will be described.
A. Plant Expression Vector Expression vectors for use in the present invention comprise a chimeric gene (or expression cassette), designed for operation in plants, with companion sequences upstream and downstream to from the expression cassette. The companion sequences will be of plasmid or viral origin and provide necessary characteristics to the vector to permit the vectors to move DNA from bacteria to the desired plant host. Suitable transformation vectors are described in related application PCT WO
95/14099, published May 25, 1995, which is incorporated by reference herein.
Suitable components of the expression vector, including an inducible promoter, coding sequence for a signal peptide, coding sequence for a mature heterologous protein, and suitable termination sequences are discussed below. One exemplary vector is the p3D(AAT)v1.0 vector illustrated in Figs 3A and 3B.
Al. Promoters The transcription regulatory or promoter region is chosen to be regulated in a manner allowing for induction under selected cultivation conditions, e.g., sugar depletion in culture or water uptake followed by gibberellic acid production in germinating seeds.
Suitable promoters, and their method of selection are detailed in above-cited PCT application WO
95/14099. Examples of such promoters include those that transcribe the cereal o,-amylase genes and sucrose synthase genes, and are repressed or induced by small molecules, Iike sugars, sugar depletion or phytohormones such as gibberellic acid or absissic acid. Representative promoters include the promoters from the rice ~,-amylase RAmyIA, RAmyIB, RAmy2A, RAmy3A, RAmy3B, RAmy3C, RAmy3D, and RAmy3E genes, and from the pM/C, gKt1my141, gKAmy155, Amy32b, and HVl8 barley a-amylase genes. These promoters are described, for example, in ADVANCr~S IN
PLANT
$IpTECFiIQOLOGY Ryu, D.D.Y., et al, Eds., Elsevier, Amsterdam, 1994, p.37, and references cited therein. Other suitable promoters include the sucrose synthase and sucrose-6-phosphate-synthetase (SPS) promoters from rice and barley.
Other suitable promoters include promoters which are regulated in a manner allowing for induction under seed-maturation conditions. Examples of such promoters include those associated with the following monocot storage proteins: rice glutelins, oryzins, and prolamines, barley hordeins, wheat gliadins and glutelins, maize zeros and glutelins, oat glutelins, and sorghum SU~STITUTE SHEET (RULE 26) kafirins, millet pennisetins, and rye secaiins. _ A preferred promoter for expression in germinating seeds is the rice a-amylase RAmyIA
promoter, which is upregulated by gibberellic acid. Preferred promoters for expression in cell culture are the rice a-amylase RAmy3D and RAmy3E promoters which are strongly upregulated by sugar depletion in the culture. These promoters are also active during seed germination. A
preferred promoter for expression in maturing seeds is the barley endosperm-specific Bl-hordein promoter (Brandt, A., et al., (1985) Carlsberg Res. Commun. 50:333-345}.
The chimeric gene may further include, between the promoter and coding sequences, the 5' untransiated region (5' UTR) of an inducible monocot gene, such as the 5' UTR
derived from one to of the rice or barley a-amylase genes mentioned above. One preferred 5' UTR
is that derived from the RAmylA gene, which is effective to enhance the stability of the gene transcript. This 5' UTR
has the sequence given by SEQ ID NO:S herein.
A2. Signal Sequgnces In addition to encoding the protein of interest, the chimeric gene encodes a signal sequence (or signal peptide) that allows processing and translocation of the protein, as appropriate. Suitable signal sequences are described in above-referenced PCT application WO
95/14099. One preferred signal sequence is identified as SEQ ID NO:1 and is derived from the RAmy3D
promoter. Another preferred signal sequence is identified as SEQ ID N0:4 and is derived from the RAmyIA promoter.
The plant signal sequence is placed in frame with a heterologous nucleic acid encoding a mature protein, forming a construct which encodes a fusion protein having an N-terminal region correspanding to the signal peptide and, immediately adjacent to the C-terminal amino acid of the signal peptide, the N-terminal amino acid of the mature heterologous protein.
The expressed fusion protein is subsequently secreted and processed by signal peptidase cleavage precisely at the junction of the signal peptide and the mature protein, to yield the mature heterologous protein.
In another embodiment of the invention, the coding sequence in the fusion protein gene, in at least the coding region for the signal sequence, may tie colon-optimized for optimal expression in plant cells, e.g., rice cells, as described below. The upper row in Fig. 1 shows one codon-optimized coding sequence for the RAmy3D signal sequence, identified herein as SEQ ID N0:3.
A3. Naturally-Occurring Heterologous Protein Coding~g"q ~ n~P~
(i) ~1-Anti sin: Mature human AAT is composed of 394 amino acids, having the sequence identified herein as SEQ iD N0:7. The protein has N-glycosyiation sites at asparagines 46, 83 and 247. The corresponding native DNA coding sequence is identified herein as SEQ ID
N0:8.
SU9ST1TUTE SHEET (RULE 26) - (ii) Antithrombin III: Mature human ATIII is composed of 432 amino acids, having the sequence identified herein as SEQ ID N0:9. The protein has N-glycosylation sites at the four asparagine residues 96, 135, 155, and 192. The corresponding native DNA coding sequence is identified herein as SEQ ID NO:10.
(iii) Human serum albumin: Mature HSA as found in human serum is composed of amino acids, having the sequence identified herein as SEQ ID N0:11. The protein has no N-linked glycosylation sites. The corresponding native DNA coding sequence is identified herein as SEQ ID
N0:12.
(iv) Subtilisin BPN': Native proBPN' as produced in B. amyloliquefaciens is composed of l0 352 amino acids, having the sequence identified herein as SEQ 117 N0:13, The corresponding native DNA coding sequence is identifed herein as SEQ ID NO: i4. The proBPN' polypeptide contains a 77 amino acid "pro" moiety which is identified herein as SEQ ID N0:15. The remainder of the polypeptide, which forms the mature active BPN', is a 275 amino acid sequence identified herein by SEQ ID NO: i6. Native BPN' as produced in Bacillus is not glycosylated.
A4. Codon-Optimized Coding Sequences In accordance with one aspect of the invention, it has been discovered that a severalfold enhancement of expression level can be achieved in plant cell culture by modifying the native coding sequence of a heterologous gene by contain predominantly or exciusively, highest-frequency codons found in the plant cell host.
The method will be illustrated for expression of a heterologous gene in rice plant cells, it being recognized that the method is generally applicable to any monocot. As a first step, a representative set of known coding gene sequence from rice is assembled. The sequences are then analyzed for codon frequency for each amino acid, and the most frequent codon is selected for each amino acid. This approach differs from earlier reported codon matching methods, in which more than one frequent codon is selected for at least some of the amino acids. The optimal codons selected in this manner for rice and barley are shown in-Table 1.
Table 1 Amino Acid Rice Preferred Codon Barley Preferred Codon I
AIa A GCC
Arg R CGC
Asn N AAC
SUBSTITUTE SHEET (RULE 26) ~ , Amino Acid Rice Preferred Barley Preferred Codon Codon Asp D GAC
Cys C UGC
GIn Q CAG
Glu E GAG
Gly G GGC
His H CAC
Ile I AUC
Leu L CUC
Lys K AAG
Phe F UUC
Pro P CCG CCC
Ser S AGC UCC
Thr T ACC
TYr Y UAC
Val V GUC GUG
stop UAA UGA
As indicated above, the fusion protein coding sequence in the chimeric gene is constructed such that the final (C-terminal) codon in the signal sequence is immediately followed by the codon for the N-terminal amino acid in the mature form of the heterologous protein.
Exemplary fusion protein genes, in accordance with the present invention, are identified herein as follows:
SEQ ID N0:18, corresponding to codon-optimized coding sequences of the fusion protein consisting of RAmy3D signal sequence/mature a,-antitrypsin;
SEQ ID NO:I9, corresponding to the fusion protein coding sequence consisting of the l0 codon-optimized RAmy3D signal sequence and the native mature antithrombin III sequence;
SEQ ID N0:20, corresponding to the fusion protein coding sequence consisting of the codon-optimized RAmy3D signal sequence and the native mature human serum albumin sequence;
SEQ ID N0:2I, corresponding to codon-optimized coding sequence of the fusion protein RAmy3D signal sequence/prosubtilisin BPN'. In this instance, prosubtilisin is considered the "mature" protein, in that secreted prosubtilisin can autocatatyze to active, mature subtilisin.
In a preferred embodiment, the BPN' coding sequence is further modified to eliminate SU~STlTUTE SHEET (RULE 26) potential N-glycosylation sites, as native BPN' is not glycosylated. Table 2 illustrates preferred -codon substitutions, which eliminate all potential N-glycosylation sites in subtilisin BPN'. SEQ ID
N0:17 corresponds to a mature BPN' amino acid sequence containing the substitutions presented in Table 2.
Table z N Glycosylation Location (Asn) (in Amino Acid Sites mature Substitution protein) Asn Asn Ser 61 Thr Asn Ser Asn Asn Ser 76 Thr Asn Ser Asn Met Ser 123 Thr Met Ser Asn Gly Thr 2I8 Ser Gly Thrt Asn Trp Thr 240 Thr Trp Thr 'improved thcrmostability; Bryan, et al., Proteiru: Structure, Function, and Genetics 1:326 (1986).
A5. Transcription and Translation Terminators The chimeric gene may also include, downstream of the coding sequence, the 3' untranslated region (3' UTR) from an inducible monocot gene, such as one of the rice or barley a-amylase genes mentioned above. One preferred 3' UTR is that derived from the RAmylA gene, whose sequence is given by SEQ ID N0:6. This sequence includes non-coding sequence 5' to the polyadenylation site, the polyadenylation site, and the transcription termination sequence. The transcriptional termination region may be selected, particularly for stability of the mRNA to enhance expression. Polyadenylation tails (Alber and Kawasaki, 1982, Mol. and Appl. Genet.
_1:419-434) are also commonly added to the expression cassette to optimize high levels of 2Q transcription and proper transcription termination, respectively.
Polyadenylation sequences include but are not limited to the Agrobacterium octopine synthetase signal (Gielen, et al., EMBO J. ~:835-846 (I984) or the nopaline synthase of the same species (Depicker, et al., Mol. Appl. Genet. _1:561-573 (1982).
Since the ultimate expression of the heterologous protein will be in a eukaryotic cell (in this case, a member of the grass family), it is desirable to determine whether any portion of the cloned gene contains sequences which will be processed out as introns by the host's splicing machinery. If so, site-directed mutagenesis of the "intron" region may be conducted to prevent losing a portion of the genetic message as a false intron code (Reed and Maniatis, Cell 41:95-105 (1985).
SUBSTITUTE SHEET (RULE 26) Fig. 2 shows the elements of one preferred chimeric gene constructed in accordance with the invention, and intended particularly for use in protein expression in a rice cell suspension culture. The gene includes, in a 5' to 3' direction, the promoter from the RAmy3D gene, which is inducible in cell culture with sugar depletion, the 5' UTR from the RAmylA
gene, which confers enhanced stability on the gene transcript, the RAmy3D signal sequence coding region, as identified above, the coding region of a heterologous protein to be produced, and a 3' UTR region from the RAmyIA gene.
BI. Plant Transformation For transformation of plants, the chimeric gene is placed in a suitable expression vector designed for operation in plants. The vector includes suitable elements of plasmid or viral origin that provide necessary characteristics to the vector to permit the vectors to move DNA from bacteria to the desired plant host. Suitable transformation vectors are described in related application PCT
WO 95/14099, published May 25, 1995, which is incorporated by reference herein. Suitable components of the expression vector, inciuding the chimeric gene described above, are discussed below. One exemplary vector is the p3Dv1.0 vector described in Example 1.
A. Transformation Vector Vectors containing a chimeric gene of the present invention may also include selectable markers for use in plant cells (such as the nptIl kanamycin resistance gene, for selection in kanamycin-containing or the phosphinothricin acetyltransferase gene, for selection in medium containing phosphinothricin (PPT).
The vectors may also include sequences that allow their selection and propagation in a secondary host, such as sequences containing an origin of replication and a selectable marker such as antibiotic or herbicide resistance genes, e.g., FiPH (Hagio et al., Planr Cell Reports ,~:329 (1995); van der Elzer, Plant Mol. Biol. x:299-302 (/985). Typical secondary hosts include bacteria and yeast. In one embodiment, the secondary host is ~scherichia coli, the origin of replication is a colEl-type, and the selectable marker is a gene encoding ampicillin resistance. Such sequences are well known in the art and are commercialiy available as well (e.g., Clontech, Palo Alto, CA;
3o Stratagene, La Jolla, CA).
The vectors of the present invention may also be modified to intermediate plant transformation plasmids that contain a region of homology to an Agrobacterium tumefaciens vector, a T-DNA border region from Agrobacterium tumefaciens, and chimeric genes or expression cassettes (described above). Further, the vectors of the invention may comprise a disarmed plant tumor inducing plasmid of Agrobacterium tumefaciens.
SUBSTITUTE SHEET (RULE 26) WO 98!36085 PCT/US98/03068 The vector described in Example I, and having a promoter from the RAmy3D gene, is suitable for use in a method of mature protein productionin cell culture, where the RAmy3D
promoter is induced by sugar depletion in cell culture medium. Other promoters may be selected for other applications, as indicated above. For example, for mature protein expression in germinating seeds, the coding sequence may be placed under the control of the rice a-amylase RAmylA promoter, which is inducible by gibberellic acid during seed germination.
B. Transformation of plp ant cell Various methods for direct or vectored transformation of plant cells, e.g., plant protoplast cells, have been described, e.g., in above-cited PCT application WO 95/14099.
As noted in that reference, promoters directing expression of selectable markers used for plant transformation (e. g., nptlI) should operate effectively in plant hosts. One such promoter is the nos promoter from native Ti plasmids (Herrera-Estrella, et al., Nature 303:209-213 (i983). Others include the 35S and 19S
promoters of cauliflower mosaic virus {Odell, et al., Nature ~,~I :810-812 {I985) and the Z' promoter (Velten, et al., EMBO J. x:2723-2730 {1984).
In one preferred embodiment, the embryo and endosperm of mature seeds are removed to exposed scutulum tissue cells. The cells may be transformed by DNA bombardment or injection, or by vectored transformation, e.g., by Agrobacteriu»z infection after bombarding the scuteller cells with microparticles to make them susceptible to Agrobacterium infection (Bidney et al., Plant Mol.
Biol. 18:301-313, 1992).
One preferred transformation follows the methods detailed generally in Sivamani, E. et al., Plant Cell Reports x:465 (1996); Zhang, S., et al., Plant Cell Reports 15:465 (1996); and Li, L., et al., Plant Cell Reports 12:250 (1993). Briefly, rice seeds are sterilized by standard methods, and callus induction from the seeds is carried out on MB media with 2,4D. During a first incubation period, callus tissue forms around the embryo of the seed. By the end of the incubation period, (e.g., 14 days at 28~C) the calli are about 0.25 to 0.5 cm in diameter. Callus mass is then detached from the seed, and placed on fresh NB media, and incubated again for about 14 days at 28~C. After the second incubation period, satellite calli developed around the original "mother" callus mass.
These satellite calli were slightly smaller, more compact and defined than the original tissue. It was these calli were transferred to fresh media. The "mother " calti was not transferred. The goal was to select only the strongest, most vigorous growing tissue for further culture.
Calli to be bombarded are selected from 14-day-old subcultures. The size, shape, color and density are all important in selecting calli in the optimal physiological condition for transformation.
The calli should be between .8 and 1.1 mm in diameter. The calli should appear as spherical masses with a rough exterior.
SUBSTITUTE SHEET (RULE 26) Transformation is by particle bombardment, as detailed in the references cited above. After the transformation steps, the cells are typically grown under conditions that permit expression of the selectable marker gene. In a preferred embodiment, the selectable marker gene is HPH. It is preferred to culture the transformed cells under multiple rounds of selection to produce a uniformly stable transformed cell line.
IV. Cell Culture Production of Mature Heterolog us Protein Transgenic cells, typically callus cells, are cultured under conditions that favor plant cell growth, until the cells reach a desired cell density, then under conditions that favor expression of l0 the mature protein under the control of the given promoter. Preferred culture conditions are described below and in Example 2. Purification of the mature protein secreted into the medium is by standard techniques known by those of skill in the art.
Production of mature AAT: In a preferred embodiment, the culture medium contains a phosphate buffer, e. g., the 20 mM phosphate buffer, pH 6.8 described in Example 2, to reduce AAT degradation catalyzed by metals. Alternatively, or in addition, a metal chelating agent, such as EDTA, may be added to the medium.
Following the cell culture method 'described in Example 2, cell culture media was partially purified and the fraction containing AAT was analyzed by Western blot, as shown in Fig. 4. The first two lanes ("phosphate") show AAT bands both in the presence and absence of elastase ("+E"
and "-E"), where the higher molecular weight bands in the presence of elastase correspond roughly to a 58-59 kdal AAT/elastase complex. Also as seen in the figure, expression was high in the absence of sucrose, but nearly undetectable in the presence of sucrose.
To ascertain the degree of glycosylation (as determined by apparent molecular weight by SDS-PAGE) the protein produced in culture was fractionated by SDS-PAGE and immunodetected with a labeled antibody raised against the C-terminal portion of AAT, as shown in Fig. S. Lane 4 contains human AAT, and its migration position corresponds to about 52 kdal.
In lane 3 is the plant-produced AAT, having an apparent molecular weight of about 49-50 kdal, indicating an extent of glycosylation of up to 60-$0~'a of the glycosylation found in human AAT
(non-glycosylated AAT
has a molecular weight of 45 kdal).
Similar results are shown in the Western blots in Fig. 6. Lanes I-3 in this figure correspond to decreasing amount (I5, 10, and 5 ng) of human AAT; lane 4, to 10 ~l supernatant from a non-expressing plant cell line; lanes 5 and 6, to 10 ~l supernatant from AAT-expressing plant cell lines I1B and 27F, respectively, and lane 7, to 10 p,l supernatant from cell line 27F plus 250 ng trypsin. The upward mobility shift in lane 7 is indicative of association between trypsin and the plant-produced AAT.
SUBSTITUTE SHEET (RULE 26) The ability of plant-produced AAT to bind to elastase is demonstrated in Fig.
7, which-shows the shift in molecular weight over a 30 minute binding interval for the 52 kdal human AAT
(lanes 1-4) and the 49-50 kdal plant-produced AAT.
To demonstrate that the mature protein is produced in secreted form, with the desired N-terminus, a chimeric gene constructed as above, and having the coding sequence for mature al-antitrypsin was expressed and secreted in cell culture as described in Example 2. The isolated _ protein was then sequenced at its N-terminal region, yielding the N-terminal sequence shown fn Fig.
8. This sequence, which is identified herein as SEQ ID N0:22, has the same N-terminal residues as native mature a,-antitrypsin.
Production of mature ATIII: In a preferred embodiment, the culture medium contains a MES buffer, pH 6.8. Western blot analysis of the ATIII protein produced, shown in lanes 4 and 6 in Fig. 9, shows a band corresponding to ATIII (lane 1) in cell lines 42 and 46, when grown in the absence (but not in the presence) of sucrose.
Production of mature BPN': In one embodiment of the invention, in which BPN' is secreted I5 as the proBPN' form of the enzyme, the chaperon "pro" moiety of the enzyme facilitates enzyme folding and is cleaved from the enzyme, leaving the active mature form of BPN'. In another embodiment, the mature enzyme is co-expressed and co-secreted with the "pro"
chaperon moiety, with conversion of the enzyme to active form occurring in presence of the free chaperon (Eder et al., Biochem. (1993) 32:18-26; Eder et al, (1993) J. Mol. Biol. 223:293-304).
In yet another embodiment of the invention, the BPN' is secreted in inactive form at a pH
that may be in the 6-8 range, with subsequent activation of the inactive form, e.g., after enzyme isolation, by exposure to the "pro" chaperon moiety, e. g., immobilized to a solid support.
In both of these embodiments, the culture medium is maintained at a pH of between 5 and 6, preferably about 5.5 during the period of active expression and secretion of BPN', to keep the BPN', which is normally active at alkaline pH, at a pH below optimal activity.
Codon optimization to the host plant's most frequent codons yielded a severalfold enhancement in the level of expressed heterologous protein in cell culture as shown in Fig. 11. The extent of enhancement is seen from the Western blot analysis shown in Fig. 10 for two cells lines and further substantiated in Fig. 11. Lane 2 (second from left) in Fig. 10 shows a Western blot of BPN' obtained in culture from cells transformed with a native proBPN' coding sequence. Two bands observed correspond to a lower molecular weight protein whose approximately 35 kdal molecular weight corresponds to that of proBPN'. The upper band corresponds to a somewhat higher molecular weight species, possibly glycosylated.
The first lane in the figure shows BPN' polypeptides produced in culture by plant cells transformed with the codon-optimized proBPN' sequence identified by SEQ ID
N0:21. For SUBSTITUTE SHEET (RULE 26) comparative purposes, the same volume of culture medium, adjusted for cell density, was applied in both lanes 1 and 2. As seen, the amount of BPN' enzyme produced with a colon-optimized sequence was severalfold higher than for subtilisin BPN' produced with the native coding sequence..
Further, a dark band or bands corresponding to mature peptide (molecular weight 27.5 kdal) was observed. However, it should be noted that directly above the band at 35kD is a more pronounced band which may be pro mature product yet to be cleaved into active form.
Fig. 11 compares the specific activity of BPN' colon-optimized (AP106) versus BPN' native (AP101) expression in rice callus cell culture, assayed using the chromogenic peptide substrate suc-Ala-Ala-Pro-Phe-pNA as described by DelMar, E.G. et al. (1979;
Anal. Biochem.
l0 99:316-320). As shown if Fig. I1, several of the cell lines transformed with colon-optimized chimeric genes produced levels of BPN', as evidenced by measured specific activity in culture medium, that were 2-5 times the highest levels observed for plant cells transformed with native proBPN' sequence.
In accordance with another aspect of the invention, it has been found that the transformed plant cell culture is able to express and secrete BPN' at a cell culture pH, pH 5.5, which largely inhibits self-degradation of mature, active BPN'. To assay for optimal pH
conditions, the assay disclosed in DeIMar, et al. (supra) is used to test the media derived from BPN' transformed cell lines under various pH conditions. Transformed rice callus cells are cultured in a MES medium under similar conditions as disclosed in Example 2, but where the pH of the medium is maintained at a selected pH between 5 and 8Ø At each pH, the total amount of expressed and secreted BPN' is determined by Western blot analysis. BPN' activity can be tested in the assay described by DelMar (supra).
V. Production of Mature Heterolo n"s Protein in ~Prminating~g~s In this embodiment, monocot cells transformed as above are used to regenerate plants, seeds from the plants are harvested and then germinated, and the mature protein is isolated from the germinated seeds.
Plant regeneration from cultured protoplasts or callus tissue is carried by standard methods, e.g., as described in Evans et al., HALtt~BOOK OF PLaNT CELL L Es Vol. 1:
(MacMiltan 3o Publishing Co. New York, 1983); and Vasil LR. (ed.), E L CuL~ryRn ArrD
~oMA~rrc CELL
~$NETICS OF PLANTS, Acad. Press, Orlando, Vol. I, 1984, and Vol. III, 1986, and as described in the above-cited PCT application.
A. Seed Germination Condition The transgenic seeds obtained from the regenerated plants are harvested, and prepared for germination by an initial steeping step, in which the seeds immersed in or sprayed with water to SUBSTITUTE SHEET (RULE 26) increase the moisture content of the seed to between 35-45%. This initiates germination. Steeping typically takes place in a steep tank which is typically ftted with a conical end to allow the seed to flow freely out. The addition of compressed air to oxygenate the steeping process is an option.
The temperature is controlled at approximately 22~C depending on the seed.
After steeping, the seeds are transferred to a germination compartment which contains air saturated with water and is under controlled temperature and air flows. The typical temperatures are between I2-25oC and germination is permitted to continue for from 3 to 7 days.
Where the heterologous protein coding gene is operably linked to a inducibie promoter requiring a metabolite such as sugar or plant hormone, e.g., 2 to 100 p,M
gibberellic acid, this l0 metabolite is added, removed or depleted from the steeping water medium and/or is added to the water saturated air used during germination. The seed absorbs the aqueous medium and begins to germinate, expressing the heterologous protein. The medium may then be withdrawn and the malting begun, by maintaining the seeds in a moist temperature controlled aerated environment. In this way, the seeds may begin growth prior to expression, so that the expressed product is less likely to be partially degraded or denatured during the grocess.
More specifically, the temperature during the imbibition or steeping phase will be maintained in the range of about IS-25oC, while the temperature during the germination will usually be about 20~C. The time for the imbibition will usually be from about 1 to 4 days, while the gernunation time will usually be an additional 1 to 10 days, more usually 3 to 7 days. Usually, the 2o time for the malting does not exceed about ten days. The period for the malting can be reduced by using giant hormones during the imbibition, particularly gibberellic acid.
To achieve maximum production of recombinant protein from malting, the malting procedure may be modified to accommodate de-hulled and de-embryonated seeds, as described in above-cited PCT application WO 95114099. In the absence of sugars from the endosperm, there is expected to be a S to 10 fold increase in ltAmy3D promoter activity and thus expression of heterologous protein. Alternatively when embryoless half seeds are incubated in 10 mM CaCl2 and 5 p,M gibberellic acid, there is a SO fold increase in RAmyIA promoter activity.
Prgduction of mature HSA: Following the germination conditions as outlined above and further detailed in Example 3, supernatant was analyzed by Western blot.
Western blot analysis shows production of HSA in germinating rice seeds, with seed samples taken 24, 72, and 120 hours after induction with gibberellin. HSA production was highest approximately 24 hours post-induction (lanes 3 and 4, Fig. 12). Bilirubin binding, a measure of correct folding of plant-produced HSA, is assayed according to the method presented in Example 3.
VI. Production of Mature Heterologous Protein in Maturin Seeds SUBSTITUTE SHEET (RULE 26) In this embodiment, monocot cells transformed as above are used to regenerate punts, anti seeds from the plants are allowed to mature, typically in the field, with consequent production of heterologous protein in the seeds.
Following seed maturation, the seeds and their heterologous proteins may be used directly, that is, without protein isolation, where for example, the heterologous protein is intended to confer a benefit on the seed as a whole, for example, to enrich the seed in the selected protein.
Alternatively, the seeds may be fractionated by standard methods to obtain the heterologous protein in enriched or purified form. In one general approach, the seed is first milled, then suspended in a suitable extraction medium, e.g., an aqueous or an organic solvent, to extract the l0 protein or metabolite of interest. If desired the heterologous protein can be further fractionated and purified, using standard purification methods.
The following examples are provided by way of illustration only and not by way of limitation. Those of skill will readily recognize a variety of noncritical parameters which could be changed or modified to yield essentially similar results.
General Methods Generally, the nomenclature and laboratory procedures with respect to standard recombinant DNA technology can be found in Sambrook, et al., MOLECULAR Ct,orrn~ro - A
LABORATORY
~r , Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 1989 and in S.B.
Gelvin and R.A. Schilperoot, PLANT MOLECULAR BIOLOGY, 1988. Other general references are provided throughout this document. The procedures therein are known in the art and are provided for the convenience of the reader.
xam le 1 Construction of a Transforming Vector Containing a Colon-Optimized n,-antitrYpsin Setluence A. Hv~romvcin Resistance Gene Insertion:
The 3 kb BamHI fragment containing the 35S promoter-Hph-NOS was removed from the plasmid pMON410 (Monsanto, St. Louis, MO) and placed into an site-directed mutagenized BgllT
site in the pUCl8 at 1463 to form the plasmid pUCHl8+.
B. Terminator Insertion:
pOSgIABKS is a 5 kb BamHI-Kpnl fragment from lambda clone ~,OSglA (Huang, N., et al., (1990) Nuc. Acids Res. 18:7007) cloned into pBluescript KS- (Stratagene, San Diego, CA).
SUBSTITUTE SHEET (RULE 26) Plasmid pOSgIABKS was digested with Mspl and blunted with T4 DNA polymerase followed by SpeI digestion. The 350 by terminator fragment was subcloned into pUCI9 (New England BioLabs, Beverly, MA), which had been digested with BamHI, blunted with T4 DNA
polymerase and digested with Xbal, to form pUCl9/terminator.
C. RAmy 3D Promoter Insertion:
. A 1.1 kb NheI-PstI fragment derived from plASI.S (Huang, N. et al. (1993) Plant Mol.
Biol. 23:737-747), was cloned into the vector pGEMSzf jmuitiple cloning site (MCS) (Promega, Madison, WI): ApaI, AatII, SphI, Ncol, SstII, EcoRV, SpeI, NotI, PstI, SaII, NdeI, SacI, MIuI, l0 NsiI~ at the SpeI and Pstl sites to form pGEMSzf (3DlNheI PstI). pGEMSzf (3DlNheI Pst~ was then digested with PstI and SacI, and two non kinased 30mers having the complementary sequences 5' GCTTG ACCTG TAACT CGGGC CAGGC GAGCT 3' (SEQ ID N0:23) and 5' CGCCT
AGCCC GAGTT ACAGG TCAAG CAGCT 3' (SEQ ID N0:24) were Iigated in to form p3DProSig. The promoter fragment prepared by digesting p3DProSig with NcoI, blunting with T4 DNA polymerase, and digesting with SstI was subcloned into pUCl9/terminator which had been digested with EcoRI, blunted with T4 DNA polymerase and digested with SstI, to form p3DProSigEND.
D. Multiple Cloning Site Insertion:
p3DProSigEND was digested with SstI and SmaI followed by the ligation of a new synthetic linker fragment constructed with the non-kinased complementary oligonucleotides 5' AGCTC
CATGG CCGTG GCTCG AGTCT AGACG CGTCC CC 3' (SEQ ID N0:25} and 5' GGGGA
CGCGT CTAGA CTCGA GCCAC GGCCA TGG 3' (SEQ ID N0:26) to form p3DProSigENDlink.
E. p3DProSigENDlink Flankine Site Modification:
p3DProSigENDlink was digested with SalI and blunted with T4 DNA polymerase followed by EcoRV digestion. The blunt fragment was then inserted into pBluescript KS+
(Stratagene) in the EcoRV site so that the HindIII site is proximal to the promoter and the EcoRI is proximal to the terminator sequence. The HindIII EcoRI fragment was then moved into the polylinker of pUCHl8+ to farm the p3Dv1.0 expression vector.
F. RAmyIA Promoter insertion:
A 1.9 kb NheI PstI fragment derived from subclone pOSG2CA2.3 from lambda clone ~,OSg2 (Huang et al. (1990) Plant Mol. Biol. I4:655-668), was cloned into the vector pGEMSzf at SUBSTITUTE SHEET (RULE 26) the SpeI and PstI sites to form pGEMSzf (lAINheI Pstl). pGEMSzf (lAlNheI Pstn was digested with Pstl and SacI and two non-kinased 35mers and four- kinased 32mers were Iigated in, with the complementary sequences as follows: 5' GCATG CAGGT GCTGA ACACC ATGGT GAACA
AACAC 3' (SEQ ID N0:27); 5' TTCTT GTCCC TTTCG GTCCT CATCG TCCTC CT 3' (SEQ
ID N0:28); 5' TGGCC TCTCC TCCAA CTTGA CAGCC GGGAG CT 3' (SEQ ID 0:29); 5' TTCAC CATGG TGTTC AGCAC CTGCA TGCTG CA 3' (SEQ ID N0:30); 5' CGATG AGGAC
CGAAA GGGAC AAGAA GTGTT TG 3' (SEQ ID N0:31); 5' CCCGG CTGTC AAGTT
GGAGG AGAGG CCAAG GAGGA 3' (SEQ ID N0:32) to form plAProSig. The HindIII-SacI
0.8 kb promoter fragment was subcloned from pIAProSig into the p3Dv1.0 vector digested with l0 HindIII-SacI to yield the plAvl.O expression vector.
G. construction of p3D-AAT Plasmid Two PCR grimers were used to amplify a fragment encoding AAT according to the sequence disclosed as Genbank Accession No. K01396: N-terminal primer 5' GAGGA
TCCCC
AGGGA GATGC TGCCC AGAR 3' (SEQ ID N0:33) and C-terminal primer 5' CGCGC TCGAG
TTATT TTTGG GTGGG ATTCA CCAC 3' (SEQ ID N0:34). The N-terminal primer amplifies to a blunt site for in-frame insertion with the end of the p3D signal peptide and the C-terminal primer contains a XhoI site for cloning the fragment into the vector as shown in Figs. 3A and 3B.
Alternatively, the sequence encoding mature AAT (SEQ ID N0:8) or colon-optimized AAT may be 2o chemically synthesized using techniques known in the art, incorporating a XhoI restriction site 3' of the termination colon for insertion into the expression vector as described above.
P~uction of mature ."-antitrypsin in cell culture After selection of transgenic callus, callus cells were suspended in liquid culture containing AA2 media ('Thompson, J.A., et al., Plant Science 47:123 (1986), at 3%
sucrose, pH 5.8.
Thereafter, the cells were shifted to phosphate-buffered media (20 mM
phosphate buffer, pH 6.8) using 10 mL mufti-well tissue culture plates and shaken at 120 rpm in the dark for 48 hours. The supernatant was then removed and stored at -80~C prior to western blot analysis.
Supernatants were concentrated using Centricon-IO filters (Amicon cat. #4207) and washed with induction media to remove substances interfering with electrophoretic migration. Samples were concentrated approximately 10 fold, and mature AAT was purified by SDS
PAGE
electrophoresis. The purified protein was extracted from the electrophoresis medium, and sequenced at its N-terminus, giving the sequence shown in Fig. 8, identified herein as SEQ 7~
N0:22.
SUBSTITUTE SHEET (RULE 26) Example 3 HSA Induction in Germinating Seeds After selection of transgenie plants which tested positive for the presence of a codon-optimized HSA gene driven by the GA3-responsive RAmylA promoter, seeds were harvested and imbibed for 24 hours with 100 rpm orbital shaking in the dark at ZS~C. GA3 was added to a final concentration of Sp.M and incubated for an additional 24-120 hours. Total soluble protein was isolated by double grinding each seed in 120 ~cI grinding buffer and centrifuging at 23,000 x g for 1 minute at 4oC. The clear supernatant was carefully removed from the pellet and transferred to a fresh tube.
Biliruhin bindin assax Bilirubin binding to its high-affinity site on mature HSA is assayed using the method described by Jacobsen, J. et al. (1974; Clin. Chem. 20:783) and Reed, R.G. et al. (1975;
Biochemistry 14:4578-4583). Briefly, the concentration of free bilirubin in equilibrium with protein-bound biiirubin is determined by the rate of peroxide-peroxidase catalyzed oxidation of free bilirubin. Stock solutions of bilirubin (Nutritional Biochemicals Corp.) are prepared fresh daily in 5 mM NaOH containing 1mM EDTA and the concentration determined using a molar absorptivity of 47,500 M'1 cm 1 at 440 nm. An aliquot containing between S and 30 nmol bilirubin is added to a 1 cm cuvette containing I ml PBS and approximately 30 nmol HSA at 37~C. An absorbance spectrum between 500 and 350 nm is recorded. Aliquots of horseradish peroxidase (Sigma), 0.05 mg/ml in PBS, and 0.05% ethyl hydrogen peroxide (Ferrosan; Malmo Sweden) are added and the change in absorbance at ~,max is recorded for 3-5 minutes. The concentrations of free and bound billirubin calculated from the oxidation rate observed using varying concentrations of total bilirubin are used to construct a Scatchard plot from which the association constant for a single binding site is determined.
Although the invention has been described with reference to particular embodiments, it will be appreciated that a variety of changes and modifications can be made without departing from the invention.
SUBSTITUTE SHEET (RULE 26) SEQUENCE LISTING
(1 ) GENERAL INFORMATTON
(i) APPLICANT: Applied Phytologics, Inc. -(ii) TITLE OF THE INVENTION: Production of Mature Proteins in Plants (iii) NUMBER OF SEQUENCES: 34 (iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: Dehlinger & Associates -(B) STREET: P.O. Box 60850 (C) CITY: Palo Alto (D) STATE: CA
(E) COUNTRY: USA
(F} ZIP: 94306 (v) COMPUTER
READABLE
FORM:
(A) MEDIUM TYPE: Diskette (B) COMPUTER: IBM Compatible (C) OPERATING SYSTEM: DOS
(D) SOFTWARE: FastSEQ for Windows Version 2.0 (vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER: PCT/US98/03068 (B) FTLING DATE: 13-FEB--1998 (C) CLASSIFICATION:
(vii) PRIOR APPLICATION DATA:
(A) APPLICATTON NUMBER: 60/038,169 (B) FILING DATE: 13-FEB-1997 (A) APPLICATION NUMBER: 60/037,991 (B) FILING DATE: 13-FEB-1997 (A) APPLICATION NUMBER: 60/038,170 (B) FILING DATE: 13-FEB-1997 (A) APPLICATION NUMBER: 60/038,168 (B) FILING DATE. 13-FEB-1997 -(viii) ATTORNEY/AGENT INFORMATION: -(A) NAME: Petithory, Joanne R
(B) REGISTRATION NUMBER: P42,995 (C) REFERENCE/DOCKET NUMBER: 0665-0007.41 (ix) TELECOMMUNICATION
INFORMATION:
(A) TELEPHONE: 650-324-0880 (B) TELEFAX: 650-324-0960 (2) INFORMATION FOR SEQ ID NO:1:
(7.) SEQUENCE
CHARACTERISTICS:
(A) LENGTH: 25 amino acids -(B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE
TYPE:
peptide (vii) IMMEDIATE SOURCE:
(B) CLONE: 3D signal peptide sequence (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:
SUBSTITUTE SHEET (RULE 26) Met Lys Asn Thr Ser Ser Leu Cys Leu Leu Leu Leu Val Val Leu Cys Ser Leu Thr Cys Asn Ser Gly Gln Ala (2) INFORMATION FOR SEQ ID N0:2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 75 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY. linear (vii) IMMEDIATE SOURCE:
(B) CLONE: native 3D signal peptide DNA sequence (xi) SEQUENCE DESCRIPTION: SEQ ID N0:2:
(2) INFORMATION FOR SEQ ID N0:3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 75 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: codon-optimized 3D signal peptide DNA sequence (xi) SEQUENCE DESCRIPTION: SEQ ID N0:3:
(2) INFORMATION FOR SEQ ID N0:4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 25 amino acids (B) TYPE: amino acid (D) TOPOLOGY. linear (ii) MOLECULE TYPE: peptide (vii) IMMEDIATE SOURCE:
{B) CLONE: RAmylA signal peptide (xi} SEQUENCE DESCRIPTION: SEQ ID N0:4:
Met Val Asn Lys His Phe Leu Ser Leu Ser Val Leu Ile Val Leu Leu Gly Leu Ser Ser Asn Leu Thr Ala Gly (2) INFORMATION FOR SEQ ID NO: S:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 51 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: RAmy lA 5' untranslated region {UTR) (xi) SEQUENCE DESCRIPTTON: SEQ ID N0:5:
SUBSTITUTE SHEET {MULE 26) (2) INFORMATION FOR SEQ ID N0:6: -(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 321 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (vii) IMMEDTATE SOURCE:
(B) CLONE: RAmy 1A 3' untranslated region (UTR) (xi) SEQUENCE DESCRIPTION: SEQ ID N0:6:
CTACGAAAAT TTGATGCGTA G 32l (2) INFORMATION FOR SEQ ID N0:7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 394 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear -(ii) MOLECULE TYPE: protein (vii) IMMEDIATE SOURCE:
(B) CLONE: mature AAT amino acid sequence (xi) SEQUENCE DESCRIPTION: SEQ ID N0:7:
Glu AspProGln GlyAsp AlaAlaGln LysThr AspThrSer HisHis Asp GlnAspHis ProThr PheAsnLys IleThr ProAsnLeu AlaGlu Phe AlaPheSex LeuTyr ArgGlnLeu AlaHis GlnSerAsn SerThr Asn IlePhePhe SerPro ValSerIle AlaThr AlaPileAla MetLeu Ser LeuGlyThr LysAla AspThrHis AspGlu IleLeuGlu GlyLeu Asn PheAsnLeu ThrGlu IleProGlu AlaGln IleHisGlu GlyPhe Gln GluLeuLeu ArgThr LeuAsnGln ProAsp SerGlnLeu GlnLeu Thr ThrGlyAsn GlyLeu PheLeuSer.GluGly LeuLysLeu ValAsp 11.5 12 12 Lys PheLeuGlu AspVal LysLysLeu TyrHis 5erGluAla PheThr Val AsnPheGly AspThr GluGluAla LysLys GlnIleAsn AspTyr Val GluLysGly ThrGln G1yLysIle ValAsp LeuValLys GluLeu Asp ArgAspThr ValPhe AlaLeuVal AsnTyr IlePhePhe LysGly Lys TrpGluArg ProPhe GluValLys AspThr GluGluGlu AspPhe His ValAspGln ValThr ThrValLys ValPro MetMetLys ArgLeu Gly MetPheAsn IleGln HisCysLys LysLeu SerSerTrp ValLeu Leu MetLysTyr LeuGly AsnAlaThr AlaIle PhePheLeu ProAsp SU9ST1TUTE SHEET (RULE 26j WO 98f36085 PCT/US98/U3068 G1u Gly Lys Leu Gln His Leu Glu Asn Glu Leu Thr His Asp Ile Ile 260 265 - .. 270 Thr Lys Phe Leu Glu Asn Glu Asp Arg Arg Ser Ala Ser Leu His Leu 275 280 285 _ Pro Lys Leu Ser Ile Thr Gly Thr Tyr Asp Leu Lys Ser Val Leu Gly Gln Leu Gly Ile Thr Lys Val Phe Ser Asn Gly Ala Asp Leu Ser Gly Val Thr Glu Glu Ala Pro Leu Lys Leu Ser Lys Ala Val His Lys Ala Val Leu Thr Ile Asp Glu Lys Gly Thr Glu Ala Ala Gly Ala Met Phe Leu Glu Ala Ile Pro Met Ser Ile Pro Pro Glu Val Lys Phe Asn Lys Pro Phe VaI Phe Leu Met Ile Glu Gln Asn Thr Lys Ser Pro Leu Phe Met Gly Lys Val Val Asn Pro Thr Gln Lys (2) INFORMATION FOR SEQ ID N0:8:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1185 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE. native coding sequence of mature AAT
(xi) SEQUENCE DESCRIPTION: SEQ ID~N0:8:
TCAGGATCAC
ATACCGCCAG
CGCTACAGCC
GGAGGGCCTG
GGAACTCCTC
CCTGTTCCTC
GTACCACTCA
CAACGATTAC
CAGAGACACA
CTTTGAAGTC
GGTGCCTATG
CTGGGTGCTG
GGGGAAACTA
AAATGAAGAC
TGATCTGAAG
CCTCTCCGGG
GCTGACCATC
CATGTCTATC
AAATACCAAG
(2) INFORMATION FOR SEQ ID N0:9:
(i) SEQUENCE CHARACTERISTTCS:
(A) LENGTH: 432 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (vii) IMMEDIATE SOURCE:
(B) CLONE: mature ATIII as sequence (xi) SEQUENCE DESCRIPTION: SEQ ID N0:9:
His Gly Sex Pro Va1 Asp Ile Cys Thr Ala Lys Pro Arg Asp Ile Pro SUBSTITUTE SHEET (RULE 26) WO 98/36085 PC'd'/CTS98/03068 Met AsnPro MetCys TleTyrArg SerPro GluLysLys AlaThr Glu 20 25 30 _ Asp GluGly SerGlu GlnLysIle ProGlu AlaThrAsn ArgArg Val Trp GluLeu SerLys AlaAsnSer ArgPhe AlaThrThr PheTyr Gln His LeuAla AspSer LysAsnAsp AsnAsp AsnIlePhe LeuSer Pro 65 70 75 g0 Leu SerIle SerThr AlaPheAla MetThr LysLeuGly AlaCys Asn 85 9p g5 Asp ThrLeu GlnGln LeuMetGlu ValPhe LysPheAsp ThrIle Ser Glu LysThr SerAsp GlnIleHis PhePhe PheAlaLys LeuAsn Cys Arg LeuTyr ArgLys AlaAsnLys SerSer LysLeuVal SerA1a Asn Arg LeuPhe GlyAsp LysSerLeu ThrPhe AsnGluThr TyrGln Asp Ile SerGlu LeuVal TyrGlyAla LysLeu GlnProLeu AspPhe Lys Glu AsnAla GluGln SerArgAla AlaIle AsnLysTrp ValSer Asn Lys ThrGlu GlyArg IleThrAsp ValIle ProSerGlu AlaIle Asn Glu LeuThr ValLeu ValLeuVal AsnThr IleTyrPhe LysGly Leu Trp LysSer LysPhe SerProGlu AsnThr ArgLysGlu LeuPhe Tyr _ Lys AlaAsp Gly'GluSerCysSer AlaSer MetMetTyr GlnGlu Gly Lys PheArg T~rArg ArgValAla GluGly ThrGlnVal LeuGlu Leu Pro PheLys GlyAsp AspIleThr MetVal LeuIleLeu ProLys Pro 275 280 2g5 Glu LysSer LeuAla LysValGlu LysGlu LeuThrPro GluVal Leu Gln GluTrp LeuAsp GluLeuGlu GluMet MetLeuVal ValHis Met Pro ArgPhe ArgIle GluAspG1y PheSer LeuLysG1u GlnLeu Gln Asp MetGly LeuVal AspLeuPhe SerPro G1uLysSer LysLeu Pro Gly IleVal AlaGlu GlyArgAsp AspLeu TyrValSer AspA1a Phe His LysAla PheLeu GluValAsn GluGlu GlySerGlu AlaAla Ala Ser ThrAla ValVal TleAlaGly ArgSer LeuAsnPro AsnArg Val 3B5 390 . 395 400 Thr PheLys AlaAsn ArgProPhe LeuVal PheIleArg GluVal Pro Leu AsriThr IleIle PheMetGly ArgVal AlaAsnPro CysVal Lys (2) INFORMATION FORSEQ ID N0:10:
(i) QUENCE CS:
SE CHARACTERISTI
(A) LENGTH: 1299base irs pa (B) TYPE: c id nuclei ac (C) STRANDEDNESS : ngle si (D) TOPOLOGY: near -li (vii) IMMEDIATE SOURCE:
(B) CLONE: native ATIII DNA sequence -(xi) SEQUENCE DESCRIPTION: SEQ TD N0:10:
SUBSTITUTE SHEET (RULE 26) CACGGAAGCC CTGTGGACAT CTGCACAGCC.AAGCCGCGGG ACATTCCCAT GAATCCCATG60 AATCACCGAT
GGAACTCACC
CAACAGGGTG
(2) TNFORMATION FOR SEQ ID N0:11:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: S85 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (vii) IMMEDIATE SOURCE:
(B) CLONE: mature HSA amino acid sequence (xi) SEQUENCE DESCRIPTION: SEQ ID N0:11:
Asp Ala HisLys SerGlu ValAlaHis ArgPheLys AspLeu GlyGlu Glu Asn PheLys AlaLeu ValLeuIle AlaPheAla GlnTyr LeuGln Gln Cys ProPhe GluAsp HisValLys LeuValAsn GluVal ThrGlu Phe Ala LysThr CysVal AlaAspGlu SerAlaGlu AsnCys AspLys Ser Leu HisThr LeuPhe GlyAspLys LeuCysThr ValAla ThrLeu Arg Glu ThrTyr GlyGlu MetAlaAsp CysCysAla LysGln GluPro Glu Arg AsnGlu CysPhe LeuGlnHis LysAspAsp AsnPro AsnLeu 100 lOS' I10 Pro Arg LeuVal ArgPro GluValAsp ValMetCys ThrAla PheHis Asp Asn GluGlu ThrPhe LeuLysLys TyrLeuTyr GluIle AlaArg Arg His ProTyr PheTyr AlaProGlu LeuLeuPhe PheAla LysArg Tyr Lys AlaAla PheThr GluCysCys GlnAlaAla AspLys AlaAla Cys Leu LeuPro LysLeu AspGluLeu ArgAspGlu GlyLys AlaSer Ser Ala LysGln ArgLeu LysCysAla SerLeuGln LysPhe GlyGlu Arg Ala PheLys AlaTrp AlaValAla ArgLeuSer GlnArg PhePro Lys Ala GluPhe AlaGlu ValSerLys LeuValThr AspLeu ThrLys Val His ThrGlu CysCys HisGlyAsp LeuLeuGlu CysAla AspAsp SUBSTfTUTE SHEET (RULE 26) Arg AlaAspLeu AlaLys TyrIle CysGluAsn Gln-Asp SerIleSer Ser LysLeuLys GluCys CysGlu LysProLeu LeuGlu Lys-SerHis Cys IleAlaGlu ValGlu AsnAsp GluMetPro AlaAsp LeuProSer Leu AlaAlaAsp PheVal GluSer LysAspVal CysLys AsnTyrAla 305 310 315 - .- 320 Glu AlaLysAsp ValPhe LeuGly MetPheLeu TyrGlu TyrAlaArg Arg HisProAsp TyrSer ValVal LeuLeuLeu ArgLeu AlaLysThr Tyr GluThrThr LeuGlu LysCys CysAlaAla AlaAsp Pro-HisGlu Cys TyrAlaLys ValPhe AspGlu PheLysPro LeuVal GluGluPro Gln AsnLeuIle LysGln AsnCys G1uLeuPhe LysGln LeuGlyGlu Tyr LysPheGln AsnAla LeuLeu ValArgTyr ThrLys LysValPro G1n ValSerThr ProThr LeuVal GluValSer ArgAsn LeuGlyLys Val GlySerLys CysCys LysHis ProGluAla LysArg MetProCys Ala GluAspTyr LeuSer ValVal LeuAsnGln LeuCys ValLeuHis Glu LysThrPro ValSer AspArg ValThrLys CysCys ThrGluSer Leu -ValAsnArg ArgPro CysPhe SerAlaLeu GluVal AspGluThr Tyr ValProLys GluPhe AsnAla GluThrPhe ThrPhe HisAlaAsp 500 505 510.
Ile CysThrLeu SerGlu LysG1u ArgGlnIle LysLys GlnThrAla Leu ValGluLeu ValLys HisLys ProLysAla ThrLys GluGlnLeu Lys AlaValMet AspAsp PheAla AlaPheVal GluLys CysCysLys Ala AspAspLys GluThr CysPhe AlaGluGlu GlyLys LysLeuVal Ala AlaSerGln AlaAla LeuGly Leu 580 585 - . -(2) INFORMATION FOR SEQ ID N0:12: -(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1865 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: native coding sequence of mature HSA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:12:
AATGAAGTAA
CTGAATTTGC
AAAAACATGT
GTAGCTGATG
SU~STiTUTE SHEET (RULE 26) WO 98/36085 PCTlUS98/03068 TGCCAGTCTC CAAAA.ATTTG GAGAAAGAGC TTTCAAAGCA TGGGCAGTGG CTCGCCTGAG 660 AAAACCTCTG TTGGAAAA.AT CCCACTGCAT TGCCGAAGTG GAAAATGATG AGATGCCTGC 900 ' TCCAACTCTT GTAGAGGTCT CAAGAAACCT AGGAAAAGTG GGCAGCAAAT GTTGTAAACA 1320 AACAC
(2) INFORMATION FOR SEQ ID N0:13:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 352 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (vii) IMMEDIATE SOURCE:
(B) CLONE: native proBPN' amino acid sequence (xi) SEQUENCE DESCRIPTION: SEQ ID N0:13:
Ala Gly LysSer AsnGlyGlu LysLys TyrIIe ValGlyPhe LysGln Thr Met SerThr MetSerAla AlaLys LysLys AspValIle SerGlu Lys Gly GlyLys ValGlnLys GlnPhe LysTyr ValAspAla AlaSer Ala Thr LeuAsn GluLysAla ValLys GluLeu LysLysAsp ProSer 45 Val Ala TyrVal GluGluAsp HisVal AlaHis AlaTyrAla GlnSer Val Pro TyrGly ValSerGln IleLys AlaPro AlaLeuHis SerGln Gly Tyr ThrGIy SerAsnVal LysVal AlaVal IleAspSer GlyIle Asp Ser SerHis ProAspLeu LysVal AlaGly GlyAlaSer MetVal Pro Ser GluThr AsnProPhe GlnAsp AsnAsn SerHisGly ThrHis !55Val Ala GlyThr ValAlaAla LeuAsn AsnSer IleGlyVal LeuGly Val Ala ProSer AlaSerLeu TyrAla ValLys ValLeuGly AlaAsp Gly Ser GlyGln TyrSerTrp IleIle AsnGly IleGluTrp AlaIle Ala Asn AsnMet AspValIle AsnMet SerLeu GlyGlyPro SerGly Ser Ala AlaLeu LysAlaAla ValAsp LysAla ValAlaSer GlyVal Val Val ValAla AlaAlaGly AsnGlu GlyThr SerGlySer SerSer Thr Val GlyTyr ProGlyLys TyrPro SerVal IIeAlaVal GlyAla SUBSTITUTE SHEET (RULE 26j Val Asp Ser Ser Asn Gln Arg Ala Ser Phe Ser Ser Val Gly Pro Glu Leu Asp Val Met Ala Pro Gly Val Ser Ile Gln Ser Thr Leu Pro Gly Asn Lys Tyr Gly Ala Tyr Asn Gly Thr Ser Met Ala Ser Pro His Val Ala Gly Ala Ala Ala Leu Ile Leu Ser Lys His Pro Asn Trp Thr Asn Thr Gln Val Arg Ser Ser Leu Glu Asn Thr Thr Thr Lys Leu Gly Asp Ser Phe Tyr Tyr Gly Lys Gly Leu Ile Asn Val Gln Ala Ala Ala Gln (2) INFORMATION FOR SEQ ID N0:14:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1056 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (vii} IMMEDIATE SOURCE:
(B) CLONE: native proBPN~ coding sequence (xi) SEQUENCE DESCRIPTION: SEQ ID N0:14:
AATGAGCACG
GCAAAAGCAA
AGAATTGAAA
CGCGCAGTCC
CTACACTGGA
TGATTTAAAG
CAACAACTCT
TGTATTAGGC
TTCCGGCCAA
CGTTATTAAC
TAAAGCCGTT
CAGCTCAAGC
TGACAGCAGC
ACCTGGCGTA
GTCAATGGCA
CTGGACAAAC
TTTC
TACTAT
_ 1056 _ GGAAAAGGGC TGATCAACGT ACAGGCGGCA GCTCAG =- -(2) INFORMATION FOR SEQ ID N0:15: _ (i) SEQUENCE CHARACTERISTICS:
(A} LENGTH: 77 amino acids .
(B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (vii) IMMEDIATE SOURCE:
(B) CLONE: subtilisin BPN~ pro-peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:15:
Ala Gly Lys Ser Asn Gly Glu Lys Lys Tyr Ile Val Gly Phe Lys Gln Thr Met Ser Thr Met Ser Ala Ala Lys Lys Lys Asp Val Ile Sex Glu Lys Gly Gly Lys Val Gln Lys Gln Phe Lys Tyr Val Asp Ala Ala Ser Ala Thr Leu Asn Glu Lys Ala Val Lys Glu Leu Lys Lys Asp Pro Ser 34-i SUBSTITUTE SHEET (RULE 2fi) Val Ala Tyr Val Glu Glu Asp His 'Val Ala~His Ala Tyr (2) INFORMATION FOR SEQ ID N0:16:
(i) SEQUENCE CHARACTERISTICS:
- (A) LENGTH: 275 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear ' (ii) MOLECULE TYPE: protein (vii) IMMEDIATE SOURCE:
(B) CLONE: native mature BPN~ amino acid sequence (xi) SEQUENCE DESCRIPTION: SEQ ID N0:16:
Ala Gln SerValPro TyrGly ValSerGln IleLys AlaProAla Leu His Ser GlnGlyTyr ThrGly SerAsnVal LysVal AlaValIle Asp Ser Gly IleAspSer SerHis ProAspLeu LysVal AlaGlyGly Ala Ser Met ValProSer GluThr AsnProPhe GlnAsp AsnAsnSer His Gly Thr HisValAla GlyThr ValAlaAla LeuAsn AsnSerIle Gly Val Leu GlyValAla ProSer AlaSerLeu TyrAla ValLysVal Leu Gly Ala AspGlySer GlyGln TyrSerTrp IleIle AsnGlyIle Glu Trp Ala IleAlaAsn AsnMet AspValIle AsnMet SerLeuGly Gly Pro Ser GlySerAla AlaLeu LysAlaAla ValAsp LysAlaVal A1a Ser Gly ValValVal ValAla AlaAlaGly AsnGlu GlyThrSer Gly Ser Ser SerThrVal GlyTyr ProGlyLys TyrPro SerValIle Ala Val Gly AlaValAsp SerSer AsnGlnArg AlaSer PheSerSer Val Gly Pro GluLeuAsp ValMet AlaProGly ValSer IleGlnSer Thr Leu Pro GlyAsnLys TyrGly AlaTyrAsn GlyThr SerMetAla Ser Pro His ValAlaGly AlaAla AlaLeuIle LeuSer LysHisPro Asn Trp Thr AsnThrGln ValArg SerSerLeu GluAsn ThrThrThr Lys Leu Gly Asp.SerPhe TyrTyr GlyLys~Gly LeuIle AsnValGln Ala Ala Ala Gln !55 (2) INFORMATION FOR SEQ ID N0:17:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 275 amino acids (B) TYPE: amino acid GO (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (vii) IMMEDIATE SOURCE:
(B) CLONE: amino acid sequence of mature BPN~ variant 34-ii SUBSTITUTE SHEET (RULE 26) (xi) SEQUENCE DESCRIPTION: SEQ ID N0:17:
Ala Gln Ser Val Pro Tyr Gly Val Ser Gln Ile Lys Ala Pro Ala Leu His SerGln GlyTyrThr GlySer AsnVal LysValAla ValIle Asp Ser GlyIle AspSerSer HisPro AspLeu LysValAla GlyGly Ala 35 40 45 _ Ser MetVal ProSerGlu ThrAsn ProPhe GlnAspThr AsnSer His Gly ThrHis ValAlaGly ThrVal AlaAla LeuThrAsn SexIle Gly Val LeuGly ValAlaPro SerAla SerLeu TyrAlaVal LysVal Leu Gly AlaAsp GlySerGly GlnTyr SerTrp IleIleAsn GlyIle Glu Trp AlaIle AlaAsnAsn MetAsp ValIle ThrMetSer LeuGly Gly Pro SerGly SerAlaAla LeuLys AlaAla ValAsplaysAlaVal Ala Ser GlyVal ValValVal A1aAla AlaGly AsnGluGly ThrSer G1y Ser SerSer ThrValGly TyrPro GlyLys TyrProSer ValIle Ala Val GlyAla ValAspSer SerAsn GlnArg AlaSerPhe SerSer Val Gly ProGlu LeuAspVal MetAla ProGly ValSerLle GlnSer Thr Leu ProGly AsnLysTyr GlyAla TyrSer GlyThrSer MetAla Ser 210 215 ~ 220 Pro HisVal AlaGlyAla AlaAla LeuIle LeuSerLys HisPro Thr Trp ThrAsn ThrGlnVal ArgSer SerLeu GluAsnThr ThrThr Lys Leu GlyAsp SerPheTyr TyrGly LysGly LeuIleAsn ValGln Ala Ala AlaGln (2) INFORMATION FOR SEQ ID N0:18: -(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1260 base pairs (S) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE: -(B) CLONE: codon-optimized~3D signal peptide-AAT DNA sequence (xi) SEQUENCE DESCRIPTION: SEQ ID N0:18:
34-iii -SLD~STITUTE SHEET (RULE 26j (2) INFORMATION FOR SEQ ID N0:19:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1382 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single .
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: codon-optimized 3D signal peptide-ATIII DNA sequen (xi) SEQUENCE DESCRIPTION: SEQ ID N0:19:
(2) INFORMATION FOR SEQ ID N0:20:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1940 base pairs.
(B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: codon-optimized 3D signal peptide-HSA DNA sequence (xi) SEQUENCE DESCRTPTTON: SEQ ID N0:20:
34-iv SUBSTITUTE SHEET (RULE 26) GAAAA.AATAC TTATATGAAA TTGCCAGAAG ACATCCTTAC TTTTATGCCC CGGAACTCCT540 ATGAATATGC
AAGTGGGCAG
TTGTGAAACA
TGTTGCTGCA AGTCAAGCTG CCTTAGGCTT ATAACATCTA CATTTA&AAG CATCTCAGCC1860 (2) TNFORMATION FOR SEQ ID N0:21:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1140 base pairs (B) TYPE: nucleic acid (C} STRANDEDNESS: single (D) TOPOLOGY: linear (vii} IMMEDIATE SOURCE:
(B) CLONE: codon-optimized 3D signal peptide-BPN~ DNA sequene (xi) SEQUENCE DESCRIPTION: SEQ ID N0:21:
(2) INFORMATION FOR SEQ ID N0:22:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 13 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii} MOLECULE TYPE: peptide 34-v SUBSTITUTE SHEET (RULE 26) WO 98!36085 PCT/US98/03068 (vii) IMMEDIATE SOURCE:
(B) CLONE: N-terminus o~ mature AAT
(xi) SEQUENCE DESCRIPTION. SEQ ID N0:22:
Glu Asp Pro Gln Gly Asp Ala Ala Gln Lys Thr Asp Thr (2) INFORMATION FOR SEQ ID N0:23:
(i) SEQUENCE CHARACTERISTICS:
' (A) LENGTH: 30 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ TD N0:23:
(2) INFORMATION FOR SEQ ID N0:24:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUEI~TCE DESCRIPTION: SEQ ID N0:24:
(2) INFORMATION FOR SEQ ID N0:25:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 37 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID N0:25:
(2) INFORMATION FOR SEQ ID N0:26:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID N0:26:
(2) INFORMATION FOR SEQ ID N0:27:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 35 base pairs X55 (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear 34-vi SUBSTITUTE SHEET (RULE 26) (xi) SEQUENCE DESCRIPTION: SEQ-ID N0:27:
(2) INFORMATION FOR SEQ ID N0:28:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH. 32 base pairs (B) TYPE. nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTTON: SEQ ID N0:28:
TTCTTGTCCC TTTCGGTCCT CATCGTCCTC CT _ 32 (2) INFORMATION FOR SEQ ID N0:29:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs (B) TYPE: nucleic acid __.
(C) STRANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID N0:29:
(2) INFORMATION FOR SEQ TD N0:30:
(i) SEQUENCE CHARACTERTSTICS:
(A) LENGTH: 32 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single -(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID N0:30:
(2) INFORMATION FOR SEQ ID N0:31:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single .
(D) TOPOLOGY: linear (xi} SEQUENCE DESCRIPTION: SEQ ID N0:31:
(2) INFORMATION FOR SEQ ID N0:32:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 35 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear -(xi) SEQUENCE DESCRIPTION: SEQ ID N0:32:
34-vii SU~STITUTE SHEET (RULE 26) (2) INFORMATION FOR SEQ ID N0:33:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 29 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID N0:33:
GAGGATCCCC AGGGAGATGC TGCCCAGAA 2g (2) INFORMATION FOR SEQ ID N0:34:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID N0:34:
34-viii
Claims (18)
1. A method of producing, in monocot plant cells, a mature heterologous protein selected from the group consisting of (i) mature, glycosylated .alpha.1-antitrypsin (AAT) having the same N-terminal amino acid sequence as mature AAT produced in humans and a glycosylation pattern which increases serum halflife substantially over that of mature non-glycosylated AAT;
(ii) mature, glycosylated antithrombin III (ATIII) having the same N-terminal amino acid sequence as mature ATIII produced in humans;
(iii) mature human serum albumin (HSA) having the same N-terminal amino acid sequence as mature HSA produced in humans and having the folding pattern of native mature HSA as evidenced by its bilirubin-binding characteristics; and (iv) mature, active subtilisin BPN' (BPN') having the same N-terminal amino acid sequence as BPN' produced in Bacillus;
the method comprising:
(a) obtaining monocot cells transformed with a chimeric gene having (i) a monocot transcriptional regulatory region, inducible by addition or removal of a small molecule, or during seed maturation, (ii) a first DNA sequence encoding the heterologous protein, and (iii) a second DNA sequence encoding a signal peptide, said first and second DNA sequences in translation-frame and encoding a fusion protein, and wherein (i) the transcriptional regulatory region is operably linked to the second DNA sequence, and (ii) said signal peptide is effective to facilitate secretion of the mature heterologous protein from the transformed cells;
(b) cultivating the transformed cells under conditions effective to induce said transcriptional regulatory region, thereby promoting expression of the fusion protein and secretion of the mature heterologous protein from the transformed cells; and (c) isolating said mature heterologous protein produced by the transformed cells.
(ii) mature, glycosylated antithrombin III (ATIII) having the same N-terminal amino acid sequence as mature ATIII produced in humans;
(iii) mature human serum albumin (HSA) having the same N-terminal amino acid sequence as mature HSA produced in humans and having the folding pattern of native mature HSA as evidenced by its bilirubin-binding characteristics; and (iv) mature, active subtilisin BPN' (BPN') having the same N-terminal amino acid sequence as BPN' produced in Bacillus;
the method comprising:
(a) obtaining monocot cells transformed with a chimeric gene having (i) a monocot transcriptional regulatory region, inducible by addition or removal of a small molecule, or during seed maturation, (ii) a first DNA sequence encoding the heterologous protein, and (iii) a second DNA sequence encoding a signal peptide, said first and second DNA sequences in translation-frame and encoding a fusion protein, and wherein (i) the transcriptional regulatory region is operably linked to the second DNA sequence, and (ii) said signal peptide is effective to facilitate secretion of the mature heterologous protein from the transformed cells;
(b) cultivating the transformed cells under conditions effective to induce said transcriptional regulatory region, thereby promoting expression of the fusion protein and secretion of the mature heterologous protein from the transformed cells; and (c) isolating said mature heterologous protein produced by the transformed cells.
2. The method of claim 1, wherein said first DNA sequence encodes proBPN', said cultivating includes cultivating said transformed cells at a pH between 5-6 to promote expression and secretion of proBPN' from the cells, and said isolating step includes incubating the proBPN' under conditions effective to allow the autoconversion of proBPN' to active mature BPN'.
3. The method of claim 1, wherein said first DNA sequence encodes mature BPN', and said method further includes:
transforming said cells with a second chimeric gene containing (i) a transcriptional regulatory region inducible by addition or removal of a small molecule, or during seed maturation;
(ii) a third DNA sequence encoding the pro-peptide moiety of BPN', and (iii) a fourth DNA
sequence encoding a signal polypeptide, where said fourth DNA sequence is operably linked to said transcriptional regulatory region and said third DNA sequence, and where said signal polypeptide is in translation-frame with said pro peptide moiety and is effective to facilitate secretion of expressed pro-peptide moiety from the transformed cells;
said cultivating step includes cultivating the transformed cells at a pH
between 5-6 to promote expression and secretion of BPN' and the pro-peptide moiety from the cells;
and said isolating step includes incubating the BPN' and the pro-moiety under conditions effective to allow the conversion of BPN' to active mature BPN', and isolating the active mature BPN'.
transforming said cells with a second chimeric gene containing (i) a transcriptional regulatory region inducible by addition or removal of a small molecule, or during seed maturation;
(ii) a third DNA sequence encoding the pro-peptide moiety of BPN', and (iii) a fourth DNA
sequence encoding a signal polypeptide, where said fourth DNA sequence is operably linked to said transcriptional regulatory region and said third DNA sequence, and where said signal polypeptide is in translation-frame with said pro peptide moiety and is effective to facilitate secretion of expressed pro-peptide moiety from the transformed cells;
said cultivating step includes cultivating the transformed cells at a pH
between 5-6 to promote expression and secretion of BPN' and the pro-peptide moiety from the cells;
and said isolating step includes incubating the BPN' and the pro-moiety under conditions effective to allow the conversion of BPN' to active mature BPN', and isolating the active mature BPN'.
4. The method of claim 1, wherein said signal peptide is the RAmy3D signal peptide having the amino acid sequence identified by SEQ ID NO:1.
5. The method of claim 1, wherein said second DNA sequence encodes the RAmy3D
signal peptide (SEQ ID NO:1) and has the codon-optimized nucleotide sequence identified by SEQ ID
NO:3.
signal peptide (SEQ ID NO:1) and has the codon-optimized nucleotide sequence identified by SEQ ID
NO:3.
6. The method of claim 1, wherein said signal peptide is the RAmy1A signal peptide having the amino acid sequence identified by SEQ ID NO:4.
7. The method of claim 1, wherein the second DNA sequence, the first DNA
sequence, or both the second and the first DNA sequence, is codon-optimized for enhanced expression in said plant.
sequence, or both the second and the first DNA sequence, is codon-optimized for enhanced expression in said plant.
8. The method of claim 1, wherein said transcriptional regulatory region is a promoter derived from a rice or barley .alpha.-amylase gene selected from the group consisting of the RAmy1A, RAmy1B, RAmy2A, RAmy3A, RAmy3B, RAmy3C, RAmy3D, and RAmy3E, pM/C, gKAmy141, gKAmy155, Amy32b, and HV18 genes.
9. The method of claim 8, wherein the chimeric gene further comprises, between said transcriptional regulatory region and said second DNA coding sequence, the 5' untranslated region of an inducible monocot gene selected from the group consisting of RAmy1A, RAmy3B, RAmy3C, RAmy3D, HV18, and RAmy3E.
10. The method of claim 8, wherein said chimeric gene further comprises, downstream of the sequence encoding said fusion protein, the 3' untranslated region of an inducible monocot gene derived from a rice or barley .alpha.-amylase gene selected from the group consisting of the RAmy1A, RAmy1B, RAmy2A, RAmy3A, RAmy3B, RAmy3C, RAmy3D, and RAmy3E, pM/C, gKAmy141, gKAmy155, Amy32b, and HV18 genes.
11. The method of claim 1, wherein said cultivating includes culturing the transformed plant cells in a sugar-free or sugar-depleted medium, the transcriptional regulatory region is derived from the RAmy3E or RAmy3D gene, the 5' untranslated region is derived from the RAmy1A gene and has the sequence identified by SEQ ID NO:5, and the 3' untranslated region is derived from the RAmy1A gene.
12. The method of claim 1, wherein the transformed cells are aleurone cells of mature seeds, the transcriptional regulatory region is upregulated by addition of a small molecule to promote seed germination, and said cultivating includes germinating said seeds, either in embryonated or de-embryonated form.
13. The method of claim 12, wherein the transcriptional regulatory region is a rice .alpha.-amylase RAmy1A promoter or a barley HV18 promoter, and said small molecule is gibberellic acid.
14. A mature heterologous protein produced by the method of claim 1, wherein said protein is selected from the group consisting of:
(i) mature glycoslyated .alpha.-antitrypsin (AAT) having the same N-terminal amino acid sequence as mature AAT produced in humans and having a glycosylation pattern which increases serum halflife substantially over that of non-glycosylated mature AAT;
(ii) mature glycosylated antithrombin III (ATIII) having the same N-terminal amino acid sequence as mature ATIII produced in humans; and (iii) mature glycosylated subtilisin BPN' (BPN') having the same N-terminal amino acid sequence as BPN' produced in Bacillus;
wherein said protein has a glycosylation pattern characteristic of proteins produced in said monocot plant.
(i) mature glycoslyated .alpha.-antitrypsin (AAT) having the same N-terminal amino acid sequence as mature AAT produced in humans and having a glycosylation pattern which increases serum halflife substantially over that of non-glycosylated mature AAT;
(ii) mature glycosylated antithrombin III (ATIII) having the same N-terminal amino acid sequence as mature ATIII produced in humans; and (iii) mature glycosylated subtilisin BPN' (BPN') having the same N-terminal amino acid sequence as BPN' produced in Bacillus;
wherein said protein has a glycosylation pattern characteristic of proteins produced in said monocot plant.
15. The method of claim 1, wherein said monocot plant cells are transformed rice, barley, corn, wheat, oat, rye, sorghum, or millet cells.
16. The method of claim 1, wherein said monocot plant cells are transformed rice or barley cells.
17. Plant cells capable of producing the mature heterologous protein according to the method of claim 1, wherein said cultivating includes culturing the transformed plant cells in a sugar-free or sugar-depleted medium, the transcriptional regulatory region is derived from the RAmy3E or RAmy3D gene, the 5' untranslated region is derived from the RAmy1A gene and has the sequence identified by SEQ ID NO:5, and the 3' untranslated region is derived from the RAmy1A gene.
18. Seeds capable of producing the mature heterologous protein according to the method of claim 1, wherein said transformed cells are aleurone cells, the transcriptional regulatory region is upregulated by addition of a small molecule to promote seed germination, and said cultivating includes germinating said seeds, either in embryonated or de-embryonated form.
Applications Claiming Priority (9)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US3816997P | 1997-02-13 | 1997-02-13 | |
US3799197P | 1997-02-13 | 1997-02-13 | |
US3816897P | 1997-02-13 | 1997-02-13 | |
US3817097P | 1997-02-13 | 1997-02-13 | |
US60/038,170 | 1997-02-13 | ||
US60/038,168 | 1997-02-13 | ||
US60/037,991 | 1997-02-13 | ||
US60/038,169 | 1997-02-13 | ||
PCT/US1998/003068 WO1998036085A1 (en) | 1997-02-13 | 1998-02-13 | Production of mature proteins in plants |
Publications (1)
Publication Number | Publication Date |
---|---|
CA2280894A1 true CA2280894A1 (en) | 1998-08-20 |
Family
ID=27488492
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA002280894A Abandoned CA2280894A1 (en) | 1997-02-13 | 1998-02-13 | Production of mature proteins in plants |
Country Status (5)
Country | Link |
---|---|
EP (1) | EP0981635A1 (en) |
JP (1) | JP2001512318A (en) |
AU (1) | AU746826B2 (en) |
CA (1) | CA2280894A1 (en) |
WO (1) | WO1998036085A1 (en) |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2774379B1 (en) * | 1998-01-30 | 2002-03-29 | Groupe Limagrain Holding | PROCESS FOR THE PRODUCTION OF ALPHA 1-ANTITRYPSIN AND ITS VARIANTS BY PLANT CELLS, AND PRODUCTS CONTAINING THE ALPHA-ANTITRYPSIN OBTAINED THEREBY |
US6087558A (en) * | 1998-07-22 | 2000-07-11 | Prodigene, Inc. | Commercial production of proteases in plants |
DE19947290A1 (en) * | 1999-10-01 | 2001-04-19 | Greenovation Pflanzenbiotechno | Process for the production of proteinaceous substances |
US8022270B2 (en) | 2000-07-31 | 2011-09-20 | Biolex Therapeutics, Inc. | Expression of biologically active polypeptides in duckweed |
CA2417415C (en) | 2000-07-31 | 2012-10-09 | Biolex, Inc. | Expression of biologically active polypeptides in duckweed |
US7632983B2 (en) | 2000-07-31 | 2009-12-15 | Biolex Therapeutics, Inc. | Expression of monoclonal antibodies in duckweed |
ES2500918T3 (en) | 2001-12-21 | 2014-10-01 | Human Genome Sciences, Inc. | Albumin and interferon beta fusion proteins |
GB0314856D0 (en) | 2003-06-25 | 2003-07-30 | Unitargeting Res As | Protein expression system |
WO2006108830A2 (en) * | 2005-04-13 | 2006-10-19 | Bayer Cropscience Sa | TRANSPLASTOMIC PLANTS EXPRESSING α 1-ANTITRYPSIN |
EP1896568A4 (en) * | 2005-06-28 | 2009-04-29 | Ventria Bioscience | Components of cell culture media produced from plant cells |
JP2007151435A (en) * | 2005-12-02 | 2007-06-21 | Niigata Univ | Transformed plant having high starch-accumulating ability and method for producing the same |
JP5158639B2 (en) | 2008-04-11 | 2013-03-06 | 独立行政法人農業生物資源研究所 | Genes specifically expressed in the endosperm of plants, promoters of the genes, and use thereof |
AU2010215959A1 (en) | 2009-02-20 | 2011-08-04 | Ventria Bioscience | Cell culture media containing combinations of proteins |
CN102532254B (en) * | 2010-12-24 | 2015-06-24 | 武汉禾元生物科技股份有限公司 | Method for separating and purifying recombinant human serum albumin (rHSA) from rice seeds |
KR102435211B1 (en) * | 2021-06-29 | 2022-08-23 | (주)진셀바이오텍 | Plant cell lines producing recombinant albumin with high yield and uses thereof |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE58909876D1 (en) * | 1988-06-20 | 2000-09-14 | Novartis Ag | Process for controlling plant pests with non-plant proteinase inhibitors |
EP0428572A1 (en) * | 1988-07-29 | 1991-05-29 | Washington University School Of Medicine | Producing commercially valuable polypeptides with genetically transformed endosperm tissue |
NL8901932A (en) * | 1989-07-26 | 1991-02-18 | Mogen Int | PRODUCTION OF heterologous PROTEINS IN PLANTS OR PLANTS. |
DK162790D0 (en) * | 1990-07-06 | 1990-07-06 | Novo Nordisk As | PLANT CELL |
US5460952A (en) * | 1992-11-04 | 1995-10-24 | National Science Counsil Of R.O.C. | Gene expression system comprising the promoter region of the α-amylase genes |
US5693506A (en) * | 1993-11-16 | 1997-12-02 | The Regents Of The University Of California | Process for protein production in plants |
-
1998
- 1998-02-13 EP EP98906507A patent/EP0981635A1/en not_active Withdrawn
- 1998-02-13 WO PCT/US1998/003068 patent/WO1998036085A1/en not_active Application Discontinuation
- 1998-02-13 JP JP53599798A patent/JP2001512318A/en not_active Ceased
- 1998-02-13 AU AU61716/98A patent/AU746826B2/en not_active Ceased
- 1998-02-13 CA CA002280894A patent/CA2280894A1/en not_active Abandoned
Also Published As
Publication number | Publication date |
---|---|
JP2001512318A (en) | 2001-08-21 |
WO1998036085A1 (en) | 1998-08-20 |
AU746826B2 (en) | 2002-05-02 |
AU6171698A (en) | 1998-09-08 |
EP0981635A1 (en) | 2000-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP0871749B1 (en) | Oil body proteins as carriers of high value proteins | |
US6753167B2 (en) | Preparation of heterologous proteins on oil bodies | |
US5948682A (en) | Preparation of heterologous proteins on oil bodies | |
CA2052792C (en) | Method and composition for increasing sterol accumulation in higher plants | |
RUNEBERG‐ROOS et al. | Primary structure of a barley‐grain aspartic proteinase: A plant aspartic proteinase resembling mammalian cathepsin D | |
CA2280894A1 (en) | Production of mature proteins in plants | |
US8158857B2 (en) | Monocot seed product comprising a human serum albumin protein | |
JP4570327B2 (en) | Methods for the production of multimeric proteins and related compositions | |
CA2528741A1 (en) | Methods for the production of insulin in plants | |
US6127145A (en) | Production of α1 -antitrypsin in plants | |
US6066781A (en) | Production of mature proteins in plants | |
EP0977873B1 (en) | Method for cleavage of fusion proteins | |
NZ537779A (en) | Increase in the yield of secreted protein by the introduction of an amino acid sequence motif, preferably by modification of leader sequences | |
AU2003218396B2 (en) | Human blood proteins expressed in monocot seeds | |
US6750046B2 (en) | Preparation of thioredoxin and thioredoxin reductase proteins on oil bodies | |
FR2774379A1 (en) | PROCESS FOR THE PRODUCTION OF ALPHA 1-ANTITRYPSIN AND ITS VARIANTS BY PLANT CELLS, AND PRODUCTS CONTAINING THE ALPHA-ANTITRYPSIN OBTAINED THEREBY | |
CA2296008A1 (en) | Rice beta-glucanase enzymes and genes | |
Sutliff et al. | Production of α 1-antitrypsin in plants | |
AU4714799A (en) | Production of urokinase in plant-based expression systems | |
US20040111765A1 (en) | Production of urokinase in plant-based expression systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FZDE | Discontinued |