KR20220062230A - 퍼옥시좀에서 변형 단백질의 발현 - Google Patents
퍼옥시좀에서 변형 단백질의 발현 Download PDFInfo
- Publication number
- KR20220062230A KR20220062230A KR1020217040840A KR20217040840A KR20220062230A KR 20220062230 A KR20220062230 A KR 20220062230A KR 1020217040840 A KR1020217040840 A KR 1020217040840A KR 20217040840 A KR20217040840 A KR 20217040840A KR 20220062230 A KR20220062230 A KR 20220062230A
- Authority
- KR
- South Korea
- Prior art keywords
- gly
- pro
- ala
- glu
- lys
- Prior art date
Links
- 210000002824 peroxisome Anatomy 0.000 title claims abstract description 133
- 108091005573 modified proteins Proteins 0.000 title claims description 70
- 102000035118 modified proteins Human genes 0.000 title claims description 70
- 230000014509 gene expression Effects 0.000 title claims description 40
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 267
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 246
- 210000004027 cell Anatomy 0.000 claims abstract description 113
- 238000000034 method Methods 0.000 claims abstract description 94
- 210000003527 eukaryotic cell Anatomy 0.000 claims abstract description 75
- 108090000790 Enzymes Proteins 0.000 claims description 125
- 102000004190 Enzymes Human genes 0.000 claims description 122
- 230000000858 peroxisomal effect Effects 0.000 claims description 120
- 150000007523 nucleic acids Chemical class 0.000 claims description 72
- 102000039446 nucleic acids Human genes 0.000 claims description 62
- 108020004707 nucleic acids Proteins 0.000 claims description 62
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 claims description 51
- 239000000758 substrate Substances 0.000 claims description 51
- 108010035532 Collagen Proteins 0.000 claims description 47
- 102000008186 Collagen Human genes 0.000 claims description 46
- 229920001436 collagen Polymers 0.000 claims description 46
- 230000004048 modification Effects 0.000 claims description 37
- 238000012986 modification Methods 0.000 claims description 37
- 230000008685 targeting Effects 0.000 claims description 33
- 238000004519 manufacturing process Methods 0.000 claims description 32
- 108010043005 Prolyl Hydroxylases Proteins 0.000 claims description 23
- 102000004079 Prolyl Hydroxylases Human genes 0.000 claims description 23
- 230000001965 increasing effect Effects 0.000 claims description 22
- 150000001413 amino acids Chemical class 0.000 claims description 20
- 210000005253 yeast cell Anatomy 0.000 claims description 18
- 108020004705 Codon Proteins 0.000 claims description 15
- 238000012258 culturing Methods 0.000 claims description 12
- 230000001939 inductive effect Effects 0.000 claims description 12
- 238000007254 oxidation reaction Methods 0.000 claims description 12
- 230000033444 hydroxylation Effects 0.000 claims description 11
- 238000005805 hydroxylation reaction Methods 0.000 claims description 11
- 230000003647 oxidation Effects 0.000 claims description 11
- 241000235648 Pichia Species 0.000 claims description 10
- 230000004927 fusion Effects 0.000 claims description 10
- 230000026731 phosphorylation Effects 0.000 claims description 10
- 238000006366 phosphorylation reaction Methods 0.000 claims description 10
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 9
- 108010010803 Gelatin Proteins 0.000 claims description 8
- 102000009658 Peptidylprolyl Isomerase Human genes 0.000 claims description 8
- 108010020062 Peptidylprolyl Isomerase Proteins 0.000 claims description 8
- 102000006010 Protein Disulfide-Isomerase Human genes 0.000 claims description 8
- 239000008273 gelatin Substances 0.000 claims description 8
- 229920000159 gelatin Polymers 0.000 claims description 8
- 235000019322 gelatine Nutrition 0.000 claims description 8
- 235000011852 gelatine desserts Nutrition 0.000 claims description 8
- 108020003519 protein disulfide isomerase Proteins 0.000 claims description 8
- 239000000126 substance Substances 0.000 claims description 8
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 7
- ZQPPMHVWECSIRJ-KTKRTIGZSA-N oleic acid Chemical compound CCCCCCCC\C=C/CCCCCCCC(O)=O ZQPPMHVWECSIRJ-KTKRTIGZSA-N 0.000 claims description 7
- WRIDQFICGBMAFQ-UHFFFAOYSA-N (E)-8-Octadecenoic acid Natural products CCCCCCCCCC=CCCCCCCC(O)=O WRIDQFICGBMAFQ-UHFFFAOYSA-N 0.000 claims description 6
- LQJBNNIYVWPHFW-UHFFFAOYSA-N 20:1omega9c fatty acid Natural products CCCCCCCCCCC=CCCCCCCCC(O)=O LQJBNNIYVWPHFW-UHFFFAOYSA-N 0.000 claims description 6
- QSBYPNXLFMSGKH-UHFFFAOYSA-N 9-Heptadecensaeure Natural products CCCCCCCC=CCCCCCCCC(O)=O QSBYPNXLFMSGKH-UHFFFAOYSA-N 0.000 claims description 6
- 108010006519 Molecular Chaperones Proteins 0.000 claims description 6
- 239000005642 Oleic acid Substances 0.000 claims description 6
- ZQPPMHVWECSIRJ-UHFFFAOYSA-N Oleic acid Natural products CCCCCCCCC=CCCCCCCCC(O)=O ZQPPMHVWECSIRJ-UHFFFAOYSA-N 0.000 claims description 6
- QXJSBBXBKPUZAA-UHFFFAOYSA-N isooleic acid Natural products CCCCCCCC=CCCCCCCCCC(O)=O QXJSBBXBKPUZAA-UHFFFAOYSA-N 0.000 claims description 6
- 230000017854 proteolysis Effects 0.000 claims description 6
- 241001112159 Ogataea Species 0.000 claims description 5
- 241000235070 Saccharomyces Species 0.000 claims description 5
- 241000235013 Yarrowia Species 0.000 claims description 5
- 230000030609 dephosphorylation Effects 0.000 claims description 5
- 238000006209 dephosphorylation reaction Methods 0.000 claims description 5
- 238000006317 isomerization reaction Methods 0.000 claims description 5
- 230000005945 translocation Effects 0.000 claims description 5
- 241000222120 Candida <Saccharomycetales> Species 0.000 claims description 4
- 108010003894 Protein-Lysine 6-Oxidase Proteins 0.000 claims description 4
- 102000004669 Protein-Lysine 6-Oxidase Human genes 0.000 claims description 4
- 230000002209 hydrophobic effect Effects 0.000 claims description 4
- 239000000411 inducer Substances 0.000 claims description 4
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 claims description 3
- 241000235649 Kluyveromyces Species 0.000 claims description 3
- 241001523626 Arxula Species 0.000 claims description 2
- 241001099157 Komagataella Species 0.000 claims description 2
- 102000004357 Transferases Human genes 0.000 claims description 2
- 108090000992 Transferases Proteins 0.000 claims description 2
- 238000012217 deletion Methods 0.000 claims description 2
- 230000037430 deletion Effects 0.000 claims description 2
- 125000002887 hydroxy group Chemical group [H]O* 0.000 claims description 2
- 230000012846 protein folding Effects 0.000 claims description 2
- 239000000203 mixture Substances 0.000 abstract description 8
- 235000018102 proteins Nutrition 0.000 description 197
- 108010014614 prolyl-glycyl-proline Proteins 0.000 description 116
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 108
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 82
- 108010047495 alanylglycine Proteins 0.000 description 77
- 108010029020 prolylglycine Proteins 0.000 description 76
- 108010087846 prolyl-prolyl-glycine Proteins 0.000 description 65
- OOCFXNOVSLSHAB-IUCAKERBSA-N Gly-Pro-Pro Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 OOCFXNOVSLSHAB-IUCAKERBSA-N 0.000 description 60
- BRPMXFSTKXXNHF-IUCAKERBSA-N (2s)-1-[2-[[(2s)-pyrrolidine-2-carbonyl]amino]acetyl]pyrrolidine-2-carboxylic acid Chemical compound OC(=O)[C@@H]1CCCN1C(=O)CNC(=O)[C@H]1NCCC1 BRPMXFSTKXXNHF-IUCAKERBSA-N 0.000 description 57
- LEIKGVHQTKHOLM-IUCAKERBSA-N Pro-Pro-Gly Chemical compound OC(=O)CNC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 LEIKGVHQTKHOLM-IUCAKERBSA-N 0.000 description 50
- 108010077515 glycylproline Proteins 0.000 description 48
- 108010069020 alanyl-prolyl-glycine Proteins 0.000 description 47
- CAVKXZMMDNOZJU-UHFFFAOYSA-N Gly-Pro-Ala-Gly-Pro Natural products C1CCC(C(O)=O)N1C(=O)CNC(=O)C(C)NC(=O)C1CCCN1C(=O)CN CAVKXZMMDNOZJU-UHFFFAOYSA-N 0.000 description 46
- 108010009111 arginyl-glycyl-glutamic acid Proteins 0.000 description 44
- 108010064235 lysylglycine Proteins 0.000 description 44
- 108010047857 aspartylglycine Proteins 0.000 description 39
- 108010061238 threonyl-glycine Proteins 0.000 description 39
- GGLIDLCEPDHEJO-BQBZGAKWSA-N Gly-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)CN GGLIDLCEPDHEJO-BQBZGAKWSA-N 0.000 description 38
- KIZQGKLMXKGDIV-BQBZGAKWSA-N Pro-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 KIZQGKLMXKGDIV-BQBZGAKWSA-N 0.000 description 36
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 34
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 34
- 108010078144 glutaminyl-glycine Proteins 0.000 description 34
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 33
- 108010079364 N-glycylalanine Proteins 0.000 description 31
- 108010050848 glycylleucine Proteins 0.000 description 31
- 108020004414 DNA Proteins 0.000 description 30
- 102000053602 DNA Human genes 0.000 description 30
- UGTHTQWIQKEDEH-BQBZGAKWSA-N L-alanyl-L-prolylglycine zwitterion Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UGTHTQWIQKEDEH-BQBZGAKWSA-N 0.000 description 28
- JYPCXBJRLBHWME-UHFFFAOYSA-N glycyl-L-prolyl-L-arginine Natural products NCC(=O)N1CCCC1C(=O)NC(CCCN=C(N)N)C(O)=O JYPCXBJRLBHWME-UHFFFAOYSA-N 0.000 description 28
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 27
- DHDOADIPGZTAHT-YUMQZZPRSA-N Gly-Glu-Arg Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DHDOADIPGZTAHT-YUMQZZPRSA-N 0.000 description 27
- 241000880493 Leptailurus serval Species 0.000 description 26
- ZVFVBBGVOILKPO-WHFBIAKZSA-N Ala-Gly-Ala Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O ZVFVBBGVOILKPO-WHFBIAKZSA-N 0.000 description 25
- CLNJSLSHKJECME-BQBZGAKWSA-N Pro-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H]1CCCN1 CLNJSLSHKJECME-BQBZGAKWSA-N 0.000 description 24
- BGWKULMLUIUPKY-BQBZGAKWSA-N Pro-Ser-Gly Chemical compound OC(=O)CNC(=O)[C@H](CO)NC(=O)[C@@H]1CCCN1 BGWKULMLUIUPKY-BQBZGAKWSA-N 0.000 description 24
- 108010091092 arginyl-glycyl-proline Proteins 0.000 description 24
- QXPRJQPCFXMCIY-NKWVEPMBSA-N Gly-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN QXPRJQPCFXMCIY-NKWVEPMBSA-N 0.000 description 23
- HAOUOFNNJJLVNS-BQBZGAKWSA-N Gly-Pro-Ser Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O HAOUOFNNJJLVNS-BQBZGAKWSA-N 0.000 description 23
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 23
- UUYBFNKHOCJCHT-VHSXEESVSA-N Gly-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN UUYBFNKHOCJCHT-VHSXEESVSA-N 0.000 description 22
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 22
- PUUYVMYCMIWHFE-BQBZGAKWSA-N Gly-Ala-Arg Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PUUYVMYCMIWHFE-BQBZGAKWSA-N 0.000 description 21
- GAAHQHNCMIAYEX-UWVGGRQHSA-N Gly-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN GAAHQHNCMIAYEX-UWVGGRQHSA-N 0.000 description 21
- UCBPDSYUVAAHCD-UWVGGRQHSA-N Leu-Pro-Gly Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UCBPDSYUVAAHCD-UWVGGRQHSA-N 0.000 description 21
- 108010057821 leucylproline Proteins 0.000 description 21
- 108010031719 prolyl-serine Proteins 0.000 description 21
- MQIGTEQXYCRLGK-BQBZGAKWSA-N Ala-Gly-Pro Chemical compound C[C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O MQIGTEQXYCRLGK-BQBZGAKWSA-N 0.000 description 19
- NSORZJXKUQFEKL-JGVFFNPUSA-N Gln-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCC(=O)N)N)C(=O)O NSORZJXKUQFEKL-JGVFFNPUSA-N 0.000 description 19
- KKBWDNZXYLGJEY-UHFFFAOYSA-N Gly-Arg-Pro Natural products NCC(=O)NC(CCNC(=N)N)C(=O)N1CCCC1C(=O)O KKBWDNZXYLGJEY-UHFFFAOYSA-N 0.000 description 19
- IEGFSKKANYKBDU-QWHCGFSZSA-N Gly-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)CN)C(=O)O IEGFSKKANYKBDU-QWHCGFSZSA-N 0.000 description 19
- VYWNORHENYEQDW-YUMQZZPRSA-N Pro-Gly-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 VYWNORHENYEQDW-YUMQZZPRSA-N 0.000 description 19
- 229940024606 amino acid Drugs 0.000 description 19
- 235000001014 amino acid Nutrition 0.000 description 19
- SCAKQYSGEIHPLV-IUCAKERBSA-N (4S)-4-[(2-aminoacetyl)amino]-5-[(2S)-2-(carboxymethylcarbamoyl)pyrrolidin-1-yl]-5-oxopentanoic acid Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O SCAKQYSGEIHPLV-IUCAKERBSA-N 0.000 description 18
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 18
- BNBBNGZZKQUWCD-IUCAKERBSA-N Pro-Arg-Gly Chemical compound NC(N)=NCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H]1CCCN1 BNBBNGZZKQUWCD-IUCAKERBSA-N 0.000 description 18
- KDGARKCAKHBEDB-NKWVEPMBSA-N Ser-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CO)N)C(=O)O KDGARKCAKHBEDB-NKWVEPMBSA-N 0.000 description 18
- 108090000765 processed proteins & peptides Proteins 0.000 description 18
- NLKVNZUFDPWPNL-YUMQZZPRSA-N Glu-Arg-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O NLKVNZUFDPWPNL-YUMQZZPRSA-N 0.000 description 17
- 108010076818 TEV protease Proteins 0.000 description 17
- PYTZFYUXZZHOAD-WHFBIAKZSA-N Gly-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)CN PYTZFYUXZZHOAD-WHFBIAKZSA-N 0.000 description 16
- COYSIHFOCOMGCF-UHFFFAOYSA-N Val-Arg-Gly Natural products CC(C)C(N)C(=O)NC(C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-UHFFFAOYSA-N 0.000 description 16
- 108010037850 glycylvaline Proteins 0.000 description 16
- FKLSMYYLJHYPHH-UWVGGRQHSA-N Pro-Gly-Leu Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O FKLSMYYLJHYPHH-UWVGGRQHSA-N 0.000 description 15
- 108010042598 glutamyl-aspartyl-glycine Proteins 0.000 description 15
- 108010015792 glycyllysine Proteins 0.000 description 15
- 108010034529 leucyl-lysine Proteins 0.000 description 15
- LCMWVZLBCUVDAZ-IUCAKERBSA-N Lys-Gly-Glu Chemical compound [NH3+]CCCC[C@H]([NH3+])C(=O)NCC(=O)N[C@H](C([O-])=O)CCC([O-])=O LCMWVZLBCUVDAZ-IUCAKERBSA-N 0.000 description 14
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 14
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 14
- 238000003776 cleavage reaction Methods 0.000 description 14
- 210000000805 cytoplasm Anatomy 0.000 description 14
- 230000004807 localization Effects 0.000 description 14
- 238000000746 purification Methods 0.000 description 14
- JBGSZRYCXBPWGX-BQBZGAKWSA-N Ala-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CCCN=C(N)N JBGSZRYCXBPWGX-BQBZGAKWSA-N 0.000 description 13
- OQCWXQJLCDPRHV-UWVGGRQHSA-N Arg-Gly-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O OQCWXQJLCDPRHV-UWVGGRQHSA-N 0.000 description 13
- POTCZYQVVNXUIG-BQBZGAKWSA-N Asp-Gly-Pro Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O POTCZYQVVNXUIG-BQBZGAKWSA-N 0.000 description 13
- 108010087924 alanylproline Proteins 0.000 description 13
- 230000007017 scission Effects 0.000 description 13
- RLMISHABBKUNFO-WHFBIAKZSA-N Ala-Ala-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 12
- AUFHLLPVPSMEOG-YUMQZZPRSA-N Arg-Gly-Glu Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AUFHLLPVPSMEOG-YUMQZZPRSA-N 0.000 description 12
- DTNUIAJCPRMNBT-WHFBIAKZSA-N Asp-Gly-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](C)C(O)=O DTNUIAJCPRMNBT-WHFBIAKZSA-N 0.000 description 12
- JBRBACJPBZNFMF-YUMQZZPRSA-N Gly-Ala-Lys Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN JBRBACJPBZNFMF-YUMQZZPRSA-N 0.000 description 12
- RFQATBGBLDAKGI-VHSXEESVSA-N Lys-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCCCN)N)C(=O)O RFQATBGBLDAKGI-VHSXEESVSA-N 0.000 description 12
- DXTOOBDIIAJZBJ-BQBZGAKWSA-N Pro-Gly-Ser Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CO)C(O)=O DXTOOBDIIAJZBJ-BQBZGAKWSA-N 0.000 description 12
- ABSSTGUCBCDKMU-UWVGGRQHSA-N Pro-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H]1CCCN1 ABSSTGUCBCDKMU-UWVGGRQHSA-N 0.000 description 12
- 108010069926 arginyl-glycyl-serine Proteins 0.000 description 12
- 230000000694 effects Effects 0.000 description 12
- HJARVELKOSZUEW-YUMQZZPRSA-N Gly-Pro-Gln Chemical compound [H]NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O HJARVELKOSZUEW-YUMQZZPRSA-N 0.000 description 11
- UIMCLYYSUCIUJM-UWVGGRQHSA-N Pro-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 UIMCLYYSUCIUJM-UWVGGRQHSA-N 0.000 description 11
- UQFYNFTYDHUIMI-WHFBIAKZSA-N Ser-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CO UQFYNFTYDHUIMI-WHFBIAKZSA-N 0.000 description 11
- RHAPJNVNWDBFQI-BQBZGAKWSA-N Ser-Pro-Gly Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O RHAPJNVNWDBFQI-BQBZGAKWSA-N 0.000 description 11
- 108010003700 lysyl aspartic acid Proteins 0.000 description 11
- 239000011159 matrix material Substances 0.000 description 11
- 230000004481 post-translational protein modification Effects 0.000 description 11
- 239000000047 product Substances 0.000 description 11
- 235000013930 proline Nutrition 0.000 description 11
- 108010026333 seryl-proline Proteins 0.000 description 11
- ZATRYQNPUHGXCU-DTWKUNHWSA-N Arg-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCCN=C(N)N)N)C(=O)O ZATRYQNPUHGXCU-DTWKUNHWSA-N 0.000 description 10
- SOEATRRYCIPEHA-BQBZGAKWSA-N Gly-Glu-Glu Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SOEATRRYCIPEHA-BQBZGAKWSA-N 0.000 description 10
- MBOAPAXLTUSMQI-JHEQGTHGSA-N Gly-Glu-Thr Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MBOAPAXLTUSMQI-JHEQGTHGSA-N 0.000 description 10
- JYPCXBJRLBHWME-IUCAKERBSA-N Gly-Pro-Arg Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JYPCXBJRLBHWME-IUCAKERBSA-N 0.000 description 10
- BMWFDYIYBAFROD-WPRPVWTQSA-N Gly-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN BMWFDYIYBAFROD-WPRPVWTQSA-N 0.000 description 10
- SINRIKQYQJRGDQ-MEYUZBJRSA-N Tyr-Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 SINRIKQYQJRGDQ-MEYUZBJRSA-N 0.000 description 10
- COYSIHFOCOMGCF-WPRPVWTQSA-N Val-Arg-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-WPRPVWTQSA-N 0.000 description 10
- 108010038633 aspartylglutamate Proteins 0.000 description 10
- 108010092854 aspartyllysine Proteins 0.000 description 10
- 108010072405 glycyl-aspartyl-glycine Proteins 0.000 description 10
- 210000004379 membrane Anatomy 0.000 description 10
- 239000012528 membrane Substances 0.000 description 10
- 108010070643 prolylglutamic acid Proteins 0.000 description 10
- HICVMZCGVFKTPM-BQBZGAKWSA-N Asp-Pro-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O HICVMZCGVFKTPM-BQBZGAKWSA-N 0.000 description 9
- PMNHJLASAAWELO-FOHZUACHSA-N Gly-Asp-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PMNHJLASAAWELO-FOHZUACHSA-N 0.000 description 9
- XTQFHTHIAKKCTM-YFKPBYRVSA-N Gly-Glu-Gly Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O XTQFHTHIAKKCTM-YFKPBYRVSA-N 0.000 description 9
- SCWYHUQOOFRVHP-MBLNEYKQSA-N Gly-Ile-Thr Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SCWYHUQOOFRVHP-MBLNEYKQSA-N 0.000 description 9
- ABPRMMYHROQBLY-NKWVEPMBSA-N Gly-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)CN)C(=O)O ABPRMMYHROQBLY-NKWVEPMBSA-N 0.000 description 9
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 9
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 9
- TYVAWPFQYFPSBR-BFHQHQDPSA-N Thr-Ala-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)NCC(O)=O TYVAWPFQYFPSBR-BFHQHQDPSA-N 0.000 description 9
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 9
- 108010072041 arginyl-glycyl-aspartic acid Proteins 0.000 description 9
- 108010027338 isoleucylcysteine Proteins 0.000 description 9
- 229960002429 proline Drugs 0.000 description 9
- 108010077112 prolyl-proline Proteins 0.000 description 9
- 108010048818 seryl-histidine Proteins 0.000 description 9
- BLIMFWGRQKRCGT-YUMQZZPRSA-N Ala-Gly-Lys Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN BLIMFWGRQKRCGT-YUMQZZPRSA-N 0.000 description 8
- SUHLZMHFRALVSY-YUMQZZPRSA-N Ala-Lys-Gly Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)NCC(O)=O SUHLZMHFRALVSY-YUMQZZPRSA-N 0.000 description 8
- VKKYFICVTYKFIO-CIUDSAMLSA-N Arg-Ala-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N VKKYFICVTYKFIO-CIUDSAMLSA-N 0.000 description 8
- GWNMUVANAWDZTI-YUMQZZPRSA-N Asn-Gly-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N GWNMUVANAWDZTI-YUMQZZPRSA-N 0.000 description 8
- GQRDIVQPSMPQME-ZPFDUUQYSA-N Asn-Ile-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O GQRDIVQPSMPQME-ZPFDUUQYSA-N 0.000 description 8
- RVHGJNGNKGDCPX-KKUMJFAQSA-N Asn-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N RVHGJNGNKGDCPX-KKUMJFAQSA-N 0.000 description 8
- MYRLSKYSMXNLLA-LAEOZQHASA-N Asn-Val-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O MYRLSKYSMXNLLA-LAEOZQHASA-N 0.000 description 8
- KVMPVNGOKHTUHZ-GCJQMDKQSA-N Asp-Ala-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KVMPVNGOKHTUHZ-GCJQMDKQSA-N 0.000 description 8
- PDECQIHABNQRHN-GUBZILKMSA-N Asp-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC(O)=O PDECQIHABNQRHN-GUBZILKMSA-N 0.000 description 8
- SNDBKTFJWVEVPO-WHFBIAKZSA-N Asp-Gly-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CO)C(O)=O SNDBKTFJWVEVPO-WHFBIAKZSA-N 0.000 description 8
- HYPVLWGNBIYTNA-GUBZILKMSA-N Gln-Leu-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O HYPVLWGNBIYTNA-GUBZILKMSA-N 0.000 description 8
- JNENSVNAUWONEZ-GUBZILKMSA-N Gln-Lys-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O JNENSVNAUWONEZ-GUBZILKMSA-N 0.000 description 8
- UTKUTMJSWKKHEM-WDSKDSINSA-N Glu-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O UTKUTMJSWKKHEM-WDSKDSINSA-N 0.000 description 8
- TZOVVRJYUDETQG-RCOVLWMOSA-N Gly-Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)CN TZOVVRJYUDETQG-RCOVLWMOSA-N 0.000 description 8
- MOJKRXIRAZPZLW-WDSKDSINSA-N Gly-Glu-Ala Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O MOJKRXIRAZPZLW-WDSKDSINSA-N 0.000 description 8
- MVORZMQFXBLMHM-QWRGUYRKSA-N Gly-His-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CN=CN1 MVORZMQFXBLMHM-QWRGUYRKSA-N 0.000 description 8
- WDXLKVQATNEAJQ-BQBZGAKWSA-N Gly-Pro-Asp Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O WDXLKVQATNEAJQ-BQBZGAKWSA-N 0.000 description 8
- YOBGUCWZPXJHTN-BQBZGAKWSA-N Gly-Ser-Arg Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YOBGUCWZPXJHTN-BQBZGAKWSA-N 0.000 description 8
- KSOBNUBCYHGUKH-UWVGGRQHSA-N Gly-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)CN KSOBNUBCYHGUKH-UWVGGRQHSA-N 0.000 description 8
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 8
- LAGPXKYZCCTSGQ-JYJNAYRXSA-N Leu-Glu-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LAGPXKYZCCTSGQ-JYJNAYRXSA-N 0.000 description 8
- XVZCXCTYGHPNEM-UHFFFAOYSA-N Leu-Leu-Pro Natural products CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(O)=O XVZCXCTYGHPNEM-UHFFFAOYSA-N 0.000 description 8
- MVBZBRKNZVJEKK-DTWKUNHWSA-N Met-Gly-Pro Chemical compound CSCC[C@@H](C(=O)NCC(=O)N1CCC[C@@H]1C(=O)O)N MVBZBRKNZVJEKK-DTWKUNHWSA-N 0.000 description 8
- ZPPVJIJMIKTERM-YUMQZZPRSA-N Pro-Gln-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(=O)N)NC(=O)[C@@H]1CCCN1 ZPPVJIJMIKTERM-YUMQZZPRSA-N 0.000 description 8
- VZKBJNBZMZHKRC-XUXIUFHCSA-N Pro-Ile-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O VZKBJNBZMZHKRC-XUXIUFHCSA-N 0.000 description 8
- FDMCIBSQRKFSTJ-RHYQMDGZSA-N Pro-Thr-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O FDMCIBSQRKFSTJ-RHYQMDGZSA-N 0.000 description 8
- FUOGXAQMNJMBFG-WPRPVWTQSA-N Pro-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 FUOGXAQMNJMBFG-WPRPVWTQSA-N 0.000 description 8
- MSIYNSBKKVMGFO-BHNWBGBOSA-N Thr-Gly-Pro Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N1CCC[C@@H]1C(=O)O)N)O MSIYNSBKKVMGFO-BHNWBGBOSA-N 0.000 description 8
- PELIQFPESHBTMA-WLTAIBSBSA-N Thr-Tyr-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 PELIQFPESHBTMA-WLTAIBSBSA-N 0.000 description 8
- 108010005233 alanylglutamic acid Proteins 0.000 description 8
- 230000006870 function Effects 0.000 description 8
- 108010038983 glycyl-histidyl-lysine Proteins 0.000 description 8
- 108010010147 glycylglutamine Proteins 0.000 description 8
- 108010081551 glycylphenylalanine Proteins 0.000 description 8
- 108010012988 lysyl-glutamyl-aspartyl-glycine Proteins 0.000 description 8
- 108010054155 lysyllysine Proteins 0.000 description 8
- 210000003463 organelle Anatomy 0.000 description 8
- 102000004196 processed proteins & peptides Human genes 0.000 description 8
- 231100000331 toxic Toxicity 0.000 description 8
- 230000002588 toxic effect Effects 0.000 description 8
- XVZCXCTYGHPNEM-IHRRRGAJSA-N (2s)-1-[(2s)-2-[[(2s)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(O)=O XVZCXCTYGHPNEM-IHRRRGAJSA-N 0.000 description 7
- LSLIRHLIUDVNBN-CIUDSAMLSA-N Ala-Asp-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LSLIRHLIUDVNBN-CIUDSAMLSA-N 0.000 description 7
- DEWWPUNXRNGMQN-LPEHRKFASA-N Ala-Met-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N1CCC[C@@H]1C(=O)O)N DEWWPUNXRNGMQN-LPEHRKFASA-N 0.000 description 7
- VHAQSYHSDKERBS-XPUUQOCRSA-N Ala-Val-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O VHAQSYHSDKERBS-XPUUQOCRSA-N 0.000 description 7
- KMSHNDWHPWXPEC-BQBZGAKWSA-N Arg-Asp-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KMSHNDWHPWXPEC-BQBZGAKWSA-N 0.000 description 7
- YFBGNGASPGRWEM-DCAQKATOSA-N Arg-Asp-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N YFBGNGASPGRWEM-DCAQKATOSA-N 0.000 description 7
- OLGCWMNDJTWQAG-GUBZILKMSA-N Asn-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC(N)=O OLGCWMNDJTWQAG-GUBZILKMSA-N 0.000 description 7
- OLVIPTLKNSAYRJ-YUMQZZPRSA-N Asn-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N OLVIPTLKNSAYRJ-YUMQZZPRSA-N 0.000 description 7
- HPNDBHLITCHRSO-WHFBIAKZSA-N Asp-Ala-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)NCC(O)=O HPNDBHLITCHRSO-WHFBIAKZSA-N 0.000 description 7
- VBVKSAFJPVXMFJ-CIUDSAMLSA-N Asp-Asn-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N VBVKSAFJPVXMFJ-CIUDSAMLSA-N 0.000 description 7
- SVABRQFIHCSNCI-FOHZUACHSA-N Asp-Gly-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O SVABRQFIHCSNCI-FOHZUACHSA-N 0.000 description 7
- SWTQDYFZVOJVLL-KKUMJFAQSA-N Asp-His-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC(=O)O)N)O SWTQDYFZVOJVLL-KKUMJFAQSA-N 0.000 description 7
- JUWISGAGWSDGDH-KKUMJFAQSA-N Asp-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(O)=O)CC1=CC=CC=C1 JUWISGAGWSDGDH-KKUMJFAQSA-N 0.000 description 7
- CITDWMLWXNUQKD-FXQIFTODSA-N Gln-Gln-Asn Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N CITDWMLWXNUQKD-FXQIFTODSA-N 0.000 description 7
- FQCILXROGNOZON-YUMQZZPRSA-N Gln-Pro-Gly Chemical compound NC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O FQCILXROGNOZON-YUMQZZPRSA-N 0.000 description 7
- XMPAXPSENRSOSV-RYUDHWBXSA-N Glu-Gly-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XMPAXPSENRSOSV-RYUDHWBXSA-N 0.000 description 7
- PJBVXVBTTFZPHJ-GUBZILKMSA-N Glu-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)O)N PJBVXVBTTFZPHJ-GUBZILKMSA-N 0.000 description 7
- CQAHWYDHKUWYIX-YUMQZZPRSA-N Glu-Pro-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O CQAHWYDHKUWYIX-YUMQZZPRSA-N 0.000 description 7
- GPSHCSTUYOQPAI-JHEQGTHGSA-N Glu-Thr-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O GPSHCSTUYOQPAI-JHEQGTHGSA-N 0.000 description 7
- IWAXHBCACVWNHT-BQBZGAKWSA-N Gly-Asp-Arg Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IWAXHBCACVWNHT-BQBZGAKWSA-N 0.000 description 7
- LXXLEUBUOMCAMR-NKWVEPMBSA-N Gly-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)CN)C(=O)O LXXLEUBUOMCAMR-NKWVEPMBSA-N 0.000 description 7
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 7
- 102000051366 Glycosyltransferases Human genes 0.000 description 7
- 108700023372 Glycosyltransferases Proteins 0.000 description 7
- MDBYBTWRMOAJAY-NHCYSSNCSA-N His-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CN=CN1)N MDBYBTWRMOAJAY-NHCYSSNCSA-N 0.000 description 7
- WGVPDSNCHDEDBP-KKUMJFAQSA-N His-Asp-Phe Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WGVPDSNCHDEDBP-KKUMJFAQSA-N 0.000 description 7
- QAMFAYSMNZBNCA-UWVGGRQHSA-N His-Gly-Met Chemical compound CSCC[C@H](NC(=O)CNC(=O)[C@@H](N)Cc1cnc[nH]1)C(O)=O QAMFAYSMNZBNCA-UWVGGRQHSA-N 0.000 description 7
- YYOCMTFVGKDNQP-IHRRRGAJSA-N His-Met-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N YYOCMTFVGKDNQP-IHRRRGAJSA-N 0.000 description 7
- LVQDUPQUJZWKSU-PYJNHQTQSA-N Ile-Arg-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N LVQDUPQUJZWKSU-PYJNHQTQSA-N 0.000 description 7
- GQKSJYINYYWPMR-NGZCFLSTSA-N Ile-Gly-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N1CCC[C@@H]1C(=O)O)N GQKSJYINYYWPMR-NGZCFLSTSA-N 0.000 description 7
- UIEZQYNXCYHMQS-BJDJZHNGSA-N Ile-Lys-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)O)N UIEZQYNXCYHMQS-BJDJZHNGSA-N 0.000 description 7
- 108010065920 Insulin Lispro Proteins 0.000 description 7
- 241000235058 Komagataella pastoris Species 0.000 description 7
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 7
- LLBQJYDYOLIQAI-JYJNAYRXSA-N Leu-Glu-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LLBQJYDYOLIQAI-JYJNAYRXSA-N 0.000 description 7
- YWKNKRAKOCLOLH-OEAJRASXSA-N Leu-Phe-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=CC=C1 YWKNKRAKOCLOLH-OEAJRASXSA-N 0.000 description 7
- QWWPYKKLXWOITQ-VOAKCMCISA-N Leu-Thr-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(C)C QWWPYKKLXWOITQ-VOAKCMCISA-N 0.000 description 7
- YEIYAQQKADPIBJ-GARJFASQSA-N Lys-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCCN)N)C(=O)O YEIYAQQKADPIBJ-GARJFASQSA-N 0.000 description 7
- GPJGFSFYBJGYRX-YUMQZZPRSA-N Lys-Gly-Asp Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O GPJGFSFYBJGYRX-YUMQZZPRSA-N 0.000 description 7
- YPLVCBKEPJPBDQ-MELADBBJSA-N Lys-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N YPLVCBKEPJPBDQ-MELADBBJSA-N 0.000 description 7
- LMGNWHDWJDIOPK-DKIMLUQUSA-N Lys-Phe-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LMGNWHDWJDIOPK-DKIMLUQUSA-N 0.000 description 7
- LUAJJLPHUXPQLH-KKUMJFAQSA-N Lys-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCCN)N LUAJJLPHUXPQLH-KKUMJFAQSA-N 0.000 description 7
- PDIDTSZKKFEDMB-UWVGGRQHSA-N Lys-Pro-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O PDIDTSZKKFEDMB-UWVGGRQHSA-N 0.000 description 7
- OLWAOWXIADGIJG-AVGNSLFASA-N Met-Arg-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(O)=O OLWAOWXIADGIJG-AVGNSLFASA-N 0.000 description 7
- QAVZUKIPOMBLMC-AVGNSLFASA-N Met-Val-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(C)C QAVZUKIPOMBLMC-AVGNSLFASA-N 0.000 description 7
- XZFYRXDAULDNFX-UHFFFAOYSA-N N-L-cysteinyl-L-phenylalanine Natural products SCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XZFYRXDAULDNFX-UHFFFAOYSA-N 0.000 description 7
- 108091005804 Peptidases Proteins 0.000 description 7
- JVTMTFMMMHAPCR-UBHSHLNASA-N Phe-Ala-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 JVTMTFMMMHAPCR-UBHSHLNASA-N 0.000 description 7
- CDNPIRSCAFMMBE-SRVKXCTJSA-N Phe-Asn-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O CDNPIRSCAFMMBE-SRVKXCTJSA-N 0.000 description 7
- OQTDZEJJWWAGJT-KKUMJFAQSA-N Phe-Lys-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O OQTDZEJJWWAGJT-KKUMJFAQSA-N 0.000 description 7
- SCKXGHWQPPURGT-KKUMJFAQSA-N Phe-Lys-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O SCKXGHWQPPURGT-KKUMJFAQSA-N 0.000 description 7
- ULIWFCCJIOEHMU-BQBZGAKWSA-N Pro-Gly-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 ULIWFCCJIOEHMU-BQBZGAKWSA-N 0.000 description 7
- XQSREVQDGCPFRJ-STQMWFEESA-N Pro-Gly-Phe Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O XQSREVQDGCPFRJ-STQMWFEESA-N 0.000 description 7
- DCHQYSOGURGJST-FJXKBIBVSA-N Pro-Thr-Gly Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O DCHQYSOGURGJST-FJXKBIBVSA-N 0.000 description 7
- 239000004365 Protease Substances 0.000 description 7
- IGROJMCBGRFRGI-YTLHQDLWSA-N Thr-Ala-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O IGROJMCBGRFRGI-YTLHQDLWSA-N 0.000 description 7
- LIXBDERDAGNVAV-XKBZYTNZSA-N Thr-Gln-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O LIXBDERDAGNVAV-XKBZYTNZSA-N 0.000 description 7
- SLUWOCTZVGMURC-BFHQHQDPSA-N Thr-Gly-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O SLUWOCTZVGMURC-BFHQHQDPSA-N 0.000 description 7
- AQAMPXBRJJWPNI-JHEQGTHGSA-N Thr-Gly-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AQAMPXBRJJWPNI-JHEQGTHGSA-N 0.000 description 7
- FQPDRTDDEZXCEC-SVSWQMSJSA-N Thr-Ile-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O FQPDRTDDEZXCEC-SVSWQMSJSA-N 0.000 description 7
- VTMGKRABARCZAX-OSUNSFLBSA-N Thr-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)[C@@H](C)O VTMGKRABARCZAX-OSUNSFLBSA-N 0.000 description 7
- UQCNIMDPYICBTR-KYNKHSRBSA-N Thr-Thr-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O UQCNIMDPYICBTR-KYNKHSRBSA-N 0.000 description 7
- BXPOOVDVGWEXDU-WZLNRYEVSA-N Tyr-Ile-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BXPOOVDVGWEXDU-WZLNRYEVSA-N 0.000 description 7
- NSGZILIDHCIZAM-KKUMJFAQSA-N Tyr-Leu-Ser Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N NSGZILIDHCIZAM-KKUMJFAQSA-N 0.000 description 7
- XJPXTYLVMUZGNW-IHRRRGAJSA-N Tyr-Pro-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O XJPXTYLVMUZGNW-IHRRRGAJSA-N 0.000 description 7
- LMSBRIVOCYOKMU-NRPADANISA-N Val-Gln-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CS)C(=O)O)N LMSBRIVOCYOKMU-NRPADANISA-N 0.000 description 7
- QHFQQRKNGCXTHL-AUTRQRHGSA-N Val-Gln-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O QHFQQRKNGCXTHL-AUTRQRHGSA-N 0.000 description 7
- YTPLVNUZZOBFFC-SCZZXKLOSA-N Val-Gly-Pro Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N1CCC[C@@H]1C(O)=O YTPLVNUZZOBFFC-SCZZXKLOSA-N 0.000 description 7
- BTWMICVCQLKKNR-DCAQKATOSA-N Val-Leu-Ser Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C([O-])=O BTWMICVCQLKKNR-DCAQKATOSA-N 0.000 description 7
- NSUUANXHLKKHQB-BZSNNMDCSA-N Val-Pro-Trp Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CNC2=CC=CC=C12 NSUUANXHLKKHQB-BZSNNMDCSA-N 0.000 description 7
- JAIZPWVHPQRYOU-ZJDVBMNYSA-N Val-Thr-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O JAIZPWVHPQRYOU-ZJDVBMNYSA-N 0.000 description 7
- 230000021736 acetylation Effects 0.000 description 7
- 238000006640 acetylation reaction Methods 0.000 description 7
- 108010044940 alanylglutamine Proteins 0.000 description 7
- 108010010430 asparagine-proline-alanine Proteins 0.000 description 7
- 108010077245 asparaginyl-proline Proteins 0.000 description 7
- 210000004899 c-terminal region Anatomy 0.000 description 7
- 238000004949 mass spectrometry Methods 0.000 description 7
- 108010005942 methionylglycine Proteins 0.000 description 7
- 239000002243 precursor Substances 0.000 description 7
- 108010048397 seryl-lysyl-leucine Proteins 0.000 description 7
- 230000014616 translation Effects 0.000 description 7
- 108010084932 tryptophyl-proline Proteins 0.000 description 7
- 108010009962 valyltyrosine Proteins 0.000 description 7
- WGDNWOMKBUXFHR-BQBZGAKWSA-N Ala-Gly-Arg Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N WGDNWOMKBUXFHR-BQBZGAKWSA-N 0.000 description 6
- UISQLSIBJKEJSS-GUBZILKMSA-N Arg-Arg-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CO)C(O)=O UISQLSIBJKEJSS-GUBZILKMSA-N 0.000 description 6
- AQPVUEJJARLJHB-BQBZGAKWSA-N Arg-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CCCN=C(N)N AQPVUEJJARLJHB-BQBZGAKWSA-N 0.000 description 6
- 241000894006 Bacteria Species 0.000 description 6
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 6
- 102100033601 Collagen alpha-1(I) chain Human genes 0.000 description 6
- RLFSBAPJTYKSLG-WHFBIAKZSA-N Gly-Ala-Asp Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O RLFSBAPJTYKSLG-WHFBIAKZSA-N 0.000 description 6
- QSDKBRMVXSWAQE-BFHQHQDPSA-N Gly-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN QSDKBRMVXSWAQE-BFHQHQDPSA-N 0.000 description 6
- HFXJIZNEXNIZIJ-BQBZGAKWSA-N Gly-Glu-Gln Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O HFXJIZNEXNIZIJ-BQBZGAKWSA-N 0.000 description 6
- QSVCIFZPGLOZGH-WDSKDSINSA-N Gly-Glu-Ser Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O QSVCIFZPGLOZGH-WDSKDSINSA-N 0.000 description 6
- SWQALSGKVLYKDT-UHFFFAOYSA-N Gly-Ile-Ala Natural products NCC(=O)NC(C(C)CC)C(=O)NC(C)C(O)=O SWQALSGKVLYKDT-UHFFFAOYSA-N 0.000 description 6
- WDEHMRNSGHVNOH-VHSXEESVSA-N Gly-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)CN)C(=O)O WDEHMRNSGHVNOH-VHSXEESVSA-N 0.000 description 6
- YYXJFBMCOUSYSF-RYUDHWBXSA-N Gly-Phe-Gln Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O YYXJFBMCOUSYSF-RYUDHWBXSA-N 0.000 description 6
- IRJWAYCXIYUHQE-WHFBIAKZSA-N Gly-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)CN IRJWAYCXIYUHQE-WHFBIAKZSA-N 0.000 description 6
- LVTJJOJKDCVZGP-QWRGUYRKSA-N Leu-Lys-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LVTJJOJKDCVZGP-QWRGUYRKSA-N 0.000 description 6
- KPYAOIVPJKPIOU-KKUMJFAQSA-N Leu-Lys-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O KPYAOIVPJKPIOU-KKUMJFAQSA-N 0.000 description 6
- XZNJZXJZBMBGGS-NHCYSSNCSA-N Leu-Val-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O XZNJZXJZBMBGGS-NHCYSSNCSA-N 0.000 description 6
- ZJSZPXISKMDJKQ-JYJNAYRXSA-N Lys-Phe-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCC(O)=O)C(O)=O)CC1=CC=CC=C1 ZJSZPXISKMDJKQ-JYJNAYRXSA-N 0.000 description 6
- DMKWYMWNEKIPFC-IUCAKERBSA-N Pro-Gly-Arg Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O DMKWYMWNEKIPFC-IUCAKERBSA-N 0.000 description 6
- JMVQDLDPDBXAAX-YUMQZZPRSA-N Pro-Gly-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 JMVQDLDPDBXAAX-YUMQZZPRSA-N 0.000 description 6
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 6
- MIJWOJAXARLEHA-WDSKDSINSA-N Ser-Gly-Glu Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O MIJWOJAXARLEHA-WDSKDSINSA-N 0.000 description 6
- 241000723792 Tobacco etch virus Species 0.000 description 6
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 6
- 108010029483 alpha 1 Chain Collagen Type I Proteins 0.000 description 6
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 6
- 108010068265 aspartyltyrosine Proteins 0.000 description 6
- 229910052799 carbon Inorganic materials 0.000 description 6
- 230000009089 cytolysis Effects 0.000 description 6
- 230000001086 cytosolic effect Effects 0.000 description 6
- 230000004151 fermentation Effects 0.000 description 6
- 238000000855 fermentation Methods 0.000 description 6
- 239000012634 fragment Substances 0.000 description 6
- 108010025801 glycyl-prolyl-arginine Proteins 0.000 description 6
- -1 hydroxyltransferase Proteins 0.000 description 6
- 239000000463 material Substances 0.000 description 6
- 125000003729 nucleotide group Chemical group 0.000 description 6
- 108010015796 prolylisoleucine Proteins 0.000 description 6
- 239000004753 textile Substances 0.000 description 6
- 230000032258 transport Effects 0.000 description 6
- YLTKNGYYPIWKHZ-ACZMJKKPSA-N Ala-Ala-Glu Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O YLTKNGYYPIWKHZ-ACZMJKKPSA-N 0.000 description 5
- PAIHPOGPJVUFJY-WDSKDSINSA-N Ala-Glu-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O PAIHPOGPJVUFJY-WDSKDSINSA-N 0.000 description 5
- WMYJZJRILUVVRG-WDSKDSINSA-N Ala-Gly-Gln Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O WMYJZJRILUVVRG-WDSKDSINSA-N 0.000 description 5
- BEMGNWZECGIJOI-WDSKDSINSA-N Ala-Gly-Glu Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O BEMGNWZECGIJOI-WDSKDSINSA-N 0.000 description 5
- NBTGEURICRTMGL-WHFBIAKZSA-N Ala-Gly-Ser Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O NBTGEURICRTMGL-WHFBIAKZSA-N 0.000 description 5
- IYMAXBFPHPZYIK-BQBZGAKWSA-N Arg-Gly-Asp Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O IYMAXBFPHPZYIK-BQBZGAKWSA-N 0.000 description 5
- AGVNTAUPLWIQEN-ZPFDUUQYSA-N Arg-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N AGVNTAUPLWIQEN-ZPFDUUQYSA-N 0.000 description 5
- HGKHPCFTRQDHCU-IUCAKERBSA-N Arg-Pro-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O HGKHPCFTRQDHCU-IUCAKERBSA-N 0.000 description 5
- 108010051330 Arg-Pro-Gly-Pro Proteins 0.000 description 5
- IICZCLFBILYRCU-WHFBIAKZSA-N Asn-Gly-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O IICZCLFBILYRCU-WHFBIAKZSA-N 0.000 description 5
- AXXCUABIFZPKPM-BQBZGAKWSA-N Asp-Arg-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O AXXCUABIFZPKPM-BQBZGAKWSA-N 0.000 description 5
- JJQGZGOEDSSHTE-FOHZUACHSA-N Asp-Thr-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O JJQGZGOEDSSHTE-FOHZUACHSA-N 0.000 description 5
- 102000011632 Caseins Human genes 0.000 description 5
- 108010076119 Caseins Proteins 0.000 description 5
- ORYMMTRPKVTGSJ-XVKPBYJWSA-N Gln-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCC(N)=O ORYMMTRPKVTGSJ-XVKPBYJWSA-N 0.000 description 5
- NKLRYVLERDYDBI-FXQIFTODSA-N Glu-Glu-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O NKLRYVLERDYDBI-FXQIFTODSA-N 0.000 description 5
- JRDYDYXZKFNNRQ-XPUUQOCRSA-N Gly-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN JRDYDYXZKFNNRQ-XPUUQOCRSA-N 0.000 description 5
- VXKCPBPQEKKERH-IUCAKERBSA-N Gly-Arg-Pro Chemical compound NC(N)=NCCC[C@H](NC(=O)CN)C(=O)N1CCC[C@H]1C(O)=O VXKCPBPQEKKERH-IUCAKERBSA-N 0.000 description 5
- XRTDOIOIBMAXCT-NKWVEPMBSA-N Gly-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)CN)C(=O)O XRTDOIOIBMAXCT-NKWVEPMBSA-N 0.000 description 5
- PABFFPWEJMEVEC-JGVFFNPUSA-N Gly-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)CN)C(=O)O PABFFPWEJMEVEC-JGVFFNPUSA-N 0.000 description 5
- NSVOVKWEKGEOQB-LURJTMIESA-N Gly-Pro-Gly Chemical compound NCC(=O)N1CCC[C@H]1C(=O)NCC(O)=O NSVOVKWEKGEOQB-LURJTMIESA-N 0.000 description 5
- 102000004195 Isomerases Human genes 0.000 description 5
- 108090000769 Isomerases Proteins 0.000 description 5
- WNGVUZWBXZKQES-YUMQZZPRSA-N Leu-Ala-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O WNGVUZWBXZKQES-YUMQZZPRSA-N 0.000 description 5
- ULXYQAJWJGLCNR-YUMQZZPRSA-N Leu-Asp-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O ULXYQAJWJGLCNR-YUMQZZPRSA-N 0.000 description 5
- AIMGJYMCTAABEN-GVXVVHGQSA-N Leu-Val-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIMGJYMCTAABEN-GVXVVHGQSA-N 0.000 description 5
- ITWQLSZTLBKWJM-YUMQZZPRSA-N Lys-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CCCCN ITWQLSZTLBKWJM-YUMQZZPRSA-N 0.000 description 5
- 108010085186 Peroxisomal Targeting Signals Proteins 0.000 description 5
- HAEGAELAYWSUNC-WPRPVWTQSA-N Pro-Gly-Val Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O HAEGAELAYWSUNC-WPRPVWTQSA-N 0.000 description 5
- 108010078762 Protein Precursors Proteins 0.000 description 5
- 102000014961 Protein Precursors Human genes 0.000 description 5
- HQTKVSCNCDLXSX-BQBZGAKWSA-N Ser-Arg-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O HQTKVSCNCDLXSX-BQBZGAKWSA-N 0.000 description 5
- VTCKHZJKWQENKX-KBPBESRZSA-N Tyr-Lys-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O VTCKHZJKWQENKX-KBPBESRZSA-N 0.000 description 5
- SJRUJQFQVLMZFW-WPRPVWTQSA-N Val-Pro-Gly Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O SJRUJQFQVLMZFW-WPRPVWTQSA-N 0.000 description 5
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 5
- 108010093581 aspartyl-proline Proteins 0.000 description 5
- 230000001580 bacterial effect Effects 0.000 description 5
- 230000015572 biosynthetic process Effects 0.000 description 5
- 238000006664 bond formation reaction Methods 0.000 description 5
- BECPQYXYKAMYBN-UHFFFAOYSA-N casein, tech. Chemical compound NCCCCC(C(O)=O)N=C(O)C(CC(O)=O)N=C(O)C(CCC(O)=N)N=C(O)C(CC(C)C)N=C(O)C(CCC(O)=O)N=C(O)C(CC(O)=O)N=C(O)C(CCC(O)=O)N=C(O)C(C(C)O)N=C(O)C(CCC(O)=N)N=C(O)C(CCC(O)=N)N=C(O)C(CCC(O)=N)N=C(O)C(CCC(O)=O)N=C(O)C(CCC(O)=O)N=C(O)C(COP(O)(O)=O)N=C(O)C(CCC(O)=N)N=C(O)C(N)CC1=CC=CC=C1 BECPQYXYKAMYBN-UHFFFAOYSA-N 0.000 description 5
- 235000021240 caseins Nutrition 0.000 description 5
- 230000002255 enzymatic effect Effects 0.000 description 5
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 5
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 5
- 239000005431 greenhouse gas Substances 0.000 description 5
- 238000001727 in vivo Methods 0.000 description 5
- 108010009298 lysylglutamic acid Proteins 0.000 description 5
- 239000002773 nucleotide Substances 0.000 description 5
- 125000001500 prolyl group Chemical group [H]N1C([H])(C(=O)[*])C([H])([H])C([H])([H])C1([H])[H] 0.000 description 5
- 108010053725 prolylvaline Proteins 0.000 description 5
- 108020003175 receptors Proteins 0.000 description 5
- 102000005962 receptors Human genes 0.000 description 5
- 238000013518 transcription Methods 0.000 description 5
- 230000035897 transcription Effects 0.000 description 5
- 108700004896 tripeptide FEG Proteins 0.000 description 5
- CUVSTAMIHSSVKL-UWVGGRQHSA-N (4s)-4-[(2-aminoacetyl)amino]-5-[[(2s)-6-amino-1-(carboxymethylamino)-1-oxohexan-2-yl]amino]-5-oxopentanoic acid Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN CUVSTAMIHSSVKL-UWVGGRQHSA-N 0.000 description 4
- KIUYPHAMDKDICO-WHFBIAKZSA-N Ala-Asp-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KIUYPHAMDKDICO-WHFBIAKZSA-N 0.000 description 4
- SMCGQGDVTPFXKB-XPUUQOCRSA-N Ala-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N SMCGQGDVTPFXKB-XPUUQOCRSA-N 0.000 description 4
- MDNAVFBZPROEHO-UHFFFAOYSA-N Ala-Lys-Val Natural products CC(C)C(C(O)=O)NC(=O)C(NC(=O)C(C)N)CCCCN MDNAVFBZPROEHO-UHFFFAOYSA-N 0.000 description 4
- RTZCUEHYUQZIDE-WHFBIAKZSA-N Ala-Ser-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RTZCUEHYUQZIDE-WHFBIAKZSA-N 0.000 description 4
- OTCJMMRQBVDQRK-DCAQKATOSA-N Arg-Asp-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O OTCJMMRQBVDQRK-DCAQKATOSA-N 0.000 description 4
- HQIZDMIGUJOSNI-IUCAKERBSA-N Arg-Gly-Arg Chemical compound N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O HQIZDMIGUJOSNI-IUCAKERBSA-N 0.000 description 4
- NKNILFJYKKHBKE-WPRPVWTQSA-N Arg-Gly-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O NKNILFJYKKHBKE-WPRPVWTQSA-N 0.000 description 4
- PLTGTJAZQRGMPP-FXQIFTODSA-N Asn-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC(N)=O PLTGTJAZQRGMPP-FXQIFTODSA-N 0.000 description 4
- RYKWOUUZJFSJOH-FXQIFTODSA-N Asp-Gln-Glu Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N RYKWOUUZJFSJOH-FXQIFTODSA-N 0.000 description 4
- YNCHFVRXEQFPBY-BQBZGAKWSA-N Asp-Gly-Arg Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N YNCHFVRXEQFPBY-BQBZGAKWSA-N 0.000 description 4
- JUWZKMBALYLZCK-WHFBIAKZSA-N Asp-Gly-Asn Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O JUWZKMBALYLZCK-WHFBIAKZSA-N 0.000 description 4
- BIVYLQMZPHDUIH-WHFBIAKZSA-N Asp-Gly-Cys Chemical compound C([C@@H](C(=O)NCC(=O)N[C@@H](CS)C(=O)O)N)C(=O)O BIVYLQMZPHDUIH-WHFBIAKZSA-N 0.000 description 4
- VIRHEUMYXXLCBF-WDSKDSINSA-N Asp-Gly-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O VIRHEUMYXXLCBF-WDSKDSINSA-N 0.000 description 4
- WSGVTKZFVJSJOG-RCOVLWMOSA-N Asp-Gly-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O WSGVTKZFVJSJOG-RCOVLWMOSA-N 0.000 description 4
- KNDCWFXCFKSEBM-AVGNSLFASA-N Asp-Tyr-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O KNDCWFXCFKSEBM-AVGNSLFASA-N 0.000 description 4
- 101100351811 Caenorhabditis elegans pgal-1 gene Proteins 0.000 description 4
- 108010031425 Casein Kinases Proteins 0.000 description 4
- 102000005403 Casein Kinases Human genes 0.000 description 4
- 108010072062 GEKG peptide Proteins 0.000 description 4
- VSXBYIJUAXPAAL-WDSKDSINSA-N Gln-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CCC(N)=O VSXBYIJUAXPAAL-WDSKDSINSA-N 0.000 description 4
- SMLDOQHTOAAFJQ-WDSKDSINSA-N Gln-Gly-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CO)C(O)=O SMLDOQHTOAAFJQ-WDSKDSINSA-N 0.000 description 4
- AFODTOLGSZQDSL-PEFMBERDSA-N Glu-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N AFODTOLGSZQDSL-PEFMBERDSA-N 0.000 description 4
- RAUDKMVXNOWDLS-WDSKDSINSA-N Glu-Gly-Ser Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O RAUDKMVXNOWDLS-WDSKDSINSA-N 0.000 description 4
- BRFJMRSRMOMIMU-WHFBIAKZSA-N Gly-Ala-Asn Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O BRFJMRSRMOMIMU-WHFBIAKZSA-N 0.000 description 4
- QPTNELDXWKRIFX-YFKPBYRVSA-N Gly-Gly-Gln Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O QPTNELDXWKRIFX-YFKPBYRVSA-N 0.000 description 4
- BUEFQXUHTUZXHR-LURJTMIESA-N Gly-Gly-Pro zwitterion Chemical compound NCC(=O)NCC(=O)N1CCC[C@H]1C(O)=O BUEFQXUHTUZXHR-LURJTMIESA-N 0.000 description 4
- LOEANKRDMMVOGZ-YUMQZZPRSA-N Gly-Lys-Asp Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)N[C@@H](CC(O)=O)C(O)=O LOEANKRDMMVOGZ-YUMQZZPRSA-N 0.000 description 4
- GLACUWHUYFBSPJ-FJXKBIBVSA-N Gly-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN GLACUWHUYFBSPJ-FJXKBIBVSA-N 0.000 description 4
- 239000004471 Glycine Substances 0.000 description 4
- 101001052493 Homo sapiens Mitogen-activated protein kinase 1 Proteins 0.000 description 4
- AQCUAZTZSPQJFF-ZKWXMUAHSA-N Ile-Ala-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O AQCUAZTZSPQJFF-ZKWXMUAHSA-N 0.000 description 4
- TZCGZYWNIDZZMR-UHFFFAOYSA-N Ile-Arg-Ala Natural products CCC(C)C(N)C(=O)NC(C(=O)NC(C)C(O)=O)CCCN=C(N)N TZCGZYWNIDZZMR-UHFFFAOYSA-N 0.000 description 4
- KCTIFOCXAIUQQK-QXEWZRGKSA-N Ile-Pro-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O KCTIFOCXAIUQQK-QXEWZRGKSA-N 0.000 description 4
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 4
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 4
- OXRLYTYUXAQTHP-YUMQZZPRSA-N Leu-Gly-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](C)C(O)=O OXRLYTYUXAQTHP-YUMQZZPRSA-N 0.000 description 4
- BABSVXFGKFLIGW-UWVGGRQHSA-N Leu-Gly-Arg Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCNC(N)=N BABSVXFGKFLIGW-UWVGGRQHSA-N 0.000 description 4
- AAKRWBIIGKPOKQ-ONGXEEELSA-N Leu-Val-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O AAKRWBIIGKPOKQ-ONGXEEELSA-N 0.000 description 4
- XFIHDSBIPWEYJJ-YUMQZZPRSA-N Lys-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN XFIHDSBIPWEYJJ-YUMQZZPRSA-N 0.000 description 4
- AAORVPFVUIHEAB-YUMQZZPRSA-N Lys-Asp-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O AAORVPFVUIHEAB-YUMQZZPRSA-N 0.000 description 4
- QZONCCHVHCOBSK-YUMQZZPRSA-N Lys-Gly-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O QZONCCHVHCOBSK-YUMQZZPRSA-N 0.000 description 4
- FHIAJWBDZVHLAH-YUMQZZPRSA-N Lys-Gly-Ser Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O FHIAJWBDZVHLAH-YUMQZZPRSA-N 0.000 description 4
- 102100024193 Mitogen-activated protein kinase 1 Human genes 0.000 description 4
- 102000008109 Mixed Function Oxygenases Human genes 0.000 description 4
- 108010074633 Mixed Function Oxygenases Proteins 0.000 description 4
- 102000005431 Molecular Chaperones Human genes 0.000 description 4
- 108010058846 Ovalbumin Proteins 0.000 description 4
- 102000009913 Peroxisomal Targeting Signal 2 Receptor Human genes 0.000 description 4
- 108010077056 Peroxisomal Targeting Signal 2 Receptor Proteins 0.000 description 4
- FIRWJEJVFFGXSH-RYUDHWBXSA-N Phe-Glu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 FIRWJEJVFFGXSH-RYUDHWBXSA-N 0.000 description 4
- PEFJUUYFEGBXFA-BZSNNMDCSA-N Phe-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 PEFJUUYFEGBXFA-BZSNNMDCSA-N 0.000 description 4
- RBRNEFJTEHPDSL-ACRUOGEOSA-N Phe-Phe-Lys Chemical compound C([C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 RBRNEFJTEHPDSL-ACRUOGEOSA-N 0.000 description 4
- MMJJFXWMCMJMQA-STQMWFEESA-N Phe-Pro-Gly Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)NCC(O)=O)C1=CC=CC=C1 MMJJFXWMCMJMQA-STQMWFEESA-N 0.000 description 4
- QTDBZORPVYTRJU-KKXDTOCCSA-N Phe-Tyr-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O QTDBZORPVYTRJU-KKXDTOCCSA-N 0.000 description 4
- JSGWNFKWZNPDAV-YDHLFZDLSA-N Phe-Val-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 JSGWNFKWZNPDAV-YDHLFZDLSA-N 0.000 description 4
- SGCZFWSQERRKBD-BQBZGAKWSA-N Pro-Asp-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)[C@@H]1CCCN1 SGCZFWSQERRKBD-BQBZGAKWSA-N 0.000 description 4
- UUHXBJHVTVGSKM-BQBZGAKWSA-N Pro-Gly-Asn Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O UUHXBJHVTVGSKM-BQBZGAKWSA-N 0.000 description 4
- AFXCXDQNRXTSBD-FJXKBIBVSA-N Pro-Gly-Thr Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O AFXCXDQNRXTSBD-FJXKBIBVSA-N 0.000 description 4
- 101100465559 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) PRE7 gene Proteins 0.000 description 4
- WDXYVIIVDIDOSX-DCAQKATOSA-N Ser-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CCCN=C(N)N WDXYVIIVDIDOSX-DCAQKATOSA-N 0.000 description 4
- BNFVPSRLHHPQKS-WHFBIAKZSA-N Ser-Asp-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O BNFVPSRLHHPQKS-WHFBIAKZSA-N 0.000 description 4
- XJDMUQCLVSCRSJ-VZFHVOOUSA-N Ser-Thr-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O XJDMUQCLVSCRSJ-VZFHVOOUSA-N 0.000 description 4
- MECLEFZMPPOEAC-VOAKCMCISA-N Thr-Leu-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MECLEFZMPPOEAC-VOAKCMCISA-N 0.000 description 4
- MXDOAJQRJBMGMO-FJXKBIBVSA-N Thr-Pro-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O MXDOAJQRJBMGMO-FJXKBIBVSA-N 0.000 description 4
- CTDPLKMBVALCGN-JSGCOSHPSA-N Tyr-Gly-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O CTDPLKMBVALCGN-JSGCOSHPSA-N 0.000 description 4
- ISERLACIZUGCDX-ZKWXMUAHSA-N Val-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N ISERLACIZUGCDX-ZKWXMUAHSA-N 0.000 description 4
- LKUDRJSNRWVGMS-QSFUFRPTSA-N Val-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N LKUDRJSNRWVGMS-QSFUFRPTSA-N 0.000 description 4
- VXDSPJJQUQDCKH-UKJIMTQDSA-N Val-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N VXDSPJJQUQDCKH-UKJIMTQDSA-N 0.000 description 4
- 108010066875 alanyl-prolyl-tryptophyl-cysteine Proteins 0.000 description 4
- 108010062796 arginyllysine Proteins 0.000 description 4
- 239000011324 bead Substances 0.000 description 4
- 239000005018 casein Substances 0.000 description 4
- 210000000170 cell membrane Anatomy 0.000 description 4
- 230000008859 change Effects 0.000 description 4
- 230000004186 co-expression Effects 0.000 description 4
- 230000008045 co-localization Effects 0.000 description 4
- 230000001419 dependent effect Effects 0.000 description 4
- 239000004744 fabric Substances 0.000 description 4
- 239000000499 gel Substances 0.000 description 4
- 108010049041 glutamylalanine Proteins 0.000 description 4
- 229960002449 glycine Drugs 0.000 description 4
- 108010027668 glycyl-alanyl-valine Proteins 0.000 description 4
- XKUKSGPZAADMRA-UHFFFAOYSA-N glycyl-glycyl-glycine Natural products NCC(=O)NCC(=O)NCC(O)=O XKUKSGPZAADMRA-UHFFFAOYSA-N 0.000 description 4
- HHLFWLYXYJOTON-UHFFFAOYSA-N glyoxylic acid Chemical compound OC(=O)C=O HHLFWLYXYJOTON-UHFFFAOYSA-N 0.000 description 4
- 230000012010 growth Effects 0.000 description 4
- 229960002885 histidine Drugs 0.000 description 4
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 4
- 235000015097 nutrients Nutrition 0.000 description 4
- 229940092253 ovalbumin Drugs 0.000 description 4
- 101150076896 pts1 gene Proteins 0.000 description 4
- 229920002477 rna polymer Polymers 0.000 description 4
- GUGNSJAORJLKGP-UHFFFAOYSA-K sodium 8-methoxypyrene-1,3,6-trisulfonate Chemical compound [Na+].[Na+].[Na+].C1=C2C(OC)=CC(S([O-])(=O)=O)=C(C=C3)C2=C2C3=C(S([O-])(=O)=O)C=C(S([O-])(=O)=O)C2=C1 GUGNSJAORJLKGP-UHFFFAOYSA-K 0.000 description 4
- 238000012546 transfer Methods 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- 238000013519 translation Methods 0.000 description 4
- 108010051110 tyrosyl-lysine Proteins 0.000 description 4
- 238000001262 western blot Methods 0.000 description 4
- 108010000998 wheylin-2 peptide Proteins 0.000 description 4
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 3
- RLCSROTYKMPBDL-USJZOSNVSA-N 2-[[(2s)-1-[(2s)-2-[[(2s)-2-[[2-[[(2s)-2-amino-3-methylbutanoyl]amino]acetyl]amino]-3-methylbutanoyl]amino]propanoyl]pyrrolidine-2-carbonyl]amino]acetic acid Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O RLCSROTYKMPBDL-USJZOSNVSA-N 0.000 description 3
- WQVFQXXBNHHPLX-ZKWXMUAHSA-N Ala-Ala-His Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O WQVFQXXBNHHPLX-ZKWXMUAHSA-N 0.000 description 3
- CVGNCMIULZNYES-WHFBIAKZSA-N Ala-Asn-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CVGNCMIULZNYES-WHFBIAKZSA-N 0.000 description 3
- IFTVANMRTIHKML-WDSKDSINSA-N Ala-Gln-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O IFTVANMRTIHKML-WDSKDSINSA-N 0.000 description 3
- MPLOSMWGDNJSEV-WHFBIAKZSA-N Ala-Gly-Asp Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MPLOSMWGDNJSEV-WHFBIAKZSA-N 0.000 description 3
- PCIFXPRIFWKWLK-YUMQZZPRSA-N Ala-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N PCIFXPRIFWKWLK-YUMQZZPRSA-N 0.000 description 3
- IORKCNUBHNIMKY-CIUDSAMLSA-N Ala-Pro-Glu Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O IORKCNUBHNIMKY-CIUDSAMLSA-N 0.000 description 3
- PNIGSVZJNVUVJA-BQBZGAKWSA-N Arg-Gly-Asn Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O PNIGSVZJNVUVJA-BQBZGAKWSA-N 0.000 description 3
- HAVKMRGWNXMCDR-STQMWFEESA-N Arg-Gly-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HAVKMRGWNXMCDR-STQMWFEESA-N 0.000 description 3
- UZSQXCMNUPKLCC-FJXKBIBVSA-N Arg-Thr-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O UZSQXCMNUPKLCC-FJXKBIBVSA-N 0.000 description 3
- HZPSDHRYYIORKR-WHFBIAKZSA-N Asn-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC(N)=O HZPSDHRYYIORKR-WHFBIAKZSA-N 0.000 description 3
- VJTWLBMESLDOMK-WDSKDSINSA-N Asn-Gln-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O VJTWLBMESLDOMK-WDSKDSINSA-N 0.000 description 3
- HYQYLOSCICEYTR-YUMQZZPRSA-N Asn-Gly-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O HYQYLOSCICEYTR-YUMQZZPRSA-N 0.000 description 3
- IBLAOXSULLECQZ-IUKAMOBKSA-N Asn-Ile-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC(N)=O IBLAOXSULLECQZ-IUKAMOBKSA-N 0.000 description 3
- OOXUBGLNDRGOKT-FXQIFTODSA-N Asn-Ser-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O OOXUBGLNDRGOKT-FXQIFTODSA-N 0.000 description 3
- LGCVSPFCFXWUEY-IHPCNDPISA-N Asn-Trp-Tyr Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CC=C(C=C3)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N LGCVSPFCFXWUEY-IHPCNDPISA-N 0.000 description 3
- ZAESWDKAMDVHLL-RCOVLWMOSA-N Asn-Val-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O ZAESWDKAMDVHLL-RCOVLWMOSA-N 0.000 description 3
- PSLSTUMPZILTAH-BYULHYEWSA-N Asp-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O PSLSTUMPZILTAH-BYULHYEWSA-N 0.000 description 3
- RPUYTJJZXQBWDT-SRVKXCTJSA-N Asp-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC(=O)O)N RPUYTJJZXQBWDT-SRVKXCTJSA-N 0.000 description 3
- NJLLRXWFPQQPHV-SRVKXCTJSA-N Asp-Tyr-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O NJLLRXWFPQQPHV-SRVKXCTJSA-N 0.000 description 3
- 241000283690 Bos taurus Species 0.000 description 3
- 108010022452 Collagen Type I Proteins 0.000 description 3
- 102000012422 Collagen Type I Human genes 0.000 description 3
- 102100036213 Collagen alpha-2(I) chain Human genes 0.000 description 3
- BWGNESOTFCXPMA-UHFFFAOYSA-N Dihydrogen disulfide Chemical compound SS BWGNESOTFCXPMA-UHFFFAOYSA-N 0.000 description 3
- GNMQDOGFWYWPNM-LAEOZQHASA-N Gln-Gly-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)CNC(=O)[C@@H](N)CCC(N)=O)C(O)=O GNMQDOGFWYWPNM-LAEOZQHASA-N 0.000 description 3
- FGYPOQPQTUNESW-IUCAKERBSA-N Gln-Gly-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)N)N FGYPOQPQTUNESW-IUCAKERBSA-N 0.000 description 3
- QBLMTCRYYTVUQY-GUBZILKMSA-N Gln-Leu-Asp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O QBLMTCRYYTVUQY-GUBZILKMSA-N 0.000 description 3
- CGOHAEBMDSEKFB-FXQIFTODSA-N Glu-Glu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O CGOHAEBMDSEKFB-FXQIFTODSA-N 0.000 description 3
- OGNJZUXUTPQVBR-BQBZGAKWSA-N Glu-Gly-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OGNJZUXUTPQVBR-BQBZGAKWSA-N 0.000 description 3
- KRGZZKWSBGPLKL-IUCAKERBSA-N Glu-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)O)N KRGZZKWSBGPLKL-IUCAKERBSA-N 0.000 description 3
- QIQABBIDHGQXGA-ZPFDUUQYSA-N Glu-Ile-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O QIQABBIDHGQXGA-ZPFDUUQYSA-N 0.000 description 3
- IRXNJYPKBVERCW-DCAQKATOSA-N Glu-Leu-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IRXNJYPKBVERCW-DCAQKATOSA-N 0.000 description 3
- ILWHFUZZCFYSKT-AVGNSLFASA-N Glu-Lys-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O ILWHFUZZCFYSKT-AVGNSLFASA-N 0.000 description 3
- HQTDNEZTGZUWSY-XVKPBYJWSA-N Glu-Val-Gly Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCC(O)=O)C(=O)NCC(O)=O HQTDNEZTGZUWSY-XVKPBYJWSA-N 0.000 description 3
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 3
- RJIVPOXLQFJRTG-LURJTMIESA-N Gly-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N RJIVPOXLQFJRTG-LURJTMIESA-N 0.000 description 3
- XCLCVBYNGXEVDU-WHFBIAKZSA-N Gly-Asn-Ser Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O XCLCVBYNGXEVDU-WHFBIAKZSA-N 0.000 description 3
- FUTAPPOITCCWTH-WHFBIAKZSA-N Gly-Asp-Asp Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O FUTAPPOITCCWTH-WHFBIAKZSA-N 0.000 description 3
- MHHUEAIBJZWDBH-YUMQZZPRSA-N Gly-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN MHHUEAIBJZWDBH-YUMQZZPRSA-N 0.000 description 3
- CQZDZKRHFWJXDF-WDSKDSINSA-N Gly-Gln-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)CN CQZDZKRHFWJXDF-WDSKDSINSA-N 0.000 description 3
- DTRUBYPMMVPQPD-YUMQZZPRSA-N Gly-Gln-Arg Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O DTRUBYPMMVPQPD-YUMQZZPRSA-N 0.000 description 3
- KMSGYZQRXPUKGI-BYPYZUCNSA-N Gly-Gly-Asn Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(N)=O KMSGYZQRXPUKGI-BYPYZUCNSA-N 0.000 description 3
- TVDHVLGFJSHPAX-UWVGGRQHSA-N Gly-His-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CN=CN1 TVDHVLGFJSHPAX-UWVGGRQHSA-N 0.000 description 3
- BHPQOIPBLYJNAW-NGZCFLSTSA-N Gly-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN BHPQOIPBLYJNAW-NGZCFLSTSA-N 0.000 description 3
- PAWIVEIWWYGBAM-YUMQZZPRSA-N Gly-Leu-Ala Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O PAWIVEIWWYGBAM-YUMQZZPRSA-N 0.000 description 3
- YTSVAIMKVLZUDU-YUMQZZPRSA-N Gly-Leu-Asp Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O YTSVAIMKVLZUDU-YUMQZZPRSA-N 0.000 description 3
- TWTPDFFBLQEBOE-IUCAKERBSA-N Gly-Leu-Gln Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O TWTPDFFBLQEBOE-IUCAKERBSA-N 0.000 description 3
- VBOBNHSVQKKTOT-YUMQZZPRSA-N Gly-Lys-Ala Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O VBOBNHSVQKKTOT-YUMQZZPRSA-N 0.000 description 3
- SSFWXSNOKDZNHY-QXEWZRGKSA-N Gly-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN SSFWXSNOKDZNHY-QXEWZRGKSA-N 0.000 description 3
- OHUKZZYSJBKFRR-WHFBIAKZSA-N Gly-Ser-Asp Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O OHUKZZYSJBKFRR-WHFBIAKZSA-N 0.000 description 3
- WCORRBXVISTKQL-WHFBIAKZSA-N Gly-Ser-Ser Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WCORRBXVISTKQL-WHFBIAKZSA-N 0.000 description 3
- MUGLKCQHTUFLGF-WPRPVWTQSA-N Gly-Val-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)CN MUGLKCQHTUFLGF-WPRPVWTQSA-N 0.000 description 3
- HVLSXIKZNLPZJJ-TXZCQADKSA-N HA peptide Chemical compound C([C@@H](C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](C)C(O)=O)NC(=O)[C@H]1N(CCC1)C(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 HVLSXIKZNLPZJJ-TXZCQADKSA-N 0.000 description 3
- ZIMTWPHIKZEHSE-UWVGGRQHSA-N His-Arg-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O ZIMTWPHIKZEHSE-UWVGGRQHSA-N 0.000 description 3
- QSLKWWDKIXMWJV-SRVKXCTJSA-N His-Cys-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)O)N QSLKWWDKIXMWJV-SRVKXCTJSA-N 0.000 description 3
- 101000875067 Homo sapiens Collagen alpha-2(I) chain Proteins 0.000 description 3
- 101000976075 Homo sapiens Insulin Proteins 0.000 description 3
- 101001072202 Homo sapiens Protein disulfide-isomerase Proteins 0.000 description 3
- MHAJPDPJQMAIIY-UHFFFAOYSA-N Hydrogen peroxide Chemical compound OO MHAJPDPJQMAIIY-UHFFFAOYSA-N 0.000 description 3
- FZWVCYCYWCLQDH-NHCYSSNCSA-N Ile-Leu-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)O)N FZWVCYCYWCLQDH-NHCYSSNCSA-N 0.000 description 3
- GLYJPWIRLBAIJH-UHFFFAOYSA-N Ile-Lys-Pro Natural products CCC(C)C(N)C(=O)NC(CCCCN)C(=O)N1CCCC1C(O)=O GLYJPWIRLBAIJH-UHFFFAOYSA-N 0.000 description 3
- 108010047761 Interferon-alpha Proteins 0.000 description 3
- 102000006992 Interferon-alpha Human genes 0.000 description 3
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 3
- KWTVLKBOQATPHJ-SRVKXCTJSA-N Leu-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N KWTVLKBOQATPHJ-SRVKXCTJSA-N 0.000 description 3
- DPWGZWUMUUJQDT-IUCAKERBSA-N Leu-Gln-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O DPWGZWUMUUJQDT-IUCAKERBSA-N 0.000 description 3
- WIDZHJTYKYBLSR-DCAQKATOSA-N Leu-Glu-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WIDZHJTYKYBLSR-DCAQKATOSA-N 0.000 description 3
- IEWBEPKLKUXQBU-VOAKCMCISA-N Leu-Leu-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IEWBEPKLKUXQBU-VOAKCMCISA-N 0.000 description 3
- HDHQQEDVWQGBEE-DCAQKATOSA-N Leu-Met-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O HDHQQEDVWQGBEE-DCAQKATOSA-N 0.000 description 3
- PJWOOBTYQNNRBF-BZSNNMDCSA-N Leu-Phe-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)O)N PJWOOBTYQNNRBF-BZSNNMDCSA-N 0.000 description 3
- VDIARPPNADFEAV-WEDXCCLWSA-N Leu-Thr-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O VDIARPPNADFEAV-WEDXCCLWSA-N 0.000 description 3
- KCXUCYYZNZFGLL-SRVKXCTJSA-N Lys-Ala-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O KCXUCYYZNZFGLL-SRVKXCTJSA-N 0.000 description 3
- DGWXCIORNLWGGG-CIUDSAMLSA-N Lys-Asn-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O DGWXCIORNLWGGG-CIUDSAMLSA-N 0.000 description 3
- UETQMSASAVBGJY-QWRGUYRKSA-N Lys-Gly-His Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CNC=N1 UETQMSASAVBGJY-QWRGUYRKSA-N 0.000 description 3
- RBEATVHTWHTHTJ-KKUMJFAQSA-N Lys-Leu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O RBEATVHTWHTHTJ-KKUMJFAQSA-N 0.000 description 3
- MIMXMVDLMDMOJD-BZSNNMDCSA-N Lys-Tyr-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O MIMXMVDLMDMOJD-BZSNNMDCSA-N 0.000 description 3
- 239000004472 Lysine Substances 0.000 description 3
- GODBLDDYHFTUAH-CIUDSAMLSA-N Met-Asp-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(O)=O GODBLDDYHFTUAH-CIUDSAMLSA-N 0.000 description 3
- AXHNAGAYRGCDLG-UWVGGRQHSA-N Met-Lys-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O AXHNAGAYRGCDLG-UWVGGRQHSA-N 0.000 description 3
- 101000913652 Mus musculus Fibronectin type III domain-containing protein 5 Proteins 0.000 description 3
- 101100342977 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) leu-1 gene Proteins 0.000 description 3
- 241001099341 Ogataea polymorpha Species 0.000 description 3
- FPTXMUIBLMGTQH-ONGXEEELSA-N Phe-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 FPTXMUIBLMGTQH-ONGXEEELSA-N 0.000 description 3
- DDYIRGBOZVKRFR-AVGNSLFASA-N Phe-Asp-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N DDYIRGBOZVKRFR-AVGNSLFASA-N 0.000 description 3
- LLGTYVHITPVGKR-RYUDHWBXSA-N Phe-Gln-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O LLGTYVHITPVGKR-RYUDHWBXSA-N 0.000 description 3
- UAMFZRNCIFFMLE-FHWLQOOXSA-N Phe-Glu-Tyr Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N UAMFZRNCIFFMLE-FHWLQOOXSA-N 0.000 description 3
- XYHMFGGWNOFUOU-QXEWZRGKSA-N Pro-Ile-Gly Chemical compound OC(=O)CNC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]1CCCN1 XYHMFGGWNOFUOU-QXEWZRGKSA-N 0.000 description 3
- SUENWIFTSTWUKD-AVGNSLFASA-N Pro-Leu-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O SUENWIFTSTWUKD-AVGNSLFASA-N 0.000 description 3
- 102100036352 Protein disulfide-isomerase Human genes 0.000 description 3
- BTKUIVBNGBFTTP-WHFBIAKZSA-N Ser-Ala-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)NCC(O)=O BTKUIVBNGBFTTP-WHFBIAKZSA-N 0.000 description 3
- YUJLIIRMIAGMCQ-CIUDSAMLSA-N Ser-Leu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YUJLIIRMIAGMCQ-CIUDSAMLSA-N 0.000 description 3
- OWCVUSJMEBGMOK-YUMQZZPRSA-N Ser-Lys-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O OWCVUSJMEBGMOK-YUMQZZPRSA-N 0.000 description 3
- SRSPTFBENMJHMR-WHFBIAKZSA-N Ser-Ser-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SRSPTFBENMJHMR-WHFBIAKZSA-N 0.000 description 3
- PURRNJBBXDDWLX-ZDLURKLDSA-N Ser-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CO)N)O PURRNJBBXDDWLX-ZDLURKLDSA-N 0.000 description 3
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 3
- XFTYVCHLARBHBQ-FOHZUACHSA-N Thr-Gly-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O XFTYVCHLARBHBQ-FOHZUACHSA-N 0.000 description 3
- YOOAQCZYZHGUAZ-KATARQTJSA-N Thr-Leu-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YOOAQCZYZHGUAZ-KATARQTJSA-N 0.000 description 3
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 3
- 239000004473 Threonine Substances 0.000 description 3
- ZWZOCUWOXSDYFZ-CQDKDKBSSA-N Tyr-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 ZWZOCUWOXSDYFZ-CQDKDKBSSA-N 0.000 description 3
- OHNXAUCZVWGTLL-KKUMJFAQSA-N Tyr-His-Cys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CS)C(=O)O)N)O OHNXAUCZVWGTLL-KKUMJFAQSA-N 0.000 description 3
- ASQFIHTXXMFENG-XPUUQOCRSA-N Val-Ala-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O ASQFIHTXXMFENG-XPUUQOCRSA-N 0.000 description 3
- ZLFHAAGHGQBQQN-GUBZILKMSA-N Val-Ala-Pro Natural products CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O ZLFHAAGHGQBQQN-GUBZILKMSA-N 0.000 description 3
- XTDDIVQWDXMRJL-IHRRRGAJSA-N Val-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](C(C)C)N XTDDIVQWDXMRJL-IHRRRGAJSA-N 0.000 description 3
- VPGCVZRRBYOGCD-AVGNSLFASA-N Val-Lys-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O VPGCVZRRBYOGCD-AVGNSLFASA-N 0.000 description 3
- QTXGUIMEHKCPBH-FHWLQOOXSA-N Val-Trp-Lys Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@@H](CCCCN)C(O)=O)=CNC2=C1 QTXGUIMEHKCPBH-FHWLQOOXSA-N 0.000 description 3
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 3
- 238000001261 affinity purification Methods 0.000 description 3
- 229960003767 alanine Drugs 0.000 description 3
- 235000004279 alanine Nutrition 0.000 description 3
- 108010076324 alanyl-glycyl-glycine Proteins 0.000 description 3
- 150000001412 amines Chemical class 0.000 description 3
- 210000002421 cell wall Anatomy 0.000 description 3
- 108010060199 cysteinylproline Proteins 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 238000004090 dissolution Methods 0.000 description 3
- 229940079593 drug Drugs 0.000 description 3
- 239000003814 drug Substances 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 108020001507 fusion proteins Proteins 0.000 description 3
- 102000037865 fusion proteins Human genes 0.000 description 3
- 229930182830 galactose Natural products 0.000 description 3
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 3
- 108010057083 glutamyl-aspartyl-leucine Proteins 0.000 description 3
- 108010051307 glycyl-glycyl-proline Proteins 0.000 description 3
- 108010082286 glycyl-seryl-alanine Proteins 0.000 description 3
- 108010089804 glycyl-threonine Proteins 0.000 description 3
- 108010020688 glycylhistidine Proteins 0.000 description 3
- 108010087823 glycyltyrosine Proteins 0.000 description 3
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 3
- 230000006698 induction Effects 0.000 description 3
- 230000002401 inhibitory effect Effects 0.000 description 3
- PBGKTOXHQIOBKM-FHFVDXKLSA-N insulin (human) Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@H]1CSSC[C@H]2C(=O)N[C@H](C(=O)N[C@@H](CO)C(=O)N[C@H](C(=O)N[C@H](C(N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=3C=CC(O)=CC=3)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC=3C=CC(O)=CC=3)C(=O)N[C@@H](CSSC[C@H](NC(=O)[C@H](C(C)C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC=3C=CC(O)=CC=3)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC=3NC=NC=3)NC(=O)[C@H](CO)NC(=O)CNC1=O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O)=O)CSSC[C@@H](C(N2)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](NC(=O)CN)[C@@H](C)CC)[C@@H](C)CC)[C@@H](C)O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC=1C=CC=CC=1)C(C)C)C1=CN=CN1 PBGKTOXHQIOBKM-FHFVDXKLSA-N 0.000 description 3
- 238000001294 liquid chromatography-tandem mass spectrometry Methods 0.000 description 3
- 229960003646 lysine Drugs 0.000 description 3
- 108010025153 lysyl-alanyl-alanine Proteins 0.000 description 3
- 108010045397 lysyl-tyrosyl-lysine Proteins 0.000 description 3
- 108010017391 lysylvaline Proteins 0.000 description 3
- 108010056582 methionylglutamic acid Proteins 0.000 description 3
- 239000000178 monomer Substances 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 108010070409 phenylalanyl-glycyl-glycine Proteins 0.000 description 3
- 108010084525 phenylalanyl-phenylalanyl-glycine Proteins 0.000 description 3
- 108010012581 phenylalanylglutamate Proteins 0.000 description 3
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 3
- 108010051242 phenylalanylserine Proteins 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 108010004914 prolylarginine Proteins 0.000 description 3
- 230000001105 regulatory effect Effects 0.000 description 3
- 210000003705 ribosome Anatomy 0.000 description 3
- 229960001153 serine Drugs 0.000 description 3
- 230000011664 signaling Effects 0.000 description 3
- 238000003786 synthesis reaction Methods 0.000 description 3
- 229960002898 threonine Drugs 0.000 description 3
- 108010031491 threonyl-lysyl-glutamic acid Proteins 0.000 description 3
- 231100000419 toxicity Toxicity 0.000 description 3
- 230000001988 toxicity Effects 0.000 description 3
- 238000010361 transduction Methods 0.000 description 3
- 230000026683 transduction Effects 0.000 description 3
- 108010017949 tyrosyl-glycyl-glycine Proteins 0.000 description 3
- 108010073969 valyllysine Proteins 0.000 description 3
- 238000012800 visualization Methods 0.000 description 3
- CEHZCZCQHUNAJF-AVGNSLFASA-N (2s)-1-[2-[[(2s)-1-[(2s)-2-amino-5-(diaminomethylideneamino)pentanoyl]pyrrolidine-2-carbonyl]amino]acetyl]pyrrolidine-2-carboxylic acid Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(=O)N1[C@H](C(O)=O)CCC1 CEHZCZCQHUNAJF-AVGNSLFASA-N 0.000 description 2
- SNNYHIFMUVVACL-ASHKBJFXSA-N (2s)-2-[[2-[[(2s,3s)-2-[[(2s)-1-(2-aminoacetyl)pyrrolidine-2-carbonyl]amino]-3-methylpentanoyl]amino]acetyl]amino]-3-hydroxypropanoic acid Chemical compound OC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]1CCCN1C(=O)CN SNNYHIFMUVVACL-ASHKBJFXSA-N 0.000 description 2
- 102000007469 Actins Human genes 0.000 description 2
- 108010085238 Actins Proteins 0.000 description 2
- SBGXWWCLHIOABR-UHFFFAOYSA-N Ala Ala Gly Ala Chemical compound CC(N)C(=O)NC(C)C(=O)NCC(=O)NC(C)C(O)=O SBGXWWCLHIOABR-UHFFFAOYSA-N 0.000 description 2
- YYSWCHMLFJLLBJ-ZLUOBGJFSA-N Ala-Ala-Ser Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YYSWCHMLFJLLBJ-ZLUOBGJFSA-N 0.000 description 2
- JBVSSSZFNTXJDX-YTLHQDLWSA-N Ala-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](C)N JBVSSSZFNTXJDX-YTLHQDLWSA-N 0.000 description 2
- JAMAWBXXKFGFGX-KZVJFYERSA-N Ala-Arg-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JAMAWBXXKFGFGX-KZVJFYERSA-N 0.000 description 2
- ZIBWKCRKNFYTPT-ZKWXMUAHSA-N Ala-Asn-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O ZIBWKCRKNFYTPT-ZKWXMUAHSA-N 0.000 description 2
- WXERCAHAIKMTKX-ZLUOBGJFSA-N Ala-Asp-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O WXERCAHAIKMTKX-ZLUOBGJFSA-N 0.000 description 2
- MCKSLROAGSDNFC-ACZMJKKPSA-N Ala-Asp-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MCKSLROAGSDNFC-ACZMJKKPSA-N 0.000 description 2
- IKKVASZHTMKJIR-ZKWXMUAHSA-N Ala-Asp-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O IKKVASZHTMKJIR-ZKWXMUAHSA-N 0.000 description 2
- RXTBLQVXNIECFP-FXQIFTODSA-N Ala-Gln-Gln Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O RXTBLQVXNIECFP-FXQIFTODSA-N 0.000 description 2
- LMFXXZPPZDCPTA-ZKWXMUAHSA-N Ala-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N LMFXXZPPZDCPTA-ZKWXMUAHSA-N 0.000 description 2
- QHASENCZLDHBGX-ONGXEEELSA-N Ala-Gly-Phe Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QHASENCZLDHBGX-ONGXEEELSA-N 0.000 description 2
- GSHKMNKPMLXSQW-KBIXCLLPSA-N Ala-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C)N GSHKMNKPMLXSQW-KBIXCLLPSA-N 0.000 description 2
- RZZMZYZXNJRPOJ-BJDJZHNGSA-N Ala-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](C)N RZZMZYZXNJRPOJ-BJDJZHNGSA-N 0.000 description 2
- BFMIRJBURUXDRG-DLOVCJGASA-N Ala-Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 BFMIRJBURUXDRG-DLOVCJGASA-N 0.000 description 2
- FQNILRVJOJBFFC-FXQIFTODSA-N Ala-Pro-Asp Chemical compound C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N FQNILRVJOJBFFC-FXQIFTODSA-N 0.000 description 2
- BTRULDJUUVGRNE-DCAQKATOSA-N Ala-Pro-Lys Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O BTRULDJUUVGRNE-DCAQKATOSA-N 0.000 description 2
- YYAVDNKUWLAFCV-ACZMJKKPSA-N Ala-Ser-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O YYAVDNKUWLAFCV-ACZMJKKPSA-N 0.000 description 2
- NZGRHTKZFSVPAN-BIIVOSGPSA-N Ala-Ser-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N NZGRHTKZFSVPAN-BIIVOSGPSA-N 0.000 description 2
- QKHWNPQNOHEFST-VZFHVOOUSA-N Ala-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](C)N)O QKHWNPQNOHEFST-VZFHVOOUSA-N 0.000 description 2
- HCBKAOZYACJUEF-XQXXSGGOSA-N Ala-Thr-Gln Chemical compound N[C@@H](C)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCC(N)=O)C(=O)O HCBKAOZYACJUEF-XQXXSGGOSA-N 0.000 description 2
- WNHNMKOFKCHKKD-BFHQHQDPSA-N Ala-Thr-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O WNHNMKOFKCHKKD-BFHQHQDPSA-N 0.000 description 2
- AOAKQKVICDWCLB-UWJYBYFXSA-N Ala-Tyr-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N AOAKQKVICDWCLB-UWJYBYFXSA-N 0.000 description 2
- XKXAZPSREVUCRT-BPNCWPANSA-N Ala-Tyr-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CC=C(O)C=C1 XKXAZPSREVUCRT-BPNCWPANSA-N 0.000 description 2
- OMSKGWFGWCQFBD-KZVJFYERSA-N Ala-Val-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OMSKGWFGWCQFBD-KZVJFYERSA-N 0.000 description 2
- 241000203069 Archaea Species 0.000 description 2
- USNSOPDIZILSJP-FXQIFTODSA-N Arg-Asn-Asn Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O USNSOPDIZILSJP-FXQIFTODSA-N 0.000 description 2
- HPKSHFSEXICTLI-CIUDSAMLSA-N Arg-Glu-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O HPKSHFSEXICTLI-CIUDSAMLSA-N 0.000 description 2
- WVNFNPGXYADPPO-BQBZGAKWSA-N Arg-Gly-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O WVNFNPGXYADPPO-BQBZGAKWSA-N 0.000 description 2
- COXMUHNBYCVVRG-DCAQKATOSA-N Arg-Leu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O COXMUHNBYCVVRG-DCAQKATOSA-N 0.000 description 2
- PRLPSDIHSRITSF-UNQGMJICSA-N Arg-Phe-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PRLPSDIHSRITSF-UNQGMJICSA-N 0.000 description 2
- FVBZXNSRIDVYJS-AVGNSLFASA-N Arg-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCN=C(N)N FVBZXNSRIDVYJS-AVGNSLFASA-N 0.000 description 2
- JOTRDIXZHNQYGP-DCAQKATOSA-N Arg-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCCN=C(N)N)N JOTRDIXZHNQYGP-DCAQKATOSA-N 0.000 description 2
- ICRHGPYYXMWHIE-LPEHRKFASA-N Arg-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O ICRHGPYYXMWHIE-LPEHRKFASA-N 0.000 description 2
- HRCIIMCTUIAKQB-XGEHTFHBSA-N Arg-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O HRCIIMCTUIAKQB-XGEHTFHBSA-N 0.000 description 2
- XRLOBFSLPCHYLQ-ULQDDVLXSA-N Arg-Tyr-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O XRLOBFSLPCHYLQ-ULQDDVLXSA-N 0.000 description 2
- VYZBPPBKFCHCIS-WPRPVWTQSA-N Arg-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCN=C(N)N VYZBPPBKFCHCIS-WPRPVWTQSA-N 0.000 description 2
- CIWBSHSKHKDKBQ-JLAZNSOCSA-N Ascorbic acid Chemical compound OC[C@H](O)[C@H]1OC(=O)C(O)=C1O CIWBSHSKHKDKBQ-JLAZNSOCSA-N 0.000 description 2
- WPOLSNAQGVHROR-GUBZILKMSA-N Asn-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)N)N WPOLSNAQGVHROR-GUBZILKMSA-N 0.000 description 2
- MSBDSTRUMZFSEU-PEFMBERDSA-N Asn-Glu-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MSBDSTRUMZFSEU-PEFMBERDSA-N 0.000 description 2
- CTQIOCMSIJATNX-WHFBIAKZSA-N Asn-Gly-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](C)C(O)=O CTQIOCMSIJATNX-WHFBIAKZSA-N 0.000 description 2
- DXVMJJNAOVECBA-WHFBIAKZSA-N Asn-Gly-Asn Chemical compound NC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O DXVMJJNAOVECBA-WHFBIAKZSA-N 0.000 description 2
- RAQMSGVCGSJKCL-FOHZUACHSA-N Asn-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(N)=O RAQMSGVCGSJKCL-FOHZUACHSA-N 0.000 description 2
- NYGILGUOUOXGMJ-YUMQZZPRSA-N Asn-Lys-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O NYGILGUOUOXGMJ-YUMQZZPRSA-N 0.000 description 2
- CDGHMJJJHYKMPA-DLOVCJGASA-N Asn-Phe-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CC(=O)N)N CDGHMJJJHYKMPA-DLOVCJGASA-N 0.000 description 2
- RAUPFUCUDBQYHE-AVGNSLFASA-N Asn-Phe-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O RAUPFUCUDBQYHE-AVGNSLFASA-N 0.000 description 2
- GKKUBLFXKRDMFC-BQBZGAKWSA-N Asn-Pro-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O GKKUBLFXKRDMFC-BQBZGAKWSA-N 0.000 description 2
- AWXDRZJQCVHCIT-DCAQKATOSA-N Asn-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC(N)=O AWXDRZJQCVHCIT-DCAQKATOSA-N 0.000 description 2
- HPASIOLTWSNMFB-OLHMAJIHSA-N Asn-Thr-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O HPASIOLTWSNMFB-OLHMAJIHSA-N 0.000 description 2
- RDLYUKRPEJERMM-XIRDDKMYSA-N Asn-Trp-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(C)C)C(O)=O RDLYUKRPEJERMM-XIRDDKMYSA-N 0.000 description 2
- KDFQZBWWPYQBEN-ZLUOBGJFSA-N Asp-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N KDFQZBWWPYQBEN-ZLUOBGJFSA-N 0.000 description 2
- XBQSLMACWDXWLJ-GHCJXIJMSA-N Asp-Ala-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XBQSLMACWDXWLJ-GHCJXIJMSA-N 0.000 description 2
- ZLGKHJHFYSRUBH-FXQIFTODSA-N Asp-Arg-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O ZLGKHJHFYSRUBH-FXQIFTODSA-N 0.000 description 2
- GWTLRDMPMJCNMH-WHFBIAKZSA-N Asp-Asn-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O GWTLRDMPMJCNMH-WHFBIAKZSA-N 0.000 description 2
- TVVYVAUGRHNTGT-UGYAYLCHSA-N Asp-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O TVVYVAUGRHNTGT-UGYAYLCHSA-N 0.000 description 2
- SBHUBSDEZQFJHJ-CIUDSAMLSA-N Asp-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O SBHUBSDEZQFJHJ-CIUDSAMLSA-N 0.000 description 2
- GHODABZPVZMWCE-FXQIFTODSA-N Asp-Glu-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O GHODABZPVZMWCE-FXQIFTODSA-N 0.000 description 2
- VFUXXFVCYZPOQG-WDSKDSINSA-N Asp-Glu-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O VFUXXFVCYZPOQG-WDSKDSINSA-N 0.000 description 2
- YRBGRUOSJROZEI-NHCYSSNCSA-N Asp-His-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(O)=O YRBGRUOSJROZEI-NHCYSSNCSA-N 0.000 description 2
- SEMWSADZTMJELF-BYULHYEWSA-N Asp-Ile-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O SEMWSADZTMJELF-BYULHYEWSA-N 0.000 description 2
- SPKCGKRUYKMDHP-GUDRVLHUSA-N Asp-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)O)N SPKCGKRUYKMDHP-GUDRVLHUSA-N 0.000 description 2
- DWOGMPWRQQWPPF-GUBZILKMSA-N Asp-Leu-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O DWOGMPWRQQWPPF-GUBZILKMSA-N 0.000 description 2
- GKWFMNNNYZHJHV-SRVKXCTJSA-N Asp-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC(O)=O GKWFMNNNYZHJHV-SRVKXCTJSA-N 0.000 description 2
- JSNWZMFSLIWAHS-HJGDQZAQSA-N Asp-Thr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O JSNWZMFSLIWAHS-HJGDQZAQSA-N 0.000 description 2
- DKQCWCQRAMAFLN-UBHSHLNASA-N Asp-Trp-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(O)=O)C(O)=O DKQCWCQRAMAFLN-UBHSHLNASA-N 0.000 description 2
- SFJUYBCDQBAYAJ-YDHLFZDLSA-N Asp-Val-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 SFJUYBCDQBAYAJ-YDHLFZDLSA-N 0.000 description 2
- 241000193738 Bacillus anthracis Species 0.000 description 2
- 102000000584 Calmodulin Human genes 0.000 description 2
- 108010041952 Calmodulin Proteins 0.000 description 2
- CURLTUGMZLYLDI-UHFFFAOYSA-N Carbon dioxide Chemical compound O=C=O CURLTUGMZLYLDI-UHFFFAOYSA-N 0.000 description 2
- 102100035882 Catalase Human genes 0.000 description 2
- 108010053835 Catalase Proteins 0.000 description 2
- 108091035707 Consensus sequence Proteins 0.000 description 2
- 102000005636 Cyclic AMP Response Element-Binding Protein Human genes 0.000 description 2
- 108010045171 Cyclic AMP Response Element-Binding Protein Proteins 0.000 description 2
- GMXSSZUVDNPRMA-FXQIFTODSA-N Cys-Arg-Asp Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O GMXSSZUVDNPRMA-FXQIFTODSA-N 0.000 description 2
- UPJGYXRAPJWIHD-CIUDSAMLSA-N Cys-Asn-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UPJGYXRAPJWIHD-CIUDSAMLSA-N 0.000 description 2
- SQJSYLDKQBZQTG-FXQIFTODSA-N Cys-Asn-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CS)N SQJSYLDKQBZQTG-FXQIFTODSA-N 0.000 description 2
- KIHRUISMQZVCNO-ZLUOBGJFSA-N Cys-Asp-Asp Chemical compound SC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O KIHRUISMQZVCNO-ZLUOBGJFSA-N 0.000 description 2
- RFHGRMMADHHQSA-KBIXCLLPSA-N Cys-Gln-Ile Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O RFHGRMMADHHQSA-KBIXCLLPSA-N 0.000 description 2
- RWAZRMXTVSIVJR-YUMQZZPRSA-N Cys-Gly-His Chemical compound [H]N[C@@H](CS)C(=O)NCC(=O)N[C@@H](CC1=CNC=N1)C(O)=O RWAZRMXTVSIVJR-YUMQZZPRSA-N 0.000 description 2
- OZHXXYOHPLLLMI-CIUDSAMLSA-N Cys-Lys-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OZHXXYOHPLLLMI-CIUDSAMLSA-N 0.000 description 2
- SMEYEQDCCBHTEF-FXQIFTODSA-N Cys-Pro-Ala Chemical compound [H]N[C@@H](CS)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O SMEYEQDCCBHTEF-FXQIFTODSA-N 0.000 description 2
- MQQLYEHXSBJTRK-FXQIFTODSA-N Cys-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CS)N MQQLYEHXSBJTRK-FXQIFTODSA-N 0.000 description 2
- MYMOFIZGZYHOMD-UHFFFAOYSA-N Dioxygen Chemical compound O=O MYMOFIZGZYHOMD-UHFFFAOYSA-N 0.000 description 2
- XZWYTXMRWQJBGX-VXBMVYAYSA-N FLAG peptide Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(O)=O)CC1=CC=C(O)C=C1 XZWYTXMRWQJBGX-VXBMVYAYSA-N 0.000 description 2
- PGPJSRSLQNXBDT-YUMQZZPRSA-N Gln-Arg-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O PGPJSRSLQNXBDT-YUMQZZPRSA-N 0.000 description 2
- MWLYSLMKFXWZPW-ZPFDUUQYSA-N Gln-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CCC(N)=O MWLYSLMKFXWZPW-ZPFDUUQYSA-N 0.000 description 2
- WQWMZOIPXWSZNE-WDSKDSINSA-N Gln-Asp-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O WQWMZOIPXWSZNE-WDSKDSINSA-N 0.000 description 2
- AJDMYLOISOCHHC-YVNDNENWSA-N Gln-Gln-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O AJDMYLOISOCHHC-YVNDNENWSA-N 0.000 description 2
- MCAVASRGVBVPMX-FXQIFTODSA-N Gln-Glu-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O MCAVASRGVBVPMX-FXQIFTODSA-N 0.000 description 2
- LFIVHGMKWFGUGK-IHRRRGAJSA-N Gln-Glu-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N LFIVHGMKWFGUGK-IHRRRGAJSA-N 0.000 description 2
- NSNUZSPSADIMJQ-WDSKDSINSA-N Gln-Gly-Asp Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O NSNUZSPSADIMJQ-WDSKDSINSA-N 0.000 description 2
- CLPQUWHBWXFJOX-BQBZGAKWSA-N Gln-Gly-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O CLPQUWHBWXFJOX-BQBZGAKWSA-N 0.000 description 2
- HXOLDXKNWKLDMM-YVNDNENWSA-N Gln-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N HXOLDXKNWKLDMM-YVNDNENWSA-N 0.000 description 2
- IHSGESFHTMFHRB-GUBZILKMSA-N Gln-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(N)=O IHSGESFHTMFHRB-GUBZILKMSA-N 0.000 description 2
- JRHPEMVLTRADLJ-AVGNSLFASA-N Gln-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N JRHPEMVLTRADLJ-AVGNSLFASA-N 0.000 description 2
- FTTHLXOMDMLKKW-FHWLQOOXSA-N Gln-Phe-Phe Chemical compound C([C@H](NC(=O)[C@H](CCC(N)=O)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 FTTHLXOMDMLKKW-FHWLQOOXSA-N 0.000 description 2
- YPFFHGRJCUBXPX-NHCYSSNCSA-N Gln-Pro-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCC(N)=O)C(O)=O YPFFHGRJCUBXPX-NHCYSSNCSA-N 0.000 description 2
- WZZSKAJIHTUUSG-ACZMJKKPSA-N Glu-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O WZZSKAJIHTUUSG-ACZMJKKPSA-N 0.000 description 2
- SRZLHYPAOXBBSB-HJGDQZAQSA-N Glu-Arg-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SRZLHYPAOXBBSB-HJGDQZAQSA-N 0.000 description 2
- CKRUHITYRFNUKW-WDSKDSINSA-N Glu-Asn-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CKRUHITYRFNUKW-WDSKDSINSA-N 0.000 description 2
- IESFZVCAVACGPH-PEFMBERDSA-N Glu-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O IESFZVCAVACGPH-PEFMBERDSA-N 0.000 description 2
- JVSBYEDSSRZQGV-GUBZILKMSA-N Glu-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O JVSBYEDSSRZQGV-GUBZILKMSA-N 0.000 description 2
- SJPMNHCEWPTRBR-BQBZGAKWSA-N Glu-Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SJPMNHCEWPTRBR-BQBZGAKWSA-N 0.000 description 2
- MUSGDMDGNGXULI-DCAQKATOSA-N Glu-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O MUSGDMDGNGXULI-DCAQKATOSA-N 0.000 description 2
- KUTPGXNAAOQSPD-LPEHRKFASA-N Glu-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)O)N)C(=O)O KUTPGXNAAOQSPD-LPEHRKFASA-N 0.000 description 2
- MTAOBYXRYJZRGQ-WDSKDSINSA-N Glu-Gly-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MTAOBYXRYJZRGQ-WDSKDSINSA-N 0.000 description 2
- IVGJYOOGJLFKQE-AVGNSLFASA-N Glu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N IVGJYOOGJLFKQE-AVGNSLFASA-N 0.000 description 2
- HRBYTAIBKPNZKQ-AVGNSLFASA-N Glu-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(O)=O HRBYTAIBKPNZKQ-AVGNSLFASA-N 0.000 description 2
- WVWZIPOJECFDAG-AVGNSLFASA-N Glu-Phe-Cys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)O)N WVWZIPOJECFDAG-AVGNSLFASA-N 0.000 description 2
- YRMZCZIRHYCNHX-RYUDHWBXSA-N Glu-Phe-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O YRMZCZIRHYCNHX-RYUDHWBXSA-N 0.000 description 2
- YTRBQAQSUDSIQE-FHWLQOOXSA-N Glu-Phe-Phe Chemical compound C([C@H](NC(=O)[C@H](CCC(O)=O)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 YTRBQAQSUDSIQE-FHWLQOOXSA-N 0.000 description 2
- MRWYPDWDZSLWJM-ACZMJKKPSA-N Glu-Ser-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O MRWYPDWDZSLWJM-ACZMJKKPSA-N 0.000 description 2
- RFTVTKBHDXCEEX-WDSKDSINSA-N Glu-Ser-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RFTVTKBHDXCEEX-WDSKDSINSA-N 0.000 description 2
- BXSZPACYCMNKLS-AVGNSLFASA-N Glu-Ser-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O BXSZPACYCMNKLS-AVGNSLFASA-N 0.000 description 2
- KIEICAOUSNYOLM-NRPADANISA-N Glu-Val-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O KIEICAOUSNYOLM-NRPADANISA-N 0.000 description 2
- LZEUDRYSAZAJIO-AUTRQRHGSA-N Glu-Val-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LZEUDRYSAZAJIO-AUTRQRHGSA-N 0.000 description 2
- WGYHAAXZWPEBDQ-IFFSRLJSSA-N Glu-Val-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WGYHAAXZWPEBDQ-IFFSRLJSSA-N 0.000 description 2
- LJPIRKICOISLKN-WHFBIAKZSA-N Gly-Ala-Ser Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O LJPIRKICOISLKN-WHFBIAKZSA-N 0.000 description 2
- CLODWIOAKCSBAN-BQBZGAKWSA-N Gly-Arg-Asp Chemical compound NC(N)=NCCC[C@H](NC(=O)CN)C(=O)N[C@@H](CC(O)=O)C(O)=O CLODWIOAKCSBAN-BQBZGAKWSA-N 0.000 description 2
- OGCIHJPYKVSMTE-YUMQZZPRSA-N Gly-Arg-Glu Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O OGCIHJPYKVSMTE-YUMQZZPRSA-N 0.000 description 2
- WKJKBELXHCTHIJ-WPRPVWTQSA-N Gly-Arg-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N WKJKBELXHCTHIJ-WPRPVWTQSA-N 0.000 description 2
- CIMULJZTTOBOPN-WHFBIAKZSA-N Gly-Asn-Asn Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CIMULJZTTOBOPN-WHFBIAKZSA-N 0.000 description 2
- AIJAPFVDBFYNKN-WHFBIAKZSA-N Gly-Asn-Asp Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)CN)C(=O)N AIJAPFVDBFYNKN-WHFBIAKZSA-N 0.000 description 2
- BGVYNAQWHSTTSP-BYULHYEWSA-N Gly-Asn-Ile Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BGVYNAQWHSTTSP-BYULHYEWSA-N 0.000 description 2
- JVWPPCWUDRJGAE-YUMQZZPRSA-N Gly-Asn-Leu Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JVWPPCWUDRJGAE-YUMQZZPRSA-N 0.000 description 2
- OCDLPQDYTJPWNG-YUMQZZPRSA-N Gly-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)CN OCDLPQDYTJPWNG-YUMQZZPRSA-N 0.000 description 2
- KQDMENMTYNBWMR-WHFBIAKZSA-N Gly-Asp-Ala Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O KQDMENMTYNBWMR-WHFBIAKZSA-N 0.000 description 2
- YZPVGIVFMZLQMM-YUMQZZPRSA-N Gly-Gln-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)CN YZPVGIVFMZLQMM-YUMQZZPRSA-N 0.000 description 2
- HDNXXTBKOJKWNN-WDSKDSINSA-N Gly-Glu-Asn Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O HDNXXTBKOJKWNN-WDSKDSINSA-N 0.000 description 2
- FIQQRCFQXGLOSZ-WDSKDSINSA-N Gly-Glu-Asp Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O FIQQRCFQXGLOSZ-WDSKDSINSA-N 0.000 description 2
- STVHDEHTKFXBJQ-LAEOZQHASA-N Gly-Glu-Ile Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O STVHDEHTKFXBJQ-LAEOZQHASA-N 0.000 description 2
- CUYLIWAAAYJKJH-RYUDHWBXSA-N Gly-Glu-Tyr Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 CUYLIWAAAYJKJH-RYUDHWBXSA-N 0.000 description 2
- PDAWDNVHMUKWJR-ZETCQYMHSA-N Gly-Gly-His Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC1=CNC=N1 PDAWDNVHMUKWJR-ZETCQYMHSA-N 0.000 description 2
- QITBQGJOXQYMOA-ZETCQYMHSA-N Gly-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)CN QITBQGJOXQYMOA-ZETCQYMHSA-N 0.000 description 2
- IVSWQHKONQIOHA-YUMQZZPRSA-N Gly-His-Cys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)CN IVSWQHKONQIOHA-YUMQZZPRSA-N 0.000 description 2
- SXJHOPPTOJACOA-QXEWZRGKSA-N Gly-Ile-Arg Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CCCN=C(N)N SXJHOPPTOJACOA-QXEWZRGKSA-N 0.000 description 2
- HAXARWKYFIIHKD-ZKWXMUAHSA-N Gly-Ile-Ser Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O HAXARWKYFIIHKD-ZKWXMUAHSA-N 0.000 description 2
- NSTUFLGQJCOCDL-UWVGGRQHSA-N Gly-Leu-Arg Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N NSTUFLGQJCOCDL-UWVGGRQHSA-N 0.000 description 2
- IUZGUFAJDBHQQV-YUMQZZPRSA-N Gly-Leu-Asn Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IUZGUFAJDBHQQV-YUMQZZPRSA-N 0.000 description 2
- BXICSAQLIHFDDL-YUMQZZPRSA-N Gly-Lys-Asn Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O BXICSAQLIHFDDL-YUMQZZPRSA-N 0.000 description 2
- GMTXWRIDLGTVFC-IUCAKERBSA-N Gly-Lys-Glu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O GMTXWRIDLGTVFC-IUCAKERBSA-N 0.000 description 2
- MHXKHKWHPNETGG-QWRGUYRKSA-N Gly-Lys-Leu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O MHXKHKWHPNETGG-QWRGUYRKSA-N 0.000 description 2
- FXGRXIATVXUAHO-WEDXCCLWSA-N Gly-Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN FXGRXIATVXUAHO-WEDXCCLWSA-N 0.000 description 2
- IBYOLNARKHMLBG-WHOFXGATSA-N Gly-Phe-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 IBYOLNARKHMLBG-WHOFXGATSA-N 0.000 description 2
- YLEIWGJJBFBFHC-KBPBESRZSA-N Gly-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 YLEIWGJJBFBFHC-KBPBESRZSA-N 0.000 description 2
- FEUPVVCGQLNXNP-IRXDYDNUSA-N Gly-Phe-Phe Chemical compound C([C@H](NC(=O)CN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 FEUPVVCGQLNXNP-IRXDYDNUSA-N 0.000 description 2
- GGAPHLIUUTVYMX-QWRGUYRKSA-N Gly-Phe-Ser Chemical compound OC[C@@H](C([O-])=O)NC(=O)[C@@H](NC(=O)C[NH3+])CC1=CC=CC=C1 GGAPHLIUUTVYMX-QWRGUYRKSA-N 0.000 description 2
- VDCRBJACQKOSMS-JSGCOSHPSA-N Gly-Phe-Val Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O VDCRBJACQKOSMS-JSGCOSHPSA-N 0.000 description 2
- OCPPBNKYGYSLOE-IUCAKERBSA-N Gly-Pro-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN OCPPBNKYGYSLOE-IUCAKERBSA-N 0.000 description 2
- CSMYMGFCEJWALV-WDSKDSINSA-N Gly-Ser-Gln Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(N)=O CSMYMGFCEJWALV-WDSKDSINSA-N 0.000 description 2
- FGPLUIQCSKGLTI-WDSKDSINSA-N Gly-Ser-Glu Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O FGPLUIQCSKGLTI-WDSKDSINSA-N 0.000 description 2
- WNGHUXFWEWTKAO-YUMQZZPRSA-N Gly-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)CN WNGHUXFWEWTKAO-YUMQZZPRSA-N 0.000 description 2
- LCRDMSSAKLTKBU-ZDLURKLDSA-N Gly-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)CN LCRDMSSAKLTKBU-ZDLURKLDSA-N 0.000 description 2
- ZLCLYFGMKFCDCN-XPUUQOCRSA-N Gly-Ser-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CO)NC(=O)CN)C(O)=O ZLCLYFGMKFCDCN-XPUUQOCRSA-N 0.000 description 2
- NVTPVQLIZCOJFK-FOHZUACHSA-N Gly-Thr-Asp Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O NVTPVQLIZCOJFK-FOHZUACHSA-N 0.000 description 2
- MYXNLWDWWOTERK-BHNWBGBOSA-N Gly-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN)O MYXNLWDWWOTERK-BHNWBGBOSA-N 0.000 description 2
- UVTSZKIATYSKIR-RYUDHWBXSA-N Gly-Tyr-Glu Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O UVTSZKIATYSKIR-RYUDHWBXSA-N 0.000 description 2
- GWCJMBNBFYBQCV-XPUUQOCRSA-N Gly-Val-Ala Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O GWCJMBNBFYBQCV-XPUUQOCRSA-N 0.000 description 2
- SYOJVRNQCXYEOV-XVKPBYJWSA-N Gly-Val-Glu Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SYOJVRNQCXYEOV-XVKPBYJWSA-N 0.000 description 2
- YGHSQRJSHKYUJY-SCZZXKLOSA-N Gly-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN YGHSQRJSHKYUJY-SCZZXKLOSA-N 0.000 description 2
- DZMVESFTHXSSPZ-XVYDVKMFSA-N His-Ala-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O DZMVESFTHXSSPZ-XVYDVKMFSA-N 0.000 description 2
- CJGDTAHEMXLRMB-ULQDDVLXSA-N His-Arg-Phe Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O CJGDTAHEMXLRMB-ULQDDVLXSA-N 0.000 description 2
- 108010093488 His-His-His-His-His-His Proteins 0.000 description 2
- CMPHFUWXKBPNRS-WDSOQIARSA-N His-Val-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CNC=N1 CMPHFUWXKBPNRS-WDSOQIARSA-N 0.000 description 2
- 101000947120 Homo sapiens Beta-casein Proteins 0.000 description 2
- QLRMMMQNCWBNPQ-QXEWZRGKSA-N Ile-Arg-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)NCC(=O)O)N QLRMMMQNCWBNPQ-QXEWZRGKSA-N 0.000 description 2
- NKRJALPCDNXULF-BYULHYEWSA-N Ile-Asp-Gly Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O NKRJALPCDNXULF-BYULHYEWSA-N 0.000 description 2
- NPROWIBAWYMPAZ-GUDRVLHUSA-N Ile-Asp-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N NPROWIBAWYMPAZ-GUDRVLHUSA-N 0.000 description 2
- GECLQMBTZCPAFY-PEFMBERDSA-N Ile-Gln-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N GECLQMBTZCPAFY-PEFMBERDSA-N 0.000 description 2
- LPXHYGGZJOCAFR-MNXVOIDGSA-N Ile-Glu-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N LPXHYGGZJOCAFR-MNXVOIDGSA-N 0.000 description 2
- NZOCIWKZUVUNDW-ZKWXMUAHSA-N Ile-Gly-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O NZOCIWKZUVUNDW-ZKWXMUAHSA-N 0.000 description 2
- ODPKZZLRDNXTJZ-WHOFXGATSA-N Ile-Gly-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N ODPKZZLRDNXTJZ-WHOFXGATSA-N 0.000 description 2
- KYLIZSDYWQQTFM-PEDHHIEDSA-N Ile-Ile-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CCCN=C(N)N KYLIZSDYWQQTFM-PEDHHIEDSA-N 0.000 description 2
- HPCFRQWLTRDGHT-AJNGGQMLSA-N Ile-Leu-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O HPCFRQWLTRDGHT-AJNGGQMLSA-N 0.000 description 2
- XDUVMJCBYUKNFJ-MXAVVETBSA-N Ile-Lys-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N XDUVMJCBYUKNFJ-MXAVVETBSA-N 0.000 description 2
- GLYJPWIRLBAIJH-FQUUOJAGSA-N Ile-Lys-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N GLYJPWIRLBAIJH-FQUUOJAGSA-N 0.000 description 2
- UAELWXJFLZBKQS-WHOFXGATSA-N Ile-Phe-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](Cc1ccccc1)C(=O)NCC(O)=O UAELWXJFLZBKQS-WHOFXGATSA-N 0.000 description 2
- CIJLNXXMDUOFPH-HJWJTTGWSA-N Ile-Pro-Phe Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 CIJLNXXMDUOFPH-HJWJTTGWSA-N 0.000 description 2
- ZLFNNVATRMCAKN-ZKWXMUAHSA-N Ile-Ser-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)NCC(=O)O)N ZLFNNVATRMCAKN-ZKWXMUAHSA-N 0.000 description 2
- VGSPNSSCMOHRRR-BJDJZHNGSA-N Ile-Ser-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N VGSPNSSCMOHRRR-BJDJZHNGSA-N 0.000 description 2
- DGTOKVBDZXJHNZ-WZLNRYEVSA-N Ile-Thr-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N DGTOKVBDZXJHNZ-WZLNRYEVSA-N 0.000 description 2
- IBMVEYRWAWIOTN-UHFFFAOYSA-N L-Leucyl-L-Arginyl-L-Proline Natural products CC(C)CC(N)C(=O)NC(CCCN=C(N)N)C(=O)N1CCCC1C(O)=O IBMVEYRWAWIOTN-UHFFFAOYSA-N 0.000 description 2
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 2
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 2
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 2
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 2
- TYYLDKGBCJGJGW-UHFFFAOYSA-N L-tryptophan-L-tyrosine Natural products C=1NC2=CC=CC=C2C=1CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 TYYLDKGBCJGJGW-UHFFFAOYSA-N 0.000 description 2
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 2
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 2
- CZCSUZMIRKFFFA-CIUDSAMLSA-N Leu-Ala-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O CZCSUZMIRKFFFA-CIUDSAMLSA-N 0.000 description 2
- KSZCCRIGNVSHFH-UWVGGRQHSA-N Leu-Arg-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O KSZCCRIGNVSHFH-UWVGGRQHSA-N 0.000 description 2
- DBVWMYGBVFCRBE-CIUDSAMLSA-N Leu-Asn-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O DBVWMYGBVFCRBE-CIUDSAMLSA-N 0.000 description 2
- OIARJGNVARWKFP-YUMQZZPRSA-N Leu-Asn-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O OIARJGNVARWKFP-YUMQZZPRSA-N 0.000 description 2
- MYGQXVYRZMKRDB-SRVKXCTJSA-N Leu-Asp-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN MYGQXVYRZMKRDB-SRVKXCTJSA-N 0.000 description 2
- QCSFMCFHVGTLFF-NHCYSSNCSA-N Leu-Asp-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O QCSFMCFHVGTLFF-NHCYSSNCSA-N 0.000 description 2
- NEEOBPIXKWSBRF-IUCAKERBSA-N Leu-Glu-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O NEEOBPIXKWSBRF-IUCAKERBSA-N 0.000 description 2
- JNDYEOUZBLOVOF-AVGNSLFASA-N Leu-Leu-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O JNDYEOUZBLOVOF-AVGNSLFASA-N 0.000 description 2
- RZXLZBIUTDQHJQ-SRVKXCTJSA-N Leu-Lys-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O RZXLZBIUTDQHJQ-SRVKXCTJSA-N 0.000 description 2
- QNTJIDXQHWUBKC-BZSNNMDCSA-N Leu-Lys-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QNTJIDXQHWUBKC-BZSNNMDCSA-N 0.000 description 2
- VCHVSKNMTXWIIP-SRVKXCTJSA-N Leu-Lys-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O VCHVSKNMTXWIIP-SRVKXCTJSA-N 0.000 description 2
- SYRTUBLKWNDSDK-DKIMLUQUSA-N Leu-Phe-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SYRTUBLKWNDSDK-DKIMLUQUSA-N 0.000 description 2
- BMVFXOQHDQZAQU-DCAQKATOSA-N Leu-Pro-Asp Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N BMVFXOQHDQZAQU-DCAQKATOSA-N 0.000 description 2
- YRRCOJOXAJNSAX-IHRRRGAJSA-N Leu-Pro-Lys Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)O)N YRRCOJOXAJNSAX-IHRRRGAJSA-N 0.000 description 2
- JDBQSGMJBMPNFT-AVGNSLFASA-N Leu-Pro-Val Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O JDBQSGMJBMPNFT-AVGNSLFASA-N 0.000 description 2
- ODRREERHVHMIPT-OEAJRASXSA-N Leu-Thr-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ODRREERHVHMIPT-OEAJRASXSA-N 0.000 description 2
- FZIJIFCXUCZHOL-CIUDSAMLSA-N Lys-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN FZIJIFCXUCZHOL-CIUDSAMLSA-N 0.000 description 2
- NCTDKZKNBDZDOL-GARJFASQSA-N Lys-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N)C(=O)O NCTDKZKNBDZDOL-GARJFASQSA-N 0.000 description 2
- IBQMEXQYZMVIFU-SRVKXCTJSA-N Lys-Asp-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCCN)N IBQMEXQYZMVIFU-SRVKXCTJSA-N 0.000 description 2
- IWWMPCPLFXFBAF-SRVKXCTJSA-N Lys-Asp-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O IWWMPCPLFXFBAF-SRVKXCTJSA-N 0.000 description 2
- SSJBMGCZZXCGJJ-DCAQKATOSA-N Lys-Asp-Met Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O SSJBMGCZZXCGJJ-DCAQKATOSA-N 0.000 description 2
- PBIPLDMFHAICIP-DCAQKATOSA-N Lys-Glu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O PBIPLDMFHAICIP-DCAQKATOSA-N 0.000 description 2
- ISHNZELVUVPCHY-ZETCQYMHSA-N Lys-Gly-Gly Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)NCC(O)=O ISHNZELVUVPCHY-ZETCQYMHSA-N 0.000 description 2
- NKKFVJRLCCUJNA-QWRGUYRKSA-N Lys-Gly-Lys Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN NKKFVJRLCCUJNA-QWRGUYRKSA-N 0.000 description 2
- WRODMZBHNNPRLN-SRVKXCTJSA-N Lys-Leu-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O WRODMZBHNNPRLN-SRVKXCTJSA-N 0.000 description 2
- ATNKHRAIZCMCCN-BZSNNMDCSA-N Lys-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N ATNKHRAIZCMCCN-BZSNNMDCSA-N 0.000 description 2
- GZGWILAQHOVXTD-DCAQKATOSA-N Lys-Met-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O GZGWILAQHOVXTD-DCAQKATOSA-N 0.000 description 2
- TWPCWKVOZDUYAA-KKUMJFAQSA-N Lys-Phe-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O TWPCWKVOZDUYAA-KKUMJFAQSA-N 0.000 description 2
- CNGOEHJCLVCJHN-SRVKXCTJSA-N Lys-Pro-Glu Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O CNGOEHJCLVCJHN-SRVKXCTJSA-N 0.000 description 2
- HKXSZKJMDBHOTG-CIUDSAMLSA-N Lys-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CCCCN HKXSZKJMDBHOTG-CIUDSAMLSA-N 0.000 description 2
- SBQDRNOLGSYHQA-YUMQZZPRSA-N Lys-Ser-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SBQDRNOLGSYHQA-YUMQZZPRSA-N 0.000 description 2
- JHNOXVASMSXSNB-WEDXCCLWSA-N Lys-Thr-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O JHNOXVASMSXSNB-WEDXCCLWSA-N 0.000 description 2
- CAVRAQIDHUPECU-UVOCVTCTSA-N Lys-Thr-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAVRAQIDHUPECU-UVOCVTCTSA-N 0.000 description 2
- VHTOGMKQXXJOHG-RHYQMDGZSA-N Lys-Thr-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O VHTOGMKQXXJOHG-RHYQMDGZSA-N 0.000 description 2
- TXTZMVNJIRZABH-ULQDDVLXSA-N Lys-Val-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 TXTZMVNJIRZABH-ULQDDVLXSA-N 0.000 description 2
- 101710175625 Maltose/maltodextrin-binding periplasmic protein Proteins 0.000 description 2
- 108010052285 Membrane Proteins Proteins 0.000 description 2
- DBXMFHGGHMXYHY-DCAQKATOSA-N Met-Leu-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O DBXMFHGGHMXYHY-DCAQKATOSA-N 0.000 description 2
- JQHYVIKEFYETEW-IHRRRGAJSA-N Met-Phe-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=CC=C1 JQHYVIKEFYETEW-IHRRRGAJSA-N 0.000 description 2
- VSJAPSMRFYUOKS-IUCAKERBSA-N Met-Pro-Gly Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O VSJAPSMRFYUOKS-IUCAKERBSA-N 0.000 description 2
- RDLSEGZJMYGFNS-FXQIFTODSA-N Met-Ser-Asp Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O RDLSEGZJMYGFNS-FXQIFTODSA-N 0.000 description 2
- AJHCSUXXECOXOY-UHFFFAOYSA-N N-glycyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-UHFFFAOYSA-N 0.000 description 2
- PXHVJJICTQNCMI-UHFFFAOYSA-N Nickel Chemical compound [Ni] PXHVJJICTQNCMI-UHFFFAOYSA-N 0.000 description 2
- 102000007399 Nuclear hormone receptor Human genes 0.000 description 2
- 101000621749 Penicillium citrinum Peroxiredoxin Pen c 3 Proteins 0.000 description 2
- 108010025366 Peroxins Proteins 0.000 description 2
- 102000013772 Peroxins Human genes 0.000 description 2
- 102100036598 Peroxisomal targeting signal 1 receptor Human genes 0.000 description 2
- BJEYSVHMGIJORT-NHCYSSNCSA-N Phe-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 BJEYSVHMGIJORT-NHCYSSNCSA-N 0.000 description 2
- HGNGAMWHGGANAU-WHOFXGATSA-N Phe-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 HGNGAMWHGGANAU-WHOFXGATSA-N 0.000 description 2
- APJPXSFJBMMOLW-KBPBESRZSA-N Phe-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 APJPXSFJBMMOLW-KBPBESRZSA-N 0.000 description 2
- NHCKESBLOMHIIE-IRXDYDNUSA-N Phe-Gly-Phe Chemical compound C([C@H](N)C(=O)NCC(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 NHCKESBLOMHIIE-IRXDYDNUSA-N 0.000 description 2
- KDYPMIZMXDECSU-JYJNAYRXSA-N Phe-Leu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 KDYPMIZMXDECSU-JYJNAYRXSA-N 0.000 description 2
- XOHJOMKCRLHGCY-UNQGMJICSA-N Phe-Pro-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOHJOMKCRLHGCY-UNQGMJICSA-N 0.000 description 2
- MVIJMIZJPHQGEN-IHRRRGAJSA-N Phe-Ser-Val Chemical compound CC(C)[C@@H](C([O-])=O)NC(=O)[C@H](CO)NC(=O)[C@@H]([NH3+])CC1=CC=CC=C1 MVIJMIZJPHQGEN-IHRRRGAJSA-N 0.000 description 2
- VGTJSEYTVMAASM-RPTUDFQQSA-N Phe-Thr-Tyr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VGTJSEYTVMAASM-RPTUDFQQSA-N 0.000 description 2
- YUPRIZTWANWWHK-DZKIICNBSA-N Phe-Val-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N YUPRIZTWANWWHK-DZKIICNBSA-N 0.000 description 2
- 102000004160 Phosphoric Monoester Hydrolases Human genes 0.000 description 2
- 108090000608 Phosphoric Monoester Hydrolases Proteins 0.000 description 2
- XQLBWXHVZVBNJM-FXQIFTODSA-N Pro-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 XQLBWXHVZVBNJM-FXQIFTODSA-N 0.000 description 2
- UPJGUQPLYWTISV-GUBZILKMSA-N Pro-Gln-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UPJGUQPLYWTISV-GUBZILKMSA-N 0.000 description 2
- SKICPQLTOXGWGO-GARJFASQSA-N Pro-Gln-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCC(=O)N)C(=O)N2CCC[C@@H]2C(=O)O SKICPQLTOXGWGO-GARJFASQSA-N 0.000 description 2
- NMELOOXSGDRBRU-YUMQZZPRSA-N Pro-Glu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(=O)O)NC(=O)[C@@H]1CCCN1 NMELOOXSGDRBRU-YUMQZZPRSA-N 0.000 description 2
- FEPSEIDIPBMIOS-QXEWZRGKSA-N Pro-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 FEPSEIDIPBMIOS-QXEWZRGKSA-N 0.000 description 2
- ZTMLZUNPFDGPKY-VKOGCVSHSA-N Pro-Ile-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@@H]3CCCN3 ZTMLZUNPFDGPKY-VKOGCVSHSA-N 0.000 description 2
- XQPHBAKJJJZOBX-SRVKXCTJSA-N Pro-Lys-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O XQPHBAKJJJZOBX-SRVKXCTJSA-N 0.000 description 2
- MHHQQZIFLWFZGR-DCAQKATOSA-N Pro-Lys-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O MHHQQZIFLWFZGR-DCAQKATOSA-N 0.000 description 2
- WIPAMEKBSHNFQE-IUCAKERBSA-N Pro-Met-Gly Chemical compound CSCC[C@@H](C(=O)NCC(=O)O)NC(=O)[C@@H]1CCCN1 WIPAMEKBSHNFQE-IUCAKERBSA-N 0.000 description 2
- 102000001253 Protein Kinase Human genes 0.000 description 2
- 102000004879 Racemases and epimerases Human genes 0.000 description 2
- 108090001066 Racemases and epimerases Proteins 0.000 description 2
- MUPFEKGTMRGPLJ-RMMQSMQOSA-N Raffinose Natural products O(C[C@H]1[C@@H](O)[C@H](O)[C@@H](O)[C@@H](O[C@@]2(CO)[C@H](O)[C@@H](O)[C@@H](CO)O2)O1)[C@@H]1[C@H](O)[C@@H](O)[C@@H](O)[C@@H](CO)O1 MUPFEKGTMRGPLJ-RMMQSMQOSA-N 0.000 description 2
- 108091027981 Response element Proteins 0.000 description 2
- 241000607142 Salmonella Species 0.000 description 2
- WTUJZHKANPDPIN-CIUDSAMLSA-N Ser-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N WTUJZHKANPDPIN-CIUDSAMLSA-N 0.000 description 2
- KAAPNMOKUUPKOE-SRVKXCTJSA-N Ser-Asn-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KAAPNMOKUUPKOE-SRVKXCTJSA-N 0.000 description 2
- FTVRVZNYIYWJGB-ACZMJKKPSA-N Ser-Asp-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O FTVRVZNYIYWJGB-ACZMJKKPSA-N 0.000 description 2
- SFZKGGOGCNQPJY-CIUDSAMLSA-N Ser-Asp-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)N SFZKGGOGCNQPJY-CIUDSAMLSA-N 0.000 description 2
- BGOWRLSWJCVYAQ-CIUDSAMLSA-N Ser-Asp-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BGOWRLSWJCVYAQ-CIUDSAMLSA-N 0.000 description 2
- SWSRFJZZMNLMLY-ZKWXMUAHSA-N Ser-Asp-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O SWSRFJZZMNLMLY-ZKWXMUAHSA-N 0.000 description 2
- ZOHGLPQGEHSLPD-FXQIFTODSA-N Ser-Gln-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZOHGLPQGEHSLPD-FXQIFTODSA-N 0.000 description 2
- YRBGKVIWMNEVCZ-WDSKDSINSA-N Ser-Glu-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O YRBGKVIWMNEVCZ-WDSKDSINSA-N 0.000 description 2
- BRGQQXQKPUCUJQ-KBIXCLLPSA-N Ser-Glu-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BRGQQXQKPUCUJQ-KBIXCLLPSA-N 0.000 description 2
- YMTLKLXDFCSCNX-BYPYZUCNSA-N Ser-Gly-Gly Chemical compound OC[C@H](N)C(=O)NCC(=O)NCC(O)=O YMTLKLXDFCSCNX-BYPYZUCNSA-N 0.000 description 2
- WSTIOCFMWXNOCX-YUMQZZPRSA-N Ser-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CO)N WSTIOCFMWXNOCX-YUMQZZPRSA-N 0.000 description 2
- XUDRHBPSPAPDJP-SRVKXCTJSA-N Ser-Lys-Leu Chemical group CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO XUDRHBPSPAPDJP-SRVKXCTJSA-N 0.000 description 2
- TVPQRPNBYCRRLL-IHRRRGAJSA-N Ser-Phe-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(O)=O TVPQRPNBYCRRLL-IHRRRGAJSA-N 0.000 description 2
- GYDFRTRSSXOZCR-ACZMJKKPSA-N Ser-Ser-Glu Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O GYDFRTRSSXOZCR-ACZMJKKPSA-N 0.000 description 2
- RXUOAOOZIWABBW-XGEHTFHBSA-N Ser-Thr-Arg Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N RXUOAOOZIWABBW-XGEHTFHBSA-N 0.000 description 2
- JGUWRQWULDWNCM-FXQIFTODSA-N Ser-Val-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O JGUWRQWULDWNCM-FXQIFTODSA-N 0.000 description 2
- 101710172711 Structural protein Proteins 0.000 description 2
- 108700005078 Synthetic Genes Proteins 0.000 description 2
- FQPQPTHMHZKGFM-XQXXSGGOSA-N Thr-Ala-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O FQPQPTHMHZKGFM-XQXXSGGOSA-N 0.000 description 2
- MFEBUIFJVPNZLO-OLHMAJIHSA-N Thr-Asp-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O MFEBUIFJVPNZLO-OLHMAJIHSA-N 0.000 description 2
- DSLHSTIUAPKERR-XGEHTFHBSA-N Thr-Cys-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(O)=O DSLHSTIUAPKERR-XGEHTFHBSA-N 0.000 description 2
- UDQBCBUXAQIZAK-GLLZPBPUSA-N Thr-Glu-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UDQBCBUXAQIZAK-GLLZPBPUSA-N 0.000 description 2
- NIEWSKWFURSECR-FOHZUACHSA-N Thr-Gly-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O NIEWSKWFURSECR-FOHZUACHSA-N 0.000 description 2
- ZTPXSEUVYNNZRB-CDMKHQONSA-N Thr-Gly-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZTPXSEUVYNNZRB-CDMKHQONSA-N 0.000 description 2
- DJDSEDOKJTZBAR-ZDLURKLDSA-N Thr-Gly-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O DJDSEDOKJTZBAR-ZDLURKLDSA-N 0.000 description 2
- WBCCCPZIJIJTSD-TUBUOCAGSA-N Thr-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H]([C@@H](C)O)N WBCCCPZIJIJTSD-TUBUOCAGSA-N 0.000 description 2
- AHOLTQCAVBSUDP-PPCPHDFISA-N Thr-Ile-Lys Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)[C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O AHOLTQCAVBSUDP-PPCPHDFISA-N 0.000 description 2
- VTVVYQOXJCZVEB-WDCWCFNPSA-N Thr-Leu-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O VTVVYQOXJCZVEB-WDCWCFNPSA-N 0.000 description 2
- MEJHFIOYJHTWMK-VOAKCMCISA-N Thr-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)[C@@H](C)O MEJHFIOYJHTWMK-VOAKCMCISA-N 0.000 description 2
- JWQNAFHCXKVZKZ-UVOCVTCTSA-N Thr-Lys-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JWQNAFHCXKVZKZ-UVOCVTCTSA-N 0.000 description 2
- KKPOGALELPLJTL-MEYUZBJRSA-N Thr-Lys-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 KKPOGALELPLJTL-MEYUZBJRSA-N 0.000 description 2
- XKWABWFMQXMUMT-HJGDQZAQSA-N Thr-Pro-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O XKWABWFMQXMUMT-HJGDQZAQSA-N 0.000 description 2
- STUAPCLEDMKXKL-LKXGYXEUSA-N Thr-Ser-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O STUAPCLEDMKXKL-LKXGYXEUSA-N 0.000 description 2
- SGAOHNPSEPVAFP-ZDLURKLDSA-N Thr-Ser-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SGAOHNPSEPVAFP-ZDLURKLDSA-N 0.000 description 2
- ILUOMMDDGREELW-OSUNSFLBSA-N Thr-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)[C@@H](C)O ILUOMMDDGREELW-OSUNSFLBSA-N 0.000 description 2
- BKVICMPZWRNWOC-RHYQMDGZSA-N Thr-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)[C@@H](C)O BKVICMPZWRNWOC-RHYQMDGZSA-N 0.000 description 2
- HLDFBNPSURDYEN-VHWLVUOQSA-N Trp-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N HLDFBNPSURDYEN-VHWLVUOQSA-N 0.000 description 2
- ULHASJWZGUEUNN-XIRDDKMYSA-N Trp-Lys-Ser Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O ULHASJWZGUEUNN-XIRDDKMYSA-N 0.000 description 2
- AZGZDDNKFFUDEH-QWRGUYRKSA-N Tyr-Gly-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AZGZDDNKFFUDEH-QWRGUYRKSA-N 0.000 description 2
- HVPPEXXUDXAPOM-MGHWNKPDSA-N Tyr-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 HVPPEXXUDXAPOM-MGHWNKPDSA-N 0.000 description 2
- XGZBEGGGAUQBMB-KJEVXHAQSA-N Tyr-Pro-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC2=CC=C(C=C2)O)N)O XGZBEGGGAUQBMB-KJEVXHAQSA-N 0.000 description 2
- MUPFEKGTMRGPLJ-UHFFFAOYSA-N UNPD196149 Natural products OC1C(O)C(CO)OC1(CO)OC1C(O)C(O)C(O)C(COC2C(C(O)C(O)C(CO)O2)O)O1 MUPFEKGTMRGPLJ-UHFFFAOYSA-N 0.000 description 2
- ZLFHAAGHGQBQQN-AEJSXWLSSA-N Val-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N ZLFHAAGHGQBQQN-AEJSXWLSSA-N 0.000 description 2
- LABUITCFCAABSV-UHFFFAOYSA-N Val-Ala-Tyr Natural products CC(C)C(N)C(=O)NC(C)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LABUITCFCAABSV-UHFFFAOYSA-N 0.000 description 2
- JYVKKBDANPZIAW-AVGNSLFASA-N Val-Arg-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](C(C)C)N JYVKKBDANPZIAW-AVGNSLFASA-N 0.000 description 2
- UDLYXGYWTVOIKU-QXEWZRGKSA-N Val-Asn-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N UDLYXGYWTVOIKU-QXEWZRGKSA-N 0.000 description 2
- SRWWRLKBEJZFPW-IHRRRGAJSA-N Val-Cys-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N SRWWRLKBEJZFPW-IHRRRGAJSA-N 0.000 description 2
- GBESYURLQOYWLU-LAEOZQHASA-N Val-Glu-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N GBESYURLQOYWLU-LAEOZQHASA-N 0.000 description 2
- CELJCNRXKZPTCX-XPUUQOCRSA-N Val-Gly-Ala Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O CELJCNRXKZPTCX-XPUUQOCRSA-N 0.000 description 2
- WFENBJPLZMPVAX-XVKPBYJWSA-N Val-Gly-Glu Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O WFENBJPLZMPVAX-XVKPBYJWSA-N 0.000 description 2
- SYOMXKPPFZRELL-ONGXEEELSA-N Val-Gly-Lys Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N SYOMXKPPFZRELL-ONGXEEELSA-N 0.000 description 2
- HQYVQDRYODWONX-DCAQKATOSA-N Val-His-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CO)C(=O)O)N HQYVQDRYODWONX-DCAQKATOSA-N 0.000 description 2
- CPGJELLYDQEDRK-NAKRPEOUSA-N Val-Ile-Ala Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@@H](C)C(O)=O CPGJELLYDQEDRK-NAKRPEOUSA-N 0.000 description 2
- UKEVLVBHRKWECS-LSJOCFKGSA-N Val-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](C(C)C)N UKEVLVBHRKWECS-LSJOCFKGSA-N 0.000 description 2
- KTEZUXISLQTDDQ-NHCYSSNCSA-N Val-Lys-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N KTEZUXISLQTDDQ-NHCYSSNCSA-N 0.000 description 2
- LTTQCQRTSHJPPL-ZKWXMUAHSA-N Val-Ser-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)O)N LTTQCQRTSHJPPL-ZKWXMUAHSA-N 0.000 description 2
- UGFMVXRXULGLNO-XPUUQOCRSA-N Val-Ser-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O UGFMVXRXULGLNO-XPUUQOCRSA-N 0.000 description 2
- OFTXTCGQJXTNQS-XGEHTFHBSA-N Val-Thr-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](C(C)C)N)O OFTXTCGQJXTNQS-XGEHTFHBSA-N 0.000 description 2
- CFIBZQOLUDURST-IHRRRGAJSA-N Val-Tyr-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CS)C(=O)O)N CFIBZQOLUDURST-IHRRRGAJSA-N 0.000 description 2
- AEFJNECXZCODJM-UWVGGRQHSA-N Val-Val-Gly Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@@H](C(C)C)C(=O)NCC([O-])=O AEFJNECXZCODJM-UWVGGRQHSA-N 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 238000001042 affinity chromatography Methods 0.000 description 2
- 238000013019 agitation Methods 0.000 description 2
- 108010041407 alanylaspartic acid Proteins 0.000 description 2
- 239000012237 artificial material Substances 0.000 description 2
- 229960005261 aspartic acid Drugs 0.000 description 2
- 235000003704 aspartic acid Nutrition 0.000 description 2
- 108010027234 aspartyl-glycyl-glutamyl-alanine Proteins 0.000 description 2
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 2
- 229940065181 bacillus anthracis Drugs 0.000 description 2
- 230000004888 barrier function Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 2
- 239000002775 capsule Substances 0.000 description 2
- 125000000837 carbohydrate group Chemical group 0.000 description 2
- 230000006037 cell lysis Effects 0.000 description 2
- 239000006285 cell suspension Substances 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- HVYWMOMLDIMFJA-DPAQBDIFSA-N cholesterol Chemical compound C1C=C2C[C@@H](O)CC[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@H]([C@H](C)CCCC(C)C)[C@@]1(C)CC2 HVYWMOMLDIMFJA-DPAQBDIFSA-N 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000000875 corresponding effect Effects 0.000 description 2
- 108091007930 cytoplasmic receptors Proteins 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 239000008121 dextrose Substances 0.000 description 2
- 235000014113 dietary fatty acids Nutrition 0.000 description 2
- 230000029087 digestion Effects 0.000 description 2
- 108010009297 diglycyl-histidine Proteins 0.000 description 2
- RTZKZFJDLAIYFH-UHFFFAOYSA-N ether Substances CCOCC RTZKZFJDLAIYFH-UHFFFAOYSA-N 0.000 description 2
- 229930195729 fatty acid Natural products 0.000 description 2
- 239000000194 fatty acid Substances 0.000 description 2
- 150000004665 fatty acids Chemical class 0.000 description 2
- MHMNJMPURVTYEJ-UHFFFAOYSA-N fluorescein-5-isothiocyanate Chemical compound O1C(=O)C2=CC(N=C=S)=CC=C2C21C1=CC=C(O)C=C1OC1=CC(O)=CC=C21 MHMNJMPURVTYEJ-UHFFFAOYSA-N 0.000 description 2
- 235000013305 food Nutrition 0.000 description 2
- 235000013373 food additive Nutrition 0.000 description 2
- 239000002778 food additive Substances 0.000 description 2
- 108010006664 gamma-glutamyl-glycyl-glycine Proteins 0.000 description 2
- 239000012213 gelatinous substance Substances 0.000 description 2
- 239000011521 glass Substances 0.000 description 2
- 125000003147 glycosyl group Chemical group 0.000 description 2
- 230000013595 glycosylation Effects 0.000 description 2
- 238000006206 glycosylation reaction Methods 0.000 description 2
- 108700014210 glycosyltransferase activity proteins Proteins 0.000 description 2
- 125000003630 glycyl group Chemical group [H]N([H])C([H])([H])C(*)=O 0.000 description 2
- 108010010096 glycyl-glycyl-tyrosine Proteins 0.000 description 2
- 108010084760 glycyl-tyrosyl-glycyl-aspartate Proteins 0.000 description 2
- 108010036413 histidylglycine Proteins 0.000 description 2
- 210000005260 human cell Anatomy 0.000 description 2
- 238000000338 in vitro Methods 0.000 description 2
- 238000010348 incorporation Methods 0.000 description 2
- 238000002955 isolation Methods 0.000 description 2
- 239000002649 leather substitute Substances 0.000 description 2
- 108010076756 leucyl-alanyl-phenylalanine Proteins 0.000 description 2
- 108010083708 leucyl-aspartyl-valine Proteins 0.000 description 2
- 108010038320 lysylphenylalanine Proteins 0.000 description 2
- 108010056929 lyticase Proteins 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 235000013372 meat Nutrition 0.000 description 2
- 210000004940 nucleus Anatomy 0.000 description 2
- 239000002417 nutraceutical Substances 0.000 description 2
- 235000021436 nutraceutical agent Nutrition 0.000 description 2
- 229910052760 oxygen Inorganic materials 0.000 description 2
- 239000001301 oxygen Substances 0.000 description 2
- 108010073101 phenylalanylleucine Proteins 0.000 description 2
- 239000006187 pill Substances 0.000 description 2
- 229920002791 poly-4-hydroxybutyrate Polymers 0.000 description 2
- 238000003752 polymerase chain reaction Methods 0.000 description 2
- 230000013823 prenylation Effects 0.000 description 2
- 150000003148 prolines Chemical class 0.000 description 2
- 108010007513 prolyl-glycyl-prolyl-leucine Proteins 0.000 description 2
- 108700042769 prolyl-leucyl-glycine Proteins 0.000 description 2
- 108060006633 protein kinase Proteins 0.000 description 2
- 150000003212 purines Chemical class 0.000 description 2
- 150000003230 pyrimidines Chemical class 0.000 description 2
- MUPFEKGTMRGPLJ-ZQSKZDJDSA-N raffinose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO[C@@H]2[C@@H]([C@@H](O)[C@@H](O)[C@@H](CO)O2)O)O1 MUPFEKGTMRGPLJ-ZQSKZDJDSA-N 0.000 description 2
- 238000001542 size-exclusion chromatography Methods 0.000 description 2
- 150000003384 small molecules Chemical class 0.000 description 2
- 125000001424 substituent group Chemical group 0.000 description 2
- 239000006228 supernatant Substances 0.000 description 2
- MPLHNVLQVRSVEE-UHFFFAOYSA-N texas red Chemical compound [O-]S(=O)(=O)C1=CC(S(Cl)(=O)=O)=CC=C1C(C1=CC=2CCCN3CCCC(C=23)=C1O1)=C2C1=C(CCC1)C3=[N+]1CCCC3=C2 MPLHNVLQVRSVEE-UHFFFAOYSA-N 0.000 description 2
- 239000003053 toxin Substances 0.000 description 2
- 231100000765 toxin Toxicity 0.000 description 2
- 230000002103 transcriptional effect Effects 0.000 description 2
- 108010044292 tryptophyltyrosine Proteins 0.000 description 2
- 229960004441 tyrosine Drugs 0.000 description 2
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 2
- 108010020532 tyrosyl-proline Proteins 0.000 description 2
- 230000034512 ubiquitination Effects 0.000 description 2
- 238000010798 ubiquitination Methods 0.000 description 2
- 229960004295 valine Drugs 0.000 description 2
- 239000004474 valine Substances 0.000 description 2
- 235000014393 valine Nutrition 0.000 description 2
- 108010011876 valyl-glycyl-valyl-alanyl-prolyl-glycine Proteins 0.000 description 2
- GJLXVWOMRRWCIB-MERZOTPQSA-N (2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-acetamido-5-(diaminomethylideneamino)pentanoyl]amino]-3-(4-hydroxyphenyl)propanoyl]amino]-3-(4-hydroxyphenyl)propanoyl]amino]-5-(diaminomethylideneamino)pentanoyl]amino]-3-(1H-indol-3-yl)propanoyl]amino]-6-aminohexanoyl]amino]-6-aminohexanoyl]amino]-6-aminohexanoyl]amino]-6-aminohexanoyl]amino]-6-aminohexanoyl]amino]-6-aminohexanoyl]amino]-6-aminohexanamide Chemical compound C([C@H](NC(=O)[C@H](CCCN=C(N)N)NC(=O)C)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(N)=O)C1=CC=C(O)C=C1 GJLXVWOMRRWCIB-MERZOTPQSA-N 0.000 description 1
- FDKWRPBBCBCIGA-REOHCLBHSA-N (2r)-2-azaniumyl-3-$l^{1}-selanylpropanoate Chemical compound [Se]C[C@H](N)C(O)=O FDKWRPBBCBCIGA-REOHCLBHSA-N 0.000 description 1
- CWFMWBHMIMNZLN-NAKRPEOUSA-N (2s)-1-[(2s)-2-[[(2s,3s)-2-amino-3-methylpentanoyl]amino]propanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O CWFMWBHMIMNZLN-NAKRPEOUSA-N 0.000 description 1
- VWOBAQCHQNVOSG-QOFFQMATSA-N (2s)-2-[[(2s)-2-[[(2s,3s)-2-[[(2s)-2-[[(2s)-4-amino-2-[[(2s)-6-amino-2-[[(2s)-6-amino-2-[[(2s)-2-[[(2s)-2-[[(2s)-2-[[(2s)-2,6-diaminohexanoyl]amino]-5-(diaminomethylideneamino)pentanoyl]amino]-5-(diaminomethylideneamino)pentanoyl]amino]-3-(1h-indol-3-yl)p Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CCCCN)C1=CC=CC=C1 VWOBAQCHQNVOSG-QOFFQMATSA-N 0.000 description 1
- JBFQOLHAGBKPTP-NZATWWQASA-N (2s)-2-[[(2s)-4-carboxy-2-[[3-carboxy-2-[[(2s)-2,6-diaminohexanoyl]amino]propanoyl]amino]butanoyl]amino]-4-methylpentanoic acid Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)C(CC(O)=O)NC(=O)[C@@H](N)CCCCN JBFQOLHAGBKPTP-NZATWWQASA-N 0.000 description 1
- OTEWWRBKGONZBW-UHFFFAOYSA-N 2-[[2-[[2-[(2-azaniumylacetyl)amino]-4-methylpentanoyl]amino]acetyl]amino]acetate Chemical compound NCC(=O)NC(CC(C)C)C(=O)NCC(=O)NCC(O)=O OTEWWRBKGONZBW-UHFFFAOYSA-N 0.000 description 1
- XWTNPSHCJMZAHQ-QMMMGPOBSA-N 2-[[2-[[2-[[(2s)-2-amino-4-methylpentanoyl]amino]acetyl]amino]acetyl]amino]acetic acid Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)NCC(=O)NCC(O)=O XWTNPSHCJMZAHQ-QMMMGPOBSA-N 0.000 description 1
- KPGXRSRHYNQIFN-UHFFFAOYSA-L 2-oxoglutarate(2-) Chemical compound [O-]C(=O)CCC(=O)C([O-])=O KPGXRSRHYNQIFN-UHFFFAOYSA-L 0.000 description 1
- QFVHZQCOUORWEI-UHFFFAOYSA-N 4-[(4-anilino-5-sulfonaphthalen-1-yl)diazenyl]-5-hydroxynaphthalene-2,7-disulfonic acid Chemical compound C=12C(O)=CC(S(O)(=O)=O)=CC2=CC(S(O)(=O)=O)=CC=1N=NC(C1=CC=CC(=C11)S(O)(=O)=O)=CC=C1NC1=CC=CC=C1 QFVHZQCOUORWEI-UHFFFAOYSA-N 0.000 description 1
- FVFVNNKYKYZTJU-UHFFFAOYSA-N 6-chloro-1,3,5-triazine-2,4-diamine Chemical group NC1=NC(N)=NC(Cl)=N1 FVFVNNKYKYZTJU-UHFFFAOYSA-N 0.000 description 1
- 102100026802 72 kDa type IV collagenase Human genes 0.000 description 1
- 101710151806 72 kDa type IV collagenase Proteins 0.000 description 1
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical group N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 1
- 102100032310 A disintegrin and metalloproteinase with thrombospondin motifs 14 Human genes 0.000 description 1
- 102100027399 A disintegrin and metalloproteinase with thrombospondin motifs 2 Human genes 0.000 description 1
- 101710100366 A disintegrin and metalloproteinase with thrombospondin motifs 2 Proteins 0.000 description 1
- 102100027400 A disintegrin and metalloproteinase with thrombospondin motifs 4 Human genes 0.000 description 1
- IMMKUCQIKKXKNP-DCAQKATOSA-N Ala-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CCCN=C(N)N IMMKUCQIKKXKNP-DCAQKATOSA-N 0.000 description 1
- LWUWMHIOBPTZBA-DCAQKATOSA-N Ala-Arg-Lys Chemical compound NC(=N)NCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O LWUWMHIOBPTZBA-DCAQKATOSA-N 0.000 description 1
- NXSFUECZFORGOG-CIUDSAMLSA-N Ala-Asn-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NXSFUECZFORGOG-CIUDSAMLSA-N 0.000 description 1
- PBAMJJXWDQXOJA-FXQIFTODSA-N Ala-Asp-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PBAMJJXWDQXOJA-FXQIFTODSA-N 0.000 description 1
- LZRNYBIJOSKKRJ-XVYDVKMFSA-N Ala-Asp-His Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N LZRNYBIJOSKKRJ-XVYDVKMFSA-N 0.000 description 1
- IYCZBJXFSZSHPN-DLOVCJGASA-N Ala-Cys-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O IYCZBJXFSZSHPN-DLOVCJGASA-N 0.000 description 1
- CZPAHAKGPDUIPJ-CIUDSAMLSA-N Ala-Gln-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(O)=O CZPAHAKGPDUIPJ-CIUDSAMLSA-N 0.000 description 1
- FUSPCLTUKXQREV-ACZMJKKPSA-N Ala-Glu-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O FUSPCLTUKXQREV-ACZMJKKPSA-N 0.000 description 1
- BGNLUHXLSAQYRQ-FXQIFTODSA-N Ala-Glu-Gln Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O BGNLUHXLSAQYRQ-FXQIFTODSA-N 0.000 description 1
- WKOBSJOZRJJVRZ-FXQIFTODSA-N Ala-Glu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WKOBSJOZRJJVRZ-FXQIFTODSA-N 0.000 description 1
- GGNHBHYDMUDXQB-KBIXCLLPSA-N Ala-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C)N GGNHBHYDMUDXQB-KBIXCLLPSA-N 0.000 description 1
- VWEWCZSUWOEEFM-WDSKDSINSA-N Ala-Gly-Ala-Gly Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(=O)NCC(O)=O VWEWCZSUWOEEFM-WDSKDSINSA-N 0.000 description 1
- NHLAEBFGWPXFGI-WHFBIAKZSA-N Ala-Gly-Asn Chemical compound C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)N)C(=O)O)N NHLAEBFGWPXFGI-WHFBIAKZSA-N 0.000 description 1
- VGPWRRFOPXVGOH-BYPYZUCNSA-N Ala-Gly-Gly Chemical compound C[C@H](N)C(=O)NCC(=O)NCC(O)=O VGPWRRFOPXVGOH-BYPYZUCNSA-N 0.000 description 1
- OBVSBEYOMDWLRJ-BFHQHQDPSA-N Ala-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N OBVSBEYOMDWLRJ-BFHQHQDPSA-N 0.000 description 1
- OKEWAFFWMHBGPT-XPUUQOCRSA-N Ala-His-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CN=CN1 OKEWAFFWMHBGPT-XPUUQOCRSA-N 0.000 description 1
- HUUOZYZWNCXTFK-INTQDDNPSA-N Ala-His-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N2CCC[C@@H]2C(=O)O)N HUUOZYZWNCXTFK-INTQDDNPSA-N 0.000 description 1
- NYDBKUNVSALYPX-NAKRPEOUSA-N Ala-Ile-Arg Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CCCN=C(N)N NYDBKUNVSALYPX-NAKRPEOUSA-N 0.000 description 1
- DVJSJDDYCYSMFR-ZKWXMUAHSA-N Ala-Ile-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O DVJSJDDYCYSMFR-ZKWXMUAHSA-N 0.000 description 1
- QPBSRMDNJOTFAL-AICCOOGYSA-N Ala-Leu-Leu-Thr Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QPBSRMDNJOTFAL-AICCOOGYSA-N 0.000 description 1
- MEFILNJXAVSUTO-JXUBOQSCSA-N Ala-Leu-Thr Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MEFILNJXAVSUTO-JXUBOQSCSA-N 0.000 description 1
- PIXQDIGKDNNOOV-GUBZILKMSA-N Ala-Lys-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O PIXQDIGKDNNOOV-GUBZILKMSA-N 0.000 description 1
- MFMDKJIPHSWSBM-GUBZILKMSA-N Ala-Lys-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O MFMDKJIPHSWSBM-GUBZILKMSA-N 0.000 description 1
- VCSABYLVNWQYQE-SRVKXCTJSA-N Ala-Lys-Lys Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O VCSABYLVNWQYQE-SRVKXCTJSA-N 0.000 description 1
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 1
- CHFFHQUVXHEGBY-GARJFASQSA-N Ala-Lys-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N CHFFHQUVXHEGBY-GARJFASQSA-N 0.000 description 1
- MDNAVFBZPROEHO-DCAQKATOSA-N Ala-Lys-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O MDNAVFBZPROEHO-DCAQKATOSA-N 0.000 description 1
- WEZNQZHACPSMEF-QEJZJMRPSA-N Ala-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 WEZNQZHACPSMEF-QEJZJMRPSA-N 0.000 description 1
- RNHKOQHGYMTHFR-UBHSHLNASA-N Ala-Phe-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CC=CC=C1 RNHKOQHGYMTHFR-UBHSHLNASA-N 0.000 description 1
- IPZQNYYAYVRKKK-FXQIFTODSA-N Ala-Pro-Ala Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O IPZQNYYAYVRKKK-FXQIFTODSA-N 0.000 description 1
- FEGOCLZUJUFCHP-CIUDSAMLSA-N Ala-Pro-Gln Chemical compound [H]N[C@@H](C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O FEGOCLZUJUFCHP-CIUDSAMLSA-N 0.000 description 1
- BHTBAVZSZCQZPT-GUBZILKMSA-N Ala-Pro-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)N BHTBAVZSZCQZPT-GUBZILKMSA-N 0.000 description 1
- WQKAQKZRDIZYNV-VZFHVOOUSA-N Ala-Ser-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WQKAQKZRDIZYNV-VZFHVOOUSA-N 0.000 description 1
- XQNRANMFRPCFFW-GCJQMDKQSA-N Ala-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C)N)O XQNRANMFRPCFFW-GCJQMDKQSA-N 0.000 description 1
- KTXKIYXZQFWJKB-VZFHVOOUSA-N Ala-Thr-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O KTXKIYXZQFWJKB-VZFHVOOUSA-N 0.000 description 1
- IETUUAHKCHOQHP-KZVJFYERSA-N Ala-Thr-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@H](C)N)[C@@H](C)O)C(O)=O IETUUAHKCHOQHP-KZVJFYERSA-N 0.000 description 1
- 108010011170 Ala-Trp-Arg-His-Pro-Gln-Phe-Gly-Gly Proteins 0.000 description 1
- LFFOJBOTZUWINF-ZANVPECISA-N Ala-Trp-Gly Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)C)C(=O)NCC(O)=O)=CNC2=C1 LFFOJBOTZUWINF-ZANVPECISA-N 0.000 description 1
- PGNNQOJOEGFAOR-KWQFWETISA-N Ala-Tyr-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=C(O)C=C1 PGNNQOJOEGFAOR-KWQFWETISA-N 0.000 description 1
- JPOQZCHGOTWRTM-FQPOAREZSA-N Ala-Tyr-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JPOQZCHGOTWRTM-FQPOAREZSA-N 0.000 description 1
- OIRCZHKOHJUHAC-SIUGBPQLSA-N Ala-Val-Asp-Tyr Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 OIRCZHKOHJUHAC-SIUGBPQLSA-N 0.000 description 1
- LYILPUNCKACNGF-NAKRPEOUSA-N Ala-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C)N LYILPUNCKACNGF-NAKRPEOUSA-N 0.000 description 1
- XCIGOVDXZULBBV-DCAQKATOSA-N Ala-Val-Lys Chemical compound CC(C)[C@H](NC(=O)[C@H](C)N)C(=O)N[C@@H](CCCCN)C(O)=O XCIGOVDXZULBBV-DCAQKATOSA-N 0.000 description 1
- 102100036826 Aldehyde oxidase Human genes 0.000 description 1
- 101000600176 Arabidopsis thaliana Peroxisomal membrane protein PEX14 Proteins 0.000 description 1
- 101000987688 Arabidopsis thaliana Peroxisome biogenesis protein 5 Proteins 0.000 description 1
- MCYJBCKCAPERSE-FXQIFTODSA-N Arg-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N MCYJBCKCAPERSE-FXQIFTODSA-N 0.000 description 1
- OTOXOKCIIQLMFH-KZVJFYERSA-N Arg-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N OTOXOKCIIQLMFH-KZVJFYERSA-N 0.000 description 1
- VWVPYNGMOCSSGK-GUBZILKMSA-N Arg-Arg-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O VWVPYNGMOCSSGK-GUBZILKMSA-N 0.000 description 1
- OVVUNXXROOFSIM-SDDRHHMPSA-N Arg-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O OVVUNXXROOFSIM-SDDRHHMPSA-N 0.000 description 1
- NONSEUUPKITYQT-BQBZGAKWSA-N Arg-Asn-Gly Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)NCC(=O)O)N)CN=C(N)N NONSEUUPKITYQT-BQBZGAKWSA-N 0.000 description 1
- GHNDBBVSWOWYII-LPEHRKFASA-N Arg-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O GHNDBBVSWOWYII-LPEHRKFASA-N 0.000 description 1
- OBFTYSPXDRROQO-SRVKXCTJSA-N Arg-Gln-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCCN=C(N)N OBFTYSPXDRROQO-SRVKXCTJSA-N 0.000 description 1
- PBSOQGZLPFVXPU-YUMQZZPRSA-N Arg-Glu-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O PBSOQGZLPFVXPU-YUMQZZPRSA-N 0.000 description 1
- QAXCZGMLVICQKS-SRVKXCTJSA-N Arg-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N QAXCZGMLVICQKS-SRVKXCTJSA-N 0.000 description 1
- HPSVTWMFWCHKFN-GARJFASQSA-N Arg-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O HPSVTWMFWCHKFN-GARJFASQSA-N 0.000 description 1
- 108010010777 Arg-Gly-Asp-Gly Proteins 0.000 description 1
- YNSGXDWWPCGGQS-YUMQZZPRSA-N Arg-Gly-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O YNSGXDWWPCGGQS-YUMQZZPRSA-N 0.000 description 1
- QKSAZKCRVQYYGS-UWVGGRQHSA-N Arg-Gly-His Chemical compound N[C@@H](CCCN=C(N)N)C(=O)NCC(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O QKSAZKCRVQYYGS-UWVGGRQHSA-N 0.000 description 1
- RFXXUWGNVRJTNQ-QXEWZRGKSA-N Arg-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCCN=C(N)N)N RFXXUWGNVRJTNQ-QXEWZRGKSA-N 0.000 description 1
- SYAUZLVLXCDRSH-IUCAKERBSA-N Arg-Gly-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCCN=C(N)N)N SYAUZLVLXCDRSH-IUCAKERBSA-N 0.000 description 1
- KRQSPVKUISQQFS-FJXKBIBVSA-N Arg-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCN=C(N)N KRQSPVKUISQQFS-FJXKBIBVSA-N 0.000 description 1
- ZZZWQALDSQQBEW-STQMWFEESA-N Arg-Gly-Tyr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZZZWQALDSQQBEW-STQMWFEESA-N 0.000 description 1
- QEHMMRSQJMOYNO-DCAQKATOSA-N Arg-His-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N QEHMMRSQJMOYNO-DCAQKATOSA-N 0.000 description 1
- UBCPNBUIQNMDNH-NAKRPEOUSA-N Arg-Ile-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O UBCPNBUIQNMDNH-NAKRPEOUSA-N 0.000 description 1
- OOIMKQRCPJBGPD-XUXIUFHCSA-N Arg-Ile-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O OOIMKQRCPJBGPD-XUXIUFHCSA-N 0.000 description 1
- LVMUGODRNHFGRA-AVGNSLFASA-N Arg-Leu-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O LVMUGODRNHFGRA-AVGNSLFASA-N 0.000 description 1
- UHFUZWSZQKMDSX-DCAQKATOSA-N Arg-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UHFUZWSZQKMDSX-DCAQKATOSA-N 0.000 description 1
- DNUKXVMPARLPFN-XUXIUFHCSA-N Arg-Leu-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DNUKXVMPARLPFN-XUXIUFHCSA-N 0.000 description 1
- NMRHDSAOIURTNT-RWMBFGLXSA-N Arg-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N NMRHDSAOIURTNT-RWMBFGLXSA-N 0.000 description 1
- JEOCWTUOMKEEMF-RHYQMDGZSA-N Arg-Leu-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JEOCWTUOMKEEMF-RHYQMDGZSA-N 0.000 description 1
- SSZGOKWBHLOCHK-DCAQKATOSA-N Arg-Lys-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N SSZGOKWBHLOCHK-DCAQKATOSA-N 0.000 description 1
- NGTYEHIRESTSRX-UWVGGRQHSA-N Arg-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N NGTYEHIRESTSRX-UWVGGRQHSA-N 0.000 description 1
- IGFJVXOATGZTHD-UHFFFAOYSA-N Arg-Phe-His Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccccc1)C(=O)NC(Cc2c[nH]cn2)C(=O)O IGFJVXOATGZTHD-UHFFFAOYSA-N 0.000 description 1
- UGZUVYDKAYNCII-ULQDDVLXSA-N Arg-Phe-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O UGZUVYDKAYNCII-ULQDDVLXSA-N 0.000 description 1
- DNLQVHBBMPZUGJ-BQBZGAKWSA-N Arg-Ser-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O DNLQVHBBMPZUGJ-BQBZGAKWSA-N 0.000 description 1
- SYFHFLGAROUHNT-VEVYYDQMSA-N Arg-Thr-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O SYFHFLGAROUHNT-VEVYYDQMSA-N 0.000 description 1
- MOGMYRUNTKYZFB-UNQGMJICSA-N Arg-Thr-Phe Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MOGMYRUNTKYZFB-UNQGMJICSA-N 0.000 description 1
- QCTOLCVIGRLMQS-HRCADAONSA-N Arg-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O QCTOLCVIGRLMQS-HRCADAONSA-N 0.000 description 1
- CNBIWSCSSCAINS-UFYCRDLUSA-N Arg-Tyr-Tyr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CNBIWSCSSCAINS-UFYCRDLUSA-N 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- RZVVKNIACROXRM-ZLUOBGJFSA-N Asn-Ala-Asp Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N RZVVKNIACROXRM-ZLUOBGJFSA-N 0.000 description 1
- LEFKSBYHUGUWLP-ACZMJKKPSA-N Asn-Ala-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LEFKSBYHUGUWLP-ACZMJKKPSA-N 0.000 description 1
- NTXNUXPCNRDMAF-WFBYXXMGSA-N Asn-Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CC(N)=O)C)C(O)=O)=CNC2=C1 NTXNUXPCNRDMAF-WFBYXXMGSA-N 0.000 description 1
- PCKRJVZAQZWNKM-WHFBIAKZSA-N Asn-Asn-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O PCKRJVZAQZWNKM-WHFBIAKZSA-N 0.000 description 1
- NVGWESORMHFISY-SRVKXCTJSA-N Asn-Asn-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O NVGWESORMHFISY-SRVKXCTJSA-N 0.000 description 1
- XSGBIBGAMKTHMY-WHFBIAKZSA-N Asn-Asp-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O XSGBIBGAMKTHMY-WHFBIAKZSA-N 0.000 description 1
- FANGHKQYFPYDNB-UBHSHLNASA-N Asn-Asp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N FANGHKQYFPYDNB-UBHSHLNASA-N 0.000 description 1
- IYVSIZAXNLOKFQ-BYULHYEWSA-N Asn-Asp-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O IYVSIZAXNLOKFQ-BYULHYEWSA-N 0.000 description 1
- ULRPXVNMIIYDDJ-ACZMJKKPSA-N Asn-Glu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)N)N ULRPXVNMIIYDDJ-ACZMJKKPSA-N 0.000 description 1
- UBKOVSLDWIHYSY-ACZMJKKPSA-N Asn-Glu-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O UBKOVSLDWIHYSY-ACZMJKKPSA-N 0.000 description 1
- DMLSCRJBWUEALP-LAEOZQHASA-N Asn-Glu-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O DMLSCRJBWUEALP-LAEOZQHASA-N 0.000 description 1
- PBSQFBAJKPLRJY-BYULHYEWSA-N Asn-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N PBSQFBAJKPLRJY-BYULHYEWSA-N 0.000 description 1
- JQSWHKKUZMTOIH-QWRGUYRKSA-N Asn-Gly-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N JQSWHKKUZMTOIH-QWRGUYRKSA-N 0.000 description 1
- GJFYPBDMUGGLFR-NKWVEPMBSA-N Asn-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CC(=O)N)N)C(=O)O GJFYPBDMUGGLFR-NKWVEPMBSA-N 0.000 description 1
- NVWJMQNYLYWVNQ-BYULHYEWSA-N Asn-Ile-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O NVWJMQNYLYWVNQ-BYULHYEWSA-N 0.000 description 1
- SPCONPVIDFMDJI-QSFUFRPTSA-N Asn-Ile-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O SPCONPVIDFMDJI-QSFUFRPTSA-N 0.000 description 1
- BXUHCIXDSWRSBS-CIUDSAMLSA-N Asn-Leu-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BXUHCIXDSWRSBS-CIUDSAMLSA-N 0.000 description 1
- YVXRYLVELQYAEQ-SRVKXCTJSA-N Asn-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N YVXRYLVELQYAEQ-SRVKXCTJSA-N 0.000 description 1
- JEEFEQCRXKPQHC-KKUMJFAQSA-N Asn-Leu-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JEEFEQCRXKPQHC-KKUMJFAQSA-N 0.000 description 1
- FHETWELNCBMRMG-HJGDQZAQSA-N Asn-Leu-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FHETWELNCBMRMG-HJGDQZAQSA-N 0.000 description 1
- NCFJQJRLQJEECD-NHCYSSNCSA-N Asn-Leu-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O NCFJQJRLQJEECD-NHCYSSNCSA-N 0.000 description 1
- GIQCDTKOIPUDSG-GARJFASQSA-N Asn-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)N)N)C(=O)O GIQCDTKOIPUDSG-GARJFASQSA-N 0.000 description 1
- XFJKRRCWLTZIQA-XIRDDKMYSA-N Asn-Lys-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)N)N XFJKRRCWLTZIQA-XIRDDKMYSA-N 0.000 description 1
- ZJIFRAPZHAGLGR-MELADBBJSA-N Asn-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CC(=O)N)N)C(=O)O ZJIFRAPZHAGLGR-MELADBBJSA-N 0.000 description 1
- VHQSGALUSWIYOD-QXEWZRGKSA-N Asn-Pro-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O VHQSGALUSWIYOD-QXEWZRGKSA-N 0.000 description 1
- HPBNLFLSSQDFQW-WHFBIAKZSA-N Asn-Ser-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O HPBNLFLSSQDFQW-WHFBIAKZSA-N 0.000 description 1
- DOURAOODTFJRIC-CIUDSAMLSA-N Asn-Ser-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N DOURAOODTFJRIC-CIUDSAMLSA-N 0.000 description 1
- NPZJLGMWMDNQDD-GHCJXIJMSA-N Asn-Ser-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NPZJLGMWMDNQDD-GHCJXIJMSA-N 0.000 description 1
- UGXYFDQFLVCDFC-CIUDSAMLSA-N Asn-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O UGXYFDQFLVCDFC-CIUDSAMLSA-N 0.000 description 1
- FMNBYVSGRCXWEK-FOHZUACHSA-N Asn-Thr-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O FMNBYVSGRCXWEK-FOHZUACHSA-N 0.000 description 1
- WUQXMTITJLFXAU-JIOCBJNQSA-N Asn-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N)O WUQXMTITJLFXAU-JIOCBJNQSA-N 0.000 description 1
- UXHYOWXTJLBEPG-GSSVUCPTSA-N Asn-Thr-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O UXHYOWXTJLBEPG-GSSVUCPTSA-N 0.000 description 1
- QUCCLIXMVPIVOB-BZSNNMDCSA-N Asn-Tyr-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CC(=O)N)N QUCCLIXMVPIVOB-BZSNNMDCSA-N 0.000 description 1
- KBQOUDLMWYWXNP-YDHLFZDLSA-N Asn-Val-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CC(=O)N)N KBQOUDLMWYWXNP-YDHLFZDLSA-N 0.000 description 1
- HBUJSDCLZCXXCW-YDHLFZDLSA-N Asn-Val-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HBUJSDCLZCXXCW-YDHLFZDLSA-N 0.000 description 1
- XEDQMTWEYFBOIK-ACZMJKKPSA-N Asp-Ala-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O XEDQMTWEYFBOIK-ACZMJKKPSA-N 0.000 description 1
- NECWUSYTYSIFNC-DLOVCJGASA-N Asp-Ala-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 NECWUSYTYSIFNC-DLOVCJGASA-N 0.000 description 1
- NYLBGYLHBDFRHL-VEVYYDQMSA-N Asp-Arg-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NYLBGYLHBDFRHL-VEVYYDQMSA-N 0.000 description 1
- ATYWBXGNXZYZGI-ACZMJKKPSA-N Asp-Asn-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O ATYWBXGNXZYZGI-ACZMJKKPSA-N 0.000 description 1
- VPSHHQXIWLGVDD-ZLUOBGJFSA-N Asp-Asp-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O VPSHHQXIWLGVDD-ZLUOBGJFSA-N 0.000 description 1
- FRSGNOZCTWDVFZ-ACZMJKKPSA-N Asp-Asp-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O FRSGNOZCTWDVFZ-ACZMJKKPSA-N 0.000 description 1
- FANQWNCPNFEPGZ-WHFBIAKZSA-N Asp-Asp-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O FANQWNCPNFEPGZ-WHFBIAKZSA-N 0.000 description 1
- PXLNPFOJZQMXAT-BYULHYEWSA-N Asp-Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O PXLNPFOJZQMXAT-BYULHYEWSA-N 0.000 description 1
- DZQKLNLLWFQONU-LKXGYXEUSA-N Asp-Cys-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC(=O)O)N)O DZQKLNLLWFQONU-LKXGYXEUSA-N 0.000 description 1
- LJRPYAZQQWHEEV-FXQIFTODSA-N Asp-Gln-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O LJRPYAZQQWHEEV-FXQIFTODSA-N 0.000 description 1
- VAWNQIGQPUOPQW-ACZMJKKPSA-N Asp-Glu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O VAWNQIGQPUOPQW-ACZMJKKPSA-N 0.000 description 1
- KHBLRHKVXICFMY-GUBZILKMSA-N Asp-Glu-Lys Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O KHBLRHKVXICFMY-GUBZILKMSA-N 0.000 description 1
- ZEDBMCPXPIYJLW-XHNCKOQMSA-N Asp-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N)C(=O)O ZEDBMCPXPIYJLW-XHNCKOQMSA-N 0.000 description 1
- XDGBFDYXZCMYEX-NUMRIWBASA-N Asp-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N)O XDGBFDYXZCMYEX-NUMRIWBASA-N 0.000 description 1
- GISFCCXBVJKGEO-QEJZJMRPSA-N Asp-Glu-Trp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O GISFCCXBVJKGEO-QEJZJMRPSA-N 0.000 description 1
- WBDWQKRLTVCDSY-WHFBIAKZSA-N Asp-Gly-Asp Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O WBDWQKRLTVCDSY-WHFBIAKZSA-N 0.000 description 1
- HAFCJCDJGIOYPW-WDSKDSINSA-N Asp-Gly-Gln Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O HAFCJCDJGIOYPW-WDSKDSINSA-N 0.000 description 1
- OMMIEVATLAGRCK-BYPYZUCNSA-N Asp-Gly-Gly Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)NCC(O)=O OMMIEVATLAGRCK-BYPYZUCNSA-N 0.000 description 1
- ZSVJVIOVABDTTL-YUMQZZPRSA-N Asp-Gly-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)O)N ZSVJVIOVABDTTL-YUMQZZPRSA-N 0.000 description 1
- QCVXMEHGFUMKCO-YUMQZZPRSA-N Asp-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O QCVXMEHGFUMKCO-YUMQZZPRSA-N 0.000 description 1
- PZXPWHFYZXTFBI-YUMQZZPRSA-N Asp-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O PZXPWHFYZXTFBI-YUMQZZPRSA-N 0.000 description 1
- KPNUCOPMVSGRCR-DCAQKATOSA-N Asp-His-Arg Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O KPNUCOPMVSGRCR-DCAQKATOSA-N 0.000 description 1
- LNENWJXDHCFVOF-DCAQKATOSA-N Asp-His-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC(=O)O)N LNENWJXDHCFVOF-DCAQKATOSA-N 0.000 description 1
- RKNIUWSZIAUEPK-PBCZWWQYSA-N Asp-His-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC(=O)O)N)O RKNIUWSZIAUEPK-PBCZWWQYSA-N 0.000 description 1
- KTTCQQNRRLCIBC-GHCJXIJMSA-N Asp-Ile-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O KTTCQQNRRLCIBC-GHCJXIJMSA-N 0.000 description 1
- QNFRBNZGVVKBNJ-PEFMBERDSA-N Asp-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N QNFRBNZGVVKBNJ-PEFMBERDSA-N 0.000 description 1
- KYQNAIMCTRZLNP-QSFUFRPTSA-N Asp-Ile-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O KYQNAIMCTRZLNP-QSFUFRPTSA-N 0.000 description 1
- JNNVNVRBYUJYGS-CIUDSAMLSA-N Asp-Leu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O JNNVNVRBYUJYGS-CIUDSAMLSA-N 0.000 description 1
- XSXVLWBWIPKUSN-UHFFFAOYSA-N Asp-Leu-Glu-Asp Chemical compound OC(=O)CC(N)C(=O)NC(CC(C)C)C(=O)NC(CCC(O)=O)C(=O)NC(CC(O)=O)C(O)=O XSXVLWBWIPKUSN-UHFFFAOYSA-N 0.000 description 1
- CJUKAWUWBZCTDQ-SRVKXCTJSA-N Asp-Leu-Lys Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O CJUKAWUWBZCTDQ-SRVKXCTJSA-N 0.000 description 1
- QNIACYURSSCLRP-GUBZILKMSA-N Asp-Lys-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O QNIACYURSSCLRP-GUBZILKMSA-N 0.000 description 1
- LBOVBQONZJRWPV-YUMQZZPRSA-N Asp-Lys-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LBOVBQONZJRWPV-YUMQZZPRSA-N 0.000 description 1
- NZWDWXSWUQCNMG-GARJFASQSA-N Asp-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)O)N)C(=O)O NZWDWXSWUQCNMG-GARJFASQSA-N 0.000 description 1
- JXGJJQJHXHXJQF-CIUDSAMLSA-N Asp-Met-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(O)=O)C(O)=O JXGJJQJHXHXJQF-CIUDSAMLSA-N 0.000 description 1
- VWWAFGHMPWBKEP-GMOBBJLQSA-N Asp-Met-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC(=O)O)N VWWAFGHMPWBKEP-GMOBBJLQSA-N 0.000 description 1
- PCJOFZYFFMBZKC-PCBIJLKTSA-N Asp-Phe-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PCJOFZYFFMBZKC-PCBIJLKTSA-N 0.000 description 1
- UCHSVZYJKJLPHF-BZSNNMDCSA-N Asp-Phe-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O UCHSVZYJKJLPHF-BZSNNMDCSA-N 0.000 description 1
- PWAIZUBWHRHYKS-MELADBBJSA-N Asp-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CC(=O)O)N)C(=O)O PWAIZUBWHRHYKS-MELADBBJSA-N 0.000 description 1
- BWJZSLQJNBSUPM-FXQIFTODSA-N Asp-Pro-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O BWJZSLQJNBSUPM-FXQIFTODSA-N 0.000 description 1
- AHWRSSLYSGLBGD-CIUDSAMLSA-N Asp-Pro-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O AHWRSSLYSGLBGD-CIUDSAMLSA-N 0.000 description 1
- MNQMTYSEKZHIDF-GCJQMDKQSA-N Asp-Thr-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O MNQMTYSEKZHIDF-GCJQMDKQSA-N 0.000 description 1
- GCACQYDBDHRVGE-LKXGYXEUSA-N Asp-Thr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC(O)=O GCACQYDBDHRVGE-LKXGYXEUSA-N 0.000 description 1
- RSMZEHCMIOKNMW-GSSVUCPTSA-N Asp-Thr-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RSMZEHCMIOKNMW-GSSVUCPTSA-N 0.000 description 1
- NWAHPBGBDIFUFD-KKUMJFAQSA-N Asp-Tyr-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O NWAHPBGBDIFUFD-KKUMJFAQSA-N 0.000 description 1
- XWKBWZXGNXTDKY-ZKWXMUAHSA-N Asp-Val-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(O)=O XWKBWZXGNXTDKY-ZKWXMUAHSA-N 0.000 description 1
- PLOKOIJSGCISHE-BYULHYEWSA-N Asp-Val-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O PLOKOIJSGCISHE-BYULHYEWSA-N 0.000 description 1
- XWKPSMRPIKKDDU-RCOVLWMOSA-N Asp-Val-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O XWKPSMRPIKKDDU-RCOVLWMOSA-N 0.000 description 1
- VXEORMGBKTUUCM-KWBADKCTSA-N Asp-Val-Gly-Pro Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O VXEORMGBKTUUCM-KWBADKCTSA-N 0.000 description 1
- UXRVDHVARNBOIO-QSFUFRPTSA-N Asp-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CC(=O)O)N UXRVDHVARNBOIO-QSFUFRPTSA-N 0.000 description 1
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 1
- 241000271566 Aves Species 0.000 description 1
- 102100028728 Bone morphogenetic protein 1 Human genes 0.000 description 1
- 108090000654 Bone morphogenetic protein 1 Proteins 0.000 description 1
- 101100512078 Caenorhabditis elegans lys-1 gene Proteins 0.000 description 1
- 101100505161 Caenorhabditis elegans mel-32 gene Proteins 0.000 description 1
- UXVMQQNJUSDDNG-UHFFFAOYSA-L Calcium chloride Chemical compound [Cl-].[Cl-].[Ca+2] UXVMQQNJUSDDNG-UHFFFAOYSA-L 0.000 description 1
- 102100027992 Casein kinase II subunit beta Human genes 0.000 description 1
- 101710158100 Casein kinase II subunit beta Proteins 0.000 description 1
- 102000003813 Cis-trans-isomerases Human genes 0.000 description 1
- 108090000175 Cis-trans-isomerases Proteins 0.000 description 1
- 108700010070 Codon Usage Proteins 0.000 description 1
- 102100027995 Collagenase 3 Human genes 0.000 description 1
- 108050005238 Collagenase 3 Proteins 0.000 description 1
- IVOMOUWHDPKRLL-KQYNXXCUSA-N Cyclic adenosine monophosphate Chemical compound C([C@H]1O2)OP(O)(=O)O[C@H]1[C@@H](O)[C@@H]2N1C(N=CN=C2N)=C2N=C1 IVOMOUWHDPKRLL-KQYNXXCUSA-N 0.000 description 1
- BMHBJCVEXUBGFI-BIIVOSGPSA-N Cys-Cys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CS)NC(=O)[C@H](CS)N)C(=O)O BMHBJCVEXUBGFI-BIIVOSGPSA-N 0.000 description 1
- BDWIZLQVVWQMTB-XKBZYTNZSA-N Cys-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CS)N)O BDWIZLQVVWQMTB-XKBZYTNZSA-N 0.000 description 1
- XTHUKRLJRUVVBF-WHFBIAKZSA-N Cys-Gly-Ser Chemical compound SC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O XTHUKRLJRUVVBF-WHFBIAKZSA-N 0.000 description 1
- UVZFZTWNHOQWNK-NAKRPEOUSA-N Cys-Ile-Arg Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O UVZFZTWNHOQWNK-NAKRPEOUSA-N 0.000 description 1
- ZLHPWFSAUJEEAN-KBIXCLLPSA-N Cys-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CS)N ZLHPWFSAUJEEAN-KBIXCLLPSA-N 0.000 description 1
- LHMSYHSAAJOEBL-CIUDSAMLSA-N Cys-Lys-Asn Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O LHMSYHSAAJOEBL-CIUDSAMLSA-N 0.000 description 1
- OZSBRCONEMXYOJ-AVGNSLFASA-N Cys-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CS)N OZSBRCONEMXYOJ-AVGNSLFASA-N 0.000 description 1
- RESAHOSBQHMOKH-KKUMJFAQSA-N Cys-Phe-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CS)N RESAHOSBQHMOKH-KKUMJFAQSA-N 0.000 description 1
- IDZDFWJNPOOOHE-KKUMJFAQSA-N Cys-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CS)N IDZDFWJNPOOOHE-KKUMJFAQSA-N 0.000 description 1
- CAXGCBSRJLADPD-FXQIFTODSA-N Cys-Pro-Asn Chemical compound [H]N[C@@H](CS)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O CAXGCBSRJLADPD-FXQIFTODSA-N 0.000 description 1
- XBELMDARIGXDKY-GUBZILKMSA-N Cys-Pro-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CS)N XBELMDARIGXDKY-GUBZILKMSA-N 0.000 description 1
- JIVJQYNNAYFXDG-LKXGYXEUSA-N Cys-Thr-Asn Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O JIVJQYNNAYFXDG-LKXGYXEUSA-N 0.000 description 1
- GFAPBMCRSMSGDZ-XGEHTFHBSA-N Cys-Thr-Met Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CS)N)O GFAPBMCRSMSGDZ-XGEHTFHBSA-N 0.000 description 1
- IOLWXFWVYYCVTJ-NRPADANISA-N Cys-Val-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CS)N IOLWXFWVYYCVTJ-NRPADANISA-N 0.000 description 1
- UOEYKPDDHSFMLI-DCAQKATOSA-N Cys-Val-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CS)N UOEYKPDDHSFMLI-DCAQKATOSA-N 0.000 description 1
- 108010005843 Cysteine Proteases Proteins 0.000 description 1
- 102000005927 Cysteine Proteases Human genes 0.000 description 1
- FDKWRPBBCBCIGA-UWTATZPHSA-N D-Selenocysteine Natural products [Se]C[C@@H](N)C(O)=O FDKWRPBBCBCIGA-UWTATZPHSA-N 0.000 description 1
- 150000008574 D-amino acids Chemical class 0.000 description 1
- 102000004674 D-amino-acid oxidase Human genes 0.000 description 1
- 108010003989 D-amino-acid oxidase Proteins 0.000 description 1
- LEVWYRKDKASIDU-QWWZWVQMSA-N D-cystine Chemical compound OC(=O)[C@H](N)CSSC[C@@H](N)C(O)=O LEVWYRKDKASIDU-QWWZWVQMSA-N 0.000 description 1
- 238000001712 DNA sequencing Methods 0.000 description 1
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 1
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 1
- 206010011906 Death Diseases 0.000 description 1
- 241000196324 Embryophyta Species 0.000 description 1
- 241000588724 Escherichia coli Species 0.000 description 1
- 108060002716 Exonuclease Proteins 0.000 description 1
- CWYNVVGOOAEACU-UHFFFAOYSA-N Fe2+ Chemical compound [Fe+2] CWYNVVGOOAEACU-UHFFFAOYSA-N 0.000 description 1
- 208000005577 Gastroenteritis Diseases 0.000 description 1
- HHWQMFIGMMOVFK-WDSKDSINSA-N Gln-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O HHWQMFIGMMOVFK-WDSKDSINSA-N 0.000 description 1
- KVYVOGYEMPEXBT-GUBZILKMSA-N Gln-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O KVYVOGYEMPEXBT-GUBZILKMSA-N 0.000 description 1
- XXLBHPPXDUWYAG-XQXXSGGOSA-N Gln-Ala-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XXLBHPPXDUWYAG-XQXXSGGOSA-N 0.000 description 1
- RKAQZCDMSUQTSS-FXQIFTODSA-N Gln-Asp-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N RKAQZCDMSUQTSS-FXQIFTODSA-N 0.000 description 1
- UICOTGULOUGGLC-NUMRIWBASA-N Gln-Asp-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)N)N)O UICOTGULOUGGLC-NUMRIWBASA-N 0.000 description 1
- LPJVZYMINRLCQA-AVGNSLFASA-N Gln-Cys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCC(=O)N)N LPJVZYMINRLCQA-AVGNSLFASA-N 0.000 description 1
- LPYPANUXJGFMGV-FXQIFTODSA-N Gln-Gln-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)N)N LPYPANUXJGFMGV-FXQIFTODSA-N 0.000 description 1
- QFJPFPCSXOXMKI-BPUTZDHNSA-N Gln-Gln-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)N)N QFJPFPCSXOXMKI-BPUTZDHNSA-N 0.000 description 1
- SNLOOPZHAQDMJG-CIUDSAMLSA-N Gln-Glu-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SNLOOPZHAQDMJG-CIUDSAMLSA-N 0.000 description 1
- PNENQZWRFMUZOM-DCAQKATOSA-N Gln-Glu-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O PNENQZWRFMUZOM-DCAQKATOSA-N 0.000 description 1
- MAGNEQBFSBREJL-DCAQKATOSA-N Gln-Glu-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N MAGNEQBFSBREJL-DCAQKATOSA-N 0.000 description 1
- DRDSQGHKTLSNEA-GLLZPBPUSA-N Gln-Glu-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DRDSQGHKTLSNEA-GLLZPBPUSA-N 0.000 description 1
- XKBASPWPBXNVLQ-WDSKDSINSA-N Gln-Gly-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O XKBASPWPBXNVLQ-WDSKDSINSA-N 0.000 description 1
- HVQCEQTUSWWFOS-WDSKDSINSA-N Gln-Gly-Cys Chemical compound C(CC(=O)N)[C@@H](C(=O)NCC(=O)N[C@@H](CS)C(=O)O)N HVQCEQTUSWWFOS-WDSKDSINSA-N 0.000 description 1
- MFJAPSYJQJCQDN-BQBZGAKWSA-N Gln-Gly-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O MFJAPSYJQJCQDN-BQBZGAKWSA-N 0.000 description 1
- VGTDBGYFVWOQTI-RYUDHWBXSA-N Gln-Gly-Phe Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 VGTDBGYFVWOQTI-RYUDHWBXSA-N 0.000 description 1
- IWUFOVSLWADEJC-AVGNSLFASA-N Gln-His-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O IWUFOVSLWADEJC-AVGNSLFASA-N 0.000 description 1
- DAAUVRPSZRDMBV-KBIXCLLPSA-N Gln-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)N)N DAAUVRPSZRDMBV-KBIXCLLPSA-N 0.000 description 1
- MTCXQQINVAFZKW-MNXVOIDGSA-N Gln-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N MTCXQQINVAFZKW-MNXVOIDGSA-N 0.000 description 1
- HWEINOMSWQSJDC-SRVKXCTJSA-N Gln-Leu-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O HWEINOMSWQSJDC-SRVKXCTJSA-N 0.000 description 1
- ZBKUIQNCRIYVGH-SDDRHHMPSA-N Gln-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZBKUIQNCRIYVGH-SDDRHHMPSA-N 0.000 description 1
- CELXWPDNIGWCJN-WDCWCFNPSA-N Gln-Lys-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CELXWPDNIGWCJN-WDCWCFNPSA-N 0.000 description 1
- BJPPYOMRAVLXBY-YUMQZZPRSA-N Gln-Met-Gly Chemical compound CSCC[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N BJPPYOMRAVLXBY-YUMQZZPRSA-N 0.000 description 1
- NPMFDZGLKBNFOO-SRVKXCTJSA-N Gln-Pro-His Chemical compound NC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CN=CN1 NPMFDZGLKBNFOO-SRVKXCTJSA-N 0.000 description 1
- WLRYGVYQFXRJDA-DCAQKATOSA-N Gln-Pro-Pro Chemical compound NC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 WLRYGVYQFXRJDA-DCAQKATOSA-N 0.000 description 1
- VNTGPISAOMAXRK-CIUDSAMLSA-N Gln-Pro-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O VNTGPISAOMAXRK-CIUDSAMLSA-N 0.000 description 1
- UTOQQOMEJDPDMX-ACZMJKKPSA-N Gln-Ser-Asp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O UTOQQOMEJDPDMX-ACZMJKKPSA-N 0.000 description 1
- LPIKVBWNNVFHCQ-GUBZILKMSA-N Gln-Ser-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O LPIKVBWNNVFHCQ-GUBZILKMSA-N 0.000 description 1
- PAOHIZNRJNIXQY-XQXXSGGOSA-N Gln-Thr-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O PAOHIZNRJNIXQY-XQXXSGGOSA-N 0.000 description 1
- OUBUHIODTNUUTC-WDCWCFNPSA-N Gln-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O OUBUHIODTNUUTC-WDCWCFNPSA-N 0.000 description 1
- OACQOWPRWGNKTP-AVGNSLFASA-N Gln-Tyr-Asp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O OACQOWPRWGNKTP-AVGNSLFASA-N 0.000 description 1
- SGVGIVDZLSHSEN-RYUDHWBXSA-N Gln-Tyr-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O SGVGIVDZLSHSEN-RYUDHWBXSA-N 0.000 description 1
- QXQDADBVIBLBHN-FHWLQOOXSA-N Gln-Tyr-Phe Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QXQDADBVIBLBHN-FHWLQOOXSA-N 0.000 description 1
- QGWXAMDECCKGRU-XVKPBYJWSA-N Gln-Val-Gly Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCC(N)=O)C(=O)NCC(O)=O QGWXAMDECCKGRU-XVKPBYJWSA-N 0.000 description 1
- ITYRYNUZHPNCIK-GUBZILKMSA-N Glu-Ala-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O ITYRYNUZHPNCIK-GUBZILKMSA-N 0.000 description 1
- NCWOMXABNYEPLY-NRPADANISA-N Glu-Ala-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O NCWOMXABNYEPLY-NRPADANISA-N 0.000 description 1
- ZOXBSICWUDAOHX-GUBZILKMSA-N Glu-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCC(O)=O ZOXBSICWUDAOHX-GUBZILKMSA-N 0.000 description 1
- JPHYJQHPILOKHC-ACZMJKKPSA-N Glu-Asp-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O JPHYJQHPILOKHC-ACZMJKKPSA-N 0.000 description 1
- VAIWPXWHWAPYDF-FXQIFTODSA-N Glu-Asp-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O VAIWPXWHWAPYDF-FXQIFTODSA-N 0.000 description 1
- DSPQRJXOIXHOHK-WDSKDSINSA-N Glu-Asp-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O DSPQRJXOIXHOHK-WDSKDSINSA-N 0.000 description 1
- MXPBQDFWIMBACQ-ACZMJKKPSA-N Glu-Cys-Cys Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CS)C(O)=O MXPBQDFWIMBACQ-ACZMJKKPSA-N 0.000 description 1
- PXHABOCPJVTGEK-BQBZGAKWSA-N Glu-Gln-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O PXHABOCPJVTGEK-BQBZGAKWSA-N 0.000 description 1
- LVCHEMOPBORRLB-DCAQKATOSA-N Glu-Gln-Lys Chemical compound NCCCC[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCC(O)=O)C(O)=O LVCHEMOPBORRLB-DCAQKATOSA-N 0.000 description 1
- VFZIDQZAEBORGY-GLLZPBPUSA-N Glu-Gln-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VFZIDQZAEBORGY-GLLZPBPUSA-N 0.000 description 1
- ILGFBUGLBSAQQB-GUBZILKMSA-N Glu-Glu-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ILGFBUGLBSAQQB-GUBZILKMSA-N 0.000 description 1
- NUSWUSKZRCGFEX-FXQIFTODSA-N Glu-Glu-Cys Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CS)C(O)=O NUSWUSKZRCGFEX-FXQIFTODSA-N 0.000 description 1
- BUZMZDDKFCSKOT-CIUDSAMLSA-N Glu-Glu-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BUZMZDDKFCSKOT-CIUDSAMLSA-N 0.000 description 1
- KASDBWKLWJKTLJ-GUBZILKMSA-N Glu-Glu-Met Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O KASDBWKLWJKTLJ-GUBZILKMSA-N 0.000 description 1
- IQACOVZVOMVILH-FXQIFTODSA-N Glu-Glu-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O IQACOVZVOMVILH-FXQIFTODSA-N 0.000 description 1
- QJCKNLPMTPXXEM-AUTRQRHGSA-N Glu-Glu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O QJCKNLPMTPXXEM-AUTRQRHGSA-N 0.000 description 1
- AIGROOHQXCACHL-WDSKDSINSA-N Glu-Gly-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](C)C(O)=O AIGROOHQXCACHL-WDSKDSINSA-N 0.000 description 1
- PXXGVUVQWQGGIG-YUMQZZPRSA-N Glu-Gly-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N PXXGVUVQWQGGIG-YUMQZZPRSA-N 0.000 description 1
- WRNAXCVRSBBKGS-BQBZGAKWSA-N Glu-Gly-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O WRNAXCVRSBBKGS-BQBZGAKWSA-N 0.000 description 1
- OAGVHWYIBZMWLA-YFKPBYRVSA-N Glu-Gly-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)NCC(O)=O OAGVHWYIBZMWLA-YFKPBYRVSA-N 0.000 description 1
- ZWQVYZXPYSYPJD-RYUDHWBXSA-N Glu-Gly-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ZWQVYZXPYSYPJD-RYUDHWBXSA-N 0.000 description 1
- OPAINBJQDQTGJY-JGVFFNPUSA-N Glu-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCC(=O)O)N)C(=O)O OPAINBJQDQTGJY-JGVFFNPUSA-N 0.000 description 1
- XIKYNVKEUINBGL-IUCAKERBSA-N Glu-His-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)NCC(O)=O XIKYNVKEUINBGL-IUCAKERBSA-N 0.000 description 1
- VGOFRWOTSXVPAU-SDDRHHMPSA-N Glu-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CCC(=O)O)N)C(=O)O VGOFRWOTSXVPAU-SDDRHHMPSA-N 0.000 description 1
- VGUYMZGLJUJRBV-YVNDNENWSA-N Glu-Ile-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O VGUYMZGLJUJRBV-YVNDNENWSA-N 0.000 description 1
- XTZDZAXYPDISRR-MNXVOIDGSA-N Glu-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N XTZDZAXYPDISRR-MNXVOIDGSA-N 0.000 description 1
- VGBSZQSKQRMLHD-MNXVOIDGSA-N Glu-Leu-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VGBSZQSKQRMLHD-MNXVOIDGSA-N 0.000 description 1
- MWMJCGBSIORNCD-AVGNSLFASA-N Glu-Leu-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O MWMJCGBSIORNCD-AVGNSLFASA-N 0.000 description 1
- GJBUAAAIZSRCDC-GVXVVHGQSA-N Glu-Leu-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O GJBUAAAIZSRCDC-GVXVVHGQSA-N 0.000 description 1
- SJJHXJDSNQJMMW-SRVKXCTJSA-N Glu-Lys-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O SJJHXJDSNQJMMW-SRVKXCTJSA-N 0.000 description 1
- YKBUCXNNBYZYAY-MNXVOIDGSA-N Glu-Lys-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YKBUCXNNBYZYAY-MNXVOIDGSA-N 0.000 description 1
- XEKAJTCACGEBOK-KKUMJFAQSA-N Glu-Met-Phe Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCC(=O)O)N XEKAJTCACGEBOK-KKUMJFAQSA-N 0.000 description 1
- YHOJJFFTSMWVGR-HJGDQZAQSA-N Glu-Met-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O YHOJJFFTSMWVGR-HJGDQZAQSA-N 0.000 description 1
- ITVBKCZZLJUUHI-HTUGSXCWSA-N Glu-Phe-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ITVBKCZZLJUUHI-HTUGSXCWSA-N 0.000 description 1
- MIIGESVJEBDJMP-FHWLQOOXSA-N Glu-Phe-Tyr Chemical compound C([C@H](NC(=O)[C@H](CCC(O)=O)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 MIIGESVJEBDJMP-FHWLQOOXSA-N 0.000 description 1
- CBOVGULVQSVMPT-CIUDSAMLSA-N Glu-Pro-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O CBOVGULVQSVMPT-CIUDSAMLSA-N 0.000 description 1
- HLYCMRDRWGSTPZ-CIUDSAMLSA-N Glu-Pro-Cys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CCC(=O)O)N)C(=O)N[C@@H](CS)C(=O)O HLYCMRDRWGSTPZ-CIUDSAMLSA-N 0.000 description 1
- HMJULNMJWOZNFI-XHNCKOQMSA-N Glu-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)O)N)C(=O)O HMJULNMJWOZNFI-XHNCKOQMSA-N 0.000 description 1
- ZGXGVBYEJGVJMV-HJGDQZAQSA-N Glu-Thr-Met Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O ZGXGVBYEJGVJMV-HJGDQZAQSA-N 0.000 description 1
- CAQXJMUDOLSBPF-SUSMZKCASA-N Glu-Thr-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAQXJMUDOLSBPF-SUSMZKCASA-N 0.000 description 1
- VHPVBPCCWVDGJL-IRIUXVKKSA-N Glu-Thr-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VHPVBPCCWVDGJL-IRIUXVKKSA-N 0.000 description 1
- UCZXXMREFIETQW-AVGNSLFASA-N Glu-Tyr-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O UCZXXMREFIETQW-AVGNSLFASA-N 0.000 description 1
- BKMOHWJHXQLFEX-IRIUXVKKSA-N Glu-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CCC(=O)O)N)O BKMOHWJHXQLFEX-IRIUXVKKSA-N 0.000 description 1
- LSYFGBRDBIQYAQ-FHWLQOOXSA-N Glu-Tyr-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LSYFGBRDBIQYAQ-FHWLQOOXSA-N 0.000 description 1
- YPHPEHMXOYTEQG-LAEOZQHASA-N Glu-Val-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O YPHPEHMXOYTEQG-LAEOZQHASA-N 0.000 description 1
- ZYRXTRTUCAVNBQ-GVXVVHGQSA-N Glu-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZYRXTRTUCAVNBQ-GVXVVHGQSA-N 0.000 description 1
- 108010033128 Glucan Endo-1,3-beta-D-Glucosidase Proteins 0.000 description 1
- 108010070675 Glutathione transferase Proteins 0.000 description 1
- GZUKEVBTYNNUQF-WDSKDSINSA-N Gly-Ala-Gln Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O GZUKEVBTYNNUQF-WDSKDSINSA-N 0.000 description 1
- MFVQGXGQRIXBPK-WDSKDSINSA-N Gly-Ala-Glu Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O MFVQGXGQRIXBPK-WDSKDSINSA-N 0.000 description 1
- UGVQELHRNUDMAA-BYPYZUCNSA-N Gly-Ala-Gly Chemical compound [NH3+]CC(=O)N[C@@H](C)C(=O)NCC([O-])=O UGVQELHRNUDMAA-BYPYZUCNSA-N 0.000 description 1
- MZZSCEANQDPJER-ONGXEEELSA-N Gly-Ala-Phe Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MZZSCEANQDPJER-ONGXEEELSA-N 0.000 description 1
- JXYMPBCYRKWJEE-BQBZGAKWSA-N Gly-Arg-Ala Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O JXYMPBCYRKWJEE-BQBZGAKWSA-N 0.000 description 1
- UPOJUWHGMDJUQZ-IUCAKERBSA-N Gly-Arg-Arg Chemical compound NC(=N)NCCC[C@H](NC(=O)CN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O UPOJUWHGMDJUQZ-IUCAKERBSA-N 0.000 description 1
- DTPOVRRYXPJJAZ-FJXKBIBVSA-N Gly-Arg-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N DTPOVRRYXPJJAZ-FJXKBIBVSA-N 0.000 description 1
- XUORRGAFUQIMLC-STQMWFEESA-N Gly-Arg-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)CN)O XUORRGAFUQIMLC-STQMWFEESA-N 0.000 description 1
- DWUKOTKSTDWGAE-BQBZGAKWSA-N Gly-Asn-Arg Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DWUKOTKSTDWGAE-BQBZGAKWSA-N 0.000 description 1
- GRIRDMVMJJDZKV-RCOVLWMOSA-N Gly-Asn-Val Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O GRIRDMVMJJDZKV-RCOVLWMOSA-N 0.000 description 1
- XEJTYSCIXKYSHR-WDSKDSINSA-N Gly-Asp-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN XEJTYSCIXKYSHR-WDSKDSINSA-N 0.000 description 1
- FZQLXNIMCPJVJE-YUMQZZPRSA-N Gly-Asp-Leu Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O FZQLXNIMCPJVJE-YUMQZZPRSA-N 0.000 description 1
- QCTLGOYODITHPQ-WHFBIAKZSA-N Gly-Cys-Ser Chemical compound [H]NCC(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(O)=O QCTLGOYODITHPQ-WHFBIAKZSA-N 0.000 description 1
- VNBNZUAPOYGRDB-ZDLURKLDSA-N Gly-Cys-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)CN)O VNBNZUAPOYGRDB-ZDLURKLDSA-N 0.000 description 1
- JMQFHZWESBGPFC-WDSKDSINSA-N Gly-Gln-Asp Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O JMQFHZWESBGPFC-WDSKDSINSA-N 0.000 description 1
- KTSZUNRRYXPZTK-BQBZGAKWSA-N Gly-Gln-Glu Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O KTSZUNRRYXPZTK-BQBZGAKWSA-N 0.000 description 1
- BYYNJRSNDARRBX-YFKPBYRVSA-N Gly-Gln-Gly Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O BYYNJRSNDARRBX-YFKPBYRVSA-N 0.000 description 1
- AQLHORCVPGXDJW-IUCAKERBSA-N Gly-Gln-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)CN AQLHORCVPGXDJW-IUCAKERBSA-N 0.000 description 1
- NPSWCZIRBAYNSB-JHEQGTHGSA-N Gly-Gln-Thr Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NPSWCZIRBAYNSB-JHEQGTHGSA-N 0.000 description 1
- BIRKKBCSAIHDDF-WDSKDSINSA-N Gly-Glu-Cys Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CS)C(O)=O BIRKKBCSAIHDDF-WDSKDSINSA-N 0.000 description 1
- YYPFZVIXAVDHIK-IUCAKERBSA-N Gly-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN YYPFZVIXAVDHIK-IUCAKERBSA-N 0.000 description 1
- ZQIMMEYPEXIYBB-IUCAKERBSA-N Gly-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN ZQIMMEYPEXIYBB-IUCAKERBSA-N 0.000 description 1
- JSNNHGHYGYMVCK-XVKPBYJWSA-N Gly-Glu-Val Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O JSNNHGHYGYMVCK-XVKPBYJWSA-N 0.000 description 1
- CCQOOWAONKGYKQ-BYPYZUCNSA-N Gly-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)CN CCQOOWAONKGYKQ-BYPYZUCNSA-N 0.000 description 1
- UFPXDFOYHVEIPI-BYPYZUCNSA-N Gly-Gly-Asp Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O UFPXDFOYHVEIPI-BYPYZUCNSA-N 0.000 description 1
- GDOZQTNZPCUARW-YFKPBYRVSA-N Gly-Gly-Glu Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O GDOZQTNZPCUARW-YFKPBYRVSA-N 0.000 description 1
- KAJAOGBVWCYGHZ-JTQLQIEISA-N Gly-Gly-Phe Chemical compound [NH3+]CC(=O)NCC(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 KAJAOGBVWCYGHZ-JTQLQIEISA-N 0.000 description 1
- INLIXXRWNUKVCF-JTQLQIEISA-N Gly-Gly-Tyr Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 INLIXXRWNUKVCF-JTQLQIEISA-N 0.000 description 1
- OLPPXYMMIARYAL-QMMMGPOBSA-N Gly-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)CN OLPPXYMMIARYAL-QMMMGPOBSA-N 0.000 description 1
- AYBKPDHHVADEDA-YUMQZZPRSA-N Gly-His-Asn Chemical compound [H]NCC(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(O)=O AYBKPDHHVADEDA-YUMQZZPRSA-N 0.000 description 1
- CQIIXEHDSZUSAG-QWRGUYRKSA-N Gly-His-His Chemical compound C([C@H](NC(=O)CN)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CN=CN1 CQIIXEHDSZUSAG-QWRGUYRKSA-N 0.000 description 1
- UTYGDAHJBBDPBA-BYULHYEWSA-N Gly-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)CN UTYGDAHJBBDPBA-BYULHYEWSA-N 0.000 description 1
- HKSNHPVETYYJBK-LAEOZQHASA-N Gly-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)CN HKSNHPVETYYJBK-LAEOZQHASA-N 0.000 description 1
- ITZOBNKQDZEOCE-NHCYSSNCSA-N Gly-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)CN ITZOBNKQDZEOCE-NHCYSSNCSA-N 0.000 description 1
- COVXELOAORHTND-LSJOCFKGSA-N Gly-Ile-Val Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O COVXELOAORHTND-LSJOCFKGSA-N 0.000 description 1
- LLZXNUUIBOALNY-QWRGUYRKSA-N Gly-Leu-Lys Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN LLZXNUUIBOALNY-QWRGUYRKSA-N 0.000 description 1
- LHYJCVCQPWRMKZ-WEDXCCLWSA-N Gly-Leu-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LHYJCVCQPWRMKZ-WEDXCCLWSA-N 0.000 description 1
- CLNSYANKYVMZNM-UWVGGRQHSA-N Gly-Lys-Arg Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)N[C@H](C(O)=O)CCCN=C(N)N CLNSYANKYVMZNM-UWVGGRQHSA-N 0.000 description 1
- IUKIDFVOUHZRAK-QWRGUYRKSA-N Gly-Lys-His Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 IUKIDFVOUHZRAK-QWRGUYRKSA-N 0.000 description 1
- PTIIBFKSLCYQBO-NHCYSSNCSA-N Gly-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)CN PTIIBFKSLCYQBO-NHCYSSNCSA-N 0.000 description 1
- VEPBEGNDJYANCF-QWRGUYRKSA-N Gly-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN VEPBEGNDJYANCF-QWRGUYRKSA-N 0.000 description 1
- NTBOEZICHOSJEE-YUMQZZPRSA-N Gly-Lys-Ser Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O NTBOEZICHOSJEE-YUMQZZPRSA-N 0.000 description 1
- DBJYVKDPGIFXFO-BQBZGAKWSA-N Gly-Met-Ala Chemical compound [H]NCC(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O DBJYVKDPGIFXFO-BQBZGAKWSA-N 0.000 description 1
- ZWRDOVYMQAAISL-UWVGGRQHSA-N Gly-Met-Lys Chemical compound CSCC[C@H](NC(=O)CN)C(=O)N[C@H](C(O)=O)CCCCN ZWRDOVYMQAAISL-UWVGGRQHSA-N 0.000 description 1
- LXTRSHQLGYINON-DTWKUNHWSA-N Gly-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN LXTRSHQLGYINON-DTWKUNHWSA-N 0.000 description 1
- WMGHDYWNHNLGBV-ONGXEEELSA-N Gly-Phe-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 WMGHDYWNHNLGBV-ONGXEEELSA-N 0.000 description 1
- GAFKBWKVXNERFA-QWRGUYRKSA-N Gly-Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 GAFKBWKVXNERFA-QWRGUYRKSA-N 0.000 description 1
- IGOYNRWLWHWAQO-JTQLQIEISA-N Gly-Phe-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 IGOYNRWLWHWAQO-JTQLQIEISA-N 0.000 description 1
- QAMMIGULQSIRCD-IRXDYDNUSA-N Gly-Phe-Tyr Chemical compound C([C@H](NC(=O)C[NH3+])C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C([O-])=O)C1=CC=CC=C1 QAMMIGULQSIRCD-IRXDYDNUSA-N 0.000 description 1
- JJGBXTYGTKWGAT-YUMQZZPRSA-N Gly-Pro-Glu Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O JJGBXTYGTKWGAT-YUMQZZPRSA-N 0.000 description 1
- IALQAMYQJBZNSK-WHFBIAKZSA-N Gly-Ser-Asn Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O IALQAMYQJBZNSK-WHFBIAKZSA-N 0.000 description 1
- LBDXVCBAJJNJNN-WHFBIAKZSA-N Gly-Ser-Cys Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(O)=O LBDXVCBAJJNJNN-WHFBIAKZSA-N 0.000 description 1
- POJJAZJHBGXEGM-YUMQZZPRSA-N Gly-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)CN POJJAZJHBGXEGM-YUMQZZPRSA-N 0.000 description 1
- YXTFLTJYLIAZQG-FJXKBIBVSA-N Gly-Thr-Arg Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YXTFLTJYLIAZQG-FJXKBIBVSA-N 0.000 description 1
- CQMFNTVQVLQRLT-JHEQGTHGSA-N Gly-Thr-Gln Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O CQMFNTVQVLQRLT-JHEQGTHGSA-N 0.000 description 1
- JQFILXICXLDTRR-FBCQKBJTSA-N Gly-Thr-Gly Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)NCC(O)=O JQFILXICXLDTRR-FBCQKBJTSA-N 0.000 description 1
- XHVONGZZVUUORG-WEDXCCLWSA-N Gly-Thr-Lys Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CCCCN XHVONGZZVUUORG-WEDXCCLWSA-N 0.000 description 1
- TVTZEOHWHUVYCG-KYNKHSRBSA-N Gly-Thr-Thr Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O TVTZEOHWHUVYCG-KYNKHSRBSA-N 0.000 description 1
- FOKISINOENBSDM-WLTAIBSBSA-N Gly-Thr-Tyr Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O FOKISINOENBSDM-WLTAIBSBSA-N 0.000 description 1
- FXTUGWXZTFMTIV-GJZGRUSLSA-N Gly-Trp-Arg Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)CN FXTUGWXZTFMTIV-GJZGRUSLSA-N 0.000 description 1
- GNNJKUYDWFIBTK-QWRGUYRKSA-N Gly-Tyr-Asp Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O GNNJKUYDWFIBTK-QWRGUYRKSA-N 0.000 description 1
- LYZYGGWCBLBDMC-QWHCGFSZSA-N Gly-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)CN)C(=O)O LYZYGGWCBLBDMC-QWHCGFSZSA-N 0.000 description 1
- DUAWRXXTOQOECJ-JSGCOSHPSA-N Gly-Tyr-Val Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O DUAWRXXTOQOECJ-JSGCOSHPSA-N 0.000 description 1
- GJHWILMUOANXTG-WPRPVWTQSA-N Gly-Val-Arg Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GJHWILMUOANXTG-WPRPVWTQSA-N 0.000 description 1
- NGRPGJGKJMUGDM-XVKPBYJWSA-N Gly-Val-Gln Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O NGRPGJGKJMUGDM-XVKPBYJWSA-N 0.000 description 1
- BNMRSWQOHIQTFL-JSGCOSHPSA-N Gly-Val-Phe Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 BNMRSWQOHIQTFL-JSGCOSHPSA-N 0.000 description 1
- AFMOTCMSEBITOE-YEPSODPASA-N Gly-Val-Thr Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O AFMOTCMSEBITOE-YEPSODPASA-N 0.000 description 1
- 108060003393 Granulin Proteins 0.000 description 1
- RVKIPWVMZANZLI-UHFFFAOYSA-N H-Lys-Trp-OH Natural products C1=CC=C2C(CC(NC(=O)C(N)CCCCN)C(O)=O)=CNC2=C1 RVKIPWVMZANZLI-UHFFFAOYSA-N 0.000 description 1
- 101150105462 HIS6 gene Proteins 0.000 description 1
- 102100029100 Hematopoietic prostaglandin D synthase Human genes 0.000 description 1
- BIAKMWKJMQLZOJ-ZKWXMUAHSA-N His-Ala-Ala Chemical compound C[C@H](NC(=O)[C@H](C)NC(=O)[C@@H](N)Cc1cnc[nH]1)C(O)=O BIAKMWKJMQLZOJ-ZKWXMUAHSA-N 0.000 description 1
- DCRODRAURLJOFY-XPUUQOCRSA-N His-Ala-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)NCC(O)=O DCRODRAURLJOFY-XPUUQOCRSA-N 0.000 description 1
- OMNVOTCFQQLEQU-CIUDSAMLSA-N His-Asn-Asp Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N OMNVOTCFQQLEQU-CIUDSAMLSA-N 0.000 description 1
- NOQPTNXSGNPJNS-YUMQZZPRSA-N His-Asn-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O NOQPTNXSGNPJNS-YUMQZZPRSA-N 0.000 description 1
- LMMPTUVWHCFTOT-GARJFASQSA-N His-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC2=CN=CN2)N)C(=O)O LMMPTUVWHCFTOT-GARJFASQSA-N 0.000 description 1
- LIEIYPBMQJLASB-SRVKXCTJSA-N His-Gln-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CC1=CN=CN1 LIEIYPBMQJLASB-SRVKXCTJSA-N 0.000 description 1
- TVRMJKNELJKNRS-GUBZILKMSA-N His-Glu-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N TVRMJKNELJKNRS-GUBZILKMSA-N 0.000 description 1
- OSZUPUINVNPCOE-SDDRHHMPSA-N His-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC2=CN=CN2)N)C(=O)O OSZUPUINVNPCOE-SDDRHHMPSA-N 0.000 description 1
- OEROYDLRVAYIMQ-YUMQZZPRSA-N His-Gly-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O OEROYDLRVAYIMQ-YUMQZZPRSA-N 0.000 description 1
- CHZRWFUGWRTUOD-IUCAKERBSA-N His-Gly-Gln Chemical compound C1=C(NC=N1)C[C@@H](C(=O)NCC(=O)N[C@@H](CCC(=O)N)C(=O)O)N CHZRWFUGWRTUOD-IUCAKERBSA-N 0.000 description 1
- HAPWZEVRQYGLSG-IUCAKERBSA-N His-Gly-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O HAPWZEVRQYGLSG-IUCAKERBSA-N 0.000 description 1
- WZBLRQQCDYYRTD-SIXJUCDHSA-N His-Ile-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC3=CN=CN3)N WZBLRQQCDYYRTD-SIXJUCDHSA-N 0.000 description 1
- MJUUWJJEUOBDGW-IHRRRGAJSA-N His-Leu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CN=CN1 MJUUWJJEUOBDGW-IHRRRGAJSA-N 0.000 description 1
- LVXFNTIIGOQBMD-SRVKXCTJSA-N His-Leu-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O LVXFNTIIGOQBMD-SRVKXCTJSA-N 0.000 description 1
- LVWIJITYHRZHBO-IXOXFDKPSA-N His-Leu-Thr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LVWIJITYHRZHBO-IXOXFDKPSA-N 0.000 description 1
- XKIYNCLILDLGRS-QWRGUYRKSA-N His-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CC1=CN=CN1 XKIYNCLILDLGRS-QWRGUYRKSA-N 0.000 description 1
- UXSATKFPUVZVDK-KKUMJFAQSA-N His-Lys-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CN=CN1)N UXSATKFPUVZVDK-KKUMJFAQSA-N 0.000 description 1
- RLAOTFTXBFQJDV-KKUMJFAQSA-N His-Phe-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CN=CN1 RLAOTFTXBFQJDV-KKUMJFAQSA-N 0.000 description 1
- PYNPBMCLAKTHJL-SRVKXCTJSA-N His-Pro-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O PYNPBMCLAKTHJL-SRVKXCTJSA-N 0.000 description 1
- QCBYAHHNOHBXIH-UWVGGRQHSA-N His-Pro-Gly Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)NCC(O)=O)C1=CN=CN1 QCBYAHHNOHBXIH-UWVGGRQHSA-N 0.000 description 1
- OWYIDJCNRWRSJY-QTKMDUPCSA-N His-Pro-Thr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O OWYIDJCNRWRSJY-QTKMDUPCSA-N 0.000 description 1
- KAXZXLSXFWSNNZ-XVYDVKMFSA-N His-Ser-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O KAXZXLSXFWSNNZ-XVYDVKMFSA-N 0.000 description 1
- STGQSBKUYSPPIG-CIUDSAMLSA-N His-Ser-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CN=CN1 STGQSBKUYSPPIG-CIUDSAMLSA-N 0.000 description 1
- IAYPZSHNZQHQNO-KKUMJFAQSA-N His-Ser-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC2=CN=CN2)N IAYPZSHNZQHQNO-KKUMJFAQSA-N 0.000 description 1
- DQZCEKQPSOBNMJ-NKIYYHGXSA-N His-Thr-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O DQZCEKQPSOBNMJ-NKIYYHGXSA-N 0.000 description 1
- MDOBWSFNSNPENN-PMVVWTBXSA-N His-Thr-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O MDOBWSFNSNPENN-PMVVWTBXSA-N 0.000 description 1
- VXZZUXWAOMWWJH-QTKMDUPCSA-N His-Thr-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O VXZZUXWAOMWWJH-QTKMDUPCSA-N 0.000 description 1
- FFYYUUWROYYKFY-IHRRRGAJSA-N His-Val-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O FFYYUUWROYYKFY-IHRRRGAJSA-N 0.000 description 1
- 101000798295 Homo sapiens A disintegrin and metalloproteinase with thrombospondin motifs 14 Proteins 0.000 description 1
- 101000936403 Homo sapiens A disintegrin and metalloproteinase with thrombospondin motifs 2 Proteins 0.000 description 1
- 101000936395 Homo sapiens A disintegrin and metalloproteinase with thrombospondin motifs 3 Proteins 0.000 description 1
- 101000928314 Homo sapiens Aldehyde oxidase Proteins 0.000 description 1
- 101001073025 Homo sapiens Peroxisomal targeting signal 1 receptor Proteins 0.000 description 1
- 101000730795 Homo sapiens Peroxisomal targeting signal 2 receptor Proteins 0.000 description 1
- 101001059454 Homo sapiens Serine/threonine-protein kinase MARK2 Proteins 0.000 description 1
- PMMYEEVYMWASQN-DMTCNVIQSA-N Hydroxyproline Chemical compound O[C@H]1CN[C@H](C(O)=O)C1 PMMYEEVYMWASQN-DMTCNVIQSA-N 0.000 description 1
- YKRYHWJRQUSTKG-KBIXCLLPSA-N Ile-Ala-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N YKRYHWJRQUSTKG-KBIXCLLPSA-N 0.000 description 1
- RWIKBYVJQAJYDP-BJDJZHNGSA-N Ile-Ala-Lys Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN RWIKBYVJQAJYDP-BJDJZHNGSA-N 0.000 description 1
- HERITAGIPLEJMT-GVARAGBVSA-N Ile-Ala-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HERITAGIPLEJMT-GVARAGBVSA-N 0.000 description 1
- TZCGZYWNIDZZMR-NAKRPEOUSA-N Ile-Arg-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](C)C(=O)O)N TZCGZYWNIDZZMR-NAKRPEOUSA-N 0.000 description 1
- WECYRWOMWSCWNX-XUXIUFHCSA-N Ile-Arg-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(C)C)C(O)=O WECYRWOMWSCWNX-XUXIUFHCSA-N 0.000 description 1
- NULSANWBUWLTKN-NAKRPEOUSA-N Ile-Arg-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CO)C(=O)O)N NULSANWBUWLTKN-NAKRPEOUSA-N 0.000 description 1
- UAVQIQOOBXFKRC-BYULHYEWSA-N Ile-Asn-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O UAVQIQOOBXFKRC-BYULHYEWSA-N 0.000 description 1
- IPYVXYDYLHVWHU-GMOBBJLQSA-N Ile-Asn-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCSC)C(=O)O)N IPYVXYDYLHVWHU-GMOBBJLQSA-N 0.000 description 1
- HVWXAQVMRBKKFE-UGYAYLCHSA-N Ile-Asp-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HVWXAQVMRBKKFE-UGYAYLCHSA-N 0.000 description 1
- RGSOCXHDOPQREB-ZPFDUUQYSA-N Ile-Asp-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N RGSOCXHDOPQREB-ZPFDUUQYSA-N 0.000 description 1
- DCQMJRSOGCYKTR-GHCJXIJMSA-N Ile-Asp-Ser Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O DCQMJRSOGCYKTR-GHCJXIJMSA-N 0.000 description 1
- LLZLRXBTOOFODM-QSFUFRPTSA-N Ile-Asp-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](C(C)C)C(=O)O)N LLZLRXBTOOFODM-QSFUFRPTSA-N 0.000 description 1
- CTHAJJYOHOBUDY-GHCJXIJMSA-N Ile-Cys-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)O)C(=O)O)N CTHAJJYOHOBUDY-GHCJXIJMSA-N 0.000 description 1
- PPSQSIDMOVPKPI-BJDJZHNGSA-N Ile-Cys-Leu Chemical compound N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(=O)O PPSQSIDMOVPKPI-BJDJZHNGSA-N 0.000 description 1
- WEWCEPOYKANMGZ-MMWGEVLESA-N Ile-Cys-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N1CCC[C@@H]1C(=O)O)N WEWCEPOYKANMGZ-MMWGEVLESA-N 0.000 description 1
- VQUCKIAECLVLAD-SVSWQMSJSA-N Ile-Cys-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N VQUCKIAECLVLAD-SVSWQMSJSA-N 0.000 description 1
- BSWLQVGEVFYGIM-ZPFDUUQYSA-N Ile-Gln-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N BSWLQVGEVFYGIM-ZPFDUUQYSA-N 0.000 description 1
- CYHJCEKUMCNDFG-LAEOZQHASA-N Ile-Gln-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)NCC(=O)O)N CYHJCEKUMCNDFG-LAEOZQHASA-N 0.000 description 1
- KUHFPGIVBOCRMV-MNXVOIDGSA-N Ile-Gln-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(C)C)C(=O)O)N KUHFPGIVBOCRMV-MNXVOIDGSA-N 0.000 description 1
- JRYQSFOFUFXPTB-RWRJDSDZSA-N Ile-Gln-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N JRYQSFOFUFXPTB-RWRJDSDZSA-N 0.000 description 1
- BEWFWZRGBDVXRP-PEFMBERDSA-N Ile-Glu-Asn Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O BEWFWZRGBDVXRP-PEFMBERDSA-N 0.000 description 1
- FUOYNOXRWPJPAN-QEWYBTABSA-N Ile-Glu-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N FUOYNOXRWPJPAN-QEWYBTABSA-N 0.000 description 1
- JXMSHKFPDIUYGS-SIUGBPQLSA-N Ile-Glu-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N JXMSHKFPDIUYGS-SIUGBPQLSA-N 0.000 description 1
- WUKLZPHVWAMZQV-UKJIMTQDSA-N Ile-Glu-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](C(C)C)C(=O)O)N WUKLZPHVWAMZQV-UKJIMTQDSA-N 0.000 description 1
- SLQVFYWBGNNOTK-BYULHYEWSA-N Ile-Gly-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)N)C(=O)O)N SLQVFYWBGNNOTK-BYULHYEWSA-N 0.000 description 1
- KFVUBLZRFSVDGO-BYULHYEWSA-N Ile-Gly-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O KFVUBLZRFSVDGO-BYULHYEWSA-N 0.000 description 1
- IGJWJGIHUFQANP-LAEOZQHASA-N Ile-Gly-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CCC(=O)N)C(=O)O)N IGJWJGIHUFQANP-LAEOZQHASA-N 0.000 description 1
- JLWLMGADIQFKRD-QSFUFRPTSA-N Ile-His-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CN=CN1 JLWLMGADIQFKRD-QSFUFRPTSA-N 0.000 description 1
- SJLVSMMIFYTSGY-GRLWGSQLSA-N Ile-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N SJLVSMMIFYTSGY-GRLWGSQLSA-N 0.000 description 1
- PFPUFNLHBXKPHY-HTFCKZLJSA-N Ile-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(=O)O)N PFPUFNLHBXKPHY-HTFCKZLJSA-N 0.000 description 1
- KLBVGHCGHUNHEA-BJDJZHNGSA-N Ile-Leu-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)O)N KLBVGHCGHUNHEA-BJDJZHNGSA-N 0.000 description 1
- OUUCIIJSBIBCHB-ZPFDUUQYSA-N Ile-Leu-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O OUUCIIJSBIBCHB-ZPFDUUQYSA-N 0.000 description 1
- NUKXXNFEUZGPRO-BJDJZHNGSA-N Ile-Leu-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)O)N NUKXXNFEUZGPRO-BJDJZHNGSA-N 0.000 description 1
- IOVUXUSIGXCREV-DKIMLUQUSA-N Ile-Leu-Phe Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IOVUXUSIGXCREV-DKIMLUQUSA-N 0.000 description 1
- PARSHQDZROHERM-NHCYSSNCSA-N Ile-Lys-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)NCC(=O)O)N PARSHQDZROHERM-NHCYSSNCSA-N 0.000 description 1
- WVUDHMBJNBWZBU-XUXIUFHCSA-N Ile-Lys-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)O)N WVUDHMBJNBWZBU-XUXIUFHCSA-N 0.000 description 1
- IDMNOFVUXYYZPF-DKIMLUQUSA-N Ile-Lys-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N IDMNOFVUXYYZPF-DKIMLUQUSA-N 0.000 description 1
- FFJQAEYLAQMGDL-MGHWNKPDSA-N Ile-Lys-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FFJQAEYLAQMGDL-MGHWNKPDSA-N 0.000 description 1
- WYUHAXJAMDTOAU-IAVJCBSLSA-N Ile-Phe-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N WYUHAXJAMDTOAU-IAVJCBSLSA-N 0.000 description 1
- KLJKJVXDHVUMMZ-KKPKCPPISA-N Ile-Phe-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O)N KLJKJVXDHVUMMZ-KKPKCPPISA-N 0.000 description 1
- JODPUDMBQBIWCK-GHCJXIJMSA-N Ile-Ser-Asn Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O JODPUDMBQBIWCK-GHCJXIJMSA-N 0.000 description 1
- ZDNNDIJTUHQCAM-MXAVVETBSA-N Ile-Ser-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N ZDNNDIJTUHQCAM-MXAVVETBSA-N 0.000 description 1
- PXKACEXYLPBMAD-JBDRJPRFSA-N Ile-Ser-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)O)N PXKACEXYLPBMAD-JBDRJPRFSA-N 0.000 description 1
- HXIDVIFHRYRXLZ-NAKRPEOUSA-N Ile-Ser-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)O)N HXIDVIFHRYRXLZ-NAKRPEOUSA-N 0.000 description 1
- CNMOKANDJMLAIF-CIQUZCHMSA-N Ile-Thr-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O CNMOKANDJMLAIF-CIQUZCHMSA-N 0.000 description 1
- SAEWJTCJQVZQNZ-IUKAMOBKSA-N Ile-Thr-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N SAEWJTCJQVZQNZ-IUKAMOBKSA-N 0.000 description 1
- COWHUQXTSYTKQC-RWRJDSDZSA-N Ile-Thr-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N COWHUQXTSYTKQC-RWRJDSDZSA-N 0.000 description 1
- YBKKLDBBPFIXBQ-MBLNEYKQSA-N Ile-Thr-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(=O)O)N YBKKLDBBPFIXBQ-MBLNEYKQSA-N 0.000 description 1
- KBDIBHQICWDGDL-PPCPHDFISA-N Ile-Thr-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N KBDIBHQICWDGDL-PPCPHDFISA-N 0.000 description 1
- HZVRQFKRALAMQS-SLBDDTMCSA-N Ile-Trp-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HZVRQFKRALAMQS-SLBDDTMCSA-N 0.000 description 1
- DLEBSGAVWRPTIX-PEDHHIEDSA-N Ile-Val-Ile Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)[C@@H](C)CC DLEBSGAVWRPTIX-PEDHHIEDSA-N 0.000 description 1
- PWWVAXIEGOYWEE-UHFFFAOYSA-N Isophenergan Chemical compound C1=CC=C2N(CC(C)N(C)C)C3=CC=CC=C3SC2=C1 PWWVAXIEGOYWEE-UHFFFAOYSA-N 0.000 description 1
- 108010076876 Keratins Proteins 0.000 description 1
- 102000011782 Keratins Human genes 0.000 description 1
- 241001099156 Komagataella phaffii Species 0.000 description 1
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 1
- XUJNEKJLAYXESH-REOHCLBHSA-N L-Cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 1
- AHLPHDHHMVZTML-BYPYZUCNSA-N L-Ornithine Chemical compound NCCC[C@H](N)C(O)=O AHLPHDHHMVZTML-BYPYZUCNSA-N 0.000 description 1
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 1
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 1
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 1
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 1
- RCFDOSNHHZGBOY-UHFFFAOYSA-N L-isoleucyl-L-alanine Natural products CCC(C)C(N)C(=O)NC(C)C(O)=O RCFDOSNHHZGBOY-UHFFFAOYSA-N 0.000 description 1
- SENJXOPIZNYLHU-UHFFFAOYSA-N L-leucyl-L-arginine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CCCN=C(N)N SENJXOPIZNYLHU-UHFFFAOYSA-N 0.000 description 1
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 1
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 1
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 1
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 1
- LZDNBBYBDGBADK-UHFFFAOYSA-N L-valyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-UHFFFAOYSA-N 0.000 description 1
- XIRYQRLFHWWWTC-QEJZJMRPSA-N Leu-Ala-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XIRYQRLFHWWWTC-QEJZJMRPSA-N 0.000 description 1
- WSGXUIQTEZDVHJ-GARJFASQSA-N Leu-Ala-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@@H]1C(O)=O WSGXUIQTEZDVHJ-GARJFASQSA-N 0.000 description 1
- QUAAUWNLWMLERT-IHRRRGAJSA-N Leu-Arg-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(C)C)C(O)=O QUAAUWNLWMLERT-IHRRRGAJSA-N 0.000 description 1
- KTFHTMHHKXUYPW-ZPFDUUQYSA-N Leu-Asp-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KTFHTMHHKXUYPW-ZPFDUUQYSA-N 0.000 description 1
- JQSXWJXBASFONF-KKUMJFAQSA-N Leu-Asp-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JQSXWJXBASFONF-KKUMJFAQSA-N 0.000 description 1
- NFHJQETXTSDZSI-DCAQKATOSA-N Leu-Cys-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NFHJQETXTSDZSI-DCAQKATOSA-N 0.000 description 1
- KAFOIVJDVSZUMD-UHFFFAOYSA-N Leu-Gln-Gln Natural products CC(C)CC(N)C(=O)NC(CCC(N)=O)C(=O)NC(CCC(N)=O)C(O)=O KAFOIVJDVSZUMD-UHFFFAOYSA-N 0.000 description 1
- RVVBWTWPNFDYBE-SRVKXCTJSA-N Leu-Glu-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O RVVBWTWPNFDYBE-SRVKXCTJSA-N 0.000 description 1
- HFBCHNRFRYLZNV-GUBZILKMSA-N Leu-Glu-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HFBCHNRFRYLZNV-GUBZILKMSA-N 0.000 description 1
- QVFGXCVIXXBFHO-AVGNSLFASA-N Leu-Glu-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O QVFGXCVIXXBFHO-AVGNSLFASA-N 0.000 description 1
- WQWSMEOYXJTFRU-GUBZILKMSA-N Leu-Glu-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O WQWSMEOYXJTFRU-GUBZILKMSA-N 0.000 description 1
- HVJVUYQWFYMGJS-GVXVVHGQSA-N Leu-Glu-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O HVJVUYQWFYMGJS-GVXVVHGQSA-N 0.000 description 1
- VWHGTYCRDRBSFI-ZETCQYMHSA-N Leu-Gly-Gly Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)NCC(O)=O VWHGTYCRDRBSFI-ZETCQYMHSA-N 0.000 description 1
- HYIFFZAQXPUEAU-QWRGUYRKSA-N Leu-Gly-Leu Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(C)C HYIFFZAQXPUEAU-QWRGUYRKSA-N 0.000 description 1
- APFJUBGRZGMQFF-QWRGUYRKSA-N Leu-Gly-Lys Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN APFJUBGRZGMQFF-QWRGUYRKSA-N 0.000 description 1
- VGPCJSXPPOQPBK-YUMQZZPRSA-N Leu-Gly-Ser Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O VGPCJSXPPOQPBK-YUMQZZPRSA-N 0.000 description 1
- HRTRLSRYZZKPCO-BJDJZHNGSA-N Leu-Ile-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O HRTRLSRYZZKPCO-BJDJZHNGSA-N 0.000 description 1
- PDQDCFBVYXEFSD-SRVKXCTJSA-N Leu-Leu-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O PDQDCFBVYXEFSD-SRVKXCTJSA-N 0.000 description 1
- IFMPDNRWZZEZSL-SRVKXCTJSA-N Leu-Leu-Cys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(O)=O IFMPDNRWZZEZSL-SRVKXCTJSA-N 0.000 description 1
- FAELBUXXFQLUAX-AJNGGQMLSA-N Leu-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(C)C FAELBUXXFQLUAX-AJNGGQMLSA-N 0.000 description 1
- UBZGNBKMIJHOHL-BZSNNMDCSA-N Leu-Leu-Phe Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 UBZGNBKMIJHOHL-BZSNNMDCSA-N 0.000 description 1
- ZRHDPZAAWLXXIR-SRVKXCTJSA-N Leu-Lys-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O ZRHDPZAAWLXXIR-SRVKXCTJSA-N 0.000 description 1
- VVQJGYPTIYOFBR-IHRRRGAJSA-N Leu-Lys-Met Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)O)N VVQJGYPTIYOFBR-IHRRRGAJSA-N 0.000 description 1
- ARRIJPQRBWRNLT-DCAQKATOSA-N Leu-Met-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)N)C(=O)O)N ARRIJPQRBWRNLT-DCAQKATOSA-N 0.000 description 1
- FLNPJLDPGMLWAU-UWVGGRQHSA-N Leu-Met-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CC(C)C FLNPJLDPGMLWAU-UWVGGRQHSA-N 0.000 description 1
- IBSGMIPRBMPMHE-IHRRRGAJSA-N Leu-Met-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(O)=O IBSGMIPRBMPMHE-IHRRRGAJSA-N 0.000 description 1
- GCXGCIYIHXSKAY-ULQDDVLXSA-N Leu-Phe-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GCXGCIYIHXSKAY-ULQDDVLXSA-N 0.000 description 1
- YESNGRDJQWDYLH-KKUMJFAQSA-N Leu-Phe-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CS)C(=O)O)N YESNGRDJQWDYLH-KKUMJFAQSA-N 0.000 description 1
- DRWMRVFCKKXHCH-BZSNNMDCSA-N Leu-Phe-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=CC=C1 DRWMRVFCKKXHCH-BZSNNMDCSA-N 0.000 description 1
- VULJUQZPSOASBZ-SRVKXCTJSA-N Leu-Pro-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O VULJUQZPSOASBZ-SRVKXCTJSA-N 0.000 description 1
- XXXXOVFBXRERQL-ULQDDVLXSA-N Leu-Pro-Phe Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XXXXOVFBXRERQL-ULQDDVLXSA-N 0.000 description 1
- IZPVWNSAVUQBGP-CIUDSAMLSA-N Leu-Ser-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O IZPVWNSAVUQBGP-CIUDSAMLSA-N 0.000 description 1
- AKVBOOKXVAMKSS-GUBZILKMSA-N Leu-Ser-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O AKVBOOKXVAMKSS-GUBZILKMSA-N 0.000 description 1
- IWMJFLJQHIDZQW-KKUMJFAQSA-N Leu-Ser-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IWMJFLJQHIDZQW-KKUMJFAQSA-N 0.000 description 1
- ZJZNLRVCZWUONM-JXUBOQSCSA-N Leu-Thr-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O ZJZNLRVCZWUONM-JXUBOQSCSA-N 0.000 description 1
- BTEMNFBEAAOGBR-BZSNNMDCSA-N Leu-Tyr-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BTEMNFBEAAOGBR-BZSNNMDCSA-N 0.000 description 1
- MVJRBCJCRYGCKV-GVXVVHGQSA-N Leu-Val-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O MVJRBCJCRYGCKV-GVXVVHGQSA-N 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- MPOHDJKRBLVGCT-CIUDSAMLSA-N Lys-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N MPOHDJKRBLVGCT-CIUDSAMLSA-N 0.000 description 1
- BTSXLXFPMZXVPR-DLOVCJGASA-N Lys-Ala-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCCCN)N BTSXLXFPMZXVPR-DLOVCJGASA-N 0.000 description 1
- IRNSXVOWSXSULE-DCAQKATOSA-N Lys-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN IRNSXVOWSXSULE-DCAQKATOSA-N 0.000 description 1
- GAOJCVKPIGHTGO-UWVGGRQHSA-N Lys-Arg-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O GAOJCVKPIGHTGO-UWVGGRQHSA-N 0.000 description 1
- SWWCDAGDQHTKIE-RHYQMDGZSA-N Lys-Arg-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWWCDAGDQHTKIE-RHYQMDGZSA-N 0.000 description 1
- HQVDJTYKCMIWJP-YUMQZZPRSA-N Lys-Asn-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O HQVDJTYKCMIWJP-YUMQZZPRSA-N 0.000 description 1
- YVSHZSUKQHNDHD-KKUMJFAQSA-N Lys-Asn-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N YVSHZSUKQHNDHD-KKUMJFAQSA-N 0.000 description 1
- JBRWKVANRYPCAF-XIRDDKMYSA-N Lys-Asn-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N JBRWKVANRYPCAF-XIRDDKMYSA-N 0.000 description 1
- LZWNAOIMTLNMDW-NHCYSSNCSA-N Lys-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N LZWNAOIMTLNMDW-NHCYSSNCSA-N 0.000 description 1
- QUYCUALODHJQLK-CIUDSAMLSA-N Lys-Asp-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O QUYCUALODHJQLK-CIUDSAMLSA-N 0.000 description 1
- HWMZUBUEOYAQSC-DCAQKATOSA-N Lys-Gln-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O HWMZUBUEOYAQSC-DCAQKATOSA-N 0.000 description 1
- LXNPMPIQDNSMTA-AVGNSLFASA-N Lys-Gln-His Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 LXNPMPIQDNSMTA-AVGNSLFASA-N 0.000 description 1
- RZHLIPMZXOEJTL-AVGNSLFASA-N Lys-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N RZHLIPMZXOEJTL-AVGNSLFASA-N 0.000 description 1
- MQMIRLVJXQNTRJ-SDDRHHMPSA-N Lys-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N)C(=O)O MQMIRLVJXQNTRJ-SDDRHHMPSA-N 0.000 description 1
- ZXEUFAVXODIPHC-GUBZILKMSA-N Lys-Glu-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZXEUFAVXODIPHC-GUBZILKMSA-N 0.000 description 1
- GCMWRRQAKQXDED-IUCAKERBSA-N Lys-Glu-Gly Chemical compound [NH3+]CCCC[C@H]([NH3+])C(=O)N[C@@H](CCC([O-])=O)C(=O)NCC([O-])=O GCMWRRQAKQXDED-IUCAKERBSA-N 0.000 description 1
- DCRWPTBMWMGADO-AVGNSLFASA-N Lys-Glu-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DCRWPTBMWMGADO-AVGNSLFASA-N 0.000 description 1
- IMAKMJCBYCSMHM-AVGNSLFASA-N Lys-Glu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN IMAKMJCBYCSMHM-AVGNSLFASA-N 0.000 description 1
- ODUQLUADRKMHOZ-JYJNAYRXSA-N Lys-Glu-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)N)O ODUQLUADRKMHOZ-JYJNAYRXSA-N 0.000 description 1
- DTUZCYRNEJDKSR-NHCYSSNCSA-N Lys-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN DTUZCYRNEJDKSR-NHCYSSNCSA-N 0.000 description 1
- GQFDWEDHOQRNLC-QWRGUYRKSA-N Lys-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN GQFDWEDHOQRNLC-QWRGUYRKSA-N 0.000 description 1
- CANPXOLVTMKURR-WEDXCCLWSA-N Lys-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN CANPXOLVTMKURR-WEDXCCLWSA-N 0.000 description 1
- SQJSXOQXJYAVRV-SRVKXCTJSA-N Lys-His-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N SQJSXOQXJYAVRV-SRVKXCTJSA-N 0.000 description 1
- WOEDRPCHKPSFDT-MXAVVETBSA-N Lys-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCCCN)N WOEDRPCHKPSFDT-MXAVVETBSA-N 0.000 description 1
- OWRUUFUVXFREBD-KKUMJFAQSA-N Lys-His-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O OWRUUFUVXFREBD-KKUMJFAQSA-N 0.000 description 1
- FGMHXLULNHTPID-KKUMJFAQSA-N Lys-His-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CN=CN1 FGMHXLULNHTPID-KKUMJFAQSA-N 0.000 description 1
- CTBMEDOQJFGNMI-IHPCNDPISA-N Lys-His-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC3=CN=CN3)NC(=O)[C@H](CCCCN)N CTBMEDOQJFGNMI-IHPCNDPISA-N 0.000 description 1
- QOJDBRUCOXQSSK-AJNGGQMLSA-N Lys-Ile-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(O)=O QOJDBRUCOXQSSK-AJNGGQMLSA-N 0.000 description 1
- NCZIQZYZPUPMKY-PPCPHDFISA-N Lys-Ile-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NCZIQZYZPUPMKY-PPCPHDFISA-N 0.000 description 1
- SKRGVGLIRUGANF-AVGNSLFASA-N Lys-Leu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SKRGVGLIRUGANF-AVGNSLFASA-N 0.000 description 1
- VMTYLUGCXIEDMV-QWRGUYRKSA-N Lys-Leu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCCCN VMTYLUGCXIEDMV-QWRGUYRKSA-N 0.000 description 1
- XOQMURBBIXRRCR-SRVKXCTJSA-N Lys-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN XOQMURBBIXRRCR-SRVKXCTJSA-N 0.000 description 1
- ZJWIXBZTAAJERF-IHRRRGAJSA-N Lys-Lys-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CCCN=C(N)N ZJWIXBZTAAJERF-IHRRRGAJSA-N 0.000 description 1
- RIJCHEVHFWMDKD-SRVKXCTJSA-N Lys-Lys-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O RIJCHEVHFWMDKD-SRVKXCTJSA-N 0.000 description 1
- JQSIGLHQNSZZRL-KKUMJFAQSA-N Lys-Lys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N JQSIGLHQNSZZRL-KKUMJFAQSA-N 0.000 description 1
- HVAUKHLDSDDROB-KKUMJFAQSA-N Lys-Lys-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O HVAUKHLDSDDROB-KKUMJFAQSA-N 0.000 description 1
- TYEJPFJNAHIKRT-DCAQKATOSA-N Lys-Met-Cys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCCN)N TYEJPFJNAHIKRT-DCAQKATOSA-N 0.000 description 1
- IPTUBUUIFRZMJK-ACRUOGEOSA-N Lys-Phe-Phe Chemical compound C([C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 IPTUBUUIFRZMJK-ACRUOGEOSA-N 0.000 description 1
- YTJFXEDRUOQGSP-DCAQKATOSA-N Lys-Pro-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O YTJFXEDRUOQGSP-DCAQKATOSA-N 0.000 description 1
- DYJOORGDQIGZAS-DCAQKATOSA-N Lys-Ser-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCCCN)N DYJOORGDQIGZAS-DCAQKATOSA-N 0.000 description 1
- DIBZLYZXTSVGLN-CIUDSAMLSA-N Lys-Ser-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O DIBZLYZXTSVGLN-CIUDSAMLSA-N 0.000 description 1
- MIFFFXHMAHFACR-KATARQTJSA-N Lys-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CCCCN MIFFFXHMAHFACR-KATARQTJSA-N 0.000 description 1
- TVHCDSBMFQYPNA-RHYQMDGZSA-N Lys-Thr-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TVHCDSBMFQYPNA-RHYQMDGZSA-N 0.000 description 1
- IEVXCWPVBYCJRZ-IXOXFDKPSA-N Lys-Thr-His Chemical compound NCCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 IEVXCWPVBYCJRZ-IXOXFDKPSA-N 0.000 description 1
- CNXOBMMOYZPPGS-NUTKFTJISA-N Lys-Trp-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(O)=O CNXOBMMOYZPPGS-NUTKFTJISA-N 0.000 description 1
- XATKLFSXFINPSB-JYJNAYRXSA-N Lys-Tyr-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O XATKLFSXFINPSB-JYJNAYRXSA-N 0.000 description 1
- WINFHLHJTRGLCV-BZSNNMDCSA-N Lys-Tyr-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=C(O)C=C1 WINFHLHJTRGLCV-BZSNNMDCSA-N 0.000 description 1
- QLFAPXUXEBAWEK-NHCYSSNCSA-N Lys-Val-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O QLFAPXUXEBAWEK-NHCYSSNCSA-N 0.000 description 1
- DRRXXZBXDMLGFC-IHRRRGAJSA-N Lys-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN DRRXXZBXDMLGFC-IHRRRGAJSA-N 0.000 description 1
- NYTDJEZBAAFLLG-IHRRRGAJSA-N Lys-Val-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(O)=O NYTDJEZBAAFLLG-IHRRRGAJSA-N 0.000 description 1
- HMZPYMSEAALNAE-ULQDDVLXSA-N Lys-Val-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O HMZPYMSEAALNAE-ULQDDVLXSA-N 0.000 description 1
- 102000000380 Matrix Metalloproteinase 1 Human genes 0.000 description 1
- 108010016113 Matrix Metalloproteinase 1 Proteins 0.000 description 1
- 108010076557 Matrix Metalloproteinase 14 Proteins 0.000 description 1
- 102100030216 Matrix metalloproteinase-14 Human genes 0.000 description 1
- 102000018697 Membrane Proteins Human genes 0.000 description 1
- MUYQDMBLDFEVRJ-LSJOCFKGSA-N Met-Ala-His Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 MUYQDMBLDFEVRJ-LSJOCFKGSA-N 0.000 description 1
- QEVRUYFHWJJUHZ-DCAQKATOSA-N Met-Ala-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(C)C QEVRUYFHWJJUHZ-DCAQKATOSA-N 0.000 description 1
- BVXXDMUMHMXFER-BPNCWPANSA-N Met-Ala-Tyr Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O BVXXDMUMHMXFER-BPNCWPANSA-N 0.000 description 1
- CWFYZYQMUDWGTI-GUBZILKMSA-N Met-Arg-Asp Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O CWFYZYQMUDWGTI-GUBZILKMSA-N 0.000 description 1
- IIPHCNKHEZYSNE-DCAQKATOSA-N Met-Arg-Gln Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O IIPHCNKHEZYSNE-DCAQKATOSA-N 0.000 description 1
- DSWOTZCVCBEPOU-IUCAKERBSA-N Met-Arg-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CCCNC(N)=N DSWOTZCVCBEPOU-IUCAKERBSA-N 0.000 description 1
- ZEDVFJPQNNBMST-CYDGBPFRSA-N Met-Arg-Ile Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZEDVFJPQNNBMST-CYDGBPFRSA-N 0.000 description 1
- WDTLNWHPIPCMMP-AVGNSLFASA-N Met-Arg-Leu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O WDTLNWHPIPCMMP-AVGNSLFASA-N 0.000 description 1
- RJEFZSIVBHGRQJ-SRVKXCTJSA-N Met-Arg-Met Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(O)=O RJEFZSIVBHGRQJ-SRVKXCTJSA-N 0.000 description 1
- TZLYIHDABYBOCJ-FXQIFTODSA-N Met-Asp-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O TZLYIHDABYBOCJ-FXQIFTODSA-N 0.000 description 1
- FVKRQMQQFGBXHV-QXEWZRGKSA-N Met-Asp-Val Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O FVKRQMQQFGBXHV-QXEWZRGKSA-N 0.000 description 1
- OFNCSQNBSWGGNV-DCAQKATOSA-N Met-Cys-His Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 OFNCSQNBSWGGNV-DCAQKATOSA-N 0.000 description 1
- HLYIDXAXQIJYIG-CIUDSAMLSA-N Met-Gln-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HLYIDXAXQIJYIG-CIUDSAMLSA-N 0.000 description 1
- UOENBSHXYCHSAU-YUMQZZPRSA-N Met-Gln-Gly Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O UOENBSHXYCHSAU-YUMQZZPRSA-N 0.000 description 1
- AETNZPKUUYYYEK-CIUDSAMLSA-N Met-Glu-Asn Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O AETNZPKUUYYYEK-CIUDSAMLSA-N 0.000 description 1
- JPCHYAUKOUGOIB-HJGDQZAQSA-N Met-Glu-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JPCHYAUKOUGOIB-HJGDQZAQSA-N 0.000 description 1
- MYAPQOBHGWJZOM-UWVGGRQHSA-N Met-Gly-Leu Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(C)C MYAPQOBHGWJZOM-UWVGGRQHSA-N 0.000 description 1
- PPHLBTXVBJNKOB-FDARSICLSA-N Met-Ile-Trp Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O PPHLBTXVBJNKOB-FDARSICLSA-N 0.000 description 1
- PZUUMQPMHBJJKE-AVGNSLFASA-N Met-Leu-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCNC(N)=N PZUUMQPMHBJJKE-AVGNSLFASA-N 0.000 description 1
- WPTHAGXMYDRPFD-SRVKXCTJSA-N Met-Lys-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O WPTHAGXMYDRPFD-SRVKXCTJSA-N 0.000 description 1
- BQHLZUMZOXUWNU-DCAQKATOSA-N Met-Pro-Glu Chemical compound CSCC[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(=O)O)C(=O)O)N BQHLZUMZOXUWNU-DCAQKATOSA-N 0.000 description 1
- WXXNVZMWHOLNRJ-AVGNSLFASA-N Met-Pro-Lys Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O WXXNVZMWHOLNRJ-AVGNSLFASA-N 0.000 description 1
- DSZFTPCSFVWMKP-DCAQKATOSA-N Met-Ser-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN DSZFTPCSFVWMKP-DCAQKATOSA-N 0.000 description 1
- CIIJWIAORKTXAH-FJXKBIBVSA-N Met-Thr-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O CIIJWIAORKTXAH-FJXKBIBVSA-N 0.000 description 1
- HMEVNCOJHJTLNB-BVSLBCMMSA-N Met-Trp-Phe Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC3=CC=CC=C3)C(=O)O)N HMEVNCOJHJTLNB-BVSLBCMMSA-N 0.000 description 1
- VYXIKLFLGRTANT-HRCADAONSA-N Met-Tyr-Pro Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N2CCC[C@@H]2C(=O)O)N VYXIKLFLGRTANT-HRCADAONSA-N 0.000 description 1
- LPNWWHBFXPNHJG-AVGNSLFASA-N Met-Val-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN LPNWWHBFXPNHJG-AVGNSLFASA-N 0.000 description 1
- LBSWWNKMVPAXOI-GUBZILKMSA-N Met-Val-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O LBSWWNKMVPAXOI-GUBZILKMSA-N 0.000 description 1
- 206010027476 Metastases Diseases 0.000 description 1
- 241001291091 Mimivirus Species 0.000 description 1
- 101100268906 Mus musculus Acox1 gene Proteins 0.000 description 1
- PESQCPHRXOFIPX-UHFFFAOYSA-N N-L-methionyl-L-tyrosine Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 PESQCPHRXOFIPX-UHFFFAOYSA-N 0.000 description 1
- 108010066427 N-valyltryptophan Proteins 0.000 description 1
- 108010087066 N2-tryptophyllysine Proteins 0.000 description 1
- 101150114808 NAA25 gene Proteins 0.000 description 1
- 108010047562 NGR peptide Proteins 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 101100395023 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) his-7 gene Proteins 0.000 description 1
- 102100030411 Neutrophil collagenase Human genes 0.000 description 1
- 101710118230 Neutrophil collagenase Proteins 0.000 description 1
- 108091092724 Noncoding DNA Proteins 0.000 description 1
- 101710163270 Nuclease Proteins 0.000 description 1
- 241000320412 Ogataea angusta Species 0.000 description 1
- 241001563619 Ogataea parapolymorpha Species 0.000 description 1
- 108091034117 Oligonucleotide Proteins 0.000 description 1
- AHLPHDHHMVZTML-UHFFFAOYSA-N Orn-delta-NH2 Natural products NCCCC(N)C(O)=O AHLPHDHHMVZTML-UHFFFAOYSA-N 0.000 description 1
- UTJLXEIPEHZYQJ-UHFFFAOYSA-N Ornithine Natural products OC(=O)C(C)CCCN UTJLXEIPEHZYQJ-UHFFFAOYSA-N 0.000 description 1
- 102000004316 Oxidoreductases Human genes 0.000 description 1
- 108090000854 Oxidoreductases Proteins 0.000 description 1
- 101150056463 PEX5 gene Proteins 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 102000035195 Peptidases Human genes 0.000 description 1
- 108091093037 Peptide nucleic acid Proteins 0.000 description 1
- 101710101427 Peroxisomal targeting signal 1 receptor Proteins 0.000 description 1
- 102100032924 Peroxisomal targeting signal 2 receptor Human genes 0.000 description 1
- 101710183599 Peroxisomal targeting signal receptor Proteins 0.000 description 1
- MPGJIHFJCXTVEX-KKUMJFAQSA-N Phe-Arg-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O MPGJIHFJCXTVEX-KKUMJFAQSA-N 0.000 description 1
- LDSOBEJVGGVWGD-DLOVCJGASA-N Phe-Asp-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 LDSOBEJVGGVWGD-DLOVCJGASA-N 0.000 description 1
- QEPZQAPZKIPVDV-KKUMJFAQSA-N Phe-Cys-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N QEPZQAPZKIPVDV-KKUMJFAQSA-N 0.000 description 1
- GDBOREPXIRKSEQ-FHWLQOOXSA-N Phe-Gln-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GDBOREPXIRKSEQ-FHWLQOOXSA-N 0.000 description 1
- IDUCUXTUHHIQIP-SOUVJXGZSA-N Phe-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O IDUCUXTUHHIQIP-SOUVJXGZSA-N 0.000 description 1
- NKLDZIPTGKBDBB-HTUGSXCWSA-N Phe-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N)O NKLDZIPTGKBDBB-HTUGSXCWSA-N 0.000 description 1
- ZLGQEBCCANLYRA-RYUDHWBXSA-N Phe-Gly-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O ZLGQEBCCANLYRA-RYUDHWBXSA-N 0.000 description 1
- NAXPHWZXEXNDIW-JTQLQIEISA-N Phe-Gly-Gly Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 NAXPHWZXEXNDIW-JTQLQIEISA-N 0.000 description 1
- HBGFEEQFVBWYJQ-KBPBESRZSA-N Phe-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 HBGFEEQFVBWYJQ-KBPBESRZSA-N 0.000 description 1
- FXYXBEZMRACDDR-KKUMJFAQSA-N Phe-His-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(O)=O FXYXBEZMRACDDR-KKUMJFAQSA-N 0.000 description 1
- GYEPCBNTTRORKW-PCBIJLKTSA-N Phe-Ile-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O GYEPCBNTTRORKW-PCBIJLKTSA-N 0.000 description 1
- DVOCGBNHAUHKHJ-DKIMLUQUSA-N Phe-Ile-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O DVOCGBNHAUHKHJ-DKIMLUQUSA-N 0.000 description 1
- RORUIHAWOLADSH-HJWJTTGWSA-N Phe-Ile-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC1=CC=CC=C1 RORUIHAWOLADSH-HJWJTTGWSA-N 0.000 description 1
- XMQSOOJRRVEHRO-ULQDDVLXSA-N Phe-Leu-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 XMQSOOJRRVEHRO-ULQDDVLXSA-N 0.000 description 1
- CMHTUJQZQXFNTQ-OEAJRASXSA-N Phe-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC1=CC=CC=C1)N)O CMHTUJQZQXFNTQ-OEAJRASXSA-N 0.000 description 1
- RMKGXGPQIPLTFC-KKUMJFAQSA-N Phe-Lys-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O RMKGXGPQIPLTFC-KKUMJFAQSA-N 0.000 description 1
- CJAHQEZWDZNSJO-KKUMJFAQSA-N Phe-Lys-Cys Chemical compound NCCCC[C@@H](C(=O)N[C@@H](CS)C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 CJAHQEZWDZNSJO-KKUMJFAQSA-N 0.000 description 1
- MJAYDXWQQUOURZ-JYJNAYRXSA-N Phe-Lys-Gln Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O MJAYDXWQQUOURZ-JYJNAYRXSA-N 0.000 description 1
- WLYPRKLMRIYGPP-JYJNAYRXSA-N Phe-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 WLYPRKLMRIYGPP-JYJNAYRXSA-N 0.000 description 1
- ZUQACJLOHYRVPJ-DKIMLUQUSA-N Phe-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 ZUQACJLOHYRVPJ-DKIMLUQUSA-N 0.000 description 1
- RTUWVJVJSMOGPL-KKUMJFAQSA-N Phe-Met-Glu Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N RTUWVJVJSMOGPL-KKUMJFAQSA-N 0.000 description 1
- SRILZRSXIKRGBF-HRCADAONSA-N Phe-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N SRILZRSXIKRGBF-HRCADAONSA-N 0.000 description 1
- AXIOGMQCDYVTNY-ACRUOGEOSA-N Phe-Phe-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 AXIOGMQCDYVTNY-ACRUOGEOSA-N 0.000 description 1
- WKLMCMXFMQEKCX-SLFFLAALSA-N Phe-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CC3=CC=CC=C3)N)C(=O)O WKLMCMXFMQEKCX-SLFFLAALSA-N 0.000 description 1
- JLLJTMHNXQTMCK-UBHSHLNASA-N Phe-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=CC=C1 JLLJTMHNXQTMCK-UBHSHLNASA-N 0.000 description 1
- QARPMYDMYVLFMW-KKUMJFAQSA-N Phe-Pro-Glu Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCC(O)=O)C(O)=O)C1=CC=CC=C1 QARPMYDMYVLFMW-KKUMJFAQSA-N 0.000 description 1
- NJJBATPLUQHRBM-IHRRRGAJSA-N Phe-Pro-Ser Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)N)C(=O)N[C@@H](CO)C(=O)O NJJBATPLUQHRBM-IHRRRGAJSA-N 0.000 description 1
- BPCLGWHVPVTTFM-QWRGUYRKSA-N Phe-Ser-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)NCC(O)=O BPCLGWHVPVTTFM-QWRGUYRKSA-N 0.000 description 1
- IPFXYNKCXYGSSV-KKUMJFAQSA-N Phe-Ser-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N IPFXYNKCXYGSSV-KKUMJFAQSA-N 0.000 description 1
- XNQMZHLAYFWSGJ-HTUGSXCWSA-N Phe-Thr-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XNQMZHLAYFWSGJ-HTUGSXCWSA-N 0.000 description 1
- BSKMOCNNLNDIMU-CDMKHQONSA-N Phe-Thr-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O BSKMOCNNLNDIMU-CDMKHQONSA-N 0.000 description 1
- APECKGGXAXNFLL-RNXOBYDBSA-N Phe-Trp-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 APECKGGXAXNFLL-RNXOBYDBSA-N 0.000 description 1
- VFDRDMOMHBJGKD-UFYCRDLUSA-N Phe-Tyr-Arg Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N VFDRDMOMHBJGKD-UFYCRDLUSA-N 0.000 description 1
- ZTVSVSFBHUVYIN-UFYCRDLUSA-N Phe-Tyr-Met Chemical compound C([C@@H](C(=O)N[C@@H](CCSC)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CC=C(O)C=C1 ZTVSVSFBHUVYIN-UFYCRDLUSA-N 0.000 description 1
- IPVPGAADZXRZSH-RNXOBYDBSA-N Phe-Tyr-Trp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O IPVPGAADZXRZSH-RNXOBYDBSA-N 0.000 description 1
- XALFIVXGQUEGKV-JSGCOSHPSA-N Phe-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 XALFIVXGQUEGKV-JSGCOSHPSA-N 0.000 description 1
- 102000045595 Phosphoprotein Phosphatases Human genes 0.000 description 1
- 108700019535 Phosphoprotein Phosphatases Proteins 0.000 description 1
- 108091000080 Phosphotransferase Proteins 0.000 description 1
- 239000004952 Polyamide Substances 0.000 description 1
- IFMDQWDAJUMMJC-DCAQKATOSA-N Pro-Ala-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O IFMDQWDAJUMMJC-DCAQKATOSA-N 0.000 description 1
- OOLOTUZJUBOMAX-GUBZILKMSA-N Pro-Ala-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O OOLOTUZJUBOMAX-GUBZILKMSA-N 0.000 description 1
- SMCHPSMKAFIERP-FXQIFTODSA-N Pro-Asn-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@@H]1CCCN1 SMCHPSMKAFIERP-FXQIFTODSA-N 0.000 description 1
- INXAPZFIOVGHSV-CIUDSAMLSA-N Pro-Asn-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H]1CCCN1 INXAPZFIOVGHSV-CIUDSAMLSA-N 0.000 description 1
- XROLYVMNVIKVEM-BQBZGAKWSA-N Pro-Asn-Gly Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O XROLYVMNVIKVEM-BQBZGAKWSA-N 0.000 description 1
- ILMLVTGTUJPQFP-FXQIFTODSA-N Pro-Asp-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O ILMLVTGTUJPQFP-FXQIFTODSA-N 0.000 description 1
- KPDRZQUWJKTMBP-DCAQKATOSA-N Pro-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1 KPDRZQUWJKTMBP-DCAQKATOSA-N 0.000 description 1
- HXOLCSYHGRNXJJ-IHRRRGAJSA-N Pro-Asp-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HXOLCSYHGRNXJJ-IHRRRGAJSA-N 0.000 description 1
- SFECXGVELZFBFJ-VEVYYDQMSA-N Pro-Asp-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SFECXGVELZFBFJ-VEVYYDQMSA-N 0.000 description 1
- LSIWVWRUTKPXDS-DCAQKATOSA-N Pro-Gln-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O LSIWVWRUTKPXDS-DCAQKATOSA-N 0.000 description 1
- HJSCRFZVGXAGNG-SRVKXCTJSA-N Pro-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H]1CCCN1 HJSCRFZVGXAGNG-SRVKXCTJSA-N 0.000 description 1
- DRIJZWBRGMJCDD-DCAQKATOSA-N Pro-Gln-Met Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O DRIJZWBRGMJCDD-DCAQKATOSA-N 0.000 description 1
- DIFXZGPHVCIVSQ-CIUDSAMLSA-N Pro-Gln-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O DIFXZGPHVCIVSQ-CIUDSAMLSA-N 0.000 description 1
- FRKBNXCFJBPJOL-GUBZILKMSA-N Pro-Glu-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O FRKBNXCFJBPJOL-GUBZILKMSA-N 0.000 description 1
- LGSANCBHSMDFDY-GARJFASQSA-N Pro-Glu-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCC(=O)O)C(=O)N2CCC[C@@H]2C(=O)O LGSANCBHSMDFDY-GARJFASQSA-N 0.000 description 1
- LXVLKXPFIDDHJG-CIUDSAMLSA-N Pro-Glu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O LXVLKXPFIDDHJG-CIUDSAMLSA-N 0.000 description 1
- ZTVCLZLGHZXLOT-ULQDDVLXSA-N Pro-Glu-Trp Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O ZTVCLZLGHZXLOT-ULQDDVLXSA-N 0.000 description 1
- HAAQQNHQZBOWFO-LURJTMIESA-N Pro-Gly-Gly Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H]1CCCN1 HAAQQNHQZBOWFO-LURJTMIESA-N 0.000 description 1
- WSRWHZRUOCACLJ-UWVGGRQHSA-N Pro-Gly-His Chemical compound C([C@@H](C(=O)O)NC(=O)CNC(=O)[C@H]1NCCC1)C1=CN=CN1 WSRWHZRUOCACLJ-UWVGGRQHSA-N 0.000 description 1
- FEVDNIBDCRKMER-IUCAKERBSA-N Pro-Gly-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)CNC(=O)[C@@H]1CCCN1 FEVDNIBDCRKMER-IUCAKERBSA-N 0.000 description 1
- QEWBZBLXDKIQPS-STQMWFEESA-N Pro-Gly-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O QEWBZBLXDKIQPS-STQMWFEESA-N 0.000 description 1
- LCUOTSLIVGSGAU-AVGNSLFASA-N Pro-His-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O LCUOTSLIVGSGAU-AVGNSLFASA-N 0.000 description 1
- STASJMBVVHNWCG-IHRRRGAJSA-N Pro-His-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)NC(=O)[C@H]1[NH2+]CCC1)C1=CN=CN1 STASJMBVVHNWCG-IHRRRGAJSA-N 0.000 description 1
- TYMBHHITTMGGPI-NAKRPEOUSA-N Pro-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@@H]1CCCN1 TYMBHHITTMGGPI-NAKRPEOUSA-N 0.000 description 1
- LNOWDSPAYBWJOR-PEDHHIEDSA-N Pro-Ile-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LNOWDSPAYBWJOR-PEDHHIEDSA-N 0.000 description 1
- UREQLMJCKFLLHM-NAKRPEOUSA-N Pro-Ile-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O UREQLMJCKFLLHM-NAKRPEOUSA-N 0.000 description 1
- FXGIMYRVJJEIIM-UWVGGRQHSA-N Pro-Leu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H]1CCCN1 FXGIMYRVJJEIIM-UWVGGRQHSA-N 0.000 description 1
- SXMSEHDMNIUTSP-DCAQKATOSA-N Pro-Lys-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O SXMSEHDMNIUTSP-DCAQKATOSA-N 0.000 description 1
- VWHJZETTZDAGOM-XUXIUFHCSA-N Pro-Lys-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VWHJZETTZDAGOM-XUXIUFHCSA-N 0.000 description 1
- DWGFLKQSGRUQTI-IHRRRGAJSA-N Pro-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1 DWGFLKQSGRUQTI-IHRRRGAJSA-N 0.000 description 1
- WFIVLLFYUZZWOD-RHYQMDGZSA-N Pro-Lys-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WFIVLLFYUZZWOD-RHYQMDGZSA-N 0.000 description 1
- MHBSUKYVBZVQRW-HJWJTTGWSA-N Pro-Phe-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MHBSUKYVBZVQRW-HJWJTTGWSA-N 0.000 description 1
- ZVEQWRWMRFIVSD-HRCADAONSA-N Pro-Phe-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)N3CCC[C@@H]3C(=O)O ZVEQWRWMRFIVSD-HRCADAONSA-N 0.000 description 1
- AFWBWPCXSWUCLB-WDSKDSINSA-N Pro-Ser Chemical compound OC[C@@H](C([O-])=O)NC(=O)[C@@H]1CCC[NH2+]1 AFWBWPCXSWUCLB-WDSKDSINSA-N 0.000 description 1
- QUBVFEANYYWBTM-VEVYYDQMSA-N Pro-Thr-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O QUBVFEANYYWBTM-VEVYYDQMSA-N 0.000 description 1
- HRIXMVRZRGFKNQ-HJGDQZAQSA-N Pro-Thr-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O HRIXMVRZRGFKNQ-HJGDQZAQSA-N 0.000 description 1
- AJJDPGVVNPUZCR-RHYQMDGZSA-N Pro-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@@H]1CCCN1)O AJJDPGVVNPUZCR-RHYQMDGZSA-N 0.000 description 1
- JRBWMRUPXWPEID-JYJNAYRXSA-N Pro-Trp-Cys Chemical compound N([C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CS)C(=O)O)C(=O)[C@@H]1CCCN1 JRBWMRUPXWPEID-JYJNAYRXSA-N 0.000 description 1
- SNSYSBUTTJBPDG-OKZBNKHCSA-N Pro-Trp-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)N4CCC[C@@H]4C(=O)O SNSYSBUTTJBPDG-OKZBNKHCSA-N 0.000 description 1
- ZAUHSLVPDLNTRZ-QXEWZRGKSA-N Pro-Val-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O ZAUHSLVPDLNTRZ-QXEWZRGKSA-N 0.000 description 1
- STGVYUTZKGPRCI-GUBZILKMSA-N Pro-Val-Cys Chemical compound SC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 STGVYUTZKGPRCI-GUBZILKMSA-N 0.000 description 1
- UCTIUWKCVNGEFH-OBJOEFQTSA-N Pro-Val-Gly-Pro Chemical compound N([C@@H](C(C)C)C(=O)NCC(=O)N1[C@@H](CCC1)C(O)=O)C(=O)[C@@H]1CCCN1 UCTIUWKCVNGEFH-OBJOEFQTSA-N 0.000 description 1
- IIRBTQHFVNGPMQ-AVGNSLFASA-N Pro-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@@H]1CCCN1 IIRBTQHFVNGPMQ-AVGNSLFASA-N 0.000 description 1
- XRGIDCGRSSWCKE-SRVKXCTJSA-N Pro-Val-Met Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCSC)C(O)=O XRGIDCGRSSWCKE-SRVKXCTJSA-N 0.000 description 1
- ZMLRZBWCXPQADC-TUAOUCFPSA-N Pro-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 ZMLRZBWCXPQADC-TUAOUCFPSA-N 0.000 description 1
- YDTUEBLEAVANFH-RCWTZXSCSA-N Pro-Val-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 YDTUEBLEAVANFH-RCWTZXSCSA-N 0.000 description 1
- 229940124158 Protease/peptidase inhibitor Drugs 0.000 description 1
- 108010076504 Protein Sorting Signals Proteins 0.000 description 1
- 108010001267 Protein Subunits Proteins 0.000 description 1
- 102000004022 Protein-Tyrosine Kinases Human genes 0.000 description 1
- 108090000412 Protein-Tyrosine Kinases Proteins 0.000 description 1
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 1
- 102100033479 RAF proto-oncogene serine/threonine-protein kinase Human genes 0.000 description 1
- 101710141955 RAF proto-oncogene serine/threonine-protein kinase Proteins 0.000 description 1
- 108010079005 RDV peptide Proteins 0.000 description 1
- 108010003201 RGH 0205 Proteins 0.000 description 1
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 1
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 1
- MEFKEPWMEQBLKI-AIRLBKTGSA-N S-adenosyl-L-methioninate Chemical compound O[C@@H]1[C@H](O)[C@@H](C[S+](CC[C@H](N)C([O-])=O)C)O[C@H]1N1C2=NC=NC(N)=C2N=C1 MEFKEPWMEQBLKI-AIRLBKTGSA-N 0.000 description 1
- 101100010928 Saccharolobus solfataricus (strain ATCC 35092 / DSM 1617 / JCM 11322 / P2) tuf gene Proteins 0.000 description 1
- 241000235343 Saccharomycetales Species 0.000 description 1
- ZUGXSSFMTXKHJS-ZLUOBGJFSA-N Ser-Ala-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O ZUGXSSFMTXKHJS-ZLUOBGJFSA-N 0.000 description 1
- LVVBAKCGXXUHFO-ZLUOBGJFSA-N Ser-Ala-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O LVVBAKCGXXUHFO-ZLUOBGJFSA-N 0.000 description 1
- QFBNNYNWKYKVJO-DCAQKATOSA-N Ser-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CCCN=C(N)N QFBNNYNWKYKVJO-DCAQKATOSA-N 0.000 description 1
- UBRXAVQWXOWRSJ-ZLUOBGJFSA-N Ser-Asn-Asp Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CO)N)C(=O)N UBRXAVQWXOWRSJ-ZLUOBGJFSA-N 0.000 description 1
- VAUMZJHYZQXZBQ-WHFBIAKZSA-N Ser-Asn-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O VAUMZJHYZQXZBQ-WHFBIAKZSA-N 0.000 description 1
- VGNYHOBZJKWRGI-CIUDSAMLSA-N Ser-Asn-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO VGNYHOBZJKWRGI-CIUDSAMLSA-N 0.000 description 1
- DKKGAAJTDKHWOD-BIIVOSGPSA-N Ser-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CO)N)C(=O)O DKKGAAJTDKHWOD-BIIVOSGPSA-N 0.000 description 1
- RDFQNDHEHVSONI-ZLUOBGJFSA-N Ser-Asn-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O RDFQNDHEHVSONI-ZLUOBGJFSA-N 0.000 description 1
- OHKLFYXEOGGGCK-ZLUOBGJFSA-N Ser-Asp-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O OHKLFYXEOGGGCK-ZLUOBGJFSA-N 0.000 description 1
- MESDJCNHLZBMEP-ZLUOBGJFSA-N Ser-Asp-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MESDJCNHLZBMEP-ZLUOBGJFSA-N 0.000 description 1
- MMAPOBOTRUVNKJ-ZLUOBGJFSA-N Ser-Asp-Ser Chemical compound C([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CO)N)C(=O)O MMAPOBOTRUVNKJ-ZLUOBGJFSA-N 0.000 description 1
- DSSOYPJWSWFOLK-CIUDSAMLSA-N Ser-Cys-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(O)=O DSSOYPJWSWFOLK-CIUDSAMLSA-N 0.000 description 1
- MPPHJZYXDVDGOF-BWBBJGPYSA-N Ser-Cys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CS)NC(=O)[C@@H](N)CO MPPHJZYXDVDGOF-BWBBJGPYSA-N 0.000 description 1
- CDVFZMOFNJPUDD-ACZMJKKPSA-N Ser-Gln-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CDVFZMOFNJPUDD-ACZMJKKPSA-N 0.000 description 1
- XWCYBVBLJRWOFR-WDSKDSINSA-N Ser-Gln-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O XWCYBVBLJRWOFR-WDSKDSINSA-N 0.000 description 1
- GWMXFEMMBHOKDX-AVGNSLFASA-N Ser-Gln-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 GWMXFEMMBHOKDX-AVGNSLFASA-N 0.000 description 1
- OHKFXGKHSJKKAL-NRPADANISA-N Ser-Glu-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O OHKFXGKHSJKKAL-NRPADANISA-N 0.000 description 1
- AEGUWTFAQQWVLC-BQBZGAKWSA-N Ser-Gly-Arg Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O AEGUWTFAQQWVLC-BQBZGAKWSA-N 0.000 description 1
- BPMRXBZYPGYPJN-WHFBIAKZSA-N Ser-Gly-Asn Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O BPMRXBZYPGYPJN-WHFBIAKZSA-N 0.000 description 1
- MUARUIBTKQJKFY-WHFBIAKZSA-N Ser-Gly-Asp Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MUARUIBTKQJKFY-WHFBIAKZSA-N 0.000 description 1
- GZFAWAQTEYDKII-YUMQZZPRSA-N Ser-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO GZFAWAQTEYDKII-YUMQZZPRSA-N 0.000 description 1
- UIGMAMGZOJVTDN-WHFBIAKZSA-N Ser-Gly-Ser Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O UIGMAMGZOJVTDN-WHFBIAKZSA-N 0.000 description 1
- MOINZPRHJGTCHZ-MMWGEVLESA-N Ser-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N MOINZPRHJGTCHZ-MMWGEVLESA-N 0.000 description 1
- ZIFYDQAFEMIZII-GUBZILKMSA-N Ser-Leu-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZIFYDQAFEMIZII-GUBZILKMSA-N 0.000 description 1
- XNCUYZKGQOCOQH-YUMQZZPRSA-N Ser-Leu-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O XNCUYZKGQOCOQH-YUMQZZPRSA-N 0.000 description 1
- IUXGJEIKJBYKOO-SRVKXCTJSA-N Ser-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CO)N IUXGJEIKJBYKOO-SRVKXCTJSA-N 0.000 description 1
- UBRMZSHOOIVJPW-SRVKXCTJSA-N Ser-Leu-Lys Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O UBRMZSHOOIVJPW-SRVKXCTJSA-N 0.000 description 1
- HDBOEVPDIDDEPC-CIUDSAMLSA-N Ser-Lys-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O HDBOEVPDIDDEPC-CIUDSAMLSA-N 0.000 description 1
- GVMUJUPXFQFBBZ-GUBZILKMSA-N Ser-Lys-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O GVMUJUPXFQFBBZ-GUBZILKMSA-N 0.000 description 1
- CRJZZXMAADSBBQ-SRVKXCTJSA-N Ser-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO CRJZZXMAADSBBQ-SRVKXCTJSA-N 0.000 description 1
- FPCGZYMRFFIYIH-CIUDSAMLSA-N Ser-Lys-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O FPCGZYMRFFIYIH-CIUDSAMLSA-N 0.000 description 1
- QJKPECIAWNNKIT-KKUMJFAQSA-N Ser-Lys-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O QJKPECIAWNNKIT-KKUMJFAQSA-N 0.000 description 1
- XVWDJUROVRQKAE-KKUMJFAQSA-N Ser-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CC1=CC=CC=C1 XVWDJUROVRQKAE-KKUMJFAQSA-N 0.000 description 1
- RWDVVSKYZBNDCO-MELADBBJSA-N Ser-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CO)N)C(=O)O RWDVVSKYZBNDCO-MELADBBJSA-N 0.000 description 1
- QMCDMHWAKMUGJE-IHRRRGAJSA-N Ser-Phe-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O QMCDMHWAKMUGJE-IHRRRGAJSA-N 0.000 description 1
- BSXKBOUZDAZXHE-CIUDSAMLSA-N Ser-Pro-Glu Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O BSXKBOUZDAZXHE-CIUDSAMLSA-N 0.000 description 1
- AABIBDJHSKIMJK-FXQIFTODSA-N Ser-Ser-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O AABIBDJHSKIMJK-FXQIFTODSA-N 0.000 description 1
- KKKVOZNCLALMPV-XKBZYTNZSA-N Ser-Thr-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O KKKVOZNCLALMPV-XKBZYTNZSA-N 0.000 description 1
- ZKOKTQPHFMRSJP-YJRXYDGGSA-N Ser-Thr-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZKOKTQPHFMRSJP-YJRXYDGGSA-N 0.000 description 1
- PIQRHJQWEPWFJG-UWJYBYFXSA-N Ser-Tyr-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O PIQRHJQWEPWFJG-UWJYBYFXSA-N 0.000 description 1
- FHXGMDRKJHKLKW-QWRGUYRKSA-N Ser-Tyr-Gly Chemical compound OC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 FHXGMDRKJHKLKW-QWRGUYRKSA-N 0.000 description 1
- PLQWGQUNUPMNOD-KKUMJFAQSA-N Ser-Tyr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O PLQWGQUNUPMNOD-KKUMJFAQSA-N 0.000 description 1
- PMTWIUBUQRGCSB-FXQIFTODSA-N Ser-Val-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O PMTWIUBUQRGCSB-FXQIFTODSA-N 0.000 description 1
- LLSLRQOEAFCZLW-NRPADANISA-N Ser-Val-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O LLSLRQOEAFCZLW-NRPADANISA-N 0.000 description 1
- JZRYFUGREMECBH-XPUUQOCRSA-N Ser-Val-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O JZRYFUGREMECBH-XPUUQOCRSA-N 0.000 description 1
- ANOQEBQWIAYIMV-AEJSXWLSSA-N Ser-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N ANOQEBQWIAYIMV-AEJSXWLSSA-N 0.000 description 1
- LSHUNRICNSEEAN-BPUTZDHNSA-N Ser-Val-Trp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CO)N LSHUNRICNSEEAN-BPUTZDHNSA-N 0.000 description 1
- 102100028904 Serine/threonine-protein kinase MARK2 Human genes 0.000 description 1
- 108700025832 Serum Response Element Proteins 0.000 description 1
- 108010090804 Streptavidin Proteins 0.000 description 1
- 210000001744 T-lymphocyte Anatomy 0.000 description 1
- 101150001810 TEAD1 gene Proteins 0.000 description 1
- 101150074253 TEF1 gene Proteins 0.000 description 1
- 108020005038 Terminator Codon Proteins 0.000 description 1
- MQCPGOZXFSYJPS-KZVJFYERSA-N Thr-Ala-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O MQCPGOZXFSYJPS-KZVJFYERSA-N 0.000 description 1
- DFTCYYILCSQGIZ-GCJQMDKQSA-N Thr-Ala-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O DFTCYYILCSQGIZ-GCJQMDKQSA-N 0.000 description 1
- DWYAUVCQDTZIJI-VZFHVOOUSA-N Thr-Ala-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O DWYAUVCQDTZIJI-VZFHVOOUSA-N 0.000 description 1
- XSLXHSYIVPGEER-KZVJFYERSA-N Thr-Ala-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O XSLXHSYIVPGEER-KZVJFYERSA-N 0.000 description 1
- TWLMXDWFVNEFFK-FJXKBIBVSA-N Thr-Arg-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O TWLMXDWFVNEFFK-FJXKBIBVSA-N 0.000 description 1
- MQBTXMPQNCGSSZ-OSUNSFLBSA-N Thr-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)O)CCCN=C(N)N MQBTXMPQNCGSSZ-OSUNSFLBSA-N 0.000 description 1
- NAXBBCLCEOTAIG-RHYQMDGZSA-N Thr-Arg-Lys Chemical compound NC(N)=NCCC[C@H](NC(=O)[C@@H](N)[C@H](O)C)C(=O)N[C@@H](CCCCN)C(O)=O NAXBBCLCEOTAIG-RHYQMDGZSA-N 0.000 description 1
- WFUAUEQXPVNAEF-ZJDVBMNYSA-N Thr-Arg-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CCCN=C(N)N WFUAUEQXPVNAEF-ZJDVBMNYSA-N 0.000 description 1
- QGXCWPNQVCYJEL-NUMRIWBASA-N Thr-Asn-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O QGXCWPNQVCYJEL-NUMRIWBASA-N 0.000 description 1
- PQLXHSACXPGWPD-GSSVUCPTSA-N Thr-Asn-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PQLXHSACXPGWPD-GSSVUCPTSA-N 0.000 description 1
- NLSNVZAREYQMGR-HJGDQZAQSA-N Thr-Asp-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NLSNVZAREYQMGR-HJGDQZAQSA-N 0.000 description 1
- XDARBNMYXKUFOJ-GSSVUCPTSA-N Thr-Asp-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XDARBNMYXKUFOJ-GSSVUCPTSA-N 0.000 description 1
- NRUPKQSXTJNQGD-XGEHTFHBSA-N Thr-Cys-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NRUPKQSXTJNQGD-XGEHTFHBSA-N 0.000 description 1
- ZLNWJMRLHLGKFX-SVSWQMSJSA-N Thr-Cys-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZLNWJMRLHLGKFX-SVSWQMSJSA-N 0.000 description 1
- RKDFEMGVMMYYNG-WDCWCFNPSA-N Thr-Gln-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O RKDFEMGVMMYYNG-WDCWCFNPSA-N 0.000 description 1
- XXNLGZRRSKPSGF-HTUGSXCWSA-N Thr-Gln-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N)O XXNLGZRRSKPSGF-HTUGSXCWSA-N 0.000 description 1
- KGKWKSSSQGGYAU-SUSMZKCASA-N Thr-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N)O KGKWKSSSQGGYAU-SUSMZKCASA-N 0.000 description 1
- VGYBYGQXZJDZJU-XQXXSGGOSA-N Thr-Glu-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O VGYBYGQXZJDZJU-XQXXSGGOSA-N 0.000 description 1
- LGNBRHZANHMZHK-NUMRIWBASA-N Thr-Glu-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O LGNBRHZANHMZHK-NUMRIWBASA-N 0.000 description 1
- GKWNLDNXMMLRMC-GLLZPBPUSA-N Thr-Glu-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O GKWNLDNXMMLRMC-GLLZPBPUSA-N 0.000 description 1
- LHEZGZQRLDBSRR-WDCWCFNPSA-N Thr-Glu-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LHEZGZQRLDBSRR-WDCWCFNPSA-N 0.000 description 1
- BIENEHRYNODTLP-HJGDQZAQSA-N Thr-Glu-Met Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCSC)C(=O)O)N)O BIENEHRYNODTLP-HJGDQZAQSA-N 0.000 description 1
- VULNJDORNLBPNG-SWRJLBSHSA-N Thr-Glu-Trp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N)O VULNJDORNLBPNG-SWRJLBSHSA-N 0.000 description 1
- QQWNRERCGGZOKG-WEDXCCLWSA-N Thr-Gly-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O QQWNRERCGGZOKG-WEDXCCLWSA-N 0.000 description 1
- MPUMPERGHHJGRP-WEDXCCLWSA-N Thr-Gly-Lys Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N)O MPUMPERGHHJGRP-WEDXCCLWSA-N 0.000 description 1
- KBBRNEDOYWMIJP-KYNKHSRBSA-N Thr-Gly-Thr Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(=O)O)N)O KBBRNEDOYWMIJP-KYNKHSRBSA-N 0.000 description 1
- XOWKUMFHEZLKLT-CIQUZCHMSA-N Thr-Ile-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O XOWKUMFHEZLKLT-CIQUZCHMSA-N 0.000 description 1
- PAXANSWUSVPFNK-IUKAMOBKSA-N Thr-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N PAXANSWUSVPFNK-IUKAMOBKSA-N 0.000 description 1
- DDDLIMCZFKOERC-SVSWQMSJSA-N Thr-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N DDDLIMCZFKOERC-SVSWQMSJSA-N 0.000 description 1
- GMXIJHCBTZDAPD-QPHKQPEJSA-N Thr-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N GMXIJHCBTZDAPD-QPHKQPEJSA-N 0.000 description 1
- RRRRCRYTLZVCEN-HJGDQZAQSA-N Thr-Leu-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O RRRRCRYTLZVCEN-HJGDQZAQSA-N 0.000 description 1
- HOVLHEKTGVIKAP-WDCWCFNPSA-N Thr-Leu-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O HOVLHEKTGVIKAP-WDCWCFNPSA-N 0.000 description 1
- RFKVQLIXNVEOMB-WEDXCCLWSA-N Thr-Leu-Gly Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)O)N)O RFKVQLIXNVEOMB-WEDXCCLWSA-N 0.000 description 1
- VRUFCJZQDACGLH-UVOCVTCTSA-N Thr-Leu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VRUFCJZQDACGLH-UVOCVTCTSA-N 0.000 description 1
- KZSYAEWQMJEGRZ-RHYQMDGZSA-N Thr-Leu-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O KZSYAEWQMJEGRZ-RHYQMDGZSA-N 0.000 description 1
- ZSPQUTWLWGWTPS-HJGDQZAQSA-N Thr-Lys-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O ZSPQUTWLWGWTPS-HJGDQZAQSA-N 0.000 description 1
- SCSVNSNWUTYSFO-WDCWCFNPSA-N Thr-Lys-Glu Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O SCSVNSNWUTYSFO-WDCWCFNPSA-N 0.000 description 1
- JLNMFGCJODTXDH-WEDXCCLWSA-N Thr-Lys-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O JLNMFGCJODTXDH-WEDXCCLWSA-N 0.000 description 1
- XSEPSRUDSPHMPX-KATARQTJSA-N Thr-Lys-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O XSEPSRUDSPHMPX-KATARQTJSA-N 0.000 description 1
- YJVJPJPHHFOVMG-VEVYYDQMSA-N Thr-Met-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O YJVJPJPHHFOVMG-VEVYYDQMSA-N 0.000 description 1
- GYUUYCIXELGTJS-MEYUZBJRSA-N Thr-Phe-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O GYUUYCIXELGTJS-MEYUZBJRSA-N 0.000 description 1
- MUAFDCVOHYAFNG-RCWTZXSCSA-N Thr-Pro-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O MUAFDCVOHYAFNG-RCWTZXSCSA-N 0.000 description 1
- PRTHQBSMXILLPC-XGEHTFHBSA-N Thr-Ser-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PRTHQBSMXILLPC-XGEHTFHBSA-N 0.000 description 1
- IVDFVBVIVLJJHR-LKXGYXEUSA-N Thr-Ser-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O IVDFVBVIVLJJHR-LKXGYXEUSA-N 0.000 description 1
- BCYUHPXBHCUYBA-CUJWVEQBSA-N Thr-Ser-His Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O BCYUHPXBHCUYBA-CUJWVEQBSA-N 0.000 description 1
- NQQMWWVVGIXUOX-SVSWQMSJSA-N Thr-Ser-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NQQMWWVVGIXUOX-SVSWQMSJSA-N 0.000 description 1
- AHERARIZBPOMNU-KATARQTJSA-N Thr-Ser-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O AHERARIZBPOMNU-KATARQTJSA-N 0.000 description 1
- IEZVHOULSUULHD-XGEHTFHBSA-N Thr-Ser-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O IEZVHOULSUULHD-XGEHTFHBSA-N 0.000 description 1
- AAZOYLQUEQRUMZ-GSSVUCPTSA-N Thr-Thr-Asn Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(N)=O AAZOYLQUEQRUMZ-GSSVUCPTSA-N 0.000 description 1
- VGNLMPBYWWNQFS-ZEILLAHLSA-N Thr-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N)O VGNLMPBYWWNQFS-ZEILLAHLSA-N 0.000 description 1
- ZMYCLHFLHRVOEA-HEIBUPTGSA-N Thr-Thr-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ZMYCLHFLHRVOEA-HEIBUPTGSA-N 0.000 description 1
- GRIUMVXCJDKVPI-IZPVPAKOSA-N Thr-Thr-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O GRIUMVXCJDKVPI-IZPVPAKOSA-N 0.000 description 1
- XGUAUKUYQHBUNY-SWRJLBSHSA-N Thr-Trp-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(O)=O)C(O)=O XGUAUKUYQHBUNY-SWRJLBSHSA-N 0.000 description 1
- NLWDSYKZUPRMBJ-IEGACIPQSA-N Thr-Trp-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC(C)C)C(=O)O)N)O NLWDSYKZUPRMBJ-IEGACIPQSA-N 0.000 description 1
- CURFABYITJVKEW-QTKMDUPCSA-N Thr-Val-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N)O CURFABYITJVKEW-QTKMDUPCSA-N 0.000 description 1
- 108091023040 Transcription factor Proteins 0.000 description 1
- 102000040945 Transcription factor Human genes 0.000 description 1
- 102100029898 Transcriptional enhancer factor TEF-1 Human genes 0.000 description 1
- PXYJUECTGMGIDT-WDSOQIARSA-N Trp-Arg-Leu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(C)C)C(O)=O)=CNC2=C1 PXYJUECTGMGIDT-WDSOQIARSA-N 0.000 description 1
- HJTYJQVRIQXMHM-XIRDDKMYSA-N Trp-Asp-Lys Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N HJTYJQVRIQXMHM-XIRDDKMYSA-N 0.000 description 1
- WQYPAGQDXAJNED-AAEUAGOBSA-N Trp-Cys-Gly Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CS)C(=O)NCC(=O)O)N WQYPAGQDXAJNED-AAEUAGOBSA-N 0.000 description 1
- PTAWAMWPRFTACW-SZMVWBNQSA-N Trp-Gln-Lys Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N PTAWAMWPRFTACW-SZMVWBNQSA-N 0.000 description 1
- CZWIHKFGHICAJX-BPUTZDHNSA-N Trp-Glu-Glu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 CZWIHKFGHICAJX-BPUTZDHNSA-N 0.000 description 1
- WLBZWXXGSOLJBA-HOCLYGCPSA-N Trp-Gly-Lys Chemical compound C1=CC=C2C(C[C@H](N)C(=O)NCC(=O)N[C@@H](CCCCN)C(O)=O)=CNC2=C1 WLBZWXXGSOLJBA-HOCLYGCPSA-N 0.000 description 1
- IQXWAJUIAQLZNX-IHPCNDPISA-N Trp-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N IQXWAJUIAQLZNX-IHPCNDPISA-N 0.000 description 1
- RWAYYYOZMHMEGD-XIRDDKMYSA-N Trp-Leu-Ser Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O)=CNC2=C1 RWAYYYOZMHMEGD-XIRDDKMYSA-N 0.000 description 1
- TUUXFNQXSFNFLX-XIRDDKMYSA-N Trp-Met-Glu Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N TUUXFNQXSFNFLX-XIRDDKMYSA-N 0.000 description 1
- ACGIVBXINJFALS-HKUYNNGSSA-N Trp-Phe-Gly Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N ACGIVBXINJFALS-HKUYNNGSSA-N 0.000 description 1
- 108010028230 Trp-Ser- His-Pro-Gln-Phe-Glu-Lys Proteins 0.000 description 1
- ITUAVBRBGKVBLH-BVSLBCMMSA-N Trp-Tyr-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N ITUAVBRBGKVBLH-BVSLBCMMSA-N 0.000 description 1
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 1
- OOEUVMFKKZYSRX-LEWSCRJBSA-N Tyr-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N OOEUVMFKKZYSRX-LEWSCRJBSA-N 0.000 description 1
- KDGFPPHLXCEQRN-STECZYCISA-N Tyr-Arg-Ile Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KDGFPPHLXCEQRN-STECZYCISA-N 0.000 description 1
- GFHYISDTIWZUSU-QWRGUYRKSA-N Tyr-Asn-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O GFHYISDTIWZUSU-QWRGUYRKSA-N 0.000 description 1
- AYHSJESDFKREAR-KKUMJFAQSA-N Tyr-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AYHSJESDFKREAR-KKUMJFAQSA-N 0.000 description 1
- XMNDQSYABVWZRK-BZSNNMDCSA-N Tyr-Asn-Phe Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O XMNDQSYABVWZRK-BZSNNMDCSA-N 0.000 description 1
- YGKVNUAKYPGORG-AVGNSLFASA-N Tyr-Asp-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YGKVNUAKYPGORG-AVGNSLFASA-N 0.000 description 1
- RCLOWEZASFJFEX-KKUMJFAQSA-N Tyr-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 RCLOWEZASFJFEX-KKUMJFAQSA-N 0.000 description 1
- XBWKCYFGRXKWGO-SRVKXCTJSA-N Tyr-Cys-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(O)=O XBWKCYFGRXKWGO-SRVKXCTJSA-N 0.000 description 1
- QUILOGWWLXMSAT-IHRRRGAJSA-N Tyr-Gln-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QUILOGWWLXMSAT-IHRRRGAJSA-N 0.000 description 1
- HKYTWJOWZTWBQB-AVGNSLFASA-N Tyr-Glu-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 HKYTWJOWZTWBQB-AVGNSLFASA-N 0.000 description 1
- HVHJYXDXRIWELT-RYUDHWBXSA-N Tyr-Glu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O HVHJYXDXRIWELT-RYUDHWBXSA-N 0.000 description 1
- FMOSEWZYZPMJAL-KKUMJFAQSA-N Tyr-Glu-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N FMOSEWZYZPMJAL-KKUMJFAQSA-N 0.000 description 1
- LHTGRUZSZOIAKM-SOUVJXGZSA-N Tyr-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O LHTGRUZSZOIAKM-SOUVJXGZSA-N 0.000 description 1
- GIOBXJSONRQHKQ-RYUDHWBXSA-N Tyr-Gly-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O GIOBXJSONRQHKQ-RYUDHWBXSA-N 0.000 description 1
- HIINQLBHPIQYHN-JTQLQIEISA-N Tyr-Gly-Gly Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 HIINQLBHPIQYHN-JTQLQIEISA-N 0.000 description 1
- FNWGDMZVYBVAGJ-XEGUGMAKSA-N Tyr-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC1=CC=C(C=C1)O)N FNWGDMZVYBVAGJ-XEGUGMAKSA-N 0.000 description 1
- WSFXJLFSJSXGMQ-MGHWNKPDSA-N Tyr-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N WSFXJLFSJSXGMQ-MGHWNKPDSA-N 0.000 description 1
- HFJJDMOFTCQGEI-STECZYCISA-N Tyr-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N HFJJDMOFTCQGEI-STECZYCISA-N 0.000 description 1
- JAGGEZACYAAMIL-CQDKDKBSSA-N Tyr-Lys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CC=C(C=C1)O)N JAGGEZACYAAMIL-CQDKDKBSSA-N 0.000 description 1
- HSBZWINKRYZCSQ-KKUMJFAQSA-N Tyr-Lys-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O HSBZWINKRYZCSQ-KKUMJFAQSA-N 0.000 description 1
- YSGAPESOXHFTQY-IHRRRGAJSA-N Tyr-Met-Asp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N YSGAPESOXHFTQY-IHRRRGAJSA-N 0.000 description 1
- UBKKNELWDCBNCF-STQMWFEESA-N Tyr-Met-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 UBKKNELWDCBNCF-STQMWFEESA-N 0.000 description 1
- LRHBBGDMBLFYGL-FHWLQOOXSA-N Tyr-Phe-Glu Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CCC(O)=O)C(O)=O)C1=CC=C(O)C=C1 LRHBBGDMBLFYGL-FHWLQOOXSA-N 0.000 description 1
- SZEIFUXUTBBQFQ-STQMWFEESA-N Tyr-Pro-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O SZEIFUXUTBBQFQ-STQMWFEESA-N 0.000 description 1
- MNWINJDPGBNOED-ULQDDVLXSA-N Tyr-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=C(O)C=C1 MNWINJDPGBNOED-ULQDDVLXSA-N 0.000 description 1
- VSYROIRKNBCULO-BWAGICSOSA-N Tyr-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)O VSYROIRKNBCULO-BWAGICSOSA-N 0.000 description 1
- KUXCBJFJURINGF-PXDAIIFMSA-N Tyr-Trp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](CC3=CC=C(C=C3)O)N KUXCBJFJURINGF-PXDAIIFMSA-N 0.000 description 1
- GPLTZEMVOCZVAV-UFYCRDLUSA-N Tyr-Tyr-Arg Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)C1=CC=C(O)C=C1 GPLTZEMVOCZVAV-UFYCRDLUSA-N 0.000 description 1
- AFWXOGHZEKARFH-ACRUOGEOSA-N Tyr-Tyr-His Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CC=C(O)C=C1 AFWXOGHZEKARFH-ACRUOGEOSA-N 0.000 description 1
- FZADUTOCSFDBRV-RNXOBYDBSA-N Tyr-Tyr-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CC=C(O)C=C1 FZADUTOCSFDBRV-RNXOBYDBSA-N 0.000 description 1
- IVOMOUWHDPKRLL-UHFFFAOYSA-N UNPD107823 Natural products O1C2COP(O)(=O)OC2C(O)C1N1C(N=CN=C2N)=C2N=C1 IVOMOUWHDPKRLL-UHFFFAOYSA-N 0.000 description 1
- 108090000848 Ubiquitin Proteins 0.000 description 1
- 102000044159 Ubiquitin Human genes 0.000 description 1
- TVWHNULVHGKJHS-UHFFFAOYSA-N Uric acid Natural products N1C(=O)NC(=O)C2NC(=O)NC21 TVWHNULVHGKJHS-UHFFFAOYSA-N 0.000 description 1
- UEOOXDLMQZBPFR-ZKWXMUAHSA-N Val-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N UEOOXDLMQZBPFR-ZKWXMUAHSA-N 0.000 description 1
- IZFVRRYRMQFVGX-NRPADANISA-N Val-Ala-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N IZFVRRYRMQFVGX-NRPADANISA-N 0.000 description 1
- RUCNAYOMFXRIKJ-DCAQKATOSA-N Val-Ala-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN RUCNAYOMFXRIKJ-DCAQKATOSA-N 0.000 description 1
- LABUITCFCAABSV-BPNCWPANSA-N Val-Ala-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 LABUITCFCAABSV-BPNCWPANSA-N 0.000 description 1
- NMANTMWGQZASQN-QXEWZRGKSA-N Val-Arg-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N NMANTMWGQZASQN-QXEWZRGKSA-N 0.000 description 1
- NMPXRFYMZDIBRF-ZOBUZTSGSA-N Val-Asn-Trp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N NMPXRFYMZDIBRF-ZOBUZTSGSA-N 0.000 description 1
- BMGOFDMKDVVGJG-NHCYSSNCSA-N Val-Asp-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BMGOFDMKDVVGJG-NHCYSSNCSA-N 0.000 description 1
- OVLIFGQSBSNGHY-KKHAAJSZSA-N Val-Asp-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N)O OVLIFGQSBSNGHY-KKHAAJSZSA-N 0.000 description 1
- BWVHQINTNLVWGZ-ZKWXMUAHSA-N Val-Cys-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)O)C(=O)O)N BWVHQINTNLVWGZ-ZKWXMUAHSA-N 0.000 description 1
- XEYUMGGWQCIWAR-XVKPBYJWSA-N Val-Gln-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)NCC(=O)O)N XEYUMGGWQCIWAR-XVKPBYJWSA-N 0.000 description 1
- UZDHNIJRRTUKKC-DLOVCJGASA-N Val-Gln-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](C(C)C)C(=O)O)N UZDHNIJRRTUKKC-DLOVCJGASA-N 0.000 description 1
- CVIXTAITYJQMPE-LAEOZQHASA-N Val-Glu-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CVIXTAITYJQMPE-LAEOZQHASA-N 0.000 description 1
- VVZDBPBZHLQPPB-XVKPBYJWSA-N Val-Glu-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O VVZDBPBZHLQPPB-XVKPBYJWSA-N 0.000 description 1
- FOADDSDHGRFUOC-DZKIICNBSA-N Val-Glu-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N FOADDSDHGRFUOC-DZKIICNBSA-N 0.000 description 1
- OQWNEUXPKHIEJO-NRPADANISA-N Val-Glu-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CO)C(=O)O)N OQWNEUXPKHIEJO-NRPADANISA-N 0.000 description 1
- JTWIMNMUYLQNPI-WPRPVWTQSA-N Val-Gly-Arg Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCNC(N)=N JTWIMNMUYLQNPI-WPRPVWTQSA-N 0.000 description 1
- PMDOQZFYGWZSTK-LSJOCFKGSA-N Val-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)C(C)C PMDOQZFYGWZSTK-LSJOCFKGSA-N 0.000 description 1
- PYPZMFDMCCWNST-NAKRPEOUSA-N Val-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](C(C)C)N PYPZMFDMCCWNST-NAKRPEOUSA-N 0.000 description 1
- FTKXYXACXYOHND-XUXIUFHCSA-N Val-Ile-Leu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O FTKXYXACXYOHND-XUXIUFHCSA-N 0.000 description 1
- JZWZACGUZVCQPS-RNJOBUHISA-N Val-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N JZWZACGUZVCQPS-RNJOBUHISA-N 0.000 description 1
- BMOFUVHDBROBSE-DCAQKATOSA-N Val-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](C(C)C)N BMOFUVHDBROBSE-DCAQKATOSA-N 0.000 description 1
- RFKJNTRMXGCKFE-FHWLQOOXSA-N Val-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC(C)C)C(O)=O)=CNC2=C1 RFKJNTRMXGCKFE-FHWLQOOXSA-N 0.000 description 1
- IJGPOONOTBNTFS-GVXVVHGQSA-N Val-Lys-Glu Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O IJGPOONOTBNTFS-GVXVVHGQSA-N 0.000 description 1
- OJPRSVJGNCAKQX-SRVKXCTJSA-N Val-Met-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N OJPRSVJGNCAKQX-SRVKXCTJSA-N 0.000 description 1
- VENKIVFKIPGEJN-NHCYSSNCSA-N Val-Met-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N VENKIVFKIPGEJN-NHCYSSNCSA-N 0.000 description 1
- JVGHIFMSFBZDHH-WPRPVWTQSA-N Val-Met-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)NCC(=O)O)N JVGHIFMSFBZDHH-WPRPVWTQSA-N 0.000 description 1
- AJNUKMZFHXUBMK-GUBZILKMSA-N Val-Ser-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N AJNUKMZFHXUBMK-GUBZILKMSA-N 0.000 description 1
- QTPQHINADBYBNA-DCAQKATOSA-N Val-Ser-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN QTPQHINADBYBNA-DCAQKATOSA-N 0.000 description 1
- UJMCYJKPDFQLHX-XGEHTFHBSA-N Val-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N)O UJMCYJKPDFQLHX-XGEHTFHBSA-N 0.000 description 1
- NZYNRRGJJVSSTJ-GUBZILKMSA-N Val-Ser-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O NZYNRRGJJVSSTJ-GUBZILKMSA-N 0.000 description 1
- MNSSBIHFEUUXNW-RCWTZXSCSA-N Val-Thr-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N MNSSBIHFEUUXNW-RCWTZXSCSA-N 0.000 description 1
- UVHFONIHVHLDDQ-IFFSRLJSSA-N Val-Thr-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O UVHFONIHVHLDDQ-IFFSRLJSSA-N 0.000 description 1
- GVNLOVJNNDZUHS-RHYQMDGZSA-N Val-Thr-Lys Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O GVNLOVJNNDZUHS-RHYQMDGZSA-N 0.000 description 1
- YLBNZCJFSVJDRJ-KJEVXHAQSA-N Val-Thr-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](Cc1ccc(O)cc1)C(O)=O YLBNZCJFSVJDRJ-KJEVXHAQSA-N 0.000 description 1
- HTONZBWRYUKUKC-RCWTZXSCSA-N Val-Thr-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O HTONZBWRYUKUKC-RCWTZXSCSA-N 0.000 description 1
- VBTFUDNTMCHPII-UHFFFAOYSA-N Val-Trp-Tyr Natural products C=1NC2=CC=CC=C2C=1CC(NC(=O)C(N)C(C)C)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 VBTFUDNTMCHPII-UHFFFAOYSA-N 0.000 description 1
- DOBHJKVVACOQTN-DZKIICNBSA-N Val-Tyr-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=C(O)C=C1 DOBHJKVVACOQTN-DZKIICNBSA-N 0.000 description 1
- JPBGMZDTPVGGMQ-ULQDDVLXSA-N Val-Tyr-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N JPBGMZDTPVGGMQ-ULQDDVLXSA-N 0.000 description 1
- VVIZITNVZUAEMI-DLOVCJGASA-N Val-Val-Gln Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCC(N)=O VVIZITNVZUAEMI-DLOVCJGASA-N 0.000 description 1
- NLNCNKIVJPEFBC-DLOVCJGASA-N Val-Val-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O NLNCNKIVJPEFBC-DLOVCJGASA-N 0.000 description 1
- JSOXWWFKRJKTMT-WOPDTQHZSA-N Val-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N JSOXWWFKRJKTMT-WOPDTQHZSA-N 0.000 description 1
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 1
- 241000222124 [Candida] boidinii Species 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 229960001570 ademetionine Drugs 0.000 description 1
- 108010011559 alanylphenylalanine Proteins 0.000 description 1
- 125000000217 alkyl group Chemical group 0.000 description 1
- WQZGKKKJIJFFOK-PHYPRBDBSA-N alpha-D-galactose Chemical compound OC[C@H]1O[C@H](O)[C@H](O)[C@@H](O)[C@H]1O WQZGKKKJIJFFOK-PHYPRBDBSA-N 0.000 description 1
- 108010050025 alpha-glutamyltryptophan Proteins 0.000 description 1
- 150000001408 amides Chemical group 0.000 description 1
- 238000012870 ammonium sulfate precipitation Methods 0.000 description 1
- 230000003698 anagen phase Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 210000004102 animal cell Anatomy 0.000 description 1
- 229960003121 arginine Drugs 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 108010013835 arginine glutamate Proteins 0.000 description 1
- 108010008355 arginyl-glutamine Proteins 0.000 description 1
- 108010052670 arginyl-glutamyl-glutamic acid Proteins 0.000 description 1
- 108010086780 arginyl-glycyl-aspartyl-alanine Proteins 0.000 description 1
- 108010059459 arginyl-threonyl-phenylalanine Proteins 0.000 description 1
- 108010068380 arginylarginine Proteins 0.000 description 1
- 229960005070 ascorbic acid Drugs 0.000 description 1
- 235000010323 ascorbic acid Nutrition 0.000 description 1
- 239000011668 ascorbic acid Substances 0.000 description 1
- 229960001230 asparagine Drugs 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 125000000852 azido group Chemical group *N=[N+]=[N-] 0.000 description 1
- 244000052616 bacterial pathogen Species 0.000 description 1
- 210000003651 basophil Anatomy 0.000 description 1
- 230000003796 beauty Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- GEHJBWKLJVFKPS-UHFFFAOYSA-N bromochloroacetic acid Chemical compound OC(=O)C(Cl)Br GEHJBWKLJVFKPS-UHFFFAOYSA-N 0.000 description 1
- 239000001110 calcium chloride Substances 0.000 description 1
- 229910001628 calcium chloride Inorganic materials 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 125000002837 carbocyclic group Chemical group 0.000 description 1
- 150000001720 carbohydrates Chemical class 0.000 description 1
- 235000014633 carbohydrates Nutrition 0.000 description 1
- 239000001569 carbon dioxide Substances 0.000 description 1
- 229910002092 carbon dioxide Inorganic materials 0.000 description 1
- 229940021722 caseins Drugs 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000003833 cell viability Effects 0.000 description 1
- 230000006505 cellular lipid metabolism Effects 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 238000007385 chemical modification Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 210000003763 chloroplast Anatomy 0.000 description 1
- 235000012000 cholesterol Nutrition 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 210000002808 connective tissue Anatomy 0.000 description 1
- 238000001816 cooling Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 239000002537 cosmetic Substances 0.000 description 1
- 239000006071 cream Substances 0.000 description 1
- 229940095074 cyclic amp Drugs 0.000 description 1
- 229960002433 cysteine Drugs 0.000 description 1
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 1
- 235000018417 cysteine Nutrition 0.000 description 1
- 108010004073 cysteinylcysteine Proteins 0.000 description 1
- 108010016616 cysteinylglycine Proteins 0.000 description 1
- 229960003067 cystine Drugs 0.000 description 1
- 230000007711 cytoplasmic localization Effects 0.000 description 1
- 230000003436 cytoskeletal effect Effects 0.000 description 1
- 210000004292 cytoskeleton Anatomy 0.000 description 1
- 238000000432 density-gradient centrifugation Methods 0.000 description 1
- 238000012938 design process Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- PGUYAANYCROBRT-UHFFFAOYSA-N dihydroxy-selanyl-selanylidene-lambda5-phosphane Chemical compound OP(O)([SeH])=[Se] PGUYAANYCROBRT-UHFFFAOYSA-N 0.000 description 1
- NAGJZTKCGNOGPW-UHFFFAOYSA-K dioxido-sulfanylidene-sulfido-$l^{5}-phosphane Chemical compound [O-]P([O-])([S-])=S NAGJZTKCGNOGPW-UHFFFAOYSA-K 0.000 description 1
- 229910001882 dioxygen Inorganic materials 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- PMMYEEVYMWASQN-UHFFFAOYSA-N dl-hydroxyproline Natural products OC1C[NH2+]C(C([O-])=O)C1 PMMYEEVYMWASQN-UHFFFAOYSA-N 0.000 description 1
- 238000001035 drying Methods 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 150000002148 esters Chemical class 0.000 description 1
- 150000002170 ethers Chemical class 0.000 description 1
- 102000013165 exonuclease Human genes 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 238000000799 fluorescence microscopy Methods 0.000 description 1
- 102000034287 fluorescent proteins Human genes 0.000 description 1
- 108091006047 fluorescent proteins Proteins 0.000 description 1
- 238000005187 foaming Methods 0.000 description 1
- 239000012737 fresh medium Substances 0.000 description 1
- 230000002538 fungal effect Effects 0.000 description 1
- 238000002523 gelfiltration Methods 0.000 description 1
- 230000008571 general function Effects 0.000 description 1
- 239000003862 glucocorticoid Substances 0.000 description 1
- 239000008103 glucose Substances 0.000 description 1
- 229960002989 glutamic acid Drugs 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 1
- 229960002743 glutamine Drugs 0.000 description 1
- 108010045624 glutamyl-lysyl-alanyl-histidyl-aspartyl-glycyl-glycyl-arginine Proteins 0.000 description 1
- 108010008237 glutamyl-valyl-glycine Proteins 0.000 description 1
- 108010079547 glutamylmethionine Proteins 0.000 description 1
- 230000034659 glycolysis Effects 0.000 description 1
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 1
- 108010023364 glycyl-histidyl-arginine Proteins 0.000 description 1
- 108010077435 glycyl-phenylalanyl-glycine Proteins 0.000 description 1
- 108010084389 glycyltryptophan Proteins 0.000 description 1
- 230000009643 growth defect Effects 0.000 description 1
- 230000009036 growth inhibition Effects 0.000 description 1
- 229910052736 halogen Inorganic materials 0.000 description 1
- 150000002367 halogens Chemical class 0.000 description 1
- 125000000623 heterocyclic group Chemical group 0.000 description 1
- 238000000703 high-speed centrifugation Methods 0.000 description 1
- 108010040030 histidinoalanine Proteins 0.000 description 1
- 108010045383 histidyl-glycyl-glutamic acid Proteins 0.000 description 1
- 108010025306 histidylleucine Proteins 0.000 description 1
- 108010085325 histidylproline Proteins 0.000 description 1
- 238000000265 homogenisation Methods 0.000 description 1
- 229940088597 hormone Drugs 0.000 description 1
- 239000005556 hormone Substances 0.000 description 1
- 230000007062 hydrolysis Effects 0.000 description 1
- 238000006460 hydrolysis reaction Methods 0.000 description 1
- 229960002591 hydroxyproline Drugs 0.000 description 1
- 238000000099 in vitro assay Methods 0.000 description 1
- 230000002779 inactivation Effects 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 102000006029 inositol monophosphatase Human genes 0.000 description 1
- 230000017730 intein-mediated protein splicing Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 210000004347 intestinal mucosa Anatomy 0.000 description 1
- 238000004255 ion exchange chromatography Methods 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 229960000310 isoleucine Drugs 0.000 description 1
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 1
- 238000011031 large-scale manufacturing process Methods 0.000 description 1
- 231100000518 lethal Toxicity 0.000 description 1
- 230000001665 lethal effect Effects 0.000 description 1
- 229960003136 leucine Drugs 0.000 description 1
- 108010073093 leucyl-glycyl-glycyl-glycine Proteins 0.000 description 1
- 108010047926 leucyl-lysyl-tyrosine Proteins 0.000 description 1
- 108010087810 leucyl-seryl-glutamyl-leucine Proteins 0.000 description 1
- 108010000761 leucylarginine Proteins 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 230000029226 lipidation Effects 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 238000000464 low-speed centrifugation Methods 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 239000006166 lysate Substances 0.000 description 1
- 230000002934 lysing effect Effects 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 210000005171 mammalian brain Anatomy 0.000 description 1
- 210000004962 mammalian cell Anatomy 0.000 description 1
- 108010025276 mapacalcine Proteins 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 239000002609 medium Substances 0.000 description 1
- 230000009401 metastasis Effects 0.000 description 1
- 230000000696 methanogenic effect Effects 0.000 description 1
- 229930182817 methionine Natural products 0.000 description 1
- 229960004452 methionine Drugs 0.000 description 1
- 230000011987 methylation Effects 0.000 description 1
- 238000007069 methylation reaction Methods 0.000 description 1
- 210000002500 microbody Anatomy 0.000 description 1
- 210000003470 mitochondria Anatomy 0.000 description 1
- 239000003607 modifier Substances 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 108700003805 myo-inositol-1 (or 4)-monophosphatase Proteins 0.000 description 1
- 229910052759 nickel Inorganic materials 0.000 description 1
- 230000009635 nitrosylation Effects 0.000 description 1
- 231100000252 nontoxic Toxicity 0.000 description 1
- 230000003000 nontoxic effect Effects 0.000 description 1
- 235000016709 nutrition Nutrition 0.000 description 1
- 229940049964 oleate Drugs 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 229960003104 ornithine Drugs 0.000 description 1
- 238000013021 overheating Methods 0.000 description 1
- 230000001590 oxidative effect Effects 0.000 description 1
- 230000036542 oxidative stress Effects 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 239000013618 particulate matter Substances 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 239000000137 peptide hydrolase inhibitor Substances 0.000 description 1
- 230000007030 peptide scission Effects 0.000 description 1
- 229960005190 phenylalanine Drugs 0.000 description 1
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 1
- 108010084572 phenylalanyl-valine Proteins 0.000 description 1
- 108010024607 phenylalanylalanine Proteins 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- 150000003904 phospholipids Chemical class 0.000 description 1
- PTMHPRAIXMAOOB-UHFFFAOYSA-L phosphoramidate Chemical compound NP([O-])([O-])=O PTMHPRAIXMAOOB-UHFFFAOYSA-L 0.000 description 1
- 102000020233 phosphotransferase Human genes 0.000 description 1
- 230000005097 photorespiration Effects 0.000 description 1
- 239000004033 plastic Substances 0.000 description 1
- 229920003023 plastic Polymers 0.000 description 1
- 229920002647 polyamide Polymers 0.000 description 1
- 229920000768 polyamine Polymers 0.000 description 1
- 239000004417 polycarbonate Substances 0.000 description 1
- 229920000515 polycarbonate Polymers 0.000 description 1
- 229920002704 polyhistidine Polymers 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 102000040430 polynucleotide Human genes 0.000 description 1
- 108091033319 polynucleotide Proteins 0.000 description 1
- 239000002157 polynucleotide Substances 0.000 description 1
- 229920001184 polypeptide Polymers 0.000 description 1
- 210000002729 polyribosome Anatomy 0.000 description 1
- 108091005626 post-translationally modified proteins Proteins 0.000 description 1
- 102000035123 post-translationally modified proteins Human genes 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 238000003825 pressing Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 108010020755 prolyl-glycyl-glycine Proteins 0.000 description 1
- 108010090894 prolylleucine Proteins 0.000 description 1
- 230000007065 protein hydrolysis Effects 0.000 description 1
- 238000001742 protein purification Methods 0.000 description 1
- 238000001243 protein synthesis Methods 0.000 description 1
- 239000002994 raw material Substances 0.000 description 1
- 239000003642 reactive oxygen metabolite Substances 0.000 description 1
- 238000006479 redox reaction Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000007670 refining Methods 0.000 description 1
- 230000001718 repressive effect Effects 0.000 description 1
- 239000011347 resin Substances 0.000 description 1
- 229920005989 resin Polymers 0.000 description 1
- 230000029058 respiratory gaseous exchange Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 229940055619 selenocysteine Drugs 0.000 description 1
- ZKZBPNGNEQAJSX-UHFFFAOYSA-N selenocysteine Natural products [SeH]CC(N)C(O)=O ZKZBPNGNEQAJSX-UHFFFAOYSA-N 0.000 description 1
- 235000016491 selenocysteine Nutrition 0.000 description 1
- JRPHGDYSKGJTKZ-UHFFFAOYSA-K selenophosphate Chemical compound [O-]P([O-])([O-])=[Se] JRPHGDYSKGJTKZ-UHFFFAOYSA-K 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000009919 sequestration Effects 0.000 description 1
- 238000013207 serial dilution Methods 0.000 description 1
- 108010071207 serylmethionine Proteins 0.000 description 1
- 230000019491 signal transduction Effects 0.000 description 1
- 238000005063 solubilization Methods 0.000 description 1
- 230000007928 solubilization Effects 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 238000003756 stirring Methods 0.000 description 1
- 238000000856 sucrose gradient centrifugation Methods 0.000 description 1
- 230000010741 sumoylation Effects 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 230000001502 supplementing effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-K thiophosphate Chemical compound [O-]P([O-])([O-])=S RYYWUUFWQRZTIU-UHFFFAOYSA-K 0.000 description 1
- 108010072986 threonyl-seryl-lysine Proteins 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 231100000167 toxic agent Toxicity 0.000 description 1
- FGMPLJWBKKVCDB-UHFFFAOYSA-N trans-L-hydroxy-proline Natural products ON1CCCC1C(O)=O FGMPLJWBKKVCDB-UHFFFAOYSA-N 0.000 description 1
- 108091005572 translationally modified proteins Proteins 0.000 description 1
- 102000035117 translationally modified proteins Human genes 0.000 description 1
- 229960004799 tryptophan Drugs 0.000 description 1
- 108010080629 tryptophan-leucine Proteins 0.000 description 1
- 108010058119 tryptophyl-glycyl-glycine Proteins 0.000 description 1
- 108010038745 tryptophylglycine Proteins 0.000 description 1
- 230000007306 turnover Effects 0.000 description 1
- 108010035534 tyrosyl-leucyl-alanine Proteins 0.000 description 1
- 108010003137 tyrosyltyrosine Proteins 0.000 description 1
- 229940116269 uric acid Drugs 0.000 description 1
- IBIDRSSEHFLGSD-UHFFFAOYSA-N valinyl-arginine Natural products CC(C)C(N)C(=O)NC(C(O)=O)CCCN=C(N)N IBIDRSSEHFLGSD-UHFFFAOYSA-N 0.000 description 1
- 150000004669 very long chain fatty acids Chemical class 0.000 description 1
- 238000003260 vortexing Methods 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/80—Vectors or expression systems specially adapted for eukaryotic hosts for fungi
- C12N15/81—Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/46—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
- C07K14/47—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
- C07K14/4701—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals not used
- C07K14/4732—Casein
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/575—Hormones
- C07K14/65—Insulin-like growth factors, i.e. somatomedins, e.g. IGF-1, IGF-2
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/76—Albumins
- C07K14/77—Ovalbumin
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/78—Connective tissue peptides, e.g. collagen, elastin, laminin, fibronectin, vitronectin or cold insoluble globulin [CIG]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/52—Genes encoding for enzymes or proenzymes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/62—DNA sequences coding for fusion proteins
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/0004—Oxidoreductases (1.)
- C12N9/0071—Oxidoreductases (1.) acting on paired donors with incorporation of molecular oxygen (1.14)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/12—Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/48—Hydrolases (3) acting on peptide bonds (3.4)
- C12N9/50—Proteinases, e.g. Endopeptidases (3.4.21-3.4.25)
- C12N9/503—Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from viruses
- C12N9/506—Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from viruses derived from RNA viruses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/90—Isomerases (5.)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y114/00—Oxidoreductases acting on paired donors, with incorporation or reduction of molecular oxygen (1.14)
- C12Y114/11—Oxidoreductases acting on paired donors, with incorporation or reduction of molecular oxygen (1.14) with 2-oxoglutarate as one donor, and incorporation of one atom each of oxygen into both donors (1.14.11)
- C12Y114/11002—Procollagen-proline dioxygenase (1.14.11.2), i.e. proline-hydroxylase
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y207/00—Transferases transferring phosphorus-containing groups (2.7)
- C12Y207/11—Protein-serine/threonine kinases (2.7.11)
- C12Y207/11001—Non-specific serine/threonine protein kinase (2.7.11.1), i.e. casein kinase or checkpoint kinase
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y304/00—Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
- C12Y304/22—Cysteine endopeptidases (3.4.22)
- C12Y304/22044—Nuclear-inclusion-a endopeptidase (3.4.22.44)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y503/00—Intramolecular oxidoreductases (5.3)
- C12Y503/04—Intramolecular oxidoreductases (5.3) transposing S-S bonds (5.3.4)
- C12Y503/04001—Protein disulfide-isomerase (5.3.4.1), i.e. disufide bond-forming enzyme
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/50—Fusion polypeptide containing protease site
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/22—Vectors comprising a coding region that has been codon optimised for expression in a respective host
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Zoology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Wood Science & Technology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Microbiology (AREA)
- Medicinal Chemistry (AREA)
- Plant Pathology (AREA)
- Physics & Mathematics (AREA)
- Mycology (AREA)
- Gastroenterology & Hepatology (AREA)
- Toxicology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Virology (AREA)
- Diabetes (AREA)
- Endocrinology (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Peptides Or Proteins (AREA)
Abstract
본 명세서의 개시내용은 퍼옥시좀에서 단백질을 만드는 방법 및 조성물, 및 퍼옥시좀에서 단백질을 생산하기 위한 세포를 만드는 방법을 포함한다. 또한 퍼옥시좀에서 단백질을 생산하기 위한 세포 및 본 명세서에 기술된 바와 같은 퍼옥시좀을 함유하는 진핵 세포에서 단백질을 생산하는 방법이 본 명세서에 개시되어 있다.
Description
관련 출원에 대한 상호 참조
본 출원은 2019년 5월 14일자로 출원된 미국 가출원 제62/847,769호의 이익을 주장하며, 상기 미국 가출원은 그 전체가 참조에 의해 본 명세서에 원용된다.
전자 형식의 서열 목록 및 표에 대한 참조
본 출원은 235KB 크기이며 2020년 5월 7일자로 생성된 PBFAB001WO2SEQLIST.TXT의 파일명의 전자식 서열 목록과 함께 제출된다. 이 전자식 서열 목록의 정보는 그 전체가 참조에 의해 본 명세서에 원용된다.
분야
예를 들어 인공 재료에 사용될 수 있는 단백질 및 단백질 전구체를 생산하기 위해 세포를 유전적으로 변형시키는 방법 및 조성물이 본 명세서에서 제공된다.
세포에서 단백질을 생산 및 변형하는 개선된 방법에 대한 이 분야의 요구가 있다. 세포에서 생산 및 변형된 단백질은 다양한 방식으로 사용된다.
필름 개발 제품의 기질; 알약의 캡슐(약물 및 기능식품의 젤라틴); 식품 첨가제(예를 들어, 모든 젤라틴 물질) 및 식품용 콜라겐과 같은 물질의 전구체로 작용할 수 있는 단백질을 생산하는 방법이 본 명세서에 기술되며 합성 육류, 합성 가죽과 같은 직물, 미용 제품 및 생물의학 재료(골격, 봉합사, 이식편, 확장 세포, 겔 등)가 고려된다. 이러한 방법의 사용으로 현재 사용되는 표준 제조 방법에서 제품 탄소 발자국을 줄이는 물질 또한 제공할 수 있다.
물질의 생산에 사용될 수 있는 단백질 전구체가 고려된다. 예를 들어, 통상적으로 생산되는 직물과 비교하여 온실 가스 배출이 더 낮은 세포 공학 및 조직 공학 기술을 사용한 인공 제작 직물과 같은 차세대 직물이 고려된다.
단백질 전구체는 예를 들어 얼굴 크림, 주사가능한 약물 및 상처 드레싱에서 발견될 수 있는 콜라겐 유래 제품으로 사용될 수 있다.
예를 들어 인공 물질에 사용될 수 있는, 단백질 및 단백질 전구체를 생산하기 위해 세포를 유전적으로 변형시키는 방법 및 조성물이 본 명세서에서 제공된다.
본 명세서에서 제공되는 일부 실시형태는 퍼옥시좀에서 변형 단백질(modified protein)을 생산하기 위해 유전자 변형 세포를 만드는 방법 및 조성물에 관한 것이다. 본 명세서에 기술된 변형 단백질(modified protein)은 직물, 인공 피부 또는 다른 물질과 같은 물질을 생산하기 위한 빌딩 블록으로 사용될 수 있다. 일부 직물에서 발견되는 단백질의 생산이 세포 생산 시스템에서 사용하기 위해 고려된다.
본 명세서에서 제공되는 일부 실시형태는 퍼옥시좀에서 변형 단백질을 생산하는 세포를 만드는 방법에 관한 것이다. 일부 실시형태에서, 방법은 세포를 제공하는 단계, 세포에 제1 핵산을 도입하는 단계 및 세포에 제2 핵산을 도입하는 단계를 포함한다. 일부 실시형태에서, 제1 핵산은 퍼옥시좀-표적화 서열에 융합된 이종 단백질을 암호화하는 제1 서열을 포함한다. 일부 실시형태에서, 제2 핵산은 퍼옥시좀-표적화 서열에 융합된 이종 변형 효소(heterologous modification enzyme)를 암호화하는 제2 서열을 포함한다. 일부 실시형태에서, 세포는 세균 또는 고세균 세포이다. 일부 실시형태에서, 세포는 진핵 세포이다. 일부 실시형태에서, 세포는 효모 세포이다. 일부 실시형태에서, 세포는 효모 세포이다. 일부 실시형태에서, 세포는 아르술라(Arxula), 칸디다(Candida), 한세눌라(Hansenula), 클루이베로마이세스(Kluyveromyces), 코마가탤라(Komagataella), 오가타에아(Ogataea), 피키아(Pichia), 사카로마이세스(Saccharomyces) 또는 야로위아(Yarrowia) 속으로부터 선택된다. 일부 실시형태에서, 제1 핵산 및/또는 제2 핵산은 프로모터(들)를 포함한다. 일부 실시형태에서, 프로모터는 항시성 또는 유도성이다. 일부 실시형태에서, 퍼옥시좀-표적화 서열은 서열번호 1(SLK), 서열번호 2(RLXXXXX(H/Q)L) 또는 서열번호 3(LGRGRRSKL)에 기재된 서열을 포함한다. 일부 실시형태에서, 단백질은 태그(tag)를 포함한다. 일부 실시형태에서, 태그는 절단가능하다. 일부 실시형태에서, 방법은 세포에 제3 핵산을 도입하는 단계를 더 포함한다. 일부 실시형태에서, 제3 핵산은 퍼옥시좀-표적화 서열에 융합된 제2 이종 변형 효소를 암호화하는 제3 서열을 포함한다. 일부 실시형태에서, 이종 단백질은 1Da, 5Da, 10Da, 20Da, 30Da, 40Da, 50Da, 60Da, 70Da, 80Da, 90Da, 100Da, 200Da, 300Da, 400Da, 500Da, 600Da, 700Da, 800Da, 900Da, 1kDa, 5kDa, 10kDa, 20kDa, 30kDa, 40kDa, 50kDa, 60kDa, 70kDa, 80kDa, 90kDa, 100kDa, 110kDa, 120kDa, 130kDa, 140kDa, 150kDa, 160kDa, 170kDa, 180kDa, 190kDa, 200kDa, 210kDa, 220kDa, 230kDa, 240kDa, 250kDa, 260kDa, 270kDa, 280kDa, 290kDa 또는 300kDa, 또는 임의의 상기 2개의 값으로 정의되는 범위 사이의 임의의 크기의 분자량을 갖는다. 일부 실시형태에서, 효소는 변형을 생성한다. 일부 실시형태에서, 변형은 단백질의 접힘이다. 일부 실시형태에서, 단백질은 접히지 않는다. 일부 실시형태에서, 변형은 단백질 접힘, 하이드록실화, 글리코실 전이, 산화 및/또는 이성질체화이다. 일부 실시형태에서, 효소는 프롤릴 하이드록실화효소, 글리코실 전이효소, 리실 산화효소, 단백질 샤페론 또는 프롤릴 이성질체화효소이다. 일부 실시형태에서, 효소는 글리코실 전이효소, 프롤릴 이성질체화효소, 단백질 이황화 이성질체화효소, 하이드록실 전이효소 또는 프롤릴 하이드록실화효소이다. 일부 실시형태에서, 단백질은 콜라겐, 젤라틴 또는 실크 단백질을 포함한다. 일부 실시형태에서, 효소는 글리코실 전이효소, 프롤릴 하이드록실화효소 또는 프롤릴 이성질체화효소이다. 일부 실시형태에서, 단백질은 콜라겐이고, 콜라겐은 I형 이종삼량체, 1형 알파 동종삼량체 또는 III형 동종삼량체 콜라겐을 형성하도록 변형된다. 일부 실시형태에서, 콜라겐은 Col1A1 또는 Col1A2를 포함한다. 일부 실시형태에서, 프롤릴-4-하이드록실화효소는 PDI 도메인의 결실을 갖도록 유전자 변형된다. 일부 실시형태에서, 효소는 개선된 발현 및 퍼옥시좀으로의 도입을 위해 유전자 변형된다. 일부 실시형태에서, 단백질은 개선된 발현 및 퍼옥시좀으로의 도입을 위해 유전자 변형된다. 일부 실시형태에서, 핵산은 효모 세포와 같은 진핵 세포에서의 단백질 발현을 위해 코돈 최적화된다. 일부 실시형태에서, 퍼옥시좀 표적화 서열에 대한 이종 단백질의 융합은 이종 단백질의 퍼옥시좀으로의 표적화를 초래하고, 이에 의해 퍼옥시좀으로 표적화되지 않은 효소로부터 이종 단백질을 분리한다. 일부 실시형태에서, 퍼옥시좀 표적화 서열에 대한 변형 효소(modification enzyme)의 융합은 변형 효소의 퍼옥시좀으로의 표적화를 초래하고, 이에 의해 퍼옥시좀으로 표적화되지 않은 기질 또는 효소로부터 변형 효소를 분리한다. 일부 실시형태에서, 이종 단백질은 COLsyn1, COLsyn2, COLsyn3, COLsyn4, 또는 COLsyn1, COLsyn2, COLsyn3 또는 COLsyn4의 아미노산 서열과 적어도 80%, 85%, 90%, 95%, 97%, 98% 또는 99% 동일한 아미노산 서열을 포함한다. 일부 실시형태에서, 제1 핵산은 변형되지 않은 제1 핵산 또는 천연발생인 제1 핵산과 비교하여 이종 단백질에서 적어도 하나의 소수성 아미노산이 친수성 또는 비-소수성 아미노산으로 대체되도록 조작된다.
본 명세서에서 제공되는 일부 실시형태는 퍼옥시좀에서 단백질을 생산하기 위한, 본 명세서에서 제공된 임의의 방법으로 제조된, 진핵 세포에 관한 것이다.
본 명세서에서 제공되는 일부 실시형태는 퍼옥시좀에서 단백질을 생산하기 위한 진핵 세포에 관한 것이다. 일부 실시형태에서, 세포는 퍼옥시좀-표적화 서열에 융합된 이종 단백질을 암호화하는 서열을 포함하는 제1 핵산 및 퍼옥시좀-표적화 서열에 융합된 이종 변형 효소를 암호화하는 제2 핵산을 포함한다.
본 명세서에서 제공되는 일부 실시형태는 변형 단백질을 생산하기 위한 퍼옥시좀을 포함하는 진핵 세포에 관한 것이다. 일부 실시형태에서, 진핵 세포는 퍼옥시좀-표적화 서열에 융합된 이종 단백질 및 퍼옥시좀-표적화 서열에 융합된 이종 변형 효소를 발현할 수 있다. 일부 실시형태에서, 단백질은 퍼옥시좀에서 변형된다. 일부 실시형태에서, 세포는 파스토리스(Pastoris)이다. 일부 실시형태에서, 퍼옥시좀-표적화 서열은 서열번호 1, 2 또는 3에 기재된 서열을 포함한다. 일부 실시형태에서, 세포는 퍼옥시좀-표적화 서열에 융합된 제2 단백질을 암호화하는 제3 핵산을 더 포함한다.
본 명세서에서 제공되는 일부 실시형태는 퍼옥시좀을 함유하는 진핵 세포에서 변형 단백질을 생산하는 방법에 관한 것이다. 일부 실시형태에서, 진핵 세포는 퍼옥시좀-표적화 서열에 융합된 이종 변형 효소를 발현한다. 일부 실시형태에서, 방법은 본 명세서에 기술된 방법으로 제조된 세포 또는 본 명세서에 기술된 대안적인 것들 중 임의의 하나의 세포를 제공하는 단계, 진핵 세포에서 이종 단백질을 발현하는 단계 및 이종 변형 효소가 퍼옥시좀에서 이종 단백질을 변형시켜 변형 단백질을 생산하도록 하는 조건 하에서 진핵 세포를 배양하는 단계를 포함한다. 일부 실시형태에서, 이종 단백질은 퍼옥시좀-표적화 서열에 융합된다. 일부 실시형태에서, 방법은 퍼옥시좀의 화물(cargo)을 증가시키는 단계를 더 포함한다. 일부 실시형태에서, 퍼옥시좀의 화물을 증가시키는 것은 진핵 세포에 올레산 또는 메탄올을 제공하여 수행된다.
본 명세서에서 제공되는 일부 실시형태는 퍼옥시좀을 함유하는 진핵 세포에서 변형 단백질을 생산하는 방법에 관한 것이다. 일부 실시형태에서, 진핵 세포는 퍼옥시좀-표적화 서열에 융합된 이종 변형 효소를 발현한다. 일부 실시형태에서, 방법은 진핵 세포에서 이종 단백질을 발현하는 단계 및 이종 변형 효소가 퍼옥시좀에서 이종 단백질을 변형시켜 변형 단백질을 생산하도록 하는 조건 하에서 진핵 세포를 배양하는 단계를 포함한다. 일부 실시형태에서, 이종 단백질은 퍼옥시좀-표적화 서열에 융합된다. 일부 실시형태에서, 방법은 퍼옥시좀의 화물을 증가시키는 단계를 더 포함한다. 일부 실시형태에서, 퍼옥시좀의 화물을 증가시키는 것은 진핵 세포에 올레산 또는 메탄올을 제공하여 수행된다.
본 명세서에서 제공되는 일부 실시형태는 변형 단백질을 생산하는 방법에 관한 것이다. 일부 실시형태에서, 방법은 변형 단백질이 생산되도록 하는 조건 하에서 퍼옥시좀 함유 진핵 세포를 배양하는 단계를 포함한다. 일부 실시형태에서, 진핵 세포는 퍼옥시좀-표적화 서열에 융합된 이종 단백질 및 퍼옥시좀-표적화 서열에 융합된 이종 변형 효소를 발현한다. 일부 실시형태에서, 이종 변형 효소는 배양 조건 하에서 퍼옥시좀에서 이종 단백질을 변형시켜 변형 단백질을 생산한다. 일부 실시형태에서, 방법은 퍼옥시좀의 화물을 증가시키는 단계를 더 포함한다. 일부 실시형태에서, 퍼옥시좀의 화물을 증가시키는 것은 진핵 세포에 올레산 또는 메탄올을 제공하여 수행된다.
본 명세서에서 제공되는 일부 실시형태는 변형 단백질의 수율을 증가시키는 방법에 관한 것이다. 일부 실시형태에서, 방법은 변형 단백질이 생산되도록 하는 조건 하에서 퍼옥시좀을 함유하는 진핵 세포를 배양하는 단계를 포함한다. 일부 실시형태에서, 진핵 세포는 퍼옥시좀-표적화 서열에 융합된 이종 단백질 및 퍼옥시좀-표적화 서열에 융합된 이종 변형 효소를 발현한다. 일부 실시형태에서, 이종 단백질의 발현은 프로모터의 영향 하에 있다. 일부 실시형태에서, 이종 변형 효소는 배양 조건 하에서 퍼옥시좀에서 이종 단백질을 변형시켜 변형 단백질을 생산하고 화학적 유도제의 첨가로 이종 단백질의 생산을 유도한다. 일부 실시형태에서, 방법은 퍼옥시좀의 화물을 증가시키는 단계를 더 포함한다. 일부 실시형태에서, 퍼옥시좀의 화물을 증가시키는 것은 진핵 세포에 올레산 또는 메탄올을 제공하여 수행된다.
일부 실시형태는 세포의 퍼옥시좀에서 변형 단백질을 생산하는 키트에 관한 것이다. 일부 실시형태에서, 키트는 GFP-x-ePTS1 또는 x-FLAG-ePTS1을 포함하는 제1 핵산 작제물 및 GFP-y-ePTS1 또는 y-FLAG-ePTS1을 포함하는 제2 핵산 작제물을 포함한다. 일부 실시형태에서, x는 퍼옥시좀으로 표적화되는 이종 단백질을 암호화하는 핵산 서열이다. 일부 실시형태에서, y는 퍼옥시좀으로 표적화되는 변형 효소를 암호화하는 핵산 서열이다. 일부 실시형태에서, 변형 효소는 퍼옥시좀에서 이종 단백질을 변형시킬 수 있는 효소이다.
도 1은 단백질 및 효소를 세포의 퍼옥시좀으로 향하게 하는 예시를 나타내는 개략도를 보여준다.
도 2는 일부 실시형태에 따른 유전자 변형 효모의 발효, 번역적으로 변형된 단백질 정제의 개략도를 보여준다.
도 3은 야생형(윗줄) 또는 삭제된 PEX5 유전자로 변형된(아랫줄) 융합 단백질을 발현하는 사카로마이세스 세레비지에(S. cerevisiae) 균주의 현미경 이미지를 나타낸다. 융합은 합성 콜라겐 펩타이드 및 콜라겐 변형 효소에 융합된 N-말단 GFP 및 C-말단 ePTS1을 포함한다.
도 4는 각각 서로 다른 산업성 효모 숙주 PBH001, PBH002 및 PBH004를 대표하는 PB000095, PB000163, PB000297 균주에서 GFP 및 C-말단 ePTS1에 융합된 콜라겐 변이체의 형광 국소화(localization)를 보여준다.
도 5는 YPD 또는 YP 갈락토스 플레이트 상에서 연속 희석된 균주의 콜로니 성장을 보여준다. 균주는 GAL-SigD1-351-ePTS1(상단) 또는 GAL-SigD1-351(하단)을 발현한다.
도 6은 퍼옥시좀-국소화 RFP-tev-TFP-ePTS1 기질(A) 상에서 또는 세포질의 RFP-tev-YFP 기질(B) 상에서 퍼옥시좀-국소화 TEV-FLAG-ePTS1 단백질분해효소 활성의 웨스턴 블롯 이미지를 보여준다. TEV 단백질분해효소 발현은 다음의 서로 다른 항시성 또는 유도성 프로모터 및 성장 조건에 의해 조절되었다: (1) pTEF1, (2) pRPL18B, (3) 덱스트로스에 의해 억제되는 pGAL1, (4) 라피노스 및 덱스트로스에 의해 억제되는 pGAL1 및 (5) 라피노스 및 갈락토스에 의해 유도되는 pGAL1. 웨스턴 블롯에서 전장 54kDa 기질 또는 27kDa 절단 산물을 인식하는 항-tRFP 항체를 사용하여 조사하였다.
도 7은 퍼옥시좀 내 콜라겐 상에서 Bant P4H 하이드록실화효소 활성을 보여준다. A는 균주 목록을 나타낸다. Bant P4H는 TDH3 프로모터에서 발현되고 콜라겐 기질은 TEF1 프로모터에서 발현된다. B는 Geneious 소프트웨어를 사용하여 각 균주로부터의 콜라겐 기질을 정렬한 것을 보여준다. 일치하는 서열은 1. PB000224; 2. PB000248; 및 3. PB000249가 같은 서열(서열번호 71)을 나타내고, 4. PB000225; 5. PB000254; 및 6. PB000255가 같은 서열(서열번호 72)을 나타냄을 보여준다. 아미노산 아래의 회색 상자는 LCMSMS에 의해 산화되는 것으로 확인된 프롤린 위치를 나타낸다. C는 각 변형 부위의 LCMSMS 결과에 대한 세부 정보를 보여준다.
도 8은 사카로마이세스 세레비지에에서 GFP 태그에 융합된 ePTS1-태깅 전장 콜라겐 AmCol1A 또는 AmCol1A2 및 mRuby 태그에 융합된 ePTS1-태깅 BantP4H 하이드록실화효소의 생체 내 형광 국소화를 보여준다. 이미지는 각각 GFP 및 mRuby 검출을 위한 개별 FITC 및 TexasRed 채널로 표시된다. 병합 이미지는 FITC 및 TexasRed 채널을 겹친 것으로 두 단백질의 공동국소화를 의미한다.
도 2는 일부 실시형태에 따른 유전자 변형 효모의 발효, 번역적으로 변형된 단백질 정제의 개략도를 보여준다.
도 3은 야생형(윗줄) 또는 삭제된 PEX5 유전자로 변형된(아랫줄) 융합 단백질을 발현하는 사카로마이세스 세레비지에(S. cerevisiae) 균주의 현미경 이미지를 나타낸다. 융합은 합성 콜라겐 펩타이드 및 콜라겐 변형 효소에 융합된 N-말단 GFP 및 C-말단 ePTS1을 포함한다.
도 4는 각각 서로 다른 산업성 효모 숙주 PBH001, PBH002 및 PBH004를 대표하는 PB000095, PB000163, PB000297 균주에서 GFP 및 C-말단 ePTS1에 융합된 콜라겐 변이체의 형광 국소화(localization)를 보여준다.
도 5는 YPD 또는 YP 갈락토스 플레이트 상에서 연속 희석된 균주의 콜로니 성장을 보여준다. 균주는 GAL-SigD1-351-ePTS1(상단) 또는 GAL-SigD1-351(하단)을 발현한다.
도 6은 퍼옥시좀-국소화 RFP-tev-TFP-ePTS1 기질(A) 상에서 또는 세포질의 RFP-tev-YFP 기질(B) 상에서 퍼옥시좀-국소화 TEV-FLAG-ePTS1 단백질분해효소 활성의 웨스턴 블롯 이미지를 보여준다. TEV 단백질분해효소 발현은 다음의 서로 다른 항시성 또는 유도성 프로모터 및 성장 조건에 의해 조절되었다: (1) pTEF1, (2) pRPL18B, (3) 덱스트로스에 의해 억제되는 pGAL1, (4) 라피노스 및 덱스트로스에 의해 억제되는 pGAL1 및 (5) 라피노스 및 갈락토스에 의해 유도되는 pGAL1. 웨스턴 블롯에서 전장 54kDa 기질 또는 27kDa 절단 산물을 인식하는 항-tRFP 항체를 사용하여 조사하였다.
도 7은 퍼옥시좀 내 콜라겐 상에서 Bant P4H 하이드록실화효소 활성을 보여준다. A는 균주 목록을 나타낸다. Bant P4H는 TDH3 프로모터에서 발현되고 콜라겐 기질은 TEF1 프로모터에서 발현된다. B는 Geneious 소프트웨어를 사용하여 각 균주로부터의 콜라겐 기질을 정렬한 것을 보여준다. 일치하는 서열은 1. PB000224; 2. PB000248; 및 3. PB000249가 같은 서열(서열번호 71)을 나타내고, 4. PB000225; 5. PB000254; 및 6. PB000255가 같은 서열(서열번호 72)을 나타냄을 보여준다. 아미노산 아래의 회색 상자는 LCMSMS에 의해 산화되는 것으로 확인된 프롤린 위치를 나타낸다. C는 각 변형 부위의 LCMSMS 결과에 대한 세부 정보를 보여준다.
도 8은 사카로마이세스 세레비지에에서 GFP 태그에 융합된 ePTS1-태깅 전장 콜라겐 AmCol1A 또는 AmCol1A2 및 mRuby 태그에 융합된 ePTS1-태깅 BantP4H 하이드록실화효소의 생체 내 형광 국소화를 보여준다. 이미지는 각각 GFP 및 mRuby 검출을 위한 개별 FITC 및 TexasRed 채널로 표시된다. 병합 이미지는 FITC 및 TexasRed 채널을 겹친 것으로 두 단백질의 공동국소화를 의미한다.
정의
본 명세서에서 제공되는 표제, 제목 및 부제목은 본 개시의 다양한 양상을 제한하는 것으로 해석되어서는 안된다. 따라서, 바로 아래에서 정의되는 용어들은 명세서의 전체를 참조하여 보다 완전하게 정의된다.
달리 정의되지 않는 한, 본 발명과 관련하여 사용되는 과학적 및 기술적 용어들은 이 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 의미를 가질 것이다. 또한, 문맥 상 달리 요구되지 않는 한, 단수적 용어는 복수를 포함하고 복수적 용어는 단수를 포함할 것이다.
본 명세서에서 "또는"의 사용은 달리 정해지지 않는 한 "및/또는"을 의미한다. 다중 종속항의 맥락에서, "또는"의 사용은 대안적으로만 하나 초과의 앞선 독립항 또는 종속항을 다시 가리킨다. 또한, "요소" 또는 "성분"과 같은 용어는 달리 구체적으로 정해지지 않는 한 하나의 구성 단위를 포함하는 요소 및 성분 및 하나 초과의 하위 구성 단위를 포함하는 요소 및 성분 모두를 포함한다.
본 명세서 및 첨부된 청구범위에 사용된 단수 형태 및 임의의 단어의 임의의 단수적 사용은 명시적이고 명확하게 하나의 지시 대상으로 제한되지 않는 한 복수의 지시 대상을 포함한다는 것에 주의해야 한다. 본 명세서에 사용된 용어 "포함" 및 이의 문법적인 변형체는 목록 내 항목이 나열된 항목에 대체되거나 추가될 수 있는 다른 유사한 항목을 배제하지 않도록 비-제한적인 것으로 의도된다.
본 명세서에 기술된 임의의 농도 범위, 백분율 범위, 비율 범위 또는 정수 범위는 달리 명시되지 않는 한 인용된 범위 내의 임의의 정수의 값 및, 적절한 경우, 이의 분수(예를 들어 정수의 10분의 1 및 100분의 1)를 포함하는 것으로 이해되어야 한다.
단위, 접두사 및 기호는 국제단위계(SI)에서 인정되는 형태로 표시된다. 숫자 범위는 범위를 정의하는 숫자가 포함된다. 측정치는 유효 숫자 및 측정치와 관련된 오류를 고려하여 근사치로 이해된다.
본 개시내용에 따라 이용되는 다음의 용어는, 달리 지시되지 않는 한, 다음의 의미를 갖는 것으로 이해되어야 한다:
본 명세서에 사용된 용어 "약"은, 명확하게 지시되든 그렇지 않든, 예를 들어 범자연수, 분수 및 백분율을 포함하는 숫자 값을 의미한다. 용어 "약"은 일반적으로 이 분야의 기술자가 인용된 값과 동등한 것으로 간주하는 숫자 값(예를 들어, 같은 기능 또는 결과를 갖는 값)의 범위(예를 들어, 인용된 범위의 +/- 5 내지 10%)를 의미한다. 적어도 및 약과 같은 용어가 숫자 값 또는 범위의 목록 앞에 오는 경우, 이 용어는 목록에 제공된 모든 값 또는 범위를 수식한다. 일부의 경우, 용어 약은 가장 가까운 유효숫자로 반올림된 숫자 값을 포함할 수 있다.
"퍼옥시좀"은 본 명세서에 비추어 볼 때 평범하고 통상적인 의미를 가지며, 예를 들어 매우 긴 사슬 지방산, 분지형 사슬 지방산, D-아미노산 및 폴리아민의 이화작용, 반응성 산소종의 환원, 플라스말로겐(즉, 포유동물의 뇌 및 폐의 일반적인 기능에 중요한 에테르 인지질)의 생합성을 위한 세포소기관을 포함할 수 있지만 이로 제한되지 않는다. 퍼옥시좀은 또한 일부 효모에서 글리옥실산 회로, 해당 작용(gylycolysis) 및 메탄올 및/또는 아민 산화 및 동화작용을 위해 기능할 수 있다. 퍼옥시좀은 또한 고유의 천연 효소를 가질 수 있다. 제한 없이, 이 효소는 예를 들어 D-아미노산 산화효소 및 요산 산화효소와 같은 산화적 효소를 위한 카탈라제를 포함할 수 있다. 본 명세서의 실시형태에서, 퍼옥시좀은 단백질을 만들거나 단백질을 변형시키는 기능을 할 수 있다.
단백질에 대한 "변형"은 본 명세서에 비추어 볼 때 평범하고 통상적인 의미를 갖는다. 제한 없이, 변형은 공유 변형의 추가, 단백질의 폴딩, 다중-하위유닛 복합체의 4차 구조로의 단백질의 조립 및 번역-후 변형과 같은, 단백질의 1차, 2차, 3차 및 4차 구조의 변화를 포함할 수 있다. 프롤릴 하이드록실화 외에 다른 변형 또한 퍼옥시좀에서 달성될 수 있다. 퍼옥시좀은 변형 효소(modifying enzyme)에 의해 변형되는 기질의 역할을 하는 많은 작은 분자에 대해 자연적으로 투과성이다. 사실, 퍼옥시좀은 약 700달톤 미만의 분자가 이 세포소기관으로 자유롭게 확산될 수 있는 크기 게이팅을 갖는 것으로 결정되었다. 퍼옥시좀으로 자유롭게 확산될 수 없는 기질은 수송되어야 한다. 수송은 퍼옥시좀 막을 표적으로 하는 막 단백질을 통해, 특별하게 또는 무차별적으로, 이루어질 수 있다.
"핵산" 또는 "핵산 분자"는, 데옥시리보핵산(DNA) 또는 리보핵산(RNA)과 같은, 폴리뉴클레오타이드, 올리고뉴클레오타이드, 중합효소 연쇄 반응(PCR)으로 생성되는 단편, 및 결찰, 절단, 엔도뉴클레아제의 작용 및 엑소뉴클레아제의 작용 중 임의의 것으로 생성되는 단편을 의미한다. 핵산 분자는 자연-발생 뉴클레오타이드(예를 들어 DNA 및 RNA)인 단량체, 또는 자연-발생 뉴클레오타이드의 유사체(예를 들어, 자연-발생 뉴클레오타이드의 거울상이성질 형태), 또는 이 둘의 조합으로 구성될 수 있다. 변형 뉴클레오타이드는 당 모이어티 및/또는 피리미딘 또는 퓨린 염기 모이어티에서의 변형을 가질 수 있다. 당 변형은 예를 들어 하나 이상의 수산기가 할로겐, 알킬기, 아민 및 아지도기로 대체되는 것을 포함하거나, 당이 에테르 또는 에스터로 기능화될 수 있다. 또한, 전체 당 모이어티가, 아자-당(aza-sugar) 및 탄소고리 당 유사체와 같은, 입체적으로 전자적으로 유사한 구조로 대체될 수 있다. 염기 모이어티 변형의 예시에는 알킬화 퓨린 및 피리미딘, 아실화 퓨린 또는 피리미딘, 또는 다른 잘 알려진 헤테로고리 대체물이 포함된다. 핵산 단량체는 인산다이에스터 결합 또는 이러한 연결의 유사체에 의해 연결될 수 있다. 인산다이에스터 연결의 유사체는 포스포로티오에이트, 포스포로다이티오에이트, 포스포로셀레노에이트, 포스포로다이셀레노에이트, 포스포로아닐로티오에이트, 포스포라닐리데이트, 포스포라미데이트 등을 포함한다. 용어 "핵산 분자"는 또한, 폴리아미드 백본에 부착된 자연-발생 또는 변형 핵산 염기를 포함하는, "펩타이드 핵산"이라 불리는 것을 포함한다. 핵산은 단일 가닥이거나 이중 가닥일 수 있다. 일부 대안에서, 퍼옥시좀-표적화 서열에 융합된 이종 단백질을 암호화하는 서열을 포함하는 핵산 서열이 제공된다. 일부 대안에서, 핵산은 RNA 또는 DNA이다.
"진핵" 세포는 조류 세포, 균류 세포(예를 들어 효모), 식물 세포, 동물 세포, 포유동물 세포 및 인간 세포(예를 들어, T-세포)를 포함하지만 이로 제한되지 않는다. 일부 실시형태에서, 세포는 코마가탤라, 피키아, 한세눌라 및 오가타에아로 이루어지는 메틸영양체성 효모의 속(genus)으로부터 선택된다. 일부 실시형태에서, 세포는 추가 출아효모 속인 아르술라, 칸디다, 클루베로마이세스(Kluveromyces), 사카로마이세스 및 야로위아로부터 선택된다.
"세균 세포"는 본 명세서에 비추어 볼 때 평범하고 통상적인 의미를 갖는다. 세균 세포는 주로 인지질로 만들어진 세포막으로 둘러쌓인다. 이 막은 세포의 내용물을 둘러싸고 세포 내 세포질의 영양소, 단백질 및 다른 필수 구성 요소를 유지하는 장벽 역할을 한다. 그러나, 진핵 세포와는 달리, 세균에는 일반적으로 세포질 내에 진핵 세포에 존재하는 핵, 미토콘드리아, 엽록체 및 다른 세포소기관과 같은 큰 막-결합 구조가 없다. 단백질 발현을 위한 세균에는 예를 들어 대장균(E. coli)이 포함될 수 있다.
"고세균"은 본 명세서에 비추어 볼 때 평범하고 통상적인 의미를 갖는다. 고세균 또는 원시세균은 극도로 뜨거운 열수 분출구로 인한 해저면과 같은 극한 환경에서 생존할 수 있다. 고세균과 세균은 매우 유사하다. 이들 모두 세포벽 및 세포막을 갖는 단-세포 원핵생물이다. 이들 사이의 주요 차이는 화학적 구조와 생존하는 곳이다. 예시로 호열성 세균, 호염성 세균 및 메탄생성균이 포함될 수 있지만 이로 제한되지 않는다.
"프로모터"는 본 명세서에 비추어 볼 때 평범하고 통상적인 의미를 가지며, 예를 들어 구조 유전자의 전사를 지시하는 뉴클레오타이드 서열을 포함할 수 있다. 일부 대안에서, 프로모터는 유전자의 구조 유전자의 전사 시작 부위에 근접한 5' 비-암호화 영역에 위치한다. 전사의 개시에서 기능하는 프로모터의 서열 요소는 종종 공통 뉴클레오타이드 서열을 특징으로 한다. 이들 프로모터 요소는 RNA 중합효소 결합 부위, TATA 서열, CAAT 서열, 분화-특이적 요소(DSE; McGehee et al., Mol. Endocrinol. 7:551 (1993); 이의 전체가 참조에 의해 원용됨), 고리형 AMP 반응 요소(CRE), 혈청 반응 요소(SRE; Treisman, Seminars in Cancer Biol. 1:47 (1990); 이의 전체가 참조에 의해 원용됨), 글루코코르티코이드 반응 요소(GRE) 및 다른 전사 인자, 예를 들어 CRE/ATF(O'Reilly et al., J. Biol. Chem. 267:19938 (1992); 이의 전체가 참조에 의해 원용됨), AP2(Ye et al., J. Biol. Chem. 269:25728 (1994); 이의 전체가 참조에 의해 원용됨), SP1, cAMP 반응 요소 결합 단백질(CREB; Loeken, Gene Expr. 3:253 (1993); 이의 전체가 참조에 의해 원용됨) 및 옥타머 인자(octamer factor)(일반적으로, 문헌[Watson et al., eds., Molecular Biology of the Gene, 4th ed.](The Benjamin/Cummings Publishing Company, Inc. 1987; 이의 전체가 참조에 의해 원용됨) 및 문헌[Lemaigre and Rousseau, Biochem. J. 303:1 (1994)](이의 전체가 참조에 의해 원용됨) 참조)에 대한 결합 부위를 포함한다. 본 명세서에 사용된 프로모터는 항시 활성, 억제성 또는 유도성일 수 있다. 프로모터가 유도성 프로모터인 경우, 전사 속도는 유도제에 반응하여 증가한다. 대조적으로, 프로모터가 항시성 프로모터인 경우 전사 속도는 유도제에 의해 조절되지 않는다. 본 명세서의 일부 실시형태에서, 제공되는 핵산은 프로모터 서열을 포함한다. 일부 실시형태에서, 프로모터는 단백질 번역을 위한 효모 프로모터이다. 일부 실시형태에서, 세포는 피키아이고, 프로모터는 메탄올 유도성 프로모터 PAOX1 또는 항시성 프로모터 PGAP을 포함한다. 일부 실시형태에서, 프로모터는 pAOX, pGal, pCup, pGEM 또는 pZPM을 포함한다.
퍼옥시좀 표적화 신호(PTS)는 본 명세서에 비추어 볼 때 평범하고 통상적인 의미를 가지며, 예를 들어 수용체가 인식하고 결합하는 퍼옥시좀 단백질의 영역을 포함할 수 있다. 이러한 모티프를 함유하는 단백질은 퍼옥시좀으로 국소화된다. 본 명세서의 일부 실시형태에서, PTS에 작동 가능하게 연결된 단백질 서열을 포함하는 핵산이 제공된다.
"단백질 태그" 또는 "태그"는 본 명세서에 비추어 볼 때 평범하고 통상적인 의미를 가지며, 예를 들어 재조합 단백질 상에 유전적으로 이식된 펩타이드 서열을 포함할 수 있다. 이러한 태그는 종종 화학 작용제 또는 효소적 수단, 예를 들어 단백질 분해 또는 인테인 스플라이싱(intein splicing)으로 제거될 수 있다. 태그는 다양한 목적, 예를 들어 정제 또는 가용화를 위한 친화성 태그와 같은 목적을 위해 단백질에 부착된다. 태그는 또한 퍼옥시좀에 있는 동안의 단백질 안정성을 위해 단백질 또는 효소에 부가될 수 있다. 본 명세서의 일부 실시형태에서, 퍼옥시좀에서의 변형을 위해 발현되는 단백질은 태그를 포함한다. 일부 실시형태에서, 태그는 히스티딘(예를 들어, HIS6), 말토스-결합 단백질, GST, FLAG, Fc 도메인 및 스트렙-태그(Strep-tag)로 이루어진 군으로부터 선택된다.
"단백질"은 본 명세서에 비추어 볼 때 평범하고 통상적인 의미를 가지며, 예를 들어 하나 이상의 폴리펩타이드 사슬을 포함하는 고분자를 포함할 수 있다. 따라서 단백질은 임의의 하나 이상의 아미노산에 의해 형성되는, 펩타이드(아미드) 결합으로 연결되는 아미노산 단량체들의 사슬인 펩타이드로 구성될 수 있다. 단백질 또는 펩타이드는 적어도 2개의 아미노산을 함유할 수 있고, 단백질 또는 펩타이드 서열을 포함할 수 있는 아미노산의 최대 수에는 제한이 없다. 제한 없이, 아미노산은 예를 들어 아르기닌, 히스티딘, 라이신, 아스파트산, 글루탐산, 세린, 트레오닌, 아스파라긴, 글루타민, 시스테인, 시스틴, 글라이신, 프롤린, 알라닌, 발린, 하이드록시프롤린, 이소루신, 루신, 피롤리신, 메티오닌, 페닐알라닌, 타이로신, 트립토판, 오르니틴, S-아데노실메티오닌 및 셀레노시스테인이다. 단백질은 또한 비천연 아미노산을 포함할 수 있다. 일부 실시형태에서, 비천연 아미노산의 혼입은 앰버 코돈 억제로 수행된다. 단백질은 또한 예를 들어 탄수화물기와 같은 비-펩타이드 구성 요소를 포함할 수 있다. 탄수화물 및 다른 비-펩타이드 치환기는 단백질을 생산하는 세포에 의해 단백질에 부가될 수 있으며, 세포의 유형에 따라 달라질 것이다. 단백질은 본 명세서에서 아미노산 백본 구조의 관점에서 정의된다; 탄수화물기와 같은 치환기는 일반적으로 지정되지 않지만, 그럼에도 불구하고 존재할 수 있다. 본 명세서에 기술된 일부 대안에서, 퍼옥시좀에서 변형 단백질을 만드는 방법이 제공된다. 일부 실시형태에서, 변형 단백질은 콜라겐, 젤라틴 또는 실크 단백질을 포함한다. 일부 직물에서, 글로불린-유사 단백질, 케라틴, 콜라겐 가수분해물, 콜라겐 펩타이드 및 콜라겐과 같은 단백질 또한 고려된다.
"콜라겐"은 본 명세서에 비추어 볼 때 평범하고 통상적인 의미를 가지며, 예를 들어 피부 및 다른 결합 조직에서 발견되는 구조 단백질을 포함할 수 있다. 본 명세서의 일부 실시형태에서, 콜라겐은 퍼옥시좀에서 변형된다.
"젤라틴"은 본 명세서에 비추어 볼 때 평범하고 통상적인 의미를 가지며, 예를 들어 콜라겐으로부터 제조되는 수용성 단백질을 포함할 수 있다. 일부 실시형태에서, 젤라틴은 퍼옥시좀에서의 변형을 위해 제공된다.
"이성질체화효소"는 본 명세서에 비추어 볼 때 평범하고 통상적인 의미를 가지며, 예를 들어 특정 화합물을 이성질체로 전환하는 것을 촉매하는 효소를 포함할 수 있다. 이 분야의 기술자들은 예를 들어 라세미화효소, 에피머화효소, 시스-트랜스 이성질체화효소 및 분자내 전이효소와 같은 많은 유형의 이성질체화효소가 있다는 것을 이해할 것이다.
"하이드록실 전이효소"는 본 명세서에 비추어 볼 때 평범하고 통상적인 의미를 가지며, 예를 들어 프롤릴 하이드록실화효소 및 리실 산화효소와 같은 효소를 포함할 수 있다.
"글리코실전이효소"는 본 명세서에 비추어 볼 때 평범하고 통상적인 의미를 가지며, 예를 들어 글리코시드 결합을 형성하는 효소를 포함할 수 있다.
이 분야의 기술자들은 유전자 발현 수준이 프로모터 서열 및 조절 요소와 같은 많은 인자들에 의존한다는 것을 인식할 것이다. 최대 단백질 선택을 위한 또 다른 인자는 전사 유전자의 코돈을 숙주의 전형적인 코돈 사용에 적응시키는 것이다. 예를 들어 대부분의 세균 및 효모 세포에 대해 언급된 바와 같이, 코돈의 작은 부분집합이 tRNA 종에 의해 인식되어 번역 선택으로 이어지며, 이는 단백질 발현 상의 중요한 제한이 될 수 있다. 이러한 양상에서, 단백질 발현 수준을 증가시키기 위해 많은 합성 유전자가 설계될 수 있다. 코돈 최적화 설계 과정은 희귀 코돈을 최대 단백질 발현 효율을 증가시키는 것으로 알려진 코돈으로 변경하는 것일 수 있다. 일부 대안에서, 코돈 선택이 기술되며, 여기서 코돈 선택은 더 높은 수준의 전사 및 단백질 수율을 위해 최적화된 합성 유전자 전사체를 생성하기 위한 이 분야의 기술자에게 알려진 알고리즘을 사용하여 수행된다. 코돈 최적화 알고리즘을 함유하는 프로그램들이 이 분야의 기술자들에게 알려져 있다. 프로그램에는 예를 들어 OptimumGeneTM, GeneGPS® 알고리즘 등이 포함될 수 있다. 추가로, Integrated DNA Technologies 및 다른 상업적으로 이용가능한 DNA 시퀀싱 서비스로부터 합성 코돈 최적화 서열을 상업적으로 얻을 수 있다. 일부 대안에서, 변형 단백질에 대한 유전자가 효모, 예를 들어 피키아에서의 발현을 위해 코돈 최적화되도록 단백질이 제조된다. 일부 대안에서, 단백질 또는 효소가 기술되며, 여기서 단백질 또는 효소의 완전 유전자 전사체에 대한 유전자는 진핵 세포, 예를 들어 효모에서의 발현을 위해 코돈 최적화되고, 이는 효모 퍼옥시좀에서의 변형을 위한 단백질의 농도를 증가시킬 수 있다.
"정제"는 본 명세서에 비추어 볼 때 평범하고 통상적인 의미를 가지며, 예를 들어 고도로 정제된 세포, 퍼옥시좀 및 단백질의 단리를 포함할 수 있다. 세포 정제의 방법에서, 세포는 플라스틱 또는 폴리 카보네이트 표면, 비드, 입자, 플레이트 또는 웰과 같은 지지체에 부착된 리간드에 결합하는 능력으로 단리, 분리 또는 선택될 수 있다. 세포는 특정 세포 표면 마커를 기반으로 결합할 수 있으며, 이는 정제를 가능하게 한다. 퍼옥시좀의 경우, 이 분야의 기술자들은 예를 들어 원심분리와 같은 퍼옥시좀 정제 방법을 이해할 것이다. 또한 단백질이 정제될 수 있다. 단백질 정제 방법은 예를 들어 크기 배제 및 친화성 크로마토그래피와 같은 이 분야의 기술자들에게 알려져 있다.
섬유 및 악세서리는 자주 구매하고 가장 자주 교체되는 소비재이다. 게다가, 대부분의 의류는 오래 지속되지 않으며 잦은 교체를 필요로 한다. 의류의 경우, 높은 회전률, 대량 생산 및 에너지-집약적 사용으로 인해 자원 소비 및 온실 가스 배출의 관점에서 중요한 제품 범주이다.
의류 제작과 관련된 문제를 방지하려면, 의류 및 악세서리의 탄소 발자국과 같은 여러 영역을 해결해야 할 것이다. 탄소 발자국은 조직, 사건, 제품 또는 사람으로 인해 발생하는 총 온실 가스 배출 집합으로 설명될 수 있다. 본 명세서에 언급된 바와 같이 직물 생산과 관련된 탄소 발자국을 낮추는 방법 및 세포가 있다. 예를 들어 의류 품목의 탄소 발자국은 품목의 수명 주기에 걸쳐 배출되는 이산화탄소(CO2) 및 다른 온실 가스의 총량으로, CO2 등가 킬로그램으로 나타낸다. 이는 원자재의 제조, 품목의 제조, 재료 및 완제품의 운송, 포장, 수많은 세탁 및 건조 주기를 포함하는 사용 단계 및 수명-종료 처리에서 생성되는 모든 온실 가스를 포함한다.
다른 물질을 위한 단백질 전구체도 고려된다. 세포에 의해 생산되는 단백질은 필름 개발용 제품; 알약용 캡슐(약물 및 기능식품의 젤라틴); 식품 첨가제(예를 들어 모든 젤라틴 물질) 및 식품용 콜라겐과 같은 여러 물질의 전구체일 수 있으며 합성 육류, 합성 가죽, 미용 제품 및 생물의학 재료(골격, 봉합사, 이식편, 확장 세포, 겔 등)가 고려된다.
높은 탄소 발자국에 관련된 문제를 제거하기 위해, 직물을 생산하기 위한 전구체를 만드는 방법이 기술된다. 본 명세서의 실시형태에 기술된 바와 같이 퍼옥시좀과 같은 세포소기관 내에서 변형 단백질을 만드는 방법이 있다. 퍼옥시좀은 세포 지질 대사에서 역할을 하는 것으로 주로 알려진 흔하고 다기능성인 세포소기관이다. 퍼옥시좀은 정상 기능의 일부로서 산화환원 반응을 촉매할 수 있는 퍼옥시좀 효소를 포함하며, 또한 이 세포소기관은 점차 산화 스트레스-관련 신호전달 경로의 잠재적인 조절자로서 인식되고 있다.
퍼옥시좀 내에서의 처리를 위해, 신호전달 서열에 의해 단백질이 퍼옥시좀으로 이동하도록 지시될 수 있다. 신호전달 서열을 암호화하는 서열은 단백질을 암호화하는 서열에 작동 가능하게 연결될 수 있다. 이에 따라 단백질의 번역 후, 단백질은 퍼옥시좀으로 이동하게 된다.
퍼옥시좀은 1965년에 발견된 이후 잘 기술되어 있다(문헌[Sabatini et al.; PNAS August 13, 2013. 110 (33) 13234-13235] 및 문헌[Purdue et al.; Annu. Rev. Cell Dev. Biol. 2001. 17:701-52]; 이의 전체가 본 명세서에 참조에 의해 원용됨). 퍼옥시좀은 DNA와 리보솜이 없는 작은 세포소기관이며 단일 막으로 둘러싸여 있다. 퍼옥시좀 단백질은 핵 유전자에 의해 암호화되고, 세포질의 리보솜 상에서 합성되고, 기존 퍼옥시좀에 혼입된다. 세포의 수명 동안, 퍼옥시좀은 예를 들어 단백질 및 지질의 추가로 확대될 수 있고 결국 나뉘어 새로운 하나의 퍼옥시좀을 형성할 수 있다.
퍼옥시좀의 크기 및 효소 조성은 다양할 수 있다. 그러나, 퍼옥시좀은 분자 산소를 사용하여 다양한 기질을 산화함으로써 과산화수소(H2O2)를 형성하는 효소를 모두 함유할 수 있다. 퍼옥시좀은 H2O2-기반 호흡 및 지방산 β-산화로 알려져 있다. 제한 없이, 퍼옥시좀의 기능은 예를 들어 에테르 지질(플라스말로겐) 합성 및 콜레스테롤 합성, 발아 종자에서의 글리옥실산 회로("글리옥시좀"), 광호흡, 트리파노솜의 해당과정("글리코솜") 및 효모에서 메탄올 및/또는 아민 산화 및 동화를 포함할 수 있다.
퍼옥시좀의 처리를 위해 지시된 단백질은 폴딩된 단백질의 퍼옥시좀 매트릭스로의 진입을 지시하는 C- 및/또는 N-말단 표적화 서열을 가질 수 있다. 번역 및 세포질 리보솜으로부터의 방출 후, 퍼옥시좀을 표적으로 하는 새롭게 합성된 단백질은 세포소기관으로의 진입 전에 세포질에서 성숙한 형태로 폴딩될 수 있다. 폴딩은 또한 샤페론 단백질의 도움으로 이루어질 수 있다. 퍼옥시좀으로의 단백질 진입은 ATP 가수분해를 필요로 하지만, 일부 수송 시스템과 달리, 퍼옥시좀 막을 가로질러 전기화학적 구배는 없다. 수송을 위한 태그는 이전에 기술되었다(Purdue et al.). 일부 실시형태에서, 단백질은 샤페론 단백질의 도움으로 폴딩된다.
퍼옥시좀을 표적으로 하는 일부 단백질에 대한 흡수-표적화 신호는 단백질의 C-말단의 Ser-Lys-Leu 서열(1문자 코드로는 SKL) 또는 관련 서열이다. SKL 신호는 세포질에서 페록신과 같은 가용성 수용체 단백질에 결합할 수 있다. PTS1 및 PTS2와 같이, 여러 종류의 페록신(PTS)이 있다. 생성된 PTS1R-카탈라제 복합체는 수용체 단백질에 결합한다. PTS1의 경우 Pex5p, PTS2의 경우 Pex7p와 같은 이후 표적 단백질이 퍼옥시좀 내부로 수송되는 세포질 수용체가 퍼옥시좀 막에서 확인되었다. SKL 서열은 퍼옥시좀으로의 진입 후 카탈라제로부터 잘리지 않는다.
제한 없이, 매트릭스 단백질은 N-말단 흡수-표적화 서열과 함께 전구체로서 합성될 수 있다. 이러한 유형의 흡수-표적화 신호를 갖는 단백질은, PTS1R과 같이, 전구체 단백질을 퍼옥시좀 막의 Pex14p 수용체로 호송하는 PTS2R이라는 다른 세포질 수용체 단백질에 결합한다. 이러한 단백질의 진입 후, N-말단 표적화 서열은 잘린다. 퍼옥시좀 막 단백질은 또한 유리 폴리리보솜 상에서 합성되고 합성 후 퍼옥시좀으로 혼입된다. 단백질을 퍼옥시좀 막으로 표적화하는 신호는 SKL 서열을 함유하지 않지만, 흡수 과정에 대해서는 거의 알려지지 않았다.
프롤릴 하이드록실화 이외의 다른 변형 또한 퍼옥시좀에서 달성할 수 있다. 예를 들어, 퍼옥시좀 진입 태그로 태깅하는 것을 통해 글리코실전이효소를 퍼옥시좀에 함께 진입시켜 콜라겐과 같은 단백질 기질을 글리코실화할 수 있다. 퍼옥시좀은 변형 효소에 의해 변형하는 기질의 역할을 하는 많은 소분자에 자연적으로 투과성이다. 퍼옥시좀으로 자유롭게 확산될 수 없는 기질은 수송되어야 한다. 수송은 퍼옥시좀 막에 표적화된 막 단백질을 통해, 특이적으로 또는 무차별적으로 이루어질 수 있다.
변형은 또한 퍼옥시좀의 세포질 표면에서 발생할 수 있다. 제한 없이, 이러한 변형은 예를 들어 유비퀴틴화 및 인산화를 포함할 수 있다.
샤페론 단백질 또한 퍼옥시좀 전이를 위해 태깅될 수 있다. 이와 같이, 퍼옥시좀에서 전이된 단백질의 적절한 폴딩을 위해 샤페론이 퍼옥시좀에서 사용될 수 있다.
변형 단백질의 생산을 위한 유전자 변형 세포를 만드는 방법
일부 실시형태에서, 퍼옥시좀에서 변형 단백질을 생산하기 위한 세포를 만드는 방법이 제공된다. 세포를 제공하는 단계, 세포에 제1 핵산을 도입하는 단계로서, 제1 핵산은 퍼옥시좀-표적화 서열에 융합된 이종 단백질을 암호화하는 제1 서열을 포함하는, 단계 및 세포에 제2 핵산을 도입하는 단계로서, 제2 핵산은 퍼옥시좀-표적화 서열에 융합된 이종 변형 효소를 암호화하는 제2 서열을 포함하는, 단계를 포함할 수 있다. 세포는 진핵 세포일 수 있다. 일부 실시형태에서, 도입은 염화칼슘의 존재 하에서 수행된다. 일부 실시형태에서, 도입은 전기천공과 같은 이 분야의 기술자에게 알려진 표준 형질전환 기술로 수행된다.
일부 실시형태에서 세포는 사카로마이세스 세레비지에(Saccharomyces cerevisiae), 피키아 파스토리스(Pichia pastoris) 및 오가타에아 폴리모파(Ogataea polymorpha)와 같은 효모 세포이다. 파스토리스 세포의 경우, 예를 들어 핵산은 메탄올의 존재 하에서 단백질의 유도를 허용하는 프로모터를 가질 수 있다.
일부 실시형태에서, 제1 및/또는 제2 핵산은 프로모터(들)를 포함한다. 일부 실시형태에서, 프로모터는 항시성 또는 유도성이다.
일부 실시형태에서, 퍼옥시좀-표적화 서열은 서열번호 1(SLK), 서열번호 2(RLXXXXX(H/Q)L) 또는 서열번호 3(LGRGRRSKL)에 기재된 서열을 포함한다.
일부 실시형태에서, 단백질은 태그를 포함한다. 일부 실시형태에서, 태그는 절단가능하다. 태그는 퍼옥시좀의 환경 내에서 단백질의 용해성 또는 안정성을 허용하는 태그일 수 있다.
일부 실시형태에서, 방법은 세포에 제3 핵산을 도입하는 단계로서, 제3 핵산은 퍼옥시좀-표적화 서열에 융합된 제2 이종 변형 효소를 암호화하는 제3 서열을 포함하는, 단계를 더 포함한다.
일부 실시형태에서, 효소는 하이드록실화, 산화, 글리코실 전이 및 이성질체화로부터 선택된 변형의 군으로부터 선택된 변형을 촉매한다.
일부 실시형태에서, 효소는 글리코실 전이효소, 이성질체화효소(예를 들어, 프롤릴 및 이황화), 하이드록실 전이효소(예를 들어, 프롤릴 하이드록실화효소 및 리실 산화효소)를 포함한다.
일부 실시형태에서, 효소는 글리코실 전이효소, 이성질체화효소, 프롤릴 이성질체화효소, 하이드록실 전이효소 또는 프롤릴 하이드록실화효소로부터 선택된다.
일부 실시형태에서, 단백질은 콜라겐, 젤라틴 또는 실크 단백질을 포함한다.
도 1에 나타낸 바와 같이, 세포는 퍼옥시좀 내 전이를 위해 태깅된 단백질 및 효소를 암호화하는 핵산을 포함한다. 번역 후, C-말단 또는 N-말단 태그는 이들이 추가로 처리되는 퍼옥시좀으로의 단백질 및 효소의 전이 신호를 보낸다.
세포
일부 실시형태에서, 퍼옥시좀에서 단백질을 생산하기 위한 진핵 세포는 본 명세서에 기술된 임의의 하나의 실시형태의 방법으로 제조된다. 일부 실시형태에서, 세포는 퍼옥시좀-표적화 서열에 융합된 이종 단백질을 암호화하는 서열을 포함하는 제1 핵산 및 퍼옥시좀-표적화 서열에 융합된 이종 변형 효소를 암호화하는 제2 핵산을 포함한다. 일부 실시형태에서, 세포는 변형 단백질을 생산하기 위한 퍼옥시좀을 포함하며, 여기서 진핵 세포는 퍼옥시좀-표적화 서열에 융합된 이종 단백질 및 퍼옥시좀-표적화 서열에 융합된 이종 변형 효소를 발현할 수 있다. 일부 실시형태에서, 세포는 변형 단백질을 생산하기 위한 퍼옥시좀을 포함하며, 여기서 진핵 세포는 퍼옥시좀-표적화 서열에 융합된 이종 단백질을 암호화하는 제1 핵산 서열 및 퍼옥시좀-표적화 서열에 융합된 이종 변형 효소를 암호화하는 제2 핵산 서열을 포함한다(도 1 참조).
일부 실시형태에서, 변형 단백질을 생산하기 위한 퍼옥시좀을 포함하는 진핵 세포로서, 퍼옥시좀은 퍼옥시좀-표적화 서열에 융합된 이종 단백질 및 퍼옥시좀-표적화 서열에 융합된 이종 변형 효소를 포함하는, 진핵 세포가 제공된다.
일부 실시형태에서, 단백질은 퍼옥시좀에서 변형된다. 일부 실시형태에서, 세포는 파스토리스이다. 일부 실시형태에서, 퍼옥시좀-표적화 서열은 서열번호 1, 2 또는 3에 기재된 서열을 포함한다. 세포는 퍼옥시좀-표적화 서열에 융합된 제2 단백질을 암호화하는 제3 핵산을 더 포함한다.
세포는 표준 발효 액체 배지에서 발효를 위해 사용될 수 있다. 이 분야의 기술자들은 단백질 생산을 위해 세포를 성장시키는 표준 방법을 인식할 것이다. 일부 실시형태에서, 발효는 유도제 또는 메탄올의 존재 하에서 수행될 수 있다.
대량의 단백질이 대규모 생산에서 요구되는 일부 실시형태에서, 세포는 발효기에서 성장된다. 사카로마이세스 세레비지에, 피키아 파스토리스 및 오가타에아 폴리모파의 이점은 이들이 풍부한 성장률로 성장할 수 있다는 것이다. 발효기는 pH 조절, 산소 제한, 영양 제한 및 온도 변동에 따른 제한을 방지하기 위해 사용될 수 있다. 발효기는 교반을 증가시킬 뿐만 아니라 기류를 증가시키고, 기류에 순수 산소를 보충하여 기류를 증가시켜 용존 산소(DO) 수준을 높이는 것이 가능하다. 또한 고갈되는 영양소를 보충할 수 있는 속도로 신선한 배지 또는 성장 제한 영양소가 용기에 펌핑되는 "공급 모드"에서 발효기가 작동될 수 있기 때문에, 영양 제한을 최소화할 수 있다. 또한 발효기는 메탄올의 존재에 대해 세포를 길들이기 위해 메탄올 유속을 조절할 수 있을 뿐만 아니라, 독성을 유발할 수 있는 과도한 메탄올의 추가를 방지하면서 단백질 합성을 위해 충분한 메탄올의 추가가 허용되는 적절한 속도로 메탄올을 제공할 수 있다.
변형 단백질을 생산하는 방법
일부 실시형태에서, 퍼옥시좀을 함유하는 진핵 세포에서 변형 단백질을 생산하는 방법으로, 진핵 세포가 퍼옥시좀-표적화 서열에 융합된 이종 변형 효소를 발현하는, 방법이 제공된다. 방법은 본 명세서의 임의의 하나의 실시형태의 방법으로 제조된 세포 또는 본 명세서의 임의의 하나의 실시형태의 세포를 제공하는 단계, 진핵 세포에서 이종 단백질을 발현시키는 단계로서, 이종 단백질은 퍼옥시좀-표적화 서열에 융합된, 단계 및 이종 변형 효소가 퍼옥시좀에서 이종 단백질을 변형시켜 변형 단백질을 생산하도록 하는 조건 하에서 진핵 세포를 배양하는 단계를 포함한다.
일부 실시형태에서, 퍼옥시좀을 함유하는 진핵 세포에서 변형 단백질을 생산하는 방법으로서, 진핵 세포는 퍼옥시좀-표적화 서열에 융합된 이종 변형 효소를 발현하는, 방법이 제공된다. 방법은 진핵 세포에서 이종 단백질을 발현하는 단계로서, 이종 단백질은 퍼옥시좀-표적화 서열에 융합된, 단계 및 이종 변형 효소가 퍼옥시좀에서 이종 단백질을 변형시켜 변형 단백질을 생산하도록 하는 조건 하에서 진핵 세포를 배양하는 단계를 포함할 수 있다.
일부 실시형태에서, 변형 단백질을 생산하는 방법을 포함하는 진핵 세포에서 변형 단백질을 생산하는 방법이 제공된다. 방법은 다음 단계를 포함한다: 변형 단백질이 생산되도록 하는 조건 하에서 퍼옥시좀을 함유하는 진핵 세포를 배양하는 단계로서, 진핵 세포는 퍼옥시좀-표적화 서열에 융합된 이종 단백질 및 퍼옥시좀-표적화 서열에 융합된 이종 변형 효소를 발현하고, 이종 변형 효소는 이종 단백질을 변형하여 배양 조건 하에서 퍼옥시좀에서 변형 단백질을 생산하는, 단계.
일부 실시형태에서, 변형 단백질의 수율을 증가시키는 방법을 포함하는 진핵 세포에서 변형 단백질을 생산하는 방법이 제공된다. 일부 실시형태에서, 진핵 세포는 사카로마이세스 세레비지에, 피키아 파스토리스 또는 오가타에아 폴리모파로부터의 것이다. 방법은 변형 단백질이 생산되도록 하는 조건 하에서 퍼옥시좀을 함유하는 진핵 세포를 배양하는 단계로서, 진핵 세포는 퍼옥시좀-표적화 서열에 융합된 이종 단백질 및 퍼옥시좀-표적화 서열에 융합된 이종 변형 효소를 발현하고, 이종 단백질의 발현은 프로모터의 영향 하에서 이루어지고, 이종 변형 효소는 이종 단백질을 변형하여 배양 조건 하에서 퍼옥시좀에서 변형 단백질을 생산하는, 단계를 포함한다. 일부 실시형태에서, 방법은 화학유도제의 추가로 이종 단백질의 생산을 유도하는 단계를 더 포함한다. 일부 실시형태에서, 방법은 퍼옥시좀의 화물을 증가시키는 단계로서, 퍼옥시좀의 화물을 증가시키는 것은 진핵 세포에 올레산 또는 메탄올을 제공하여 수행되는, 단계를 더 포함한다.
일부 실시형태에서, 세포는 본 명세서에 기술된 바와 같은 하나 이상의 핵산으로 형질전환된다(예를 들어, 도 2 참조). 일부 실시형태에서, 형질전환 세포를 발효시킨다. 일부 실시형태에서, 전이가 뒤따르는, 발효 및 번역을 위한 단백질 유도 후, 세포가 수확된다. 세포는 일부 실시형태에서 원심분리된다.
일부 실시형태에서, 세포가 용해를 위해 준비된다. 균질기가 효모 세포의 파쇄에 사용될 수 있다. 균질기는 세포 현탁액에 압력을 가하고 갑자기 압력을 해제하여 세포를 용해할 수 있다. 이것은 세포를 용해할 수 있는 액체 전단을 생성한다. 구형 균질기, 프렌치 프레스 및 만톤-가울린 균질기에서의 전형적인 작동 압력은 6000 내지 10,000psi이다. 합리적인 정도의 용해를 달성하려면 여러 번(적어도 3번)의 통과가 필요하다. 하지만 높은 작동 압력은 작동 온도의 상승을 초래할 수 있다. 따라서, 일부 실시형태에서 사용하기 전에 압축 세포를 냉각(4℃)한다. 온도 조절에 더해, 일부 실시형태에서 발포에 의한 단백질의 불활성화를 피하기 위해 주의를 기울여야 한다. 따라서, 압력을 증가시키면서 적용할 수 있다. 또한 일부 실시형태에서 용해는 단백질 분해효소 저해제의 존재 하에서 이루어져야 한다.
현대식 균질기는 더 높은 압력에서 작동할 수 있기 때문에 효소 세포를 용해시키기에 보다 적합하다. 예를 들어, 30,000psi(200MPa)에서 피키아 파스토리스 세포를 용해시키기 위해 Avestin Emulsiflex-C5가 사용될 수 있다.
또한 유리 비드(0.4 내지 0.5㎜)와 함께 교반하여 효모 세포를 파쇄하는 세포 용해를 위해 유리 비드 볼텍싱이 사용될 수 있다. 여러 번의 교반(30 내지 60초)은 세포 현탁액의 과열을 피하기 위해 얼음에서 냉각시키는 주기가 사이에 배치되어야 한다. 파손은 가변적이지만, 50%를 넘길 수 있다(최대 95%). 위의 방법은 소량(최대 15㎖)에 대해 기술되었지만 특수 장치를 사용하여 수 리터까지 규모를 키울 수 있다.
또한 세포를 용해시키기 위해 효소적 용해가 사용될 수 있다. 효모 세포의 효소적 용해는 가장 널리 사용되는 자이몰라제(zymolase) 및 라이티카제(lyticase)와 같은 여러 효소에 의한 세포벽의 소화를 기반으로 한다.
일부 실시형태에서, 용해 후, 상등액을 스핀다운시키고 또한 미립자 물질을 제거하기 위해 여과할 수 있다. 퍼옥시좀의 정제는 이 분야의 기술자들에게 알려져 있으며 원심분리기에서 구배에 의해 수행될 수 있다. 또한 퍼옥시좀은 상업적 키트(예를 들어 Sigma Aldrich의 Peroxisome Isolation Kit)로 단리될 수 있다.
퍼옥시좀의 용해 후, 관심있는 단백질을 위해 용해물을 정제할 수 있다. 대량 정제 후, 단백질을 용해된 퍼옥시좀으로부터 분리할 수 있다. 정제 기술은 이 분야의 기술자들에게 알려져 있다. 단백질의 유형 및 특성에 따라, 다른 유형의 정제 기술이 고려될 수 있다. 제한 없이 침전으로 단백질을 단리하기 위해 황산암모늄 침전과 같은 단계가 취해질 수 있다. 또한 시료 내 서로 다른 크기의 분자를 분리하기 위해 수크로스 구배 원심분리가 사용될 수 있다. 단백질의 리폴딩 방법이 알려진 경우 크기 배제 크로마토그래피가 비-변성 또는 변성 조건에서 주로 사용된다. 또한 전하 또는 소수성을 기반으로 단백질을 분리할 수 있다. 또한 단백질이 태깅된 경우, 친화성 크로마토그래피 또는 컬럼 또는 수지에 고정화하여 단백질을 분리할 수 있다.
예를 들어 질량분석법으로 관심있는 단백질의 변형에 대해 분석할 수 있다. 또한 효소와 같은 단백질은 활성 분석법으로 분석할 수 있다.
또한 퍼옥시좀의 전이에 대해 단백질의 유형을 분석할 수 있다. 안정성을 위해 단백질을 조작하는 방법이 이 분야의 기술자들에게 알려져 있다. 제한 없이, 이는 단백질의 pH를 인공적으로 변화시키기 위해 절단가능한 태그를 부착하는 것 또는 퍼옥시좀으로 전이될 단백질의 pH를 인공적으로 변화시키기 위해 몇몇 돌연변이를 생성하는 것을 포함할 수 있다.
고려될 수 있는 다른 태그는 단백질 또는 이의 도메인으로 전이되는 것으로 알려진 단백질의 태그이다. 문헌[Purdue et al.]에 기술된, 공통 서열 XX(K/R)(K/R)X(3-7)(T/S)XX(D/E)X(서열번호 4)로서, X는 임의의 아미노산이고, X(3-7)은 지시된 위치에서 임의의 아미노산 중 3 내지 7개의 아미노산 범위를 나타내는, 서열은 퍼옥시좀에서 단백질의 전이 또는 안정성을 허용할 수 있는 퍼옥시좀 단백질에 보존된 서열이다.
본 명세서에 기술된 방법, 세포 또는 조성물의 일부 실시형태에서, 퍼옥시좀 표적화 서열에 융합된 이종 단백질과 같은 단백질은 진핵 세포 또는 효모 세포와 같은 세포 내 퍼옥시좀에 국소화된다. 일부 실시형태에서, 퍼옥시좀 표적화 서열에 융합된 변형 효소와 같은 효소는 진핵 세포 또는 효모 세포와 같은 세포의 퍼옥시좀에 국소화되거나 퍼옥시좀 표적화 서열에 융합된 이종 단백질과 함께 국소화된다. 일부 실시형태에서, 단백질 및/또는 효소는 PTS1 또는 ePTS1과 같은 퍼옥시좀 표적화 신호에 융합된다. 예를 들어, 일부 실시형태에서 ePTS1은 퍼옥시좀 표적화 서열이다. ePTS1 태그 및 ePTS1 태그를 암호화하는 핵산 서열의 예시는 서열번호 3(LGRGRRSKL) 및 서열번호 12(TTGGGAAGAGGTAGAAGATCCAAATTG)에 제공된다.
다양한 단백질 및 효소가 퍼옥시좀 표적화 서열을 사용하여 퍼옥시좀으로 표적화될 수 있다. 예를 들어, 1 내지 5, 5 내지 10, 10 내지 25, 25 내지 50, 50 내지 75, 75 내지 100kDa, 100 내지 200kDa 또는 200 내지 300kDa, 또는 더 높은, 또는 전술된 kDa 범위 중 임의의 것을 포함하는 값의 범위의 분자량을 갖는 단백질 및 효소가 퍼옥시좀 표적화 서열로 퍼옥시좀에 표적화될 수 있다. 일부 실시형태에서, 퍼옥시좀으로 표적화될 단백질 및/또는 효소를 암호화하고 퍼옥시좀 표적화 서열을 암호화하는 서열을 갖는 핵산이 퍼옥시좀을 포함하는 세포로 전달되고, 세포는 단백질 및/또는 효소를 번역하고 퍼옥시좀으로 이들을 수송한다. 퍼옥시좀으로 표적화될 수 있는 단백질 및 효소의 추가적인 예시에는 구조 단백질, 콜라겐, 키나제, 포스파타제, 하이드록실화효소, 이성질체화효소, 절단 효소, 형광 단백질 및 호르몬이 포함되지만 이로 제한되지 않는다. 일부 실시형태에서, 표적화될 단백질 및/또는 효소는 형광성 태그(예를 들어, GFP, YFP 또는 CFP), 플래그 태그(예를 들어 DYKDDDDK, 여기서 D=아스파트산, Y=타이로신 및 K=라이신, 서열번호 5) 또는 히스티딘 태그(예를 들어, His-His-His-His-His-His, 서열번호 6)와 같은 태그를 포함한다. 이러한 태그는, 제한 없이, 단백질 및/또는 효소의 정제 및/또는 위치 확인을 위해 사용될 수 있다. 정제 기술은 태그(들)를 사용하여 단백질 및/또는 효소를 정제하기 위한 친화성 정제 또는 니켈 칼럼과 같은 이온 칼럼의 사용을 포함할 수 있지만 이로 제한되지 않는다. 사용될 수 있는 다른 태그에는 칼모듈린(KRRWKKNFIAVSAANRFKKISSSGAL, 서열번호 7), HA(YPYDVPDYA, 서열번호 8), Myc(EQKLISEEDL, 서열번호 9), SBP(MDEKTTGWRGGHVVEGLAGELEQLRARLEHHPQGQREP, 서열번호 10) 및/또는 Strp(WSHPQFEK, 서열번호 11) 태그가 포함된다.
GFP 태그의 예시가 서열번호 13(MRKGEELFTGVVPILVELDGDVNGHKFSVRGEGEGDATNGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFARYPDHMKQHDFFKSAMPEGYVQERTISFKDDGTYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNFNSHNVYITADKQKNGIKANFKIRHNVEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSVLSKDPNEKRDHMVLLEFVTAAGITHGMDELYK)에 제공된다. 일부 실시형태는 서열번호 14(ATGCGTAAAGGCGAAGAGCTGTTCACTGGTGTCGTCCCTATTCTGGTGGAACTGGATGGTGATGTCAACGGTCATAAGTTTTCCGTGCGTGGCGAGGGTGAAGGTGACGCAACTAATGGTAAACTGACGCTGAAGTTCATCTGTACTACTGGTAAACTGCCGGTTCCTTGGCCGACTCTGGTAACGACGCTGACTTATGGTGTTCAGTGCTTTGCTCGTTATCCGGACCATATGAAGCAGCATGACTTCTTCAAGTCCGCCATGCCGGAAGGCTATGTGCAGGAACGCACGATTTCCTTTAAGGATGACGGCACGTACAAAACGCGTGCGGAAGTGAAATTTGAAGGCGATACCCTGGTAAACCGCATTGAGCTGAAAGGCATTGACTTTAAAGAGGACGGCAATATCCTGGGCCATAAGCTGGAATACAATTTTAACAGCCACAATGTTTACATCACCGCCGATAAACAAAAAAATGGCATTAAAGCGAATTTTAAAATTCGCCACAACGTGGAGGATGGCAGCGTGCAGCTGGCTGATCACTACCAGCAAAACACTCCAATCGGTGATGGTCCTGTTCTGCTGCCAGACAATCACTATCTGAGCACGCAAAGCGTTCTGTCTAAAGATCCGAACGAGAAACGCGATCATATGGTTCTGCTGGAGTTCGTAACCGCAGCGGGCATCACGCATGGTATGGATGAACTGTACAAA)의 핵산 서열 또는 이의 단편과 같은 GFP 태그를 암호화하는 핵산을 포함한다.
실시예
아래에서 논의된 실시예는 순수하게 발명의 예시적인 것으로 의도되며 어떤 식으로든 발명을 제한하는 것으로 간주되어서는 안된다. 실시예는 아래의 실험이 수행된 모든 또는 유일한 실험임을 나타내기 위한 것이 아니다. 사용된 숫자(예를 들어, 양, 온도 등)와 관련하여 정확성을 보장하기 위해 노력하였지만 일부 실험적 오류 및 편차가 고려되어야 할 것이다. 달리 지시되지 않는 한, 부는 중량부이고, 분자량은 평균 분자량이고, 온도는 섭씨 온도이고, 압력은 대기압 또는 그 부근이다.
실시예 1: 다수의 효모 숙주에서 콜라겐 변이체 또는 P4HB의 퍼옥시좀으로의 국소화
GFP-x-ePTS1 작제물을 생산하였으며, 이때 GFP는 국소화의 가시화를 위해 포함되었고, ePTS1은 퍼옥시좀으로의 표적화를 위해 포함되었고, x는 관심있는 단백질이다. 관심있는 단백질의 비-제한적 예시에는 합성 콜라겐 펩타이드 COLsyn1a, COLsyn2, COLsyn3, COLsyn4, COLsyn5 및 COLsyn6, 및 단백질 이황화-이성질체화효소 P4HB가 포함된다(표 1 참조). 일부 실시형태에서, P4HB는 BantP4HB, ApmiP4HB, BtauP4HA1, BtauP4HB, BtP4HB 또는 GFP-B5P4HB-ePTS1, 또는 이의 단편 또는 유도체이다. 이들 관심있는 단백질을 암호화하는 핵산은 별도의 작제물에 포함되었다. 각각의 관심있는 단백질을 갖는 펩타이드를 생성한 작제물을 세포에서 형광 초점으로서 가시화된 야생형(WT) 사카로마이세스 세레비지에 균주의 퍼옥시좀에 도입하였다(도 3). 퍼옥시좀 도입 수용체(pex5Δ)가 없는 균주에서, 확산 세포질 국소화만 보였다. 이러한 결과는 일부 실시형태에서 본 명세서에 기술된 바와 같은 퍼옥시좀 표적화 펩타이드가 효모 세포와 같은 세포에서 퍼옥시좀으로 단백질 또는 효소를 표적화기 위해 사용될 수 있음을 나타낸다. 또한 관심있는 단백질의 다른 비-제한 예시 및 암호화 뉴클레오타이드 서열의 일부 예시가 표 1에 나타나 있다. 일부 실시형태에서, 관심있는 단백질 또는 암호화 핵산은 서열번호 15 내지 70 중에서 임의의 하나 이상과 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% 또는 100%, 또는 전술된 백분율 중 임의의 둘로 정의되는 범위와 동일한 아미노산 또는 핵산 서열로 이루어지거나 이를 포함한다. 일부 실시형태는 퍼옥시좀으로 표적화될 수 있는 다수의 관심있는 단백질을 포함한다.
다양한 콜라겐 변이체가 여러 산업용 효모 숙주에서 국소화되는 것으로 관찰되었다(도 4). 전장 콜라겐의 비-제한 예시에는 AmCOL1A1, AmCOL1A2, BtCOL1A1, BtCOL1A2 및 이의 단편이 포함된다. 더 작은 콜라겐 단편의 비-제한 예시에는 COLsyn1, COLsyn2, COLsyn3, COLsyn4, COLsyn, COLsyn5 및 COLsyn6, BtCol1A1 403-11P 및 BtCol1A1 403-0P가 포함된다. 도 4는 3종의 다른 산업용 효모 숙주 PBH001, PBH002, PBH004에서 GFP-콜라겐 변이체의 ePTS1-의존적 형광 국소화를 보여준다. 일반적인 산업용 효모 숙주에는 아르술라, 칸디다, 한세눌라, 클루이베로마이세스, 코마가탤라, 오가타에아, 피키아, 사카로마이세스 또는 야로위아가 포함되지만 이로 제한되지 않는다.
퍼옥시좀에 국소화되는 것으로 관찰된 단백질의 크기는 31kDa(GFP-COLSyn1) 내지 195kDa(BtCol1A2)의 범위이다. 따라서, 상당한 범위 크기의 단백질을 퍼옥시좀에 도입할 수 있다.
실시예 2: 독성 화합물로부터의 보호
일부 실시형태에서, 단백질 및/또는 효소를 퍼옥시좀에 표적화하는 것은 다른 효소 또는 기질로부터 물리적으로 분리함으로써 구획화한다. 이것은 분리된 단백질(들), 효소(들) 및/또는 기질(들) 사이의 상호작용 또는 활성을 방지하기 위해 사용될 수 있다. 예를 들어, SigD와 같은 독성적 또는 저해성 단백질이 구획화될 수 있다.
효소를 기질로부터 물리적으로 분리하기 위한 퍼옥시좀 구획화는 기질 상의 활성을 방지하기 위해 일부 실시형태에서 사용된다. 활성을 구획화하는 능력을 설명하자면, 퍼옥시좀에 독성 단백질을 격리함으로써 독성 단백질이 발현될 때 세포 생존력이 구제된다.
병원성 세균 살모넬라(Salmonella)는 장의 점막을 침범하여 위장염을 일으키는 일반적인 원인이다. 살모넬라에 의해 분비되는 병원성 인자 중 하나가 사카로마이세스 세레비지에에서 발현될 때 심각한 성장 저해를 유발하는 것이 입증된 이노시톨 포스파타제로 추정되는 SigD이다. 독성은 포스파타제 도메인이 없지만 효모 및 인간 세포 모두에서 엑틴 세포골격의 구성에 영향을 미치는 SigD N-말단 도메인(SigD1-351)과 연결된다(doi:10.1111/j.1462-5822.2005.00568.x).
퍼옥시좀 구획화에 의해 SigD1-351이 세포질 엑틴 세포골격 기질에 접근하는 것을 제거함으로써, 사카로마이세스 세레비지에는 SigD 저해 성장 효과로부터 보호될 수 있다.
도 5는 독성 단백질 SigD1-351이 퍼옥시좀에 격리될 때 숙주 사카로마이세스 세레비지에가 보호되는 것을 입증하는 예시이다. 유도성 GAL 프로모터의 조절 하에서 SigD1-351-eTPS1 또는 SigD1-351과 통합된 균주를 YPD 플레이트 상에서 연속 희석하여 발현을 억제하거나 YPGalactose 플레이트 상에서 연속 희석하여 발현을 유도하였다. 억제된 경우, 두 균주 모두 동등하게 잘 자랐다. 발현이 유도된 경우, 퍼옥시좀 국소화 독소(SigD1-351-eTPS1)를 갖는 균주는 성장할 수 있었지만 세포질에 발현된 독소(SigD1-351)는 숙주에 치명적이었다.
실시예는 다음의 설계를 포함한다: 독성 SigD 발현을 조절하기 위한 유도성 GAL 프로모터를 갖는 발현 카세트의 사용, 효모 세포로 형질전환된 별도의 발현 카세트에서 SigD의 독성(SigD1-351) 및 비-독성 변이체(SigD1-351(118-142Δ))의 발현, 발현 카세트에 의한 융합 단백질 GFP-x-ePTS1의 생산으로서, 여기서 x는 독성 또는 비-독성 SigD 변이체인, 융합 단백질 GFP-x-ePTS1의 생산 및 다음 균주 배경 중 하나로 별도의 효모 세포군 각각의 형질전환: PEX5(퍼옥시좀 도입) 및 pex5Δ(퍼옥시좀 도입이 없음). 본 실시예에서, 다음의 실험 기술이 수행된다: 포도당(억제) 및 갈락토스(유도) 플레이트 상에서 세포를 연속 희석하여 성장 결함을 보여주고, GFP 형광으로 국소화를 입증함.
실시예 3: 퍼옥시좀에서의 번역-후 변형을 수행하기 위한 효소 및 기질의 공동-국소화
다양한 종류의 번역-후 변형(PTM)이 퍼옥시좀에서 발생하는 것으로 입증될 수 있다. 효소와 이의 기질 또는 단백질 기질을 퍼옥시좀 장벽으로 분리하는 것이 일부 실시형태에서 기질에 대한 효소의 활성을 방지하기 위해 사용된다. 따라서, 기질 또는 효소의 격리가 사용될 수 있다. 예를 들어, 이것은 퍼옥시좀-격리 단백질로부터 세포 내용물을 보호하거나 그 반대의 경우의 예시일 수 있다.
일부 실시형태에서, 또 다른 단백질의 번역-후 변형(PTM)을 수행하는 변형 효소가 세포의 퍼옥시좀에 다른 단백질과 함께 공동-국소화된다. PTM의 예시로는 글리코실화(또는 다른 당 부가), 이성질체화, 절단, 단백질분해효소 절단, 단백질가수분해성 분해, 하이드록실화, 단백질가수분해, 인산화, 탈인산화, 유비퀴틴화(및 유비퀴틴-유사 변형 유사 네딜화(neddylation), 수모화(sumoylation)), 메틸화, 니트로실화, 아세틸화 및 지질화(GPI 고정, 프레닐화, 미리스토일화(myristolation) 포함)가 포함되지만 이로 제한되지 않는다. 다른 PTM 반응도 고려된다. 일부 실시형태에서, 효소, 효소의 보조인자 및 효소의 기질이 세포질 및/또는 퍼옥시좀으로 공동-국소화된다.
일부 실시형태에서, 효소, 효소의 보조인자 및 효소의 기질이 세포질 및/또는 퍼옥시좀으로 공동-국소화된다. 이것은 효소와 기질이 같은 영역에 공동-국소화될 때 변형이 발생한다는 것을 입증하기 위해 일부 실시형태에서 사용된다. 따라서, 공동-국소화는 PTM과 같은 변형을 수행하기 위해 사용될 수 있다.
본 명세서에 개시된 방법 및 조성물에 사용하기 위해 적합한 PTM의 예시로는 단백질분해효소 절단, 인산화, 탈인산화, 하이드록실화, 이성질체화, 글리코실화 및 프레닐화가 포함된다. 일부 실시형태에서, 단백질분해효소 절단, 인산화 및 탈인산화 중 하나 이상이 바람직한 PTM이다.
도 8은 사카로마이세스 세레비지에에서 하이드록실화효소(BantP4H) 및 콜라겐 기질(AmisCOL1A1 또는 Amis COL1A2)의 생체 내 공동-국소화를 입증한다. BantP4H는 형광 현미경으로 국소화를 모니터하기 위한 GFP 융합 태그와 함께 mRuby 융합 태그 및 콜라겐 기질을 함유한다. ePTS1 퍼옥시좀 국소화 신호로 형광 초점이 관찰되며 병합된 이미지는 하이드록실화효소와 콜라겐의 국소화가 겹치는 것을 보여준다. mRuby를 갖는 예시적 서열로는 예를 들어 서열번호 51 내지 52가 포함될 수 있다.
실시예 4: 단백질 가수 분해
일부 실시형태에서, 펩타이드 절단이 퍼옥시좀에서 발생할 수 있음을 보여주기 위해 TEV 단백질분해효소가 사용된다. 예를 들어, 일부 실시형태에서, 절단은 단백질분해효소와 기질이 같은 세포하 구획(예를 들어 세포질 또는 퍼옥시좀)에 존재할 때만 발생할 수 있다. TEV 단백질분해효소가 퍼옥시좀에 격리되어 세포질에서 표적을 자를 수 없음을 보여주는 예시는 세포질 내 다른 잠재적인 표적 또한 TEV-절단의 대상이 되지 않고 그러므로 퍼옥시좀 구획화 효소로부터 보호된다는 것을 보여준다. 일부 실시형태에서, 발현된 단백질/효소가 세포에 독성인 경우, 퍼옥시좀 구획화를 통해 이를 세포 기질로부터 분리하는 것은 단백질/효소로부터 세포를 보호하는 것을 제공한다. 기질/단백질이 퍼옥시좀에 격리되어 세포질의 TEV 단백질분해효소에 의해 잘릴 수 없다는 예시는 기질이 세포질 내 다른 효소의 대상이 되지 않으므로 기질/단백질이 단백질가수분해성 분해와 같은 세포로부터의 원치않는 변형으로부터 보호된다는 것을 시사한다. 따라서, 일부 실시형태에서, 일부 단백질의 선택적인 표적화는 일부 단백질의 원하는 변형을 초래하면서 다른 단백질은 그렇게 되지 않고/않거나 원치않는 변형을 방지한다.
일부 실시형태에서, 사카로마이세스 세레비지에에서, TEV 단백질분해효소와 절단을 위한 TEV 인식 부위(TEVrs)를 함유하는 기질이 강력한 프로모터에 의해 발현되어야 한다. YFP 또는 REP와의 융합은 현미경으로 세포질 또는 퍼옥시좀에 국소화되는 것을 입증할 것이다. 기질(YFP-TEVrs-IGF2-FLAG)의 단백질가수분해는 웨스턴 블롯으로 분석될 것이다.
일부 실시형태에서, 퍼옥시좀으로 표적화될 수 있는 다른 변형 단백질분해효소로는 매트릭스 금속단백질분해효소 MMP-1, MMP-2, MMP-8, MMP-13 및 MMP-14; N-단백질분해효소 ADAMTS-2, ADAMTS-3, ADAMTS-14; 및 C-단백질분해효소 BMP-1, mTLS 및 TLL-1이 포함되지만 이로 제한되지 않는다.
일부 실시형태에서, 퍼옥시좀으로 표적화되는 단백질은 TEV-절단가능 태그를 함유한다. 예를 들어, 절단가능 태그를 갖는 단백질의 예시는 BtCol1A2-TEV-GFP-HIS-ePTS1(서열번호 64)이며, 여기서 전장 소 1형 콜라겐 알파 2 단백질은 TEV 단백질분해효소에 의해 퍼옥시좀 국소화, 가시화 및 정제를 위해 사용될 수 있는 N-말단 태그로부터 분리될 수 있다. 추가 예시로는 임의의 태그 서열, 표적화 서열, 도메인 또는 단편, 또는 이의 유사체와 조합된 본 명세서에 개시된 임의의 단백질 서열이 포함될 수 있다. 이러한 서열의 예시로는 예를 들어 서열번호 57 내지 68이 포함될 수 있다.
TEV 단백질분해효소는 담배 식각 바이러스(TEV)의 서열 특이적 시스테인 단백질분해효소이다. 이종 효소 활성이 퍼옥시좀에서 달성될 수 있음을 보여주는 예시에서, TEV 단백질분해효소를 퍼옥시좀으로의 국소화를 지시하는 N-말단 ePTS1 신호 서열과 함께 사카로마이세스 세레비지에에서 발현시켰다. TEV 활성을 시험하기 위한 기질은 TEV 인식 아미노산 서열 Glu-Asn-Leu-Tyr-Phe-Gln-Ser의 양 옆에 N-말단 RFP 및 C-말단 YFP를 배치하여 생성하였다. 이 기질을 ePTS1 서열과 함께(도 6의 패널 A) 또는 이 서열없이(도 6의 패널 B) 발현시켰다. TEV 단백질분해효소와 기질이 모두 발현되어 퍼옥시좀에 공동-국소화되었을 때, 웨스턴 블롯 상에서 54kDa의 전체 크기 기질 밴드가 사라지고 27kDa RFP 절단 산물이 나타난 것으로 입증된 바와 같이 기질은 완전히 잘렸다(도 6의 패널 A의 1, 2 및 5번 레인). 그러나, TEV 단백질분해효소의 발현이 억제되었을 때, 퍼옥시좀-국소화 기질은 잘리지 않은 채로 남았다(도 6의 패널 A의 3 및 4번 레인). 대조군으로서, 기질은 세포질에서 발현시켰으나 TEV 단백질분해효소는 퍼옥시좀으로 표적화하였다. 다양한 양의 기질 절단이 관찰되었고 TEV 단백질분해효소의 발현을 구동하는 프로모터의 강도 pRPL18B < pTEF1 < pGAL1과 직접적인 상호관련성이 있었다(도 6의 패널 B의 1, 2 및 5번 레인). 이러한 결과는 TEV 단백질분해효소가 퍼옥시좀으로 도입되었지만 기질에 접근하는 것은 높은 발현에 의존하였기 때문에 TEV 단백질분해효소가 세포질에서 여전히 활성이 있음을 시사한다. 상대적으로, TEV 단백질분해효소의 발현 수준의 차이에도 불구하고 기질과 단백질분해효소가 퍼옥시좀에 공동-국소화될 때 TEV 절단 활성은 완료되었으며 이는 공동-구획화 또한 기질 변형 효율을 개선할 수 있는 방법이라는 예시를 보여준다.
실시예 5: 인산화 및 탈인산화
일부 실시형태에서, 특이적 인산화효소(세린/트레오닌 인산화효소 또는 타이로신 인산화효소) 및/또는 포스파타제 및 이들의 기질의 공동-발현이 확인된다. 예를 들어, MEK 및 이의 기질 MAPK1은 MEK가 MAPK1을 인산화하는 퍼옥시좀으로 MEK 및 MAPK1를 표적화하는 퍼옥시좀-표적화 펩타이드와 MEK 및 MAPK1의 융합 펩타이드를 생산하기 위한 핵산에 또는 별도의 핵산에 암호화될 수 있다. 또한, 추가 효소 및 기질, 예를 들어 Raf-1이 추가될 수 있다.
실시예 6: 하이드록실화
일부 실시형태에서, 퍼옥시좀에서 P4H 이산소화효소에 의한 콜라겐의 하이드록실화를 보여준다. 예를 들어, 소 P4H 서브유닛을 사용하는 설계를 사용할 수 있다. 대안적으로, 단일 세균성 P4H(바실러스 안트라시스(Bacillus anthracis) 또는 미미바이러스(mimivirus))가 사용될 수 있다. 일부 실시형태에서, 배지에 아스코르브산 및/또는 알파-케토글루타르산염 및 철(II)을 보충하며, 보조인자 및/또는 보충물이 퍼옥시좀에 들어갈 수 있는 경우 거기서 특이적 화학적 변형이 발생할 수 있음이 입증된다. 이러한 경우, 산화에 대해 질량-분광법으로 콜라겐을 분석한다. 일부 실시형태에서, 효소 활성을 추가 입증하기 위한 시험관내 분석을 사용한다.
이종 하이드록실화 활성이 생체 내 퍼옥시좀에서 달성될 수 있음을 입증하기 위해, 프롤릴-4-하이드록실화효소(P4H) 및 콜라겐 기질을 사카로마이세스 세레비지에에서 공동-발현시켰다. 바실러스 안트라시스의 P4H 효소가 합성 콜라겐-유사 펩타이드를 하이드록실화한다는 것이 이전에 시험관 내에서 입증되었으며(Schnicker and Dey, 2016) 이를 세포질(BantP4H) 또는 퍼옥시좀(BantP4H-ePTS1)에서 발현시켰다. 콜라겐 헬릭스는 GXY의 반복으로 이루어지며, 여기서 G는 글라이신, X는 임의의 아미노산이지만 종종 프롤린이고, Y는 임의의 아미노산이지만 종종 프롤린이다. Y 위치의 프롤린은 나선형 안정성을 위해 바람직하게는 하이드록실화된다(Gorres and Raines, 2010). 본 연구를 위해 설계된 기질은 11번 Y-위치 프롤린을 함유하는 소 1형 콜라겐 알파 1(BtCol1A1 403-11P)의 나선형 영역의 99개 아미노산 단편이었다. Y-위치 프롤린 하이드록실화에 대한 대조군으로, 11번 프롤린을 알라닌 또는 발린으로 돌연변이시켰다(BtCol1A1 403-0P). 이들 기질을 생체 내 국소화 모니터(도 8 참조) 및 정제를 위한 N-말단 GFP 및 C-말단 ePTS1 퍼옥시좀-국소화 서열과 함께 발현시켰다.
BantP4H 효소와 콜라겐 기질의 조합을 발현하는 세포(도 7의 패널 A)를 칸막이형 진탕 플라스크의 YPD에서 초기 대수증식기까지 30℃에서 배양한 다음 수확하였다. 세포 용해 후, 기질을 GFP-Trap 비드 상에서 정제하고, 10% PAGE 겔 상에서 전기영동하고, 쿠마시 블루로 염색하고, 겔에서 잘라내고, 프롤린 잔기의 산화에 대한 LCMSMS 분석을 위해 MS Bioworks에 보냈다.
질량 분광법 결과 퍼옥시좀에서 공동-발현될 때 콜라겐 기질의 세 부위에서 BantP4H-특이적 산화가 나타났다. PB000225, PB000254 및 PB000255 균주에서, BtCol1A1 403-11P_ePTS1 기질은 P264 위치 Y-위치 프롤린이 산화되었다. BtCol1A1 403-0P_ePTS1 대조군 기질에서 대응하는 위치는 알라닌으로 돌연변이(A264)되었고 산화가 관찰되지 않았다(도 7의 패널 B). P264에서 확인된 변형을 자세히 검토하면, PB000225(총 38개 중 1개가 변형됨) 및 PB000225(총 42개 중 2개가 변형됨) 균주에서 각각 2.6% 및 4.8%인 것에 비교하여 BantP4H가 퍼옥시좀에 공동-국소화된 PB000254 균주(총 33개 중 4개 변형됨)에서 이 위치에 12.1% 산화가 있다. 유사하게, 두 개의 추가 Y-위치 프롤린 P300 및 P324에서의 산화가 PB000254 균주에서만 관찰되었고 다른 5가지의 균주에서는 관찰되지 않았다(도 7의 패널 C). 종합하면, 이러한 결과는 효소와 기질 모두가 퍼옥시좀에 공동-국소화되면 콜라겐 기질 상의 3개의 Y-위치 프롤린이 Bant-P4H에 의해 특이적으로 하이드록실화된다는 것을 보여준다. 403-0P-ePTS1 또는 403-11P-ePTS1을 갖는 예시적 서열로는 예를 들어 서열번호 53 내지 56 및 65 내지 68이 포함된다.
실시예 7: 효모 퍼옥시좀에서 콜라겐의 발현
퍼옥시좀 표적화 태그를 통해 콜라겐 단백질이 퍼옥시좀에 도입된다. 퍼옥시좀 표적화 태그를 사용하여 프롤릴 하이드록실화효소 및 프롤릴 이성질체화효소를 유사하게 퍼옥시좀에 도입한다. 퍼옥시좀에서 프롤릴 하이드록실화효소를 콜라겐과 공동-배양하면 적절한 3중 나선 구조를 형성할 수 있다. I형 이종삼량체, 1형 알파 동종삼량체 및 III형 동종삼량체 콜라겐 모두를 기술된 방식으로 생산한다. I형 콜라겐의 경우, 전장 Col1A1(프로-알파1 사슬) 및 Col1A2(프로-알파2 사슬)뿐만 아니라 Col1A1(알파1 사슬) 및 Col1A2(알파2 사슬)의 발현 개선을 위한 문헌[Olsen et al (2001)]에서 보여주는 텔로단백질(teloprotein)을 단리하기 위한 N- 및 C-말단의 절단물을 사카로마이세스 세레비지에에서 발현시킨다. 유사하게, 프롤릴-4-하이드록실화효소를 전장으로 뿐만 아니라 발현 개선을 위한 PDI 도메인(Toman 2000)의 절단물로 발현시키고 퍼옥시좀에 도입한다.
실시예 8: 퍼옥시좀의 화물 증가시키기
임의의 다양한 통상의 프로토콜을 사용하여 발효기에서 효모를 배양한다. 유도를 통해 퍼옥시좀 용량을 증가시킬 수 있다. 사카로마이세스 세레비지에의 경우, 올레산염의 사용을 통한 것일 수 있고, 피키아 파스토리스 및 오가타에아 폴리모프의 경우 메탄올의 사용을 통한 것일 수 있다. 구획화 및 정제하려는 단백질을 퍼옥시좀-표적화 태그 PTS1, PTS2 또는 이들 태그의 강화 버전으로 태깅한다. 발효-후, 프렌치 프레스 또는 라이티카제(lyticase)를 사용한 세포벽 소화 후 균질화하는 것과 같은 많은 통상적인 용해 방법을 사용하여 효모 세포의 원형질막을 용해할 수 있다. 핵, 원형질막 및 다른 세포 잔해를 제거하기 위해 저속 원심분리를 사용한다. 밀도 구배 원심분리와 같은 다른 방법으로 생성된 상등액으로부터 퍼옥시좀을 추가 정제할 수 있다. 퍼옥시좀 정제의 대안적인 방법은 친화성 정제를 위한 스트렙타비딘 또는 폴리히스티딘 펩타이드와 같은 친화성 태그로 퍼옥시좀 막 단백질을 유전적으로 태깅하는 것이다. 이후 이렇게 정제된 퍼옥시좀을 예를 들어 삼투 용해(J Cell Biol. 2007 Apr 23; 177(2): 289-303; 이의 전체가 본 명세서에 참조에 의해 원용됨)를 사용하여 용해한다. 고속 원심분리를 통해 퍼옥시좀 잔해를 제거하고 원하는 화물 단백질을 함유하는 가용성 분획을 수집할 수 있다. 원할 경우, 친화성 정제를 사용하여 이러한 원하는 단백질을 추가 정제할 수 있다. 제한 없이, 화물 단백질을 예를 들어 폴리-히스티딘, 말토스-결합 단백질, 글루타치온 S-전이효소와 같은 임의의 여러 이용가능한 펩타이드 또는 단백질 폴드 친화성 태그로 태깅하고, 각각의 프로토콜을 사용하여 정제할 수 있다. 대안적으로, 이온 크로마토그래피 또는 겔 여과와 같은 다른 정제 방법을 사용할 수 있다.
실시예 9: 효모 퍼옥시좀에서 번역-후 변형된 단백질의 발현 - 개별 단백질의 퍼옥시좀으로의 국소화(ePTS1-기반 표적화)
퍼옥시좀 표적화 서열을 사용하여 크기 및 기능 기반 다른 종류의 단백질을 전형적인 효모 세포의 퍼옥시좀에 국소화하는 것을 입증한다. 표적화될 수 있는 단백질 및 단백질 유형의 비-제한 예시가 표 3에 나열되어 있다. 퍼옥시좀 표적화 기작이 유지되므로, 피키아 파스토리스/코마가탤라 파피(Komagataella phaffii), 한세눌라 폴리모파/오가타에아 파라폴리모파(Ogataea parapolymorpha) 및 칸디다 보이디니(Candida boidinii)와 같은 메틸영양성 효모를 포함하는 다른 유기체에서 플랫폼을 사용할 수 있다. GFP-x-ePTS1 및 x-FLAG-ePTS1 작제물을 생산한다. 작제물에서, GFP는 국소화의 가시화를 위해, FLAG-ePTS1는 단백질 발현 및 GFP가 기능을 방해하는 경우를 위해 사용되며, "x"는 표적화할 관심있는 단백질 또는 효소를 나타낸다. 일부 작제물 서열 및 일부 실시형태의 세부사항은 표 1 및 2에 제공된다.
실시예 10: 이황화 결합 형성
일부 실시형태에서, 변형은 이황화 결합 형성이다. 예를 들어, 이종 단백질 및 단백질 이황화 이성질체화효소(PDI)가 공동-발현되고 퍼옥시좀에 표적화되는 설계가 사용된다. 이러한 경우, 질량-분광법으로 이종 단백질의 이황화를 분석한다.
생체 내 퍼옥시좀에서의 이황화 결합 형성을 입증하기 위해, 인간 인슐린, 알파 인터페론 및 마파칼신(mapacalcine)을 발현하는 이종 유전자를 PDI와 함께 공동-발현시킨다. 일반적으로 ER로 표적화되는 오가타에아 PDI(OgPDI)가 과발현되고 퍼옥시좀으로 표적화되도록 설계한다. 인간 인슐린 전구체(Baeshan et al, 2014), 알파 인터페론(Shi et al, 2007) 및 마파칼신(Noubhani et al, 2015)은 피키아 파스토리스에 최적화된 코돈을 사용하여 합성한다. 관심있는 표적 유전자에 대한 발현 카세트, 변형 효소에 대한 발현 카세트 및 선택가능 마커에 대한 발현 카세트를 포함하는 세 가지 발현 카세트로 작제물을 설계한다.
각 카세트는 프로모터, 발현 유전자(관심있는 유전자 또는 변형 효소 유전자 또는 선택가능 마커 유전자) 및 터미네이터를 갖는다. 관심있는 유전자 및 변형 효소 유전자는 번역적 융합으로서 각각 GFP 및 mRuby 형광 태그를 포함하도록 설계한다. 관심있는 유전자 및 변형 효소 유전자 모두 3' 말단의 ePTS1 서열의 도입에 의해 퍼옥시좀으로 표적화된다. 마파칼신 및 OgPDI을 공동-발현하는 전체 작제물의 서열은 서열번호 73에 기재되어 있다. 추가 카세트는 인간 인슐린 전구체(서열번호 74), 알파 인터페론(서열번호 75), 마파칼신(서열번호 76), OgPDI(서열번호 77)에 대한 핵산 서열을 포함한다.
초기에 퍼옥시좀으로의 표적화를 확인하는 형광 마커에 대해 이들 카세트를 발현하는 형질전환체를 스크리닝한다. 질량 분광법으로 형질전환체 균주로부터 정제된 관심있는 이종 단백질의 이황화 형성에 대해 분석한다.
실시예 11: 인산화
일부 실시형태에서, 변형은 인산화이다. 예를 들어, 인간 베타-카세인 II(Greenberg et al, 1984; Thurmond et al, 1997) 및 특정 단백질 인산화효소, 즉 카세인의 특정 세린 및 트레오닌 아미노산을 인산화하는 인간 카세인 인산화효소(Voss et al, 1991)의 공동-발현에 대해 확인한다. 인간 베타-카세인 II의 코돈 최적화 서열은 서열번호 78에 카세인 인산화효소 II 서브유닛 베타의 코돈 최적화 서열은 서열번호 79에 기재되어 있다.
이황화 결합 형성의 입증을 위해 사용된 것과 같은 방식으로(실시예 10에 기재된 바와 같이) 형질전환을 위한 작제물을 생성한다. 카세인은 관심있는 유전자로 사용되고 카세인 인산화효소는 변형 효소로 사용된다. 인산화는 퍼옥시좀 내 조절의 주요 형태이며, 퍼옥시좀에서 발현되는 표적 카세인은 카세인 인산화효소의 공동-발현을 필요로하지 않을 수도 있다. 일단 생성되면, 재조합 카세인을 정제하고 질량-분광법으로 트레오닌 및 세린의 인산화 형태에 대해 분석한다. 일부 실시형태에서, 인산화 활성은 시험관 내에서 분석한다.
실시예 12: 아세틸화
일부 실시형태에서, 변형은 N-말단 아세틸화이다. 예를 들어, 암닭 달걀 오브알부민(Ito & Matsudomi, 2005) 및 특정 아세틸화 복합체 NatB(Rovere et al, 2008)로서 N-말단 글라이신의 아세틸화를 촉진하는 NatB의 공동-발현을 확인한다. 오브알부민의 코돈 최적화 서열은 서열번호 80에 기재되어 있고 효모 NatB 복합체(Naa20 및 Naa25)에 해당하는 두 유전자는 각각 서열번호 81 및 82에 기재되어 있다.
이황화 결합 형성의 입증을 위해 사용된 것과 같은 방식으로(실시예 10에 기재된 바와 같이) 형질전환을 위한 작제물을 생성한다. 오브알부민은 관심있는 유전자로 사용되고 NatB 복합체의 두 유전자는 변형 효소를 구성한다. 효모의 많은 단백질이 N-말단 아세틸화되며, 퍼옥시좀에서 발현되는 표적 오브알부민은 카세인 인산화효소가 없어도 N-말단 아세틸화를 보일 수 있다. 일단 생성되면 재조합 카세인을 정제하고 질량-분광법으로 N-말단 글라이신의 아세틸화에 대해 분석한다.
본 명세서에서 복수적 및/또는 단수적 용어의 사용과 관련하여, 이 분야의 기술자들은 문맥 및/또는 적용에 적절하게 복수에서 단수로 및/또는 단수에서 복수로 바꿀 수 있다. 명확성을 위해 다양한 단수적/복수적 순열이 본 명세서에 명시적으로 기재될 수 있다.
일반적으로, 본 명세서 및 특히 첨부된 청구범위(예를 들어, 첨부된 청구범위의 본문)에 사용된 용어들은 일반적으로 "개방된" 용어로서 의도된 것임을 이 분야의 기술자들은 이해할 것이다(예를 들어, 용어 "포함하는"은 "포함하지만 이로 제한되지 않는"으로 이해되어야 하고, 용어 "갖는"은 "적어도 갖는"으로 이해되어야 하고, 용어 "포함한다"는 "포함하지만 이로 제한되지 않는다"로 이해되어야 하는 등). 또한 청구항의 기재에서 특정 수가 의도된 경우, 이러한 의도는 청구항에서 명시적으로 기재될 것이고, 이러한 기재가 없는 경우 이러한 의도가 없음을 이 분야의 기술자들은 이해할 것이다. 예를 들어, 이해를 돕기 위한 것으로서, 다음의 첨부된 청구범위에는 "적어도 하나" 및 "하나 이상"의 문구의 사용을 포함할 수 있다. 그러나, 이러한 문구를 사용하는 것으로 인해, 그렇지 않은 경우가 청구범위를 단지 하나를 함유하는 실시형태로 제한하는 것으로 해석되어서는 안되며, "하나 이상" 또는 "적어도 하나"의 문구를 사용한 경우와 그렇지 않은 경우가 같은 청구항에 포함되는 경우에도 마찬가지이다(예를 들어, 단수적인 표현은 "적어도 하나" 또는 "하나 이상"을 의미하는 것으로 해석되어야 함). 또한, 청구항의 기재에서 특정 수가 명시적으로 기재된 경우에도, 이 분야의 기술자들은 이러한 기재가 기재된 수 이상을 의미하는 것으로 해석되어야 함을 인식할 것이다(예를 들어, 다른 수식어가 없는 "두 기재들"의 단순한 기재는 적어도 두 기재들 또는 둘 이상의 기재들을 의미함). 또한, "A, B 및 C 중 적어도 하나" 등과 같은 관례적인 문구가 사용된 경우, 일반적으로 이러한 구성은 이 분야의 기술자가 그 관례를 이해하는 의미로 의도된다(예를 들어, "A, B 및 C 중 적어도 하나를 갖는 시스템"은 A를 단독으로 갖는 시스템, B를 단독으로 갖는 시스템, C를 단독으로 갖는 시스템, A와 B를 함께 갖는 시스템, A와 C를 함께 갖는 시스템, B와 C를 함께 갖는 시스템 및/또는 A, B 및 C를 함께 갖는 시스템 등을 포함하지만 이로 제한되지 않을 것임). "A, B 또는 C 중 적어도 하나" 등과 같은 관례적인 문구가 사용된 경우, 일반적으로 이러한 구성은 이 분야의 기술자가 그 관례를 이해하는 의미로 의도된다(예를 들어, "A, B 또는 C 중 적어도 하나를 갖는 시스템"은 A를 단독으로 갖는 시스템, B를 단독으로 갖는 시스템, C를 단독으로 갖는 시스템, A와 B를 함께 갖는 시스템, A와 C를 함께 갖는 시스템, B와 C를 함께 갖는 시스템 및/또는 A, B 및 C를 함께 갖는 시스템 등을 포함하지만 이로 제한되지 않을 것임). 또한, 발명의 설명에서든, 청구범위에서든 또는 도면에서든, 둘 이상의 대안적인 용어를 나타내는 실질적으로 이접적인 임의의 단어 및/또는 문구는 용어들 중 하나, 용어들 중 어느 하나 또는 용어들 모두를 포함하는 가능성을 고려하는 것으로 이해되어야 함을 이 분야의 기술자들은 이해할 것이다. 예를 들어, 문구 "A 또는 B"는 "A" 또는 "B" 또는 "A 및 B"의 가능성을 포함하는 것으로 이해될 것이다.
또한, 본 개시내용의 특징 또는 양상이 마쿠쉬 군으로 기술된 경우, 이 분야의 기술자들은 개시내용이 마쿠쉬 군의 임의의 개별 구성원 또는 구성원들의 하위군의 관점에서도 기술된다는 것을 인식할 것이다.
임의의 일 양상의 실시형태의 특징이 본 명세서에서 확인된 모든 양상 및 실시형태에 적용가능하다. 더욱이, 임의의 일 양상의 실시형태의 특징이 독립적으로, 부분적으로 또는 전체적으로 본 명세서에 기술된 다른 실시형태와 임의의 방식으로 결합가능하며, 예를 들어 하나, 둘 또는 셋 또는 그 이상의 실시형태가 전체적으로 또는 부분적으로 결합가능하다. 또한, 임의의 일 양상의 실시형태의 특징은 다른 양상 또는 실시형태에 대해 선택적으로 만들어질 수 있다.
SEQUENCE LISTING
<110> PROVENANCE BIO, LLC
THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
<120> EXPRESSION OF MODIFIED PROTEINS IN A
PEROXISOME
<130> PBFAB.001WO2
<140> PCT/US2020/032512
<141> 2020-05-12
<150> US 62/847769
<151> 2019-05-14
<160> 82
<170> FastSEQ for Windows Version 4.0
<210> 1
<211> 3
<212> PRT
<213> Artificial Sequence
<220>
<223> Peroxisome targeting sequence
<400> 1
Ser Leu Lys
1
<210> 2
<211> 9
<212> PRT
<213> Artificial Sequence
<220>
<223> Peroxisome targeting sequence
<220>
<221> VARIANT
<222> (3)...(7)
<223> Xaa = any amino acid
<220>
<221> VARIANT
<222> 8
<223> Xaa = H or Q
<400> 2
Arg Leu Xaa Xaa Xaa Xaa Xaa Xaa Leu
1 5
<210> 3
<211> 9
<212> PRT
<213> Artificial Sequence
<220>
<223> Peroxisome targeting sequence
<400> 3
Leu Gly Arg Gly Arg Arg Ser Lys Leu
1 5
<210> 4
<211> 16
<212> PRT
<213> Artificial Sequence
<220>
<223> Consensus sequence
<220>
<221> VARIANT
<222> 1,2,5-11,13,14,16
<223> Xaa = any amino acid
<220>
<221> VARIANT
<222> (3)...(4)
<223> Xaa = K or R
<220>
<221> VARIANT
<222> 12
<223> Xaa = T or S
<220>
<221> VARIANT
<222> 15
<223> Xaa = D or E
<400> 4
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
1 5 10 15
<210> 5
<211> 8
<212> PRT
<213> Artificial Sequence
<220>
<223> tag
<400> 5
Asp Tyr Lys Asp Asp Asp Asp Lys
1 5
<210> 6
<211> 6
<212> PRT
<213> Artificial Sequence
<220>
<223> tag
<400> 6
His His His His His His
1 5
<210> 7
<211> 26
<212> PRT
<213> Artificial Sequence
<220>
<223> calmodulin tag
<400> 7
Lys Arg Arg Trp Lys Lys Asn Phe Ile Ala Val Ser Ala Ala Asn Arg
1 5 10 15
Phe Lys Lys Ile Ser Ser Ser Gly Ala Leu
20 25
<210> 8
<211> 9
<212> PRT
<213> Artificial Sequence
<220>
<223> HA tag
<400> 8
Tyr Pro Tyr Asp Val Pro Asp Tyr Ala
1 5
<210> 9
<211> 10
<212> PRT
<213> Artificial Sequence
<220>
<223> Myc tag
<400> 9
Glu Gln Lys Leu Ile Ser Glu Glu Asp Leu
1 5 10
<210> 10
<211> 38
<212> PRT
<213> Artificial Sequence
<220>
<223> SBP tag
<400> 10
Met Asp Glu Lys Thr Thr Gly Trp Arg Gly Gly His Val Val Glu Gly
1 5 10 15
Leu Ala Gly Glu Leu Glu Gln Leu Arg Ala Arg Leu Glu His His Pro
20 25 30
Gln Gly Gln Arg Glu Pro
35
<210> 11
<211> 8
<212> PRT
<213> Artificial Sequence
<220>
<223> Strp tag
<400> 11
Trp Ser His Pro Gln Phe Glu Lys
1 5
<210> 12
<211> 27
<212> DNA
<213> Artificial Sequence
<220>
<223> ePTS1 tag
<400> 12
ttgggaagag gtagaagatc caaattg 27
<210> 13
<211> 238
<212> PRT
<213> Artificial Sequence
<220>
<223> GFP tag
<400> 13
Met Arg Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu Val
1 5 10 15
Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Arg Gly Glu
20 25 30
Gly Glu Gly Asp Ala Thr Asn Gly Lys Leu Thr Leu Lys Phe Ile Cys
35 40 45
Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Leu
50 55 60
Thr Tyr Gly Val Gln Cys Phe Ala Arg Tyr Pro Asp His Met Lys Gln
65 70 75 80
His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu Arg
85 90 95
Thr Ile Ser Phe Lys Asp Asp Gly Thr Tyr Lys Thr Arg Ala Glu Val
100 105 110
Lys Phe Glu Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile
115 120 125
Asp Phe Lys Glu Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr Asn
130 135 140
Phe Asn Ser His Asn Val Tyr Ile Thr Ala Asp Lys Gln Lys Asn Gly
145 150 155 160
Ile Lys Ala Asn Phe Lys Ile Arg His Asn Val Glu Asp Gly Ser Val
165 170 175
Gln Leu Ala Asp His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro
180 185 190
Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Val Leu Ser
195 200 205
Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val
210 215 220
Thr Ala Ala Gly Ile Thr His Gly Met Asp Glu Leu Tyr Lys
225 230 235
<210> 14
<211> 714
<212> DNA
<213> Artificial Sequence
<220>
<223> GFP tag
<400> 14
atgcgtaaag gcgaagagct gttcactggt gtcgtcccta ttctggtgga actggatggt 60
gatgtcaacg gtcataagtt ttccgtgcgt ggcgagggtg aaggtgacgc aactaatggt 120
aaactgacgc tgaagttcat ctgtactact ggtaaactgc cggttccttg gccgactctg 180
gtaacgacgc tgacttatgg tgttcagtgc tttgctcgtt atccggacca tatgaagcag 240
catgacttct tcaagtccgc catgccggaa ggctatgtgc aggaacgcac gatttccttt 300
aaggatgacg gcacgtacaa aacgcgtgcg gaagtgaaat ttgaaggcga taccctggta 360
aaccgcattg agctgaaagg cattgacttt aaagaggacg gcaatatcct gggccataag 420
ctggaataca attttaacag ccacaatgtt tacatcaccg ccgataaaca aaaaaatggc 480
attaaagcga attttaaaat tcgccacaac gtggaggatg gcagcgtgca gctggctgat 540
cactaccagc aaaacactcc aatcggtgat ggtcctgttc tgctgccaga caatcactat 600
ctgagcacgc aaagcgttct gtctaaagat ccgaacgaga aacgcgatca tatggttctg 660
ctggagttcg taaccgcagc gggcatcacg catggtatgg atgaactgta caaa 714
<210> 15
<211> 4392
<212> DNA
<213> Artificial Sequence
<220>
<223> Btau COL1A1
<400> 15
atgttcagct ttgtggacct ccggctcctg ctcctcttag cggccaccgc cctcctgacg 60
cacggccaag aggagggcca ggaagaaggc caagaagaag acatcccacc agtcacctgc 120
gtacagaacg gcctcaggta ccatgaccga gacgtgtgga aacccgtgcc ctgccagatc 180
tgtgtctgcg acaacggcaa cgtgctgtgc gatgacgtga tctgcgacga acttaaggac 240
tgtcctaacg ccaaagtccc cacggacgaa tgctgccccg tctgccccga aggccaggaa 300
tcacccacgg accaagaaac caccggagtc gagggaccga aaggagacac tggcccccga 360
ggcccaaggg gacccgccgg cccccccggc cgagatggca tccctggaca acctggactt 420
cccggacccc ctggaccccc cggacctccc ggaccccctg gcctcggagg aaactttgct 480
ccccagttgt cttacggcta tgatgagaaa tcaacaggaa tttccgtgcc tggtcccatg 540
ggtccttctg gtcctcgtgg tctccctggc ccccctggcg cacctggtcc ccaaggtttc 600
caaggccccc ctggtgagcc tggcgagcca ggagcctcag gtcccatggg tccccgtggt 660
ccccctggcc cccctggcaa gaacggagat gatggcgaag ctggaaagcc tggtcgtcct 720
ggtgagcgcg ggcctcccgg acctcagggt gctcggggat tgcctggaac agctggcctc 780
cctggaatga agggacacag aggtttcagt ggtttggatg gtgccaaggg agatgctggt 840
cctgctggcc ccaagggcga gcctggtagc cccggtgaaa atggagctcc tggtcagatg 900
ggcccccgtg gtctgcctgg tgagagaggt cgccctggag cccctggccc tgctggtgct 960
cgaggaaatg atggtgcgac tggtgctgct gggccccctg gtcccactgg ccccgctggt 1020
cctcctggtt tccctggtgc tgtgggtgct aagggtgaag gtggtcccca aggaccccga 1080
ggttctgaag gtccccaggg tgtacgtggt gagcctggcc cccctggccc tgctggtgct 1140
gctggccctg ctggcaaccc tggtgctgat ggacagcctg gtgctaaagg agccaatggc 1200
gctcctggta ttgctggtgc tcctggcttc cctggtgccc gaggcccctc tggaccccag 1260
ggccccagcg gcccccctgg ccccaagggt aacagcggtg aacctggtgc tcctggcagc 1320
aaaggagaca ctggcgccaa gggagaaccc ggtcccactg gtattcaagg cccccctggc 1380
cccgctgggg aagaaggaaa gcgaggagcc cgaggtgaac ctggacctgc tggcctgcct 1440
ggaccccctg gcgagcgtgg tggacctgga agccgtggtt tccctggcgc cgacggtgtt 1500
gctggtccca agggtcctgc tggtgaacgc ggtgctcctg gccctgctgg ccccaaaggt 1560
tctcctggtg aagctggtcg ccccggtgaa gctggtctgc ccggtgccaa gggtctgact 1620
ggaagccctg gcagcccggg tcctgatggc aaaactggcc cccctggtcc cgccggtcaa 1680
gatggccgcc ctggacctcc aggccctccc ggtgcccgtg gtcaggctgg cgtgatgggt 1740
ttccctggac ctaaaggtgc tgctggagag cctggaaaag ctggagagcg aggtgttcct 1800
ggaccccctg gcgctgttgg tcctgctggc aaagacggag aagctggagc tcagggaccc 1860
ccaggacctg ctggccccgc tggtgagaga ggcgaacaag gccctgctgg ctcccctgga 1920
ttccagggtc tccccggccc tgctggtcct cctggtgaag caggcaaacc tggtgaacag 1980
ggtgttcctg gagatcttgg tgcccccggc ccctctggag caagaggcga gagaggtttc 2040
cccggcgagc gtggtgtgca agggccgccc ggtcctgcag gtccccgtgg ggccaatggt 2100
gcccctggca acgatggtgc taagggtgat gctggtgccc ctggagcccc cggtagccag 2160
ggtgcccctg gccttcaagg aatgcctggt gaacgaggtg cagctggtct tccaggccct 2220
aagggtgaca gaggggatgc tggtcccaaa ggtgctgatg gtgctcctgg caaagatggc 2280
gtccgtggtc tgactggtcc catcggtcct cctggccccg ctggtgcccc tggtgacaag 2340
ggtgaagctg gtcctagtgg cccagccggt cccactggag ctcgtggtgc ccccggtgac 2400
cgtggtgagc ctggtccccc cggccctgct ggcttcgctg gcccccctgg tgctgatggc 2460
caacctggtg ctaaaggcga acctggtgat gctggtgcta aaggtgacgc tggtcccccc 2520
ggccctgctg ggcccgctgg accccccggc cccattggta acgttggtgc tcccggaccc 2580
aaaggtgctc gtggcagcgc tggtccccct ggtgctactg gtttcccagg tgctgctggc 2640
cgagtcggtc cccccggccc ctctggaaat gctggacccc ctggccctcc tggccctgct 2700
ggcaaagaag gcagcaaagg cccccgcggt gagactggcc ccgctgggcg tcccggtgaa 2760
gtcggtcccc ctggtccccc tggccccgct ggtgagaaag gagcccctgg tgctgacgga 2820
cctgctggag ctcctggcac tcctggacct caaggtattg ctggacagcg tggtgtggtc 2880
ggcctgcctg gtcagagagg agaaagaggc ttccctggtc ttcctggccc ctctggtgaa 2940
cccggcaaac aaggtccttc tggagcaagt ggtgaacgtg gcccccctgg tcccatgggc 3000
ccccctggat tggctggacc ccctggcgag tctggacgtg agggagctcc tggtgctgaa 3060
ggatcccctg gacgagatgg ttctcctggc gccaagggtg accgtggtga gaccggccct 3120
gctggacctc ctggtgctcc tggcgctccc ggtgcccccg gccctgtcgg acctgccggc 3180
aagagcggtg atcgtggtga gaccggtcct gctggtcctg ctggtcccat tggccccgtt 3240
ggtgcccgtg gccccgctgg accccaaggc ccccgtggtg acaagggtga gacaggcgaa 3300
cagggcgaca gaggcattaa gggtcaccgt ggcttctctg gtctccaggg tccccccggc 3360
cctcccggct ctcctggtga gcaaggtcct tccggagcct ctggtcctgc tggtccccgc 3420
ggtccccctg gctctgctgg ttctcccggc aaagatggac tcaatggtct cccaggcccc 3480
atcggtcccc ctgggcctcg aggtcgcact ggtgatgctg gtcctgctgg tcctcccggc 3540
cctcctggac cccctggtcc cccaggtcct cccagcggcg gctacgactt gagcttcctg 3600
ccccagccac ctcaagagaa ggctcacgat ggtggccgct actaccgggc tgatgatgcc 3660
aatgtggtcc gtgaccgtga cctcgaggtg gacaccaccc tcaagagcct gagccagcag 3720
atcgagaaca tccggagccc tgaaggcagc cgcaagaacc ccgcccgcac ctgccgtgac 3780
ctcaagatgt gccactctga ctggaagagc ggagaatact ggattgaccc caaccaaggc 3840
tgcaacctgg atgccattaa ggtcttctgc aacatggaaa ccggtgagac ctgtgtatac 3900
cccactcagc ccagcgtggc ccagaagaac tggtatatca gcaagaaccc caaggaaaag 3960
aggcacgtct ggtacggcga gagcatgacc ggcggattcc agttcgagta tggcggccag 4020
gggtccgatc ctgccgatgt ggccatccag ctgactttcc tgcgcctgat gtccaccgag 4080
gcctcccaga acatcaccta ccactgcaag aacagcgtgg cctacatgga ccagcagact 4140
ggcaacctca agaaggccct gctcctccag ggctccaacg agatcgagat ccgggccgag 4200
ggcaacagcc gcttcaccta cagcgtcacc tacgatggct gcacgagtca caccggagcc 4260
tggggcaaga cagtgatcga atacaaaacc accaagacct cccgcttgcc catcatcgat 4320
gtggccccct tggacgttgg cgccccagac caggaattcg gcttcgacgt tggccctgcc 4380
tgcttcctgt aa 4392
<210> 16
<211> 1463
<212> PRT
<213> Artificial Sequence
<220>
<223> Btau COL1A1
<400> 16
Met Phe Ser Phe Val Asp Leu Arg Leu Leu Leu Leu Leu Ala Ala Thr
1 5 10 15
Ala Leu Leu Thr His Gly Gln Glu Glu Gly Gln Glu Glu Gly Gln Glu
20 25 30
Glu Asp Ile Pro Pro Val Thr Cys Val Gln Asn Gly Leu Arg Tyr His
35 40 45
Asp Arg Asp Val Trp Lys Pro Val Pro Cys Gln Ile Cys Val Cys Asp
50 55 60
Asn Gly Asn Val Leu Cys Asp Asp Val Ile Cys Asp Glu Leu Lys Asp
65 70 75 80
Cys Pro Asn Ala Lys Val Pro Thr Asp Glu Cys Cys Pro Val Cys Pro
85 90 95
Glu Gly Gln Glu Ser Pro Thr Asp Gln Glu Thr Thr Gly Val Glu Gly
100 105 110
Pro Lys Gly Asp Thr Gly Pro Arg Gly Pro Arg Gly Pro Ala Gly Pro
115 120 125
Pro Gly Arg Asp Gly Ile Pro Gly Gln Pro Gly Leu Pro Gly Pro Pro
130 135 140
Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Leu Gly Gly Asn Phe Ala
145 150 155 160
Pro Gln Leu Ser Tyr Gly Tyr Asp Glu Lys Ser Thr Gly Ile Ser Val
165 170 175
Pro Gly Pro Met Gly Pro Ser Gly Pro Arg Gly Leu Pro Gly Pro Pro
180 185 190
Gly Ala Pro Gly Pro Gln Gly Phe Gln Gly Pro Pro Gly Glu Pro Gly
195 200 205
Glu Pro Gly Ala Ser Gly Pro Met Gly Pro Arg Gly Pro Pro Gly Pro
210 215 220
Pro Gly Lys Asn Gly Asp Asp Gly Glu Ala Gly Lys Pro Gly Arg Pro
225 230 235 240
Gly Glu Arg Gly Pro Pro Gly Pro Gln Gly Ala Arg Gly Leu Pro Gly
245 250 255
Thr Ala Gly Leu Pro Gly Met Lys Gly His Arg Gly Phe Ser Gly Leu
260 265 270
Asp Gly Ala Lys Gly Asp Ala Gly Pro Ala Gly Pro Lys Gly Glu Pro
275 280 285
Gly Ser Pro Gly Glu Asn Gly Ala Pro Gly Gln Met Gly Pro Arg Gly
290 295 300
Leu Pro Gly Glu Arg Gly Arg Pro Gly Ala Pro Gly Pro Ala Gly Ala
305 310 315 320
Arg Gly Asn Asp Gly Ala Thr Gly Ala Ala Gly Pro Pro Gly Pro Thr
325 330 335
Gly Pro Ala Gly Pro Pro Gly Phe Pro Gly Ala Val Gly Ala Lys Gly
340 345 350
Glu Gly Gly Pro Gln Gly Pro Arg Gly Ser Glu Gly Pro Gln Gly Val
355 360 365
Arg Gly Glu Pro Gly Pro Pro Gly Pro Ala Gly Ala Ala Gly Pro Ala
370 375 380
Gly Asn Pro Gly Ala Asp Gly Gln Pro Gly Ala Lys Gly Ala Asn Gly
385 390 395 400
Ala Pro Gly Ile Ala Gly Ala Pro Gly Phe Pro Gly Ala Arg Gly Pro
405 410 415
Ser Gly Pro Gln Gly Pro Ser Gly Pro Pro Gly Pro Lys Gly Asn Ser
420 425 430
Gly Glu Pro Gly Ala Pro Gly Ser Lys Gly Asp Thr Gly Ala Lys Gly
435 440 445
Glu Pro Gly Pro Thr Gly Ile Gln Gly Pro Pro Gly Pro Ala Gly Glu
450 455 460
Glu Gly Lys Arg Gly Ala Arg Gly Glu Pro Gly Pro Ala Gly Leu Pro
465 470 475 480
Gly Pro Pro Gly Glu Arg Gly Gly Pro Gly Ser Arg Gly Phe Pro Gly
485 490 495
Ala Asp Gly Val Ala Gly Pro Lys Gly Pro Ala Gly Glu Arg Gly Ala
500 505 510
Pro Gly Pro Ala Gly Pro Lys Gly Ser Pro Gly Glu Ala Gly Arg Pro
515 520 525
Gly Glu Ala Gly Leu Pro Gly Ala Lys Gly Leu Thr Gly Ser Pro Gly
530 535 540
Ser Pro Gly Pro Asp Gly Lys Thr Gly Pro Pro Gly Pro Ala Gly Gln
545 550 555 560
Asp Gly Arg Pro Gly Pro Pro Gly Pro Pro Gly Ala Arg Gly Gln Ala
565 570 575
Gly Val Met Gly Phe Pro Gly Pro Lys Gly Ala Ala Gly Glu Pro Gly
580 585 590
Lys Ala Gly Glu Arg Gly Val Pro Gly Pro Pro Gly Ala Val Gly Pro
595 600 605
Ala Gly Lys Asp Gly Glu Ala Gly Ala Gln Gly Pro Pro Gly Pro Ala
610 615 620
Gly Pro Ala Gly Glu Arg Gly Glu Gln Gly Pro Ala Gly Ser Pro Gly
625 630 635 640
Phe Gln Gly Leu Pro Gly Pro Ala Gly Pro Pro Gly Glu Ala Gly Lys
645 650 655
Pro Gly Glu Gln Gly Val Pro Gly Asp Leu Gly Ala Pro Gly Pro Ser
660 665 670
Gly Ala Arg Gly Glu Arg Gly Phe Pro Gly Glu Arg Gly Val Gln Gly
675 680 685
Pro Pro Gly Pro Ala Gly Pro Arg Gly Ala Asn Gly Ala Pro Gly Asn
690 695 700
Asp Gly Ala Lys Gly Asp Ala Gly Ala Pro Gly Ala Pro Gly Ser Gln
705 710 715 720
Gly Ala Pro Gly Leu Gln Gly Met Pro Gly Glu Arg Gly Ala Ala Gly
725 730 735
Leu Pro Gly Pro Lys Gly Asp Arg Gly Asp Ala Gly Pro Lys Gly Ala
740 745 750
Asp Gly Ala Pro Gly Lys Asp Gly Val Arg Gly Leu Thr Gly Pro Ile
755 760 765
Gly Pro Pro Gly Pro Ala Gly Ala Pro Gly Asp Lys Gly Glu Ala Gly
770 775 780
Pro Ser Gly Pro Ala Gly Pro Thr Gly Ala Arg Gly Ala Pro Gly Asp
785 790 795 800
Arg Gly Glu Pro Gly Pro Pro Gly Pro Ala Gly Phe Ala Gly Pro Pro
805 810 815
Gly Ala Asp Gly Gln Pro Gly Ala Lys Gly Glu Pro Gly Asp Ala Gly
820 825 830
Ala Lys Gly Asp Ala Gly Pro Pro Gly Pro Ala Gly Pro Ala Gly Pro
835 840 845
Pro Gly Pro Ile Gly Asn Val Gly Ala Pro Gly Pro Lys Gly Ala Arg
850 855 860
Gly Ser Ala Gly Pro Pro Gly Ala Thr Gly Phe Pro Gly Ala Ala Gly
865 870 875 880
Arg Val Gly Pro Pro Gly Pro Ser Gly Asn Ala Gly Pro Pro Gly Pro
885 890 895
Pro Gly Pro Ala Gly Lys Glu Gly Ser Lys Gly Pro Arg Gly Glu Thr
900 905 910
Gly Pro Ala Gly Arg Pro Gly Glu Val Gly Pro Pro Gly Pro Pro Gly
915 920 925
Pro Ala Gly Glu Lys Gly Ala Pro Gly Ala Asp Gly Pro Ala Gly Ala
930 935 940
Pro Gly Thr Pro Gly Pro Gln Gly Ile Ala Gly Gln Arg Gly Val Val
945 950 955 960
Gly Leu Pro Gly Gln Arg Gly Glu Arg Gly Phe Pro Gly Leu Pro Gly
965 970 975
Pro Ser Gly Glu Pro Gly Lys Gln Gly Pro Ser Gly Ala Ser Gly Glu
980 985 990
Arg Gly Pro Pro Gly Pro Met Gly Pro Pro Gly Leu Ala Gly Pro Pro
995 1000 1005
Gly Glu Ser Gly Arg Glu Gly Ala Pro Gly Ala Glu Gly Ser Pro Gly
1010 1015 1020
Arg Asp Gly Ser Pro Gly Ala Lys Gly Asp Arg Gly Glu Thr Gly Pro
1025 1030 1035 1040
Ala Gly Pro Pro Gly Ala Pro Gly Ala Pro Gly Ala Pro Gly Pro Val
1045 1050 1055
Gly Pro Ala Gly Lys Ser Gly Asp Arg Gly Glu Thr Gly Pro Ala Gly
1060 1065 1070
Pro Ala Gly Pro Ile Gly Pro Val Gly Ala Arg Gly Pro Ala Gly Pro
1075 1080 1085
Gln Gly Pro Arg Gly Asp Lys Gly Glu Thr Gly Glu Gln Gly Asp Arg
1090 1095 1100
Gly Ile Lys Gly His Arg Gly Phe Ser Gly Leu Gln Gly Pro Pro Gly
1105 1110 1115 1120
Pro Pro Gly Ser Pro Gly Glu Gln Gly Pro Ser Gly Ala Ser Gly Pro
1125 1130 1135
Ala Gly Pro Arg Gly Pro Pro Gly Ser Ala Gly Ser Pro Gly Lys Asp
1140 1145 1150
Gly Leu Asn Gly Leu Pro Gly Pro Ile Gly Pro Pro Gly Pro Arg Gly
1155 1160 1165
Arg Thr Gly Asp Ala Gly Pro Ala Gly Pro Pro Gly Pro Pro Gly Pro
1170 1175 1180
Pro Gly Pro Pro Gly Pro Pro Ser Gly Gly Tyr Asp Leu Ser Phe Leu
1185 1190 1195 1200
Pro Gln Pro Pro Gln Glu Lys Ala His Asp Gly Gly Arg Tyr Tyr Arg
1205 1210 1215
Ala Asp Asp Ala Asn Val Val Arg Asp Arg Asp Leu Glu Val Asp Thr
1220 1225 1230
Thr Leu Lys Ser Leu Ser Gln Gln Ile Glu Asn Ile Arg Ser Pro Glu
1235 1240 1245
Gly Ser Arg Lys Asn Pro Ala Arg Thr Cys Arg Asp Leu Lys Met Cys
1250 1255 1260
His Ser Asp Trp Lys Ser Gly Glu Tyr Trp Ile Asp Pro Asn Gln Gly
1265 1270 1275 1280
Cys Asn Leu Asp Ala Ile Lys Val Phe Cys Asn Met Glu Thr Gly Glu
1285 1290 1295
Thr Cys Val Tyr Pro Thr Gln Pro Ser Val Ala Gln Lys Asn Trp Tyr
1300 1305 1310
Ile Ser Lys Asn Pro Lys Glu Lys Arg His Val Trp Tyr Gly Glu Ser
1315 1320 1325
Met Thr Gly Gly Phe Gln Phe Glu Tyr Gly Gly Gln Gly Ser Asp Pro
1330 1335 1340
Ala Asp Val Ala Ile Gln Leu Thr Phe Leu Arg Leu Met Ser Thr Glu
1345 1350 1355 1360
Ala Ser Gln Asn Ile Thr Tyr His Cys Lys Asn Ser Val Ala Tyr Met
1365 1370 1375
Asp Gln Gln Thr Gly Asn Leu Lys Lys Ala Leu Leu Leu Gln Gly Ser
1380 1385 1390
Asn Glu Ile Glu Ile Arg Ala Glu Gly Asn Ser Arg Phe Thr Tyr Ser
1395 1400 1405
Val Thr Tyr Asp Gly Cys Thr Ser His Thr Gly Ala Trp Gly Lys Thr
1410 1415 1420
Val Ile Glu Tyr Lys Thr Thr Lys Thr Ser Arg Leu Pro Ile Ile Asp
1425 1430 1435 1440
Val Ala Pro Leu Asp Val Gly Ala Pro Asp Gln Glu Phe Gly Phe Asp
1445 1450 1455
Val Gly Pro Ala Cys Phe Leu
1460
<210> 17
<211> 4095
<212> DNA
<213> Artificial Sequence
<220>
<223> Btau COL1A1
<400> 17
atgctcagct ttgtggatac gcggactttg ttgctgcttg cagtaacttc gtgcctagca 60
acatgccaat ccttacaaga ggcaactgca agaaagggcc caagtggaga tagaggacca 120
cgcggagaaa ggggtccacc aggcccacca ggcagagatg gtgatgacgg catcccaggc 180
cctcctggcc cccctggccc tcctggcccc cctggtcttg gcgggaactt tgctgctcag 240
tttgatgcaa aaggaggtgg ccctggacca atggggctga tgggacctcg cggccctcct 300
ggggcttctg gagcccctgg ccctcaaggt ttccagggac ctccgggtga gcctggtgaa 360
cctggtcaga ctggtcctgc aggtgctcgt ggcccgcctg gccctcctgg caaggctggt 420
gaggatggtc accctggaaa acctggacga cctggtgaga gaggggttgt tggaccacag 480
ggtgctcgtg gctttcctgg aactcctgga ctccctggct tcaagggcat taggggtcac 540
aatggtctgg atggattgaa gggacagcct ggtgctccag gtgtgaaggg tgaacctggt 600
gcccctggtg aaaatggaac tccaggtcaa acgggagccc gtggtcttcc tggtgagaga 660
ggacgtgttg gtgcccctgg cccagctggt gcccgtggaa gtgatggaag tgtgggtcct 720
gtgggccctg ctggtcccat tgggtctgct ggccctccag gcttcccagg tgctcctggc 780
cccaagggtg aactcggacc tgttggtaac cctggccctg ctggtcccgc gggtccccgt 840
ggtgaagtgg gtctcccagg cctttctggc cctgtcggac ctcctggaaa ccccggagcc 900
aatgggcttc ctggcgctaa gggtgctgct ggccttcccg gtgttgctgg ggctcccggc 960
ctccctggac cccggggtat tcctggccct gttggcgctg ctggtgctac tggcgccaga 1020
ggacttgttg gtgagcccgg cccagctggt tcgaaaggag agagcggcaa caagggcgag 1080
cctggtgctg ttgggcagcc aggtcctcct ggccccagtg gtgaagaagg aaagagaggc 1140
tccactggag aaatcggacc cgctggcccc ccaggacctc ctgggctgag gggaaatcct 1200
ggctcccgtg gtctacctgg agctgacggc agagctggtg tcatgggtcc tgctggtagc 1260
cgtggtgcaa ctggccctgc tggtgtgcga ggtcccaatg gagattctgg tcgccctgga 1320
gagcctggcc tcatgggacc ccgaggtttc ccaggttccc ctggaaatat cggcccagct 1380
ggtaaagaag gtcctgtggg tctccctggt attgacggca gacctgggcc cattggccca 1440
gcgggagcaa gaggagagcc tggcaacatt ggattccctg gacccaaagg ccccagtggt 1500
gatcctggca aagctggtga aaaaggtcat gctggtcttg ctggtgctcg gggcgctcca 1560
ggtcccgatg gcaacaacgg tgctcaggga ccccctggac tacagggtgt ccaaggtgga 1620
aaaggtgaac agggtcctgc tggtcctcca ggcttccagg gtctgcctgg ccctgcaggc 1680
acagctggtg aagctggcaa accaggagaa aggggtatcc ctggtgaatt tggtctccct 1740
ggccctgctg gtgcaagagg ggagcggggg cccccaggtg aaagtggtgc tgctgggcct 1800
actgggccta ttggaagccg aggtccttct ggacccccag ggcctgatgg aaacaagggt 1860
gaaccgggtg tggttggcgc tccaggcact gctggcccat ctggtcctag cggactccca 1920
ggagagaggg gtgcggctgg cattcctgga ggcaagggag aaaagggtga aactggtctc 1980
agaggtgaca ttggtagccc tggtagagat ggtgctcgtg gtgctcctgg tgctattggt 2040
gctcctggcc ctgctggagc caatggggac cggggtgaag ctggtcccgc tggccctgct 2100
ggccctgctg gtcctcgtgg tagccctggt gaacgtggtg aggtcggtcc cgctggcccc 2160
aacggatttg ctggtcctgc tggtgctgct ggtcaacctg gtgctaaagg agagagagga 2220
accaaaggac ccaagggtga aaatggtcct gttggtccca caggccccgt tggagctgcc 2280
ggtccgtctg gtccaaatgg cccacctggt cctgctggaa gtcgtggtga tggagggccc 2340
cctggggcta ctggtttccc tggtgctgct ggacggactg gtccccctgg accctctggt 2400
atctctggcc cccctggccc ccctggtcct gctggtaaag aagggcttcg tgggcctcgt 2460
ggtgaccaag gtccagttgg tcgaagtgga gagacaggtg cctctggccc tcctggcttt 2520
gttggtgaga agggtccctc tggagagcct ggtactgctg ggcctcctgg aaccccaggt 2580
ccacaaggcc ttcttggtgc tcctggtttt ctgggtctcc caggctctag aggtgagcgt 2640
ggtctaccag gtgtcgctgg atctgtgggt gaacctggcc ccctcggcat cgcaggccca 2700
cctggggccc gtggtccccc tggtaatgtc ggtaatcctg gcgtcaatgg tgctcctggt 2760
gaagccggtc gtgacggcaa ccctgggaat gacggtcccc caggccgcga tggtcaaccc 2820
ggacacaagg gggagcgtgg ttaccccggt aacgcaggtc ctgttggtgc tgccggtgct 2880
cctggccctc aaggccctgt gggtcccgtt ggtaaacacg gaaaccgtgg tgaaccgggt 2940
cctgccggtg ctgttggtcc tgctggtgcc gttggcccaa gaggtcccag tggcccacaa 3000
ggtattcgag gtgacaaggg agagcctggt gataagggtc ccagaggtct tcctggctta 3060
aagggacaca atgggttgca aggtctcccg ggtcttgctg gtcatcatgg cgatcaaggt 3120
gctcccggtg ctgtgggtcc cgctggtccc aggggccctg ctggtccttc tggccccgct 3180
ggcaaagacg gtcgcattgg acagcctggt gcagtcggac ctgctggcat tcgtggctct 3240
cagggtagcc aaggtcctgc tggccctcct ggtccccctg gccctcctgg acctcctggc 3300
ccaagtggtg gtggttacga gtttggtttt gatggagact tctacagggc tgaccagcct 3360
cgctcaccaa cttctctcag acccaaggat tatgaagttg atgctactct gaaatctctc 3420
aacaaccaga ttgagaccct tcttactcca gaaggctcta ggaagaaccc agctcgcaca 3480
tgccgagact tgagactcag ccacccagaa tggagcagtg gttactactg gattgaccct 3540
aaccaaggat gtactatgga tgctatcaaa gtatactgtg atttctctac tggcgaaacc 3600
tgcatccggg ctcaacctga agacatccca gtcaagaact ggtacagaaa ttccaaggcc 3660
aagaagcatg tctgggtagg agaaactatc aacggtggta cccagtttga atataatgtt 3720
gaaggagtaa ccaccaagga aatggctacc caacttgcct tcatgcgtct gctggccaac 3780
catgcctctc agaacatcac ctaccattgc aagaacagca ttgcatacat ggatgaggaa 3840
actggcaacc tgaaaaaggc tgtcattctg caaggatcca atgatgtcga acttgttgcc 3900
gagggcaaca gcagattcac ttacactgtt cttgtagatg gctgctctaa aaagacaaat 3960
gaatggcaga agacaatcat tgaatataaa acaaacaagc catctcgcct gcctatcctt 4020
gatattgcac ctttggacat cggtggcgct gaccaagaaa tcagattgaa cattggccca 4080
gtctgtttca aataa 4095
<210> 18
<211> 1364
<212> PRT
<213> Artificial Sequence
<220>
<223> Btau COL1A1
<400> 18
Met Leu Ser Phe Val Asp Thr Arg Thr Leu Leu Leu Leu Ala Val Thr
1 5 10 15
Ser Cys Leu Ala Thr Cys Gln Ser Leu Gln Glu Ala Thr Ala Arg Lys
20 25 30
Gly Pro Ser Gly Asp Arg Gly Pro Arg Gly Glu Arg Gly Pro Pro Gly
35 40 45
Pro Pro Gly Arg Asp Gly Asp Asp Gly Ile Pro Gly Pro Pro Gly Pro
50 55 60
Pro Gly Pro Pro Gly Pro Pro Gly Leu Gly Gly Asn Phe Ala Ala Gln
65 70 75 80
Phe Asp Ala Lys Gly Gly Gly Pro Gly Pro Met Gly Leu Met Gly Pro
85 90 95
Arg Gly Pro Pro Gly Ala Ser Gly Ala Pro Gly Pro Gln Gly Phe Gln
100 105 110
Gly Pro Pro Gly Glu Pro Gly Glu Pro Gly Gln Thr Gly Pro Ala Gly
115 120 125
Ala Arg Gly Pro Pro Gly Pro Pro Gly Lys Ala Gly Glu Asp Gly His
130 135 140
Pro Gly Lys Pro Gly Arg Pro Gly Glu Arg Gly Val Val Gly Pro Gln
145 150 155 160
Gly Ala Arg Gly Phe Pro Gly Thr Pro Gly Leu Pro Gly Phe Lys Gly
165 170 175
Ile Arg Gly His Asn Gly Leu Asp Gly Leu Lys Gly Gln Pro Gly Ala
180 185 190
Pro Gly Val Lys Gly Glu Pro Gly Ala Pro Gly Glu Asn Gly Thr Pro
195 200 205
Gly Gln Thr Gly Ala Arg Gly Leu Pro Gly Glu Arg Gly Arg Val Gly
210 215 220
Ala Pro Gly Pro Ala Gly Ala Arg Gly Ser Asp Gly Ser Val Gly Pro
225 230 235 240
Val Gly Pro Ala Gly Pro Ile Gly Ser Ala Gly Pro Pro Gly Phe Pro
245 250 255
Gly Ala Pro Gly Pro Lys Gly Glu Leu Gly Pro Val Gly Asn Pro Gly
260 265 270
Pro Ala Gly Pro Ala Gly Pro Arg Gly Glu Val Gly Leu Pro Gly Leu
275 280 285
Ser Gly Pro Val Gly Pro Pro Gly Asn Pro Gly Ala Asn Gly Leu Pro
290 295 300
Gly Ala Lys Gly Ala Ala Gly Leu Pro Gly Val Ala Gly Ala Pro Gly
305 310 315 320
Leu Pro Gly Pro Arg Gly Ile Pro Gly Pro Val Gly Ala Ala Gly Ala
325 330 335
Thr Gly Ala Arg Gly Leu Val Gly Glu Pro Gly Pro Ala Gly Ser Lys
340 345 350
Gly Glu Ser Gly Asn Lys Gly Glu Pro Gly Ala Val Gly Gln Pro Gly
355 360 365
Pro Pro Gly Pro Ser Gly Glu Glu Gly Lys Arg Gly Ser Thr Gly Glu
370 375 380
Ile Gly Pro Ala Gly Pro Pro Gly Pro Pro Gly Leu Arg Gly Asn Pro
385 390 395 400
Gly Ser Arg Gly Leu Pro Gly Ala Asp Gly Arg Ala Gly Val Met Gly
405 410 415
Pro Ala Gly Ser Arg Gly Ala Thr Gly Pro Ala Gly Val Arg Gly Pro
420 425 430
Asn Gly Asp Ser Gly Arg Pro Gly Glu Pro Gly Leu Met Gly Pro Arg
435 440 445
Gly Phe Pro Gly Ser Pro Gly Asn Ile Gly Pro Ala Gly Lys Glu Gly
450 455 460
Pro Val Gly Leu Pro Gly Ile Asp Gly Arg Pro Gly Pro Ile Gly Pro
465 470 475 480
Ala Gly Ala Arg Gly Glu Pro Gly Asn Ile Gly Phe Pro Gly Pro Lys
485 490 495
Gly Pro Ser Gly Asp Pro Gly Lys Ala Gly Glu Lys Gly His Ala Gly
500 505 510
Leu Ala Gly Ala Arg Gly Ala Pro Gly Pro Asp Gly Asn Asn Gly Ala
515 520 525
Gln Gly Pro Pro Gly Leu Gln Gly Val Gln Gly Gly Lys Gly Glu Gln
530 535 540
Gly Pro Ala Gly Pro Pro Gly Phe Gln Gly Leu Pro Gly Pro Ala Gly
545 550 555 560
Thr Ala Gly Glu Ala Gly Lys Pro Gly Glu Arg Gly Ile Pro Gly Glu
565 570 575
Phe Gly Leu Pro Gly Pro Ala Gly Ala Arg Gly Glu Arg Gly Pro Pro
580 585 590
Gly Glu Ser Gly Ala Ala Gly Pro Thr Gly Pro Ile Gly Ser Arg Gly
595 600 605
Pro Ser Gly Pro Pro Gly Pro Asp Gly Asn Lys Gly Glu Pro Gly Val
610 615 620
Val Gly Ala Pro Gly Thr Ala Gly Pro Ser Gly Pro Ser Gly Leu Pro
625 630 635 640
Gly Glu Arg Gly Ala Ala Gly Ile Pro Gly Gly Lys Gly Glu Lys Gly
645 650 655
Glu Thr Gly Leu Arg Gly Asp Ile Gly Ser Pro Gly Arg Asp Gly Ala
660 665 670
Arg Gly Ala Pro Gly Ala Ile Gly Ala Pro Gly Pro Ala Gly Ala Asn
675 680 685
Gly Asp Arg Gly Glu Ala Gly Pro Ala Gly Pro Ala Gly Pro Ala Gly
690 695 700
Pro Arg Gly Ser Pro Gly Glu Arg Gly Glu Val Gly Pro Ala Gly Pro
705 710 715 720
Asn Gly Phe Ala Gly Pro Ala Gly Ala Ala Gly Gln Pro Gly Ala Lys
725 730 735
Gly Glu Arg Gly Thr Lys Gly Pro Lys Gly Glu Asn Gly Pro Val Gly
740 745 750
Pro Thr Gly Pro Val Gly Ala Ala Gly Pro Ser Gly Pro Asn Gly Pro
755 760 765
Pro Gly Pro Ala Gly Ser Arg Gly Asp Gly Gly Pro Pro Gly Ala Thr
770 775 780
Gly Phe Pro Gly Ala Ala Gly Arg Thr Gly Pro Pro Gly Pro Ser Gly
785 790 795 800
Ile Ser Gly Pro Pro Gly Pro Pro Gly Pro Ala Gly Lys Glu Gly Leu
805 810 815
Arg Gly Pro Arg Gly Asp Gln Gly Pro Val Gly Arg Ser Gly Glu Thr
820 825 830
Gly Ala Ser Gly Pro Pro Gly Phe Val Gly Glu Lys Gly Pro Ser Gly
835 840 845
Glu Pro Gly Thr Ala Gly Pro Pro Gly Thr Pro Gly Pro Gln Gly Leu
850 855 860
Leu Gly Ala Pro Gly Phe Leu Gly Leu Pro Gly Ser Arg Gly Glu Arg
865 870 875 880
Gly Leu Pro Gly Val Ala Gly Ser Val Gly Glu Pro Gly Pro Leu Gly
885 890 895
Ile Ala Gly Pro Pro Gly Ala Arg Gly Pro Pro Gly Asn Val Gly Asn
900 905 910
Pro Gly Val Asn Gly Ala Pro Gly Glu Ala Gly Arg Asp Gly Asn Pro
915 920 925
Gly Asn Asp Gly Pro Pro Gly Arg Asp Gly Gln Pro Gly His Lys Gly
930 935 940
Glu Arg Gly Tyr Pro Gly Asn Ala Gly Pro Val Gly Ala Ala Gly Ala
945 950 955 960
Pro Gly Pro Gln Gly Pro Val Gly Pro Val Gly Lys His Gly Asn Arg
965 970 975
Gly Glu Pro Gly Pro Ala Gly Ala Val Gly Pro Ala Gly Ala Val Gly
980 985 990
Pro Arg Gly Pro Ser Gly Pro Gln Gly Ile Arg Gly Asp Lys Gly Glu
995 1000 1005
Pro Gly Asp Lys Gly Pro Arg Gly Leu Pro Gly Leu Lys Gly His Asn
1010 1015 1020
Gly Leu Gln Gly Leu Pro Gly Leu Ala Gly His His Gly Asp Gln Gly
1025 1030 1035 1040
Ala Pro Gly Ala Val Gly Pro Ala Gly Pro Arg Gly Pro Ala Gly Pro
1045 1050 1055
Ser Gly Pro Ala Gly Lys Asp Gly Arg Ile Gly Gln Pro Gly Ala Val
1060 1065 1070
Gly Pro Ala Gly Ile Arg Gly Ser Gln Gly Ser Gln Gly Pro Ala Gly
1075 1080 1085
Pro Pro Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Pro Ser Gly Gly
1090 1095 1100
Gly Tyr Glu Phe Gly Phe Asp Gly Asp Phe Tyr Arg Ala Asp Gln Pro
1105 1110 1115 1120
Arg Ser Pro Thr Ser Leu Arg Pro Lys Asp Tyr Glu Val Asp Ala Thr
1125 1130 1135
Leu Lys Ser Leu Asn Asn Gln Ile Glu Thr Leu Leu Thr Pro Glu Gly
1140 1145 1150
Ser Arg Lys Asn Pro Ala Arg Thr Cys Arg Asp Leu Arg Leu Ser His
1155 1160 1165
Pro Glu Trp Ser Ser Gly Tyr Tyr Trp Ile Asp Pro Asn Gln Gly Cys
1170 1175 1180
Thr Met Asp Ala Ile Lys Val Tyr Cys Asp Phe Ser Thr Gly Glu Thr
1185 1190 1195 1200
Cys Ile Arg Ala Gln Pro Glu Asp Ile Pro Val Lys Asn Trp Tyr Arg
1205 1210 1215
Asn Ser Lys Ala Lys Lys His Val Trp Val Gly Glu Thr Ile Asn Gly
1220 1225 1230
Gly Thr Gln Phe Glu Tyr Asn Val Glu Gly Val Thr Thr Lys Glu Met
1235 1240 1245
Ala Thr Gln Leu Ala Phe Met Arg Leu Leu Ala Asn His Ala Ser Gln
1250 1255 1260
Asn Ile Thr Tyr His Cys Lys Asn Ser Ile Ala Tyr Met Asp Glu Glu
1265 1270 1275 1280
Thr Gly Asn Leu Lys Lys Ala Val Ile Leu Gln Gly Ser Asn Asp Val
1285 1290 1295
Glu Leu Val Ala Glu Gly Asn Ser Arg Phe Thr Tyr Thr Val Leu Val
1300 1305 1310
Asp Gly Cys Ser Lys Lys Thr Asn Glu Trp Gln Lys Thr Ile Ile Glu
1315 1320 1325
Tyr Lys Thr Asn Lys Pro Ser Arg Leu Pro Ile Leu Asp Ile Ala Pro
1330 1335 1340
Leu Asp Ile Gly Gly Ala Asp Gln Glu Ile Arg Leu Asn Ile Gly Pro
1345 1350 1355 1360
Val Cys Phe Lys
<210> 19
<211> 4368
<212> DNA
<213> Artificial Sequence
<220>
<223> Amis COL1A1
<400> 19
atgttcagct ttgtggattc tcggttactg ctgttgatag cagcgactgt actactcacc 60
aaaggtcaag gagaagaaga cattcaaact ggaagctgca tacaggatgg actagcgtac 120
aacaacacag acgtatggaa acccgagccc tgccagatct gcgtatgcga caatggcaac 180
atcctgtgtg acgatgtcat ctgtgatgat acctcggact gtaccaatgc tgagatcccc 240
tttggagaat gctgtcccat ctgtcctgac accgctggct cttctaccta ccccaaatcc 300
actggagtag agggtcctaa gggagacact ggccccagag gacagagggg actcccaggc 360
ccacctggca gagatggcat tcctggacag cctggtctcc ctggactccc aggacctcca 420
ggccctcctg gccttggtgg aaacttcgct cctcaaatgg cttacggtta cggagatgaa 480
accaaatctg ctggcatttc tgtccctgga cccatgggtc cagctggccc ccgtggtctc 540
cccggccccc ctggttctcc tggtcctcaa ggtttccaag gtcctcctgg agagcctgga 600
gagcctggtg cttcaggtcc aatgggtccc cgtggtccag ccggcccccc tggcaagaac 660
ggagatgatg gtgaagctgg aaagcccggc cgtcccggtg agcgcggccc tcctggcccc 720
cagggtgcac gtggtctgcc cggaactgct ggcctgccag gcatgaaggg tcacagaggt 780
ttcagtggtc tggatggtgc taagggtgat gctggtccat ccggccccaa gggtgagcct 840
ggtagccctg gtgagaacgg agctcctgga caaatgggcc ctcgtggtct tcccggtgag 900
agaggccgcc ctggtccatc tggccctgct ggtgctcgtg gtaacgatgg tagtcctggt 960
gctgctggcc ctccaggtcc aactggccca gctggccccc ctggcttccc tggtgctgct 1020
ggtgctaagg gtgaaactgg tcctcaaggt tctcgtggta gtgaaggccc acagggtgct 1080
cgtggtgagc ctggtcctcc tggccctgct ggtgctgctg gtcctgctgg caaccctggt 1140
tctgatggtc aagctggtgc caaaggtgca actggtgctc ctggtattgc tggtgctcct 1200
ggcttccctg gcgctcgtgg cccatctgga ccccagggtc ccagcggtgc tcctggcccc 1260
aagggtaaca gtggtgaacc cggtgctcaa ggcaacaagg gagacactgg tgcaaaagga 1320
gagcctggtc ctgctggtgt ccaaggccca cctggtccag ctggtgaaga aggcaagaga 1380
ggagcccgtg gtgagcccgg ccctggaggt cttcctggcc ctgctggcga acgtggtgct 1440
cctggaagcc gtggtttccc tggcgctgat ggcatttctg gtcccaaggg tccccctggt 1500
gaacgtggtt cccctggccc tgctggtccc aaaggatcta ctggtgaatc tggacgccct 1560
ggtgagcctg gtctccctgg tgccaagggt cttactggaa gcccaggtag cccaggtcct 1620
gatggcaaga ctggtccacc tggccccgct ggtcaagatg gtcgcccagg acccccaggc 1680
ccacctggtg ccagaggtca ggctggtgtg atgggtttcc ctggacctaa aggtgctgct 1740
ggtgagcctg gcaaacctgg tgagagagga gctcctggac cccctggtgc tgttggcgca 1800
gctggtaagg atggtgaagc tggtgcccaa ggttctcctg gcgctgctgg tcctgctgga 1860
gagagaggtg aacaaggtcc tgctggtgct cctggattcc agggtctgcc cggtcctgct 1920
ggcccatctg gtgaatctgg caagcctggt gaacagggtg ttcctggaga tgctggtgct 1980
cctggtccag ctggtgcaag aggcgagaga ggtttccctg gtgagcgtgg tgtccaaggt 2040
caaccaggtc cacagggtcc acgtggtgct aacggtgctc ccggtaacga tggtgctaag 2100
ggtgatgctg gtgctcctgg tgctcctggt ggccaaggtc ctcccggtct gcagggtatg 2160
cctggtgagc gtggtgctgc tggtctgcct ggttccaagg gtgacagagg cgatcctggt 2220
cccaaaggca ctgatggtgc tcctggcaaa gatggcgtca gaggtctaac tggccctatt 2280
ggtcctcctg gcccagctgg tgcccctggt gacaagggtg aagctggtcc ttctggccct 2340
gctggtccca ctggttctcg tggtgcccct ggagatcgtg gtgagcctgg tccacctggc 2400
cctgctggat tcgctggtcc ccctggtgct gatggacaac ctggtgctaa aggtgaatct 2460
ggtgatgctg gtgctaaagg tgatgctggt cctccaggcc ctgctggacc cactggtgct 2520
cctggacctt ctggcgctgt tggtgctcct ggacccaaag gtgctcgtgg tagtgctgga 2580
ccccctggtg ctactggttt ccctggtgct gctggaagag ttggtccacc tggccctgct 2640
ggtaacgtcg gtcttcctgg cccatcaggc cccagtggaa aagaaggctc taaaggaccc 2700
cgtggtgaga ctggccctgc tggacgcccc ggtgaacctg gacctgctgg cccaccagga 2760
ccttctggcg agaagggctc tcctggtggt gatggtcccg ctggtgctcc tggtactcca 2820
ggcccacagg gtattgctgg acagcgtggt gtagttggtc ttcctggaca gagaggcgag 2880
agaggtttcc ctggtctccc cggcccatct ggcgaacctg gcaaacaagg tccatctggc 2940
tcctctggtg aacgcggtcc tcctggtcca atgggaccac ctggcttggc tggacctcct 3000
ggtgaagctg gacgtgaggg tgctcctggt tctgaaggtg ctcctggtcg cgatggcgct 3060
gctggtccca agggtgaccg tggtgagact ggcccctctg gtcctcctgg tgctcccggt 3120
gcccctggag ctcctggccc tattggccct gctggcaaga atggagatcg tggtgagact 3180
ggtccttctg gtcctgctgg ccctgccggt cctgctggtg ctcgtggtcc tgctggtcca 3240
caaggtgccc gtggtgacaa aggtgaaact ggagaacatg gtgacagagg catgaagggt 3300
cacagaggat tccctggtcc ccagggtccc tctggtcctg ctggctctcc tggtgaacaa 3360
ggtccttctg gagcttccgg ccctgctggt ccaagaggtc ctcctggctc tgctggcacc 3420
cctggcaaag atggtctgaa tggtctccct ggccctattg gtccacctgg tccccggggt 3480
cgcactggtg atgttggtcc tgctggtccc cctggacctc ctgggccccc aggtcctcct 3540
ggtgcaccca gcggcggctt tgacttcagc ttcatgcccc agcctcctca ggagaaagcc 3600
catgatcctg gccgctacta cagagctgat gacgccaacg tgatgcgtga ccgtgacctg 3660
gaggtggaca ccaccctcaa gagcctgagc cagcagatcg agaacatccg cagccccgag 3720
ggcaccagga agaaccctgc ccgcacctgc cgtgacctga agatgtgcca caatgactgg 3780
aagagcggcg agtactggat tgaccccaac cagggctgca atctggatgc catcaaggtc 3840
tactgtaaca tggagactgg cgagacttgc gtccacccaa cccaggccac catcgctcag 3900
aagaactggt acatgagcaa gaaccccaag gagaagaaac acatctggtt tggcgagaca 3960
atgagcgatg gcttccagtt cgaatatggt ggggagggct ccaacccagc tgacgttgcc 4020
atccaactga ccttcctgcg cctgatgtcc actgaggcct cccagaacat cacctaccac 4080
tgcaagaaca gcgtggctta catggaccag gagactggca acctgaagaa ggctctgctc 4140
cttcagggct ccaacgagat cgagatcaga gcagaaggca acagccgctt cacctatgga 4200
gtcactgagg atggctgcac aactcacacc ggtgcctggg gcaagacagt cattgaatac 4260
aaaacaacaa aaacctctcg cctgcccgtc attgacgtgg ctcccatgga cgttggagca 4320
caagatcagg aattcggaat tgtcatcgga cctgtctgct tcttgtaa 4368
<210> 20
<211> 1455
<212> PRT
<213> Artificial Sequence
<220>
<223> Amis COL1A1
<400> 20
Met Phe Ser Phe Val Asp Ser Arg Leu Leu Leu Leu Ile Ala Ala Thr
1 5 10 15
Val Leu Leu Thr Lys Gly Gln Gly Glu Glu Asp Ile Gln Thr Gly Ser
20 25 30
Cys Ile Gln Asp Gly Leu Ala Tyr Asn Asn Thr Asp Val Trp Lys Pro
35 40 45
Glu Pro Cys Gln Ile Cys Val Cys Asp Asn Gly Asn Ile Leu Cys Asp
50 55 60
Asp Val Ile Cys Asp Asp Thr Ser Asp Cys Thr Asn Ala Glu Ile Pro
65 70 75 80
Phe Gly Glu Cys Cys Pro Ile Cys Pro Asp Thr Ala Gly Ser Ser Thr
85 90 95
Tyr Pro Lys Ser Thr Gly Val Glu Gly Pro Lys Gly Asp Thr Gly Pro
100 105 110
Arg Gly Gln Arg Gly Leu Pro Gly Pro Pro Gly Arg Asp Gly Ile Pro
115 120 125
Gly Gln Pro Gly Leu Pro Gly Leu Pro Gly Pro Pro Gly Pro Pro Gly
130 135 140
Leu Gly Gly Asn Phe Ala Pro Gln Met Ala Tyr Gly Tyr Gly Asp Glu
145 150 155 160
Thr Lys Ser Ala Gly Ile Ser Val Pro Gly Pro Met Gly Pro Ala Gly
165 170 175
Pro Arg Gly Leu Pro Gly Pro Pro Gly Ser Pro Gly Pro Gln Gly Phe
180 185 190
Gln Gly Pro Pro Gly Glu Pro Gly Glu Pro Gly Ala Ser Gly Pro Met
195 200 205
Gly Pro Arg Gly Pro Ala Gly Pro Pro Gly Lys Asn Gly Asp Asp Gly
210 215 220
Glu Ala Gly Lys Pro Gly Arg Pro Gly Glu Arg Gly Pro Pro Gly Pro
225 230 235 240
Gln Gly Ala Arg Gly Leu Pro Gly Thr Ala Gly Leu Pro Gly Met Lys
245 250 255
Gly His Arg Gly Phe Ser Gly Leu Asp Gly Ala Lys Gly Asp Ala Gly
260 265 270
Pro Ser Gly Pro Lys Gly Glu Pro Gly Ser Pro Gly Glu Asn Gly Ala
275 280 285
Pro Gly Gln Met Gly Pro Arg Gly Leu Pro Gly Glu Arg Gly Arg Pro
290 295 300
Gly Pro Ser Gly Pro Ala Gly Ala Arg Gly Asn Asp Gly Ser Pro Gly
305 310 315 320
Ala Ala Gly Pro Pro Gly Pro Thr Gly Pro Ala Gly Pro Pro Gly Phe
325 330 335
Pro Gly Ala Ala Gly Ala Lys Gly Glu Thr Gly Pro Gln Gly Ser Arg
340 345 350
Gly Ser Glu Gly Pro Gln Gly Ala Arg Gly Glu Pro Gly Pro Pro Gly
355 360 365
Pro Ala Gly Ala Ala Gly Pro Ala Gly Asn Pro Gly Ser Asp Gly Gln
370 375 380
Ala Gly Ala Lys Gly Ala Thr Gly Ala Pro Gly Ile Ala Gly Ala Pro
385 390 395 400
Gly Phe Pro Gly Ala Arg Gly Pro Ser Gly Pro Gln Gly Pro Ser Gly
405 410 415
Ala Pro Gly Pro Lys Gly Asn Ser Gly Glu Pro Gly Ala Gln Gly Asn
420 425 430
Lys Gly Asp Thr Gly Ala Lys Gly Glu Pro Gly Pro Ala Gly Val Gln
435 440 445
Gly Pro Pro Gly Pro Ala Gly Glu Glu Gly Lys Arg Gly Ala Arg Gly
450 455 460
Glu Pro Gly Pro Gly Gly Leu Pro Gly Pro Ala Gly Glu Arg Gly Ala
465 470 475 480
Pro Gly Ser Arg Gly Phe Pro Gly Ala Asp Gly Ile Ser Gly Pro Lys
485 490 495
Gly Pro Pro Gly Glu Arg Gly Ser Pro Gly Pro Ala Gly Pro Lys Gly
500 505 510
Ser Thr Gly Glu Ser Gly Arg Pro Gly Glu Pro Gly Leu Pro Gly Ala
515 520 525
Lys Gly Leu Thr Gly Ser Pro Gly Ser Pro Gly Pro Asp Gly Lys Thr
530 535 540
Gly Pro Pro Gly Pro Ala Gly Gln Asp Gly Arg Pro Gly Pro Pro Gly
545 550 555 560
Pro Pro Gly Ala Arg Gly Gln Ala Gly Val Met Gly Phe Pro Gly Pro
565 570 575
Lys Gly Ala Ala Gly Glu Pro Gly Lys Pro Gly Glu Arg Gly Ala Pro
580 585 590
Gly Pro Pro Gly Ala Val Gly Ala Ala Gly Lys Asp Gly Glu Ala Gly
595 600 605
Ala Gln Gly Ser Pro Gly Ala Ala Gly Pro Ala Gly Glu Arg Gly Glu
610 615 620
Gln Gly Pro Ala Gly Ala Pro Gly Phe Gln Gly Leu Pro Gly Pro Ala
625 630 635 640
Gly Pro Ser Gly Glu Ser Gly Lys Pro Gly Glu Gln Gly Val Pro Gly
645 650 655
Asp Ala Gly Ala Pro Gly Pro Ala Gly Ala Arg Gly Glu Arg Gly Phe
660 665 670
Pro Gly Glu Arg Gly Val Gln Gly Gln Pro Gly Pro Gln Gly Pro Arg
675 680 685
Gly Ala Asn Gly Ala Pro Gly Asn Asp Gly Ala Lys Gly Asp Ala Gly
690 695 700
Ala Pro Gly Ala Pro Gly Gly Gln Gly Pro Pro Gly Leu Gln Gly Met
705 710 715 720
Pro Gly Glu Arg Gly Ala Ala Gly Leu Pro Gly Ser Lys Gly Asp Arg
725 730 735
Gly Asp Pro Gly Pro Lys Gly Thr Asp Gly Ala Pro Gly Lys Asp Gly
740 745 750
Val Arg Gly Leu Thr Gly Pro Ile Gly Pro Pro Gly Pro Ala Gly Ala
755 760 765
Pro Gly Asp Lys Gly Glu Ala Gly Pro Ser Gly Pro Ala Gly Pro Thr
770 775 780
Gly Ser Arg Gly Ala Pro Gly Asp Arg Gly Glu Pro Gly Pro Pro Gly
785 790 795 800
Pro Ala Gly Phe Ala Gly Pro Pro Gly Ala Asp Gly Gln Pro Gly Ala
805 810 815
Lys Gly Glu Ser Gly Asp Ala Gly Ala Lys Gly Asp Ala Gly Pro Pro
820 825 830
Gly Pro Ala Gly Pro Thr Gly Ala Pro Gly Pro Ser Gly Ala Val Gly
835 840 845
Ala Pro Gly Pro Lys Gly Ala Arg Gly Ser Ala Gly Pro Pro Gly Ala
850 855 860
Thr Gly Phe Pro Gly Ala Ala Gly Arg Val Gly Pro Pro Gly Pro Ala
865 870 875 880
Gly Asn Val Gly Leu Pro Gly Pro Ser Gly Pro Ser Gly Lys Glu Gly
885 890 895
Ser Lys Gly Pro Arg Gly Glu Thr Gly Pro Ala Gly Arg Pro Gly Glu
900 905 910
Pro Gly Pro Ala Gly Pro Pro Gly Pro Ser Gly Glu Lys Gly Ser Pro
915 920 925
Gly Gly Asp Gly Pro Ala Gly Ala Pro Gly Thr Pro Gly Pro Gln Gly
930 935 940
Ile Ala Gly Gln Arg Gly Val Val Gly Leu Pro Gly Gln Arg Gly Glu
945 950 955 960
Arg Gly Phe Pro Gly Leu Pro Gly Pro Ser Gly Glu Pro Gly Lys Gln
965 970 975
Gly Pro Ser Gly Ser Ser Gly Glu Arg Gly Pro Pro Gly Pro Met Gly
980 985 990
Pro Pro Gly Leu Ala Gly Pro Pro Gly Glu Ala Gly Arg Glu Gly Ala
995 1000 1005
Pro Gly Ser Glu Gly Ala Pro Gly Arg Asp Gly Ala Ala Gly Pro Lys
1010 1015 1020
Gly Asp Arg Gly Glu Thr Gly Pro Ser Gly Pro Pro Gly Ala Pro Gly
1025 1030 1035 1040
Ala Pro Gly Ala Pro Gly Pro Ile Gly Pro Ala Gly Lys Asn Gly Asp
1045 1050 1055
Arg Gly Glu Thr Gly Pro Ser Gly Pro Ala Gly Pro Ala Gly Pro Ala
1060 1065 1070
Gly Ala Arg Gly Pro Ala Gly Pro Gln Gly Ala Arg Gly Asp Lys Gly
1075 1080 1085
Glu Thr Gly Glu His Gly Asp Arg Gly Met Lys Gly His Arg Gly Phe
1090 1095 1100
Pro Gly Pro Gln Gly Pro Ser Gly Pro Ala Gly Ser Pro Gly Glu Gln
1105 1110 1115 1120
Gly Pro Ser Gly Ala Ser Gly Pro Ala Gly Pro Arg Gly Pro Pro Gly
1125 1130 1135
Ser Ala Gly Thr Pro Gly Lys Asp Gly Leu Asn Gly Leu Pro Gly Pro
1140 1145 1150
Ile Gly Pro Pro Gly Pro Arg Gly Arg Thr Gly Asp Val Gly Pro Ala
1155 1160 1165
Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Ala Pro Ser
1170 1175 1180
Gly Gly Phe Asp Phe Ser Phe Met Pro Gln Pro Pro Gln Glu Lys Ala
1185 1190 1195 1200
His Asp Pro Gly Arg Tyr Tyr Arg Ala Asp Asp Ala Asn Val Met Arg
1205 1210 1215
Asp Arg Asp Leu Glu Val Asp Thr Thr Leu Lys Ser Leu Ser Gln Gln
1220 1225 1230
Ile Glu Asn Ile Arg Ser Pro Glu Gly Thr Arg Lys Asn Pro Ala Arg
1235 1240 1245
Thr Cys Arg Asp Leu Lys Met Cys His Asn Asp Trp Lys Ser Gly Glu
1250 1255 1260
Tyr Trp Ile Asp Pro Asn Gln Gly Cys Asn Leu Asp Ala Ile Lys Val
1265 1270 1275 1280
Tyr Cys Asn Met Glu Thr Gly Glu Thr Cys Val His Pro Thr Gln Ala
1285 1290 1295
Thr Ile Ala Gln Lys Asn Trp Tyr Met Ser Lys Asn Pro Lys Glu Lys
1300 1305 1310
Lys His Ile Trp Phe Gly Glu Thr Met Ser Asp Gly Phe Gln Phe Glu
1315 1320 1325
Tyr Gly Gly Glu Gly Ser Asn Pro Ala Asp Val Ala Ile Gln Leu Thr
1330 1335 1340
Phe Leu Arg Leu Met Ser Thr Glu Ala Ser Gln Asn Ile Thr Tyr His
1345 1350 1355 1360
Cys Lys Asn Ser Val Ala Tyr Met Asp Gln Glu Thr Gly Asn Leu Lys
1365 1370 1375
Lys Ala Leu Leu Leu Gln Gly Ser Asn Glu Ile Glu Ile Arg Ala Glu
1380 1385 1390
Gly Asn Ser Arg Phe Thr Tyr Gly Val Thr Glu Asp Gly Cys Thr Thr
1395 1400 1405
His Thr Gly Ala Trp Gly Lys Thr Val Ile Glu Tyr Lys Thr Thr Lys
1410 1415 1420
Thr Ser Arg Leu Pro Val Ile Asp Val Ala Pro Met Asp Val Gly Ala
1425 1430 1435 1440
Gln Asp Gln Glu Phe Gly Ile Val Ile Gly Pro Val Cys Phe Leu
1445 1450 1455
<210> 21
<211> 4092
<212> DNA
<213> Artificial Sequence
<220>
<223> Amis COL1A2
<400> 21
atgctcagct ttgtggatac acggattttg ttgctgctcg cagtaacttc gtacctagca 60
acatgtcaac aagcaaatga ggcaactgca ggacggaagg gcccaagagg agacaaaggg 120
ccacagggag aaaggggtcc accaggtcca ccaggcagag atggtgaaga tggtccacca 180
gggcctccag ggccccctgg tcctccaggt cttggcggaa actttgctgc tcagtatgac 240
ggagcaaaag caggtgacta tggctcagga ccaatgggtt taatgggacc cagaggccca 300
cctggaacaa gtggacctcc tggtcctcct ggcttccaag gacctcatgg tgagcctggt 360
gaacctggtc aaacaggtcc ccagggtccc cgtggtccat ctggtcctcc tggaaaggct 420
ggtgaagatg gccatcctgg aaaatctgga cgatctggtg agaggggcgt ctctggtcct 480
cagggtgctc gtggtttccc tggaactcct ggtctgcctg gctttaaggg aattagagga 540
cacaatggtc tggatggtca gaagggacaa cctggtactc caggcattaa gggtgaatcc 600
ggtgcccctg gtgaaaatgg taccccagga caatctggtg ctcgtggcct tcccggtgaa 660
agaggaagaa ttggtgcacc tggcccagct ggtgcccgtg gcagcgatgg tagcactggt 720
cccactggtc ctgctggccc tatcggttct gctggtgctc caggtttccc aggtgctcct 780
ggagccaagg gtgaaattgg agctgctggt aatgtaggtc cttctggccc tgctggtcca 840
cgaggagagg ctggacttcc tggttcttct ggtcccgttg gccctcctgg aaaccctggt 900
tctaatggtc ttgctggtgc taaaggtgca actggtcttc ctggtgttgc tggtgctcct 960
ggcttgcctg gtccacgtgg tattcctgga ccttctggcc ctgccggagc tgctggcacc 1020
agaggtcttg ttggtgaacc aggccctgct ggtgccaagg gagaaagtgg taacaagggt 1080
gaacccggtg ctgctggtcc atcaggtccc gctggtccaa gtggtgaaga aggcaagaaa 1140
ggtactactg gtgaacctgg ctcttctggc ccccctggtc cagctggtct aagaggcgtt 1200
cctggatctc gtggtctccc tggagctgac ggcagagctg gtgttatggg acctgctggc 1260
agccgtggtg ctactggtcc tgctggtgct aaaggtccta gtggtgataa tggtcgccct 1320
ggtgagcctg gccttatggg tccaagaggt ctccctggtc aacctggaag ctcaggccct 1380
gctggcaagg aaggtcctgt tggtttccct ggtgcagatg gtagagttgg cccaactggt 1440
ccagctggtg caagaggtga gcctggcaac attggattcc ctggacccaa aggccccact 1500
ggtgaccctg gcaaacctgg tgacagaggc catgctggtc ttgctggtgc tcggggtgcg 1560
cctggtcctg agggcaacaa tggggctcaa ggtcctcctg gtgttgctgg caaccctggt 1620
gcaaaaggtg aacaaggtcc agctggtcct cccggtttcc agggtctccc aggcccctca 1680
ggtccagctg gtgaagctgg caaaccaggt gaaaggggta tggctggtga atttggtgcc 1740
cctggccctg cgggttcaag aggtgaacgt ggtcctccag gcgaaagtgg tgctgttggt 1800
cctgtaggtc ccattggaag ccgtggtcca tctggtccac caggcactga tggcaacaag 1860
ggtgaacctg gtaatgttgg taatgctggt actgcaggcc cctctggcgc tggtggagcc 1920
ccaggagaga gaggcattgc tggtattcca ggacccaagg gtgaaaaggg tgctacaggt 1980
ctgagagggg atactggcgc aacaggaaga gatggtgctc gtggtgctcc tggtgctatt 2040
ggagcccctg gccccgctgg tggagctggt gagcggggtg aaggtggtcc tgctggtgct 2100
gctggccctt ctggtgcccg tggtattcct ggtgaacgtg gtgagcctgg tcctgctggc 2160
cctactggat ttgctggacc tgctggtgca gctggccaac ctggtgctaa aggtgaacga 2220
ggtacaaaag gacccaaggg tgaaaatggt ccacaaggtg ctgttggccc agttggttct 2280
tctggaccat caggtcctgt tggtgcctct ggtcctgctg gtcctcgtgg tgatggtggt 2340
cctcctggtg tcactggttt ccctggagct gctggcagaa ctggtcctcc cggcccctct 2400
ggtatcactg gcccccctgg tccccctggc tcagctggca aagatggtat gagaggccca 2460
cgtggtgata ctggtccagt tggccgcact ggagaacaag gcattgttgg cccacctggc 2520
ttcagtggtg agaaaggtcc atctggagag cctggtgctg ctggtccccc tggtacccca 2580
ggtcctcagg gtattcttgg tgctcctggt atccttggtc tgcctggctc tcggggagaa 2640
cgtggtcttc caggcatctc tggagcaaca ggtgaaccag gtcctcttgg tatttccggt 2700
cctcctggtg cacgtggtcc ctctggcccc gtgggttctg ctggtctgaa tggtgcccct 2760
ggtgaagctg gccgtgatgg caatcctggc catgatggtg ctccaggccg tgatggtgct 2820
cctggtttca agggtgagcg tggtgctcct gggaacaatg gacctgctgg tgctgttggt 2880
gctcctggcg cccatggtca agttggtcct gctggaaagc ctggaaatcg tggtgatcct 2940
ggtcctgttg gtccttctgg tcctgctggt gcttttggtg caaggggtcc ttctggccca 3000
caaggtgcac gtggtgagaa gggagaaaca ggtgaaaagg gacacagagg tatgcctgga 3060
tttaaggggc acaatggact tcagggtctg cctggtcttg ctggccaaca tggagatcaa 3120
ggtcctccag gttctactgg ccccgctggc ccaaggggtc cctctggtcc ttctggtcct 3180
gctggaaaag atggtcgcaa tggactccct ggccctattg gacctgctgg tgtgcgtggt 3240
tctcagggta gccaaggtcc ttcgggtcca cctggcccac ctggtctccc tggtccccct 3300
ggtgcaaatg gtggtggata cgaagttggc tatgatcttg aatactaccg ggctgatcag 3360
cctgctctca gacctaagga ctatgaagtt gatgccactc tgaaaacatt gaacaaccaa 3420
attgagaccc tcctgacccc agaaggctcc aggaagaacc cagctcgcac ctgccgtgac 3480
ctgagactca gccacccaga atggaccagt ggtttctact ggattgatcc caaccagggc 3540
tgtactatgg atgccattag agtgtattgt gacttctcca ctggtgagac ttgcatacat 3600
gccaatctag aaaacatccc cactaagaac tggtatgtca gcaagaactc caaggaaaag 3660
aagcacatgt ggtttggtga aactatcaat ggtggtaccc agtttgaata taacgatgaa 3720
ggagtgactt ccaaggacat ggctacccaa cttgccttca tgcgtctgct ggccaaccat 3780
gcctcccaga acatcaccta ccactgcaag aacagtattg catacatgga tgaagaaact 3840
ggcaacctta agaaggctgt aatactgcag ggatccaatg atgttgaact acgagctgaa 3900
ggcaacagca gattcacttt cagtgttctg gaagatggct gctctagaaa gaacaacgca 3960
tggggcaaaa caatcattga atatagaaca aacaaaccat ctcgcttgcc catccttgac 4020
attgcacctt tggacattgg tggagctgat caagaattcg gtttggacat tggcccagtc 4080
tgtttcaaat ga 4092
<210> 22
<211> 1363
<212> PRT
<213> Artificial Sequence
<220>
<223> Amis COL1A2
<400> 22
Met Leu Ser Phe Val Asp Thr Arg Ile Leu Leu Leu Leu Ala Val Thr
1 5 10 15
Ser Tyr Leu Ala Thr Cys Gln Gln Ala Asn Glu Ala Thr Ala Gly Arg
20 25 30
Lys Gly Pro Arg Gly Asp Lys Gly Pro Gln Gly Glu Arg Gly Pro Pro
35 40 45
Gly Pro Pro Gly Arg Asp Gly Glu Asp Gly Pro Pro Gly Pro Pro Gly
50 55 60
Pro Pro Gly Pro Pro Gly Leu Gly Gly Asn Phe Ala Ala Gln Tyr Asp
65 70 75 80
Gly Ala Lys Ala Gly Asp Tyr Gly Ser Gly Pro Met Gly Leu Met Gly
85 90 95
Pro Arg Gly Pro Pro Gly Thr Ser Gly Pro Pro Gly Pro Pro Gly Phe
100 105 110
Gln Gly Pro His Gly Glu Pro Gly Glu Pro Gly Gln Thr Gly Pro Gln
115 120 125
Gly Pro Arg Gly Pro Ser Gly Pro Pro Gly Lys Ala Gly Glu Asp Gly
130 135 140
His Pro Gly Lys Ser Gly Arg Ser Gly Glu Arg Gly Val Ser Gly Pro
145 150 155 160
Gln Gly Ala Arg Gly Phe Pro Gly Thr Pro Gly Leu Pro Gly Phe Lys
165 170 175
Gly Ile Arg Gly His Asn Gly Leu Asp Gly Gln Lys Gly Gln Pro Gly
180 185 190
Thr Pro Gly Ile Lys Gly Glu Ser Gly Ala Pro Gly Glu Asn Gly Thr
195 200 205
Pro Gly Gln Ser Gly Ala Arg Gly Leu Pro Gly Glu Arg Gly Arg Ile
210 215 220
Gly Ala Pro Gly Pro Ala Gly Ala Arg Gly Ser Asp Gly Ser Thr Gly
225 230 235 240
Pro Thr Gly Pro Ala Gly Pro Ile Gly Ser Ala Gly Ala Pro Gly Phe
245 250 255
Pro Gly Ala Pro Gly Ala Lys Gly Glu Ile Gly Ala Ala Gly Asn Val
260 265 270
Gly Pro Ser Gly Pro Ala Gly Pro Arg Gly Glu Ala Gly Leu Pro Gly
275 280 285
Ser Ser Gly Pro Val Gly Pro Pro Gly Asn Pro Gly Ser Asn Gly Leu
290 295 300
Ala Gly Ala Lys Gly Ala Thr Gly Leu Pro Gly Val Ala Gly Ala Pro
305 310 315 320
Gly Leu Pro Gly Pro Arg Gly Ile Pro Gly Pro Ser Gly Pro Ala Gly
325 330 335
Ala Ala Gly Thr Arg Gly Leu Val Gly Glu Pro Gly Pro Ala Gly Ala
340 345 350
Lys Gly Glu Ser Gly Asn Lys Gly Glu Pro Gly Ala Ala Gly Pro Ser
355 360 365
Gly Pro Ala Gly Pro Ser Gly Glu Glu Gly Lys Lys Gly Thr Thr Gly
370 375 380
Glu Pro Gly Ser Ser Gly Pro Pro Gly Pro Ala Gly Leu Arg Gly Val
385 390 395 400
Pro Gly Ser Arg Gly Leu Pro Gly Ala Asp Gly Arg Ala Gly Val Met
405 410 415
Gly Pro Ala Gly Ser Arg Gly Ala Thr Gly Pro Ala Gly Ala Lys Gly
420 425 430
Pro Ser Gly Asp Asn Gly Arg Pro Gly Glu Pro Gly Leu Met Gly Pro
435 440 445
Arg Gly Leu Pro Gly Gln Pro Gly Ser Ser Gly Pro Ala Gly Lys Glu
450 455 460
Gly Pro Val Gly Phe Pro Gly Ala Asp Gly Arg Val Gly Pro Thr Gly
465 470 475 480
Pro Ala Gly Ala Arg Gly Glu Pro Gly Asn Ile Gly Phe Pro Gly Pro
485 490 495
Lys Gly Pro Thr Gly Asp Pro Gly Lys Pro Gly Asp Arg Gly His Ala
500 505 510
Gly Leu Ala Gly Ala Arg Gly Ala Pro Gly Pro Glu Gly Asn Asn Gly
515 520 525
Ala Gln Gly Pro Pro Gly Val Ala Gly Asn Pro Gly Ala Lys Gly Glu
530 535 540
Gln Gly Pro Ala Gly Pro Pro Gly Phe Gln Gly Leu Pro Gly Pro Ser
545 550 555 560
Gly Pro Ala Gly Glu Ala Gly Lys Pro Gly Glu Arg Gly Met Ala Gly
565 570 575
Glu Phe Gly Ala Pro Gly Pro Ala Gly Ser Arg Gly Glu Arg Gly Pro
580 585 590
Pro Gly Glu Ser Gly Ala Val Gly Pro Val Gly Pro Ile Gly Ser Arg
595 600 605
Gly Pro Ser Gly Pro Pro Gly Thr Asp Gly Asn Lys Gly Glu Pro Gly
610 615 620
Asn Val Gly Asn Ala Gly Thr Ala Gly Pro Ser Gly Ala Gly Gly Ala
625 630 635 640
Pro Gly Glu Arg Gly Ile Ala Gly Ile Pro Gly Pro Lys Gly Glu Lys
645 650 655
Gly Ala Thr Gly Leu Arg Gly Asp Thr Gly Ala Thr Gly Arg Asp Gly
660 665 670
Ala Arg Gly Ala Pro Gly Ala Ile Gly Ala Pro Gly Pro Ala Gly Gly
675 680 685
Ala Gly Glu Arg Gly Glu Gly Gly Pro Ala Gly Ala Ala Gly Pro Ser
690 695 700
Gly Ala Arg Gly Ile Pro Gly Glu Arg Gly Glu Pro Gly Pro Ala Gly
705 710 715 720
Pro Thr Gly Phe Ala Gly Pro Ala Gly Ala Ala Gly Gln Pro Gly Ala
725 730 735
Lys Gly Glu Arg Gly Thr Lys Gly Pro Lys Gly Glu Asn Gly Pro Gln
740 745 750
Gly Ala Val Gly Pro Val Gly Ser Ser Gly Pro Ser Gly Pro Val Gly
755 760 765
Ala Ser Gly Pro Ala Gly Pro Arg Gly Asp Gly Gly Pro Pro Gly Val
770 775 780
Thr Gly Phe Pro Gly Ala Ala Gly Arg Thr Gly Pro Pro Gly Pro Ser
785 790 795 800
Gly Ile Thr Gly Pro Pro Gly Pro Pro Gly Ser Ala Gly Lys Asp Gly
805 810 815
Met Arg Gly Pro Arg Gly Asp Thr Gly Pro Val Gly Arg Thr Gly Glu
820 825 830
Gln Gly Ile Val Gly Pro Pro Gly Phe Ser Gly Glu Lys Gly Pro Ser
835 840 845
Gly Glu Pro Gly Ala Ala Gly Pro Pro Gly Thr Pro Gly Pro Gln Gly
850 855 860
Ile Leu Gly Ala Pro Gly Ile Leu Gly Leu Pro Gly Ser Arg Gly Glu
865 870 875 880
Arg Gly Leu Pro Gly Ile Ser Gly Ala Thr Gly Glu Pro Gly Pro Leu
885 890 895
Gly Ile Ser Gly Pro Pro Gly Ala Arg Gly Pro Ser Gly Pro Val Gly
900 905 910
Ser Ala Gly Leu Asn Gly Ala Pro Gly Glu Ala Gly Arg Asp Gly Asn
915 920 925
Pro Gly His Asp Gly Ala Pro Gly Arg Asp Gly Ala Pro Gly Phe Lys
930 935 940
Gly Glu Arg Gly Ala Pro Gly Asn Asn Gly Pro Ala Gly Ala Val Gly
945 950 955 960
Ala Pro Gly Ala His Gly Gln Val Gly Pro Ala Gly Lys Pro Gly Asn
965 970 975
Arg Gly Asp Pro Gly Pro Val Gly Pro Ser Gly Pro Ala Gly Ala Phe
980 985 990
Gly Ala Arg Gly Pro Ser Gly Pro Gln Gly Ala Arg Gly Glu Lys Gly
995 1000 1005
Glu Thr Gly Glu Lys Gly His Arg Gly Met Pro Gly Phe Lys Gly His
1010 1015 1020
Asn Gly Leu Gln Gly Leu Pro Gly Leu Ala Gly Gln His Gly Asp Gln
1025 1030 1035 1040
Gly Pro Pro Gly Ser Thr Gly Pro Ala Gly Pro Arg Gly Pro Ser Gly
1045 1050 1055
Pro Ser Gly Pro Ala Gly Lys Asp Gly Arg Asn Gly Leu Pro Gly Pro
1060 1065 1070
Ile Gly Pro Ala Gly Val Arg Gly Ser Gln Gly Ser Gln Gly Pro Ser
1075 1080 1085
Gly Pro Pro Gly Pro Pro Gly Leu Pro Gly Pro Pro Gly Ala Asn Gly
1090 1095 1100
Gly Gly Tyr Glu Val Gly Tyr Asp Leu Glu Tyr Tyr Arg Ala Asp Gln
1105 1110 1115 1120
Pro Ala Leu Arg Pro Lys Asp Tyr Glu Val Asp Ala Thr Leu Lys Thr
1125 1130 1135
Leu Asn Asn Gln Ile Glu Thr Leu Leu Thr Pro Glu Gly Ser Arg Lys
1140 1145 1150
Asn Pro Ala Arg Thr Cys Arg Asp Leu Arg Leu Ser His Pro Glu Trp
1155 1160 1165
Thr Ser Gly Phe Tyr Trp Ile Asp Pro Asn Gln Gly Cys Thr Met Asp
1170 1175 1180
Ala Ile Arg Val Tyr Cys Asp Phe Ser Thr Gly Glu Thr Cys Ile His
1185 1190 1195 1200
Ala Asn Leu Glu Asn Ile Pro Thr Lys Asn Trp Tyr Val Ser Lys Asn
1205 1210 1215
Ser Lys Glu Lys Lys His Met Trp Phe Gly Glu Thr Ile Asn Gly Gly
1220 1225 1230
Thr Gln Phe Glu Tyr Asn Asp Glu Gly Val Thr Ser Lys Asp Met Ala
1235 1240 1245
Thr Gln Leu Ala Phe Met Arg Leu Leu Ala Asn His Ala Ser Gln Asn
1250 1255 1260
Ile Thr Tyr His Cys Lys Asn Ser Ile Ala Tyr Met Asp Glu Glu Thr
1265 1270 1275 1280
Gly Asn Leu Lys Lys Ala Val Ile Leu Gln Gly Ser Asn Asp Val Glu
1285 1290 1295
Leu Arg Ala Glu Gly Asn Ser Arg Phe Thr Phe Ser Val Leu Glu Asp
1300 1305 1310
Gly Cys Ser Arg Lys Asn Asn Ala Trp Gly Lys Thr Ile Ile Glu Tyr
1315 1320 1325
Arg Thr Asn Lys Pro Ser Arg Leu Pro Ile Leu Asp Ile Ala Pro Leu
1330 1335 1340
Asp Ile Gly Gly Ala Asp Gln Glu Phe Gly Leu Asp Ile Gly Pro Val
1345 1350 1355 1360
Cys Phe Lys
<210> 23
<211> 90
<212> DNA
<213> Artificial Sequence
<220>
<223> COLsyn1a
<400> 23
ggtcctaagg gtccaaaggg ccctaaggga cccaaaggtc cacctggccc tccaggcgat 60
ccaggtgacc ctggcgaccc cggagatcca 90
<210> 24
<211> 30
<212> PRT
<213> Artificial Sequence
<220>
<223> COLsyn1a
<400> 24
Gly Pro Lys Gly Pro Lys Gly Pro Lys Gly Pro Lys Gly Pro Pro Gly
1 5 10 15
Pro Pro Gly Asp Pro Gly Asp Pro Gly Asp Pro Gly Asp Pro
20 25 30
<210> 25
<211> 136
<212> DNA
<213> Artificial Sequence
<220>
<223> COLsyn2
<400> 25
gcatcgtctc atcggtctca ttctggtcct aaaggacccg acggaccaaa gggcccagac 60
ggaccccctg gtccaccagg tgaccccggc aagccaggag atcccggtaa accaatcctg 120
agacctgaga cggcat 136
<210> 26
<211> 30
<212> PRT
<213> Artificial Sequence
<220>
<223> COLsyn2
<400> 26
Gly Pro Lys Gly Pro Asp Gly Pro Lys Gly Pro Asp Gly Pro Pro Gly
1 5 10 15
Pro Pro Gly Asp Pro Gly Lys Pro Gly Asp Pro Gly Lys Pro
20 25 30
<210> 27
<211> 90
<212> DNA
<213> Artificial Sequence
<220>
<223> COLsyn3
<400> 27
ggaccaaagg gacccaaagg accagacggc ccagatggcc ccccaggacc tcctggcgac 60
ccaggtgacc caggtaagcc tggcaagcct 90
<210> 28
<211> 30
<212> PRT
<213> Artificial Sequence
<220>
<223> COLsyn3
<400> 28
Gly Pro Lys Gly Pro Lys Gly Pro Asp Gly Pro Asp Gly Pro Pro Gly
1 5 10 15
Pro Pro Gly Asp Pro Gly Asp Pro Gly Lys Pro Gly Lys Pro
20 25 30
<210> 29
<211> 90
<212> DNA
<213> Artificial Sequence
<220>
<223> COLsyn4
<400> 29
ggtcctaaag gaccaaaggg tcccaagggc ccaaagggtc ctccaggagc tcctggacca 60
cctggccctc caggtgtccc aggtccacca 90
<210> 30
<211> 30
<212> PRT
<213> Artificial Sequence
<220>
<223> COLsyn4
<400> 30
Gly Pro Lys Gly Pro Lys Gly Pro Lys Gly Pro Lys Gly Pro Pro Gly
1 5 10 15
Ala Pro Gly Pro Pro Gly Pro Pro Gly Val Pro Gly Pro Pro
20 25 30
<210> 31
<211> 90
<212> DNA
<213> Artificial Sequence
<220>
<223> COLsyn5
<400> 31
ggtcctgacg gacctgatgg accagatggt cctgatggtc ctccaggagc tcctggacca 60
cctggccctc caggtgtccc aggtccacca 90
<210> 32
<211> 30
<212> PRT
<213> Artificial Sequence
<220>
<223> COLsyn5
<400> 32
Gly Pro Asp Gly Pro Asp Gly Pro Asp Gly Pro Asp Gly Pro Pro Gly
1 5 10 15
Ala Pro Gly Pro Pro Gly Pro Pro Gly Val Pro Gly Pro Pro
20 25 30
<210> 33
<211> 90
<212> DNA
<213> Artificial Sequence
<220>
<223> COLsyn6
<400> 33
ggtttagctg gtcccccagg tcctgcagga gctcccggtc ctccaggagc tcctggacca 60
cctggccctc caggtgtccc aggtccacca 90
<210> 34
<211> 30
<212> PRT
<213> Artificial Sequence
<220>
<223> COLsyn6
<400> 34
Gly Leu Ala Gly Pro Pro Gly Pro Ala Gly Ala Pro Gly Pro Pro Gly
1 5 10 15
Ala Pro Gly Pro Pro Gly Pro Pro Gly Val Pro Gly Pro Pro
20 25 30
<210> 35
<211> 877
<212> DNA
<213> Artificial Sequence
<220>
<223> GFP-COLsyn2-ePTS1
<400> 35
atgcgtaaag gcgaagagct gttcactggt gtcgtcccta ttctggtgga actggatggt 60
gatgtcaacg gtcataagtt ttccgtgcgt ggcgagggtg aaggtgacgc aactaatggt 120
aaactgacgc tgaagttcat ctgtactact ggtaaactgc cggttccttg gccgactctg 180
gtaacgacgc tgacttatgg tgttcagtgc tttgctcgtt atccggacca tatgaagcag 240
catgacttct tcaagtccgc catgccggaa ggctatgtgc aggaacgcac gatttccttt 300
aaggatgacg gcacgtacaa aacgcgtgcg gaagtgaaat ttgaaggcga taccctggta 360
aaccgcattg agctgaaagg cattgacttt aaagaggacg gcaatatcct gggccataag 420
ctggaataca attttaacag ccacaatgtt tacatcaccg ccgataaaca aaaaaatggc 480
attaaagcga attttaaaat tcgccacaac gtggaggatg gcagcgtgca gctggctgat 540
cactaccagc aaaacactcc aatcggtgat ggtcctgttc tgctgccaga caatcactat 600
ctgagcacgc aaagcgttct gtctaaagat ccgaacgaga aacgcgatca tatggttctg 660
ctggagttcg taaccgcagc gggcatcacg catggtatgg atgaactgta caaagcatcg 720
tctcatcggt ctcattctgg tcctaaagga cccgacggac caaagggccc agacggaccc 780
cctggtccac caggtgaccc cggcaagcca ggagatcccg gtaaaccaat cctgagacct 840
gagacggcat ttgggaagag gtagaagatc caaattg 877
<210> 36
<211> 277
<212> PRT
<213> Artificial Sequence
<220>
<223> GFP-COLsyn2-ePTS1
<400> 36
Met Arg Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu Val
1 5 10 15
Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Arg Gly Glu
20 25 30
Gly Glu Gly Asp Ala Thr Asn Gly Lys Leu Thr Leu Lys Phe Ile Cys
35 40 45
Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Leu
50 55 60
Thr Tyr Gly Val Gln Cys Phe Ala Arg Tyr Pro Asp His Met Lys Gln
65 70 75 80
His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu Arg
85 90 95
Thr Ile Ser Phe Lys Asp Asp Gly Thr Tyr Lys Thr Arg Ala Glu Val
100 105 110
Lys Phe Glu Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile
115 120 125
Asp Phe Lys Glu Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr Asn
130 135 140
Phe Asn Ser His Asn Val Tyr Ile Thr Ala Asp Lys Gln Lys Asn Gly
145 150 155 160
Ile Lys Ala Asn Phe Lys Ile Arg His Asn Val Glu Asp Gly Ser Val
165 170 175
Gln Leu Ala Asp His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro
180 185 190
Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Val Leu Ser
195 200 205
Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val
210 215 220
Thr Ala Ala Gly Ile Thr His Gly Met Asp Glu Leu Tyr Lys Gly Pro
225 230 235 240
Lys Gly Pro Asp Gly Pro Lys Gly Pro Asp Gly Pro Pro Gly Pro Pro
245 250 255
Gly Asp Pro Gly Lys Pro Gly Asp Pro Gly Lys Pro Leu Gly Arg Gly
260 265 270
Arg Arg Ser Lys Leu
275
<210> 37
<211> 831
<212> DNA
<213> Artificial Sequence
<220>
<223> GFP-COLsyn3-ePTS1
<400> 37
atgcgtaaag gcgaagagct gttcactggt gtcgtcccta ttctggtgga actggatggt 60
gatgtcaacg gtcataagtt ttccgtgcgt ggcgagggtg aaggtgacgc aactaatggt 120
aaactgacgc tgaagttcat ctgtactact ggtaaactgc cggttccttg gccgactctg 180
gtaacgacgc tgacttatgg tgttcagtgc tttgctcgtt atccggacca tatgaagcag 240
catgacttct tcaagtccgc catgccggaa ggctatgtgc aggaacgcac gatttccttt 300
aaggatgacg gcacgtacaa aacgcgtgcg gaagtgaaat ttgaaggcga taccctggta 360
aaccgcattg agctgaaagg cattgacttt aaagaggacg gcaatatcct gggccataag 420
ctggaataca attttaacag ccacaatgtt tacatcaccg ccgataaaca aaaaaatggc 480
attaaagcga attttaaaat tcgccacaac gtggaggatg gcagcgtgca gctggctgat 540
cactaccagc aaaacactcc aatcggtgat ggtcctgttc tgctgccaga caatcactat 600
ctgagcacgc aaagcgttct gtctaaagat ccgaacgaga aacgcgatca tatggttctg 660
ctggagttcg taaccgcagc gggcatcacg catggtatgg atgaactgta caaaggacca 720
aagggaccca aaggaccaga cggcccagat ggccccccag gacctcctgg cgacccaggt 780
gacccaggta agcctggcaa gcctttggga agaggtagaa gatccaaatt g 831
<210> 38
<211> 277
<212> PRT
<213> Artificial Sequence
<220>
<223> GFP-COLsyn3-ePTS1
<400> 38
Met Arg Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu Val
1 5 10 15
Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Arg Gly Glu
20 25 30
Gly Glu Gly Asp Ala Thr Asn Gly Lys Leu Thr Leu Lys Phe Ile Cys
35 40 45
Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Leu
50 55 60
Thr Tyr Gly Val Gln Cys Phe Ala Arg Tyr Pro Asp His Met Lys Gln
65 70 75 80
His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu Arg
85 90 95
Thr Ile Ser Phe Lys Asp Asp Gly Thr Tyr Lys Thr Arg Ala Glu Val
100 105 110
Lys Phe Glu Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile
115 120 125
Asp Phe Lys Glu Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr Asn
130 135 140
Phe Asn Ser His Asn Val Tyr Ile Thr Ala Asp Lys Gln Lys Asn Gly
145 150 155 160
Ile Lys Ala Asn Phe Lys Ile Arg His Asn Val Glu Asp Gly Ser Val
165 170 175
Gln Leu Ala Asp His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro
180 185 190
Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Val Leu Ser
195 200 205
Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val
210 215 220
Thr Ala Ala Gly Ile Thr His Gly Met Asp Glu Leu Tyr Lys Gly Pro
225 230 235 240
Lys Gly Pro Lys Gly Pro Asp Gly Pro Asp Gly Pro Pro Gly Pro Pro
245 250 255
Gly Asp Pro Gly Asp Pro Gly Lys Pro Gly Lys Pro Leu Gly Arg Gly
260 265 270
Arg Arg Ser Lys Leu
275
<210> 39
<211> 831
<212> DNA
<213> Artificial Sequence
<220>
<223> GFP-COLsyn6-ePTS1
<400> 39
atgcgtaaag gcgaagagct gttcactggt gtcgtcccta ttctggtgga actggatggt 60
gatgtcaacg gtcataagtt ttccgtgcgt ggcgagggtg aaggtgacgc aactaatggt 120
aaactgacgc tgaagttcat ctgtactact ggtaaactgc cggttccttg gccgactctg 180
gtaacgacgc tgacttatgg tgttcagtgc tttgctcgtt atccggacca tatgaagcag 240
catgacttct tcaagtccgc catgccggaa ggctatgtgc aggaacgcac gatttccttt 300
aaggatgacg gcacgtacaa aacgcgtgcg gaagtgaaat ttgaaggcga taccctggta 360
aaccgcattg agctgaaagg cattgacttt aaagaggacg gcaatatcct gggccataag 420
ctggaataca attttaacag ccacaatgtt tacatcaccg ccgataaaca aaaaaatggc 480
attaaagcga attttaaaat tcgccacaac gtggaggatg gcagcgtgca gctggctgat 540
cactaccagc aaaacactcc aatcggtgat ggtcctgttc tgctgccaga caatcactat 600
ctgagcacgc aaagcgttct gtctaaagat ccgaacgaga aacgcgatca tatggttctg 660
ctggagttcg taaccgcagc gggcatcacg catggtatgg atgaactgta caaaggttta 720
gctggtcccc caggtcctgc aggagctccc ggtcctccag gagctcctgg accacctggc 780
cctccaggtg tcccaggtcc accattggga agaggtagaa gatccaaatt g 831
<210> 40
<211> 277
<212> PRT
<213> Artificial Sequence
<220>
<223> GFP-COLsyn6-ePTS1
<400> 40
Met Arg Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu Val
1 5 10 15
Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Arg Gly Glu
20 25 30
Gly Glu Gly Asp Ala Thr Asn Gly Lys Leu Thr Leu Lys Phe Ile Cys
35 40 45
Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Leu
50 55 60
Thr Tyr Gly Val Gln Cys Phe Ala Arg Tyr Pro Asp His Met Lys Gln
65 70 75 80
His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu Arg
85 90 95
Thr Ile Ser Phe Lys Asp Asp Gly Thr Tyr Lys Thr Arg Ala Glu Val
100 105 110
Lys Phe Glu Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile
115 120 125
Asp Phe Lys Glu Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr Asn
130 135 140
Phe Asn Ser His Asn Val Tyr Ile Thr Ala Asp Lys Gln Lys Asn Gly
145 150 155 160
Ile Lys Ala Asn Phe Lys Ile Arg His Asn Val Glu Asp Gly Ser Val
165 170 175
Gln Leu Ala Asp His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro
180 185 190
Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Val Leu Ser
195 200 205
Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val
210 215 220
Thr Ala Ala Gly Ile Thr His Gly Met Asp Glu Leu Tyr Lys Gly Leu
225 230 235 240
Ala Gly Pro Pro Gly Pro Ala Gly Ala Pro Gly Pro Pro Gly Ala Pro
245 250 255
Gly Pro Pro Gly Pro Pro Gly Val Pro Gly Pro Pro Leu Gly Arg Gly
260 265 270
Arg Arg Ser Lys Leu
275
<210> 41
<211> 1605
<212> DNA
<213> Artificial Sequence
<220>
<223> BtauP4HA1
<400> 41
atgatctggt atattttagt tgtagggatt ctacttcccc agtctttggc ccatccaggc 60
ttttttactt ctattggtca gatgactgat ttgattcata ctgaaaaaga tctggtgact 120
tccctgaaag actatataaa ggcagaagag gacaaattag aacaaataaa aaaatgggca 180
gagaaattag atcgattaac cagcacagcg acaaaagatc cagaaggatt tgttggacac 240
cctgtaaatg cattcaaatt aatgaaacgt ctgaacactg agtggagtga gttggagaat 300
ctggtcctta aggatatgtc agatggtttt atctctaacc taaccattca gagacagtac 360
ttccctaatg atgaagatca ggttggggca gccaaagctc tgttgcgtct acaggacacc 420
tacaatttgg atacagatac catctcaaag ggtgatcttc caggagtaaa acacaaatct 480
tttctaacag ttgaggactg ttttgagttg ggcaaagtgg cctacacaga agcagattat 540
taccatacag agctgtggat ggaacaagca ctgaggcagc tggatgaagg cgaggtttct 600
accgttgata aagtctctgt tctggattat ttgagctatg cagtatacca gcagggagac 660
ctggataagg cgcttttgct cacaaagaag cttcttgaac tagatcctga acatcagaga 720
gctaacggta acttaaaata ctttgagtat ataatggcta aagaaaaaga tgccaataag 780
tcttcttcag atgaccaatc tgatcagaaa accacactga agaagaaagg tgctgctgtg 840
gattacctgc cagagagaca gaagtacgaa atgctgtgcc gtggggaggg tatcaaaatg 900
actcctcgga gacagaaaaa actcttctgt cgctaccatg atggaaaccg gaatcctaaa 960
tttatcctgg ctccagccaa acaggaggat gagtgggaca agcctcgtat tatccgcttc 1020
catgatatta tttctgatgc agaaattgaa gtcgttaaag atctagcaaa accaaggctg 1080
aggcgagcca ccatttcaaa cccaataaca ggagacttgg agacggtaca ttacagaatt 1140
agcaaaagtg cctggctgtc tggctatgaa aaccctgtgg tgtcacgaat taatatgaga 1200
atccaagatc tgacaggact agatgtctcc acagcagagg aattacaggt agcaaattat 1260
ggagttggag gacagtatga accccatttt gattttgcac ggaaagatga gccagatgct 1320
ttcaaagagc tggggacagg aaatagaatt gctacatggc tgttttatat gagtgatgtg 1380
ttagcaggag gagccactgt ttttcctgaa gtaggagcta gtgtttggcc caaaaaggga 1440
actgctgttt tctggtataa tctgtttgcc agtggagaag gagattatag tacacggcat 1500
gcagcctgtc cagtgctggt tggaaacaaa tgggtatcca ataaatggct ccatgaacgt 1560
ggacaggaat ttcgaagacc atgcaccttg tcagaattgg aatga 1605
<210> 42
<211> 534
<212> PRT
<213> Artificial Sequence
<220>
<223> BtauP4HA1
<400> 42
Met Ile Trp Tyr Ile Leu Val Val Gly Ile Leu Leu Pro Gln Ser Leu
1 5 10 15
Ala His Pro Gly Phe Phe Thr Ser Ile Gly Gln Met Thr Asp Leu Ile
20 25 30
His Thr Glu Lys Asp Leu Val Thr Ser Leu Lys Asp Tyr Ile Lys Ala
35 40 45
Glu Glu Asp Lys Leu Glu Gln Ile Lys Lys Trp Ala Glu Lys Leu Asp
50 55 60
Arg Leu Thr Ser Thr Ala Thr Lys Asp Pro Glu Gly Phe Val Gly His
65 70 75 80
Pro Val Asn Ala Phe Lys Leu Met Lys Arg Leu Asn Thr Glu Trp Ser
85 90 95
Glu Leu Glu Asn Leu Val Leu Lys Asp Met Ser Asp Gly Phe Ile Ser
100 105 110
Asn Leu Thr Ile Gln Arg Gln Tyr Phe Pro Asn Asp Glu Asp Gln Val
115 120 125
Gly Ala Ala Lys Ala Leu Leu Arg Leu Gln Asp Thr Tyr Asn Leu Asp
130 135 140
Thr Asp Thr Ile Ser Lys Gly Asp Leu Pro Gly Val Lys His Lys Ser
145 150 155 160
Phe Leu Thr Val Glu Asp Cys Phe Glu Leu Gly Lys Val Ala Tyr Thr
165 170 175
Glu Ala Asp Tyr Tyr His Thr Glu Leu Trp Met Glu Gln Ala Leu Arg
180 185 190
Gln Leu Asp Glu Gly Glu Val Ser Thr Val Asp Lys Val Ser Val Leu
195 200 205
Asp Tyr Leu Ser Tyr Ala Val Tyr Gln Gln Gly Asp Leu Asp Lys Ala
210 215 220
Leu Leu Leu Thr Lys Lys Leu Leu Glu Leu Asp Pro Glu His Gln Arg
225 230 235 240
Ala Asn Gly Asn Leu Lys Tyr Phe Glu Tyr Ile Met Ala Lys Glu Lys
245 250 255
Asp Ala Asn Lys Ser Ser Ser Asp Asp Gln Ser Asp Gln Lys Thr Thr
260 265 270
Leu Lys Lys Lys Gly Ala Ala Val Asp Tyr Leu Pro Glu Arg Gln Lys
275 280 285
Tyr Glu Met Leu Cys Arg Gly Glu Gly Ile Lys Met Thr Pro Arg Arg
290 295 300
Gln Lys Lys Leu Phe Cys Arg Tyr His Asp Gly Asn Arg Asn Pro Lys
305 310 315 320
Phe Ile Leu Ala Pro Ala Lys Gln Glu Asp Glu Trp Asp Lys Pro Arg
325 330 335
Ile Ile Arg Phe His Asp Ile Ile Ser Asp Ala Glu Ile Glu Val Val
340 345 350
Lys Asp Leu Ala Lys Pro Arg Leu Arg Arg Ala Thr Ile Ser Asn Pro
355 360 365
Ile Thr Gly Asp Leu Glu Thr Val His Tyr Arg Ile Ser Lys Ser Ala
370 375 380
Trp Leu Ser Gly Tyr Glu Asn Pro Val Val Ser Arg Ile Asn Met Arg
385 390 395 400
Ile Gln Asp Leu Thr Gly Leu Asp Val Ser Thr Ala Glu Glu Leu Gln
405 410 415
Val Ala Asn Tyr Gly Val Gly Gly Gln Tyr Glu Pro His Phe Asp Phe
420 425 430
Ala Arg Lys Asp Glu Pro Asp Ala Phe Lys Glu Leu Gly Thr Gly Asn
435 440 445
Arg Ile Ala Thr Trp Leu Phe Tyr Met Ser Asp Val Leu Ala Gly Gly
450 455 460
Ala Thr Val Phe Pro Glu Val Gly Ala Ser Val Trp Pro Lys Lys Gly
465 470 475 480
Thr Ala Val Phe Trp Tyr Asn Leu Phe Ala Ser Gly Glu Gly Asp Tyr
485 490 495
Ser Thr Arg His Ala Ala Cys Pro Val Leu Val Gly Asn Lys Trp Val
500 505 510
Ser Asn Lys Trp Leu His Glu Arg Gly Gln Glu Phe Arg Arg Pro Cys
515 520 525
Thr Leu Ser Glu Leu Glu
530
<210> 43
<211> 1533
<212> DNA
<213> Artificial Sequence
<220>
<223> BtauP4HB
<400> 43
atgctgcgcc gcgctctgct ctgcctggcc ctgaccgcgc tattccgcgc gggtgccggc 60
gcccccgacg aggaggacca cgtcctggtg ctccataagg gcaacttcga cgaggcgctg 120
gcggcccaca agtacctgct ggtggagttc tacgccccat ggtgcggcca ctgcaaggct 180
ctggccccgg agtatgccaa agcagctggg aagctgaagg cagaaggttc tgagatcaga 240
ctggccaagg tggatgccac tgaagagtct gacctggccc agcagtatgg tgtccgaggc 300
taccccacca tcaagttctt caagaatgga gacacagctt cccccaaaga gtacacagct 360
ggccgagaag cggatgatat cgtgaactgg ctgaagaagc gcacgggccc cgctgccagc 420
acgctgtccg acggggctgc tgcagaggcc ttggtggagt ccagtgaggt ggccgtcatt 480
ggcttcttca aggacatgga gtcggactcc gcaaagcagt tcttcttggc agcagaggtc 540
attgatgaca tccccttcgg gatcacatct aacagcgatg tgttctccaa ataccagctg 600
gacaaggatg gggttgtcct ctttaagaag tttgacgaag gccggaacaa ctttgagggg 660
gaggtcacca aagaaaagct tctggacttc atcaagcaca accagttgcc cctggtcatt 720
gagttcaccg agcagacagc cccgaagatc ttcggagggg aaatcaagac tcacatcctg 780
ctgttcctgc cgaaaagcgt gtctgactat gagggcaagc tgagcaactt caaaaaagcg 840
gctgagagct tcaagggcaa gatcctgttt atcttcatcg acagcgacca cactgacaac 900
cagcgcatcc tggaattctt cggcctaaag aaagaggagt gcccggccgt gcgcctcatc 960
acgctggagg aggagatgac caaatataag ccagagtcag atgagctgac ggcagagaag 1020
atcaccgagt tctgccaccg cttcctggag ggcaagatta agccccacct gatgagccag 1080
gagctgcctg acgactggga caagcagcct gtcaaagtgc tggttgggaa gaactttgaa 1140
gaggttgctt ttgatgagaa aaagaacgtc tttgtagagt tctatgcccc gtggtgcggt 1200
cactgcaagc agctggcccc catctgggat aagctgggag agacgtacaa ggaccacgag 1260
aacatagtca tcgccaagat ggactccacg gccaacgagg tggaggcggt gaaagtgcac 1320
agcttcccca cgctcaagtt cttccccgcc agcgccgaca ggacggtcat cgactacaat 1380
ggggagcgga cactggatgg ttttaagaag ttcctggaga gtggtggcca ggatggggcc 1440
ggagatgatg acgatctaga agatcttgaa gaagcagaag agcctgatct ggaggaagat 1500
gatgatcaaa aagctgtgaa agatgaactg taa 1533
<210> 44
<211> 510
<212> PRT
<213> Artificial Sequence
<220>
<223> BtauP4HB
<400> 44
Met Leu Arg Arg Ala Leu Leu Cys Leu Ala Leu Thr Ala Leu Phe Arg
1 5 10 15
Ala Gly Ala Gly Ala Pro Asp Glu Glu Asp His Val Leu Val Leu His
20 25 30
Lys Gly Asn Phe Asp Glu Ala Leu Ala Ala His Lys Tyr Leu Leu Val
35 40 45
Glu Phe Tyr Ala Pro Trp Cys Gly His Cys Lys Ala Leu Ala Pro Glu
50 55 60
Tyr Ala Lys Ala Ala Gly Lys Leu Lys Ala Glu Gly Ser Glu Ile Arg
65 70 75 80
Leu Ala Lys Val Asp Ala Thr Glu Glu Ser Asp Leu Ala Gln Gln Tyr
85 90 95
Gly Val Arg Gly Tyr Pro Thr Ile Lys Phe Phe Lys Asn Gly Asp Thr
100 105 110
Ala Ser Pro Lys Glu Tyr Thr Ala Gly Arg Glu Ala Asp Asp Ile Val
115 120 125
Asn Trp Leu Lys Lys Arg Thr Gly Pro Ala Ala Ser Thr Leu Ser Asp
130 135 140
Gly Ala Ala Ala Glu Ala Leu Val Glu Ser Ser Glu Val Ala Val Ile
145 150 155 160
Gly Phe Phe Lys Asp Met Glu Ser Asp Ser Ala Lys Gln Phe Phe Leu
165 170 175
Ala Ala Glu Val Ile Asp Asp Ile Pro Phe Gly Ile Thr Ser Asn Ser
180 185 190
Asp Val Phe Ser Lys Tyr Gln Leu Asp Lys Asp Gly Val Val Leu Phe
195 200 205
Lys Lys Phe Asp Glu Gly Arg Asn Asn Phe Glu Gly Glu Val Thr Lys
210 215 220
Glu Lys Leu Leu Asp Phe Ile Lys His Asn Gln Leu Pro Leu Val Ile
225 230 235 240
Glu Phe Thr Glu Gln Thr Ala Pro Lys Ile Phe Gly Gly Glu Ile Lys
245 250 255
Thr His Ile Leu Leu Phe Leu Pro Lys Ser Val Ser Asp Tyr Glu Gly
260 265 270
Lys Leu Ser Asn Phe Lys Lys Ala Ala Glu Ser Phe Lys Gly Lys Ile
275 280 285
Leu Phe Ile Phe Ile Asp Ser Asp His Thr Asp Asn Gln Arg Ile Leu
290 295 300
Glu Phe Phe Gly Leu Lys Lys Glu Glu Cys Pro Ala Val Arg Leu Ile
305 310 315 320
Thr Leu Glu Glu Glu Met Thr Lys Tyr Lys Pro Glu Ser Asp Glu Leu
325 330 335
Thr Ala Glu Lys Ile Thr Glu Phe Cys His Arg Phe Leu Glu Gly Lys
340 345 350
Ile Lys Pro His Leu Met Ser Gln Glu Leu Pro Asp Asp Trp Asp Lys
355 360 365
Gln Pro Val Lys Val Leu Val Gly Lys Asn Phe Glu Glu Val Ala Phe
370 375 380
Asp Glu Lys Lys Asn Val Phe Val Glu Phe Tyr Ala Pro Trp Cys Gly
385 390 395 400
His Cys Lys Gln Leu Ala Pro Ile Trp Asp Lys Leu Gly Glu Thr Tyr
405 410 415
Lys Asp His Glu Asn Ile Val Ile Ala Lys Met Asp Ser Thr Ala Asn
420 425 430
Glu Val Glu Ala Val Lys Val His Ser Phe Pro Thr Leu Lys Phe Phe
435 440 445
Pro Ala Ser Ala Asp Arg Thr Val Ile Asp Tyr Asn Gly Glu Arg Thr
450 455 460
Leu Asp Gly Phe Lys Lys Phe Leu Glu Ser Gly Gly Gln Asp Gly Ala
465 470 475 480
Gly Asp Asp Asp Asp Leu Glu Asp Leu Glu Glu Ala Glu Glu Pro Asp
485 490 495
Leu Glu Glu Asp Asp Asp Gln Lys Ala Val Lys Asp Glu Leu
500 505 510
<210> 45
<211> 1470
<212> DNA
<213> Artificial Sequence
<220>
<223> BtP4HB
<400> 45
gcccccgacg aggaggacca cgtcctggtg ctccataagg gcaacttcga cgaggcgctg 60
gcggcccaca agtacctgct ggtggagttc tacgccccat ggtgcggcca ctgcaaggct 120
ctggccccgg agtatgccaa agcagctggg aagctgaagg cagaaggttc tgagatcaga 180
ctggccaagg tggatgccac tgaagagtct gacctggccc agcagtatgg tgtccgaggc 240
taccccacca tcaagttctt caagaatgga gacacagctt cccccaaaga gtacacagct 300
ggccgagaag cggatgatat cgtgaactgg ctgaagaagc gcacgggccc cgctgccagc 360
acgctgtccg acggggctgc tgcagaggcc ttggtggagt ccagtgaggt ggccgtcatt 420
ggcttcttca aggacatgga gtcggactcc gcaaagcagt tcttcttggc agcagaggtc 480
attgatgaca tccccttcgg gatcacatct aacagcgatg tgttctccaa ataccagctg 540
gacaaggatg gggttgtcct ctttaagaag tttgacgaag gccggaacaa ctttgagggg 600
gaggtcacca aagaaaagct tctggacttc atcaagcaca accagttgcc cctggtcatt 660
gagttcaccg agcagacagc cccgaagatc ttcggagggg aaatcaagac tcacatcctg 720
ctgttcctgc cgaaaagcgt gtctgactat gagggcaagc tgagcaactt caaaaaagcg 780
gctgagagct tcaagggcaa gatcctgttt atcttcatcg acagcgacca cactgacaac 840
cagcgcatcc tggaattctt cggcctaaag aaagaggagt gcccggccgt gcgcctcatc 900
acgctggagg aggagatgac caaatataag ccagagtcag atgagctgac ggcagagaag 960
atcaccgagt tctgccaccg cttcctggag ggcaagatta agccccacct gatgagccag 1020
gagctgcctg acgactggga caagcagcct gtcaaagtgc tggttgggaa gaactttgaa 1080
gaggttgctt ttgatgagaa aaagaacgtc tttgtagagt tctatgcccc gtggtgcggt 1140
cactgcaagc agctggcccc catctgggat aagctgggag agacgtacaa ggaccacgag 1200
aacatagtca tcgccaagat ggactccacg gccaacgagg tggaggcggt gaaagtgcac 1260
agcttcccca cgctcaagtt cttccccgcc agcgccgaca ggacggtcat cgactacaat 1320
ggggagcgga cactggatgg ttttaagaag ttcctggaga gtggtggcca ggatggggcc 1380
ggagatgatg acgatctaga agatcttgaa gaagcagaag agcctgatct ggaggaagat 1440
gatgatcaaa aagctgtgaa agatgaactg 1470
<210> 46
<211> 490
<212> PRT
<213> Artificial Sequence
<220>
<223> BtP4HB
<400> 46
Ala Pro Asp Glu Glu Asp His Val Leu Val Leu His Lys Gly Asn Phe
1 5 10 15
Asp Glu Ala Leu Ala Ala His Lys Tyr Leu Leu Val Glu Phe Tyr Ala
20 25 30
Pro Trp Cys Gly His Cys Lys Ala Leu Ala Pro Glu Tyr Ala Lys Ala
35 40 45
Ala Gly Lys Leu Lys Ala Glu Gly Ser Glu Ile Arg Leu Ala Lys Val
50 55 60
Asp Ala Thr Glu Glu Ser Asp Leu Ala Gln Gln Tyr Gly Val Arg Gly
65 70 75 80
Tyr Pro Thr Ile Lys Phe Phe Lys Asn Gly Asp Thr Ala Ser Pro Lys
85 90 95
Glu Tyr Thr Ala Gly Arg Glu Ala Asp Asp Ile Val Asn Trp Leu Lys
100 105 110
Lys Arg Thr Gly Pro Ala Ala Ser Thr Leu Ser Asp Gly Ala Ala Ala
115 120 125
Glu Ala Leu Val Glu Ser Ser Glu Val Ala Val Ile Gly Phe Phe Lys
130 135 140
Asp Met Glu Ser Asp Ser Ala Lys Gln Phe Phe Leu Ala Ala Glu Val
145 150 155 160
Ile Asp Asp Ile Pro Phe Gly Ile Thr Ser Asn Ser Asp Val Phe Ser
165 170 175
Lys Tyr Gln Leu Asp Lys Asp Gly Val Val Leu Phe Lys Lys Phe Asp
180 185 190
Glu Gly Arg Asn Asn Phe Glu Gly Glu Val Thr Lys Glu Lys Leu Leu
195 200 205
Asp Phe Ile Lys His Asn Gln Leu Pro Leu Val Ile Glu Phe Thr Glu
210 215 220
Gln Thr Ala Pro Lys Ile Phe Gly Gly Glu Ile Lys Thr His Ile Leu
225 230 235 240
Leu Phe Leu Pro Lys Ser Val Ser Asp Tyr Glu Gly Lys Leu Ser Asn
245 250 255
Phe Lys Lys Ala Ala Glu Ser Phe Lys Gly Lys Ile Leu Phe Ile Phe
260 265 270
Ile Asp Ser Asp His Thr Asp Asn Gln Arg Ile Leu Glu Phe Phe Gly
275 280 285
Leu Lys Lys Glu Glu Cys Pro Ala Val Arg Leu Ile Thr Leu Glu Glu
290 295 300
Glu Met Thr Lys Tyr Lys Pro Glu Ser Asp Glu Leu Thr Ala Glu Lys
305 310 315 320
Ile Thr Glu Phe Cys His Arg Phe Leu Glu Gly Lys Ile Lys Pro His
325 330 335
Leu Met Ser Gln Glu Leu Pro Asp Asp Trp Asp Lys Gln Pro Val Lys
340 345 350
Val Leu Val Gly Lys Asn Phe Glu Glu Val Ala Phe Asp Glu Lys Lys
355 360 365
Asn Val Phe Val Glu Phe Tyr Ala Pro Trp Cys Gly His Cys Lys Gln
370 375 380
Leu Ala Pro Ile Trp Asp Lys Leu Gly Glu Thr Tyr Lys Asp His Glu
385 390 395 400
Asn Ile Val Ile Ala Lys Met Asp Ser Thr Ala Asn Glu Val Glu Ala
405 410 415
Val Lys Val His Ser Phe Pro Thr Leu Lys Phe Phe Pro Ala Ser Ala
420 425 430
Asp Arg Thr Val Ile Asp Tyr Asn Gly Glu Arg Thr Leu Asp Gly Phe
435 440 445
Lys Lys Phe Leu Glu Ser Gly Gly Gln Asp Gly Ala Gly Asp Asp Asp
450 455 460
Asp Leu Glu Asp Leu Glu Glu Ala Glu Glu Pro Asp Leu Glu Glu Asp
465 470 475 480
Asp Asp Gln Lys Ala Val Lys Asp Glu Leu
485 490
<210> 47
<211> 2211
<212> DNA
<213> Artificial Sequence
<220>
<223> GFP-BtP4HB-ePTS1
<400> 47
atgcgtaaag gcgaagagct gttcactggt gtcgtcccta ttctggtgga actggatggt 60
gatgtcaacg gtcataagtt ttccgtgcgt ggcgagggtg aaggtgacgc aactaatggt 120
aaactgacgc tgaagttcat ctgtactact ggtaaactgc cggttccttg gccgactctg 180
gtaacgacgc tgacttatgg tgttcagtgc tttgctcgtt atccggacca tatgaagcag 240
catgacttct tcaagtccgc catgccggaa ggctatgtgc aggaacgcac gatttccttt 300
aaggatgacg gcacgtacaa aacgcgtgcg gaagtgaaat ttgaaggcga taccctggta 360
aaccgcattg agctgaaagg cattgacttt aaagaggacg gcaatatcct gggccataag 420
ctggaataca attttaacag ccacaatgtt tacatcaccg ccgataaaca aaaaaatggc 480
attaaagcga attttaaaat tcgccacaac gtggaggatg gcagcgtgca gctggctgat 540
cactaccagc aaaacactcc aatcggtgat ggtcctgttc tgctgccaga caatcactat 600
ctgagcacgc aaagcgttct gtctaaagat ccgaacgaga aacgcgatca tatggttctg 660
ctggagttcg taaccgcagc gggcatcacg catggtatgg atgaactgta caaagccccc 720
gacgaggagg accacgtcct ggtgctccat aagggcaact tcgacgaggc gctggcggcc 780
cacaagtacc tgctggtgga gttctacgcc ccatggtgcg gccactgcaa ggctctggcc 840
ccggagtatg ccaaagcagc tgggaagctg aaggcagaag gttctgagat cagactggcc 900
aaggtggatg ccactgaaga gtctgacctg gcccagcagt atggtgtccg aggctacccc 960
accatcaagt tcttcaagaa tggagacaca gcttccccca aagagtacac agctggccga 1020
gaagcggatg atatcgtgaa ctggctgaag aagcgcacgg gccccgctgc cagcacgctg 1080
tccgacgggg ctgctgcaga ggccttggtg gagtccagtg aggtggccgt cattggcttc 1140
ttcaaggaca tggagtcgga ctccgcaaag cagttcttct tggcagcaga ggtcattgat 1200
gacatcccct tcgggatcac atctaacagc gatgtgttct ccaaatacca gctggacaag 1260
gatggggttg tcctctttaa gaagtttgac gaaggccgga acaactttga gggggaggtc 1320
accaaagaaa agcttctgga cttcatcaag cacaaccagt tgcccctggt cattgagttc 1380
accgagcaga cagccccgaa gatcttcgga ggggaaatca agactcacat cctgctgttc 1440
ctgccgaaaa gcgtgtctga ctatgagggc aagctgagca acttcaaaaa agcggctgag 1500
agcttcaagg gcaagatcct gtttatcttc atcgacagcg accacactga caaccagcgc 1560
atcctggaat tcttcggcct aaagaaagag gagtgcccgg ccgtgcgcct catcacgctg 1620
gaggaggaga tgaccaaata taagccagag tcagatgagc tgacggcaga gaagatcacc 1680
gagttctgcc accgcttcct ggagggcaag attaagcccc acctgatgag ccaggagctg 1740
cctgacgact gggacaagca gcctgtcaaa gtgctggttg ggaagaactt tgaagaggtt 1800
gcttttgatg agaaaaagaa cgtctttgta gagttctatg ccccgtggtg cggtcactgc 1860
aagcagctgg cccccatctg ggataagctg ggagagacgt acaaggacca cgagaacata 1920
gtcatcgcca agatggactc cacggccaac gaggtggagg cggtgaaagt gcacagcttc 1980
cccacgctca agttcttccc cgccagcgcc gacaggacgg tcatcgacta caatggggag 2040
cggacactgg atggttttaa gaagttcctg gagagtggtg gccaggatgg ggccggagat 2100
gatgacgatc tagaagatct tgaagaagca gaagagcctg atctggagga agatgatgat 2160
caaaaagctg tgaaagatga actgttggga agaggtagaa gatccaaatt g 2211
<210> 48
<211> 737
<212> PRT
<213> Artificial Sequence
<220>
<223> GFP-BtP4HB-ePTS1
<400> 48
Met Arg Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu Val
1 5 10 15
Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Arg Gly Glu
20 25 30
Gly Glu Gly Asp Ala Thr Asn Gly Lys Leu Thr Leu Lys Phe Ile Cys
35 40 45
Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Leu
50 55 60
Thr Tyr Gly Val Gln Cys Phe Ala Arg Tyr Pro Asp His Met Lys Gln
65 70 75 80
His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu Arg
85 90 95
Thr Ile Ser Phe Lys Asp Asp Gly Thr Tyr Lys Thr Arg Ala Glu Val
100 105 110
Lys Phe Glu Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile
115 120 125
Asp Phe Lys Glu Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr Asn
130 135 140
Phe Asn Ser His Asn Val Tyr Ile Thr Ala Asp Lys Gln Lys Asn Gly
145 150 155 160
Ile Lys Ala Asn Phe Lys Ile Arg His Asn Val Glu Asp Gly Ser Val
165 170 175
Gln Leu Ala Asp His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro
180 185 190
Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Val Leu Ser
195 200 205
Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val
210 215 220
Thr Ala Ala Gly Ile Thr His Gly Met Asp Glu Leu Tyr Lys Ala Pro
225 230 235 240
Asp Glu Glu Asp His Val Leu Val Leu His Lys Gly Asn Phe Asp Glu
245 250 255
Ala Leu Ala Ala His Lys Tyr Leu Leu Val Glu Phe Tyr Ala Pro Trp
260 265 270
Cys Gly His Cys Lys Ala Leu Ala Pro Glu Tyr Ala Lys Ala Ala Gly
275 280 285
Lys Leu Lys Ala Glu Gly Ser Glu Ile Arg Leu Ala Lys Val Asp Ala
290 295 300
Thr Glu Glu Ser Asp Leu Ala Gln Gln Tyr Gly Val Arg Gly Tyr Pro
305 310 315 320
Thr Ile Lys Phe Phe Lys Asn Gly Asp Thr Ala Ser Pro Lys Glu Tyr
325 330 335
Thr Ala Gly Arg Glu Ala Asp Asp Ile Val Asn Trp Leu Lys Lys Arg
340 345 350
Thr Gly Pro Ala Ala Ser Thr Leu Ser Asp Gly Ala Ala Ala Glu Ala
355 360 365
Leu Val Glu Ser Ser Glu Val Ala Val Ile Gly Phe Phe Lys Asp Met
370 375 380
Glu Ser Asp Ser Ala Lys Gln Phe Phe Leu Ala Ala Glu Val Ile Asp
385 390 395 400
Asp Ile Pro Phe Gly Ile Thr Ser Asn Ser Asp Val Phe Ser Lys Tyr
405 410 415
Gln Leu Asp Lys Asp Gly Val Val Leu Phe Lys Lys Phe Asp Glu Gly
420 425 430
Arg Asn Asn Phe Glu Gly Glu Val Thr Lys Glu Lys Leu Leu Asp Phe
435 440 445
Ile Lys His Asn Gln Leu Pro Leu Val Ile Glu Phe Thr Glu Gln Thr
450 455 460
Ala Pro Lys Ile Phe Gly Gly Glu Ile Lys Thr His Ile Leu Leu Phe
465 470 475 480
Leu Pro Lys Ser Val Ser Asp Tyr Glu Gly Lys Leu Ser Asn Phe Lys
485 490 495
Lys Ala Ala Glu Ser Phe Lys Gly Lys Ile Leu Phe Ile Phe Ile Asp
500 505 510
Ser Asp His Thr Asp Asn Gln Arg Ile Leu Glu Phe Phe Gly Leu Lys
515 520 525
Lys Glu Glu Cys Pro Ala Val Arg Leu Ile Thr Leu Glu Glu Glu Met
530 535 540
Thr Lys Tyr Lys Pro Glu Ser Asp Glu Leu Thr Ala Glu Lys Ile Thr
545 550 555 560
Glu Phe Cys His Arg Phe Leu Glu Gly Lys Ile Lys Pro His Leu Met
565 570 575
Ser Gln Glu Leu Pro Asp Asp Trp Asp Lys Gln Pro Val Lys Val Leu
580 585 590
Val Gly Lys Asn Phe Glu Glu Val Ala Phe Asp Glu Lys Lys Asn Val
595 600 605
Phe Val Glu Phe Tyr Ala Pro Trp Cys Gly His Cys Lys Gln Leu Ala
610 615 620
Pro Ile Trp Asp Lys Leu Gly Glu Thr Tyr Lys Asp His Glu Asn Ile
625 630 635 640
Val Ile Ala Lys Met Asp Ser Thr Ala Asn Glu Val Glu Ala Val Lys
645 650 655
Val His Ser Phe Pro Thr Leu Lys Phe Phe Pro Ala Ser Ala Asp Arg
660 665 670
Thr Val Ile Asp Tyr Asn Gly Glu Arg Thr Leu Asp Gly Phe Lys Lys
675 680 685
Phe Leu Glu Ser Gly Gly Gln Asp Gly Ala Gly Asp Asp Asp Asp Leu
690 695 700
Glu Asp Leu Glu Glu Ala Glu Glu Pro Asp Leu Glu Glu Asp Asp Asp
705 710 715 720
Gln Lys Ala Val Lys Asp Glu Leu Leu Gly Arg Gly Arg Arg Ser Lys
725 730 735
Leu
<210> 49
<211> 708
<212> DNA
<213> Artificial Sequence
<220>
<223> TEV protease
<400> 49
ggagagtccc tgtttaaagg acccagagac tataacccga ttagtagcac tatttgtcat 60
cttacaaacg aaagtgatgg tcacacgact agtctttacg gaatcggatt cggcccattt 120
attatcacaa acaagcatct gttcagaaga aataacggga cgttgttggt ccaatctctt 180
catggagtat ttaaggtaaa gaacactaca actcttcagc agcatctgat cgacggtagg 240
gatatgatca tcatccgtat gccgaaagac tttccacctt ttcctcagaa gttgaagttt 300
agagaacccc agcgtgagga gcgtatctgt ttagtaacaa caaatttcca aacgaaatct 360
atgtcatcaa tggttagcga taccagttgt actttcccca gttcagatgg gattttctgg 420
aagcactgga ttcagacaaa ggacggtcag tgtggtagtc cgcttgtttc tacaagggac 480
ggatttattg tcgggataca cagtgcttct aactttacga atacaaacaa ctacttcacg 540
tctgtcccta aaaattttat ggagctgttg actaatcagg aagcccaaca gtgggtatct 600
ggctggcgtt tgaacgcgga ttccgtactg tggggtggcc acaaggtttt tatggttaag 660
cctgaagagc cgttccaacc tgtgaaggag gcaacacagc taatgaat 708
<210> 50
<211> 236
<212> PRT
<213> Artificial Sequence
<220>
<223> TEV protease
<400> 50
Gly Glu Ser Leu Phe Lys Gly Pro Arg Asp Tyr Asn Pro Ile Ser Ser
1 5 10 15
Thr Ile Cys His Leu Thr Asn Glu Ser Asp Gly His Thr Thr Ser Leu
20 25 30
Tyr Gly Ile Gly Phe Gly Pro Phe Ile Ile Thr Asn Lys His Leu Phe
35 40 45
Arg Arg Asn Asn Gly Thr Leu Leu Val Gln Ser Leu His Gly Val Phe
50 55 60
Lys Val Lys Asn Thr Thr Thr Leu Gln Gln His Leu Ile Asp Gly Arg
65 70 75 80
Asp Met Ile Ile Ile Arg Met Pro Lys Asp Phe Pro Pro Phe Pro Gln
85 90 95
Lys Leu Lys Phe Arg Glu Pro Gln Arg Glu Glu Arg Ile Cys Leu Val
100 105 110
Thr Thr Asn Phe Gln Thr Lys Ser Met Ser Ser Met Val Ser Asp Thr
115 120 125
Ser Cys Thr Phe Pro Ser Ser Asp Gly Ile Phe Trp Lys His Trp Ile
130 135 140
Gln Thr Lys Asp Gly Gln Cys Gly Ser Pro Leu Val Ser Thr Arg Asp
145 150 155 160
Gly Phe Ile Val Gly Ile His Ser Ala Ser Asn Phe Thr Asn Thr Asn
165 170 175
Asn Tyr Phe Thr Ser Val Pro Lys Asn Phe Met Glu Leu Leu Thr Asn
180 185 190
Gln Glu Ala Gln Gln Trp Val Ser Gly Trp Arg Leu Asn Ala Asp Ser
195 200 205
Val Leu Trp Gly Gly His Lys Val Phe Met Val Lys Pro Glu Glu Pro
210 215 220
Phe Gln Pro Val Lys Glu Ala Thr Gln Leu Met Asn
225 230 235
<210> 51
<211> 1182
<212> DNA
<213> Artificial Sequence
<220>
<223> mRuby-BantP4H-ePTS1
<400> 51
atggtgtcca aaggagagga gttaatcaag gaaaacatga gaatgaaagt tgtcatggag 60
ggctccgtta atggtcacca attcaagtgt acaggggaag gtgaaggtaa tccttacatg 120
ggtacacaaa ctatgagaat taaagtaatt gaaggcggac cactaccatt tgcatttgac 180
attctggcaa cgtcattcat gtacggatca cgaactttca tcaagtaccc taaaggtata 240
ccagactttt tcaagcaatc ttttccagag ggttttacat gggaaagggt tacaagatac 300
gaagatgggg gtgtcgtcac agttatgcaa gatacttcat tagaagatgg ctgccttgtc 360
tatcatgtgc aagtaagagg ggtgaatttt ccttctaacg gacctgtgat gcagaaaaag 420
accaaaggtt gggaaccaaa tactgaaatg atgtacccag ctgatggagg tttgagaggc 480
tacacacaca tggcgcttaa agttgatggt ggaggtcatt tgtcttgtag ttttgttacc 540
acttatcgtt ctaaaaagac tgttggcaat atcaaaatgc caggaataca tgctgtagac 600
cacagactag aaagactcga agagagcgat aacgaaatgt tcgttgtaca gagagagcat 660
gccgtagcca aatttgctgg cttaggcggt ggtatggatg aattgtataa gggttctaaa 720
actttaggct atatccttat ggagaatggc gagaagattg acttggagtt ctttcccgag 780
gaagcaccta aaacagtaga aaacttcaaa aagctggctg agcaaggttt ctacgatggt 840
gttacattcc acagggttat cccaggattc gtgtcacaag gtggagatcc aaccggtacg 900
ggagctggcg gtccaggtta ctcaatccca tgtgaaacgg acggcaatcc ccataggcac 960
cttgttggca gtctaagtat ggcacatgct ggtcgtaaca ccggcggttc acagttcttc 1020
atagtacacg aaccccaacc ccaccttgac ggagtgcaca cagtctttgg taaggctacc 1080
tcaggtatcg aaacggtact aaatatgagg caaggtgacg taatgaaaga ggtcaaggtc 1140
tgggaggaag gatccttggg aagaggtaga agatccaaat tg 1182
<210> 52
<211> 394
<212> PRT
<213> Artificial Sequence
<220>
<223> mRuby-BantP4H-ePTS1
<400> 52
Met Val Ser Lys Gly Glu Glu Leu Ile Lys Glu Asn Met Arg Met Lys
1 5 10 15
Val Val Met Glu Gly Ser Val Asn Gly His Gln Phe Lys Cys Thr Gly
20 25 30
Glu Gly Glu Gly Asn Pro Tyr Met Gly Thr Gln Thr Met Arg Ile Lys
35 40 45
Val Ile Glu Gly Gly Pro Leu Pro Phe Ala Phe Asp Ile Leu Ala Thr
50 55 60
Ser Phe Met Tyr Gly Ser Arg Thr Phe Ile Lys Tyr Pro Lys Gly Ile
65 70 75 80
Pro Asp Phe Phe Lys Gln Ser Phe Pro Glu Gly Phe Thr Trp Glu Arg
85 90 95
Val Thr Arg Tyr Glu Asp Gly Gly Val Val Thr Val Met Gln Asp Thr
100 105 110
Ser Leu Glu Asp Gly Cys Leu Val Tyr His Val Gln Val Arg Gly Val
115 120 125
Asn Phe Pro Ser Asn Gly Pro Val Met Gln Lys Lys Thr Lys Gly Trp
130 135 140
Glu Pro Asn Thr Glu Met Met Tyr Pro Ala Asp Gly Gly Leu Arg Gly
145 150 155 160
Tyr Thr His Met Ala Leu Lys Val Asp Gly Gly Gly His Leu Ser Cys
165 170 175
Ser Phe Val Thr Thr Tyr Arg Ser Lys Lys Thr Val Gly Asn Ile Lys
180 185 190
Met Pro Gly Ile His Ala Val Asp His Arg Leu Glu Arg Leu Glu Glu
195 200 205
Ser Asp Asn Glu Met Phe Val Val Gln Arg Glu His Ala Val Ala Lys
210 215 220
Phe Ala Gly Leu Gly Gly Gly Met Asp Glu Leu Tyr Lys Gly Ser Lys
225 230 235 240
Thr Leu Gly Tyr Ile Leu Met Glu Asn Gly Glu Lys Ile Asp Leu Glu
245 250 255
Phe Phe Pro Glu Glu Ala Pro Lys Thr Val Glu Asn Phe Lys Lys Leu
260 265 270
Ala Glu Gln Gly Phe Tyr Asp Gly Val Thr Phe His Arg Val Ile Pro
275 280 285
Gly Phe Val Ser Gln Gly Gly Asp Pro Thr Gly Thr Gly Ala Gly Gly
290 295 300
Pro Gly Tyr Ser Ile Pro Cys Glu Thr Asp Gly Asn Pro His Arg His
305 310 315 320
Leu Val Gly Ser Leu Ser Met Ala His Ala Gly Arg Asn Thr Gly Gly
325 330 335
Ser Gln Phe Phe Ile Val His Glu Pro Gln Pro His Leu Asp Gly Val
340 345 350
His Thr Val Phe Gly Lys Ala Thr Ser Gly Ile Glu Thr Val Leu Asn
355 360 365
Met Arg Gln Gly Asp Val Met Lys Glu Val Lys Val Trp Glu Glu Gly
370 375 380
Ser Leu Gly Arg Gly Arg Arg Ser Lys Leu
385 390
<210> 53
<211> 1053
<212> DNA
<213> Artificial Sequence
<220>
<223> GFP-BTCol1A1 403-0P-ePTS1
<400> 53
atgcgtaaag gcgaagagct gttcactggt gtcgtcccta ttctggtgga actggatggt 60
gatgtcaacg gtcataagtt ttccgtgcgt ggcgagggtg aaggtgacgc aactaatggt 120
aaactgacgc tgaagttcat ctgtactact ggtaaactgc cggttccttg gccgactctg 180
gtaacgacgc tgacttatgg tgttcagtgc tttgctcgtt atccggacca tatgaagcag 240
catgacttct tcaagtccgc catgccggaa ggctatgtgc aggaacgcac gatttccttt 300
aaggatgacg gcacgtacaa aacgcgtgcg gaagtgaaat ttgaaggcga taccctggta 360
aaccgcattg agctgaaagg cattgacttt aaagaggacg gcaatatcct gggccataag 420
ctggaataca attttaacag ccacaatgtt tacatcaccg ccgataaaca aaaaaatggc 480
attaaagcga attttaaaat tcgccacaac gtggaggatg gcagcgtgca gctggctgat 540
cactaccagc aaaacactcc aatcggtgat ggtcctgttc tgctgccaga caatcactat 600
ctgagcacgc aaagcgttct gtctaaagat ccgaacgaga aacgcgatca tatggttctg 660
ctggagttcg taaccgcagc gggcatcacg catggtatgg atgaactgta caaaggttct 720
ggatttgctg gccctaaagg agccgcagga gaggctggta aagcaggcga acgtggcgta 780
gctggcccag ctggtgctgt gggtccagtt ggtaaagatg gagaggccgg tgcccaaggt 840
ccagccggtc ctgtaggccc agcaggtgaa agaggagaac aaggtcccgc aggctcagct 900
ggattccaag gattagcagg acctgccggc ccagttggcg aagccggtaa agccggagag 960
caaggcgtgg ctggtgacct aggtgcagtt ggccctagtg gcgccagggg agaaagagga 1020
tccttgggaa gaggtagaag atccaaattg taa 1053
<210> 54
<211> 350
<212> PRT
<213> Artificial Sequence
<220>
<223> GFP-BTCol1A1 403-0P-ePTS1
<400> 54
Met Arg Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu Val
1 5 10 15
Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Arg Gly Glu
20 25 30
Gly Glu Gly Asp Ala Thr Asn Gly Lys Leu Thr Leu Lys Phe Ile Cys
35 40 45
Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Leu
50 55 60
Thr Tyr Gly Val Gln Cys Phe Ala Arg Tyr Pro Asp His Met Lys Gln
65 70 75 80
His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu Arg
85 90 95
Thr Ile Ser Phe Lys Asp Asp Gly Thr Tyr Lys Thr Arg Ala Glu Val
100 105 110
Lys Phe Glu Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile
115 120 125
Asp Phe Lys Glu Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr Asn
130 135 140
Phe Asn Ser His Asn Val Tyr Ile Thr Ala Asp Lys Gln Lys Asn Gly
145 150 155 160
Ile Lys Ala Asn Phe Lys Ile Arg His Asn Val Glu Asp Gly Ser Val
165 170 175
Gln Leu Ala Asp His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro
180 185 190
Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Val Leu Ser
195 200 205
Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val
210 215 220
Thr Ala Ala Gly Ile Thr His Gly Met Asp Glu Leu Tyr Lys Gly Ser
225 230 235 240
Gly Phe Ala Gly Pro Lys Gly Ala Ala Gly Glu Ala Gly Lys Ala Gly
245 250 255
Glu Arg Gly Val Ala Gly Pro Ala Gly Ala Val Gly Pro Val Gly Lys
260 265 270
Asp Gly Glu Ala Gly Ala Gln Gly Pro Ala Gly Pro Val Gly Pro Ala
275 280 285
Gly Glu Arg Gly Glu Gln Gly Pro Ala Gly Ser Ala Gly Phe Gln Gly
290 295 300
Leu Ala Gly Pro Ala Gly Pro Val Gly Glu Ala Gly Lys Ala Gly Glu
305 310 315 320
Gln Gly Val Ala Gly Asp Leu Gly Ala Val Gly Pro Ser Gly Ala Arg
325 330 335
Gly Glu Arg Gly Ser Leu Gly Arg Gly Arg Arg Ser Lys Leu
340 345 350
<210> 55
<211> 1053
<212> DNA
<213> Artificial Sequence
<220>
<223> GFP-BTCol1A1 403-11P-ePTS1
<400> 55
atgcgtaaag gcgaagagct gttcactggt gtcgtcccta ttctggtgga actggatggt 60
gatgtcaacg gtcataagtt ttccgtgcgt ggcgagggtg aaggtgacgc aactaatggt 120
aaactgacgc tgaagttcat ctgtactact ggtaaactgc cggttccttg gccgactctg 180
gtaacgacgc tgacttatgg tgttcagtgc tttgctcgtt atccggacca tatgaagcag 240
catgacttct tcaagtccgc catgccggaa ggctatgtgc aggaacgcac gatttccttt 300
aaggatgacg gcacgtacaa aacgcgtgcg gaagtgaaat ttgaaggcga taccctggta 360
aaccgcattg agctgaaagg cattgacttt aaagaggacg gcaatatcct gggccataag 420
ctggaataca attttaacag ccacaatgtt tacatcaccg ccgataaaca aaaaaatggc 480
attaaagcga attttaaaat tcgccacaac gtggaggatg gcagcgtgca gctggctgat 540
cactaccagc aaaacactcc aatcggtgat ggtcctgttc tgctgccaga caatcactat 600
ctgagcacgc aaagcgttct gtctaaagat ccgaacgaga aacgcgatca tatggttctg 660
ctggagttcg taaccgcagc gggcatcacg catggtatgg atgaactgta caaaggttct 720
ggatttcctg gccctaaggg agccgcagga gagcccggta aagcaggcga aagaggcgta 780
cctggtccac ccggtgctgt gggtccagct ggtaaagatg gagaggccgg tgcccaaggt 840
cctcctggtc ctgctggccc agcaggtgaa agaggagaac aaggtcccgc aggctcacct 900
ggattccaag gattaccagg tccagccgga ccacctggcg aagccggtaa acccggagag 960
caaggcgtgc ctggtgacct aggtgcacca ggacctagtg gcgccagggg agaaagagga 1020
tccttgggaa gaggtagaag atccaaattg taa 1053
<210> 56
<211> 350
<212> PRT
<213> Artificial Sequence
<220>
<223> GFP-BTCol1A1 403-11P-ePTS1
<400> 56
Met Arg Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu Val
1 5 10 15
Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Arg Gly Glu
20 25 30
Gly Glu Gly Asp Ala Thr Asn Gly Lys Leu Thr Leu Lys Phe Ile Cys
35 40 45
Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Leu
50 55 60
Thr Tyr Gly Val Gln Cys Phe Ala Arg Tyr Pro Asp His Met Lys Gln
65 70 75 80
His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu Arg
85 90 95
Thr Ile Ser Phe Lys Asp Asp Gly Thr Tyr Lys Thr Arg Ala Glu Val
100 105 110
Lys Phe Glu Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile
115 120 125
Asp Phe Lys Glu Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr Asn
130 135 140
Phe Asn Ser His Asn Val Tyr Ile Thr Ala Asp Lys Gln Lys Asn Gly
145 150 155 160
Ile Lys Ala Asn Phe Lys Ile Arg His Asn Val Glu Asp Gly Ser Val
165 170 175
Gln Leu Ala Asp His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro
180 185 190
Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Val Leu Ser
195 200 205
Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val
210 215 220
Thr Ala Ala Gly Ile Thr His Gly Met Asp Glu Leu Tyr Lys Gly Ser
225 230 235 240
Gly Phe Pro Gly Pro Lys Gly Ala Ala Gly Glu Pro Gly Lys Ala Gly
245 250 255
Glu Arg Gly Val Pro Gly Pro Pro Gly Ala Val Gly Pro Ala Gly Lys
260 265 270
Asp Gly Glu Ala Gly Ala Gln Gly Pro Pro Gly Pro Ala Gly Pro Ala
275 280 285
Gly Glu Arg Gly Glu Gln Gly Pro Ala Gly Ser Pro Gly Phe Gln Gly
290 295 300
Leu Pro Gly Pro Ala Gly Pro Pro Gly Glu Ala Gly Lys Pro Gly Glu
305 310 315 320
Gln Gly Val Pro Gly Asp Leu Gly Ala Pro Gly Pro Ser Gly Ala Arg
325 330 335
Gly Glu Arg Gly Ser Leu Gly Arg Gly Arg Arg Ser Lys Leu
340 345 350
<210> 57
<211> 5094
<212> DNA
<213> Artificial Sequence
<220>
<223> AmisCOL1A1-TEV-GFP-HIS-ePTS1
<400> 57
atgcaaggag aagaagacau ucaaacugga agcugcauac aggauggacu agcguacaac 60
aacacagacg uauggaaacc cgagcccugc cagaucugcg uaugcgacaa uggcaacauc 120
cugugugacg augucaucug ugaugauacc ucggacugua ccaaugcuga gauccccuuu 180
ggagaaugcu gucccaucug uccugacacc gcuggcucuu cuaccuaccc caaauccacu 240
ggaguagagg guccuaaggg agacacuggc cccagaggac agaggggacu cccaggccca 300
ccuggcagag auggcauucc uggacagccu ggucucccug gacucccagg accuccaggc 360
ccuccuggcc uugguggaaa cuucgcuccu caaauggcuu acgguuacgg agaugaaacc 420
aaaucugcug gcauuucugu cccuggaccc auggguccag cuggcccccg uggucucccc 480
ggccccccug guucuccugg uccucaaggu uuccaagguc cuccuggaga gccuggagag 540
ccuggugcuu cagguccaau ggguccccgu gguccagccg gccccccugg caagaacgga 600
gaugauggug aagcuggaaa gcccggccgu cccggugagc gcggcccucc uggcccccag 660
ggugcacgug gucugcccgg aacugcuggc cugccaggca ugaaggguca cagagguuuc 720
aguggucugg auggugcuaa gggugaugcu gguccauccg gccccaaggg ugagccuggu 780
agcccuggug agaacggagc uccuggacaa augggcccuc guggucuucc cggugagaga 840
ggccgcccug guccaucugg cccugcuggu gcucguggua acgaugguag uccuggugcu 900
gcuggcccuc cagguccaac uggcccagcu ggccccccug gcuucccugg ugcugcuggu 960
gcuaagggug aaacuggucc ucaagguucu cgugguagug aaggcccaca gggugcucgu 1020
ggugagccug guccuccugg cccugcuggu gcugcugguc cugcuggcaa cccugguucu 1080
gauggucaag cuggugccaa aggugcaacu ggugcuccug guauugcugg ugcuccuggc 1140
uucccuggcg cucguggccc aucuggaccc caggguccca gcggugcucc uggccccaag 1200
gguaacagug gugaacccgg ugcucaaggc aacaagggag acacuggugc aaaaggagag 1260
ccugguccug cuggugucca aggcccaccu gguccagcug gugaagaagg caagagagga 1320
gcccguggug agcccggccc uggaggucuu ccuggcccug cuggcgaacg uggugcuccu 1380
ggaagccgug guuucccugg cgcugauggc auuucugguc ccaagggucc cccuggugaa 1440
cgugguuccc cuggcccugc uggucccaaa ggaucuacug gugaaucugg acgcccuggu 1500
gagccugguc ucccuggugc caagggucuu acuggaagcc cagguagccc agguccugau 1560
ggcaagacug guccaccugg ccccgcuggu caagaugguc gcccaggacc cccaggccca 1620
ccuggugcca gaggucaggc uggugugaug gguuucccug gaccuaaagg ugcugcuggu 1680
gagccuggca aaccugguga gagaggagcu ccuggacccc cuggugcugu uggcgcagcu 1740
gguaaggaug gugaagcugg ugcccaaggu ucuccuggcg cugcuggucc ugcuggagag 1800
agaggugaac aagguccugc uggugcuccu ggauuccagg gucugcccgg uccugcuggc 1860
ccaucuggug aaucuggcaa gccuggugaa caggguguuc cuggagaugc uggugcuccu 1920
gguccagcug gugcaagagg cgagagaggu uucccuggug agcguggugu ccaaggucaa 1980
ccagguccac aggguccacg uggugcuaac ggugcucccg guaacgaugg ugcuaagggu 2040
gaugcuggug cuccuggugc uccugguggc caagguccuc ccggucugca ggguaugccu 2100
ggugagcgug gugcugcugg ucugccuggu uccaagggug acagaggcga uccugguccc 2160
aaaggcacug auggugcucc uggcaaagau ggcgucagag gucuaacugg cccuauuggu 2220
ccuccuggcc cagcuggugc cccuggugac aagggugaag cugguccuuc uggcccugcu 2280
ggucccacug guucucgugg ugccccugga gaucguggug agccuggucc accuggcccu 2340
gcuggauucg cugguccccc uggugcugau ggacaaccug gugcuaaagg ugaaucuggu 2400
gaugcuggug cuaaagguga ugcugguccu ccaggcccug cuggacccac uggugcuccu 2460
ggaccuucug gcgcuguugg ugcuccugga cccaaaggug cucgugguag ugcuggaccc 2520
ccuggugcua cugguuuccc uggugcugcu ggaagaguug guccaccugg cccugcuggu 2580
aacgucgguc uuccuggccc aucaggcccc aguggaaaag aaggcucuaa aggaccccgu 2640
ggugagacug gcccugcugg acgccccggu gaaccuggac cugcuggccc accaggaccu 2700
ucuggcgaga agggcucucc ugguggugau ggucccgcug gugcuccugg uacuccaggc 2760
ccacagggua uugcuggaca gcguggugua guuggucuuc cuggacagag aggcgagaga 2820
gguuucccug gucuccccgg cccaucuggc gaaccuggca aacaaggucc aucuggcucc 2880
ucuggugaac gcgguccucc ugguccaaug ggaccaccug gcuuggcugg accuccuggu 2940
gaagcuggac gugagggugc uccugguucu gaaggugcuc cuggucgcga uggcgcugcu 3000
ggucccaagg gugaccgugg ugagacuggc cccucugguc cuccuggugc ucccggugcc 3060
ccuggagcuc cuggcccuau uggcccugcu ggcaagaaug gagaucgugg ugagacuggu 3120
ccuucugguc cugcuggccc ugccgguccu gcuggugcuc gugguccugc ugguccacaa 3180
ggugcccgug gugacaaagg ugaaacugga gaacauggug acagaggcau gaagggucac 3240
agaggauucc cuggucccca gggucccucu gguccugcug gcucuccugg ugaacaaggu 3300
ccuucuggag cuuccggccc ugcuggucca agagguccuc cuggcucugc uggcaccccu 3360
ggcaaagaug gucugaaugg ucucccuggc ccuauugguc caccuggucc ccggggucgc 3420
acuggugaug uugguccugc uggucccccu ggaccuccug ggcccccagg uccuccuggu 3480
gcacccagcg gcggcuuuga cuucagcuuc augccccagc cuccucagga gaaagcccau 3540
gauccuggcc gcuacuacag agcugaugac gccaacguga ugcgugaccg ugaccuggag 3600
guggacacca cccucaagag ccugagccag cagaucgaga acauccgcag ccccgagggc 3660
accaggaaga acccugcccg caccugccgu gaccugaaga ugugccacaa ugacuggaag 3720
agcggcgagu acuggauuga ccccaaccag ggcugcaauc uggaugccau caaggucuac 3780
uguaacaugg agacuggcga gacuugcguc cacccaaccc aggccaccau cgcucagaag 3840
aacugguaca ugagcaagaa ccccaaggag aagaaacaca ucugguuugg cgagacaaug 3900
agcgauggcu uccaguucga auaugguggg gagggcucca acccagcuga cguugccauc 3960
caacugaccu uccugcgccu gauguccacu gaggccuccc agaacaucac cuaccacugc 4020
aagaacagcg uggcuuacau ggaccaggag acuggcaacc ugaagaaggc ucugcuccuu 4080
cagggcucca acgagaucga gaucagagca gaaggcaaca gccgcuucac cuauggaguc 4140
acugaggaug gcugcacaac ucacaccggu gccuggggca agacagucau ugaauacaaa 4200
acaacaaaaa ccucucgccu gcccgucauu gacguggcuc ccauggacgu uggagcacaa 4260
gaucaggaau ucggaauugu caucggaccu gucugcuucu ugggttctga gaatctttat 4320
tttcagggcc gtaaaggcga agagctgttc actggtgtcg tccctattct ggtggaactg 4380
gatggtgatg tcaacggtca taagttttcc gtgcgtggcg agggtgaagg tgacgcaact 4440
aatggtaaac tgacgctgaa gttcatctgt actactggta aactgccggt tccttggccg 4500
actctggtaa cgacgctgac ttatggtgtt cagtgctttg ctcgttatcc ggaccatatg 4560
aagcagcatg acttcttcaa gtccgccatg ccggaaggct atgtgcagga acgcacgatt 4620
tcctttaagg atgacggcac gtacaaaacg cgtgcggaag tgaaatttga aggcgatacc 4680
ctggtaaacc gcattgagct gaaaggcatt gactttaaag aggacggcaa tatcctgggc 4740
cataagctgg aatacaattt taacagccac aatgtttaca tcaccgccga taaacaaaaa 4800
aatggcatta aagcgaattt taaaattcgc cacaacgtgg aggatggcag cgtgcagctg 4860
gctgatcact accagcaaaa cactccaatc ggtgatggtc ctgttctgct gccagacaat 4920
cactatctga gcacgcaaag cgttctgtct aaagatccga acgagaaacg cgatcatatg 4980
gttctgctgg agttcgtaac cgcagcgggc atcacgcatg gtatggatga actgtacaaa 5040
ggttctcatc atcatcatca tcacttggga agaggtagaa gatccaaatt gtaa 5094
<210> 58
<211> 1697
<212> PRT
<213> Artificial Sequence
<220>
<223> AmisCOL1A1-TEV-GFP-HIS-ePTS1
<400> 58
Met Gln Gly Glu Glu Asp Ile Gln Thr Gly Ser Cys Ile Gln Asp Gly
1 5 10 15
Leu Ala Tyr Asn Asn Thr Asp Val Trp Lys Pro Glu Pro Cys Gln Ile
20 25 30
Cys Val Cys Asp Asn Gly Asn Ile Leu Cys Asp Asp Val Ile Cys Asp
35 40 45
Asp Thr Ser Asp Cys Thr Asn Ala Glu Ile Pro Phe Gly Glu Cys Cys
50 55 60
Pro Ile Cys Pro Asp Thr Ala Gly Ser Ser Thr Tyr Pro Lys Ser Thr
65 70 75 80
Gly Val Glu Gly Pro Lys Gly Asp Thr Gly Pro Arg Gly Gln Arg Gly
85 90 95
Leu Pro Gly Pro Pro Gly Arg Asp Gly Ile Pro Gly Gln Pro Gly Leu
100 105 110
Pro Gly Leu Pro Gly Pro Pro Gly Pro Pro Gly Leu Gly Gly Asn Phe
115 120 125
Ala Pro Gln Met Ala Tyr Gly Tyr Gly Asp Glu Thr Lys Ser Ala Gly
130 135 140
Ile Ser Val Pro Gly Pro Met Gly Pro Ala Gly Pro Arg Gly Leu Pro
145 150 155 160
Gly Pro Pro Gly Ser Pro Gly Pro Gln Gly Phe Gln Gly Pro Pro Gly
165 170 175
Glu Pro Gly Glu Pro Gly Ala Ser Gly Pro Met Gly Pro Arg Gly Pro
180 185 190
Ala Gly Pro Pro Gly Lys Asn Gly Asp Asp Gly Glu Ala Gly Lys Pro
195 200 205
Gly Arg Pro Gly Glu Arg Gly Pro Pro Gly Pro Gln Gly Ala Arg Gly
210 215 220
Leu Pro Gly Thr Ala Gly Leu Pro Gly Met Lys Gly His Arg Gly Phe
225 230 235 240
Ser Gly Leu Asp Gly Ala Lys Gly Asp Ala Gly Pro Ser Gly Pro Lys
245 250 255
Gly Glu Pro Gly Ser Pro Gly Glu Asn Gly Ala Pro Gly Gln Met Gly
260 265 270
Pro Arg Gly Leu Pro Gly Glu Arg Gly Arg Pro Gly Pro Ser Gly Pro
275 280 285
Ala Gly Ala Arg Gly Asn Asp Gly Ser Pro Gly Ala Ala Gly Pro Pro
290 295 300
Gly Pro Thr Gly Pro Ala Gly Pro Pro Gly Phe Pro Gly Ala Ala Gly
305 310 315 320
Ala Lys Gly Glu Thr Gly Pro Gln Gly Ser Arg Gly Ser Glu Gly Pro
325 330 335
Gln Gly Ala Arg Gly Glu Pro Gly Pro Pro Gly Pro Ala Gly Ala Ala
340 345 350
Gly Pro Ala Gly Asn Pro Gly Ser Asp Gly Gln Ala Gly Ala Lys Gly
355 360 365
Ala Thr Gly Ala Pro Gly Ile Ala Gly Ala Pro Gly Phe Pro Gly Ala
370 375 380
Arg Gly Pro Ser Gly Pro Gln Gly Pro Ser Gly Ala Pro Gly Pro Lys
385 390 395 400
Gly Asn Ser Gly Glu Pro Gly Ala Gln Gly Asn Lys Gly Asp Thr Gly
405 410 415
Ala Lys Gly Glu Pro Gly Pro Ala Gly Val Gln Gly Pro Pro Gly Pro
420 425 430
Ala Gly Glu Glu Gly Lys Arg Gly Ala Arg Gly Glu Pro Gly Pro Gly
435 440 445
Gly Leu Pro Gly Pro Ala Gly Glu Arg Gly Ala Pro Gly Ser Arg Gly
450 455 460
Phe Pro Gly Ala Asp Gly Ile Ser Gly Pro Lys Gly Pro Pro Gly Glu
465 470 475 480
Arg Gly Ser Pro Gly Pro Ala Gly Pro Lys Gly Ser Thr Gly Glu Ser
485 490 495
Gly Arg Pro Gly Glu Pro Gly Leu Pro Gly Ala Lys Gly Leu Thr Gly
500 505 510
Ser Pro Gly Ser Pro Gly Pro Asp Gly Lys Thr Gly Pro Pro Gly Pro
515 520 525
Ala Gly Gln Asp Gly Arg Pro Gly Pro Pro Gly Pro Pro Gly Ala Arg
530 535 540
Gly Gln Ala Gly Val Met Gly Phe Pro Gly Pro Lys Gly Ala Ala Gly
545 550 555 560
Glu Pro Gly Lys Pro Gly Glu Arg Gly Ala Pro Gly Pro Pro Gly Ala
565 570 575
Val Gly Ala Ala Gly Lys Asp Gly Glu Ala Gly Ala Gln Gly Ser Pro
580 585 590
Gly Ala Ala Gly Pro Ala Gly Glu Arg Gly Glu Gln Gly Pro Ala Gly
595 600 605
Ala Pro Gly Phe Gln Gly Leu Pro Gly Pro Ala Gly Pro Ser Gly Glu
610 615 620
Ser Gly Lys Pro Gly Glu Gln Gly Val Pro Gly Asp Ala Gly Ala Pro
625 630 635 640
Gly Pro Ala Gly Ala Arg Gly Glu Arg Gly Phe Pro Gly Glu Arg Gly
645 650 655
Val Gln Gly Gln Pro Gly Pro Gln Gly Pro Arg Gly Ala Asn Gly Ala
660 665 670
Pro Gly Asn Asp Gly Ala Lys Gly Asp Ala Gly Ala Pro Gly Ala Pro
675 680 685
Gly Gly Gln Gly Pro Pro Gly Leu Gln Gly Met Pro Gly Glu Arg Gly
690 695 700
Ala Ala Gly Leu Pro Gly Ser Lys Gly Asp Arg Gly Asp Pro Gly Pro
705 710 715 720
Lys Gly Thr Asp Gly Ala Pro Gly Lys Asp Gly Val Arg Gly Leu Thr
725 730 735
Gly Pro Ile Gly Pro Pro Gly Pro Ala Gly Ala Pro Gly Asp Lys Gly
740 745 750
Glu Ala Gly Pro Ser Gly Pro Ala Gly Pro Thr Gly Ser Arg Gly Ala
755 760 765
Pro Gly Asp Arg Gly Glu Pro Gly Pro Pro Gly Pro Ala Gly Phe Ala
770 775 780
Gly Pro Pro Gly Ala Asp Gly Gln Pro Gly Ala Lys Gly Glu Ser Gly
785 790 795 800
Asp Ala Gly Ala Lys Gly Asp Ala Gly Pro Pro Gly Pro Ala Gly Pro
805 810 815
Thr Gly Ala Pro Gly Pro Ser Gly Ala Val Gly Ala Pro Gly Pro Lys
820 825 830
Gly Ala Arg Gly Ser Ala Gly Pro Pro Gly Ala Thr Gly Phe Pro Gly
835 840 845
Ala Ala Gly Arg Val Gly Pro Pro Gly Pro Ala Gly Asn Val Gly Leu
850 855 860
Pro Gly Pro Ser Gly Pro Ser Gly Lys Glu Gly Ser Lys Gly Pro Arg
865 870 875 880
Gly Glu Thr Gly Pro Ala Gly Arg Pro Gly Glu Pro Gly Pro Ala Gly
885 890 895
Pro Pro Gly Pro Ser Gly Glu Lys Gly Ser Pro Gly Gly Asp Gly Pro
900 905 910
Ala Gly Ala Pro Gly Thr Pro Gly Pro Gln Gly Ile Ala Gly Gln Arg
915 920 925
Gly Val Val Gly Leu Pro Gly Gln Arg Gly Glu Arg Gly Phe Pro Gly
930 935 940
Leu Pro Gly Pro Ser Gly Glu Pro Gly Lys Gln Gly Pro Ser Gly Ser
945 950 955 960
Ser Gly Glu Arg Gly Pro Pro Gly Pro Met Gly Pro Pro Gly Leu Ala
965 970 975
Gly Pro Pro Gly Glu Ala Gly Arg Glu Gly Ala Pro Gly Ser Glu Gly
980 985 990
Ala Pro Gly Arg Asp Gly Ala Ala Gly Pro Lys Gly Asp Arg Gly Glu
995 1000 1005
Thr Gly Pro Ser Gly Pro Pro Gly Ala Pro Gly Ala Pro Gly Ala Pro
1010 1015 1020
Gly Pro Ile Gly Pro Ala Gly Lys Asn Gly Asp Arg Gly Glu Thr Gly
1025 1030 1035 1040
Pro Ser Gly Pro Ala Gly Pro Ala Gly Pro Ala Gly Ala Arg Gly Pro
1045 1050 1055
Ala Gly Pro Gln Gly Ala Arg Gly Asp Lys Gly Glu Thr Gly Glu His
1060 1065 1070
Gly Asp Arg Gly Met Lys Gly His Arg Gly Phe Pro Gly Pro Gln Gly
1075 1080 1085
Pro Ser Gly Pro Ala Gly Ser Pro Gly Glu Gln Gly Pro Ser Gly Ala
1090 1095 1100
Ser Gly Pro Ala Gly Pro Arg Gly Pro Pro Gly Ser Ala Gly Thr Pro
1105 1110 1115 1120
Gly Lys Asp Gly Leu Asn Gly Leu Pro Gly Pro Ile Gly Pro Pro Gly
1125 1130 1135
Pro Arg Gly Arg Thr Gly Asp Val Gly Pro Ala Gly Pro Pro Gly Pro
1140 1145 1150
Pro Gly Pro Pro Gly Pro Pro Gly Ala Pro Ser Gly Gly Phe Asp Phe
1155 1160 1165
Ser Phe Met Pro Gln Pro Pro Gln Glu Lys Ala His Asp Pro Gly Arg
1170 1175 1180
Tyr Tyr Arg Ala Asp Asp Ala Asn Val Met Arg Asp Arg Asp Leu Glu
1185 1190 1195 1200
Val Asp Thr Thr Leu Lys Ser Leu Ser Gln Gln Ile Glu Asn Ile Arg
1205 1210 1215
Ser Pro Glu Gly Thr Arg Lys Asn Pro Ala Arg Thr Cys Arg Asp Leu
1220 1225 1230
Lys Met Cys His Asn Asp Trp Lys Ser Gly Glu Tyr Trp Ile Asp Pro
1235 1240 1245
Asn Gln Gly Cys Asn Leu Asp Ala Ile Lys Val Tyr Cys Asn Met Glu
1250 1255 1260
Thr Gly Glu Thr Cys Val His Pro Thr Gln Ala Thr Ile Ala Gln Lys
1265 1270 1275 1280
Asn Trp Tyr Met Ser Lys Asn Pro Lys Glu Lys Lys His Ile Trp Phe
1285 1290 1295
Gly Glu Thr Met Ser Asp Gly Phe Gln Phe Glu Tyr Gly Gly Glu Gly
1300 1305 1310
Ser Asn Pro Ala Asp Val Ala Ile Gln Leu Thr Phe Leu Arg Leu Met
1315 1320 1325
Ser Thr Glu Ala Ser Gln Asn Ile Thr Tyr His Cys Lys Asn Ser Val
1330 1335 1340
Ala Tyr Met Asp Gln Glu Thr Gly Asn Leu Lys Lys Ala Leu Leu Leu
1345 1350 1355 1360
Gln Gly Ser Asn Glu Ile Glu Ile Arg Ala Glu Gly Asn Ser Arg Phe
1365 1370 1375
Thr Tyr Gly Val Thr Glu Asp Gly Cys Thr Thr His Thr Gly Ala Trp
1380 1385 1390
Gly Lys Thr Val Ile Glu Tyr Lys Thr Thr Lys Thr Ser Arg Leu Pro
1395 1400 1405
Val Ile Asp Val Ala Pro Met Asp Val Gly Ala Gln Asp Gln Glu Phe
1410 1415 1420
Gly Ile Val Ile Gly Pro Val Cys Phe Leu Gly Ser Glu Asn Leu Tyr
1425 1430 1435 1440
Phe Gln Gly Arg Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile
1445 1450 1455
Leu Val Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Arg
1460 1465 1470
Gly Glu Gly Glu Gly Asp Ala Thr Asn Gly Lys Leu Thr Leu Lys Phe
1475 1480 1485
Ile Cys Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr
1490 1495 1500
Thr Leu Thr Tyr Gly Val Gln Cys Phe Ala Arg Tyr Pro Asp His Met
1505 1510 1515 1520
Lys Gln His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gln
1525 1530 1535
Glu Arg Thr Ile Ser Phe Lys Asp Asp Gly Thr Tyr Lys Thr Arg Ala
1540 1545 1550
Glu Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys
1555 1560 1565
Gly Ile Asp Phe Lys Glu Asp Gly Asn Ile Leu Gly His Lys Leu Glu
1570 1575 1580
Tyr Asn Phe Asn Ser His Asn Val Tyr Ile Thr Ala Asp Lys Gln Lys
1585 1590 1595 1600
Asn Gly Ile Lys Ala Asn Phe Lys Ile Arg His Asn Val Glu Asp Gly
1605 1610 1615
Ser Val Gln Leu Ala Asp His Tyr Gln Gln Asn Thr Pro Ile Gly Asp
1620 1625 1630
Gly Pro Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Val
1635 1640 1645
Leu Ser Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu
1650 1655 1660
Phe Val Thr Ala Ala Gly Ile Thr His Gly Met Asp Glu Leu Tyr Lys
1665 1670 1675 1680
Gly Ser His His His His His His Leu Gly Arg Gly Arg Arg Ser Lys
1685 1690 1695
Leu
<210> 59
<211> 4818
<212> DNA
<213> Artificial Sequence
<220>
<223> AmisCOL1A2-TEV-GFP-HIS-ePTS1
<400> 59
atgcaacaag caaatgaggc aactgcagga cggaagggcc caagaggaga caaagggcca 60
cagggagaaa ggggtccacc aggtccacca ggcagagatg gtgaagatgg tccaccaggg 120
cctccagggc cccctggtcc tccaggtctt ggcggaaact ttgctgctca gtatgacgga 180
gcaaaagcag gtgactatgg ctcaggacca atgggtttaa tgggacccag aggcccacct 240
ggaacaagtg gacctcctgg tcctcctggc ttccaaggac ctcatggtga gcctggtgaa 300
cctggtcaaa caggtcccca gggtccccgt ggtccatctg gtcctcctgg aaaggctggt 360
gaagatggcc atcctggaaa atctggacga tctggtgaga ggggcgtctc tggtcctcag 420
ggtgctcgtg gtttccctgg aactcctggt ctgcctggct ttaagggaat tagaggacac 480
aatggtctgg atggtcagaa gggacaacct ggtactccag gcattaaggg tgaatccggt 540
gcccctggtg aaaatggtac cccaggacaa tctggtgctc gtggccttcc cggtgaaaga 600
ggaagaattg gtgcacctgg cccagctggt gcccgtggca gcgatggtag cactggtccc 660
actggtcctg ctggccctat cggttctgct ggtgctccag gtttcccagg tgctcctgga 720
gccaagggtg aaattggagc tgctggtaat gtaggtcctt ctggccctgc tggtccacga 780
ggagaggctg gacttcctgg ttcttctggt cccgttggcc ctcctggaaa ccctggttct 840
aatggtcttg ctggtgctaa aggtgcaact ggtcttcctg gtgttgctgg tgctcctggc 900
ttgcctggtc cacgtggtat tcctggacct tctggccctg ccggagctgc tggcaccaga 960
ggtcttgttg gtgaaccagg ccctgctggt gccaagggag aaagtggtaa caagggtgaa 1020
cccggtgctg ctggtccatc aggtcccgct ggtccaagtg gtgaagaagg caagaaaggt 1080
actactggtg aacctggctc ttctggcccc cctggtccag ctggtctaag aggcgttcct 1140
ggatctcgtg gtctccctgg agctgacggc agagctggtg ttatgggacc tgctggcagc 1200
cgtggtgcta ctggtcctgc tggtgctaaa ggtcctagtg gtgataatgg tcgccctggt 1260
gagcctggcc ttatgggtcc aagaggtctc cctggtcaac ctggaagctc aggccctgct 1320
ggcaaggaag gtcctgttgg tttccctggt gcagatggta gagttggccc aactggtcca 1380
gctggtgcaa gaggtgagcc tggcaacatt ggattccctg gacccaaagg ccccactggt 1440
gaccctggca aacctggtga cagaggccat gctggtcttg ctggtgctcg gggtgcgcct 1500
ggtcctgagg gcaacaatgg ggctcaaggt cctcctggtg ttgctggcaa ccctggtgca 1560
aaaggtgaac aaggtccagc tggtcctccc ggtttccagg gtctcccagg cccctcaggt 1620
ccagctggtg aagctggcaa accaggtgaa aggggtatgg ctggtgaatt tggtgcccct 1680
ggccctgcgg gttcaagagg tgaacgtggt cctccaggcg aaagtggtgc tgttggtcct 1740
gtaggtccca ttggaagccg tggtccatct ggtccaccag gcactgatgg caacaagggt 1800
gaacctggta atgttggtaa tgctggtact gcaggcccct ctggcgctgg tggagcccca 1860
ggagagagag gcattgctgg tattccagga cccaagggtg aaaagggtgc tacaggtctg 1920
agaggggata ctggcgcaac aggaagagat ggtgctcgtg gtgctcctgg tgctattgga 1980
gcccctggcc ccgctggtgg agctggtgag cggggtgaag gtggtcctgc tggtgctgct 2040
ggcccttctg gtgcccgtgg tattcctggt gaacgtggtg agcctggtcc tgctggccct 2100
actggatttg ctggacctgc tggtgcagct ggccaacctg gtgctaaagg tgaacgaggt 2160
acaaaaggac ccaagggtga aaatggtcca caaggtgctg ttggcccagt tggttcttct 2220
ggaccatcag gtcctgttgg tgcctctggt cctgctggtc ctcgtggtga tggtggtcct 2280
cctggtgtca ctggtttccc tggagctgct ggcagaactg gtcctcccgg cccctctggt 2340
atcactggcc cccctggtcc ccctggctca gctggcaaag atggtatgag aggcccacgt 2400
ggtgatactg gtccagttgg ccgcactgga gaacaaggca ttgttggccc acctggcttc 2460
agtggtgaga aaggtccatc tggagagcct ggtgctgctg gtccccctgg taccccaggt 2520
cctcagggta ttcttggtgc tcctggtatc cttggtctgc ctggctctcg gggagaacgt 2580
ggtcttccag gcatctctgg agcaacaggt gaaccaggtc ctcttggtat ttccggtcct 2640
cctggtgcac gtggtccctc tggccccgtg ggttctgctg gtctgaatgg tgcccctggt 2700
gaagctggcc gtgatggcaa tcctggccat gatggtgctc caggccgtga tggtgctcct 2760
ggtttcaagg gtgagcgtgg tgctcctggg aacaatggac ctgctggtgc tgttggtgct 2820
cctggcgccc atggtcaagt tggtcctgct ggaaagcctg gaaatcgtgg tgatcctggt 2880
cctgttggtc cttctggtcc tgctggtgct tttggtgcaa ggggtccttc tggcccacaa 2940
ggtgcacgtg gtgagaaggg agaaacaggt gaaaagggac acagaggtat gcctggattt 3000
aaggggcaca atggacttca gggtctgcct ggtcttgctg gccaacatgg agatcaaggt 3060
cctccaggtt ctactggccc cgctggccca aggggtccct ctggtccttc tggtcctgct 3120
ggaaaagatg gtcgcaatgg actccctggc cctattggac ctgctggtgt gcgtggttct 3180
cagggtagcc aaggtccttc gggtccacct ggcccacctg gtctccctgg tccccctggt 3240
gcaaatggtg gtggatacga agttggctat gatcttgaat actaccgggc tgatcagcct 3300
gctctcagac ctaaggacta tgaagttgat gccactctga aaacattgaa caaccaaatt 3360
gagaccctcc tgaccccaga aggctccagg aagaacccag ctcgcacctg ccgtgacctg 3420
agactcagcc acccagaatg gaccagtggt ttctactgga ttgatcccaa ccagggctgt 3480
actatggatg ccattagagt gtattgtgac ttctccactg gtgagacttg catacatgcc 3540
aatctagaaa acatccccac taagaactgg tatgtcagca agaactccaa ggaaaagaag 3600
cacatgtggt ttggtgaaac tatcaatggt ggtacccagt ttgaatataa cgatgaagga 3660
gtgacttcca aggacatggc tacccaactt gccttcatgc gtctgctggc caaccatgcc 3720
tcccagaaca tcacctacca ctgcaagaac agtattgcat acatggatga agaaactggc 3780
aaccttaaga aggctgtaat actgcaggga tccaatgatg ttgaactacg agctgaaggc 3840
aacagcagat tcactttcag tgttctggaa gatggctgct ctagaaagaa caacgcatgg 3900
ggcaaaacaa tcattgaata tagaacaaac aaaccatctc gcttgcccat ccttgacatt 3960
gcacctttgg acattggtgg agctgatcaa gaattcggtt tggacattgg cccagtctgt 4020
ttcaaaggtt ctgagaatct ttattttcag ggccgtaaag gcgaagagct gttcactggt 4080
gtcgtcccta ttctggtgga actggatggt gatgtcaacg gtcataagtt ttccgtgcgt 4140
ggcgagggtg aaggtgacgc aactaatggt aaactgacgc tgaagttcat ctgtactact 4200
ggtaaactgc cggttccttg gccgactctg gtaacgacgc tgacttatgg tgttcagtgc 4260
tttgctcgtt atccggacca tatgaagcag catgacttct tcaagtccgc catgccggaa 4320
ggctatgtgc aggaacgcac gatttccttt aaggatgacg gcacgtacaa aacgcgtgcg 4380
gaagtgaaat ttgaaggcga taccctggta aaccgcattg agctgaaagg cattgacttt 4440
aaagaggacg gcaatatcct gggccataag ctggaataca attttaacag ccacaatgtt 4500
tacatcaccg ccgataaaca aaaaaatggc attaaagcga attttaaaat tcgccacaac 4560
gtggaggatg gcagcgtgca gctggctgat cactaccagc aaaacactcc aatcggtgat 4620
ggtcctgttc tgctgccaga caatcactat ctgagcacgc aaagcgttct gtctaaagat 4680
ccgaacgaga aacgcgatca tatggttctg ctggagttcg taaccgcagc gggcatcacg 4740
catggtatgg atgaactgta caaaggttct catcatcatc atcatcactt gggaagaggt 4800
agaagatcca aattgtaa 4818
<210> 60
<211> 1605
<212> PRT
<213> Artificial Sequence
<220>
<223> AmisCOL1A2-TEV-GFP-HIS-ePTS1
<400> 60
Met Gln Gln Ala Asn Glu Ala Thr Ala Gly Arg Lys Gly Pro Arg Gly
1 5 10 15
Asp Lys Gly Pro Gln Gly Glu Arg Gly Pro Pro Gly Pro Pro Gly Arg
20 25 30
Asp Gly Glu Asp Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Pro Pro
35 40 45
Gly Leu Gly Gly Asn Phe Ala Ala Gln Tyr Asp Gly Ala Lys Ala Gly
50 55 60
Asp Tyr Gly Ser Gly Pro Met Gly Leu Met Gly Pro Arg Gly Pro Pro
65 70 75 80
Gly Thr Ser Gly Pro Pro Gly Pro Pro Gly Phe Gln Gly Pro His Gly
85 90 95
Glu Pro Gly Glu Pro Gly Gln Thr Gly Pro Gln Gly Pro Arg Gly Pro
100 105 110
Ser Gly Pro Pro Gly Lys Ala Gly Glu Asp Gly His Pro Gly Lys Ser
115 120 125
Gly Arg Ser Gly Glu Arg Gly Val Ser Gly Pro Gln Gly Ala Arg Gly
130 135 140
Phe Pro Gly Thr Pro Gly Leu Pro Gly Phe Lys Gly Ile Arg Gly His
145 150 155 160
Asn Gly Leu Asp Gly Gln Lys Gly Gln Pro Gly Thr Pro Gly Ile Lys
165 170 175
Gly Glu Ser Gly Ala Pro Gly Glu Asn Gly Thr Pro Gly Gln Ser Gly
180 185 190
Ala Arg Gly Leu Pro Gly Glu Arg Gly Arg Ile Gly Ala Pro Gly Pro
195 200 205
Ala Gly Ala Arg Gly Ser Asp Gly Ser Thr Gly Pro Thr Gly Pro Ala
210 215 220
Gly Pro Ile Gly Ser Ala Gly Ala Pro Gly Phe Pro Gly Ala Pro Gly
225 230 235 240
Ala Lys Gly Glu Ile Gly Ala Ala Gly Asn Val Gly Pro Ser Gly Pro
245 250 255
Ala Gly Pro Arg Gly Glu Ala Gly Leu Pro Gly Ser Ser Gly Pro Val
260 265 270
Gly Pro Pro Gly Asn Pro Gly Ser Asn Gly Leu Ala Gly Ala Lys Gly
275 280 285
Ala Thr Gly Leu Pro Gly Val Ala Gly Ala Pro Gly Leu Pro Gly Pro
290 295 300
Arg Gly Ile Pro Gly Pro Ser Gly Pro Ala Gly Ala Ala Gly Thr Arg
305 310 315 320
Gly Leu Val Gly Glu Pro Gly Pro Ala Gly Ala Lys Gly Glu Ser Gly
325 330 335
Asn Lys Gly Glu Pro Gly Ala Ala Gly Pro Ser Gly Pro Ala Gly Pro
340 345 350
Ser Gly Glu Glu Gly Lys Lys Gly Thr Thr Gly Glu Pro Gly Ser Ser
355 360 365
Gly Pro Pro Gly Pro Ala Gly Leu Arg Gly Val Pro Gly Ser Arg Gly
370 375 380
Leu Pro Gly Ala Asp Gly Arg Ala Gly Val Met Gly Pro Ala Gly Ser
385 390 395 400
Arg Gly Ala Thr Gly Pro Ala Gly Ala Lys Gly Pro Ser Gly Asp Asn
405 410 415
Gly Arg Pro Gly Glu Pro Gly Leu Met Gly Pro Arg Gly Leu Pro Gly
420 425 430
Gln Pro Gly Ser Ser Gly Pro Ala Gly Lys Glu Gly Pro Val Gly Phe
435 440 445
Pro Gly Ala Asp Gly Arg Val Gly Pro Thr Gly Pro Ala Gly Ala Arg
450 455 460
Gly Glu Pro Gly Asn Ile Gly Phe Pro Gly Pro Lys Gly Pro Thr Gly
465 470 475 480
Asp Pro Gly Lys Pro Gly Asp Arg Gly His Ala Gly Leu Ala Gly Ala
485 490 495
Arg Gly Ala Pro Gly Pro Glu Gly Asn Asn Gly Ala Gln Gly Pro Pro
500 505 510
Gly Val Ala Gly Asn Pro Gly Ala Lys Gly Glu Gln Gly Pro Ala Gly
515 520 525
Pro Pro Gly Phe Gln Gly Leu Pro Gly Pro Ser Gly Pro Ala Gly Glu
530 535 540
Ala Gly Lys Pro Gly Glu Arg Gly Met Ala Gly Glu Phe Gly Ala Pro
545 550 555 560
Gly Pro Ala Gly Ser Arg Gly Glu Arg Gly Pro Pro Gly Glu Ser Gly
565 570 575
Ala Val Gly Pro Val Gly Pro Ile Gly Ser Arg Gly Pro Ser Gly Pro
580 585 590
Pro Gly Thr Asp Gly Asn Lys Gly Glu Pro Gly Asn Val Gly Asn Ala
595 600 605
Gly Thr Ala Gly Pro Ser Gly Ala Gly Gly Ala Pro Gly Glu Arg Gly
610 615 620
Ile Ala Gly Ile Pro Gly Pro Lys Gly Glu Lys Gly Ala Thr Gly Leu
625 630 635 640
Arg Gly Asp Thr Gly Ala Thr Gly Arg Asp Gly Ala Arg Gly Ala Pro
645 650 655
Gly Ala Ile Gly Ala Pro Gly Pro Ala Gly Gly Ala Gly Glu Arg Gly
660 665 670
Glu Gly Gly Pro Ala Gly Ala Ala Gly Pro Ser Gly Ala Arg Gly Ile
675 680 685
Pro Gly Glu Arg Gly Glu Pro Gly Pro Ala Gly Pro Thr Gly Phe Ala
690 695 700
Gly Pro Ala Gly Ala Ala Gly Gln Pro Gly Ala Lys Gly Glu Arg Gly
705 710 715 720
Thr Lys Gly Pro Lys Gly Glu Asn Gly Pro Gln Gly Ala Val Gly Pro
725 730 735
Val Gly Ser Ser Gly Pro Ser Gly Pro Val Gly Ala Ser Gly Pro Ala
740 745 750
Gly Pro Arg Gly Asp Gly Gly Pro Pro Gly Val Thr Gly Phe Pro Gly
755 760 765
Ala Ala Gly Arg Thr Gly Pro Pro Gly Pro Ser Gly Ile Thr Gly Pro
770 775 780
Pro Gly Pro Pro Gly Ser Ala Gly Lys Asp Gly Met Arg Gly Pro Arg
785 790 795 800
Gly Asp Thr Gly Pro Val Gly Arg Thr Gly Glu Gln Gly Ile Val Gly
805 810 815
Pro Pro Gly Phe Ser Gly Glu Lys Gly Pro Ser Gly Glu Pro Gly Ala
820 825 830
Ala Gly Pro Pro Gly Thr Pro Gly Pro Gln Gly Ile Leu Gly Ala Pro
835 840 845
Gly Ile Leu Gly Leu Pro Gly Ser Arg Gly Glu Arg Gly Leu Pro Gly
850 855 860
Ile Ser Gly Ala Thr Gly Glu Pro Gly Pro Leu Gly Ile Ser Gly Pro
865 870 875 880
Pro Gly Ala Arg Gly Pro Ser Gly Pro Val Gly Ser Ala Gly Leu Asn
885 890 895
Gly Ala Pro Gly Glu Ala Gly Arg Asp Gly Asn Pro Gly His Asp Gly
900 905 910
Ala Pro Gly Arg Asp Gly Ala Pro Gly Phe Lys Gly Glu Arg Gly Ala
915 920 925
Pro Gly Asn Asn Gly Pro Ala Gly Ala Val Gly Ala Pro Gly Ala His
930 935 940
Gly Gln Val Gly Pro Ala Gly Lys Pro Gly Asn Arg Gly Asp Pro Gly
945 950 955 960
Pro Val Gly Pro Ser Gly Pro Ala Gly Ala Phe Gly Ala Arg Gly Pro
965 970 975
Ser Gly Pro Gln Gly Ala Arg Gly Glu Lys Gly Glu Thr Gly Glu Lys
980 985 990
Gly His Arg Gly Met Pro Gly Phe Lys Gly His Asn Gly Leu Gln Gly
995 1000 1005
Leu Pro Gly Leu Ala Gly Gln His Gly Asp Gln Gly Pro Pro Gly Ser
1010 1015 1020
Thr Gly Pro Ala Gly Pro Arg Gly Pro Ser Gly Pro Ser Gly Pro Ala
1025 1030 1035 1040
Gly Lys Asp Gly Arg Asn Gly Leu Pro Gly Pro Ile Gly Pro Ala Gly
1045 1050 1055
Val Arg Gly Ser Gln Gly Ser Gln Gly Pro Ser Gly Pro Pro Gly Pro
1060 1065 1070
Pro Gly Leu Pro Gly Pro Pro Gly Ala Asn Gly Gly Gly Tyr Glu Val
1075 1080 1085
Gly Tyr Asp Leu Glu Tyr Tyr Arg Ala Asp Gln Pro Ala Leu Arg Pro
1090 1095 1100
Lys Asp Tyr Glu Val Asp Ala Thr Leu Lys Thr Leu Asn Asn Gln Ile
1105 1110 1115 1120
Glu Thr Leu Leu Thr Pro Glu Gly Ser Arg Lys Asn Pro Ala Arg Thr
1125 1130 1135
Cys Arg Asp Leu Arg Leu Ser His Pro Glu Trp Thr Ser Gly Phe Tyr
1140 1145 1150
Trp Ile Asp Pro Asn Gln Gly Cys Thr Met Asp Ala Ile Arg Val Tyr
1155 1160 1165
Cys Asp Phe Ser Thr Gly Glu Thr Cys Ile His Ala Asn Leu Glu Asn
1170 1175 1180
Ile Pro Thr Lys Asn Trp Tyr Val Ser Lys Asn Ser Lys Glu Lys Lys
1185 1190 1195 1200
His Met Trp Phe Gly Glu Thr Ile Asn Gly Gly Thr Gln Phe Glu Tyr
1205 1210 1215
Asn Asp Glu Gly Val Thr Ser Lys Asp Met Ala Thr Gln Leu Ala Phe
1220 1225 1230
Met Arg Leu Leu Ala Asn His Ala Ser Gln Asn Ile Thr Tyr His Cys
1235 1240 1245
Lys Asn Ser Ile Ala Tyr Met Asp Glu Glu Thr Gly Asn Leu Lys Lys
1250 1255 1260
Ala Val Ile Leu Gln Gly Ser Asn Asp Val Glu Leu Arg Ala Glu Gly
1265 1270 1275 1280
Asn Ser Arg Phe Thr Phe Ser Val Leu Glu Asp Gly Cys Ser Arg Lys
1285 1290 1295
Asn Asn Ala Trp Gly Lys Thr Ile Ile Glu Tyr Arg Thr Asn Lys Pro
1300 1305 1310
Ser Arg Leu Pro Ile Leu Asp Ile Ala Pro Leu Asp Ile Gly Gly Ala
1315 1320 1325
Asp Gln Glu Phe Gly Leu Asp Ile Gly Pro Val Cys Phe Lys Gly Ser
1330 1335 1340
Glu Asn Leu Tyr Phe Gln Gly Arg Lys Gly Glu Glu Leu Phe Thr Gly
1345 1350 1355 1360
Val Val Pro Ile Leu Val Glu Leu Asp Gly Asp Val Asn Gly His Lys
1365 1370 1375
Phe Ser Val Arg Gly Glu Gly Glu Gly Asp Ala Thr Asn Gly Lys Leu
1380 1385 1390
Thr Leu Lys Phe Ile Cys Thr Thr Gly Lys Leu Pro Val Pro Trp Pro
1395 1400 1405
Thr Leu Val Thr Thr Leu Thr Tyr Gly Val Gln Cys Phe Ala Arg Tyr
1410 1415 1420
Pro Asp His Met Lys Gln His Asp Phe Phe Lys Ser Ala Met Pro Glu
1425 1430 1435 1440
Gly Tyr Val Gln Glu Arg Thr Ile Ser Phe Lys Asp Asp Gly Thr Tyr
1445 1450 1455
Lys Thr Arg Ala Glu Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg
1460 1465 1470
Ile Glu Leu Lys Gly Ile Asp Phe Lys Glu Asp Gly Asn Ile Leu Gly
1475 1480 1485
His Lys Leu Glu Tyr Asn Phe Asn Ser His Asn Val Tyr Ile Thr Ala
1490 1495 1500
Asp Lys Gln Lys Asn Gly Ile Lys Ala Asn Phe Lys Ile Arg His Asn
1505 1510 1515 1520
Val Glu Asp Gly Ser Val Gln Leu Ala Asp His Tyr Gln Gln Asn Thr
1525 1530 1535
Pro Ile Gly Asp Gly Pro Val Leu Leu Pro Asp Asn His Tyr Leu Ser
1540 1545 1550
Thr Gln Ser Val Leu Ser Lys Asp Pro Asn Glu Lys Arg Asp His Met
1555 1560 1565
Val Leu Leu Glu Phe Val Thr Ala Ala Gly Ile Thr His Gly Met Asp
1570 1575 1580
Glu Leu Tyr Lys Gly Ser His His His His His His Leu Gly Arg Gly
1585 1590 1595 1600
Arg Arg Ser Lys Leu
1605
<210> 61
<211> 5118
<212> DNA
<213> Artificial Sequence
<220>
<223> BtCOL1A1-TEV-GFP-HIS-ePTS1
<400> 61
atgcaagagg agggccagga agaaggccaa gaagaagaca tcccaccagt cacctgcgta 60
cagaacggcc tcaggtacca tgaccgagac gtgtggaaac ccgtgccctg ccagatctgt 120
gtctgcgaca acggcaacgt gctgtgcgat gacgtgatct gcgacgaact taaggactgt 180
cctaacgcca aagtccccac ggacgaatgc tgccccgtct gccccgaagg ccaggaatca 240
cccacggacc aagaaaccac cggagtcgag ggaccgaaag gagacactgg cccccgaggc 300
ccaaggggac ccgccggccc ccccggccga gatggcatcc ctggacaacc tggacttccc 360
ggaccccctg gaccccccgg acctcccgga ccccctggcc tcggaggaaa ctttgctccc 420
cagttgtctt acggctatga tgagaaatca acaggaattt ccgtgcctgg tcccatgggt 480
ccttctggtc ctcgtggtct ccctggcccc cctggcgcac ctggtcccca aggtttccaa 540
ggcccccctg gtgagcctgg cgagccagga gcctcaggtc ccatgggtcc ccgtggtccc 600
cctggccccc ctggcaagaa cggagatgat ggcgaagctg gaaagcctgg tcgtcctggt 660
gagcgcgggc ctcccggacc tcagggtgct cggggattgc ctggaacagc tggcctccct 720
ggaatgaagg gacacagagg tttcagtggt ttggatggtg ccaagggaga tgctggtcct 780
gctggcccca agggcgagcc tggtagcccc ggtgaaaatg gagctcctgg tcagatgggc 840
ccccgtggtc tgcctggtga gagaggtcgc cctggagccc ctggccctgc tggtgctcga 900
ggaaatgatg gtgcgactgg tgctgctggg ccccctggtc ccactggccc cgctggtcct 960
cctggtttcc ctggtgctgt gggtgctaag ggtgaaggtg gtccccaagg accccgaggt 1020
tctgaaggtc cccagggtgt acgtggtgag cctggccccc ctggccctgc tggtgctgct 1080
ggccctgctg gcaaccctgg tgctgatgga cagcctggtg ctaaaggagc caatggcgct 1140
cctggtattg ctggtgctcc tggcttccct ggtgcccgag gcccctctgg accccagggc 1200
cccagcggcc cccctggccc caagggtaac agcggtgaac ctggtgctcc tggcagcaaa 1260
ggagacactg gcgccaaggg agaacccggt cccactggta ttcaaggccc ccctggcccc 1320
gctggggaag aaggaaagcg aggagcccga ggtgaacctg gacctgctgg cctgcctgga 1380
ccccctggcg agcgtggtgg acctggaagc cgtggtttcc ctggcgccga cggtgttgct 1440
ggtcccaagg gtcctgctgg tgaacgcggt gctcctggcc ctgctggccc caaaggttct 1500
cctggtgaag ctggtcgccc cggtgaagct ggtctgcccg gtgccaaggg tctgactgga 1560
agccctggca gcccgggtcc tgatggcaaa actggccccc ctggtcccgc cggtcaagat 1620
ggccgccctg gacctccagg ccctcccggt gcccgtggtc aggctggcgt gatgggtttc 1680
cctggaccta aaggtgctgc tggagagcct ggaaaagctg gagagcgagg tgttcctgga 1740
ccccctggcg ctgttggtcc tgctggcaaa gacggagaag ctggagctca gggaccccca 1800
ggacctgctg gccccgctgg tgagagaggc gaacaaggcc ctgctggctc ccctggattc 1860
cagggtctcc ccggccctgc tggtcctcct ggtgaagcag gcaaacctgg tgaacagggt 1920
gttcctggag atcttggtgc ccccggcccc tctggagcaa gaggcgagag aggtttcccc 1980
ggcgagcgtg gtgtgcaagg gccgcccggt cctgcaggtc cccgtggggc caatggtgcc 2040
cctggcaacg atggtgctaa gggtgatgct ggtgcccctg gagcccccgg tagccagggt 2100
gcccctggcc ttcaaggaat gcctggtgaa cgaggtgcag ctggtcttcc aggccctaag 2160
ggtgacagag gggatgctgg tcccaaaggt gctgatggtg ctcctggcaa agatggcgtc 2220
cgtggtctga ctggtcccat cggtcctcct ggccccgctg gtgcccctgg tgacaagggt 2280
gaagctggtc ctagtggccc agccggtccc actggagctc gtggtgcccc cggtgaccgt 2340
ggtgagcctg gtccccccgg ccctgctggc ttcgctggcc cccctggtgc tgatggccaa 2400
cctggtgcta aaggcgaacc tggtgatgct ggtgctaaag gtgacgctgg tccccccggc 2460
cctgctgggc ccgctggacc ccccggcccc attggtaacg ttggtgctcc cggacccaaa 2520
ggtgctcgtg gcagcgctgg tccccctggt gctactggtt tcccaggtgc tgctggccga 2580
gtcggtcccc ccggcccctc tggaaatgct ggaccccctg gccctcctgg ccctgctggc 2640
aaagaaggca gcaaaggccc ccgcggtgag actggccccg ctgggcgtcc cggtgaagtc 2700
ggtccccctg gtccccctgg ccccgctggt gagaaaggag cccctggtgc tgacggacct 2760
gctggagctc ctggcactcc tggacctcaa ggtattgctg gacagcgtgg tgtggtcggc 2820
ctgcctggtc agagaggaga aagaggcttc cctggtcttc ctggcccctc tggtgaaccc 2880
ggcaaacaag gtccttctgg agcaagtggt gaacgtggcc cccctggtcc catgggcccc 2940
cctggattgg ctggaccccc tggcgagtct ggacgtgagg gagctcctgg tgctgaagga 3000
tcccctggac gagatggttc tcctggcgcc aagggtgacc gtggtgagac cggccctgct 3060
ggacctcctg gtgctcctgg cgctcccggt gcccccggcc ctgtcggacc tgccggcaag 3120
agcggtgatc gtggtgagac cggtcctgct ggtcctgctg gtcccattgg ccccgttggt 3180
gcccgtggcc ccgctggacc ccaaggcccc cgtggtgaca agggtgagac aggcgaacag 3240
ggcgacagag gcattaaggg tcaccgtggc ttctctggtc tccagggtcc ccccggccct 3300
cccggctctc ctggtgagca aggtccttcc ggagcctctg gtcctgctgg tccccgcggt 3360
ccccctggct ctgctggttc tcccggcaaa gatggactca atggtctccc aggccccatc 3420
ggtccccctg ggcctcgagg tcgcactggt gatgctggtc ctgctggtcc tcccggccct 3480
cctggacccc ctggtccccc aggtcctccc agcggcggct acgacttgag cttcctgccc 3540
cagccacctc aagagaaggc tcacgatggt ggccgctact accgggctga tgatgccaat 3600
gtggtccgtg accgtgacct cgaggtggac accaccctca agagcctgag ccagcagatc 3660
gagaacatcc ggagccctga aggcagccgc aagaaccccg cccgcacctg ccgtgacctc 3720
aagatgtgcc actctgactg gaagagcgga gaatactgga ttgaccccaa ccaaggctgc 3780
aacctggatg ccattaaggt cttctgcaac atggaaaccg gtgagacctg tgtatacccc 3840
actcagccca gcgtggccca gaagaactgg tatatcagca agaaccccaa ggaaaagagg 3900
cacgtctggt acggcgagag catgaccggc ggattccagt tcgagtatgg cggccagggg 3960
tccgatcctg ccgatgtggc catccagctg actttcctgc gcctgatgtc caccgaggcc 4020
tcccagaaca tcacctacca ctgcaagaac agcgtggcct acatggacca gcagactggc 4080
aacctcaaga aggccctgct cctccagggc tccaacgaga tcgagatccg ggccgagggc 4140
aacagccgct tcacctacag cgtcacctac gatggctgca cgagtcacac cggagcctgg 4200
ggcaagacag tgatcgaata caaaaccacc aagacctccc gcttgcccat catcgatgtg 4260
gcccccttgg acgttggcgc cccagaccag gaattcggct tcgacgttgg ccctgcctgc 4320
ttcctgggtt ctgagaatct ttattttcag ggccgtaaag gcgaagagct gttcactggt 4380
gtcgtcccta ttctggtgga actggatggt gatgtcaacg gtcataagtt ttccgtgcgt 4440
ggcgagggtg aaggtgacgc aactaatggt aaactgacgc tgaagttcat ctgtactact 4500
ggtaaactgc cggttccttg gccgactctg gtaacgacgc tgacttatgg tgttcagtgc 4560
tttgctcgtt atccggacca tatgaagcag catgacttct tcaagtccgc catgccggaa 4620
ggctatgtgc aggaacgcac gatttccttt aaggatgacg gcacgtacaa aacgcgtgcg 4680
gaagtgaaat ttgaaggcga taccctggta aaccgcattg agctgaaagg cattgacttt 4740
aaagaggacg gcaatatcct gggccataag ctggaataca attttaacag ccacaatgtt 4800
tacatcaccg ccgataaaca aaaaaatggc attaaagcga attttaaaat tcgccacaac 4860
gtggaggatg gcagcgtgca gctggctgat cactaccagc aaaacactcc aatcggtgat 4920
ggtcctgttc tgctgccaga caatcactat ctgagcacgc aaagcgttct gtctaaagat 4980
ccgaacgaga aacgcgatca tatggttctg ctggagttcg taaccgcagc gggcatcacg 5040
catggtatgg atgaactgta caaaggttct catcatcatc atcatcactt gggaagaggt 5100
agaagatcca aattgtaa 5118
<210> 62
<211> 1705
<212> PRT
<213> Artificial Sequence
<220>
<223> BtCOL1A1-TEV-GFP-HIS-ePTS1
<400> 62
Met Gln Glu Glu Gly Gln Glu Glu Gly Gln Glu Glu Asp Ile Pro Pro
1 5 10 15
Val Thr Cys Val Gln Asn Gly Leu Arg Tyr His Asp Arg Asp Val Trp
20 25 30
Lys Pro Val Pro Cys Gln Ile Cys Val Cys Asp Asn Gly Asn Val Leu
35 40 45
Cys Asp Asp Val Ile Cys Asp Glu Leu Lys Asp Cys Pro Asn Ala Lys
50 55 60
Val Pro Thr Asp Glu Cys Cys Pro Val Cys Pro Glu Gly Gln Glu Ser
65 70 75 80
Pro Thr Asp Gln Glu Thr Thr Gly Val Glu Gly Pro Lys Gly Asp Thr
85 90 95
Gly Pro Arg Gly Pro Arg Gly Pro Ala Gly Pro Pro Gly Arg Asp Gly
100 105 110
Ile Pro Gly Gln Pro Gly Leu Pro Gly Pro Pro Gly Pro Pro Gly Pro
115 120 125
Pro Gly Pro Pro Gly Leu Gly Gly Asn Phe Ala Pro Gln Leu Ser Tyr
130 135 140
Gly Tyr Asp Glu Lys Ser Thr Gly Ile Ser Val Pro Gly Pro Met Gly
145 150 155 160
Pro Ser Gly Pro Arg Gly Leu Pro Gly Pro Pro Gly Ala Pro Gly Pro
165 170 175
Gln Gly Phe Gln Gly Pro Pro Gly Glu Pro Gly Glu Pro Gly Ala Ser
180 185 190
Gly Pro Met Gly Pro Arg Gly Pro Pro Gly Pro Pro Gly Lys Asn Gly
195 200 205
Asp Asp Gly Glu Ala Gly Lys Pro Gly Arg Pro Gly Glu Arg Gly Pro
210 215 220
Pro Gly Pro Gln Gly Ala Arg Gly Leu Pro Gly Thr Ala Gly Leu Pro
225 230 235 240
Gly Met Lys Gly His Arg Gly Phe Ser Gly Leu Asp Gly Ala Lys Gly
245 250 255
Asp Ala Gly Pro Ala Gly Pro Lys Gly Glu Pro Gly Ser Pro Gly Glu
260 265 270
Asn Gly Ala Pro Gly Gln Met Gly Pro Arg Gly Leu Pro Gly Glu Arg
275 280 285
Gly Arg Pro Gly Ala Pro Gly Pro Ala Gly Ala Arg Gly Asn Asp Gly
290 295 300
Ala Thr Gly Ala Ala Gly Pro Pro Gly Pro Thr Gly Pro Ala Gly Pro
305 310 315 320
Pro Gly Phe Pro Gly Ala Val Gly Ala Lys Gly Glu Gly Gly Pro Gln
325 330 335
Gly Pro Arg Gly Ser Glu Gly Pro Gln Gly Val Arg Gly Glu Pro Gly
340 345 350
Pro Pro Gly Pro Ala Gly Ala Ala Gly Pro Ala Gly Asn Pro Gly Ala
355 360 365
Asp Gly Gln Pro Gly Ala Lys Gly Ala Asn Gly Ala Pro Gly Ile Ala
370 375 380
Gly Ala Pro Gly Phe Pro Gly Ala Arg Gly Pro Ser Gly Pro Gln Gly
385 390 395 400
Pro Ser Gly Pro Pro Gly Pro Lys Gly Asn Ser Gly Glu Pro Gly Ala
405 410 415
Pro Gly Ser Lys Gly Asp Thr Gly Ala Lys Gly Glu Pro Gly Pro Thr
420 425 430
Gly Ile Gln Gly Pro Pro Gly Pro Ala Gly Glu Glu Gly Lys Arg Gly
435 440 445
Ala Arg Gly Glu Pro Gly Pro Ala Gly Leu Pro Gly Pro Pro Gly Glu
450 455 460
Arg Gly Gly Pro Gly Ser Arg Gly Phe Pro Gly Ala Asp Gly Val Ala
465 470 475 480
Gly Pro Lys Gly Pro Ala Gly Glu Arg Gly Ala Pro Gly Pro Ala Gly
485 490 495
Pro Lys Gly Ser Pro Gly Glu Ala Gly Arg Pro Gly Glu Ala Gly Leu
500 505 510
Pro Gly Ala Lys Gly Leu Thr Gly Ser Pro Gly Ser Pro Gly Pro Asp
515 520 525
Gly Lys Thr Gly Pro Pro Gly Pro Ala Gly Gln Asp Gly Arg Pro Gly
530 535 540
Pro Pro Gly Pro Pro Gly Ala Arg Gly Gln Ala Gly Val Met Gly Phe
545 550 555 560
Pro Gly Pro Lys Gly Ala Ala Gly Glu Pro Gly Lys Ala Gly Glu Arg
565 570 575
Gly Val Pro Gly Pro Pro Gly Ala Val Gly Pro Ala Gly Lys Asp Gly
580 585 590
Glu Ala Gly Ala Gln Gly Pro Pro Gly Pro Ala Gly Pro Ala Gly Glu
595 600 605
Arg Gly Glu Gln Gly Pro Ala Gly Ser Pro Gly Phe Gln Gly Leu Pro
610 615 620
Gly Pro Ala Gly Pro Pro Gly Glu Ala Gly Lys Pro Gly Glu Gln Gly
625 630 635 640
Val Pro Gly Asp Leu Gly Ala Pro Gly Pro Ser Gly Ala Arg Gly Glu
645 650 655
Arg Gly Phe Pro Gly Glu Arg Gly Val Gln Gly Pro Pro Gly Pro Ala
660 665 670
Gly Pro Arg Gly Ala Asn Gly Ala Pro Gly Asn Asp Gly Ala Lys Gly
675 680 685
Asp Ala Gly Ala Pro Gly Ala Pro Gly Ser Gln Gly Ala Pro Gly Leu
690 695 700
Gln Gly Met Pro Gly Glu Arg Gly Ala Ala Gly Leu Pro Gly Pro Lys
705 710 715 720
Gly Asp Arg Gly Asp Ala Gly Pro Lys Gly Ala Asp Gly Ala Pro Gly
725 730 735
Lys Asp Gly Val Arg Gly Leu Thr Gly Pro Ile Gly Pro Pro Gly Pro
740 745 750
Ala Gly Ala Pro Gly Asp Lys Gly Glu Ala Gly Pro Ser Gly Pro Ala
755 760 765
Gly Pro Thr Gly Ala Arg Gly Ala Pro Gly Asp Arg Gly Glu Pro Gly
770 775 780
Pro Pro Gly Pro Ala Gly Phe Ala Gly Pro Pro Gly Ala Asp Gly Gln
785 790 795 800
Pro Gly Ala Lys Gly Glu Pro Gly Asp Ala Gly Ala Lys Gly Asp Ala
805 810 815
Gly Pro Pro Gly Pro Ala Gly Pro Ala Gly Pro Pro Gly Pro Ile Gly
820 825 830
Asn Val Gly Ala Pro Gly Pro Lys Gly Ala Arg Gly Ser Ala Gly Pro
835 840 845
Pro Gly Ala Thr Gly Phe Pro Gly Ala Ala Gly Arg Val Gly Pro Pro
850 855 860
Gly Pro Ser Gly Asn Ala Gly Pro Pro Gly Pro Pro Gly Pro Ala Gly
865 870 875 880
Lys Glu Gly Ser Lys Gly Pro Arg Gly Glu Thr Gly Pro Ala Gly Arg
885 890 895
Pro Gly Glu Val Gly Pro Pro Gly Pro Pro Gly Pro Ala Gly Glu Lys
900 905 910
Gly Ala Pro Gly Ala Asp Gly Pro Ala Gly Ala Pro Gly Thr Pro Gly
915 920 925
Pro Gln Gly Ile Ala Gly Gln Arg Gly Val Val Gly Leu Pro Gly Gln
930 935 940
Arg Gly Glu Arg Gly Phe Pro Gly Leu Pro Gly Pro Ser Gly Glu Pro
945 950 955 960
Gly Lys Gln Gly Pro Ser Gly Ala Ser Gly Glu Arg Gly Pro Pro Gly
965 970 975
Pro Met Gly Pro Pro Gly Leu Ala Gly Pro Pro Gly Glu Ser Gly Arg
980 985 990
Glu Gly Ala Pro Gly Ala Glu Gly Ser Pro Gly Arg Asp Gly Ser Pro
995 1000 1005
Gly Ala Lys Gly Asp Arg Gly Glu Thr Gly Pro Ala Gly Pro Pro Gly
1010 1015 1020
Ala Pro Gly Ala Pro Gly Ala Pro Gly Pro Val Gly Pro Ala Gly Lys
1025 1030 1035 1040
Ser Gly Asp Arg Gly Glu Thr Gly Pro Ala Gly Pro Ala Gly Pro Ile
1045 1050 1055
Gly Pro Val Gly Ala Arg Gly Pro Ala Gly Pro Gln Gly Pro Arg Gly
1060 1065 1070
Asp Lys Gly Glu Thr Gly Glu Gln Gly Asp Arg Gly Ile Lys Gly His
1075 1080 1085
Arg Gly Phe Ser Gly Leu Gln Gly Pro Pro Gly Pro Pro Gly Ser Pro
1090 1095 1100
Gly Glu Gln Gly Pro Ser Gly Ala Ser Gly Pro Ala Gly Pro Arg Gly
1105 1110 1115 1120
Pro Pro Gly Ser Ala Gly Ser Pro Gly Lys Asp Gly Leu Asn Gly Leu
1125 1130 1135
Pro Gly Pro Ile Gly Pro Pro Gly Pro Arg Gly Arg Thr Gly Asp Ala
1140 1145 1150
Gly Pro Ala Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly
1155 1160 1165
Pro Pro Ser Gly Gly Tyr Asp Leu Ser Phe Leu Pro Gln Pro Pro Gln
1170 1175 1180
Glu Lys Ala His Asp Gly Gly Arg Tyr Tyr Arg Ala Asp Asp Ala Asn
1185 1190 1195 1200
Val Val Arg Asp Arg Asp Leu Glu Val Asp Thr Thr Leu Lys Ser Leu
1205 1210 1215
Ser Gln Gln Ile Glu Asn Ile Arg Ser Pro Glu Gly Ser Arg Lys Asn
1220 1225 1230
Pro Ala Arg Thr Cys Arg Asp Leu Lys Met Cys His Ser Asp Trp Lys
1235 1240 1245
Ser Gly Glu Tyr Trp Ile Asp Pro Asn Gln Gly Cys Asn Leu Asp Ala
1250 1255 1260
Ile Lys Val Phe Cys Asn Met Glu Thr Gly Glu Thr Cys Val Tyr Pro
1265 1270 1275 1280
Thr Gln Pro Ser Val Ala Gln Lys Asn Trp Tyr Ile Ser Lys Asn Pro
1285 1290 1295
Lys Glu Lys Arg His Val Trp Tyr Gly Glu Ser Met Thr Gly Gly Phe
1300 1305 1310
Gln Phe Glu Tyr Gly Gly Gln Gly Ser Asp Pro Ala Asp Val Ala Ile
1315 1320 1325
Gln Leu Thr Phe Leu Arg Leu Met Ser Thr Glu Ala Ser Gln Asn Ile
1330 1335 1340
Thr Tyr His Cys Lys Asn Ser Val Ala Tyr Met Asp Gln Gln Thr Gly
1345 1350 1355 1360
Asn Leu Lys Lys Ala Leu Leu Leu Gln Gly Ser Asn Glu Ile Glu Ile
1365 1370 1375
Arg Ala Glu Gly Asn Ser Arg Phe Thr Tyr Ser Val Thr Tyr Asp Gly
1380 1385 1390
Cys Thr Ser His Thr Gly Ala Trp Gly Lys Thr Val Ile Glu Tyr Lys
1395 1400 1405
Thr Thr Lys Thr Ser Arg Leu Pro Ile Ile Asp Val Ala Pro Leu Asp
1410 1415 1420
Val Gly Ala Pro Asp Gln Glu Phe Gly Phe Asp Val Gly Pro Ala Cys
1425 1430 1435 1440
Phe Leu Gly Ser Glu Asn Leu Tyr Phe Gln Gly Arg Lys Gly Glu Glu
1445 1450 1455
Leu Phe Thr Gly Val Val Pro Ile Leu Val Glu Leu Asp Gly Asp Val
1460 1465 1470
Asn Gly His Lys Phe Ser Val Arg Gly Glu Gly Glu Gly Asp Ala Thr
1475 1480 1485
Asn Gly Lys Leu Thr Leu Lys Phe Ile Cys Thr Thr Gly Lys Leu Pro
1490 1495 1500
Val Pro Trp Pro Thr Leu Val Thr Thr Leu Thr Tyr Gly Val Gln Cys
1505 1510 1515 1520
Phe Ala Arg Tyr Pro Asp His Met Lys Gln His Asp Phe Phe Lys Ser
1525 1530 1535
Ala Met Pro Glu Gly Tyr Val Gln Glu Arg Thr Ile Ser Phe Lys Asp
1540 1545 1550
Asp Gly Thr Tyr Lys Thr Arg Ala Glu Val Lys Phe Glu Gly Asp Thr
1555 1560 1565
Leu Val Asn Arg Ile Glu Leu Lys Gly Ile Asp Phe Lys Glu Asp Gly
1570 1575 1580
Asn Ile Leu Gly His Lys Leu Glu Tyr Asn Phe Asn Ser His Asn Val
1585 1590 1595 1600
Tyr Ile Thr Ala Asp Lys Gln Lys Asn Gly Ile Lys Ala Asn Phe Lys
1605 1610 1615
Ile Arg His Asn Val Glu Asp Gly Ser Val Gln Leu Ala Asp His Tyr
1620 1625 1630
Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro Val Leu Leu Pro Asp Asn
1635 1640 1645
His Tyr Leu Ser Thr Gln Ser Val Leu Ser Lys Asp Pro Asn Glu Lys
1650 1655 1660
Arg Asp His Met Val Leu Leu Glu Phe Val Thr Ala Ala Gly Ile Thr
1665 1670 1675 1680
His Gly Met Asp Glu Leu Tyr Lys Gly Ser His His His His His His
1685 1690 1695
Leu Gly Arg Gly Arg Arg Ser Lys Leu
1700 1705
<210> 63
<211> 4080
<212> DNA
<213> Artificial Sequence
<220>
<223> BtCOL1A2-TEV-GFP-HIS-ePTS1
<400> 63
atgcaatcct tacaagaggc aactgcaaga aagggcccaa gtggagatag aggaccacgc 60
ggagaaaggg gtccaccagg cccaccaggc agagatggtg atgacggcat cccaggccct 120
cctggccccc ctggccctcc tggcccccct ggtcttggcg ggaactttgc tgctcagttt 180
gatgcaaaag gaggtggccc tggaccaatg gggctgatgg gacctcgcgg ccctcctggg 240
gcttctggag cccctggccc tcaaggtttc cagggacctc cgggtgagcc tggtgaacct 300
ggtcagactg gtcctgcagg tgctcgtggc ccgcctggcc ctcctggcaa ggctggtgag 360
gatggtcacc ctggaaaacc tggacgacct ggtgagagag gggttgttgg accacagggt 420
gctcgtggct ttcctggaac tcctggactc cctggcttca agggcattag gggtcacaat 480
ggtctggatg gattgaaggg acagcctggt gctccaggtg tgaagggtga acctggtgcc 540
cctggtgaaa atggaactcc aggtcaaacg ggagcccgtg gtcttcctgg tgagagagga 600
cgtgttggtg cccctggccc agctggtgcc cgtggaagtg atggaagtgt gggtcctgtg 660
ggccctgctg gtcccattgg gtctgctggc cctccaggct tcccaggtgc tcctggcccc 720
aagggtgaac tcggacctgt tggtaaccct ggccctgctg gtcccgcggg tccccgtggt 780
gaagtgggtc tcccaggcct ttctggccct gtcggacctc ctggaaaccc cggagccaat 840
gggcttcctg gcgctaaggg tgctgctggc cttcccggtg ttgctggggc tcccggcctc 900
cctggacccc ggggtattcc tggccctgtt ggcgctgctg gtgctactgg cgccagagga 960
cttgttggtg agcccggccc agctggttcg aaaggagaga gcggcaacaa gggcgagcct 1020
ggtgctgttg ggcagccagg tcctcctggc cccagtggtg aagaaggaaa gagaggctcc 1080
actggagaaa tcggacccgc tggcccccca ggacctcctg ggctgagggg aaatcctggc 1140
tcccgtggtc tacctggagc tgacggcaga gctggtgtca tgggtcctgc tggtagccgt 1200
ggtgcaactg gccctgctgg tgtgcgaggt cccaatggag attctggtcg ccctggagag 1260
cctggcctca tgggaccccg aggtttccca ggttcccctg gaaatatcgg cccagctggt 1320
aaagaaggtc ctgtgggtct ccctggtatt gacggcagac ctgggcccat tggcccagcg 1380
ggagcaagag gagagcctgg caacattgga ttccctggac ccaaaggccc cagtggtgat 1440
cctggcaaag ctggtgaaaa aggtcatgct ggtcttgctg gtgctcgggg cgctccaggt 1500
cccgatggca acaacggtgc tcagggaccc cctggactac agggtgtcca aggtggaaaa 1560
ggtgaacagg gtcctgctgg tcctccaggc ttccagggtc tgcctggccc tgcaggcaca 1620
gctggtgaag ctggcaaacc aggagaaagg ggtatccctg gtgaatttgg tctccctggc 1680
cctgctggtg caagagggga gcgggggccc ccaggtgaaa gtggtgctgc tgggcctact 1740
gggcctattg gaagccgagg tccttctgga cccccagggc ctgatggaaa caagggtgaa 1800
ccgggtgtgg ttggcgctcc aggcactgct ggcccatctg gtcctagcgg actcccagga 1860
gagaggggtg cggctggcat tcctggaggc aagggagaaa agggtgaaac tggtctcaga 1920
ggtgacattg gtagccctgg tagagatggt gctcgtggtg ctcctggtgc tattggtgct 1980
cctggccctg ctggagccaa tggggaccgg ggtgaagctg gtcccgctgg ccctgctggc 2040
cctgctggtc ctcgtggtag ccctggtgaa cgtggtgagg tcggtcccgc tggccccaac 2100
ggatttgctg gtcctgctgg tgctgctggt caacctggtg ctaaaggaga gagaggaacc 2160
aaaggaccca agggtgaaaa tggtcctgtt ggtcccacag gccccgttgg agctgccggt 2220
ccgtctggtc caaatggccc acctggtcct gctggaagtc gtggtgatgg agggccccct 2280
ggggctactg gtttccctgg tgctgctgga cggactggtc cccctggacc ctctggtatc 2340
tctggccccc ctggcccccc tggtcctgct ggtaaagaag ggcttcgtgg gcctcgtggt 2400
gaccaaggtc cagttggtcg aagtggagag acaggtgcct ctggccctcc tggctttgtt 2460
ggtgagaagg gtccctctgg agagcctggt actgctgggc ctcctggaac cccaggtcca 2520
caaggccttc ttggtgctcc tggttttctg ggtctcccag gctctagagg tgagcgtggt 2580
ctaccaggtg tcgctggatc tgtgggtgaa cctggccccc tcggcatcgc aggcccacct 2640
ggggcccgtg gtccccctgg taatgtcggt aatcctggcg tcaatggtgc tcctggtgaa 2700
gccggtcgtg acggcaaccc tgggaatgac ggtcccccag gccgcgatgg tcaacccgga 2760
cacaaggggg agcgtggtta ccccggtaac gcaggtcctg ttggtgctgc cggtgctcct 2820
ggccctcaag gccctgtggg tcccgttggt aaacacggaa accgtggtga accgggtcct 2880
gccggtgctg ttggtcctgc tggtgccgtt ggcccaagag gtcccagtgg cccacaaggt 2940
attcgaggtg acaagggaga gcctggtgat aagggtccca gaggtcttcc tggcttaaag 3000
ggacacaatg ggttgcaagg tctcccgggt cttgctggtc atcatggcga tcaaggtgct 3060
cccggtgctg tgggtcccgc tggtcccagg ggccctgctg gtccttctgg ccccgctggc 3120
aaagacggtc gcattggaca gcctggtgca gtcggacctg ctggcattcg tggctctcag 3180
ggtagccaag gtcctgctgg ccctcctggt ccccctggcc ctcctggacc tcctggccca 3240
agtggtggtg gttacgagtt tggttttgat ggagacttct acagggctgg ttctgagaat 3300
ctttattttc agggccgtaa aggcgaagag ctgttcactg gtgtcgtccc tattctggtg 3360
gaactggatg gtgatgtcaa cggtcataag ttttccgtgc gtggcgaggg tgaaggtgac 3420
gcaactaatg gtaaactgac gctgaagttc atctgtacta ctggtaaact gccggttcct 3480
tggccgactc tggtaacgac gctgacttat ggtgttcagt gctttgctcg ttatccggac 3540
catatgaagc agcatgactt cttcaagtcc gccatgccgg aaggctatgt gcaggaacgc 3600
acgatttcct ttaaggatga cggcacgtac aaaacgcgtg cggaagtgaa atttgaaggc 3660
gataccctgg taaaccgcat tgagctgaaa ggcattgact ttaaagagga cggcaatatc 3720
ctgggccata agctggaata caattttaac agccacaatg tttacatcac cgccgataaa 3780
caaaaaaatg gcattaaagc gaattttaaa attcgccaca acgtggagga tggcagcgtg 3840
cagctggctg atcactacca gcaaaacact ccaatcggtg atggtcctgt tctgctgcca 3900
gacaatcact atctgagcac gcaaagcgtt ctgtctaaag atccgaacga gaaacgcgat 3960
catatggttc tgctggagtt cgtaaccgca gcgggcatca cgcatggtat ggatgaactg 4020
tacaaaggtt ctcatcatca tcatcatcac ttgggaagag gtagaagatc caaattgtaa 4080
<210> 64
<211> 1359
<212> PRT
<213> Artificial Sequence
<220>
<223> BtCOL1A2-TEV-GFP-HIS-ePTS1
<400> 64
Met Gln Ser Leu Gln Glu Ala Thr Ala Arg Lys Gly Pro Ser Gly Asp
1 5 10 15
Arg Gly Pro Arg Gly Glu Arg Gly Pro Pro Gly Pro Pro Gly Arg Asp
20 25 30
Gly Asp Asp Gly Ile Pro Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly
35 40 45
Pro Pro Gly Leu Gly Gly Asn Phe Ala Ala Gln Phe Asp Ala Lys Gly
50 55 60
Gly Gly Pro Gly Pro Met Gly Leu Met Gly Pro Arg Gly Pro Pro Gly
65 70 75 80
Ala Ser Gly Ala Pro Gly Pro Gln Gly Phe Gln Gly Pro Pro Gly Glu
85 90 95
Pro Gly Glu Pro Gly Gln Thr Gly Pro Ala Gly Ala Arg Gly Pro Pro
100 105 110
Gly Pro Pro Gly Lys Ala Gly Glu Asp Gly His Pro Gly Lys Pro Gly
115 120 125
Arg Pro Gly Glu Arg Gly Val Val Gly Pro Gln Gly Ala Arg Gly Phe
130 135 140
Pro Gly Thr Pro Gly Leu Pro Gly Phe Lys Gly Ile Arg Gly His Asn
145 150 155 160
Gly Leu Asp Gly Leu Lys Gly Gln Pro Gly Ala Pro Gly Val Lys Gly
165 170 175
Glu Pro Gly Ala Pro Gly Glu Asn Gly Thr Pro Gly Gln Thr Gly Ala
180 185 190
Arg Gly Leu Pro Gly Glu Arg Gly Arg Val Gly Ala Pro Gly Pro Ala
195 200 205
Gly Ala Arg Gly Ser Asp Gly Ser Val Gly Pro Val Gly Pro Ala Gly
210 215 220
Pro Ile Gly Ser Ala Gly Pro Pro Gly Phe Pro Gly Ala Pro Gly Pro
225 230 235 240
Lys Gly Glu Leu Gly Pro Val Gly Asn Pro Gly Pro Ala Gly Pro Ala
245 250 255
Gly Pro Arg Gly Glu Val Gly Leu Pro Gly Leu Ser Gly Pro Val Gly
260 265 270
Pro Pro Gly Asn Pro Gly Ala Asn Gly Leu Pro Gly Ala Lys Gly Ala
275 280 285
Ala Gly Leu Pro Gly Val Ala Gly Ala Pro Gly Leu Pro Gly Pro Arg
290 295 300
Gly Ile Pro Gly Pro Val Gly Ala Ala Gly Ala Thr Gly Ala Arg Gly
305 310 315 320
Leu Val Gly Glu Pro Gly Pro Ala Gly Ser Lys Gly Glu Ser Gly Asn
325 330 335
Lys Gly Glu Pro Gly Ala Val Gly Gln Pro Gly Pro Pro Gly Pro Ser
340 345 350
Gly Glu Glu Gly Lys Arg Gly Ser Thr Gly Glu Ile Gly Pro Ala Gly
355 360 365
Pro Pro Gly Pro Pro Gly Leu Arg Gly Asn Pro Gly Ser Arg Gly Leu
370 375 380
Pro Gly Ala Asp Gly Arg Ala Gly Val Met Gly Pro Ala Gly Ser Arg
385 390 395 400
Gly Ala Thr Gly Pro Ala Gly Val Arg Gly Pro Asn Gly Asp Ser Gly
405 410 415
Arg Pro Gly Glu Pro Gly Leu Met Gly Pro Arg Gly Phe Pro Gly Ser
420 425 430
Pro Gly Asn Ile Gly Pro Ala Gly Lys Glu Gly Pro Val Gly Leu Pro
435 440 445
Gly Ile Asp Gly Arg Pro Gly Pro Ile Gly Pro Ala Gly Ala Arg Gly
450 455 460
Glu Pro Gly Asn Ile Gly Phe Pro Gly Pro Lys Gly Pro Ser Gly Asp
465 470 475 480
Pro Gly Lys Ala Gly Glu Lys Gly His Ala Gly Leu Ala Gly Ala Arg
485 490 495
Gly Ala Pro Gly Pro Asp Gly Asn Asn Gly Ala Gln Gly Pro Pro Gly
500 505 510
Leu Gln Gly Val Gln Gly Gly Lys Gly Glu Gln Gly Pro Ala Gly Pro
515 520 525
Pro Gly Phe Gln Gly Leu Pro Gly Pro Ala Gly Thr Ala Gly Glu Ala
530 535 540
Gly Lys Pro Gly Glu Arg Gly Ile Pro Gly Glu Phe Gly Leu Pro Gly
545 550 555 560
Pro Ala Gly Ala Arg Gly Glu Arg Gly Pro Pro Gly Glu Ser Gly Ala
565 570 575
Ala Gly Pro Thr Gly Pro Ile Gly Ser Arg Gly Pro Ser Gly Pro Pro
580 585 590
Gly Pro Asp Gly Asn Lys Gly Glu Pro Gly Val Val Gly Ala Pro Gly
595 600 605
Thr Ala Gly Pro Ser Gly Pro Ser Gly Leu Pro Gly Glu Arg Gly Ala
610 615 620
Ala Gly Ile Pro Gly Gly Lys Gly Glu Lys Gly Glu Thr Gly Leu Arg
625 630 635 640
Gly Asp Ile Gly Ser Pro Gly Arg Asp Gly Ala Arg Gly Ala Pro Gly
645 650 655
Ala Ile Gly Ala Pro Gly Pro Ala Gly Ala Asn Gly Asp Arg Gly Glu
660 665 670
Ala Gly Pro Ala Gly Pro Ala Gly Pro Ala Gly Pro Arg Gly Ser Pro
675 680 685
Gly Glu Arg Gly Glu Val Gly Pro Ala Gly Pro Asn Gly Phe Ala Gly
690 695 700
Pro Ala Gly Ala Ala Gly Gln Pro Gly Ala Lys Gly Glu Arg Gly Thr
705 710 715 720
Lys Gly Pro Lys Gly Glu Asn Gly Pro Val Gly Pro Thr Gly Pro Val
725 730 735
Gly Ala Ala Gly Pro Ser Gly Pro Asn Gly Pro Pro Gly Pro Ala Gly
740 745 750
Ser Arg Gly Asp Gly Gly Pro Pro Gly Ala Thr Gly Phe Pro Gly Ala
755 760 765
Ala Gly Arg Thr Gly Pro Pro Gly Pro Ser Gly Ile Ser Gly Pro Pro
770 775 780
Gly Pro Pro Gly Pro Ala Gly Lys Glu Gly Leu Arg Gly Pro Arg Gly
785 790 795 800
Asp Gln Gly Pro Val Gly Arg Ser Gly Glu Thr Gly Ala Ser Gly Pro
805 810 815
Pro Gly Phe Val Gly Glu Lys Gly Pro Ser Gly Glu Pro Gly Thr Ala
820 825 830
Gly Pro Pro Gly Thr Pro Gly Pro Gln Gly Leu Leu Gly Ala Pro Gly
835 840 845
Phe Leu Gly Leu Pro Gly Ser Arg Gly Glu Arg Gly Leu Pro Gly Val
850 855 860
Ala Gly Ser Val Gly Glu Pro Gly Pro Leu Gly Ile Ala Gly Pro Pro
865 870 875 880
Gly Ala Arg Gly Pro Pro Gly Asn Val Gly Asn Pro Gly Val Asn Gly
885 890 895
Ala Pro Gly Glu Ala Gly Arg Asp Gly Asn Pro Gly Asn Asp Gly Pro
900 905 910
Pro Gly Arg Asp Gly Gln Pro Gly His Lys Gly Glu Arg Gly Tyr Pro
915 920 925
Gly Asn Ala Gly Pro Val Gly Ala Ala Gly Ala Pro Gly Pro Gln Gly
930 935 940
Pro Val Gly Pro Val Gly Lys His Gly Asn Arg Gly Glu Pro Gly Pro
945 950 955 960
Ala Gly Ala Val Gly Pro Ala Gly Ala Val Gly Pro Arg Gly Pro Ser
965 970 975
Gly Pro Gln Gly Ile Arg Gly Asp Lys Gly Glu Pro Gly Asp Lys Gly
980 985 990
Pro Arg Gly Leu Pro Gly Leu Lys Gly His Asn Gly Leu Gln Gly Leu
995 1000 1005
Pro Gly Leu Ala Gly His His Gly Asp Gln Gly Ala Pro Gly Ala Val
1010 1015 1020
Gly Pro Ala Gly Pro Arg Gly Pro Ala Gly Pro Ser Gly Pro Ala Gly
1025 1030 1035 1040
Lys Asp Gly Arg Ile Gly Gln Pro Gly Ala Val Gly Pro Ala Gly Ile
1045 1050 1055
Arg Gly Ser Gln Gly Ser Gln Gly Pro Ala Gly Pro Pro Gly Pro Pro
1060 1065 1070
Gly Pro Pro Gly Pro Pro Gly Pro Ser Gly Gly Gly Tyr Glu Phe Gly
1075 1080 1085
Phe Asp Gly Asp Phe Tyr Arg Ala Gly Ser Glu Asn Leu Tyr Phe Gln
1090 1095 1100
Gly Arg Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu Val
1105 1110 1115 1120
Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Arg Gly Glu
1125 1130 1135
Gly Glu Gly Asp Ala Thr Asn Gly Lys Leu Thr Leu Lys Phe Ile Cys
1140 1145 1150
Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Leu
1155 1160 1165
Thr Tyr Gly Val Gln Cys Phe Ala Arg Tyr Pro Asp His Met Lys Gln
1170 1175 1180
His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu Arg
1185 1190 1195 1200
Thr Ile Ser Phe Lys Asp Asp Gly Thr Tyr Lys Thr Arg Ala Glu Val
1205 1210 1215
Lys Phe Glu Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile
1220 1225 1230
Asp Phe Lys Glu Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr Asn
1235 1240 1245
Phe Asn Ser His Asn Val Tyr Ile Thr Ala Asp Lys Gln Lys Asn Gly
1250 1255 1260
Ile Lys Ala Asn Phe Lys Ile Arg His Asn Val Glu Asp Gly Ser Val
1265 1270 1275 1280
Gln Leu Ala Asp His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro
1285 1290 1295
Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Val Leu Ser
1300 1305 1310
Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val
1315 1320 1325
Thr Ala Ala Gly Ile Thr His Gly Met Asp Glu Leu Tyr Lys Gly Ser
1330 1335 1340
His His His His His His Leu Gly Arg Gly Arg Arg Ser Lys Leu
1345 1350 1355
<210> 65
<211> 1092
<212> DNA
<213> Artificial Sequence
<220>
<223> BtCOL1A1 403-0P-TEV-GFP-HIS-ePTS1
<400> 65
atgggatttg ctggccctaa aggagccgca ggagaggctg gtaaagcagg cgaacgtggc 60
gtagctggcc cagctggtgc tgtgggtcca gttggtaaag atggagaggc cggtgcccaa 120
ggtccagccg gtcctgtagg cccagcaggt gaaagaggag aacaaggtcc cgcaggctca 180
gctggattcc aaggattagc aggacctgcc ggcccagttg gcgaagccgg taaagccgga 240
gagcaaggcg tggctggtga cctaggtgca gttggcccta gtggcgccag gggagaaaga 300
ggttctgaga atctttattt tcagggccgt aaaggcgaag agctgttcac tggtgtcgtc 360
cctattctgg tggaactgga tggtgatgtc aacggtcata agttttccgt gcgtggcgag 420
ggtgaaggtg acgcaactaa tggtaaactg acgctgaagt tcatctgtac tactggtaaa 480
ctgccggttc cttggccgac tctggtaacg acgctgactt atggtgttca gtgctttgct 540
cgttatccgg accatatgaa gcagcatgac ttcttcaagt ccgccatgcc ggaaggctat 600
gtgcaggaac gcacgatttc ctttaaggat gacggcacgt acaaaacgcg tgcggaagtg 660
aaatttgaag gcgataccct ggtaaaccgc attgagctga aaggcattga ctttaaagag 720
gacggcaata tcctgggcca taagctggaa tacaatttta acagccacaa tgtttacatc 780
accgccgata aacaaaaaaa tggcattaaa gcgaatttta aaattcgcca caacgtggag 840
gatggcagcg tgcagctggc tgatcactac cagcaaaaca ctccaatcgg tgatggtcct 900
gttctgctgc cagacaatca ctatctgagc acgcaaagcg ttctgtctaa agatccgaac 960
gagaaacgcg atcatatggt tctgctggag ttcgtaaccg cagcgggcat cacgcatggt 1020
atggatgaac tgtacaaagg ttctcatcat catcatcatc acttgggaag aggtagaaga 1080
tccaaattgt aa 1092
<210> 66
<211> 363
<212> PRT
<213> Artificial Sequence
<220>
<223> BtCOL1A1 403-0P-TEV-GFP-HIS-ePTS1
<400> 66
Met Gly Phe Ala Gly Pro Lys Gly Ala Ala Gly Glu Ala Gly Lys Ala
1 5 10 15
Gly Glu Arg Gly Val Ala Gly Pro Ala Gly Ala Val Gly Pro Val Gly
20 25 30
Lys Asp Gly Glu Ala Gly Ala Gln Gly Pro Ala Gly Pro Val Gly Pro
35 40 45
Ala Gly Glu Arg Gly Glu Gln Gly Pro Ala Gly Ser Ala Gly Phe Gln
50 55 60
Gly Leu Ala Gly Pro Ala Gly Pro Val Gly Glu Ala Gly Lys Ala Gly
65 70 75 80
Glu Gln Gly Val Ala Gly Asp Leu Gly Ala Val Gly Pro Ser Gly Ala
85 90 95
Arg Gly Glu Arg Gly Ser Glu Asn Leu Tyr Phe Gln Gly Arg Lys Gly
100 105 110
Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu Val Glu Leu Asp Gly
115 120 125
Asp Val Asn Gly His Lys Phe Ser Val Arg Gly Glu Gly Glu Gly Asp
130 135 140
Ala Thr Asn Gly Lys Leu Thr Leu Lys Phe Ile Cys Thr Thr Gly Lys
145 150 155 160
Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Leu Thr Tyr Gly Val
165 170 175
Gln Cys Phe Ala Arg Tyr Pro Asp His Met Lys Gln His Asp Phe Phe
180 185 190
Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu Arg Thr Ile Ser Phe
195 200 205
Lys Asp Asp Gly Thr Tyr Lys Thr Arg Ala Glu Val Lys Phe Glu Gly
210 215 220
Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile Asp Phe Lys Glu
225 230 235 240
Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr Asn Phe Asn Ser His
245 250 255
Asn Val Tyr Ile Thr Ala Asp Lys Gln Lys Asn Gly Ile Lys Ala Asn
260 265 270
Phe Lys Ile Arg His Asn Val Glu Asp Gly Ser Val Gln Leu Ala Asp
275 280 285
His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro Val Leu Leu Pro
290 295 300
Asp Asn His Tyr Leu Ser Thr Gln Ser Val Leu Ser Lys Asp Pro Asn
305 310 315 320
Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val Thr Ala Ala Gly
325 330 335
Ile Thr His Gly Met Asp Glu Leu Tyr Lys Gly Ser His His His His
340 345 350
His His Leu Gly Arg Gly Arg Arg Ser Lys Leu
355 360
<210> 67
<211> 1092
<212> DNA
<213> Artificial Sequence
<220>
<223> BtCOL1A1 403-11P-TEV-GFP-HIS-ePTS1
<400> 67
atgggatttc ctggccctaa gggagccgca ggagagcccg gtaaagcagg cgaaagaggc 60
gtacctggtc cacccggtgc tgtgggtcca gctggtaaag atggagaggc cggtgcccaa 120
ggtcctcctg gtcctgctgg cccagcaggt gaaagaggag aacaaggtcc cgcaggctca 180
cctggattcc aaggattacc aggtccagcc ggaccacctg gcgaagccgg taaacccgga 240
gagcaaggcg tgcctggtga cctaggtgca ccaggaccta gtggcgccag gggagaaaga 300
ggttctgaga atctttattt tcagggccgt aaaggcgaag agctgttcac tggtgtcgtc 360
cctattctgg tggaactgga tggtgatgtc aacggtcata agttttccgt gcgtggcgag 420
ggtgaaggtg acgcaactaa tggtaaactg acgctgaagt tcatctgtac tactggtaaa 480
ctgccggttc cttggccgac tctggtaacg acgctgactt atggtgttca gtgctttgct 540
cgttatccgg accatatgaa gcagcatgac ttcttcaagt ccgccatgcc ggaaggctat 600
gtgcaggaac gcacgatttc ctttaaggat gacggcacgt acaaaacgcg tgcggaagtg 660
aaatttgaag gcgataccct ggtaaaccgc attgagctga aaggcattga ctttaaagag 720
gacggcaata tcctgggcca taagctggaa tacaatttta acagccacaa tgtttacatc 780
accgccgata aacaaaaaaa tggcattaaa gcgaatttta aaattcgcca caacgtggag 840
gatggcagcg tgcagctggc tgatcactac cagcaaaaca ctccaatcgg tgatggtcct 900
gttctgctgc cagacaatca ctatctgagc acgcaaagcg ttctgtctaa agatccgaac 960
gagaaacgcg atcatatggt tctgctggag ttcgtaaccg cagcgggcat cacgcatggt 1020
atggatgaac tgtacaaagg ttctcatcat catcatcatc acttgggaag aggtagaaga 1080
tccaaattgt aa 1092
<210> 68
<211> 363
<212> PRT
<213> Artificial Sequence
<220>
<223> BtCOL1A1 403-11P-TEV-GFP-HIS-ePTS1
<400> 68
Met Gly Phe Pro Gly Pro Lys Gly Ala Ala Gly Glu Pro Gly Lys Ala
1 5 10 15
Gly Glu Arg Gly Val Pro Gly Pro Pro Gly Ala Val Gly Pro Ala Gly
20 25 30
Lys Asp Gly Glu Ala Gly Ala Gln Gly Pro Pro Gly Pro Ala Gly Pro
35 40 45
Ala Gly Glu Arg Gly Glu Gln Gly Pro Ala Gly Ser Pro Gly Phe Gln
50 55 60
Gly Leu Pro Gly Pro Ala Gly Pro Pro Gly Glu Ala Gly Lys Pro Gly
65 70 75 80
Glu Gln Gly Val Pro Gly Asp Leu Gly Ala Pro Gly Pro Ser Gly Ala
85 90 95
Arg Gly Glu Arg Gly Ser Glu Asn Leu Tyr Phe Gln Gly Arg Lys Gly
100 105 110
Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu Val Glu Leu Asp Gly
115 120 125
Asp Val Asn Gly His Lys Phe Ser Val Arg Gly Glu Gly Glu Gly Asp
130 135 140
Ala Thr Asn Gly Lys Leu Thr Leu Lys Phe Ile Cys Thr Thr Gly Lys
145 150 155 160
Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Leu Thr Tyr Gly Val
165 170 175
Gln Cys Phe Ala Arg Tyr Pro Asp His Met Lys Gln His Asp Phe Phe
180 185 190
Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu Arg Thr Ile Ser Phe
195 200 205
Lys Asp Asp Gly Thr Tyr Lys Thr Arg Ala Glu Val Lys Phe Glu Gly
210 215 220
Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile Asp Phe Lys Glu
225 230 235 240
Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr Asn Phe Asn Ser His
245 250 255
Asn Val Tyr Ile Thr Ala Asp Lys Gln Lys Asn Gly Ile Lys Ala Asn
260 265 270
Phe Lys Ile Arg His Asn Val Glu Asp Gly Ser Val Gln Leu Ala Asp
275 280 285
His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro Val Leu Leu Pro
290 295 300
Asp Asn His Tyr Leu Ser Thr Gln Ser Val Leu Ser Lys Asp Pro Asn
305 310 315 320
Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val Thr Ala Ala Gly
325 330 335
Ile Thr His Gly Met Asp Glu Leu Tyr Lys Gly Ser His His His His
340 345 350
His His Leu Gly Arg Gly Arg Arg Ser Lys Leu
355 360
<210> 69
<211> 81
<212> DNA
<213> Artificial Sequence
<220>
<223> T4 fibritin foldon domain
<400> 69
ggttacatcc ccgaagctcc tcgtgacggc caggcttacg tcaggaaaga tggcgagtgg 60
gttcttttgt ccacttttct g 81
<210> 70
<211> 27
<212> PRT
<213> Artificial Sequence
<220>
<223> T4 fibritin foldon domain
<400> 70
Gly Tyr Ile Pro Glu Ala Pro Arg Asp Gly Gln Ala Tyr Val Arg Lys
1 5 10 15
Asp Gly Glu Trp Val Leu Leu Ser Thr Phe Leu
20 25
<210> 71
<211> 99
<212> PRT
<213> Artificial Sequence
<220>
<223> Collagen substrate 1
<400> 71
Gly Phe Ala Gly Pro Lys Gly Ala Ala Gly Glu Ala Gly Lys Ala Gly
1 5 10 15
Glu Arg Gly Val Ala Gly Pro Ala Gly Ala Val Gly Pro Val Gly Lys
20 25 30
Asp Gly Glu Ala Gly Ala Gln Gly Pro Ala Gly Pro Val Gly Pro Ala
35 40 45
Gly Glu Arg Gly Glu Gln Gly Pro Ala Gly Ser Ala Gly Phe Gln Gly
50 55 60
Leu Ala Gly Pro Ala Gly Pro Val Gly Glu Ala Gly Lys Ala Gly Glu
65 70 75 80
Gln Gly Val Ala Gly Asp Leu Gly Ala Val Gly Pro Ser Gly Ala Arg
85 90 95
Gly Glu Arg
<210> 72
<211> 99
<212> PRT
<213> Artificial Sequence
<220>
<223> Collagen substrate 2
<400> 72
Gly Phe Pro Gly Pro Lys Gly Ala Ala Gly Glu Pro Gly Lys Ala Gly
1 5 10 15
Glu Arg Gly Val Pro Gly Pro Pro Gly Ala Val Gly Pro Ala Gly Lys
20 25 30
Asp Gly Glu Ala Gly Ala Gln Gly Pro Pro Gly Pro Ala Gly Pro Ala
35 40 45
Gly Glu Arg Gly Glu Gln Gly Pro Ala Gly Ser Pro Gly Phe Gln Gly
50 55 60
Leu Pro Gly Pro Ala Gly Pro Pro Gly Glu Ala Gly Lys Pro Gly Glu
65 70 75 80
Gln Gly Val Pro Gly Asp Leu Gly Ala Pro Gly Pro Ser Gly Ala Arg
85 90 95
Gly Glu Arg
<210> 73
<211> 6481
<212> DNA
<213> Artificial Sequence
<220>
<223> OgPDI Construct
<400> 73
ttgcttgtgt gcagttgcgg tgtttttcgg gggaggtttc gtttattcgc ttttcaaata 60
taaccagggc cgtcaatcag tttctcactg gaaacgcaat tccttcacag ctggagtgaa 120
aagccttgct caggtaagtc aacgtgaaga gatttacaaa taacgagata ctaatatcga 180
ctcaatcggc tatagaggga atttatgcac aatcgatacc gtgataatgt gccagaattc 240
aatagctttt tcagctcatt ttcaggagat gaaaacgagt tgattgatgc actggacgcg 300
caacagaatg aaaactaata gaccatttag cagtctatat caaaggtgaa caattaagtt 360
acatcttatt acccggcatt aacgttccgg gtaaccaaac ttataaggcc gaaaattttt 420
gcgttgaggc gaactgaata aggtgggaat tgacttgaaa aattttcttc agcaggtttt 480
tcaaagactg aatactataa atatgatatg caatggtcaa tggaccagtg tcggctcagc 540
cggcttgtac tacaccatca aggctgatag tatgtgcgta gacatacact atacggacgg 600
cttcattcaa ccttcctgtc agggcctaca ggttattgga ccatgcaacc gttatcagaa 660
cggccctagg gattttgtag cctgccagac atctggaggt agtggccacc ccatctgtat 720
tcagtccacc aacggcaaca ttgagctgtg cgctaactgt tactgtcctc aaggttctga 780
gaatctttat tttcagggcc gtaaaggcga agagctgttc actggtgtcg tccctattct 840
ggtggaactg gatggtgatg tcaacggtca taagttttcc gtgcgtggcg agggtgaagg 900
tgacgcaact aatggtaaac tgacgctgaa gttcatctgt actactggta aactgccggt 960
tccttggccg actctggtaa cgacgctgac ttatggtgtt cagtgctttg ctcgttatcc 1020
ggaccatatg aagcagcatg acttcttcaa gtccgccatg ccggaaggct atgtgcagga 1080
acgcacgatt tcctttaagg atgacggcac gtacaaaacg cgtgcggaag tgaaatttga 1140
aggcgatacc ctggtaaacc gcattgagct gaaaggcatt gactttaaag aggacggcaa 1200
tatcctgggc cataagctgg aatacaattt taacagccac aatgtttaca tcaccgccga 1260
taaacaaaaa aatggcatta aagcgaattt taaaattcgc cacaacgtgg aggatggcag 1320
cgtgcagctg gctgatcact accagcaaaa cactccaatc ggtgatggtc ctgttctgct 1380
gccagacaat cactatctga gcacgcaaag cgttctgtct aaagatccga acgagaaacg 1440
cgatcatatg gttctgctgg agttcgtaac cgcagcgggc atcacgcatg gtatggatga 1500
actgtacaaa ggttctcatc atcatcatca tcacttggga agaggtagaa gatccaaatt 1560
gtaactcgag agcttttgat taagccttct agtccaaaaa acacgttttt ttgtcattta 1620
tttcattttc ttagaatagt ttagtttatt cattttatag tcacgaatgt tttatgattc 1680
tatatagggt tgcaaacaag catttttcat tttatgttaa aacaatttca ggtttacctt 1740
ttattctgct tgtggtgacg cgtgtatccg cccgctcttt tggtcaccca tgtatgctga 1800
cggggtcatc acggctcatc atgcgccaaa caaatgtgtg caatacacgc tcggatgact 1860
gcatgatgac cgcactgact ggggacagca gatccaccta agcctgtgag agaagcagac 1920
acccgacaga tcaaggcagt taaacgcctt gccaacaggg agttcttcag agacatggag 1980
gctcaaaacg aaattattga cagcctagac atcaatagtc atacaacaga aagcgaccac 2040
ccaactttgg ctgataatag cgtataaaca atgcatactt tgtacgttca aaatacaatg 2100
cagtagatat atttatgcat attacatata atacatatca cataggaagc aacaggcgcg 2160
ttggactttt aattttcgag gaccgcgaat ccttacatca cacccaatcc cccacaagtg 2220
atcccccaca caccatagct tcaaaatgtt tctactcctt ttttactctt ccagattttc 2280
tcggactccg cgcatcgccg taccacttca aaacacccaa gcacagcata ctaaatttcc 2340
cctctttctt cctctagggt gtcgttaatt acccgtacta aaggtttgga aaagaaaaaa 2400
gacaccgcct cgtttctttt tcttcgtcga aaaaggcaat aaaaattttt atcacgtttc 2460
tttttcttga aaattttttt ttttgatttt tttctctttc gatgacctcc cattgatatt 2520
taagttaata aacggtcatc aatttctcaa gtttcagttt catttttctt gttctattac 2580
aacttttttt acttcttgct cattagaaag aaagcatagc aatctaatct aagttttaat 2640
tacaaaagat ctatggtgtc caaaggagag gagttaatca aggaaaacat gagaatgaaa 2700
gttgtcatgg agggctccgt taatggtcac caattcaagt gtacagggga aggtgaaggt 2760
aatccttaca tgggtacaca aactatgaga attaaagtaa ttgaaggcgg accactacca 2820
tttgcatttg acattctggc aacgtcattc atgtacggat cacgaacttt catcaagtac 2880
cctaaaggta taccagactt tttcaagcaa tcttttccag agggttttac atgggaaagg 2940
gttacaagat acgaagatgg gggtgtcgtc acagttatgc aagatacttc attagaagat 3000
ggctgccttg tctatcatgt gcaagtaaga ggggtgaatt ttccttctaa cggacctgtg 3060
atgcagaaaa agaccaaagg ttgggaacca aatactgaaa tgatgtaccc agctgatgga 3120
ggtttgagag gctacacaca catggcgctt aaagttgatg gtggaggtca tttgtcttgt 3180
agttttgtta ccacttatcg ttctaaaaag actgttggca atatcaaaat gccaggaata 3240
catgctgtag accacagact agaaagactc gaagagagcg ataacgaaat gttcgttgta 3300
cagagagagc atgccgtagc caaatttgct ggcttaggcg gtggtatgga tgaattgtat 3360
aagggttctg ctgtggccaa aggtgacgcc gacgaagccg ccattgcgtc gccggattcc 3420
gctgttgtga agttgacagc agaatccttc gagtcgttta tcaaggagaa cccgctcgtt 3480
ttggctgagt tctttgcgcc atggtgtggc cactgcaagc gtcttggacc agagtttagc 3540
gctgctgccg acaaacttgt cgagaaagac atcaagttgg cccagattga ttgcacccaa 3600
gagagagatc tatgtgcgga ctatggtatc cgtggttatc catctctcaa ggtcttcaga 3660
ggcaacaaca cgccatccga gtaccagggc caaagagaac aagatgcaat tgtcagctac 3720
atgatcaagc aagccctacc tccagtgtcg ttgcttgagg atacggctga tctgctggac 3780
gctctggccg atctgagcga accaatgatc ttgcaagttc tgccacctga ctcgaagtct 3840
tccggcaacg aaacgttcca ttcgttggcc aaccgtctta gaaacgactt caggttcgtg 3900
tctacctcca accctgagta tgttgagaaa tacgtcaagg aaaagtccac tccaacctac 3960
gttgttttca gaccgggtga aaagattgag gacgcatctg ttctcaccaa caagactata 4020
gacgaagagg gattgcagag attcattagt gttgagacta agcctctttt cggcgaggtc 4080
accggtgcca cgttccaggc ctacatggac tccaaacttc ctttagcata ctttttctat 4140
gaagaggagt ctcagaaggc tgctgtcgca gacgaaatca ctaagttggc caagaaatat 4200
agaggcgaga tcaatttcgc cggactggag gccaagaaat acggaatgca cgctaaaaac 4260
ctcaacatgc aagaaaagtt cccactgttt gccattcacg atctgcaagg cgacctaaag 4320
tacggcatcc cacaagataa ggatctggac ttctctgaga ttcctaaatt tgtcgaaaac 4380
ttcaagaagg gcaagctgaa gccaattgtc aagagcgagc ctattccaga gactcaagag 4440
gaggctgtct accacttggt cggctacgag cacgacaaga tcgtcaacca aaagaaggac 4500
gttctggtcg agtactatgc tccatggtgt ggtcactgca agagacttgc ccctacatat 4560
gaggagctgg ctgctatcta caagaacgac accgctgcta gtgccaaggt cgtgatcgcc 4620
aagattgacc acaccgctaa cgatgttgcg ggcgtcgaga tcaccggata ccctaccatt 4680
ttcctttatc cagctgacgg ttctggtccg gtcaattacg agggacaaag aactttggag 4740
tccctagctt ctttcattca agagaagggt acctttggtg ttgacggttt ggccatcaga 4800
ggcgctaaga gcggcggagc tgataaaccg gagtccgaca ctaaggacag tactggatcc 4860
ttgggaagag gtagaagatc caaattgtaa ctcgagtggc gcgaatttct tatgatttat 4920
gatttttatt attaaataag ttataaaaaa aataagtgta tacaaatttt aaagtgactc 4980
ttaggtttta aaacgaaaat tcttattctt gagtaactct ttcctgtagg tcaggttgct 5040
ttctcaggta tagcatgagg tcgctcttat tgaccacacc tctaccggca tgccgagcaa 5100
atgcctgcaa atcgctcccc atttcgctgg aaatctgctc gtcagtggtg ctcacactga 5160
cgaatcatgt acagatcata ccgatgactg cctggcgact cacaactaag caagacagcc 5220
ggaaccagcg ccggcgaaca ccactgcata tatggcatat cacaacagtc cacgtctcaa 5280
gcagttacag agatgttacg aaccactagt gcactgcagt acagtttagc ttgcctcgtc 5340
cccgccgggt cacccggcca gcgacatgga ggcccagaat accctccttg acagtcttga 5400
cgtgcgcagc tcaggggcat gatgtgactg tcgcccgtac atttagccca tacatcccca 5460
tgtataatca tttgcatcca tacattttga tggccgcacg gcgcgaagca aaaattacgg 5520
ctcctcgctc cagacctgcg agcagggaaa cgctcccctc acagacgcgt tgaattgtcc 5580
ccacgccgcg cccctgtaga gaaatataaa aggttaggat ttgccactga ggttcttctt 5640
tcatatactt ccttttaaaa tcttgctagg atacagttct cacatcacat ccgaacataa 5700
acaaaaatgg gtactacctt agatgataca gcctacagat acagaacatc agtccctggt 5760
gatgctgaag caattgaggc tttagacggt tcattcacca ccgacaccgt ctttagagta 5820
accgccaccg gtgatggatt taccttaaga gaagtcccag tcgaccctcc attaactaaa 5880
gtctttccag atgatgaatc tgatgacgaa agcgacgacg gagaagatgg tgacccagat 5940
tcaagaactt tcgtagcata cggtgatgac ggtgatttgg ctggttttgt agtcgtttct 6000
tattcaggtt ggaatagaag gttgaccgtt gaagatatag aagtcgcccc agagcataga 6060
ggtcatggtg taggaagagc tttgatgggt ttggctacag aatttgcaag agagagagga 6120
gccggtcatt tatggttaga agttactaat gttaacgccc ctgctatcca tgcttataga 6180
agaatgggtt tcacattatg tggtttagat actgctttat atgatggaac agcatctgac 6240
ggtgaacagg ccttgtatat gtctatgcct tgcccttaaa gtaactgaca ataaaaagat 6300
tcttgttttc aagaacttgt catttgtata gtttttttat attgtagttg ttctatttta 6360
atcaaatgtt agcgtgattt atattttttt tcgcctcgac atcatctgcc cagatgcgaa 6420
gttaagtgcg cagaaagtaa tatcatgcgt caatcgtatg tgaatgctgg tcgctatact 6480
g 6481
<210> 74
<211> 54
<212> PRT
<213> Artificial Sequence
<220>
<223> Human insulin precursor
<400> 74
Met Phe Val Asn Gln His Leu Cys Gly Ser His Leu Val Glu Ala Leu
1 5 10 15
Tyr Leu Val Cys Gly Glu Arg Gly Phe Phe Tyr Thr Pro Lys Ala Ala
20 25 30
Lys Gly Ile Val Glu Gln Cys Cys Thr Ser Ile Cys Ser Leu Tyr Gln
35 40 45
Phe Glu Asn Tyr Cys Asn
50
<210> 75
<211> 166
<212> PRT
<213> Artificial Sequence
<220>
<223> Alpha interferon
<400> 75
Met Lys Tyr Thr Ser Tyr Ile Leu Ala Phe Gln Leu Cys Ile Val Leu
1 5 10 15
Gly Ser Leu Gly Cys Tyr Cys Gln Asp Pro Tyr Val Lys Glu Ala Glu
20 25 30
Asn Leu Lys Lys Tyr Phe Asn Ala Gly His Ser Asp Val Ala Asp Asn
35 40 45
Gly Thr Leu Phe Leu Gly Ile Leu Lys Asn Trp Lys Glu Glu Ser Asp
50 55 60
Arg Lys Ile Met Gln Ser Gln Ile Val Ser Phe Tyr Phe Lys Leu Phe
65 70 75 80
Lys Asn Phe Lys Asp Asp Gln Ser Ile Gln Lys Ser Val Glu Thr Ile
85 90 95
Lys Glu Asp Met Asn Val Lys Phe Phe Asn Ser Asn Lys Lys Lys Arg
100 105 110
Asp Asp Phe Glu Lys Leu Thr Asn Tyr Ser Val Thr Asp Leu Asn Val
115 120 125
Gln Arg Lys Ala Ile His Glu Leu Ile Gln Val Met Ala Glu Leu Ser
130 135 140
Pro Ala Ala Lys Thr Gly Lys Arg Lys Arg Ser Gln Met Leu Phe Arg
145 150 155 160
Gly Arg Arg Ala Ser Gln
165
<210> 76
<211> 89
<212> PRT
<213> Artificial Sequence
<220>
<223> Mapacalcine
<400> 76
Ile Cys Asn Gly Gln Trp Thr Ser Val Gly Ser Ala Gly Leu Tyr Tyr
1 5 10 15
Thr Ile Lys Ala Asp Ser Met Cys Val Asp Ile His Tyr Thr Asp Gly
20 25 30
Phe Ile Gln Pro Ser Cys Gln Gly Leu Gln Val Ile Gly Pro Cys Asn
35 40 45
Arg Tyr Gln Asn Gly Pro Arg Asp Phe Val Ala Cys Gln Thr Ser Gly
50 55 60
Gly Ser Gly His Pro Ile Cys Ile Gln Ser Thr Asn Gly Asn Ile Glu
65 70 75 80
Leu Cys Ala Asn Cys Tyr Cys Pro Gln
85
<210> 77
<211> 496
<212> PRT
<213> Artificial Sequence
<220>
<223> OgPDI
<400> 77
Met Ala Val Ala Lys Gly Asp Ala Asp Glu Ala Ala Ile Ala Ser Pro
1 5 10 15
Asp Ser Ala Val Val Lys Leu Thr Ala Glu Ser Phe Glu Ser Phe Ile
20 25 30
Lys Glu Asn Pro Leu Val Leu Ala Glu Phe Phe Ala Pro Trp Cys Gly
35 40 45
His Cys Lys Arg Leu Gly Pro Glu Phe Ser Ala Ala Ala Asp Lys Leu
50 55 60
Val Glu Lys Asp Ile Lys Leu Ala Gln Ile Asp Cys Thr Gln Glu Arg
65 70 75 80
Asp Leu Cys Ala Asp Tyr Gly Ile Arg Gly Tyr Pro Ser Leu Lys Val
85 90 95
Phe Arg Gly Asn Asn Thr Pro Ser Glu Tyr Gln Gly Gln Arg Glu Gln
100 105 110
Asp Ala Ile Val Ser Tyr Met Ile Lys Gln Ala Leu Pro Pro Val Ser
115 120 125
Leu Leu Glu Asp Thr Ala Asp Leu Leu Asp Ala Leu Ala Asp Leu Ser
130 135 140
Glu Pro Met Ile Leu Gln Val Leu Pro Pro Asp Ser Lys Ser Ser Gly
145 150 155 160
Asn Glu Thr Phe His Ser Leu Ala Asn Arg Leu Arg Asn Asp Phe Arg
165 170 175
Phe Val Ser Thr Ser Asn Pro Glu Tyr Val Glu Lys Tyr Val Lys Glu
180 185 190
Lys Ser Thr Pro Thr Tyr Val Val Phe Arg Pro Gly Glu Lys Ile Glu
195 200 205
Asp Ala Ser Val Leu Thr Asn Lys Thr Ile Asp Glu Glu Gly Leu Gln
210 215 220
Arg Phe Ile Ser Val Glu Thr Lys Pro Leu Phe Gly Glu Val Thr Gly
225 230 235 240
Ala Thr Phe Gln Ala Tyr Met Asp Ser Lys Leu Pro Leu Ala Tyr Phe
245 250 255
Phe Tyr Glu Glu Glu Ser Gln Lys Ala Ala Val Ala Asp Glu Ile Thr
260 265 270
Lys Leu Ala Lys Lys Tyr Arg Gly Glu Ile Asn Phe Ala Gly Leu Glu
275 280 285
Ala Lys Lys Tyr Gly Met His Ala Lys Asn Leu Asn Met Gln Glu Lys
290 295 300
Phe Pro Leu Phe Ala Ile His Asp Leu Gln Gly Asp Leu Lys Tyr Gly
305 310 315 320
Ile Pro Gln Asp Lys Asp Leu Asp Phe Ser Glu Ile Pro Lys Phe Val
325 330 335
Glu Asn Phe Lys Lys Gly Lys Leu Lys Pro Ile Val Lys Ser Glu Pro
340 345 350
Ile Pro Glu Thr Gln Glu Glu Ala Val Tyr His Leu Val Gly Tyr Glu
355 360 365
His Asp Lys Ile Val Asn Gln Lys Lys Asp Val Leu Val Glu Tyr Tyr
370 375 380
Ala Pro Trp Cys Gly His Cys Lys Arg Leu Ala Pro Thr Tyr Glu Glu
385 390 395 400
Leu Ala Ala Ile Tyr Lys Asn Asp Thr Ala Ala Ser Ala Lys Val Val
405 410 415
Ile Ala Lys Ile Asp His Thr Ala Asn Asp Val Ala Gly Val Glu Ile
420 425 430
Thr Gly Tyr Pro Thr Ile Phe Leu Tyr Pro Ala Asp Gly Ser Gly Pro
435 440 445
Val Asn Tyr Glu Gly Gln Arg Thr Leu Glu Ser Leu Ala Ser Phe Ile
450 455 460
Gln Glu Lys Gly Thr Phe Gly Val Asp Gly Leu Ala Ile Arg Asp Ala
465 470 475 480
Lys Ser Gly Gly Ala Asp Lys Pro Glu Ser Asp Thr Lys Asp Ser Thr
485 490 495
<210> 78
<211> 675
<212> DNA
<213> Artificial Sequence
<220>
<223> Human beta-Casein II
<400> 78
atgaaagtcc ttattttagc ttgccttgtc gcattggctc tggcaagaga gacgattgaa 60
tcactaagta gttccgaaga aagtatcacc gaatataaaa aggtcgagaa ggtgaagcat 120
gaagaccagc aacagggcga agacgagcat caagacaaga tttaccctag tttccaacca 180
cagcctttaa tttatccctt cgtggaacca ataccatatg gcttcctgcc acaaaatatc 240
ctgcccttag cccaacccgc cgtcgttctg ccagtgcctc aacctgagat catggaagtt 300
ccaaaagcca aggatactgt ttatactaaa ggacgtgtga tgcctgtttt aaaatctccc 360
accattcctt tctttgatcc ccaaatccca aaacttactg accttgagaa cctacatcta 420
cccctaccac ttttacagcc actaatgcaa caggtgcctc agcctattcc tcagacccta 480
gctctaccac cacagcccct ttggtctgtc ccccaaccca aggttcttcc catacctcaa 540
caagtagttc catacccaca acgtgctgtc cctgtgcagg ctctgctact gaaccaggaa 600
ttgttactga atcctaccca ccaaatctac ccagtgactc agcccttagc cccagtacat 660
aatcccatca gtgtt 675
<210> 79
<211> 645
<212> DNA
<213> Artificial Sequence
<220>
<223> Casein Kinase II
<400> 79
atgtcatcct ccgaggaagt cagttggatc tcatggttct gcggtctgag gggcaacgag 60
tttttctgtg aggtagatga agactatatt caagacaagt tcaatctgac gggacttaat 120
gaacaggttc ctcactacag acaagcacta gacatgatat tagacctgga gcctgacgaa 180
gaactagaag ataaccccaa tcagtcagat ctaatcgaac aagccgcaga gatgttgtat 240
ggcttgatac acgccagata catattaact aaccgtggta ttgcacagat gttggaaaag 300
tatcagcaag gtgattttgg atattgcccc agagtatatt gcgagaacca acctatgtta 360
cccataggac tttctgatat tcctggagag gctatggtga aattgtactg ccctaaatgt 420
atggatgttt acactcctaa atcttcccgt catcatcata cggatggcgc ttattttgga 480
actggttttc cccacatgtt gttcatggtc caccctgagt ataggccaaa aagacctgca 540
aatcaatttg ttcctagact ttatggattt aagatacatc caatggctta ccaactgcag 600
ttacaagctg ctagtaattt taaatctcca gtcaaaacca ttaga 645
<210> 80
<211> 1155
<212> DNA
<213> Artificial Sequence
<220>
<223> Ovalbumin
<400> 80
ggttctatcg gagcagctag tatggaattc tgtttcgacg tgttcaagga attaaaagtc 60
catcatgcta atgaaaacat attctactgc ccaattgcca ttatgtcagc cctggccatg 120
gtgtacctag gtgccaaaga ttccacgaga actcaaataa ataaggttgt tagattcgac 180
aagttgccag gtttcggtga tagtattgaa gcccagtgcg gaacgtctgt taacgttcac 240
agttccctaa gagacatttt aaatcaaatc acaaagccca acgacgtgta ttcattttcc 300
ttagcctcca ggctgtacgc cgaggaaaga tatccaattt tgcccgaata cctgcagtgc 360
gtaaaagagc tgtatagagg cggacttgaa ccaataaatt tccagaccgc tgctgaccaa 420
gcccgtgagt tgataaactc ctgggtcgag tcccaaacta atggaatcat acgtaacgtg 480
ctgcaaccaa gtagtgttga ctcacagacc gcaatggttt tggttaacgc tatcgtattt 540
aaaggtttgt gggagaaagc attcaaagac gaagacacac aagccatgcc tttcagagtg 600
acggagcagg agagtaagcc agtacagatg atgtaccaga ttggcctttt cagagttgct 660
tccatggcct cagagaagat gaaaatcctg gagcttccat ttgcttctgg tactatgtcc 720
atgctagtgc ttctgcccga cgaagtgtct ggattggagc aattagaatc tatcataaac 780
tttgaaaagt taacagagtg gacttcttca aacgtaatgg aggaacgtaa gattaaagtt 840
tacctaccac gtatgaaaat ggaagaaaag tataatttga cctctgttct gatggcaatg 900
ggcataaccg acgtattcag ttcttcagca aatttatccg gcatatcctc tgcagaatct 960
ctaaagatat ctcaggccgt acacgccgct cacgcagaaa ttaacgaggc cggacgtgag 1020
gtagtaggat ccgctgaggc cggtgtggac gctgcatctg tgtctgagga gtttagagcc 1080
gaccaccctt ttctattctg tattaaacat atagccacta atgccgttct tttcttcggt 1140
aggtgcgtct cacca 1155
<210> 81
<211> 540
<212> DNA
<213> Artificial Sequence
<220>
<223> NatB complex Naa20
<400> 81
atgaccgata caagaaaatt taaggcaaca gatttatttt cctttaacaa cataaatctg 60
gacccactta ccgaaacatt taacatatca ttttaccttt catatcttaa caagtggccc 120
tccttatgcg tagtacaaga gtccgatctt tcagacccca cgttgatggg ttatattatg 180
ggcaagtccg aaggaacagg caaagagtgg cacacgcatg ttaccgccat cacagtggcc 240
ccaaattcac gtagattggg ccttgccagg acaatgatgg attacttgga gacggtcggt 300
aactctgaga acgccttctt cgtggacctg ttcgtcaggg catccaacgc cctagcaatt 360
gacttttaca aaggattggg ctactctgtc taccgtagag tgattggtta ttacagtaac 420
cctcatggca aagatgaaga ctctttcgac atgagaaaac cattgtctag ggatgtaaac 480
agggagtcaa tcagagaaaa cggtgagaac tttaagtgct cacccgcaga tgtgagtttt 540
<210> 82
<211> 2433
<212> DNA
<213> Artificial Sequence
<220>
<223> NatB complex Naa25
<400> 82
atgaggagga gtggctccaa ggagagtact attgtatact cagctttgag tcttgctcag 60
gccggaagag gacccgaagc actggccttg cttgagcccc tgaagtcaac tccaatcaat 120
tctcttgagt tattggatat catacaggct gtatacgatg atcaaaagaa aggagaagaa 180
tccttcgttt tctgggagaa gttccttcag acttatggta agcaggagaa aaatttactg 240
gcttacttca aggcttctat tagaattaaa tctctgagtc accaacgtaa ggctgcagtg 300
gagctgcaaa agaactttcc aagtaggaaa cacacgttat gggttattag tagtctttat 360
ttactgtcca aaaaatccga gaacgaagtt gagcagaggc tgctgaaggc tctagctgaa 420
aaaaccgcta aattaatttt tgaaaaacca accggatata ttgattcttg tgaggaattc 480
catttgtatc tggacgtact acttctagta ggtgataaag atagggctct tgacgcttta 540
attcaccaag acgcagatag attcgtcgac gctgatgctg acctactgct tcgtaaatta 600
gaactattag caagttgtgc aagatgggat tcactgttta ctttctccct gagtttgttc 660
cagactggta acacggactg gaaagtctgt aaagcactac tagattctgc ctccaacgat 720
gatagtaagt tagtaccatt gaaggattgc atacttaaag cattatctac gtcttctact 780
aaaagaaatt tacacctgct atggattgaa gcatccgcac gtttctttcc cgaggaacac 840
gagtcagcac tattaggcta cataaagaaa ctttacatga agcccattgt cttcgaggat 900
cttaggccat atctgctgaa actaaacgtc gatgcacagc accgtttgtt ggacgctttc 960
aagctagctg atttgggcga gtcaaatgag tcacagaagg tcgataaatt atacgctgag 1020
gttcttctgt tgaaaatcca cttcctgcta ttcgagagtt tcacagccga gagtgtggta 1080
gactacgttc gtcgttgttt cgttgccttt gagaaaggac tttcactgtc taaaggactt 1140
ttgccaacgg acttcactca tggatatgag gctcttctgt tagcagtaca ttcccttatt 1200
tatatgtggg aaggtaacaa ggatttaaaa ccagcagaaa agcaggcatt aattttcgac 1260
gctatttgcc tgttagaaaa gggtataaca tacagtcaac ataatttcca cctgaaactt 1320
cccctgataa ggttatacct actgcttgac ggaggattcc ccgcagcagc aaaagtttac 1380
gatactatga gtataaagca aattcagaac gatacattag atcactattt actgaccagg 1440
gctaccacat attacccctc ttccgtcacg tcacattata taaattcatc cttaaagata 1500
tatggctcta acgagttcga aaccccagaa atgatttcta tggcatatga ggacggcgca 1560
tacagtcaga tcgaggacat gcgtaatttt agatctaggc ttgaccattc cacctggaag 1620
agtatatccc tagtcgaaag ggccaggata cactatctta ccgcatttaa gcctcctaaa 1680
cagtacctac ccaagtgttc cagtcctaaa gataaccgtg acctaaaggt gttcgctgat 1740
tacggatcag acaagcttcc taccgtggag gaaagtctaa ggaactcccc caagcccgat 1800
acgttgtgga tccacctaac tgtaatcggt cattccttag ttcaggatag tattgtgaat 1860
ggcgattttg agaaggccgt tctgtcagcc aaagaaatgg aagtcttgtg tgaaaataac 1920
gatctgtcta agcaactaac atcagaagag atcgtgcaca tgaagctact aatccaatta 1980
ggacttttaa gtgtgaaggt taagaatgga gattatgaaa actcctcttt tgagactatc 2040
gagaacctta tagaaagttt cgattatgaa aacagtactc ccctaagtca gttgacaaag 2100
tataccgaga tcatcaatga tttaatcacc tgcttgaact catttttgta tcatgtaagt 2160
gctactaaga aaaaggaatt cacacgtcaa taccagttgc tgaaaaatat aagttccaac 2220
aaacttggat caatctctgg tatcaccaaa cataagaaga aagctgccag aaagtacgtt 2280
tccgagctgt tgagtaattc ttggctaagt aacctatctg agacccaggt cccttacgat 2340
ccaaagtttg caaagcaagt gggtgagggt atgatcgact cttatataca gacaacggac 2400
gcagtgtcaa aattaccaaa gttcgtgaag ttt 2433
Claims (42)
- 퍼옥시좀에서 변형 단백질을 생산하는 방법으로서,
세포를 제공하는 단계;
상기 세포에 제1 핵산을 도입하는 단계로서, 상기 제1 핵산은 퍼옥시좀-표적화 서열에 융합된 이종 단백질을 암호화하는 제1 서열을 포함하는, 상기 제1 핵산을 도입하는 단계; 및
상기 세포에 제2 핵산을 도입하는 단계로서, 상기 제2 핵산은 퍼옥시좀-표적화 서열에 융합된 이종 변형 효소를 암호화하는 제2 서열을 포함하는, 상기 제2 핵산을 도입하는 단계
를 포함하는, 퍼옥시좀에서 변형 단백질을 생산하는 방법. - 제1항에 있어서, 상기 세포는 진핵 세포인, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 또는 제2항에 있어서, 상기 세포는 효모 세포인, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제3항 중 어느 한 항에 있어서, 상기 세포는 아르술라(Arxula), 칸디다(Candida), 한세눌라(Hansenula), 클루이베로마이세스(Kluyveromyces), 코마가탤라(Komagataella), 오가타에아(Ogataea), 피키아(Pichia), 사카로마이세스(Saccharomyces) 또는 야로위아(Yarrowia)로부터 선택되는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제4항 중 어느 한 항에 있어서, 상기 제1 핵산 및/또는 제2 핵산은 프로모터(들)를 포함하는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제5항에 있어서, 상기 프로모터는 항시성 또는 유도성인, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제6항 중 어느 한 항에 있어서, 상기 퍼옥시좀-표적화 서열은 서열번호 1(SLK), 서열번호 2(RLXXXXX(H/Q)L) 또는 서열번호 3(LGRGRRSKL)에 기재된 서열을 포함하는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제7항 중 어느 한 항에 있어서, 상기 단백질은 태그를 포함하는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제8항에 있어서, 상기 태그는 절단가능한, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제9항 중 어느 한 항에 있어서, 상기 방법은 상기 세포에 제3 핵산을 도입하는 단계를 더 포함하되, 상기 제3 핵산은 퍼옥시좀-표적화 서열에 융합된 제2 이종 변형 효소를 암호화하는 제3 서열을 포함하는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제10항 중 어느 한 항에 있어서, 상기 이종 단백질은 1Da, 5Da, 10Da, 20Da, 30Da, 40Da, 50Da, 60Da, 70Da, 80Da, 90Da, 100Da, 200Da, 300Da, 400Da, 500Da, 600Da, 700Da, 800Da, 900Da, 1kDa, 5kDa, 10kDa, 20kDa, 30kDa, 40kDa, 50kDa, 60kDa, 70kDa, 80kDa, 90kDa, 100kDa, 110kDa, 120kDa, 130kDa, 140kDa, 150kDa, 160kDa, 170kDa, 180kDa, 190kDa, 200kDa, 210kDa, 220kDa, 230kDa, 240kDa, 250kDa, 260kDa, 270kDa, 280kDa, 290kDa 또는 300kDa, 또는 임의의 상기 2개의 값으로 정의되는 범위 사이의 임의의 크기, 또는 최대 300kDa의 분자량을 갖는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제11항 중 어느 한 항에 있어서, 상기 효소는 변형을 생성하는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제12항에 있어서, 상기 변형은 하이드록실화, 단백질 폴딩, 산화, 단백질 분해, 인산화, 탈인산화 및/또는 이성질체화인, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제13항 중 어느 한 항에 있어서, 상기 효소는 프롤릴 하이드록실화효소, 리실 산화효소, 단백질 샤페론 또는 프롤릴 이성질체화효소를 포함하는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제14항 중 어느 한 항에 있어서, 상기 효소는 프롤릴 이성질체화효소, 단백질 이황화 이성질체화효소, 하이드록실 전이효소 또는 프롤릴 하이드록실화효소로부터 선택되는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제15항 중 어느 한 항에 있어서, 상기 단백질은 콜라겐, 젤라틴 또는 실크 단백질을 포함하는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제16항 중 어느 한 항에 있어서, 상기 핵산은 효모 세포와 같은 진핵 세포에서의 단백질 발현을 위해 코돈 최적화되는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제17항 중 어느 한 항에 있어서, 상기 효소는 프롤릴 하이드록실화효소 또는 프롤릴 이성질체화효소를 포함하는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제18항 중 어느 한 항에 있어서, 상기 단백질은 콜라겐이고, 상기 콜라겐은 I형 이종삼량체, 1형 알파 동종삼량체 또는 III형 동종삼량체 콜라겐을 형성하도록 변형되는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제19항 중 어느 한 항에 있어서, 상기 이종 단백질은 Col1A1 또는 Col1A2를 포함하는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제20항 중 어느 한 항에 있어서, 상기 효소는 프롤릴-4-하이드록실화효소를 포함하는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제21항에 있어서, 상기 프롤릴-4-하이드록실화효소는 PDI 도메인의 결실을 갖도록 유전자 변형되는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제22항 중 어느 한 항에 있어서, 상기 효소는 개선된 발현 및 퍼옥시좀으로의 전입을 위해 유전자 변형되는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제23항 중 어느 한 항에 있어서, 상기 단백질은 개선된 발현 및 퍼옥시좀으로의 전입을 위해 유전자 변형되는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제24항 중 어느 한 항에 있어서, 상기 퍼옥시좀 표적화 서열에 대한 상기 이종 단백질의 융합은 상기 이종 단백질의 상기 퍼옥시좀으로의 표적화를 초래하고, 이에 의해 상기 퍼옥시좀으로 표적화되지 않은 효소로부터 상기 이종 단백질을 분리하는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제25항 중 어느 한 항에 있어서, 상기 퍼옥시좀 표적화 서열에 대한 상기 변형 단백질의 융합은 상기 변형 단백질의 상기 퍼옥시좀으로의 표적화를 초래하고, 이에 의해 상기 퍼옥시좀으로 표적화되지 않은 기질 또는 효소로부터 상기 변형 단백질을 분리하는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제26항 중 어느 한 항에 있어서, 상기 이종 단백질은 COLsyn2, COLsyn3, 또는 COLsyn2 또는 COLsyn3의 아미노산 서열과 적어도 80%, 85%, 90%, 95%, 97%, 98% 또는 99% 동일한 아미노산 서열을 포함하는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제27항 중 어느 한 항에 있어서, 상기 제1 핵산은 변형되지 않은 제1 핵산 또는 천연발생인 제1 핵산과 비교하여 상기 이종 단백질에서 적어도 하나의 소수성 아미노산이 친수성 또는 비-소수성 아미노산으로 대체되도록 조작되는, 퍼옥시좀에서 변형 단백질을 생산하는 방법.
- 제1항 내지 제28항 중 어느 한 항의 방법으로 제조된, 퍼옥시좀에서 단백질을 생산하는 진핵 세포.
- 퍼옥시좀에서 단백질을 생산하는 진핵 세포로서,
퍼옥시좀-표적화 서열에 융합된 이종 단백질을 암호화하는 서열을 포함하는 제1 핵산; 및
퍼옥시좀-표적화 서열에 융합된 이종 변형 효소를 암호화하는 제2 핵산
을 포함하는, 진핵 세포. - 퍼옥시좀에서 변형 단백질을 생산하는 진핵 세포로서,
퍼옥시좀-표적화 서열에 융합된 이종 단백질 및
퍼옥시좀-표적화 서열에 융합된 이종 변형 효소
를 발현할 수 있는, 진핵 세포. - 제30항 또는 제31항에 있어서, 상기 단백질이 상기 퍼옥시좀에서 변형되는, 진핵 세포.
- 제29항 내지 제32항 중 어느 한 항에 있어서, 상기 세포는 파스토리스(Pastoris)인, 진핵 세포.
- 제29항 내지 제33항 중 어느 한 항에 있어서, 상기 퍼옥시좀-표적화 서열은 서열번호 1, 2 또는 3에 기재된 서열을 포함하는, 진핵 세포.
- 제29항 내지 제34항 중 어느 한 항에 있어서, 상기 세포는 퍼옥시좀-표적화 서열에 융합된 단백질을 암호화하는 제3 핵산을 더 포함하는, 진핵 세포.
- 진핵 세포에서 변형 단백질을 생산하는 방법으로서,
상기 진핵 세포는 퍼옥시좀-표적화 서열에 융합된 이종 변형 효소를 발현하고,
제1항 내지 제28항 중 어느 한 항의 방법으로 제조된 세포 또는 제29항 내지 제35항 중 어느 한 항의 세포를 제공하는 단계;
상기 진핵 세포에서 이종 단백질을 발현하는 단계로서, 상기 이종 단백질은 퍼옥시좀-표적화 서열에 융합되는, 상기 이종 단백질을 발현하는 단계; 및
상기 이종 변형 효소가 상기 퍼옥시좀에서 상기 이종 단백질을 변형시켜 변형 단백질을 생산하도록 하는 조건 하에서 상기 진핵 세포를 배양하는 단계
를 포함하는, 진핵 세포에서 변형 단백질을 생산하는 방법. - 진핵 세포에서 변형 단백질을 생산하는 방법으로서,
상기 세포는 퍼옥시좀을 포함하고, 상기 진핵 세포는 퍼옥시좀-표적화 서열에 융합된 이종 변형 효소를 발현하고,
진핵 세포에서 이종 단백질을 발현하는 단계로서, 상기 이종 단백질은 퍼옥시좀-표적화 서열에 융합되는, 상기 이종 단백질을 발현하는 단계; 및
상기 이종 변형 효소가 퍼옥시좀에서 상기 이종 단백질을 변형시켜 변형 단백질을 생산하도록 하는 조건 하에서 상기 진핵 세포를 배양하는 단계
를 포함하는, 진핵 세포에서 변형 단백질을 생산하는 방법. - 변형 단백질을 생산하는 방법으로서,
상기 변형 단백질이 생산되도록 하는 조건 하에서 퍼옥시좀을 함유하는 진핵 세포를 배양하는 단계를 포함하되, 상기 진핵 세포는
퍼옥시좀-표적화 서열에 융합된 이종 단백질 및
퍼옥시좀-표적화 서열에 융합된 이종 변형 효소
를 발현하고, 상기 이종 변형 효소가 상기 배양 조건 하에서 상기 퍼옥시좀에서 상기 이종 단백질을 변형시켜 상기 변형 단백질을 생산하는, 변형 단백질을 생산하는 방법. - 변형 단백질의 수율을 증가시키는 방법으로서,
상기 변형 단백질이 생산되도록 하는 조건 하에서 퍼옥시좀을 함유하는 진핵 세포를 배양하는 단계를 포함하되, 상기 진핵 세포는
퍼옥시좀-표적화 서열에 융합된 이종 단백질로서, 상기 이종 단백질의 발현이 프로모터의 영향 하에 있는, 단백질 및
퍼옥시좀-표적화 서열에 융합된 이종 변형 효소
를 발현하고, 상기 이종 변형 효소가 상기 배양 조건 하에서 상기 퍼옥시좀에서 상기 이종 단백질을 변형시켜 상기 변형 단백질을 생산하는, 변형 단백질의 수율을 증가시키는 방법. - 제39항에 있어서, 상기 이종 단백질의 생산은 화학적 유도제에 의해 유도되는, 변형 단백질의 수율을 증가시키는 방법.
- 제39항 또는 제40항에 있어서, 상기 방법은 상기 퍼옥시좀의 화물(cargo)을 증가시키는 단계를 더 포함하되, 상기 퍼옥시좀의 화물을 증가시키는 것은 상기 진핵 세포에 올레산 또는 메탄올을 제공하여 수행되는, 변형 단백질의 수율을 증가시키는 방법.
- 세포의 퍼옥시좀에서 변형 단백질을 생산하는 키트로서,
GFP-x-ePTS1 또는 x-FLAG-ePTS1을 포함하는 제1 핵산 작제물로서, x는 퍼옥시좀으로 표적화되는 이종 단백질을 암호화하는 핵산 서열인, 상기 제1 핵산 작제물 및
GFP-y-ePTS1 또는 y-FLAG-ePTS1을 포함하는 제2 핵산 작제물로서, y는 퍼옥시좀으로 표적화되는 변형 효소를 암호화하는 핵산 서열이고, 상기 변형 효소는 상기 퍼옥시좀에서 상기 이종 단백질을 변형시키도록 구성된, 상기 제2 핵산 작제물
을 포함하는, 키트.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201962847769P | 2019-05-14 | 2019-05-14 | |
US62/847,769 | 2019-05-14 | ||
PCT/US2020/032512 WO2020232017A2 (en) | 2019-05-14 | 2020-05-12 | Expression of modified proteins in a peroxisome |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20220062230A true KR20220062230A (ko) | 2022-05-16 |
Family
ID=73289613
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020217040840A KR20220062230A (ko) | 2019-05-14 | 2020-05-12 | 퍼옥시좀에서 변형 단백질의 발현 |
Country Status (12)
Country | Link |
---|---|
US (1) | US20230148256A1 (ko) |
EP (1) | EP4004196A4 (ko) |
JP (1) | JP2022537640A (ko) |
KR (1) | KR20220062230A (ko) |
CN (1) | CN114423861A (ko) |
AU (1) | AU2020274089A1 (ko) |
BR (1) | BR112021022900A8 (ko) |
CA (1) | CA3140144A1 (ko) |
IL (1) | IL288015A (ko) |
MX (1) | MX2021013900A (ko) |
SG (1) | SG11202112632UA (ko) |
WO (1) | WO2020232017A2 (ko) |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020142391A1 (en) * | 1991-06-12 | 2002-10-03 | Kivirikko Kari I. | Synthesis of human procollagens and collagens in recombinant DNA systems |
EP3630940A1 (en) * | 2017-05-31 | 2020-04-08 | Universität für Bodenkultur Wien | Yeast expressing a synthetic calvin cycle |
-
2020
- 2020-05-12 CA CA3140144A patent/CA3140144A1/en active Pending
- 2020-05-12 US US17/595,293 patent/US20230148256A1/en active Pending
- 2020-05-12 EP EP20806832.0A patent/EP4004196A4/en active Pending
- 2020-05-12 CN CN202080048603.2A patent/CN114423861A/zh active Pending
- 2020-05-12 BR BR112021022900A patent/BR112021022900A8/pt unknown
- 2020-05-12 SG SG11202112632UA patent/SG11202112632UA/en unknown
- 2020-05-12 KR KR1020217040840A patent/KR20220062230A/ko active Search and Examination
- 2020-05-12 AU AU2020274089A patent/AU2020274089A1/en not_active Abandoned
- 2020-05-12 WO PCT/US2020/032512 patent/WO2020232017A2/en unknown
- 2020-05-12 JP JP2021568479A patent/JP2022537640A/ja active Pending
- 2020-05-12 MX MX2021013900A patent/MX2021013900A/es unknown
-
2021
- 2021-11-11 IL IL288015A patent/IL288015A/en unknown
Also Published As
Publication number | Publication date |
---|---|
SG11202112632UA (en) | 2021-12-30 |
EP4004196A2 (en) | 2022-06-01 |
WO2020232017A3 (en) | 2020-12-30 |
AU2020274089A1 (en) | 2022-01-20 |
IL288015A (en) | 2022-01-01 |
EP4004196A4 (en) | 2024-01-17 |
JP2022537640A (ja) | 2022-08-29 |
BR112021022900A8 (pt) | 2022-08-30 |
US20230148256A1 (en) | 2023-05-11 |
CN114423861A (zh) | 2022-04-29 |
BR112021022900A2 (pt) | 2022-06-07 |
WO2020232017A2 (en) | 2020-11-19 |
MX2021013900A (es) | 2022-04-27 |
CA3140144A1 (en) | 2020-11-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5469786B2 (ja) | タンパク質の発現増強および精製の方法および組成 | |
KR20170085129A (ko) | 펩티드 생산용 융합 파트너 | |
Schwarze et al. | Requirements for construction of a functional hybrid complex of photosystem I and [NiFe]-hydrogenase | |
CN110564708B (zh) | 一种重组磷脂酶d及其在合成磷脂酰丝氨酸或其它磷脂中的应用 | |
US20230399379A1 (en) | Expression of collagen peptide components in prokaryotic systems | |
KR101304735B1 (ko) | 3중 나선구조를 갖는 단백질의 제조방법 | |
Nogueira et al. | High-level secretion of recombinant full-length streptavidin in Pichia pastoris and its application to enantioselective catalysis | |
CN111378047A (zh) | 一种提高蛋白表达的融合标签蛋白及其应用 | |
CN114761553A (zh) | 用于生产来自黑曲霉的β-呋喃果糖苷酶的核酸、载体、宿主细胞和方法 | |
KR20220062230A (ko) | 퍼옥시좀에서 변형 단백질의 발현 | |
CN113201074B (zh) | 一种pkek融合蛋白及制备方法与应用 | |
KR20030022274A (ko) | 히드록실화 콜라겐 유사 화합물의 제조 방법 | |
JP3896460B2 (ja) | 非天然蛋白質の製造方法、固定化方法及びキット | |
CN108179160B (zh) | 一种植烷醇连接的高甘露糖型寡糖的制备方法 | |
CN113151377A (zh) | 一种从葡萄糖到氨基葡萄糖的酶法制备方法与酶用途 | |
JP5425350B2 (ja) | 酵母においてエンテロキナーゼを生成するための構成物および方法 | |
EP3103881A1 (en) | Method for producing peptides having azole-derived skeleton | |
CN114746548A (zh) | 用于生产来自日本曲霉的果糖基转移酶的核酸、载体、宿主细胞和方法 | |
CN112513259A (zh) | 肽大环化酶 | |
CN115838712B (zh) | 具有肌肽水解酶功能的蛋白酶及其在l-肌肽合成中的应用 | |
JP2011239686A (ja) | ペプチド及びペプチド誘導体の製造方法 | |
EP3676370B1 (en) | Compositions and methods using methanotrophic s-layer proteins for expression of heterologous proteins | |
Li et al. | High-level expression of biotin ligase BirA from Escherichia coli K12 in Pichia pastoris KM71 | |
JP4252299B2 (ja) | 新規ジスルフィド酸化還元酵素、および、該酵素を用いたタンパク質の活性化方法 | |
KR101833427B1 (ko) | 신규 카프로락탐 전환 효소를 이용한 ε-카프로락탐의 제조 방법 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination |