KR20230088911A - 박테리아 미세구획 바이러스-유사 입자 - Google Patents
박테리아 미세구획 바이러스-유사 입자 Download PDFInfo
- Publication number
- KR20230088911A KR20230088911A KR1020237016963A KR20237016963A KR20230088911A KR 20230088911 A KR20230088911 A KR 20230088911A KR 1020237016963 A KR1020237016963 A KR 1020237016963A KR 20237016963 A KR20237016963 A KR 20237016963A KR 20230088911 A KR20230088911 A KR 20230088911A
- Authority
- KR
- South Korea
- Prior art keywords
- seq
- shell
- gly
- val
- cso
- Prior art date
Links
- 230000001580 bacterial effect Effects 0.000 title claims abstract description 48
- 239000002245 particle Substances 0.000 title claims abstract description 22
- 108090000765 processed proteins & peptides Proteins 0.000 claims abstract description 51
- 108010001267 Protein Subunits Proteins 0.000 claims abstract description 44
- 102000002067 Protein Subunits Human genes 0.000 claims abstract description 44
- 238000000034 method Methods 0.000 claims abstract description 36
- 241000605178 Halothiobacillus neapolitanus Species 0.000 claims abstract description 26
- 238000005538 encapsulation Methods 0.000 claims abstract description 22
- 238000004519 manufacturing process Methods 0.000 claims abstract description 13
- 241001600172 Haliangium ochraceum Species 0.000 claims abstract description 10
- 108091033319 polynucleotide Proteins 0.000 claims abstract description 6
- 102000040430 polynucleotide Human genes 0.000 claims abstract description 6
- 239000002157 polynucleotide Substances 0.000 claims abstract description 6
- 239000013612 plasmid Substances 0.000 claims description 118
- 108010028230 Trp-Ser- His-Pro-Gln-Phe-Glu-Lys Proteins 0.000 claims description 73
- 240000004808 Saccharomyces cerevisiae Species 0.000 claims description 70
- 102000004190 Enzymes Human genes 0.000 claims description 65
- 108090000790 Enzymes Proteins 0.000 claims description 65
- 241000588724 Escherichia coli Species 0.000 claims description 39
- 150000001413 amino acids Chemical class 0.000 claims description 36
- 239000000203 mixture Substances 0.000 claims description 23
- 230000014509 gene expression Effects 0.000 claims description 22
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 13
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 claims description 12
- 150000007523 nucleic acids Chemical class 0.000 claims description 11
- 230000001105 regulatory effect Effects 0.000 claims description 11
- 238000006243 chemical reaction Methods 0.000 claims description 8
- 108091006047 fluorescent proteins Proteins 0.000 claims description 8
- 102000034287 fluorescent proteins Human genes 0.000 claims description 8
- 230000002068 genetic effect Effects 0.000 claims description 8
- 108020004707 nucleic acids Proteins 0.000 claims description 7
- 102000039446 nucleic acids Human genes 0.000 claims description 7
- 101100118148 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) YEF3 gene Proteins 0.000 claims description 6
- 238000011282 treatment Methods 0.000 claims description 6
- 241000341975 Haliangium Species 0.000 claims description 5
- 230000000295 complement effect Effects 0.000 claims description 5
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 5
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 claims description 4
- 101150085381 CDC19 gene Proteins 0.000 claims description 4
- 101100234604 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) ace-8 gene Proteins 0.000 claims description 4
- 101150093629 PYK1 gene Proteins 0.000 claims description 4
- 201000010099 disease Diseases 0.000 claims description 4
- 239000003814 drug Substances 0.000 claims description 4
- 230000002163 immunogen Effects 0.000 claims description 4
- 229960005486 vaccine Drugs 0.000 claims description 4
- 230000003851 biochemical process Effects 0.000 claims description 3
- 230000002265 prevention Effects 0.000 claims description 2
- 229940002612 prodrug Drugs 0.000 claims description 2
- 239000000651 prodrug Substances 0.000 claims description 2
- 229940124597 therapeutic agent Drugs 0.000 claims description 2
- 125000003275 alpha amino acid group Chemical group 0.000 claims 8
- 230000000069 prophylactic effect Effects 0.000 claims 1
- 238000002560 therapeutic procedure Methods 0.000 claims 1
- 229910052739 hydrogen Inorganic materials 0.000 abstract description 2
- 229910052698 phosphorus Inorganic materials 0.000 abstract description 2
- 108090000623 proteins and genes Proteins 0.000 description 105
- 108020004414 DNA Proteins 0.000 description 93
- 102000004169 proteins and genes Human genes 0.000 description 86
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 67
- 229940088598 enzyme Drugs 0.000 description 62
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 52
- 230000037361 pathway Effects 0.000 description 40
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 description 39
- 239000002773 nucleotide Substances 0.000 description 27
- 125000003729 nucleotide group Chemical group 0.000 description 27
- 239000011780 sodium chloride Substances 0.000 description 26
- 230000000694 effects Effects 0.000 description 19
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 18
- 238000005349 anion exchange Methods 0.000 description 16
- 239000000872 buffer Substances 0.000 description 16
- 108060006004 Ascorbate peroxidase Proteins 0.000 description 15
- 210000004027 cell Anatomy 0.000 description 15
- 238000000746 purification Methods 0.000 description 15
- 238000005259 measurement Methods 0.000 description 14
- 239000000523 sample Substances 0.000 description 14
- 239000006166 lysate Substances 0.000 description 13
- 230000006870 function Effects 0.000 description 12
- 239000005090 green fluorescent protein Substances 0.000 description 12
- 239000013615 primer Substances 0.000 description 12
- 230000015572 biosynthetic process Effects 0.000 description 11
- 238000010828 elution Methods 0.000 description 11
- 108010005233 alanylglutamic acid Proteins 0.000 description 10
- 238000004458 analytical method Methods 0.000 description 10
- 108010050848 glycylleucine Proteins 0.000 description 10
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 9
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 9
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 9
- 238000003556 assay Methods 0.000 description 9
- 238000002296 dynamic light scattering Methods 0.000 description 9
- CCQOOWAONKGYKQ-BYPYZUCNSA-N Gly-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)CN CCQOOWAONKGYKQ-BYPYZUCNSA-N 0.000 description 8
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 8
- UAJAYRMZGNQILN-BQBZGAKWSA-N Ser-Gly-Met Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCSC)C(O)=O UAJAYRMZGNQILN-BQBZGAKWSA-N 0.000 description 8
- 108010047857 aspartylglycine Proteins 0.000 description 8
- 238000004891 communication Methods 0.000 description 8
- 230000010354 integration Effects 0.000 description 8
- 108010057821 leucylproline Proteins 0.000 description 8
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 8
- 108010061238 threonyl-glycine Proteins 0.000 description 8
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 7
- QNBVTHNJGCOVFA-AVGNSLFASA-N Leu-Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O QNBVTHNJGCOVFA-AVGNSLFASA-N 0.000 description 7
- CNGOEHJCLVCJHN-SRVKXCTJSA-N Lys-Pro-Glu Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O CNGOEHJCLVCJHN-SRVKXCTJSA-N 0.000 description 7
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 7
- HEUVHBXOVZONPU-BJDJZHNGSA-N Ser-Leu-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HEUVHBXOVZONPU-BJDJZHNGSA-N 0.000 description 7
- CYCGARJWIQWPQM-YJRXYDGGSA-N Thr-Tyr-Ser Chemical compound C[C@@H](O)[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CO)C([O-])=O)CC1=CC=C(O)C=C1 CYCGARJWIQWPQM-YJRXYDGGSA-N 0.000 description 7
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 7
- 108010042598 glutamyl-aspartyl-glycine Proteins 0.000 description 7
- 108010015792 glycyllysine Proteins 0.000 description 7
- 108010081551 glycylphenylalanine Proteins 0.000 description 7
- 239000000047 product Substances 0.000 description 7
- 238000012552 review Methods 0.000 description 7
- 238000013518 transcription Methods 0.000 description 7
- 230000035897 transcription Effects 0.000 description 7
- AWZKCUCQJNTBAD-SRVKXCTJSA-N Ala-Leu-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN AWZKCUCQJNTBAD-SRVKXCTJSA-N 0.000 description 6
- 101100512078 Caenorhabditis elegans lys-1 gene Proteins 0.000 description 6
- BZKDJRSZWLPJNI-SRVKXCTJSA-N His-His-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O BZKDJRSZWLPJNI-SRVKXCTJSA-N 0.000 description 6
- 108010065920 Insulin Lispro Proteins 0.000 description 6
- LRALLISKBZNSKN-BQBZGAKWSA-N Met-Gly-Ser Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O LRALLISKBZNSKN-BQBZGAKWSA-N 0.000 description 6
- IOVBCLGAJJXOHK-SRVKXCTJSA-N Ser-His-His Chemical compound C([C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CN=CN1 IOVBCLGAJJXOHK-SRVKXCTJSA-N 0.000 description 6
- 238000003917 TEM image Methods 0.000 description 6
- VLOYGOZDPGYWFO-LAEOZQHASA-N Val-Asp-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VLOYGOZDPGYWFO-LAEOZQHASA-N 0.000 description 6
- 108010077245 asparaginyl-proline Proteins 0.000 description 6
- 108010092854 aspartyllysine Proteins 0.000 description 6
- 230000002759 chromosomal effect Effects 0.000 description 6
- 230000008045 co-localization Effects 0.000 description 6
- 230000004927 fusion Effects 0.000 description 6
- 108010078144 glutaminyl-glycine Proteins 0.000 description 6
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 6
- 108010028295 histidylhistidine Proteins 0.000 description 6
- 230000003993 interaction Effects 0.000 description 6
- 239000007788 liquid Substances 0.000 description 6
- 102000004196 processed proteins & peptides Human genes 0.000 description 6
- 238000001228 spectrum Methods 0.000 description 6
- 108010080629 tryptophan-leucine Proteins 0.000 description 6
- 238000001262 western blot Methods 0.000 description 6
- NHCPCLJZRSIDHS-ZLUOBGJFSA-N Ala-Asp-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O NHCPCLJZRSIDHS-ZLUOBGJFSA-N 0.000 description 5
- OMMDTNGURYRDAC-NRPADANISA-N Ala-Glu-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O OMMDTNGURYRDAC-NRPADANISA-N 0.000 description 5
- 241000894006 Bacteria Species 0.000 description 5
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 5
- DSPQRJXOIXHOHK-WDSKDSINSA-N Glu-Asp-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O DSPQRJXOIXHOHK-WDSKDSINSA-N 0.000 description 5
- WCORRBXVISTKQL-WHFBIAKZSA-N Gly-Ser-Ser Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WCORRBXVISTKQL-WHFBIAKZSA-N 0.000 description 5
- 101000896205 Haliangium ochraceum (strain DSM 14365 / JCM 11303 / SMP-2) Bacterial microcompartment shell vertex protein Proteins 0.000 description 5
- 241000588747 Klebsiella pneumoniae Species 0.000 description 5
- 102000003960 Ligases Human genes 0.000 description 5
- 108090000364 Ligases Proteins 0.000 description 5
- PDIDTSZKKFEDMB-UWVGGRQHSA-N Lys-Pro-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O PDIDTSZKKFEDMB-UWVGGRQHSA-N 0.000 description 5
- LOGFVTREOLYCPF-RHYQMDGZSA-N Lys-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCCN LOGFVTREOLYCPF-RHYQMDGZSA-N 0.000 description 5
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 5
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 5
- 101001010478 Rhodospirillum rubrum (strain F11) Bacterial microcompartment shell vertex protein GrpN Proteins 0.000 description 5
- LRWBCWGEUCKDTN-BJDJZHNGSA-N Ser-Lys-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LRWBCWGEUCKDTN-BJDJZHNGSA-N 0.000 description 5
- UHBPFYOQQPFKQR-JHEQGTHGSA-N Thr-Gln-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O UHBPFYOQQPFKQR-JHEQGTHGSA-N 0.000 description 5
- 229940098773 bovine serum albumin Drugs 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 238000001493 electron microscopy Methods 0.000 description 5
- 239000001963 growth medium Substances 0.000 description 5
- LHGVFZTZFXWLCP-UHFFFAOYSA-N guaiacol Chemical compound COC1=CC=CC=C1O LHGVFZTZFXWLCP-UHFFFAOYSA-N 0.000 description 5
- 238000011534 incubation Methods 0.000 description 5
- 239000002609 medium Substances 0.000 description 5
- 239000000178 monomer Substances 0.000 description 5
- 239000000243 solution Substances 0.000 description 5
- 230000008685 targeting Effects 0.000 description 5
- 230000002103 transcriptional effect Effects 0.000 description 5
- 238000004627 transmission electron microscopy Methods 0.000 description 5
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 5
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 4
- RLMISHABBKUNFO-WHFBIAKZSA-N Ala-Ala-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 4
- WQVFQXXBNHHPLX-ZKWXMUAHSA-N Ala-Ala-His Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O WQVFQXXBNHHPLX-ZKWXMUAHSA-N 0.000 description 4
- IEAUDUOCWNPZBR-LKTVYLICSA-N Ala-Trp-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N IEAUDUOCWNPZBR-LKTVYLICSA-N 0.000 description 4
- NGYHSXDNNOFHNE-AVGNSLFASA-N Arg-Pro-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O NGYHSXDNNOFHNE-AVGNSLFASA-N 0.000 description 4
- MFFOYNGMOYFPBD-DCAQKATOSA-N Asn-Arg-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O MFFOYNGMOYFPBD-DCAQKATOSA-N 0.000 description 4
- HYQYLOSCICEYTR-YUMQZZPRSA-N Asn-Gly-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O HYQYLOSCICEYTR-YUMQZZPRSA-N 0.000 description 4
- UMHUHHJMEXNSIV-CIUDSAMLSA-N Asp-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UMHUHHJMEXNSIV-CIUDSAMLSA-N 0.000 description 4
- 101710139569 Bacterial microcompartment shell protein EutM Proteins 0.000 description 4
- 101710091770 Bacterial microcompartment shell protein PduA Proteins 0.000 description 4
- 101710093019 Bacterial microcompartment shell protein PduU Proteins 0.000 description 4
- 241000282326 Felis catus Species 0.000 description 4
- CRRFJBGUGNNOCS-PEFMBERDSA-N Gln-Asp-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O CRRFJBGUGNNOCS-PEFMBERDSA-N 0.000 description 4
- NSEKYCAADBNQFE-XIRDDKMYSA-N Gln-Trp-Arg Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCC(N)=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)=CNC2=C1 NSEKYCAADBNQFE-XIRDDKMYSA-N 0.000 description 4
- RUFHOVYUYSNDNY-ACZMJKKPSA-N Glu-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O RUFHOVYUYSNDNY-ACZMJKKPSA-N 0.000 description 4
- UGSVSNXPJJDJKL-SDDRHHMPSA-N Glu-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N UGSVSNXPJJDJKL-SDDRHHMPSA-N 0.000 description 4
- AIJAPFVDBFYNKN-WHFBIAKZSA-N Gly-Asn-Asp Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)CN)C(=O)N AIJAPFVDBFYNKN-WHFBIAKZSA-N 0.000 description 4
- NPSWCZIRBAYNSB-JHEQGTHGSA-N Gly-Gln-Thr Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NPSWCZIRBAYNSB-JHEQGTHGSA-N 0.000 description 4
- QPDUVFSVVAOUHE-XVKPBYJWSA-N Gly-Gln-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)CN)C(O)=O QPDUVFSVVAOUHE-XVKPBYJWSA-N 0.000 description 4
- YWAQATDNEKZFFK-BYPYZUCNSA-N Gly-Gly-Ser Chemical compound NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O YWAQATDNEKZFFK-BYPYZUCNSA-N 0.000 description 4
- SCWYHUQOOFRVHP-MBLNEYKQSA-N Gly-Ile-Thr Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SCWYHUQOOFRVHP-MBLNEYKQSA-N 0.000 description 4
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 4
- 241000880493 Leptailurus serval Species 0.000 description 4
- KYIIALJHAOIAHF-KKUMJFAQSA-N Leu-Leu-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 KYIIALJHAOIAHF-KKUMJFAQSA-N 0.000 description 4
- WUGMRIBZSVSJNP-UHFFFAOYSA-N N-L-alanyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UHFFFAOYSA-N 0.000 description 4
- BQVUABVGYYSDCJ-UHFFFAOYSA-N Nalpha-L-Leucyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-UHFFFAOYSA-N 0.000 description 4
- JLLJTMHNXQTMCK-UBHSHLNASA-N Phe-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=CC=C1 JLLJTMHNXQTMCK-UBHSHLNASA-N 0.000 description 4
- NJJBATPLUQHRBM-IHRRRGAJSA-N Phe-Pro-Ser Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)N)C(=O)N[C@@H](CO)C(=O)O NJJBATPLUQHRBM-IHRRRGAJSA-N 0.000 description 4
- UVKNEILZSJMKSR-FXQIFTODSA-N Pro-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H]1CCCN1 UVKNEILZSJMKSR-FXQIFTODSA-N 0.000 description 4
- NAIPAPCKKRCMBL-JYJNAYRXSA-N Pro-Pro-Phe Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H]1N(CCC1)C(=O)[C@H]1NCCC1)C1=CC=CC=C1 NAIPAPCKKRCMBL-JYJNAYRXSA-N 0.000 description 4
- PRKWBYCXBBSLSK-GUBZILKMSA-N Pro-Ser-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O PRKWBYCXBBSLSK-GUBZILKMSA-N 0.000 description 4
- 101100415710 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) RPL41B gene Proteins 0.000 description 4
- 101100090299 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) RPL42B gene Proteins 0.000 description 4
- 241001138501 Salmonella enterica Species 0.000 description 4
- 101100415709 Schizosaccharomyces pombe (strain 972 / ATCC 24843) rpl4102 gene Proteins 0.000 description 4
- BTKUIVBNGBFTTP-WHFBIAKZSA-N Ser-Ala-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)NCC(O)=O BTKUIVBNGBFTTP-WHFBIAKZSA-N 0.000 description 4
- MIJWOJAXARLEHA-WDSKDSINSA-N Ser-Gly-Glu Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O MIJWOJAXARLEHA-WDSKDSINSA-N 0.000 description 4
- DJDSEDOKJTZBAR-ZDLURKLDSA-N Thr-Gly-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O DJDSEDOKJTZBAR-ZDLURKLDSA-N 0.000 description 4
- XHWCDRUPDNSDAZ-XKBZYTNZSA-N Thr-Ser-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N)O XHWCDRUPDNSDAZ-XKBZYTNZSA-N 0.000 description 4
- SUEGAFMNTXXNLR-WFBYXXMGSA-N Trp-Ser-Ala Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O SUEGAFMNTXXNLR-WFBYXXMGSA-N 0.000 description 4
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 4
- SLLKXDSRVAOREO-KZVJFYERSA-N Val-Ala-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H](C(C)C)N)O SLLKXDSRVAOREO-KZVJFYERSA-N 0.000 description 4
- UDNYEPLJTRDMEJ-RCOVLWMOSA-N Val-Asn-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)NCC(=O)O)N UDNYEPLJTRDMEJ-RCOVLWMOSA-N 0.000 description 4
- XWYUBUYQMOUFRQ-IFFSRLJSSA-N Val-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N)O XWYUBUYQMOUFRQ-IFFSRLJSSA-N 0.000 description 4
- OTJMMKPMLUNTQT-AVGNSLFASA-N Val-Leu-Arg Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](C(C)C)N OTJMMKPMLUNTQT-AVGNSLFASA-N 0.000 description 4
- 238000001261 affinity purification Methods 0.000 description 4
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 4
- 108010070944 alanylhistidine Proteins 0.000 description 4
- 108010087924 alanylproline Proteins 0.000 description 4
- 229940024606 amino acid Drugs 0.000 description 4
- 108010060035 arginylproline Proteins 0.000 description 4
- 108010093581 aspartyl-proline Proteins 0.000 description 4
- 210000004899 c-terminal region Anatomy 0.000 description 4
- 230000015556 catabolic process Effects 0.000 description 4
- 238000006731 degradation reaction Methods 0.000 description 4
- 238000001514 detection method Methods 0.000 description 4
- 238000009826 distribution Methods 0.000 description 4
- 238000009472 formulation Methods 0.000 description 4
- 238000007710 freezing Methods 0.000 description 4
- 108010049041 glutamylalanine Proteins 0.000 description 4
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 4
- 108010072405 glycyl-aspartyl-glycine Proteins 0.000 description 4
- XKUKSGPZAADMRA-UHFFFAOYSA-N glycyl-glycyl-glycine Natural products NCC(=O)NCC(=O)NCC(O)=O XKUKSGPZAADMRA-UHFFFAOYSA-N 0.000 description 4
- 108010038983 glycyl-histidyl-lysine Proteins 0.000 description 4
- 108010020688 glycylhistidine Proteins 0.000 description 4
- 108010087823 glycyltyrosine Proteins 0.000 description 4
- 108010040030 histidinoalanine Proteins 0.000 description 4
- 108010025306 histidylleucine Proteins 0.000 description 4
- 108010085325 histidylproline Proteins 0.000 description 4
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 4
- 229930027917 kanamycin Natural products 0.000 description 4
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 4
- 229960000318 kanamycin Drugs 0.000 description 4
- 229930182823 kanamycin A Natural products 0.000 description 4
- 108010034529 leucyl-lysine Proteins 0.000 description 4
- 238000011068 loading method Methods 0.000 description 4
- 108010017391 lysylvaline Proteins 0.000 description 4
- 238000012423 maintenance Methods 0.000 description 4
- 108010005942 methionylglycine Proteins 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 235000015097 nutrients Nutrition 0.000 description 4
- 108010024654 phenylalanyl-prolyl-alanine Proteins 0.000 description 4
- 108010018625 phenylalanylarginine Proteins 0.000 description 4
- 108010051242 phenylalanylserine Proteins 0.000 description 4
- YBYRMVIVWMBXKQ-UHFFFAOYSA-N phenylmethanesulfonyl fluoride Chemical compound FS(=O)(=O)CC1=CC=CC=C1 YBYRMVIVWMBXKQ-UHFFFAOYSA-N 0.000 description 4
- 108010070643 prolylglutamic acid Proteins 0.000 description 4
- 108010015796 prolylisoleucine Proteins 0.000 description 4
- 238000011002 quantification Methods 0.000 description 4
- 238000011160 research Methods 0.000 description 4
- 108091008146 restriction endonucleases Proteins 0.000 description 4
- 230000002441 reversible effect Effects 0.000 description 4
- 108010048818 seryl-histidine Proteins 0.000 description 4
- 108010069117 seryl-lysyl-aspartic acid Proteins 0.000 description 4
- 108010048397 seryl-lysyl-leucine Proteins 0.000 description 4
- 239000002904 solvent Substances 0.000 description 4
- 241000894007 species Species 0.000 description 4
- UCSJYZPVAKXKNQ-HZYVHMACSA-N streptomycin Chemical compound CN[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O[C@H]1O[C@@H]1[C@](C=O)(O)[C@H](C)O[C@H]1O[C@@H]1[C@@H](NC(N)=N)[C@H](O)[C@@H](NC(N)=N)[C@H](O)[C@H]1O UCSJYZPVAKXKNQ-HZYVHMACSA-N 0.000 description 4
- 108010051110 tyrosyl-lysine Proteins 0.000 description 4
- JEPNLGMEZMCFEX-QSFUFRPTSA-N Ala-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](C)N JEPNLGMEZMCFEX-QSFUFRPTSA-N 0.000 description 3
- CCDFBRZVTDDJNM-GUBZILKMSA-N Ala-Leu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CCDFBRZVTDDJNM-GUBZILKMSA-N 0.000 description 3
- VKKYFICVTYKFIO-CIUDSAMLSA-N Arg-Ala-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N VKKYFICVTYKFIO-CIUDSAMLSA-N 0.000 description 3
- AGVNTAUPLWIQEN-ZPFDUUQYSA-N Arg-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N AGVNTAUPLWIQEN-ZPFDUUQYSA-N 0.000 description 3
- QLSRIZIDQXDQHK-RCWTZXSCSA-N Arg-Val-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QLSRIZIDQXDQHK-RCWTZXSCSA-N 0.000 description 3
- BLQBMRNMBAYREH-UWJYBYFXSA-N Asp-Ala-Tyr Chemical compound N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O BLQBMRNMBAYREH-UWJYBYFXSA-N 0.000 description 3
- PZXPWHFYZXTFBI-YUMQZZPRSA-N Asp-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O PZXPWHFYZXTFBI-YUMQZZPRSA-N 0.000 description 3
- LNENWJXDHCFVOF-DCAQKATOSA-N Asp-His-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC(=O)O)N LNENWJXDHCFVOF-DCAQKATOSA-N 0.000 description 3
- 108700038233 Bacterial microcompartment shell protein EutL Proteins 0.000 description 3
- 108700038232 Bacterial microcompartment shell protein PduB Proteins 0.000 description 3
- 241000192700 Cyanobacteria Species 0.000 description 3
- IAZDPXIOMUYVGZ-UHFFFAOYSA-N Dimethylsulphoxide Chemical compound CS(C)=O IAZDPXIOMUYVGZ-UHFFFAOYSA-N 0.000 description 3
- WSFSSNUMVMOOMR-UHFFFAOYSA-N Formaldehyde Chemical compound O=C WSFSSNUMVMOOMR-UHFFFAOYSA-N 0.000 description 3
- 101150081655 GPM1 gene Proteins 0.000 description 3
- PXXGVUVQWQGGIG-YUMQZZPRSA-N Glu-Gly-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N PXXGVUVQWQGGIG-YUMQZZPRSA-N 0.000 description 3
- IVGJYOOGJLFKQE-AVGNSLFASA-N Glu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N IVGJYOOGJLFKQE-AVGNSLFASA-N 0.000 description 3
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 3
- XTQFHTHIAKKCTM-YFKPBYRVSA-N Gly-Glu-Gly Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O XTQFHTHIAKKCTM-YFKPBYRVSA-N 0.000 description 3
- UTYGDAHJBBDPBA-BYULHYEWSA-N Gly-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)CN UTYGDAHJBBDPBA-BYULHYEWSA-N 0.000 description 3
- BMWFDYIYBAFROD-WPRPVWTQSA-N Gly-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN BMWFDYIYBAFROD-WPRPVWTQSA-N 0.000 description 3
- 108010093488 His-His-His-His-His-His Proteins 0.000 description 3
- 101000583156 Homo sapiens Pituitary homeobox 1 Proteins 0.000 description 3
- GVKKVHNRTUFCCE-BJDJZHNGSA-N Ile-Leu-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)O)N GVKKVHNRTUFCCE-BJDJZHNGSA-N 0.000 description 3
- 108010029660 Intrinsically Disordered Proteins Proteins 0.000 description 3
- 125000000174 L-prolyl group Chemical group [H]N1C([H])([H])C([H])([H])C([H])([H])[C@@]1([H])C(*)=O 0.000 description 3
- KSZCCRIGNVSHFH-UWVGGRQHSA-N Leu-Arg-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O KSZCCRIGNVSHFH-UWVGGRQHSA-N 0.000 description 3
- JNDYEOUZBLOVOF-AVGNSLFASA-N Leu-Leu-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O JNDYEOUZBLOVOF-AVGNSLFASA-N 0.000 description 3
- BTEMNFBEAAOGBR-BZSNNMDCSA-N Leu-Tyr-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BTEMNFBEAAOGBR-BZSNNMDCSA-N 0.000 description 3
- GQUDMNDPQTXZRV-DCAQKATOSA-N Lys-Arg-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O GQUDMNDPQTXZRV-DCAQKATOSA-N 0.000 description 3
- ISHNZELVUVPCHY-ZETCQYMHSA-N Lys-Gly-Gly Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)NCC(O)=O ISHNZELVUVPCHY-ZETCQYMHSA-N 0.000 description 3
- LUAJJLPHUXPQLH-KKUMJFAQSA-N Lys-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCCN)N LUAJJLPHUXPQLH-KKUMJFAQSA-N 0.000 description 3
- GODBLDDYHFTUAH-CIUDSAMLSA-N Met-Asp-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(O)=O GODBLDDYHFTUAH-CIUDSAMLSA-N 0.000 description 3
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 3
- 108700026244 Open Reading Frames Proteins 0.000 description 3
- 108091005804 Peptidases Proteins 0.000 description 3
- NEHSHYOUIWBYSA-DCPHZVHLSA-N Phe-Ala-Trp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC3=CC=CC=C3)N NEHSHYOUIWBYSA-DCPHZVHLSA-N 0.000 description 3
- MWQXFDIQXIXPMS-UNQGMJICSA-N Phe-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CC1=CC=CC=C1)N)O MWQXFDIQXIXPMS-UNQGMJICSA-N 0.000 description 3
- 102100030345 Pituitary homeobox 1 Human genes 0.000 description 3
- NMELOOXSGDRBRU-YUMQZZPRSA-N Pro-Glu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(=O)O)NC(=O)[C@@H]1CCCN1 NMELOOXSGDRBRU-YUMQZZPRSA-N 0.000 description 3
- 239000004365 Protease Substances 0.000 description 3
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 3
- BPMRXBZYPGYPJN-WHFBIAKZSA-N Ser-Gly-Asn Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O BPMRXBZYPGYPJN-WHFBIAKZSA-N 0.000 description 3
- YMTLKLXDFCSCNX-BYPYZUCNSA-N Ser-Gly-Gly Chemical compound OC[C@H](N)C(=O)NCC(=O)NCC(O)=O YMTLKLXDFCSCNX-BYPYZUCNSA-N 0.000 description 3
- 239000007983 Tris buffer Substances 0.000 description 3
- BURPTJBFWIOHEY-UWJYBYFXSA-N Tyr-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 BURPTJBFWIOHEY-UWJYBYFXSA-N 0.000 description 3
- MJFSRZZJQWZHFQ-SRVKXCTJSA-N Val-Met-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(=O)O)N MJFSRZZJQWZHFQ-SRVKXCTJSA-N 0.000 description 3
- 238000002835 absorbance Methods 0.000 description 3
- 238000013019 agitation Methods 0.000 description 3
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 3
- 230000000712 assembly Effects 0.000 description 3
- 238000000429 assembly Methods 0.000 description 3
- 229910052799 carbon Inorganic materials 0.000 description 3
- 230000003197 catalytic effect Effects 0.000 description 3
- 239000003153 chemical reaction reagent Substances 0.000 description 3
- 150000001875 compounds Chemical class 0.000 description 3
- 239000013078 crystal Substances 0.000 description 3
- 210000000172 cytosol Anatomy 0.000 description 3
- 238000013480 data collection Methods 0.000 description 3
- 238000004925 denaturation Methods 0.000 description 3
- 230000036425 denaturation Effects 0.000 description 3
- 238000010217 densitometric analysis Methods 0.000 description 3
- 230000029087 digestion Effects 0.000 description 3
- 239000000499 gel Substances 0.000 description 3
- 239000008103 glucose Substances 0.000 description 3
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 3
- 108010089804 glycyl-threonine Proteins 0.000 description 3
- 108010077515 glycylproline Proteins 0.000 description 3
- 101150084612 gpmA gene Proteins 0.000 description 3
- 238000003119 immunoblot Methods 0.000 description 3
- 230000001965 increasing effect Effects 0.000 description 3
- 238000003780 insertion Methods 0.000 description 3
- 230000037431 insertion Effects 0.000 description 3
- 230000003834 intracellular effect Effects 0.000 description 3
- 238000010369 molecular cloning Methods 0.000 description 3
- 238000002887 multiple sequence alignment Methods 0.000 description 3
- 239000008194 pharmaceutical composition Substances 0.000 description 3
- 108010053725 prolylvaline Proteins 0.000 description 3
- 238000010379 pull-down assay Methods 0.000 description 3
- 238000003259 recombinant expression Methods 0.000 description 3
- 230000002829 reductive effect Effects 0.000 description 3
- 230000000717 retained effect Effects 0.000 description 3
- 230000035939 shock Effects 0.000 description 3
- 239000000126 substance Substances 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 239000013638 trimer Substances 0.000 description 3
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 3
- 108010020532 tyrosyl-proline Proteins 0.000 description 3
- XVZCXCTYGHPNEM-IHRRRGAJSA-N (2s)-1-[(2s)-2-[[(2s)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(O)=O XVZCXCTYGHPNEM-IHRRRGAJSA-N 0.000 description 2
- QFVHZQCOUORWEI-UHFFFAOYSA-N 4-[(4-anilino-5-sulfonaphthalen-1-yl)diazenyl]-5-hydroxynaphthalene-2,7-disulfonic acid Chemical compound C=12C(O)=CC(S(O)(=O)=O)=CC2=CC(S(O)(=O)=O)=CC=1N=NC(C1=CC=CC(=C11)S(O)(=O)=O)=CC=C1NC1=CC=CC=C1 QFVHZQCOUORWEI-UHFFFAOYSA-N 0.000 description 2
- 102100023415 40S ribosomal protein S20 Human genes 0.000 description 2
- 241000266272 Acidithiobacillus Species 0.000 description 2
- CXRCVCURMBFFOL-FXQIFTODSA-N Ala-Ala-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O CXRCVCURMBFFOL-FXQIFTODSA-N 0.000 description 2
- YYSWCHMLFJLLBJ-ZLUOBGJFSA-N Ala-Ala-Ser Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YYSWCHMLFJLLBJ-ZLUOBGJFSA-N 0.000 description 2
- SVBXIUDNTRTKHE-CIUDSAMLSA-N Ala-Arg-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O SVBXIUDNTRTKHE-CIUDSAMLSA-N 0.000 description 2
- UCIYCBSJBQGDGM-LPEHRKFASA-N Ala-Arg-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@@H]1C(=O)O)N UCIYCBSJBQGDGM-LPEHRKFASA-N 0.000 description 2
- STACJSVFHSEZJV-GHCJXIJMSA-N Ala-Asn-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O STACJSVFHSEZJV-GHCJXIJMSA-N 0.000 description 2
- KIUYPHAMDKDICO-WHFBIAKZSA-N Ala-Asp-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KIUYPHAMDKDICO-WHFBIAKZSA-N 0.000 description 2
- LZRNYBIJOSKKRJ-XVYDVKMFSA-N Ala-Asp-His Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N LZRNYBIJOSKKRJ-XVYDVKMFSA-N 0.000 description 2
- ZIWWTZWAKYBUOB-CIUDSAMLSA-N Ala-Asp-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O ZIWWTZWAKYBUOB-CIUDSAMLSA-N 0.000 description 2
- LSLIRHLIUDVNBN-CIUDSAMLSA-N Ala-Asp-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LSLIRHLIUDVNBN-CIUDSAMLSA-N 0.000 description 2
- KUDREHRZRIVKHS-UWJYBYFXSA-N Ala-Asp-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KUDREHRZRIVKHS-UWJYBYFXSA-N 0.000 description 2
- IKKVASZHTMKJIR-ZKWXMUAHSA-N Ala-Asp-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O IKKVASZHTMKJIR-ZKWXMUAHSA-N 0.000 description 2
- YIGLXQRFQVWFEY-NRPADANISA-N Ala-Gln-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O YIGLXQRFQVWFEY-NRPADANISA-N 0.000 description 2
- NWVVKQZOVSTDBQ-CIUDSAMLSA-N Ala-Glu-Arg Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NWVVKQZOVSTDBQ-CIUDSAMLSA-N 0.000 description 2
- NJPMYXWVWQWCSR-ACZMJKKPSA-N Ala-Glu-Asn Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O NJPMYXWVWQWCSR-ACZMJKKPSA-N 0.000 description 2
- HXNNRBHASOSVPG-GUBZILKMSA-N Ala-Glu-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O HXNNRBHASOSVPG-GUBZILKMSA-N 0.000 description 2
- VWEWCZSUWOEEFM-WDSKDSINSA-N Ala-Gly-Ala-Gly Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(=O)NCC(O)=O VWEWCZSUWOEEFM-WDSKDSINSA-N 0.000 description 2
- WMYJZJRILUVVRG-WDSKDSINSA-N Ala-Gly-Gln Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O WMYJZJRILUVVRG-WDSKDSINSA-N 0.000 description 2
- BEMGNWZECGIJOI-WDSKDSINSA-N Ala-Gly-Glu Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O BEMGNWZECGIJOI-WDSKDSINSA-N 0.000 description 2
- SMCGQGDVTPFXKB-XPUUQOCRSA-N Ala-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N SMCGQGDVTPFXKB-XPUUQOCRSA-N 0.000 description 2
- DVJSJDDYCYSMFR-ZKWXMUAHSA-N Ala-Ile-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O DVJSJDDYCYSMFR-ZKWXMUAHSA-N 0.000 description 2
- LXAARTARZJJCMB-CIQUZCHMSA-N Ala-Ile-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LXAARTARZJJCMB-CIQUZCHMSA-N 0.000 description 2
- SOBIAADAMRHGKH-CIUDSAMLSA-N Ala-Leu-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O SOBIAADAMRHGKH-CIUDSAMLSA-N 0.000 description 2
- OQWQTGBOFPJOIF-DLOVCJGASA-N Ala-Lys-His Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N OQWQTGBOFPJOIF-DLOVCJGASA-N 0.000 description 2
- KQESEZXHYOUIIM-CQDKDKBSSA-N Ala-Lys-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KQESEZXHYOUIIM-CQDKDKBSSA-N 0.000 description 2
- OMDNCNKNEGFOMM-BQBZGAKWSA-N Ala-Met-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O OMDNCNKNEGFOMM-BQBZGAKWSA-N 0.000 description 2
- DEWWPUNXRNGMQN-LPEHRKFASA-N Ala-Met-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N1CCC[C@@H]1C(=O)O)N DEWWPUNXRNGMQN-LPEHRKFASA-N 0.000 description 2
- AWNAEZICPNGAJK-FXQIFTODSA-N Ala-Met-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O AWNAEZICPNGAJK-FXQIFTODSA-N 0.000 description 2
- HYIDEIQUCBKIPL-CQDKDKBSSA-N Ala-Phe-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N HYIDEIQUCBKIPL-CQDKDKBSSA-N 0.000 description 2
- ZBLQIYPCUWZSRZ-QEJZJMRPSA-N Ala-Phe-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CC=CC=C1 ZBLQIYPCUWZSRZ-QEJZJMRPSA-N 0.000 description 2
- IORKCNUBHNIMKY-CIUDSAMLSA-N Ala-Pro-Glu Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O IORKCNUBHNIMKY-CIUDSAMLSA-N 0.000 description 2
- ADSGHMXEAZJJNF-DCAQKATOSA-N Ala-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)N ADSGHMXEAZJJNF-DCAQKATOSA-N 0.000 description 2
- YHBDGLZYNIARKJ-GUBZILKMSA-N Ala-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)N YHBDGLZYNIARKJ-GUBZILKMSA-N 0.000 description 2
- RMAWDDRDTRSZIR-ZLUOBGJFSA-N Ala-Ser-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O RMAWDDRDTRSZIR-ZLUOBGJFSA-N 0.000 description 2
- RTZCUEHYUQZIDE-WHFBIAKZSA-N Ala-Ser-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RTZCUEHYUQZIDE-WHFBIAKZSA-N 0.000 description 2
- YNOCMHZSWJMGBB-GCJQMDKQSA-N Ala-Thr-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O YNOCMHZSWJMGBB-GCJQMDKQSA-N 0.000 description 2
- QOIGKCBMXUCDQU-KDXUFGMBSA-N Ala-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C)N)O QOIGKCBMXUCDQU-KDXUFGMBSA-N 0.000 description 2
- IETUUAHKCHOQHP-KZVJFYERSA-N Ala-Thr-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@H](C)N)[C@@H](C)O)C(O)=O IETUUAHKCHOQHP-KZVJFYERSA-N 0.000 description 2
- TVUFMYKTYXTRPY-HERUPUMHSA-N Ala-Trp-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(O)=O TVUFMYKTYXTRPY-HERUPUMHSA-N 0.000 description 2
- IYKVSFNGSWTTNZ-GUBZILKMSA-N Ala-Val-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IYKVSFNGSWTTNZ-GUBZILKMSA-N 0.000 description 2
- YJHKTAMKPGFJCT-NRPADANISA-N Ala-Val-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O YJHKTAMKPGFJCT-NRPADANISA-N 0.000 description 2
- XKHLBBQNPSOGPI-GUBZILKMSA-N Ala-Val-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C)N XKHLBBQNPSOGPI-GUBZILKMSA-N 0.000 description 2
- DBKNLHKEVPZVQC-LPEHRKFASA-N Arg-Ala-Pro Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@@H]1C(O)=O DBKNLHKEVPZVQC-LPEHRKFASA-N 0.000 description 2
- DXQIQUIQYAGRCC-CIUDSAMLSA-N Arg-Asp-Gln Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)CN=C(N)N DXQIQUIQYAGRCC-CIUDSAMLSA-N 0.000 description 2
- HKRXJBBCQBAGIM-FXQIFTODSA-N Arg-Asp-Ser Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CO)C(=O)O)N)CN=C(N)N HKRXJBBCQBAGIM-FXQIFTODSA-N 0.000 description 2
- ASQYTJJWAMDISW-BPUTZDHNSA-N Arg-Asp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N ASQYTJJWAMDISW-BPUTZDHNSA-N 0.000 description 2
- YUGFLWBWAJFGKY-BQBZGAKWSA-N Arg-Cys-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CS)C(=O)NCC(O)=O YUGFLWBWAJFGKY-BQBZGAKWSA-N 0.000 description 2
- XLWSGICNBZGYTA-CIUDSAMLSA-N Arg-Glu-Asp Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O XLWSGICNBZGYTA-CIUDSAMLSA-N 0.000 description 2
- OHYQKYUTLIPFOX-ZPFDUUQYSA-N Arg-Glu-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O OHYQKYUTLIPFOX-ZPFDUUQYSA-N 0.000 description 2
- XUUXCWCKKCZEAW-YFKPBYRVSA-N Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](N)CCCN=C(N)N XUUXCWCKKCZEAW-YFKPBYRVSA-N 0.000 description 2
- YKBHOXLMMPZPHQ-GMOBBJLQSA-N Arg-Ile-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O YKBHOXLMMPZPHQ-GMOBBJLQSA-N 0.000 description 2
- FFEUXEAKYRCACT-PEDHHIEDSA-N Arg-Ile-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCCNC(N)=N)[C@@H](C)CC)C(O)=O FFEUXEAKYRCACT-PEDHHIEDSA-N 0.000 description 2
- UHFUZWSZQKMDSX-DCAQKATOSA-N Arg-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UHFUZWSZQKMDSX-DCAQKATOSA-N 0.000 description 2
- NMRHDSAOIURTNT-RWMBFGLXSA-N Arg-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N NMRHDSAOIURTNT-RWMBFGLXSA-N 0.000 description 2
- COXMUHNBYCVVRG-DCAQKATOSA-N Arg-Leu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O COXMUHNBYCVVRG-DCAQKATOSA-N 0.000 description 2
- JEOCWTUOMKEEMF-RHYQMDGZSA-N Arg-Leu-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JEOCWTUOMKEEMF-RHYQMDGZSA-N 0.000 description 2
- JCROZIFVIYMXHM-GUBZILKMSA-N Arg-Met-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CCCN=C(N)N JCROZIFVIYMXHM-GUBZILKMSA-N 0.000 description 2
- YTMKMRSYXHBGER-IHRRRGAJSA-N Arg-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N YTMKMRSYXHBGER-IHRRRGAJSA-N 0.000 description 2
- FKQITMVNILRUCQ-IHRRRGAJSA-N Arg-Phe-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O FKQITMVNILRUCQ-IHRRRGAJSA-N 0.000 description 2
- VUGWHBXPMAHEGZ-SRVKXCTJSA-N Arg-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCN=C(N)N VUGWHBXPMAHEGZ-SRVKXCTJSA-N 0.000 description 2
- YNSUUAOAFCVINY-OSUNSFLBSA-N Arg-Thr-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YNSUUAOAFCVINY-OSUNSFLBSA-N 0.000 description 2
- ZJBUILVYSXQNSW-YTWAJWBKSA-N Arg-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O ZJBUILVYSXQNSW-YTWAJWBKSA-N 0.000 description 2
- ZUVMUOOHJYNJPP-XIRDDKMYSA-N Arg-Trp-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZUVMUOOHJYNJPP-XIRDDKMYSA-N 0.000 description 2
- YHZQOSXDTFRZKU-WDSOQIARSA-N Arg-Trp-Leu Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N)=CNC2=C1 YHZQOSXDTFRZKU-WDSOQIARSA-N 0.000 description 2
- JBQORRNSZGTLCV-WDSOQIARSA-N Arg-Trp-Lys Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N)=CNC2=C1 JBQORRNSZGTLCV-WDSOQIARSA-N 0.000 description 2
- XRLOBFSLPCHYLQ-ULQDDVLXSA-N Arg-Tyr-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O XRLOBFSLPCHYLQ-ULQDDVLXSA-N 0.000 description 2
- QCTOLCVIGRLMQS-HRCADAONSA-N Arg-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O QCTOLCVIGRLMQS-HRCADAONSA-N 0.000 description 2
- LXTGAOAXPSJWOU-DCAQKATOSA-N Asn-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)N)N LXTGAOAXPSJWOU-DCAQKATOSA-N 0.000 description 2
- MEFGKQUUYZOLHM-GMOBBJLQSA-N Asn-Arg-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MEFGKQUUYZOLHM-GMOBBJLQSA-N 0.000 description 2
- PCKRJVZAQZWNKM-WHFBIAKZSA-N Asn-Asn-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O PCKRJVZAQZWNKM-WHFBIAKZSA-N 0.000 description 2
- QYXNFROWLZPWPC-FXQIFTODSA-N Asn-Glu-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QYXNFROWLZPWPC-FXQIFTODSA-N 0.000 description 2
- JREOBWLIZLXRIS-GUBZILKMSA-N Asn-Glu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JREOBWLIZLXRIS-GUBZILKMSA-N 0.000 description 2
- OLGCWMNDJTWQAG-GUBZILKMSA-N Asn-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC(N)=O OLGCWMNDJTWQAG-GUBZILKMSA-N 0.000 description 2
- UBKOVSLDWIHYSY-ACZMJKKPSA-N Asn-Glu-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O UBKOVSLDWIHYSY-ACZMJKKPSA-N 0.000 description 2
- GWNMUVANAWDZTI-YUMQZZPRSA-N Asn-Gly-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N GWNMUVANAWDZTI-YUMQZZPRSA-N 0.000 description 2
- PBSQFBAJKPLRJY-BYULHYEWSA-N Asn-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N PBSQFBAJKPLRJY-BYULHYEWSA-N 0.000 description 2
- XLHLPYFMXGOASD-CIUDSAMLSA-N Asn-His-Asp Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N XLHLPYFMXGOASD-CIUDSAMLSA-N 0.000 description 2
- WQLJRNRLHWJIRW-KKUMJFAQSA-N Asn-His-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC(=O)N)N)O WQLJRNRLHWJIRW-KKUMJFAQSA-N 0.000 description 2
- ANPFQTJEPONRPL-UGYAYLCHSA-N Asn-Ile-Asp Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O ANPFQTJEPONRPL-UGYAYLCHSA-N 0.000 description 2
- XVBDDUPJVQXDSI-PEFMBERDSA-N Asn-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N XVBDDUPJVQXDSI-PEFMBERDSA-N 0.000 description 2
- GQRDIVQPSMPQME-ZPFDUUQYSA-N Asn-Ile-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O GQRDIVQPSMPQME-ZPFDUUQYSA-N 0.000 description 2
- TZFQICWZWFNIKU-KKUMJFAQSA-N Asn-Leu-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 TZFQICWZWFNIKU-KKUMJFAQSA-N 0.000 description 2
- RVHGJNGNKGDCPX-KKUMJFAQSA-N Asn-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N RVHGJNGNKGDCPX-KKUMJFAQSA-N 0.000 description 2
- IDUUACUJKUXKKD-VEVYYDQMSA-N Asn-Pro-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O IDUUACUJKUXKKD-VEVYYDQMSA-N 0.000 description 2
- XTMZYFMTYJNABC-ZLUOBGJFSA-N Asn-Ser-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N XTMZYFMTYJNABC-ZLUOBGJFSA-N 0.000 description 2
- GZXOUBTUAUAVHD-ACZMJKKPSA-N Asn-Ser-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O GZXOUBTUAUAVHD-ACZMJKKPSA-N 0.000 description 2
- DOURAOODTFJRIC-CIUDSAMLSA-N Asn-Ser-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N DOURAOODTFJRIC-CIUDSAMLSA-N 0.000 description 2
- WUQXMTITJLFXAU-JIOCBJNQSA-N Asn-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N)O WUQXMTITJLFXAU-JIOCBJNQSA-N 0.000 description 2
- TZQWZQSMHDVLQL-QEJZJMRPSA-N Asn-Trp-Gln Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N TZQWZQSMHDVLQL-QEJZJMRPSA-N 0.000 description 2
- RTFXPCYMDYBZNQ-SRVKXCTJSA-N Asn-Tyr-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O RTFXPCYMDYBZNQ-SRVKXCTJSA-N 0.000 description 2
- QNNBHTFDFFFHGC-KKUMJFAQSA-N Asn-Tyr-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O QNNBHTFDFFFHGC-KKUMJFAQSA-N 0.000 description 2
- DXHINQUXBZNUCF-MELADBBJSA-N Asn-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CC(=O)N)N)C(=O)O DXHINQUXBZNUCF-MELADBBJSA-N 0.000 description 2
- XOQYDFCQPWAMSA-KKHAAJSZSA-N Asn-Val-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOQYDFCQPWAMSA-KKHAAJSZSA-N 0.000 description 2
- KRXIWXCXOARFNT-ZLUOBGJFSA-N Asp-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(O)=O KRXIWXCXOARFNT-ZLUOBGJFSA-N 0.000 description 2
- KVMPVNGOKHTUHZ-GCJQMDKQSA-N Asp-Ala-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KVMPVNGOKHTUHZ-GCJQMDKQSA-N 0.000 description 2
- QHAJMRDEWNAIBQ-FXQIFTODSA-N Asp-Arg-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O QHAJMRDEWNAIBQ-FXQIFTODSA-N 0.000 description 2
- SOYOSFXLXYZNRG-CIUDSAMLSA-N Asp-Arg-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O SOYOSFXLXYZNRG-CIUDSAMLSA-N 0.000 description 2
- IXIWEFWRKIUMQX-DCAQKATOSA-N Asp-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CC(O)=O IXIWEFWRKIUMQX-DCAQKATOSA-N 0.000 description 2
- ILJQISGMGXRZQQ-IHRRRGAJSA-N Asp-Arg-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ILJQISGMGXRZQQ-IHRRRGAJSA-N 0.000 description 2
- QRULNKJGYQQZMW-ZLUOBGJFSA-N Asp-Asn-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O QRULNKJGYQQZMW-ZLUOBGJFSA-N 0.000 description 2
- FANQWNCPNFEPGZ-WHFBIAKZSA-N Asp-Asp-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O FANQWNCPNFEPGZ-WHFBIAKZSA-N 0.000 description 2
- SVFOIXMRMLROHO-SRVKXCTJSA-N Asp-Asp-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 SVFOIXMRMLROHO-SRVKXCTJSA-N 0.000 description 2
- LKIYSIYBKYLKPU-BIIVOSGPSA-N Asp-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)O)N)C(=O)O LKIYSIYBKYLKPU-BIIVOSGPSA-N 0.000 description 2
- CSEJMKNZDCJYGJ-XHNCKOQMSA-N Asp-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)O)N)C(=O)O CSEJMKNZDCJYGJ-XHNCKOQMSA-N 0.000 description 2
- XJQRWGXKUSDEFI-ACZMJKKPSA-N Asp-Glu-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O XJQRWGXKUSDEFI-ACZMJKKPSA-N 0.000 description 2
- HSWYMWGDMPLTTH-FXQIFTODSA-N Asp-Glu-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O HSWYMWGDMPLTTH-FXQIFTODSA-N 0.000 description 2
- PDECQIHABNQRHN-GUBZILKMSA-N Asp-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC(O)=O PDECQIHABNQRHN-GUBZILKMSA-N 0.000 description 2
- JUWZKMBALYLZCK-WHFBIAKZSA-N Asp-Gly-Asn Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O JUWZKMBALYLZCK-WHFBIAKZSA-N 0.000 description 2
- HAFCJCDJGIOYPW-WDSKDSINSA-N Asp-Gly-Gln Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O HAFCJCDJGIOYPW-WDSKDSINSA-N 0.000 description 2
- POTCZYQVVNXUIG-BQBZGAKWSA-N Asp-Gly-Pro Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O POTCZYQVVNXUIG-BQBZGAKWSA-N 0.000 description 2
- SNDBKTFJWVEVPO-WHFBIAKZSA-N Asp-Gly-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CO)C(O)=O SNDBKTFJWVEVPO-WHFBIAKZSA-N 0.000 description 2
- PGUYEUCYVNZGGV-QWRGUYRKSA-N Asp-Gly-Tyr Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 PGUYEUCYVNZGGV-QWRGUYRKSA-N 0.000 description 2
- SWTQDYFZVOJVLL-KKUMJFAQSA-N Asp-His-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC(=O)O)N)O SWTQDYFZVOJVLL-KKUMJFAQSA-N 0.000 description 2
- QNMKWNONJGKJJC-NHCYSSNCSA-N Asp-Leu-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O QNMKWNONJGKJJC-NHCYSSNCSA-N 0.000 description 2
- LIVXPXUVXFRWNY-CIUDSAMLSA-N Asp-Lys-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O LIVXPXUVXFRWNY-CIUDSAMLSA-N 0.000 description 2
- YTXCCDCOHIYQFC-GUBZILKMSA-N Asp-Met-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O YTXCCDCOHIYQFC-GUBZILKMSA-N 0.000 description 2
- RNAQPBOOJRDICC-BPUTZDHNSA-N Asp-Met-Trp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC(=O)O)N RNAQPBOOJRDICC-BPUTZDHNSA-N 0.000 description 2
- YRZIYQGXTSBRLT-AVGNSLFASA-N Asp-Phe-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O YRZIYQGXTSBRLT-AVGNSLFASA-N 0.000 description 2
- QJHOOKBAHRJPPX-QWRGUYRKSA-N Asp-Phe-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=CC=C1 QJHOOKBAHRJPPX-QWRGUYRKSA-N 0.000 description 2
- UCHSVZYJKJLPHF-BZSNNMDCSA-N Asp-Phe-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O UCHSVZYJKJLPHF-BZSNNMDCSA-N 0.000 description 2
- BWJZSLQJNBSUPM-FXQIFTODSA-N Asp-Pro-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O BWJZSLQJNBSUPM-FXQIFTODSA-N 0.000 description 2
- QOCFFCUFZGDHTP-NUMRIWBASA-N Asp-Thr-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QOCFFCUFZGDHTP-NUMRIWBASA-N 0.000 description 2
- GWWSUMLEWKQHLR-NUMRIWBASA-N Asp-Thr-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O GWWSUMLEWKQHLR-NUMRIWBASA-N 0.000 description 2
- RSMZEHCMIOKNMW-GSSVUCPTSA-N Asp-Thr-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RSMZEHCMIOKNMW-GSSVUCPTSA-N 0.000 description 2
- PLNJUJGNLDSFOP-UWJYBYFXSA-N Asp-Tyr-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O PLNJUJGNLDSFOP-UWJYBYFXSA-N 0.000 description 2
- NWAHPBGBDIFUFD-KKUMJFAQSA-N Asp-Tyr-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O NWAHPBGBDIFUFD-KKUMJFAQSA-N 0.000 description 2
- XWKBWZXGNXTDKY-ZKWXMUAHSA-N Asp-Val-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(O)=O XWKBWZXGNXTDKY-ZKWXMUAHSA-N 0.000 description 2
- XWKPSMRPIKKDDU-RCOVLWMOSA-N Asp-Val-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O XWKPSMRPIKKDDU-RCOVLWMOSA-N 0.000 description 2
- QOJJMJKTMKNFEF-ZKWXMUAHSA-N Asp-Val-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(O)=O QOJJMJKTMKNFEF-ZKWXMUAHSA-N 0.000 description 2
- 241000823258 Betaproteobacteria bacterium Species 0.000 description 2
- 241000823281 Burkholderiales bacterium Species 0.000 description 2
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 2
- 241000588923 Citrobacter Species 0.000 description 2
- 241000588919 Citrobacter freundii Species 0.000 description 2
- PJWWRFATQTVXHA-UHFFFAOYSA-N Cyclohexylaminopropanesulfonic acid Chemical compound OS(=O)(=O)CCCNC1CCCCC1 PJWWRFATQTVXHA-UHFFFAOYSA-N 0.000 description 2
- TVYMKYUSZSVOAG-ZLUOBGJFSA-N Cys-Ala-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O TVYMKYUSZSVOAG-ZLUOBGJFSA-N 0.000 description 2
- PKNIZMPLMSKROD-BIIVOSGPSA-N Cys-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CS)N PKNIZMPLMSKROD-BIIVOSGPSA-N 0.000 description 2
- XABFFGOGKOORCG-CIUDSAMLSA-N Cys-Asp-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O XABFFGOGKOORCG-CIUDSAMLSA-N 0.000 description 2
- YZKOXEJTLWZOQL-GUBZILKMSA-N Cys-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CS)N YZKOXEJTLWZOQL-GUBZILKMSA-N 0.000 description 2
- OTXLNICGSXPGQF-KBIXCLLPSA-N Cys-Ile-Glu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O OTXLNICGSXPGQF-KBIXCLLPSA-N 0.000 description 2
- CHRCKSPMGYDLIA-SRVKXCTJSA-N Cys-Phe-Ser Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O CHRCKSPMGYDLIA-SRVKXCTJSA-N 0.000 description 2
- NAPULYCVEVVFRB-HEIBUPTGSA-N Cys-Thr-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@@H](N)CS NAPULYCVEVVFRB-HEIBUPTGSA-N 0.000 description 2
- YAHZABJORDUQGO-NQXXGFSBSA-N D-ribulose 1,5-bisphosphate Chemical compound OP(=O)(O)OC[C@@H](O)[C@@H](O)C(=O)COP(O)(O)=O YAHZABJORDUQGO-NQXXGFSBSA-N 0.000 description 2
- 108010090461 DFG peptide Proteins 0.000 description 2
- 102000053602 DNA Human genes 0.000 description 2
- 238000007702 DNA assembly Methods 0.000 description 2
- MWLYSLMKFXWZPW-ZPFDUUQYSA-N Gln-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CCC(N)=O MWLYSLMKFXWZPW-ZPFDUUQYSA-N 0.000 description 2
- SOBBAYVQSNXYPQ-ACZMJKKPSA-N Gln-Asn-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O SOBBAYVQSNXYPQ-ACZMJKKPSA-N 0.000 description 2
- CYTSBCIIEHUPDU-ACZMJKKPSA-N Gln-Asp-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O CYTSBCIIEHUPDU-ACZMJKKPSA-N 0.000 description 2
- WLODHVXYKYHLJD-ACZMJKKPSA-N Gln-Asp-Ser Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CO)C(=O)O)N WLODHVXYKYHLJD-ACZMJKKPSA-N 0.000 description 2
- ALUBSZXSNSPDQV-WDSKDSINSA-N Gln-Cys-Gly Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CS)C(=O)NCC(O)=O ALUBSZXSNSPDQV-WDSKDSINSA-N 0.000 description 2
- CITDWMLWXNUQKD-FXQIFTODSA-N Gln-Gln-Asn Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N CITDWMLWXNUQKD-FXQIFTODSA-N 0.000 description 2
- PKVWNYGXMNWJSI-CIUDSAMLSA-N Gln-Gln-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O PKVWNYGXMNWJSI-CIUDSAMLSA-N 0.000 description 2
- JEFZIKRIDLHOIF-BYPYZUCNSA-N Gln-Gly Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(O)=O JEFZIKRIDLHOIF-BYPYZUCNSA-N 0.000 description 2
- MFJAPSYJQJCQDN-BQBZGAKWSA-N Gln-Gly-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O MFJAPSYJQJCQDN-BQBZGAKWSA-N 0.000 description 2
- XSBGUANSZDGULP-IUCAKERBSA-N Gln-Gly-Lys Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CCCCN)C(O)=O XSBGUANSZDGULP-IUCAKERBSA-N 0.000 description 2
- LTXLIIZACMCQTO-GUBZILKMSA-N Gln-His-Asp Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N LTXLIIZACMCQTO-GUBZILKMSA-N 0.000 description 2
- PSERKXGRRADTKA-MNXVOIDGSA-N Gln-Leu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PSERKXGRRADTKA-MNXVOIDGSA-N 0.000 description 2
- KHNJVFYHIKLUPD-SRVKXCTJSA-N Gln-Leu-Met Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CCC(=O)N)N KHNJVFYHIKLUPD-SRVKXCTJSA-N 0.000 description 2
- JNENSVNAUWONEZ-GUBZILKMSA-N Gln-Lys-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O JNENSVNAUWONEZ-GUBZILKMSA-N 0.000 description 2
- SXGMGNZEHFORAV-IUCAKERBSA-N Gln-Lys-Gly Chemical compound C(CCN)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N SXGMGNZEHFORAV-IUCAKERBSA-N 0.000 description 2
- FKXCBKCOSVIGCT-AVGNSLFASA-N Gln-Lys-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O FKXCBKCOSVIGCT-AVGNSLFASA-N 0.000 description 2
- KLKYKPXITJBSNI-CIUDSAMLSA-N Gln-Met-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O KLKYKPXITJBSNI-CIUDSAMLSA-N 0.000 description 2
- CULXMOZETKLBDI-XIRDDKMYSA-N Gln-Met-Trp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CCC(=O)N)N CULXMOZETKLBDI-XIRDDKMYSA-N 0.000 description 2
- HMIXCETWRYDVMO-GUBZILKMSA-N Gln-Pro-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O HMIXCETWRYDVMO-GUBZILKMSA-N 0.000 description 2
- OSCLNNWLKKIQJM-WDSKDSINSA-N Gln-Ser-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O OSCLNNWLKKIQJM-WDSKDSINSA-N 0.000 description 2
- KPNWAJMEMRCLAL-GUBZILKMSA-N Gln-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)N)N KPNWAJMEMRCLAL-GUBZILKMSA-N 0.000 description 2
- GJLXZITZLUUXMJ-NHCYSSNCSA-N Gln-Val-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CCC(=O)N)N GJLXZITZLUUXMJ-NHCYSSNCSA-N 0.000 description 2
- FHPXTPQBODWBIY-CIUDSAMLSA-N Glu-Ala-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FHPXTPQBODWBIY-CIUDSAMLSA-N 0.000 description 2
- UTKICHUQEQBDGC-ACZMJKKPSA-N Glu-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)O)N UTKICHUQEQBDGC-ACZMJKKPSA-N 0.000 description 2
- UTKUTMJSWKKHEM-WDSKDSINSA-N Glu-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O UTKUTMJSWKKHEM-WDSKDSINSA-N 0.000 description 2
- BPDVTFBJZNBHEU-HGNGGELXSA-N Glu-Ala-His Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 BPDVTFBJZNBHEU-HGNGGELXSA-N 0.000 description 2
- NCWOMXABNYEPLY-NRPADANISA-N Glu-Ala-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O NCWOMXABNYEPLY-NRPADANISA-N 0.000 description 2
- SRZLHYPAOXBBSB-HJGDQZAQSA-N Glu-Arg-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SRZLHYPAOXBBSB-HJGDQZAQSA-N 0.000 description 2
- DYFJZDDQPNIPAB-NHCYSSNCSA-N Glu-Arg-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O DYFJZDDQPNIPAB-NHCYSSNCSA-N 0.000 description 2
- CKRUHITYRFNUKW-WDSKDSINSA-N Glu-Asn-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CKRUHITYRFNUKW-WDSKDSINSA-N 0.000 description 2
- RDPOETHPAQEGDP-ACZMJKKPSA-N Glu-Asp-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O RDPOETHPAQEGDP-ACZMJKKPSA-N 0.000 description 2
- VAIWPXWHWAPYDF-FXQIFTODSA-N Glu-Asp-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O VAIWPXWHWAPYDF-FXQIFTODSA-N 0.000 description 2
- SJPMNHCEWPTRBR-BQBZGAKWSA-N Glu-Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SJPMNHCEWPTRBR-BQBZGAKWSA-N 0.000 description 2
- IQACOVZVOMVILH-FXQIFTODSA-N Glu-Glu-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O IQACOVZVOMVILH-FXQIFTODSA-N 0.000 description 2
- MTAOBYXRYJZRGQ-WDSKDSINSA-N Glu-Gly-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MTAOBYXRYJZRGQ-WDSKDSINSA-N 0.000 description 2
- OGNJZUXUTPQVBR-BQBZGAKWSA-N Glu-Gly-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OGNJZUXUTPQVBR-BQBZGAKWSA-N 0.000 description 2
- OPAINBJQDQTGJY-JGVFFNPUSA-N Glu-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCC(=O)O)N)C(=O)O OPAINBJQDQTGJY-JGVFFNPUSA-N 0.000 description 2
- XMPAXPSENRSOSV-RYUDHWBXSA-N Glu-Gly-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XMPAXPSENRSOSV-RYUDHWBXSA-N 0.000 description 2
- VGOFRWOTSXVPAU-SDDRHHMPSA-N Glu-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CCC(=O)O)N)C(=O)O VGOFRWOTSXVPAU-SDDRHHMPSA-N 0.000 description 2
- GXMXPCXXKVWOSM-KQXIARHKSA-N Glu-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N GXMXPCXXKVWOSM-KQXIARHKSA-N 0.000 description 2
- HVYWQYLBVXMXSV-GUBZILKMSA-N Glu-Leu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O HVYWQYLBVXMXSV-GUBZILKMSA-N 0.000 description 2
- VSRCAOIHMGCIJK-SRVKXCTJSA-N Glu-Leu-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O VSRCAOIHMGCIJK-SRVKXCTJSA-N 0.000 description 2
- VMKCPNBBPGGQBJ-GUBZILKMSA-N Glu-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N VMKCPNBBPGGQBJ-GUBZILKMSA-N 0.000 description 2
- ATVYZJGOZLVXDK-IUCAKERBSA-N Glu-Leu-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O ATVYZJGOZLVXDK-IUCAKERBSA-N 0.000 description 2
- WNRZUESNGGDCJX-JYJNAYRXSA-N Glu-Leu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WNRZUESNGGDCJX-JYJNAYRXSA-N 0.000 description 2
- NJCALAAIGREHDR-WDCWCFNPSA-N Glu-Leu-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NJCALAAIGREHDR-WDCWCFNPSA-N 0.000 description 2
- SJJHXJDSNQJMMW-SRVKXCTJSA-N Glu-Lys-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O SJJHXJDSNQJMMW-SRVKXCTJSA-N 0.000 description 2
- CBWKURKPYSLMJV-SOUVJXGZSA-N Glu-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CCC(=O)O)N)C(=O)O CBWKURKPYSLMJV-SOUVJXGZSA-N 0.000 description 2
- TZXOPHFCAATANZ-QEJZJMRPSA-N Glu-Ser-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)O)N TZXOPHFCAATANZ-QEJZJMRPSA-N 0.000 description 2
- DDXZHOHEABQXSE-NKIYYHGXSA-N Glu-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O DDXZHOHEABQXSE-NKIYYHGXSA-N 0.000 description 2
- MXJYXYDREQWUMS-XKBZYTNZSA-N Glu-Thr-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O MXJYXYDREQWUMS-XKBZYTNZSA-N 0.000 description 2
- JDAYMLXPUJRSDJ-XIRDDKMYSA-N Glu-Trp-Arg Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCC(O)=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)=CNC2=C1 JDAYMLXPUJRSDJ-XIRDDKMYSA-N 0.000 description 2
- HVKAAUOFFTUSAA-XDTLVQLUSA-N Glu-Tyr-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O HVKAAUOFFTUSAA-XDTLVQLUSA-N 0.000 description 2
- YQPFCZVKMUVZIN-AUTRQRHGSA-N Glu-Val-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O YQPFCZVKMUVZIN-AUTRQRHGSA-N 0.000 description 2
- ZYRXTRTUCAVNBQ-GVXVVHGQSA-N Glu-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZYRXTRTUCAVNBQ-GVXVVHGQSA-N 0.000 description 2
- QSDKBRMVXSWAQE-BFHQHQDPSA-N Gly-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN QSDKBRMVXSWAQE-BFHQHQDPSA-N 0.000 description 2
- OVSKVOOUFAKODB-UWVGGRQHSA-N Gly-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N OVSKVOOUFAKODB-UWVGGRQHSA-N 0.000 description 2
- XZRZILPOZBVTDB-GJZGRUSLSA-N Gly-Arg-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCNC(N)=N)NC(=O)CN)C(O)=O)=CNC2=C1 XZRZILPOZBVTDB-GJZGRUSLSA-N 0.000 description 2
- BGVYNAQWHSTTSP-BYULHYEWSA-N Gly-Asn-Ile Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BGVYNAQWHSTTSP-BYULHYEWSA-N 0.000 description 2
- XRTDOIOIBMAXCT-NKWVEPMBSA-N Gly-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)CN)C(=O)O XRTDOIOIBMAXCT-NKWVEPMBSA-N 0.000 description 2
- XCLCVBYNGXEVDU-WHFBIAKZSA-N Gly-Asn-Ser Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O XCLCVBYNGXEVDU-WHFBIAKZSA-N 0.000 description 2
- FUTAPPOITCCWTH-WHFBIAKZSA-N Gly-Asp-Asp Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O FUTAPPOITCCWTH-WHFBIAKZSA-N 0.000 description 2
- MHHUEAIBJZWDBH-YUMQZZPRSA-N Gly-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN MHHUEAIBJZWDBH-YUMQZZPRSA-N 0.000 description 2
- PMNHJLASAAWELO-FOHZUACHSA-N Gly-Asp-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PMNHJLASAAWELO-FOHZUACHSA-N 0.000 description 2
- TZOVVRJYUDETQG-RCOVLWMOSA-N Gly-Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)CN TZOVVRJYUDETQG-RCOVLWMOSA-N 0.000 description 2
- GYAUWXXORNTCHU-QWRGUYRKSA-N Gly-Cys-Tyr Chemical compound NCC(=O)N[C@@H](CS)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 GYAUWXXORNTCHU-QWRGUYRKSA-N 0.000 description 2
- SOEATRRYCIPEHA-BQBZGAKWSA-N Gly-Glu-Glu Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SOEATRRYCIPEHA-BQBZGAKWSA-N 0.000 description 2
- STVHDEHTKFXBJQ-LAEOZQHASA-N Gly-Glu-Ile Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O STVHDEHTKFXBJQ-LAEOZQHASA-N 0.000 description 2
- PDAWDNVHMUKWJR-ZETCQYMHSA-N Gly-Gly-His Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC1=CNC=N1 PDAWDNVHMUKWJR-ZETCQYMHSA-N 0.000 description 2
- KAJAOGBVWCYGHZ-JTQLQIEISA-N Gly-Gly-Phe Chemical compound [NH3+]CC(=O)NCC(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 KAJAOGBVWCYGHZ-JTQLQIEISA-N 0.000 description 2
- ADZGCWWDPFDHCY-ZETCQYMHSA-N Gly-His-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CC1=CN=CN1 ADZGCWWDPFDHCY-ZETCQYMHSA-N 0.000 description 2
- MVORZMQFXBLMHM-QWRGUYRKSA-N Gly-His-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CN=CN1 MVORZMQFXBLMHM-QWRGUYRKSA-N 0.000 description 2
- SWQALSGKVLYKDT-UHFFFAOYSA-N Gly-Ile-Ala Natural products NCC(=O)NC(C(C)CC)C(=O)NC(C)C(O)=O SWQALSGKVLYKDT-UHFFFAOYSA-N 0.000 description 2
- ITZOBNKQDZEOCE-NHCYSSNCSA-N Gly-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)CN ITZOBNKQDZEOCE-NHCYSSNCSA-N 0.000 description 2
- FCKPEGOCSVZPNC-WHOFXGATSA-N Gly-Ile-Phe Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 FCKPEGOCSVZPNC-WHOFXGATSA-N 0.000 description 2
- IUZGUFAJDBHQQV-YUMQZZPRSA-N Gly-Leu-Asn Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IUZGUFAJDBHQQV-YUMQZZPRSA-N 0.000 description 2
- CCBIBMKQNXHNIN-ZETCQYMHSA-N Gly-Leu-Gly Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O CCBIBMKQNXHNIN-ZETCQYMHSA-N 0.000 description 2
- AFWYPMDMDYCKMD-KBPBESRZSA-N Gly-Leu-Tyr Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AFWYPMDMDYCKMD-KBPBESRZSA-N 0.000 description 2
- WDEHMRNSGHVNOH-VHSXEESVSA-N Gly-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)CN)C(=O)O WDEHMRNSGHVNOH-VHSXEESVSA-N 0.000 description 2
- FJWSJWACLMTDMI-WPRPVWTQSA-N Gly-Met-Val Chemical compound [H]NCC(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(O)=O FJWSJWACLMTDMI-WPRPVWTQSA-N 0.000 description 2
- DHNXGWVNLFPOMQ-KBPBESRZSA-N Gly-Phe-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)CN DHNXGWVNLFPOMQ-KBPBESRZSA-N 0.000 description 2
- VDCRBJACQKOSMS-JSGCOSHPSA-N Gly-Phe-Val Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O VDCRBJACQKOSMS-JSGCOSHPSA-N 0.000 description 2
- ZZJVYSAQQMDIRD-UWVGGRQHSA-N Gly-Pro-His Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O ZZJVYSAQQMDIRD-UWVGGRQHSA-N 0.000 description 2
- GAAHQHNCMIAYEX-UWVGGRQHSA-N Gly-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN GAAHQHNCMIAYEX-UWVGGRQHSA-N 0.000 description 2
- IXHQLZIWBCQBLQ-STQMWFEESA-N Gly-Pro-Phe Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IXHQLZIWBCQBLQ-STQMWFEESA-N 0.000 description 2
- SOEGEPHNZOISMT-BYPYZUCNSA-N Gly-Ser-Gly Chemical compound NCC(=O)N[C@@H](CO)C(=O)NCC(O)=O SOEGEPHNZOISMT-BYPYZUCNSA-N 0.000 description 2
- WNGHUXFWEWTKAO-YUMQZZPRSA-N Gly-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)CN WNGHUXFWEWTKAO-YUMQZZPRSA-N 0.000 description 2
- YXTFLTJYLIAZQG-FJXKBIBVSA-N Gly-Thr-Arg Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YXTFLTJYLIAZQG-FJXKBIBVSA-N 0.000 description 2
- HUFUVTYGPOUCBN-MBLNEYKQSA-N Gly-Thr-Ile Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HUFUVTYGPOUCBN-MBLNEYKQSA-N 0.000 description 2
- BAYQNCWLXIDLHX-ONGXEEELSA-N Gly-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)CN BAYQNCWLXIDLHX-ONGXEEELSA-N 0.000 description 2
- KSOBNUBCYHGUKH-UWVGGRQHSA-N Gly-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)CN KSOBNUBCYHGUKH-UWVGGRQHSA-N 0.000 description 2
- RVKIPWVMZANZLI-UHFFFAOYSA-N H-Lys-Trp-OH Natural products C1=CC=C2C(CC(NC(=O)C(N)CCCCN)C(O)=O)=CNC2=C1 RVKIPWVMZANZLI-UHFFFAOYSA-N 0.000 description 2
- 241001289523 Halothece Species 0.000 description 2
- XINDHUAGVGCNSF-QSFUFRPTSA-N His-Ala-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XINDHUAGVGCNSF-QSFUFRPTSA-N 0.000 description 2
- MBSSHYPAEHPSGY-LSJOCFKGSA-N His-Ala-Met Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CCSC)C(O)=O MBSSHYPAEHPSGY-LSJOCFKGSA-N 0.000 description 2
- IDNNYVGVSZMQTK-IHRRRGAJSA-N His-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N IDNNYVGVSZMQTK-IHRRRGAJSA-N 0.000 description 2
- UZZXGLOJRZKYEL-DJFWLOJKSA-N His-Asn-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UZZXGLOJRZKYEL-DJFWLOJKSA-N 0.000 description 2
- MDBYBTWRMOAJAY-NHCYSSNCSA-N His-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CN=CN1)N MDBYBTWRMOAJAY-NHCYSSNCSA-N 0.000 description 2
- HVCRQRQPIIRNLY-IUCAKERBSA-N His-Gln-Gly Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)NCC(=O)O)N HVCRQRQPIIRNLY-IUCAKERBSA-N 0.000 description 2
- BQFGKVYHKCNEMF-DCAQKATOSA-N His-Glu-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CN=CN1 BQFGKVYHKCNEMF-DCAQKATOSA-N 0.000 description 2
- QAMFAYSMNZBNCA-UWVGGRQHSA-N His-Gly-Met Chemical compound CSCC[C@H](NC(=O)CNC(=O)[C@@H](N)Cc1cnc[nH]1)C(O)=O QAMFAYSMNZBNCA-UWVGGRQHSA-N 0.000 description 2
- ORERHHPZDDEMSC-VGDYDELISA-N His-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N ORERHHPZDDEMSC-VGDYDELISA-N 0.000 description 2
- LVWIJITYHRZHBO-IXOXFDKPSA-N His-Leu-Thr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LVWIJITYHRZHBO-IXOXFDKPSA-N 0.000 description 2
- IGBBXBFSLKRHJB-BZSNNMDCSA-N His-Lys-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CN=CN1 IGBBXBFSLKRHJB-BZSNNMDCSA-N 0.000 description 2
- YYOCMTFVGKDNQP-IHRRRGAJSA-N His-Met-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N YYOCMTFVGKDNQP-IHRRRGAJSA-N 0.000 description 2
- WYSJPCTWSBJFCO-AVGNSLFASA-N His-Met-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC1=CN=CN1)N WYSJPCTWSBJFCO-AVGNSLFASA-N 0.000 description 2
- ZVKDCQVQTGYBQT-LSJOCFKGSA-N His-Pro-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O ZVKDCQVQTGYBQT-LSJOCFKGSA-N 0.000 description 2
- QCBYAHHNOHBXIH-UWVGGRQHSA-N His-Pro-Gly Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)NCC(O)=O)C1=CN=CN1 QCBYAHHNOHBXIH-UWVGGRQHSA-N 0.000 description 2
- KAXZXLSXFWSNNZ-XVYDVKMFSA-N His-Ser-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O KAXZXLSXFWSNNZ-XVYDVKMFSA-N 0.000 description 2
- STGQSBKUYSPPIG-CIUDSAMLSA-N His-Ser-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CN=CN1 STGQSBKUYSPPIG-CIUDSAMLSA-N 0.000 description 2
- WSWAUVHXQREQQG-JYJNAYRXSA-N His-Tyr-Gln Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O WSWAUVHXQREQQG-JYJNAYRXSA-N 0.000 description 2
- DAKSMIWQZPHRIB-BZSNNMDCSA-N His-Tyr-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O DAKSMIWQZPHRIB-BZSNNMDCSA-N 0.000 description 2
- 101001114932 Homo sapiens 40S ribosomal protein S20 Proteins 0.000 description 2
- AQCUAZTZSPQJFF-ZKWXMUAHSA-N Ile-Ala-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O AQCUAZTZSPQJFF-ZKWXMUAHSA-N 0.000 description 2
- LVQDUPQUJZWKSU-PYJNHQTQSA-N Ile-Arg-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N LVQDUPQUJZWKSU-PYJNHQTQSA-N 0.000 description 2
- DCQMJRSOGCYKTR-GHCJXIJMSA-N Ile-Asp-Ser Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O DCQMJRSOGCYKTR-GHCJXIJMSA-N 0.000 description 2
- WEWCEPOYKANMGZ-MMWGEVLESA-N Ile-Cys-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N1CCC[C@@H]1C(=O)O)N WEWCEPOYKANMGZ-MMWGEVLESA-N 0.000 description 2
- VQUCKIAECLVLAD-SVSWQMSJSA-N Ile-Cys-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N VQUCKIAECLVLAD-SVSWQMSJSA-N 0.000 description 2
- WZDCVAWMBUNDDY-KBIXCLLPSA-N Ile-Glu-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](C)C(=O)O)N WZDCVAWMBUNDDY-KBIXCLLPSA-N 0.000 description 2
- WUKLZPHVWAMZQV-UKJIMTQDSA-N Ile-Glu-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](C(C)C)C(=O)O)N WUKLZPHVWAMZQV-UKJIMTQDSA-N 0.000 description 2
- KFVUBLZRFSVDGO-BYULHYEWSA-N Ile-Gly-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O KFVUBLZRFSVDGO-BYULHYEWSA-N 0.000 description 2
- VOBYAKCXGQQFLR-LSJOCFKGSA-N Ile-Gly-Val Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O VOBYAKCXGQQFLR-LSJOCFKGSA-N 0.000 description 2
- GLLAUPMJCGKPFY-BLMTYFJBSA-N Ile-Ile-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)[C@@H](C)CC)C(O)=O)=CNC2=C1 GLLAUPMJCGKPFY-BLMTYFJBSA-N 0.000 description 2
- NUKXXNFEUZGPRO-BJDJZHNGSA-N Ile-Leu-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)O)N NUKXXNFEUZGPRO-BJDJZHNGSA-N 0.000 description 2
- GLYJPWIRLBAIJH-FQUUOJAGSA-N Ile-Lys-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N GLYJPWIRLBAIJH-FQUUOJAGSA-N 0.000 description 2
- GLYJPWIRLBAIJH-UHFFFAOYSA-N Ile-Lys-Pro Natural products CCC(C)C(N)C(=O)NC(CCCCN)C(=O)N1CCCC1C(O)=O GLYJPWIRLBAIJH-UHFFFAOYSA-N 0.000 description 2
- CIDLJWVDMNDKPT-FIRPJDEBSA-N Ile-Phe-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)O)N CIDLJWVDMNDKPT-FIRPJDEBSA-N 0.000 description 2
- JHNJNTMTZHEDLJ-NAKRPEOUSA-N Ile-Ser-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O JHNJNTMTZHEDLJ-NAKRPEOUSA-N 0.000 description 2
- RQJUKVXWAKJDBW-SVSWQMSJSA-N Ile-Ser-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N RQJUKVXWAKJDBW-SVSWQMSJSA-N 0.000 description 2
- YCKPUHHMCFSUMD-IUKAMOBKSA-N Ile-Thr-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YCKPUHHMCFSUMD-IUKAMOBKSA-N 0.000 description 2
- NGKPIPCGMLWHBX-WZLNRYEVSA-N Ile-Tyr-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N NGKPIPCGMLWHBX-WZLNRYEVSA-N 0.000 description 2
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 2
- RCFDOSNHHZGBOY-UHFFFAOYSA-N L-isoleucyl-L-alanine Natural products CCC(C)C(N)C(=O)NC(C)C(O)=O RCFDOSNHHZGBOY-UHFFFAOYSA-N 0.000 description 2
- LZDNBBYBDGBADK-UHFFFAOYSA-N L-valyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-UHFFFAOYSA-N 0.000 description 2
- KVRKAGGMEWNURO-CIUDSAMLSA-N Leu-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(C)C)N KVRKAGGMEWNURO-CIUDSAMLSA-N 0.000 description 2
- REPPKAMYTOJTFC-DCAQKATOSA-N Leu-Arg-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O REPPKAMYTOJTFC-DCAQKATOSA-N 0.000 description 2
- CNNQBZRGQATKNY-DCAQKATOSA-N Leu-Arg-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CS)C(=O)O)N CNNQBZRGQATKNY-DCAQKATOSA-N 0.000 description 2
- UCOCBWDBHCUPQP-DCAQKATOSA-N Leu-Arg-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O UCOCBWDBHCUPQP-DCAQKATOSA-N 0.000 description 2
- OIARJGNVARWKFP-YUMQZZPRSA-N Leu-Asn-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O OIARJGNVARWKFP-YUMQZZPRSA-N 0.000 description 2
- BPANDPNDMJHFEV-CIUDSAMLSA-N Leu-Asp-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O BPANDPNDMJHFEV-CIUDSAMLSA-N 0.000 description 2
- ILJREDZFPHTUIE-GUBZILKMSA-N Leu-Asp-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ILJREDZFPHTUIE-GUBZILKMSA-N 0.000 description 2
- ULXYQAJWJGLCNR-YUMQZZPRSA-N Leu-Asp-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O ULXYQAJWJGLCNR-YUMQZZPRSA-N 0.000 description 2
- KTFHTMHHKXUYPW-ZPFDUUQYSA-N Leu-Asp-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KTFHTMHHKXUYPW-ZPFDUUQYSA-N 0.000 description 2
- ZTLGVASZOIKNIX-DCAQKATOSA-N Leu-Gln-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZTLGVASZOIKNIX-DCAQKATOSA-N 0.000 description 2
- QVFGXCVIXXBFHO-AVGNSLFASA-N Leu-Glu-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O QVFGXCVIXXBFHO-AVGNSLFASA-N 0.000 description 2
- LLBQJYDYOLIQAI-JYJNAYRXSA-N Leu-Glu-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LLBQJYDYOLIQAI-JYJNAYRXSA-N 0.000 description 2
- FMEICTQWUKNAGC-YUMQZZPRSA-N Leu-Gly-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O FMEICTQWUKNAGC-YUMQZZPRSA-N 0.000 description 2
- VBZOAGIPCULURB-QWRGUYRKSA-N Leu-Gly-His Chemical compound CC(C)C[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N VBZOAGIPCULURB-QWRGUYRKSA-N 0.000 description 2
- CFZZDVMBRYFFNU-QWRGUYRKSA-N Leu-His-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)NCC(O)=O CFZZDVMBRYFFNU-QWRGUYRKSA-N 0.000 description 2
- OHZIZVWQXJPBJS-IXOXFDKPSA-N Leu-His-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OHZIZVWQXJPBJS-IXOXFDKPSA-N 0.000 description 2
- LKXANTUNFMVCNF-IHPCNDPISA-N Leu-His-Trp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O LKXANTUNFMVCNF-IHPCNDPISA-N 0.000 description 2
- AVEGDIAXTDVBJS-XUXIUFHCSA-N Leu-Ile-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AVEGDIAXTDVBJS-XUXIUFHCSA-N 0.000 description 2
- LIINDKYIGYTDLG-PPCPHDFISA-N Leu-Ile-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LIINDKYIGYTDLG-PPCPHDFISA-N 0.000 description 2
- IAJFFZORSWOZPQ-SRVKXCTJSA-N Leu-Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IAJFFZORSWOZPQ-SRVKXCTJSA-N 0.000 description 2
- XVZCXCTYGHPNEM-UHFFFAOYSA-N Leu-Leu-Pro Natural products CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(O)=O XVZCXCTYGHPNEM-UHFFFAOYSA-N 0.000 description 2
- ZRHDPZAAWLXXIR-SRVKXCTJSA-N Leu-Lys-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O ZRHDPZAAWLXXIR-SRVKXCTJSA-N 0.000 description 2
- QNTJIDXQHWUBKC-BZSNNMDCSA-N Leu-Lys-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QNTJIDXQHWUBKC-BZSNNMDCSA-N 0.000 description 2
- OVZLLFONXILPDZ-VOAKCMCISA-N Leu-Lys-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OVZLLFONXILPDZ-VOAKCMCISA-N 0.000 description 2
- IBSGMIPRBMPMHE-IHRRRGAJSA-N Leu-Met-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(O)=O IBSGMIPRBMPMHE-IHRRRGAJSA-N 0.000 description 2
- FZMNAYBEFGZEIF-AVGNSLFASA-N Leu-Met-Met Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCSC)C(=O)O)N FZMNAYBEFGZEIF-AVGNSLFASA-N 0.000 description 2
- HDHQQEDVWQGBEE-DCAQKATOSA-N Leu-Met-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O HDHQQEDVWQGBEE-DCAQKATOSA-N 0.000 description 2
- GCXGCIYIHXSKAY-ULQDDVLXSA-N Leu-Phe-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GCXGCIYIHXSKAY-ULQDDVLXSA-N 0.000 description 2
- PTRKPHUGYULXPU-KKUMJFAQSA-N Leu-Phe-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O PTRKPHUGYULXPU-KKUMJFAQSA-N 0.000 description 2
- YWKNKRAKOCLOLH-OEAJRASXSA-N Leu-Phe-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=CC=C1 YWKNKRAKOCLOLH-OEAJRASXSA-N 0.000 description 2
- CHJKEDSZNSONPS-DCAQKATOSA-N Leu-Pro-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O CHJKEDSZNSONPS-DCAQKATOSA-N 0.000 description 2
- IZPVWNSAVUQBGP-CIUDSAMLSA-N Leu-Ser-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O IZPVWNSAVUQBGP-CIUDSAMLSA-N 0.000 description 2
- XOWMDXHFSBCAKQ-SRVKXCTJSA-N Leu-Ser-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(C)C XOWMDXHFSBCAKQ-SRVKXCTJSA-N 0.000 description 2
- AMSSKPUHBUQBOQ-SRVKXCTJSA-N Leu-Ser-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N AMSSKPUHBUQBOQ-SRVKXCTJSA-N 0.000 description 2
- PPGBXYKMUMHFBF-KATARQTJSA-N Leu-Ser-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PPGBXYKMUMHFBF-KATARQTJSA-N 0.000 description 2
- SVBJIZVVYJYGLA-DCAQKATOSA-N Leu-Ser-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O SVBJIZVVYJYGLA-DCAQKATOSA-N 0.000 description 2
- ICYRCNICGBJLGM-HJGDQZAQSA-N Leu-Thr-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(O)=O ICYRCNICGBJLGM-HJGDQZAQSA-N 0.000 description 2
- LFSQWRSVPNKJGP-WDCWCFNPSA-N Leu-Thr-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCC(O)=O LFSQWRSVPNKJGP-WDCWCFNPSA-N 0.000 description 2
- QWWPYKKLXWOITQ-VOAKCMCISA-N Leu-Thr-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(C)C QWWPYKKLXWOITQ-VOAKCMCISA-N 0.000 description 2
- KLSUAWUZBMAZCL-RHYQMDGZSA-N Leu-Thr-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(O)=O KLSUAWUZBMAZCL-RHYQMDGZSA-N 0.000 description 2
- HGLKOTPFWOMPOB-MEYUZBJRSA-N Leu-Thr-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HGLKOTPFWOMPOB-MEYUZBJRSA-N 0.000 description 2
- IDGRADDMTTWOQC-WDSOQIARSA-N Leu-Trp-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IDGRADDMTTWOQC-WDSOQIARSA-N 0.000 description 2
- FPFOYSCDUWTZBF-IHPCNDPISA-N Leu-Trp-Leu Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H]([NH3+])CC(C)C)C(=O)N[C@@H](CC(C)C)C([O-])=O)=CNC2=C1 FPFOYSCDUWTZBF-IHPCNDPISA-N 0.000 description 2
- UCRJTSIIAYHOHE-ULQDDVLXSA-N Leu-Tyr-Arg Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N UCRJTSIIAYHOHE-ULQDDVLXSA-N 0.000 description 2
- XZNJZXJZBMBGGS-NHCYSSNCSA-N Leu-Val-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O XZNJZXJZBMBGGS-NHCYSSNCSA-N 0.000 description 2
- AIMGJYMCTAABEN-GVXVVHGQSA-N Leu-Val-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIMGJYMCTAABEN-GVXVVHGQSA-N 0.000 description 2
- QESXLSQLQHHTIX-RHYQMDGZSA-N Leu-Val-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QESXLSQLQHHTIX-RHYQMDGZSA-N 0.000 description 2
- YIBOAHAOAWACDK-QEJZJMRPSA-N Lys-Ala-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 YIBOAHAOAWACDK-QEJZJMRPSA-N 0.000 description 2
- LXNPMPIQDNSMTA-AVGNSLFASA-N Lys-Gln-His Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 LXNPMPIQDNSMTA-AVGNSLFASA-N 0.000 description 2
- RZHLIPMZXOEJTL-AVGNSLFASA-N Lys-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N RZHLIPMZXOEJTL-AVGNSLFASA-N 0.000 description 2
- QQUJSUFWEDZQQY-AVGNSLFASA-N Lys-Gln-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCCN QQUJSUFWEDZQQY-AVGNSLFASA-N 0.000 description 2
- DRCILAJNUJKAHC-SRVKXCTJSA-N Lys-Glu-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O DRCILAJNUJKAHC-SRVKXCTJSA-N 0.000 description 2
- GCMWRRQAKQXDED-IUCAKERBSA-N Lys-Glu-Gly Chemical compound [NH3+]CCCC[C@H]([NH3+])C(=O)N[C@@H](CCC([O-])=O)C(=O)NCC([O-])=O GCMWRRQAKQXDED-IUCAKERBSA-N 0.000 description 2
- CANPXOLVTMKURR-WEDXCCLWSA-N Lys-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN CANPXOLVTMKURR-WEDXCCLWSA-N 0.000 description 2
- OWRUUFUVXFREBD-KKUMJFAQSA-N Lys-His-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O OWRUUFUVXFREBD-KKUMJFAQSA-N 0.000 description 2
- GNLJXWBNLAIPEP-MELADBBJSA-N Lys-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CCCCN)N)C(=O)O GNLJXWBNLAIPEP-MELADBBJSA-N 0.000 description 2
- MXMDJEJWERYPMO-XUXIUFHCSA-N Lys-Ile-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O MXMDJEJWERYPMO-XUXIUFHCSA-N 0.000 description 2
- NCZIQZYZPUPMKY-PPCPHDFISA-N Lys-Ile-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NCZIQZYZPUPMKY-PPCPHDFISA-N 0.000 description 2
- SKRGVGLIRUGANF-AVGNSLFASA-N Lys-Leu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SKRGVGLIRUGANF-AVGNSLFASA-N 0.000 description 2
- YPLVCBKEPJPBDQ-MELADBBJSA-N Lys-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N YPLVCBKEPJPBDQ-MELADBBJSA-N 0.000 description 2
- OIQSIMFSVLLWBX-VOAKCMCISA-N Lys-Leu-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OIQSIMFSVLLWBX-VOAKCMCISA-N 0.000 description 2
- WBSCNDJQPKSPII-KKUMJFAQSA-N Lys-Lys-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O WBSCNDJQPKSPII-KKUMJFAQSA-N 0.000 description 2
- BEGQVWUZFXLNHZ-IHPCNDPISA-N Lys-Lys-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN)C(O)=O)=CNC2=C1 BEGQVWUZFXLNHZ-IHPCNDPISA-N 0.000 description 2
- LMGNWHDWJDIOPK-DKIMLUQUSA-N Lys-Phe-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LMGNWHDWJDIOPK-DKIMLUQUSA-N 0.000 description 2
- JHNOXVASMSXSNB-WEDXCCLWSA-N Lys-Thr-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O JHNOXVASMSXSNB-WEDXCCLWSA-N 0.000 description 2
- RPWTZTBIFGENIA-VOAKCMCISA-N Lys-Thr-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O RPWTZTBIFGENIA-VOAKCMCISA-N 0.000 description 2
- YFQSSOAGMZGXFT-MEYUZBJRSA-N Lys-Thr-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YFQSSOAGMZGXFT-MEYUZBJRSA-N 0.000 description 2
- MDDUIRLQCYVRDO-NHCYSSNCSA-N Lys-Val-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN MDDUIRLQCYVRDO-NHCYSSNCSA-N 0.000 description 2
- 241000124008 Mammalia Species 0.000 description 2
- VHGIWFGJIHTASW-FXQIFTODSA-N Met-Ala-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O VHGIWFGJIHTASW-FXQIFTODSA-N 0.000 description 2
- DRINJBAHUGXNFC-DCAQKATOSA-N Met-Asp-His Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(O)=O DRINJBAHUGXNFC-DCAQKATOSA-N 0.000 description 2
- XOMXAVJBLRROMC-IHRRRGAJSA-N Met-Asp-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XOMXAVJBLRROMC-IHRRRGAJSA-N 0.000 description 2
- AVTWKENDGGUWDC-BQBZGAKWSA-N Met-Cys-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CS)C(=O)NCC(O)=O AVTWKENDGGUWDC-BQBZGAKWSA-N 0.000 description 2
- SJDQOYTYNGZZJX-SRVKXCTJSA-N Met-Glu-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O SJDQOYTYNGZZJX-SRVKXCTJSA-N 0.000 description 2
- SLQDSYZHHOKQSR-QXEWZRGKSA-N Met-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCSC SLQDSYZHHOKQSR-QXEWZRGKSA-N 0.000 description 2
- WXJXYMFUTRXRGO-UWVGGRQHSA-N Met-His-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CNC=N1 WXJXYMFUTRXRGO-UWVGGRQHSA-N 0.000 description 2
- FZUNSVYYPYJYAP-NAKRPEOUSA-N Met-Ile-Ala Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O FZUNSVYYPYJYAP-NAKRPEOUSA-N 0.000 description 2
- PZUUMQPMHBJJKE-AVGNSLFASA-N Met-Leu-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCNC(N)=N PZUUMQPMHBJJKE-AVGNSLFASA-N 0.000 description 2
- CHDYFPCQVUOJEB-ULQDDVLXSA-N Met-Leu-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 CHDYFPCQVUOJEB-ULQDDVLXSA-N 0.000 description 2
- HOTNHEUETJELDL-BPNCWPANSA-N Met-Tyr-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CCSC)N HOTNHEUETJELDL-BPNCWPANSA-N 0.000 description 2
- YGNUDKAPJARTEM-GUBZILKMSA-N Met-Val-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O YGNUDKAPJARTEM-GUBZILKMSA-N 0.000 description 2
- 108010085220 Multiprotein Complexes Proteins 0.000 description 2
- 102000007474 Multiprotein Complexes Human genes 0.000 description 2
- XZFYRXDAULDNFX-UHFFFAOYSA-N N-L-cysteinyl-L-phenylalanine Natural products SCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XZFYRXDAULDNFX-UHFFFAOYSA-N 0.000 description 2
- 108010079364 N-glycylalanine Proteins 0.000 description 2
- 108010066427 N-valyltryptophan Proteins 0.000 description 2
- 108010087066 N2-tryptophyllysine Proteins 0.000 description 2
- 108010047562 NGR peptide Proteins 0.000 description 2
- 239000001888 Peptone Substances 0.000 description 2
- 108010080698 Peptones Proteins 0.000 description 2
- XWBJLKDCHJVKAK-KKUMJFAQSA-N Phe-Arg-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N XWBJLKDCHJVKAK-KKUMJFAQSA-N 0.000 description 2
- MPGJIHFJCXTVEX-KKUMJFAQSA-N Phe-Arg-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O MPGJIHFJCXTVEX-KKUMJFAQSA-N 0.000 description 2
- QCHNRQQVLJYDSI-DLOVCJGASA-N Phe-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 QCHNRQQVLJYDSI-DLOVCJGASA-N 0.000 description 2
- HXSUFWQYLPKEHF-IHRRRGAJSA-N Phe-Asn-Arg Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N HXSUFWQYLPKEHF-IHRRRGAJSA-N 0.000 description 2
- KIEPQOIQHFKQLK-PCBIJLKTSA-N Phe-Asn-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KIEPQOIQHFKQLK-PCBIJLKTSA-N 0.000 description 2
- JIYJYFIXQTYDNF-YDHLFZDLSA-N Phe-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N JIYJYFIXQTYDNF-YDHLFZDLSA-N 0.000 description 2
- XMPUYNHKEPFERE-IHRRRGAJSA-N Phe-Asp-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 XMPUYNHKEPFERE-IHRRRGAJSA-N 0.000 description 2
- WMGVYPPIMZPWPN-SRVKXCTJSA-N Phe-Asp-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N WMGVYPPIMZPWPN-SRVKXCTJSA-N 0.000 description 2
- CSYVXYQDIVCQNU-QWRGUYRKSA-N Phe-Asp-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O CSYVXYQDIVCQNU-QWRGUYRKSA-N 0.000 description 2
- HQVPQHLNOVTLDD-IHRRRGAJSA-N Phe-Cys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC1=CC=CC=C1)N HQVPQHLNOVTLDD-IHRRRGAJSA-N 0.000 description 2
- KAGCQPSEVAETCA-JYJNAYRXSA-N Phe-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N KAGCQPSEVAETCA-JYJNAYRXSA-N 0.000 description 2
- GDBOREPXIRKSEQ-FHWLQOOXSA-N Phe-Gln-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GDBOREPXIRKSEQ-FHWLQOOXSA-N 0.000 description 2
- FIRWJEJVFFGXSH-RYUDHWBXSA-N Phe-Glu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 FIRWJEJVFFGXSH-RYUDHWBXSA-N 0.000 description 2
- VADLTGVIOIOKGM-BZSNNMDCSA-N Phe-His-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CN=CN1 VADLTGVIOIOKGM-BZSNNMDCSA-N 0.000 description 2
- VZFPYFRVHMSSNA-JURCDPSOSA-N Phe-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC1=CC=CC=C1 VZFPYFRVHMSSNA-JURCDPSOSA-N 0.000 description 2
- YCCUXNNKXDGMAM-KKUMJFAQSA-N Phe-Leu-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YCCUXNNKXDGMAM-KKUMJFAQSA-N 0.000 description 2
- OQTDZEJJWWAGJT-KKUMJFAQSA-N Phe-Lys-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O OQTDZEJJWWAGJT-KKUMJFAQSA-N 0.000 description 2
- WLYPRKLMRIYGPP-JYJNAYRXSA-N Phe-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 WLYPRKLMRIYGPP-JYJNAYRXSA-N 0.000 description 2
- SCKXGHWQPPURGT-KKUMJFAQSA-N Phe-Lys-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O SCKXGHWQPPURGT-KKUMJFAQSA-N 0.000 description 2
- JKJSIYKSGIDHPM-WBAXXEDZSA-N Phe-Phe-Ala Chemical compound C[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H](N)Cc1ccccc1)C(O)=O JKJSIYKSGIDHPM-WBAXXEDZSA-N 0.000 description 2
- WEDZFLRYSIDIRX-IHRRRGAJSA-N Phe-Ser-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=CC=C1 WEDZFLRYSIDIRX-IHRRRGAJSA-N 0.000 description 2
- LTAWNJXSRUCFAN-UNQGMJICSA-N Phe-Thr-Arg Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O LTAWNJXSRUCFAN-UNQGMJICSA-N 0.000 description 2
- KLYYKKGCPOGDPE-OEAJRASXSA-N Phe-Thr-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O KLYYKKGCPOGDPE-OEAJRASXSA-N 0.000 description 2
- MSSXKZBDKZAHCX-UNQGMJICSA-N Phe-Thr-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O MSSXKZBDKZAHCX-UNQGMJICSA-N 0.000 description 2
- QUUCAHIYARMNBL-FHWLQOOXSA-N Phe-Tyr-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N QUUCAHIYARMNBL-FHWLQOOXSA-N 0.000 description 2
- ZYNBEWGJFXTBDU-ACRUOGEOSA-N Phe-Tyr-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC2=CC=CC=C2)N ZYNBEWGJFXTBDU-ACRUOGEOSA-N 0.000 description 2
- 229920002562 Polyethylene Glycol 3350 Polymers 0.000 description 2
- DZZCICYRSZASNF-FXQIFTODSA-N Pro-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 DZZCICYRSZASNF-FXQIFTODSA-N 0.000 description 2
- VXCHGLYSIOOZIS-GUBZILKMSA-N Pro-Ala-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 VXCHGLYSIOOZIS-GUBZILKMSA-N 0.000 description 2
- KIZQGKLMXKGDIV-BQBZGAKWSA-N Pro-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 KIZQGKLMXKGDIV-BQBZGAKWSA-N 0.000 description 2
- IFMDQWDAJUMMJC-DCAQKATOSA-N Pro-Ala-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O IFMDQWDAJUMMJC-DCAQKATOSA-N 0.000 description 2
- SBYVDRLQAGENMY-DCAQKATOSA-N Pro-Asn-His Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O SBYVDRLQAGENMY-DCAQKATOSA-N 0.000 description 2
- JARJPEMLQAWNBR-GUBZILKMSA-N Pro-Asp-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JARJPEMLQAWNBR-GUBZILKMSA-N 0.000 description 2
- CJZTUKSFZUSNCC-FXQIFTODSA-N Pro-Asp-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H]1CCCN1 CJZTUKSFZUSNCC-FXQIFTODSA-N 0.000 description 2
- YFNOUBWUIIJQHF-LPEHRKFASA-N Pro-Asp-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)O)C(=O)N2CCC[C@@H]2C(=O)O YFNOUBWUIIJQHF-LPEHRKFASA-N 0.000 description 2
- UPJGUQPLYWTISV-GUBZILKMSA-N Pro-Gln-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UPJGUQPLYWTISV-GUBZILKMSA-N 0.000 description 2
- LHALYDBUDCWMDY-CIUDSAMLSA-N Pro-Glu-Ala Chemical compound C[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1)C(O)=O LHALYDBUDCWMDY-CIUDSAMLSA-N 0.000 description 2
- VOZIBWWZSBIXQN-SRVKXCTJSA-N Pro-Glu-Lys Chemical compound NCCCC[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1)C(O)=O VOZIBWWZSBIXQN-SRVKXCTJSA-N 0.000 description 2
- LXVLKXPFIDDHJG-CIUDSAMLSA-N Pro-Glu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O LXVLKXPFIDDHJG-CIUDSAMLSA-N 0.000 description 2
- VYWNORHENYEQDW-YUMQZZPRSA-N Pro-Gly-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 VYWNORHENYEQDW-YUMQZZPRSA-N 0.000 description 2
- UIMCLYYSUCIUJM-UWVGGRQHSA-N Pro-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 UIMCLYYSUCIUJM-UWVGGRQHSA-N 0.000 description 2
- DXTOOBDIIAJZBJ-BQBZGAKWSA-N Pro-Gly-Ser Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CO)C(O)=O DXTOOBDIIAJZBJ-BQBZGAKWSA-N 0.000 description 2
- HAEGAELAYWSUNC-WPRPVWTQSA-N Pro-Gly-Val Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O HAEGAELAYWSUNC-WPRPVWTQSA-N 0.000 description 2
- VZKBJNBZMZHKRC-XUXIUFHCSA-N Pro-Ile-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O VZKBJNBZMZHKRC-XUXIUFHCSA-N 0.000 description 2
- FMLRRBDLBJLJIK-DCAQKATOSA-N Pro-Leu-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]1CCCN1 FMLRRBDLBJLJIK-DCAQKATOSA-N 0.000 description 2
- BRJGUPWVFXKBQI-XUXIUFHCSA-N Pro-Leu-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BRJGUPWVFXKBQI-XUXIUFHCSA-N 0.000 description 2
- SRBFGSGDNNQABI-FHWLQOOXSA-N Pro-Leu-Trp Chemical compound N([C@@H](CC(C)C)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C(=O)[C@@H]1CCCN1 SRBFGSGDNNQABI-FHWLQOOXSA-N 0.000 description 2
- RMODQFBNDDENCP-IHRRRGAJSA-N Pro-Lys-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O RMODQFBNDDENCP-IHRRRGAJSA-N 0.000 description 2
- BLJMJZOMZRCESA-GUBZILKMSA-N Pro-Met-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@@H]1CCCN1 BLJMJZOMZRCESA-GUBZILKMSA-N 0.000 description 2
- SBVPYBFMIGDIDX-SRVKXCTJSA-N Pro-Pro-Pro Chemical compound OC(=O)[C@@H]1CCCN1C(=O)[C@H]1N(C(=O)[C@H]2NCCC2)CCC1 SBVPYBFMIGDIDX-SRVKXCTJSA-N 0.000 description 2
- GOMUXSCOIWIJFP-GUBZILKMSA-N Pro-Ser-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GOMUXSCOIWIJFP-GUBZILKMSA-N 0.000 description 2
- SEZGGSHLMROBFX-CIUDSAMLSA-N Pro-Ser-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O SEZGGSHLMROBFX-CIUDSAMLSA-N 0.000 description 2
- IURWWZYKYPEANQ-HJGDQZAQSA-N Pro-Thr-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O IURWWZYKYPEANQ-HJGDQZAQSA-N 0.000 description 2
- FDMCIBSQRKFSTJ-RHYQMDGZSA-N Pro-Thr-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O FDMCIBSQRKFSTJ-RHYQMDGZSA-N 0.000 description 2
- DLZBBDSPTJBOOD-BPNCWPANSA-N Pro-Tyr-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O DLZBBDSPTJBOOD-BPNCWPANSA-N 0.000 description 2
- JXVXYRZQIUPYSA-NHCYSSNCSA-N Pro-Val-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O JXVXYRZQIUPYSA-NHCYSSNCSA-N 0.000 description 2
- VDHGTOHMHHQSKG-JYJNAYRXSA-N Pro-Val-Phe Chemical compound CC(C)[C@H](NC(=O)[C@@H]1CCCN1)C(=O)N[C@@H](Cc1ccccc1)C(O)=O VDHGTOHMHHQSKG-JYJNAYRXSA-N 0.000 description 2
- ZMLRZBWCXPQADC-TUAOUCFPSA-N Pro-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 ZMLRZBWCXPQADC-TUAOUCFPSA-N 0.000 description 2
- FHJQROWZEJFZPO-SRVKXCTJSA-N Pro-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 FHJQROWZEJFZPO-SRVKXCTJSA-N 0.000 description 2
- 108010079005 RDV peptide Proteins 0.000 description 2
- 229920002684 Sepharose Polymers 0.000 description 2
- SRTCFKGBYBZRHA-ACZMJKKPSA-N Ser-Ala-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SRTCFKGBYBZRHA-ACZMJKKPSA-N 0.000 description 2
- IYCBDVBJWDXQRR-FXQIFTODSA-N Ser-Ala-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCSC)C(O)=O IYCBDVBJWDXQRR-FXQIFTODSA-N 0.000 description 2
- GXXTUIUYTWGPMV-FXQIFTODSA-N Ser-Arg-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O GXXTUIUYTWGPMV-FXQIFTODSA-N 0.000 description 2
- WXUBSIDKNMFAGS-IHRRRGAJSA-N Ser-Arg-Tyr Chemical compound NC(N)=NCCC[C@H](NC(=O)[C@H](CO)N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 WXUBSIDKNMFAGS-IHRRRGAJSA-N 0.000 description 2
- SFZKGGOGCNQPJY-CIUDSAMLSA-N Ser-Asp-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)N SFZKGGOGCNQPJY-CIUDSAMLSA-N 0.000 description 2
- OLIJLNWFEQEFDM-SRVKXCTJSA-N Ser-Asp-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OLIJLNWFEQEFDM-SRVKXCTJSA-N 0.000 description 2
- IXUGADGDCQDLSA-FXQIFTODSA-N Ser-Gln-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CO)N IXUGADGDCQDLSA-FXQIFTODSA-N 0.000 description 2
- SMIDBHKWSYUBRZ-ACZMJKKPSA-N Ser-Glu-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O SMIDBHKWSYUBRZ-ACZMJKKPSA-N 0.000 description 2
- DSGYZICNAMEJOC-AVGNSLFASA-N Ser-Glu-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O DSGYZICNAMEJOC-AVGNSLFASA-N 0.000 description 2
- UQFYNFTYDHUIMI-WHFBIAKZSA-N Ser-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CO UQFYNFTYDHUIMI-WHFBIAKZSA-N 0.000 description 2
- WSTIOCFMWXNOCX-YUMQZZPRSA-N Ser-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CO)N WSTIOCFMWXNOCX-YUMQZZPRSA-N 0.000 description 2
- JFWDJFULOLKQFY-QWRGUYRKSA-N Ser-Gly-Phe Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JFWDJFULOLKQFY-QWRGUYRKSA-N 0.000 description 2
- WEQAYODCJHZSJZ-KKUMJFAQSA-N Ser-His-Tyr Chemical compound C([C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 WEQAYODCJHZSJZ-KKUMJFAQSA-N 0.000 description 2
- SFTZTYBXIXLRGQ-JBDRJPRFSA-N Ser-Ile-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O SFTZTYBXIXLRGQ-JBDRJPRFSA-N 0.000 description 2
- FUMGHWDRRFCKEP-CIUDSAMLSA-N Ser-Leu-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O FUMGHWDRRFCKEP-CIUDSAMLSA-N 0.000 description 2
- XNCUYZKGQOCOQH-YUMQZZPRSA-N Ser-Leu-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O XNCUYZKGQOCOQH-YUMQZZPRSA-N 0.000 description 2
- VMLONWHIORGALA-SRVKXCTJSA-N Ser-Leu-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CO VMLONWHIORGALA-SRVKXCTJSA-N 0.000 description 2
- MUJQWSAWLLRJCE-KATARQTJSA-N Ser-Leu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MUJQWSAWLLRJCE-KATARQTJSA-N 0.000 description 2
- GVIGVIOEYBOTCB-XIRDDKMYSA-N Ser-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CC(C)C)C(O)=O)=CNC2=C1 GVIGVIOEYBOTCB-XIRDDKMYSA-N 0.000 description 2
- PPNPDKGQRFSCAC-CIUDSAMLSA-N Ser-Lys-Asp Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)CO)C(=O)N[C@@H](CC(O)=O)C(O)=O PPNPDKGQRFSCAC-CIUDSAMLSA-N 0.000 description 2
- OWCVUSJMEBGMOK-YUMQZZPRSA-N Ser-Lys-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O OWCVUSJMEBGMOK-YUMQZZPRSA-N 0.000 description 2
- XUDRHBPSPAPDJP-SRVKXCTJSA-N Ser-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO XUDRHBPSPAPDJP-SRVKXCTJSA-N 0.000 description 2
- SRSPTFBENMJHMR-WHFBIAKZSA-N Ser-Ser-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SRSPTFBENMJHMR-WHFBIAKZSA-N 0.000 description 2
- OZPDGESCTGGNAD-CIUDSAMLSA-N Ser-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CO OZPDGESCTGGNAD-CIUDSAMLSA-N 0.000 description 2
- SZRNDHWMVSFPSP-XKBZYTNZSA-N Ser-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CO)N)O SZRNDHWMVSFPSP-XKBZYTNZSA-N 0.000 description 2
- WMZVVNLPHFSUPA-BPUTZDHNSA-N Ser-Trp-Arg Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)=CNC2=C1 WMZVVNLPHFSUPA-BPUTZDHNSA-N 0.000 description 2
- ZWSZBWAFDZRBNM-UBHSHLNASA-N Ser-Trp-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(O)=O ZWSZBWAFDZRBNM-UBHSHLNASA-N 0.000 description 2
- PLQWGQUNUPMNOD-KKUMJFAQSA-N Ser-Tyr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O PLQWGQUNUPMNOD-KKUMJFAQSA-N 0.000 description 2
- ZVBCMFDJIMUELU-BZSNNMDCSA-N Ser-Tyr-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CO)N ZVBCMFDJIMUELU-BZSNNMDCSA-N 0.000 description 2
- KIEIJCFVGZCUAS-MELADBBJSA-N Ser-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CO)N)C(=O)O KIEIJCFVGZCUAS-MELADBBJSA-N 0.000 description 2
- UKKROEYWYIHWBD-ZKWXMUAHSA-N Ser-Val-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O UKKROEYWYIHWBD-ZKWXMUAHSA-N 0.000 description 2
- LLSLRQOEAFCZLW-NRPADANISA-N Ser-Val-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O LLSLRQOEAFCZLW-NRPADANISA-N 0.000 description 2
- JGUWRQWULDWNCM-FXQIFTODSA-N Ser-Val-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O JGUWRQWULDWNCM-FXQIFTODSA-N 0.000 description 2
- 108020004682 Single-Stranded DNA Proteins 0.000 description 2
- DBMJMQXJHONAFJ-UHFFFAOYSA-M Sodium laurylsulphate Chemical compound [Na+].CCCCCCCCCCCCOS([O-])(=O)=O DBMJMQXJHONAFJ-UHFFFAOYSA-M 0.000 description 2
- 241000605268 Thiobacillus thioparus Species 0.000 description 2
- IGROJMCBGRFRGI-YTLHQDLWSA-N Thr-Ala-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O IGROJMCBGRFRGI-YTLHQDLWSA-N 0.000 description 2
- STGXWWBXWXZOER-MBLNEYKQSA-N Thr-Ala-His Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 STGXWWBXWXZOER-MBLNEYKQSA-N 0.000 description 2
- LVHHEVGYAZGXDE-KDXUFGMBSA-N Thr-Ala-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)N1CCC[C@@H]1C(=O)O)N)O LVHHEVGYAZGXDE-KDXUFGMBSA-N 0.000 description 2
- CAGTXGDOIFXLPC-KZVJFYERSA-N Thr-Arg-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CCCN=C(N)N CAGTXGDOIFXLPC-KZVJFYERSA-N 0.000 description 2
- MQBTXMPQNCGSSZ-OSUNSFLBSA-N Thr-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)O)CCCN=C(N)N MQBTXMPQNCGSSZ-OSUNSFLBSA-N 0.000 description 2
- XVNZSJIKGJLQLH-RCWTZXSCSA-N Thr-Arg-Met Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCSC)C(=O)O)N)O XVNZSJIKGJLQLH-RCWTZXSCSA-N 0.000 description 2
- VXMHQKHDKCATDV-VEVYYDQMSA-N Thr-Asp-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VXMHQKHDKCATDV-VEVYYDQMSA-N 0.000 description 2
- GARULAKWZGFIKC-RWRJDSDZSA-N Thr-Gln-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O GARULAKWZGFIKC-RWRJDSDZSA-N 0.000 description 2
- RKDFEMGVMMYYNG-WDCWCFNPSA-N Thr-Gln-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O RKDFEMGVMMYYNG-WDCWCFNPSA-N 0.000 description 2
- DKDHTRVDOUZZTP-IFFSRLJSSA-N Thr-Gln-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)[C@@H](C)O)C(O)=O DKDHTRVDOUZZTP-IFFSRLJSSA-N 0.000 description 2
- LHEZGZQRLDBSRR-WDCWCFNPSA-N Thr-Glu-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LHEZGZQRLDBSRR-WDCWCFNPSA-N 0.000 description 2
- AQAMPXBRJJWPNI-JHEQGTHGSA-N Thr-Gly-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AQAMPXBRJJWPNI-JHEQGTHGSA-N 0.000 description 2
- MPUMPERGHHJGRP-WEDXCCLWSA-N Thr-Gly-Lys Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N)O MPUMPERGHHJGRP-WEDXCCLWSA-N 0.000 description 2
- JKGGPMOUIAAJAA-YEPSODPASA-N Thr-Gly-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O JKGGPMOUIAAJAA-YEPSODPASA-N 0.000 description 2
- FDALPRWYVKJCLL-PMVVWTBXSA-N Thr-His-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)NCC(O)=O FDALPRWYVKJCLL-PMVVWTBXSA-N 0.000 description 2
- URPSJRMWHQTARR-MBLNEYKQSA-N Thr-Ile-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O URPSJRMWHQTARR-MBLNEYKQSA-N 0.000 description 2
- NCXVJIQMWSGRHY-KXNHARMFSA-N Thr-Leu-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N)O NCXVJIQMWSGRHY-KXNHARMFSA-N 0.000 description 2
- VRUFCJZQDACGLH-UVOCVTCTSA-N Thr-Leu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VRUFCJZQDACGLH-UVOCVTCTSA-N 0.000 description 2
- KZSYAEWQMJEGRZ-RHYQMDGZSA-N Thr-Leu-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O KZSYAEWQMJEGRZ-RHYQMDGZSA-N 0.000 description 2
- JLNMFGCJODTXDH-WEDXCCLWSA-N Thr-Lys-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O JLNMFGCJODTXDH-WEDXCCLWSA-N 0.000 description 2
- GUHLYMZJVXUIPO-RCWTZXSCSA-N Thr-Met-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(O)=O GUHLYMZJVXUIPO-RCWTZXSCSA-N 0.000 description 2
- WYLAVUAWOUVUCA-XVSYOHENSA-N Thr-Phe-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O WYLAVUAWOUVUCA-XVSYOHENSA-N 0.000 description 2
- NYQIZWROIMIQSL-VEVYYDQMSA-N Thr-Pro-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O NYQIZWROIMIQSL-VEVYYDQMSA-N 0.000 description 2
- JAJOFWABAUKAEJ-QTKMDUPCSA-N Thr-Pro-His Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O JAJOFWABAUKAEJ-QTKMDUPCSA-N 0.000 description 2
- VTMGKRABARCZAX-OSUNSFLBSA-N Thr-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)[C@@H](C)O VTMGKRABARCZAX-OSUNSFLBSA-N 0.000 description 2
- STUAPCLEDMKXKL-LKXGYXEUSA-N Thr-Ser-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O STUAPCLEDMKXKL-LKXGYXEUSA-N 0.000 description 2
- NLWDSYKZUPRMBJ-IEGACIPQSA-N Thr-Trp-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC(C)C)C(=O)O)N)O NLWDSYKZUPRMBJ-IEGACIPQSA-N 0.000 description 2
- PELIQFPESHBTMA-WLTAIBSBSA-N Thr-Tyr-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 PELIQFPESHBTMA-WLTAIBSBSA-N 0.000 description 2
- XGFYGMKZKFRGAI-RCWTZXSCSA-N Thr-Val-Arg Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N XGFYGMKZKFRGAI-RCWTZXSCSA-N 0.000 description 2
- BKIOKSLLAAZYTC-KKHAAJSZSA-N Thr-Val-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O BKIOKSLLAAZYTC-KKHAAJSZSA-N 0.000 description 2
- FYBFTPLPAXZBOY-KKHAAJSZSA-N Thr-Val-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O FYBFTPLPAXZBOY-KKHAAJSZSA-N 0.000 description 2
- MNYNCKZAEIAONY-XGEHTFHBSA-N Thr-Val-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O MNYNCKZAEIAONY-XGEHTFHBSA-N 0.000 description 2
- VYVBSMCZNHOZGD-RCWTZXSCSA-N Thr-Val-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(O)=O VYVBSMCZNHOZGD-RCWTZXSCSA-N 0.000 description 2
- GTNCSPKYWCJZAC-XIRDDKMYSA-N Trp-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N GTNCSPKYWCJZAC-XIRDDKMYSA-N 0.000 description 2
- DEZKIRSBKKXUEV-NYVOZVTQSA-N Trp-Asp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC3=CNC4=CC=CC=C43)C(=O)O)N DEZKIRSBKKXUEV-NYVOZVTQSA-N 0.000 description 2
- HDQJVXVRGJUDML-UBHSHLNASA-N Trp-Cys-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)N)C(=O)O)N HDQJVXVRGJUDML-UBHSHLNASA-N 0.000 description 2
- CZSMNLQMRWPGQF-XEGUGMAKSA-N Trp-Gln-Ala Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O CZSMNLQMRWPGQF-XEGUGMAKSA-N 0.000 description 2
- CZWIHKFGHICAJX-BPUTZDHNSA-N Trp-Glu-Glu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 CZWIHKFGHICAJX-BPUTZDHNSA-N 0.000 description 2
- ILDJYIDXESUBOE-HSCHXYMDSA-N Trp-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N ILDJYIDXESUBOE-HSCHXYMDSA-N 0.000 description 2
- SAKLWFSRZTZQAJ-GQGQLFGLSA-N Trp-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N SAKLWFSRZTZQAJ-GQGQLFGLSA-N 0.000 description 2
- VPRHDRKAPYZMHL-SZMVWBNQSA-N Trp-Leu-Glu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 VPRHDRKAPYZMHL-SZMVWBNQSA-N 0.000 description 2
- IKUMWSDCGQVGHC-UMPQAUOISA-N Trp-Pro-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC2=CNC3=CC=CC=C32)N)O IKUMWSDCGQVGHC-UMPQAUOISA-N 0.000 description 2
- RNDWCRUOGGQDKN-UBHSHLNASA-N Trp-Ser-Asp Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O RNDWCRUOGGQDKN-UBHSHLNASA-N 0.000 description 2
- UMIACFRBELJMGT-GQGQLFGLSA-N Trp-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N UMIACFRBELJMGT-GQGQLFGLSA-N 0.000 description 2
- LNGFWVPNKLWATF-ZVZYQTTQSA-N Trp-Val-Glu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LNGFWVPNKLWATF-ZVZYQTTQSA-N 0.000 description 2
- VCXWRWYFJLXITF-AUTRQRHGSA-N Tyr-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 VCXWRWYFJLXITF-AUTRQRHGSA-N 0.000 description 2
- NSTPFWRAIDTNGH-BZSNNMDCSA-N Tyr-Asn-Tyr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O NSTPFWRAIDTNGH-BZSNNMDCSA-N 0.000 description 2
- GAYLGYUVTDMLKC-UWJYBYFXSA-N Tyr-Asp-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 GAYLGYUVTDMLKC-UWJYBYFXSA-N 0.000 description 2
- QUILOGWWLXMSAT-IHRRRGAJSA-N Tyr-Gln-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QUILOGWWLXMSAT-IHRRRGAJSA-N 0.000 description 2
- TWAVEIJGFCBWCG-JYJNAYRXSA-N Tyr-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N TWAVEIJGFCBWCG-JYJNAYRXSA-N 0.000 description 2
- HVHJYXDXRIWELT-RYUDHWBXSA-N Tyr-Glu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O HVHJYXDXRIWELT-RYUDHWBXSA-N 0.000 description 2
- HIINQLBHPIQYHN-JTQLQIEISA-N Tyr-Gly-Gly Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 HIINQLBHPIQYHN-JTQLQIEISA-N 0.000 description 2
- JKUZFODWJGEQAP-KBPBESRZSA-N Tyr-Gly-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N)O JKUZFODWJGEQAP-KBPBESRZSA-N 0.000 description 2
- CTDPLKMBVALCGN-JSGCOSHPSA-N Tyr-Gly-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O CTDPLKMBVALCGN-JSGCOSHPSA-N 0.000 description 2
- HFJJDMOFTCQGEI-STECZYCISA-N Tyr-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N HFJJDMOFTCQGEI-STECZYCISA-N 0.000 description 2
- SINRIKQYQJRGDQ-MEYUZBJRSA-N Tyr-Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 SINRIKQYQJRGDQ-MEYUZBJRSA-N 0.000 description 2
- AUZADXNWQMBZOO-JYJNAYRXSA-N Tyr-Pro-Arg Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)C1=CC=C(O)C=C1 AUZADXNWQMBZOO-JYJNAYRXSA-N 0.000 description 2
- XJPXTYLVMUZGNW-IHRRRGAJSA-N Tyr-Pro-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O XJPXTYLVMUZGNW-IHRRRGAJSA-N 0.000 description 2
- SOEGLGLDSUHWTI-STECZYCISA-N Tyr-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=C(O)C=C1 SOEGLGLDSUHWTI-STECZYCISA-N 0.000 description 2
- PWKMJDQXKCENMF-MEYUZBJRSA-N Tyr-Thr-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O PWKMJDQXKCENMF-MEYUZBJRSA-N 0.000 description 2
- LDKDSFQSEUOCOO-RPTUDFQQSA-N Tyr-Thr-Phe Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LDKDSFQSEUOCOO-RPTUDFQQSA-N 0.000 description 2
- KLOZTPOXVVRVAQ-DZKIICNBSA-N Tyr-Val-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 KLOZTPOXVVRVAQ-DZKIICNBSA-N 0.000 description 2
- PQPWEALFTLKSEB-DZKIICNBSA-N Tyr-Val-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O PQPWEALFTLKSEB-DZKIICNBSA-N 0.000 description 2
- FZSPNKUFROZBSG-ZKWXMUAHSA-N Val-Ala-Asp Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O FZSPNKUFROZBSG-ZKWXMUAHSA-N 0.000 description 2
- ZLFHAAGHGQBQQN-GUBZILKMSA-N Val-Ala-Pro Natural products CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O ZLFHAAGHGQBQQN-GUBZILKMSA-N 0.000 description 2
- JOQSQZFKFYJKKJ-GUBZILKMSA-N Val-Arg-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CS)C(=O)O)N JOQSQZFKFYJKKJ-GUBZILKMSA-N 0.000 description 2
- KKHRWGYHBZORMQ-NHCYSSNCSA-N Val-Arg-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N KKHRWGYHBZORMQ-NHCYSSNCSA-N 0.000 description 2
- NMPXRFYMZDIBRF-ZOBUZTSGSA-N Val-Asn-Trp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N NMPXRFYMZDIBRF-ZOBUZTSGSA-N 0.000 description 2
- VUTHNLMCXKLLFI-LAEOZQHASA-N Val-Asp-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N VUTHNLMCXKLLFI-LAEOZQHASA-N 0.000 description 2
- BMGOFDMKDVVGJG-NHCYSSNCSA-N Val-Asp-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BMGOFDMKDVVGJG-NHCYSSNCSA-N 0.000 description 2
- XTAUQCGQFJQGEJ-NHCYSSNCSA-N Val-Gln-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N XTAUQCGQFJQGEJ-NHCYSSNCSA-N 0.000 description 2
- LMSBRIVOCYOKMU-NRPADANISA-N Val-Gln-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CS)C(=O)O)N LMSBRIVOCYOKMU-NRPADANISA-N 0.000 description 2
- QHFQQRKNGCXTHL-AUTRQRHGSA-N Val-Gln-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O QHFQQRKNGCXTHL-AUTRQRHGSA-N 0.000 description 2
- VFOHXOLPLACADK-GVXVVHGQSA-N Val-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](C(C)C)N VFOHXOLPLACADK-GVXVVHGQSA-N 0.000 description 2
- CVIXTAITYJQMPE-LAEOZQHASA-N Val-Glu-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CVIXTAITYJQMPE-LAEOZQHASA-N 0.000 description 2
- ROLGIBMFNMZANA-GVXVVHGQSA-N Val-Glu-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N ROLGIBMFNMZANA-GVXVVHGQSA-N 0.000 description 2
- ZXAGTABZUOMUDO-GVXVVHGQSA-N Val-Glu-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZXAGTABZUOMUDO-GVXVVHGQSA-N 0.000 description 2
- UEHRGZCNLSWGHK-DLOVCJGASA-N Val-Glu-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O UEHRGZCNLSWGHK-DLOVCJGASA-N 0.000 description 2
- BVWPHWLFGRCECJ-JSGCOSHPSA-N Val-Gly-Tyr Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N BVWPHWLFGRCECJ-JSGCOSHPSA-N 0.000 description 2
- VHRLUTIMTDOVCG-PEDHHIEDSA-N Val-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)NC(=O)[C@H](C(C)C)N VHRLUTIMTDOVCG-PEDHHIEDSA-N 0.000 description 2
- HGJRMXOWUWVUOA-GVXVVHGQSA-N Val-Leu-Gln Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N HGJRMXOWUWVUOA-GVXVVHGQSA-N 0.000 description 2
- LYERIXUFCYVFFX-GVXVVHGQSA-N Val-Leu-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N LYERIXUFCYVFFX-GVXVVHGQSA-N 0.000 description 2
- KTEZUXISLQTDDQ-NHCYSSNCSA-N Val-Lys-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N KTEZUXISLQTDDQ-NHCYSSNCSA-N 0.000 description 2
- HPANGHISDXDUQY-ULQDDVLXSA-N Val-Lys-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N HPANGHISDXDUQY-ULQDDVLXSA-N 0.000 description 2
- LJSZPMSUYKKKCP-UBHSHLNASA-N Val-Phe-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=CC=C1 LJSZPMSUYKKKCP-UBHSHLNASA-N 0.000 description 2
- CKTMJBPRVQWPHU-JSGCOSHPSA-N Val-Phe-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)O)N CKTMJBPRVQWPHU-JSGCOSHPSA-N 0.000 description 2
- USLVEJAHTBLSIL-CYDGBPFRSA-N Val-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)C(C)C USLVEJAHTBLSIL-CYDGBPFRSA-N 0.000 description 2
- NHXZRXLFOBFMDM-AVGNSLFASA-N Val-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)C(C)C NHXZRXLFOBFMDM-AVGNSLFASA-N 0.000 description 2
- BGXVHVMJZCSOCA-AVGNSLFASA-N Val-Pro-Lys Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)O)N BGXVHVMJZCSOCA-AVGNSLFASA-N 0.000 description 2
- QIVPZSWBBHRNBA-JYJNAYRXSA-N Val-Pro-Phe Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1ccccc1)C(O)=O QIVPZSWBBHRNBA-JYJNAYRXSA-N 0.000 description 2
- SSYBNWFXCFNRFN-GUBZILKMSA-N Val-Pro-Ser Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O SSYBNWFXCFNRFN-GUBZILKMSA-N 0.000 description 2
- NSUUANXHLKKHQB-BZSNNMDCSA-N Val-Pro-Trp Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CNC2=CC=CC=C12 NSUUANXHLKKHQB-BZSNNMDCSA-N 0.000 description 2
- UGFMVXRXULGLNO-XPUUQOCRSA-N Val-Ser-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O UGFMVXRXULGLNO-XPUUQOCRSA-N 0.000 description 2
- YQYFYUSYEDNLSD-YEPSODPASA-N Val-Thr-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O YQYFYUSYEDNLSD-YEPSODPASA-N 0.000 description 2
- PDDJTOSAVNRJRH-UNQGMJICSA-N Val-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](C(C)C)N)O PDDJTOSAVNRJRH-UNQGMJICSA-N 0.000 description 2
- JAIZPWVHPQRYOU-ZJDVBMNYSA-N Val-Thr-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O JAIZPWVHPQRYOU-ZJDVBMNYSA-N 0.000 description 2
- RSEIVHMDTNNEOW-JYJNAYRXSA-N Val-Trp-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CS)C(=O)O)N RSEIVHMDTNNEOW-JYJNAYRXSA-N 0.000 description 2
- GTACFKZDQFTVAI-STECZYCISA-N Val-Tyr-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=C(O)C=C1 GTACFKZDQFTVAI-STECZYCISA-N 0.000 description 2
- VVIZITNVZUAEMI-DLOVCJGASA-N Val-Val-Gln Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCC(N)=O VVIZITNVZUAEMI-DLOVCJGASA-N 0.000 description 2
- NLNCNKIVJPEFBC-DLOVCJGASA-N Val-Val-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O NLNCNKIVJPEFBC-DLOVCJGASA-N 0.000 description 2
- 241000700605 Viruses Species 0.000 description 2
- 239000004480 active ingredient Substances 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 2
- 108010076324 alanyl-glycyl-glycine Proteins 0.000 description 2
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 2
- 108010041407 alanylaspartic acid Proteins 0.000 description 2
- 108010047495 alanylglycine Proteins 0.000 description 2
- 239000003963 antioxidant agent Substances 0.000 description 2
- 235000006708 antioxidants Nutrition 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 108010013835 arginine glutamate Proteins 0.000 description 2
- 108010008355 arginyl-glutamine Proteins 0.000 description 2
- 108010001271 arginyl-glutamyl-arginine Proteins 0.000 description 2
- 108010057412 arginyl-glycyl-aspartyl-phenylalanine Proteins 0.000 description 2
- 108010062796 arginyllysine Proteins 0.000 description 2
- 108010038633 aspartylglutamate Proteins 0.000 description 2
- 108010005774 beta-Galactosidase Proteins 0.000 description 2
- 102000005936 beta-Galactosidase Human genes 0.000 description 2
- 239000012148 binding buffer Substances 0.000 description 2
- 230000003115 biocidal effect Effects 0.000 description 2
- 229960002685 biotin Drugs 0.000 description 2
- 235000020958 biotin Nutrition 0.000 description 2
- 239000011616 biotin Substances 0.000 description 2
- 230000000903 blocking effect Effects 0.000 description 2
- 229940041514 candida albicans extract Drugs 0.000 description 2
- 238000010367 cloning Methods 0.000 description 2
- 230000001276 controlling effect Effects 0.000 description 2
- 239000006184 cosolvent Substances 0.000 description 2
- 101150101649 csoS1D gene Proteins 0.000 description 2
- 108010016616 cysteinylglycine Proteins 0.000 description 2
- 230000006378 damage Effects 0.000 description 2
- 239000008121 dextrose Substances 0.000 description 2
- 108010009297 diglycyl-histidine Proteins 0.000 description 2
- 239000003085 diluting agent Substances 0.000 description 2
- 108010054813 diprotin B Proteins 0.000 description 2
- 108010048367 enhanced green fluorescent protein Proteins 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- 230000008014 freezing Effects 0.000 description 2
- 108010006664 gamma-glutamyl-glycyl-glycine Proteins 0.000 description 2
- 108010080575 glutamyl-aspartyl-alanine Proteins 0.000 description 2
- 108010079547 glutamylmethionine Proteins 0.000 description 2
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 2
- 108010010096 glycyl-glycyl-tyrosine Proteins 0.000 description 2
- 108010033719 glycyl-histidyl-glycine Proteins 0.000 description 2
- 108010050475 glycyl-leucyl-tyrosine Proteins 0.000 description 2
- 108010082286 glycyl-seryl-alanine Proteins 0.000 description 2
- 108010045126 glycyl-tyrosyl-glycine Proteins 0.000 description 2
- 108010010147 glycylglutamine Proteins 0.000 description 2
- 108010037850 glycylvaline Proteins 0.000 description 2
- 238000002744 homologous recombination Methods 0.000 description 2
- 230000006801 homologous recombination Effects 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 238000001597 immobilized metal affinity chromatography Methods 0.000 description 2
- 108010027338 isoleucylcysteine Proteins 0.000 description 2
- 108010078274 isoleucylvaline Proteins 0.000 description 2
- 108010053037 kyotorphin Proteins 0.000 description 2
- 108010077158 leucinyl-arginyl-tryptophan Proteins 0.000 description 2
- 108010051673 leucyl-glycyl-phenylalanine Proteins 0.000 description 2
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 2
- XIXADJRWDQXREU-UHFFFAOYSA-M lithium acetate Chemical compound [Li+].CC([O-])=O XIXADJRWDQXREU-UHFFFAOYSA-M 0.000 description 2
- 230000028744 lysogeny Effects 0.000 description 2
- 108010025153 lysyl-alanyl-alanine Proteins 0.000 description 2
- 108010044348 lysyl-glutamyl-aspartic acid Proteins 0.000 description 2
- 108091005958 mTurquoise2 Proteins 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 230000001404 mediated effect Effects 0.000 description 2
- 108700023046 methionyl-leucyl-phenylalanine Proteins 0.000 description 2
- 108010056582 methionylglutamic acid Proteins 0.000 description 2
- 238000001000 micrograph Methods 0.000 description 2
- 235000019319 peptone Nutrition 0.000 description 2
- 108010012581 phenylalanylglutamate Proteins 0.000 description 2
- 238000005498 polishing Methods 0.000 description 2
- 229920002401 polyacrylamide Polymers 0.000 description 2
- 229920000642 polymer Polymers 0.000 description 2
- 238000002360 preparation method Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 108010025826 prolyl-leucyl-arginine Proteins 0.000 description 2
- 108010093296 prolyl-prolyl-alanine Proteins 0.000 description 2
- 108010004914 prolylarginine Proteins 0.000 description 2
- 108010090894 prolylleucine Proteins 0.000 description 2
- 108020001580 protein domains Proteins 0.000 description 2
- 230000017854 proteolysis Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 238000002864 sequence alignment Methods 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 108010026333 seryl-proline Proteins 0.000 description 2
- 108010071207 serylmethionine Proteins 0.000 description 2
- 238000002741 site-directed mutagenesis Methods 0.000 description 2
- 159000000000 sodium salts Chemical class 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 230000000087 stabilizing effect Effects 0.000 description 2
- 239000011550 stock solution Substances 0.000 description 2
- 229960005322 streptomycin Drugs 0.000 description 2
- 238000012916 structural analysis Methods 0.000 description 2
- 239000000758 substrate Substances 0.000 description 2
- 239000000725 suspension Substances 0.000 description 2
- 238000005382 thermal cycling Methods 0.000 description 2
- 230000014616 translation Effects 0.000 description 2
- 108700004896 tripeptide FEG Proteins 0.000 description 2
- 108010015666 tryptophyl-leucyl-glutamic acid Proteins 0.000 description 2
- 108010084932 tryptophyl-proline Proteins 0.000 description 2
- 108010044292 tryptophyltyrosine Proteins 0.000 description 2
- 108010017949 tyrosyl-glycyl-glycine Proteins 0.000 description 2
- 108010071635 tyrosyl-prolyl-arginine Proteins 0.000 description 2
- 229940035893 uracil Drugs 0.000 description 2
- 108010009962 valyltyrosine Proteins 0.000 description 2
- 239000003981 vehicle Substances 0.000 description 2
- 239000012224 working solution Substances 0.000 description 2
- 210000005253 yeast cell Anatomy 0.000 description 2
- 239000012138 yeast extract Substances 0.000 description 2
- AUTOLBMXDDTRRT-JGVFFNPUSA-N (4R,5S)-dethiobiotin Chemical compound C[C@@H]1NC(=O)N[C@@H]1CCCCCC(O)=O AUTOLBMXDDTRRT-JGVFFNPUSA-N 0.000 description 1
- PHLXSNIEQIKENK-UHFFFAOYSA-N 2-[[2-[5-methyl-3-(trifluoromethyl)pyrazol-1-yl]acetyl]amino]-4,5,6,7-tetrahydro-1-benzothiophene-3-carboxamide Chemical compound CC1=CC(C(F)(F)F)=NN1CC(=O)NC1=C(C(N)=O)C(CCCC2)=C2S1 PHLXSNIEQIKENK-UHFFFAOYSA-N 0.000 description 1
- IVLXQGJVBGMLRR-UHFFFAOYSA-N 2-aminoacetic acid;hydron;chloride Chemical compound Cl.NCC(O)=O IVLXQGJVBGMLRR-UHFFFAOYSA-N 0.000 description 1
- ANXOZVKTXWJGOY-UHFFFAOYSA-N 2-aminoethanone Chemical compound NC[C]=O ANXOZVKTXWJGOY-UHFFFAOYSA-N 0.000 description 1
- FWBHETKCLVMNFS-UHFFFAOYSA-N 4',6-Diamino-2-phenylindol Chemical compound C1=CC(C(=N)N)=CC=C1C1=CC2=CC=C(C(N)=N)C=C2N1 FWBHETKCLVMNFS-UHFFFAOYSA-N 0.000 description 1
- 240000004507 Abelmoschus esculentus Species 0.000 description 1
- 241000352732 Acidithiobacillus ferridurans Species 0.000 description 1
- 241000321865 Acidithiobacillus ferrivorans Species 0.000 description 1
- BTYTYHBSJKQBQA-GCJQMDKQSA-N Ala-Asp-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C)N)O BTYTYHBSJKQBQA-GCJQMDKQSA-N 0.000 description 1
- PMQXMXAASGFUDX-SRVKXCTJSA-N Ala-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CCCCN PMQXMXAASGFUDX-SRVKXCTJSA-N 0.000 description 1
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 1
- 241000607305 Arctica Species 0.000 description 1
- OZNSCVPYWZRQPY-CIUDSAMLSA-N Arg-Asp-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O OZNSCVPYWZRQPY-CIUDSAMLSA-N 0.000 description 1
- YFBGNGASPGRWEM-DCAQKATOSA-N Arg-Asp-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N YFBGNGASPGRWEM-DCAQKATOSA-N 0.000 description 1
- CYXCAHZVPFREJD-LURJTMIESA-N Arg-Gly-Gly Chemical compound NC(=N)NCCC[C@H](N)C(=O)NCC(=O)NCC(O)=O CYXCAHZVPFREJD-LURJTMIESA-N 0.000 description 1
- CRCCTGPNZUCAHE-DCAQKATOSA-N Arg-His-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CN=CN1 CRCCTGPNZUCAHE-DCAQKATOSA-N 0.000 description 1
- YLVGUOGAFAJMKP-JYJNAYRXSA-N Arg-Met-Tyr Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N YLVGUOGAFAJMKP-JYJNAYRXSA-N 0.000 description 1
- MJIJBEYEHBKTIM-BYULHYEWSA-N Asn-Val-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N MJIJBEYEHBKTIM-BYULHYEWSA-N 0.000 description 1
- XYBJLTKSGFBLCS-QXEWZRGKSA-N Asp-Arg-Val Chemical compound NC(N)=NCCC[C@@H](C(=O)N[C@@H](C(C)C)C(O)=O)NC(=O)[C@@H](N)CC(O)=O XYBJLTKSGFBLCS-QXEWZRGKSA-N 0.000 description 1
- SPWXXPFDTMYTRI-IUKAMOBKSA-N Asp-Ile-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SPWXXPFDTMYTRI-IUKAMOBKSA-N 0.000 description 1
- WQSXAPPYLGNMQL-IHRRRGAJSA-N Asp-Met-Tyr Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N WQSXAPPYLGNMQL-IHRRRGAJSA-N 0.000 description 1
- WMLFFCRUSPNENW-ZLUOBGJFSA-N Asp-Ser-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O WMLFFCRUSPNENW-ZLUOBGJFSA-N 0.000 description 1
- JSNWZMFSLIWAHS-HJGDQZAQSA-N Asp-Thr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O JSNWZMFSLIWAHS-HJGDQZAQSA-N 0.000 description 1
- 108010083946 Asp-Tyr-Leu-Lys Proteins 0.000 description 1
- BPAUXFVCSYQDQX-JRQIVUDYSA-N Asp-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC(=O)O)N)O BPAUXFVCSYQDQX-JRQIVUDYSA-N 0.000 description 1
- 102100021277 Beta-secretase 2 Human genes 0.000 description 1
- 101710150190 Beta-secretase 2 Proteins 0.000 description 1
- 101800001415 Bri23 peptide Proteins 0.000 description 1
- 102400000107 C-terminal peptide Human genes 0.000 description 1
- 101800000655 C-terminal peptide Proteins 0.000 description 1
- 102000003846 Carbonic anhydrases Human genes 0.000 description 1
- 108090000209 Carbonic anhydrases Proteins 0.000 description 1
- 108090000317 Chymotrypsin Proteins 0.000 description 1
- 206010010071 Coma Diseases 0.000 description 1
- 241000061196 Comamonadaceae bacterium Species 0.000 description 1
- 238000011537 Coomassie blue staining Methods 0.000 description 1
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- NRVQLLDIJJEIIZ-VZFHVOOUSA-N Cys-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CS)N)O NRVQLLDIJJEIIZ-VZFHVOOUSA-N 0.000 description 1
- 102100025698 Cytosolic carboxypeptidase 4 Human genes 0.000 description 1
- 239000003155 DNA primer Substances 0.000 description 1
- 241000196324 Embryophyta Species 0.000 description 1
- 241000701867 Enterobacteria phage T7 Species 0.000 description 1
- OTMSDBZUPAUEDD-UHFFFAOYSA-N Ethane Chemical compound CC OTMSDBZUPAUEDD-UHFFFAOYSA-N 0.000 description 1
- 241001018496 Ferrovum Species 0.000 description 1
- 241001009037 Gallionellaceae bacterium Species 0.000 description 1
- 108700023863 Gene Components Proteins 0.000 description 1
- WQWMZOIPXWSZNE-WDSKDSINSA-N Gln-Asp-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O WQWMZOIPXWSZNE-WDSKDSINSA-N 0.000 description 1
- COYGBRTZEVWZBW-XKBZYTNZSA-N Gln-Cys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CS)NC(=O)[C@@H](N)CCC(N)=O COYGBRTZEVWZBW-XKBZYTNZSA-N 0.000 description 1
- KDXKFBSNIJYNNR-YVNDNENWSA-N Gln-Glu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KDXKFBSNIJYNNR-YVNDNENWSA-N 0.000 description 1
- ZBKUIQNCRIYVGH-SDDRHHMPSA-N Gln-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZBKUIQNCRIYVGH-SDDRHHMPSA-N 0.000 description 1
- LURQDGKYBFWWJA-MNXVOIDGSA-N Gln-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)N)N LURQDGKYBFWWJA-MNXVOIDGSA-N 0.000 description 1
- WIMVKDYAKRAUCG-IHRRRGAJSA-N Gln-Tyr-Glu Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O WIMVKDYAKRAUCG-IHRRRGAJSA-N 0.000 description 1
- NLKVNZUFDPWPNL-YUMQZZPRSA-N Glu-Arg-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O NLKVNZUFDPWPNL-YUMQZZPRSA-N 0.000 description 1
- NTBDVNJIWCKURJ-ACZMJKKPSA-N Glu-Asp-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O NTBDVNJIWCKURJ-ACZMJKKPSA-N 0.000 description 1
- ZWABFSSWTSAMQN-KBIXCLLPSA-N Glu-Ile-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O ZWABFSSWTSAMQN-KBIXCLLPSA-N 0.000 description 1
- VGUYMZGLJUJRBV-YVNDNENWSA-N Glu-Ile-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O VGUYMZGLJUJRBV-YVNDNENWSA-N 0.000 description 1
- JZJGEKDPWVJOLD-QEWYBTABSA-N Glu-Phe-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JZJGEKDPWVJOLD-QEWYBTABSA-N 0.000 description 1
- PUUYVMYCMIWHFE-BQBZGAKWSA-N Gly-Ala-Arg Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PUUYVMYCMIWHFE-BQBZGAKWSA-N 0.000 description 1
- QIZJOTQTCAGKPU-KWQFWETISA-N Gly-Ala-Tyr Chemical compound [NH3+]CC(=O)N[C@@H](C)C(=O)N[C@H](C([O-])=O)CC1=CC=C(O)C=C1 QIZJOTQTCAGKPU-KWQFWETISA-N 0.000 description 1
- KKBWDNZXYLGJEY-UHFFFAOYSA-N Gly-Arg-Pro Natural products NCC(=O)NC(CCNC(=N)N)C(=O)N1CCCC1C(=O)O KKBWDNZXYLGJEY-UHFFFAOYSA-N 0.000 description 1
- GDOZQTNZPCUARW-YFKPBYRVSA-N Gly-Gly-Glu Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O GDOZQTNZPCUARW-YFKPBYRVSA-N 0.000 description 1
- OLPPXYMMIARYAL-QMMMGPOBSA-N Gly-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)CN OLPPXYMMIARYAL-QMMMGPOBSA-N 0.000 description 1
- ORXZVPZCPMKHNR-IUCAKERBSA-N Gly-His-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CNC=N1 ORXZVPZCPMKHNR-IUCAKERBSA-N 0.000 description 1
- CQMFNTVQVLQRLT-JHEQGTHGSA-N Gly-Thr-Gln Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O CQMFNTVQVLQRLT-JHEQGTHGSA-N 0.000 description 1
- UIQGJYUEQDOODF-KWQFWETISA-N Gly-Tyr-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 UIQGJYUEQDOODF-KWQFWETISA-N 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 101100494141 Haliangium ochraceum (strain DSM 14365 / JCM 11303 / SMP-2) Hoch_5814 gene Proteins 0.000 description 1
- 101100494139 Haliangium ochraceum (strain DSM 14365 / JCM 11303 / SMP-2) Hoch_5815 gene Proteins 0.000 description 1
- 241000918547 Halothece sp. PCC 7418 Species 0.000 description 1
- KYMUEAZVLPRVAE-GUBZILKMSA-N His-Asn-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O KYMUEAZVLPRVAE-GUBZILKMSA-N 0.000 description 1
- QMUHTRISZMFKAY-MXAVVETBSA-N His-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N QMUHTRISZMFKAY-MXAVVETBSA-N 0.000 description 1
- LPBWRHRHEIYAIP-KKUMJFAQSA-N His-Tyr-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O LPBWRHRHEIYAIP-KKUMJFAQSA-N 0.000 description 1
- 101000932590 Homo sapiens Cytosolic carboxypeptidase 4 Proteins 0.000 description 1
- 241000253370 Hydrogenophilales Species 0.000 description 1
- IDAHFEPYTJJZFD-PEFMBERDSA-N Ile-Asp-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N IDAHFEPYTJJZFD-PEFMBERDSA-N 0.000 description 1
- HGNUKGZQASSBKQ-PCBIJLKTSA-N Ile-Asp-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N HGNUKGZQASSBKQ-PCBIJLKTSA-N 0.000 description 1
- PWDSHAAAFXISLE-SXTJYALSSA-N Ile-Ile-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O PWDSHAAAFXISLE-SXTJYALSSA-N 0.000 description 1
- ADDYYRVQQZFIMW-MNXVOIDGSA-N Ile-Lys-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ADDYYRVQQZFIMW-MNXVOIDGSA-N 0.000 description 1
- YSGBJIQXTIVBHZ-AJNGGQMLSA-N Ile-Lys-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O YSGBJIQXTIVBHZ-AJNGGQMLSA-N 0.000 description 1
- OWSWUWDMSNXTNE-GMOBBJLQSA-N Ile-Pro-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N OWSWUWDMSNXTNE-GMOBBJLQSA-N 0.000 description 1
- YWCJXQKATPNPOE-UKJIMTQDSA-N Ile-Val-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N YWCJXQKATPNPOE-UKJIMTQDSA-N 0.000 description 1
- ODKSFYDXXFIFQN-BYPYZUCNSA-N L-arginine Chemical compound OC(=O)[C@@H](N)CCCN=C(N)N ODKSFYDXXFIFQN-BYPYZUCNSA-N 0.000 description 1
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 1
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 1
- ZRLUISBDKUWAIZ-CIUDSAMLSA-N Leu-Ala-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O ZRLUISBDKUWAIZ-CIUDSAMLSA-N 0.000 description 1
- LAGPXKYZCCTSGQ-JYJNAYRXSA-N Leu-Glu-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LAGPXKYZCCTSGQ-JYJNAYRXSA-N 0.000 description 1
- LVTJJOJKDCVZGP-QWRGUYRKSA-N Leu-Lys-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LVTJJOJKDCVZGP-QWRGUYRKSA-N 0.000 description 1
- BGZCJDGBBUUBHA-KKUMJFAQSA-N Leu-Lys-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O BGZCJDGBBUUBHA-KKUMJFAQSA-N 0.000 description 1
- IWMJFLJQHIDZQW-KKUMJFAQSA-N Leu-Ser-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IWMJFLJQHIDZQW-KKUMJFAQSA-N 0.000 description 1
- SBANPBVRHYIMRR-UHFFFAOYSA-N Leu-Ser-Pro Natural products CC(C)CC(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O SBANPBVRHYIMRR-UHFFFAOYSA-N 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- AAORVPFVUIHEAB-YUMQZZPRSA-N Lys-Asp-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O AAORVPFVUIHEAB-YUMQZZPRSA-N 0.000 description 1
- WTZUSCUIVPVCRH-SRVKXCTJSA-N Lys-Gln-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N WTZUSCUIVPVCRH-SRVKXCTJSA-N 0.000 description 1
- ZXEUFAVXODIPHC-GUBZILKMSA-N Lys-Glu-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZXEUFAVXODIPHC-GUBZILKMSA-N 0.000 description 1
- LCMWVZLBCUVDAZ-IUCAKERBSA-N Lys-Gly-Glu Chemical compound [NH3+]CCCC[C@H]([NH3+])C(=O)NCC(=O)N[C@H](C([O-])=O)CCC([O-])=O LCMWVZLBCUVDAZ-IUCAKERBSA-N 0.000 description 1
- SLQJJFAVWSZLBL-BJDJZHNGSA-N Lys-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN SLQJJFAVWSZLBL-BJDJZHNGSA-N 0.000 description 1
- RIJCHEVHFWMDKD-SRVKXCTJSA-N Lys-Lys-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O RIJCHEVHFWMDKD-SRVKXCTJSA-N 0.000 description 1
- YXPJCVNIDDKGOE-MELADBBJSA-N Lys-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N)C(=O)O YXPJCVNIDDKGOE-MELADBBJSA-N 0.000 description 1
- CAVRAQIDHUPECU-UVOCVTCTSA-N Lys-Thr-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAVRAQIDHUPECU-UVOCVTCTSA-N 0.000 description 1
- XGZDDOKIHSYHTO-SZMVWBNQSA-N Lys-Trp-Glu Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 XGZDDOKIHSYHTO-SZMVWBNQSA-N 0.000 description 1
- BWECSLVQIWEMSC-IHRRRGAJSA-N Lys-Val-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCCCN)N BWECSLVQIWEMSC-IHRRRGAJSA-N 0.000 description 1
- RIPJMCFGQHGHNP-RHYQMDGZSA-N Lys-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CCCCN)N)O RIPJMCFGQHGHNP-RHYQMDGZSA-N 0.000 description 1
- QGQGAIBGTUJRBR-NAKRPEOUSA-N Met-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCSC QGQGAIBGTUJRBR-NAKRPEOUSA-N 0.000 description 1
- OBVHKUFUDCPZDW-JYJNAYRXSA-N Met-Arg-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OBVHKUFUDCPZDW-JYJNAYRXSA-N 0.000 description 1
- NCVJJAJVWILAGI-SRVKXCTJSA-N Met-Gln-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N NCVJJAJVWILAGI-SRVKXCTJSA-N 0.000 description 1
- YORIKIDJCPKBON-YUMQZZPRSA-N Met-Glu-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O YORIKIDJCPKBON-YUMQZZPRSA-N 0.000 description 1
- CULGJGUDIJATIP-STQMWFEESA-N Met-Tyr-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 CULGJGUDIJATIP-STQMWFEESA-N 0.000 description 1
- QAVZUKIPOMBLMC-AVGNSLFASA-N Met-Val-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(C)C QAVZUKIPOMBLMC-AVGNSLFASA-N 0.000 description 1
- LBSWWNKMVPAXOI-GUBZILKMSA-N Met-Val-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O LBSWWNKMVPAXOI-GUBZILKMSA-N 0.000 description 1
- ZOKXTWBITQBERF-UHFFFAOYSA-N Molybdenum Chemical compound [Mo] ZOKXTWBITQBERF-UHFFFAOYSA-N 0.000 description 1
- 101001033003 Mus musculus Granzyme F Proteins 0.000 description 1
- PESQCPHRXOFIPX-UHFFFAOYSA-N N-L-methionyl-L-tyrosine Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 PESQCPHRXOFIPX-UHFFFAOYSA-N 0.000 description 1
- AJHCSUXXECOXOY-UHFFFAOYSA-N N-glycyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-UHFFFAOYSA-N 0.000 description 1
- MPFGIYLYWUCSJG-AVGNSLFASA-N Phe-Glu-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 MPFGIYLYWUCSJG-AVGNSLFASA-N 0.000 description 1
- IPFXYNKCXYGSSV-KKUMJFAQSA-N Phe-Ser-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N IPFXYNKCXYGSSV-KKUMJFAQSA-N 0.000 description 1
- 229920000388 Polyphosphate Polymers 0.000 description 1
- APKRGYLBSCWJJP-FXQIFTODSA-N Pro-Ala-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O APKRGYLBSCWJJP-FXQIFTODSA-N 0.000 description 1
- LQZZPNDMYNZPFT-KKUMJFAQSA-N Pro-Gln-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LQZZPNDMYNZPFT-KKUMJFAQSA-N 0.000 description 1
- MGDFPGCFVJFITQ-CIUDSAMLSA-N Pro-Glu-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MGDFPGCFVJFITQ-CIUDSAMLSA-N 0.000 description 1
- MCWHYUWXVNRXFV-RWMBFGLXSA-N Pro-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 MCWHYUWXVNRXFV-RWMBFGLXSA-N 0.000 description 1
- FKYKZHOKDOPHSA-DCAQKATOSA-N Pro-Leu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FKYKZHOKDOPHSA-DCAQKATOSA-N 0.000 description 1
- GMJDSFYVTAMIBF-FXQIFTODSA-N Pro-Ser-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O GMJDSFYVTAMIBF-FXQIFTODSA-N 0.000 description 1
- FZXSYIPVAFVYBH-KKUMJFAQSA-N Pro-Tyr-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O FZXSYIPVAFVYBH-KKUMJFAQSA-N 0.000 description 1
- 241000611429 Prochlorococcus marinus subsp. pastoris str. CCMP1986 Species 0.000 description 1
- OFOBLEOULBTSOW-UHFFFAOYSA-N Propanedioic acid Natural products OC(=O)CC(O)=O OFOBLEOULBTSOW-UHFFFAOYSA-N 0.000 description 1
- 229940124158 Protease/peptidase inhibitor Drugs 0.000 description 1
- 101100230601 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) HBT1 gene Proteins 0.000 description 1
- 101100544819 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) YPT31 gene Proteins 0.000 description 1
- JPIDMRXXNMIVKY-VZFHVOOUSA-N Ser-Ala-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JPIDMRXXNMIVKY-VZFHVOOUSA-N 0.000 description 1
- DBIDZNUXSLXVRG-FXQIFTODSA-N Ser-Asp-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)N DBIDZNUXSLXVRG-FXQIFTODSA-N 0.000 description 1
- GZSZPKSBVAOGIE-CIUDSAMLSA-N Ser-Lys-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O GZSZPKSBVAOGIE-CIUDSAMLSA-N 0.000 description 1
- GYDFRTRSSXOZCR-ACZMJKKPSA-N Ser-Ser-Glu Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O GYDFRTRSSXOZCR-ACZMJKKPSA-N 0.000 description 1
- BMKNXTJLHFIAAH-CIUDSAMLSA-N Ser-Ser-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O BMKNXTJLHFIAAH-CIUDSAMLSA-N 0.000 description 1
- PCMZJFMUYWIERL-ZKWXMUAHSA-N Ser-Val-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O PCMZJFMUYWIERL-ZKWXMUAHSA-N 0.000 description 1
- JBHMLZSKIXMVFS-XVSYOHENSA-N Thr-Asn-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JBHMLZSKIXMVFS-XVSYOHENSA-N 0.000 description 1
- WLDUCKSCDRIVLJ-NUMRIWBASA-N Thr-Gln-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O WLDUCKSCDRIVLJ-NUMRIWBASA-N 0.000 description 1
- XPNSAQMEAVSQRD-FBCQKBJTSA-N Thr-Gly-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)NCC(O)=O XPNSAQMEAVSQRD-FBCQKBJTSA-N 0.000 description 1
- WBCCCPZIJIJTSD-TUBUOCAGSA-N Thr-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H]([C@@H](C)O)N WBCCCPZIJIJTSD-TUBUOCAGSA-N 0.000 description 1
- QHUWWSQZTFLXPQ-FJXKBIBVSA-N Thr-Met-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O QHUWWSQZTFLXPQ-FJXKBIBVSA-N 0.000 description 1
- YGZWVPBHYABGLT-KJEVXHAQSA-N Thr-Pro-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 YGZWVPBHYABGLT-KJEVXHAQSA-N 0.000 description 1
- RTAQQCXQSZGOHL-UHFFFAOYSA-N Titanium Chemical compound [Ti] RTAQQCXQSZGOHL-UHFFFAOYSA-N 0.000 description 1
- NXJZCPKZIKTYLX-XEGUGMAKSA-N Trp-Glu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N NXJZCPKZIKTYLX-XEGUGMAKSA-N 0.000 description 1
- 108090000631 Trypsin Proteins 0.000 description 1
- 102000004142 Trypsin Human genes 0.000 description 1
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 1
- JAGGEZACYAAMIL-CQDKDKBSSA-N Tyr-Lys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CC=C(C=C1)O)N JAGGEZACYAAMIL-CQDKDKBSSA-N 0.000 description 1
- VTCKHZJKWQENKX-KBPBESRZSA-N Tyr-Lys-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O VTCKHZJKWQENKX-KBPBESRZSA-N 0.000 description 1
- PMHLLBKTDHQMCY-ULQDDVLXSA-N Tyr-Lys-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PMHLLBKTDHQMCY-ULQDDVLXSA-N 0.000 description 1
- AOIZTZRWMSPPAY-KAOXEZKKSA-N Tyr-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)O AOIZTZRWMSPPAY-KAOXEZKKSA-N 0.000 description 1
- GOPQNCQSXBJAII-ULQDDVLXSA-N Tyr-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N GOPQNCQSXBJAII-ULQDDVLXSA-N 0.000 description 1
- SVFRYKBZHUGKLP-QXEWZRGKSA-N Val-Met-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)N)C(=O)O)N SVFRYKBZHUGKLP-QXEWZRGKSA-N 0.000 description 1
- CEKSLIVSNNGOKH-KZVJFYERSA-N Val-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](C(C)C)N)O CEKSLIVSNNGOKH-KZVJFYERSA-N 0.000 description 1
- HTONZBWRYUKUKC-RCWTZXSCSA-N Val-Thr-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O HTONZBWRYUKUKC-RCWTZXSCSA-N 0.000 description 1
- 238000011481 absorbance measurement Methods 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 230000002378 acidificating effect Effects 0.000 description 1
- 239000002671 adjuvant Substances 0.000 description 1
- 108010070783 alanyltyrosine Proteins 0.000 description 1
- 125000000539 amino acid group Chemical group 0.000 description 1
- 125000000129 anionic group Chemical group 0.000 description 1
- 230000000890 antigenic effect Effects 0.000 description 1
- 230000003078 antioxidant effect Effects 0.000 description 1
- 239000007864 aqueous solution Substances 0.000 description 1
- 229960003589 arginine hydrochloride Drugs 0.000 description 1
- 239000011230 binding agent Substances 0.000 description 1
- 238000005842 biochemical reaction Methods 0.000 description 1
- 230000002715 bioenergetic effect Effects 0.000 description 1
- 230000004071 biological effect Effects 0.000 description 1
- 239000012620 biological material Substances 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 210000000234 capsid Anatomy 0.000 description 1
- 239000002775 capsule Substances 0.000 description 1
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 239000003054 catalyst Substances 0.000 description 1
- 101150036922 ccmK2 gene Proteins 0.000 description 1
- 101150108476 ccmL gene Proteins 0.000 description 1
- 239000013592 cell lysate Substances 0.000 description 1
- 239000006285 cell suspension Substances 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 229960002376 chymotrypsin Drugs 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 230000021615 conjugation Effects 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013270 controlled release Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000000604 cryogenic transmission electron microscopy Methods 0.000 description 1
- 101150022417 csoS2 gene Proteins 0.000 description 1
- 101150079192 csoS3 gene Proteins 0.000 description 1
- 101150090418 csoS4A gene Proteins 0.000 description 1
- 230000001086 cytosolic effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000000326 densiometry Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 229910003460 diamond Inorganic materials 0.000 description 1
- 239000010432 diamond Substances 0.000 description 1
- 239000007884 disintegrant Substances 0.000 description 1
- 238000010494 dissociation reaction Methods 0.000 description 1
- 230000005593 dissociations Effects 0.000 description 1
- 238000004090 dissolution Methods 0.000 description 1
- 239000003937 drug carrier Substances 0.000 description 1
- 238000012377 drug delivery Methods 0.000 description 1
- 238000000635 electron micrograph Methods 0.000 description 1
- 239000012149 elution buffer Substances 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 238000003028 enzyme activity measurement method Methods 0.000 description 1
- 101150081045 eutM gene Proteins 0.000 description 1
- 101150016941 eutN gene Proteins 0.000 description 1
- 239000000945 filler Substances 0.000 description 1
- 238000002073 fluorescence micrograph Methods 0.000 description 1
- 239000013505 freshwater Substances 0.000 description 1
- 238000007306 functionalization reaction Methods 0.000 description 1
- LYQGMALGKYWNIU-UHFFFAOYSA-K gadolinium(3+);triacetate Chemical compound [Gd+3].CC([O-])=O.CC([O-])=O.CC([O-])=O LYQGMALGKYWNIU-UHFFFAOYSA-K 0.000 description 1
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 1
- 238000001502 gel electrophoresis Methods 0.000 description 1
- 235000021472 generally recognized as safe Nutrition 0.000 description 1
- 239000003979 granulating agent Substances 0.000 description 1
- 229960001867 guaiacol Drugs 0.000 description 1
- 238000003306 harvesting Methods 0.000 description 1
- 238000010438 heat treatment Methods 0.000 description 1
- 125000004435 hydrogen atom Chemical group [H]* 0.000 description 1
- 230000007062 hydrolysis Effects 0.000 description 1
- 238000006460 hydrolysis reaction Methods 0.000 description 1
- 230000001900 immune effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 239000000411 inducer Substances 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 238000011090 industrial biotechnology method and process Methods 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000009878 intermolecular interaction Effects 0.000 description 1
- 238000004255 ion exchange chromatography Methods 0.000 description 1
- 229960000310 isoleucine Drugs 0.000 description 1
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 1
- 108010044374 isoleucyl-tyrosine Proteins 0.000 description 1
- 125000001449 isopropyl group Chemical group [H]C([H])([H])C([H])(*)C([H])([H])[H] 0.000 description 1
- 238000003367 kinetic assay Methods 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 230000033001 locomotion Effects 0.000 description 1
- 239000000314 lubricant Substances 0.000 description 1
- 108010054155 lysyllysine Proteins 0.000 description 1
- 108010038320 lysylphenylalanine Proteins 0.000 description 1
- VZCYOOQTPOCHFL-UPHRSURJSA-N maleic acid Chemical compound OC(=O)\C=C/C(O)=O VZCYOOQTPOCHFL-UPHRSURJSA-N 0.000 description 1
- 239000011976 maleic acid Substances 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000003340 mental effect Effects 0.000 description 1
- 238000012269 metabolic engineering Methods 0.000 description 1
- 238000006241 metabolic reaction Methods 0.000 description 1
- 230000000813 microbial effect Effects 0.000 description 1
- 238000000386 microscopy Methods 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 238000001823 molecular biology technique Methods 0.000 description 1
- 229910052750 molybdenum Inorganic materials 0.000 description 1
- 239000011733 molybdenum Substances 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 239000002086 nanomaterial Substances 0.000 description 1
- 239000013642 negative control Substances 0.000 description 1
- 229910052757 nitrogen Inorganic materials 0.000 description 1
- 238000005580 one pot reaction Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000007911 parenteral administration Methods 0.000 description 1
- 101150056273 pduA gene Proteins 0.000 description 1
- 239000008188 pellet Substances 0.000 description 1
- 239000000137 peptide hydrolase inhibitor Substances 0.000 description 1
- 102000013415 peroxidase activity proteins Human genes 0.000 description 1
- 108040007629 peroxidase activity proteins Proteins 0.000 description 1
- 239000000546 pharmaceutical excipient Substances 0.000 description 1
- 230000000144 pharmacologic effect Effects 0.000 description 1
- 230000035479 physiological effects, processes and functions Effects 0.000 description 1
- 239000000049 pigment Substances 0.000 description 1
- 230000037039 plant physiology Effects 0.000 description 1
- 229920000729 poly(L-lysine) polymer Polymers 0.000 description 1
- 229920001184 polypeptide Polymers 0.000 description 1
- 239000001205 polyphosphate Substances 0.000 description 1
- 235000011176 polyphosphates Nutrition 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 230000004481 post-translational protein modification Effects 0.000 description 1
- 239000003755 preservative agent Substances 0.000 description 1
- 238000007639 printing Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 108010079317 prolyl-tyrosine Proteins 0.000 description 1
- 238000011321 prophylaxis Methods 0.000 description 1
- 230000001681 protective effect Effects 0.000 description 1
- 238000001742 protein purification Methods 0.000 description 1
- 239000012460 protein solution Substances 0.000 description 1
- 230000004850 protein–protein interaction Effects 0.000 description 1
- 239000002510 pyrogen Substances 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 238000004064 recycling Methods 0.000 description 1
- 108010054624 red fluorescent protein Proteins 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 239000011347 resin Substances 0.000 description 1
- 229920005989 resin Polymers 0.000 description 1
- 238000007480 sanger sequencing Methods 0.000 description 1
- 238000009738 saturating Methods 0.000 description 1
- 230000009919 sequestration Effects 0.000 description 1
- 239000010420 shell particle Substances 0.000 description 1
- 229910052709 silver Inorganic materials 0.000 description 1
- 239000004332 silver Substances 0.000 description 1
- QZRSVBDWRWTHMT-UHFFFAOYSA-M silver;3-carboxy-3,5-dihydroxy-5-oxopentanoate Chemical compound [Ag+].OC(=O)CC(O)(C([O-])=O)CC(O)=O QZRSVBDWRWTHMT-UHFFFAOYSA-M 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 238000011105 stabilization Methods 0.000 description 1
- 239000003381 stabilizer Substances 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 230000004797 therapeutic response Effects 0.000 description 1
- 230000001988 toxicity Effects 0.000 description 1
- 231100000419 toxicity Toxicity 0.000 description 1
- VZCYOOQTPOCHFL-UHFFFAOYSA-N trans-butenedioic acid Natural products OC(=O)C=CC(O)=O VZCYOOQTPOCHFL-UHFFFAOYSA-N 0.000 description 1
- 230000005030 transcription termination Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 239000012588 trypsin Substances 0.000 description 1
- 125000000430 tryptophan group Chemical class [H]N([H])C(C(=O)O*)C([H])([H])C1=C([H])N([H])C2=C([H])C([H])=C([H])C([H])=C12 0.000 description 1
- 230000007306 turnover Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 108010073969 valyllysine Proteins 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
Images
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K39/00—Medicinal preparations containing antigens or antibodies
- A61K39/385—Haptens or antigens, bound to carriers
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/70—Vectors or expression systems specially adapted for E. coli
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/195—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
- C07K14/32—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Bacillus (G)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/80—Vectors or expression systems specially adapted for eukaryotic hosts for fungi
- C12N15/81—Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K39/00—Medicinal preparations containing antigens or antibodies
- A61K2039/51—Medicinal preparations containing antigens or antibodies comprising whole cells, viruses or DNA/RNA
- A61K2039/525—Virus
- A61K2039/5256—Virus expressing foreign proteins
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K39/00—Medicinal preparations containing antigens or antibodies
- A61K2039/51—Medicinal preparations containing antigens or antibodies comprising whole cells, viruses or DNA/RNA
- A61K2039/525—Virus
- A61K2039/5258—Virus-like particles
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A50/00—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
- Y02A50/30—Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- General Engineering & Computer Science (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- Biophysics (AREA)
- Mycology (AREA)
- Molecular Biology (AREA)
- Plant Pathology (AREA)
- Physics & Mathematics (AREA)
- Medicinal Chemistry (AREA)
- Pharmacology & Pharmacy (AREA)
- Gastroenterology & Hepatology (AREA)
- Veterinary Medicine (AREA)
- Immunology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Public Health (AREA)
- Animal Behavior & Ethology (AREA)
- Epidemiology (AREA)
- Peptides Or Proteins (AREA)
- Medicinal Preparation (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
본 발명은, 카고 분자를 운반하는 박테리아 미세구획 바이러스-유사 입자(VLP)를 생성하는 방법으로서, 상기 방법은 (a) 박테리아 미세구획 쉘 프로토머를 코딩하는 제1 서열 및 서열 SKITGSSGNDTQGSLITYSGGARG를 포함하는 캡슐화 펩티드에 융합된 카고 분자를 코딩하고 카고 분자를 캡슐화하는 미세구획을 형성하는 제2 서열, 또는 (b) 박테리아 미세구획 쉘 프로토머를 코딩하는 제1 서열, 및 카고 분자 또는 생화학적 태그와 융합된 상기 프로토머 중 적어도 하나를 코딩하고, 카고 분자 또는 생화학적 태그를 외측 표면 상에 발현하는 미세구획을 형성하는 제2 서열을 포함하는 하나 이상의 폴리뉴클레오티드를 숙주 세포 또는 생물에 도입하고 발현시키는 것을 포함하는, 방법에 관한 것이다. 일 구현예에서, 박테리아 미세구획 프로토머는 할로티오바실러스 네아폴리타누스(Halothiobacillus neapolitanus)로부터의 CsoS1A 및 CsoS4A 또는 할리안기움 오크라세움(Haliangium ochraceum)으로부터의 HO-H, HO-P 및 HO-T1을 포함한다.
Description
본 발명은 카고 분자(cargo molecule)를 운반(carrying)하는 박테리아 미세구획(bacterial microcompartment) 바이러스-유사 입자(virus-like particle; VLP), 박테리아 미세구획 VLP를 생산하기 위한 방법, VLP를 생산하기 위해 사용되는 단리된 플라스미드(isolated plasmid) 또는 벡터 핵산(vector nucleic acid), 적어도 하나의 상기 VLP를 포함하는 조성물, 상기 VLP의 용도, 및 상기 VLP를 사용한 치료 방법에 관한 것이다.
박테리아 미세구획(bacterial microcompartment; BMC)은 박테리아의 일부 종에서 발견되는 단백질 쉘(protein shell)이고, 특정 도전적 생화학 반응을 구별하기 위한 전략으로 진화한 것으로 생각된다[참조: Kerfeld, C. A., et al., Nature Reviews Microbiology 16: 277-290 (2018)]. 이러한 단백질 복합체는, 직경이 40nm 내지 400nm의 범위인 다각형 구조로 자가-조립되는 수백 내지 수천 개의 폴리펩티드 아단위로 구성된다. 시아노박테리아 및 일부 화학영양 박테리아 종에서 발견되는 카복시좀은 BMC의 가장 초기에 공지된 예이다. 이 단백질 쉘은 리불로스-1,5-비스포스페이트 카복실라제(RuBisCO)를 캡슐화하고, RuBisCO-에 근접하여, 이의 기질인 CO2 및 리불로스-1,5-비스포스페이트를 농축함으로써 이의 촉매 효율을 개선시킨다. 카복시좀은, 캡슐화되는 RuBisCO의 클래스에 따라 2개의 주요 그룹으로 분류된다. 알파-카복시좀은 α-시아노박테리아(일반적으로 염수 시아노박테리아)와 화학영양소에서 발견되는 형태 1A RuBisCO를 함유하는 반면, 베타-카복시좀은, β-시아노박테리아(일반적으로 담수 시아노박테리아)에서 관찰되는 형태 1B RuBisCO를 수용한다[참조: Turmo, A., et al., FEMS Microbiol Lett 364: (2017)].
BMC 쉘을 구성하는 아단위(subunit)의 다수의 원자-스케일 구조(atomic-scale structure)는, 3개의 무손상 쉘의 구조(할로테세 종(Halothece sp.) PCC 7418로부터의 감소된 성분 베타-카복시좀, 클레브시엘라 뉴모니아에(Klebsiella pneumoniae)의 합성 글리실-라디칼 관련 BMC 그룹 2(GRM2) 및 할리안기움 오크라세움(Haliangium ochraceum)로부터의 미결정 기능의 BMC(HO-BMC))와 함께, 최근 수년 동안 보고되었다[참조: Kalnins, G., et al., Nature Communications 11: 388 (2020); Sutter, M., et al., Science 356(6344): 1293-1297 (2017); Sutter, M., et al., Plant Physiology (2019)]. BMC의 외관과 기능의 다양성에도 불구하고, 주요 빌딩 블록의 3차 구조는 보존된다. BMC-H 도메인 단백질(pfam00936)은 화학량론적으로 주요 모듈이며, C6 기하를 갖는 호모-헥사머를 형성한다. BMC-T 단백질은 2개의 BMC-H 도메인의 탠덤 반복에 의해 형성되고, 삼량체 또는 가육방정 대칭을 갖는 압박 삼량체의 이중 스택으로서 조립된다. BMC-P 도메인 단위(pfam03319)는 BMC 쉘 복합체에서 사소하지만 중요한 모듈이다. BMC-P 프로토머는 쉘의 정점을 점유하는 피라미드형 기하를 갖는 호모-펜타머로 조립된다. 이에 의해, BMC의 다각형 외관이 발생한다. 성분의 구조에 대한 상세한 분자 이해는 BMC 쉘 조작의 분야에 기여했다. 이러한 노력에는, 동족 내강 단백질로부터 유래하는 짧은 펩티드 서열인 캡슐화 펩티드(EP)의 사용에 의해, 또는 쉘 성분의 단백질 조작에 의해 쉘 내강으로 이종 단백질 카고를 표적화하는 것이 포함된다[참조: Lawrence, A. D., et al., ACS Synthetic Biology 3: 454-465 (2014)]. 이러한 변형은 BMC를 세포내 나노반응기로 재이용하거나, 생체분자의 전달을 위한 스캐폴드로 재이용하는 것을 목적으로 한다.
할로티오바실러스 네아폴리타누스(Halothiobacillus neapolitanus)의 알파-카복시좀은 전체 알파-카복시좀 오페론(cso)(도 1)을 재조합 숙주에 이식함으로써 에스케리키아 콜라이(Escherichia coli)에서 이전에 생산되었다[참조: Bonacci, W., et al. PNAS 109: 478-483 (2012)]. 오페론에서 발견되는 유전자에는, RuBisCO 거대 및 작은 단위(cbbLS), 탄산 무수화효소(carbonic anhydrase)(csoS3/SCA) 및 본질적으로 무질서한 단백질(IDP), csoS2를 코딩하는 유전자와 함께, 3개의 BMC-H 파라로그(cso1ABC), 1개의 BMC-T 단백질(csoS1D) 및 2개의 BMC-P 파라로그(csoS4AB)가 포함된다. 이 IDP는 쉘 및 루미날 단백질 사이의 상호작용을 촉진함으로써 알파-카복시좀 조립에 중요한 것으로 공지되어 있다[참조: Cai, F. et al. Life (Basel, Switzerland) 5: 1141-1171 (2015)]. 이종 카고는 내강으로 보다 효율적으로 팩키징될 수 있기 때문에, 이들의 천연 카고의 BMC 효력은 조작된 용도에 더 적합하다. 그러나, 에이치. 네아폴리타누스(H. neapolitanus) 알파-카복시좀은, 이의 구조 및 생화학적 프로세스(biochemical process)에 대한 수십 년의 연구에도 불구하고, 상술한 10개 미만의 유전자를 갖는 구조적으로 밀폐된 형태로 재조합적으로 발현되지 않았다[참조: Bonacci, W., et al. PNAS 109: 478-483 (2012)].
재조합 박테리아 및 효모 숙주에서 생산 효율이 개선된 박테리아 미세구획 바이러스-유사 입자, 및 표면 카고 분자 상에 캡슐화 및/또는 존재하는 대체 방법을 제공할 필요가 있다.
놀랍게도, BMC VLP는 각각 할로티오바실러스 네아폴리타누스(Halothiobacillus neapolitanus) 및 할리안기움 오크라세움(Haliangium ochraceum)의 BMC 프로토머의 2개 또는 3개 유형만을 사용하여 형성될 수 있다는 것이 밝혀졌다. 에이치. 네아폴리타누스(H. neapolitanus) BMC VLP는 Cso-BMC라고 하고, 에이치 오크라세움(H. ochraceum) BMC VLP는 HO-BMC라고 한다. 또한, 카고 분자는, CsoS2로부터 유래하고 S2CP라고 하는 신규한 짧은 펩티드 또는 S2CP(30)라고 하는 이의 변이체를 사용하여 Cso-BMC 내에서 캡슐화될 수 있다. 카고 분자를 캡슐화한 Cso-BMC는 카고 분자를 캡슐화하지 않은 Cso-BMC에서 관찰되지 않는 명확한 쉘 형태를 가졌다. 주목할 것으로, 양쪽 쉘의 프로토머의 펩티드 말단은 외측으로 직면하여, 목적의 단백질의 유전적 융합을 가능하게 한다는 것이다. 따라서, 카고 분자는, 프로토머의 말단에 융합하여 또는 프로토머 말단에 부착된 생화학적 태그에 상보적 결합 파트너를 갖는 카고 분자를 통해 이를 발현함으로써 본 발명의 BCM VLP의 외측 표면 상에 표시될 수 있다.
제1 양태에 따르면, 본 발명은 카고 분자를 운반하는 박테리아 미세구획 바이러스-유사 입자(VLP)를 생산하기 위한 방법으로서, 상기 방법은
A) (i) 박테리아 미세구획 쉘 프로토머(shell protomer)를 코딩하는(encoding) 제1 서열; 및 (ii) 캡슐화 펩티드에 융합된 카고 분자를 코딩하는 제2 서열(여기서, 상기 캡슐화 펩티드는 서열번호 1(SKITGSSGNDTQGSLITYSGGARG) 또는 서열번호 94(KPEKPGSKITGSSGNDTQGSLITYSGGARG)에 제시된 아미노산 서열 또는 이의 기능적 변이체(functional variant)를 포함한다)을 포함하는 하나 이상의 이종(heterologous) 폴리뉴클레오티드를 숙주 세포 또는 생물에 도입하여;
a) 상기 제1 및 제2 서열을 발현시키고,
b) 카고 분자를 캡슐화하는 미세구획을 형성하는 단계, 또는
B) (i) 박테리아 미세구획 쉘 프로토머를 코딩하는 제1 서열; 및 (ii) 카고 분자 또는 생화학적 태그(biochemical tag)와 융합된 상기 프로토머 중 적어도 하나를 코딩하는 제2 서열을 포함하는 하나 이상의 폴리뉴클레오티드를 숙주 세포 또는 생물에 도입하여;
a) 제1 및 제2 서열을 발현시키고;
b) 외측 표면에 카고 분자를 발현하는 미세구획을 형성하거나, 상보적 태그를 포함하는 카고 분자가 결합할 수 있는 외측 표면에 생화학적 태그를 발현하는 미세구획을 형성하는 단계
를 포함하는, 카고 분자를 운반하는 박테리아 미세구획 바이러스-유사 입자(VLP)를 생성하는 방법을 제공한다.
일부 구현예에서, 서열번호 1에 제시된 캡슐화 펩티드의 기능적 변이체는, 이의 아미노 말단에서, 서열번호 94의 아미노 말단에 1, 2, 3, 4 또는 5개의 추가 아미노산을 포함한다. 예를 들면, 서열번호 1의 캡슐화 펩티드의 변이체는, 이의 아미노 말단에서, 'G', 'PG', 'KPG' 등을 포함하고, 기능을 유지할 수 있다. 이러한 변이체는 서열번호 1과 서열번호 94의 서열 사이의 중간체이다.
일부 구현예에서, 상기 캡슐화 펩티드는, 유전자 코드의 중복성(redundancy)에 기인하여, 각각 서열번호 7 또는 서열번호 95(S2CP(30))에 제시된 핵산 서열과 적어도 80% 동일성, 적어도 85% 동일성, 적어도 90% 동일성, 적어도 95% 동일성 또는 100% 동일성을 갖는 폴리뉴클레오티드 서열에 의해 코딩된다.
일부 구현예에서, 박테리아 미세구획 프로토머는 할로티오바실러스 네아폴리타누스(Halothiobacillus neapolitanus) 또는 할리안기움 오크라세움(Haliangium ochraceum)으로부터 유래한다.
일부 구현예에서, 박테리아 미세구획 프로토머는 할로티오바실러스 네아폴리타누스(Halothiobacillus neapolitanus)로부터의 CsoS1A(서열번호 2) 및 CsoS4A(서열번호 3); 또는 할리안기움 오크라세움(Haliangium ochraceum)으로부터의 HO-H(서열번호 4), HO-P(서열번호 5) 및 HO-T1(서열번호 6), 및 이의 변이체이다.
일부 구현예에서, 카고 분자는 효소 및/또는 형광 단백질 및/또는 면역원성 펩티드와 같은 적어도 하나의 펩티드이다.
일부 구현예에서, 생화학적 태그는 Strep-Tag II(SII), SpyCatcher/SpyTag(SC/ST) 쌍 및 CC-Di-A/B(CCA/CCB) 쌍을 포함하는 그룹으로부터 선택될 수 있다.
일부 구현예에서, CsoS1A의 발현은 프로모터 PT7에 의해 조절되고; CsoS4A는 프로모터 PCON5에 의해 조절되고; HO-H는 효모 프로모터 PTDH3에 의해 조절되고; HO-P는 효모 프로모터 PPYK1에 의해 조절되고; HO-T1은 효모 프로모터 PYEF3에 의해 조절된다.
일부 구현예에서, 숙주 생물은 이. 콜라이(E. coli) 또는 에스. 세레비지아에(S. cerevisiae)이다.
제2 양태에 따르면, 본 발명은, 카고 분자를 운반하는 조작된 박테리아 미세구획 VLP로서, i) 박테리아 미세구획 쉘 프로토머, 및 캡슐화 펩티드에 융합된 카고 분자(여기서, 상기 캡슐화 펩티드는 서열번호 1(SKITGSSGNDTQGSLITYSGGARG) 또는 서열번호 94(KPEKPGSKITGSSGNDTQGSLITYSGGARG)에 제시된 아미노산 서열 또는 이의 기능적 변이체를 포함한다); 또는 ii) 박테리아 미세구획 쉘 프로토머 및 카고 분자(여기서, 상기 카고 분자는 적어도 하나의 상기 프로토머의 말단에 융합되거나, 적어도 하나의 상기 프로토머는 태그에 융합되고, 상보적 태그를 포함하는 카고 분자는 VLP의 외측 표면에 결합된다)를 포함하는, 조작된 박테리아 미세구획 VLP를 제공한다.
일부 구현예에서, 박테리아 미세구획 프로토머는 할로티오바실러스 네아폴리타누스(Halothiobacillus neapolitanus) 또는 할리안기움 오크라세움(Haliangium ochraceum)으로부터 유래한다.
일부 구현예에서, 박테리아 미세구획 프로토머는 할로티오바실러스 네아폴리타누스(Halothiobacillus neapolitanus)로부터 서열번호 2에 제시된 아미노산 서열을 포함하는 CsoS1A 및 서열번호 3에 제시된 아미노산 서열을 포함하는 CsoS4A, 또는 할리안기움 오크라세움(Haliangium ochraceum)으로부터 서열번호 4에 제시된 아미노산 서열을 포함하는 HO-H, 서열번호 5에 제시된 아미노산 서열을 포함하는 HO-P 및 서열번호 6에 제시된 아미노산 서열을 포함하는 HO-T1, 및 이의 변이체이다.
일부 구현예에서, 카고 분자는 효소 및/또는 형광 단백질 및/또는 면역원성 펩티드와 같은 적어도 하나의 펩티드이다.
일부 구현예에서, 생화학적 태그는 Strep-Tag II(SII), SpyCatcher/SpyTag(SC/ST) 쌍 및 CC-Di-A/B(CCA/CCB) 쌍을 포함하는 그룹으로부터 선택될 수 있다.
제3 양태에 따르면, 본 발명은
a) 각각이 프로모터에 작동적으로 연결되어 있는, 박테리아 미세구획 쉘 프로토머를 코딩하는 제1 DNA 서열, 및
b) 프로모터에 작동적으로 연결된, 캡슐화 펩티드에 융합된 카고 분자를 코딩하는 제2 DNA 서열(여기서, 캡슐화 펩티드는 서열번호 1(SKITGSSGNDTQGSLITYSGGARG) 또는 서열번호 94(KPEKPGSKITGSSGNDTQGSLITYSGGARG)에 제시된 아미노산 서열 또는 이의 기능적 변이체를 포함한다); 또는
c) 각각이 프로모터에 작동적으로 연결되어 있는, 박테리아 미세구획 쉘 프로토머를 코딩하는 제1 DNA 서열, 및
d) 카고 분자 또는 생화학적 태그와 융합된 상기 프로토머 중 적어도 하나를 코딩하는 제2 DNA 서열
을 포함하는, 단리된 플라스미드 또는 벡터 핵산을 제공한다.
일부 구현예에서, 단리된 플라스미드 또는 벡터 핵산은 이전에 정의된 바와 같은 박테리아 미세구획 쉘 프로토머, 프로모터, 카고 분자 및 태그를 포함한다.
일부 구현예에서, 서열번호 1-6 및 94에 제시된 상기 박테리아 미세구획 쉘 프로토머, 카고 분자 및 태그를 코딩하는 단리된 플라스미드 또는 벡터 핵산 DNA 서열은, 유전자 코드의 중복성에 기인하여, 서열번호 7-12 및 95-S2CP(30)에 제시된 핵산 서열과 적어도 70%, 적어도 80%, 적어도 90% 또는 100% 동일성을 갖는다.
제4 양태에 따르면, 본 발명은, a) 대상체(subject)에서 질환의 예방 또는 치료; 또는 b) 생화학적 프로세스에 사용하기 위한 본 발명의 임의의 양태의 적어도 하나의 조작된 VLP를 포함하는 조성물 또는 조합물을 제공한다.
일부 구현예에서, 적어도 하나의 조작된 VLP는 프로드러그의 전환을 위한 효소를 포함한다.
일부 구현예에서, 조성물은 하나 이상의 추가 치료제를 포함할 수 있다. 조성물은 백신으로서 사용될 수 있다.
제5 양태에 따르면, 본 발명은, 대상체에서 질환의 예방 또는 치료를 위한 의약의 제조에 있어서 본 발명의 임의의 양태의 적어도 하나의 조작된 VLP의 용도를 제공한다.
제6 양태에 따르면, 본 발명은, 이러한 치료를 필요로 하는 대상체에서 유효량의 본 발명의 임의의 양태의 조작된 VLP를 투여하는 것을 포함하는, 예방 또는 치료 방법을 제공한다.
본 발명은 하기에서 상세히 설명되는 특정 구현예로 한정되지 않는 것을 이해할 것이다.
도 1은 할로티오바실러스 네아폴리타누스(Halothiobacillus neapolitanus)의 알파-카복시좀 오페론(cso)의 개략도를 나타낸다. 점선은, 카복시좀과 연관될 가능성이 없는 csos1B와 csoS1D 사이의 10개 유전자를 나타낸다. 유전자 길이와 개재 거리는 스케일로 도시되지 않는다.
도 2A-D는 개발된 플라스미드의 개략도를 나타낸다. (A) TU 수용체 플라스미드, pESX는 pUC 복제 기원과 함께 스트렙토마이신 선택 마커(StrepR)를 함유한다. RFP 카세트는 제한 효소(RE) BsmBI에 의한 소화를 통해 유입 TU에 의해 대체된다. (B) 경로 수용체 플라스미드, pCKH는 RE Bsal로의 소화에 의해 pESX 플라스미드로부터 방출된 TU를 수용한다. pCKH는 카나마이신 선택 마커(KanR)를 함유한다. (C 및 D) ORF의 N- 또는 C-말단에서 SII 또는 His6 태그 또는 4개의 FP(mT2, meGFP, mKOK, mCh) 중 하나를 부착할 수 있는 변형된 HcKan_O 플라스미드. Bsal에 의한 ORF의 삽입 후, ORF-태그 융합 생성물(Gly-Ser-Ser 링커로 분리됨)이 BsmBI에 의해 방출된다.
도 3은 생성된 VLP 경로의 개략도를 나타낸다. Cso-BMC의 경우, 사용된 모든 터미네이터는 TT7였다. 프로모터 화살표 상의 그레이스케일 강도는 이들의 상대적 강도를 상징하며, 어두울수록 강력하다.
도 4A-B는 알파-카복시좀 시스템에 대한 카고 표적화 펩티드 서열의 식별을 나타낸다. (A) 에이치. 네아폴리타누스(H. neapolitanus)의 것과 CsoS2 오르토로그(상이한 속의 상위 9개가 제시됨)의 다중 서열 정렬은 서열 로고로 도시된 바와 같이 C-말단 영역이 고도로 보존되어 있음을 나타낸다. (B) 쉘 단백질 CsoS1A-SII, SII-CsoS1D 및 CsoS4A-SII를 사용한 His6-meGFP-S2CP의 풀-다운 검정은 S2CP가 CsoS1A-SII와의 상호작용만을 매개하는 것을 입증한다.
도 5A-B는 알파-카복시좀 쉘 성분의 발현 및 정제를 나타낸다. (A) 알파-카복시좀 성분을 발현하기 위해 사용되는 합성 오페론의 개략도. 쉘 모듈은 기하학적 아이콘으로도 표시된다. Cso4A: 오각형; CsoS1D: 삼량체 육각형; CsoS1A: 헥사머 육각형. (B) 경로 Cso-PmC hTHC를 발현하는 세포의 형광 현미경사진. meGFP-S2CP 및 CsoS4A-mCherry의 공동 국소화를 볼 수 있다. DIC: 미분 간섭 콘트라스트 채널. 스케일 바(백색, 우측 하부)는 2㎛를 나타낸다. (C-F) (C) Cso-PmC hTHC, (D) Cso-PSIITHC, (E) Cso-PSIITH 및 (F) Cso-PSIIH에 대한 AIEX 정제 후의 0.4M NaCl 용출 분획에서 정제된 단백질 쉘의 TEM 가시화. 스케일 바(흑색, 우측 하부)는 50nm를 나타낸다.
도 6A-D는 S2CP가 캡슐화 펩티드로 작용하는 것을 나타낸다. (A) S2CP에 의한 UmuD1-40 프로테아제 신호 태그된 GFP의 캡슐화가 내인성 CIpXP 프로테아제로부터 이를 보호할 수 있는 것을 도시하는 개략도. (B) S2CP는 UmuD1-40-meGFP를 단순화된 카복시좀의 내강으로 표적화할 수 있었다. 정제된 쉘은 항-GFP 항체를 사용하여 웨스턴 블롯 분석에 적용했다. UmuD1-40 meGFP는 Cso-PSIITHCU,S2CP으로부터만 검출되었으며, Cso-PSIITHCU에서는 검출되지 않았다. 전자 현미경사진은 (C) Cso-PSII THCU,S2CP 및 (D) Cso-PSIITHCU에 의해 생성된 쉘이 유사하다는 것을 나타낸다. 스케일 바(흑색, 우측 하부)는 50nm를 나타낸다.
도 7A-B는 단순화된 알파 카복시좀 쉘의 원자 모델을 나타낸다. (A, B) 쉘의 표면 표현, CsoS1A의 회색 및 CsoS4A의 밝은 회색. 밝은 회색 위의 우측 및 하부 화살표는 각각 CsoS4A 단량체의 N- 및 C-말단을 나타내고, 회색 위의 상부 및 우측 화살표는 각각 CsoS1A 단량체의 N- 및 C-말단을 나타낸다.
도 8A-B는 생체내 쉘 성분 사이의 상호작용을 조사하기 위해 사용된 쉘 프로브(A) CsoS4A-mCherry 및 (B) meGFP-S2CP의 형광 현미경사진을 나타낸다. 프로브는 일반적으로 개별적으로 발현될 때 시토졸 내에 균일하게 분포되었다. 스케일 바(우측 하부)는 2㎛를 나타낸다.
도 9A-C는 (A) Cso-PmChTHC 경로 작제물, (B) Cso-PSIITHC 및 (C) CsoS4A-SII로부터 친화성-정제된 단백질에 대한 음이온-교환(AIEX) 크로마토그램을 나타낸다. 청색 트레이스(좌측 Y-축)은 280nm에서의 흡광도(mAU)를 나타내고, 녹색 트레이스(우측 Y-축)은 지정된 용출 용적에 사용된 AIEX 완충액 B(트리스 50mM, NaCl 1.0M, pH 7.9)의 백분율을 나타낸다. 우측의 TEM 현미경사진은 0.3M NaCl에서 수득된 용출 분획의 표시이다. 그 자체로 발현되는 CsoS4A-SII는 단백질 쉘을 형성하지 않음을 알 수 있다. 스케일 바(우측 하부)는 50nm를 나타냅니다.
도 10A-B는 Cso-PSIITH의 정제를 나타낸다. (A) AIEX 크로마토그램 및 (B) 0.3M NaCl에 상응하는 용출 분획에 대한 TEM 현미경사진. 스케일 바(우측 하부)는 50nm를 나타낸다.
도 11A-E는 Cso-PSIIH의 정제를 나타낸다. (A) Cso-PSIIH의 친화성 정제 단백질에 대한 AIEX 크로마토그램. (B) 0.3M NaCl에 상응하는 Cso-PSIIH의 용출 분획에 대한 TEM 현미경사진. (C) CsoS1A-SII, (D) CsoSID와 공-발현된 CsoS1A-SII 및 (E) CsoS1D와 공-발현된 CsoS4A-SII에 대한 TEM 현미경사진은 이러한 조합이 단백질 쉘을 형성하지 않는다는 것을 나타낸다. 스케일 바(우측 하부)는 50nm를 나타낸다.
도 12A-D는 (A) Cso-PmChTHC, (B) PSIITHC, (C) PSIITH 및 (D) PSIIH의 AIEX 정제로부터 수집된 분획의 나트륨 도데실 설페이트 - 폴리아크릴아미드 겔 전기영동(SDS-PAGE) 분석을 나타낸다. 화살표는 TEM 분석에 사용되는 분획을 나타내고, 좌측 및 우측 화살표는 각각 [NaCl] = 0.3M 및 0.4M에 상응한다. 단백질 래더(ladder) 레인은 질량(kDa)이 표시된 L로 표시된다.
도 13은 15 동적 광 산란으로 측정한 단백질 쉘의 입자 크기 분포를 나타낸다. (A) Cso-PmChTHC, (B) Cso-PSIITHC, (C) Cso-PSIITH 및 (D) Cso-PSIIH.
도 14는 정제된 Cso-BMC와 HO-BMC 사이의 차이를 요약한 표를 나타낸다.
도 15는 6량체 아단위(회색) 및 5량체 아단위(밝은 회색)의 N- 및 C-말단이 쉘 내강으로부터 떨어져 있는 것을 입증하는 Cso-BMC 외관의 확대도를 나타낸다. 선택된 6량체 쇄 및 5량체 쇄의 N-말단 및 C-말단은 화살표로 표시된다. 표시된 Cso-BMC 6량체 및 5량체 아단위의 토폴로지는 HO-BMC의 표시이다.
도 16은 UmuD1-40-GFP-S2CP 또는 UmuD1-40-GFP-S2CP(30)와 공-발현된 Cso-PSIIH 쉘의 농도측정 분석을 나타낸다. UmuD1-40-GFP-S2CP 및 UmuD1-40-GFP-S2CP(30)의 상대적 양을 직접 비교할 수 있도록 쉘 샘플당 대략 동일한 양의 쉘(밴드 피크 영역으로 판단됨)을 로딩했다. 화살표는 UmuD1-40-GFP-S2CP 및 UmuD1-40-GFP-S2CP(30)을 나타낸다.
도 17A-D는 일반적 변성 요인에 대한 Cso-BMC 쉘의 안정성의 평가를 나타낸다. (A-D) 표시된 조건에 대해 시험된 공 Cso-BMC의 DLS 스펙트럼. 기준선은 후속의 각 스펙트럼에 대해 0.2로 수직 이동되어, 모든 스펙트럼을 하나의 그래프에서 볼 수 있다.
도 18A-E는 APEX2 및 LacZ 효소의 Cso-BMC 쉘로의 로딩을 나타낸다. (A-B) 효소와 공-발현된 Cso-BMC의 SDS-PAGE 및 웨스턴 블롯 분석(항-His6 항체를 사용). (C-D) 효소를 로딩한 Cso-BMC의 TEM 현미경사진. 스케일 바(흑색, 우측 하부)는 50nm를 나타낸다. (E) 공 Cso-BMC 쉘을 참조로 사용하여, 효소와 공-발현된 Cso-BMC의 DLS 스펙트럼.
도 19는 유리 및 Cso-BMC 캡슐화된 APEX2 및 LacZ 효소의 미하엘리스-멘텐(Michaelis-Menten) 동역학을 나타낸다.
도 20A-D는 변성 조건에 대한 APEX2 및 LacZ에 부여된 Cso-BMC의 안정화 효과의 평가를 나타낸다. 유리 및 캡슐화된(+ 쉘) 효소의 잔류 효소 활성은, (A) 23℃, (B) 0% v/v 메탄올, (C) 동결-해동 부재 및 (D) pH 8로서 제시된, 순수한 샘플의 활성으로 활성을 정규화함으로써 수득되었다. 에러 바는 평균의 1개의 표준 편차를 나타낸다.
도 21A-E는 HO-BMC 쉘: HO-HTP 및 HO-HTSTP+GFP-SpyCatcher의 정제를 나타낸다. (A-B) 정제된 쉘의 SDS-PAGE 분석, (C) HO-HTSTP+GFP-SpyCatcher 샘플에서 GFP-SpyCatcher의 존재를 나타내는 웨스턴 블롯 분석(항-GFP 항체를 사용), (D-E) 양쪽 HO-BMC 작제물의 TEM 현미경사진. 스케일 바(흑색, 우측 하부)는 50nm를 나타낸다.
도 2A-D는 개발된 플라스미드의 개략도를 나타낸다. (A) TU 수용체 플라스미드, pESX는 pUC 복제 기원과 함께 스트렙토마이신 선택 마커(StrepR)를 함유한다. RFP 카세트는 제한 효소(RE) BsmBI에 의한 소화를 통해 유입 TU에 의해 대체된다. (B) 경로 수용체 플라스미드, pCKH는 RE Bsal로의 소화에 의해 pESX 플라스미드로부터 방출된 TU를 수용한다. pCKH는 카나마이신 선택 마커(KanR)를 함유한다. (C 및 D) ORF의 N- 또는 C-말단에서 SII 또는 His6 태그 또는 4개의 FP(mT2, meGFP, mKOK, mCh) 중 하나를 부착할 수 있는 변형된 HcKan_O 플라스미드. Bsal에 의한 ORF의 삽입 후, ORF-태그 융합 생성물(Gly-Ser-Ser 링커로 분리됨)이 BsmBI에 의해 방출된다.
도 3은 생성된 VLP 경로의 개략도를 나타낸다. Cso-BMC의 경우, 사용된 모든 터미네이터는 TT7였다. 프로모터 화살표 상의 그레이스케일 강도는 이들의 상대적 강도를 상징하며, 어두울수록 강력하다.
도 4A-B는 알파-카복시좀 시스템에 대한 카고 표적화 펩티드 서열의 식별을 나타낸다. (A) 에이치. 네아폴리타누스(H. neapolitanus)의 것과 CsoS2 오르토로그(상이한 속의 상위 9개가 제시됨)의 다중 서열 정렬은 서열 로고로 도시된 바와 같이 C-말단 영역이 고도로 보존되어 있음을 나타낸다. (B) 쉘 단백질 CsoS1A-SII, SII-CsoS1D 및 CsoS4A-SII를 사용한 His6-meGFP-S2CP의 풀-다운 검정은 S2CP가 CsoS1A-SII와의 상호작용만을 매개하는 것을 입증한다.
도 5A-B는 알파-카복시좀 쉘 성분의 발현 및 정제를 나타낸다. (A) 알파-카복시좀 성분을 발현하기 위해 사용되는 합성 오페론의 개략도. 쉘 모듈은 기하학적 아이콘으로도 표시된다. Cso4A: 오각형; CsoS1D: 삼량체 육각형; CsoS1A: 헥사머 육각형. (B) 경로 Cso-PmC hTHC를 발현하는 세포의 형광 현미경사진. meGFP-S2CP 및 CsoS4A-mCherry의 공동 국소화를 볼 수 있다. DIC: 미분 간섭 콘트라스트 채널. 스케일 바(백색, 우측 하부)는 2㎛를 나타낸다. (C-F) (C) Cso-PmC hTHC, (D) Cso-PSIITHC, (E) Cso-PSIITH 및 (F) Cso-PSIIH에 대한 AIEX 정제 후의 0.4M NaCl 용출 분획에서 정제된 단백질 쉘의 TEM 가시화. 스케일 바(흑색, 우측 하부)는 50nm를 나타낸다.
도 6A-D는 S2CP가 캡슐화 펩티드로 작용하는 것을 나타낸다. (A) S2CP에 의한 UmuD1-40 프로테아제 신호 태그된 GFP의 캡슐화가 내인성 CIpXP 프로테아제로부터 이를 보호할 수 있는 것을 도시하는 개략도. (B) S2CP는 UmuD1-40-meGFP를 단순화된 카복시좀의 내강으로 표적화할 수 있었다. 정제된 쉘은 항-GFP 항체를 사용하여 웨스턴 블롯 분석에 적용했다. UmuD1-40 meGFP는 Cso-PSIITHCU,S2CP으로부터만 검출되었으며, Cso-PSIITHCU에서는 검출되지 않았다. 전자 현미경사진은 (C) Cso-PSII THCU,S2CP 및 (D) Cso-PSIITHCU에 의해 생성된 쉘이 유사하다는 것을 나타낸다. 스케일 바(흑색, 우측 하부)는 50nm를 나타낸다.
도 7A-B는 단순화된 알파 카복시좀 쉘의 원자 모델을 나타낸다. (A, B) 쉘의 표면 표현, CsoS1A의 회색 및 CsoS4A의 밝은 회색. 밝은 회색 위의 우측 및 하부 화살표는 각각 CsoS4A 단량체의 N- 및 C-말단을 나타내고, 회색 위의 상부 및 우측 화살표는 각각 CsoS1A 단량체의 N- 및 C-말단을 나타낸다.
도 8A-B는 생체내 쉘 성분 사이의 상호작용을 조사하기 위해 사용된 쉘 프로브(A) CsoS4A-mCherry 및 (B) meGFP-S2CP의 형광 현미경사진을 나타낸다. 프로브는 일반적으로 개별적으로 발현될 때 시토졸 내에 균일하게 분포되었다. 스케일 바(우측 하부)는 2㎛를 나타낸다.
도 9A-C는 (A) Cso-PmChTHC 경로 작제물, (B) Cso-PSIITHC 및 (C) CsoS4A-SII로부터 친화성-정제된 단백질에 대한 음이온-교환(AIEX) 크로마토그램을 나타낸다. 청색 트레이스(좌측 Y-축)은 280nm에서의 흡광도(mAU)를 나타내고, 녹색 트레이스(우측 Y-축)은 지정된 용출 용적에 사용된 AIEX 완충액 B(트리스 50mM, NaCl 1.0M, pH 7.9)의 백분율을 나타낸다. 우측의 TEM 현미경사진은 0.3M NaCl에서 수득된 용출 분획의 표시이다. 그 자체로 발현되는 CsoS4A-SII는 단백질 쉘을 형성하지 않음을 알 수 있다. 스케일 바(우측 하부)는 50nm를 나타냅니다.
도 10A-B는 Cso-PSIITH의 정제를 나타낸다. (A) AIEX 크로마토그램 및 (B) 0.3M NaCl에 상응하는 용출 분획에 대한 TEM 현미경사진. 스케일 바(우측 하부)는 50nm를 나타낸다.
도 11A-E는 Cso-PSIIH의 정제를 나타낸다. (A) Cso-PSIIH의 친화성 정제 단백질에 대한 AIEX 크로마토그램. (B) 0.3M NaCl에 상응하는 Cso-PSIIH의 용출 분획에 대한 TEM 현미경사진. (C) CsoS1A-SII, (D) CsoSID와 공-발현된 CsoS1A-SII 및 (E) CsoS1D와 공-발현된 CsoS4A-SII에 대한 TEM 현미경사진은 이러한 조합이 단백질 쉘을 형성하지 않는다는 것을 나타낸다. 스케일 바(우측 하부)는 50nm를 나타낸다.
도 12A-D는 (A) Cso-PmChTHC, (B) PSIITHC, (C) PSIITH 및 (D) PSIIH의 AIEX 정제로부터 수집된 분획의 나트륨 도데실 설페이트 - 폴리아크릴아미드 겔 전기영동(SDS-PAGE) 분석을 나타낸다. 화살표는 TEM 분석에 사용되는 분획을 나타내고, 좌측 및 우측 화살표는 각각 [NaCl] = 0.3M 및 0.4M에 상응한다. 단백질 래더(ladder) 레인은 질량(kDa)이 표시된 L로 표시된다.
도 13은 15 동적 광 산란으로 측정한 단백질 쉘의 입자 크기 분포를 나타낸다. (A) Cso-PmChTHC, (B) Cso-PSIITHC, (C) Cso-PSIITH 및 (D) Cso-PSIIH.
도 14는 정제된 Cso-BMC와 HO-BMC 사이의 차이를 요약한 표를 나타낸다.
도 15는 6량체 아단위(회색) 및 5량체 아단위(밝은 회색)의 N- 및 C-말단이 쉘 내강으로부터 떨어져 있는 것을 입증하는 Cso-BMC 외관의 확대도를 나타낸다. 선택된 6량체 쇄 및 5량체 쇄의 N-말단 및 C-말단은 화살표로 표시된다. 표시된 Cso-BMC 6량체 및 5량체 아단위의 토폴로지는 HO-BMC의 표시이다.
도 16은 UmuD1-40-GFP-S2CP 또는 UmuD1-40-GFP-S2CP(30)와 공-발현된 Cso-PSIIH 쉘의 농도측정 분석을 나타낸다. UmuD1-40-GFP-S2CP 및 UmuD1-40-GFP-S2CP(30)의 상대적 양을 직접 비교할 수 있도록 쉘 샘플당 대략 동일한 양의 쉘(밴드 피크 영역으로 판단됨)을 로딩했다. 화살표는 UmuD1-40-GFP-S2CP 및 UmuD1-40-GFP-S2CP(30)을 나타낸다.
도 17A-D는 일반적 변성 요인에 대한 Cso-BMC 쉘의 안정성의 평가를 나타낸다. (A-D) 표시된 조건에 대해 시험된 공 Cso-BMC의 DLS 스펙트럼. 기준선은 후속의 각 스펙트럼에 대해 0.2로 수직 이동되어, 모든 스펙트럼을 하나의 그래프에서 볼 수 있다.
도 18A-E는 APEX2 및 LacZ 효소의 Cso-BMC 쉘로의 로딩을 나타낸다. (A-B) 효소와 공-발현된 Cso-BMC의 SDS-PAGE 및 웨스턴 블롯 분석(항-His6 항체를 사용). (C-D) 효소를 로딩한 Cso-BMC의 TEM 현미경사진. 스케일 바(흑색, 우측 하부)는 50nm를 나타낸다. (E) 공 Cso-BMC 쉘을 참조로 사용하여, 효소와 공-발현된 Cso-BMC의 DLS 스펙트럼.
도 19는 유리 및 Cso-BMC 캡슐화된 APEX2 및 LacZ 효소의 미하엘리스-멘텐(Michaelis-Menten) 동역학을 나타낸다.
도 20A-D는 변성 조건에 대한 APEX2 및 LacZ에 부여된 Cso-BMC의 안정화 효과의 평가를 나타낸다. 유리 및 캡슐화된(+ 쉘) 효소의 잔류 효소 활성은, (A) 23℃, (B) 0% v/v 메탄올, (C) 동결-해동 부재 및 (D) pH 8로서 제시된, 순수한 샘플의 활성으로 활성을 정규화함으로써 수득되었다. 에러 바는 평균의 1개의 표준 편차를 나타낸다.
도 21A-E는 HO-BMC 쉘: HO-HTP 및 HO-HTSTP+GFP-SpyCatcher의 정제를 나타낸다. (A-B) 정제된 쉘의 SDS-PAGE 분석, (C) HO-HTSTP+GFP-SpyCatcher 샘플에서 GFP-SpyCatcher의 존재를 나타내는 웨스턴 블롯 분석(항-GFP 항체를 사용), (D-E) 양쪽 HO-BMC 작제물의 TEM 현미경사진. 스케일 바(흑색, 우측 하부)는 50nm를 나타낸다.
본 명세서에서 언급되는 서지 참조는 참조 목록의 형태로 편의상 나열되고, 실시예의 최후에 추가된다. 이러한 서지 참조의 전체 내용은 본원에 참조에 의해 편입된다.
정의
명세서, 실시예 및 첨부된 청구범위에 사용된 특정 용어는 편의를 위해 본원에 수집되었다.
본 명세서에서 사용되는 바와 같이, 용어 "포함하는(comprising)" 또는 "포함하는(including)"은 참조된 바와 같이 언급된 특징, 정수, 단계 또는 성분의 존재를 지정하는 것으로 해석되어야 하지만, 하나 이상의 특징, 정수, 단계 또는 성분 또는 이의 그룹의 존재 또는 추가를 배제하지 않는다. 그러나, 본 개시와 관련하여, 용어 "포함하는(comprising)" 또는 "포함하는(including)"에는 "이루어진(consisting of)"도 포함된다. "포함하다(comprise)" 및 "포함한다(comprises)"와 같은 단어 "포함하는(comprising)"의 변형 및 "포함하다(include)" 및 "포함한다(includes)"와 같은 "포함하는(including)"의 변형은 이에 상응하게 변경된 의미를 갖는다.
본원에서 사용된 바와 같이, 용어 Cso-PSIIH는 용어 Cso-BMC와 상호 교환적으로 사용된다.
본원에서 사용된 바와 같은 용어 "변이체"는, 하나 이상의 아미노산에 의해 변경되지만, 본 발명에서 캡슐화 펩티드로서 기능하는 능력을 유지하는 아미노산 서열을 지칭한다. 변이체는 "보존적" 변화를 가질 수 있으며, 여기서 치환된 아미노산은 유사한 구조적 또는 화학적 특성(예를 들면, 류신을 이소류신으로 치환)을 갖는다. 더 드물게는, 변이체는 "비-보존적" 변화(예를 들면, 글리신을 트립토판으로 치환)를 가질 수 있다. 유사한 경미한 변이에는 또한 아미노산 결실 또는 삽입 또는 둘 다가 포함될 수도 있다. 생물학적 또는 면역학적 활성을 폐지하지 않고서 어느 아미노산 잔기가 치환, 삽입 또는 결실될 수 있는지를 결정하는 지침은 당해 기술분야에 잘 공지되어 있는 컴퓨터 프로그램, 예를 들면, DNASTAR® 소프트웨어(DNASTAR, Inc. Madison, Wisconsin, USA)에서 발견할 수 있다. 변이체의 한 가지 유형은, 예를 들면, 서열번호 94에 제시된 아미노산 서열을 갖는 펩티드이며, 이는 서열번호 1에 제시된 서열보다 더 길고, 또한 CsoS2로부터 유래하고, 서열번호 1의 캡슐화 기능성을 보유한다. 서열번호 1과 서열번호 94 사이의 중간의 아미노산 서열을 갖는 다른 변이체는 기능성을 보유할 것으로 예상된다.
본 발명의 조성물 또는 조합물은 일반적으로 약제학적으로 허용되는 보조제, 희석제 또는 담체와 혼합하여 약제학적 제형으로서 투여될 것이고, 의도된 투여 경로 및 표준 약제학적 관행을 고려하여 선택될 수 있다. 이러한 약제학적으로 허용되는 담체는 활성 화합물에 대해 화학적으로 불활성일 수 있고, 사용 조건하에서 유해한 부작용이나 독성이 없을 수 있다. 적합한 약제학적 제형은, 예를 들면, 문헌[참조: Remington The Science and Practice of Pharmacy, 19th ed., Mack Printing Company, Easton, Pennsylvania (1995)]에서 발견할 수 있다. 비경구 투여의 경우, 피로겐을 포함하지 않고 필요한 pH, 등장성 및 안정성을 갖는 비경구적으로 허용되는 수용액이 사용될 수 있다. 적합한 용액은 문헌에 기재된 다수의 방법과 함께 당업자에게 잘 공지되어 있다. 약물 전달 방법에 대한 간략한 리뷰는, 예를 들면, 문헌[참조: Langer, (Science 249: 1527 (1990)]에서 발견할 수 있다.
그렇지 않으면, 적합한 제형의 제조는 통상의 기술을 사용하여 및/또는 표준 및/또는 허용된 약제학적 관행에 따라 당업자에 의해 일상적으로 달성될 수 있다.
본 발명에 따라 사용되는 임의의 약제학적 제형 내의 조성물 또는 조합물의 양은 치료되는 병태의 중증도, 치료되는 특정 환자 및 사용되는 화합물(들)과 같은 다양한 요인에 따라 달라질 것이다. 일부 구현예에서, BMC-VLP는 이의 표면에 항원 분자를 표시하고, 백신으로 기능한다. 어떠한 경우에도, 제형 중의 조성물 또는 조합물의 양은 당업자에 의해 일상적으로 결정될 수 있다.
예를 들면, 정제 또는 캡슐과 같은 고체 경구 조성물은 1 내지 99%(w/w) 활성 성분; 0 내지 99%(w/w) 희석제 또는 충전제; 0 내지 20%(w/w)의 붕해제; 0 내지 5%(w/w)의 윤활제; 0 내지 5%(w/w)의 유동 보조제; 0 내지 50%(w/w)의 과립화제 또는 결합제; 0 내지 5%(w/w)의 항산화제; 및 0 내지 5%(w/w)의 안료를 함유할 수 있다. 추가로, 제어된 방출 정제는 방출-제어 중합체를 0 내지 90%(w/w) 함유할 수 있다.
비경구 제형(예컨대, 주사용 용액 또는 현탁액, 또는 주입용 용액)은 1 내지 50%(w/w) 활성 성분; 및 50%(w/w) 내지 99%(w/w)의 액체 또는 반고체 담체 또는 비히클(예를 들면, 물과 같은 용매); 및 0 내지 20%(w/w)의 하나 이상의 기타 부형제, 예컨대, 완충제, 항산화제, 현탁 안정제, 등장성 조절제 및 방부제를 함유할 수 있다.
장애, 치료되는 환자, 및 투여 경로에 따라, 본 발명의 BMC-VLP를 포함하는 조성물 또는 조합물은 이를 필요로 하는 환자에게 다양한 치료 유효량으로 투여될 수 있다.
그러나, 본 발명의 맥락에서, 포유동물, 특히 인간에게 투여되는 용량은 합리적 기간에 걸쳐 포유동물에서 치료 반응을 일으키기에 충분해야 한다. 당업자는, 정확한 용량 및 조성 및 가장 적절한 전달 섭생의 선택이 특히 제형의 약리학적 특성, 치료되는 병태의 성질 및 중증도, 및 수용자의 신체적 상태 및 정신적 예리함, 뿐만 아니라 특정 화합물의 효능, 치료되는 환자의 연령, 상태, 체중, 성별 및 반응에 의해 영향을 받을 것이다.
이제, 본 발명을 일반적으로 기재하였지만, 본 발명은, 본 발명을 제한하고자 하는 것이 아니라 예시로서 제공되는 하기 실시예를 참조하여 보다 용이하게 이해될 것이다.
실시예
당해 기술분야에 공지되어 있고 구체적으로 기재되지 않은 표준 분자 생물학 기술은 일반적으로 문헌[참조: Sambrook and Russel, Molecular Cloning: A Laboratory Manual, Cold Springs Harbor Laboratory, New York (2001)]에 기재된 바와 같이 수행되었다.
실시예 1:
재료 및 방법
박테리아 균주 및 배양
이. 콜라이 아셀라(E. coli Acella)(DE3)(EdgeBio) 세포는 Cso-BMC VLP의 분자 클로닝(molecular cloning) 및 단백질 발현 모두에 사용되었다. 세포는 50㎍/mL의 적절한 항생물질(카나마이신 또는 스트렙토마이신)이 보충된 리소게니 브로쓰(LB) 또는 테리픽 브로쓰(TB)에서 성장했다.
사카로마이세스 세레비지아에(Saccharomyces cerevisiae)(이후 단순히 효모라고 함) 세포는 HO-BMC VLP의 분자 클로닝 및 단백질 발현 모두에 사용되었다. 효모에서 플라스미드-기반 발현은 영양 선택을 기반으로 하며, 이는 배합된 성장 배지를 필요로 한다. 배지는, 플라스미드 상의 유전자에 의해 코딩된 단백질에 의해 생산될 수 있는 조작된 효모 균주에 필요한 중요한 영양소를 결여한다. pCKU에서, 유전자 산물인 Ura3p는 우라실을 생성한다. 이 화학적으로 정의된 배지는 고가이기 때문에(약 SGD 30/L), 효모 균주가 덜 고가(약 SGD 5/L)인 정의되지 않은 배양 배지(효모-펩톤-덱스트로즈)에서 경로 단백질을 여전히 발현할 수 있도록 경로를 효모 게놈에 염색체적으로 통합하려고 했다. 따라서, 우리는 효모에 통합되는 경로에 인접하는 상동성 부위를 설치하는 pGAU-YMRWδ15를 개발했다. 효모에서 내인성 상동성 재조합 기계를 사용하여, 목적하는 경로는 효모의 YMRWδ15 염색체 부위에 삽입되고, 경로의 단백질은 선택을 필요로 하지 않고서 발현될 수 있다.
플라스미드의 골든 게이트 어셈블리
골든 게이트 원-포트(Golden Gate one-pot) 플라스미드 어셈블리는 이전에 공개된 프로토콜을 약간 수정하여 대체로 따른다[참조: Guo, Y. et al., Nucleic Acids Res 43: e88 (2015)]. 1~3개의 단편을 삽입하기 위해, 반응 포트에서 1μL의 T4 리가제 완충액(NEB), 0.5μL의 10× 정제된 소 혈청 알부민(BSA, NEB), 5U의 Bsal(NEB) 또는 Esp3l(Thermo), 0.2U의 T4 리가제(Thermo), 15ng의 목적 플라스미드, 1 내지 3μL의 삽입물 및 10μL까지의 물을 제조했다. 반응 포트는 15사이클 동안 각 단계에서 5분 동안 인큐베이팅으로 37℃ 내지 18℃ 열사이클링 프로세스에 적용하고, 이어서 본래 목적 벡터를 함유하는 콜로니의 수를 감소시키기 위해 결찰을 억제하면서 어셈블리되지 않은 플라스미드를 소화하면서 15분 동안 55℃ 단계를 수행했다. 3개 이상의 삽입물을 어셈블리하는 경우, 제한 효소 및 리가제의 양은 2배로 되었고, 목적 플라스미드의 양은 75ng으로 증가했으며, 열사이클의 수는 70회로 증가했고, 삽입물 및 목적 플라스미드는 고정 용적으로서가 아닌 2:1 몰 비로 첨가되었다. 이들은 정확하게 어셈블리된 플라스미드의 수를 증가시키기 위한 목적으로 수행되었다.
코돈-최적화(codon-optimized) BMC 유전자를 합성(BioBasic)하고, HcKan_O에 클로닝했다. 프로모터 및 터미네이터 부분은 PCR 산물로서 다양한 주형으로부터 증폭시키고, 각각 HcKan_P 및 HcKan_T로 클로닝되었다.
A206K 돌연변이를 통해 수득된 eGFP의 단량체 형태인 meGFP는 NEBuilder® HiFi 어셈블리를 사용하여 부위-지시된 돌연변이유발(SDM)에 의해 생성되었다. 형광 단백질, S2CP 또는 정화 태그를 ORF에 부가하기 위해 사용되는 플라스미드 pES1-7, pCKH 및 변형된 HcKan_O 플라스미드는 마찬가지로 HiFi 어셈블리를 사용하여 생성되었다.
서열분석용 프라이머
다양한 플라스미드 작제물을 서열분석하기 위해 사용되는 특정 올리고뉴클레오티드 프라이머는 표 1에 제시되어 있다.
프라이머 명칭 | 사용된 플라스미드(들) | 서열 | 서열번호 |
HcKan_chc_F' | HcKan | GATCCTTTGATTTTCTACCG | 85 |
HcKan_chc_R' | HcKan | CTCGATAACTCAAAAAATACG | 86 |
pES_Chc_F' | pESX | CGGAGCCTATGGAAAAACGC | 87 |
pES_Chc_R' | pESX | CCGCAGTGTCTTGGGTCTCT | 88 |
His_chc_F' | pCKH | TAGAGTGTACTAGAGGAGGCCAA | 89 |
CEN_chc_R' | pCKH/pCKU | GGTGATGACGGTGAAAACCT | 90 |
Ura_chc_F' | pCKU | TCTGTTCGGAGATTACCGAATCAA | 91 |
pGau_chc_F' | pGAU-YMRWδ15/ pGAH-YPRCδ15 |
CCACCTCAGGCAGAGAACCT | 92 |
pGau_chc_R' | pGAU-YMRWδ15/ pGAH-YPRCδ15 | GGAAAAACGCCAGCAACGC | 93 |
서열 정렬
CsoS2 서열을 클러스탈 오메가(Clustal Omega)로 정렬하고, 출력 정렬 파일을 JalView 2로 준비했다[참조: Waterhouse, A. M. et al., Bioinformatics 25: 1189-1191 (2009); Sievers, F. and Higgins, D. G. Methods in Molecular Biology (Clifton, N. J.) 1079: 105-116 (2014)]. 서열 정렬에 사용된 서열에 대한 수탁 번호는 표 2에 상세되어 있다.
유전자 | 생물 | GenBank 수탁 번호 |
csos2 |
할로티오바실러스 네아폴리타누스
(Halothiobacillus neapolitanus) |
ACX95763.1 |
아시디티오바실러스 페리보란스
(Acidithiobacillus ferrivorans) |
OYV82041.1 | |
부르크홀데리알레스 박테리움
(Burkholderiales bacterium) |
TNF63637.1 | |
갈리오넬라세아에 박테리움
(Gallionellaceae bacterium) |
TAJ81120.1 | |
하이드로게노필라레스 박테리움
(Hydrogenophilales bacterium) |
OZA28367.1 | |
티오바실러스 티오파루스
(Thiobacillus thioparus) |
WP_018507371.1 | |
코마모나다세아에 박테리움
(Comamonadaceae bacterium) |
KJS73712.1 | |
아시디티오바실러스 레리두란스
(Acidithiobacillus ferridurans) |
BBF66259.1 | |
베타프로테오박테리아 박테리움
(Betaproteobacteria bacterium) |
TSA22668.1 | |
레로붐 종(Ferrovum sp.) Z-31 | WP_062187313.1 | |
csos1A |
할로티오바실러스 네아폴리타누스
(Halothiobacillus neapolitanus) |
WP_012823794.1 |
Hoch_5815 (BMC-H) |
할리안기움 오르크라세움
(Haliangium orchraceum) |
WP_012830883.1 |
ccmK2 | 할로테세 종(Halothece sp.) 7418 | WP_015227514.1 |
eutM |
살모넬라 엔테리카
(Salmonella enterica) |
VFS02811.1 |
pduA |
시트로박터 프레운디
(Citrobacter freundii) |
WP_098065011.1 |
cmcC |
클레브시엘라 뉴모니아에
(Klebsiella pneumoniae) |
WP_004146125.1 |
csos4A |
할로티오바실러스 네아폴리타누스
(Halothiobacillus neapolitanus) |
WP_012823797.1 |
Hoch_5814 (BMC-P) |
할리안기움 오르크라세움
(Haliangium orchraceum) |
WP_012830882.1 |
ccmL | 할로테세 종(Halothece sp.) 7418 | WP_015227516.1 |
eutN |
살모넬라 엔테리카
(Salmonella enterica) |
EBA6053551.1 |
pduN |
시트로박터 프레운디
(Citrobacter freundii) |
WP_038641685.1 |
cmcD |
클레브시엘라 뉴모니아에
(Klebsiella pneumoniae) |
WP_009486245.1 |
VLP의 정제 및 카고-로딩 분석
Cso - BMC의 경우: 아셀라(Acella)(DE3) 세포를 50mg/L의 카나마이신이 보충된 500mL 테리픽 브로쓰(TB, BioBasic)에서 배양하고, 배양물의 광학 밀도(λ = 600nm에서) 값이 대략 0.6 내지 1.0에 도달할 때까지 37℃에서 진탕시켰다. 이어서, 배양물을 25℃로 냉각시키고, 단백질 유도를 위해 이소프로필 β-D-1-티오갈락토피라노시드(IPTG, GoldBio)를 50μM까지 첨가했다. 세포를 원심분리에 의해 수확하기 전에 대략 30시간 동안 25℃에서 배양했다. 세포를 3회 통과로 M-110P 미세유동화기(Microfluidics)를 사용하여 15,000psi에서 용해시켰다. 세포 용해물에 0.1mM의 페닐메틸설포닐 플루오라이드(PMSF) 프로테아제 억제제를 첨가했다. 용해물은 매회 20분 동안 20,000 ×g에서 2회 회전시켰다. 정화된 용해물을 StrepTrap™ HP 5mL 컬럼(GE Life Sciences)에 1mL/min 선형 유속으로 로딩했다. 결합 완충액(트리스·HCl 100mM, NaCl 150mM, pH 8.0)으로 12 컬럼 용적(CV)의 세척 및 3mL/분 선형 유속에서 용출 완충액(2.5mM 데스티오비오틴이 보충된 결합 완충액)으로 6CV의 용출을 갖는 AKTA FPLC를 사용하여 정제를 수행했다.
구조 연구를 위한 고품질 단백질 쉘을 수득하기 위해, 음이온 교환(AIEX) 크로마토그래피는 StrepTrap™ 친화성 정제 후에 수행했다. AIEX 완충액 A(트리스·HCl 50mM, pH 8.0)를 첨가하여, 풀링된 StrepTrap™ 용출 분획을 2배로 희석했다. 샘플은 1mL/분으로 10mL 베드 수지 용적 Q 세파로즈(GE Life Sciences) 컬럼에 로딩했다. 6CV에 걸쳐 0 내지 60% AIEX 완충액 B(트리스·HCl 50mM, NaCl 1.0M, pH 8.0) 및 2CV에 걸쳐 60 내지 100% IEX 완충액 B로 이루어진 2단계 구배 프로토콜(둘 다 2mL/min 선형 유속)은 용리에 사용되었다.
단백질은 13% 스태킹 나트륨 도데실 설페이트 폴리아크릴아미드 겔 전기영동(SDS-PAGE) 겔을 사용하여 분석하고, InstantBlue(Expedeon)를 사용하여 염색했다. 밀도측정 분석은 BMC 카고 정량화에 대한 이전 보고서[참조: Hagen, et al., Nature Communications 9: 2881, doi:10.1038/s41467-018-05162-z (2018)]에 따라 Bio-Rad 이미지 랩 소프트웨어를 사용하여 수행되었다. 백그라운드 감산을 수행하고, 목적 밴드에 상응하는 피크 영역을 정량화에 사용했다. 절대 단백질 농도는 280nm(ε280)에서 계산된 몰 감쇠 계수를 사용하여 DeNovix 분광광도계를 사용하여 측정되었다. 이의 개별 성분의 ε280의 합계로서 취한 T = 3 쉘의 계산된 ε280는 1 588 200M- 1·cm- 1이다. T = 4 쉘의 ε280는 1 677 600M- 1·cm- 1로 계산되었다. 양쪽 쉘 유형 사이에 계산된 ε280 값의 작은 차이는 CsoS1A의 낮은 ε280(1490M-1·cm-1)에 의한 것이다. 대부분의 ε280 기여는, 양쪽 쉘 유형에서 동일한 수의 카피를 갖는 CsoS4A(23 490M-1·cm-1)로부터 유래한다. 형광을 사용하여 쉘당 평균 GFP 수의 결정은 하겐 등(Hagen et al.)[참조: Hagen, et al., Nature Communications 9: 2881, doi:10.1038/s41467-018-05162-z (2018)]에 기재된 프로토콜에 따라 수행되었다.
HO- BMC의 경우: 25℃에서 48시간 동안 8L의 YPD(효모 추출물 1%, 펩톤 2%, 글루코즈 2%, BioBasic)에서 성장한 후, 효모 세포를 펠렛화하고, 8회의 통과로 20,000psi에서 M-110P 미세유동화기를 사용하여 용해했다. 용해물을 매회 20,000 ×g에서 20분 동안 2회 회전시키고, 정화된 용해물을 1M 트리스·HCl pH 12를 사용하여 pH 8로 조정했다. 용해물을, 300μL의 비오틴 차단 완충액(IBA Lifesciences)와 함께 15분 동안 온화하게 교반하면서 인큐베이팅했다. StrepTrap™ 친화성 정제는 상기에서 설명한 것과 동일한 방식으로 수행되었다.
풀다운 검정 및 면역블롯팅
정제된 His6-meGFP-S2CP 및 His6-meGFP는 CsoS1A-SII, SII-CsoS1D 또는 CsoS4A-SII를 함유하는 정화된 이. 콜라이 용해물과 함께 25℃에서 1시간 동안 온화하게 교반하면서 개별적으로 배양했다. 용해물 혼합물은 상기 언급된 StrepTrap™ 프로토콜을 사용하여 정제했다. eGFP 에피토프의 면역블롯팅 검출은 GFP-HRP 접합된 항체(GF28R, Invitrogen)를 사용하여 수행되었고, His6 에피토프의 검출을 위해, HRP와 접합된 His 태그 항체(Genscript)가 사용되었다. 검출은 제조업체의 권장 프로토콜에 따라 수행했다.
형광 현미경
이. 콜라이 세포는 기재된 조건에 따라 성장시키고, 배양액 0.1mL를 수집하여 펠렛화했다. 펠렛은 1% 포름알데히드를 갖는 PBS에 재현탁시키고, 실온에서 10분 동안 방치했다. 세포를 PBS로 2회 세척하고, 0.5mL PBS에 재현탁시켰다. 소량(약 3μL)의 세포 현탁액을, 폴리-L-리신 현미경 슬라이드(Thermo Scientific)에 장착하기 전에, 동일 용적의 Prolong™ 다이아몬드 안티페이드 마운턴트(Diamond Antifade Mountant)(Thermo Scientific)와 혼합했다. 샘플은 이미징 전에 적어도 24시간 동안 암소에서 치유되도록 하였다. 슬라이드는 100× 대물렌즈 배율에서 올림푸스(Olympus) FV1200 공초점 현미경을 사용하여 이미지화했다. ImageJ 소프트웨어를 사용하여 공-국재화 분석을 수행했다.
투과 전자 현미경
Formvar/탄소 코팅된 구리 그리드는, 5μL의 정제된 단백질 샘플(A280 약 0.05 이하로 희석됨)을 60초 동안 장착하기 전에, 여과지로 액적을 제거하기 전에, 글로우 방전을 수행했다. 이어서, 그리드는 2.5% 가돌리늄(III) 아세테이트의 5μL 액적을 첨가하고, 90초 동안 인큐베이팅하고, 유사하게 블롯팅함으로써 음성으로 염색했다. 그리드는 JEOL JEM-1220 TEM을 사용하여 이미지화했다.
쉘 입자 크기 및 안정성 측정
입자 크기 분포는 Uncle™ 기기(Unchained labs)를 사용하여 동적 광 산란(DLS)에 의해 결정되었다. 달리 명시되지 않는 한, 샘플을 TBS-50/350 pH 8.0(트리스·HCl 50mM, NaCl 350mM, pH 8.0)에서 1mg/mL로 희석하고, 측정 전에 응집물을 제거하기 위해 5분 동안 20,000g에서 회전시켰다. 분석을 위해 최상위 상청액을 사용하도록 주의했다. 미니 큐벳에 9μL의 샘플을 첨가했다. 달리 명시되지 않는 한, 모든 DLS 측정은 3회 수행되었고, 20℃에서 수행되었다. 입자 크기 분포의 분석은 Uncle™ 분석 소프트웨어를 사용하여 수행했다.
다양한 온도에서 쉘 안정성 측정을 위해, 쉘 샘플을 얇은 벽의 PCR 튜브에 분주하고, Uncle™ 기기에서 15분 동안 10℃ 증분으로 20 내지 80℃ 범위의 온도에 적용시켰다. 15분 인큐베이팅의 종료시에, DLS 스펙트럼을 취했다.
다양한 완충액 조건에서 쉘 안정성 측정을 위해, 10% 및 20%(v/v) 메탄올을 함유하는 TBS-50/350 pH 8.0 완충액은, 트리스·HCl 1.0M pH 8.0, NaCl 5.0M 및 99.8% 메탄올(ACS 시약 등급, Sigma)의 스톡 용액으로부터 새롭게 제조하고, 제조일 이내에 사용한다. 메탄올을 물과 혼합할 때에 발생하는 혼합의 열로 인해, 메탄올-함유 완충액은 제조 후 적어도 1시간 동안 실온으로 다시 평형화시켰다. 다양한 pH에서 완충액을 제조하기 위해, 하기 성분은 표시된 pH 범위에 대해 50mM로 사용했다: pH 2 내지 4의 경우 글리신·HCl; pH 5 내지 7의 경우 4-모르폴리에탄설폰산(MES) 나트륨 염; pH 8 내지 9의 경우 트리스·HCl; pH 10 내지 11의 경우 N-사이클로헥실-3-아미노프로판설폰산(CAPS) 나트륨 염; pH 12 내지 13의 경우 아르기닌·HCl. 모든 완충액은 350mM NaCl을 함유했다. 쉘을 상기 완충액에서 15분 동안 인큐베이팅하여, 입자 크기 측정 전에 쉘 용리/단백질 변성의 가능 시간을 확보했다.
동결-해동 안정성을 위해, TBS-50/350 pH 8.0의 쉘 샘플을 얇은 벽의 PCR 튜브에 분주하고, 액체 N2에서 급속-동결했다. 재-동결 전에 15분 동안 정치시킴으로써 아이스 결정이 보이지 않을 때까지 샘플을 실온에서 해동시켰다.
효소 정상-상태 동역학 검정.
APEX2의 경우, TBS-50/350 pH 8.0을 사용하여, 모든 시약을 적절한 작업 농도로 제조했다. 모두 10mM인 구아이아콜 및 H2O2의 작업 용액을 검정 당일에 준비했다. 실온으로 다시 평형화하기 전에 완전한 용해를 보장하기 위해, 구아이아콜 용액을 30℃에서 격렬하게 진탕시켰다. APEX2에 대한 검정 농도는 10nM이었고, H2O2는 1mM이었고, 구아이아콜에 대한 농도는 0.20 내지 2.0mM 범위였다. 총 반응 용적은 200pL이었다. 반응은 BioTek Synergy™ HT 마이크로플레이트 판독기를 사용하여 470nm에서 형성된 테트라구아이아콜의 흡광도에 의해 모니터링되었다. 테트라구아이콜 형성의 속도는 90초까지 일정한 것으로 밝혀졌다. 이 시점은 초기 속도(Vo) 측정을 위해 취득했다. 운동 상수는 그래프패드 프리즘(GraphPad Prism) 소프트웨어에서 비-선형 최소 자승 미하엘리스-멘텐(Michaelis-Menten) 피팅을 사용하여 수득되었다.
LacZ의 정상-상태 동역학을 위해, 모든 시약은 1mM MgCl2가 보충된 TBS-50/350 pH 8.0을 사용하여 적절한 작업 농도로 제조되었다. 10mM에서 ONPG의 작업 용액은 검정 당일에 DMSO 중의 50mM 스톡 용액으로부터 제조되었다. LacZ에 대한 검정 농도는 10nM이었고, ONPG(오르토-니트로페닐-β-갈락토시드)에 대한 검정 농도는 0.050 내지 1.5mM 범위였다. 총 반응 용적은 100μL였다. ONPG의 가수분해는 405nm에서 흡광도 측정에 의해 추적되었다. 생산 형성의 속도는 60초까지 일정한 것으로 나타났고, 이 시점을 V0 측정을 위해 취했다.
효소/효소-쉘 활성 및 안정성 검정.
모든 효소 활성 측정은 주변 온도(23℃)에서 수행되었고, 효소의 작업 농도는 10nM이었다. 측정은 3회 수행되었다. 효소 활성 측정은 포화 기질 농도(즉, Vmax 근처)를 사용하여 제품 형성의 초기 속도로서 결정되었다. APEX2의 경우, 이는 1.4mM 구아이아콜 및 1mM H2O2였다. LacZ의 경우, 이는 1.5mM ONPG였다.
열 충격 검정(heat shock assay)을 위해, 효소/효소-쉘 샘플을 얇은 벽의 PCR 튜브에 분주하고, 열사이클러에서 15분 동안 지시된 승온(도 6B)에 적용시켰다. 인큐베이팅 후, 샘플을 20℃로 냉각하고, 검정 전에 15분 동안 주변 온도로 다시 평형화했다.
메탄올을 함유하는 완충액 및 다양한 pH 조건하에서 안정성 측정을 위해, 입자 크기 측정 섹션에 기재된 바와 같이, 효소/효소-쉘 샘플을 다양한 완충액에 투석했다. 가능한 단백질 변성의 시간을 확보하기 위해, 검정 전에 적어도 15분 동안 용액을 정치했다.
동결-해동 안정성을 위해, 효소/효소-쉘 샘플을 얇은 벽의 PCR 튜브에 분주하고, 액체 N2에서 급속-동결했다. 재-동결 또는 검정하기 전에 15분 동안 방치하여 아이스 결정이 보이지 않을 때까지 샘플을 실온에서 해동했다.
극저온-전자 현미경 및 구조 분석
단백질 용액을, 빙냉 TBS-50/400 완충액(트리스·HCl 50mM, NaCl 400mM, pH 8.0)을 사용하여 0.5mg/ml의 농도로 희석했다. 구멍이 있는 탄소 지지 필름(Quantifoil)을 갖는 글로우 방전된 R 1.2/1.3 및 R2/2 몰리브덴 200 그리드에 2.5μL의 단백질 샘플을 적용했다. 그리드를 라이카(Leica) EM GP 플런지 냉동고로 옮기고, 90% 습도에서 2초 동안 블롯팅하고, 액체 N2에 의해 냉각시킨 액체 에탄에서 급속 동결했다. 그리드는 결정질 아이스의 형성을 방지하기 위해 액체 N2 온도하에 저장되었다.
최고의 cryoEM 그리드 제조 조건은, 최소 용량 시스템에서 200kV에서 작동하는 FEG가 장착된, 오사카 대학교 단백질 연구소의 Talos™ 악티카 크리오(Arctica Cryo)-TEM(ThermoFisher Scientific)에서 스크리닝되었다. 이미지는 33.67초의 노출 시간에서 캡쳐되어, 92,000× 배율 및 1.6~2.5μm의 디포커스 값으로 대략 20e-/Å2의 선량을 제공했다. 이미지는 1.1Å 픽셀 크기의 노출 설정과 70프레임/개별 이미지의 분획으로 카운팅 모드에서 BM-Falcon3 카메라를 사용하여 기록되었다. 데이터 수집을 위해, 그리드를 제조하고, 300kV에서 작동하는 FEG와 최소 선량 시스템을 구비한, 오사카 대학의 초고전압 전자 현미경(UHVEM)의 연구 센터에서 Titan™ Krios™(FEI)(ThermoFisher Scientific)으로 이미지화했다. Titan™ Krios™에 부착된 EPU 소프트웨어(FEI)를 사용하여 이미징을 수행했다. 이미지는, 대물렌즈 조리개를 사용하지 않고서 96,000의 공칭 배율로 기록되었으며, 실제 디포커스 범위는 1.5~2.2μm이고, 선량률은 64.3~68.1e-/Å2이고, 노출 시간은 1초, 홀당 8개의 이미지 취득으로 기록되었다. 이미지는 팔콘(Falcon) II 검출기(FEI)를 사용하여 0.86Å/픽셀의 픽셀 크기와 17프레임/개별 이미지의 프레임 속도로 기록되었다.
약 2100 내지 2500개의 생 동영상(raw movie)을 상이한 현미경 세션에서 수집하고, RELION 3.0 소프트웨어에서 처리했다[참조: Zivanov, J. et al., eLife 7: e42166 (2018)]. 드리프트는 MotionCor2 소프트웨어로 모션 수정되었고, 각 현미경사진의 CTF는 CTFFind-4.1 소프트웨어 및 Gctf 소프트웨어로 추정되었다[참조: Zhang, K. J Struct Biol 193: 1-12 (2016)]. CTF 추정치가 양호한 관찰된 현미경사진이 추가 처리를 위해 선택되었다. 쉘은 수동으로 선택되고, RELION 3.0에서 300 × 300 픽셀의 박스 크기로 추출되었다. 명확한 2차 구조 요소를 표시하는 2D 클래스의 입자가 선택되었다. 초기 3D 참조 모델은 RELION 툴박스 키트 실린더를 사용하여 제조되었다. 3D 정제는 용매 평탄화와 함께 20Å의 저역 통과 필터로 수행했다. CTF 정제는 입자 연마 없이 수행되었고(입자 연마에 영향 없음), 최종 3D 정제가 수행되었다. 용매 평탄화 및 소프트 마스크를 사용한 후처리에 의해, 단백질 쉘에 대한 최종 해상도가 수득되었다.
모델 구축 및 구조 분석
펜타머(PDB ID: 2RCF)[참조: Tanaka, S., et al., Science 319: 1083-1086 (2008)] 및 헥사머(PDB ID: 2EWH)[참조: Tsai, Y. et al., PLOS Biology 5: e144 (2007)]에 대한 생물학적 어셈블리 모델은 IICSF 키메라[참조: Pettersen, E. F., et al., J Comput Chem 25: 1605-1612 (2004)]를 사용하여 전자 밀도 맵에 수동으로 적합시켰다. 20면체 재구성의 비대칭 단위를 추출하여 COOT[참조: Emsley, P., et al., Acta Crystallogr. D Biol. Crystallogr. 66: 486-501 (2010)]에서 재구축했다. 전체-쉘 모델은 PHENIX[참조: Liebschner, D., et al., Acta Crys D 75: 861-877 (2019)] 및 CCP4[참조: Winn, M. D., Acta Crystallogr. D Biol. Crystallogr. 67: 235-242 (2011)]에서 대칭성 확장 및 실시간 공간 개량에 의해 수득되었다.
실시예 2:
미세구획 부품의 모듈식 구조를 위한 유전자 툴킷(toolkit)
당사의 골든-게이트(Golden-Gate) 클로닝 기반 유전자 부품 어셈블리 툴킷은 효모의 대사 조작에 사용되는 공개된 YeastFab 플라스미드 쉬트를 확장한다[참조: Guo, Y., et al., Nucleic Acids Res 43: e88 (2015)]. 간단히 말하면, 이는 유전적 부분, 즉 프로모터(Pro), 개방 판독 프레임(ORF) 및 터미네이터(Ter)가 모듈화되어 있는 DNA 어셈블리에 대한 계층적 접근법이다. 이러한 부분을 레벨 0 플라스미드라고 한다. 레벨 1 플라스미드는 Pro-ORF-Ter를 함께 연결하여 유전자 발현 카세트를 형성하고, POTX 플라스미드(X = 1 내지 11)라고 한다. 레벨 2 플라스미드는 2개 이상의 발현 카세트를 연결하여 경로 조합을 형성한다. 레벨 3 플라스미드는 게놈으로의 경로의 염색체 통합에 사용된다. 효모 발현을 위해, 공개된 YeastFab 툴킷의 YeastFab 레벨 0 및 1 플라스미드를 사용했지만, 우리의 요건에 더 잘 맞도록 자체 레벨 2 및 3 플라스미드를 개발했다. 이. 콜라이(E. coli) 발현을 위해, 우리는 YeastFab 레벨 0 플라스미드의 사용을 유지했지만, 자체 레벨 1 및 2 플라스미드를 개발했다. 이. 콜라이에 대한 레벨 3(게놈 통합) 플라스미드를 개발하지 않았다.
HcKan_P, _O 및 _T라고 하는 레벨 0 플라스미드는 표 3에 기재된 바와 같이 이. 콜라이 및 효모 모두에 대해 각각 Pro, ORF 및 Ter 부분의 유지에 사용된다.
유전자 부분/레벨 | 서열번호 (유전자 부분용) | 명칭 | 서열번호 (유지 플라스미드용) | 기능 |
Pro / 0 | 84 | PCon2 | 83 | 앤더슨 컬렉션(Anderson collection)(Anderson, 2006)의 강력한 박테리아 구성 프로모터, ID: BBa_J23100 |
77 | PCon3 | 40 | 앤더슨 컬렉션의 강력한 박테리아 구성 프로모터, ID: BBa_J23108 | |
78 | PCon4 | 41 | 앤더슨 컬렉션의 중강도 박테리아 구성 프로모터, ID: BBa_J23105 | |
24 | PCon5 | 42 | 앤더슨 컬렉션의 약한 박테리아 구성 프로모터, ID: BBa_J23114 | |
23 | PT7 | 43 | lac 오페론에 의해 조절되는 T7 박테리오파지 프로모터. λDE3 리소겐(Baneyx, 1999)을 갖는 이. 콜라이 균주에서 매우 높은 수준의 단백질 생산에 사용됨. | |
25 | PTDH3 | 65 | 강력한 효모 구성 프로모터 | |
27 | PYEF3 | 66 | 중강도 효모 구성 프로모터 | |
26 | PPYK1 | 67 | 중강도 효모 구성 프로모터 | |
104 | PGPM1 | 115 | 중강도 효모 구성 프로모터 | |
ORF / 0 | 2, 8 | CsoS1A | 53 | 헥사머 BMC 프로모터 |
3, 9 | CsoS4A | 54 | 펜타머 BMC 프로모터 | |
4 and 10 | HO-H | 68 | 헥사머 BMC 프로모터 | |
5 and 11 | HO-P | 69 | 펜타머 BMC 프로모터 | |
6 and 12 | HO-T1 | 70 | 삼량체 BMC 프로모터 | |
97 | HO-T1-SpyTag | 116 | 내부 SpyTag를 갖는 삼량체 BMC 프로모터 | |
45, 46 | meGFP | 44 | 단량체 증강 녹색 형광 단백질. | |
30 and 29 | mCherry | 28 | 단량체 적색 형광 단백질. | |
48, 49 | UmuD1-40-meGFP-S2CP | 47 | meGFP-S2CP에 융합된 UmuD1-40 분해 태그 | |
51, 52 | UmuD1-40-meGFP | 50 | meGFP에 융합된 UmuD1-40 분해 태그 | |
99 | GFP-SpyCatcher | 121 | SpyCatcher에 융합된 GFP | |
110 | APEX2-S2CP(30) | 109 | S2CP(30)에 융합된 조작된 완두콩 아스코르베이트 퍼옥시다제(Lam et al., 2015) | |
103 | LacZ-S2CP(30) | 106 | S2CP(30)에 융합된 이. 콜라이 베타-갈락토시다제 | |
Ter / 0 | 79 | TT7 | 55 | T7 전사 터미네이터를 함유하는 HcKan_T-TT7(Baneyx, 1999). |
80 | TRPL41B | 71 | 효모 전사 터미네이터 | |
81 | THBT1 | 72 | 효모 전사 터미네이터 | |
82 | TRPS20 | 73 | 효모 전사 터미네이터 | |
105 | TYPT31 | 119 | 효모 전사 터미네이터 | |
유전자 발현 / 1 | N.A. | pESN (N = 1 - 7) | 56 to 62 | Pro-ORF-Ter를 수용하여 유전자 발현 카세트를 형성한다 |
N.A. | POTX (X = 1 - 11) | N.A. | Pro-ORF-Ter를 수용하여 유전자 발현 카세트를 형성한다 | |
경로 / 2 | N.A. | pCKH | 63 | N=2로부터 개시하는 pESN 플라스미드를 수용하여, 짝수 pES로 계속하고 홀수 pES에서 종료한다 |
N.A. | pCKU | 74 | N=1 또는 2로부터 개시하는 POTX 플라스미드를 수용하여, 짝수 POT로 계속하고 홀수 POT로 종료한다 | |
통합 / 3 | N.A. | pGAU-YMRWδ15 | 75 | 효모 게놈에서 YMRWδ15로 경로를 통합한다 |
N.A. | pGAU-YPRCδ15 | 76 | 효모 게놈에서 YPRCδ15로 경로를 통합한다 |
레벨 1 플라스미드는 숙주 세포에 불필요한 부담을 부가할 수 있는 POTX 플라스미드로부터 유전적 요소를 제거함으로써 이. 콜라이의 단백질 발현에 맞게 조정되도록 변형시켰다. 이. 콜라이 레벨 1 플라스미드를 pES/V(N = 1 내지 7)라고 하며, 이는 TU 유지에 필요한 최소 유전 요소를 함유한다(도 2A). POTX 또는 pESN 플라스미드로부터 복수의 Pro-ORF-Ter 어셈블리를 위해, 각각 효모 및 이. 콜라이에 대해 지정된 레벨 2 플라스미드 pCKU(서열번호 74) 및 pCKH(서열번호 63)을 개발했다. 효모의 플라스미드-기반 발현은 영양 선택을 기반으로 하며, 이는 제형화된 성장 배지를 필요로 한다. 배지에는, 플라스미드 상의 유전자에 의해 코딩된 단백질에 의해 생산될 수 있는 조작된 효모 균주에 필요한 중요 영양소가 결여되어 있다. pCKU에서, 유전자 산물인 Ura3p는 우라실을 생성한다. 이 화학적으로 정의된 배지는 고가($SGD 30/L)이기 때문에, 효모 균주가, 덜 고가($SGD 5/L)인 정의되지 않은 배양 배지(효모-펩톤-덱스트로스) 상의 경로 단백질을 여전히 발현할 수 있도록 효모 게놈에 경로를 염색체적으로 통합하려고 했다. 따라서, 우리는, 효모에 통합되는 경로에 인접하는 상동성 부위를 설치하는 pGAU-YMRWδ15(서열번호 75)를 개발했다. 효모의 내인성 상동성 재조합 기기를 사용하면, 목적 경로는 효모의 YMRWδ15 염색체 부위에 삽입되고, 경로 상의 단백질은 선택을 필요로 하지 않고서 발현될 수 있다. 이. 콜라이에서의 경로 발현의 경우, 비정의된 배양 배지(리소게니 브로쓰 또는 테리픽 브로쓰)에서 적절한 항생물질(이 경우에 카나마이신)을 사용하여 박테리아에서 플라스미드 선택이 통상 수행되기 때문에, 이 시점에서 필요한 경로 발현을 발견할 수 없다.
또한, HcKan_O 플라스미드를 변형시켜, ORF의 아미노 또는 카복시 말단에서 형광 단백질(FP), 생화학/친화성 태그 또는 캡슐화 펩티드를 설치했다(도 2C 및 D).
4개의 FP(mTurquoise2(mT2), 단량체-증강된 GFP(meGFP), 단량체 Kusabira 오렌지-카파(mKOK) 및 mCherry(mCh))가 선택되었고, 단량체성 거동을 나타내는 것으로 공지되어 있으며, 이는 융합 생성물의 인공적 응집을 감소시켜야 한다. 변형된 HcKan_O 플라스미드의 예는, mCherry를 ORF의 C-말단에 태그하는, HcKan_O-CmCherry(서열번호 28)이다. 도입된 2개의 친화성 태그는 헥사히스티딘(His6) 및 Strep-태그 II(SII) 태그이고, 각각 고정 금속 친화성 크로마토그래피(IMAC) 또는 Strep-Tactin에 의한 단백질 정제를 가능하게 한다. 변형된 HcKan_O 플라스미드의 예는, His6을 ORF의 C-말단에 태그하는 HcKan_O-CHis6(서열번호 32) 및 Strep-Tag II를 OFR의 C-말단에 태그하는 HcKan_O-CSII(서열번호 31)이다.
기타 태그에는 SpyCatcher/SpyTag(ST/SC) 쌍(서열번호 13 및 16) 및 CC-Di-A/B(CCA/CCB) 쌍(서열번호 17-20)이 포함된다. 변형된 HcKan_O 플라스미드의 예는 SpyCatcher를 ORF의 C-말단에 태그하는 HcKan_O-CSpyCatcher(서열번호 37); SpyTag를 ORF의 C-말단에 태그하는 HcKan_O-CSpyTag(서열번호 38); 코일드-코일 다이머-A를 ORF의 C-말단에 태그하는 HcKan_O-CCCDiA(서열번호 35); 및 코일드-코일 다이머-B를 ORF의 C-말단에 태그하는 HcKan_O-CCCDiB(서열번호 36)이다. SII 태그(서열번호 21 및 22)는 단백질 및 단백질 복합체의 정제에 광범위하게 사용되는 반면, ST/SC 및 CCA/CCB 쌍은 VLP 및 이의 기타 단백질 나노구조의 기능화에 사용되는 것으로 밝혀졌다[참조: Fletcher, J. M. et al., Science 340: 595-599 (2013); Keeble, A. H., & Howarth, M. Methods in Enzymology, 617, 443-461(2019)]. SpyCatcher(서열번호 13 및 14)로 태그된 단백질은 SpyTag(서열번호 15 및 16)로 태그된 또 다른 단백질과 공유 이소아미드 결합을 형성하는 반면, CC-Di-A(서열번호 17 및 18)로 태그된 단백질은 CC-Di-B(서열번호 19 및 20)로 태그된 또 다른 단백질과 함께 강력한 분자간 상호작용을 형성한다(해리 상수, Kd 약 1nM)[참조: Thomas, F., et al., Journal of the American Chemical Society 135: 5161-5166, (2013)]. VLP의 표면에 SC/ST 또는 CCA/CCB 쌍의 1개 멤버를 설치하면, 쌍 내의 다른 상응하는 멤버로 태그된 게스트 단백질이 쉘 표면에 접합될 수 있다.
쉘 프로토머의 세포내 화학양론을 제어하는 것은 BMC 쉘의 성공적 어셈블리에 중요한 것으로 공지되어 있다[참조: Kerfeld, C. A., et al., Nature Reviews Microbiology 16: 277 (2018)]. 각 성분의 발현을 조정하기 위해, 앤더슨 컬렉션으로부터 5개의 구성적 활성 프로모터를 HcKan_P에 도입했다(표 4)[참조: Anderson, J. C. Anderson Promoter Library Registry of Standard Biological Parts (2006)].
프로모터 | 앤더슨 컬렉션 속성 | 상대 강도 | 서열번호 |
PCON2 | BBa_J23100 | 1.00 | 84 |
PCON3 | BBa_J23108 | 0.50 | 77 |
PCON4 | BBa_J23105 | 0.24 | 78 |
PCON5 | BBa_J23114 | 0.10 | 24 |
간결함을 위해 이 프로모터 PCON1을 PCON5로 개명하고, PCON2는 가장 강력하고 PCON5는 가장 약하다. PCON2 내지 PCON5 서열(각각 서열번호 84, 77, 78 및 24)은 서열번호 83 및 40 내지 42 내에서 각각 소문자로 제시된다. 또한, 유도인자, 이소프로필 β-D-1-이소갈락토피라노시드(IPTG)의 부가에 의한 유전자의 유도성 발현을 위해, lacl 리프레서 및 lac 오퍼레이터 서열(Lacl+ PT7)과 함께 T7 프로모터(PT7; 서열번호 23)를 포함시켰다. 전사 종료를 위해, 모든 TU에 걸쳐 T7 터미네이터(TT7)를 사용했다. DNA 어셈블리의 이 다중-모노시스트로닉 시스템을 사용하여, BMC 성분의 발현 수준은 pESN 플라스미드에서 조정할 수 있다(표 5).
ppES | A | B | C | D |
22 | P4-meG-S2CP | P5-CsoS4A-SII | P4-UmuD1-40-meG-S2CP | P4-UmuD1-40-meG |
33 | PT7-CsoS1A | |||
44 | P4-CsoS1D | |||
55 | PT7-CsoS1A | |||
66 | P5-CsoS4A-mCh | P5-CsoS4A-SII | ||
77 | PT7-CsoS1A |
Cso-BMC에서 카고의 캡슐화를 위해, 캡슐화 펩티드(EP) 서열(SKITGSSGNDTQGSLITYSGGARG; 서열번호 1)을 동정했고, 이를 S2CP라고 명명하고, 이는 단순화된 카복시좀으로 단백질 카고의 격리를 매개한다. ORF의 C-말단에 S2CP를 태그하는 변형된 HcKan_O 플라스미드의 예는 HcKan_O-S2CP(서열번호 39)이다. S2CP를 EP로 동정하는 방법에 대한 상세는 이후에 설명한다. HO-BMC에서 카고의 캡슐화를 위해, HO-BMC에 대해 보고된 EP는 이. 콜라이가 재조합 숙주인 경우에 기능하는 것으로 보고되었지만, 효모에서는 기능하지 않는다는 것을 발견했다[참조: Lassila, J. K., et al., Journal of molecular biology 426: 2217-2228 (2014)]. 이. 콜라이에서 Cso-BMC를 제조하기 위해 사용되는 경로의 합성 오페론과 효모에서 HO-BMC를 발현하기 위해 사용되는 HO-ACB 경로의 개략도는 도 3에 제시되어 있다.
골든 게이트(Golden Gate) 원-포트 플라스미드 어셈블리는 이전에 발표된 프로토콜을 약간 수정하여 대체로 따른다[참조: Guo, Y., et al., Nucleic Acids Research, 43(13), e88 (2015)]. 1 내지 3개 단편의 삽입을 위해, 반응 포트에 1μL의 T4 리가제 완충액, 0.5μL의 10× 정제된 소 혈청 알부민(BSA), 5U의 Bsal(레벨 0 및 2 어셈블리용) 또는 Esp3l(레벨 1 어셈블리용), 10U의 T4 리가제, 20ng의 목적 플라스미드, 1 내지 3μL의 삽입물 및 10μL까지의 물을 제조했다. 사용된 모든 효소와 BSA는 뉴 잉글랜드 바이오랩스(New England Biolabs; NEB)에서 구입했다. 반응 포트는 70사이클 동안 각 단계에서 5분 인큐베이팅과 함께 37℃에서 18℃ 열사이클링 프로세스에 적용한 후, 15분 동안 55℃ 단계에 적용했다. 플라스미드는 이. 콜라이 아셀라(Acella)(DE3) 균주(EdgeBio)로 형질전환시키고, 생거(Sanger) 서열분석에 의해 검증했다.
플라스미드의 형질전환 및 유전자의 효모로의 염색체 통합은 쉬에스틀 및 동료[참조: Gietz, R. D. and Schiestl, R. H. Nature Protocols 2: 31 (2007)]에 의해 기재된 고효율 리튬 아세테이트/일본쇄 DNA/PEG-3350 프로토콜에 따라 수행했다.
실시예
3:
Cso
시스템에 대한 표적 펩티드의 동정
박테리아 BMC를 세포내 나노반응기로 재이용하기 위한 중요한 전략은 카고에 EP를 설치함으로써 쉘 내에서 이종 효소를 캡슐화하는 것이다. 일부 BMC 시스템에 대해 EP 서열이 동정되고 특성화되었지만, 알파-카복시좀에 대해서는 이러한 서열이 보고되지 않았다[참조: Kerfeld, C. A., et al., Nature Reviews: Microbiology, 16, 277 (2018)]. EP 서열은 CsoS2에 존재하는 것으로 제안되었다[참조: Kerfeld, C. A., et al., Nature Reviews: Microbiology, 16, 277 (2018)]. CsoS2에 대해 수행된 연구는 이의 C-말단 영역이 쉘 단백질을 고정시키는 동안 이의 N-말단을 통해 내강 카고를 모집함으로써 카복시좀의 어셈블리를 개시하는 것을 시사한다[참조: Oltrogge, L. M., et al., Nature Structural & Molecular Biology 27: 281-287 (2020)]. 100 CsoS2 오르토로그의 다중 서열 정렬은 C-말단 영역이 특히 말단 잔기에서 고도로 보존되어 있는 것을 나타냈다(도 4A). 이는 기능적 중요성을 시사한다. 따라서, 에이치. 네아폴리타누스(H. neapolitanus) CsoS2(SKITGSSGNDTQGSLITYSGGARG; 서열번호 1)의 말단 24개 잔기의 기능을 조사하기로 결정하고, 이를 CsoS2 C-말단 펩티드(Peptide)의 약어로 "S2CP"라고 명명했다. S2CP 펩티드를 코딩하는 핵산 서열은 서열번호 7에 기재되어 있다. 또한, S2CP의 약간 더 긴 변이체가 이종 단백질 카고에 여분의 추가 벌크를 추가하지 않고서 캡슐화 효능을 개선시킬 수 있는 가능성을 고려했다. 이를 위해, 에이치. 네아폴리타누스(H. neapolitanus) CsoS2(KPEKPG SKITGSSGNDTQGSLITYSGGARG 서열번호 94)의 말단 30개 잔기를 캡슐화 펩티드 변이체로서 선택하고, 이를 "S2CP(30)"이라고 명명했다. S2CP(30) 펩티드를 코딩하는 핵산 서열은 서열번호 95에 기재되어 있다.
풀-다운 검정을 사용하여, S2CP로 태그된 비-천연 단백질 카고가 BMC-H, BMC-T 및 BMC-P 쉘 단백질 유형을 각각 나타내는 CsoS1A, CsoS1D 또는 CsoS4A와 상호작용할 수 있는지를 조사했다. pES2-Pcon4-His6-meGFP-S2CP-TT7을 생성하고, His6-meGFP-S2CP를 정제하고, CsoS1A-SII, SII-CsoS1D 또는 CsoS4A-SII가 PT7을 사용하여 발현되는 이. 콜라이 용해물과 함께 단백질을 인큐베이팅했다. 음성 대조군으로서, 정제된 His6-meGFP는 용해물을 함유하는 동일한 쉘 단백질과 유사하게 인큐베이팅했다. 혼합물을 Strep-Tactin을 통해 정제하고, 6개 혼합물로부터의 정제된 분획을 GFP의 존재에 대해 웨스턴 블롯팅으로 분석했다. His6-meGFP-S2CP는 CsoS1A-Sll과 공-용출되었지만, SII-CsoS1D 또는 CsoS4A-SII와 공-용출되지 않았다(도 4B). His6-meGFP는 또한 CsoS1A-SII, SII-CsoS1D 또는 CsoS4A-SII와 공-용출되는 것으로 보이지 않았다. 이는 S2CP가 His6-meGFP가 CsoS1A와 상호작용하기 위해 S2CP가 필요하다는 것을 나타낸다. 이전 보고서에서는 전장 CsoS2가 CsoS1A와 상호작용한다는 것을 입증했지만[참조: Cai et al., 2015], CsoS2의 말단 24개 잔기만이 상호작용에 충분하다는 것을 나타냈다. 알파-카복시좀의 주요 쉘 모듈인 CsoS1A와 S2CP와의 연관은 이 펩티드 서열이 단백질 카고를 쉘 복합체로 표적화하도록 허용해야 한다. 그러나, 이 결과 단독에 기초하여, S2CP가 쉘 내에서 카고 캡슐화를 매개할 수 있는지, 또는 단순히 이를 쉘 주변으로 표적화하는지는 아직 확인할 수 없다.
실시예
4:
단순화된 알파-
카복시좀
쉘의
재조합 형성
성분 구조에 대한 지식을 기반으로 단순화된 미세구획 쉘을 구성하기 위해 Cso 성분 사이의 상호작용을 조사했다. 우리의 접근법은 단백질-단백질 상호작용의 프로브 역할로서 기능하기 위해 쉘 성분 및 S2CP에 대한 FP의 번역 융합을 수반했다. HcKan_O-FP 플라스미드를 사용하여, 4개의 FP(mTurquoise2, meGFP, mKOK 및 mCherry)를 CsoS4A의 아미노 및 카복시 말단 모두에 융합하고, pES6 플라스미드로부터의 PCON5 프로모터를 사용하여 이. 콜라이에서 하이브리드 단백질을 발현했다. CsoS4A-mCherry만이 시토졸 내에 일반적으로 균질하게 분포하는 것으로 나타났다(도 8A). 나머지 융합 생성물은 다양한 수준의 응집(데이터는 제시되지 않음)을 나타내어, 프로브로서 사용하기에 이상적이지 않았다. 따라서, CsoS4A-mChery는 쉘 성분 프로브로 선택되었다. 또한, pES2 플라스미드에서 PCON4를 사용하여 meGFP-S2CP를 발현시키고, 단백질이 일반적으로 시토졸 내에서 확산된 것으로 밝혀졌다(도 8B).
쉘(CsoS4A-mCherry) 및 표적화 펩티드(meGFP-S2CP) 프로브가 확립된 상태에서, 이어서, 경로 플라스미드 pCKH-Cso-PmChTHC를 사용하여 이들 프로브와 함께 CsoS1D 및 CsoS1A를 발현시켰다(도 5A, 표 6).
어셈블리된 TU | |
Cso-P mCh THC | 2A-4A-6A-7A |
Cso-P SII THC | 2A-4A-6B-7A |
Cso-P SII TH | 2B-4A-5A |
Cso-P SII H | 2B-3A |
Cso-P SII THC U,S2CP | 2C-4A-6B-7A |
Cso-P SII THC U | 2D-4A-6B-7A |
우리의 경로 명명법에서, PmCh는 mCherry에 융합된 펜타머 쉘 단백질(CsoS4A)을 나타내고, T는 삼량체(CsoSID), H는 헥사머(CsoS1A), C는 카고(meGFP-S2CP)를 나타낸다. 이들 4개 성분을 발현하는 세포에서, IPTG를 50μM로 첨가하면 CsoS4A-mCherry와 meGFP-S2CP가 공-국재화하는 것으로 나타났다(도 5B). 맨더(Mander)의 공-국재화 계수(MCC), [tM1, tM2]를 사용하여 공-국재화 정도를 정량화했고, 여기서 tM1은 적색 신호가 있는 영역에서 발견된 녹색 신호의 비율이고, tM2는 녹색 영역에서 발견되는 적색 신호의 비율이다[참조: Dunn, K. W., et al., American Journal of Physiology-Cell Physiology 300: C723-C742 (2011)]. 조사된 세포로부터, MCC 값은 [0.688, 0.758]로 나타났고, 이는 상당한 비율의 공-국재화가 있음을 시사한다.
관찰된 형광 초점이 정제될 수 있는 단백질 어셈블리를 나타낼 수 있는지를 결정하기 위해 진행했다. 2개의 정화 전략이 시도되었다. 첫 번째는 Strep-Tactin에 의한 정제 전에 순수한 CsoS4A-SII와 함께 Cso-PmChTHC를 발현하는 이. 콜라이 용해물을 인큐베이팅하는 것이었다. 두 번째는 Cso-PmChTHC 경로에서 CsoS4A-mCherry를 CsoS4A-SII로 치환하여, 신규한 경로, 즉 Cso-PSIITHC를 생성하는 것이었다. Strep-Tactin을 통해 정제된 단백질은 Q 세파로즈를 사용하여 음이온 교환 이온 크로마토그래피(AIEX)에 의해 추가로 정제되었다. 양쪽 정제 전략 모두에 대해, AIEX 크로마토그램에서 0.3M 및 0.4M NaCl에서 2개의 용출 피크가 나타났다(도 9A-B). 투과전자현미경(TEM)을 사용하여 양쪽 피크의 분획을 관찰하고, 직경이 약 20nm인 다수의 캡시드-유사 구조가 0.4M NaCl 용출 분획에서 관찰되었으며(도 5C-D), 0.3M NaCl 분획(도 9A-B)에서는 이러한 구조가 현저히 적게 관찰되었다. 캡시드-유사 구조가 주로 0.4M NaCl에서 용출되는 반면, 일부는 이들 2개 피크의 중첩으로 인해 0.3M NaCl 분획에서 관찰되었다고 추론했다. 또한, CsoS4A-SII 단독이, 동일한 AIEX 절차에 적용하는 경우, 0.3M NaCl에서 단일 피크로 용출되었다(도 9C). 따라서, Cso-PmChTHC 및 Cso-PSIITHC에 대해 관찰된 0.3M NaCl 피크는 쉘 내에 도입되지 않은 CsoS4A-SII에 상응할 가능성이 높다.
CsoS2는 이의 C-말단을 통해 쉘 단백질을 동원함으로써 알파-카복시좀의 어셈블리에 중요하다고 제안되었다[참조: Oltrogge, L. M., et al., Nature Structural & Molecular Biology 27: 281-287 (2020)]. Cso-PmChTHC 및 Cso-PSIITHC 작제물에서, CsoS2(S2CP; 서열번호 1)의 말단 24개 잔기는 쉘 어셈블리를 지원할 수 있다. 알파-카복시좀 성분으로부터 유래하는 쉘의 형성에 S2CP가 필수적인지의 여부를 조사하고자 했다. 따라서, S2CP가 부재하는 경로 Cso-PSIITH를 작제했다. 유사한 AIEX 크로마토그램(도 10A)에 추가하여, Cso-PmChTHC 및 Cso-PSIITHC에 의해 생성된 것과 구별할 수 없는 캡시드-유사 구조가 Cso-PSIITH 조합에서 나타났다. 이러한 구조는 0.3M NaCl 분획보다 0.4M NaCl 분획에서 더 풍부했다(0.4M NaCl의 경우 도 5E, 0.3M의 경우 도 10B). 이들 결과는 S2CP가 관찰된 단백질 쉘의 형성에 불필요하다는 것을 입증한다.
이어서, 쉘 어셈블리에 필요한 최소 성분을 결정하려고 했다. CsoS1A와 CsoS1D가 동일한 단백질 도메인으로부터 작제되었다는 점을 감안하면, 단백질 쉘은, 각각 상이한 단백질 도메인으로부터의 CsoS1A와 CsoS4A만으로 작제될 수 있는 가능성을 고려했다. CsoS1A 및 CsoS4A-SII를 발현하는 신규 경로 조합인 Cso-PSIIH가 작제되었고(pCKH-Cso-BMC; 서열번호 64), 이전과 같이 단백질을 정제했다(도 11A). 이전 경로 조합물로부터 정제된 것과 유사하게 보이는 캡시드-유사 구조가 다시 나타났다(0.3M NaCl 분획의 경우 도 11B, 0.4M NaCl의 경우 도 5D). 캡시드-유사 구조는 CsoS1A-SII 단독으로부터 어셈블리되는 것으로 보이지 않았다(도 11C). 게다가, CsoS1A 및 CsoS1D가 공-발현되거나 CsoS1D 및 CsoS4A가 공-발현되는 작제물은 단백질 쉘을 생성하지 못했다(도 11D-E). 종합하면, 이러한 결과는 CsoS1A 및 CsoS4A가 캡시드-유사 쉘의 어셈블리에 필요하고 충분하다는 것을 나타낸다.
실시예
5:
S2CP는
단순화된
카복시좀
쉘의
내강으로
카고를
표적화한다
.
S2CP가 카고를 단순화된 카복시좀 쉘의 내강으로 표적화할 수 있는지를 확인하기 위해, 이. 콜라이 UmuD N-말단 분해 태그(잔기 1 - 40)를 meGFP-S2CP의 아미노 말단에 융합시겼다[참조: Neher, S. B. et al., Proceedings of the National Academy of Sciences 100: 13219-13224 (2003)]. 플라스미드 pCKH-Cso-PSIITHCU,S2CP를 작제함으로써 CsoS1A, CsoS1D 및 CsoS4A와 함께 UmuD1 -40- meGFP-S2CP를 공-발현시켰다. S2CP가 UmuD1 -40-meGFP를 카복시좀으로 표적화할 수 있는 경우, UmuD1 -40 -meGFP-S2CP(서열번호 49)는 UmuD의 N-말단 영역으로 태그된 단백질을 인식하고 분해하는 내인성 CIpXP 프로테아제에 의해 단백질분해로부터 보호될 것이라는 가설을 세웠 다(도 6A). 반면, S2CP가 쉘 외측에만 카고를 표적화하는 경우, UmuD1 -40-meGFP-S2CP는 CIpXP에 노출되어 분해된다. 유사한 작제물, pCKH-Cso-PSIITHCU는, 유일한 차이점이 UmuD1 -40-meGFP로부터의 S2CP의 부재인 경우, 쉘에 의한 UmuD1 -40-meGFP의 확률적 캡슐화를 설명하는 역할을 했다. GFP 검출을 위해 웨스턴 블롯팅이 사용되었다(도 6B). Cso-PSIITHCU . S2CP로부터의 정제된 단백질에 상응하는 레인에서, UmuD1 -40-meGFP-S2CP가 검출되었다. Cso-PSIITHCU로부터 동일한 양의 정제된 단백질(280nm에서의 흡광도에 의해 결정됨)에 상응하는 레인에서, UmuD1 -40-meGFP(서열번호 52)는 검출될 수 없었다. 추가로 확증적 분석으로서, 양쪽 경로 조합물에 대한 용출 분획에서 유사한 단백질 쉘이 관찰되었다(도 6C-D). 이는 단백질분해로부터 UmuD1 -40-meGFP의 보호가 S2CP에 의해 매개되는 쉘로의 캡슐화 때문일 가능성이 있음을 입증한다.
실시예
6:
단순화된 알파-
카복시좀
쉘의
원자 모델
단순화된 알파-카복시좀의 분자 구조를 더 잘 이해하기 위해, 저온-전자 현미경(cryo-EM)을 사용하여 Cso-PSIITHC, Cso-PSIITH 및 Cso-PSIIH의 원자-스케일 부근의 모델을 수득했다. Cso-PSIITHC에 대해 2개의 상이한 쉘 크기가 관찰되었고, 이는 20면체 캡시드 삼각측량 수 T = 3 및 T = 4에 상응한다. 쉘 모델은 각각 3.24 및 2.90Å의 해상도에서 수득되었다. Cso-PSIITH 및 Cso-PSIIH의 경우, T = 3 쉘만이 관찰되었고, 구조는 각각 3.35Å 및 3.14Å의 해상도에서 수득되었다. Cso-PSIITHC에서 관찰된 T=3 쉘의 비율은 14.6%인 반면, T=4 쉘의 비율은 85.4%였다. 에이치. 네아폴리타누스(H. neapolitanus) CsoS1A(PDB: 2EWH) 및 CsoS4A(PDB: 2RCF)의 보고된 X선 결정 구조를 모델 피팅에 사용했다[참조: Tanaka, S. et al., Science 319: 1083-1086 (2008); Tsai, Y. et al., PLOS Biology 5: e144 (2007)]. 에이치. 네아폴리타누스(H. neapolitanus) CsoS1D는, 프로클로로콕쿠스 마리누스(Prochlorococcus marinus) MED4로부터의 CsoS1D의 구조로부터 추론된 바와 같이, 삼량체의 이중 적층 층으로 어셈블리될 것으로 예상되고, 이는 60% 동일한 잔기를 공유한다[참조: Klein, M. G., et al., Journal of Molecular Biology 392: 319-333 (2009)]. 그러나, Cso-PSIITHC 및 Cso-PSIITH에 대한 전자 밀도 맵에서 이중 적층 층을 식별할 수 없었고, 이는 CsoS1D가 이러한 쉘 내에 도입되지 않았음을 시사한다. 전자 밀도는 또한 Cso-PSIITHC로부터 정제된 쉘의 내강 공간에서 meGFP-S2CP 카고에 대해 검출되지 않았다. 그럼에도 불구하고, 쉘 프로토머와 카고 사이의 상호작용이 쉘 크기와 형상에 영향을 미친다는 것을 시사하는 계산 연구에 비추어, Cso-PSIITHC에서만 관찰되는 T=4 쉘의 형성이 카고 캡슐화에 영향을 받을 수 있고, 카고를 갖지 않는 쉘이 더 작은 T = 3 형태로서 어셈블리되는 것으로 생각할 수 있다.
T = 3 쉘 모델을 수득하기 위해 사용된 3개의 경로 조합 사이에 눈에 띄는 차이가 없었기 때문에, 모델 구축 및 개선을 위해 Cso-PSIITHC에 의해 생성된 쉘에 초점을 맞췄다(표 7).
Cso-P SII THC | Cso-P SII TH | Cso-P SII H | ||
T = 3 | T = 4 | T = 3 | T = 3 | |
수탁 코드 | ||||
맵 (EMDB) | EMD-30384 | EMD-30385 | 기탁되지 않음, Cso-PSIITHC T = 3와 거의 공일한 구조 | 기탁되지 않음, Cso-PSIITHC T = 3와 거의 동일한 구조 |
좌표 (PDB) | 7CKB | 7CKC | ||
데이터 수집 | ||||
현미경 | Titan Krios (일본 오사카현 오사카 대학, 초고전압 전자 현미경의 연구 센터) | |||
전압 (kV) | 300 kV | |||
검출기 | 팔콘(Falcon) II | |||
배율 | 96 k | |||
픽셀 크기 (Å) | 0.86 | |||
디포커스 범위 (μm) | 1.5~1.9 | 1.5~1.9 | 1.5~2.2 | 1.5~2.2 |
전자 노출(e-/Å2) | 64.3 | 64.3 | 68.1 | 68.1 |
재구성 | ||||
소프트웨어 | ||||
초기 입자 이미지 (no.) | 94129 | 94129 | 14401 | 15680 |
최종 입자 이미지 (no.) | 11468 | 67192 | 9349 | 7678 |
박스 크기 (픽셀) | ||||
대칭 부과 | ||||
정확도 회전 (°) | ||||
정확도 번역 (픽셀) | ||||
맵 해상도 (Å) | 3.24 | 2.9 | 3.35 | 3.14 |
FSC 역치 | 0.143 | 0.143 | 0.143 | 0.143 |
맵 해상도 범위 (Å) | ∞ ~ 3.24 | ∞ ~ 2.90 | ∞ ~ 3.35 | ∞ ~ 3.14 |
맵 샤프닝 B 계수 (Å2) | ||||
모델 구축 & 개선 | ||||
소프트웨어 | 키메라(Pettersen et al., 2004), 쿠트(Coot)(Emsley & Cowtan, 2004), 페닉스(Phenix)(Adams et al., 2010) | |||
사용된 초기 모델 (PDB 코드) | 2RCF, 2EWH | |||
모델 해상도 (Å) | 2.15, 1.40 | |||
FSC 역치 | ||||
모델 해상도 범위 (Å) | ||||
모델 조성 | ||||
비-수소 원자 | 114840 | 150300 | ||
단백질 잔기 | 15840 | 20820 | ||
리간드 & 물 | 0 | 0 | ||
B 인자 전체 (Å2) | 10.67 | 13.91 | ||
R.m.s. 편차 | ||||
결합 길이 (Å) | 0.0076 | 0.0076 | ||
결합 각도 (°) | 0.87 | 1.23 | ||
검증 | ||||
몰프로비티(MolProbity) 스코어 | 1.67 | 1.33 | ||
클래쉬스코어(Clashscore) | 6.97 | 5.49 | ||
불충분한 로타머 (%) | 1.05 | 0 | ||
Cβ 편차 (%) | 0 | 0 | ||
라마샨드란 플롯 | ||||
양호 (%) | 96.01 | 97.84 | ||
허용 (%) | 3.99 | 2.16 | ||
불허용 (%) | 0 | 0 |
T = 3 쉘은, 외부 직경이 217Å이고 계산된 분자량이 1.7MDa인, CsoS4A의 12개 호모-펜타머와 CsoS1A의 20개 호모-헥사머를 함유한다(도 7A). T = 4 쉘은, 외부 직경이 247Å이고 질량이 2.3MDa인, 12개 호모-펜타머와 30개 호모-헥사머를 함유한다(도 7B). 양쪽 쉘 유형은 다른 측면에서 대체로 유사하다. N 및 C 말단이 존재하는 CsoS1A 및 CsoS4A의 오목면 측은 쉘의 외측을 향한다(도 7A).
실시예 7:
Cso
-BMC에 대한
S2CP
및
S2CP(30)의
캡슐화 효능의 결정
Cso-BMC 쉘 구조를 사용하여, UmuD1-40-GFP-S2CP 및 UmuD1-40-GFP-S2CP(30)의 평균 카피 수는 GFP 형광을 통해 정량화될 수 있다. 극저온 전자 현미경 관찰에 기초하여, 쉘 분자 질량을 포함하는 모든 계산에 대해, 카고와 함께 공-발현된 모든 쉘이 T = 3 및 4 형태의 혼합물로 추정했다. 쉘 형태의 비율은 샘플마다 상이할 수 있기 때문에, 샘플 내의 모든 쉘이 T = 3 또는 4라고 가정하여 계산된 2개의 값이 제공된다. 평균 7.7 - 8.0개 카피의 UmuD1-40-GFP-S2CP(30)이, 1.6 - 1.7개 카피의 UmuD1-4 0-GFP-S2CP와 비교하여, 쉘당 캡슐화되었다. UmuD1-40-GFP-S2CP 또는 UmuD1-40-GFP-S2CP(30)을 캡슐화하는 Cso-PSIIH 쉘의 밀도측정 분석은 또한, UmuD1-40-GFP-S2CP와 비교하여, 쉘 내에서 약 4배의 UmuD1-40-GFP-S2CP(30)가 발견되는 것을 나타냈다(도 16). 따라서, S2CP(30)는 S2CP보다 더 효과적인 캡슐화 펩티드이다.
캡슐화 펩티드 | 잔기의 수 | 쉘당 UmuD1-40-GFP 카고의 평균 수 | |
T = 3 | T = 4 | ||
S2CP | 24 | 1.6 | 1.7 |
S2CP(30) | 30 | 7.7 | 8.0 |
실시예 8:
Cso-BMC를 사용한 효소 활성의 안정화
단백질 쉘은 가열 또는 동결과 같은 물리적 손상 또는 유기 공용매 또는 비생리학적 pH의 존재와 같은 화학적 손상에 대해 효소에 안정성을 부여하기 위한 플랫폼으로 주목받고 있다[참조: Demchuk & Patel, Biotechnology Advances, 41: 107547 (2020); Silva, C., et al., Critical Reviews in Biotechnology, 38(3): 335-350 (2018)]. 효소 제한은 종종 이들의 구조적 유연성을 감소시키며, 이는 때때로 변성을 유도하는 구조적 변화에 대한 안정성을 부여한다[참조: Das, Zhao, (2020) Biochemistry, 59(31): 2870-2881; Kuchler, et al., Nature Nanotechnology, 11(5): 409-420 (2016)]. 현재, 동종체 단백질 쉘은 어셈블리의 상대적 용이성 및 입자 크기의 균질성에 기여하는 효소를 호스팅하기 위해 더 잘 확립되어 있고, 이는 조작 중의 예측가능성과 취급용이성을 개선시킨다[참조: Patterson, D. P., Prevelige, P. E., & Douglas, T. (2012). ACS Nano, 6(6): 5000-5009; Patterson, D. P., Schwarz, B., El-Boubbou, K., van der Oost, J., Prevelige, P. E., & Douglas, T. (2012). Soft Matter, 8(39): 10158-10166; Sanchez-Sanchez et al., Journal of Nanobiotechnology 13(1): 66 (2015); Tan, Xue, & Yew, Molecules 26(5): 1389 (2021)]. 이들의 이종체 조성으로 인해, 최소 BMC-유래 쉘은 효소를 호스팅하기 위한 신규 스캐폴드를 나타내고, 이러한 쉘은 목적의 변형을 위한 더 많은 방법을 제공할 수 있는 반면, 일반적으로 균질한 입자 크기는 여전히 조작을 촉진하는 예측가능성을 부여한다[참조: Turmo, A., Gonzalez-Esquer, C. R., & Kerfeld, C. A. FEMS Microbiology Letters, 364(18): fnx176 (2017)]. 그러나, 최소 BMC-유래 쉘은 이종 효소를 호스팅하기 위해 아직 조사되지 않았다[참조: Cai, F., Bernstein, S. L., Wilson, S. C. & Kerfeld, C. A. Plant Physiol 170: 1868-1877 (2016); Hagen, A., et al., Nature Communications 9: 2881, (2018)]. 이에 의해, Cso-BMC가 효소를 호스팅하고 안정화할 수 있는지를 조사하는 것이 촉진되었다. 공 Cso-BMC(Cso-PSIIH)는 먼저 열 충격, 동결, 메탄올 공용매의 존재 및 pH 2~13의 환경에 대한 이들의 안정성에 대해 시험되었다. 이들 조건에 적용된 쉘의 DLS 스펙트럼은 트리스·HCl-50/350(트리스·HCl 50mM pH 8.0, NaCl 350mM)의 쉘의 것과 비교했다. 입자 크기 분포의 현저한 변화 및/또는 복수 피크의 외관은 단백질 쉘 분해를 나타낸다[참조: Yu, Z., Reid, J. C., & Yang, Y.-P. Journal of Pharmaceutical Sciences 102(12): 4284-4290 (2013)]. 시험된 조건에 기초하여, Cso-BMC는 15분 동안 최대 70℃까지, 20% v/v 메탄올, 7회 연속 동결-해동 및 pH 5 내지 11 사이에서 안정한 것으로 생각되었다(도 17).
상당히 상이한 분자 크기의 효소를 포위하는 Cso-BMC의 능력을 조사하기 위해, 진화된 완두콩 시토졸 아스코르베이트 퍼옥시다제(APEX2), 27.0kDa 단량체[참조: Lam et al., Nature Methods 12(1): 51-54 (2015)] 및 이. 콜라이 베타-갈락토시다제(LacZ), 466.0kDa 호모테트라머를 캡슐화용으로 선택했다[참조: Golan, et al., Biochimica et Biophysica Acta (BBA) - Bioenergetics, 1293(2): 238-242; Lam et al., Nature Methods 12(1): 51-54 (2015)]. S2CP(30)는, 재조합 단백질의 캡슐화를 매개하기 위해 S2CP보다 더 효과적인 것으로 밝혀졌기 때문에, 캡슐화를 매개하기 위해 효소의 C-말단에 융합되었다. 효소는 또한 헥사히스티딘(His6) 태그로 N-말단 태그되어, 쉘과 공-정제할 수 있는 캡슐화되지 않은 효소의 하류 제거를 용이하게 한다[참조: Nichols, Kennedy, & Tullman-Ercek, 2019]. 효소와 공-발현된 Cso-PSIIH 쉘을 작제하고 정제했다. SDS-PAGE 분석 및 웨스턴 블롯팅은 쉘 샘플에서 표적 효소의 존재를 확인했고(도 18A-B), 쉘당 효소의 평균 카피 수는 쿠마시에 블루 밀도측정법에 의해 추정되었다(표 8)[참조: Hagen, A., et al., Nature Communications 9: 2881 (2018); Nichols et al., Methods in Enzymology, 617, 155-186 (2019)]. 이들 효소의 캡슐화는 Cso-BMC 크기 및 형태에 상당한 영향을 미치지 않는 것으로 보인다(도 18C-E).
효소를 단백질 쉘로 캡슐화하는 것은, 경우에 따라, 효소의 촉매 특성을 변경시키는 것으로 공지되어 있다. Cso-BMC에 의한 캡슐화가 APEX2 및 LacZ의 촉매 효율에 어떻게 영향을 미칠 수 있는지를 조사하기 위해, 유리 효소와 캡슐화된 효소 모두의 정상-상태 동역학을 수행하고, 데이터를 미하엘리스-멘텐(Michaelis-Menten) 모델에 적합시켜 회전율 수(kcat), 미하엘리스-멘텐(Michaelis-Menten) 상수(KM), 및 촉매 효율(kcat/KM)을 수득했다(표 8, 도 19). 캡슐화된 APEX2의 경우, kcat/KM은 유리 효소의 약 30%로 감소했다. 캡슐화된 LacZ의 경우, kcat/KM은 유리 효소와 유의한 차이는 없었다. 양쪽 유리 효소에 대해 수득된 운동 상수, kca t 및 KM은 이전 연구와 합리적으로 일치했고, 이는 S2CP(30)의 존재가 유리 효소의 활성에 영향을 미치지 않는다는 것을 시사한다[참조: Juers, Hakda, Matthews, & Huber, Biochemistry, 42(46), 13505-13511 (2003); Lam et al., Nature Methods 12(1): 51-54 (2015)].
효소에 대한 Cso-BMC의 가능한 안정화 효과를 결정하기 위해, 유리 효소 및 쉘-캡슐화 효소 샘플은 공 쉘이 안정한 것으로 밝혀진 상기 조건으로 챌린지했다. 효소 활성은 잔류 활성을 결정하기 위해 자연 샘플의 활성으로 정규화했다(도 20). Cso-BMC는 양쪽 효소 모두에 중간 레벨의 열안정성을 부여했다. 캡슐화된 효소는, 유리 효소의 40%와 대조적으로, 40℃에서 15분 동안 인큐베이팅한 후에 이들의 활성의 약 90%를 유지했다. 50℃에서, 캡슐화된 APEX2는 이의 활성의 약 절반을 유지한 반면, 유리 효소는 본질적으로 비활성이었다. 그러나, 캡슐화된 LacZ의 활성은 50℃에서 유리 효소보다 약간 더 높았다. 60℃ 이상에서는 모든 효소 샘플이 비활성이었다. Cso-BMC는 최대 20% v/v 메탄올까지 APEX2에 대한 보호 효과를 가졌다. 한편, 메탄올 중의 유리 LacZ 및 캡슐화된 LacZ 모두에 대해, 활성의 증가가 관찰되었다. 최대 40% v/v 메탄올의 존재는 LacZ를 변성시키지 않고, 오히려 이의 활성을 증강시키는 것으로 보고되었다[참조: Shifrin & Hunn, Archives of Biochemistry and Biophysics, 130, 530-535 (1969)]. 따라서, Cso-BMC는 메탄올에 대해 안정화된 LacZ를 가질 가능성이 낮다. 동결-해동 안정성을 위해, Cso-BMC는 양쪽 효소 종을 최대 7회 연속 사이클까지 안정화했다.
캡슐화된 효소는 pH 10 - 11에서 Cso-BMC 내에서 더 높은 활성을 나타냈지만, pH 5 - 6에서는 더 낮은 활성을 나타냈다. Cso-BMC 내의 산성 미세환경은 캡슐화된 효소의 pH 활성 프로파일이 유리 효소와 비교하여 보다 알칼리성 조건으로 이동시켰을 가능성이 있다고 추론했다. 효소의 pH-의존적 활성에 대한 음이온 스캐폴드의 영향은 트립신 및 키모트립신에 대한 합성 말레산 중합체 스캐폴드에 대해 관찰되었으며, 보다 최근에는 글루코스 옥시다제-서양고추냉이 퍼옥시다제(GOx-HRP) 캐스케이드 상의 DNA 폴리포스페이트 골격에 대해 관찰되었다[참조: Goldstein, Biochemistry 11(22): 4072-4084 (1972); Goldstein, Levin, & Katchalski, Biochemistry, 3(12): 1913-1919 (1964); Zhang, Tsitkov, & Hess, Nature Communications, 7(1): 13982 (2016)].
현재까지, Cso-BMC는 최소 BMC-유래 쉘 중에서 캡슐화 펩티드를 통해 가장 높은 이종 카고 로딩을 입증할 가능성이 높다. 이러한 쉘에 대한 캡슐화 펩티드의 사용은 대체로 비효율적이었고, 카고는 종종 쿠마시에 블루 염색을 통해 검출될 수 없어, 면역블롯팅 또는 형광과 같은 보다 민감한 기술을 필요로 한다[참조: Cai, F., et al., Plant Physiol 170: 1868-1877 (2016); Hagen, A., et al., Nature Communications 9: 2881 (2018); Lassila, Bernstein, Kinney, Axen, & Kerfeld, (2014)]. 대조적으로, Cso-BMC 및 S2CP(30) 시스템의 경우, 시험된 3개의 이종 단백질 카고(GFP, APEX2, LacZ)는 모두 쿠마시에-블루 염색된 겔에서 명확하게 확인될 수 있다(도 16 및 18).
[표 8]
쉘당 캡슐화된 효소의 평균 카피 수의 정량화 및 캡슐화된 효소와 유리 효소의 동역학 상수. 쉘당 평균 효소 카피 수의 경우, 모든 쉘이 T = 3 또는 4라고 가정함으로써 계산된 값이 제공된다. 동역학 측정은 3회 수행되었고, 평균 값은 표준 오차로 표시된다.
실시예 9:
에스. 세레비지아에(S. cerevisiae)에서 HO-BMC VLP의 생산
골든 게이트 클로닝 시스템
HO-BMC VLP를 발현하기 위한 작제물은 표 3, 도 2 및 도 3에 기재된 성분을 포함하고, 실시예 2에 기재된 방법에 따라 어셈블리되었다. 간략하게는, 효모 프로모터 PTDH3를 HcKan_P에 클로닝하고, HcKan_P-TDH3(서열번호 65)로 지정했다. 효모 프로모터 PYEF3를 HcKan_P에 클로닝하고, HcKan_P-YEF3(서열번호 66)으로 지정했다. 효모 프로모터 PPYK1를 HcKan_P에 클로닝하고, HcKan_P-PYK1(서열번호 67)로 지정했다. 효모 프로모터 PGPM1를 HcKan_P에 클로닝하고, HcKan_P-GPM1(서열번호 115)로 지정했다.
HO-H ORF를 HcKan_O에 클로닝하고, HcKan_O-HO-H(서열번호 68)로 지정했다. HO-P ORF를 HcKan_O에 클로닝하고, HcKan_O-HO-P(서열번호 69)로 지정했다. HO-T1 ORF를 HcKan_O에 클로닝하고, HcKan_O-HO-T1(서열번호 70)로 지정했다. HO-T1-SpyTag ORF를 HcKan_O에 클로닝하고, HcKan_O-HO-T1-SpyTag(서열번호 116)로 지정했다.
효모 터미네이터 TRPL41B(서열번호 80)를 HcKan_T에 클로닝하고, HcKan_T-RPL41B(서열번호 71)로 지정했다. 효모 터미네이터 THBT1(서열번호 81)를 HcKan_T에 클로닝하고, HcKan_T-HBT1(서열번호 72)로 지정했다. 효모 터미네이터 TRPS 20(서열번호 82)를 HcKan_T에 클로닝하고, HcKan_T-RPS20(서열번호 73)으로 지정했다. 효모 터미네이터 TYPT31(서열번호 105)를 HcKan_T에 클로닝하고, HcKan_T-YPT31(서열번호 119)로 지정했다.
상기 기재된 프로모터, ORF 및 터미네이터 부분을 경로 어셈블리 플라스미드 pCKU(서열번호 74)에 어셈블리했다. 이어서, 어셈블리된 HO-BMC 경로는 효모 YMRWδ15 부위로의 경로의 염색체 통합을 위해 pGAU-YMRWδ15(서열번호 75)에 서브-클로닝되었다. HO-BMC 경로를 효모 YMRWδ15 부위로 통합하기 위한 HO-BMC를 포함하는 작제물을 pGAU-YMRWδ15-HO-BMC(서열번호 76)로 명명했다. GFP-SpyCatcher을 효모 YPRCδ15 부위로 통합하기 위한 PGPM1-GFP-SpyCatcher-TRPS20를 포함하는 작제물은 pGAH-YPRCδ15-GFP-SpyCatcher(서열번호 121)로 명명했다.
플라스미드의 형질전환 및 효모로의 유전자의 염색체 통합은, 쉬에스틀 및 동료[참조: ietz, R. D. and Schiestl, R. H. Nature Protocols 2: 31 (2007)]에 의해 기재된 바와 같이 고효율 리튬 아세테이트/일본쇄 DNA/PEG-3350 프로토콜에 따라 수행했다.
HO-BMC 내의 카고의 캡슐화를 위해, 이. 콜라이가 재조합 숙주인 경우에 HO-BMC에 대한 EP가 기능하는 것으로 보고되었지만, 효모에서는 기능하지 않는다는 것을 발견했다[참조: Lassila, J. K. et al., Journal of Molecular Biology 426: 2217-2228 (2014)]. 따라서, SpyCatcher/SpyTag 단백질 접합 시스템을 사용하여, HO-쉘에 카고 캡슐화를 위한 대체 방법이 채택되었다[참조: Hagen, A., et al., Nature Communications 9: 2881 (2018)]. 이 방법은 SpyTag 서열을 HO-T1의 쉘을 향하는 펩티드 루프에 이식하는 것을 포함했다. 이 변형된 HO-T1 아단위를 HO-T1-SpyTag라고 한다. 따라서, 융합 SpyCatcher 도메인을 갖는 카고 단백질은 HO-T1 -SpyTag와 공유 이소펩티드 결합을 형성하고, HO-쉘 내에 캡슐화될 수 있다. 효모의 HO-쉘을 구성하는 전사 단위는 표 9에 편집되어 있고, HO-쉘 경로를 발현하는 효모 균주는 표 10에 편집되어 있다. 이. 콜라이에서 Cso-BMC를 제조하기 위해 사용되는 경로의 합성 오페론의 개략도 및 효모에서 HO-BMC를 발현하기 위해 사용되는 HO-ACB 경로는 도 3에 제시되어 있다.
pPOT | A | B |
22 | PTDH3-HO-H-TRPL41B | |
44 | PYEF3-HO-T1-TRPL41B | PYEF3-HO-T1-SpyTag-TRPL41B |
55 | PPYK1-HO-P-SII-TRPS20 |
어셈블리된 TU | |
HO-PTH | 2A-4A-5A (YMRWδ15에서 통합됨) |
HO-PT ST H+GFP-SpyCatcher | 2A-4B-5A (YMRWδ15에서 통합됨) + GFP-SpyCatcher (YPRCδ15에서 통합됨) |
VLP의 정제
8L의 YPD(효모 추출물 1%, 펩톤 2%, 글루코즈 2%, BioBasic)에서 48시간 동안 25℃에서 성장시킨 후, 효모 세포를 펠렛화하고, 8회 통과로 M-110P 미세유동화기를 사용하여 20,000psi에서 용해시켰다. 용해물을 20,000 ×g에서 매회 20분 동안 2회 회전시키고, 정화된 용해물을 1M 트리스·HCl pH 12를 사용하여 pH 8로 조정했다. 또한, 용해물을 15분 동안 온화하게 교반하면서 300μL의 비오틴 차단 완충액(IBA Lifesciences)과 함께 인큐베이팅했다. StrepTrap 친화성 정제는 상기 기재된 바와 동일한 방식으로 수행되었다.
결과:
Cso-BMC 및 HO-BMC의 정제
작제된 합성 오페론(도 3)을 사용하여, 이. 콜라이의 Cso-BMC 및 효모의 HO-BMC를 정제했다. 본 발명자의 지식에 따르면, 이는 에이치. 네아폴리타누스(H. neapolitanus) cso 오페론의 2개 성분만을 사용한 재조합 단백질 쉘 형성의 최초 공지된 사례이다. 실버(Silver) 및 동료들은 이. 콜라이에서 에이치. 네아폴리타누스 카복시좀의 형성을 보고했지만, 이는 10개 유전자를 코딩하는 전체 cso 오페론을 이. 콜라이에 이식함으로써 수행되었다[참조: Bonacci, W. et al., Proceedings of National Academy of Sciences 109: 478-483 (2012)]. 우리의 시스템은 단백질 쉘의 형성을 2개의 유전자, CsoS1A 및 csoS4A로 단순화했다. CsoS1A는 평탄한 육각형 타일을 형성하는 헥사머로 어셈블리되는 반면, CsoS4A는 쉘의 정점을 점유하는 펜타머로 어셈블리되어, CsoS1A에 의해 형성된 평탄한 타일을 캡핑하고 쉘에 이의 20면체 기하를 부여한다. 수득된 Cso-BMC 쉘은 천연 에이치. 네아폴리타누스 카복시좀(직경 90 내지 110nm)보다 작지만(직경 22nm), 합성 쉘은 DLS 측정의 의해 입증된 바와 같이 크기가 매우 균일하다(도 17).
케르펠드(Kerfeld)와 동료들은 3개의 쉘 프로토머, HO-H, HO-P 및 H0-T1을 사용하여 이의 원자-스케일 구조를 수득한 이. 콜라이에서 HO-BMC의 재조합 발현을 보고했다[참조: Sutter, M. et al., Science 356: 1293-1297 (2017)]. HO-H의 구조와 기하학적 기능은 CsoS1A와 유사하고, HO-P는 CsoS4A와 유사하다. 2개 HO-H의 탠덤 반복과 유사한 HO-T1은 마찬가지로 평탄한 육각형 타일을 형성하는 삼량체로 어셈블리된다. 효모에서 HO-BMC를 재구성했다. 문헌의 현재의 이해에 기초하여, 이는 효모에서 BMC-유래 단백질 쉘의 재조합 발현의 최초의 사례이다. 효모의 재조합 단백질 역가는 일반적으로 이. 콜라이와 비교하여 낮지만(도 14), 효모에서 HO-BMC VLP의 발현은 진핵생물 번역후 변형 기구에 의한 조정의 길을 열어 준다[참조: Sudbery, P. E. Curr Opin Biotechnol 7 (1996)]. 또한, 다수의 효모-유래 생체분자와 생물 자체가 일반적으로 안전한 것으로 간주되는(GRAS) 상태로 간주되어, HO-BMC를 백신 개발에 양호한 위치에 배치한다는 점도 주목할 만한다[참조: Sewalt, V. et al., Industrial Biotechnology 12: 295-302 (2016)].
TEM하에서 관찰하면, Cso-BMC는 직경이 약 20nm인 캡시드-유사 구조로서 표시되고, 일부는 각진 면을 갖는다. 이 형상은 전술한 바와 같이, 천연 에이치. 네아폴리타누스 카복시좀을 연상시키고, 합성 Cso-BMC의 직경은 천연 카복시좀의 약 20%이다. Cso-BMC의 더 작은 크기를 설명하는 그럴듯한 이유는 내강 공간이 비어 있기 때문이다. 천연 카복시좀에서, 쉘 내에 빽빽하게 팩키징된 수백 내지 수천 개의 단백질이 있는 것으로 공지되어 있다[참조: Bonacci, W. et al. Proceedings of the National Academy of Sciences 109: 478-483 (2012)]. 생물공학 목적을 위해, 재조합 단백질 카고가 이들 쉘 내에 보다 효율적으로 캡슐화될 수 있도록 VLP가 이들의 천연 내강 단백질을 배제되도록 하는 것이 더 바람직해야 한다[참조: Schwarz, B. et al., Advances in Virus Research 97: 1-60 (2017)].
우리의 효모 발현된 HO-BMC 쉘은 크기와 형상의 측면에서 케르펠드(Kerfeld)와 동료들이 보고한 바와 같이 이. 콜라이에서 발현된 것과 매우 유사하다[참조: Sutter et al., Science 356: 1293-1297 (2017)]. Cso-BMC 및 HO-BMC의 친화성 정제로부터의 단백질 용출액의 도데실 황산나트륨 폴리아크릴아미드 겔(SDS-PAGE) 분석은 예상되는 쉘 프로토머 단백질의 존재를 나타낸다. 헥사머(CsoS1A, HO-H)와 펜타머(CsoS4A, HO-P)는 분자량이 유사하기 때문에(10±1kDa), 이들은 SDS-PAGE에 의해 충분히 분리되지 않는다. 그럼에도 불구하고, 단백질 쉘의 존재를 고려하면, 양쪽 종은 모두 관찰된 약 10kDa 단백질 밴드에 존재한다고 추론할 수 있다. Cso-BMC 및 HO-쉘 모두의 원자 레벨의 구조적 상세는 이들 입자가 대체로 크기가 균일하다는 것을 나타낸다[참조: Sutter, M. et al., Science 356: 1293-1297 (2017); Tan, Ali, et al., Biomacromolecules doi:10.1021/acs.biomac.1c00533 (2021)]. 이러한 크기의 균일성은, VLP를 생체 재료로서 기능화할 때에 예측가능성으로 변환되기 때문에, VLP 조작에서 유용한 특징이다[참조: Schwarz, B. et al., Advances in Virus Research 97: 1-60 (2017)].
요약
BMC는 미생물 세포 공장에서 대사 반응의 공간 프로그래밍을 위한 유망한 플랫폼이며, 특수한 생화학적 전달 비히클로 재이용할 수 있다[참조: Kerfeld C.A. et al. Nature Reviews Microbiology 16: 277 (2018)]. 그러나, 이러한 목적을 위해 이러한 단백질 쉘을 활용하는 데 있어서 주요 장애물은, 재조합 시스템으로 용이하게 번역할 수 없는 어셈블리의 종종-복잡한 성질이다. 2개 유형의 쉘 단백질을 사용한 단백질 쉘의 어셈블리는 천연-유사 알파-카복시좀을 생산하는 에이치. 네아폴리타누스 cso 오페론으로부터 10개 동정된 성분으로부터 대폭 감소했다[참조: Bonacci, W. et al. Proceedings of the National Academy of Sciences 109: 478-483 (2012)]. 추가로, 단순화된 카복시좀 쉘로 이종 단백질 카고를 표적화할 수 있는 서열, S2CP를 동정했다. 6개 이상의 잔기, S2CP(30)를 함유하는 캡슐화 펩티드 변이체는, Cso-BMC로 GFP 카고 단백질의 캡슐화를 매개할 때에 S2CP보다 약 4배 더 효과적인 것으로 나타났다. 따라서, S2CP와 S2CP(30)는 모두 Cso-BMC 내에 팩키징되는 이종 단백질 카고의 양을 제어하는 데 유용하다. Cso-BMC는 또한 열 충격, 메탄올 공용매의 존재, 연속 동결-해동 사이클 및 고알칼리성 환경과 같은 일반적 효소 변성 요인에 대해 2개 효소, APEX2 및 LacZ를 안정화할 수 있다. 우리가 아는 한, 이는 이러한 변성 요인에 대한 효소를 호스팅하고 안정화하기 위해 최소 성분 BMC-유래 쉘을 이용하는 최초의 실증이다. Cso-BMC는 효소를 캡슐화하여 안정화시키기 위해 사용될 수 있는 VLP의 현재 범위를 확장한다[참조: Demchuk & Patel, Biotechnology Advances, 41: 107547 (2020)].
또한, 효모에서 HO-BMC를 재조합적으로 발현시켰고, 쉘이 재조합 단백질 카고를 캡슐화할 수 있다는 증거를 제공한다. 우리가 아는 한, 이것은 효모에서 BMC 쉘의 재조합 발현에 대한 최초의 실증이다.
참고문헌
본 명세서에서 명백히 사전에 공개된 문서의 목록 또는 논의는, 이러한 문서가 최신 기술의 일부이거나, 일반적인 일반 지식이라는 것을 인정하는 것으로 간주되어서는 안 된다.
Adams, P. D., Afonine, P. V., Bunkoczi, G., Chen, V. B., Davis, I. W., Echols, N., Zwart, P. H. (2010). PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallographica. Section D: Biological Crystallography, 66(Pt 2), 213-221. doi:10.1107/s0907444909052925.
Anderson, J. C. (2006). Anderson Promoter Library Registry of Standard Biological Parts. Retrieved from parts.igem.org/Promoters/Catalog/Anderson.
Baneyx, F. (1999). Recombinant protein expression in Escherichia coli. Current Opinion in Biotechnology, 10(5), 411-421. doi:10.1016/S0958-1669(99)00003-8.
Bonacci, W., Teng, P. K., Afonso, B., Niederholtmeyer, H., Grob, P., Silver, P. A., & Savage, D. F. (2012). Modularity of a carbon-fixing protein organelle. Proceedings of the National Academy of Sciences, 109(2), 478-483. doi:10.1073/pnas.1108557109.
Cai, F., Bernstein, S. L., Wilson, S. C., & Kerfeld, C. A. (2016). Production and Characterization of Synthetic Carboxysome Shells with Incorporated Luminal Proteins. Plant Physiology, 170(3), 1868-1877. doi:10.1104/pp.15.01822.
Cai, F., Dou, Z., Bernstein, S. L., Leverenz, R., Williams, E. B., Heinhorst, S., Kerfeld, C. A. (2015). Advances in Understanding Carboxysome Assembly in Prochlorococcus and Synechococcus Implicate CsoS2 as a Critical Component. Life (Basel), 5(2), 1141-1171. doi:10.3390/life5021141.
Das, S., Zhao, L., Elofson, K., & Finn, M. G. (2020). Enzyme Stabilization by Virus-Like Particles. Biochemistry, 59(31), 2870-2881. doi:10.1021/acs.biochem.0c00435.
Demchuk, A. M., & Patel, T. R. (2020). The biomedical and bioengineering potential of protein nanocompartments. Biotechnology Advances, 41, 107547. doi:10.1016/j.biotechadv.2020.107547.
Dunn, K. W., Kamocka, M. M., & McDonald, J. H. (2011). A practical guide to evaluating colocalization in biological microscopy. American Journal of Physiology-Cell Physiology, 300(4), C723-C742. doi:10.1152/ajpcell.00462.2010.
Emsley, P., & Cowtan, K. (2004). Coot: model-building tools for molecular graphics. Acta Crystallographica. Section D: Biological Crystallography, 60(Pt 12 Pt 1), 2126-2132. doi:10.1107/s0907444904019158.
Fletcher, J. M., Harniman, R. L., Barnes, F. R. H., Boyle, A. L., Collins, A., Mantell, J., Woolfson, D. N. (2013). Self-Assembling Cages from Coiled-Coil Peptide Modules. Science, 340(6132), 595-599. doi:10.1126/science.1233936.
Gietz, R. D., & Schiestl, R. H. (2007). High-efficiency yeast transformation using the LiAc/SS carrier DNA/PEG method. Nature Protocols, 2, 31. doi:10.1038/nprot.2007.13.
Golan, R., Zehavi, U., Naim, M., Patchornik, A., & Smirnoff, P. (1996). Inhibition of Escherichia coli beta-galactosidase by 2-nitro-1-(4,5-dimethoxy-2-nitrophenyl) ethyl, a photoreversible thiol label. Biochimica et Biophysica Acta (BBA) - Bioenergetics, 1293(2), 238-242. doi:10.1016/0167-4838(95)00254-5.
Goldstein, L. (1972). Microenvironmental effects on enzyme catalysis. Kinetic study of polyanionic and polycationic derivatives of chymotrypsin. Biochemistry, 11(22), 4072-4084. doi:10.1021/bi00772a009.
Goldstein, L., Levin, Y., & Katchalski, E. (1964). A Water-insoluble Polyanionic Derivative of Trypsin. II. Effect of the Polyelectrolyte Carrier on the Kinetic Behavior of the Bound Trypsin*. Biochemistry, 3(12), 1913-1919. doi:10.1021/bi00900a022.
Guo, Y., Dong, J., Zhou, T., Auxillos, J., Li, T., Zhang, W., Dai, J. (2015). YeastFab: the design and construction of standard biological parts for metabolic engineering in Saccharomyces cerevisiae. Nucleic Acids Research, 43(13), e88. doi:10.1093/nar/gkv464.
Hagen, A., Sutter, M., Sloan, N., & Kerfeld, C. A. (2018). Programmed loading and rapid purification of engineered bacterial microcompartment shells. Nature Communications, 9(1), 2881. doi:10.1038/s41467-018-05162-z.
Juers, D. H., Hakda, S., Matthews, B. W., & Huber, R. E. (2003). Structural Basis for the Altered Activity of Gly794 Variants of Escherichia coli β-Galactosidase. Biochemistry, 42(46), 13505-13511. doi:10.1021/bi035506j.
Kalnins, G., Cesle, E.-E., Jansons, J., Liepins, J., Filimonenko, A., & Tars, K. (2020). Encapsulation mechanisms and structural studies of GRM2 bacterial microcompartment particles. Nature Communications, 11(1), 388. doi:10.1038/s41467-019-14205-y.
Keeble, A. H., & Howarth, M. (2019). Insider information on successful covalent protein coupling with help from SpyBank. Methods in Enzymology, 617, 443-461. doi:10.1016/bs.mie.2018.12.010.
Kerfeld, C. A., Aussignargues, C., Zarzycki, J., Cai, F., & Sutter, M. (2018). Bacterial microcompartments. Nature Reviews: Microbiology, 16, 277. doi:10.1038/nrmicro.2018.10.
Klein, M. G., Zwart, P., Bagby, S. C., Cai, F., Chisholm, S. W., Heinhorst, S., Kerfeld, C. A. (2009). Identification and structural analysis of a novel carboxysome shell protein with implications for metabolite transport. Journal of Molecular Biology, 392(2), 319-333. doi:10.1016/j.jmb.2009.03.056.
Kchler, A., Yoshimoto, M., Luginbhl, S., Mavelli, F., & Walde, P. (2016). Enzymatic reactions in confined environments. Nature Nanotechnology, 11(5), 409-420. doi:10.1038/nnano.2016.54.
Lam, S. S., Martell, J. D., Kamer, K. J., Deerinck, T. J., Ellisman, M. H., Mootha, V. K., & Ting, A. Y. (2015). Directed evolution of APEX2 for electron microscopy and proximity labeling. Nature Methods, 12(1), 51-54. doi:10.1038/nmeth.3179.
Lassila, J. K., Bernstein, S. L., Kinney, J. N., Axen, S. D., & Kerfeld, C. A. (2014). Assembly of robust bacterial microcompartment shells using building blocks from an organelle of unknown function. Journal of Molecular Biology, 426(11), 2217-2228. doi:10.1016/j.jmb.2014.02.025.
Lawrence, A. D., Frank, S., Newnham, S., Lee, M. J., Brown, I. R., Xue, W.-F., Warren, M. J. (2014). Solution Structure of a Bacterial Microcompartment Targeting Peptide and Its Application in the Construction of an Ethanol Bioreactor. ACS Synthetic Biology, 3(7), 454-465. doi:10.1021/sb4001118.
Liebschner, D., Afonine, P. V., Baker, M. L., Bunkoczi, G., Chen, V. B., Croll, T. I., Adams, P. D. (2019). Macromolecular structure determination using X-rays, neutrons and electrons: recent developments in Phenix. Acta Crystallographica Section D: Structural Biology, 75(10), 861-877. doi:10.1107/S2059798319011471.
Neher, S. B., Sauer, R. T., & Baker, T. A. (2003). Distinct peptide signals in the UmuD and UmuD subunits of UmuD/D mediate tethering and substrate processing by the ClpXP protease. Proceedings of the National Academy of Sciences, 100(23), 13219-13224. doi:10.1073/pnas.2235804100.
Nichols, T. M., Kennedy, N. W., & Tullman-Ercek, D. (2019). Cargo encapsulation in bacterial microcompartments: Methods and analysis. Methods in Enzymology, 617, 155-186. doi:10.1016/bs.mie.2018.12.009.
Oltrogge, L. M., Chaijarasphong, T., Chen, A. W., Bolin, E. R., Marqusee, S., & Savage, D. F. (2020). Multivalent interactions between CsoS2 and Rubisco mediate α-carboxysome formation. Nature Structural & Molecular Biology, 27(3), 281-287. doi:10.1038/s41594-020-0387-7.
Patterson, D. P., Prevelige, P. E., & Douglas, T. (2012). Nanoreactors by Programmed Enzyme Encapsulation Inside the Capsid of the Bacteriophage P22. ACS Nano, 6(6), 5000-5009. doi:10.1021/nn300545z.
Patterson, D. P., Schwarz, B., El-Boubbou, K., van der Oost, J., Prevelige, P. E., & Douglas, T. (2012). Virus-like particle nanoreactors: programmed encapsulation of the thermostable CelB glycosidase inside the P22 capsid. Soft Matter, 8(39), 10158-10166. doi:10.1039/C2SM26485D.
Pettersen, E. F., Goddard, T. D., Huang, C. C., Couch, G. S., Greenblatt, D. M., Meng, E. C., & Ferrin, T. E. (2004). UCSF Chimera - a visualization system for exploratory research and analysis. Journal of Computational Chemistry, 25(13), 1605-1612. doi:10.1002/jcc.20084.
S, L., Tapia-Moreno, A., Juarez-Moreno, K., Patterson, D. P., Cadena-Nava, R. D., Douglas, T., & Vazquez-Duhalt, R. (2015). Design of a VLP-nanovehicle for CYP450 enzymatic activity delivery. Journal of Nanobiotechnology, 13(1), 66. doi:10.1186/s12951-015-0127-z.
Schwarz, B., Uchida, M., & Douglas, T. (2017). Chapter One - Biomedical and Catalytic Opportunities of Virus-Like Particles in Nanotechnology. In M. Kielian, T. C. Mettenleiter, & M. J. Roossinck (Eds.), Advances in Virus Research (Vol. 97, pp. 1-60): Academic Press.
Sewalt, V., Shanahan, D., Gregg, L., La Marta, J., & Carrillo, R. (2016). The Generally Recognized as Safe (GRAS) Process for Industrial Microbial Enzymes. Industrial Biotechnology, 12(5), 295-302. doi:10.1089/ind.2016.0011.
Shifrin, S., & Hunn, G. (1969). Effect of alcohols on the enzymatic activity and subunit association of β-galactosidase. Archives of Biochemistry and Biophysics, 130, 530-535. doi:10.1016/0003-9861(69)90066-6.
Sievers, F., & Higgins, D. G. (2014). Clustal Omega, accurate alignment of very large numbers of sequences. Methods in Molecular Biology, 1079, 105-116. doi:10.1007/978-1-62703-646-7_6.
Silva, C., Martins, M., Jing, S., Fu, J., & Cavaco-Paulo, A. (2018). Practical insights on enzyme stabilization. Critical Reviews in Biotechnology, 38(3), 335-350. doi:10.1080/07388551.2017.1355294.
Sudbery, P. E. (1996). The expression of recombinant proteins in yeasts. Current Opinion in Biotechnology, 7. doi:10.1016/s0958-1669(96)80055-3.
Sutter, M., Greber, B., Aussignargues, C., & Kerfeld, C. A. (2017). Assembly principles and structure of a 6.5-MDa bacterial microcompartment shell. Science, 356(6344), 1293-1297. doi:10.1126/science.aan3289.
Sutter, M., Laughlin, T. G., Sloan, N. B., Serwas, D., Davies, K. M., & Kerfeld, C. A. (2019). Structure of a synthetic beta-carboxysome shell. Plant Physiology, 181(3), 1050-1058. doi:10.1104/pp.19.00885.
Tan, Y. Q., Ali, S., Xue, B., Teo, W. Z., Ling, L. H., Go, M. K., . . . Yew, W. S. (2021). Structure of a Minimal α-Carboxysome-Derived Shell and Its Utility in Enzyme Stabilization. Biomacromolecules. doi:10.1021/acs.biomac.1c00533.
Tan, Y. Q., Xue, B., & Yew, W. S. (2021). Genetically Encodable Scaffolds for Optimizing Enzyme Function. Molecules, 26(5), 1389. Retrieved from wwwdotmdpidotcom/1420-3049/26/5/1389.
Tanaka, S., Kerfeld, C. A., Sawaya, M. R., Cai, F., Heinhorst, S., Cannon, G. C., & Yeates, T. O. (2008). Atomic-Level Models of the Bacterial Carboxysome Shell. Science, 319(5866), 1083-1086. doi:10.1126/science.1151458.
Thomas, F., Boyle, A. L., Burton, A. J., & Woolfson, D. N. (2013). A Set of de Novo Designed Parallel Heterodimeric Coiled Coils with Quantified Dissociation Constants in the Micromolar to Sub-nanomolar Regime. Journal of the American Chemical Society, 135(13), 5161-5166. doi:10.1021/ja312310g.
Tsai, Y., Sawaya, M. R., Cannon, G. C., Cai, F., Williams, E. B., Heinhorst, S., . . . Yeates, T. O. (2007). Structural Analysis of CsoS1A and the Protein Shell of the Halothiobacillus neapolitanus Carboxysome. PLoS Biology, 5(6), e144. doi:10.1371/journal.pbio.0050144.
Turmo, A., Gonzalez-Esquer, C. R., & Kerfeld, C. A. (2017). Carboxysomes: metabolic modules for CO2 fixation. FEMS Microbiology Letters, 364(18), fnx176. doi:10.1093/femsle/fnx176.
Waterhouse, A. M., Procter, J. B., Martin, D. M. A., Clamp, M., & Barton, G. J. (2009). Jalview Version 2―a multiple sequence alignment editor and analysis workbench. Bioinformatics, 25(9), 1189-1191. doi:10.1093/bioinformatics/btp033.
Winn, M. D., Ballard, C. C., Cowtan, K. D., Dodson, E. J., Emsley, P., Evans, P. R., . . . Wilson, K. S. (2011). Overview of the CCP4 suite and current developments. Acta Crystallographica. Section D: Biological Crystallography, 67(Pt 4), 235-242. doi:10.1107/s0907444910045749.
Yu, Z., Reid, J. C., & Yang, Y.-P. (2013). Utilizing Dynamic Light Scattering as a Process Analytical Technology for Protein Folding and Aggregation Monitoring in Vaccine Manufacturing. Journal of Pharmaceutical Sciences, 102(12), 4284-4290. doi:10.1002/jps.23746.
Zhang, K. (2016). Gctf: Real-time CTF determination and correction. Journal of Structural Biology, 193(1), 1-12. doi:10.1016/j.jsb.2015.11.003
Zhang, Y., Tsitkov, S., & Hess, H. (2016). Proximity does not contribute to activity enhancement in the glucose oxidase-horseradish peroxidase cascade. Nature Communications, 7(1), 13982. doi:10.1038/ncomms13982
Zivanov, J., Nakane, T., Forsberg, B. O., Kimanius, D., Hagen, W. J. H., Lindahl, E., & Scheres, S. H. W. (2018). New tools for automated high-resolution cryo-EM structure determination in RELION-3. eLife, 7, e42166. doi:10.7554/eLife.42166.
SEQUENCE LISTING
<110> National University of Singapore
<120> BACTERIAL MICROCOMPARTMENT VIRUS-LIKE PARTICLES
<130> SP102877WO
<150> SG10202010547W
<151> 2020-10-23
<160> 121
<170> PatentIn version 3.5
<210> 1
<211> 24
<212> PRT
<213> Artificial Sequence
<220>
<223> S2CP amino acid sequence
<400> 1
Ser Lys Ile Thr Gly Ser Ser Gly Asn Asp Thr Gln Gly Ser Leu Ile
1 5 10 15
Thr Tyr Ser Gly Gly Ala Arg Gly
20
<210> 2
<211> 98
<212> PRT
<213> Artificial Sequence
<220>
<223> CsoS1A amino acid sequence
<400> 2
Met Ala Asp Val Thr Gly Ile Ala Leu Gly Met Ile Glu Thr Arg Gly
1 5 10 15
Leu Val Pro Ala Ile Glu Ala Ala Asp Ala Met Thr Lys Ala Ala Glu
20 25 30
Val Arg Leu Val Gly Arg Gln Phe Val Gly Gly Gly Tyr Val Thr Val
35 40 45
Leu Val Arg Gly Glu Thr Gly Ala Val Asn Ala Ala Val Arg Ala Gly
50 55 60
Ala Asp Ala Cys Glu Arg Val Gly Asp Gly Leu Val Ala Ala His Ile
65 70 75 80
Ile Ala Arg Val His Ser Glu Val Glu Asn Ile Leu Pro Lys Ala Pro
85 90 95
Gln Ala
<210> 3
<211> 83
<212> PRT
<213> Artificial Sequence
<220>
<223> CsoS4A amino acid sequence
<400> 3
Met Lys Ile Met Gln Val Glu Lys Thr Leu Val Ser Thr Asn Arg Ile
1 5 10 15
Ala Asp Met Gly His Lys Pro Leu Leu Val Val Trp Glu Lys Pro Gly
20 25 30
Ala Pro Arg Gln Val Ala Val Asp Ala Ile Gly Cys Ile Pro Gly Asp
35 40 45
Trp Val Leu Cys Val Gly Ser Ser Ala Ala Arg Glu Ala Ala Gly Ser
50 55 60
Lys Ser Tyr Pro Ser Asp Leu Thr Ile Ile Gly Ile Ile Asp Gln Trp
65 70 75 80
Asn Gly Glu
<210> 4
<211> 99
<212> PRT
<213> Artificial Sequence
<220>
<223> HO-H amino acid sequence
<400> 4
Met Ala Asp Ala Leu Gly Met Ile Glu Val Arg Gly Phe Val Gly Met
1 5 10 15
Val Glu Ala Ala Asp Ala Met Val Lys Ala Ala Lys Val Glu Leu Ile
20 25 30
Gly Tyr Glu Lys Thr Gly Gly Gly Tyr Val Thr Ala Val Val Arg Gly
35 40 45
Asp Val Ala Ala Val Lys Ala Ala Thr Glu Ala Gly Gln Arg Ala Ala
50 55 60
Glu Arg Val Gly Glu Val Val Ala Val His Val Ile Pro Arg Pro His
65 70 75 80
Val Asn Val Asp Ala Ala Leu Pro Leu Gly Arg Thr Pro Gly Met Asp
85 90 95
Lys Ser Ala
<210> 5
<211> 96
<212> PRT
<213> Artificial Sequence
<220>
<223> HO-P amino acid sequence
<400> 5
Met Val Leu Gly Lys Val Val Gly Thr Val Val Ala Ser Arg Lys Glu
1 5 10 15
Pro Arg Ile Glu Gly Leu Ser Leu Leu Leu Val Arg Ala Cys Asp Pro
20 25 30
Asp Gly Thr Pro Thr Gly Gly Ala Val Val Cys Ala Asp Ala Val Gly
35 40 45
Ala Gly Val Gly Glu Val Val Leu Tyr Ala Ser Gly Ser Ser Ala Arg
50 55 60
Gln Thr Glu Val Thr Asn Asn Arg Pro Val Asp Ala Thr Ile Met Ala
65 70 75 80
Ile Val Asp Leu Val Glu Met Gly Gly Asp Val Arg Phe Arg Lys Asp
85 90 95
<210> 6
<211> 205
<212> PRT
<213> Artificial Sequence
<220>
<223> HO-T1 amino acid sequence
<400> 6
Met Asp His Ala Pro Glu Arg Phe Asp Ala Thr Pro Pro Ala Gly Glu
1 5 10 15
Pro Asp Arg Pro Ala Leu Gly Val Leu Glu Leu Thr Ser Ile Ala Arg
20 25 30
Gly Ile Thr Val Ala Asp Ala Ala Leu Lys Arg Ala Pro Ser Leu Leu
35 40 45
Leu Met Ser Arg Pro Val Ser Ser Gly Lys His Leu Leu Met Met Arg
50 55 60
Gly Gln Val Ala Glu Val Glu Glu Ser Met Ile Ala Ala Arg Glu Ile
65 70 75 80
Ala Gly Ala Gly Ser Gly Ala Leu Leu Asp Glu Leu Glu Leu Pro Tyr
85 90 95
Ala His Glu Gln Leu Trp Arg Phe Leu Asp Ala Pro Val Val Ala Asp
100 105 110
Ala Trp Glu Glu Asp Thr Glu Ser Val Ile Ile Val Glu Thr Ala Thr
115 120 125
Val Cys Ala Ala Ile Asp Ser Ala Asp Ala Ala Leu Lys Thr Ala Pro
130 135 140
Val Val Leu Arg Asp Met Arg Leu Ala Ile Gly Ile Ala Gly Lys Ala
145 150 155 160
Phe Phe Thr Leu Thr Gly Glu Leu Ala Asp Val Glu Ala Ala Ala Glu
165 170 175
Val Val Arg Glu Arg Cys Gly Ala Arg Leu Leu Glu Leu Ala Cys Ile
180 185 190
Ala Arg Pro Val Asp Glu Leu Arg Gly Arg Leu Phe Phe
195 200 205
<210> 7
<211> 72
<212> DNA
<213> Artificial Sequence
<220>
<223> S2CP nucleotide sequence
<400> 7
tctaagatta ctggttcttc tggtaacgat acccaaggtt ctttgattac ttactctggt 60
ggtgctagag gt 72
<210> 8
<211> 294
<212> DNA
<213> Artificial Sequence
<220>
<223> CsoS1A nucleotide sequence
<400> 8
atggctgatg ttactggtat tgctttgggt atgattgaaa ctagaggttt ggttccagct 60
atcgaagctg ctgacgctat gaccaaggcc gctgaagtca gattggtcgg tagacaattt 120
gttggaggtg gttacgtcac tgttttggtt cgtggtgaaa ccggtgccgt taacgctgct 180
gttagagctg gtgctgatgc ttgtgaaaga gttggtgacg gtttagttgc tgcccacatt 240
attgccagag tccactctga agttgaaaac attttgccaa aggctccaca ggct 294
<210> 9
<211> 249
<212> DNA
<213> Artificial Sequence
<220>
<223> CsoS4A nucleotide sequence
<400> 9
atgaagatca tgcaagttga aaagactttg gtttctacca acagaattgc tgatatgggt 60
cacaagccat tgttggttgt ttgggaaaaa cctggtgctc caagacaagt tgctgttgat 120
gctattggtt gtattccagg tgactgggtt ttgtgtgttg gttcttctgc tgccagagaa 180
gctgctggtt ccaagtctta cccatctgat ttgactatca tcggtattat tgaccaatgg 240
aacggtgaa 249
<210> 10
<211> 297
<212> DNA
<213> Artificial Sequence
<220>
<223> HO-H nuxleotide sequence
<400> 10
atggctgatg ctttgggtat gattgaagtt agaggtttcg ttggtatggt tgaagctgct 60
gatgctatgg ttaaggctgc taaagttgaa ttgatcggtt acgaaaaaac tggtggtggt 120
tatgttactg ctgttgttag aggtgatgtt gctgctgtaa aagctgctac tgaagctggt 180
caaagggctg ctgaaagagt tggagaagtt gttgctgttc atgttattcc aagaccacat 240
gttaatgttg atgctgcttt gccattgggt agaactccag gtatggataa gtctgct 297
<210> 11
<211> 288
<212> DNA
<213> Artificial Sequence
<220>
<223> HO-P nucleotide sequence
<400> 11
atggttttag gtaaagttgt cggtactgtt gttgcatcaa gaaaggaacc aagaattgaa 60
ggtttatctt tattattggt tagagcttgt gatccagatg gtactccaac tggtggtgct 120
gttgtttgtg ctgatgctgt tggtgctggt gttggtgaag ttgttttata tgcttctggt 180
tcttctgcta gacaaactga agttactaat aatagaccag ttgatgctac tattatggct 240
attgttgatt tggttgaaat gggtggtgat gttagattta gaaaagat 288
<210> 12
<211> 615
<212> DNA
<213> Artificial Sequence
<220>
<223> HO-T1 nucleotide sequence
<400> 12
atggatcatg ctccagaaag atttgatgct actcctccag ctggtgaacc agatagacca 60
gctttgggtg ttttggaatt gacttctatt gctagaggta ttaccgttgc tgatgctgct 120
ttgaaaagag caccatcttt gttgttgatg tccagaccag tttcttccgg taaacatttg 180
ttgatgatga gaggtcaagt tgccgaagtt gaagaatcta tgattgctgc tagagaaatt 240
gctggtgctg gttctggtgc tttgttggat gaattggaat tgccatatgc tcacgaacaa 300
ctttggagat ttttggatgc tccagttgtt gcagatgctt gggaagaaga tactgaatcc 360
gttattatcg ttgaaaccgc tactgtttgt gctgctattg attctgctga tgcagcctta 420
aaaactgctc ctgttgtttt gagagatatg agattggcta ttggtattgc tggtaaggct 480
ttctttactt tgactggtga attggctgat gttgaagctg ctgctgaagt tgttagagaa 540
agatgtggtg ctagattgct agaattggct tgtattgcaa gaccagttga cgaattgaga 600
ggtaggttgt ttttc 615
<210> 13
<211> 83
<212> PRT
<213> Artificial Sequence
<220>
<223> Spycatcher tag amino acid sequence
<400> 13
Asp Ser Ala Thr His Ile Lys Phe Ser Lys Arg Asp Glu Asp Gly Lys
1 5 10 15
Glu Leu Ala Gly Ala Thr Met Glu Leu Arg Asp Ser Ser Gly Lys Thr
20 25 30
Ile Ser Thr Trp Ile Ser Asp Gly Gln Val Lys Asp Phe Tyr Leu Tyr
35 40 45
Pro Gly Lys Tyr Thr Phe Val Glu Thr Ala Ala Pro Asp Gly Tyr Glu
50 55 60
Val Ala Thr Ala Ile Thr Phe Thr Val Asn Glu Gln Gly Gln Val Thr
65 70 75 80
Val Asn Gly
<210> 14
<211> 249
<212> DNA
<213> Artificial Sequence
<220>
<223> Spycatcher tag nucleotide sequence
<400> 14
gattctgcta ctcatattaa gttctccaag agggacgaag atggtaaaga attggctggt 60
gcaactatgg aattgagaga ttcttctggt aagaccattt ccacctggat ttctgatggt 120
caagttaagg atttctactt gtacccaggt aagtacactt tcgttgaaac tgctgctcca 180
gatggttatg aagttgctac tgctattact ttcaccgtca atgaacaagg tcaagtcact 240
gttaatggt 249
<210> 15
<211> 13
<212> PRT
<213> Artificial Sequence
<220>
<223> Spytag amino acid sequence
<400> 15
Ala His Ile Val Met Val Asp Ala Tyr Lys Pro Thr Lys
1 5 10
<210> 16
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Spytag nucleotide sequence
<400> 16
gctcatatag ttatggttga tgcttacaag ccaacaaaa 39
<210> 17
<211> 21
<212> PRT
<213> Artificial Sequence
<220>
<223> CC-Di-A amino acid sequence
<400> 17
Glu Ile Ala Ala Leu Glu Lys Glu Asn Ala Ala Leu Glu Gln Glu Ile
1 5 10 15
Ala Ala Leu Glu Gln
20
<210> 18
<211> 63
<212> DNA
<213> Artificial Sequence
<220>
<223> CC-Di-A nucleotide sequence
<400> 18
gaaattgcag ctttggaaaa agaaaacgct gccttggaac aagaaattgc cgcattagaa 60
caa 63
<210> 19
<211> 21
<212> PRT
<213> Artificial Sequence
<220>
<223> CC-Di-B amino acid sequemce
<400> 19
Lys Ile Ala Ala Leu Lys Lys Lys Asn Ala Ala Leu Lys Gln Lys Ile
1 5 10 15
Ala Ala Leu Lys Gln
20
<210> 20
<211> 63
<212> DNA
<213> Artificial Sequence
<220>
<223> CC-Di-B nucleotide sequence
<400> 20
aaaattgcag cattgaaaaa gaagaacgcc gccttgaaac aaaaaattgc tgccttaaaa 60
caa 63
<210> 21
<211> 8
<212> PRT
<213> Artificial Sequence
<220>
<223> Strep-Tag II (SII) amino acid sequence
<400> 21
Trp Ser His Pro Gln Phe Glu Lys
1 5
<210> 22
<211> 24
<212> DNA
<213> Artificial Sequence
<220>
<223> Strep-Tag II (SII) nucleotide sequence
<400> 22
tggtcacatc cacaatttga aaag 24
<210> 23
<211> 1555
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter PT7 nucleotide sequence
<400> 23
tcactgcccg ctttccagtc gggaaacctg tcgtgccagc tgcattaatg aatcggccaa 60
cgcgcgggga gaggcggttt gcgtattggg cgccagggtg gtttttcttt tcaccagtga 120
aacgggcaac agctgattgc ccttcaccgc ctggccctga gagagttgca gcaagcggtc 180
cacgctggtt tgccccagca ggcgaaaatc ctgtttgatg gtggttaacg gcgggatata 240
acatgagctg tcttcggtat cgtcgtatcc cactaccgag atatccgcac caacgcgcag 300
cccggactcg gtaatggcgc gcattgcgcc cagcgccatc tgatcgttgg caaccagcat 360
cgcagtggga acgatgccct cattcagcat ttgcatggtt tgttgaaaac cggacatggc 420
actccagtcg ccttcccgtt ccgctatcgg ctgaatttga ttgcgagtga gatatttatg 480
ccagccagcc agacgcagac gcgccgagac agaacttaat gggcccgcta acagcgcgat 540
ttgctggtga cccaatgcga ccagatgctc cacgcccagt cgcgtaccgt cttcatggga 600
gaaaataata ctgttgatgg gtgtctggtc agagacatca agaaataacg ccggaacatt 660
agtgcaggca gcttccacag caatggcatc ctggtcatcc agcggatagt taatgatcag 720
cccactgacg cgttgcgcga gaagattgtg caccgccgct ttacaggctt cgacgccgct 780
tcgttctacc atcgacacca ccacgctggc acccagttga tcggcgcgag atttaatcgc 840
cgcgacaatt tgcgacggcg cgtgcagggc cagactggag gtggcaacgc caatcagcaa 900
cgactgtttg cccgccagtt gttgtgccac gcggttggga atgtaattca gctccgccat 960
cgccgcttcc actttttccc gcgttttcgc agaaacgtgg ctggcctggt tcaccacgcg 1020
ggaaacggtc tgataagaga caccggcata ctctgcgaca tcgtataacg ttactggttt 1080
cacattcacc accctgaatt gactctcttc cgggcgctat catgccatac cgcgaaaggt 1140
tttgcgccat tcgatggtgt ccgggatctc gacgctctcc cttatgcgac tcctgcatta 1200
ggaagcagcc cagtagtagg ttgaggccgt tgagcaccgc cgccgcaagg aatggtgcat 1260
gcaaggagat ggcgcccaac agtcccccgg ccacggggcc tgccaccata cccacgccga 1320
aacaagcgct catgagcccg aagtggcgag cccgatcttc cccatcggtg atgtcggcga 1380
tataggcgcc agcaaccgca cctgtggcgc cggtgatgcc ggccacgatg cgtccggcgt 1440
agaggatcga gatctcgatc ccgcgaaatt aatacgactc actatagggg aattgtgagc 1500
ggataacaat tcccctctag aaataatttt gtttaacttt aagaaggaga tatac 1555
<210> 24
<211> 118
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter PCON5 nucleotide sequence
<400> 24
ggcttcccaa ccttaccaga gggcgcccca gctggcaatt ccgacgtctt tatggctagc 60
tcagtcctag gtacaatgct agcgaattca aaagatcttt taagaaggag atatacat 118
<210> 25
<211> 500
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter PTDH3 nucleotide sequence
<400> 25
acagtttatt cctggcatcc actaaatata atggagcccg ctttttaagc tggcatccag 60
aaaaaaaaag aatcccagca ccaaaatatt gttttcttca ccaaccatca gttcataggt 120
ccattctctt agcgcaacta cagagaacag gggcacaaac aggcaaaaaa cgggcacaac 180
ctcaatggag tgatgcaacc tgcctggagt aaatgatgac acaaggcaat tgacccacgc 240
atgtatctat ctcattttct tacaccttct attaccttct gctctctctg atttggaaaa 300
agctgaaaaa aaaggttgaa accagttccc tgaaattatt cccctacttg actaataagt 360
atataaagac ggtaggtatt gattgtaatt ctgtaaatct atttcttaaa cttcttaaat 420
tctactttta tagttagtct tttttttagt tttaaaacac caagaactta gtttcgaata 480
aacacacata aacaaacaaa 500
<210> 26
<211> 500
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter PPYK1 nucleotide sequence
<400> 26
acagattggg agattttcat agtagaattc agcatgatag ctacgtaaat gtgttccgca 60
ccgtcacaaa gtgttttcta ctgttctttc ttctttcgtt cattcagttg agttgagtga 120
gtgctttgtt caatggatct tagctaaaat gcatattttt tctcttggta aatgaatgct 180
tgtgatgtct tccaagtgat ttcctttcct tcccatatga tgctaggtac ctttagtgtc 240
ttcctaaaaa aaaaaaaagg ctcgccatca aaacgatatt cgttggcttt tttttctgaa 300
ttataaatac tctttggtaa cttttcattt ccaagaacct cttttttcca gttatatcat 360
ggtccccttt caaagttatt ctctactctt tttcatattc attctttttc atcctttggt 420
tttttattct taacttgttt attattctct cttgtttcta tttacaagac accaatcaaa 480
acaaataaaa catcatcaca 500
<210> 27
<211> 500
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter PYEF3 nucleotide sequence
<400> 27
attaaaaaaa caacttacaa tcattgttcg ccccttccat acttactgcc actcgcaaaa 60
gggcccaacc agggcaatta cgtatcaaaa aatcatgaca ggctgggtaa taaatattcg 120
tgaagaaaga agaaattaaa aaaagaaacg aagaagcaaa aaaaagaaaa gactccgttt 180
aatcactttc aaccgcggtt tatccggccc cacccatgca taaccctaaa ttattagatc 240
acttagcacg tgaaaaagaa acgtttttaa tgtttttttt ttttttttct ttttcttttt 300
ttgcgttggt gaaaattttt tcgcttcctc gagtataatt atctcatctc atctttcata 360
taagataaga agttttataa aaaccttttg catcaaaatt ttgtagaata tctctttttc 420
ttacgctctc tttctttcct taattgtttt ctaaagaacc gtgtattttt ctagttcgaa 480
tccatcgata acattaaaag 500
<210> 28
<211> 3589
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CmCherry plasmid
<400> 28
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgagagacc gaattcgcgg ccgcttctag agcaatacgc aaaccgcctc 120
tccccgcgcg ttggccgatt cattaatgca gctggcacga caggtttccc gactggaaag 180
cgggcagtga gcgcaacgca attaatgtga gttagctcac tcattaggca ccccaggctt 240
tacactttat gcttccggct cgtatgttgt gtggaattgt gagcggataa caatttcaca 300
catactagag aaagaggaga aatactagat ggcttcctcc gaagacgtta tcaaagagtt 360
catgcgtttc aaagttcgta tggaaggttc cgttaacggt cacgagttcg aaatcgaagg 420
tgaaggtgaa ggtcgtccgt acgaaggtac ccagaccgct aaactgaaag ttaccaaagg 480
tggtccgctg ccgttcgctt gggacatcct gtccccgcag ttccagtacg gttccaaagc 540
ttacgttaaa cacccggctg acatcccgga ctacctgaaa ctgtccttcc cggaaggttt 600
caaatgggaa cgtgttatga acttcgaaga cggtggtgtt gttaccgtta cccaggactc 660
ctccctgcaa gacggtgagt tcatctacaa agttaaactg cgtggtacca acttcccgtc 720
cgacggtccg gttatgcaga aaaaaaccat gggttgggaa gcttccaccg aacgtatgta 780
cccggaagac ggtgctctga aaggtgaaat caaaatgcgt ctgaaactga aagacggtgg 840
tcactacgac gctgaagtta aaaccaccta catggctaaa aaaccggttc agctgccggg 900
tgcttacaaa accgacatca aactggacat cacctcccac aacgaagact acaccatcgt 960
tgaacagtac gaacgtgctg aaggtcgtca ctccaccggt gcttaataac gctgatagtg 1020
ctagtgtaga tcgctactag agccaggcat caaataaaac gaaaggctca gtcgaaagac 1080
tgggcctttc gttttatctg ttgtttgtcg gtgaacgctc tctactagag tcacactggc 1140
tcaccttcgg gtgggccttt ctgcgtttat atactagtag cggccgctgc agggtctctg 1200
gttcttctat ggtgagcaag ggcgaggagg ataacatggc catcatcaag gagttcatgc 1260
gcttcaaggt gcacatggag ggctccgtga acggccacga gttcgagatc gagggcgagg 1320
gcgagggccg cccctacgag ggcacccaga ccgccaagct gaaggtgacc aagggtggcc 1380
ccctgccctt cgcctgggac atcctgtccc ctcagttcat gtacggctcc aaggcctacg 1440
tgaagcaccc cgccgacatc cccgactact tgaagctgtc cttccccgag ggcttcaagt 1500
gggagcgcgt gatgaacttc gaggacggcg gcgtggtgac cgtgacccag gactcctccc 1560
tgcaggacgg cgagttcatc tacaaggtga agctgcgcgg caccaacttc ccctccgacg 1620
gccccgtaat gcagaagaag accatgggct gggaggcctc ctccgagcgg atgtaccccg 1680
aggacggcgc cctgaagggc gagatcaagc agaggctgaa gctgaaggac ggcggccact 1740
acgacgctga ggtcaagacc acctacaagg ccaagaagcc cgtgcagctg cccggcgcct 1800
acaacgtcaa catcaagttg gacatcacct cccacaacga ggactacacc atcgtggaac 1860
agtacgaacg cgccgagggc cgccactcca ccggcggcat ggacgagctg tacaagtagc 1920
cgagacgact gaccatttaa atcatacctg acctccatag cagaaagtca aaagcctccg 1980
accggaggct tttgacttga tcggcacgta agaggttcca actttcacca taatgaaata 2040
agatcactac cgggcgtatt ttttgagtta tcgagatttt caggagctaa ggaagctaaa 2100
atgagccata ttcaacggga aacgtcttgc tcgaggccgc gattaaattc caacatggat 2160
gctgatttat atgggtataa atgggctcgc gataatgtcg ggcaatcagg tgcgacaatc 2220
tatcgattgt atgggaagcc cgatgcgcca gagttgtttc tgaaacatgg caaaggtagc 2280
gttgccaatg atgttacaga tgagatggtc aggctaaact ggctgacgga atttatgcct 2340
cttccgacca tcaagcattt tatccgtact cctgatgatg catggttact caccactgcg 2400
atcccaggga aaacagcatt ccaggtatta gaagaatatc ctgattcagg tgaaaatatt 2460
gttgatgcgc tggcagtgtt cctgcgccgg ttgcattcga ttcctgtttg taattgtcct 2520
tttaacggcg atcgcgtatt tcgtctcgct caggcgcaat cacgaatgaa taacggtttg 2580
gttggtgcga gtgattttga tgacgagcgt aatggctggc ctgttgaaca agtctggaaa 2640
gaaatgcata agcttttgcc attctcaccg gattcagtcg tcactcatgg tgatttctca 2700
cttgataacc ttatttttga cgaggggaaa ttaataggtt gtattgatgt tggacgagtc 2760
ggaatcgcag accgatacca ggatcttgcc atcctatgga actgcctcgg tgagttttct 2820
ccttcattac agaaacggct ttttcaaaaa tatggtattg ataatcctga tatgaataaa 2880
ttgcagtttc acttgatgct cgatgagttt ttctaatgag ggcccaaatg taatcacctg 2940
gctcaccttc gggtgggcct ttctgcgttg ctggcgtttt tccataggct ccgcccccct 3000
gacgagcatc acaaaaatcg atgctcaagt cagaggtggc gaaacccgac aggactataa 3060
agataccagg cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg 3120
cttaccggat acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcatagctca 3180
cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa 3240
ccccccgttc agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg 3300
gtaagacacg acttatcgcc actggcagca gccactggta acaggattag cagagcgagg 3360
tatgtaggcg gtgctacaga gttcttgaag tggtggccta actacggcta cactagaaga 3420
acagtatttg gtatctgcgc tctgctgaag ccagttacct cggaaaaaga gttggtagct 3480
cttgatccgg caaacaaacc accgctggta gcggtggttt ttttgtttgc aagcagcaga 3540
ttacgcgcag aaaaaaagga tctcaagaag atcctttgat tttctaccg 3589
<210> 29
<211> 708
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CmCherry nucleotide sequence of key ORF
<400> 29
atggtgagca agggcgagga ggataacatg gccatcatca aggagttcat gcgcttcaag 60
gtgcacatgg agggctccgt gaacggccac gagttcgaga tcgagggcga gggcgagggc 120
cgcccctacg agggcaccca gaccgccaag ctgaaggtga ccaagggtgg ccccctgccc 180
ttcgcctggg acatcctgtc ccctcagttc atgtacggct ccaaggccta cgtgaagcac 240
cccgccgaca tccccgacta cttgaagctg tccttccccg agggcttcaa gtgggagcgc 300
gtgatgaact tcgaggacgg cggcgtggtg accgtgaccc aggactcctc cctgcaggac 360
ggcgagttca tctacaaggt gaagctgcgc ggcaccaact tcccctccga cggccccgta 420
atgcagaaga agaccatggg ctgggaggcc tcctccgagc ggatgtaccc cgaggacggc 480
gccctgaagg gcgagatcaa gcagaggctg aagctgaagg acggcggcca ctacgacgct 540
gaggtcaaga ccacctacaa ggccaagaag cccgtgcagc tgcccggcgc ctacaacgtc 600
aacatcaagt tggacatcac ctcccacaac gaggactaca ccatcgtgga acagtacgaa 660
cgcgccgagg gccgccactc caccggcggc atggacgagc tgtacaag 708
<210> 30
<211> 236
<212> PRT
<213> Artificial Sequence
<220>
<223> HcKan_O-CmCherry amino acid sequence of key ORF
<400> 30
Met Val Ser Lys Gly Glu Glu Asp Asn Met Ala Ile Ile Lys Glu Phe
1 5 10 15
Met Arg Phe Lys Val His Met Glu Gly Ser Val Asn Gly His Glu Phe
20 25 30
Glu Ile Glu Gly Glu Gly Glu Gly Arg Pro Tyr Glu Gly Thr Gln Thr
35 40 45
Ala Lys Leu Lys Val Thr Lys Gly Gly Pro Leu Pro Phe Ala Trp Asp
50 55 60
Ile Leu Ser Pro Gln Phe Met Tyr Gly Ser Lys Ala Tyr Val Lys His
65 70 75 80
Pro Ala Asp Ile Pro Asp Tyr Leu Lys Leu Ser Phe Pro Glu Gly Phe
85 90 95
Lys Trp Glu Arg Val Met Asn Phe Glu Asp Gly Gly Val Val Thr Val
100 105 110
Thr Gln Asp Ser Ser Leu Gln Asp Gly Glu Phe Ile Tyr Lys Val Lys
115 120 125
Leu Arg Gly Thr Asn Phe Pro Ser Asp Gly Pro Val Met Gln Lys Lys
130 135 140
Thr Met Gly Trp Glu Ala Ser Ser Glu Arg Met Tyr Pro Glu Asp Gly
145 150 155 160
Ala Leu Lys Gly Glu Ile Lys Gln Arg Leu Lys Leu Lys Asp Gly Gly
165 170 175
His Tyr Asp Ala Glu Val Lys Thr Thr Tyr Lys Ala Lys Lys Pro Val
180 185 190
Gln Leu Pro Gly Ala Tyr Asn Val Asn Ile Lys Leu Asp Ile Thr Ser
195 200 205
His Asn Glu Asp Tyr Thr Ile Val Glu Gln Tyr Glu Arg Ala Glu Gly
210 215 220
Arg His Ser Thr Gly Gly Met Asp Glu Leu Tyr Lys
225 230 235
<210> 31
<211> 2905
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CSII plasmid
<400> 31
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgagagacc gaattcgcgg ccgcttctag agcaatacgc aaaccgcctc 120
tccccgcgcg ttggccgatt cattaatgca gctggcacga caggtttccc gactggaaag 180
cgggcagtga gcgcaacgca attaatgtga gttagctcac tcattaggca ccccaggctt 240
tacactttat gcttccggct cgtatgttgt gtggaattgt gagcggataa caatttcaca 300
catactagag aaagaggaga aatactagat ggcttcctcc gaagacgtta tcaaagagtt 360
catgcgtttc aaagttcgta tggaaggttc cgttaacggt cacgagttcg aaatcgaagg 420
tgaaggtgaa ggtcgtccgt acgaaggtac ccagaccgct aaactgaaag ttaccaaagg 480
tggtccgctg ccgttcgctt gggacatcct gtccccgcag ttccagtacg gttccaaagc 540
ttacgttaaa cacccggctg acatcccgga ctacctgaaa ctgtccttcc cggaaggttt 600
caaatgggaa cgtgttatga acttcgaaga cggtggtgtt gttaccgtta cccaggactc 660
ctccctgcaa gacggtgagt tcatctacaa agttaaactg cgtggtacca acttcccgtc 720
cgacggtccg gttatgcaga aaaaaaccat gggttgggaa gcttccaccg aacgtatgta 780
cccggaagac ggtgctctga aaggtgaaat caaaatgcgt ctgaaactga aagacggtgg 840
tcactacgac gctgaagtta aaaccaccta catggctaaa aaaccggttc agctgccggg 900
tgcttacaaa accgacatca aactggacat cacctcccac aacgaagact acaccatcgt 960
tgaacagtac gaacgtgctg aaggtcgtca ctccaccggt gcttaataac gctgatagtg 1020
ctagtgtaga tcgctactag agccaggcat caaataaaac gaaaggctca gtcgaaagac 1080
tgggcctttc gttttatctg ttgtttgtcg gtgaacgctc tctactagag tcacactggc 1140
tcaccttcgg gtgggccttt ctgcgtttat atactagtag cggccgctgc agggtctctg 1200
gttcttcttg gtcacatcca caatttgaaa agtagccgag acgactgacc atttaaatca 1260
tacctgacct ccatagcaga aagtcaaaag cctccgaccg gaggcttttg acttgatcgg 1320
cacgtaagag gttccaactt tcaccataat gaaataagat cactaccggg cgtatttttt 1380
gagttatcga gattttcagg agctaaggaa gctaaaatga gccatattca acgggaaacg 1440
tcttgctcga ggccgcgatt aaattccaac atggatgctg atttatatgg gtataaatgg 1500
gctcgcgata atgtcgggca atcaggtgcg acaatctatc gattgtatgg gaagcccgat 1560
gcgccagagt tgtttctgaa acatggcaaa ggtagcgttg ccaatgatgt tacagatgag 1620
atggtcaggc taaactggct gacggaattt atgcctcttc cgaccatcaa gcattttatc 1680
cgtactcctg atgatgcatg gttactcacc actgcgatcc cagggaaaac agcattccag 1740
gtattagaag aatatcctga ttcaggtgaa aatattgttg atgcgctggc agtgttcctg 1800
cgccggttgc attcgattcc tgtttgtaat tgtcctttta acggcgatcg cgtatttcgt 1860
ctcgctcagg cgcaatcacg aatgaataac ggtttggttg gtgcgagtga ttttgatgac 1920
gagcgtaatg gctggcctgt tgaacaagtc tggaaagaaa tgcataagct tttgccattc 1980
tcaccggatt cagtcgtcac tcatggtgat ttctcacttg ataaccttat ttttgacgag 2040
gggaaattaa taggttgtat tgatgttgga cgagtcggaa tcgcagaccg ataccaggat 2100
cttgccatcc tatggaactg cctcggtgag ttttctcctt cattacagaa acggcttttt 2160
caaaaatatg gtattgataa tcctgatatg aataaattgc agtttcactt gatgctcgat 2220
gagtttttct aatgagggcc caaatgtaat cacctggctc accttcgggt gggcctttct 2280
gcgttgctgg cgtttttcca taggctccgc ccccctgacg agcatcacaa aaatcgatgc 2340
tcaagtcaga ggtggcgaaa cccgacagga ctataaagat accaggcgtt tccccctgga 2400
agctccctcg tgcgctctcc tgttccgacc ctgccgctta ccggatacct gtccgccttt 2460
ctcccttcgg gaagcgtggc gctttctcat agctcacgct gtaggtatct cagttcggtg 2520
taggtcgttc gctccaagct gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc 2580
gccttatccg gtaactatcg tcttgagtcc aacccggtaa gacacgactt atcgccactg 2640
gcagcagcca ctggtaacag gattagcaga gcgaggtatg taggcggtgc tacagagttc 2700
ttgaagtggt ggcctaacta cggctacact agaagaacag tatttggtat ctgcgctctg 2760
ctgaagccag ttacctcgga aaaagagttg gtagctcttg atccggcaaa caaaccaccg 2820
ctggtagcgg tggttttttt gtttgcaagc agcagattac gcgcagaaaa aaaggatctc 2880
aagaagatcc tttgattttc taccg 2905
<210> 32
<211> 2899
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CHis6 plasmid
<400> 32
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgagagacc gaattcgcgg ccgcttctag agcaatacgc aaaccgcctc 120
tccccgcgcg ttggccgatt cattaatgca gctggcacga caggtttccc gactggaaag 180
cgggcagtga gcgcaacgca attaatgtga gttagctcac tcattaggca ccccaggctt 240
tacactttat gcttccggct cgtatgttgt gtggaattgt gagcggataa caatttcaca 300
catactagag aaagaggaga aatactagat ggcttcctcc gaagacgtta tcaaagagtt 360
catgcgtttc aaagttcgta tggaaggttc cgttaacggt cacgagttcg aaatcgaagg 420
tgaaggtgaa ggtcgtccgt acgaaggtac ccagaccgct aaactgaaag ttaccaaagg 480
tggtccgctg ccgttcgctt gggacatcct gtccccgcag ttccagtacg gttccaaagc 540
ttacgttaaa cacccggctg acatcccgga ctacctgaaa ctgtccttcc cggaaggttt 600
caaatgggaa cgtgttatga acttcgaaga cggtggtgtt gttaccgtta cccaggactc 660
ctccctgcaa gacggtgagt tcatctacaa agttaaactg cgtggtacca acttcccgtc 720
cgacggtccg gttatgcaga aaaaaaccat gggttgggaa gcttccaccg aacgtatgta 780
cccggaagac ggtgctctga aaggtgaaat caaaatgcgt ctgaaactga aagacggtgg 840
tcactacgac gctgaagtta aaaccaccta catggctaaa aaaccggttc agctgccggg 900
tgcttacaaa accgacatca aactggacat cacctcccac aacgaagact acaccatcgt 960
tgaacagtac gaacgtgctg aaggtcgtca ctccaccggt gcttaataac gctgatagtg 1020
ctagtgtaga tcgctactag agccaggcat caaataaaac gaaaggctca gtcgaaagac 1080
tgggcctttc gttttatctg ttgtttgtcg gtgaacgctc tctactagag tcacactggc 1140
tcaccttcgg gtgggccttt ctgcgtttat atactagtag cggccgctgc agggtctctg 1200
gttcttctca tcatcaccat caccattagc cgagacgact gaccatttaa atcatacctg 1260
acctccatag cagaaagtca aaagcctccg accggaggct tttgacttga tcggcacgta 1320
agaggttcca actttcacca taatgaaata agatcactac cgggcgtatt ttttgagtta 1380
tcgagatttt caggagctaa ggaagctaaa atgagccata ttcaacggga aacgtcttgc 1440
tcgaggccgc gattaaattc caacatggat gctgatttat atgggtataa atgggctcgc 1500
gataatgtcg ggcaatcagg tgcgacaatc tatcgattgt atgggaagcc cgatgcgcca 1560
gagttgtttc tgaaacatgg caaaggtagc gttgccaatg atgttacaga tgagatggtc 1620
aggctaaact ggctgacgga atttatgcct cttccgacca tcaagcattt tatccgtact 1680
cctgatgatg catggttact caccactgcg atcccaggga aaacagcatt ccaggtatta 1740
gaagaatatc ctgattcagg tgaaaatatt gttgatgcgc tggcagtgtt cctgcgccgg 1800
ttgcattcga ttcctgtttg taattgtcct tttaacggcg atcgcgtatt tcgtctcgct 1860
caggcgcaat cacgaatgaa taacggtttg gttggtgcga gtgattttga tgacgagcgt 1920
aatggctggc ctgttgaaca agtctggaaa gaaatgcata agcttttgcc attctcaccg 1980
gattcagtcg tcactcatgg tgatttctca cttgataacc ttatttttga cgaggggaaa 2040
ttaataggtt gtattgatgt tggacgagtc ggaatcgcag accgatacca ggatcttgcc 2100
atcctatgga actgcctcgg tgagttttct ccttcattac agaaacggct ttttcaaaaa 2160
tatggtattg ataatcctga tatgaataaa ttgcagtttc acttgatgct cgatgagttt 2220
ttctaatgag ggcccaaatg taatcacctg gctcaccttc gggtgggcct ttctgcgttg 2280
ctggcgtttt tccataggct ccgcccccct gacgagcatc acaaaaatcg atgctcaagt 2340
cagaggtggc gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc 2400
ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct 2460
tcgggaagcg tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc 2520
gttcgctcca agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta 2580
tccggtaact atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca 2640
gccactggta acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag 2700
tggtggccta actacggcta cactagaaga acagtatttg gtatctgcgc tctgctgaag 2760
ccagttacct cggaaaaaga gttggtagct cttgatccgg caaacaaacc accgctggta 2820
gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag 2880
atcctttgat tttctaccg 2899
<210> 33
<211> 18
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CHis6 nucleotide sequence of key ORF
<400> 33
catcatcacc atcaccat 18
<210> 34
<211> 6
<212> PRT
<213> Artificial Sequence
<220>
<223> HcKan_O-CHis6 amino acid sequence of key ORF
<400> 34
His His His His His His
1 5
<210> 35
<211> 2983
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CCCDiA plasmid
<400> 35
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgagagacc gaattcgcgg ccgcttctag agcaatacgc aaaccgcctc 120
tccccgcgcg ttggccgatt cattaatgca gctggcacga caggtttccc gactggaaag 180
cgggcagtga gcgcaacgca attaatgtga gttagctcac tcattaggca ccccaggctt 240
tacactttat gcttccggct cgtatgttgt gtggaattgt gagcggataa caatttcaca 300
catactagag aaagaggaga aatactagat ggcttcctcc gaagacgtta tcaaagagtt 360
catgcgtttc aaagttcgta tggaaggttc cgttaacggt cacgagttcg aaatcgaagg 420
tgaaggtgaa ggtcgtccgt acgaaggtac ccagaccgct aaactgaaag ttaccaaagg 480
tggtccgctg ccgttcgctt gggacatcct gtccccgcag ttccagtacg gttccaaagc 540
ttacgttaaa cacccggctg acatcccgga ctacctgaaa ctgtccttcc cggaaggttt 600
caaatgggaa cgtgttatga acttcgaaga cggtggtgtt gttaccgtta cccaggactc 660
ctccctgcaa gacggtgagt tcatctacaa agttaaactg cgtggtacca acttcccgtc 720
cgacggtccg gttatgcaga aaaaaaccat gggttgggaa gcttccaccg aacgtatgta 780
cccggaagac ggtgctctga aaggtgaaat caaaatgcgt ctgaaactga aagacggtgg 840
tcactacgac gctgaagtta aaaccaccta catggctaaa aaaccggttc agctgccggg 900
tgcttacaaa accgacatca aactggacat cacctcccac aacgaagact acaccatcgt 960
tgaacagtac gaacgtgctg aaggtcgtca ctccaccggt gcttaataac gctgatagtg 1020
ctagtgtaga tcgctactag agccaggcat caaataaaac gaaaggctca gtcgaaagac 1080
tgggcctttc gttttatctg ttgtttgtcg gtgaacgctc tctactagag tcacactggc 1140
tcaccttcgg gtgggccttt ctgcgtttat atactagtag cggccgctgc agggtctctg 1200
gttctggtgg tggttcaggt ggttctgaaa ttgcagcttt ggaaaaagaa aacgctgcct 1260
tggaacaaga aattgccgca ttagaacaag gtggtagtgg tggatctggt tagccgagac 1320
gactgaccat ttaaatcata cctgacctcc atagcagaaa gtcaaaagcc tccgaccgga 1380
ggcttttgac ttgatcggca cgtaagaggt tccaactttc accataatga aataagatca 1440
ctaccgggcg tattttttga gttatcgaga ttttcaggag ctaaggaagc taaaatgagc 1500
catattcaac gggaaacgtc ttgctcgagg ccgcgattaa attccaacat ggatgctgat 1560
ttatatgggt ataaatgggc tcgcgataat gtcgggcaat caggtgcgac aatctatcga 1620
ttgtatggga agcccgatgc gccagagttg tttctgaaac atggcaaagg tagcgttgcc 1680
aatgatgtta cagatgagat ggtcaggcta aactggctga cggaatttat gcctcttccg 1740
accatcaagc attttatccg tactcctgat gatgcatggt tactcaccac tgcgatccca 1800
gggaaaacag cattccaggt attagaagaa tatcctgatt caggtgaaaa tattgttgat 1860
gcgctggcag tgttcctgcg ccggttgcat tcgattcctg tttgtaattg tccttttaac 1920
ggcgatcgcg tatttcgtct cgcacaggcg caatcacgaa tgaataacgg tttggttggt 1980
gcgagtgatt ttgatgacga gcgtaatggc tggcctgttg aacaagtctg gaaagaaatg 2040
cataagcttt tgccattctc accggattca gtcgtcactc atggtgattt ctcacttgat 2100
aaccttattt ttgacgaggg gaaattaata ggttgtattg atgttggacg agtcggaatc 2160
gcagaccgat accaggatct tgccatccta tggaactgcc tcggtgagtt ttctccttca 2220
ttacagaaac ggctttttca aaaatatggt attgataatc ctgatatgaa taaattgcag 2280
tttcacttga tgctcgatga gtttttctaa tgagggccca aatgtaatca cctggctcac 2340
cttcgggtgg gcctttctgc gttgctggcg tttttccata ggctccgccc ccctgacgag 2400
catcacaaaa atcgatgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 2460
caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 2520
ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt 2580
aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 2640
gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 2700
cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 2760
ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aagaacagta 2820
tttggtatct gcgctctgct gaagccagtt acctcggaaa aagagttggt agctcttgat 2880
ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag cagattacgc 2940
gcagaaaaaa aggatctcaa gaagatcctt tgattttcta ccg 2983
<210> 36
<211> 2983
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CCCDiB plasmid
<400> 36
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgagagacc gaattcgcgg ccgcttctag agcaatacgc aaaccgcctc 120
tccccgcgcg ttggccgatt cattaatgca gctggcacga caggtttccc gactggaaag 180
cgggcagtga gcgcaacgca attaatgtga gttagctcac tcattaggca ccccaggctt 240
tacactttat gcttccggct cgtatgttgt gtggaattgt gagcggataa caatttcaca 300
catactagag aaagaggaga aatactagat ggcttcctcc gaagacgtta tcaaagagtt 360
catgcgtttc aaagttcgta tggaaggttc cgttaacggt cacgagttcg aaatcgaagg 420
tgaaggtgaa ggtcgtccgt acgaaggtac ccagaccgct aaactgaaag ttaccaaagg 480
tggtccgctg ccgttcgctt gggacatcct gtccccgcag ttccagtacg gttccaaagc 540
ttacgttaaa cacccggctg acatcccgga ctacctgaaa ctgtccttcc cggaaggttt 600
caaatgggaa cgtgttatga acttcgaaga cggtggtgtt gttaccgtta cccaggactc 660
ctccctgcaa gacggtgagt tcatctacaa agttaaactg cgtggtacca acttcccgtc 720
cgacggtccg gttatgcaga aaaaaaccat gggttgggaa gcttccaccg aacgtatgta 780
cccggaagac ggtgctctga aaggtgaaat caaaatgcgt ctgaaactga aagacggtgg 840
tcactacgac gctgaagtta aaaccaccta catggctaaa aaaccggttc agctgccggg 900
tgcttacaaa accgacatca aactggacat cacctcccac aacgaagact acaccatcgt 960
tgaacagtac gaacgtgctg aaggtcgtca ctccaccggt gcttaataac gctgatagtg 1020
ctagtgtaga tcgctactag agccaggcat caaataaaac gaaaggctca gtcgaaagac 1080
tgggcctttc gttttatctg ttgtttgtcg gtgaacgctc tctactagag tcacactggc 1140
tcaccttcgg gtgggccttt ctgcgtttat atactagtag cggccgctgc agggtctctg 1200
gttctggtgg tggtagtggt ggttctaaaa ttgcagcatt gaaaaagaag aacgccgcct 1260
tgaaacaaaa aattgctgcc ttaaaacaag gtggtagtgg tggatctggt tagccgagac 1320
gactgaccat ttaaatcata cctgacctcc atagcagaaa gtcaaaagcc tccgaccgga 1380
ggcttttgac ttgatcggca cgtaagaggt tccaactttc accataatga aataagatca 1440
ctaccgggcg tattttttga gttatcgaga ttttcaggag ctaaggaagc taaaatgagc 1500
catattcaac gggaaacgtc ttgctcgagg ccgcgattaa attccaacat ggatgctgat 1560
ttatatgggt ataaatgggc tcgcgataat gtcgggcaat caggtgcgac aatctatcga 1620
ttgtatggga agcccgatgc gccagagttg tttctgaaac atggcaaagg tagcgttgcc 1680
aatgatgtta cagatgagat ggtcaggcta aactggctga cggaatttat gcctcttccg 1740
accatcaagc attttatccg tactcctgat gatgcatggt tactcaccac tgcgatccca 1800
gggaaaacag cattccaggt attagaagaa tatcctgatt caggtgaaaa tattgttgat 1860
gcgctggcag tgttcctgcg ccggttgcat tcgattcctg tttgtaattg tccttttaac 1920
ggcgatcgcg tatttcgtct cgcacaggcg caatcacgaa tgaataacgg tttggttggt 1980
gcgagtgatt ttgatgacga gcgtaatggc tggcctgttg aacaagtctg gaaagaaatg 2040
cataagcttt tgccattctc accggattca gtcgtcactc atggtgattt ctcacttgat 2100
aaccttattt ttgacgaggg gaaattaata ggttgtattg atgttggacg agtcggaatc 2160
gcagaccgat accaggatct tgccatccta tggaactgcc tcggtgagtt ttctccttca 2220
ttacagaaac ggctttttca aaaatatggt attgataatc ctgatatgaa taaattgcag 2280
tttcacttga tgctcgatga gtttttctaa tgagggccca aatgtaatca cctggctcac 2340
cttcgggtgg gcctttctgc gttgctggcg tttttccata ggctccgccc ccctgacgag 2400
catcacaaaa atcgatgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 2460
caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 2520
ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt 2580
aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 2640
gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 2700
cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 2760
ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aagaacagta 2820
tttggtatct gcgctctgct gaagccagtt acctcggaaa aagagttggt agctcttgat 2880
ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag cagattacgc 2940
gcagaaaaaa aggatctcaa gaagatcctt tgattttcta ccg 2983
<210> 37
<211> 3136
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CSpyCatcher plasmid
<400> 37
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgagagacc gaattcgcgg ccgcttctag agcaatacgc aaaccgcctc 120
tccccgcgcg ttggccgatt cattaatgca gctggcacga caggtttccc gactggaaag 180
cgggcagtga gcgcaacgca attaatgtga gttagctcac tcattaggca ccccaggctt 240
tacactttat gcttccggct cgtatgttgt gtggaattgt gagcggataa caatttcaca 300
catactagag aaagaggaga aatactagat ggcttcctcc gaagacgtta tcaaagagtt 360
catgcgtttc aaagttcgta tggaaggttc cgttaacggt cacgagttcg aaatcgaagg 420
tgaaggtgaa ggtcgtccgt acgaaggtac ccagaccgct aaactgaaag ttaccaaagg 480
tggtccgctg ccgttcgctt gggacatcct gtccccgcag ttccagtacg gttccaaagc 540
ttacgttaaa cacccggctg acatcccgga ctacctgaaa ctgtccttcc cggaaggttt 600
caaatgggaa cgtgttatga acttcgaaga cggtggtgtt gttaccgtta cccaggactc 660
ctccctgcaa gacggtgagt tcatctacaa agttaaactg cgtggtacca acttcccgtc 720
cgacggtccg gttatgcaga aaaaaaccat gggttgggaa gcttccaccg aacgtatgta 780
cccggaagac ggtgctctga aaggtgaaat caaaatgcgt ctgaaactga aagacggtgg 840
tcactacgac gctgaagtta aaaccaccta catggctaaa aaaccggttc agctgccggg 900
tgcttacaaa accgacatca aactggacat cacctcccac aacgaagact acaccatcgt 960
tgaacagtac gaacgtgctg aaggtcgtca ctccaccggt gcttaataac gctgatagtg 1020
ctagtgtaga tcgctactag agccaggcat caaataaaac gaaaggctca gtcgaaagac 1080
tgggcctttc gttttatctg ttgtttgtcg gtgaacgctc tctactagag tcacactggc 1140
tcaccttcgg gtgggccttt ctgcgtttat atactagtag cggccgctgc agggtctctg 1200
gttctggtgg ttctgattct gctactcata ttaagttctc caagagggac gaagatggta 1260
aagaattggc tggtgcaact atggaattga gagattcttc tggtaagacc atttccacct 1320
ggatttctga tggtcaagtt aaggatttct acttgtaccc aggtaagtac actttcgttg 1380
aaactgctgc tccagatggt tatgaagttg ctactgctat tactttcacc gtcaatgaac 1440
aaggtcaagt cactgttaat ggttagccga gacgactgac catttaaatc atacctgacc 1500
tccatagcag aaagtcaaaa gcctccgacc ggaggctttt gacttgatcg gcacgtaaga 1560
ggttccaact ttcaccataa tgaaataaga tcactaccgg gcgtattttt tgagttatcg 1620
agattttcag gagctaagga agctaaaatg agccatattc aacgggaaac gtcttgctcg 1680
aggccgcgat taaattccaa catggatgct gatttatatg ggtataaatg ggctcgcgat 1740
aatgtcgggc aatcaggtgc gacaatctat cgattgtatg ggaagcccga tgcgccagag 1800
ttgtttctga aacatggcaa aggtagcgtt gccaatgatg ttacagatga gatggtcagg 1860
ctaaactggc tgacggaatt tatgcctctt ccgaccatca agcattttat ccgtactcct 1920
gatgatgcat ggttactcac cactgcgatc ccagggaaaa cagcattcca ggtattagaa 1980
gaatatcctg attcaggtga aaatattgtt gatgcgctgg cagtgttcct gcgccggttg 2040
cattcgattc ctgtttgtaa ttgtcctttt aacggcgatc gcgtatttcg tctcgctcag 2100
gcgcaatcac gaatgaataa cggtttggtt ggtgcgagtg attttgatga cgagcgtaat 2160
ggctggcctg ttgaacaagt ctggaaagaa atgcataagc ttttgccatt ctcaccggat 2220
tcagtcgtca ctcatggtga tttctcactt gataacctta tttttgacga ggggaaatta 2280
ataggttgta ttgatgttgg acgagtcgga atcgcagacc gataccagga tcttgccatc 2340
ctatggaact gcctcggtga gttttctcct tcattacaga aacggctttt tcaaaaatat 2400
ggtattgata atcctgatat gaataaattg cagtttcact tgatgctcga tgagtttttc 2460
taatgagggc ccaaatgtaa tcacctggct caccttcggg tgggcctttc tgcgttgctg 2520
gcgtttttcc ataggctccg cccccctgac gagcatcaca aaaatcgatg ctcaagtcag 2580
aggtggcgaa acccgacagg actataaaga taccaggcgt ttccccctgg aagctccctc 2640
gtgcgctctc ctgttccgac cctgccgctt accggatacc tgtccgcctt tctcccttcg 2700
ggaagcgtgg cgctttctca tagctcacgc tgtaggtatc tcagttcggt gtaggtcgtt 2760
cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc 2820
ggtaactatc gtcttgagtc caacccggta agacacgact tatcgccact ggcagcagcc 2880
actggtaaca ggattagcag agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg 2940
tggcctaact acggctacac tagaagaaca gtatttggta tctgcgctct gctgaagcca 3000
gttacctcgg aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg 3060
gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc 3120
ctttgatttt ctaccg 3136
<210> 38
<211> 2926
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CSpyTag plasmid
<400> 38
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgagagacc gaattcgcgg ccgcttctag agcaatacgc aaaccgcctc 120
tccccgcgcg ttggccgatt cattaatgca gctggcacga caggtttccc gactggaaag 180
cgggcagtga gcgcaacgca attaatgtga gttagctcac tcattaggca ccccaggctt 240
tacactttat gcttccggct cgtatgttgt gtggaattgt gagcggataa caatttcaca 300
catactagag aaagaggaga aatactagat ggcttcctcc gaagacgtta tcaaagagtt 360
catgcgtttc aaagttcgta tggaaggttc cgttaacggt cacgagttcg aaatcgaagg 420
tgaaggtgaa ggtcgtccgt acgaaggtac ccagaccgct aaactgaaag ttaccaaagg 480
tggtccgctg ccgttcgctt gggacatcct gtccccgcag ttccagtacg gttccaaagc 540
ttacgttaaa cacccggctg acatcccgga ctacctgaaa ctgtccttcc cggaaggttt 600
caaatgggaa cgtgttatga acttcgaaga cggtggtgtt gttaccgtta cccaggactc 660
ctccctgcaa gacggtgagt tcatctacaa agttaaactg cgtggtacca acttcccgtc 720
cgacggtccg gttatgcaga aaaaaaccat gggttgggaa gcttccaccg aacgtatgta 780
cccggaagac ggtgctctga aaggtgaaat caaaatgcgt ctgaaactga aagacggtgg 840
tcactacgac gctgaagtta aaaccaccta catggctaaa aaaccggttc agctgccggg 900
tgcttacaaa accgacatca aactggacat cacctcccac aacgaagact acaccatcgt 960
tgaacagtac gaacgtgctg aaggtcgtca ctccaccggt gcttaataac gctgatagtg 1020
ctagtgtaga tcgctactag agccaggcat caaataaaac gaaaggctca gtcgaaagac 1080
tgggcctttc gttttatctg ttgtttgtcg gtgaacgctc tctactagag tcacactggc 1140
tcaccttcgg gtgggccttt ctgcgtttat atactagtag cggccgctgc agggtctctg 1200
gttctggtgg ttctgctcat atagttatgg ttgatgctta caagccaaca aaatagccga 1260
gacgactgac catttaaatc atacctgacc tccatagcag aaagtcaaaa gcctccgacc 1320
ggaggctttt gacttgatcg gcacgtaaga ggttccaact ttcaccataa tgaaataaga 1380
tcactaccgg gcgtattttt tgagttatcg agattttcag gagctaagga agctaaaatg 1440
agccatattc aacgggaaac gtcttgctcg aggccgcgat taaattccaa catggatgct 1500
gatttatatg ggtataaatg ggctcgcgat aatgtcgggc aatcaggtgc gacaatctat 1560
cgattgtatg ggaagcccga tgcgccagag ttgtttctga aacatggcaa aggtagcgtt 1620
gccaatgatg ttacagatga gatggtcagg ctaaactggc tgacggaatt tatgcctctt 1680
ccgaccatca agcattttat ccgtactcct gatgatgcat ggttactcac cactgcgatc 1740
ccagggaaaa cagcattcca ggtattagaa gaatatcctg attcaggtga aaatattgtt 1800
gatgcgctgg cagtgttcct gcgccggttg cattcgattc ctgtttgtaa ttgtcctttt 1860
aacggcgatc gcgtatttcg tctcgctcag gcgcaatcac gaatgaataa cggtttggtt 1920
ggtgcgagtg attttgatga cgagcgtaat ggctggcctg ttgaacaagt ctggaaagaa 1980
atgcataagc ttttgccatt ctcaccggat tcagtcgtca ctcatggtga tttctcactt 2040
gataacctta tttttgacga ggggaaatta ataggttgta ttgatgttgg acgagtcgga 2100
atcgcagacc gataccagga tcttgccatc ctatggaact gcctcggtga gttttctcct 2160
tcattacaga aacggctttt tcaaaaatat ggtattgata atcctgatat gaataaattg 2220
cagtttcact tgatgctcga tgagtttttc taatgagggc ccaaatgtaa tcacctggct 2280
caccttcggg tgggcctttc tgcgttgctg gcgtttttcc ataggctccg cccccctgac 2340
gagcatcaca aaaatcgatg ctcaagtcag aggtggcgaa acccgacagg actataaaga 2400
taccaggcgt ttccccctgg aagctccctc gtgcgctctc ctgttccgac cctgccgctt 2460
accggatacc tgtccgcctt tctcccttcg ggaagcgtgg cgctttctca tagctcacgc 2520
tgtaggtatc tcagttcggt gtaggtcgtt cgctccaagc tgggctgtgt gcacgaaccc 2580
cccgttcagc ccgaccgctg cgccttatcc ggtaactatc gtcttgagtc caacccggta 2640
agacacgact tatcgccact ggcagcagcc actggtaaca ggattagcag agcgaggtat 2700
gtaggcggtg ctacagagtt cttgaagtgg tggcctaact acggctacac tagaagaaca 2760
gtatttggta tctgcgctct gctgaagcca gttacctcgg aaaaagagtt ggtagctctt 2820
gatccggcaa acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta 2880
cgcgcagaaa aaaaggatct caagaagatc ctttgatttt ctaccg 2926
<210> 39
<211> 2944
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-S2CP plasmid
<400> 39
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgagagacc gaattcgcgg ccgcttctag agcaatacgc aaaccgcctc 120
tccccgcgcg ttggccgatt cattaatgca gctggcacga caggtttccc gactggaaag 180
cgggcagtga gcgcaacgca attaatgtga gttagctcac tcattaggca ccccaggctt 240
tacactttat gcttccggct cgtatgttgt gtggaattgt gagcggataa caatttcaca 300
catactagag aaagaggaga aatactagat ggcttcctcc gaagacgtta tcaaagagtt 360
catgcgtttc aaagttcgta tggaaggttc cgttaacggt cacgagttcg aaatcgaagg 420
tgaaggtgaa ggtcgtccgt acgaaggtac ccagaccgct aaactgaaag ttaccaaagg 480
tggtccgctg ccgttcgctt gggacatcct gtccccgcag ttccagtacg gttccaaagc 540
ttacgttaaa cacccggctg acatcccgga ctacctgaaa ctgtccttcc cggaaggttt 600
caaatgggaa cgtgttatga acttcgaaga cggtggtgtt gttaccgtta cccaggactc 660
ctccctgcaa gacggtgagt tcatctacaa agttaaactg cgtggtacca acttcccgtc 720
cgacggtccg gttatgcaga aaaaaaccat gggttgggaa gcttccaccg aacgtatgta 780
cccggaagac ggtgctctga aaggtgaaat caaaatgcgt ctgaaactga aagacggtgg 840
tcactacgac gctgaagtta aaaccaccta catggctaaa aaaccggttc agctgccggg 900
tgcttacaaa accgacatca aactggacat cacctcccac aacgaagact acaccatcgt 960
tgaacagtac gaacgtgctg aaggtcgtca ctccaccggt gcttaataac gctgatagtg 1020
ctagtgtaga tcgctactag agccaggcat caaataaaac gaaaggctca gtcgaaagac 1080
tgggcctttc gttttatctg ttgtttgtcg gtgaacgctc tctactagag tcacactggc 1140
tcaccttcgg gtgggccttt ctgcgtttat atactagtag cggccgctgc agggtctctt 1200
ctaagattac tggttcttct ggtaacgata cccaaggttc tttgattact tactctggtg 1260
gtgctagagg ttagccgaga cgactgacca tttaaatcat acctgacctc catagcagaa 1320
agtcaaaagc ctccgaccgg aggcttttga cttgatcggc acgtaagagg ttccaacttt 1380
caccataatg aaataagatc actaccgggc gtattttttg agttatcgag attttcagga 1440
gctaaggaag ctaaaatgag ccatattcaa cgggaaacgt cttgctcgag gccgcgatta 1500
aattccaaca tggatgctga tttatatggg tataaatggg ctcgcgataa tgtcgggcaa 1560
tcaggtgcga caatctatcg attgtatggg aagcccgatg cgccagagtt gtttctgaaa 1620
catggcaaag gtagcgttgc caatgatgtt acagatgaga tggtcaggct aaactggctg 1680
acggaattta tgcctcttcc gaccatcaag cattttatcc gtactcctga tgatgcatgg 1740
ttactcacca ctgcgatccc agggaaaaca gcattccagg tattagaaga atatcctgat 1800
tcaggtgaaa atattgttga tgcgctggca gtgttcctgc gccggttgca ttcgattcct 1860
gtttgtaatt gtccttttaa cggcgatcgc gtatttcgtc tcgctcaggc gcaatcacga 1920
atgaataacg gtttggttgg tgcgagtgat tttgatgacg agcgtaatgg ctggcctgtt 1980
gaacaagtct ggaaagaaat gcataagctt ttgccattct caccggattc agtcgtcact 2040
catggtgatt tctcacttga taaccttatt tttgacgagg ggaaattaat aggttgtatt 2100
gatgttggac gagtcggaat cgcagaccga taccaggatc ttgccatcct atggaactgc 2160
ctcggtgagt tttctccttc attacagaaa cggctttttc aaaaatatgg tattgataat 2220
cctgatatga ataaattgca gtttcacttg atgctcgatg agtttttcta atgagggccc 2280
aaatgtaatc acctggctca ccttcgggtg ggcctttctg cgttgctggc gtttttccat 2340
aggctccgcc cccctgacga gcatcacaaa aatcgatgct caagtcagag gtggcgaaac 2400
ccgacaggac tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct 2460
gttccgaccc tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg 2520
ctttctcata gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg 2580
ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt 2640
cttgagtcca acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg 2700
attagcagag cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac 2760
ggctacacta gaagaacagt atttggtatc tgcgctctgc tgaagccagt tacctcggaa 2820
aaagagttgg tagctcttga tccggcaaac aaaccaccgc tggtagcggt ggtttttttg 2880
tttgcaagca gcagattacg cgcagaaaaa aaggatctca agaagatcct ttgattttct 2940
accg 2944
<210> 40
<211> 1864
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_P-Pcon3 plasmid
<400> 40
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg gctggcttcc caaccttacc agagggcgcc ccagctggca attccgacgt 120
cctgacagct agctcagtcc taggtataat gctagcgaat tcaaaagatc ttttaagaag 180
gagatataca tgatgcgaga cgactgacca tttaaatcat acctgacctc catagcagaa 240
agtcaaaagc ctccgaccgg aggcttttga cttgatcggc acgtaagagg ttccaacttt 300
caccataatg aaataagatc actaccgggc gtattttttg agttatcgag attttcagga 360
gctaaggaag ctaaaatgag ccatattcaa cgggaaacgt cttgctcgag gccgcgatta 420
aattccaaca tggatgctga tttatatggg tataaatggg ctcgcgataa tgtcgggcaa 480
tcaggtgcga caatctatcg attgtatggg aagcccgatg cgccagagtt gtttctgaaa 540
catggcaaag gtagcgttgc caatgatgtt acagatgaga tggtcaggct aaactggctg 600
acggaattta tgcctcttcc gaccatcaag cattttatcc gtactcctga tgatgcatgg 660
ttactcacca ctgcgatccc agggaaaaca gcattccagg tattagaaga atatcctgat 720
tcaggtgaaa atattgttga tgcgctggca gtgttcctgc gccggttgca ttcgattcct 780
gtttgtaatt gtccttttaa cggcgatcgc gtatttcgtc tcgctcaggc gcaatcacga 840
atgaataacg gtttggttgg tgcgagtgat tttgatgacg agcgtaatgg ctggcctgtt 900
gaacaagtct ggaaagaaat gcataagctt ttgccattct caccggattc agtcgtcact 960
catggtgatt tctcacttga taaccttatt tttgacgagg ggaaattaat aggttgtatt 1020
gatgttggac gagtcggaat cgcagaccga taccaggatc ttgccatcct atggaactgc 1080
ctcggtgagt tttctccttc attacagaaa cggctttttc aaaaatatgg tattgataat 1140
cctgatatga ataaattgca gtttcacttg atgctcgatg agtttttcta atgagggccc 1200
aaatgtaatc acctggctca ccttcgggtg ggcctttctg cgttgctggc gtttttccat 1260
aggctccgcc cccctgacga gcatcacaaa aatcgatgct caagtcagag gtggcgaaac 1320
ccgacaggac tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct 1380
gttccgaccc tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg 1440
ctttctcata gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg 1500
ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt 1560
cttgagtcca acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg 1620
attagcagag cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac 1680
ggctacacta gaagaacagt atttggtatc tgcgctctgc tgaagccagt tacctcggaa 1740
aaagagttgg tagctcttga tccggcaaac aaaccaccgc tggtagcggt ggtttttttg 1800
tttgcaagca gcagattacg cgcagaaaaa aaggatctca agaagatcct ttgattttct 1860
accg 1864
<210> 41
<211> 1864
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_P-Pcon4 plasmid
<400> 41
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg gctggcttcc caaccttacc agagggcgcc ccagctggca attccgacgt 120
ctttacggct agctcagtcc taggtactat gctagcgaat tcaaaagatc ttttaagaag 180
gagatataca tgatgcgaga cgactgacca tttaaatcat acctgacctc catagcagaa 240
agtcaaaagc ctccgaccgg aggcttttga cttgatcggc acgtaagagg ttccaacttt 300
caccataatg aaataagatc actaccgggc gtattttttg agttatcgag attttcagga 360
gctaaggaag ctaaaatgag ccatattcaa cgggaaacgt cttgctcgag gccgcgatta 420
aattccaaca tggatgctga tttatatggg tataaatggg ctcgcgataa tgtcgggcaa 480
tcaggtgcga caatctatcg attgtatggg aagcccgatg cgccagagtt gtttctgaaa 540
catggcaaag gtagcgttgc caatgatgtt acagatgaga tggtcaggct aaactggctg 600
acggaattta tgcctcttcc gaccatcaag cattttatcc gtactcctga tgatgcatgg 660
ttactcacca ctgcgatccc agggaaaaca gcattccagg tattagaaga atatcctgat 720
tcaggtgaaa atattgttga tgcgctggca gtgttcctgc gccggttgca ttcgattcct 780
gtttgtaatt gtccttttaa cggcgatcgc gtatttcgtc tcgctcaggc gcaatcacga 840
atgaataacg gtttggttgg tgcgagtgat tttgatgacg agcgtaatgg ctggcctgtt 900
gaacaagtct ggaaagaaat gcataagctt ttgccattct caccggattc agtcgtcact 960
catggtgatt tctcacttga taaccttatt tttgacgagg ggaaattaat aggttgtatt 1020
gatgttggac gagtcggaat cgcagaccga taccaggatc ttgccatcct atggaactgc 1080
ctcggtgagt tttctccttc attacagaaa cggctttttc aaaaatatgg tattgataat 1140
cctgatatga ataaattgca gtttcacttg atgctcgatg agtttttcta atgagggccc 1200
aaatgtaatc acctggctca ccttcgggtg ggcctttctg cgttgctggc gtttttccat 1260
aggctccgcc cccctgacga gcatcacaaa aatcgatgct caagtcagag gtggcgaaac 1320
ccgacaggac tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct 1380
gttccgaccc tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg 1440
ctttctcata gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg 1500
ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt 1560
cttgagtcca acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg 1620
attagcagag cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac 1680
ggctacacta gaagaacagt atttggtatc tgcgctctgc tgaagccagt tacctcggaa 1740
aaagagttgg tagctcttga tccggcaaac aaaccaccgc tggtagcggt ggtttttttg 1800
tttgcaagca gcagattacg cgcagaaaaa aaggatctca agaagatcct ttgattttct 1860
accg 1864
<210> 42
<211> 1864
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_P-Pcon5 plasmid
<400> 42
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg gctggcttcc caaccttacc agagggcgcc ccagctggca attccgacgt 120
ctttatggct agctcagtcc taggtacaat gctagcgaat tcaaaagatc ttttaagaag 180
gagatataca tgatgcgaga cgactgacca tttaaatcat acctgacctc catagcagaa 240
agtcaaaagc ctccgaccgg aggcttttga cttgatcggc acgtaagagg ttccaacttt 300
caccataatg aaataagatc actaccgggc gtattttttg agttatcgag attttcagga 360
gctaaggaag ctaaaatgag ccatattcaa cgggaaacgt cttgctcgag gccgcgatta 420
aattccaaca tggatgctga tttatatggg tataaatggg ctcgcgataa tgtcgggcaa 480
tcaggtgcga caatctatcg attgtatggg aagcccgatg cgccagagtt gtttctgaaa 540
catggcaaag gtagcgttgc caatgatgtt acagatgaga tggtcaggct aaactggctg 600
acggaattta tgcctcttcc gaccatcaag cattttatcc gtactcctga tgatgcatgg 660
ttactcacca ctgcgatccc agggaaaaca gcattccagg tattagaaga atatcctgat 720
tcaggtgaaa atattgttga tgcgctggca gtgttcctgc gccggttgca ttcgattcct 780
gtttgtaatt gtccttttaa cggcgatcgc gtatttcgtc tcgctcaggc gcaatcacga 840
atgaataacg gtttggttgg tgcgagtgat tttgatgacg agcgtaatgg ctggcctgtt 900
gaacaagtct ggaaagaaat gcataagctt ttgccattct caccggattc agtcgtcact 960
catggtgatt tctcacttga taaccttatt tttgacgagg ggaaattaat aggttgtatt 1020
gatgttggac gagtcggaat cgcagaccga taccaggatc ttgccatcct atggaactgc 1080
ctcggtgagt tttctccttc attacagaaa cggctttttc aaaaatatgg tattgataat 1140
cctgatatga ataaattgca gtttcacttg atgctcgatg agtttttcta atgagggccc 1200
aaatgtaatc acctggctca ccttcgggtg ggcctttctg cgttgctggc gtttttccat 1260
aggctccgcc cccctgacga gcatcacaaa aatcgatgct caagtcagag gtggcgaaac 1320
ccgacaggac tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct 1380
gttccgaccc tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg 1440
ctttctcata gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg 1500
ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt 1560
cttgagtcca acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg 1620
attagcagag cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac 1680
ggctacacta gaagaacagt atttggtatc tgcgctctgc tgaagccagt tacctcggaa 1740
aaagagttgg tagctcttga tccggcaaac aaaccaccgc tggtagcggt ggtttttttg 1800
tttgcaagca gcagattacg cgcagaaaaa aaggatctca agaagatcct ttgattttct 1860
accg 1864
<210> 43
<211> 3301
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_P-LacI+PT7 plasmid
<400> 43
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg gcttcactgc ccgctttcca gtcgggaaac ctgtcgtgcc agctgcatta 120
atgaatcggc caacgcgcgg ggagaggcgg tttgcgtatt gggcgccagg gtggtttttc 180
ttttcaccag tgaaacgggc aacagctgat tgcccttcac cgcctggccc tgagagagtt 240
gcagcaagcg gtccacgctg gtttgcccca gcaggcgaaa atcctgtttg atggtggtta 300
acggcgggat ataacatgag ctgtcttcgg tatcgtcgta tcccactacc gagatatccg 360
caccaacgcg cagcccggac tcggtaatgg cgcgcattgc gcccagcgcc atctgatcgt 420
tggcaaccag catcgcagtg ggaacgatgc cctcattcag catttgcatg gtttgttgaa 480
aaccggacat ggcactccag tcgccttccc gttccgctat cggctgaatt tgattgcgag 540
tgagatattt atgccagcca gccagacgca gacgcgccga gacagaactt aatgggcccg 600
ctaacagcgc gatttgctgg tgacccaatg cgaccagatg ctccacgccc agtcgcgtac 660
cgtcttcatg ggagaaaata atactgttga tgggtgtctg gtcagagaca tcaagaaata 720
acgccggaac attagtgcag gcagcttcca cagcaatggc atcctggtca tccagcggat 780
agttaatgat cagcccactg acgcgttgcg cgagaagatt gtgcaccgcc gctttacagg 840
cttcgacgcc gcttcgttct accatcgaca ccaccacgct ggcacccagt tgatcggcgc 900
gagatttaat cgccgcgaca atttgcgacg gcgcgtgcag ggccagactg gaggtggcaa 960
cgccaatcag caacgactgt ttgcccgcca gttgttgtgc cacgcggttg ggaatgtaat 1020
tcagctccgc catcgccgct tccacttttt cccgcgtttt cgcagaaacg tggctggcct 1080
ggttcaccac gcgggaaacg gtctgataag agacaccggc atactctgcg acatcgtata 1140
acgttactgg tttcacattc accaccctga attgactctc ttccgggcgc tatcatgcca 1200
taccgcgaaa ggttttgcgc cattcgatgg tgtccgggat ctcgacgctc tcccttatgc 1260
gactcctgca ttaggaagca gcccagtagt aggttgaggc cgttgagcac cgccgccgca 1320
aggaatggtg catgcaagga gatggcgccc aacagtcccc cggccacggg gcctgccacc 1380
atacccacgc cgaaacaagc gctcatgagc ccgaagtggc gagcccgatc ttccccatcg 1440
gtgatgtcgg cgatataggc gccagcaacc gcacctgtgg cgccggtgat gccggccacg 1500
atgcgtccgg cgtagaggat cgagatctcg atcccgcgaa attaatacga ctcactatag 1560
gggaattgtg agcggataac aattcccctc tagaaataat tttgtttaac tttaagaagg 1620
agatatacga tgcgagacga ctgaccattt aaatcatacc tgacctccat agcagaaagt 1680
caaaagcctc cgaccggagg cttttgactt gatcggcacg taagaggttc caactttcac 1740
cataatgaaa taagatcact accgggcgta ttttttgagt tatcgagatt ttcaggagct 1800
aaggaagcta aaatgagcca tattcaacgg gaaacgtctt gctcgaggcc gcgattaaat 1860
tccaacatgg atgctgattt atatgggtat aaatgggctc gcgataatgt cgggcaatca 1920
ggtgcgacaa tctatcgatt gtatgggaag cccgatgcgc cagagttgtt tctgaaacat 1980
ggcaaaggta gcgttgccaa tgatgttaca gatgagatgg tcaggctaaa ctggctgacg 2040
gaatttatgc ctcttccgac catcaagcat tttatccgta ctcctgatga tgcatggtta 2100
ctcaccactg cgatcccagg gaaaacagca ttccaggtat tagaagaata tcctgattca 2160
ggtgaaaata ttgttgatgc gctggcagtg ttcctgcgcc ggttgcattc gattcctgtt 2220
tgtaattgtc cttttaacgg cgatcgcgta tttcgtctcg ctcaggcgca atcacgaatg 2280
aataacggtt tggttggtgc gagtgatttt gatgacgagc gtaatggctg gcctgttgaa 2340
caagtctgga aagaaatgca taagcttttg ccattctcac cggattcagt cgtcactcat 2400
ggtgatttct cacttgataa ccttattttt gacgagggga aattaatagg ttgtattgat 2460
gttggacgag tcggaatcgc agaccgatac caggatcttg ccatcctatg gaactgcctc 2520
ggtgagtttt ctccttcatt acagaaacgg ctttttcaaa aatatggtat tgataatcct 2580
gatatgaata aattgcagtt tcacttgatg ctcgatgagt ttttctaatg agggcccaaa 2640
tgtaatcacc tggctcacct tcgggtgggc ctttctgcgt tgctggcgtt tttccatagg 2700
ctccgccccc ctgacgagca tcacaaaaat cgatgctcaa gtcagaggtg gcgaaacccg 2760
acaggactat aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt 2820
ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt 2880
tctcatagct cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc 2940
tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt 3000
gagtccaacc cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt 3060
agcagagcga ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc 3120
tacactagaa gaacagtatt tggtatctgc gctctgctga agccagttac ctcggaaaaa 3180
gagttggtag ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt 3240
gcaagcagca gattacgcgc agaaaaaaag gatctcaaga agatcctttg attttctacc 3300
g 3301
<210> 44
<211> 2499
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-His6-meGFP plasmid
<400> 44
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgggttctt ctcatcatca ccatcaccat tcttctggga tgtctaaagg 120
tgaagaatta ttcactggtg ttgtcccaat tttggttgaa ttagatggtg atgttaatgg 180
tcacaaattt tctgtctccg gtgaaggtga aggtgatgct acttacggta aattgacctt 240
aaaatttatt tgtactactg gtaaattgcc agttccatgg ccaaccttag tcactacttt 300
aacttatggt gttcaatgtt tttctagata cccagatcat atgaaacaac atgacttttt 360
caagtctgcc atgccagaag gttatgttca agaaagaact atttttttca aagatgacgg 420
taactacaag accagagctg aagtcaagtt tgaaggtgat accttagtta atagaatcga 480
attaaaaggt attgatttta aagaagatgg taacatttta ggtcacaaat tggaatacaa 540
ctataactct cacaatgttt acatcatggc tgacaaacaa aagaatggta tcaaagttaa 600
cttcaaaatt agacacaaca ttgaagatgg ttctgttcaa ttagctgacc attatcaaca 660
aaatactcca attggtgatg gtccagtctt gttaccagac aaccattact tatccactca 720
atctaaatta tccaaagatc caaacgaaaa gagagatcac atggtcttgt tagaatttgt 780
tactgctgct ggtattaccc atggtatgga tgaattgtac aaataatagc cgagacgact 840
gaccatttaa atcatacctg acctccatag cagaaagtca aaagcctccg accggaggct 900
tttgacttga tcggcacgta agaggttcca actttcacca taatgaaata agatcactac 960
cgggcgtatt ttttgagtta tcgagatttt caggagctaa ggaagctaaa atgagccata 1020
ttcaacggga aacgtcttgc tcgaggccgc gattaaattc caacatggat gctgatttat 1080
atgggtataa atgggctcgc gataatgtcg ggcaatcagg tgcgacaatc tatcgattgt 1140
atgggaagcc cgatgcgcca gagttgtttc tgaaacatgg caaaggtagc gttgccaatg 1200
atgttacaga tgagatggtc aggctaaact ggctgacgga atttatgcct cttccgacca 1260
tcaagcattt tatccgtact cctgatgatg catggttact caccactgcg atcccaggga 1320
aaacagcatt ccaggtatta gaagaatatc ctgattcagg tgaaaatatt gttgatgcgc 1380
tggcagtgtt cctgcgccgg ttgcattcga ttcctgtttg taattgtcct tttaacggcg 1440
atcgcgtatt tcgtctcgct caggcgcaat cacgaatgaa taacggtttg gttggtgcga 1500
gtgattttga tgacgagcgt aatggctggc ctgttgaaca agtctggaaa gaaatgcata 1560
agcttttgcc attctcaccg gattcagtcg tcactcatgg tgatttctca cttgataacc 1620
ttatttttga cgaggggaaa ttaataggtt gtattgatgt tggacgagtc ggaatcgcag 1680
accgatacca ggatcttgcc atcctatgga actgcctcgg tgagttttct ccttcattac 1740
agaaacggct ttttcaaaaa tatggtattg ataatcctga tatgaataaa ttgcagtttc 1800
acttgatgct cgatgagttt ttctaatgag ggcccaaatg taatcacctg gctcaccttc 1860
gggtgggcct ttctgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc 1920
acaaaaatcg atgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg 1980
cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat 2040
acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt 2100
atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc 2160
agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg 2220
acttatcgcc actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg 2280
gtgctacaga gttcttgaag tggtggccta actacggcta cactagaaga acagtatttg 2340
gtatctgcgc tctgctgaag ccagttacct cggaaaaaga gttggtagct cttgatccgg 2400
caaacaaacc accgctggta gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag 2460
aaaaaaagga tctcaagaag atcctttgat tttctaccg 2499
<210> 45
<211> 753
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-His6-meGFP nucleotide sequence of key ORF
<400> 45
atgggttctt ctcatcatca ccatcaccat tcttctggga tgtctaaagg tgaagaatta 60
ttcactggtg ttgtcccaat tttggttgaa ttagatggtg atgttaatgg tcacaaattt 120
tctgtctccg gtgaaggtga aggtgatgct acttacggta aattgacctt aaaatttatt 180
tgtactactg gtaaattgcc agttccatgg ccaaccttag tcactacttt aacttatggt 240
gttcaatgtt tttctagata cccagatcat atgaaacaac atgacttttt caagtctgcc 300
atgccagaag gttatgttca agaaagaact atttttttca aagatgacgg taactacaag 360
accagagctg aagtcaagtt tgaaggtgat accttagtta atagaatcga attaaaaggt 420
attgatttta aagaagatgg taacatttta ggtcacaaat tggaatacaa ctataactct 480
cacaatgttt acatcatggc tgacaaacaa aagaatggta tcaaagttaa cttcaaaatt 540
agacacaaca ttgaagatgg ttctgttcaa ttagctgacc attatcaaca aaatactcca 600
attggtgatg gtccagtctt gttaccagac aaccattact tatccactca atctaaatta 660
tccaaagatc caaacgaaaa gagagatcac atggtcttgt tagaatttgt tactgctgct 720
ggtattaccc atggtatgga tgaattgtac aaa 753
<210> 46
<211> 251
<212> PRT
<213> Artificial Sequence
<220>
<223> HcKan_O-His6-meGFP amino acid sequence of key ORF
<400> 46
Met Gly Ser Ser His His His His His His Ser Ser Gly Met Ser Lys
1 5 10 15
Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu Val Glu Leu Asp
20 25 30
Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu Gly Glu Gly
35 40 45
Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe Ile Cys Thr Thr Gly
50 55 60
Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Leu Thr Tyr Gly
65 70 75 80
Val Gln Cys Phe Ser Arg Tyr Pro Asp His Met Lys Gln His Asp Phe
85 90 95
Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu Arg Thr Ile Phe
100 105 110
Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val Lys Phe Glu
115 120 125
Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile Asp Phe Lys
130 135 140
Glu Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr Asn Tyr Asn Ser
145 150 155 160
His Asn Val Tyr Ile Met Ala Asp Lys Gln Lys Asn Gly Ile Lys Val
165 170 175
Asn Phe Lys Ile Arg His Asn Ile Glu Asp Gly Ser Val Gln Leu Ala
180 185 190
Asp His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro Val Leu Leu
195 200 205
Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Lys Leu Ser Lys Asp Pro
210 215 220
Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val Thr Ala Ala
225 230 235 240
Gly Ile Thr His Gly Met Asp Glu Leu Tyr Lys
245 250
<210> 47
<211> 2658
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-UmuD1-40-meGFP-S2CP plasmid
<400> 47
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgctgttta tcaaacctgc cgatctgcgt gaaattgtta cctttccgct 120
gtttagcgat ctggttcagt gtggttttcc gagtccggca gcagattatg ttgaacagcg 180
tattgatctg agcagcggta tgtctaaagg tgaagaatta ttcactggtg ttgtcccaat 240
tttggttgaa ttagatggtg atgttaatgg tcacaaattt tctgtctccg gtgaaggtga 300
aggtgatgct acttacggta aattgacctt aaaatttatt tgtactactg gtaaattgcc 360
agttccatgg ccaaccttag tcactacttt aacttatggt gttcaatgtt tttctagata 420
cccagatcat atgaaacaac atgacttttt caagtctgcc atgccagaag gttatgttca 480
agaaagaact atttttttca aagatgacgg taactacaag accagagctg aagtcaagtt 540
tgaaggtgat accttagtta atagaatcga attaaaaggt attgatttta aagaagatgg 600
taacatttta ggtcacaaat tggaatacaa ctataactct cacaatgttt acatcatggc 660
tgacaaacaa aagaatggta tcaaagttaa cttcaaaatt agacacaaca ttgaagatgg 720
ttctgttcaa ttagctgacc attatcaaca aaatactcca attggtgatg gtccagtctt 780
gttaccagac aaccattact tatccactca atctaaatta tccaaagatc caaacgaaaa 840
gagagatcac atggtcttgt tagaatttgt tactgctgct ggtattaccc atggtatgga 900
tgaattgtac aaatctaaga ttactggttc ttctggtaac gatacccaag gttctttgat 960
tacttactct ggtggtgcta gaggttagcc gagacgactg accatttaaa tcatacctga 1020
cctccatagc agaaagtcaa aagcctccga ccggaggctt ttgacttgat cggcacgtaa 1080
gaggttccaa ctttcaccat aatgaaataa gatcactacc gggcgtattt tttgagttat 1140
cgagattttc aggagctaag gaagctaaaa tgagccatat tcaacgggaa acgtcttgct 1200
cgaggccgcg attaaattcc aacatggatg ctgatttata tgggtataaa tgggctcgcg 1260
ataatgtcgg gcaatcaggt gcgacaatct atcgattgta tgggaagccc gatgcgccag 1320
agttgtttct gaaacatggc aaaggtagcg ttgccaatga tgttacagat gagatggtca 1380
ggctaaactg gctgacggaa tttatgcctc ttccgaccat caagcatttt atccgtactc 1440
ctgatgatgc atggttactc accactgcga tcccagggaa aacagcattc caggtattag 1500
aagaatatcc tgattcaggt gaaaatattg ttgatgcgct ggcagtgttc ctgcgccggt 1560
tgcattcgat tcctgtttgt aattgtcctt ttaacggcga tcgcgtattt cgtctcgctc 1620
aggcgcaatc acgaatgaat aacggtttgg ttggtgcgag tgattttgat gacgagcgta 1680
atggctggcc tgttgaacaa gtctggaaag aaatgcataa gcttttgcca ttctcaccgg 1740
attcagtcgt cactcatggt gatttctcac ttgataacct tatttttgac gaggggaaat 1800
taataggttg tattgatgtt ggacgagtcg gaatcgcaga ccgataccag gatcttgcca 1860
tcctatggaa ctgcctcggt gagttttctc cttcattaca gaaacggctt tttcaaaaat 1920
atggtattga taatcctgat atgaataaat tgcagtttca cttgatgctc gatgagtttt 1980
tctaatgagg gcccaaatgt aatcacctgg ctcaccttcg ggtgggcctt tctgcgttgc 2040
tggcgttttt ccataggctc cgcccccctg acgagcatca caaaaatcga tgctcaagtc 2100
agaggtggcg aaacccgaca ggactataaa gataccaggc gtttccccct ggaagctccc 2160
tcgtgcgctc tcctgttccg accctgccgc ttaccggata cctgtccgcc tttctccctt 2220
cgggaagcgt ggcgctttct catagctcac gctgtaggta tctcagttcg gtgtaggtcg 2280
ttcgctccaa gctgggctgt gtgcacgaac cccccgttca gcccgaccgc tgcgccttat 2340
ccggtaacta tcgtcttgag tccaacccgg taagacacga cttatcgcca ctggcagcag 2400
ccactggtaa caggattagc agagcgaggt atgtaggcgg tgctacagag ttcttgaagt 2460
ggtggcctaa ctacggctac actagaagaa cagtatttgg tatctgcgct ctgctgaagc 2520
cagttacctc ggaaaaagag ttggtagctc ttgatccggc aaacaaacca ccgctggtag 2580
cggtggtttt tttgtttgca agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga 2640
tcctttgatt ttctaccg 2658
<210> 48
<211> 915
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-UmuD1-40-meGFP-S2CP nucl;eotide sequence of key ORF
<400> 48
atgctgttta tcaaacctgc cgatctgcgt gaaattgtta cctttccgct gtttagcgat 60
ctggttcagt gtggttttcc gagtccggca gcagattatg ttgaacagcg tattgatctg 120
agcagcggta tgtctaaagg tgaagaatta ttcactggtg ttgtcccaat tttggttgaa 180
ttagatggtg atgttaatgg tcacaaattt tctgtctccg gtgaaggtga aggtgatgct 240
acttacggta aattgacctt aaaatttatt tgtactactg gtaaattgcc agttccatgg 300
ccaaccttag tcactacttt aacttatggt gttcaatgtt tttctagata cccagatcat 360
atgaaacaac atgacttttt caagtctgcc atgccagaag gttatgttca agaaagaact 420
atttttttca aagatgacgg taactacaag accagagctg aagtcaagtt tgaaggtgat 480
accttagtta atagaatcga attaaaaggt attgatttta aagaagatgg taacatttta 540
ggtcacaaat tggaatacaa ctataactct cacaatgttt acatcatggc tgacaaacaa 600
aagaatggta tcaaagttaa cttcaaaatt agacacaaca ttgaagatgg ttctgttcaa 660
ttagctgacc attatcaaca aaatactcca attggtgatg gtccagtctt gttaccagac 720
aaccattact tatccactca atctaaatta tccaaagatc caaacgaaaa gagagatcac 780
atggtcttgt tagaatttgt tactgctgct ggtattaccc atggtatgga tgaattgtac 840
aaatctaaga ttactggttc ttctggtaac gatacccaag gttctttgat tacttactct 900
ggtggtgcta gaggt 915
<210> 49
<211> 305
<212> PRT
<213> Artificial Sequence
<220>
<223> HcKan_O-UmuD1-40-meGFP-S2CP amino acid sequence of key ORF
<400> 49
Met Leu Phe Ile Lys Pro Ala Asp Leu Arg Glu Ile Val Thr Phe Pro
1 5 10 15
Leu Phe Ser Asp Leu Val Gln Cys Gly Phe Pro Ser Pro Ala Ala Asp
20 25 30
Tyr Val Glu Gln Arg Ile Asp Leu Ser Ser Gly Met Ser Lys Gly Glu
35 40 45
Glu Leu Phe Thr Gly Val Val Pro Ile Leu Val Glu Leu Asp Gly Asp
50 55 60
Val Asn Gly His Lys Phe Ser Val Ser Gly Glu Gly Glu Gly Asp Ala
65 70 75 80
Thr Tyr Gly Lys Leu Thr Leu Lys Phe Ile Cys Thr Thr Gly Lys Leu
85 90 95
Pro Val Pro Trp Pro Thr Leu Val Thr Thr Leu Thr Tyr Gly Val Gln
100 105 110
Cys Phe Ser Arg Tyr Pro Asp His Met Lys Gln His Asp Phe Phe Lys
115 120 125
Ser Ala Met Pro Glu Gly Tyr Val Gln Glu Arg Thr Ile Phe Phe Lys
130 135 140
Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val Lys Phe Glu Gly Asp
145 150 155 160
Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile Asp Phe Lys Glu Asp
165 170 175
Gly Asn Ile Leu Gly His Lys Leu Glu Tyr Asn Tyr Asn Ser His Asn
180 185 190
Val Tyr Ile Met Ala Asp Lys Gln Lys Asn Gly Ile Lys Val Asn Phe
195 200 205
Lys Ile Arg His Asn Ile Glu Asp Gly Ser Val Gln Leu Ala Asp His
210 215 220
Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro Val Leu Leu Pro Asp
225 230 235 240
Asn His Tyr Leu Ser Thr Gln Ser Lys Leu Ser Lys Asp Pro Asn Glu
245 250 255
Lys Arg Asp His Met Val Leu Leu Glu Phe Val Thr Ala Ala Gly Ile
260 265 270
Thr His Gly Met Asp Glu Leu Tyr Lys Ser Lys Ile Thr Gly Ser Ser
275 280 285
Gly Asn Asp Thr Gln Gly Ser Leu Ile Thr Tyr Ser Gly Gly Ala Arg
290 295 300
Gly
305
<210> 50
<211> 2586
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-UmuD1-40-meGFP plasmid
<400> 50
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgctgttta tcaaacctgc cgatctgcgt gaaattgtta cctttccgct 120
gtttagcgat ctggttcagt gtggttttcc gagtccggca gcagattatg ttgaacagcg 180
tattgatctg agcagcggta tgtctaaagg tgaagaatta ttcactggtg ttgtcccaat 240
tttggttgaa ttagatggtg atgttaatgg tcacaaattt tctgtctccg gtgaaggtga 300
aggtgatgct acttacggta aattgacctt aaaatttatt tgtactactg gtaaattgcc 360
agttccatgg ccaaccttag tcactacttt aacttatggt gttcaatgtt tttctagata 420
cccagatcat atgaaacaac atgacttttt caagtctgcc atgccagaag gttatgttca 480
agaaagaact atttttttca aagatgacgg taactacaag accagagctg aagtcaagtt 540
tgaaggtgat accttagtta atagaatcga attaaaaggt attgatttta aagaagatgg 600
taacatttta ggtcacaaat tggaatacaa ctataactct cacaatgttt acatcatggc 660
tgacaaacaa aagaatggta tcaaagttaa cttcaaaatt agacacaaca ttgaagatgg 720
ttctgttcaa ttagctgacc attatcaaca aaatactcca attggtgatg gtccagtctt 780
gttaccagac aaccattact tatccactca atctaaatta tccaaagatc caaacgaaaa 840
gagagatcac atggtcttgt tagaatttgt tactgctgct ggtattaccc atggtatgga 900
tgaattgtac aaatagccga gacgactgac catttaaatc atacctgacc tccatagcag 960
aaagtcaaaa gcctccgacc ggaggctttt gacttgatcg gcacgtaaga ggttccaact 1020
ttcaccataa tgaaataaga tcactaccgg gcgtattttt tgagttatcg agattttcag 1080
gagctaagga agctaaaatg agccatattc aacgggaaac gtcttgctcg aggccgcgat 1140
taaattccaa catggatgct gatttatatg ggtataaatg ggctcgcgat aatgtcgggc 1200
aatcaggtgc gacaatctat cgattgtatg ggaagcccga tgcgccagag ttgtttctga 1260
aacatggcaa aggtagcgtt gccaatgatg ttacagatga gatggtcagg ctaaactggc 1320
tgacggaatt tatgcctctt ccgaccatca agcattttat ccgtactcct gatgatgcat 1380
ggttactcac cactgcgatc ccagggaaaa cagcattcca ggtattagaa gaatatcctg 1440
attcaggtga aaatattgtt gatgcgctgg cagtgttcct gcgccggttg cattcgattc 1500
ctgtttgtaa ttgtcctttt aacggcgatc gcgtatttcg tctcgctcag gcgcaatcac 1560
gaatgaataa cggtttggtt ggtgcgagtg attttgatga cgagcgtaat ggctggcctg 1620
ttgaacaagt ctggaaagaa atgcataagc ttttgccatt ctcaccggat tcagtcgtca 1680
ctcatggtga tttctcactt gataacctta tttttgacga ggggaaatta ataggttgta 1740
ttgatgttgg acgagtcgga atcgcagacc gataccagga tcttgccatc ctatggaact 1800
gcctcggtga gttttctcct tcattacaga aacggctttt tcaaaaatat ggtattgata 1860
atcctgatat gaataaattg cagtttcact tgatgctcga tgagtttttc taatgagggc 1920
ccaaatgtaa tcacctggct caccttcggg tgggcctttc tgcgttgctg gcgtttttcc 1980
ataggctccg cccccctgac gagcatcaca aaaatcgatg ctcaagtcag aggtggcgaa 2040
acccgacagg actataaaga taccaggcgt ttccccctgg aagctccctc gtgcgctctc 2100
ctgttccgac cctgccgctt accggatacc tgtccgcctt tctcccttcg ggaagcgtgg 2160
cgctttctca tagctcacgc tgtaggtatc tcagttcggt gtaggtcgtt cgctccaagc 2220
tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc ggtaactatc 2280
gtcttgagtc caacccggta agacacgact tatcgccact ggcagcagcc actggtaaca 2340
ggattagcag agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg tggcctaact 2400
acggctacac tagaagaaca gtatttggta tctgcgctct gctgaagcca gttacctcgg 2460
aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt 2520
tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatttt 2580
ctaccg 2586
<210> 51
<211> 843
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-UmuD1-40-meGFP nucleotide sequence of key ORF
<400> 51
atgctgttta tcaaacctgc cgatctgcgt gaaattgtta cctttccgct gtttagcgat 60
ctggttcagt gtggttttcc gagtccggca gcagattatg ttgaacagcg tattgatctg 120
agcagcggta tgtctaaagg tgaagaatta ttcactggtg ttgtcccaat tttggttgaa 180
ttagatggtg atgttaatgg tcacaaattt tctgtctccg gtgaaggtga aggtgatgct 240
acttacggta aattgacctt aaaatttatt tgtactactg gtaaattgcc agttccatgg 300
ccaaccttag tcactacttt aacttatggt gttcaatgtt tttctagata cccagatcat 360
atgaaacaac atgacttttt caagtctgcc atgccagaag gttatgttca agaaagaact 420
atttttttca aagatgacgg taactacaag accagagctg aagtcaagtt tgaaggtgat 480
accttagtta atagaatcga attaaaaggt attgatttta aagaagatgg taacatttta 540
ggtcacaaat tggaatacaa ctataactct cacaatgttt acatcatggc tgacaaacaa 600
aagaatggta tcaaagttaa cttcaaaatt agacacaaca ttgaagatgg ttctgttcaa 660
ttagctgacc attatcaaca aaatactcca attggtgatg gtccagtctt gttaccagac 720
aaccattact tatccactca atctaaatta tccaaagatc caaacgaaaa gagagatcac 780
atggtcttgt tagaatttgt tactgctgct ggtattaccc atggtatgga tgaattgtac 840
aaa 843
<210> 52
<211> 281
<212> PRT
<213> Artificial Sequence
<220>
<223> HcKan_O-UmuD1-40-meGFP amino acid sequence of key ORF
<400> 52
Met Leu Phe Ile Lys Pro Ala Asp Leu Arg Glu Ile Val Thr Phe Pro
1 5 10 15
Leu Phe Ser Asp Leu Val Gln Cys Gly Phe Pro Ser Pro Ala Ala Asp
20 25 30
Tyr Val Glu Gln Arg Ile Asp Leu Ser Ser Gly Met Ser Lys Gly Glu
35 40 45
Glu Leu Phe Thr Gly Val Val Pro Ile Leu Val Glu Leu Asp Gly Asp
50 55 60
Val Asn Gly His Lys Phe Ser Val Ser Gly Glu Gly Glu Gly Asp Ala
65 70 75 80
Thr Tyr Gly Lys Leu Thr Leu Lys Phe Ile Cys Thr Thr Gly Lys Leu
85 90 95
Pro Val Pro Trp Pro Thr Leu Val Thr Thr Leu Thr Tyr Gly Val Gln
100 105 110
Cys Phe Ser Arg Tyr Pro Asp His Met Lys Gln His Asp Phe Phe Lys
115 120 125
Ser Ala Met Pro Glu Gly Tyr Val Gln Glu Arg Thr Ile Phe Phe Lys
130 135 140
Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val Lys Phe Glu Gly Asp
145 150 155 160
Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile Asp Phe Lys Glu Asp
165 170 175
Gly Asn Ile Leu Gly His Lys Leu Glu Tyr Asn Tyr Asn Ser His Asn
180 185 190
Val Tyr Ile Met Ala Asp Lys Gln Lys Asn Gly Ile Lys Val Asn Phe
195 200 205
Lys Ile Arg His Asn Ile Glu Asp Gly Ser Val Gln Leu Ala Asp His
210 215 220
Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro Val Leu Leu Pro Asp
225 230 235 240
Asn His Tyr Leu Ser Thr Gln Ser Lys Leu Ser Lys Asp Pro Asn Glu
245 250 255
Lys Arg Asp His Met Val Leu Leu Glu Phe Val Thr Ala Ala Gly Ile
260 265 270
Thr His Gly Met Asp Glu Leu Tyr Lys
275 280
<210> 53
<211> 2037
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CsoS1A plasmid
<400> 53
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atggctgatg ttactggtat tgctttgggt atgattgaaa ctagaggttt 120
ggttccagct atcgaagctg ctgacgctat gaccaaggcc gctgaagtca gattggtcgg 180
tagacaattt gttggaggtg gttacgtcac tgttttggtt cgtggtgaaa ccggtgccgt 240
taacgctgct gttagagctg gtgctgatgc ttgtgaaaga gttggtgacg gtttagttgc 300
tgcccacatt attgccagag tccactctga agttgaaaac attttgccaa aggctccaca 360
ggcttagccg agacgactga ccatttaaat catacctgac ctccatagca gaaagtcaaa 420
agcctccgac cggaggcttt tgacttgatc ggcacgtaag aggttccaac tttcaccata 480
atgaaataag atcactaccg ggcgtatttt ttgagttatc gagattttca ggagctaagg 540
aagctaaaat gagccatatt caacgggaaa cgtcttgctc gaggccgcga ttaaattcca 600
acatggatgc tgatttatat gggtataaat gggctcgcga taatgtcggg caatcaggtg 660
cgacaatcta tcgattgtat gggaagcccg atgcgccaga gttgtttctg aaacatggca 720
aaggtagcgt tgccaatgat gttacagatg agatggtcag gctaaactgg ctgacggaat 780
ttatgcctct tccgaccatc aagcatttta tccgtactcc tgatgatgca tggttactca 840
ccactgcgat cccagggaaa acagcattcc aggtattaga agaatatcct gattcaggtg 900
aaaatattgt tgatgcgctg gcagtgttcc tgcgccggtt gcattcgatt cctgtttgta 960
attgtccttt taacggcgat cgcgtatttc gtctcgctca ggcgcaatca cgaatgaata 1020
acggtttggt tggtgcgagt gattttgatg acgagcgtaa tggctggcct gttgaacaag 1080
tctggaaaga aatgcataag cttttgccat tctcaccgga ttcagtcgtc actcatggtg 1140
atttctcact tgataacctt atttttgacg aggggaaatt aataggttgt attgatgttg 1200
gacgagtcgg aatcgcagac cgataccagg atcttgccat cctatggaac tgcctcggtg 1260
agttttctcc ttcattacag aaacggcttt ttcaaaaata tggtattgat aatcctgata 1320
tgaataaatt gcagtttcac ttgatgctcg atgagttttt ctaatgaggg cccaaatgta 1380
atcacctggc tcaccttcgg gtgggccttt ctgcgttgct ggcgtttttc cataggctcc 1440
gcccccctga cgagcatcac aaaaatcgat gctcaagtca gaggtggcga aacccgacag 1500
gactataaag ataccaggcg tttccccctg gaagctccct cgtgcgctct cctgttccga 1560
ccctgccgct taccggatac ctgtccgcct ttctcccttc gggaagcgtg gcgctttctc 1620
atagctcacg ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag ctgggctgtg 1680
tgcacgaacc ccccgttcag cccgaccgct gcgccttatc cggtaactat cgtcttgagt 1740
ccaacccggt aagacacgac ttatcgccac tggcagcagc cactggtaac aggattagca 1800
gagcgaggta tgtaggcggt gctacagagt tcttgaagtg gtggcctaac tacggctaca 1860
ctagaagaac agtatttggt atctgcgctc tgctgaagcc agttacctcg gaaaaagagt 1920
tggtagctct tgatccggca aacaaaccac cgctggtagc ggtggttttt ttgtttgcaa 1980
gcagcagatt acgcgcagaa aaaaaggatc tcaagaagat cctttgattt tctaccg 2037
<210> 54
<211> 1992
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CsoS4A plasmid
<400> 54
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgaagatca tgcaagttga aaagactttg gtttctacca acagaattgc 120
tgatatgggt cacaagccat tgttggttgt ttgggaaaaa cctggtgctc caagacaagt 180
tgctgttgat gctattggtt gtattccagg tgactgggtt ttgtgtgttg gttcttctgc 240
tgccagagaa gctgctggtt ccaagtctta cccatctgat ttgactatca tcggtattat 300
tgaccaatgg aacggtgaat agccgagacg actgaccatt taaatcatac ctgacctcca 360
tagcagaaag tcaaaagcct ccgaccggag gcttttgact tgatcggcac gtaagaggtt 420
ccaactttca ccataatgaa ataagatcac taccgggcgt attttttgag ttatcgagat 480
tttcaggagc taaggaagct aaaatgagcc atattcaacg ggaaacgtct tgctcgaggc 540
cgcgattaaa ttccaacatg gatgctgatt tatatgggta taaatgggct cgcgataatg 600
tcgggcaatc aggtgcgaca atctatcgat tgtatgggaa gcccgatgcg ccagagttgt 660
ttctgaaaca tggcaaaggt agcgttgcca atgatgttac agatgagatg gtcaggctaa 720
actggctgac ggaatttatg cctcttccga ccatcaagca ttttatccgt actcctgatg 780
atgcatggtt actcaccact gcgatcccag ggaaaacagc attccaggta ttagaagaat 840
atcctgattc aggtgaaaat attgttgatg cgctggcagt gttcctgcgc cggttgcatt 900
cgattcctgt ttgtaattgt ccttttaacg gcgatcgcgt atttcgtctc gctcaggcgc 960
aatcacgaat gaataacggt ttggttggtg cgagtgattt tgatgacgag cgtaatggct 1020
ggcctgttga acaagtctgg aaagaaatgc ataagctttt gccattctca ccggattcag 1080
tcgtcactca tggtgatttc tcacttgata accttatttt tgacgagggg aaattaatag 1140
gttgtattga tgttggacga gtcggaatcg cagaccgata ccaggatctt gccatcctat 1200
ggaactgcct cggtgagttt tctccttcat tacagaaacg gctttttcaa aaatatggta 1260
ttgataatcc tgatatgaat aaattgcagt ttcacttgat gctcgatgag tttttctaat 1320
gagggcccaa atgtaatcac ctggctcacc ttcgggtggg cctttctgcg ttgctggcgt 1380
ttttccatag gctccgcccc cctgacgagc atcacaaaaa tcgatgctca agtcagaggt 1440
ggcgaaaccc gacaggacta taaagatacc aggcgtttcc ccctggaagc tccctcgtgc 1500
gctctcctgt tccgaccctg ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa 1560
gcgtggcgct ttctcatagc tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct 1620
ccaagctggg ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc ttatccggta 1680
actatcgtct tgagtccaac ccggtaagac acgacttatc gccactggca gcagccactg 1740
gtaacaggat tagcagagcg aggtatgtag gcggtgctac agagttcttg aagtggtggc 1800
ctaactacgg ctacactaga agaacagtat ttggtatctg cgctctgctg aagccagtta 1860
cctcggaaaa agagttggta gctcttgatc cggcaaacaa accaccgctg gtagcggtgg 1920
tttttttgtt tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag aagatccttt 1980
gattttctac cg 1992
<210> 55
<211> 1896
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_T-TT7 plasmid
<400> 55
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgt agctaacaaa gcccgaaagg aagctgagtt ggctgctgcc accgctgagc 120
aataactagc ataacccctt ggggcctcta aacgggtctt gaggggtttt ttgctgaaag 180
gaggaactat atccggatat cccgcaagag gcccggcagt acccctccga gacgactgac 240
catttaaatc atacctgacc tccatagcag aaagtcaaaa gcctccgacc ggaggctttt 300
gacttgatcg gcacgtaaga ggttccaact ttcaccataa tgaaataaga tcactaccgg 360
gcgtattttt tgagttatcg agattttcag gagctaagga agctaaaatg agccatattc 420
aacgggaaac gtcttgctcg aggccgcgat taaattccaa catggatgct gatttatatg 480
ggtataaatg ggctcgcgat aatgtcgggc aatcaggtgc gacaatctat cgattgtatg 540
ggaagcccga tgcgccagag ttgtttctga aacatggcaa aggtagcgtt gccaatgatg 600
ttacagatga gatggtcagg ctaaactggc tgacggaatt tatgcctctt ccgaccatca 660
agcattttat ccgtactcct gatgatgcat ggttactcac cactgcgatc ccagggaaaa 720
cagcattcca ggtattagaa gaatatcctg attcaggtga aaatattgtt gatgcgctgg 780
cagtgttcct gcgccggttg cattcgattc ctgtttgtaa ttgtcctttt aacggcgatc 840
gcgtatttcg tctcgctcag gcgcaatcac gaatgaataa cggtttggtt ggtgcgagtg 900
attttgatga cgagcgtaat ggctggcctg ttgaacaagt ctggaaagaa atgcataagc 960
ttttgccatt ctcaccggat tcagtcgtca ctcatggtga tttctcactt gataacctta 1020
tttttgacga ggggaaatta ataggttgta ttgatgttgg acgagtcgga atcgcagacc 1080
gataccagga tcttgccatc ctatggaact gcctcggtga gttttctcct tcattacaga 1140
aacggctttt tcaaaaatat ggtattgata atcctgatat gaataaattg cagtttcact 1200
tgatgctcga tgagtttttc taatgagggc ccaaatgtaa tcacctggct caccttcggg 1260
tgggcctttc tgcgttgctg gcgtttttcc ataggctccg cccccctgac gagcatcaca 1320
aaaatcgatg ctcaagtcag aggtggcgaa acccgacagg actataaaga taccaggcgt 1380
ttccccctgg aagctccctc gtgcgctctc ctgttccgac cctgccgctt accggatacc 1440
tgtccgcctt tctcccttcg ggaagcgtgg cgctttctca tagctcacgc tgtaggtatc 1500
tcagttcggt gtaggtcgtt cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc 1560
ccgaccgctg cgccttatcc ggtaactatc gtcttgagtc caacccggta agacacgact 1620
tatcgccact ggcagcagcc actggtaaca ggattagcag agcgaggtat gtaggcggtg 1680
ctacagagtt cttgaagtgg tggcctaact acggctacac tagaagaaca gtatttggta 1740
tctgcgctct gctgaagcca gttacctcgg aaaaagagtt ggtagctctt gatccggcaa 1800
acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa 1860
aaaaggatct caagaagatc ctttgatttt ctaccg 1896
<210> 56
<211> 2943
<212> DNA
<213> Artificial Sequence
<220>
<223> pES1 plasmid
<400> 56
agagacccaa gacactgcgg ctttgtatgt gtccgcagcg cccgccgcag tctcacgccc 60
ggagcgtagc gaccgagtga gctagctatt tgtttatttt tctaaataca ttcaaatatg 120
tatccgctca tgagacaata accctgataa atgcttcaat aatattgaaa aaggaagagt 180
atgagggaag cggtgatcgc cgaagtatcg actcaactat cagaggtagt tggcgtcatc 240
gagcgccatc tcgaaccgac gttgctggcc gtacatttgt acggctccgc agtggatggc 300
ggcctgaagc cacacagtga tattgatttg ctggttacgg tgaccgtaag gcttgatgaa 360
acaacgcggc gagctttgat caacgacctt ttggaaactt cggcttcccc tggagagagc 420
gagattctcc gcgctgtaga agtcaccatt gttgtgcacg acgacatcat tccgtggcgt 480
tatccagcta agcgcgaact gcaatttgga gaatggcagc gcaatgacat tcttgcaggt 540
atcttcgagc cagccacgat cgacattgat ctggctatct tgctgacaaa agcaagagaa 600
catagcgttg ccttggtagg tccagcggcg gaggaactct ttgatccggt tcctgaacag 660
gatctatttg aggcgctaaa tgaaacctta acgctatgga actcgccgcc cgactgggct 720
ggcgatgagc gaaatgtagt gcttacgttg tcccgcattt ggtacagcgc agtaaccggc 780
aaaatcgcgc cgaaggatgt cgctgccgac tgggcaatgg agcgcctgcc ggcccagtat 840
cagcccgtca tacttgaagc tagacaggct tatcttggac aagaagaaga tcgcttggcc 900
tcgcgcgcag atcagttgga agaatttgtc cactacgtga aaggcgagat caccaaggta 960
gtcggcaaat aatgtctaac aattcgttca agccgagggg ccgcaagatc cggccacgat 1020
gacccggtcg tcggttcagg gcagggtcgt taaatagccg cttatgtcta ttgctggttt 1080
accggtttat tgactaccgg aagcagtgtg accgtgtgct tctcaaatgc ctgaggtttc 1140
agcaaaaaac ccctcaagac ccgtttagag gccccaaggg gttatgctag ttattgctca 1200
gcggcgtcag accccgtaga aaagatcaaa ggatcttctt gagatccttt ttttctgcgc 1260
gtaatctgct gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg tttgccggat 1320
caagagctac caactctttt tccgaaggta actggcttca gcagagcgca gataccaaat 1380
actgtccttc tagtgtagcc gtagttaggc caccacttca agaactctgt agcaccgcct 1440
acatacctcg ctctgctaat cctgttacca gtggctgctg ccagtggcga taagtcgtgt 1500
cttaccgggt tggactcaag acgatagtta ccggataagg cgcagcggtc gggctgaacg 1560
gggggttcgt gcacacagcc cagcttggag cgaacgacct acaccgaact gagataccta 1620
cagcgtgagc tatgagaaag cgccacgctt cccgaaggga gaaaggcgga caggtatccg 1680
gtaagcggca gggtcggaac aggagagcgc acgagggagc ttccaggggg aaacgcctgg 1740
tatctttata gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc 1800
tcgtcagggg ggcggagcct atggaaaaac gccagcaacg cggccttttt acggttcctg 1860
gccgttggca gtgactcggt ctctacctgg ctagagacgg caatacgcaa accgcctctc 1920
cccgcgcgtt ggccgattca ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg 1980
ggcagtgagc gcaacgcaat taatgtgagt tagctcactc attaggcacc ccaggcttta 2040
cactttatgc ttccggctcg tatgttgtgt ggaattgtga gcggataaca atttcacaca 2100
tactagagaa agaggagaaa tactagatgg cttcctccga agacgttatc aaagagttca 2160
tgcgtttcaa agttcgtatg gaaggttccg ttaacggtca cgagttcgaa atcgaaggtg 2220
aaggtgaagg tcgtccgtac gaaggtaccc agaccgctaa actgaaagtt accaaaggtg 2280
gtccgctgcc gttcgcttgg gacatcctgt ccccgcagtt ccagtacggt tccaaagctt 2340
acgttaaaca cccggctgac atcccggact acctgaaact gtccttcccg gaaggtttca 2400
aatgggaacg tgttatgaac ttcgaagacg gtggtgttgt taccgttacc caggactcct 2460
ccctgcaaga cggtgagttc atctacaaag ttaaactgcg tggtaccaac ttcccgtccg 2520
acggtccggt tatgcagaaa aaaaccatgg gttgggaagc ttccaccgaa cgtatgtacc 2580
cggaagacgg tgctctgaaa ggtgaaatca aaatgcgtct gaaactgaaa gacggtggtc 2640
actacgacgc tgaagttaaa accacctaca tggctaaaaa accggttcag ctgccgggtg 2700
cttacaaaac cgacatcaaa ctggacatca cctcccacaa cgaagactac accatcgttg 2760
aacagtacga acgtgctgaa ggtcgtcact ccaccggtgc ttaataacgc tgatagtgct 2820
agtgtagatc gctactagag ccaggcatca aataaaacga aaggctcagt cgaaagactg 2880
ggcctttcgt tttatctgtt gtttgtcggt gaacgctctc tactagagcg tctcacctct 2940
gag 2943
<210> 57
<211> 2943
<212> DNA
<213> Artificial Sequence
<220>
<223> pES2 plasmid
<400> 57
agagacccaa gacactgcgg ctttgtatgt gtccgcagcg cccgccgcag tctcacgccc 60
ggagcgtagc gaccgagtga gctagctatt tgtttatttt tctaaataca ttcaaatatg 120
tatccgctca tgagacaata accctgataa atgcttcaat aatattgaaa aaggaagagt 180
atgagggaag cggtgatcgc cgaagtatcg actcaactat cagaggtagt tggcgtcatc 240
gagcgccatc tcgaaccgac gttgctggcc gtacatttgt acggctccgc agtggatggc 300
ggcctgaagc cacacagtga tattgatttg ctggttacgg tgaccgtaag gcttgatgaa 360
acaacgcggc gagctttgat caacgacctt ttggaaactt cggcttcccc tggagagagc 420
gagattctcc gcgctgtaga agtcaccatt gttgtgcacg acgacatcat tccgtggcgt 480
tatccagcta agcgcgaact gcaatttgga gaatggcagc gcaatgacat tcttgcaggt 540
atcttcgagc cagccacgat cgacattgat ctggctatct tgctgacaaa agcaagagaa 600
catagcgttg ccttggtagg tccagcggcg gaggaactct ttgatccggt tcctgaacag 660
gatctatttg aggcgctaaa tgaaacctta acgctatgga actcgccgcc cgactgggct 720
ggcgatgagc gaaatgtagt gcttacgttg tcccgcattt ggtacagcgc agtaaccggc 780
aaaatcgcgc cgaaggatgt cgctgccgac tgggcaatgg agcgcctgcc ggcccagtat 840
cagcccgtca tacttgaagc tagacaggct tatcttggac aagaagaaga tcgcttggcc 900
tcgcgcgcag atcagttgga agaatttgtc cactacgtga aaggcgagat caccaaggta 960
gtcggcaaat aatgtctaac aattcgttca agccgagggg ccgcaagatc cggccacgat 1020
gacccggtcg tcggttcagg gcagggtcgt taaatagccg cttatgtcta ttgctggttt 1080
accggtttat tgactaccgg aagcagtgtg accgtgtgct tctcaaatgc ctgaggtttc 1140
agcaaaaaac ccctcaagac ccgtttagag gccccaaggg gttatgctag ttattgctca 1200
gcggcgtcag accccgtaga aaagatcaaa ggatcttctt gagatccttt ttttctgcgc 1260
gtaatctgct gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg tttgccggat 1320
caagagctac caactctttt tccgaaggta actggcttca gcagagcgca gataccaaat 1380
actgtccttc tagtgtagcc gtagttaggc caccacttca agaactctgt agcaccgcct 1440
acatacctcg ctctgctaat cctgttacca gtggctgctg ccagtggcga taagtcgtgt 1500
cttaccgggt tggactcaag acgatagtta ccggataagg cgcagcggtc gggctgaacg 1560
gggggttcgt gcacacagcc cagcttggag cgaacgacct acaccgaact gagataccta 1620
cagcgtgagc tatgagaaag cgccacgctt cccgaaggga gaaaggcgga caggtatccg 1680
gtaagcggca gggtcggaac aggagagcgc acgagggagc ttccaggggg aaacgcctgg 1740
tatctttata gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc 1800
tcgtcagggg ggcggagcct atggaaaaac gccagcaacg cggccttttt acggttcctg 1860
gccgttggca gtgactcggt ctctacctgg ctagagacgg caatacgcaa accgcctctc 1920
cccgcgcgtt ggccgattca ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg 1980
ggcagtgagc gcaacgcaat taatgtgagt tagctcactc attaggcacc ccaggcttta 2040
cactttatgc ttccggctcg tatgttgtgt ggaattgtga gcggataaca atttcacaca 2100
tactagagaa agaggagaaa tactagatgg cttcctccga agacgttatc aaagagttca 2160
tgcgtttcaa agttcgtatg gaaggttccg ttaacggtca cgagttcgaa atcgaaggtg 2220
aaggtgaagg tcgtccgtac gaaggtaccc agaccgctaa actgaaagtt accaaaggtg 2280
gtccgctgcc gttcgcttgg gacatcctgt ccccgcagtt ccagtacggt tccaaagctt 2340
acgttaaaca cccggctgac atcccggact acctgaaact gtccttcccg gaaggtttca 2400
aatgggaacg tgttatgaac ttcgaagacg gtggtgttgt taccgttacc caggactcct 2460
ccctgcaaga cggtgagttc atctacaaag ttaaactgcg tggtaccaac ttcccgtccg 2520
acggtccggt tatgcagaaa aaaaccatgg gttgggaagc ttccaccgaa cgtatgtacc 2580
cggaagacgg tgctctgaaa ggtgaaatca aaatgcgtct gaaactgaaa gacggtggtc 2640
actacgacgc tgaagttaaa accacctaca tggctaaaaa accggttcag ctgccgggtg 2700
cttacaaaac cgacatcaaa ctggacatca cctcccacaa cgaagactac accatcgttg 2760
aacagtacga acgtgctgaa ggtcgtcact ccaccggtgc ttaataacgc tgatagtgct 2820
agtgtagatc gctactagag ccaggcatca aataaaacga aaggctcagt cgaaagactg 2880
ggcctttcgt tttatctgtt gtttgtcggt gaacgctctc tactagagcg tctcacctca 2940
ggc 2943
<210> 58
<211> 2943
<212> DNA
<213> Artificial Sequence
<220>
<223> pES3 plasmid
<400> 58
agagacccaa gacactgcgg ctttgtatgt gtccgcagcg cccgccgcag tctcacgccc 60
ggagcgtagc gaccgagtga gctagctatt tgtttatttt tctaaataca ttcaaatatg 120
tatccgctca tgagacaata accctgataa atgcttcaat aatattgaaa aaggaagagt 180
atgagggaag cggtgatcgc cgaagtatcg actcaactat cagaggtagt tggcgtcatc 240
gagcgccatc tcgaaccgac gttgctggcc gtacatttgt acggctccgc agtggatggc 300
ggcctgaagc cacacagtga tattgatttg ctggttacgg tgaccgtaag gcttgatgaa 360
acaacgcggc gagctttgat caacgacctt ttggaaactt cggcttcccc tggagagagc 420
gagattctcc gcgctgtaga agtcaccatt gttgtgcacg acgacatcat tccgtggcgt 480
tatccagcta agcgcgaact gcaatttgga gaatggcagc gcaatgacat tcttgcaggt 540
atcttcgagc cagccacgat cgacattgat ctggctatct tgctgacaaa agcaagagaa 600
catagcgttg ccttggtagg tccagcggcg gaggaactct ttgatccggt tcctgaacag 660
gatctatttg aggcgctaaa tgaaacctta acgctatgga actcgccgcc cgactgggct 720
ggcgatgagc gaaatgtagt gcttacgttg tcccgcattt ggtacagcgc agtaaccggc 780
aaaatcgcgc cgaaggatgt cgctgccgac tgggcaatgg agcgcctgcc ggcccagtat 840
cagcccgtca tacttgaagc tagacaggct tatcttggac aagaagaaga tcgcttggcc 900
tcgcgcgcag atcagttgga agaatttgtc cactacgtga aaggcgagat caccaaggta 960
gtcggcaaat aatgtctaac aattcgttca agccgagggg ccgcaagatc cggccacgat 1020
gacccggtcg tcggttcagg gcagggtcgt taaatagccg cttatgtcta ttgctggttt 1080
accggtttat tgactaccgg aagcagtgtg accgtgtgct tctcaaatgc ctgaggtttc 1140
agcaaaaaac ccctcaagac ccgtttagag gccccaaggg gttatgctag ttattgctca 1200
gcggcgtcag accccgtaga aaagatcaaa ggatcttctt gagatccttt ttttctgcgc 1260
gtaatctgct gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg tttgccggat 1320
caagagctac caactctttt tccgaaggta actggcttca gcagagcgca gataccaaat 1380
actgtccttc tagtgtagcc gtagttaggc caccacttca agaactctgt agcaccgcct 1440
acatacctcg ctctgctaat cctgttacca gtggctgctg ccagtggcga taagtcgtgt 1500
cttaccgggt tggactcaag acgatagtta ccggataagg cgcagcggtc gggctgaacg 1560
gggggttcgt gcacacagcc cagcttggag cgaacgacct acaccgaact gagataccta 1620
cagcgtgagc tatgagaaag cgccacgctt cccgaaggga gaaaggcgga caggtatccg 1680
gtaagcggca gggtcggaac aggagagcgc acgagggagc ttccaggggg aaacgcctgg 1740
tatctttata gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc 1800
tcgtcagggg ggcggagcct atggaaaaac gccagcaacg cggccttttt acggttcctg 1860
gccgttggca gtgactcggt ctctaggcgg ctagagacgg caatacgcaa accgcctcta 1920
gccgcgcgtt ggccgattca ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg 1980
ggcagtgagc gcaacgcaat taatgtgagt tagctcactc attaggcacc ccaggcttta 2040
cactttatgc ttccggctcg tatgttgtgt ggaattgtga gcggataaca atttcacaca 2100
tactagagaa agaggagaaa tactagatgg cttcctccga agacgttatc aaagagttca 2160
tgcgtttcaa agttcgtatg gaaggttccg ttaacggtca cgagttcgaa atcgaaggtg 2220
aaggtgaagg tcgtccgtac gaaggtaccc agaccgctaa actgaaagtt accaaaggtg 2280
gtccgctgcc gttcgcttgg gacatcctgt ccccgcagtt ccagtacggt tccaaagctt 2340
acgttaaaca cccggctgac atcccggact acctgaaact gtccttcccg gaaggtttca 2400
aatgggaacg tgttatgaac ttcgaagacg gtggtgttgt taccgttacc caggactcct 2460
ccctgcaaga cggtgagttc atctacaaag ttaaactgcg tggtaccaac ttcccgtccg 2520
acggtccggt tatgcagaaa aaaaccatgg gttgggaagc ttccaccgaa cgtatgtacc 2580
cggaagacgg tgctctgaaa ggtgaaatca aaatgcgtct gaaactgaaa gacggtggtc 2640
actacgacgc tgaagttaaa accacctaca tggctaaaaa accggttcag ctgccgggtg 2700
cttacaaaac cgacatcaaa ctggacatca cctcccacaa cgaagactac accatcgttg 2760
aacagtacga acgtgctgaa ggtcgtcact ccaccggtgc ttaataacgc tgatagtgct 2820
agtgtagatc gctactagag ccaggcatca aataaaacga aaggctcagt cgaaagactg 2880
ggcctttcgt tttatctgtt gtttgtcggt gaacgctctc tactagagcg tctcacctct 2940
gag 2943
<210> 59
<211> 2943
<212> DNA
<213> Artificial Sequence
<220>
<223> pES4 plasmid
<400> 59
agagacccaa gacactgcgg ctttgtatgt gtccgcagcg cccgccgcag tctcacgccc 60
ggagcgtagc gaccgagtga gctagctatt tgtttatttt tctaaataca ttcaaatatg 120
tatccgctca tgagacaata accctgataa atgcttcaat aatattgaaa aaggaagagt 180
atgagggaag cggtgatcgc cgaagtatcg actcaactat cagaggtagt tggcgtcatc 240
gagcgccatc tcgaaccgac gttgctggcc gtacatttgt acggctccgc agtggatggc 300
ggcctgaagc cacacagtga tattgatttg ctggttacgg tgaccgtaag gcttgatgaa 360
acaacgcggc gagctttgat caacgacctt ttggaaactt cggcttcccc tggagagagc 420
gagattctcc gcgctgtaga agtcaccatt gttgtgcacg acgacatcat tccgtggcgt 480
tatccagcta agcgcgaact gcaatttgga gaatggcagc gcaatgacat tcttgcaggt 540
atcttcgagc cagccacgat cgacattgat ctggctatct tgctgacaaa agcaagagaa 600
catagcgttg ccttggtagg tccagcggcg gaggaactct ttgatccggt tcctgaacag 660
gatctatttg aggcgctaaa tgaaacctta acgctatgga actcgccgcc cgactgggct 720
ggcgatgagc gaaatgtagt gcttacgttg tcccgcattt ggtacagcgc agtaaccggc 780
aaaatcgcgc cgaaggatgt cgctgccgac tgggcaatgg agcgcctgcc ggcccagtat 840
cagcccgtca tacttgaagc tagacaggct tatcttggac aagaagaaga tcgcttggcc 900
tcgcgcgcag atcagttgga agaatttgtc cactacgtga aaggcgagat caccaaggta 960
gtcggcaaat aatgtctaac aattcgttca agccgagggg ccgcaagatc cggccacgat 1020
gacccggtcg tcggttcagg gcagggtcgt taaatagccg cttatgtcta ttgctggttt 1080
accggtttat tgactaccgg aagcagtgtg accgtgtgct tctcaaatgc ctgaggtttc 1140
agcaaaaaac ccctcaagac ccgtttagag gccccaaggg gttatgctag ttattgctca 1200
gcggcgtcag accccgtaga aaagatcaaa ggatcttctt gagatccttt ttttctgcgc 1260
gtaatctgct gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg tttgccggat 1320
caagagctac caactctttt tccgaaggta actggcttca gcagagcgca gataccaaat 1380
actgtccttc tagtgtagcc gtagttaggc caccacttca agaactctgt agcaccgcct 1440
acatacctcg ctctgctaat cctgttacca gtggctgctg ccagtggcga taagtcgtgt 1500
cttaccgggt tggactcaag acgatagtta ccggataagg cgcagcggtc gggctgaacg 1560
gggggttcgt gcacacagcc cagcttggag cgaacgacct acaccgaact gagataccta 1620
cagcgtgagc tatgagaaag cgccacgctt cccgaaggga gaaaggcgga caggtatccg 1680
gtaagcggca gggtcggaac aggagagcgc acgagggagc ttccaggggg aaacgcctgg 1740
tatctttata gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc 1800
tcgtcagggg ggcggagcct atggaaaaac gccagcaacg cggccttttt acggttcctg 1860
gccgttggca gtgactcggt ctctaggcgg ctagagacgg caatacgcaa accgcctcta 1920
gccgcgcgtt ggccgattca ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg 1980
ggcagtgagc gcaacgcaat taatgtgagt tagctcactc attaggcacc ccaggcttta 2040
cactttatgc ttccggctcg tatgttgtgt ggaattgtga gcggataaca atttcacaca 2100
tactagagaa agaggagaaa tactagatgg cttcctccga agacgttatc aaagagttca 2160
tgcgtttcaa agttcgtatg gaaggttccg ttaacggtca cgagttcgaa atcgaaggtg 2220
aaggtgaagg tcgtccgtac gaaggtaccc agaccgctaa actgaaagtt accaaaggtg 2280
gtccgctgcc gttcgcttgg gacatcctgt ccccgcagtt ccagtacggt tccaaagctt 2340
acgttaaaca cccggctgac atcccggact acctgaaact gtccttcccg gaaggtttca 2400
aatgggaacg tgttatgaac ttcgaagacg gtggtgttgt taccgttacc caggactcct 2460
ccctgcaaga cggtgagttc atctacaaag ttaaactgcg tggtaccaac ttcccgtccg 2520
acggtccggt tatgcagaaa aaaaccatgg gttgggaagc ttccaccgaa cgtatgtacc 2580
cggaagacgg tgctctgaaa ggtgaaatca aaatgcgtct gaaactgaaa gacggtggtc 2640
actacgacgc tgaagttaaa accacctaca tggctaaaaa accggttcag ctgccgggtg 2700
cttacaaaac cgacatcaaa ctggacatca cctcccacaa cgaagactac accatcgttg 2760
aacagtacga acgtgctgaa ggtcgtcact ccaccggtgc ttaataacgc tgatagtgct 2820
agtgtagatc gctactagag ccaggcatca aataaaacga aaggctcagt cgaaagactg 2880
ggcctttcgt tttatctgtt gtttgtcggt gaacgctctc tactagagcg tctcacctct 2940
gcc 2943
<210> 60
<211> 2943
<212> DNA
<213> Artificial Sequence
<220>
<223> pES5 plasmid
<400> 60
agagacccaa gacactgcgg ctttgtatgt gtccgcagcg cccgccgcag tctcacgccc 60
ggagcgtagc gaccgagtga gctagctatt tgtttatttt tctaaataca ttcaaatatg 120
tatccgctca tgagacaata accctgataa atgcttcaat aatattgaaa aaggaagagt 180
atgagggaag cggtgatcgc cgaagtatcg actcaactat cagaggtagt tggcgtcatc 240
gagcgccatc tcgaaccgac gttgctggcc gtacatttgt acggctccgc agtggatggc 300
ggcctgaagc cacacagtga tattgatttg ctggttacgg tgaccgtaag gcttgatgaa 360
acaacgcggc gagctttgat caacgacctt ttggaaactt cggcttcccc tggagagagc 420
gagattctcc gcgctgtaga agtcaccatt gttgtgcacg acgacatcat tccgtggcgt 480
tatccagcta agcgcgaact gcaatttgga gaatggcagc gcaatgacat tcttgcaggt 540
atcttcgagc cagccacgat cgacattgat ctggctatct tgctgacaaa agcaagagaa 600
catagcgttg ccttggtagg tccagcggcg gaggaactct ttgatccggt tcctgaacag 660
gatctatttg aggcgctaaa tgaaacctta acgctatgga actcgccgcc cgactgggct 720
ggcgatgagc gaaatgtagt gcttacgttg tcccgcattt ggtacagcgc agtaaccggc 780
aaaatcgcgc cgaaggatgt cgctgccgac tgggcaatgg agcgcctgcc ggcccagtat 840
cagcccgtca tacttgaagc tagacaggct tatcttggac aagaagaaga tcgcttggcc 900
tcgcgcgcag atcagttgga agaatttgtc cactacgtga aaggcgagat caccaaggta 960
gtcggcaaat aatgtctaac aattcgttca agccgagggg ccgcaagatc cggccacgat 1020
gacccggtcg tcggttcagg gcagggtcgt taaatagccg cttatgtcta ttgctggttt 1080
accggtttat tgactaccgg aagcagtgtg accgtgtgct tctcaaatgc ctgaggtttc 1140
agcaaaaaac ccctcaagac ccgtttagag gccccaaggg gttatgctag ttattgctca 1200
gcggcgtcag accccgtaga aaagatcaaa ggatcttctt gagatccttt ttttctgcgc 1260
gtaatctgct gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg tttgccggat 1320
caagagctac caactctttt tccgaaggta actggcttca gcagagcgca gataccaaat 1380
actgtccttc tagtgtagcc gtagttaggc caccacttca agaactctgt agcaccgcct 1440
acatacctcg ctctgctaat cctgttacca gtggctgctg ccagtggcga taagtcgtgt 1500
cttaccgggt tggactcaag acgatagtta ccggataagg cgcagcggtc gggctgaacg 1560
gggggttcgt gcacacagcc cagcttggag cgaacgacct acaccgaact gagataccta 1620
cagcgtgagc tatgagaaag cgccacgctt cccgaaggga gaaaggcgga caggtatccg 1680
gtaagcggca gggtcggaac aggagagcgc acgagggagc ttccaggggg aaacgcctgg 1740
tatctttata gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc 1800
tcgtcagggg ggcggagcct atggaaaaac gccagcaacg cggccttttt acggttcctg 1860
gccgttggca gtgactcggt ctcttgccgg ctagagacgg caatacgcaa accgcctcta 1920
gccgcgcgtt ggccgattca ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg 1980
ggcagtgagc gcaacgcaat taatgtgagt tagctcactc attaggcacc ccaggcttta 2040
cactttatgc ttccggctcg tatgttgtgt ggaattgtga gcggataaca atttcacaca 2100
tactagagaa agaggagaaa tactagatgg cttcctccga agacgttatc aaagagttca 2160
tgcgtttcaa agttcgtatg gaaggttccg ttaacggtca cgagttcgaa atcgaaggtg 2220
aaggtgaagg tcgtccgtac gaaggtaccc agaccgctaa actgaaagtt accaaaggtg 2280
gtccgctgcc gttcgcttgg gacatcctgt ccccgcagtt ccagtacggt tccaaagctt 2340
acgttaaaca cccggctgac atcccggact acctgaaact gtccttcccg gaaggtttca 2400
aatgggaacg tgttatgaac ttcgaagacg gtggtgttgt taccgttacc caggactcct 2460
ccctgcaaga cggtgagttc atctacaaag ttaaactgcg tggtaccaac ttcccgtccg 2520
acggtccggt tatgcagaaa aaaaccatgg gttgggaagc ttccaccgaa cgtatgtacc 2580
cggaagacgg tgctctgaaa ggtgaaatca aaatgcgtct gaaactgaaa gacggtggtc 2640
actacgacgc tgaagttaaa accacctaca tggctaaaaa accggttcag ctgccgggtg 2700
cttacaaaac cgacatcaaa ctggacatca cctcccacaa cgaagactac accatcgttg 2760
aacagtacga acgtgctgaa ggtcgtcact ccaccggtgc ttaataacgc tgatagtgct 2820
agtgtagatc gctactagag ccaggcatca aataaaacga aaggctcagt cgaaagactg 2880
ggcctttcgt tttatctgtt gtttgtcggt gaacgctctc tactagagcg tctcacctct 2940
gag 2943
<210> 61
<211> 2943
<212> DNA
<213> Artificial Sequence
<220>
<223> pES6 plasmid
<400> 61
agagacccaa gacactgcgg ctttgtatgt gtccgcagcg cccgccgcag tctcacgccc 60
ggagcgtagc gaccgagtga gctagctatt tgtttatttt tctaaataca ttcaaatatg 120
tatccgctca tgagacaata accctgataa atgcttcaat aatattgaaa aaggaagagt 180
atgagggaag cggtgatcgc cgaagtatcg actcaactat cagaggtagt tggcgtcatc 240
gagcgccatc tcgaaccgac gttgctggcc gtacatttgt acggctccgc agtggatggc 300
ggcctgaagc cacacagtga tattgatttg ctggttacgg tgaccgtaag gcttgatgaa 360
acaacgcggc gagctttgat caacgacctt ttggaaactt cggcttcccc tggagagagc 420
gagattctcc gcgctgtaga agtcaccatt gttgtgcacg acgacatcat tccgtggcgt 480
tatccagcta agcgcgaact gcaatttgga gaatggcagc gcaatgacat tcttgcaggt 540
atcttcgagc cagccacgat cgacattgat ctggctatct tgctgacaaa agcaagagaa 600
catagcgttg ccttggtagg tccagcggcg gaggaactct ttgatccggt tcctgaacag 660
gatctatttg aggcgctaaa tgaaacctta acgctatgga actcgccgcc cgactgggct 720
ggcgatgagc gaaatgtagt gcttacgttg tcccgcattt ggtacagcgc agtaaccggc 780
aaaatcgcgc cgaaggatgt cgctgccgac tgggcaatgg agcgcctgcc ggcccagtat 840
cagcccgtca tacttgaagc tagacaggct tatcttggac aagaagaaga tcgcttggcc 900
tcgcgcgcag atcagttgga agaatttgtc cactacgtga aaggcgagat caccaaggta 960
gtcggcaaat aatgtctaac aattcgttca agccgagggg ccgcaagatc cggccacgat 1020
gacccggtcg tcggttcagg gcagggtcgt taaatagccg cttatgtcta ttgctggttt 1080
accggtttat tgactaccgg aagcagtgtg accgtgtgct tctcaaatgc ctgaggtttc 1140
agcaaaaaac ccctcaagac ccgtttagag gccccaaggg gttatgctag ttattgctca 1200
gcggcgtcag accccgtaga aaagatcaaa ggatcttctt gagatccttt ttttctgcgc 1260
gtaatctgct gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg tttgccggat 1320
caagagctac caactctttt tccgaaggta actggcttca gcagagcgca gataccaaat 1380
actgtccttc tagtgtagcc gtagttaggc caccacttca agaactctgt agcaccgcct 1440
acatacctcg ctctgctaat cctgttacca gtggctgctg ccagtggcga taagtcgtgt 1500
cttaccgggt tggactcaag acgatagtta ccggataagg cgcagcggtc gggctgaacg 1560
gggggttcgt gcacacagcc cagcttggag cgaacgacct acaccgaact gagataccta 1620
cagcgtgagc tatgagaaag cgccacgctt cccgaaggga gaaaggcgga caggtatccg 1680
gtaagcggca gggtcggaac aggagagcgc acgagggagc ttccaggggg aaacgcctgg 1740
tatctttata gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc 1800
tcgtcagggg ggcggagcct atggaaaaac gccagcaacg cggccttttt acggttcctg 1860
gccgttggca gtgactcggt ctcttgccgg ctagagacgg caatacgcaa accgcctctc 1920
cccgcgcgtt ggccgattca ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg 1980
ggcagtgagc gcaacgcaat taatgtgagt tagctcactc attaggcacc ccaggcttta 2040
cactttatgc ttccggctcg tatgttgtgt ggaattgtga gcggataaca atttcacaca 2100
tactagagaa agaggagaaa tactagatgg cttcctccga agacgttatc aaagagttca 2160
tgcgtttcaa agttcgtatg gaaggttccg ttaacggtca cgagttcgaa atcgaaggtg 2220
aaggtgaagg tcgtccgtac gaaggtaccc agaccgctaa actgaaagtt accaaaggtg 2280
gtccgctgcc gttcgcttgg gacatcctgt ccccgcagtt ccagtacggt tccaaagctt 2340
acgttaaaca cccggctgac atcccggact acctgaaact gtccttcccg gaaggtttca 2400
aatgggaacg tgttatgaac ttcgaagacg gtggtgttgt taccgttacc caggactcct 2460
ccctgcaaga cggtgagttc atctacaaag ttaaactgcg tggtaccaac ttcccgtccg 2520
acggtccggt tatgcagaaa aaaaccatgg gttgggaagc ttccaccgaa cgtatgtacc 2580
cggaagacgg tgctctgaaa ggtgaaatca aaatgcgtct gaaactgaaa gacggtggtc 2640
actacgacgc tgaagttaaa accacctaca tggctaaaaa accggttcag ctgccgggtg 2700
cttacaaaac cgacatcaaa ctggacatca cctcccacaa cgaagactac accatcgttg 2760
aacagtacga acgtgctgaa ggtcgtcact ccaccggtgc ttaataacgc tgatagtgct 2820
agtgtagatc gctactagag ccaggcatca aataaaacga aaggctcagt cgaaagactg 2880
ggcctttcgt tttatctgtt gtttgtcggt gaacgctctc tactagagcg tctcacctcc 2940
act 2943
<210> 62
<211> 2942
<212> DNA
<213> Artificial Sequence
<220>
<223> pES7 plasmid
<400> 62
agagacccaa gacactgcgg ctttgtatgt gtccgcagcg cccgccgcag tctcacgccc 60
ggagcgtagc gaccgagtga gctagctatt tgtttatttt tctaaataca ttcaaatatg 120
tatccgctca tgagacaata accctgataa atgcttcaat aatattgaaa aaggaagagt 180
atgagggaag cggtgatcgc cgaagtatcg actcaactat cagaggtagt tggcgtcatc 240
gagcgccatc tcgaaccgac gttgctggcc gtacatttgt acggctccgc agtggatggc 300
ggcctgaagc cacacagtga tattgatttg ctggttacgg tgaccgtaag gcttgatgaa 360
acaacgcggc gagctttgat caacgacctt ttggaaactt cggcttcccc tggagagagc 420
gagattctcc gcgctgtaga agtcaccatt gttgtgcacg acgacatcat tccgtggcgt 480
tatccagcta agcgcgaact gcaatttgga gaatggcagc gcaatgacat tcttgcaggt 540
atcttcgagc cagccacgat cgacattgat ctggctatct tgctgacaaa agcaagagaa 600
catagcgttg ccttggtagg tccagcggcg gaggaactct ttgatccggt tcctgaacag 660
gatctatttg aggcgctaaa tgaaacctta acgctatgga actcgccgcc cgactgggct 720
ggcgatgagc gaaatgtagt gcttacgttg tcccgcattt ggtacagcgc agtaaccggc 780
aaaatcgcgc cgaaggatgt cgctgccgac tgggcaatgg agcgcctgcc ggcccagtat 840
cagcccgtca tacttgaagc tagacaggct tatcttggac aagaagaaga tcgcttggcc 900
tcgcgcgcag atcagttgga agaatttgtc cactacgtga aaggcgagat caccaaggta 960
gtcggcaaat aatgtctaac aattcgttca agccgagggg ccgcaagatc cggccacgat 1020
gacccggtcg tcggttcagg gcagggtcgt taaatagccg cttatgtcta ttgctggttt 1080
accggtttat tgactaccgg aagcagtgtg accgtgtgct tctcaaatgc ctgaggtttc 1140
agcaaaaaac ccctcaagac ccgtttagag gccccaaggg gttatgctag ttattgctca 1200
gcggcgtcag accccgtaga aaagatcaaa ggatcttctt gagatccttt ttttctgcgc 1260
gtaatctgct gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg tttgccggat 1320
caagagctac caactctttt tccgaaggta actggcttca gcagagcgca gataccaaat 1380
actgtccttc tagtgtagcc gtagttaggc caccacttca agaactctgt agcaccgcct 1440
acatacctcg ctctgctaat cctgttacca gtggctgctg ccagtggcga taagtcgtgt 1500
cttaccgggt tggactcaag acgatagtta ccggataagg cgcagcggtc gggctgaacg 1560
gggggttcgt gcacacagcc cagcttggag cgaacgacct acaccgaact gagataccta 1620
cagcgtgagc tatgagaaag cgccacgctt cccgaaggga gaaaggcgga caggtatccg 1680
gtaagcggca gggtcggaac aggagagcgc acgagggagc ttccaggggg aaacgcctgg 1740
tatctttata gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc 1800
tcgtcagggg ggcggagcct atggaaaaac gccagcaacg cggccttttt acggttcctg 1860
gccgttggca gtgactcggt ctctcactgg ctagagacgg caatacgcaa accgcctcta 1920
gccgcgcgtt ggccgattca ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg 1980
ggcagtgagc gcaacgcaat taatgtgagt tagctcactc attaggcacc ccaggcttta 2040
cactttatgc ttccggctcg tatgttgtgt ggaattgtga gcggataaca atttcacaca 2100
tactagagaa agaggagaaa tactagatgg cttcctccga agacgttatc aaagagttca 2160
tgcgtttcaa agttcgtatg gaaggttccg ttaacggtca cgagttcgaa atcgaaggtg 2220
aaggtgaagg tcgtccgtac gaaggtaccc agaccgctaa actgaaagtt accaaaggtg 2280
gtccgctgcc gttcgcttgg gacatcctgt ccccgcagtt ccagtacggt tccaaagctt 2340
acgttaaaca cccggctgac atcccggact acctgaaact gtccttcccg gaaggtttca 2400
aatgggaacg tgttatgaac ttcgaagacg gtggtgttgt taccgttacc caggactcct 2460
ccctgcaaga cggtgagttc atctacaaag ttaaactgcg tggtaccaac ttcccgtccg 2520
acggtccggt tatgcagaaa aaaaccatgg gttgggaagc ttccaccgaa cgtatgtacc 2580
cggaagacgg tgctctgaaa ggtgaaatca aaatgcgtct gaaactgaaa gacggtggtc 2640
actacgacgc tgaagttaaa accacctaca tggctaaaaa accggttcag ctgccgggtg 2700
cttacaaaac cgacatcaaa ctggacatca cctcccacaa cgaagactac accatcgttg 2760
aacagtacga acgtgctgaa ggtcgtcact ccaccggtgc ttaataacgc tgatagtgct 2820
agtgtagatc gctactagag ccaggcatca aataaaacga aaggctcagt cgaaagactg 2880
ggcctttcgt tttatctgtt gtttgtcggt gaacgctctc tactagacgt ctcacctctg 2940
ag 2942
<210> 63
<211> 4498
<212> DNA
<213> Artificial Sequence
<220>
<223> pCKH plasmid
<400> 63
actcctccct gcaagacggt gagttcatct acaaagttaa actgcgtggt accaacttcc 60
cgtccgacgg tccggttatg cagaaaaaaa ccatgggttg ggaagcttcc accgaacgta 120
tgtacccgga agacggtgct ctgaaaggtg aaatcaaaat gcgtctgaaa ctgaaagacg 180
gtggtcacta cgacgctgaa gttaaaacca cctacatggc taaaaaaccg gttcagctgc 240
cgggtgctta caaaaccgac atcaaactgg acatcacctc ccacaacgaa gactacacca 300
tcgttgaaca gtacgaacgt gctgaaggtc gtcactccac cggtgcttaa taacgctgat 360
agtgctagtg tagatcgcta ctagagccag gcatcaaata aaacgaaagg ctcagtcgaa 420
agactgggcc tttcgtttta tctgttgttt gtcggtgaac gctctctact agagtggtct 480
catgagcgag acgtccggca tccgcttaca gacaagctgt gacagtctcc gggagctgca 540
tgtgtcagag gttttcaccg tcatcaccga aacgcgcgag actaaagggc ctcgtgatac 600
gcctattttt ataggttaat gtcatgataa taatggtttc ttaggacgga tcgcttgcct 660
gtaacttaca cgcgcctcgt atcttttaat gatggaataa tttgggaatt tactctgtgt 720
ttatttattt ttatgttttg tatttggatt ttagaaagta aataaagaag gtagaagagt 780
tacggaatga agaaaaaaaa ataaacaaag gtttaaaaaa tttcaacaaa aagcgtactt 840
tacatatata tttattagac aagaaaagca gattaaatag atatacattc gattaacgat 900
aagtaaaatg taaaatcaca ggattttcgt gtgtggtctt ctacacagac aagatgaaac 960
aattcggcat taatacctga gagcaggaag agcaagataa aaggtagtat ttgttggcga 1020
tccccctaga gtcttttaca tcttcggaaa acaaaaacta ttttttcttt aatttctttt 1080
tttactttct atttttaatt tatatattta tattaaaaaa tttaaattat aattattttt 1140
atagcacgtg atgaaaagga cccaggtggc attgacttga tcggcacgta agaggttcca 1200
actttcacca taatgaaata agatcactac cgggcgtatt ttttgagtta tcgagatttt 1260
caggagctaa ggaagctaaa atgagccata ttcaacggga aacgtcttgc tcgaggccgc 1320
gattaaattc caacatggat gctgatttat atgggtataa atgggctcgc gataatgtcg 1380
ggcaatcagg tgcgacaatc tatcgattgt atgggaagcc cgatgcgcca gagttgtttc 1440
tgaaacatgg caaaggtagc gttgccaatg atgttacaga tgagatggtc aggctaaact 1500
ggctgacgga atttatgcct cttccgacca tcaagcattt tatccgtact cctgatgatg 1560
catggttact caccactgcg atcccaggga aaacagcatt ccaggtatta gaagaatatc 1620
ctgattcagg tgaaaatatt gttgatgcgc tggcagtgtt cctgcgccgg ttgcattcga 1680
ttcctgtttg taattgtcct tttaacggcg atcgcgtatt tcgtctcgca caggcgcaat 1740
cacgaatgaa taacggtttg gttggtgcga gtgattttga tgacgagcgt aatggctggc 1800
ctgttgaaca agtctggaaa gaaatgcata agcttttgcc attctcaccg gattcagtcg 1860
tcactcatgg tgatttctca cttgataacc ttatttttga cgaggggaaa ttaataggtt 1920
gtattgatgt tggacgagtc ggaatcgcag accgatacca ggatcttgcc atcctatgga 1980
actgcctcgg tgagttttct ccttcattac agaaacggct ttttcaaaaa tatggtattg 2040
ataatcctga tatgaataaa ttgcagtttc acttgatgct cgatgagttt ttctaatgag 2100
ggcccaaatg taatcacctg gctcaccttc gggtgggcct ttctgcgttg ctggcgtttt 2160
tccataggct ccgcccccct gacgagcatc acaaaaatcg atgctcaagt cagaggtggc 2220
gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct 2280
ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg 2340
tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca 2400
agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact 2460
atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta 2520
acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta 2580
actacggcta cactagaaga acagtatttg gtatctgcgc tctgctgaag ccagttacct 2640
cggaaaaaga gttggtagct cttgatccgg caaacaaacc accgctggta gcggtggttt 2700
ttttgtttgc aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag atcctttgat 2760
tttctaccga actgtgcggt atttcacacc gcatagatcc gtcgagttca agagaaaaaa 2820
aaagaaaaag caaaaagaaa aaaggaaagc gcgcctcgtt cagaatgaca cgtatagaat 2880
gatgcattac cttgtcatct tcagtatcat actgttcgta tacatactta ctgacattca 2940
taggtataca tatatacaca tgtatatata tcgtatgctg cagctttaaa taatcggtgt 3000
cactacataa gaacaccttt ggtggaggga acatcgttgg taccattggg cgaggtggct 3060
tctcttatgg caaccgcaag agccttgaac gcactctcac tacggtgatg atcattcttg 3120
cctcgcagac aatcaacgtg gagggtaatt ctgctagcct ctgcaaagct ttcaagaaaa 3180
tgcgggatca tctcgcaaga gagatctcct actttctccc tttgcaaacc aagttcgaca 3240
actgcgtacg gcctgttcga aagatctacc accgctctgg aaagtgcctc atccaaaggc 3300
gcaaatcctg atccaaacct ttttactcca cgcacggccc ctagggcctc tttaaaagct 3360
tgaccgagag caatcccgca gtcttcagtg gtgtgatggt cgtctatgtg taagtcacca 3420
atgcactcaa cgattagcga ccagccggaa tgcttggcca gagcatgtat catatggtcc 3480
agaaacccta tacctgtgtg gacgttaatc acttgcgatt gtgtggcctg ttctgctact 3540
gcttctgcct ctttttctgg gaagatcgag tgctctatcg ctaggggacc accctttaaa 3600
gagatcgcaa tctgaatctt ggtttcattt gtaatacgct ttactagggc tttctgctct 3660
gtcatctttg ccttcgttta tcttgcctgc tcatttttta gtatattctt cgaagaaatc 3720
acattacttt atataatgta taattcatta tgtgataatg ccaatcgcta agaaaaaaaa 3780
agagtcatcc gctaggtgga aaaaaaaaaa tgaaaatcat taccgaggca taaaaaaata 3840
tagagtgtac tagaggaggc caagagtaat agaaaaagaa aattgcggga aaggactgtg 3900
ttatgacttc cctgactaat gccgacgtct cgacctcgag accgcaatac gcaaaccgcc 3960
tctccccgcg cgttggccga ttcattaatg cagctggcac gacaggtttc ccgactggaa 4020
agcgggcagt gagcgcaacg caattaatgt gagttagctc actcattagg caccccaggc 4080
tttacacttt atgcttccgg ctcgtatgtt gtgtggaatt gtgagcggat aacaatttca 4140
cacatactag agaaagagga gaaatactag atggcttcct ccgaagacgt tatcaaagag 4200
ttcatgcgtt tcaaagttcg tatggaaggt tccgttaacg gtcacgagtt cgaaatcgaa 4260
ggtgaaggtg aaggtcgtcc gtacgaaggt acccagaccg ctaaactgaa agttaccaaa 4320
ggtggtccgc tgccgttcgc ttgggacatc ctgtccccgc agttccagta cggttccaaa 4380
gcttacgtta aacacccggc tgacatcccg gactacctga aactgtcctt cccggaaggt 4440
ttcaaatggg aacgtgttat gaacttcgaa gacggtggtg ttgttaccgt tacccagg 4498
<210> 64
<211> 6033
<212> DNA
<213> Artificial Sequence
<220>
<223> pCKH-Cso-BMC plasmid
<400> 64
tgagcgagac gtccggcatc cgcttacaga caagctgtga cagtctccgg gagctgcatg 60
tgtcagaggt tttcaccgtc atcaccgaaa cgcgcgagac taaagggcct cgtgatacgc 120
ctatttttat aggttaatgt catgataata atggtttctt aggacggatc gcttgcctgt 180
aacttacacg cgcctcgtat cttttaatga tggaataatt tgggaattta ctctgtgttt 240
atttattttt atgttttgta tttggatttt agaaagtaaa taaagaaggt agaagagtta 300
cggaatgaag aaaaaaaaat aaacaaaggt ttaaaaaatt tcaacaaaaa gcgtacttta 360
catatatatt tattagacaa gaaaagcaga ttaaatagat atacattcga ttaacgataa 420
gtaaaatgta aaatcacagg attttcgtgt gtggtcttct acacagacaa gatgaaacaa 480
ttcggcatta atacctgaga gcaggaagag caagataaaa ggtagtattt gttggcgatc 540
cccctagagt cttttacatc ttcggaaaac aaaaactatt ttttctttaa tttctttttt 600
tactttctat ttttaattta tatatttata ttaaaaaatt taaattataa ttatttttat 660
agcacgtgat gaaaaggacc caggtggcat tgacttgatc ggcacgtaag aggttccaac 720
tttcaccata atgaaataag atcactaccg ggcgtatttt ttgagttatc gagattttca 780
ggagctaagg aagctaaaat gagccatatt caacgggaaa cgtcttgctc gaggccgcga 840
ttaaattcca acatggatgc tgatttatat gggtataaat gggctcgcga taatgtcggg 900
caatcaggtg cgacaatcta tcgattgtat gggaagcccg atgcgccaga gttgtttctg 960
aaacatggca aaggtagcgt tgccaatgat gttacagatg agatggtcag gctaaactgg 1020
ctgacggaat ttatgcctct tccgaccatc aagcatttta tccgtactcc tgatgatgca 1080
tggttactca ccactgcgat cccagggaaa acagcattcc aggtattaga agaatatcct 1140
gattcaggtg aaaatattgt tgatgcgctg gcagtgttcc tgcgccggtt gcattcgatt 1200
cctgtttgta attgtccttt taacggcgat cgcgtatttc gtctcgcaca ggcgcaatca 1260
cgaatgaata acggtttggt tggtgcgagt gattttgatg acgagcgtaa tggctggcct 1320
gttgaacaag tctggaaaga aatgcataag cttttgccat tctcaccgga ttcagtcgtc 1380
actcatggtg atttctcact tgataacctt atttttgacg aggggaaatt aataggttgt 1440
attgatgttg gacgagtcgg aatcgcagac cgataccagg atcttgccat cctatggaac 1500
tgcctcggtg agttttctcc ttcattacag aaacggcttt ttcaaaaata tggtattgat 1560
aatcctgata tgaataaatt gcagtttcac ttgatgctcg atgagttttt ctaatgaggg 1620
cccaaatgta atcacctggc tcaccttcgg gtgggccttt ctgcgttgct ggcgtttttc 1680
cataggctcc gcccccctga cgagcatcac aaaaatcgat gctcaagtca gaggtggcga 1740
aacccgacag gactataaag ataccaggcg tttccccctg gaagctccct cgtgcgctct 1800
cctgttccga ccctgccgct taccggatac ctgtccgcct ttctcccttc gggaagcgtg 1860
gcgctttctc atagctcacg ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag 1920
ctgggctgtg tgcacgaacc ccccgttcag cccgaccgct gcgccttatc cggtaactat 1980
cgtcttgagt ccaacccggt aagacacgac ttatcgccac tggcagcagc cactggtaac 2040
aggattagca gagcgaggta tgtaggcggt gctacagagt tcttgaagtg gtggcctaac 2100
tacggctaca ctagaagaac agtatttggt atctgcgctc tgctgaagcc agttacctcg 2160
gaaaaagagt tggtagctct tgatccggca aacaaaccac cgctggtagc ggtggttttt 2220
ttgtttgcaa gcagcagatt acgcgcagaa aaaaaggatc tcaagaagat cctttgattt 2280
tctaccgaac tgtgcggtat ttcacaccgc atagatccgt cgagttcaag agaaaaaaaa 2340
agaaaaagca aaaagaaaaa aggaaagcgc gcctcgttca gaatgacacg tatagaatga 2400
tgcattacct tgtcatcttc agtatcatac tgttcgtata catacttact gacattcata 2460
ggtatacata tatacacatg tatatatatc gtatgctgca gctttaaata atcggtgtca 2520
ctacataaga acacctttgg tggagggaac atcgttggta ccattgggcg aggtggcttc 2580
tcttatggca accgcaagag ccttgaacgc actctcacta cggtgatgat cattcttgcc 2640
tcgcagacaa tcaacgtgga gggtaattct gctagcctct gcaaagcttt caagaaaatg 2700
cgggatcatc tcgcaagaga gatctcctac tttctccctt tgcaaaccaa gttcgacaac 2760
tgcgtacggc ctgttcgaaa gatctaccac cgctctggaa agtgcctcat ccaaaggcgc 2820
aaatcctgat ccaaaccttt ttactccacg cacggcccct agggcctctt taaaagcttg 2880
accgagagca atcccgcagt cttcagtggt gtgatggtcg tctatgtgta agtcaccaat 2940
gcactcaacg attagcgacc agccggaatg cttggccaga gcatgtatca tatggtccag 3000
aaaccctata cctgtgtgga cgttaatcac ttgcgattgt gtggcctgtt ctgctactgc 3060
ttctgcctct ttttctggga agatcgagtg ctctatcgct aggggaccac cctttaaaga 3120
gatcgcaatc tgaatcttgg tttcatttgt aatacgcttt actagggctt tctgctctgt 3180
catctttgcc ttcgtttatc ttgcctgctc attttttagt atattcttcg aagaaatcac 3240
attactttat ataatgtata attcattatg tgataatgcc aatcgctaag aaaaaaaaag 3300
agtcatccgc taggtggaaa aaaaaaaatg aaaatcatta ccgaggcata aaaaaatata 3360
gagtgtacta gaggaggcca agagtaatag aaaaagaaaa ttgcgggaaa ggactgtgtt 3420
atgacttccc tgactaatgc cgacgtctcg acctggctgg cttcccaacc ttaccagagg 3480
gcgccccagc tggcaattcc gacgtcttta tggctagctc agtcctaggt acaatgctag 3540
cgaattcaaa agatctttta agaaggagat atacatgatg aagatcatgc aagttgaaaa 3600
gactttggtt tctaccaaca gaattgctga tatgggtcac aagccattgt tggttgtttg 3660
ggaaaaacct ggtgctccaa gacaagttgc tgttgatgct attggttgta ttccaggtga 3720
ctgggttttg tgtgttggtt cttctgctgc cagagaagct gctggttcca agtcttaccc 3780
atctgatttg actatcatcg gtattattga ccaatggaac ggtgaaggtt cttcttggtc 3840
acatccacaa tttgaaaagt agctaacaaa gcccgaaagg aagctgagtt ggctgctgcc 3900
accgctgagc aataactagc ataacccctt ggggcctcta aacgggtctt gaggggtttt 3960
ttgctgaaag gaggaactat atccggatat cccgcaagag gcccggcagt acccctcagg 4020
cggcttcact gcccgctttc cagtcgggaa acctgtcgtg ccagctgcat taatgaatcg 4080
gccaacgcgc ggggagaggc ggtttgcgta ttgggcgcca gggtggtttt tcttttcacc 4140
agtgaaacgg gcaacagctg attgcccttc accgcctggc cctgagagag ttgcagcaag 4200
cggtccacgc tggtttgccc cagcaggcga aaatcctgtt tgatggtggt taacggcggg 4260
atataacatg agctgtcttc ggtatcgtcg tatcccacta ccgagatatc cgcaccaacg 4320
cgcagcccgg actcggtaat ggcgcgcatt gcgcccagcg ccatctgatc gttggcaacc 4380
agcatcgcag tgggaacgat gccctcattc agcatttgca tggtttgttg aaaaccggac 4440
atggcactcc agtcgccttc ccgttccgct atcggctgaa tttgattgcg agtgagatat 4500
ttatgccagc cagccagacg cagacgcgcc gagacagaac ttaatgggcc cgctaacagc 4560
gcgatttgct ggtgacccaa tgcgaccaga tgctccacgc ccagtcgcgt accgtcttca 4620
tgggagaaaa taatactgtt gatgggtgtc tggtcagaga catcaagaaa taacgccgga 4680
acattagtgc aggcagcttc cacagcaatg gcatcctggt catccagcgg atagttaatg 4740
atcagcccac tgacgcgttg cgcgagaaga ttgtgcaccg ccgctttaca ggcttcgacg 4800
ccgcttcgtt ctaccatcga caccaccacg ctggcaccca gttgatcggc gcgagattta 4860
atcgccgcga caatttgcga cggcgcgtgc agggccagac tggaggtggc aacgccaatc 4920
agcaacgact gtttgcccgc cagttgttgt gccacgcggt tgggaatgta attcagctcc 4980
gccatcgccg cttccacttt ttcccgcgtt ttcgcagaaa cgtggctggc ctggttcacc 5040
acgcgggaaa cggtctgata agagacaccg gcatactctg cgacatcgta taacgttact 5100
ggtttcacat tcaccaccct gaattgactc tcttccgggc gctatcatgc cataccgcga 5160
aaggttttgc gccattcgat ggtgtccggg atctcgacgc tctcccttat gcgactcctg 5220
cattaggaag cagcccagta gtaggttgag gccgttgagc accgccgccg caaggaatgg 5280
tgcatgcaag gagatggcgc ccaacagtcc cccggccacg gggcctgcca ccatacccac 5340
gccgaaacaa gcgctcatga gcccgaagtg gcgagcccga tcttccccat cggtgatgtc 5400
ggcgatatag gcgccagcaa ccgcacctgt ggcgccggtg atgccggcca cgatgcgtcc 5460
ggcgtagagg atcgagatct cgatcccgcg aaattaatac gactcactat aggggaattg 5520
tgagcggata acaattcccc tctagaaata attttgttta actttaagaa ggagatatac 5580
gatggctgat gttactggta ttgctttggg tatgattgaa actagaggtt tggttccagc 5640
tatcgaagct gctgacgcta tgaccaaggc cgctgaagtc agattggtcg gtagacaatt 5700
tgttggaggt ggttacgtca ctgttttggt tcgtggtgaa accggtgccg ttaacgctgc 5760
tgttagagct ggtgctgatg cttgtgaaag agttggtgac ggtttagttg ctgcccacat 5820
tattgccaga gtccactctg aagttgaaaa cattttgcca aaggctccac aggcttagct 5880
aacaaagccc gaaaggaagc tgagttggct gctgccaccg ctgagcaata actagcataa 5940
ccccttgggg cctctaaacg ggtcttgagg ggttttttgc tgaaaggagg aactatatcc 6000
ggatatcccg caagaggccc ggcagtaccc ctc 6033
<210> 65
<211> 2246
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_P-TDH3 plasmid
<400> 65
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg gctacagttt attcctggca tccactaaat ataatggagc ccgcttttta 120
agctggcatc cagaaaaaaa aagaatccca gcaccaaaat attgttttct tcaccaacca 180
tcagttcata ggtccattct cttagcgcaa ctacagagaa caggggcaca aacaggcaaa 240
aaacgggcac aacctcaatg gagtgatgca acctgcctgg agtaaatgat gacacaaggc 300
aattgaccca cgcatgtatc tatctcattt tcttacacct tctattacct tctgctctct 360
ctgatttgga aaaagctgaa aaaaaaggtt gaaaccagtt ccctgaaatt attcccctac 420
ttgactaata agtatataaa gacggtaggt attgattgta attctgtaaa tctatttctt 480
aaacttctta aattctactt ttatagttag tctttttttt agttttaaaa caccaagaac 540
ttagtttcga ataaacacac ataaacaaac aaagatgcga gacgactgac catttaaatc 600
atacctgacc tccatagcag aaagtcaaaa gcctccgacc ggaggctttt gacttgatcg 660
gcacgtaaga ggttccaact ttcaccataa tgaaataaga tcactaccgg gcgtattttt 720
tgagttatcg agattttcag gagctaagga agctaaaatg agccatattc aacgggaaac 780
gtcttgctcg aggccgcgat taaattccaa catggatgct gatttatatg ggtataaatg 840
ggctcgcgat aatgtcgggc aatcaggtgc gacaatctat cgattgtatg ggaagcccga 900
tgcgccagag ttgtttctga aacatggcaa aggtagcgtt gccaatgatg ttacagatga 960
gatggtcagg ctaaactggc tgacggaatt tatgcctctt ccgaccatca agcattttat 1020
ccgtactcct gatgatgcat ggttactcac cactgcgatc ccagggaaaa cagcattcca 1080
ggtattagaa gaatatcctg attcaggtga aaatattgtt gatgcgctgg cagtgttcct 1140
gcgccggttg cattcgattc ctgtttgtaa ttgtcctttt aacggcgatc gcgtatttcg 1200
tctcgctcag gcgcaatcac gaatgaataa cggtttggtt ggtgcgagtg attttgatga 1260
cgagcgtaat ggctggcctg ttgaacaagt ctggaaagaa atgcataagc ttttgccatt 1320
ctcaccggat tcagtcgtca ctcatggtga tttctcactt gataacctta tttttgacga 1380
ggggaaatta ataggttgta ttgatgttgg acgagtcgga atcgcagacc gataccagga 1440
tcttgccatc ctatggaact gcctcggtga gttttctcct tcattacaga aacggctttt 1500
tcaaaaatat ggtattgata atcctgatat gaataaattg cagtttcact tgatgctcga 1560
tgagtttttc taatgagggc ccaaatgtaa tcacctggct caccttcggg tgggcctttc 1620
tgcgttgctg gcgtttttcc ataggctccg cccccctgac gagcatcaca aaaatcgatg 1680
ctcaagtcag aggtggcgaa acccgacagg actataaaga taccaggcgt ttccccctgg 1740
aagctccctc gtgcgctctc ctgttccgac cctgccgctt accggatacc tgtccgcctt 1800
tctcccttcg ggaagcgtgg cgctttctca tagctcacgc tgtaggtatc tcagttcggt 1860
gtaggtcgtt cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg 1920
cgccttatcc ggtaactatc gtcttgagtc caacccggta agacacgact tatcgccact 1980
ggcagcagcc actggtaaca ggattagcag agcgaggtat gtaggcggtg ctacagagtt 2040
cttgaagtgg tggcctaact acggctacac tagaagaaca gtatttggta tctgcgctct 2100
gctgaagcca gttacctcgg aaaaagagtt ggtagctctt gatccggcaa acaaaccacc 2160
gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct 2220
caagaagatc ctttgatttt ctaccg 2246
<210> 66
<211> 2246
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_P-YEF3 plasmid
<400> 66
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg gctattaaaa aaacaactta caatcattgt tcgccccttc catacttact 120
gccactcgca aaagggccca accagggcaa ttacgtatca aaaaatcatg acaggctggg 180
taataaatat tcgtgaagaa agaagaaatt aaaaaaagaa acgaagaagc aaaaaaaaga 240
aaagactccg tttaatcact ttcaaccgcg gtttatccgg ccccacccat gcataaccct 300
aaattattag atcacttagc acgtgaaaaa gaaacgtttt taatgttttt tttttttttt 360
tctttttctt tttttgcgtt ggtgaaaatt ttttcgcttc ctcgagtata attatctcat 420
ctcatctttc atataagata agaagtttta taaaaacctt ttgcatcaaa attttgtaga 480
atatctcttt ttcttacgct ctctttcttt ccttaattgt tttctaaaga accgtgtatt 540
tttctagttc gaatccatcg ataacattaa aaggatgcga gacgactgac catttaaatc 600
atacctgacc tccatagcag aaagtcaaaa gcctccgacc ggaggctttt gacttgatcg 660
gcacgtaaga ggttccaact ttcaccataa tgaaataaga tcactaccgg gcgtattttt 720
tgagttatcg agattttcag gagctaagga agctaaaatg agccatattc aacgggaaac 780
gtcttgctcg aggccgcgat taaattccaa catggatgct gatttatatg ggtataaatg 840
ggctcgcgat aatgtcgggc aatcaggtgc gacaatctat cgattgtatg ggaagcccga 900
tgcgccagag ttgtttctga aacatggcaa aggtagcgtt gccaatgatg ttacagatga 960
gatggtcagg ctaaactggc tgacggaatt tatgcctctt ccgaccatca agcattttat 1020
ccgtactcct gatgatgcat ggttactcac cactgcgatc ccagggaaaa cagcattcca 1080
ggtattagaa gaatatcctg attcaggtga aaatattgtt gatgcgctgg cagtgttcct 1140
gcgccggttg cattcgattc ctgtttgtaa ttgtcctttt aacggcgatc gcgtatttcg 1200
tctcgctcag gcgcaatcac gaatgaataa cggtttggtt ggtgcgagtg attttgatga 1260
cgagcgtaat ggctggcctg ttgaacaagt ctggaaagaa atgcataagc ttttgccatt 1320
ctcaccggat tcagtcgtca ctcatggtga tttctcactt gataacctta tttttgacga 1380
ggggaaatta ataggttgta ttgatgttgg acgagtcgga atcgcagacc gataccagga 1440
tcttgccatc ctatggaact gcctcggtga gttttctcct tcattacaga aacggctttt 1500
tcaaaaatat ggtattgata atcctgatat gaataaattg cagtttcact tgatgctcga 1560
tgagtttttc taatgagggc ccaaatgtaa tcacctggct caccttcggg tgggcctttc 1620
tgcgttgctg gcgtttttcc ataggctccg cccccctgac gagcatcaca aaaatcgatg 1680
ctcaagtcag aggtggcgaa acccgacagg actataaaga taccaggcgt ttccccctgg 1740
aagctccctc gtgcgctctc ctgttccgac cctgccgctt accggatacc tgtccgcctt 1800
tctcccttcg ggaagcgtgg cgctttctca tagctcacgc tgtaggtatc tcagttcggt 1860
gtaggtcgtt cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg 1920
cgccttatcc ggtaactatc gtcttgagtc caacccggta agacacgact tatcgccact 1980
ggcagcagcc actggtaaca ggattagcag agcgaggtat gtaggcggtg ctacagagtt 2040
cttgaagtgg tggcctaact acggctacac tagaagaaca gtatttggta tctgcgctct 2100
gctgaagcca gttacctcgg aaaaagagtt ggtagctctt gatccggcaa acaaaccacc 2160
gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct 2220
caagaagatc ctttgatttt ctaccg 2246
<210> 67
<211> 2246
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_P-PYK1 plasmid
<400> 67
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg gctacagatt gggagatttt catagtagaa ttcagcatga tagctacgta 120
aatgtgttcc gcaccgtcac aaagtgtttt ctactgttct ttcttctttc gttcattcag 180
ttgagttgag tgagtgcttt gttcaatgga tcttagctaa aatgcatatt ttttctcttg 240
gtaaatgaat gcttgtgatg tcttccaagt gatttccttt ccttcccata tgatgctagg 300
tacctttagt gtcttcctaa aaaaaaaaaa aggctcgcca tcaaaacgat attcgttggc 360
ttttttttct gaattataaa tactctttgg taacttttca tttccaagaa cctctttttt 420
ccagttatat catggtcccc tttcaaagtt attctctact ctttttcata ttcattcttt 480
ttcatccttt ggttttttat tcttaacttg tttattattc tctcttgttt ctatttacaa 540
gacaccaatc aaaacaaata aaacatcatc acagatgcga gacgactgac catttaaatc 600
atacctgacc tccatagcag aaagtcaaaa gcctccgacc ggaggctttt gacttgatcg 660
gcacgtaaga ggttccaact ttcaccataa tgaaataaga tcactaccgg gcgtattttt 720
tgagttatcg agattttcag gagctaagga agctaaaatg agccatattc aacgggaaac 780
gtcttgctcg aggccgcgat taaattccaa catggatgct gatttatatg ggtataaatg 840
ggctcgcgat aatgtcgggc aatcaggtgc gacaatctat cgattgtatg ggaagcccga 900
tgcgccagag ttgtttctga aacatggcaa aggtagcgtt gccaatgatg ttacagatga 960
gatggtcagg ctaaactggc tgacggaatt tatgcctctt ccgaccatca agcattttat 1020
ccgtactcct gatgatgcat ggttactcac cactgcgatc ccagggaaaa cagcattcca 1080
ggtattagaa gaatatcctg attcaggtga aaatattgtt gatgcgctgg cagtgttcct 1140
gcgccggttg cattcgattc ctgtttgtaa ttgtcctttt aacggcgatc gcgtatttcg 1200
tctcgctcag gcgcaatcac gaatgaataa cggtttggtt ggtgcgagtg attttgatga 1260
cgagcgtaat ggctggcctg ttgaacaagt ctggaaagaa atgcataagc ttttgccatt 1320
ctcaccggat tcagtcgtca ctcatggtga tttctcactt gataacctta tttttgacga 1380
ggggaaatta ataggttgta ttgatgttgg acgagtcgga atcgcagacc gataccagga 1440
tcttgccatc ctatggaact gcctcggtga gttttctcct tcattacaga aacggctttt 1500
tcaaaaatat ggtattgata atcctgatat gaataaattg cagtttcact tgatgctcga 1560
tgagtttttc taatgagggc ccaaatgtaa tcacctggct caccttcggg tgggcctttc 1620
tgcgttgctg gcgtttttcc ataggctccg cccccctgac gagcatcaca aaaatcgatg 1680
ctcaagtcag aggtggcgaa acccgacagg actataaaga taccaggcgt ttccccctgg 1740
aagctccctc gtgcgctctc ctgttccgac cctgccgctt accggatacc tgtccgcctt 1800
tctcccttcg ggaagcgtgg cgctttctca tagctcacgc tgtaggtatc tcagttcggt 1860
gtaggtcgtt cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg 1920
cgccttatcc ggtaactatc gtcttgagtc caacccggta agacacgact tatcgccact 1980
ggcagcagcc actggtaaca ggattagcag agcgaggtat gtaggcggtg ctacagagtt 2040
cttgaagtgg tggcctaact acggctacac tagaagaaca gtatttggta tctgcgctct 2100
gctgaagcca gttacctcgg aaaaagagtt ggtagctctt gatccggcaa acaaaccacc 2160
gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct 2220
caagaagatc ctttgatttt ctaccg 2246
<210> 68
<211> 2040
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-HO-H plasmid
<400> 68
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atggctgatg ctttgggtat gattgaagtt agaggtttcg ttggtatggt 120
tgaagctgct gatgctatgg ttaaggctgc taaagttgaa ttgatcggtt acgaaaaaac 180
tggtggtggt tatgttactg ctgttgttag aggtgatgtt gctgctgtaa aagctgctac 240
tgaagctggt caaagggctg ctgaaagagt tggagaagtt gttgctgttc atgttattcc 300
aagaccacat gttaatgttg atgctgcttt gccattgggt agaactccag gtatggataa 360
gtctgcttag ccgagacgac tgaccattta aatcatacct gacctccata gcagaaagtc 420
aaaagcctcc gaccggaggc ttttgacttg atcggcacgt aagaggttcc aactttcacc 480
ataatgaaat aagatcacta ccgggcgtat tttttgagtt atcgagattt tcaggagcta 540
aggaagctaa aatgagccat attcaacggg aaacgtcttg ctcgaggccg cgattaaatt 600
ccaacatgga tgctgattta tatgggtata aatgggctcg cgataatgtc gggcaatcag 660
gtgcgacaat ctatcgattg tatgggaagc ccgatgcgcc agagttgttt ctgaaacatg 720
gcaaaggtag cgttgccaat gatgttacag atgagatggt caggctaaac tggctgacgg 780
aatttatgcc tcttccgacc atcaagcatt ttatccgtac tcctgatgat gcatggttac 840
tcaccactgc gatcccaggg aaaacagcat tccaggtatt agaagaatat cctgattcag 900
gtgaaaatat tgttgatgcg ctggcagtgt tcctgcgccg gttgcattcg attcctgttt 960
gtaattgtcc ttttaacggc gatcgcgtat ttcgtctcgc tcaggcgcaa tcacgaatga 1020
ataacggttt ggttggtgcg agtgattttg atgacgagcg taatggctgg cctgttgaac 1080
aagtctggaa agaaatgcat aagcttttgc cattctcacc ggattcagtc gtcactcatg 1140
gtgatttctc acttgataac cttatttttg acgaggggaa attaataggt tgtattgatg 1200
ttggacgagt cggaatcgca gaccgatacc aggatcttgc catcctatgg aactgcctcg 1260
gtgagttttc tccttcatta cagaaacggc tttttcaaaa atatggtatt gataatcctg 1320
atatgaataa attgcagttt cacttgatgc tcgatgagtt tttctaatga gggcccaaat 1380
gtaatcacct ggctcacctt cgggtgggcc tttctgcgtt gctggcgttt ttccataggc 1440
tccgcccccc tgacgagcat cacaaaaatc gatgctcaag tcagaggtgg cgaaacccga 1500
caggactata aagataccag gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc 1560
cgaccctgcc gcttaccgga tacctgtccg cctttctccc ttcgggaagc gtggcgcttt 1620
ctcatagctc acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc aagctgggct 1680
gtgtgcacga accccccgtt cagcccgacc gctgcgcctt atccggtaac tatcgtcttg 1740
agtccaaccc ggtaagacac gacttatcgc cactggcagc agccactggt aacaggatta 1800
gcagagcgag gtatgtaggc ggtgctacag agttcttgaa gtggtggcct aactacggct 1860
acactagaag aacagtattt ggtatctgcg ctctgctgaa gccagttacc tcggaaaaag 1920
agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt tttttgtttg 1980
caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga ttttctaccg 2040
<210> 69
<211> 2031
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-HO-P plasmid
<400> 69
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atggttttag gtaaagttgt cggtactgtt gttgcatcaa gaaaggaacc 120
aagaattgaa ggtttatctt tattattggt tagagcttgt gatccagatg gtactccaac 180
tggtggtgct gttgtttgtg ctgatgctgt tggtgctggt gttggtgaag ttgttttata 240
tgcttctggt tcttctgcta gacaaactga agttactaat aatagaccag ttgatgctac 300
tattatggct attgttgatt tggttgaaat gggtggtgat gttagattta gaaaagatta 360
gccgagacga ctgaccattt aaatcatacc tgacctccat agcagaaagt caaaagcctc 420
cgaccggagg cttttgactt gatcggcacg taagaggttc caactttcac cataatgaaa 480
taagatcact accgggcgta ttttttgagt tatcgagatt ttcaggagct aaggaagcta 540
aaatgagcca tattcaacgg gaaacgtctt gctcgaggcc gcgattaaat tccaacatgg 600
atgctgattt atatgggtat aaatgggctc gcgataatgt cgggcaatca ggtgcgacaa 660
tctatcgatt gtatgggaag cccgatgcgc cagagttgtt tctgaaacat ggcaaaggta 720
gcgttgccaa tgatgttaca gatgagatgg tcaggctaaa ctggctgacg gaatttatgc 780
ctcttccgac catcaagcat tttatccgta ctcctgatga tgcatggtta ctcaccactg 840
cgatcccagg gaaaacagca ttccaggtat tagaagaata tcctgattca ggtgaaaata 900
ttgttgatgc gctggcagtg ttcctgcgcc ggttgcattc gattcctgtt tgtaattgtc 960
cttttaacgg cgatcgcgta tttcgtctcg ctcaggcgca atcacgaatg aataacggtt 1020
tggttggtgc gagtgatttt gatgacgagc gtaatggctg gcctgttgaa caagtctgga 1080
aagaaatgca taagcttttg ccattctcac cggattcagt cgtcactcat ggtgatttct 1140
cacttgataa ccttattttt gacgagggga aattaatagg ttgtattgat gttggacgag 1200
tcggaatcgc agaccgatac caggatcttg ccatcctatg gaactgcctc ggtgagtttt 1260
ctccttcatt acagaaacgg ctttttcaaa aatatggtat tgataatcct gatatgaata 1320
aattgcagtt tcacttgatg ctcgatgagt ttttctaatg agggcccaaa tgtaatcacc 1380
tggctcacct tcgggtgggc ctttctgcgt tgctggcgtt tttccatagg ctccgccccc 1440
ctgacgagca tcacaaaaat cgatgctcaa gtcagaggtg gcgaaacccg acaggactat 1500
aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc 1560
cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct 1620
cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg 1680
aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc 1740
cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga 1800
ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa 1860
gaacagtatt tggtatctgc gctctgctga agccagttac ctcggaaaaa gagttggtag 1920
ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt gcaagcagca 1980
gattacgcgc agaaaaaaag gatctcaaga agatcctttg attttctacc g 2031
<210> 70
<211> 2358
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-HO-T1 plasmid
<400> 70
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atggatcatg ctccagaaag atttgatgct actcctccag ctggtgaacc 120
agatagacca gctttgggtg ttttggaatt gacttctatt gctagaggta ttaccgttgc 180
tgatgctgct ttgaaaagag caccatcttt gttgttgatg tccagaccag tttcttccgg 240
taaacatttg ttgatgatga gaggtcaagt tgccgaagtt gaagaatcta tgattgctgc 300
tagagaaatt gctggtgctg gttctggtgc tttgttggat gaattggaat tgccatatgc 360
tcacgaacaa ctttggagat ttttggatgc tccagttgtt gcagatgctt gggaagaaga 420
tactgaatcc gttattatcg ttgaaaccgc tactgtttgt gctgctattg attctgctga 480
tgcagcctta aaaactgctc ctgttgtttt gagagatatg agattggcta ttggtattgc 540
tggtaaggct ttctttactt tgactggtga attggctgat gttgaagctg ctgctgaagt 600
tgttagagaa agatgtggtg ctagattgct agaattggct tgtattgcaa gaccagttga 660
cgaattgaga ggtaggttgt ttttctagcc gagacgactg accatttaaa tcatacctga 720
cctccatagc agaaagtcaa aagcctccga ccggaggctt ttgacttgat cggcacgtaa 780
gaggttccaa ctttcaccat aatgaaataa gatcactacc gggcgtattt tttgagttat 840
cgagattttc aggagctaag gaagctaaaa tgagccatat tcaacgggaa acgtcttgct 900
cgaggccgcg attaaattcc aacatggatg ctgatttata tgggtataaa tgggctcgcg 960
ataatgtcgg gcaatcaggt gcgacaatct atcgattgta tgggaagccc gatgcgccag 1020
agttgtttct gaaacatggc aaaggtagcg ttgccaatga tgttacagat gagatggtca 1080
ggctaaactg gctgacggaa tttatgcctc ttccgaccat caagcatttt atccgtactc 1140
ctgatgatgc atggttactc accactgcga tcccagggaa aacagcattc caggtattag 1200
aagaatatcc tgattcaggt gaaaatattg ttgatgcgct ggcagtgttc ctgcgccggt 1260
tgcattcgat tcctgtttgt aattgtcctt ttaacggcga tcgcgtattt cgtctcgctc 1320
aggcgcaatc acgaatgaat aacggtttgg ttggtgcgag tgattttgat gacgagcgta 1380
atggctggcc tgttgaacaa gtctggaaag aaatgcataa gcttttgcca ttctcaccgg 1440
attcagtcgt cactcatggt gatttctcac ttgataacct tatttttgac gaggggaaat 1500
taataggttg tattgatgtt ggacgagtcg gaatcgcaga ccgataccag gatcttgcca 1560
tcctatggaa ctgcctcggt gagttttctc cttcattaca gaaacggctt tttcaaaaat 1620
atggtattga taatcctgat atgaataaat tgcagtttca cttgatgctc gatgagtttt 1680
tctaatgagg gcccaaatgt aatcacctgg ctcaccttcg ggtgggcctt tctgcgttgc 1740
tggcgttttt ccataggctc cgcccccctg acgagcatca caaaaatcga tgctcaagtc 1800
agaggtggcg aaacccgaca ggactataaa gataccaggc gtttccccct ggaagctccc 1860
tcgtgcgctc tcctgttccg accctgccgc ttaccggata cctgtccgcc tttctccctt 1920
cgggaagcgt ggcgctttct catagctcac gctgtaggta tctcagttcg gtgtaggtcg 1980
ttcgctccaa gctgggctgt gtgcacgaac cccccgttca gcccgaccgc tgcgccttat 2040
ccggtaacta tcgtcttgag tccaacccgg taagacacga cttatcgcca ctggcagcag 2100
ccactggtaa caggattagc agagcgaggt atgtaggcgg tgctacagag ttcttgaagt 2160
ggtggcctaa ctacggctac actagaagaa cagtatttgg tatctgcgct ctgctgaagc 2220
cagttacctc ggaaaaagag ttggtagctc ttgatccggc aaacaaacca ccgctggtag 2280
cggtggtttt tttgtttgca agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga 2340
tcctttgatt ttctaccg 2358
<210> 71
<211> 2201
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_T-RPL41B plasmid
<400> 71
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgt agcgcggatt gagagcaaat cgttaagttc aggtcaagta aaaattgatt 120
tcgaaaacta atttctctta tacaatcctt tgattggacc gtcatccttt cgaatataag 180
attttgttaa gaatatttta gacagagatc tactttatat ttaatatcta gatattacat 240
aatttcctct ctaataaaat atcattaata aaataaaaat gaagcgattt gattttgtgt 300
tgtcaactta gtttgccgct atgcctcttg ggtaatgcta ttattgaatc gaagggcttt 360
attatattac cctttagctt attctgaggt ttctgtggcg tgcaaagtga tgaaccgggc 420
gggttttaag gataaaatca aaaagtgaaa aaatgaacgg aaaatggaat acctgtgaaa 480
tggagaatga taatgaatct ttctgtcgtg cttgaaagat tttcggctcc tccgagacga 540
ctgaccattt aaatcatacc tgacctccat agcagaaagt caaaagcctc cgaccggagg 600
cttttgactt gatcggcacg taagaggttc caactttcac cataatgaaa taagatcact 660
accgggcgta ttttttgagt tatcgagatt ttcaggagct aaggaagcta aaatgagcca 720
tattcaacgg gaaacgtctt gctcgaggcc gcgattaaat tccaacatgg atgctgattt 780
atatgggtat aaatgggctc gcgataatgt cgggcaatca ggtgcgacaa tctatcgatt 840
gtatgggaag cccgatgcgc cagagttgtt tctgaaacat ggcaaaggta gcgttgccaa 900
tgatgttaca gatgagatgg tcaggctaaa ctggctgacg gaatttatgc ctcttccgac 960
catcaagcat tttatccgta ctcctgatga tgcatggtta ctcaccactg cgatcccagg 1020
gaaaacagca ttccaggtat tagaagaata tcctgattca ggtgaaaata ttgttgatgc 1080
gctggcagtg ttcctgcgcc ggttgcattc gattcctgtt tgtaattgtc cttttaacgg 1140
cgatcgcgta tttcgtctcg ctcaggcgca atcacgaatg aataacggtt tggttggtgc 1200
gagtgatttt gatgacgagc gtaatggctg gcctgttgaa caagtctgga aagaaatgca 1260
taagcttttg ccattctcac cggattcagt cgtcactcat ggtgatttct cacttgataa 1320
ccttattttt gacgagggga aattaatagg ttgtattgat gttggacgag tcggaatcgc 1380
agaccgatac caggatcttg ccatcctatg gaactgcctc ggtgagtttt ctccttcatt 1440
acagaaacgg ctttttcaaa aatatggtat tgataatcct gatatgaata aattgcagtt 1500
tcacttgatg ctcgatgagt ttttctaatg agggcccaaa tgtaatcacc tggctcacct 1560
tcgggtgggc ctttctgcgt tgctggcgtt tttccatagg ctccgccccc ctgacgagca 1620
tcacaaaaat cgatgctcaa gtcagaggtg gcgaaacccg acaggactat aaagatacca 1680
ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc cgcttaccgg 1740
atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct cacgctgtag 1800
gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg aaccccccgt 1860
tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc cggtaagaca 1920
cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga ggtatgtagg 1980
cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa gaacagtatt 2040
tggtatctgc gctctgctga agccagttac ctcggaaaaa gagttggtag ctcttgatcc 2100
ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt gcaagcagca gattacgcgc 2160
agaaaaaaag gatctcaaga agatcctttg attttctacc g 2201
<210> 72
<211> 2172
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_T-HBT1 plasmid
<400> 72
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgt agcacacttc tcgattaaca aattcccagt attctttgaa atctattttt 120
cttcctcaat tgaatttgaa taactgtcta cgcggactcc tcctatctac aactacaaca 180
aattttaacc actttattac cactttcctc tttcatttat ttttgtcttt tatgttgtca 240
atttactagt attttttttt ttttcattta cgttcaaggt tttttatact catttaactt 300
gtcttaggtt atttatatat atacctatat atttatatat atatatatat atgtatgtat 360
atattattat caccaaatga gaaataatag ctaatttgat ttttgattat ttaaaatatt 420
ggtttgttct ttctgcaaac atctcgtttg gtacgatatt agtgaaaaac gatgtaatta 480
tcaacacgtg cattacccac ctccgagacg actgaccatt taaatcatac ctgacctcca 540
tagcagaaag tcaaaagcct ccgaccggag gcttttgact tgatcggcac gtaagaggtt 600
ccaactttca ccataatgaa ataagatcac taccgggcgt attttttgag ttatcgagat 660
tttcaggagc taaggaagct aaaatgagcc atattcaacg ggaaacgtct tgctcgaggc 720
cgcgattaaa ttccaacatg gatgctgatt tatatgggta taaatgggct cgcgataatg 780
tcgggcaatc aggtgcgaca atctatcgat tgtatgggaa gcccgatgcg ccagagttgt 840
ttctgaaaca tggcaaaggt agcgttgcca atgatgttac agatgagatg gtcaggctaa 900
actggctgac ggaatttatg cctcttccga ccatcaagca ttttatccgt actcctgatg 960
atgcatggtt actcaccact gcgatcccag ggaaaacagc attccaggta ttagaagaat 1020
atcctgattc aggtgaaaat attgttgatg cgctggcagt gttcctgcgc cggttgcatt 1080
cgattcctgt ttgtaattgt ccttttaacg gcgatcgcgt atttcgtctc gctcaggcgc 1140
aatcacgaat gaataacggt ttggttggtg cgagtgattt tgatgacgag cgtaatggct 1200
ggcctgttga acaagtctgg aaagaaatgc ataagctttt gccattctca ccggattcag 1260
tcgtcactca tggtgatttc tcacttgata accttatttt tgacgagggg aaattaatag 1320
gttgtattga tgttggacga gtcggaatcg cagaccgata ccaggatctt gccatcctat 1380
ggaactgcct cggtgagttt tctccttcat tacagaaacg gctttttcaa aaatatggta 1440
ttgataatcc tgatatgaat aaattgcagt ttcacttgat gctcgatgag tttttctaat 1500
gagggcccaa atgtaatcac ctggctcacc ttcgggtggg cctttctgcg ttgctggcgt 1560
ttttccatag gctccgcccc cctgacgagc atcacaaaaa tcgatgctca agtcagaggt 1620
ggcgaaaccc gacaggacta taaagatacc aggcgtttcc ccctggaagc tccctcgtgc 1680
gctctcctgt tccgaccctg ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa 1740
gcgtggcgct ttctcatagc tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct 1800
ccaagctggg ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc ttatccggta 1860
actatcgtct tgagtccaac ccggtaagac acgacttatc gccactggca gcagccactg 1920
gtaacaggat tagcagagcg aggtatgtag gcggtgctac agagttcttg aagtggtggc 1980
ctaactacgg ctacactaga agaacagtat ttggtatctg cgctctgctg aagccagtta 2040
cctcggaaaa agagttggta gctcttgatc cggcaaacaa accaccgctg gtagcggtgg 2100
tttttttgtt tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag aagatccttt 2160
gattttctac cg 2172
<210> 73
<211> 2200
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_T-RPS20 plasmid
<400> 73
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgt agcaactaag ctggttctaa ctggaaataa tttccattag attcctcttt 120
ttctcgtcca ttaaccaaaa tatattattg aattcagcgg ttcctttttt ctcattttcg 180
catatagctg cactattaga atcagcccac tctaggtaaa cacagttcct cgatatacct 240
ctgtcttact atcagtggtt aaaccttatg caaatataat atatatatat atatatatat 300
ctcatacttt tgttgattct tgtgtaatta ttggaaaaga caaaacaaag caagcgtttc 360
tattcatcat atttacaagt atttttatga aaaactattt cttaattttc ccaccggcgg 420
ctttgaataa ggcaatgtca ttgtcctgca taatatattg tttgcctgca cgtttgataa 480
gtcccttaga ttttagtaaa gactcattta gcggtggttc catcttccct ccgagacgac 540
tgaccattta aatcatacct gacctccata gcagaaagtc aaaagcctcc gaccggaggc 600
ttttgacttg atcggcacgt aagaggttcc aactttcacc ataatgaaat aagatcacta 660
ccgggcgtat tttttgagtt atcgagattt tcaggagcta aggaagctaa aatgagccat 720
attcaacggg aaacgtcttg ctcgaggccg cgattaaatt ccaacatgga tgctgattta 780
tatgggtata aatgggctcg cgataatgtc gggcaatcag gtgcgacaat ctatcgattg 840
tatgggaagc ccgatgcgcc agagttgttt ctgaaacatg gcaaaggtag cgttgccaat 900
gatgttacag atgagatggt caggctaaac tggctgacgg aatttatgcc tcttccgacc 960
atcaagcatt ttatccgtac tcctgatgat gcatggttac tcaccactgc gatcccaggg 1020
aaaacagcat tccaggtatt agaagaatat cctgattcag gtgaaaatat tgttgatgcg 1080
ctggcagtgt tcctgcgccg gttgcattcg attcctgttt gtaattgtcc ttttaacggc 1140
gatcgcgtat ttcgtctcgc tcaggcgcaa tcacgaatga ataacggttt ggttggtgcg 1200
agtgattttg atgacgagcg taatggctgg cctgttgaac aagtctggaa agaaatgcat 1260
aagcttttgc cattctcacc ggattcagtc gtcactcatg gtgatttctc acttgataac 1320
cttatttttg acgaggggaa attaataggt tgtattgatg ttggacgagt cggaatcgca 1380
gaccgatacc aggatcttgc catcctatgg aactgcctcg gtgagttttc tccttcatta 1440
cagaaacggc tttttcaaaa atatggtatt gataatcctg atatgaataa attgcagttt 1500
cacttgatgc tcgatgagtt tttctaatga gggcccaaat gtaatcacct ggctcacctt 1560
cgggtgggcc tttctgcgtt gctggcgttt ttccataggc tccgcccccc tgacgagcat 1620
cacaaaaatc gatgctcaag tcagaggtgg cgaaacccga caggactata aagataccag 1680
gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc gcttaccgga 1740
tacctgtccg cctttctccc ttcgggaagc gtggcgcttt ctcatagctc acgctgtagg 1800
tatctcagtt cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga accccccgtt 1860
cagcccgacc gctgcgcctt atccggtaac tatcgtcttg agtccaaccc ggtaagacac 1920
gacttatcgc cactggcagc agccactggt aacaggatta gcagagcgag gtatgtaggc 1980
ggtgctacag agttcttgaa gtggtggcct aactacggct acactagaag aacagtattt 2040
ggtatctgcg ctctgctgaa gccagttacc tcggaaaaag agttggtagc tcttgatccg 2100
gcaaacaaac caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca 2160
gaaaaaaagg atctcaagaa gatcctttga ttttctaccg 2200
<210> 74
<211> 4531
<212> DNA
<213> Artificial Sequence
<220>
<223> pCKU plasmid
<400> 74
actcctccct gcaagacggt gagttcatct acaaagttaa actgcgtggt accaacttcc 60
cgtccgacgg tccggttatg cagaaaaaaa ccatgggttg ggaagcttcc accgaacgta 120
tgtacccgga agacggtgct ctgaaaggtg aaatcaaaat gcgtctgaaa ctgaaagacg 180
gtggtcacta cgacgctgaa gttaaaacca cctacatggc taaaaaaccg gttcagctgc 240
cgggtgctta caaaaccgac atcaaactgg acatcacctc ccacaacgaa gactacacca 300
tcgttgaaca gtacgaacgt gctgaaggtc gtcactccac cggtgcttaa taacgctgat 360
agtgctagtg tagatcgcta ctagagccag gcatcaaata aaacgaaagg ctcagtcgaa 420
agactgggcc tttcgtttta tctgttgttt gtcggtgaac gctctctact agagtggtct 480
catgagcgag acgtccggca tccgcttaca gacaagctgt gacaatctcc gggagctgca 540
tgtgtcagag gttttcaccg tcatcaccga aacgcgcgag attaaagggc ctcgtgatac 600
gcctattttt ataggttaat gtcatgataa taatggtttc ttagacggat cgcttgcctg 660
taacttacac gcgcctcgta tcttttaatg atggaataat ttgggaattt actctgtgtt 720
tatttatttt tatgttttgt atttggattt tagaaagtaa ataaagaagg tagaagagtt 780
acggaatgaa gaaaaaaaaa taaacaaagg tttaaaaaat ttcaacaaaa agcgtacttt 840
acatatatat ttattagaca agaaaagcag attaaataga tatacattcg attaacgata 900
agtaaaatgt aaaatcacag gattttcgtg tgtggtcttc tacacagaca agatgaaaca 960
attcggcatt aatacctgag agcaggaaga gcaagataaa aggtagtatt tgttggcgat 1020
ccccctagag tcttttacat cttcggaaaa caaaaactat tttttcttta atttcttttt 1080
ttactttcta tttttaattt atatatttat attaaaaaat ttaaattata attattttta 1140
tagcacgtga tgaaaaggac ccaggtggca ttgacttgat cggcacgtaa gaggttccaa 1200
ctttcaccat aatgaaataa gatcactacc gggcgtattt tttgagttat cgagattttc 1260
aggagctaag gaagctaaaa tgagccatat tcaacgggaa acgtcttgct cgaggccgcg 1320
attaaattcc aacatggatg ctgatttata tgggtataaa tgggctcgcg ataatgtcgg 1380
gcaatcaggt gcgacaatct atcgattgta tgggaagccc gatgcgccag agttgtttct 1440
gaaacatggc aaaggtagcg ttgccaatga tgttacagat gagatggtca ggctaaactg 1500
gctgacggaa tttatgcctc ttccgaccat caagcatttt atccgtactc ctgatgatgc 1560
atggttactc accactgcga tcccagggaa aacagcattc caggtattag aagaatatcc 1620
tgattcaggt gaaaatattg ttgatgcgct ggcagtgttc ctgcgccggt tgcattcgat 1680
tcctgtttgt aattgtcctt ttaacggcga tcgcgtattt cgtctcgcac aggcgcaatc 1740
acgaatgaat aacggtttgg ttggtgcgag tgattttgat gacgagcgta atggctggcc 1800
tgttgaacaa gtctggaaag aaatgcataa gcttttgcca ttctcaccgg attcagtcgt 1860
cactcatggt gatttctcac ttgataacct tatttttgac gaggggaaat taataggttg 1920
tattgatgtt ggacgagtcg gaatcgcaga ccgataccag gatcttgcca tcctatggaa 1980
ctgcctcggt gagttttctc cttcattaca gaaacggctt tttcaaaaat atggtattga 2040
taatcctgat atgaataaat tgcagtttca cttgatgctc gatgagtttt tctaatgagg 2100
gcccaaatgt aatcacctgg ctcaccttcg ggtgggcctt tctgcgttgc tggcgttttt 2160
ccataggctc cgcccccctg acgagcatca caaaaatcga tgctcaagtc agaggtggcg 2220
aaacccgaca ggactataaa gataccaggc gtttccccct ggaagctccc tcgtgcgctc 2280
tcctgttccg accctgccgc ttaccggata cctgtccgcc tttctccctt cgggaagcgt 2340
ggcgctttct catagctcac gctgtaggta tctcagttcg gtgtaggtcg ttcgctccaa 2400
gctgggctgt gtgcacgaac cccccgttca gcccgaccgc tgcgccttat ccggtaacta 2460
tcgtcttgag tccaacccgg taagacacga cttatcgcca ctggcagcag ccactggtaa 2520
caggattagc agagcgaggt atgtaggcgg tgctacagag ttcttgaagt ggtggcctaa 2580
ctacggctac actagaagaa cagtatttgg tatctgcgct ctgctgaagc cagttacctc 2640
ggaaaaagag ttggtagctc ttgatccggc aaacaaacca ccgctggtag cggtggtttt 2700
tttgtttgca agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatt 2760
ttctaccgaa cgtttacaat ttcctgatgc ggtattttct ccttacgcat ctgtgcggta 2820
tttcacaccg catagggtaa taactgatat aattaaattg aagctctaat ttgtgagttt 2880
agtatacatg catttactta taatacagtt ttttagtttt gctggccgca tcttctcaaa 2940
tatgcttccc agcctgcttt tctgtaacgt tcaccctcta ccttagcatc ccttcccttt 3000
gcaaatagtc ctcttccaac aataataatg tcagatcctg tagacaccac atcatccacg 3060
gttctatact gttgacccaa tgcgtcaccc ttgtcatcta aacccacacc gggtgtcata 3120
atcaaccaat cgtaaccttc atctcttcca cccatgtctc tttgagcaat aaagccgata 3180
acaaaatctt tgtcgctctt cgcaatgtca acagtaccct tagtatattc tccagtagat 3240
agggagccct tgcatgacaa ttctgctaac atcaaaaggc ctctaggttc ctttgttact 3300
tcttctgccg cctgcttcaa accgctaaca atacctgggc ccaccacacc gtgtgcattc 3360
gtaatgtctg cccattctgc tattctgtat acacccgcag agtactgcaa tttgactgta 3420
ttaccaatgt cagcaaattt tctgtcttcg aagagtaaaa aattgtactt ggcggataat 3480
gcctttagcg gcttaactgt gccctccatg gaaaaatcag tcaagatatc cacatgtgtt 3540
tttagtaaac aaattttggg acctaatgct tcaactaact ccagtaattc cttggtggta 3600
cgaacatcca atgaagcaca caagtttgtt tgcttttcgt gcatgatatt aaatagcttg 3660
gcagcaacag gactaggatg agtagcagca cgttccttat atgtagcttt cgacatgatt 3720
tatcttcgtt tcctgcaggt ttttgttctg tgcagttggg ttaagaatac tgggcaattt 3780
catgtttctt caacactaca tatgcgtata tataccaatc taagtctgtg ctccttcctt 3840
cgttcttcct tctgttcgga gattaccgaa tcaaaaaaat ttcaaagaaa ccgaaatcaa 3900
aaaaaagaat aaaaaaaaaa tgatgaattg aattgaaaag ctgtggtatg gtgcactacg 3960
tctcgacctc gagaccgcaa tacgcaaacc gcctctcccc gcgcgttggc cgattcatta 4020
atgcagctgg cacgacaggt ttcccgactg gaaagcgggc agtgagcgca acgcaattaa 4080
tgtgagttag ctcactcatt aggcacccca ggctttacac tttatgcttc cggctcgtat 4140
gttgtgtgga attgtgagcg gataacaatt tcacacatac tagagaaaga ggagaaatac 4200
tagatggctt cctccgaaga cgttatcaaa gagttcatgc gtttcaaagt tcgtatggaa 4260
ggttccgtta acggtcacga gttcgaaatc gaaggtgaag gtgaaggtcg tccgtacgaa 4320
ggtacccaga ccgctaaact gaaagttacc aaaggtggtc cgctgccgtt cgcttgggac 4380
atcctgtccc cgcagttcca gtacggttcc aaagcttacg ttaaacaccc ggctgacatc 4440
ccggactacc tgaaactgtc cttcccggaa ggtttcaaat gggaacgtgt tatgaacttc 4500
gaagacggtg gtgttgttac cgttacccag g 4531
<210> 75
<211> 6441
<212> DNA
<213> Artificial Sequence
<220>
<223> pGAU-YMRW?15 plasmid
<400> 75
gcaaaaggcc aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc 60
ccctgacgag catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact 120
ataaagatac caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct 180
gccgcttacc ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag 240
ctcacgctgt aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca 300
cgaacccccc gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa 360
cccggtaaga cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc 420
gaggtatgta ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag 480
aaggacagta tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg 540
tagctcttga tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca 600
gcagattacg cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc 660
tgacgctcag tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag 720
gatcttcacc tagatccttt taaattaaaa atgaagtttt aaatcaatct aaagtatata 780
tgagtaaact tggtctgaca gttaccaatg cttaatcagt gaggcaccta tctcagcgat 840
ctgtctattt cgttcatcca tagttgcctg actccccgtc gtgtagataa ctacgatacg 900
ggagggctta ccatctggcc ccagtgctgc aatgataccg cggctcccac gctcaccggc 960
tccagattta tcagcaataa accagccagc cggaagggcc gagcgcagaa gtggtcctgc 1020
aactttatcc gcctccatcc agtctattaa ttgttgccgg gaagctagag taagtagttc 1080
gccagttaat agtttgcgca acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc 1140
gtcgtttggt atggcttcat tcagctccgg ttcccaacga tcaaggcgag ttacatgatc 1200
ccccatgttg tgcaaaaaag cggttagctc cttcggtcct ccgatcgttg tcagaagtaa 1260
gttggccgca gtgttatcac tcatggttat ggcagcactg cataattctc ttactgtcat 1320
gccatccgta agatgctttt ctgtgactgg tgagtactca accaagtcat tctgagaata 1380
gtgtatgcgg cgaccgagtt gctcttgccc ggcgtcaata cgggataata ccgcgccaca 1440
tagcagaact ttaaaagtgc tcatcattgg aaaacgttct tcggggcgaa aactctcaag 1500
gatcttaccg ctgttgagat ccagttcgat gtaacccact cgtgcaccca actgatcttc 1560
agcatctttt actttcacca gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc 1620
aaaaaaggga ataagggcga cacggaaatg ttgaatactc atactcttcc tttttcaata 1680
ttattgaagc atttatcagg gttattgtct catgagcgga tacatatttg aatgtattta 1740
gaaaaataaa caaatagggg ttccgcgcac atttccccga aaagtgccac ggtctctgtc 1800
atcaatcaaa gcaacccaca aatcctaggc tgaatcatga tatcgatgga agcaatcaac 1860
aattttatca agaccgcacc aaagcacgac tatctgacag gcggagttca tcattctggt 1920
aatgtagacg tgttacaatt aagcggcaat aaagaagatg gtagtttagt atggaaccat 1980
acttttgttg atgtagacaa caatgtggta gctaagtttg aagacgctct cgaaaaactt 2040
gaaagtttgc accggcgctc atcctcatcc acaggcaatg aagaacacgc taacgtttaa 2100
ccgaggggag tcacttcata atgatgtgag aaataagtga atattgtaat aattgttggg 2160
actccattgt caacaaaagc tataatgtag gtatacagta tatactagaa gttctcctcg 2220
aggatcttgg aatccacaaa agggagtcga taaatctata taataaaaat tactttatct 2280
tctttcgttt tatacgttgt cgtttattat cctattacgt tatcaatctt cgcatttcag 2340
ctttcattag atttgatgac tgtttctcaa actttatgtc attttcttac accgctctct 2400
acctggctcg aagcacgcta gtaacatcag ctaacgaaag agttagaggc tcgctaaatc 2460
gcactgtcgg ggtcccttgg gtattttaca ctagcgtcag gacgactagc atgtgtcttt 2520
ccttccaggg gtatgcgggt gcgtggacaa atgagcagca tacgtattta ctcggcgtgc 2580
ctgctctctc gtatttctcc tggagatcaa ggaaatgttt catgtccaag cgaaaagccg 2640
ctctacggaa tggatctacg ttactgcctg cataaggaaa ccggtgtagc caaggacgaa 2700
agcgacccta ggttctaacc atcgactttg gcggaaaggt ttcactcagg aagcagacac 2760
tgattgacac ggtttagcag aacgtttgag gactaggtca aattgagtgg tttaatatcg 2820
gcatgtctgg ctttaaaatt cagtatagtg cgctgatcgg aaacgaatta aaaacacgag 2880
ttcccaaaac caggcgggct cgccacgcta atcgggatgc ataccacagc ttttcaattc 2940
aattcatcat ttttttttta ttcttttttt tgatttcggt ttctttgaaa tttttttgat 3000
tcggtaatct ccgaacagaa ggaagaacga aggaaggagc acagacttag attggtatat 3060
atacgcatat gtagtgttga agaaacatga aattgcccag tattcttaac ccaactgcac 3120
agaacaaaaa cctgcaggaa acgaagataa atcatgtcga aagctacata taaggaacgt 3180
gctgctactc atcctagtcc tgttgctgcc aagctattta atatcatgca cgaaaagcaa 3240
acaaacttgt gtgcttcatt ggatgttcgt accaccaagg aattactgga gttagttgaa 3300
gcattaggtc ccaaaatttg tttactaaaa acacatgtgg atatcttgac tgatttttcc 3360
atggagggca cagttaagcc gctaaaggca ttatccgcca agtacaattt tttactcttc 3420
gaagacagaa aatttgctga cattggtaat acagtcaaat tgcagtactc tgcgggtgta 3480
tacagaatag cagaatgggc agacattacg aatgcacacg gtgtggtggg cccaggtatt 3540
gttagcggtt tgaagcaggc ggcagaagaa gtaacaaagg aacctagagg ccttttgatg 3600
ttagcagaat tgtcatgcaa gggctcccta tctactggag aatatactaa gggtactgtt 3660
gacattgcga agagcgacaa agattttgtt atcggcttta ttgctcaaag agacatgggt 3720
ggaagagatg aaggttacga ttggttgatt atgacacccg gtgtgggttt agatgacaag 3780
ggtgacgcat tgggtcaaca gtatagaacc gtggatgatg tggtgtctac aggatctgac 3840
attattattg ttggaagagg actatttgca aagggaaggg atgctaaggt agagggtgaa 3900
cgttacagaa aagcaggctg ggaagcatat ttgagaagat gcggccagca aaactaaaaa 3960
actgtattat aagtaaatgc atgtatacta aactcacaaa ttagagcttc aatttaatta 4020
tatcagttat taccctatgc ggtgtgaaat accgcacaga tgcgtaagga gaaaataccg 4080
catcaggtag ccgctagtaa catcagctaa cgaaagagtt agaggctcgc taaatcgcac 4140
tgtcggggtc ccttgggtat tttacactag cgtcaggacg actagcatgt gtctttcctt 4200
ccaggggtat gcgggtgcgt ggacaaatga gcagcatacg tatttactcg gcgtgcctgc 4260
tctctcgtat ttctcctgga gatcaaggaa atgtttcatg tccaagcgaa aagccgctct 4320
acggaatgga tctacgttac tgcctgcata aggaaaccgg tgtagccaag gacgaaagcg 4380
accctaggtt ctaaccatcg actttggcgg aaaggtttca ctcaggaagc agacactgat 4440
tgacacggtt tagcagaacg tttgaggact aggtcaaatt gagtggttta atatcggcat 4500
gtctggcttt aaaattcagt atagtgcgct gatcggaaac gaattaaaaa cacgagttcc 4560
caaaaccagg cgggctcgcc acgctaatcg gtgcaccacc tcaggcagag aacctagaga 4620
cggcaatacg caaaccgcct ctccccgcgc gttggccgat tcattaatgc agctggcacg 4680
acaggtttcc cgactggaaa gcgggcagtg agcgcaacgc aattaatgtg agttagctca 4740
ctcattaggc accccaggct ttacacttta tgcttccggc tcgtatgttg tgtggaattg 4800
tgagcggata acaatttcac acatactaga gaaagaggag aaatactaga tggcttcctc 4860
cgaagacgtt atcaaagagt tcatgcgttt caaagttcgt atggaaggtt ccgttaacgg 4920
tcacgagttc gaaatcgaag gtgaaggtga aggtcgtccg tacgaaggta cccagaccgc 4980
taaactgaaa gttaccaaag gtggtccgct gccgttcgct tgggacatcc tgtccccgca 5040
gttccagtac ggttccaaag cttacgttaa acacccggct gacatcccgg actacctgaa 5100
actgtccttc ccggaaggtt tcaaatggga acgtgttatg aacttcgaag acggtggtgt 5160
tgttaccgtt acccaggact cctccctgca agacggtgag ttcatctaca aagttaaact 5220
gcgtggtacc aacttcccgt ccgacggtcc ggttatgcag aaaaaaacca tgggttggga 5280
agcttccacc gaacgtatgt acccggaaga cggtgctctg aaaggtgaaa tcaaaatgcg 5340
tctgaaactg aaagacggtg gtcactacga cgctgaagtt aaaaccacct acatggctaa 5400
aaaaccggtt cagctgccgg gtgcttacaa aaccgacatc aaactggaca tcacctccca 5460
caacgaagac tacaccatcg ttgaacagta cgaacgtgct gaaggtcgtc actccaccgg 5520
tgcttaataa cgctgatagt gctagtgtag atcgctacta gagccaggca tcaaataaaa 5580
cgaaaggctc agtcgaaaga ctgggccttt cgttttatct gttgtttgtc ggtgaacgct 5640
ctctactaga gtcacactgg ctccgtctca tgagcgctca tggaaaatgc aaccgataaa 5700
ccattataaa tcttcgcggt tatctggcat tgttattaac caaaaaaatg ccggcctatt 5760
acaagctact gttcaataaa tattgttgta atgaagacgg tccaactgta caaatacagc 5820
aaactgtcat atataaggag tcttatgtga cagcacttgc gttattgtca gccggagtat 5880
gtctttgtcg cattctgggc tttttacttt ctgctcagaa ggaagtacga acaagaaaaa 5940
aaaatcacca atgcttccct tttcagtatt agtttcatat ttgtttacgt tcaaactcgt 6000
cgtttgcgcg ataacctcta aaaaagtcaa ttacgtaact atatcaatca gagaatgcaa 6060
aaagcactat cataaaaatg tgtctagggg atgtgagaca tgtcaattat aagaagtgat 6120
ggtgtcatag tatatatatc ataaaagatt atcaaagttt caatcctttg tattttctag 6180
tttagcgcca acttttgaca aaacctaaac tttagataat catcattctt acaattttta 6240
tctggatggc aataatctcc tatataaagc ccagataaac tgtaaaaaga atccatcact 6300
atttgaaaaa aagtcatctg gcacgtttaa ttatcagagc agaaatgatg aagggtgtta 6360
gcgccgtcca ctgatgtgcc tggtagtcat gatttacgta taactaacac atcatgagga 6420
cggcggctcg gagagaccga t 6441
<210> 76
<211> 7606
<212> DNA
<213> Artificial Sequence
<220>
<223> pGAU-YMRW?15-HO-BMC plasmid
<400> 76
tgagcgagac gtccggcatc cgcttacaga caagctgtga caatctccgg gagctgcatg 60
tgtcagaggt tttcaccgtc atcaccgaaa cgcgcgagat taaagggcct cgtgatacgc 120
ctatttttat aggttaatgt catgataata atggtttctt agacggatcg cttgcctgta 180
acttacacgc gcctcgtatc ttttaatgat ggaataattt gggaatttac tctgtgttta 240
tttattttta tgttttgtat ttggatttta gaaagtaaat aaagaaggta gaagagttac 300
ggaatgaaga aaaaaaaata aacaaaggtt taaaaaattt caacaaaaag cgtactttac 360
atatatattt attagacaag aaaagcagat taaatagata tacattcgat taacgataag 420
taaaatgtaa aatcacagga ttttcgtgtg tggtcttcta cacagacaag atgaaacaat 480
tcggcattaa tacctgagag caggaagagc aagataaaag gtagtatttg ttggcgatcc 540
ccctagagtc ttttacatct tcggaaaaca aaaactattt tttctttaat ttcttttttt 600
actttctatt tttaatttat atatttatat taaaaaattt aaattataat tatttttata 660
gcacgtgatg aaaaggaccc aggtggcatt gacttgatcg gcacgtaaga ggttccaact 720
ttcaccataa tgaaataaga tcactaccgg gcgtattttt tgagttatcg agattttcag 780
gagctaagga agctaaaatg agccatattc aacgggaaac gtcttgctcg aggccgcgat 840
taaattccaa catggatgct gatttatatg ggtataaatg ggctcgcgat aatgtcgggc 900
aatcaggtgc gacaatctat cgattgtatg ggaagcccga tgcgccagag ttgtttctga 960
aacatggcaa aggtagcgtt gccaatgatg ttacagatga gatggtcagg ctaaactggc 1020
tgacggaatt tatgcctctt ccgaccatca agcattttat ccgtactcct gatgatgcat 1080
ggttactcac cactgcgatc ccagggaaaa cagcattcca ggtattagaa gaatatcctg 1140
attcaggtga aaatattgtt gatgcgctgg cagtgttcct gcgccggttg cattcgattc 1200
ctgtttgtaa ttgtcctttt aacggcgatc gcgtatttcg tctcgcacag gcgcaatcac 1260
gaatgaataa cggtttggtt ggtgcgagtg attttgatga cgagcgtaat ggctggcctg 1320
ttgaacaagt ctggaaagaa atgcataagc ttttgccatt ctcaccggat tcagtcgtca 1380
ctcatggtga tttctcactt gataacctta tttttgacga ggggaaatta ataggttgta 1440
ttgatgttgg acgagtcgga atcgcagacc gataccagga tcttgccatc ctatggaact 1500
gcctcggtga gttttctcct tcattacaga aacggctttt tcaaaaatat ggtattgata 1560
atcctgatat gaataaattg cagtttcact tgatgctcga tgagtttttc taatgagggc 1620
ccaaatgtaa tcacctggct caccttcggg tgggcctttc tgcgttgctg gcgtttttcc 1680
ataggctccg cccccctgac gagcatcaca aaaatcgatg ctcaagtcag aggtggcgaa 1740
acccgacagg actataaaga taccaggcgt ttccccctgg aagctccctc gtgcgctctc 1800
ctgttccgac cctgccgctt accggatacc tgtccgcctt tctcccttcg ggaagcgtgg 1860
cgctttctca tagctcacgc tgtaggtatc tcagttcggt gtaggtcgtt cgctccaagc 1920
tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc ggtaactatc 1980
gtcttgagtc caacccggta agacacgact tatcgccact ggcagcagcc actggtaaca 2040
ggattagcag agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg tggcctaact 2100
acggctacac tagaagaaca gtatttggta tctgcgctct gctgaagcca gttacctcgg 2160
aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt 2220
tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatttt 2280
ctaccgaacg tttacaattt cctgatgcgg tattttctcc ttacgcatct gtgcggtatt 2340
tcacaccgca tagggtaata actgatataa ttaaattgaa gctctaattt gtgagtttag 2400
tatacatgca tttacttata atacagtttt ttagttttgc tggccgcatc ttctcaaata 2460
tgcttcccag cctgcttttc tgtaacgttc accctctacc ttagcatccc ttccctttgc 2520
aaatagtcct cttccaacaa taataatgtc agatcctgta gacaccacat catccacggt 2580
tctatactgt tgacccaatg cgtcaccctt gtcatctaaa cccacaccgg gtgtcataat 2640
caaccaatcg taaccttcat ctcttccacc catgtctctt tgagcaataa agccgataac 2700
aaaatctttg tcgctcttcg caatgtcaac agtaccctta gtatattctc cagtagatag 2760
ggagcccttg catgacaatt ctgctaacat caaaaggcct ctaggttcct ttgttacttc 2820
ttctgccgcc tgcttcaaac cgctaacaat acctgggccc accacaccgt gtgcattcgt 2880
aatgtctgcc cattctgcta ttctgtatac acccgcagag tactgcaatt tgactgtatt 2940
accaatgtca gcaaattttc tgtcttcgaa gagtaaaaaa ttgtacttgg cggataatgc 3000
ctttagcggc ttaactgtgc cctccatgga aaaatcagtc aagatatcca catgtgtttt 3060
tagtaaacaa attttgggac ctaatgcttc aactaactcc agtaattcct tggtggtacg 3120
aacatccaat gaagcacaca agtttgtttg cttttcgtgc atgatattaa atagcttggc 3180
agcaacagga ctaggatgag tagcagcacg ttccttatat gtagctttcg acatgattta 3240
tcttcgtttc ctgcaggttt ttgttctgtg cagttgggtt aagaatactg ggcaatttca 3300
tgtttcttca acactacata tgcgtatata taccaatcta agtctgtgct ccttccttcg 3360
ttcttccttc tgttcggaga ttaccgaatc aaaaaaattt caaagaaacc gaaatcaaaa 3420
aaaagaataa aaaaaaaatg atgaattgaa ttgaaaagct gtggtatggt gcactacgtc 3480
tcgacctggc tacagtttat tcctggcatc cactaaatat aatggagccc gctttttaag 3540
ctggcatcca gaaaaaaaaa gaatcccagc accaaaatat tgttttcttc accaaccatc 3600
agttcatagg tccattctct tagcgcaact acagagaaca ggggcacaaa caggcaaaaa 3660
acgggcacaa cctcaatgga gtgatgcaac ctgcctggag taaatgatga cacaaggcaa 3720
ttgacccacg catgtatcta tctcattttc ttacaccttc tattaccttc tgctctctct 3780
gatttggaaa aagctgaaaa aaaaggttga aaccagttcc ctgaaattat tcccctactt 3840
gactaataag tatataaaga cggtaggtat tgattgtaat tctgtaaatc tatttcttaa 3900
acttcttaaa ttctactttt atagttagtc ttttttttag ttttaaaaca ccaagaactt 3960
agtttcgaat aaacacacat aaacaaacaa agatggctga tgctttgggt atgattgaag 4020
ttagaggttt cgttggtatg gttgaagctg ctgatgctat ggttaaggct gctaaagttg 4080
aattgatcgg ttacgaaaaa actggtggtg gttatgttac tgctgttgtt agaggtgatg 4140
ttgctgctgt aaaagctgct actgaagctg gtcaaagggc tgctgaaaga gttggagaag 4200
ttgttgctgt tcatgttatt ccaagaccac atgttaatgt tgatgctgct ttgccattgg 4260
gtagaactcc aggtatggat aagtctgctt agcgcggatt gagagcaaat cgttaagttc 4320
aggtcaagta aaaattgatt tcgaaaacta atttctctta tacaatcctt tgattggacc 4380
gtcatccttt cgaatataag attttgttaa gaatatttta gacagagatc tactttatat 4440
ttaatatcta gatattacat aatttcctct ctaataaaat atcattaata aaataaaaat 4500
gaagcgattt gattttgtgt tgtcaactta gtttgccgct atgcctcttg ggtaatgcta 4560
ttattgaatc gaagggcttt attatattac cctttagctt attctgaggt ttctgtggcg 4620
tgcaaagtga tgaaccgggc gggttttaag gataaaatca aaaagtgaaa aaatgaacgg 4680
aaaatggaat acctgtgaaa tggagaatga taatgaatct ttctgtcgtg cttgaaagat 4740
tttcggctcc tcaggcggct attaaaaaaa caacttacaa tcattgttcg ccccttccat 4800
acttactgcc actcgcaaaa gggcccaacc agggcaatta cgtatcaaaa aatcatgaca 4860
ggctgggtaa taaatattcg tgaagaaaga agaaattaaa aaaagaaacg aagaagcaaa 4920
aaaaagaaaa gactccgttt aatcactttc aaccgcggtt tatccggccc cacccatgca 4980
taaccctaaa ttattagatc acttagcacg tgaaaaagaa acgtttttaa tgtttttttt 5040
ttttttttct ttttcttttt ttgcgttggt gaaaattttt tcgcttcctc gagtataatt 5100
atctcatctc atctttcata taagataaga agttttataa aaaccttttg catcaaaatt 5160
ttgtagaata tctctttttc ttacgctctc tttctttcct taattgtttt ctaaagaacc 5220
gtgtattttt ctagttcgaa tccatcgata acattaaaag gatggatcat gctccagaaa 5280
gatttgatgc tactcctcca gctggtgaac cagatagacc agctttgggt gttttggaat 5340
tgacttctat tgctagaggt attaccgttg ctgatgctgc tttgaaaaga gcaccatctt 5400
tgttgttgat gtccagacca gtttcttccg gtaaacattt gttgatgatg agaggtcaag 5460
ttgccgaagt tgaagaatct atgattgctg ctagagaaat tgctggtgct ggttctggtg 5520
ctttgttgga tgaattggaa ttgccatatg ctcacgaaca actttggaga tttttggatg 5580
ctccagttgt tgcagatgct tgggaagaag atactgaatc cgttattatc gttgaaaccg 5640
ctactgtttg tgctgctatt gattctgctg atgcagcctt aaaaactgct cctgttgttt 5700
tgagagatat gagattggct attggtattg ctggtaaggc tttctttact ttgactggtg 5760
aattggctga tgttgaagct gctgctgaag ttgttagaga aagatgtggt gctagattgc 5820
tagaattggc ttgtattgca agaccagttg acgaattgag aggtaggttg tttttctagc 5880
acacttctcg attaacaaat tcccagtatt ctttgaaatc tatttttctt cctcaattga 5940
atttgaataa ctgtctacgc ggactcctcc tatctacaac tacaacaaat tttaaccact 6000
ttattaccac tttcctcttt catttatttt tgtcttttat gttgtcaatt tactagtatt 6060
tttttttttt tcatttacgt tcaaggtttt ttatactcat ttaacttgtc ttaggttatt 6120
tatatatata cctatatatt tatatatata tatatatatg tatgtatata ttattatcac 6180
caaatgagaa ataatagcta atttgatttt tgattattta aaatattggt ttgttctttc 6240
tgcaaacatc tcgtttggta cgatattagt gaaaaacgat gtaattatca acacgtgcat 6300
tacccacctc tgccggctac agattgggag attttcatag tagaattcag catgatagct 6360
acgtaaatgt gttccgcacc gtcacaaagt gttttctact gttctttctt ctttcgttca 6420
ttcagttgag ttgagtgagt gctttgttca atggatctta gctaaaatgc atattttttc 6480
tcttggtaaa tgaatgcttg tgatgtcttc caagtgattt cctttccttc ccatatgatg 6540
ctaggtacct ttagtgtctt cctaaaaaaa aaaaaaggct cgccatcaaa acgatattcg 6600
ttggcttttt tttctgaatt ataaatactc tttggtaact tttcatttcc aagaacctct 6660
tttttccagt tatatcatgg tcccctttca aagttattct ctactctttt tcatattcat 6720
tctttttcat cctttggttt tttattctta acttgtttat tattctctct tgtttctatt 6780
tacaagacac caatcaaaac aaataaaaca tcatcacaga tggttttagg taaagttgtc 6840
ggtactgttg ttgcatcaag aaaggaacca agaattgaag gtttatcttt attattggtt 6900
agagcttgtg atccagatgg tactccaact ggtggtgctg ttgtttgtgc tgatgctgtt 6960
ggtgctggtg ttggtgaagt tgttttatat gcttctggtt cttctgctag acaaactgaa 7020
gttactaata atagaccagt tgatgctact attatggcta ttgttgattt ggttgaaatg 7080
ggtggtgatg ttagatttag aaaagatggt tcttcttggt cacatccaca atttgaaaag 7140
tagcaactaa gctggttcta actggaaata atttccatta gattcctctt tttctcgtcc 7200
attaaccaaa atatattatt gaattcagcg gttccttttt tctcattttc gcatatagct 7260
gcactattag aatcagccca ctctaggtaa acacagttcc tcgatatacc tctgtcttac 7320
tatcagtggt taaaccttat gcaaatataa tatatatata tatatatata tatatatctc 7380
atacttttgt tgattcttgt gtaattattg gaaaagacaa aacaaagcaa gcgtttctat 7440
tcatatttac aagtattttt tatgacaaac tatttcttaa ttttcccacc ggcggctttg 7500
aataaggcaa tgtcattgtc ctgcataata tattgtttgc ctgcacgttt gataagtccc 7560
ttagatttta gtaaagactc atttagcggt ggttccatct tccctc 7606
<210> 77
<211> 122
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter Pcon3 nucleotide sequence
<400> 77
ggctggcttc ccaaccttac cagagggcgc cccagctggc aattccgacg tcctgacagc 60
tagctcagtc ctaggtataa tgctagcgaa ttcaaaagat cttttaagaa ggagatatac 120
at 122
<210> 78
<211> 122
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter Pcon4 nucleotide sequence
<400> 78
ggctggcttc ccaaccttac cagagggcgc cccagctggc aattccgacg tctttacggc 60
tagctcagtc ctaggtacta tgctagcgaa ttcaaaagat cttttaagaa ggagatatac 120
at 122
<210> 79
<211> 154
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter TT7 nucleotide sequence
<400> 79
tagctaacaa agcccgaaag gaagctgagt tggctgctgc caccgctgag caataactag 60
cataacccct tggggcctct aaacgggtct tgaggggttt tttgctgaaa ggaggaacta 120
tatccggata tcccgcaaga ggcccggcag tacc 154
<210> 80
<211> 459
<212> DNA
<213> Artificial Sequence
<220>
<223> Yeast transcriptional terminator TRPL41B
<400> 80
tagcgcggat tgagagcaaa tcgttaagtt caggtcaagt aaaaattgat ttcgaaaact 60
aatttctctt atacaatcct ttgattggac cgtcatcctt tcgaatataa gattttgtta 120
agaatatttt agacagagat ctactttata tttaatatct agatattaca taatttcctc 180
tctaataaaa tatcattaat aaaataaaaa tgaagcgatt tgattttgtg ttgtcaactt 240
agtttgccgc tatgcctctt gggtaatgct attattgaat cgaagggctt tattatatta 300
ccctttagct tattctgagg tttctgtggc gtgcaaagtg atgaaccggg cgggttttaa 360
ggataaaatc aaaaagtgaa aaaatgaacg gaaaatggaa tacctgtgaa atggagaatg 420
ataatgaatc tttctgtcgt gcttgaaaga ttttcggct 459
<210> 81
<211> 430
<212> DNA
<213> Artificial Sequence
<220>
<223> Yeast transcriptional terminator THBT1
<400> 81
tagcacactt ctcgattaac aaattcccag tattctttga aatctatttt tcttcctcaa 60
ttgaatttga ataactgtct acgcggactc ctcctatcta caactacaac aaattttaac 120
cactttatta ccactttcct ctttcattta tttttgtctt ttatgttgtc aatttactag 180
tatttttttt tttttcattt acgttcaagg ttttttatac tcatttaact tgtcttaggt 240
tatttatata tatacctata tatttatata tatatatata tatgtatgta tatattatta 300
tcaccaaatg agaaataata gctaatttga tttttgatta tttaaaatat tggtttgttc 360
tttctgcaaa catctcgttt ggtacgatat tagtgaaaaa cgatgtaatt atcaacacgt 420
gcattaccca 430
<210> 82
<211> 458
<212> DNA
<213> Artificial Sequence
<220>
<223> Yeast transcriptional terminator TRPS20
<400> 82
tagcaactaa gctggttcta actggaaata atttccatta gattcctctt tttctcgtcc 60
attaaccaaa atatattatt gaattcagcg gttccttttt tctcattttc gcatatagct 120
gcactattag aatcagccca ctctaggtaa acacagttcc tcgatatacc tctgtcttac 180
tatcagtggt taaaccttat gcaaatataa tatatatata tatatatata tctcatactt 240
ttgttgattc ttgtgtaatt attggaaaag acaaaacaaa gcaagcgttt ctattcatca 300
tatttacaag tatttttatg aaaaactatt tcttaatttt cccaccggcg gctttgaata 360
aggcaatgtc attgtcctgc ataatatatt gtttgcctgc acgtttgata agtcccttag 420
attttagtaa agactcattt agcggtggtt ccatcttc 458
<210> 83
<211> 1864
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_P-Pcon2 plasmid
<400> 83
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg gctggcttcc caaccttacc agagggcgcc ccagctggca attccgacgt 120
cttgacggct agctcagtcc taggtacagt gctagcgaat tcaaaagatc ttttaagaag 180
gagatataca tgatgcgaga cgactgacca tttaaatcat acctgacctc catagcagaa 240
agtcaaaagc ctccgaccgg aggcttttga cttgatcggc acgtaagagg ttccaacttt 300
caccataatg aaataagatc actaccgggc gtattttttg agttatcgag attttcagga 360
gctaaggaag ctaaaatgag ccatattcaa cgggaaacgt cttgctcgag gccgcgatta 420
aattccaaca tggatgctga tttatatggg tataaatggg ctcgcgataa tgtcgggcaa 480
tcaggtgcga caatctatcg attgtatggg aagcccgatg cgccagagtt gtttctgaaa 540
catggcaaag gtagcgttgc caatgatgtt acagatgaga tggtcaggct aaactggctg 600
acggaattta tgcctcttcc gaccatcaag cattttatcc gtactcctga tgatgcatgg 660
ttactcacca ctgcgatccc agggaaaaca gcattccagg tattagaaga atatcctgat 720
tcaggtgaaa atattgttga tgcgctggca gtgttcctgc gccggttgca ttcgattcct 780
gtttgtaatt gtccttttaa cggcgatcgc gtatttcgtc tcgctcaggc gcaatcacga 840
atgaataacg gtttggttgg tgcgagtgat tttgatgacg agcgtaatgg ctggcctgtt 900
gaacaagtct ggaaagaaat gcataagctt ttgccattct caccggattc agtcgtcact 960
catggtgatt tctcacttga taaccttatt tttgacgagg ggaaattaat aggttgtatt 1020
gatgttggac gagtcggaat cgcagaccga taccaggatc ttgccatcct atggaactgc 1080
ctcggtgagt tttctccttc attacagaaa cggctttttc aaaaatatgg tattgataat 1140
cctgatatga ataaattgca gtttcacttg atgctcgatg agtttttcta atgagggccc 1200
aaatgtaatc acctggctca ccttcgggtg ggcctttctg cgttgctggc gtttttccat 1260
aggctccgcc cccctgacga gcatcacaaa aatcgatgct caagtcagag gtggcgaaac 1320
ccgacaggac tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct 1380
gttccgaccc tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg 1440
ctttctcata gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg 1500
ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt 1560
cttgagtcca acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg 1620
attagcagag cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac 1680
ggctacacta gaagaacagt atttggtatc tgcgctctgc tgaagccagt tacctcggaa 1740
aaagagttgg tagctcttga tccggcaaac aaaccaccgc tggtagcggt ggtttttttg 1800
tttgcaagca gcagattacg cgcagaaaaa aaggatctca agaagatcct ttgattttct 1860
accg 1864
<210> 84
<211> 122
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter Pcon2 nucleotide sequence
<400> 84
ggctggcttc ccaaccttac cagagggcgc cccagctggc aattccgacg tcttgacggc 60
tagctcagtc ctaggtacag tgctagcgaa ttcaaaagat cttttaagaa ggagatatac 120
at 122
<210> 85
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_chc_F' forward primer
<400> 85
gatcctttga ttttctaccg 20
<210> 86
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_chc_R' reverse primer
<400> 86
ctcgataact caaaaaatac g 21
<210> 87
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> pES_Chc_F' forward primer
<400> 87
cggagcctat ggaaaaacgc 20
<210> 88
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> pES_Chc_R' reverse primer
<400> 88
ccgcagtgtc ttgggtctct 20
<210> 89
<211> 23
<212> DNA
<213> Artificial Sequence
<220>
<223> His_chc_F' forward primer
<400> 89
tagagtgtac tagaggaggc caa 23
<210> 90
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> CEN_chc_R' reverse primer
<400> 90
ggtgatgacg gtgaaaacct 20
<210> 91
<211> 24
<212> DNA
<213> Artificial Sequence
<220>
<223> Ura_chc_F' forward primer
<400> 91
tctgttcgga gattaccgaa tcaa 24
<210> 92
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> pGau_chc_F' forward primer
<400> 92
ccacctcagg cagagaacct 20
<210> 93
<211> 19
<212> DNA
<213> Artificial Sequence
<220>
<223> pGau_chc_R' reverse primer
<400> 93
ggaaaaacgc cagcaacgc 19
<210> 94
<211> 30
<212> PRT
<213> Artificial Sequence
<220>
<223> S2CP(30) amino acid sequence
<400> 94
Lys Pro Glu Lys Pro Gly Ser Lys Ile Thr Gly Ser Ser Gly Asn Asp
1 5 10 15
Thr Gln Gly Ser Leu Ile Thr Tyr Ser Gly Gly Ala Arg Gly
20 25 30
<210> 95
<211> 90
<212> DNA
<213> Artificial Sequence
<220>
<223> S2CP(30) nucleotide sequence
<400> 95
aaaccggaaa aaccaggtag caaaattacc ggtagcagcg gcaatgatac ccagggtagc 60
ctgattacct atagcggtgg tgcacgtggt 90
<210> 96
<211> 229
<212> PRT
<213> Artificial Sequence
<220>
<223> HO-T1-SpyTag amino acid sequence
<400> 96
Met Asp His Ala Pro Glu Arg Phe Asp Ala Thr Pro Pro Ala Gly Glu
1 5 10 15
Pro Asp Arg Pro Ala Leu Gly Val Leu Glu Leu Thr Ser Ile Ala Arg
20 25 30
Gly Ile Thr Val Ala Asp Ala Ala Leu Lys Arg Ala Pro Ser Leu Leu
35 40 45
Leu Met Ser Arg Pro Val Ser Ser Gly Lys His Leu Leu Met Met Arg
50 55 60
Gly Gln Val Ala Glu Val Glu Glu Ser Met Ile Ala Ala Arg Glu Ile
65 70 75 80
Ala Gly Ala Gly Gly Gly Ser Gly Gly Ser Ala His Ile Val Met Val
85 90 95
Asp Ala Tyr Lys Pro Thr Lys Gly Gly Ser Gly Gly Ser Gly Ala Leu
100 105 110
Leu Asp Glu Leu Glu Leu Pro Tyr Ala His Glu Gln Leu Trp Arg Phe
115 120 125
Leu Asp Ala Pro Val Val Ala Asp Ala Trp Glu Glu Asp Thr Glu Ser
130 135 140
Val Ile Ile Val Glu Thr Ala Thr Val Cys Ala Ala Ile Asp Ser Ala
145 150 155 160
Asp Ala Ala Leu Lys Thr Ala Pro Val Val Leu Arg Asp Met Arg Leu
165 170 175
Ala Ile Gly Ile Ala Gly Lys Ala Phe Phe Thr Leu Thr Gly Glu Leu
180 185 190
Ala Asp Val Glu Ala Ala Ala Glu Val Val Arg Glu Arg Cys Gly Ala
195 200 205
Arg Leu Leu Glu Leu Ala Cys Ile Ala Arg Pro Val Asp Glu Leu Arg
210 215 220
Gly Arg Leu Phe Phe
225
<210> 97
<211> 687
<212> DNA
<213> Artificial Sequence
<220>
<223> HO-T1-SpyTag nucleotide sequence
<400> 97
atggatcatg ctccagaaag atttgatgct actcctccag ctggtgaacc agatagacca 60
gctttgggtg ttttggaatt gacttctatt gctagaggta ttaccgttgc tgatgctgct 120
ttgaaaagag caccatcttt gttgttgatg tccagaccag tttcttccgg taaacatttg 180
ttgatgatga gaggtcaagt tgccgaagtt gaagaatcta tgattgctgc tagagaaatt 240
gctggtgctg gtggtggttc aggtggttct gctcatatag ttatggttga tgcttacaag 300
ccaacaaaag gtggtagtgg tggatctggt gctttgttgg atgaattgga attgccatat 360
gctcacgaac aactttggag atttttggat gctccagttg ttgcagatgc ttgggaagaa 420
gatactgaat ccgttattat cgttgaaacc gctactgttt gtgctgctat tgattctgct 480
gatgcagcct taaaaactgc tcctgttgtt ttgagagata tgagattggc tattggtatt 540
gctggtaagg ctttctttac tttgactggt gaattggctg atgttgaagc tgctgctgaa 600
gttgttagag aaagatgtgg tgctagattg ctagaattgg catgtattgc aagaccagtt 660
gacgaattga gaggtaggtt gtttttc 687
<210> 98
<211> 339
<212> PRT
<213> Artificial Sequence
<220>
<223> GFP-SpyCatcher amino acid sequence
<400> 98
Met Gly Ser Ser His His His His His His Ser Ser Gly Met Ser Lys
1 5 10 15
Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu Val Glu Leu Asp
20 25 30
Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu Gly Glu Gly
35 40 45
Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe Ile Cys Thr Thr Gly
50 55 60
Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Leu Thr Tyr Gly
65 70 75 80
Val Gln Cys Phe Ser Arg Tyr Pro Asp His Met Lys Gln His Asp Phe
85 90 95
Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu Arg Thr Ile Phe
100 105 110
Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val Lys Phe Glu
115 120 125
Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile Asp Phe Lys
130 135 140
Glu Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr Asn Tyr Asn Ser
145 150 155 160
His Asn Val Tyr Ile Met Ala Asp Lys Gln Lys Asn Gly Ile Lys Val
165 170 175
Asn Phe Lys Ile Arg His Asn Ile Glu Asp Gly Ser Val Gln Leu Ala
180 185 190
Asp His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro Val Leu Leu
195 200 205
Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Lys Leu Ser Lys Asp Pro
210 215 220
Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val Thr Ala Ala
225 230 235 240
Gly Ile Thr His Gly Met Asp Glu Leu Tyr Lys Gly Ser Gly Gly Ser
245 250 255
Asp Ser Ala Thr His Ile Lys Phe Ser Lys Arg Asp Glu Asp Gly Lys
260 265 270
Glu Leu Ala Gly Ala Thr Met Glu Leu Arg Asp Ser Ser Gly Lys Thr
275 280 285
Ile Ser Thr Trp Ile Ser Asp Gly Gln Val Lys Asp Phe Tyr Leu Tyr
290 295 300
Pro Gly Lys Tyr Thr Phe Val Glu Thr Ala Ala Pro Asp Gly Tyr Glu
305 310 315 320
Val Ala Thr Ala Ile Thr Phe Thr Val Asn Glu Gln Gly Gln Val Thr
325 330 335
Val Asn Gly
<210> 99
<211> 1017
<212> DNA
<213> Artificial Sequence
<220>
<223> GFP-SpyCatcher nucleotide sequence
<400> 99
atgggttctt ctcatcatca ccatcaccat tcttctggga tgtctaaagg tgaagaatta 60
ttcactggtg ttgtcccaat tttggttgaa ttagatggtg atgttaatgg tcacaaattt 120
tctgtctccg gtgaaggtga aggtgatgct acttacggta aattgacctt aaaatttatt 180
tgtactactg gtaaattgcc agttccatgg ccaaccttag tcactacttt aacttatggt 240
gttcaatgtt tttctagata cccagatcat atgaaacaac atgacttttt caagtctgcc 300
atgccagaag gttatgttca agaaagaact atttttttca aagatgacgg taactacaag 360
accagagctg aagtcaagtt tgaaggtgat accttagtta atagaatcga attaaaaggt 420
attgatttta aagaagatgg taacatttta ggtcacaaat tggaatacaa ctataactct 480
cacaatgttt acatcatggc tgacaaacaa aagaatggta tcaaagttaa cttcaaaatt 540
agacacaaca ttgaagatgg ttctgttcaa ttagctgacc attatcaaca aaatactcca 600
attggtgatg gtccagtctt gttaccagac aaccattact tatccactca atctaaatta 660
tccaaagatc caaacgaaaa gagagatcac atggtcttgt tagaatttgt tactgctgct 720
ggtattaccc atggtatgga tgaattgtac aaaggttctg gtggttctga ttctgctact 780
catattaagt tctccaagag ggacgaagat ggtaaagaat tggctggtgc aactatggaa 840
ttgagagatt cttctggtaa gaccatttcc acctggattt ctgatggtca agttaaggat 900
ttctacttgt acccaggtaa gtacactttc gttgaaactg ctgctccaga tggttatgaa 960
gttgctactg ctattacttt caccgtcaat gaacaaggtc aagtcactgt taatggt 1017
<210> 100
<211> 296
<212> PRT
<213> Artificial Sequence
<220>
<223> APEX2-S2CP(30) amino acid sequence
<400> 100
Met Gly Ser Ser His His His His His His Ser Ser Gly Met Gly Lys
1 5 10 15
Ser Tyr Pro Thr Val Ser Ala Asp Tyr Gln Asp Ala Val Glu Lys Ala
20 25 30
Lys Lys Lys Leu Arg Gly Phe Ile Ala Glu Lys Arg Cys Ala Pro Leu
35 40 45
Met Leu Arg Leu Ala Phe His Ser Ala Gly Thr Phe Asp Lys Gly Thr
50 55 60
Lys Thr Gly Gly Pro Phe Gly Thr Ile Lys His Pro Ala Glu Leu Ala
65 70 75 80
His Ser Ala Asn Asn Gly Leu Asp Ile Ala Val Arg Leu Leu Glu Pro
85 90 95
Leu Lys Ala Glu Phe Pro Ile Leu Ser Tyr Ala Asp Phe Tyr Gln Leu
100 105 110
Ala Gly Val Val Ala Val Glu Val Thr Gly Gly Pro Lys Val Pro Phe
115 120 125
His Pro Gly Arg Glu Asp Lys Pro Glu Pro Pro Pro Glu Gly Arg Leu
130 135 140
Pro Asp Pro Thr Lys Gly Ser Asp His Leu Arg Asp Val Phe Gly Lys
145 150 155 160
Ala Met Gly Leu Thr Asp Gln Asp Ile Val Ala Leu Ser Gly Gly His
165 170 175
Thr Ile Gly Ala Ala His Lys Glu Arg Ser Gly Phe Glu Gly Pro Trp
180 185 190
Thr Ser Asn Pro Leu Ile Phe Asp Asn Ser Tyr Phe Thr Glu Leu Leu
195 200 205
Ser Gly Glu Lys Glu Gly Leu Leu Gln Leu Pro Ser Asp Lys Ala Leu
210 215 220
Leu Ser Asp Pro Val Phe Arg Pro Leu Val Asp Lys Tyr Ala Ala Asp
225 230 235 240
Glu Asp Ala Phe Phe Ala Asp Tyr Ala Glu Ala His Gln Lys Leu Ser
245 250 255
Glu Leu Gly Phe Ala Asp Ala Gly Ser Ser Lys Pro Glu Lys Pro Gly
260 265 270
Ser Lys Ile Thr Gly Ser Ser Gly Asn Asp Thr Gln Gly Ser Leu Ile
275 280 285
Thr Tyr Ser Gly Gly Ala Arg Gly
290 295
<210> 101
<211> 888
<212> DNA
<213> Artificial Sequence
<220>
<223> APEX2-S2CP(30) nucleotide sequence
<400> 101
atgggttctt ctcatcatca ccatcaccat tcttctggga tgggtaagtc ttacccaact 60
gtttctgctg attatcaaga tgctgttgaa aaggccaaga agaagttgag aggtttcatt 120
gctgaaaaaa gatgcgctcc attgatgttg agattggctt ttcattctgc tggtactttc 180
gataagggta caaaaactgg tggtccattc ggtactatca aacatccagc tgaattggct 240
cattcagcta acaatggttt ggatattgct gtcagattgc tggaaccatt gaaagccgaa 300
tttccaattt tgtcctacgc cgatttttac caattggctg gtgttgttgc agttgaagtt 360
acaggtggtc caaaagttcc atttcatcca ggtagagaag ataagccaga accaccacca 420
gaaggtagat tgccagatcc aacaaaaggt tctgatcact tgagagatgt tttcggtaaa 480
gctatgggtt tgactgatca agatattgtc gctttgtctg gtggtcatac aattggtgct 540
gctcacaaag aaagatcagg ttttgaaggt ccttggactt ctaacccatt gatctttgat 600
aactcttact tcaccgagtt gttgtccggt gaaaaagaag gtttgttgca attgccatct 660
gataaggctt tgttgtctga tccagttttc agaccattgg ttgataagta tgctgctgat 720
gaagatgctt tctttgctga ttacgctgaa gctcatcaaa agttgtctga attgggtttt 780
gctgatgctg gttcttctaa accggaaaaa ccaggtagca aaattaccgg tagcagcggc 840
aatgataccc agggtagcct gattacctat agcggtggtg cacgtggt 888
<210> 102
<211> 1070
<212> PRT
<213> Artificial Sequence
<220>
<223> LacZ-S2CP(30) amino acid sequence
<400> 102
Met Gly Ser Ser His His His His His His Ser Ser Gly Met Thr Met
1 5 10 15
Ile Thr Asp Ser Leu Ala Val Val Leu Gln Arg Arg Asp Trp Glu Asn
20 25 30
Pro Gly Val Thr Gln Leu Asn Arg Leu Ala Ala His Pro Pro Phe Ala
35 40 45
Ser Trp Arg Asn Ser Glu Glu Ala Arg Thr Asp Arg Pro Ser Gln Gln
50 55 60
Leu Arg Ser Leu Asn Gly Glu Trp Arg Phe Ala Trp Phe Pro Ala Pro
65 70 75 80
Glu Ala Val Pro Glu Ser Trp Leu Glu Cys Asp Leu Pro Glu Ala Asp
85 90 95
Thr Val Val Val Pro Ser Asn Trp Gln Met His Gly Tyr Asp Ala Pro
100 105 110
Ile Tyr Thr Asn Val Thr Tyr Pro Ile Thr Val Asn Pro Pro Phe Val
115 120 125
Pro Thr Glu Asn Pro Thr Gly Cys Tyr Ser Leu Thr Phe Asn Val Asp
130 135 140
Glu Ser Trp Leu Gln Glu Gly Gln Thr Arg Ile Ile Phe Asp Gly Val
145 150 155 160
Asn Ser Ala Phe His Leu Trp Cys Asn Gly Arg Trp Val Gly Tyr Gly
165 170 175
Gln Asp Ser Arg Leu Pro Ser Glu Phe Asp Leu Ser Ala Phe Leu Arg
180 185 190
Ala Gly Glu Asn Arg Leu Ala Val Met Val Leu Arg Trp Ser Asp Gly
195 200 205
Ser Tyr Leu Glu Asp Gln Asp Met Trp Arg Met Ser Gly Ile Phe Arg
210 215 220
Asp Val Ser Leu Leu His Lys Pro Thr Thr Gln Ile Ser Asp Phe His
225 230 235 240
Val Ala Thr Arg Phe Asn Asp Asp Phe Ser Arg Ala Val Leu Glu Ala
245 250 255
Glu Val Gln Met Cys Gly Glu Leu Arg Asp Tyr Leu Arg Val Thr Val
260 265 270
Ser Leu Trp Gln Gly Glu Thr Gln Val Ala Ser Gly Thr Ala Pro Phe
275 280 285
Gly Gly Glu Ile Ile Asp Glu Arg Gly Gly Tyr Ala Asp Arg Val Thr
290 295 300
Leu Arg Leu Asn Val Glu Asn Pro Lys Leu Trp Ser Ala Glu Ile Pro
305 310 315 320
Asn Leu Tyr Arg Ala Val Val Glu Leu His Thr Ala Asp Gly Thr Leu
325 330 335
Ile Glu Ala Glu Ala Cys Asp Val Gly Phe Arg Glu Val Arg Ile Glu
340 345 350
Asn Gly Leu Leu Leu Leu Asn Gly Lys Pro Leu Leu Ile Arg Gly Val
355 360 365
Asn Arg His Glu His His Pro Leu His Gly Gln Val Met Asp Glu Gln
370 375 380
Thr Met Val Gln Asp Ile Leu Leu Met Lys Gln Asn Asn Phe Asn Ala
385 390 395 400
Val Arg Cys Ser His Tyr Pro Asn His Pro Leu Trp Tyr Thr Leu Cys
405 410 415
Asp Arg Tyr Gly Leu Tyr Val Val Asp Glu Ala Asn Ile Glu Thr His
420 425 430
Gly Met Val Pro Met Asn Arg Leu Thr Asp Asp Pro Arg Trp Leu Pro
435 440 445
Ala Met Ser Glu Arg Val Thr Arg Met Val Gln Arg Asp Arg Asn His
450 455 460
Pro Ser Val Ile Ile Trp Ser Leu Gly Asn Glu Ser Gly His Gly Ala
465 470 475 480
Asn His Asp Ala Leu Tyr Arg Trp Ile Lys Ser Val Asp Pro Ser Arg
485 490 495
Pro Val Gln Tyr Glu Gly Gly Gly Ala Asp Thr Thr Ala Thr Asp Ile
500 505 510
Ile Cys Pro Met Tyr Ala Arg Val Asp Glu Asp Gln Pro Phe Pro Ala
515 520 525
Val Pro Lys Trp Ser Ile Lys Lys Trp Leu Ser Leu Pro Gly Glu Thr
530 535 540
Arg Pro Leu Ile Leu Cys Glu Tyr Ala His Ala Met Gly Asn Ser Leu
545 550 555 560
Gly Gly Phe Ala Lys Tyr Trp Gln Ala Phe Arg Gln Tyr Pro Arg Leu
565 570 575
Gln Gly Gly Phe Val Trp Asp Trp Val Asp Gln Ser Leu Ile Lys Tyr
580 585 590
Asp Glu Asn Gly Asn Pro Trp Ser Ala Tyr Gly Gly Asp Phe Gly Asp
595 600 605
Thr Pro Asn Asp Arg Gln Phe Cys Met Asn Gly Leu Val Phe Ala Asp
610 615 620
Arg Thr Pro His Pro Ala Leu Thr Glu Ala Lys His Gln Gln Gln Phe
625 630 635 640
Phe Gln Phe Arg Leu Ser Gly Gln Thr Ile Glu Val Thr Ser Glu Tyr
645 650 655
Leu Phe Arg His Ser Asp Asn Glu Leu Leu His Trp Met Val Ala Leu
660 665 670
Asp Gly Lys Pro Leu Ala Ser Gly Glu Val Pro Leu Asp Val Ala Pro
675 680 685
Gln Gly Lys Gln Leu Ile Glu Leu Pro Glu Leu Pro Gln Pro Glu Ser
690 695 700
Ala Gly Gln Leu Trp Leu Thr Val Arg Val Val Gln Pro Asn Ala Thr
705 710 715 720
Ala Trp Ser Glu Ala Gly His Ile Ser Ala Trp Gln Gln Trp Arg Leu
725 730 735
Ala Glu Asn Leu Ser Val Thr Leu Pro Ala Ala Ser His Ala Ile Pro
740 745 750
His Leu Thr Thr Ser Glu Met Asp Phe Cys Ile Glu Leu Gly Asn Lys
755 760 765
Arg Trp Gln Phe Asn Arg Gln Ser Gly Phe Leu Ser Gln Met Trp Ile
770 775 780
Gly Asp Lys Lys Gln Leu Leu Thr Pro Leu Arg Asp Gln Phe Thr Arg
785 790 795 800
Ala Pro Leu Asp Asn Asp Ile Gly Val Ser Glu Ala Thr Arg Ile Asp
805 810 815
Pro Asn Ala Trp Val Glu Arg Trp Lys Ala Ala Gly His Tyr Gln Ala
820 825 830
Glu Ala Ala Leu Leu Gln Cys Thr Ala Asp Thr Leu Ala Asp Ala Val
835 840 845
Leu Ile Thr Thr Ala His Ala Trp Gln His Gln Gly Lys Thr Leu Phe
850 855 860
Ile Ser Arg Lys Thr Tyr Arg Ile Asp Gly Ser Gly Gln Met Ala Ile
865 870 875 880
Thr Val Asp Val Glu Val Ala Ser Asp Thr Pro His Pro Ala Arg Ile
885 890 895
Gly Leu Asn Cys Gln Leu Ala Gln Val Ala Glu Arg Val Asn Trp Leu
900 905 910
Gly Leu Gly Pro Gln Glu Asn Tyr Pro Asp Arg Leu Thr Ala Ala Cys
915 920 925
Phe Asp Arg Trp Asp Leu Pro Leu Ser Asp Met Tyr Thr Pro Tyr Val
930 935 940
Phe Pro Ser Glu Asn Gly Leu Arg Cys Gly Thr Arg Glu Leu Asn Tyr
945 950 955 960
Gly Pro His Gln Trp Arg Gly Asp Phe Gln Phe Asn Ile Ser Arg Tyr
965 970 975
Ser Gln Gln Gln Leu Met Glu Thr Ser His Arg His Leu Leu His Ala
980 985 990
Glu Glu Gly Thr Trp Leu Asn Ile Asp Gly Phe His Met Gly Ile Gly
995 1000 1005
Gly Asp Asp Ser Trp Ser Pro Ser Val Ser Ala Glu Phe Gln Leu
1010 1015 1020
Ser Ala Gly Arg Tyr His Tyr Gln Leu Val Trp Cys Gln Lys Gly
1025 1030 1035
Ser Ser Lys Pro Glu Lys Pro Gly Ser Lys Ile Thr Gly Ser Ser
1040 1045 1050
Gly Asn Asp Thr Gln Gly Ser Leu Ile Thr Tyr Ser Gly Gly Ala
1055 1060 1065
Arg Gly
1070
<210> 103
<211> 3210
<212> DNA
<213> Artificial Sequence
<220>
<223> LacZ-S2CP(30) nucleotide sequence
<400> 103
atgggttctt ctcatcatca ccatcaccat tcttctggga tgaccatgat tacggattca 60
ctggccgtcg ttttacaacg tcgtgactgg gaaaaccctg gcgttaccca acttaatcgc 120
cttgcagcac atcccccttt cgccagctgg cgtaatagcg aagaggcccg caccgatcgc 180
ccttcccaac agttgcgcag cctgaatggc gaatggcgct ttgcctggtt tccggcacca 240
gaagcggtgc cggaaagctg gctggagtgc gatcttcctg aggccgatac tgtcgtcgtc 300
ccctcaaact ggcagatgca cggttacgat gcgcccatct acaccaacgt gacctatccc 360
attacggtca atccgccgtt tgttcccacg gagaatccga cgggttgtta ctcgctcaca 420
tttaatgttg atgaaagctg gctacaggaa ggccagacgc gaattatttt tgatggcgtt 480
aactcggcgt ttcatctgtg gtgcaacggg cgctgggtcg gttacggcca ggacagtcgt 540
ttgccgtctg aatttgacct gagcgcattt ttacgcgccg gagaaaaccg cctcgcggtg 600
atggtgctgc gctggagtga cggcagttat ctggaagatc aggatatgtg gcggatgagc 660
ggcattttcc gtgatgtctc gttgctgcat aaaccgacta cacaaatcag cgatttccat 720
gttgccactc gctttaatga tgatttcagc cgcgctgtac tggaggctga agttcagatg 780
tgcggcgagt tgcgtgacta cctacgggta acagtttctt tatggcaggg tgaaacgcag 840
gtcgccagcg gcaccgcgcc tttcggcggt gaaattatcg atgagcgtgg tggttatgcc 900
gatcgcgtca cactacgtct gaacgtcgaa aacccgaaac tgtggagcgc cgaaatcccg 960
aatctctatc gtgcggtggt tgaactgcac accgccgacg gcacgctgat tgaagcagaa 1020
gcctgcgatg tcggtttccg cgaggtgcgg attgaaaatg gtctgctgct gctgaacggc 1080
aagccgttgc tgattcgagg cgttaaccgt cacgagcatc atcctctgca tggtcaggtc 1140
atggatgagc agacgatggt gcaggatatc ctgctgatga agcagaacaa ctttaacgcc 1200
gtgcgctgtt cgcattatcc gaaccatccg ctgtggtaca cgctgtgcga ccgctacggc 1260
ctgtatgtgg tggatgaagc caatattgaa acccacggca tggtgccaat gaatcgtctg 1320
accgatgatc cgcgctggct accggcgatg agcgaacgcg taacgcgaat ggtgcagcgc 1380
gatcgtaatc acccgagtgt gatcatctgg tcgctgggga atgaatcagg ccacggcgct 1440
aatcacgacg cgctgtatcg ctggatcaaa tctgtcgatc cttcccgccc ggtgcagtat 1500
gaaggcggcg gagccgacac cacggccacc gatattattt gcccgatgta cgcgcgcgtg 1560
gatgaagacc agcccttccc ggctgtgccg aaatggtcca tcaaaaaatg gctttcgcta 1620
cctggagaaa cgcgcccgct gatcctttgc gaatacgccc acgcgatggg taacagtctt 1680
ggcggtttcg ctaaatactg gcaggcgttt cgtcagtatc cccgtttaca gggcggcttc 1740
gtctgggact gggtggatca gtcgctgatt aaatatgatg aaaacggcaa cccgtggtcg 1800
gcttacggcg gtgattttgg cgatacgccg aacgatcgcc agttctgtat gaacggtctg 1860
gtctttgccg accgcacgcc gcatccagcg ctgacggaag caaaacacca gcagcagttt 1920
ttccagttcc gtttatccgg gcaaaccatc gaagtgacca gcgaatacct gttccgtcat 1980
agcgataacg agctcctgca ctggatggtg gcgctggatg gtaagccgct ggcaagcggt 2040
gaagtgcctc tggatgtcgc tccacaaggt aaacagttga ttgaactgcc tgaactaccg 2100
cagccggaga gcgccgggca actctggctc acagtacgcg tagtgcaacc gaacgcgacc 2160
gcatggtcag aagccggaca catcagcgcc tggcagcagt ggcgtctggc tgaaaacctc 2220
agcgtgacac tccccgccgc gtcccacgcc atcccgcatc tgaccaccag cgaaatggat 2280
ttttgcatcg agctgggtaa taagcgttgg caatttaacc gccagtcagg ctttctttca 2340
cagatgtgga ttggcgataa aaaacaactg ctgacgccgc tgcgcgatca gttcacccgt 2400
gcaccgctgg ataacgacat tggcgtaagt gaagcgaccc gcattgaccc taacgcctgg 2460
gtcgaacgct ggaaggcggc gggccattac caggccgaag cagcgttgtt gcagtgcacg 2520
gcagatacac ttgctgatgc ggtgctgatt acgaccgctc acgcgtggca gcatcagggg 2580
aaaaccttat ttatcagccg gaaaacctac cggattgatg gtagtggtca aatggcgatt 2640
accgttgatg ttgaagtggc gagcgataca ccgcatccgg cgcggattgg cctgaactgc 2700
cagctggcgc aggtagcaga gcgggtaaac tggctcggat tagggccgca agaaaactat 2760
cccgaccgcc ttactgccgc ctgttttgac cgctgggatc tgccattgtc agacatgtat 2820
accccgtacg tcttcccgag cgaaaacggt ctgcgctgcg ggacgcgcga attgaattat 2880
ggcccacacc agtggcgcgg cgacttccag ttcaacatca gccgctacag tcaacagcaa 2940
ctgatggaaa ccagccatcg ccatctgctg cacgcggaag aaggcacatg gctgaatatc 3000
gacggtttcc atatggggat tggtggcgac gactcctgga gcccgtcagt atcggcggaa 3060
ttccagctga gcgccggtcg ctaccattac cagttggtct ggtgtcaaaa aggttcttct 3120
aaaccggaaa aaccaggtag caaaattacc ggtagcagcg gcaatgatac ccagggtagc 3180
ctgattacct atagcggtgg tgcacgtggt 3210
<210> 104
<211> 453
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter PGPM1 nucleotide sequence
<400> 104
gtgatgtcta agtaaccttt atggtatatt tcttaatgtg gaaagatact agcgcgcgca 60
cccacacaca agcttcgtct tttcttgaag aaaagaggaa gctcgctaaa tgggattcca 120
ctttccgttc cctgccagct gatggaaaaa ggttagtgga acgatgaaga ataaaaagag 180
agatccactg aggtgaaatt tcagctgaca gcgagtttca tgatcgtgat gaacaatggt 240
aacgagttgt ggctgttgcc agggagggtg gttctcaact tttaatgtat ggccaaatcg 300
ctacttgggt ttgttatata acaaagaaga aataatgaac tgattctctt cctccttctt 360
gtcctttctt aattctgttg taattacctt cctttgtaat tttttttgta attattcttc 420
ttaataatcc aaacaaacac acatattaca ata 453
<210> 105
<211> 432
<212> DNA
<213> Artificial Sequence
<220>
<223> Yeast transcriptional terminator TYPT31
<400> 105
gagatatttt gcagcagttg cgcacttgca tgtgaatgac tcttctcccc tttaattctg 60
tgctatattt ttacaatttt ctgctgacat atagtttata tacatataga acgcatatag 120
gaaattgaag taaacagaat acacaagtag aggccggtat gtacgacatt ttgcttacta 180
ctctttaaaa tcatcgtctt cttcgtcttc atcgtcttct tctttttcac catatcctac 240
atcatcttta gagcctgtgc taggttcctt cttgtctaat tcttctgcag tctttttata 300
gtcaattact ttgccgcgtg ttcttcttcc ggatgtgatg atattagagg tatcaatttc 360
tgccaaatcg tcctcttctt cttctccctc atttcccatc aatgcgtcta acttggcatc 420
gtccatatca ga 432
<210> 106
<211> 2971
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-S2CP(30) plasmid
<400> 106
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgagagacc gaattcgcgg ccgcttctag agcaatacgc aaaccgcctc 120
tccccgcgcg ttggccgatt cattaatgca gctggcacga caggtttccc gactggaaag 180
cgggcagtga gcgcaacgca attaatgtga gttagctcac tcattaggca ccccaggctt 240
tacactttat gcttccggct cgtatgttgt gtggaattgt gagcggataa caatttcaca 300
catactagag aaagaggaga aatactagat ggcttcctcc gaagacgtta tcaaagagtt 360
catgcgtttc aaagttcgta tggaaggttc cgttaacggt cacgagttcg aaatcgaagg 420
tgaaggtgaa ggtcgtccgt acgaaggtac ccagaccgct aaactgaaag ttaccaaagg 480
tggtccgctg ccgttcgctt gggacatcct gtccccgcag ttccagtacg gttccaaagc 540
ttacgttaaa cacccggctg acatcccgga ctacctgaaa ctgtccttcc cggaaggttt 600
caaatgggaa cgtgttatga acttcgaaga cggtggtgtt gttaccgtta cccaggactc 660
ctccctgcaa gacggtgagt tcatctacaa agttaaactg cgtggtacca acttcccgtc 720
cgacggtccg gttatgcaga aaaaaaccat gggttgggaa gcttccaccg aacgtatgta 780
cccggaagac ggtgctctga aaggtgaaat caaaatgcgt ctgaaactga aagacggtgg 840
tcactacgac gctgaagtta aaaccaccta catggctaaa aaaccggttc agctgccggg 900
tgcttacaaa accgacatca aactggacat cacctcccac aacgaagact acaccatcgt 960
tgaacagtac gaacgtgctg aaggtcgtca ctccaccggt gcttaataac gctgatagtg 1020
ctagtgtaga tcgctactag agccaggcat caaataaaac gaaaggctca gtcgaaagac 1080
tgggcctttc gttttatctg ttgtttgtcg gtgaacgctc tctactagag tcacactggc 1140
tcaccttcgg gtgggccttt ctgcgtttat atactagtag cggccgctgc agggtctctg 1200
gttcttctaa accggaaaaa ccaggtagca aaattaccgg tagcagcggc aatgataccc 1260
agggtagcct gattacctat agcggtggtg cacgtggtta gccgagacga ctgaccattt 1320
aaatcatacc tgacctccat agcagaaagt caaaagcctc cgaccggagg cttttgactt 1380
gatcggcacg taagaggttc caactttcac cataatgaaa taagatcact accgggcgta 1440
ttttttgagt tatcgagatt ttcaggagct aaggaagcta aaatgagcca tattcaacgg 1500
gaaacgtctt gctcgaggcc gcgattaaat tccaacatgg atgctgattt atatgggtat 1560
aaatgggctc gcgataatgt cgggcaatca ggtgcgacaa tctatcgatt gtatgggaag 1620
cccgatgcgc cagagttgtt tctgaaacat ggcaaaggta gcgttgccaa tgatgttaca 1680
gatgagatgg tcaggctaaa ctggctgacg gaatttatgc ctcttccgac catcaagcat 1740
tttatccgta ctcctgatga tgcatggtta ctcaccactg cgatcccagg gaaaacagca 1800
ttccaggtat tagaagaata tcctgattca ggtgaaaata ttgttgatgc gctggcagtg 1860
ttcctgcgcc ggttgcattc gattcctgtt tgtaattgtc cttttaacgg cgatcgcgta 1920
tttcgtctcg ctcaggcgca atcacgaatg aataacggtt tggttggtgc gagtgatttt 1980
gatgacgagc gtaatggctg gcctgttgaa caagtctgga aagaaatgca taagcttttg 2040
ccattctcac cggattcagt cgtcactcat ggtgatttct cacttgataa ccttattttt 2100
gacgagggga aattaatagg ttgtattgat gttggacgag tcggaatcgc agaccgatac 2160
caggatcttg ccatcctatg gaactgcctc ggtgagtttt ctccttcatt acagaaacgg 2220
ctttttcaaa aatatggtat tgataatcct gatatgaata aattgcagtt tcacttgatg 2280
ctcgatgagt ttttctaatg agggcccaaa tgtaatcacc tggctcacct tcgggtgggc 2340
ctttctgcgt tgctggcgtt tttccatagg ctccgccccc ctgacgagca tcacaaaaat 2400
cgatgctcaa gtcagaggtg gcgaaacccg acaggactat aaagatacca ggcgtttccc 2460
cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc cgcttaccgg atacctgtcc 2520
gcctttctcc cttcgggaag cgtggcgctt tctcatagct cacgctgtag gtatctcagt 2580
tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg aaccccccgt tcagcccgac 2640
cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc cggtaagaca cgacttatcg 2700
ccactggcag cagccactgg taacaggatt agcagagcga ggtatgtagg cggtgctaca 2760
gagttcttga agtggtggcc taactacggc tacactagaa gaacagtatt tggtatctgc 2820
gctctgctga agccagttac ctcggaaaaa gagttggtag ctcttgatcc ggcaaacaaa 2880
ccaccgctgg tagcggtggt ttttttgttt gcaagcagca gattacgcgc agaaaaaaag 2940
gatctcaaga agatcctttg attttctacc g 2971
<210> 107
<211> 90
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-S2CP(30) nucleotide sequence of key ORF
<400> 107
aaaccggaaa aaccaggtag caaaattacc ggtagcagcg gcaatgatac ccagggtagc 60
ctgattacct atagcggtgg tgcacgtggt 90
<210> 108
<211> 36
<212> PRT
<213> Artificial Sequence
<220>
<223> HcKan_O-S2CP(30) amino acid sequence of key ORF
<400> 108
Lys Pro Glu Lys Pro Gly Lys Pro Glu Lys Pro Gly Ser Lys Ile Thr
1 5 10 15
Gly Ser Ser Gly Asn Asp Thr Gln Gly Ser Leu Ile Thr Tyr Ser Gly
20 25 30
Gly Ala Arg Gly
35
<210> 109
<211> 2631
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-APEX2-S2CP(30) plasmid
<400> 109
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgggttctt ctcatcatca ccatcaccat tcttctggga tgggtaagtc 120
ttacccaact gtttctgctg attatcaaga tgctgttgaa aaggccaaga agaagttgag 180
aggtttcatt gctgaaaaaa gatgcgctcc attgatgttg agattggctt ttcattctgc 240
tggtactttc gataagggta caaaaactgg tggtccattc ggtactatca aacatccagc 300
tgaattggct cattcagcta acaatggttt ggatattgct gtcagattgc tggaaccatt 360
gaaagccgaa tttccaattt tgtcctacgc cgatttttac caattggctg gtgttgttgc 420
agttgaagtt acaggtggtc caaaagttcc atttcatcca ggtagagaag ataagccaga 480
accaccacca gaaggtagat tgccagatcc aacaaaaggt tctgatcact tgagagatgt 540
tttcggtaaa gctatgggtt tgactgatca agatattgtc gctttgtctg gtggtcatac 600
aattggtgct gctcacaaag aaagatcagg ttttgaaggt ccttggactt ctaacccatt 660
gatctttgat aactcttact tcaccgagtt gttgtccggt gaaaaagaag gtttgttgca 720
attgccatct gataaggctt tgttgtctga tccagttttc agaccattgg ttgataagta 780
tgctgctgat gaagatgctt tctttgctga ttacgctgaa gctcatcaaa agttgtctga 840
attgggtttt gctgatgctg gttcttctaa accggaaaaa ccaggtagca aaattaccgg 900
tagcagcggc aatgataccc agggtagcct gattacctat agcggtggtg cacgtggtta 960
gccgagacga ctgaccattt aaatcatacc tgacctccat agcagaaagt caaaagcctc 1020
cgaccggagg cttttgactt gatcggcacg taagaggttc caactttcac cataatgaaa 1080
taagatcact accgggcgta ttttttgagt tatcgagatt ttcaggagct aaggaagcta 1140
aaatgagcca tattcaacgg gaaacgtctt gctcgaggcc gcgattaaat tccaacatgg 1200
atgctgattt atatgggtat aaatgggctc gcgataatgt cgggcaatca ggtgcgacaa 1260
tctatcgatt gtatgggaag cccgatgcgc cagagttgtt tctgaaacat ggcaaaggta 1320
gcgttgccaa tgatgttaca gatgagatgg tcaggctaaa ctggctgacg gaatttatgc 1380
ctcttccgac catcaagcat tttatccgta ctcctgatga tgcatggtta ctcaccactg 1440
cgatcccagg gaaaacagca ttccaggtat tagaagaata tcctgattca ggtgaaaata 1500
ttgttgatgc gctggcagtg ttcctgcgcc ggttgcattc gattcctgtt tgtaattgtc 1560
cttttaacgg cgatcgcgta tttcgtctcg ctcaggcgca atcacgaatg aataacggtt 1620
tggttggtgc gagtgatttt gatgacgagc gtaatggctg gcctgttgaa caagtctgga 1680
aagaaatgca taagcttttg ccattctcac cggattcagt cgtcactcat ggtgatttct 1740
cacttgataa ccttattttt gacgagggga aattaatagg ttgtattgat gttggacgag 1800
tcggaatcgc agaccgatac caggatcttg ccatcctatg gaactgcctc ggtgagtttt 1860
ctccttcatt acagaaacgg ctttttcaaa aatatggtat tgataatcct gatatgaata 1920
aattgcagtt tcacttgatg ctcgatgagt ttttctaatg agggcccaaa tgtaatcacc 1980
tggctcacct tcgggtgggc ctttctgcgt tgctggcgtt tttccatagg ctccgccccc 2040
ctgacgagca tcacaaaaat cgatgctcaa gtcagaggtg gcgaaacccg acaggactat 2100
aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc 2160
cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct 2220
cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg 2280
aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc 2340
cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga 2400
ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa 2460
gaacagtatt tggtatctgc gctctgctga agccagttac ctcggaaaaa gagttggtag 2520
ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt gcaagcagca 2580
gattacgcgc agaaaaaaag gatctcaaga agatcctttg attttctacc g 2631
<210> 110
<211> 888
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-APEX2-S2CP(30) nucleotide sequence of key ORF
<400> 110
atgggttctt ctcatcatca ccatcaccat tcttctggga tgggtaagtc ttacccaact 60
gtttctgctg attatcaaga tgctgttgaa aaggccaaga agaagttgag aggtttcatt 120
gctgaaaaaa gatgcgctcc attgatgttg agattggctt ttcattctgc tggtactttc 180
gataagggta caaaaactgg tggtccattc ggtactatca aacatccagc tgaattggct 240
cattcagcta acaatggttt ggatattgct gtcagattgc tggaaccatt gaaagccgaa 300
tttccaattt tgtcctacgc cgatttttac caattggctg gtgttgttgc agttgaagtt 360
acaggtggtc caaaagttcc atttcatcca ggtagagaag ataagccaga accaccacca 420
gaaggtagat tgccagatcc aacaaaaggt tctgatcact tgagagatgt tttcggtaaa 480
gctatgggtt tgactgatca agatattgtc gctttgtctg gtggtcatac aattggtgct 540
gctcacaaag aaagatcagg ttttgaaggt ccttggactt ctaacccatt gatctttgat 600
aactcttact tcaccgagtt gttgtccggt gaaaaagaag gtttgttgca attgccatct 660
gataaggctt tgttgtctga tccagttttc agaccattgg ttgataagta tgctgctgat 720
gaagatgctt tctttgctga ttacgctgaa gctcatcaaa agttgtctga attgggtttt 780
gctgatgctg gttcttctaa accggaaaaa ccaggtagca aaattaccgg tagcagcggc 840
aatgataccc agggtagcct gattacctat agcggtggtg cacgtggt 888
<210> 111
<211> 296
<212> PRT
<213> Artificial Sequence
<220>
<223> HcKan_O-APEX2-S2CP(30) amino acid sequence of key ORF
<400> 111
Met Gly Ser Ser His His His His His His Ser Ser Gly Met Gly Lys
1 5 10 15
Ser Tyr Pro Thr Val Ser Ala Asp Tyr Gln Asp Ala Val Glu Lys Ala
20 25 30
Lys Lys Lys Leu Arg Gly Phe Ile Ala Glu Lys Arg Cys Ala Pro Leu
35 40 45
Met Leu Arg Leu Ala Phe His Ser Ala Gly Thr Phe Asp Lys Gly Thr
50 55 60
Lys Thr Gly Gly Pro Phe Gly Thr Ile Lys His Pro Ala Glu Leu Ala
65 70 75 80
His Ser Ala Asn Asn Gly Leu Asp Ile Ala Val Arg Leu Leu Glu Pro
85 90 95
Leu Lys Ala Glu Phe Pro Ile Leu Ser Tyr Ala Asp Phe Tyr Gln Leu
100 105 110
Ala Gly Val Val Ala Val Glu Val Thr Gly Gly Pro Lys Val Pro Phe
115 120 125
His Pro Gly Arg Glu Asp Lys Pro Glu Pro Pro Pro Glu Gly Arg Leu
130 135 140
Pro Asp Pro Thr Lys Gly Ser Asp His Leu Arg Asp Val Phe Gly Lys
145 150 155 160
Ala Met Gly Leu Thr Asp Gln Asp Ile Val Ala Leu Ser Gly Gly His
165 170 175
Thr Ile Gly Ala Ala His Lys Glu Arg Ser Gly Phe Glu Gly Pro Trp
180 185 190
Thr Ser Asn Pro Leu Ile Phe Asp Asn Ser Tyr Phe Thr Glu Leu Leu
195 200 205
Ser Gly Glu Lys Glu Gly Leu Leu Gln Leu Pro Ser Asp Lys Ala Leu
210 215 220
Leu Ser Asp Pro Val Phe Arg Pro Leu Val Asp Lys Tyr Ala Ala Asp
225 230 235 240
Glu Asp Ala Phe Phe Ala Asp Tyr Ala Glu Ala His Gln Lys Leu Ser
245 250 255
Glu Leu Gly Phe Ala Asp Ala Gly Ser Ser Lys Pro Glu Lys Pro Gly
260 265 270
Ser Lys Ile Thr Gly Ser Ser Gly Asn Asp Thr Gln Gly Ser Leu Ile
275 280 285
Thr Tyr Ser Gly Gly Ala Arg Gly
290 295
<210> 112
<211> 4953
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-LacZ-S2CTP plasmid
<400> 112
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgggttctt ctcatcatca ccatcaccat tcttctggga tgaccatgat 120
tacggattca ctggccgtcg ttttacaacg tcgtgactgg gaaaaccctg gcgttaccca 180
acttaatcgc cttgcagcac atcccccttt cgccagctgg cgtaatagcg aagaggcccg 240
caccgatcgc ccttcccaac agttgcgcag cctgaatggc gaatggcgct ttgcctggtt 300
tccggcacca gaagcggtgc cggaaagctg gctggagtgc gatcttcctg aggccgatac 360
tgtcgtcgtc ccctcaaact ggcagatgca cggttacgat gcgcccatct acaccaacgt 420
gacctatccc attacggtca atccgccgtt tgttcccacg gagaatccga cgggttgtta 480
ctcgctcaca tttaatgttg atgaaagctg gctacaggaa ggccagacgc gaattatttt 540
tgatggcgtt aactcggcgt ttcatctgtg gtgcaacggg cgctgggtcg gttacggcca 600
ggacagtcgt ttgccgtctg aatttgacct gagcgcattt ttacgcgccg gagaaaaccg 660
cctcgcggtg atggtgctgc gctggagtga cggcagttat ctggaagatc aggatatgtg 720
gcggatgagc ggcattttcc gtgatgtctc gttgctgcat aaaccgacta cacaaatcag 780
cgatttccat gttgccactc gctttaatga tgatttcagc cgcgctgtac tggaggctga 840
agttcagatg tgcggcgagt tgcgtgacta cctacgggta acagtttctt tatggcaggg 900
tgaaacgcag gtcgccagcg gcaccgcgcc tttcggcggt gaaattatcg atgagcgtgg 960
tggttatgcc gatcgcgtca cactacgtct gaacgtcgaa aacccgaaac tgtggagcgc 1020
cgaaatcccg aatctctatc gtgcggtggt tgaactgcac accgccgacg gcacgctgat 1080
tgaagcagaa gcctgcgatg tcggtttccg cgaggtgcgg attgaaaatg gtctgctgct 1140
gctgaacggc aagccgttgc tgattcgagg cgttaaccgt cacgagcatc atcctctgca 1200
tggtcaggtc atggatgagc agacgatggt gcaggatatc ctgctgatga agcagaacaa 1260
ctttaacgcc gtgcgctgtt cgcattatcc gaaccatccg ctgtggtaca cgctgtgcga 1320
ccgctacggc ctgtatgtgg tggatgaagc caatattgaa acccacggca tggtgccaat 1380
gaatcgtctg accgatgatc cgcgctggct accggcgatg agcgaacgcg taacgcgaat 1440
ggtgcagcgc gatcgtaatc acccgagtgt gatcatctgg tcgctgggga atgaatcagg 1500
ccacggcgct aatcacgacg cgctgtatcg ctggatcaaa tctgtcgatc cttcccgccc 1560
ggtgcagtat gaaggcggcg gagccgacac cacggccacc gatattattt gcccgatgta 1620
cgcgcgcgtg gatgaagacc agcccttccc ggctgtgccg aaatggtcca tcaaaaaatg 1680
gctttcgcta cctggagaaa cgcgcccgct gatcctttgc gaatacgccc acgcgatggg 1740
taacagtctt ggcggtttcg ctaaatactg gcaggcgttt cgtcagtatc cccgtttaca 1800
gggcggcttc gtctgggact gggtggatca gtcgctgatt aaatatgatg aaaacggcaa 1860
cccgtggtcg gcttacggcg gtgattttgg cgatacgccg aacgatcgcc agttctgtat 1920
gaacggtctg gtctttgccg accgcacgcc gcatccagcg ctgacggaag caaaacacca 1980
gcagcagttt ttccagttcc gtttatccgg gcaaaccatc gaagtgacca gcgaatacct 2040
gttccgtcat agcgataacg agctcctgca ctggatggtg gcgctggatg gtaagccgct 2100
ggcaagcggt gaagtgcctc tggatgtcgc tccacaaggt aaacagttga ttgaactgcc 2160
tgaactaccg cagccggaga gcgccgggca actctggctc acagtacgcg tagtgcaacc 2220
gaacgcgacc gcatggtcag aagccggaca catcagcgcc tggcagcagt ggcgtctggc 2280
tgaaaacctc agcgtgacac tccccgccgc gtcccacgcc atcccgcatc tgaccaccag 2340
cgaaatggat ttttgcatcg agctgggtaa taagcgttgg caatttaacc gccagtcagg 2400
ctttctttca cagatgtgga ttggcgataa aaaacaactg ctgacgccgc tgcgcgatca 2460
gttcacccgt gcaccgctgg ataacgacat tggcgtaagt gaagcgaccc gcattgaccc 2520
taacgcctgg gtcgaacgct ggaaggcggc gggccattac caggccgaag cagcgttgtt 2580
gcagtgcacg gcagatacac ttgctgatgc ggtgctgatt acgaccgctc acgcgtggca 2640
gcatcagggg aaaaccttat ttatcagccg gaaaacctac cggattgatg gtagtggtca 2700
aatggcgatt accgttgatg ttgaagtggc gagcgataca ccgcatccgg cgcggattgg 2760
cctgaactgc cagctggcgc aggtagcaga gcgggtaaac tggctcggat tagggccgca 2820
agaaaactat cccgaccgcc ttactgccgc ctgttttgac cgctgggatc tgccattgtc 2880
agacatgtat accccgtacg tcttcccgag cgaaaacggt ctgcgctgcg ggacgcgcga 2940
attgaattat ggcccacacc agtggcgcgg cgacttccag ttcaacatca gccgctacag 3000
tcaacagcaa ctgatggaaa ccagccatcg ccatctgctg cacgcggaag aaggcacatg 3060
gctgaatatc gacggtttcc atatggggat tggtggcgac gactcctgga gcccgtcagt 3120
atcggcggaa ttccagctga gcgccggtcg ctaccattac cagttggtct ggtgtcaaaa 3180
aggttcttct aaaccggaaa aaccaggtag caaaattacc ggtagcagcg gcaatgatac 3240
ccagggtagc ctgattacct atagcggtgg tgcacgtggt tagccgagac gactgaccat 3300
ttaaatcata cctgacctcc atagcagaaa gtcaaaagcc tccgaccgga ggcttttgac 3360
ttgatcggca cgtaagaggt tccaactttc accataatga aataagatca ctaccgggcg 3420
tattttttga gttatcgaga ttttcaggag ctaaggaagc taaaatgagc catattcaac 3480
gggaaacgtc ttgctcgagg ccgcgattaa attccaacat ggatgctgat ttatatgggt 3540
ataaatgggc tcgcgataat gtcgggcaat caggtgcgac aatctatcga ttgtatggga 3600
agcccgatgc gccagagttg tttctgaaac atggcaaagg tagcgttgcc aatgatgtta 3660
cagatgagat ggtcaggcta aactggctga cggaatttat gcctcttccg accatcaagc 3720
attttatccg tactcctgat gatgcatggt tactcaccac tgcgatccca gggaaaacag 3780
cattccaggt attagaagaa tatcctgatt caggtgaaaa tattgttgat gcgctggcag 3840
tgttcctgcg ccggttgcat tcgattcctg tttgtaattg tccttttaac ggcgatcgcg 3900
tatttcgtct cgctcaggcg caatcacgaa tgaataacgg tttggttggt gcgagtgatt 3960
ttgatgacga gcgtaatggc tggcctgttg aacaagtctg gaaagaaatg cataagcttt 4020
tgccattctc accggattca gtcgtcactc atggtgattt ctcacttgat aaccttattt 4080
ttgacgaggg gaaattaata ggttgtattg atgttggacg agtcggaatc gcagaccgat 4140
accaggatct tgccatccta tggaactgcc tcggtgagtt ttctccttca ttacagaaac 4200
ggctttttca aaaatatggt attgataatc ctgatatgaa taaattgcag tttcacttga 4260
tgctcgatga gtttttctaa tgagggccca aatgtaatca cctggctcac cttcgggtgg 4320
gcctttctgc gttgctggcg tttttccata ggctccgccc ccctgacgag catcacaaaa 4380
atcgatgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac caggcgtttc 4440
cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc ggatacctgt 4500
ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt aggtatctca 4560
gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc gttcagcccg 4620
accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga cacgacttat 4680
cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta ggcggtgcta 4740
cagagttctt gaagtggtgg cctaactacg gctacactag aagaacagta tttggtatct 4800
gcgctctgct gaagccagtt acctcggaaa aagagttggt agctcttgat ccggcaaaca 4860
aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag cagattacgc gcagaaaaaa 4920
aggatctcaa gaagatcctt tgattttcta ccg 4953
<210> 113
<211> 3210
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-LacZ-S2CTP nucleotid sequence of key ORF
<400> 113
atgggttctt ctcatcatca ccatcaccat tcttctggga tgaccatgat tacggattca 60
ctggccgtcg ttttacaacg tcgtgactgg gaaaaccctg gcgttaccca acttaatcgc 120
cttgcagcac atcccccttt cgccagctgg cgtaatagcg aagaggcccg caccgatcgc 180
ccttcccaac agttgcgcag cctgaatggc gaatggcgct ttgcctggtt tccggcacca 240
gaagcggtgc cggaaagctg gctggagtgc gatcttcctg aggccgatac tgtcgtcgtc 300
ccctcaaact ggcagatgca cggttacgat gcgcccatct acaccaacgt gacctatccc 360
attacggtca atccgccgtt tgttcccacg gagaatccga cgggttgtta ctcgctcaca 420
tttaatgttg atgaaagctg gctacaggaa ggccagacgc gaattatttt tgatggcgtt 480
aactcggcgt ttcatctgtg gtgcaacggg cgctgggtcg gttacggcca ggacagtcgt 540
ttgccgtctg aatttgacct gagcgcattt ttacgcgccg gagaaaaccg cctcgcggtg 600
atggtgctgc gctggagtga cggcagttat ctggaagatc aggatatgtg gcggatgagc 660
ggcattttcc gtgatgtctc gttgctgcat aaaccgacta cacaaatcag cgatttccat 720
gttgccactc gctttaatga tgatttcagc cgcgctgtac tggaggctga agttcagatg 780
tgcggcgagt tgcgtgacta cctacgggta acagtttctt tatggcaggg tgaaacgcag 840
gtcgccagcg gcaccgcgcc tttcggcggt gaaattatcg atgagcgtgg tggttatgcc 900
gatcgcgtca cactacgtct gaacgtcgaa aacccgaaac tgtggagcgc cgaaatcccg 960
aatctctatc gtgcggtggt tgaactgcac accgccgacg gcacgctgat tgaagcagaa 1020
gcctgcgatg tcggtttccg cgaggtgcgg attgaaaatg gtctgctgct gctgaacggc 1080
aagccgttgc tgattcgagg cgttaaccgt cacgagcatc atcctctgca tggtcaggtc 1140
atggatgagc agacgatggt gcaggatatc ctgctgatga agcagaacaa ctttaacgcc 1200
gtgcgctgtt cgcattatcc gaaccatccg ctgtggtaca cgctgtgcga ccgctacggc 1260
ctgtatgtgg tggatgaagc caatattgaa acccacggca tggtgccaat gaatcgtctg 1320
accgatgatc cgcgctggct accggcgatg agcgaacgcg taacgcgaat ggtgcagcgc 1380
gatcgtaatc acccgagtgt gatcatctgg tcgctgggga atgaatcagg ccacggcgct 1440
aatcacgacg cgctgtatcg ctggatcaaa tctgtcgatc cttcccgccc ggtgcagtat 1500
gaaggcggcg gagccgacac cacggccacc gatattattt gcccgatgta cgcgcgcgtg 1560
gatgaagacc agcccttccc ggctgtgccg aaatggtcca tcaaaaaatg gctttcgcta 1620
cctggagaaa cgcgcccgct gatcctttgc gaatacgccc acgcgatggg taacagtctt 1680
ggcggtttcg ctaaatactg gcaggcgttt cgtcagtatc cccgtttaca gggcggcttc 1740
gtctgggact gggtggatca gtcgctgatt aaatatgatg aaaacggcaa cccgtggtcg 1800
gcttacggcg gtgattttgg cgatacgccg aacgatcgcc agttctgtat gaacggtctg 1860
gtctttgccg accgcacgcc gcatccagcg ctgacggaag caaaacacca gcagcagttt 1920
ttccagttcc gtttatccgg gcaaaccatc gaagtgacca gcgaatacct gttccgtcat 1980
agcgataacg agctcctgca ctggatggtg gcgctggatg gtaagccgct ggcaagcggt 2040
gaagtgcctc tggatgtcgc tccacaaggt aaacagttga ttgaactgcc tgaactaccg 2100
cagccggaga gcgccgggca actctggctc acagtacgcg tagtgcaacc gaacgcgacc 2160
gcatggtcag aagccggaca catcagcgcc tggcagcagt ggcgtctggc tgaaaacctc 2220
agcgtgacac tccccgccgc gtcccacgcc atcccgcatc tgaccaccag cgaaatggat 2280
ttttgcatcg agctgggtaa taagcgttgg caatttaacc gccagtcagg ctttctttca 2340
cagatgtgga ttggcgataa aaaacaactg ctgacgccgc tgcgcgatca gttcacccgt 2400
gcaccgctgg ataacgacat tggcgtaagt gaagcgaccc gcattgaccc taacgcctgg 2460
gtcgaacgct ggaaggcggc gggccattac caggccgaag cagcgttgtt gcagtgcacg 2520
gcagatacac ttgctgatgc ggtgctgatt acgaccgctc acgcgtggca gcatcagggg 2580
aaaaccttat ttatcagccg gaaaacctac cggattgatg gtagtggtca aatggcgatt 2640
accgttgatg ttgaagtggc gagcgataca ccgcatccgg cgcggattgg cctgaactgc 2700
cagctggcgc aggtagcaga gcgggtaaac tggctcggat tagggccgca agaaaactat 2760
cccgaccgcc ttactgccgc ctgttttgac cgctgggatc tgccattgtc agacatgtat 2820
accccgtacg tcttcccgag cgaaaacggt ctgcgctgcg ggacgcgcga attgaattat 2880
ggcccacacc agtggcgcgg cgacttccag ttcaacatca gccgctacag tcaacagcaa 2940
ctgatggaaa ccagccatcg ccatctgctg cacgcggaag aaggcacatg gctgaatatc 3000
gacggtttcc atatggggat tggtggcgac gactcctgga gcccgtcagt atcggcggaa 3060
ttccagctga gcgccggtcg ctaccattac cagttggtct ggtgtcaaaa aggttcttct 3120
aaaccggaaa aaccaggtag caaaattacc ggtagcagcg gcaatgatac ccagggtagc 3180
ctgattacct atagcggtgg tgcacgtggt 3210
<210> 114
<211> 1070
<212> PRT
<213> Artificial Sequence
<220>
<223> HcKan_O-LacZ-S2CTP amino acid sequence of key ORF
<400> 114
Met Gly Ser Ser His His His His His His Ser Ser Gly Met Thr Met
1 5 10 15
Ile Thr Asp Ser Leu Ala Val Val Leu Gln Arg Arg Asp Trp Glu Asn
20 25 30
Pro Gly Val Thr Gln Leu Asn Arg Leu Ala Ala His Pro Pro Phe Ala
35 40 45
Ser Trp Arg Asn Ser Glu Glu Ala Arg Thr Asp Arg Pro Ser Gln Gln
50 55 60
Leu Arg Ser Leu Asn Gly Glu Trp Arg Phe Ala Trp Phe Pro Ala Pro
65 70 75 80
Glu Ala Val Pro Glu Ser Trp Leu Glu Cys Asp Leu Pro Glu Ala Asp
85 90 95
Thr Val Val Val Pro Ser Asn Trp Gln Met His Gly Tyr Asp Ala Pro
100 105 110
Ile Tyr Thr Asn Val Thr Tyr Pro Ile Thr Val Asn Pro Pro Phe Val
115 120 125
Pro Thr Glu Asn Pro Thr Gly Cys Tyr Ser Leu Thr Phe Asn Val Asp
130 135 140
Glu Ser Trp Leu Gln Glu Gly Gln Thr Arg Ile Ile Phe Asp Gly Val
145 150 155 160
Asn Ser Ala Phe His Leu Trp Cys Asn Gly Arg Trp Val Gly Tyr Gly
165 170 175
Gln Asp Ser Arg Leu Pro Ser Glu Phe Asp Leu Ser Ala Phe Leu Arg
180 185 190
Ala Gly Glu Asn Arg Leu Ala Val Met Val Leu Arg Trp Ser Asp Gly
195 200 205
Ser Tyr Leu Glu Asp Gln Asp Met Trp Arg Met Ser Gly Ile Phe Arg
210 215 220
Asp Val Ser Leu Leu His Lys Pro Thr Thr Gln Ile Ser Asp Phe His
225 230 235 240
Val Ala Thr Arg Phe Asn Asp Asp Phe Ser Arg Ala Val Leu Glu Ala
245 250 255
Glu Val Gln Met Cys Gly Glu Leu Arg Asp Tyr Leu Arg Val Thr Val
260 265 270
Ser Leu Trp Gln Gly Glu Thr Gln Val Ala Ser Gly Thr Ala Pro Phe
275 280 285
Gly Gly Glu Ile Ile Asp Glu Arg Gly Gly Tyr Ala Asp Arg Val Thr
290 295 300
Leu Arg Leu Asn Val Glu Asn Pro Lys Leu Trp Ser Ala Glu Ile Pro
305 310 315 320
Asn Leu Tyr Arg Ala Val Val Glu Leu His Thr Ala Asp Gly Thr Leu
325 330 335
Ile Glu Ala Glu Ala Cys Asp Val Gly Phe Arg Glu Val Arg Ile Glu
340 345 350
Asn Gly Leu Leu Leu Leu Asn Gly Lys Pro Leu Leu Ile Arg Gly Val
355 360 365
Asn Arg His Glu His His Pro Leu His Gly Gln Val Met Asp Glu Gln
370 375 380
Thr Met Val Gln Asp Ile Leu Leu Met Lys Gln Asn Asn Phe Asn Ala
385 390 395 400
Val Arg Cys Ser His Tyr Pro Asn His Pro Leu Trp Tyr Thr Leu Cys
405 410 415
Asp Arg Tyr Gly Leu Tyr Val Val Asp Glu Ala Asn Ile Glu Thr His
420 425 430
Gly Met Val Pro Met Asn Arg Leu Thr Asp Asp Pro Arg Trp Leu Pro
435 440 445
Ala Met Ser Glu Arg Val Thr Arg Met Val Gln Arg Asp Arg Asn His
450 455 460
Pro Ser Val Ile Ile Trp Ser Leu Gly Asn Glu Ser Gly His Gly Ala
465 470 475 480
Asn His Asp Ala Leu Tyr Arg Trp Ile Lys Ser Val Asp Pro Ser Arg
485 490 495
Pro Val Gln Tyr Glu Gly Gly Gly Ala Asp Thr Thr Ala Thr Asp Ile
500 505 510
Ile Cys Pro Met Tyr Ala Arg Val Asp Glu Asp Gln Pro Phe Pro Ala
515 520 525
Val Pro Lys Trp Ser Ile Lys Lys Trp Leu Ser Leu Pro Gly Glu Thr
530 535 540
Arg Pro Leu Ile Leu Cys Glu Tyr Ala His Ala Met Gly Asn Ser Leu
545 550 555 560
Gly Gly Phe Ala Lys Tyr Trp Gln Ala Phe Arg Gln Tyr Pro Arg Leu
565 570 575
Gln Gly Gly Phe Val Trp Asp Trp Val Asp Gln Ser Leu Ile Lys Tyr
580 585 590
Asp Glu Asn Gly Asn Pro Trp Ser Ala Tyr Gly Gly Asp Phe Gly Asp
595 600 605
Thr Pro Asn Asp Arg Gln Phe Cys Met Asn Gly Leu Val Phe Ala Asp
610 615 620
Arg Thr Pro His Pro Ala Leu Thr Glu Ala Lys His Gln Gln Gln Phe
625 630 635 640
Phe Gln Phe Arg Leu Ser Gly Gln Thr Ile Glu Val Thr Ser Glu Tyr
645 650 655
Leu Phe Arg His Ser Asp Asn Glu Leu Leu His Trp Met Val Ala Leu
660 665 670
Asp Gly Lys Pro Leu Ala Ser Gly Glu Val Pro Leu Asp Val Ala Pro
675 680 685
Gln Gly Lys Gln Leu Ile Glu Leu Pro Glu Leu Pro Gln Pro Glu Ser
690 695 700
Ala Gly Gln Leu Trp Leu Thr Val Arg Val Val Gln Pro Asn Ala Thr
705 710 715 720
Ala Trp Ser Glu Ala Gly His Ile Ser Ala Trp Gln Gln Trp Arg Leu
725 730 735
Ala Glu Asn Leu Ser Val Thr Leu Pro Ala Ala Ser His Ala Ile Pro
740 745 750
His Leu Thr Thr Ser Glu Met Asp Phe Cys Ile Glu Leu Gly Asn Lys
755 760 765
Arg Trp Gln Phe Asn Arg Gln Ser Gly Phe Leu Ser Gln Met Trp Ile
770 775 780
Gly Asp Lys Lys Gln Leu Leu Thr Pro Leu Arg Asp Gln Phe Thr Arg
785 790 795 800
Ala Pro Leu Asp Asn Asp Ile Gly Val Ser Glu Ala Thr Arg Ile Asp
805 810 815
Pro Asn Ala Trp Val Glu Arg Trp Lys Ala Ala Gly His Tyr Gln Ala
820 825 830
Glu Ala Ala Leu Leu Gln Cys Thr Ala Asp Thr Leu Ala Asp Ala Val
835 840 845
Leu Ile Thr Thr Ala His Ala Trp Gln His Gln Gly Lys Thr Leu Phe
850 855 860
Ile Ser Arg Lys Thr Tyr Arg Ile Asp Gly Ser Gly Gln Met Ala Ile
865 870 875 880
Thr Val Asp Val Glu Val Ala Ser Asp Thr Pro His Pro Ala Arg Ile
885 890 895
Gly Leu Asn Cys Gln Leu Ala Gln Val Ala Glu Arg Val Asn Trp Leu
900 905 910
Gly Leu Gly Pro Gln Glu Asn Tyr Pro Asp Arg Leu Thr Ala Ala Cys
915 920 925
Phe Asp Arg Trp Asp Leu Pro Leu Ser Asp Met Tyr Thr Pro Tyr Val
930 935 940
Phe Pro Ser Glu Asn Gly Leu Arg Cys Gly Thr Arg Glu Leu Asn Tyr
945 950 955 960
Gly Pro His Gln Trp Arg Gly Asp Phe Gln Phe Asn Ile Ser Arg Tyr
965 970 975
Ser Gln Gln Gln Leu Met Glu Thr Ser His Arg His Leu Leu His Ala
980 985 990
Glu Glu Gly Thr Trp Leu Asn Ile Asp Gly Phe His Met Gly Ile Gly
995 1000 1005
Gly Asp Asp Ser Trp Ser Pro Ser Val Ser Ala Glu Phe Gln Leu
1010 1015 1020
Ser Ala Gly Arg Tyr His Tyr Gln Leu Val Trp Cys Gln Lys Gly
1025 1030 1035
Ser Ser Lys Pro Glu Lys Pro Gly Ser Lys Ile Thr Gly Ser Ser
1040 1045 1050
Gly Asn Asp Thr Gln Gly Ser Leu Ile Thr Tyr Ser Gly Gly Ala
1055 1060 1065
Arg Gly
1070
<210> 115
<211> 2199
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_P-GPM1 plasmid
<400> 115
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg gctgtgatgt ctaagtaacc tttatggtat atttcttaat gtggaaagat 120
actagcgcgc gcacccacac acaagcttcg tcttttcttg aagaaaagag gaagctcgct 180
aaatgggatt ccactttccg ttccctgcca gctgatggaa aaaggttagt ggaacgatga 240
agaataaaaa gagagatcca ctgaggtgaa atttcagctg acagcgagtt tcatgatcgt 300
gatgaacaat ggtaacgagt tgtggctgtt gccagggagg gtggttctca acttttaatg 360
tatggccaaa tcgctacttg ggtttgttat ataacaaaga agaaataatg aactgattct 420
cttcctcctt cttgtccttt cttaattctg ttgtaattac cttcctttgt aatttttttt 480
gtaattattc ttcttaataa tccaaacaaa cacacatatt acaatagatg cgagacgact 540
gaccatttaa atcatacctg acctccatag cagaaagtca aaagcctccg accggaggct 600
tttgacttga tcggcacgta agaggttcca actttcacca taatgaaata agatcactac 660
cgggcgtatt ttttgagtta tcgagatttt caggagctaa ggaagctaaa atgagccata 720
ttcaacggga aacgtcttgc tcgaggccgc gattaaattc caacatggat gctgatttat 780
atgggtataa atgggctcgc gataatgtcg ggcaatcagg tgcgacaatc tatcgattgt 840
atgggaagcc cgatgcgcca gagttgtttc tgaaacatgg caaaggtagc gttgccaatg 900
atgttacaga tgagatggtc aggctaaact ggctgacgga atttatgcct cttccgacca 960
tcaagcattt tatccgtact cctgatgatg catggttact caccactgcg atcccaggga 1020
aaacagcatt ccaggtatta gaagaatatc ctgattcagg tgaaaatatt gttgatgcgc 1080
tggcagtgtt cctgcgccgg ttgcattcga ttcctgtttg taattgtcct tttaacggcg 1140
atcgcgtatt tcgtctcgct caggcgcaat cacgaatgaa taacggtttg gttggtgcga 1200
gtgattttga tgacgagcgt aatggctggc ctgttgaaca agtctggaaa gaaatgcata 1260
agcttttgcc attctcaccg gattcagtcg tcactcatgg tgatttctca cttgataacc 1320
ttatttttga cgaggggaaa ttaataggtt gtattgatgt tggacgagtc ggaatcgcag 1380
accgatacca ggatcttgcc atcctatgga actgcctcgg tgagttttct ccttcattac 1440
agaaacggct ttttcaaaaa tatggtattg ataatcctga tatgaataaa ttgcagtttc 1500
acttgatgct cgatgagttt ttctaatgag ggcccaaatg taatcacctg gctcaccttc 1560
gggtgggcct ttctgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc 1620
acaaaaatcg atgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg 1680
cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat 1740
acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt 1800
atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc 1860
agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg 1920
acttatcgcc actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg 1980
gtgctacaga gttcttgaag tggtggccta actacggcta cactagaaga acagtatttg 2040
gtatctgcgc tctgctgaag ccagttacct cggaaaaaga gttggtagct cttgatccgg 2100
caaacaaacc accgctggta gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag 2160
aaaaaaagga tctcaagaag atcctttgat tttctaccg 2199
<210> 116
<211> 2430
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-HO-T1-SpyTag plasmid
<400> 116
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atggatcatg ctccagaaag atttgatgct actcctccag ctggtgaacc 120
agatagacca gctttgggtg ttttggaatt gacttctatt gctagaggta ttaccgttgc 180
tgatgctgct ttgaaaagag caccatcttt gttgttgatg tccagaccag tttcttccgg 240
taaacatttg ttgatgatga gaggtcaagt tgccgaagtt gaagaatcta tgattgctgc 300
tagagaaatt gctggtgctg gtggtggttc aggtggttct gctcatatag ttatggttga 360
tgcttacaag ccaacaaaag gtggtagtgg tggatctggt gctttgttgg atgaattgga 420
attgccatat gctcacgaac aactttggag atttttggat gctccagttg ttgcagatgc 480
ttgggaagaa gatactgaat ccgttattat cgttgaaacc gctactgttt gtgctgctat 540
tgattctgct gatgcagcct taaaaactgc tcctgttgtt ttgagagata tgagattggc 600
tattggtatt gctggtaagg ctttctttac tttgactggt gaattggctg atgttgaagc 660
tgctgctgaa gttgttagag aaagatgtgg tgctagattg ctagaattgg catgtattgc 720
aagaccagtt gacgaattga gaggtaggtt gtttttctag ccgagacgac tgaccattta 780
aatcatacct gacctccata gcagaaagtc aaaagcctcc gaccggaggc ttttgacttg 840
atcggcacgt aagaggttcc aactttcacc ataatgaaat aagatcacta ccgggcgtat 900
tttttgagtt atcgagattt tcaggagcta aggaagctaa aatgagccat attcaacggg 960
aaacgtcttg ctcgaggccg cgattaaatt ccaacatgga tgctgattta tatgggtata 1020
aatgggctcg cgataatgtc gggcaatcag gtgcgacaat ctatcgattg tatgggaagc 1080
ccgatgcgcc agagttgttt ctgaaacatg gcaaaggtag cgttgccaat gatgttacag 1140
atgagatggt caggctaaac tggctgacgg aatttatgcc tcttccgacc atcaagcatt 1200
ttatccgtac tcctgatgat gcatggttac tcaccactgc gatcccaggg aaaacagcat 1260
tccaggtatt agaagaatat cctgattcag gtgaaaatat tgttgatgcg ctggcagtgt 1320
tcctgcgccg gttgcattcg attcctgttt gtaattgtcc ttttaacggc gatcgcgtat 1380
ttcgtctcgc acaggcgcaa tcacgaatga ataacggttt ggttggtgcg agtgattttg 1440
atgacgagcg taatggctgg cctgttgaac aagtctggaa agaaatgcat aagcttttgc 1500
cattctcacc ggattcagtc gtcactcatg gtgatttctc acttgataac cttatttttg 1560
acgaggggaa attaataggt tgtattgatg ttggacgagt cggaatcgca gaccgatacc 1620
aggatcttgc catcctatgg aactgcctcg gtgagttttc tccttcatta cagaaacggc 1680
tttttcaaaa atatggtatt gataatcctg atatgaataa attgcagttt cacttgatgc 1740
tcgatgagtt tttctaatga gggcccaaat gtaatcacct ggctcacctt cgggtgggcc 1800
tttctgcgtt gctggcgttt ttccataggc tccgcccccc tgacgagcat cacaaaaatc 1860
gatgctcaag tcagaggtgg cgaaacccga caggactata aagataccag gcgtttcccc 1920
ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc gcttaccgga tacctgtccg 1980
cctttctccc ttcgggaagc gtggcgcttt ctcatagctc acgctgtagg tatctcagtt 2040
cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga accccccgtt cagcccgacc 2100
gctgcgcctt atccggtaac tatcgtcttg agtccaaccc ggtaagacac gacttatcgc 2160
cactggcagc agccactggt aacaggatta gcagagcgag gtatgtaggc ggtgctacag 2220
agttcttgaa gtggtggcct aactacggct acactagaag aacagtattt ggtatctgcg 2280
ctctgctgaa gccagttacc tcggaaaaag agttggtagc tcttgatccg gcaaacaaac 2340
caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca gaaaaaaagg 2400
atctcaagaa gatcctttga ttttctaccg 2430
<210> 117
<211> 687
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-HO-T1-SpyTag nucleotide sequence of key ORF
<400> 117
atggatcatg ctccagaaag atttgatgct actcctccag ctggtgaacc agatagacca 60
gctttgggtg ttttggaatt gacttctatt gctagaggta ttaccgttgc tgatgctgct 120
ttgaaaagag caccatcttt gttgttgatg tccagaccag tttcttccgg taaacatttg 180
ttgatgatga gaggtcaagt tgccgaagtt gaagaatcta tgattgctgc tagagaaatt 240
gctggtgctg gtggtggttc aggtggttct gctcatatag ttatggttga tgcttacaag 300
ccaacaaaag gtggtagtgg tggatctggt gctttgttgg atgaattgga attgccatat 360
gctcacgaac aactttggag atttttggat gctccagttg ttgcagatgc ttgggaagaa 420
gatactgaat ccgttattat cgttgaaacc gctactgttt gtgctgctat tgattctgct 480
gatgcagcct taaaaactgc tcctgttgtt ttgagagata tgagattggc tattggtatt 540
gctggtaagg ctttctttac tttgactggt gaattggctg atgttgaagc tgctgctgaa 600
gttgttagag aaagatgtgg tgctagattg ctagaattgg catgtattgc aagaccagtt 660
gacgaattga gaggtaggtt gtttttc 687
<210> 118
<211> 229
<212> PRT
<213> Artificial Sequence
<220>
<223> HcKan_O-HO-T1-SpyTag amino acid sequence of key ORF
<400> 118
Met Asp His Ala Pro Glu Arg Phe Asp Ala Thr Pro Pro Ala Gly Glu
1 5 10 15
Pro Asp Arg Pro Ala Leu Gly Val Leu Glu Leu Thr Ser Ile Ala Arg
20 25 30
Gly Ile Thr Val Ala Asp Ala Ala Leu Lys Arg Ala Pro Ser Leu Leu
35 40 45
Leu Met Ser Arg Pro Val Ser Ser Gly Lys His Leu Leu Met Met Arg
50 55 60
Gly Gln Val Ala Glu Val Glu Glu Ser Met Ile Ala Ala Arg Glu Ile
65 70 75 80
Ala Gly Ala Gly Gly Gly Ser Gly Gly Ser Ala His Ile Val Met Val
85 90 95
Asp Ala Tyr Lys Pro Thr Lys Gly Gly Ser Gly Gly Ser Gly Ala Leu
100 105 110
Leu Asp Glu Leu Glu Leu Pro Tyr Ala His Glu Gln Leu Trp Arg Phe
115 120 125
Leu Asp Ala Pro Val Val Ala Asp Ala Trp Glu Glu Asp Thr Glu Ser
130 135 140
Val Ile Ile Val Glu Thr Ala Thr Val Cys Ala Ala Ile Asp Ser Ala
145 150 155 160
Asp Ala Ala Leu Lys Thr Ala Pro Val Val Leu Arg Asp Met Arg Leu
165 170 175
Ala Ile Gly Ile Ala Gly Lys Ala Phe Phe Thr Leu Thr Gly Glu Leu
180 185 190
Ala Asp Val Glu Ala Ala Ala Glu Val Val Arg Glu Arg Cys Gly Ala
195 200 205
Arg Leu Leu Glu Leu Ala Cys Ile Ala Arg Pro Val Asp Glu Leu Arg
210 215 220
Gly Arg Leu Phe Phe
225
<210> 119
<211> 2178
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_T-YPT31 plasmid
<400> 119
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgt agcgagatat tttgcagcag ttgcgcactt gcatgtgaat gactcttctc 120
ccctttaatt ctgtgctata tttttacaat tttctgctga catatagttt atatacatat 180
agaacgcata taggaaattg aagtaaacag aatacacaag tagaggccgg tatgtacgac 240
attttgctta ctactcttta aaatcatcgt cttcttcgtc ttcatcgtct tcttcttttt 300
caccatatcc tacatcatct ttagagcctg tgctaggttc cttcttgtct aattcttctg 360
cagtcttttt atagtcaatt actttgccgc gtgttcttct tccggatgtg atgatattag 420
aggtatcaat ttctgccaaa tcgtcctctt cttcttctcc ctcatttccc atcaatgcgt 480
ctaacttggc atcgtccata tcagacctcc gagacgactg accatttaaa tcatacctga 540
cctccatagc agaaagtcaa aagcctccga ccggaggctt ttgacttgat cggcacgtaa 600
gaggttccaa ctttcaccat aatgaaataa gatcactacc gggcgtattt tttgagttat 660
cgagattttc aggagctaag gaagctaaaa tgagccatat tcaacgggaa acgtcttgct 720
cgaggccgcg attaaattcc aacatggatg ctgatttata tgggtataaa tgggctcgcg 780
ataatgtcgg gcaatcaggt gcgacaatct atcgattgta tgggaagccc gatgcgccag 840
agttgtttct gaaacatggc aaaggtagcg ttgccaatga tgttacagat gagatggtca 900
ggctaaactg gctgacggaa tttatgcctc ttccgaccat caagcatttt atccgtactc 960
ctgatgatgc atggttactc accactgcga tcccagggaa aacagcattc caggtattag 1020
aagaatatcc tgattcaggt gaaaatattg ttgatgcgct ggcagtgttc ctgcgccggt 1080
tgcattcgat tcctgtttgt aattgtcctt ttaacggcga tcgcgtattt cgtctcgctc 1140
aggcgcaatc acgaatgaat aacggtttgg ttggtgcgag tgattttgat gacgagcgta 1200
atggctggcc tgttgaacaa gtctggaaag aaatgcataa gcttttgcca ttctcaccgg 1260
attcagtcgt cactcatggt gatttctcac ttgataacct tatttttgac gaggggaaat 1320
taataggttg tattgatgtt ggacgagtcg gaatcgcaga ccgataccag gatcttgcca 1380
tcctatggaa ctgcctcggt gagttttctc cttcattaca gaaacggctt tttcaaaaat 1440
atggtattga taatcctgat atgaataaat tgcagtttca cttgatgctc gatgagtttt 1500
tctaatgagg gcccaaatgt aatcacctgg ctcaccttcg ggtgggcctt tctgcgttgc 1560
tggcgttttt ccataggctc cgcccccctg acgagcatca caaaaatcga tgctcaagtc 1620
agaggtggcg aaacccgaca ggactataaa gataccaggc gtttccccct ggaagctccc 1680
tcgtgcgctc tcctgttccg accctgccgc ttaccggata cctgtccgcc tttctccctt 1740
cgggaagcgt ggcgctttct catagctcac gctgtaggta tctcagttcg gtgtaggtcg 1800
ttcgctccaa gctgggctgt gtgcacgaac cccccgttca gcccgaccgc tgcgccttat 1860
ccggtaacta tcgtcttgag tccaacccgg taagacacga cttatcgcca ctggcagcag 1920
ccactggtaa caggattagc agagcgaggt atgtaggcgg tgctacagag ttcttgaagt 1980
ggtggcctaa ctacggctac actagaagaa cagtatttgg tatctgcgct ctgctgaagc 2040
cagttacctc ggaaaaagag ttggtagctc ttgatccggc aaacaaacca ccgctggtag 2100
cggtggtttt tttgtttgca agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga 2160
tcctttgatt ttctaccg 2178
<210> 120
<211> 5404
<212> DNA
<213> Artificial Sequence
<220>
<223> pGAH-YPRCd15 plasmid
<400> 120
gcaaaaggcc aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc 60
ccctgacgag catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact 120
ataaagatac caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct 180
gccgcttacc ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag 240
ctcacgctgt aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca 300
cgaacccccc gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa 360
cccggtaaga cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc 420
gaggtatgta ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag 480
aaggacagta tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg 540
tagctcttga tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca 600
gcagattacg cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc 660
tgacgctcag tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag 720
gatcttcacc tagatccttt taaattaaaa atgaagtttt aaatcaatct aaagtatata 780
tgagtaaact tggtctgaca gttaccaatg cttaatcagt gaggcaccta tctcagcgat 840
ctgtctattt cgttcatcca tagttgcctg actccccgtc gtgtagataa ctacgatacg 900
ggagggctta ccatctggcc ccagtgctgc aatgataccg cggctcccac gctcaccggc 960
tccagattta tcagcaataa accagccagc cggaagggcc gagcgcagaa gtggtcctgc 1020
aactttatcc gcctccatcc agtctattaa ttgttgccgg gaagctagag taagtagttc 1080
gccagttaat agtttgcgca acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc 1140
gtcgtttggt atggcttcat tcagctccgg ttcccaacga tcaaggcgag ttacatgatc 1200
ccccatgttg tgcaaaaaag cggttagctc cttcggtcct ccgatcgttg tcagaagtaa 1260
gttggccgca gtgttatcac tcatggttat ggcagcactg cataattctc ttactgtcat 1320
gccatccgta agatgctttt ctgtgactgg tgagtactca accaagtcat tctgagaata 1380
gtgtatgcgg cgaccgagtt gctcttgccc ggcgtcaata cgggataata ccgcgccaca 1440
tagcagaact ttaaaagtgc tcatcattgg aaaacgttct tcggggcgaa aactctcaag 1500
gatcttaccg ctgttgagat ccagttcgat gtaacccact cgtgcaccca actgatcttc 1560
agcatctttt actttcacca gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc 1620
aaaaaaggga ataagggcga cacggaaatg ttgaatactc atactcttcc tttttcaata 1680
ttattgaagc atttatcagg gttattgtct catgagcgga tacatatttg aatgtattta 1740
gaaaaataaa caaatagggg ttccgcgcac atttccccga aaagtgccac ggtctctgtc 1800
agataacgcc aggcgccttt atatcatata attaagacac aaaaggataa aacaaaggtg 1860
ttaactattc tgcatactca ctatcgtaaa ctgtcctgca aatcgtgtaa atatgtattt 1920
catttttttt gcagtgaaaa aaggcatgta aaataccgca tcaagtaact ctactccgcc 1980
tgtggtttca agactaacgg cttgagacaa aatgggaaga aatgattgca gaaaagccat 2040
atgtgtaata gcaaaaagct ggatactgct taccagatgt ttaccttaat ttcttggtga 2100
attagagaag tacagaagtt ttactattaa tcccaccata gaaatttgta taggaaagta 2160
gtttattgga gttattggat atactgtgta aactatttct tgaaattgta atcttaagat 2220
gctcttctta ttctattaaa aatagaaaat gattttcata tttatttatt tatttatatt 2280
ttggcattac tcttcatcat ttttttccct ctaagaagct tcctttcttt ttataaggat 2340
aacaaaacca aaaggaatat tgggtcagat gaatggacgc gaatgcaaga cagaagtcca 2400
aatcacgtca agacaaagaa agaaagaaag aaaaactaac acattaatgt agttttaaaa 2460
tttcaaatcc gaacaacaga gcatagggtt tcgcaaatct ctacctggct cgaagcagcg 2520
gtatttcaca ccgcatagat ccgtcgagtt caagagaaaa aaaaagaaaa agcaaaaaga 2580
aaaaaggaaa gcgcgcctcg ttcagaatga cacgtataga atgatgcatt accttgtcat 2640
cttcagtatc atactgttcg tatacatact tactgacatt cataggtata catatataca 2700
catgtatata tatcgtatgc tgcagcttta aataatcggt gtcactacat aagaacacct 2760
ttggtggagg gaacatcgtt ggtaccattg ggcgaggtgg cttctcttat ggcaaccgca 2820
agagccttga acgcactctc actacggtga tgatcattct tgcctcgcag acaatcaacg 2880
tggagggtaa ttctgctagc ctctgcaaag ctttcaagaa aatgcgggat catctcgcaa 2940
gagagatctc ctactttctc cctttgcaaa ccaagttcga caactgcgta cggcctgttc 3000
gaaagatcta ccaccgctct ggaaagtgcc tcatccaaag gcgcaaatcc tgatccaaac 3060
ctttttactc cacgcacggc ccctagggcc tctttaaaag cttgaccgag agcaatcccg 3120
cagtcttcag tggtgtgatg gtcgtctatg tgtaagtcac caatgcactc aacgattagc 3180
gaccagccgg aatgcttggc cagagcatgt atcatatggt ccagaaaccc tatacctgtg 3240
tggacgttaa tcacttgcga ttgtgtggcc tgttctgcta ctgcttctgc ctctttttct 3300
gggaagatcg agtgctctat cgctagggga ccacccttta aagagatcgc aatctgaatc 3360
ttggtttcat ttgtaatacg ctttactagg gctttctgct ctgtcatctt tgccttcgtt 3420
tatcttgcct gctcattttt tagtatattc ttcgaagaaa tcacattact ttatataatg 3480
tataattcat tatgtgataa tgccaatcgc taagaaaaaa aaagagtcat ccgctaggtg 3540
gaaaaaaaaa aatgaaaatc attaccgagg cataaaaaaa tatagagtgt actagaggag 3600
gccaagagta atagaaaaag aaaattgcgg gaaaggactg tgttatgact tccctgtgca 3660
ccacctcagg cagagaacct agagacggca atacgcaaac cgcctctccc cgcgcgttgg 3720
ccgattcatt aatgcagctg gcacgacagg tttcccgact ggaaagcggg cagtgagcgc 3780
aacgcaatta atgtgagtta gctcactcat taggcacccc aggctttaca ctttatgctt 3840
ccggctcgta tgttgtgtgg aattgtgagc ggataacaat ttcacacata ctagagaaag 3900
aggagaaata ctagatggct tcctccgaag acgttatcaa agagttcatg cgtttcaaag 3960
ttcgtatgga aggttccgtt aacggtcacg agttcgaaat cgaaggtgaa ggtgaaggtc 4020
gtccgtacga aggtacccag accgctaaac tgaaagttac caaaggtggt ccgctgccgt 4080
tcgcttggga catcctgtcc ccgcagttcc agtacggttc caaagcttac gttaaacacc 4140
cggctgacat cccggactac ctgaaactgt ccttcccgga aggtttcaaa tgggaacgtg 4200
ttatgaactt cgaagacggt ggtgttgtta ccgttaccca ggactcctcc ctgcaagacg 4260
gtgagttcat ctacaaagtt aaactgcgtg gtaccaactt cccgtccgac ggtccggtta 4320
tgcagaaaaa aaccatgggt tgggaagctt ccaccgaacg tatgtacccg gaagacggtg 4380
ctctgaaagg tgaaatcaaa atgcgtctga aactgaaaga cggtggtcac tacgacgctg 4440
aagttaaaac cacctacatg gctaaaaaac cggttcagct gccgggtgct tacaaaaccg 4500
acatcaaact ggacatcacc tcccacaacg aagactacac catcgttgaa cagtacgaac 4560
gtgctgaagg tcgtcactcc accggtgctt aataacgctg atagtgctag tgtagatcgc 4620
tactagagcc aggcatcaaa taaaacgaaa ggctcagtcg aaagactggg cctttcgttt 4680
tatctgttgt ttgtcggtga acgctctcta ctagagtcac actggctccg tctcatgagc 4740
gcttggaagg tcgggatgag catatacaag cactaagaag aacaatacag aactctacac 4800
ggtattattg tgctacaagc tcgagtaaaa ccgagtgttt tgacgatact aacgttgtta 4860
agaaagtaac ttgttatcaa actcattacc aacttgtgat taattggtga ataatatgat 4920
aattgtcgaa attccattgt tggtaaagcc tataatatta tgtatacaga ttatactaga 4980
aattctctcg agaatataag aatccccaaa attgaatcgg tatttctaca tactaatatt 5040
accattactt ctcctttcgt tttatatgtt tcattcctat tacattatcg atctttgcat 5100
ttcagcttcc attatatttg atgtctgttt tatgtcccca cgttacaccg catgtgacag 5160
tatactagta acatgagtgc taccgaatag atgacatttt agactttcat tccaacaact 5220
tggttgacag aatgttacgt accctatatc taatctatat gaggcctgaa tctaactgaa 5280
aggtggaatt tcagtaattt atcaagcttt aataagtttg ggtagtttaa ctgtgcaaaa 5340
aggtatttac cttacatact gaatcttgtc tgtttggtag cggctgcttt atgtcggaga 5400
gacc 5404
<210> 121
<211> 6264
<212> DNA
<213> Artificial Sequence
<220>
<223> pGAH-YPRCd15-GFP-SpyCatcher plasmid
<400> 121
gcaaaaggcc aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc 60
ccctgacgag catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact 120
ataaagatac caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct 180
gccgcttacc ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag 240
ctcacgctgt aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca 300
cgaacccccc gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa 360
cccggtaaga cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc 420
gaggtatgta ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag 480
aaggacagta tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg 540
tagctcttga tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca 600
gcagattacg cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc 660
tgacgctcag tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag 720
gatcttcacc tagatccttt taaattaaaa atgaagtttt aaatcaatct aaagtatata 780
tgagtaaact tggtctgaca gttaccaatg cttaatcagt gaggcaccta tctcagcgat 840
ctgtctattt cgttcatcca tagttgcctg actccccgtc gtgtagataa ctacgatacg 900
ggagggctta ccatctggcc ccagtgctgc aatgataccg cggctcccac gctcaccggc 960
tccagattta tcagcaataa accagccagc cggaagggcc gagcgcagaa gtggtcctgc 1020
aactttatcc gcctccatcc agtctattaa ttgttgccgg gaagctagag taagtagttc 1080
gccagttaat agtttgcgca acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc 1140
gtcgtttggt atggcttcat tcagctccgg ttcccaacga tcaaggcgag ttacatgatc 1200
ccccatgttg tgcaaaaaag cggttagctc cttcggtcct ccgatcgttg tcagaagtaa 1260
gttggccgca gtgttatcac tcatggttat ggcagcactg cataattctc ttactgtcat 1320
gccatccgta agatgctttt ctgtgactgg tgagtactca accaagtcat tctgagaata 1380
gtgtatgcgg cgaccgagtt gctcttgccc ggcgtcaata cgggataata ccgcgccaca 1440
tagcagaact ttaaaagtgc tcatcattgg aaaacgttct tcggggcgaa aactctcaag 1500
gatcttaccg ctgttgagat ccagttcgat gtaacccact cgtgcaccca actgatcttc 1560
agcatctttt actttcacca gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc 1620
aaaaaaggga ataagggcga cacggaaatg ttgaatactc atactcttcc tttttcaata 1680
ttattgaagc atttatcagg gttattgtct catgagcgga tacatatttg aatgtattta 1740
gaaaaataaa caaatagggg ttccgcgcac atttccccga aaagtgccac ggtctctgtc 1800
agataacgcc aggcgccttt atatcatata attaagacac aaaaggataa aacaaaggtg 1860
ttaactattc tgcatactca ctatcgtaaa ctgtcctgca aatcgtgtaa atatgtattt 1920
catttttttt gcagtgaaaa aaggcatgta aaataccgca tcaagtaact ctactccgcc 1980
tgtggtttca agactaacgg cttgagacaa aatgggaaga aatgattgca gaaaagccat 2040
atgtgtaata gcaaaaagct ggatactgct taccagatgt ttaccttaat ttcttggtga 2100
attagagaag tacagaagtt ttactattaa tcccaccata gaaatttgta taggaaagta 2160
gtttattgga gttattggat atactgtgta aactatttct tgaaattgta atcttaagat 2220
gctcttctta ttctattaaa aatagaaaat gattttcata tttatttatt tatttatatt 2280
ttggcattac tcttcatcat ttttttccct ctaagaagct tcctttcttt ttataaggat 2340
aacaaaacca aaaggaatat tgggtcagat gaatggacgc gaatgcaaga cagaagtcca 2400
aatcacgtca agacaaagaa agaaagaaag aaaaactaac acattaatgt agttttaaaa 2460
tttcaaatcc gaacaacaga gcatagggtt tcgcaaatct ctacctggct cgaagcagcg 2520
gtatttcaca ccgcatagat ccgtcgagtt caagagaaaa aaaaagaaaa agcaaaaaga 2580
aaaaaggaaa gcgcgcctcg ttcagaatga cacgtataga atgatgcatt accttgtcat 2640
cttcagtatc atactgttcg tatacatact tactgacatt cataggtata catatataca 2700
catgtatata tatcgtatgc tgcagcttta aataatcggt gtcactacat aagaacacct 2760
ttggtggagg gaacatcgtt ggtaccattg ggcgaggtgg cttctcttat ggcaaccgca 2820
agagccttga acgcactctc actacggtga tgatcattct tgcctcgcag acaatcaacg 2880
tggagggtaa ttctgctagc ctctgcaaag ctttcaagaa aatgcgggat catctcgcaa 2940
gagagatctc ctactttctc cctttgcaaa ccaagttcga caactgcgta cggcctgttc 3000
gaaagatcta ccaccgctct ggaaagtgcc tcatccaaag gcgcaaatcc tgatccaaac 3060
ctttttactc cacgcacggc ccctagggcc tctttaaaag cttgaccgag agcaatcccg 3120
cagtcttcag tggtgtgatg gtcgtctatg tgtaagtcac caatgcactc aacgattagc 3180
gaccagccgg aatgcttggc cagagcatgt atcatatggt ccagaaaccc tatacctgtg 3240
tggacgttaa tcacttgcga ttgtgtggcc tgttctgcta ctgcttctgc ctctttttct 3300
gggaagatcg agtgctctat cgctagggga ccacccttta aagagatcgc aatctgaatc 3360
ttggtttcat ttgtaatacg ctttactagg gctttctgct ctgtcatctt tgccttcgtt 3420
tatcttgcct gctcattttt tagtatattc ttcgaagaaa tcacattact ttatataatg 3480
tataattcat tatgtgataa tgccaatcgc taagaaaaaa aaagagtcat ccgctaggtg 3540
gaaaaaaaaa aatgaaaatc attaccgagg cataaaaaaa tatagagtgt actagaggag 3600
gccaagagta atagaaaaag aaaattgcgg gaaaggactg tgttatgact tccctgtgca 3660
ccacctcagg cagagaacct ggctgtgatg tctaagtaac ctttatggta tatttcttaa 3720
tgtggaaaga tactagcgcg cgcacccaca cacaagcttc gtcttttctt gaagaaaaga 3780
ggaagctcgc taaatgggat tccactttcc gttccctgcc agctgatgga aaaaggttag 3840
tggaacgatg aagaataaaa agagagatcc actgaggtga aatttcagct gacagcgagt 3900
ttcatgatcg tgatgaacaa tggtaacgag ttgtggctgt tgccagggag ggtggttctc 3960
aacttttaat gtatggccaa atcgctactt gggtttgtta tataacaaag aagaaataat 4020
gaactgattc tcttcctcct tcttgtcctt tcttaattct gttgtaatta ccttcctttg 4080
taattttttt tgtaattatt cttcttaata atccaaacaa acacacatat tacaatagat 4140
gggttcttct catcatcacc atcaccattc ttctgggatg tctaaaggtg aagaattatt 4200
cactggtgtt gtcccaattt tggttgaatt agatggtgat gttaatggtc acaaattttc 4260
tgtctccggt gaaggtgaag gtgatgctac ttacggtaaa ttgaccttaa aatttatttg 4320
tactactggt aaattgccag ttccatggcc aaccttagtc actactttaa cttatggtgt 4380
tcaatgtttt tctagatacc cagatcatat gaaacaacat gactttttca agtctgccat 4440
gccagaaggt tatgttcaag aaagaactat ttttttcaaa gatgacggta actacaagac 4500
cagagctgaa gtcaagtttg aaggtgatac cttagttaat agaatcgaat taaaaggtat 4560
tgattttaaa gaagatggta acattttagg tcacaaattg gaatacaact ataactctca 4620
caatgtttac atcatggctg acaaacaaaa gaatggtatc aaagttaact tcaaaattag 4680
acacaacatt gaagatggtt ctgttcaatt agctgaccat tatcaacaaa atactccaat 4740
tggtgatggt ccagtcttgt taccagacaa ccattactta tccactcaat ctaaattatc 4800
caaagatcca aacgaaaaga gagatcacat ggtcttgtta gaatttgtta ctgctgctgg 4860
tattacccat ggtatggatg aattgtacaa aggttctggt ggttctgatt ctgctactca 4920
tattaagttc tccaagaggg acgaagatgg taaagaattg gctggtgcaa ctatggaatt 4980
gagagattct tctggtaaga ccatttccac ctggatttct gatggtcaag ttaaggattt 5040
ctacttgtac ccaggtaagt acactttcgt tgaaactgct gctccagatg gttatgaagt 5100
tgctactgct attactttca ccgtcaatga acaaggtcaa gtcactgtta atggttagcg 5160
agatattttg cagcagttgc gcacttgcat gtgaatgact cttctcccct ttaattctgt 5220
gctatatttt tacaattttc tgctgacata tagtttatat acatatagaa cgcatatagg 5280
aaattgaagt aaacagaata cacaagtaga ggccggtatg tacgacattt tgcttactac 5340
tctttaaaat catcgtcttc ttcgtcttca tcgtcttctt ctttttcacc atatcctaca 5400
tcatctttag agcctgtgct aggttccttc ttgtctaatt cttctgcagt ctttttatag 5460
tcaattactt tgccgcgtgt tcttcttccg gatgtgatga tattagaggt atcaatttct 5520
gccaaatcgt cctcttcttc ttctccctca tttcccatca atgcgtctaa cttggcatcg 5580
tccatatcag acctctgagc gcttggaagg tcgggatgag catatacaag cactaagaag 5640
aacaatacag aactctacac ggtattattg tgctacaagc tcgagtaaaa ccgagtgttt 5700
tgacgatact aacgttgtta agaaagtaac ttgttatcaa actcattacc aacttgtgat 5760
taattggtga ataatatgat aattgtcgaa attccattgt tggtaaagcc tataatatta 5820
tgtatacaga ttatactaga aattctctcg agaatataag aatccccaaa attgaatcgg 5880
tatttctaca tactaatatt accattactt ctcctttcgt tttatatgtt tcattcctat 5940
tacattatcg atctttgcat ttcagcttcc attatatttg atgtctgttt tatgtcccca 6000
cgttacaccg catgtgacag tatactagta acatgagtgc taccgaatag atgacatttt 6060
agactttcat tccaacaact tggttgacag aatgttacgt accctatatc taatctatat 6120
gaggcctgaa tctaactgaa aggtggaatt tcagtaattt atcaagcttt aataagtttg 6180
ggtagtttaa ctgtgcaaaa aggtatttac cttacatact gaatcttgtc tgtttggtag 6240
cggctgcttt atgtcggaga gacc 6264
Claims (24)
- 카고 분자(cargo molecule)를 운반(carrying)하는 박테리아 미세구획 바이러스-유사 입자(VLP)를 생산하기 위한 방법으로서, 상기 방법은
A) (i) 박테리아 미세구획 쉘 프로토머(shell protomer)를 코딩하는(encoding) 제1 서열; 및
(ii) 캡슐화 펩티드(encapsulation peptide)에 융합된 카고 분자를 코딩하는 제2 서열(여기서, 상기 캡슐화 펩티드는 서열번호 1(SKITGSSGNDTQGSLITYSGGARG) 또는 서열번호 94(KPEKPGSKITGSSGNDTQGSLITYSGGARG)에 제시된 아미노산 서열 또는 이의 기능적 변이체(functional variant)를 포함한다)을 포함하는 하나 이상의 이종 폴리뉴클레오티드(heterologous polynucleotide)를 숙주 세포 또는 생물에 도입하여;
a) 상기 제1 및 제2 서열을 발현시키고;
b) 카고 분자를 캡슐화하는 미세구획을 형성하는 단계; 또는
B) (i) 박테리아 미세구획 쉘 프로토머를 코딩하는 제1 서열; 및
(ii) 카고 분자 또는 생화학적 태그(biochemical tag)와 융합된 상기 프로토머 중 적어도 하나를 코딩하는 제2 서열
을 포함하는 하나 이상의 폴리뉴클레오티드를 숙주 세포 또는 생물에 도입하여;
a) 제1 및 제2 서열을 발현하고;
b) 외측 표면에 카고 분자를 발현하는 미세구획을 형성하거나,
c) 상보적 태그를 포함하는 카고 분자가 결합할 수 있는 외측 표면에 생화학적 태그를 발현하는 미세구획을 형성하는 단계
를 포함하는, 카고 분자를 운반하는 박테리아 미세구획 바이러스-유사 입자(VLP)를 생산하기 위한 방법. - 제1항에 있어서, 서열번호 1에 제시된 캡슐화 펩티드의 기능적 변이체가, 이의 아미노 말단에서, 서열번호 94의 아미노 말단에 1, 2, 3, 4 또는 5개의 추가 아미노산을 포함하고, 이러한 변이체는 서열번호 1과 서열번호 94의 서열 사이의 중간체인, 방법.
- 제1항 또는 제2항에 있어서, 상기 박테리아 미세구획 프로토머가 할로티오바실러스 네아폴리타누스(Halothiobacillus neapolitanus) 또는 할리안기움 오크라세움(Haliangium ochraceum)으로부터 유래되는, 방법.
- 제3항에 있어서, 상기 박테리아 미세구획 프로토머가 할로티오바실러스 네아폴리타누스의 CsoS1A(서열번호 2) 및 CsoS4A(서열번호 3); 또는 할리안기움 오크라세움의 HO-H(서열번호 4), HO-P(서열번호 5) 및 HO-T1(서열번호 6), 및 이의 변이체인, 방법.
- 제1항 내지 제4항 중 어느 한 항에 있어서, 상기 카고 분자가 효소 및/또는 형광 단백질 및/또는 면역원성 펩티드와 같은 적어도 하나의 펩티드인, 방법.
- 제1항 내지 제5항 중 어느 한 항에 있어서, 상기 생화학적 태그가 Strep-Tag II(Sil), SpyCatcher/SpyTag(SC/ST) 쌍 및 CC-Di-A/B(CCA/CCB) 쌍을 포함하는 그룹으로부터 선택될 수 있는, 방법.
- 제4항 내지 제6항 중 어느 한 항에 있어서, CsoS1A의 발현이 프로모터 PT7에 의해 조절되고; CsoS4A가 프로모터 PCON5에 의해 조절되고; HO-H가 효모 프로모터 PTDH3에 의해 조절되고; HO-P가 효모 프로모터 PPYK1에 의해 조절되고, HO-T1이 효모 프로모터 PYEF3에 의해 조절되는, 방법.
- 제1항 내지 제7항 중 어느 한 항에 있어서, 상기 숙주 생물이 이. 콜라이(E. coli) 또는 에스. 세레비지아에(S. cerevisiae)인, 방법.
- 카고 분자를 운반하는 조작된 박테리아 미세구획 VLP로서,
i) 박테리아 미세구획 쉘 프로토머, 및 캡슐화 펩티드에 융합된 카고 분자(여기서, 상기 캡슐화 펩티드는 서열번호 1 또는 서열번호 94에 제시된 아미노산 서열 또는 이의 기능적 변이체를 포함한다); 또는
ii) 박테리아 미세구획 쉘 프로토머 및 카고 분자(여기서, 상기 카고 분자는 적어도 하나의 상기 프로토머의 말단에 융합되거나, 적어도 하나의 상기 프로토머는 태그에 융합되고, 상보적 태그를 포함하는 카고 분자는 VLP의 외측 표면에 결합된다)
를 포함하는, 조작된 박테리아 미세구획 VLP. - 제9항에 있어서, 서열번호 1에 제시된 캡슐화 펩티드의 기능적 변이체가, 이의 아미노 말단에서, 서열번호 94의 아미노 말단에 1, 2, 3, 4 또는 5개의 추가 아미노산을 포함하고, 이러한 변이체는 서열번호 1과 서열번호 94의 서열 사이의 중간체인, 조작된 VLP.
- 제9항 또는 제10항에 있어서, 상기 박테리아 미세구획 프로토머가 할로티오바실러스 네아폴리타누스(Halothiobacillus neapolitanus) 또는 할리안기움 오크라세움(Haliangium ochraceum)으로부터 유래되는, 조작된 VLP.
- 제11항에 있어서, 상기 박테리아 미세구획 프로토머가 할로티오바실러스 네아폴리타누스(Halothiobacillus neapolitanus)로부터 서열번호 2에 제시된 아미노산 서열을 포함하는 CsoS1A 및 서열번호 3에 제시된 아미노산 서열을 포함하는 CsoS4A; 또는 할리안기움 오크라세움(Haliangium ochraceum)으로부터 서열번호 4에 제시된 아미노산 서열을 포함하는 HO-H, 서열번호 5에 제시된 아미노산 서열을 포함하는 HO-P 및 서열번호 6에 제시된 아미노산 서열을 포함하는 HO-T1, 및 이의 변이체인, 조작된 VLP.
- 제9항 내지 제12항 중 어느 한 항에 있어서, 상기 카고 분자가 효소 및/또는 형광 단백질 및/또는 면역원성 펩티드와 같은 적어도 하나의 펩티드인, 조작된 VLP.
- 제9항 내지 제13항 중 어느 한 항에 있어서, 상기 생화학적 태그가 Strep-Tag II(SII), SpyCatcher/SpyTag(SC/ST) 쌍 및 CC-Di-A/B(CCA/CCB) 쌍을 포함하는 그룹으로부터 선택되는, 조작된 VLP.
- a) 각각이 프로모터에 작동적으로 연결되어 있는, 박테리아 미세구획 쉘 프로토머를 코딩하는 제1 DNA 서열, 및
b) 프로모터에 작동적으로 연결된, 캡슐화 펩티드에 융합된 카고 분자를 코딩하는 제2 DNA 서열(여기서, 캡슐화 펩티드는 서열번호 1 또는 서열번호 94에 제시된 아미노산 서열 또는 이의 기능적 변이체를 포함한다); 또는
c) 각각이 프로모터에 작동적으로 연결되어 있는, 박테리아 미세구획 쉘 프로토머를 코딩하는 제1 DNA 서열, 및
d) 카고 분자 또는 생화학적 태그와 융합된 상기 프로토머 중 적어도 하나를 코딩하는 제2 DNA 서열
을 포함하는, 단리된 플라스미드 또는 벡터 핵산. - 제13항에 있어서, 서열번호 1에 제시된 캡슐화 펩티드의 기능적 변이체가, 이의 아미노 말단에서, 서열번호 94의 아미노 말단에 1, 2, 3, 4 또는 5개의 추가 아미노산을 포함하고, 이러한 변이체는 서열번호 1과 서열번호 94의 서열 사이의 중간체인, 단리된 플라스미드 또는 벡터.
- 제15항 또는 제16항에 있어서, 상기 박테리아 미세구획 쉘 프로토머, 프로모터, 카고 분자 및 태그가 제1항 내지 제7항 중 어느 한 항에 정의된 바와 같은, 단리된 플라스미드 또는 벡터.
- 제17항에 있어서, 상기 박테리아 미세구획 쉘 프로토머, 카고 분자 및 태그를 코딩하는 DNA 서열이 유전자 코드의 중복성(redundancy)에 기인하여 서열번호 7 내지 12 및 95와 적어도 70%, 적어도 80%, 적어도 90% 또는 100% 동일성을 갖는, 단리된 플라스미드 또는 벡터.
- a) 대상체(subject)에서 질환의 예방 또는 치료; 또는
b) 생화학적 프로세스(biochemical process)
에 사용하기 위한 제1항 내지 제14항 중 어느 한 항의 적어도 하나의 조작된 VLP를 포함하는 조성물 또는 조합물. - 제19항에 있어서, 상기 적어도 하나의 조작된 VLP가 프로드러그(prodrug)의 전환을 위한 효소를 포함하는, 조성물 또는 조합물.
- 제19항 또는 제20항에 있어서, 하나 이상의 추가 치료제를 포함하는, 조합물.
- 제19항 또는 제21항에 있어서, 백신인, 조성물 또는 조합물.
- 대상체에서 질환의 예방 또는 치료를 위한 의약(medicament)의 제조에서의, 제9항 내지 제14항 중 어느 한 항의 적어도 하나의 조작된 VLP의 용도.
- 이러한 치료를 필요로 하는 대상체에게 유효량의 제9항 내지 제14항 중 어느 한 항의 조작된 VLP를 투여하는 것을 포함하는, 예방 또는 치료 방법.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SG10202010547W | 2020-10-23 | ||
SG10202010547W | 2020-10-23 | ||
PCT/SG2021/050639 WO2022086450A1 (en) | 2020-10-23 | 2021-10-21 | Bacterial microcompartment virus-like particles |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20230088911A true KR20230088911A (ko) | 2023-06-20 |
Family
ID=81291750
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020237016963A KR20230088911A (ko) | 2020-10-23 | 2021-10-21 | 박테리아 미세구획 바이러스-유사 입자 |
Country Status (8)
Country | Link |
---|---|
US (1) | US20230302124A1 (ko) |
EP (1) | EP4232585A1 (ko) |
JP (1) | JP2023546670A (ko) |
KR (1) | KR20230088911A (ko) |
CN (1) | CN116615550A (ko) |
AU (1) | AU2021364272A1 (ko) |
CA (1) | CA3196412A1 (ko) |
WO (1) | WO2022086450A1 (ko) |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2011094765A2 (en) * | 2010-02-01 | 2011-08-04 | The Regents Of The University Of California | A targeting signal for integrating proteins, peptides and biological molecules into bacterial microcompartments |
EP2574620A1 (en) * | 2011-09-28 | 2013-04-03 | University College Cork | Accumulation of metabolic products in bacterial microcompartments |
CN110438095B (zh) * | 2019-07-31 | 2023-09-05 | 中国科学院武汉病毒研究所 | 一种新型纳米反应容器的合成及应用 |
-
2021
- 2021-10-21 CA CA3196412A patent/CA3196412A1/en active Pending
- 2021-10-21 WO PCT/SG2021/050639 patent/WO2022086450A1/en active Application Filing
- 2021-10-21 JP JP2023524512A patent/JP2023546670A/ja active Pending
- 2021-10-21 KR KR1020237016963A patent/KR20230088911A/ko unknown
- 2021-10-21 AU AU2021364272A patent/AU2021364272A1/en active Pending
- 2021-10-21 CN CN202180080176.0A patent/CN116615550A/zh active Pending
- 2021-10-21 US US18/033,282 patent/US20230302124A1/en active Pending
- 2021-10-21 EP EP21883435.6A patent/EP4232585A1/en active Pending
Also Published As
Publication number | Publication date |
---|---|
CA3196412A1 (en) | 2022-04-28 |
WO2022086450A1 (en) | 2022-04-28 |
JP2023546670A (ja) | 2023-11-07 |
CN116615550A (zh) | 2023-08-18 |
AU2021364272A1 (en) | 2023-06-08 |
US20230302124A1 (en) | 2023-09-28 |
EP4232585A1 (en) | 2023-08-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2021203937B2 (en) | Compositions and methods for rapid and dynamic flux control using synthetic metabolic valves | |
KR102370675B1 (ko) | 표적 핵산의 변형을 위한 개선된 방법 | |
AU2016203445B2 (en) | Integration of a polynucleotide encoding a polypeptide that catalyzes pyruvate to acetolactate conversion | |
AU2023270322A1 (en) | Compositions and methods for modifying genomes | |
KR102006527B1 (ko) | 전립선-연관 항원의 발현을 위한 벡터 | |
KR20190122647A (ko) | 식물 병원체의 생물방제를 위한 시스템 및 방법 | |
KR20140099224A (ko) | 케토-아이소발레레이트 데카르복실라제 효소 및 이의 이용 방법 | |
DK2931918T5 (en) | PROCEDURE FOR IDENTIFYING A CELL WITH INCREASED CONCENTRATION OF A PARTICULAR METABOLIT COMPARED TO THE SIMILAR WILD TYPE CELL ..... | |
DK2443248T3 (en) | IMPROVEMENT OF LONG-CHAIN POLYUM Saturated OMEGA-3 AND OMEGA-6 FATTY ACID BIOS SYNTHESIS BY EXPRESSION OF ACYL-CoA LYSOPHOSPHOLIPID ACYL TRANSFERASES | |
KR20140092759A (ko) | 숙주 세포 및 아이소부탄올의 제조 방법 | |
KR20210080375A (ko) | 암 면역요법을 위한 재조합 폭스바이러스 | |
KR20220012327A (ko) | 피토칸나비노이드 및 피토칸나비노이드 전구체의 생산을 위한 방법 및 세포 | |
KR20210096629A (ko) | 바실러스 세포에서의 향상된 단백질 생산을 위한 신규 프로모터 서열 및 이의 방법 | |
CN107630029B (zh) | 一种产朊假丝酵母游离型表达载体及其构建方法与应用 | |
CN109996874A (zh) | 10-甲基硬脂酸的异源性产生 | |
KR20210148270A (ko) | 이중 원형 재조합 dna 작제물 및 이의 조성물을 이용하여 바실러스의 게놈 내로의 폴리뉴클레오타이드를 통합하기 위한 방법 | |
KR20220002910A (ko) | 효율적인 rna 트랜스-스플라이싱을 위한 삼중 나선 종결인자 | |
KR20210148269A (ko) | 선형 재조합 dna 작제물 및 이의 조성물을 이용하여 공여 dna 서열을 바실러스 게놈 내에 통합시키기 위한 방법 | |
CN115927299A (zh) | 增加双链rna产生的方法和组合物 | |
CN107002070A (zh) | 共表达质粒 | |
KR20080030956A (ko) | 개선된 조절 발현 체계를 사용한 질병의 치료 | |
KR20230088911A (ko) | 박테리아 미세구획 바이러스-유사 입자 | |
CN115209909A (zh) | 递送组合物和方法 | |
CN112553240A (zh) | 重组表达载体系统、重组工程菌及其制备方法和用途 | |
CN115243701A (zh) | 用于无佐剂诱导免疫应答的IgG变体 |