CN116615550A - 细菌微区室病毒样颗粒 - Google Patents
细菌微区室病毒样颗粒 Download PDFInfo
- Publication number
- CN116615550A CN116615550A CN202180080176.0A CN202180080176A CN116615550A CN 116615550 A CN116615550 A CN 116615550A CN 202180080176 A CN202180080176 A CN 202180080176A CN 116615550 A CN116615550 A CN 116615550A
- Authority
- CN
- China
- Prior art keywords
- seq
- gly
- val
- bmc
- cso
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000001580 bacterial effect Effects 0.000 title claims abstract description 45
- 239000002245 particle Substances 0.000 title claims abstract description 20
- 108090000765 processed proteins & peptides Proteins 0.000 claims abstract description 50
- 238000000034 method Methods 0.000 claims abstract description 36
- 241000605118 Thiobacillus Species 0.000 claims abstract description 24
- 235000013557 nattō Nutrition 0.000 claims abstract description 24
- 241000204031 Mycoplasma Species 0.000 claims abstract description 19
- 244000052769 pathogen Species 0.000 claims abstract description 19
- 239000003150 biochemical marker Substances 0.000 claims abstract description 14
- 238000004519 manufacturing process Methods 0.000 claims abstract description 7
- 241000607142 Salmonella Species 0.000 claims abstract description 6
- 108091033319 polynucleotide Proteins 0.000 claims abstract description 6
- 102000040430 polynucleotide Human genes 0.000 claims abstract description 6
- 239000002157 polynucleotide Substances 0.000 claims abstract description 6
- 239000013612 plasmid Substances 0.000 claims description 113
- 108010028230 Trp-Ser- His-Pro-Gln-Phe-Glu-Lys Proteins 0.000 claims description 68
- 102000004190 Enzymes Human genes 0.000 claims description 65
- 108090000790 Enzymes Proteins 0.000 claims description 65
- 240000004808 Saccharomyces cerevisiae Species 0.000 claims description 63
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 claims description 63
- 150000001413 amino acids Chemical class 0.000 claims description 43
- 238000005538 encapsulation Methods 0.000 claims description 32
- 241000588724 Escherichia coli Species 0.000 claims description 28
- 239000000203 mixture Substances 0.000 claims description 25
- 230000014509 gene expression Effects 0.000 claims description 21
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 13
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 claims description 12
- 230000002068 genetic effect Effects 0.000 claims description 11
- 150000007523 nucleic acids Chemical class 0.000 claims description 11
- 238000011282 treatment Methods 0.000 claims description 9
- 108091006047 fluorescent proteins Proteins 0.000 claims description 8
- 102000034287 fluorescent proteins Human genes 0.000 claims description 8
- 230000001717 pathogenic effect Effects 0.000 claims description 8
- 230000001276 controlling effect Effects 0.000 claims description 7
- 108020004707 nucleic acids Proteins 0.000 claims description 7
- 102000039446 nucleic acids Human genes 0.000 claims description 7
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 6
- 101100118148 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) YEF3 gene Proteins 0.000 claims description 5
- 230000000295 complement effect Effects 0.000 claims description 5
- 239000003550 marker Substances 0.000 claims description 5
- 101150085381 CDC19 gene Proteins 0.000 claims description 4
- 101100234604 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) ace-8 gene Proteins 0.000 claims description 4
- 101150093629 PYK1 gene Proteins 0.000 claims description 4
- 201000010099 disease Diseases 0.000 claims description 4
- 239000003814 drug Substances 0.000 claims description 4
- 230000002163 immunogen Effects 0.000 claims description 4
- 229960005486 vaccine Drugs 0.000 claims description 4
- 230000003851 biochemical process Effects 0.000 claims description 3
- 101150090418 csoS4A gene Proteins 0.000 claims description 3
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 claims description 2
- 229940002612 prodrug Drugs 0.000 claims description 2
- 239000000651 prodrug Substances 0.000 claims description 2
- 238000011321 prophylaxis Methods 0.000 claims description 2
- 230000001105 regulatory effect Effects 0.000 claims description 2
- 229940124597 therapeutic agent Drugs 0.000 claims description 2
- 125000003275 alpha amino acid group Chemical group 0.000 claims 8
- 229910052698 phosphorus Inorganic materials 0.000 abstract description 4
- 108020004414 DNA Proteins 0.000 description 100
- 108090000623 proteins and genes Proteins 0.000 description 96
- 102000004169 proteins and genes Human genes 0.000 description 82
- 229940088598 enzyme Drugs 0.000 description 62
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 51
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 description 39
- 108700026244 Open Reading Frames Proteins 0.000 description 38
- 230000037361 pathway Effects 0.000 description 37
- 239000002773 nucleotide Substances 0.000 description 32
- 125000003729 nucleotide group Chemical group 0.000 description 32
- 239000011780 sodium chloride Substances 0.000 description 25
- 229940024606 amino acid Drugs 0.000 description 20
- 239000000523 sample Substances 0.000 description 20
- 230000000694 effects Effects 0.000 description 19
- 239000000872 buffer Substances 0.000 description 18
- 238000000746 purification Methods 0.000 description 17
- 238000005349 anion exchange Methods 0.000 description 16
- 210000004027 cell Anatomy 0.000 description 16
- 108010050848 glycylleucine Proteins 0.000 description 14
- 238000004458 analytical method Methods 0.000 description 13
- 239000006166 lysate Substances 0.000 description 13
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 12
- 230000015572 biosynthetic process Effects 0.000 description 12
- LHGVFZTZFXWLCP-UHFFFAOYSA-N guaiacol Chemical compound COC1=CC=CC=C1O LHGVFZTZFXWLCP-UHFFFAOYSA-N 0.000 description 12
- 238000005259 measurement Methods 0.000 description 12
- 238000005755 formation reaction Methods 0.000 description 11
- 239000013615 primer Substances 0.000 description 11
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 10
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 10
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 10
- 108010011939 Pyruvate Decarboxylase Proteins 0.000 description 10
- 108010005233 alanylglutamic acid Proteins 0.000 description 10
- 108010057821 leucylproline Proteins 0.000 description 10
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 9
- CCQOOWAONKGYKQ-BYPYZUCNSA-N Gly-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)CN CCQOOWAONKGYKQ-BYPYZUCNSA-N 0.000 description 9
- 239000013256 coordination polymer Substances 0.000 description 9
- 238000002296 dynamic light scattering Methods 0.000 description 9
- 239000002609 medium Substances 0.000 description 9
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 9
- UAJAYRMZGNQILN-BQBZGAKWSA-N Ser-Gly-Met Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCSC)C(O)=O UAJAYRMZGNQILN-BQBZGAKWSA-N 0.000 description 8
- CYCGARJWIQWPQM-YJRXYDGGSA-N Thr-Tyr-Ser Chemical compound C[C@@H](O)[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CO)C([O-])=O)CC1=CC=C(O)C=C1 CYCGARJWIQWPQM-YJRXYDGGSA-N 0.000 description 8
- 108010047857 aspartylglycine Proteins 0.000 description 8
- 238000006243 chemical reaction Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 8
- 108010015792 glycyllysine Proteins 0.000 description 8
- 108010081551 glycylphenylalanine Proteins 0.000 description 8
- QPDUVFSVVAOUHE-XVKPBYJWSA-N Gly-Gln-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)CN)C(O)=O QPDUVFSVVAOUHE-XVKPBYJWSA-N 0.000 description 7
- 108010065920 Insulin Lispro Proteins 0.000 description 7
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 7
- CNGOEHJCLVCJHN-SRVKXCTJSA-N Lys-Pro-Glu Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O CNGOEHJCLVCJHN-SRVKXCTJSA-N 0.000 description 7
- HEUVHBXOVZONPU-BJDJZHNGSA-N Ser-Leu-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HEUVHBXOVZONPU-BJDJZHNGSA-N 0.000 description 7
- 238000003556 assay Methods 0.000 description 7
- 238000004891 communication Methods 0.000 description 7
- 108010078144 glutaminyl-glycine Proteins 0.000 description 7
- 108010042598 glutamyl-aspartyl-glycine Proteins 0.000 description 7
- 230000003993 interaction Effects 0.000 description 7
- 239000000047 product Substances 0.000 description 7
- 230000008685 targeting Effects 0.000 description 7
- 108010061238 threonyl-glycine Proteins 0.000 description 7
- OMMDTNGURYRDAC-NRPADANISA-N Ala-Glu-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O OMMDTNGURYRDAC-NRPADANISA-N 0.000 description 6
- DSPQRJXOIXHOHK-WDSKDSINSA-N Glu-Asp-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O DSPQRJXOIXHOHK-WDSKDSINSA-N 0.000 description 6
- BZKDJRSZWLPJNI-SRVKXCTJSA-N His-His-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O BZKDJRSZWLPJNI-SRVKXCTJSA-N 0.000 description 6
- LRALLISKBZNSKN-BQBZGAKWSA-N Met-Gly-Ser Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O LRALLISKBZNSKN-BQBZGAKWSA-N 0.000 description 6
- IOVBCLGAJJXOHK-SRVKXCTJSA-N Ser-His-His Chemical compound C([C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CN=CN1 IOVBCLGAJJXOHK-SRVKXCTJSA-N 0.000 description 6
- LRWBCWGEUCKDTN-BJDJZHNGSA-N Ser-Lys-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LRWBCWGEUCKDTN-BJDJZHNGSA-N 0.000 description 6
- 238000003917 TEM image Methods 0.000 description 6
- 108010077245 asparaginyl-proline Proteins 0.000 description 6
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 6
- 108010092854 aspartyllysine Proteins 0.000 description 6
- 210000004899 c-terminal region Anatomy 0.000 description 6
- 238000010276 construction Methods 0.000 description 6
- 239000013078 crystal Substances 0.000 description 6
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 6
- 229960001867 guaiacol Drugs 0.000 description 6
- 108010028295 histidylhistidine Proteins 0.000 description 6
- 230000010354 integration Effects 0.000 description 6
- 239000007788 liquid Substances 0.000 description 6
- 238000001228 spectrum Methods 0.000 description 6
- UCSJYZPVAKXKNQ-HZYVHMACSA-N streptomycin Chemical compound CN[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O[C@H]1O[C@@H]1[C@](C=O)(O)[C@H](C)O[C@H]1O[C@@H]1[C@@H](NC(N)=N)[C@H](O)[C@@H](NC(N)=N)[C@H](O)[C@H]1O UCSJYZPVAKXKNQ-HZYVHMACSA-N 0.000 description 6
- 239000013638 trimer Substances 0.000 description 6
- 108010080629 tryptophan-leucine Proteins 0.000 description 6
- 238000001262 western blot Methods 0.000 description 6
- KUWPCJHYPSUOFW-YBXAARCKSA-N 2-nitrophenyl beta-D-galactoside Chemical compound O[C@@H]1[C@@H](O)[C@@H](O)[C@@H](CO)O[C@H]1OC1=CC=CC=C1[N+]([O-])=O KUWPCJHYPSUOFW-YBXAARCKSA-N 0.000 description 5
- 101710139569 Bacterial microcompartment shell protein EutM Proteins 0.000 description 5
- 101710091770 Bacterial microcompartment shell protein PduA Proteins 0.000 description 5
- 101710093019 Bacterial microcompartment shell protein PduU Proteins 0.000 description 5
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 5
- 241000282326 Felis catus Species 0.000 description 5
- WCORRBXVISTKQL-WHFBIAKZSA-N Gly-Ser-Ser Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WCORRBXVISTKQL-WHFBIAKZSA-N 0.000 description 5
- 102000003960 Ligases Human genes 0.000 description 5
- 108090000364 Ligases Proteins 0.000 description 5
- PDIDTSZKKFEDMB-UWVGGRQHSA-N Lys-Pro-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O PDIDTSZKKFEDMB-UWVGGRQHSA-N 0.000 description 5
- GODBLDDYHFTUAH-CIUDSAMLSA-N Met-Asp-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(O)=O GODBLDDYHFTUAH-CIUDSAMLSA-N 0.000 description 5
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 5
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 5
- UHBPFYOQQPFKQR-JHEQGTHGSA-N Thr-Gln-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O UHBPFYOQQPFKQR-JHEQGTHGSA-N 0.000 description 5
- DJDSEDOKJTZBAR-ZDLURKLDSA-N Thr-Gly-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O DJDSEDOKJTZBAR-ZDLURKLDSA-N 0.000 description 5
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 5
- 108010087924 alanylproline Proteins 0.000 description 5
- 108010060035 arginylproline Proteins 0.000 description 5
- 229940098773 bovine serum albumin Drugs 0.000 description 5
- 210000000234 capsid Anatomy 0.000 description 5
- 210000000349 chromosome Anatomy 0.000 description 5
- 238000010828 elution Methods 0.000 description 5
- 230000004927 fusion Effects 0.000 description 5
- 239000008103 glucose Substances 0.000 description 5
- XKUKSGPZAADMRA-UHFFFAOYSA-N glycyl-glycyl-glycine Natural products NCC(=O)NCC(=O)NCC(O)=O XKUKSGPZAADMRA-UHFFFAOYSA-N 0.000 description 5
- 108010040030 histidinoalanine Proteins 0.000 description 5
- 229930027917 kanamycin Natural products 0.000 description 5
- 229960000318 kanamycin Drugs 0.000 description 5
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 5
- 229930182823 kanamycin A Natural products 0.000 description 5
- 108010005942 methionylglycine Proteins 0.000 description 5
- 239000000178 monomer Substances 0.000 description 5
- 238000002360 preparation method Methods 0.000 description 5
- 102000004196 processed proteins & peptides Human genes 0.000 description 5
- 238000012552 review Methods 0.000 description 5
- 239000000243 solution Substances 0.000 description 5
- 239000002904 solvent Substances 0.000 description 5
- 238000004627 transmission electron microscopy Methods 0.000 description 5
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 4
- WQVFQXXBNHHPLX-ZKWXMUAHSA-N Ala-Ala-His Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O WQVFQXXBNHHPLX-ZKWXMUAHSA-N 0.000 description 4
- SVBXIUDNTRTKHE-CIUDSAMLSA-N Ala-Arg-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O SVBXIUDNTRTKHE-CIUDSAMLSA-N 0.000 description 4
- NHCPCLJZRSIDHS-ZLUOBGJFSA-N Ala-Asp-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O NHCPCLJZRSIDHS-ZLUOBGJFSA-N 0.000 description 4
- AWZKCUCQJNTBAD-SRVKXCTJSA-N Ala-Leu-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN AWZKCUCQJNTBAD-SRVKXCTJSA-N 0.000 description 4
- IEAUDUOCWNPZBR-LKTVYLICSA-N Ala-Trp-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N IEAUDUOCWNPZBR-LKTVYLICSA-N 0.000 description 4
- AGVNTAUPLWIQEN-ZPFDUUQYSA-N Arg-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N AGVNTAUPLWIQEN-ZPFDUUQYSA-N 0.000 description 4
- MFFOYNGMOYFPBD-DCAQKATOSA-N Asn-Arg-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O MFFOYNGMOYFPBD-DCAQKATOSA-N 0.000 description 4
- UMHUHHJMEXNSIV-CIUDSAMLSA-N Asp-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UMHUHHJMEXNSIV-CIUDSAMLSA-N 0.000 description 4
- 108700038233 Bacterial microcompartment shell protein EutL Proteins 0.000 description 4
- 108700038232 Bacterial microcompartment shell protein PduB Proteins 0.000 description 4
- 108090000565 Capsid Proteins Proteins 0.000 description 4
- 102100023321 Ceruloplasmin Human genes 0.000 description 4
- CRRFJBGUGNNOCS-PEFMBERDSA-N Gln-Asp-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O CRRFJBGUGNNOCS-PEFMBERDSA-N 0.000 description 4
- NSEKYCAADBNQFE-XIRDDKMYSA-N Gln-Trp-Arg Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCC(N)=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)=CNC2=C1 NSEKYCAADBNQFE-XIRDDKMYSA-N 0.000 description 4
- UGSVSNXPJJDJKL-SDDRHHMPSA-N Glu-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N UGSVSNXPJJDJKL-SDDRHHMPSA-N 0.000 description 4
- AIJAPFVDBFYNKN-WHFBIAKZSA-N Gly-Asn-Asp Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)CN)C(=O)N AIJAPFVDBFYNKN-WHFBIAKZSA-N 0.000 description 4
- NPSWCZIRBAYNSB-JHEQGTHGSA-N Gly-Gln-Thr Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NPSWCZIRBAYNSB-JHEQGTHGSA-N 0.000 description 4
- YWAQATDNEKZFFK-BYPYZUCNSA-N Gly-Gly-Ser Chemical compound NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O YWAQATDNEKZFFK-BYPYZUCNSA-N 0.000 description 4
- 101000896205 Haliangium ochraceum (strain DSM 14365 / JCM 11303 / SMP-2) Bacterial microcompartment shell vertex protein Proteins 0.000 description 4
- 241000880493 Leptailurus serval Species 0.000 description 4
- KYIIALJHAOIAHF-KKUMJFAQSA-N Leu-Leu-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 KYIIALJHAOIAHF-KKUMJFAQSA-N 0.000 description 4
- LUAJJLPHUXPQLH-KKUMJFAQSA-N Lys-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCCN)N LUAJJLPHUXPQLH-KKUMJFAQSA-N 0.000 description 4
- LOGFVTREOLYCPF-RHYQMDGZSA-N Lys-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCCN LOGFVTREOLYCPF-RHYQMDGZSA-N 0.000 description 4
- VHGIWFGJIHTASW-FXQIFTODSA-N Met-Ala-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O VHGIWFGJIHTASW-FXQIFTODSA-N 0.000 description 4
- WUGMRIBZSVSJNP-UHFFFAOYSA-N N-L-alanyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UHFFFAOYSA-N 0.000 description 4
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 4
- BQVUABVGYYSDCJ-UHFFFAOYSA-N Nalpha-L-Leucyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-UHFFFAOYSA-N 0.000 description 4
- JLLJTMHNXQTMCK-UBHSHLNASA-N Phe-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=CC=C1 JLLJTMHNXQTMCK-UBHSHLNASA-N 0.000 description 4
- NJJBATPLUQHRBM-IHRRRGAJSA-N Phe-Pro-Ser Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)N)C(=O)N[C@@H](CO)C(=O)O NJJBATPLUQHRBM-IHRRRGAJSA-N 0.000 description 4
- UVKNEILZSJMKSR-FXQIFTODSA-N Pro-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H]1CCCN1 UVKNEILZSJMKSR-FXQIFTODSA-N 0.000 description 4
- NAIPAPCKKRCMBL-JYJNAYRXSA-N Pro-Pro-Phe Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H]1N(CCC1)C(=O)[C@H]1NCCC1)C1=CC=CC=C1 NAIPAPCKKRCMBL-JYJNAYRXSA-N 0.000 description 4
- PRKWBYCXBBSLSK-GUBZILKMSA-N Pro-Ser-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O PRKWBYCXBBSLSK-GUBZILKMSA-N 0.000 description 4
- 101001010478 Rhodospirillum rubrum (strain F11) Bacterial microcompartment shell vertex protein GrpN Proteins 0.000 description 4
- 101100415710 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) RPL41B gene Proteins 0.000 description 4
- 101100090299 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) RPL42B gene Proteins 0.000 description 4
- 101100415709 Schizosaccharomyces pombe (strain 972 / ATCC 24843) rpl4102 gene Proteins 0.000 description 4
- BTKUIVBNGBFTTP-WHFBIAKZSA-N Ser-Ala-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)NCC(O)=O BTKUIVBNGBFTTP-WHFBIAKZSA-N 0.000 description 4
- BPMRXBZYPGYPJN-WHFBIAKZSA-N Ser-Gly-Asn Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O BPMRXBZYPGYPJN-WHFBIAKZSA-N 0.000 description 4
- MIJWOJAXARLEHA-WDSKDSINSA-N Ser-Gly-Glu Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O MIJWOJAXARLEHA-WDSKDSINSA-N 0.000 description 4
- XHWCDRUPDNSDAZ-XKBZYTNZSA-N Thr-Ser-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N)O XHWCDRUPDNSDAZ-XKBZYTNZSA-N 0.000 description 4
- SUEGAFMNTXXNLR-WFBYXXMGSA-N Trp-Ser-Ala Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O SUEGAFMNTXXNLR-WFBYXXMGSA-N 0.000 description 4
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 4
- UDNYEPLJTRDMEJ-RCOVLWMOSA-N Val-Asn-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)NCC(=O)O)N UDNYEPLJTRDMEJ-RCOVLWMOSA-N 0.000 description 4
- VLOYGOZDPGYWFO-LAEOZQHASA-N Val-Asp-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VLOYGOZDPGYWFO-LAEOZQHASA-N 0.000 description 4
- XWYUBUYQMOUFRQ-IFFSRLJSSA-N Val-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N)O XWYUBUYQMOUFRQ-IFFSRLJSSA-N 0.000 description 4
- 238000001261 affinity purification Methods 0.000 description 4
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 4
- 108010041407 alanylaspartic acid Proteins 0.000 description 4
- 108010070944 alanylhistidine Proteins 0.000 description 4
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 4
- 108010038633 aspartylglutamate Proteins 0.000 description 4
- 229910052799 carbon Inorganic materials 0.000 description 4
- 230000003197 catalytic effect Effects 0.000 description 4
- 238000010367 cloning Methods 0.000 description 4
- 150000001875 compounds Chemical class 0.000 description 4
- 238000004925 denaturation Methods 0.000 description 4
- 230000036425 denaturation Effects 0.000 description 4
- 238000009826 distribution Methods 0.000 description 4
- 238000009472 formulation Methods 0.000 description 4
- 230000008014 freezing Effects 0.000 description 4
- 238000007710 freezing Methods 0.000 description 4
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 4
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 4
- 108010072405 glycyl-aspartyl-glycine Proteins 0.000 description 4
- 108010038983 glycyl-histidyl-lysine Proteins 0.000 description 4
- 108010020688 glycylhistidine Proteins 0.000 description 4
- 108010087823 glycyltyrosine Proteins 0.000 description 4
- 108010025306 histidylleucine Proteins 0.000 description 4
- 108010085325 histidylproline Proteins 0.000 description 4
- 238000003780 insertion Methods 0.000 description 4
- 230000037431 insertion Effects 0.000 description 4
- 108010034529 leucyl-lysine Proteins 0.000 description 4
- 108010017391 lysylvaline Proteins 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 108010024654 phenylalanyl-prolyl-alanine Proteins 0.000 description 4
- YBYRMVIVWMBXKQ-UHFFFAOYSA-N phenylmethanesulfonyl fluoride Chemical compound FS(=O)(=O)CC1=CC=CC=C1 YBYRMVIVWMBXKQ-UHFFFAOYSA-N 0.000 description 4
- 108010070643 prolylglutamic acid Proteins 0.000 description 4
- 108010015796 prolylisoleucine Proteins 0.000 description 4
- 238000011002 quantification Methods 0.000 description 4
- 238000011160 research Methods 0.000 description 4
- 108091008146 restriction endonucleases Proteins 0.000 description 4
- 230000002441 reversible effect Effects 0.000 description 4
- 108010048818 seryl-histidine Proteins 0.000 description 4
- 108010069117 seryl-lysyl-aspartic acid Proteins 0.000 description 4
- 108010048397 seryl-lysyl-leucine Proteins 0.000 description 4
- 241000894007 species Species 0.000 description 4
- 239000000758 substrate Substances 0.000 description 4
- 230000002103 transcriptional effect Effects 0.000 description 4
- 108010020532 tyrosyl-proline Proteins 0.000 description 4
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 4
- 102100023415 40S ribosomal protein S20 Human genes 0.000 description 3
- RLMISHABBKUNFO-WHFBIAKZSA-N Ala-Ala-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 3
- VWEWCZSUWOEEFM-WDSKDSINSA-N Ala-Gly-Ala-Gly Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(=O)NCC(O)=O VWEWCZSUWOEEFM-WDSKDSINSA-N 0.000 description 3
- SMCGQGDVTPFXKB-XPUUQOCRSA-N Ala-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N SMCGQGDVTPFXKB-XPUUQOCRSA-N 0.000 description 3
- IORKCNUBHNIMKY-CIUDSAMLSA-N Ala-Pro-Glu Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O IORKCNUBHNIMKY-CIUDSAMLSA-N 0.000 description 3
- RTZCUEHYUQZIDE-WHFBIAKZSA-N Ala-Ser-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RTZCUEHYUQZIDE-WHFBIAKZSA-N 0.000 description 3
- QOIGKCBMXUCDQU-KDXUFGMBSA-N Ala-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C)N)O QOIGKCBMXUCDQU-KDXUFGMBSA-N 0.000 description 3
- VKKYFICVTYKFIO-CIUDSAMLSA-N Arg-Ala-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N VKKYFICVTYKFIO-CIUDSAMLSA-N 0.000 description 3
- FKQITMVNILRUCQ-IHRRRGAJSA-N Arg-Phe-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O FKQITMVNILRUCQ-IHRRRGAJSA-N 0.000 description 3
- VUGWHBXPMAHEGZ-SRVKXCTJSA-N Arg-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCN=C(N)N VUGWHBXPMAHEGZ-SRVKXCTJSA-N 0.000 description 3
- GQRDIVQPSMPQME-ZPFDUUQYSA-N Asn-Ile-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O GQRDIVQPSMPQME-ZPFDUUQYSA-N 0.000 description 3
- QOCFFCUFZGDHTP-NUMRIWBASA-N Asp-Thr-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QOCFFCUFZGDHTP-NUMRIWBASA-N 0.000 description 3
- 241000192700 Cyanobacteria Species 0.000 description 3
- IAZDPXIOMUYVGZ-UHFFFAOYSA-N Dimethylsulphoxide Chemical compound CS(C)=O IAZDPXIOMUYVGZ-UHFFFAOYSA-N 0.000 description 3
- WSFSSNUMVMOOMR-UHFFFAOYSA-N Formaldehyde Chemical compound O=C WSFSSNUMVMOOMR-UHFFFAOYSA-N 0.000 description 3
- IQACOVZVOMVILH-FXQIFTODSA-N Glu-Glu-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O IQACOVZVOMVILH-FXQIFTODSA-N 0.000 description 3
- PXXGVUVQWQGGIG-YUMQZZPRSA-N Glu-Gly-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N PXXGVUVQWQGGIG-YUMQZZPRSA-N 0.000 description 3
- HVYWQYLBVXMXSV-GUBZILKMSA-N Glu-Leu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O HVYWQYLBVXMXSV-GUBZILKMSA-N 0.000 description 3
- NJCALAAIGREHDR-WDCWCFNPSA-N Glu-Leu-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NJCALAAIGREHDR-WDCWCFNPSA-N 0.000 description 3
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 3
- XTQFHTHIAKKCTM-YFKPBYRVSA-N Gly-Glu-Gly Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O XTQFHTHIAKKCTM-YFKPBYRVSA-N 0.000 description 3
- MVORZMQFXBLMHM-QWRGUYRKSA-N Gly-His-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CN=CN1 MVORZMQFXBLMHM-QWRGUYRKSA-N 0.000 description 3
- SWQALSGKVLYKDT-UHFFFAOYSA-N Gly-Ile-Ala Natural products NCC(=O)NC(C(C)CC)C(=O)NC(C)C(O)=O SWQALSGKVLYKDT-UHFFFAOYSA-N 0.000 description 3
- BMWFDYIYBAFROD-WPRPVWTQSA-N Gly-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN BMWFDYIYBAFROD-WPRPVWTQSA-N 0.000 description 3
- WNGHUXFWEWTKAO-YUMQZZPRSA-N Gly-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)CN WNGHUXFWEWTKAO-YUMQZZPRSA-N 0.000 description 3
- BAYQNCWLXIDLHX-ONGXEEELSA-N Gly-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)CN BAYQNCWLXIDLHX-ONGXEEELSA-N 0.000 description 3
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 3
- 108010093488 His-His-His-His-His-His Proteins 0.000 description 3
- 101001114932 Homo sapiens 40S ribosomal protein S20 Proteins 0.000 description 3
- GVKKVHNRTUFCCE-BJDJZHNGSA-N Ile-Leu-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)O)N GVKKVHNRTUFCCE-BJDJZHNGSA-N 0.000 description 3
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 3
- KSZCCRIGNVSHFH-UWVGGRQHSA-N Leu-Arg-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O KSZCCRIGNVSHFH-UWVGGRQHSA-N 0.000 description 3
- ILJREDZFPHTUIE-GUBZILKMSA-N Leu-Asp-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ILJREDZFPHTUIE-GUBZILKMSA-N 0.000 description 3
- QVFGXCVIXXBFHO-AVGNSLFASA-N Leu-Glu-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O QVFGXCVIXXBFHO-AVGNSLFASA-N 0.000 description 3
- FZMNAYBEFGZEIF-AVGNSLFASA-N Leu-Met-Met Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCSC)C(=O)O)N FZMNAYBEFGZEIF-AVGNSLFASA-N 0.000 description 3
- HDHQQEDVWQGBEE-DCAQKATOSA-N Leu-Met-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O HDHQQEDVWQGBEE-DCAQKATOSA-N 0.000 description 3
- BTEMNFBEAAOGBR-BZSNNMDCSA-N Leu-Tyr-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BTEMNFBEAAOGBR-BZSNNMDCSA-N 0.000 description 3
- ISHNZELVUVPCHY-ZETCQYMHSA-N Lys-Gly-Gly Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)NCC(O)=O ISHNZELVUVPCHY-ZETCQYMHSA-N 0.000 description 3
- OWRUUFUVXFREBD-KKUMJFAQSA-N Lys-His-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O OWRUUFUVXFREBD-KKUMJFAQSA-N 0.000 description 3
- JHNOXVASMSXSNB-WEDXCCLWSA-N Lys-Thr-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O JHNOXVASMSXSNB-WEDXCCLWSA-N 0.000 description 3
- DRINJBAHUGXNFC-DCAQKATOSA-N Met-Asp-His Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(O)=O DRINJBAHUGXNFC-DCAQKATOSA-N 0.000 description 3
- FZUNSVYYPYJYAP-NAKRPEOUSA-N Met-Ile-Ala Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O FZUNSVYYPYJYAP-NAKRPEOUSA-N 0.000 description 3
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 3
- NEHSHYOUIWBYSA-DCPHZVHLSA-N Phe-Ala-Trp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC3=CC=CC=C3)N NEHSHYOUIWBYSA-DCPHZVHLSA-N 0.000 description 3
- KIZQGKLMXKGDIV-BQBZGAKWSA-N Pro-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 KIZQGKLMXKGDIV-BQBZGAKWSA-N 0.000 description 3
- IFMDQWDAJUMMJC-DCAQKATOSA-N Pro-Ala-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O IFMDQWDAJUMMJC-DCAQKATOSA-N 0.000 description 3
- JARJPEMLQAWNBR-GUBZILKMSA-N Pro-Asp-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JARJPEMLQAWNBR-GUBZILKMSA-N 0.000 description 3
- NMELOOXSGDRBRU-YUMQZZPRSA-N Pro-Glu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(=O)O)NC(=O)[C@@H]1CCCN1 NMELOOXSGDRBRU-YUMQZZPRSA-N 0.000 description 3
- UQFYNFTYDHUIMI-WHFBIAKZSA-N Ser-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CO UQFYNFTYDHUIMI-WHFBIAKZSA-N 0.000 description 3
- YMTLKLXDFCSCNX-BYPYZUCNSA-N Ser-Gly-Gly Chemical compound OC[C@H](N)C(=O)NCC(=O)NCC(O)=O YMTLKLXDFCSCNX-BYPYZUCNSA-N 0.000 description 3
- SFTZTYBXIXLRGQ-JBDRJPRFSA-N Ser-Ile-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O SFTZTYBXIXLRGQ-JBDRJPRFSA-N 0.000 description 3
- SRSPTFBENMJHMR-WHFBIAKZSA-N Ser-Ser-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SRSPTFBENMJHMR-WHFBIAKZSA-N 0.000 description 3
- LVHHEVGYAZGXDE-KDXUFGMBSA-N Thr-Ala-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)N1CCC[C@@H]1C(=O)O)N)O LVHHEVGYAZGXDE-KDXUFGMBSA-N 0.000 description 3
- YQYFYUSYEDNLSD-YEPSODPASA-N Val-Thr-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O YQYFYUSYEDNLSD-YEPSODPASA-N 0.000 description 3
- 238000002835 absorbance Methods 0.000 description 3
- 238000013019 agitation Methods 0.000 description 3
- 108010047495 alanylglycine Proteins 0.000 description 3
- 108010008355 arginyl-glutamine Proteins 0.000 description 3
- 108010093581 aspartyl-proline Proteins 0.000 description 3
- 239000003153 chemical reaction reagent Substances 0.000 description 3
- 230000002759 chromosomal effect Effects 0.000 description 3
- 230000008045 co-localization Effects 0.000 description 3
- 210000000805 cytoplasm Anatomy 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000001493 electron microscopy Methods 0.000 description 3
- 239000000499 gel Substances 0.000 description 3
- 108010049041 glutamylalanine Proteins 0.000 description 3
- 108010079547 glutamylmethionine Proteins 0.000 description 3
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 3
- 108010010096 glycyl-glycyl-tyrosine Proteins 0.000 description 3
- 108010089804 glycyl-threonine Proteins 0.000 description 3
- 108010077515 glycylproline Proteins 0.000 description 3
- 238000003119 immunoblot Methods 0.000 description 3
- 238000011534 incubation Methods 0.000 description 3
- 230000003834 intracellular effect Effects 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 230000001404 mediated effect Effects 0.000 description 3
- 238000010369 molecular cloning Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 239000008194 pharmaceutical composition Substances 0.000 description 3
- 108010093296 prolyl-prolyl-alanine Proteins 0.000 description 3
- 108010053725 prolylvaline Proteins 0.000 description 3
- 238000003259 recombinant expression Methods 0.000 description 3
- 229920006395 saturated elastomer Polymers 0.000 description 3
- 238000002864 sequence alignment Methods 0.000 description 3
- 108010071207 serylmethionine Proteins 0.000 description 3
- 230000006641 stabilisation Effects 0.000 description 3
- 238000011105 stabilization Methods 0.000 description 3
- 229960005322 streptomycin Drugs 0.000 description 3
- 239000000126 substance Substances 0.000 description 3
- 238000013518 transcription Methods 0.000 description 3
- 230000035897 transcription Effects 0.000 description 3
- 108010051110 tyrosyl-lysine Proteins 0.000 description 3
- XVZCXCTYGHPNEM-IHRRRGAJSA-N (2s)-1-[(2s)-2-[[(2s)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(O)=O XVZCXCTYGHPNEM-IHRRRGAJSA-N 0.000 description 2
- QFVHZQCOUORWEI-UHFFFAOYSA-N 4-[(4-anilino-5-sulfonaphthalen-1-yl)diazenyl]-5-hydroxynaphthalene-2,7-disulfonic acid Chemical compound C=12C(O)=CC(S(O)(=O)=O)=CC2=CC(S(O)(=O)=O)=CC=1N=NC(C1=CC=CC(=C11)S(O)(=O)=O)=CC=C1NC1=CC=CC=C1 QFVHZQCOUORWEI-UHFFFAOYSA-N 0.000 description 2
- CXRCVCURMBFFOL-FXQIFTODSA-N Ala-Ala-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O CXRCVCURMBFFOL-FXQIFTODSA-N 0.000 description 2
- YYSWCHMLFJLLBJ-ZLUOBGJFSA-N Ala-Ala-Ser Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YYSWCHMLFJLLBJ-ZLUOBGJFSA-N 0.000 description 2
- STACJSVFHSEZJV-GHCJXIJMSA-N Ala-Asn-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O STACJSVFHSEZJV-GHCJXIJMSA-N 0.000 description 2
- LZRNYBIJOSKKRJ-XVYDVKMFSA-N Ala-Asp-His Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N LZRNYBIJOSKKRJ-XVYDVKMFSA-N 0.000 description 2
- ZIWWTZWAKYBUOB-CIUDSAMLSA-N Ala-Asp-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O ZIWWTZWAKYBUOB-CIUDSAMLSA-N 0.000 description 2
- LSLIRHLIUDVNBN-CIUDSAMLSA-N Ala-Asp-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LSLIRHLIUDVNBN-CIUDSAMLSA-N 0.000 description 2
- KUDREHRZRIVKHS-UWJYBYFXSA-N Ala-Asp-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KUDREHRZRIVKHS-UWJYBYFXSA-N 0.000 description 2
- IKKVASZHTMKJIR-ZKWXMUAHSA-N Ala-Asp-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O IKKVASZHTMKJIR-ZKWXMUAHSA-N 0.000 description 2
- YIGLXQRFQVWFEY-NRPADANISA-N Ala-Gln-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O YIGLXQRFQVWFEY-NRPADANISA-N 0.000 description 2
- NWVVKQZOVSTDBQ-CIUDSAMLSA-N Ala-Glu-Arg Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NWVVKQZOVSTDBQ-CIUDSAMLSA-N 0.000 description 2
- NJPMYXWVWQWCSR-ACZMJKKPSA-N Ala-Glu-Asn Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O NJPMYXWVWQWCSR-ACZMJKKPSA-N 0.000 description 2
- HXNNRBHASOSVPG-GUBZILKMSA-N Ala-Glu-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O HXNNRBHASOSVPG-GUBZILKMSA-N 0.000 description 2
- WMYJZJRILUVVRG-WDSKDSINSA-N Ala-Gly-Gln Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O WMYJZJRILUVVRG-WDSKDSINSA-N 0.000 description 2
- BEMGNWZECGIJOI-WDSKDSINSA-N Ala-Gly-Glu Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O BEMGNWZECGIJOI-WDSKDSINSA-N 0.000 description 2
- JEPNLGMEZMCFEX-QSFUFRPTSA-N Ala-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](C)N JEPNLGMEZMCFEX-QSFUFRPTSA-N 0.000 description 2
- DVJSJDDYCYSMFR-ZKWXMUAHSA-N Ala-Ile-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O DVJSJDDYCYSMFR-ZKWXMUAHSA-N 0.000 description 2
- LXAARTARZJJCMB-CIQUZCHMSA-N Ala-Ile-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LXAARTARZJJCMB-CIQUZCHMSA-N 0.000 description 2
- CCDFBRZVTDDJNM-GUBZILKMSA-N Ala-Leu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CCDFBRZVTDDJNM-GUBZILKMSA-N 0.000 description 2
- SOBIAADAMRHGKH-CIUDSAMLSA-N Ala-Leu-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O SOBIAADAMRHGKH-CIUDSAMLSA-N 0.000 description 2
- OQWQTGBOFPJOIF-DLOVCJGASA-N Ala-Lys-His Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N OQWQTGBOFPJOIF-DLOVCJGASA-N 0.000 description 2
- KQESEZXHYOUIIM-CQDKDKBSSA-N Ala-Lys-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KQESEZXHYOUIIM-CQDKDKBSSA-N 0.000 description 2
- OMDNCNKNEGFOMM-BQBZGAKWSA-N Ala-Met-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O OMDNCNKNEGFOMM-BQBZGAKWSA-N 0.000 description 2
- DEWWPUNXRNGMQN-LPEHRKFASA-N Ala-Met-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N1CCC[C@@H]1C(=O)O)N DEWWPUNXRNGMQN-LPEHRKFASA-N 0.000 description 2
- HYIDEIQUCBKIPL-CQDKDKBSSA-N Ala-Phe-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N HYIDEIQUCBKIPL-CQDKDKBSSA-N 0.000 description 2
- ZBLQIYPCUWZSRZ-QEJZJMRPSA-N Ala-Phe-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CC=CC=C1 ZBLQIYPCUWZSRZ-QEJZJMRPSA-N 0.000 description 2
- ADSGHMXEAZJJNF-DCAQKATOSA-N Ala-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)N ADSGHMXEAZJJNF-DCAQKATOSA-N 0.000 description 2
- RMAWDDRDTRSZIR-ZLUOBGJFSA-N Ala-Ser-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O RMAWDDRDTRSZIR-ZLUOBGJFSA-N 0.000 description 2
- YNOCMHZSWJMGBB-GCJQMDKQSA-N Ala-Thr-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O YNOCMHZSWJMGBB-GCJQMDKQSA-N 0.000 description 2
- IETUUAHKCHOQHP-KZVJFYERSA-N Ala-Thr-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@H](C)N)[C@@H](C)O)C(O)=O IETUUAHKCHOQHP-KZVJFYERSA-N 0.000 description 2
- TVUFMYKTYXTRPY-HERUPUMHSA-N Ala-Trp-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(O)=O TVUFMYKTYXTRPY-HERUPUMHSA-N 0.000 description 2
- IYKVSFNGSWTTNZ-GUBZILKMSA-N Ala-Val-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IYKVSFNGSWTTNZ-GUBZILKMSA-N 0.000 description 2
- YJHKTAMKPGFJCT-NRPADANISA-N Ala-Val-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O YJHKTAMKPGFJCT-NRPADANISA-N 0.000 description 2
- XKHLBBQNPSOGPI-GUBZILKMSA-N Ala-Val-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C)N XKHLBBQNPSOGPI-GUBZILKMSA-N 0.000 description 2
- DXQIQUIQYAGRCC-CIUDSAMLSA-N Arg-Asp-Gln Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)CN=C(N)N DXQIQUIQYAGRCC-CIUDSAMLSA-N 0.000 description 2
- HKRXJBBCQBAGIM-FXQIFTODSA-N Arg-Asp-Ser Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CO)C(=O)O)N)CN=C(N)N HKRXJBBCQBAGIM-FXQIFTODSA-N 0.000 description 2
- YUGFLWBWAJFGKY-BQBZGAKWSA-N Arg-Cys-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CS)C(=O)NCC(O)=O YUGFLWBWAJFGKY-BQBZGAKWSA-N 0.000 description 2
- XLWSGICNBZGYTA-CIUDSAMLSA-N Arg-Glu-Asp Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O XLWSGICNBZGYTA-CIUDSAMLSA-N 0.000 description 2
- OHYQKYUTLIPFOX-ZPFDUUQYSA-N Arg-Glu-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O OHYQKYUTLIPFOX-ZPFDUUQYSA-N 0.000 description 2
- XUUXCWCKKCZEAW-YFKPBYRVSA-N Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](N)CCCN=C(N)N XUUXCWCKKCZEAW-YFKPBYRVSA-N 0.000 description 2
- YKBHOXLMMPZPHQ-GMOBBJLQSA-N Arg-Ile-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O YKBHOXLMMPZPHQ-GMOBBJLQSA-N 0.000 description 2
- FFEUXEAKYRCACT-PEDHHIEDSA-N Arg-Ile-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCCNC(N)=N)[C@@H](C)CC)C(O)=O FFEUXEAKYRCACT-PEDHHIEDSA-N 0.000 description 2
- UHFUZWSZQKMDSX-DCAQKATOSA-N Arg-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UHFUZWSZQKMDSX-DCAQKATOSA-N 0.000 description 2
- NMRHDSAOIURTNT-RWMBFGLXSA-N Arg-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N NMRHDSAOIURTNT-RWMBFGLXSA-N 0.000 description 2
- COXMUHNBYCVVRG-DCAQKATOSA-N Arg-Leu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O COXMUHNBYCVVRG-DCAQKATOSA-N 0.000 description 2
- JEOCWTUOMKEEMF-RHYQMDGZSA-N Arg-Leu-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JEOCWTUOMKEEMF-RHYQMDGZSA-N 0.000 description 2
- JCROZIFVIYMXHM-GUBZILKMSA-N Arg-Met-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CCCN=C(N)N JCROZIFVIYMXHM-GUBZILKMSA-N 0.000 description 2
- NGYHSXDNNOFHNE-AVGNSLFASA-N Arg-Pro-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O NGYHSXDNNOFHNE-AVGNSLFASA-N 0.000 description 2
- YNSUUAOAFCVINY-OSUNSFLBSA-N Arg-Thr-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YNSUUAOAFCVINY-OSUNSFLBSA-N 0.000 description 2
- ZJBUILVYSXQNSW-YTWAJWBKSA-N Arg-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O ZJBUILVYSXQNSW-YTWAJWBKSA-N 0.000 description 2
- ZUVMUOOHJYNJPP-XIRDDKMYSA-N Arg-Trp-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZUVMUOOHJYNJPP-XIRDDKMYSA-N 0.000 description 2
- YHZQOSXDTFRZKU-WDSOQIARSA-N Arg-Trp-Leu Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N)=CNC2=C1 YHZQOSXDTFRZKU-WDSOQIARSA-N 0.000 description 2
- JBQORRNSZGTLCV-WDSOQIARSA-N Arg-Trp-Lys Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N)=CNC2=C1 JBQORRNSZGTLCV-WDSOQIARSA-N 0.000 description 2
- XRLOBFSLPCHYLQ-ULQDDVLXSA-N Arg-Tyr-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O XRLOBFSLPCHYLQ-ULQDDVLXSA-N 0.000 description 2
- QCTOLCVIGRLMQS-HRCADAONSA-N Arg-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O QCTOLCVIGRLMQS-HRCADAONSA-N 0.000 description 2
- QLSRIZIDQXDQHK-RCWTZXSCSA-N Arg-Val-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QLSRIZIDQXDQHK-RCWTZXSCSA-N 0.000 description 2
- LXTGAOAXPSJWOU-DCAQKATOSA-N Asn-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)N)N LXTGAOAXPSJWOU-DCAQKATOSA-N 0.000 description 2
- MEFGKQUUYZOLHM-GMOBBJLQSA-N Asn-Arg-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MEFGKQUUYZOLHM-GMOBBJLQSA-N 0.000 description 2
- PCKRJVZAQZWNKM-WHFBIAKZSA-N Asn-Asn-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O PCKRJVZAQZWNKM-WHFBIAKZSA-N 0.000 description 2
- QYXNFROWLZPWPC-FXQIFTODSA-N Asn-Glu-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QYXNFROWLZPWPC-FXQIFTODSA-N 0.000 description 2
- JREOBWLIZLXRIS-GUBZILKMSA-N Asn-Glu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JREOBWLIZLXRIS-GUBZILKMSA-N 0.000 description 2
- UBKOVSLDWIHYSY-ACZMJKKPSA-N Asn-Glu-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O UBKOVSLDWIHYSY-ACZMJKKPSA-N 0.000 description 2
- GWNMUVANAWDZTI-YUMQZZPRSA-N Asn-Gly-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N GWNMUVANAWDZTI-YUMQZZPRSA-N 0.000 description 2
- PBSQFBAJKPLRJY-BYULHYEWSA-N Asn-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N PBSQFBAJKPLRJY-BYULHYEWSA-N 0.000 description 2
- HYQYLOSCICEYTR-YUMQZZPRSA-N Asn-Gly-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O HYQYLOSCICEYTR-YUMQZZPRSA-N 0.000 description 2
- XLHLPYFMXGOASD-CIUDSAMLSA-N Asn-His-Asp Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N XLHLPYFMXGOASD-CIUDSAMLSA-N 0.000 description 2
- WQLJRNRLHWJIRW-KKUMJFAQSA-N Asn-His-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC(=O)N)N)O WQLJRNRLHWJIRW-KKUMJFAQSA-N 0.000 description 2
- ANPFQTJEPONRPL-UGYAYLCHSA-N Asn-Ile-Asp Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O ANPFQTJEPONRPL-UGYAYLCHSA-N 0.000 description 2
- XVBDDUPJVQXDSI-PEFMBERDSA-N Asn-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N XVBDDUPJVQXDSI-PEFMBERDSA-N 0.000 description 2
- RVHGJNGNKGDCPX-KKUMJFAQSA-N Asn-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N RVHGJNGNKGDCPX-KKUMJFAQSA-N 0.000 description 2
- IDUUACUJKUXKKD-VEVYYDQMSA-N Asn-Pro-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O IDUUACUJKUXKKD-VEVYYDQMSA-N 0.000 description 2
- XTMZYFMTYJNABC-ZLUOBGJFSA-N Asn-Ser-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N XTMZYFMTYJNABC-ZLUOBGJFSA-N 0.000 description 2
- GZXOUBTUAUAVHD-ACZMJKKPSA-N Asn-Ser-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O GZXOUBTUAUAVHD-ACZMJKKPSA-N 0.000 description 2
- DOURAOODTFJRIC-CIUDSAMLSA-N Asn-Ser-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N DOURAOODTFJRIC-CIUDSAMLSA-N 0.000 description 2
- WUQXMTITJLFXAU-JIOCBJNQSA-N Asn-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N)O WUQXMTITJLFXAU-JIOCBJNQSA-N 0.000 description 2
- TZQWZQSMHDVLQL-QEJZJMRPSA-N Asn-Trp-Gln Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N TZQWZQSMHDVLQL-QEJZJMRPSA-N 0.000 description 2
- RTFXPCYMDYBZNQ-SRVKXCTJSA-N Asn-Tyr-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O RTFXPCYMDYBZNQ-SRVKXCTJSA-N 0.000 description 2
- QNNBHTFDFFFHGC-KKUMJFAQSA-N Asn-Tyr-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O QNNBHTFDFFFHGC-KKUMJFAQSA-N 0.000 description 2
- DXHINQUXBZNUCF-MELADBBJSA-N Asn-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CC(=O)N)N)C(=O)O DXHINQUXBZNUCF-MELADBBJSA-N 0.000 description 2
- XOQYDFCQPWAMSA-KKHAAJSZSA-N Asn-Val-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOQYDFCQPWAMSA-KKHAAJSZSA-N 0.000 description 2
- KVMPVNGOKHTUHZ-GCJQMDKQSA-N Asp-Ala-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KVMPVNGOKHTUHZ-GCJQMDKQSA-N 0.000 description 2
- BLQBMRNMBAYREH-UWJYBYFXSA-N Asp-Ala-Tyr Chemical compound N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O BLQBMRNMBAYREH-UWJYBYFXSA-N 0.000 description 2
- SOYOSFXLXYZNRG-CIUDSAMLSA-N Asp-Arg-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O SOYOSFXLXYZNRG-CIUDSAMLSA-N 0.000 description 2
- IXIWEFWRKIUMQX-DCAQKATOSA-N Asp-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CC(O)=O IXIWEFWRKIUMQX-DCAQKATOSA-N 0.000 description 2
- ILJQISGMGXRZQQ-IHRRRGAJSA-N Asp-Arg-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ILJQISGMGXRZQQ-IHRRRGAJSA-N 0.000 description 2
- XYBJLTKSGFBLCS-QXEWZRGKSA-N Asp-Arg-Val Chemical compound NC(N)=NCCC[C@@H](C(=O)N[C@@H](C(C)C)C(O)=O)NC(=O)[C@@H](N)CC(O)=O XYBJLTKSGFBLCS-QXEWZRGKSA-N 0.000 description 2
- QRULNKJGYQQZMW-ZLUOBGJFSA-N Asp-Asn-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O QRULNKJGYQQZMW-ZLUOBGJFSA-N 0.000 description 2
- FANQWNCPNFEPGZ-WHFBIAKZSA-N Asp-Asp-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O FANQWNCPNFEPGZ-WHFBIAKZSA-N 0.000 description 2
- LKIYSIYBKYLKPU-BIIVOSGPSA-N Asp-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)O)N)C(=O)O LKIYSIYBKYLKPU-BIIVOSGPSA-N 0.000 description 2
- CSEJMKNZDCJYGJ-XHNCKOQMSA-N Asp-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)O)N)C(=O)O CSEJMKNZDCJYGJ-XHNCKOQMSA-N 0.000 description 2
- XJQRWGXKUSDEFI-ACZMJKKPSA-N Asp-Glu-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O XJQRWGXKUSDEFI-ACZMJKKPSA-N 0.000 description 2
- PDECQIHABNQRHN-GUBZILKMSA-N Asp-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC(O)=O PDECQIHABNQRHN-GUBZILKMSA-N 0.000 description 2
- JUWZKMBALYLZCK-WHFBIAKZSA-N Asp-Gly-Asn Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O JUWZKMBALYLZCK-WHFBIAKZSA-N 0.000 description 2
- HAFCJCDJGIOYPW-WDSKDSINSA-N Asp-Gly-Gln Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O HAFCJCDJGIOYPW-WDSKDSINSA-N 0.000 description 2
- PZXPWHFYZXTFBI-YUMQZZPRSA-N Asp-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O PZXPWHFYZXTFBI-YUMQZZPRSA-N 0.000 description 2
- POTCZYQVVNXUIG-BQBZGAKWSA-N Asp-Gly-Pro Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O POTCZYQVVNXUIG-BQBZGAKWSA-N 0.000 description 2
- SNDBKTFJWVEVPO-WHFBIAKZSA-N Asp-Gly-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CO)C(O)=O SNDBKTFJWVEVPO-WHFBIAKZSA-N 0.000 description 2
- PGUYEUCYVNZGGV-QWRGUYRKSA-N Asp-Gly-Tyr Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 PGUYEUCYVNZGGV-QWRGUYRKSA-N 0.000 description 2
- LNENWJXDHCFVOF-DCAQKATOSA-N Asp-His-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC(=O)O)N LNENWJXDHCFVOF-DCAQKATOSA-N 0.000 description 2
- SWTQDYFZVOJVLL-KKUMJFAQSA-N Asp-His-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC(=O)O)N)O SWTQDYFZVOJVLL-KKUMJFAQSA-N 0.000 description 2
- QNMKWNONJGKJJC-NHCYSSNCSA-N Asp-Leu-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O QNMKWNONJGKJJC-NHCYSSNCSA-N 0.000 description 2
- LIVXPXUVXFRWNY-CIUDSAMLSA-N Asp-Lys-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O LIVXPXUVXFRWNY-CIUDSAMLSA-N 0.000 description 2
- RNAQPBOOJRDICC-BPUTZDHNSA-N Asp-Met-Trp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC(=O)O)N RNAQPBOOJRDICC-BPUTZDHNSA-N 0.000 description 2
- WQSXAPPYLGNMQL-IHRRRGAJSA-N Asp-Met-Tyr Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N WQSXAPPYLGNMQL-IHRRRGAJSA-N 0.000 description 2
- YRZIYQGXTSBRLT-AVGNSLFASA-N Asp-Phe-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O YRZIYQGXTSBRLT-AVGNSLFASA-N 0.000 description 2
- QJHOOKBAHRJPPX-QWRGUYRKSA-N Asp-Phe-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=CC=C1 QJHOOKBAHRJPPX-QWRGUYRKSA-N 0.000 description 2
- UCHSVZYJKJLPHF-BZSNNMDCSA-N Asp-Phe-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O UCHSVZYJKJLPHF-BZSNNMDCSA-N 0.000 description 2
- BWJZSLQJNBSUPM-FXQIFTODSA-N Asp-Pro-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O BWJZSLQJNBSUPM-FXQIFTODSA-N 0.000 description 2
- WMLFFCRUSPNENW-ZLUOBGJFSA-N Asp-Ser-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O WMLFFCRUSPNENW-ZLUOBGJFSA-N 0.000 description 2
- GWWSUMLEWKQHLR-NUMRIWBASA-N Asp-Thr-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O GWWSUMLEWKQHLR-NUMRIWBASA-N 0.000 description 2
- RSMZEHCMIOKNMW-GSSVUCPTSA-N Asp-Thr-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RSMZEHCMIOKNMW-GSSVUCPTSA-N 0.000 description 2
- PLNJUJGNLDSFOP-UWJYBYFXSA-N Asp-Tyr-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O PLNJUJGNLDSFOP-UWJYBYFXSA-N 0.000 description 2
- NWAHPBGBDIFUFD-KKUMJFAQSA-N Asp-Tyr-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O NWAHPBGBDIFUFD-KKUMJFAQSA-N 0.000 description 2
- XWKBWZXGNXTDKY-ZKWXMUAHSA-N Asp-Val-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(O)=O XWKBWZXGNXTDKY-ZKWXMUAHSA-N 0.000 description 2
- XWKPSMRPIKKDDU-RCOVLWMOSA-N Asp-Val-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O XWKPSMRPIKKDDU-RCOVLWMOSA-N 0.000 description 2
- QOJJMJKTMKNFEF-ZKWXMUAHSA-N Asp-Val-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(O)=O QOJJMJKTMKNFEF-ZKWXMUAHSA-N 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 2
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 2
- 102000013392 Carboxylesterase Human genes 0.000 description 2
- 108010051152 Carboxylesterase Proteins 0.000 description 2
- PJWWRFATQTVXHA-UHFFFAOYSA-N Cyclohexylaminopropanesulfonic acid Chemical compound OS(=O)(=O)CCCNC1CCCCC1 PJWWRFATQTVXHA-UHFFFAOYSA-N 0.000 description 2
- TVYMKYUSZSVOAG-ZLUOBGJFSA-N Cys-Ala-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O TVYMKYUSZSVOAG-ZLUOBGJFSA-N 0.000 description 2
- PKNIZMPLMSKROD-BIIVOSGPSA-N Cys-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CS)N PKNIZMPLMSKROD-BIIVOSGPSA-N 0.000 description 2
- XABFFGOGKOORCG-CIUDSAMLSA-N Cys-Asp-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O XABFFGOGKOORCG-CIUDSAMLSA-N 0.000 description 2
- YZKOXEJTLWZOQL-GUBZILKMSA-N Cys-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CS)N YZKOXEJTLWZOQL-GUBZILKMSA-N 0.000 description 2
- OTXLNICGSXPGQF-KBIXCLLPSA-N Cys-Ile-Glu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O OTXLNICGSXPGQF-KBIXCLLPSA-N 0.000 description 2
- CHRCKSPMGYDLIA-SRVKXCTJSA-N Cys-Phe-Ser Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O CHRCKSPMGYDLIA-SRVKXCTJSA-N 0.000 description 2
- NAPULYCVEVVFRB-HEIBUPTGSA-N Cys-Thr-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@@H](N)CS NAPULYCVEVVFRB-HEIBUPTGSA-N 0.000 description 2
- YAHZABJORDUQGO-NQXXGFSBSA-N D-ribulose 1,5-bisphosphate Chemical compound OP(=O)(O)OC[C@@H](O)[C@@H](O)C(=O)COP(O)(O)=O YAHZABJORDUQGO-NQXXGFSBSA-N 0.000 description 2
- 108010090461 DFG peptide Proteins 0.000 description 2
- 102000053602 DNA Human genes 0.000 description 2
- 101150081655 GPM1 gene Proteins 0.000 description 2
- MWLYSLMKFXWZPW-ZPFDUUQYSA-N Gln-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CCC(N)=O MWLYSLMKFXWZPW-ZPFDUUQYSA-N 0.000 description 2
- SOBBAYVQSNXYPQ-ACZMJKKPSA-N Gln-Asn-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O SOBBAYVQSNXYPQ-ACZMJKKPSA-N 0.000 description 2
- CYTSBCIIEHUPDU-ACZMJKKPSA-N Gln-Asp-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O CYTSBCIIEHUPDU-ACZMJKKPSA-N 0.000 description 2
- WLODHVXYKYHLJD-ACZMJKKPSA-N Gln-Asp-Ser Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CO)C(=O)O)N WLODHVXYKYHLJD-ACZMJKKPSA-N 0.000 description 2
- ALUBSZXSNSPDQV-WDSKDSINSA-N Gln-Cys-Gly Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CS)C(=O)NCC(O)=O ALUBSZXSNSPDQV-WDSKDSINSA-N 0.000 description 2
- CITDWMLWXNUQKD-FXQIFTODSA-N Gln-Gln-Asn Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N CITDWMLWXNUQKD-FXQIFTODSA-N 0.000 description 2
- PKVWNYGXMNWJSI-CIUDSAMLSA-N Gln-Gln-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O PKVWNYGXMNWJSI-CIUDSAMLSA-N 0.000 description 2
- JEFZIKRIDLHOIF-BYPYZUCNSA-N Gln-Gly Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(O)=O JEFZIKRIDLHOIF-BYPYZUCNSA-N 0.000 description 2
- MFJAPSYJQJCQDN-BQBZGAKWSA-N Gln-Gly-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O MFJAPSYJQJCQDN-BQBZGAKWSA-N 0.000 description 2
- XSBGUANSZDGULP-IUCAKERBSA-N Gln-Gly-Lys Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CCCCN)C(O)=O XSBGUANSZDGULP-IUCAKERBSA-N 0.000 description 2
- LTXLIIZACMCQTO-GUBZILKMSA-N Gln-His-Asp Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N LTXLIIZACMCQTO-GUBZILKMSA-N 0.000 description 2
- PSERKXGRRADTKA-MNXVOIDGSA-N Gln-Leu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PSERKXGRRADTKA-MNXVOIDGSA-N 0.000 description 2
- KHNJVFYHIKLUPD-SRVKXCTJSA-N Gln-Leu-Met Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CCC(=O)N)N KHNJVFYHIKLUPD-SRVKXCTJSA-N 0.000 description 2
- JNENSVNAUWONEZ-GUBZILKMSA-N Gln-Lys-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O JNENSVNAUWONEZ-GUBZILKMSA-N 0.000 description 2
- SXGMGNZEHFORAV-IUCAKERBSA-N Gln-Lys-Gly Chemical compound C(CCN)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N SXGMGNZEHFORAV-IUCAKERBSA-N 0.000 description 2
- FKXCBKCOSVIGCT-AVGNSLFASA-N Gln-Lys-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O FKXCBKCOSVIGCT-AVGNSLFASA-N 0.000 description 2
- KLKYKPXITJBSNI-CIUDSAMLSA-N Gln-Met-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O KLKYKPXITJBSNI-CIUDSAMLSA-N 0.000 description 2
- CULXMOZETKLBDI-XIRDDKMYSA-N Gln-Met-Trp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CCC(=O)N)N CULXMOZETKLBDI-XIRDDKMYSA-N 0.000 description 2
- HMIXCETWRYDVMO-GUBZILKMSA-N Gln-Pro-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O HMIXCETWRYDVMO-GUBZILKMSA-N 0.000 description 2
- OSCLNNWLKKIQJM-WDSKDSINSA-N Gln-Ser-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O OSCLNNWLKKIQJM-WDSKDSINSA-N 0.000 description 2
- KPNWAJMEMRCLAL-GUBZILKMSA-N Gln-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)N)N KPNWAJMEMRCLAL-GUBZILKMSA-N 0.000 description 2
- RUFHOVYUYSNDNY-ACZMJKKPSA-N Glu-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O RUFHOVYUYSNDNY-ACZMJKKPSA-N 0.000 description 2
- FHPXTPQBODWBIY-CIUDSAMLSA-N Glu-Ala-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FHPXTPQBODWBIY-CIUDSAMLSA-N 0.000 description 2
- UTKICHUQEQBDGC-ACZMJKKPSA-N Glu-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)O)N UTKICHUQEQBDGC-ACZMJKKPSA-N 0.000 description 2
- UTKUTMJSWKKHEM-WDSKDSINSA-N Glu-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O UTKUTMJSWKKHEM-WDSKDSINSA-N 0.000 description 2
- BPDVTFBJZNBHEU-HGNGGELXSA-N Glu-Ala-His Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 BPDVTFBJZNBHEU-HGNGGELXSA-N 0.000 description 2
- NCWOMXABNYEPLY-NRPADANISA-N Glu-Ala-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O NCWOMXABNYEPLY-NRPADANISA-N 0.000 description 2
- NLKVNZUFDPWPNL-YUMQZZPRSA-N Glu-Arg-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O NLKVNZUFDPWPNL-YUMQZZPRSA-N 0.000 description 2
- SRZLHYPAOXBBSB-HJGDQZAQSA-N Glu-Arg-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SRZLHYPAOXBBSB-HJGDQZAQSA-N 0.000 description 2
- CKRUHITYRFNUKW-WDSKDSINSA-N Glu-Asn-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CKRUHITYRFNUKW-WDSKDSINSA-N 0.000 description 2
- RDPOETHPAQEGDP-ACZMJKKPSA-N Glu-Asp-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O RDPOETHPAQEGDP-ACZMJKKPSA-N 0.000 description 2
- VAIWPXWHWAPYDF-FXQIFTODSA-N Glu-Asp-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O VAIWPXWHWAPYDF-FXQIFTODSA-N 0.000 description 2
- SJPMNHCEWPTRBR-BQBZGAKWSA-N Glu-Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SJPMNHCEWPTRBR-BQBZGAKWSA-N 0.000 description 2
- MTAOBYXRYJZRGQ-WDSKDSINSA-N Glu-Gly-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MTAOBYXRYJZRGQ-WDSKDSINSA-N 0.000 description 2
- OGNJZUXUTPQVBR-BQBZGAKWSA-N Glu-Gly-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OGNJZUXUTPQVBR-BQBZGAKWSA-N 0.000 description 2
- OPAINBJQDQTGJY-JGVFFNPUSA-N Glu-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCC(=O)O)N)C(=O)O OPAINBJQDQTGJY-JGVFFNPUSA-N 0.000 description 2
- XMPAXPSENRSOSV-RYUDHWBXSA-N Glu-Gly-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XMPAXPSENRSOSV-RYUDHWBXSA-N 0.000 description 2
- XOFYVODYSNKPDK-AVGNSLFASA-N Glu-His-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CCC(=O)O)N XOFYVODYSNKPDK-AVGNSLFASA-N 0.000 description 2
- GXMXPCXXKVWOSM-KQXIARHKSA-N Glu-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N GXMXPCXXKVWOSM-KQXIARHKSA-N 0.000 description 2
- VSRCAOIHMGCIJK-SRVKXCTJSA-N Glu-Leu-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O VSRCAOIHMGCIJK-SRVKXCTJSA-N 0.000 description 2
- VMKCPNBBPGGQBJ-GUBZILKMSA-N Glu-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N VMKCPNBBPGGQBJ-GUBZILKMSA-N 0.000 description 2
- ATVYZJGOZLVXDK-IUCAKERBSA-N Glu-Leu-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O ATVYZJGOZLVXDK-IUCAKERBSA-N 0.000 description 2
- IVGJYOOGJLFKQE-AVGNSLFASA-N Glu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N IVGJYOOGJLFKQE-AVGNSLFASA-N 0.000 description 2
- SJJHXJDSNQJMMW-SRVKXCTJSA-N Glu-Lys-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O SJJHXJDSNQJMMW-SRVKXCTJSA-N 0.000 description 2
- CBWKURKPYSLMJV-SOUVJXGZSA-N Glu-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CCC(=O)O)N)C(=O)O CBWKURKPYSLMJV-SOUVJXGZSA-N 0.000 description 2
- TZXOPHFCAATANZ-QEJZJMRPSA-N Glu-Ser-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)O)N TZXOPHFCAATANZ-QEJZJMRPSA-N 0.000 description 2
- DDXZHOHEABQXSE-NKIYYHGXSA-N Glu-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O DDXZHOHEABQXSE-NKIYYHGXSA-N 0.000 description 2
- MXJYXYDREQWUMS-XKBZYTNZSA-N Glu-Thr-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O MXJYXYDREQWUMS-XKBZYTNZSA-N 0.000 description 2
- JDAYMLXPUJRSDJ-XIRDDKMYSA-N Glu-Trp-Arg Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCC(O)=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)=CNC2=C1 JDAYMLXPUJRSDJ-XIRDDKMYSA-N 0.000 description 2
- HVKAAUOFFTUSAA-XDTLVQLUSA-N Glu-Tyr-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O HVKAAUOFFTUSAA-XDTLVQLUSA-N 0.000 description 2
- YQPFCZVKMUVZIN-AUTRQRHGSA-N Glu-Val-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O YQPFCZVKMUVZIN-AUTRQRHGSA-N 0.000 description 2
- ZYRXTRTUCAVNBQ-GVXVVHGQSA-N Glu-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZYRXTRTUCAVNBQ-GVXVVHGQSA-N 0.000 description 2
- QSDKBRMVXSWAQE-BFHQHQDPSA-N Gly-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN QSDKBRMVXSWAQE-BFHQHQDPSA-N 0.000 description 2
- JRDYDYXZKFNNRQ-XPUUQOCRSA-N Gly-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN JRDYDYXZKFNNRQ-XPUUQOCRSA-N 0.000 description 2
- OVSKVOOUFAKODB-UWVGGRQHSA-N Gly-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N OVSKVOOUFAKODB-UWVGGRQHSA-N 0.000 description 2
- XZRZILPOZBVTDB-GJZGRUSLSA-N Gly-Arg-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCNC(N)=N)NC(=O)CN)C(O)=O)=CNC2=C1 XZRZILPOZBVTDB-GJZGRUSLSA-N 0.000 description 2
- BGVYNAQWHSTTSP-BYULHYEWSA-N Gly-Asn-Ile Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BGVYNAQWHSTTSP-BYULHYEWSA-N 0.000 description 2
- XRTDOIOIBMAXCT-NKWVEPMBSA-N Gly-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)CN)C(=O)O XRTDOIOIBMAXCT-NKWVEPMBSA-N 0.000 description 2
- XCLCVBYNGXEVDU-WHFBIAKZSA-N Gly-Asn-Ser Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O XCLCVBYNGXEVDU-WHFBIAKZSA-N 0.000 description 2
- FUTAPPOITCCWTH-WHFBIAKZSA-N Gly-Asp-Asp Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O FUTAPPOITCCWTH-WHFBIAKZSA-N 0.000 description 2
- MHHUEAIBJZWDBH-YUMQZZPRSA-N Gly-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN MHHUEAIBJZWDBH-YUMQZZPRSA-N 0.000 description 2
- PMNHJLASAAWELO-FOHZUACHSA-N Gly-Asp-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PMNHJLASAAWELO-FOHZUACHSA-N 0.000 description 2
- TZOVVRJYUDETQG-RCOVLWMOSA-N Gly-Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)CN TZOVVRJYUDETQG-RCOVLWMOSA-N 0.000 description 2
- GYAUWXXORNTCHU-QWRGUYRKSA-N Gly-Cys-Tyr Chemical compound NCC(=O)N[C@@H](CS)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 GYAUWXXORNTCHU-QWRGUYRKSA-N 0.000 description 2
- GDOZQTNZPCUARW-YFKPBYRVSA-N Gly-Gly-Glu Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O GDOZQTNZPCUARW-YFKPBYRVSA-N 0.000 description 2
- PDAWDNVHMUKWJR-ZETCQYMHSA-N Gly-Gly-His Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC1=CNC=N1 PDAWDNVHMUKWJR-ZETCQYMHSA-N 0.000 description 2
- KAJAOGBVWCYGHZ-JTQLQIEISA-N Gly-Gly-Phe Chemical compound [NH3+]CC(=O)NCC(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 KAJAOGBVWCYGHZ-JTQLQIEISA-N 0.000 description 2
- ADZGCWWDPFDHCY-ZETCQYMHSA-N Gly-His-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CC1=CN=CN1 ADZGCWWDPFDHCY-ZETCQYMHSA-N 0.000 description 2
- UTYGDAHJBBDPBA-BYULHYEWSA-N Gly-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)CN UTYGDAHJBBDPBA-BYULHYEWSA-N 0.000 description 2
- ITZOBNKQDZEOCE-NHCYSSNCSA-N Gly-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)CN ITZOBNKQDZEOCE-NHCYSSNCSA-N 0.000 description 2
- FCKPEGOCSVZPNC-WHOFXGATSA-N Gly-Ile-Phe Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 FCKPEGOCSVZPNC-WHOFXGATSA-N 0.000 description 2
- SCWYHUQOOFRVHP-MBLNEYKQSA-N Gly-Ile-Thr Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SCWYHUQOOFRVHP-MBLNEYKQSA-N 0.000 description 2
- IUZGUFAJDBHQQV-YUMQZZPRSA-N Gly-Leu-Asn Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IUZGUFAJDBHQQV-YUMQZZPRSA-N 0.000 description 2
- CCBIBMKQNXHNIN-ZETCQYMHSA-N Gly-Leu-Gly Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O CCBIBMKQNXHNIN-ZETCQYMHSA-N 0.000 description 2
- AFWYPMDMDYCKMD-KBPBESRZSA-N Gly-Leu-Tyr Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AFWYPMDMDYCKMD-KBPBESRZSA-N 0.000 description 2
- FJWSJWACLMTDMI-WPRPVWTQSA-N Gly-Met-Val Chemical compound [H]NCC(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(O)=O FJWSJWACLMTDMI-WPRPVWTQSA-N 0.000 description 2
- DHNXGWVNLFPOMQ-KBPBESRZSA-N Gly-Phe-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)CN DHNXGWVNLFPOMQ-KBPBESRZSA-N 0.000 description 2
- VDCRBJACQKOSMS-JSGCOSHPSA-N Gly-Phe-Val Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O VDCRBJACQKOSMS-JSGCOSHPSA-N 0.000 description 2
- ZZJVYSAQQMDIRD-UWVGGRQHSA-N Gly-Pro-His Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O ZZJVYSAQQMDIRD-UWVGGRQHSA-N 0.000 description 2
- GAAHQHNCMIAYEX-UWVGGRQHSA-N Gly-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN GAAHQHNCMIAYEX-UWVGGRQHSA-N 0.000 description 2
- IXHQLZIWBCQBLQ-STQMWFEESA-N Gly-Pro-Phe Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IXHQLZIWBCQBLQ-STQMWFEESA-N 0.000 description 2
- SOEGEPHNZOISMT-BYPYZUCNSA-N Gly-Ser-Gly Chemical compound NCC(=O)N[C@@H](CO)C(=O)NCC(O)=O SOEGEPHNZOISMT-BYPYZUCNSA-N 0.000 description 2
- YXTFLTJYLIAZQG-FJXKBIBVSA-N Gly-Thr-Arg Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YXTFLTJYLIAZQG-FJXKBIBVSA-N 0.000 description 2
- HUFUVTYGPOUCBN-MBLNEYKQSA-N Gly-Thr-Ile Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HUFUVTYGPOUCBN-MBLNEYKQSA-N 0.000 description 2
- UIQGJYUEQDOODF-KWQFWETISA-N Gly-Tyr-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 UIQGJYUEQDOODF-KWQFWETISA-N 0.000 description 2
- RVKIPWVMZANZLI-UHFFFAOYSA-N H-Lys-Trp-OH Natural products C1=CC=C2C(CC(NC(=O)C(N)CCCCN)C(O)=O)=CNC2=C1 RVKIPWVMZANZLI-UHFFFAOYSA-N 0.000 description 2
- XINDHUAGVGCNSF-QSFUFRPTSA-N His-Ala-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XINDHUAGVGCNSF-QSFUFRPTSA-N 0.000 description 2
- MBSSHYPAEHPSGY-LSJOCFKGSA-N His-Ala-Met Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CCSC)C(O)=O MBSSHYPAEHPSGY-LSJOCFKGSA-N 0.000 description 2
- IDNNYVGVSZMQTK-IHRRRGAJSA-N His-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N IDNNYVGVSZMQTK-IHRRRGAJSA-N 0.000 description 2
- UZZXGLOJRZKYEL-DJFWLOJKSA-N His-Asn-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UZZXGLOJRZKYEL-DJFWLOJKSA-N 0.000 description 2
- MDBYBTWRMOAJAY-NHCYSSNCSA-N His-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CN=CN1)N MDBYBTWRMOAJAY-NHCYSSNCSA-N 0.000 description 2
- HVCRQRQPIIRNLY-IUCAKERBSA-N His-Gln-Gly Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)NCC(=O)O)N HVCRQRQPIIRNLY-IUCAKERBSA-N 0.000 description 2
- BQFGKVYHKCNEMF-DCAQKATOSA-N His-Glu-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CN=CN1 BQFGKVYHKCNEMF-DCAQKATOSA-N 0.000 description 2
- QAMFAYSMNZBNCA-UWVGGRQHSA-N His-Gly-Met Chemical compound CSCC[C@H](NC(=O)CNC(=O)[C@@H](N)Cc1cnc[nH]1)C(O)=O QAMFAYSMNZBNCA-UWVGGRQHSA-N 0.000 description 2
- ORERHHPZDDEMSC-VGDYDELISA-N His-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N ORERHHPZDDEMSC-VGDYDELISA-N 0.000 description 2
- LVWIJITYHRZHBO-IXOXFDKPSA-N His-Leu-Thr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LVWIJITYHRZHBO-IXOXFDKPSA-N 0.000 description 2
- IGBBXBFSLKRHJB-BZSNNMDCSA-N His-Lys-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CN=CN1 IGBBXBFSLKRHJB-BZSNNMDCSA-N 0.000 description 2
- YYOCMTFVGKDNQP-IHRRRGAJSA-N His-Met-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N YYOCMTFVGKDNQP-IHRRRGAJSA-N 0.000 description 2
- ZVKDCQVQTGYBQT-LSJOCFKGSA-N His-Pro-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O ZVKDCQVQTGYBQT-LSJOCFKGSA-N 0.000 description 2
- QCBYAHHNOHBXIH-UWVGGRQHSA-N His-Pro-Gly Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)NCC(O)=O)C1=CN=CN1 QCBYAHHNOHBXIH-UWVGGRQHSA-N 0.000 description 2
- KAXZXLSXFWSNNZ-XVYDVKMFSA-N His-Ser-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O KAXZXLSXFWSNNZ-XVYDVKMFSA-N 0.000 description 2
- STGQSBKUYSPPIG-CIUDSAMLSA-N His-Ser-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CN=CN1 STGQSBKUYSPPIG-CIUDSAMLSA-N 0.000 description 2
- WSWAUVHXQREQQG-JYJNAYRXSA-N His-Tyr-Gln Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O WSWAUVHXQREQQG-JYJNAYRXSA-N 0.000 description 2
- DAKSMIWQZPHRIB-BZSNNMDCSA-N His-Tyr-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O DAKSMIWQZPHRIB-BZSNNMDCSA-N 0.000 description 2
- 101000583156 Homo sapiens Pituitary homeobox 1 Proteins 0.000 description 2
- AQCUAZTZSPQJFF-ZKWXMUAHSA-N Ile-Ala-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O AQCUAZTZSPQJFF-ZKWXMUAHSA-N 0.000 description 2
- LVQDUPQUJZWKSU-PYJNHQTQSA-N Ile-Arg-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N LVQDUPQUJZWKSU-PYJNHQTQSA-N 0.000 description 2
- HGNUKGZQASSBKQ-PCBIJLKTSA-N Ile-Asp-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N HGNUKGZQASSBKQ-PCBIJLKTSA-N 0.000 description 2
- DCQMJRSOGCYKTR-GHCJXIJMSA-N Ile-Asp-Ser Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O DCQMJRSOGCYKTR-GHCJXIJMSA-N 0.000 description 2
- WEWCEPOYKANMGZ-MMWGEVLESA-N Ile-Cys-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N1CCC[C@@H]1C(=O)O)N WEWCEPOYKANMGZ-MMWGEVLESA-N 0.000 description 2
- VQUCKIAECLVLAD-SVSWQMSJSA-N Ile-Cys-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N VQUCKIAECLVLAD-SVSWQMSJSA-N 0.000 description 2
- WZDCVAWMBUNDDY-KBIXCLLPSA-N Ile-Glu-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](C)C(=O)O)N WZDCVAWMBUNDDY-KBIXCLLPSA-N 0.000 description 2
- WUKLZPHVWAMZQV-UKJIMTQDSA-N Ile-Glu-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](C(C)C)C(=O)O)N WUKLZPHVWAMZQV-UKJIMTQDSA-N 0.000 description 2
- KFVUBLZRFSVDGO-BYULHYEWSA-N Ile-Gly-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O KFVUBLZRFSVDGO-BYULHYEWSA-N 0.000 description 2
- VOBYAKCXGQQFLR-LSJOCFKGSA-N Ile-Gly-Val Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O VOBYAKCXGQQFLR-LSJOCFKGSA-N 0.000 description 2
- PWDSHAAAFXISLE-SXTJYALSSA-N Ile-Ile-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O PWDSHAAAFXISLE-SXTJYALSSA-N 0.000 description 2
- GLLAUPMJCGKPFY-BLMTYFJBSA-N Ile-Ile-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)[C@@H](C)CC)C(O)=O)=CNC2=C1 GLLAUPMJCGKPFY-BLMTYFJBSA-N 0.000 description 2
- NUKXXNFEUZGPRO-BJDJZHNGSA-N Ile-Leu-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)O)N NUKXXNFEUZGPRO-BJDJZHNGSA-N 0.000 description 2
- GLYJPWIRLBAIJH-FQUUOJAGSA-N Ile-Lys-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N GLYJPWIRLBAIJH-FQUUOJAGSA-N 0.000 description 2
- GLYJPWIRLBAIJH-UHFFFAOYSA-N Ile-Lys-Pro Natural products CCC(C)C(N)C(=O)NC(CCCCN)C(=O)N1CCCC1C(O)=O GLYJPWIRLBAIJH-UHFFFAOYSA-N 0.000 description 2
- CIDLJWVDMNDKPT-FIRPJDEBSA-N Ile-Phe-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)O)N CIDLJWVDMNDKPT-FIRPJDEBSA-N 0.000 description 2
- JHNJNTMTZHEDLJ-NAKRPEOUSA-N Ile-Ser-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O JHNJNTMTZHEDLJ-NAKRPEOUSA-N 0.000 description 2
- RQJUKVXWAKJDBW-SVSWQMSJSA-N Ile-Ser-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N RQJUKVXWAKJDBW-SVSWQMSJSA-N 0.000 description 2
- NGKPIPCGMLWHBX-WZLNRYEVSA-N Ile-Tyr-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N NGKPIPCGMLWHBX-WZLNRYEVSA-N 0.000 description 2
- 241000588747 Klebsiella pneumoniae Species 0.000 description 2
- RCFDOSNHHZGBOY-UHFFFAOYSA-N L-isoleucyl-L-alanine Natural products CCC(C)C(N)C(=O)NC(C)C(O)=O RCFDOSNHHZGBOY-UHFFFAOYSA-N 0.000 description 2
- LZDNBBYBDGBADK-UHFFFAOYSA-N L-valyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-UHFFFAOYSA-N 0.000 description 2
- REPPKAMYTOJTFC-DCAQKATOSA-N Leu-Arg-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O REPPKAMYTOJTFC-DCAQKATOSA-N 0.000 description 2
- CNNQBZRGQATKNY-DCAQKATOSA-N Leu-Arg-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CS)C(=O)O)N CNNQBZRGQATKNY-DCAQKATOSA-N 0.000 description 2
- UCOCBWDBHCUPQP-DCAQKATOSA-N Leu-Arg-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O UCOCBWDBHCUPQP-DCAQKATOSA-N 0.000 description 2
- OIARJGNVARWKFP-YUMQZZPRSA-N Leu-Asn-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O OIARJGNVARWKFP-YUMQZZPRSA-N 0.000 description 2
- BPANDPNDMJHFEV-CIUDSAMLSA-N Leu-Asp-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O BPANDPNDMJHFEV-CIUDSAMLSA-N 0.000 description 2
- KTFHTMHHKXUYPW-ZPFDUUQYSA-N Leu-Asp-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KTFHTMHHKXUYPW-ZPFDUUQYSA-N 0.000 description 2
- ZTLGVASZOIKNIX-DCAQKATOSA-N Leu-Gln-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZTLGVASZOIKNIX-DCAQKATOSA-N 0.000 description 2
- LLBQJYDYOLIQAI-JYJNAYRXSA-N Leu-Glu-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LLBQJYDYOLIQAI-JYJNAYRXSA-N 0.000 description 2
- FMEICTQWUKNAGC-YUMQZZPRSA-N Leu-Gly-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O FMEICTQWUKNAGC-YUMQZZPRSA-N 0.000 description 2
- VBZOAGIPCULURB-QWRGUYRKSA-N Leu-Gly-His Chemical compound CC(C)C[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N VBZOAGIPCULURB-QWRGUYRKSA-N 0.000 description 2
- LKXANTUNFMVCNF-IHPCNDPISA-N Leu-His-Trp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O LKXANTUNFMVCNF-IHPCNDPISA-N 0.000 description 2
- LIINDKYIGYTDLG-PPCPHDFISA-N Leu-Ile-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LIINDKYIGYTDLG-PPCPHDFISA-N 0.000 description 2
- JNDYEOUZBLOVOF-AVGNSLFASA-N Leu-Leu-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O JNDYEOUZBLOVOF-AVGNSLFASA-N 0.000 description 2
- QNBVTHNJGCOVFA-AVGNSLFASA-N Leu-Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O QNBVTHNJGCOVFA-AVGNSLFASA-N 0.000 description 2
- XVZCXCTYGHPNEM-UHFFFAOYSA-N Leu-Leu-Pro Natural products CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(O)=O XVZCXCTYGHPNEM-UHFFFAOYSA-N 0.000 description 2
- ZRHDPZAAWLXXIR-SRVKXCTJSA-N Leu-Lys-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O ZRHDPZAAWLXXIR-SRVKXCTJSA-N 0.000 description 2
- LVTJJOJKDCVZGP-QWRGUYRKSA-N Leu-Lys-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LVTJJOJKDCVZGP-QWRGUYRKSA-N 0.000 description 2
- QNTJIDXQHWUBKC-BZSNNMDCSA-N Leu-Lys-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QNTJIDXQHWUBKC-BZSNNMDCSA-N 0.000 description 2
- IBSGMIPRBMPMHE-IHRRRGAJSA-N Leu-Met-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(O)=O IBSGMIPRBMPMHE-IHRRRGAJSA-N 0.000 description 2
- GCXGCIYIHXSKAY-ULQDDVLXSA-N Leu-Phe-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GCXGCIYIHXSKAY-ULQDDVLXSA-N 0.000 description 2
- PTRKPHUGYULXPU-KKUMJFAQSA-N Leu-Phe-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O PTRKPHUGYULXPU-KKUMJFAQSA-N 0.000 description 2
- CHJKEDSZNSONPS-DCAQKATOSA-N Leu-Pro-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O CHJKEDSZNSONPS-DCAQKATOSA-N 0.000 description 2
- XOWMDXHFSBCAKQ-SRVKXCTJSA-N Leu-Ser-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(C)C XOWMDXHFSBCAKQ-SRVKXCTJSA-N 0.000 description 2
- AMSSKPUHBUQBOQ-SRVKXCTJSA-N Leu-Ser-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N AMSSKPUHBUQBOQ-SRVKXCTJSA-N 0.000 description 2
- PPGBXYKMUMHFBF-KATARQTJSA-N Leu-Ser-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PPGBXYKMUMHFBF-KATARQTJSA-N 0.000 description 2
- SVBJIZVVYJYGLA-DCAQKATOSA-N Leu-Ser-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O SVBJIZVVYJYGLA-DCAQKATOSA-N 0.000 description 2
- ICYRCNICGBJLGM-HJGDQZAQSA-N Leu-Thr-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(O)=O ICYRCNICGBJLGM-HJGDQZAQSA-N 0.000 description 2
- LFSQWRSVPNKJGP-WDCWCFNPSA-N Leu-Thr-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCC(O)=O LFSQWRSVPNKJGP-WDCWCFNPSA-N 0.000 description 2
- QWWPYKKLXWOITQ-VOAKCMCISA-N Leu-Thr-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(C)C QWWPYKKLXWOITQ-VOAKCMCISA-N 0.000 description 2
- KLSUAWUZBMAZCL-RHYQMDGZSA-N Leu-Thr-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(O)=O KLSUAWUZBMAZCL-RHYQMDGZSA-N 0.000 description 2
- HGLKOTPFWOMPOB-MEYUZBJRSA-N Leu-Thr-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HGLKOTPFWOMPOB-MEYUZBJRSA-N 0.000 description 2
- IDGRADDMTTWOQC-WDSOQIARSA-N Leu-Trp-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IDGRADDMTTWOQC-WDSOQIARSA-N 0.000 description 2
- FPFOYSCDUWTZBF-IHPCNDPISA-N Leu-Trp-Leu Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H]([NH3+])CC(C)C)C(=O)N[C@@H](CC(C)C)C([O-])=O)=CNC2=C1 FPFOYSCDUWTZBF-IHPCNDPISA-N 0.000 description 2
- UCRJTSIIAYHOHE-ULQDDVLXSA-N Leu-Tyr-Arg Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N UCRJTSIIAYHOHE-ULQDDVLXSA-N 0.000 description 2
- FBNPMTNBFFAMMH-AVGNSLFASA-N Leu-Val-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-AVGNSLFASA-N 0.000 description 2
- FBNPMTNBFFAMMH-UHFFFAOYSA-N Leu-Val-Arg Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(=O)NC(C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-UHFFFAOYSA-N 0.000 description 2
- XZNJZXJZBMBGGS-NHCYSSNCSA-N Leu-Val-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O XZNJZXJZBMBGGS-NHCYSSNCSA-N 0.000 description 2
- QESXLSQLQHHTIX-RHYQMDGZSA-N Leu-Val-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QESXLSQLQHHTIX-RHYQMDGZSA-N 0.000 description 2
- YIBOAHAOAWACDK-QEJZJMRPSA-N Lys-Ala-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 YIBOAHAOAWACDK-QEJZJMRPSA-N 0.000 description 2
- GQUDMNDPQTXZRV-DCAQKATOSA-N Lys-Arg-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O GQUDMNDPQTXZRV-DCAQKATOSA-N 0.000 description 2
- LXNPMPIQDNSMTA-AVGNSLFASA-N Lys-Gln-His Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 LXNPMPIQDNSMTA-AVGNSLFASA-N 0.000 description 2
- RZHLIPMZXOEJTL-AVGNSLFASA-N Lys-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N RZHLIPMZXOEJTL-AVGNSLFASA-N 0.000 description 2
- QQUJSUFWEDZQQY-AVGNSLFASA-N Lys-Gln-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCCN QQUJSUFWEDZQQY-AVGNSLFASA-N 0.000 description 2
- DRCILAJNUJKAHC-SRVKXCTJSA-N Lys-Glu-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O DRCILAJNUJKAHC-SRVKXCTJSA-N 0.000 description 2
- GCMWRRQAKQXDED-IUCAKERBSA-N Lys-Glu-Gly Chemical compound [NH3+]CCCC[C@H]([NH3+])C(=O)N[C@@H](CCC([O-])=O)C(=O)NCC([O-])=O GCMWRRQAKQXDED-IUCAKERBSA-N 0.000 description 2
- CANPXOLVTMKURR-WEDXCCLWSA-N Lys-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN CANPXOLVTMKURR-WEDXCCLWSA-N 0.000 description 2
- GNLJXWBNLAIPEP-MELADBBJSA-N Lys-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CCCCN)N)C(=O)O GNLJXWBNLAIPEP-MELADBBJSA-N 0.000 description 2
- MXMDJEJWERYPMO-XUXIUFHCSA-N Lys-Ile-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O MXMDJEJWERYPMO-XUXIUFHCSA-N 0.000 description 2
- NCZIQZYZPUPMKY-PPCPHDFISA-N Lys-Ile-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NCZIQZYZPUPMKY-PPCPHDFISA-N 0.000 description 2
- SKRGVGLIRUGANF-AVGNSLFASA-N Lys-Leu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SKRGVGLIRUGANF-AVGNSLFASA-N 0.000 description 2
- YPLVCBKEPJPBDQ-MELADBBJSA-N Lys-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N YPLVCBKEPJPBDQ-MELADBBJSA-N 0.000 description 2
- OIQSIMFSVLLWBX-VOAKCMCISA-N Lys-Leu-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OIQSIMFSVLLWBX-VOAKCMCISA-N 0.000 description 2
- WBSCNDJQPKSPII-KKUMJFAQSA-N Lys-Lys-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O WBSCNDJQPKSPII-KKUMJFAQSA-N 0.000 description 2
- BEGQVWUZFXLNHZ-IHPCNDPISA-N Lys-Lys-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN)C(O)=O)=CNC2=C1 BEGQVWUZFXLNHZ-IHPCNDPISA-N 0.000 description 2
- LMGNWHDWJDIOPK-DKIMLUQUSA-N Lys-Phe-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LMGNWHDWJDIOPK-DKIMLUQUSA-N 0.000 description 2
- RPWTZTBIFGENIA-VOAKCMCISA-N Lys-Thr-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O RPWTZTBIFGENIA-VOAKCMCISA-N 0.000 description 2
- YFQSSOAGMZGXFT-MEYUZBJRSA-N Lys-Thr-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YFQSSOAGMZGXFT-MEYUZBJRSA-N 0.000 description 2
- MDDUIRLQCYVRDO-NHCYSSNCSA-N Lys-Val-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN MDDUIRLQCYVRDO-NHCYSSNCSA-N 0.000 description 2
- 241000124008 Mammalia Species 0.000 description 2
- XOMXAVJBLRROMC-IHRRRGAJSA-N Met-Asp-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XOMXAVJBLRROMC-IHRRRGAJSA-N 0.000 description 2
- AVTWKENDGGUWDC-BQBZGAKWSA-N Met-Cys-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CS)C(=O)NCC(O)=O AVTWKENDGGUWDC-BQBZGAKWSA-N 0.000 description 2
- SJDQOYTYNGZZJX-SRVKXCTJSA-N Met-Glu-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O SJDQOYTYNGZZJX-SRVKXCTJSA-N 0.000 description 2
- SLQDSYZHHOKQSR-QXEWZRGKSA-N Met-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCSC SLQDSYZHHOKQSR-QXEWZRGKSA-N 0.000 description 2
- WXJXYMFUTRXRGO-UWVGGRQHSA-N Met-His-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CNC=N1 WXJXYMFUTRXRGO-UWVGGRQHSA-N 0.000 description 2
- PZUUMQPMHBJJKE-AVGNSLFASA-N Met-Leu-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCNC(N)=N PZUUMQPMHBJJKE-AVGNSLFASA-N 0.000 description 2
- CHDYFPCQVUOJEB-ULQDDVLXSA-N Met-Leu-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 CHDYFPCQVUOJEB-ULQDDVLXSA-N 0.000 description 2
- HOTNHEUETJELDL-BPNCWPANSA-N Met-Tyr-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CCSC)N HOTNHEUETJELDL-BPNCWPANSA-N 0.000 description 2
- YGNUDKAPJARTEM-GUBZILKMSA-N Met-Val-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O YGNUDKAPJARTEM-GUBZILKMSA-N 0.000 description 2
- 108010085220 Multiprotein Complexes Proteins 0.000 description 2
- 102000007474 Multiprotein Complexes Human genes 0.000 description 2
- 241000204003 Mycoplasmatales Species 0.000 description 2
- XZFYRXDAULDNFX-UHFFFAOYSA-N N-L-cysteinyl-L-phenylalanine Natural products SCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XZFYRXDAULDNFX-UHFFFAOYSA-N 0.000 description 2
- 108010079364 N-glycylalanine Proteins 0.000 description 2
- 108010066427 N-valyltryptophan Proteins 0.000 description 2
- 108010087066 N2-tryptophyllysine Proteins 0.000 description 2
- 108010047562 NGR peptide Proteins 0.000 description 2
- 108091005804 Peptidases Proteins 0.000 description 2
- 239000001888 Peptone Substances 0.000 description 2
- 108010080698 Peptones Proteins 0.000 description 2
- XWBJLKDCHJVKAK-KKUMJFAQSA-N Phe-Arg-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N XWBJLKDCHJVKAK-KKUMJFAQSA-N 0.000 description 2
- MPGJIHFJCXTVEX-KKUMJFAQSA-N Phe-Arg-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O MPGJIHFJCXTVEX-KKUMJFAQSA-N 0.000 description 2
- QCHNRQQVLJYDSI-DLOVCJGASA-N Phe-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 QCHNRQQVLJYDSI-DLOVCJGASA-N 0.000 description 2
- HXSUFWQYLPKEHF-IHRRRGAJSA-N Phe-Asn-Arg Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N HXSUFWQYLPKEHF-IHRRRGAJSA-N 0.000 description 2
- KIEPQOIQHFKQLK-PCBIJLKTSA-N Phe-Asn-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KIEPQOIQHFKQLK-PCBIJLKTSA-N 0.000 description 2
- JIYJYFIXQTYDNF-YDHLFZDLSA-N Phe-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N JIYJYFIXQTYDNF-YDHLFZDLSA-N 0.000 description 2
- XMPUYNHKEPFERE-IHRRRGAJSA-N Phe-Asp-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 XMPUYNHKEPFERE-IHRRRGAJSA-N 0.000 description 2
- WMGVYPPIMZPWPN-SRVKXCTJSA-N Phe-Asp-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N WMGVYPPIMZPWPN-SRVKXCTJSA-N 0.000 description 2
- CSYVXYQDIVCQNU-QWRGUYRKSA-N Phe-Asp-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O CSYVXYQDIVCQNU-QWRGUYRKSA-N 0.000 description 2
- HQVPQHLNOVTLDD-IHRRRGAJSA-N Phe-Cys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC1=CC=CC=C1)N HQVPQHLNOVTLDD-IHRRRGAJSA-N 0.000 description 2
- KAGCQPSEVAETCA-JYJNAYRXSA-N Phe-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N KAGCQPSEVAETCA-JYJNAYRXSA-N 0.000 description 2
- GDBOREPXIRKSEQ-FHWLQOOXSA-N Phe-Gln-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GDBOREPXIRKSEQ-FHWLQOOXSA-N 0.000 description 2
- FIRWJEJVFFGXSH-RYUDHWBXSA-N Phe-Glu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 FIRWJEJVFFGXSH-RYUDHWBXSA-N 0.000 description 2
- VADLTGVIOIOKGM-BZSNNMDCSA-N Phe-His-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CN=CN1 VADLTGVIOIOKGM-BZSNNMDCSA-N 0.000 description 2
- VZFPYFRVHMSSNA-JURCDPSOSA-N Phe-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC1=CC=CC=C1 VZFPYFRVHMSSNA-JURCDPSOSA-N 0.000 description 2
- YCCUXNNKXDGMAM-KKUMJFAQSA-N Phe-Leu-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YCCUXNNKXDGMAM-KKUMJFAQSA-N 0.000 description 2
- OQTDZEJJWWAGJT-KKUMJFAQSA-N Phe-Lys-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O OQTDZEJJWWAGJT-KKUMJFAQSA-N 0.000 description 2
- WLYPRKLMRIYGPP-JYJNAYRXSA-N Phe-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 WLYPRKLMRIYGPP-JYJNAYRXSA-N 0.000 description 2
- SCKXGHWQPPURGT-KKUMJFAQSA-N Phe-Lys-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O SCKXGHWQPPURGT-KKUMJFAQSA-N 0.000 description 2
- JKJSIYKSGIDHPM-WBAXXEDZSA-N Phe-Phe-Ala Chemical compound C[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H](N)Cc1ccccc1)C(O)=O JKJSIYKSGIDHPM-WBAXXEDZSA-N 0.000 description 2
- WEDZFLRYSIDIRX-IHRRRGAJSA-N Phe-Ser-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=CC=C1 WEDZFLRYSIDIRX-IHRRRGAJSA-N 0.000 description 2
- LTAWNJXSRUCFAN-UNQGMJICSA-N Phe-Thr-Arg Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O LTAWNJXSRUCFAN-UNQGMJICSA-N 0.000 description 2
- KLYYKKGCPOGDPE-OEAJRASXSA-N Phe-Thr-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O KLYYKKGCPOGDPE-OEAJRASXSA-N 0.000 description 2
- MSSXKZBDKZAHCX-UNQGMJICSA-N Phe-Thr-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O MSSXKZBDKZAHCX-UNQGMJICSA-N 0.000 description 2
- QUUCAHIYARMNBL-FHWLQOOXSA-N Phe-Tyr-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N QUUCAHIYARMNBL-FHWLQOOXSA-N 0.000 description 2
- ZYNBEWGJFXTBDU-ACRUOGEOSA-N Phe-Tyr-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC2=CC=CC=C2)N ZYNBEWGJFXTBDU-ACRUOGEOSA-N 0.000 description 2
- 102100030345 Pituitary homeobox 1 Human genes 0.000 description 2
- 229920002562 Polyethylene Glycol 3350 Polymers 0.000 description 2
- DZZCICYRSZASNF-FXQIFTODSA-N Pro-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 DZZCICYRSZASNF-FXQIFTODSA-N 0.000 description 2
- VXCHGLYSIOOZIS-GUBZILKMSA-N Pro-Ala-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 VXCHGLYSIOOZIS-GUBZILKMSA-N 0.000 description 2
- SBYVDRLQAGENMY-DCAQKATOSA-N Pro-Asn-His Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O SBYVDRLQAGENMY-DCAQKATOSA-N 0.000 description 2
- CJZTUKSFZUSNCC-FXQIFTODSA-N Pro-Asp-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H]1CCCN1 CJZTUKSFZUSNCC-FXQIFTODSA-N 0.000 description 2
- YFNOUBWUIIJQHF-LPEHRKFASA-N Pro-Asp-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)O)C(=O)N2CCC[C@@H]2C(=O)O YFNOUBWUIIJQHF-LPEHRKFASA-N 0.000 description 2
- UPJGUQPLYWTISV-GUBZILKMSA-N Pro-Gln-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UPJGUQPLYWTISV-GUBZILKMSA-N 0.000 description 2
- LHALYDBUDCWMDY-CIUDSAMLSA-N Pro-Glu-Ala Chemical compound C[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1)C(O)=O LHALYDBUDCWMDY-CIUDSAMLSA-N 0.000 description 2
- VOZIBWWZSBIXQN-SRVKXCTJSA-N Pro-Glu-Lys Chemical compound NCCCC[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1)C(O)=O VOZIBWWZSBIXQN-SRVKXCTJSA-N 0.000 description 2
- LXVLKXPFIDDHJG-CIUDSAMLSA-N Pro-Glu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O LXVLKXPFIDDHJG-CIUDSAMLSA-N 0.000 description 2
- VYWNORHENYEQDW-YUMQZZPRSA-N Pro-Gly-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 VYWNORHENYEQDW-YUMQZZPRSA-N 0.000 description 2
- UIMCLYYSUCIUJM-UWVGGRQHSA-N Pro-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 UIMCLYYSUCIUJM-UWVGGRQHSA-N 0.000 description 2
- DXTOOBDIIAJZBJ-BQBZGAKWSA-N Pro-Gly-Ser Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CO)C(O)=O DXTOOBDIIAJZBJ-BQBZGAKWSA-N 0.000 description 2
- HAEGAELAYWSUNC-WPRPVWTQSA-N Pro-Gly-Val Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O HAEGAELAYWSUNC-WPRPVWTQSA-N 0.000 description 2
- FMLRRBDLBJLJIK-DCAQKATOSA-N Pro-Leu-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]1CCCN1 FMLRRBDLBJLJIK-DCAQKATOSA-N 0.000 description 2
- FYPGHGXAOZTOBO-IHRRRGAJSA-N Pro-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@@H]2CCCN2 FYPGHGXAOZTOBO-IHRRRGAJSA-N 0.000 description 2
- BRJGUPWVFXKBQI-XUXIUFHCSA-N Pro-Leu-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BRJGUPWVFXKBQI-XUXIUFHCSA-N 0.000 description 2
- FKYKZHOKDOPHSA-DCAQKATOSA-N Pro-Leu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FKYKZHOKDOPHSA-DCAQKATOSA-N 0.000 description 2
- SRBFGSGDNNQABI-FHWLQOOXSA-N Pro-Leu-Trp Chemical compound N([C@@H](CC(C)C)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C(=O)[C@@H]1CCCN1 SRBFGSGDNNQABI-FHWLQOOXSA-N 0.000 description 2
- RMODQFBNDDENCP-IHRRRGAJSA-N Pro-Lys-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O RMODQFBNDDENCP-IHRRRGAJSA-N 0.000 description 2
- BLJMJZOMZRCESA-GUBZILKMSA-N Pro-Met-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@@H]1CCCN1 BLJMJZOMZRCESA-GUBZILKMSA-N 0.000 description 2
- SBVPYBFMIGDIDX-SRVKXCTJSA-N Pro-Pro-Pro Chemical compound OC(=O)[C@@H]1CCCN1C(=O)[C@H]1N(C(=O)[C@H]2NCCC2)CCC1 SBVPYBFMIGDIDX-SRVKXCTJSA-N 0.000 description 2
- GOMUXSCOIWIJFP-GUBZILKMSA-N Pro-Ser-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GOMUXSCOIWIJFP-GUBZILKMSA-N 0.000 description 2
- GMJDSFYVTAMIBF-FXQIFTODSA-N Pro-Ser-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O GMJDSFYVTAMIBF-FXQIFTODSA-N 0.000 description 2
- SEZGGSHLMROBFX-CIUDSAMLSA-N Pro-Ser-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O SEZGGSHLMROBFX-CIUDSAMLSA-N 0.000 description 2
- IURWWZYKYPEANQ-HJGDQZAQSA-N Pro-Thr-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O IURWWZYKYPEANQ-HJGDQZAQSA-N 0.000 description 2
- FDMCIBSQRKFSTJ-RHYQMDGZSA-N Pro-Thr-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O FDMCIBSQRKFSTJ-RHYQMDGZSA-N 0.000 description 2
- DLZBBDSPTJBOOD-BPNCWPANSA-N Pro-Tyr-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O DLZBBDSPTJBOOD-BPNCWPANSA-N 0.000 description 2
- JXVXYRZQIUPYSA-NHCYSSNCSA-N Pro-Val-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O JXVXYRZQIUPYSA-NHCYSSNCSA-N 0.000 description 2
- ZMLRZBWCXPQADC-TUAOUCFPSA-N Pro-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 ZMLRZBWCXPQADC-TUAOUCFPSA-N 0.000 description 2
- FHJQROWZEJFZPO-SRVKXCTJSA-N Pro-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 FHJQROWZEJFZPO-SRVKXCTJSA-N 0.000 description 2
- 239000004365 Protease Substances 0.000 description 2
- 108010079005 RDV peptide Proteins 0.000 description 2
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 2
- 229920002684 Sepharose Polymers 0.000 description 2
- SRTCFKGBYBZRHA-ACZMJKKPSA-N Ser-Ala-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SRTCFKGBYBZRHA-ACZMJKKPSA-N 0.000 description 2
- IYCBDVBJWDXQRR-FXQIFTODSA-N Ser-Ala-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCSC)C(O)=O IYCBDVBJWDXQRR-FXQIFTODSA-N 0.000 description 2
- WXUBSIDKNMFAGS-IHRRRGAJSA-N Ser-Arg-Tyr Chemical compound NC(N)=NCCC[C@H](NC(=O)[C@H](CO)N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 WXUBSIDKNMFAGS-IHRRRGAJSA-N 0.000 description 2
- SFZKGGOGCNQPJY-CIUDSAMLSA-N Ser-Asp-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)N SFZKGGOGCNQPJY-CIUDSAMLSA-N 0.000 description 2
- OLIJLNWFEQEFDM-SRVKXCTJSA-N Ser-Asp-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OLIJLNWFEQEFDM-SRVKXCTJSA-N 0.000 description 2
- IXUGADGDCQDLSA-FXQIFTODSA-N Ser-Gln-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CO)N IXUGADGDCQDLSA-FXQIFTODSA-N 0.000 description 2
- SMIDBHKWSYUBRZ-ACZMJKKPSA-N Ser-Glu-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O SMIDBHKWSYUBRZ-ACZMJKKPSA-N 0.000 description 2
- DSGYZICNAMEJOC-AVGNSLFASA-N Ser-Glu-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O DSGYZICNAMEJOC-AVGNSLFASA-N 0.000 description 2
- WSTIOCFMWXNOCX-YUMQZZPRSA-N Ser-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CO)N WSTIOCFMWXNOCX-YUMQZZPRSA-N 0.000 description 2
- JFWDJFULOLKQFY-QWRGUYRKSA-N Ser-Gly-Phe Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JFWDJFULOLKQFY-QWRGUYRKSA-N 0.000 description 2
- WEQAYODCJHZSJZ-KKUMJFAQSA-N Ser-His-Tyr Chemical compound C([C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 WEQAYODCJHZSJZ-KKUMJFAQSA-N 0.000 description 2
- XNCUYZKGQOCOQH-YUMQZZPRSA-N Ser-Leu-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O XNCUYZKGQOCOQH-YUMQZZPRSA-N 0.000 description 2
- MUJQWSAWLLRJCE-KATARQTJSA-N Ser-Leu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MUJQWSAWLLRJCE-KATARQTJSA-N 0.000 description 2
- GVIGVIOEYBOTCB-XIRDDKMYSA-N Ser-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CC(C)C)C(O)=O)=CNC2=C1 GVIGVIOEYBOTCB-XIRDDKMYSA-N 0.000 description 2
- PPNPDKGQRFSCAC-CIUDSAMLSA-N Ser-Lys-Asp Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)CO)C(=O)N[C@@H](CC(O)=O)C(O)=O PPNPDKGQRFSCAC-CIUDSAMLSA-N 0.000 description 2
- OWCVUSJMEBGMOK-YUMQZZPRSA-N Ser-Lys-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O OWCVUSJMEBGMOK-YUMQZZPRSA-N 0.000 description 2
- XUDRHBPSPAPDJP-SRVKXCTJSA-N Ser-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO XUDRHBPSPAPDJP-SRVKXCTJSA-N 0.000 description 2
- HHJFMHQYEAAOBM-ZLUOBGJFSA-N Ser-Ser-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O HHJFMHQYEAAOBM-ZLUOBGJFSA-N 0.000 description 2
- OZPDGESCTGGNAD-CIUDSAMLSA-N Ser-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CO OZPDGESCTGGNAD-CIUDSAMLSA-N 0.000 description 2
- SZRNDHWMVSFPSP-XKBZYTNZSA-N Ser-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CO)N)O SZRNDHWMVSFPSP-XKBZYTNZSA-N 0.000 description 2
- WMZVVNLPHFSUPA-BPUTZDHNSA-N Ser-Trp-Arg Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)=CNC2=C1 WMZVVNLPHFSUPA-BPUTZDHNSA-N 0.000 description 2
- ZWSZBWAFDZRBNM-UBHSHLNASA-N Ser-Trp-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(O)=O ZWSZBWAFDZRBNM-UBHSHLNASA-N 0.000 description 2
- PLQWGQUNUPMNOD-KKUMJFAQSA-N Ser-Tyr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O PLQWGQUNUPMNOD-KKUMJFAQSA-N 0.000 description 2
- ZVBCMFDJIMUELU-BZSNNMDCSA-N Ser-Tyr-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CO)N ZVBCMFDJIMUELU-BZSNNMDCSA-N 0.000 description 2
- KIEIJCFVGZCUAS-MELADBBJSA-N Ser-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CO)N)C(=O)O KIEIJCFVGZCUAS-MELADBBJSA-N 0.000 description 2
- UKKROEYWYIHWBD-ZKWXMUAHSA-N Ser-Val-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O UKKROEYWYIHWBD-ZKWXMUAHSA-N 0.000 description 2
- LLSLRQOEAFCZLW-NRPADANISA-N Ser-Val-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O LLSLRQOEAFCZLW-NRPADANISA-N 0.000 description 2
- JGUWRQWULDWNCM-FXQIFTODSA-N Ser-Val-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O JGUWRQWULDWNCM-FXQIFTODSA-N 0.000 description 2
- 108020004682 Single-Stranded DNA Proteins 0.000 description 2
- IGROJMCBGRFRGI-YTLHQDLWSA-N Thr-Ala-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O IGROJMCBGRFRGI-YTLHQDLWSA-N 0.000 description 2
- STGXWWBXWXZOER-MBLNEYKQSA-N Thr-Ala-His Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 STGXWWBXWXZOER-MBLNEYKQSA-N 0.000 description 2
- CAGTXGDOIFXLPC-KZVJFYERSA-N Thr-Arg-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CCCN=C(N)N CAGTXGDOIFXLPC-KZVJFYERSA-N 0.000 description 2
- MQBTXMPQNCGSSZ-OSUNSFLBSA-N Thr-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)O)CCCN=C(N)N MQBTXMPQNCGSSZ-OSUNSFLBSA-N 0.000 description 2
- VXMHQKHDKCATDV-VEVYYDQMSA-N Thr-Asp-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VXMHQKHDKCATDV-VEVYYDQMSA-N 0.000 description 2
- GARULAKWZGFIKC-RWRJDSDZSA-N Thr-Gln-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O GARULAKWZGFIKC-RWRJDSDZSA-N 0.000 description 2
- RKDFEMGVMMYYNG-WDCWCFNPSA-N Thr-Gln-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O RKDFEMGVMMYYNG-WDCWCFNPSA-N 0.000 description 2
- DKDHTRVDOUZZTP-IFFSRLJSSA-N Thr-Gln-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)[C@@H](C)O)C(O)=O DKDHTRVDOUZZTP-IFFSRLJSSA-N 0.000 description 2
- LHEZGZQRLDBSRR-WDCWCFNPSA-N Thr-Glu-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LHEZGZQRLDBSRR-WDCWCFNPSA-N 0.000 description 2
- AQAMPXBRJJWPNI-JHEQGTHGSA-N Thr-Gly-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AQAMPXBRJJWPNI-JHEQGTHGSA-N 0.000 description 2
- MPUMPERGHHJGRP-WEDXCCLWSA-N Thr-Gly-Lys Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N)O MPUMPERGHHJGRP-WEDXCCLWSA-N 0.000 description 2
- FDALPRWYVKJCLL-PMVVWTBXSA-N Thr-His-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)NCC(O)=O FDALPRWYVKJCLL-PMVVWTBXSA-N 0.000 description 2
- WBCCCPZIJIJTSD-TUBUOCAGSA-N Thr-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H]([C@@H](C)O)N WBCCCPZIJIJTSD-TUBUOCAGSA-N 0.000 description 2
- URPSJRMWHQTARR-MBLNEYKQSA-N Thr-Ile-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O URPSJRMWHQTARR-MBLNEYKQSA-N 0.000 description 2
- NCXVJIQMWSGRHY-KXNHARMFSA-N Thr-Leu-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N)O NCXVJIQMWSGRHY-KXNHARMFSA-N 0.000 description 2
- VRUFCJZQDACGLH-UVOCVTCTSA-N Thr-Leu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VRUFCJZQDACGLH-UVOCVTCTSA-N 0.000 description 2
- KZSYAEWQMJEGRZ-RHYQMDGZSA-N Thr-Leu-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O KZSYAEWQMJEGRZ-RHYQMDGZSA-N 0.000 description 2
- JLNMFGCJODTXDH-WEDXCCLWSA-N Thr-Lys-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O JLNMFGCJODTXDH-WEDXCCLWSA-N 0.000 description 2
- GUHLYMZJVXUIPO-RCWTZXSCSA-N Thr-Met-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(O)=O GUHLYMZJVXUIPO-RCWTZXSCSA-N 0.000 description 2
- WYLAVUAWOUVUCA-XVSYOHENSA-N Thr-Phe-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O WYLAVUAWOUVUCA-XVSYOHENSA-N 0.000 description 2
- NYQIZWROIMIQSL-VEVYYDQMSA-N Thr-Pro-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O NYQIZWROIMIQSL-VEVYYDQMSA-N 0.000 description 2
- JAJOFWABAUKAEJ-QTKMDUPCSA-N Thr-Pro-His Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O JAJOFWABAUKAEJ-QTKMDUPCSA-N 0.000 description 2
- VTMGKRABARCZAX-OSUNSFLBSA-N Thr-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)[C@@H](C)O VTMGKRABARCZAX-OSUNSFLBSA-N 0.000 description 2
- YGZWVPBHYABGLT-KJEVXHAQSA-N Thr-Pro-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 YGZWVPBHYABGLT-KJEVXHAQSA-N 0.000 description 2
- STUAPCLEDMKXKL-LKXGYXEUSA-N Thr-Ser-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O STUAPCLEDMKXKL-LKXGYXEUSA-N 0.000 description 2
- NLWDSYKZUPRMBJ-IEGACIPQSA-N Thr-Trp-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC(C)C)C(=O)O)N)O NLWDSYKZUPRMBJ-IEGACIPQSA-N 0.000 description 2
- PELIQFPESHBTMA-WLTAIBSBSA-N Thr-Tyr-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 PELIQFPESHBTMA-WLTAIBSBSA-N 0.000 description 2
- XGFYGMKZKFRGAI-RCWTZXSCSA-N Thr-Val-Arg Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N XGFYGMKZKFRGAI-RCWTZXSCSA-N 0.000 description 2
- BKIOKSLLAAZYTC-KKHAAJSZSA-N Thr-Val-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O BKIOKSLLAAZYTC-KKHAAJSZSA-N 0.000 description 2
- FYBFTPLPAXZBOY-KKHAAJSZSA-N Thr-Val-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O FYBFTPLPAXZBOY-KKHAAJSZSA-N 0.000 description 2
- MNYNCKZAEIAONY-XGEHTFHBSA-N Thr-Val-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O MNYNCKZAEIAONY-XGEHTFHBSA-N 0.000 description 2
- VYVBSMCZNHOZGD-RCWTZXSCSA-N Thr-Val-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(O)=O VYVBSMCZNHOZGD-RCWTZXSCSA-N 0.000 description 2
- 239000007983 Tris buffer Substances 0.000 description 2
- GTNCSPKYWCJZAC-XIRDDKMYSA-N Trp-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N GTNCSPKYWCJZAC-XIRDDKMYSA-N 0.000 description 2
- DEZKIRSBKKXUEV-NYVOZVTQSA-N Trp-Asp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC3=CNC4=CC=CC=C43)C(=O)O)N DEZKIRSBKKXUEV-NYVOZVTQSA-N 0.000 description 2
- HDQJVXVRGJUDML-UBHSHLNASA-N Trp-Cys-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)N)C(=O)O)N HDQJVXVRGJUDML-UBHSHLNASA-N 0.000 description 2
- CZSMNLQMRWPGQF-XEGUGMAKSA-N Trp-Gln-Ala Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O CZSMNLQMRWPGQF-XEGUGMAKSA-N 0.000 description 2
- CZWIHKFGHICAJX-BPUTZDHNSA-N Trp-Glu-Glu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 CZWIHKFGHICAJX-BPUTZDHNSA-N 0.000 description 2
- ILDJYIDXESUBOE-HSCHXYMDSA-N Trp-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N ILDJYIDXESUBOE-HSCHXYMDSA-N 0.000 description 2
- SAKLWFSRZTZQAJ-GQGQLFGLSA-N Trp-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N SAKLWFSRZTZQAJ-GQGQLFGLSA-N 0.000 description 2
- VPRHDRKAPYZMHL-SZMVWBNQSA-N Trp-Leu-Glu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 VPRHDRKAPYZMHL-SZMVWBNQSA-N 0.000 description 2
- IKUMWSDCGQVGHC-UMPQAUOISA-N Trp-Pro-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC2=CNC3=CC=CC=C32)N)O IKUMWSDCGQVGHC-UMPQAUOISA-N 0.000 description 2
- RNDWCRUOGGQDKN-UBHSHLNASA-N Trp-Ser-Asp Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O RNDWCRUOGGQDKN-UBHSHLNASA-N 0.000 description 2
- UMIACFRBELJMGT-GQGQLFGLSA-N Trp-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N UMIACFRBELJMGT-GQGQLFGLSA-N 0.000 description 2
- LNGFWVPNKLWATF-ZVZYQTTQSA-N Trp-Val-Glu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LNGFWVPNKLWATF-ZVZYQTTQSA-N 0.000 description 2
- BURPTJBFWIOHEY-UWJYBYFXSA-N Tyr-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 BURPTJBFWIOHEY-UWJYBYFXSA-N 0.000 description 2
- NSTPFWRAIDTNGH-BZSNNMDCSA-N Tyr-Asn-Tyr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O NSTPFWRAIDTNGH-BZSNNMDCSA-N 0.000 description 2
- GAYLGYUVTDMLKC-UWJYBYFXSA-N Tyr-Asp-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 GAYLGYUVTDMLKC-UWJYBYFXSA-N 0.000 description 2
- QUILOGWWLXMSAT-IHRRRGAJSA-N Tyr-Gln-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QUILOGWWLXMSAT-IHRRRGAJSA-N 0.000 description 2
- TWAVEIJGFCBWCG-JYJNAYRXSA-N Tyr-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N TWAVEIJGFCBWCG-JYJNAYRXSA-N 0.000 description 2
- HVHJYXDXRIWELT-RYUDHWBXSA-N Tyr-Glu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O HVHJYXDXRIWELT-RYUDHWBXSA-N 0.000 description 2
- HIINQLBHPIQYHN-JTQLQIEISA-N Tyr-Gly-Gly Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 HIINQLBHPIQYHN-JTQLQIEISA-N 0.000 description 2
- JKUZFODWJGEQAP-KBPBESRZSA-N Tyr-Gly-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N)O JKUZFODWJGEQAP-KBPBESRZSA-N 0.000 description 2
- CTDPLKMBVALCGN-JSGCOSHPSA-N Tyr-Gly-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O CTDPLKMBVALCGN-JSGCOSHPSA-N 0.000 description 2
- HFJJDMOFTCQGEI-STECZYCISA-N Tyr-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N HFJJDMOFTCQGEI-STECZYCISA-N 0.000 description 2
- SINRIKQYQJRGDQ-MEYUZBJRSA-N Tyr-Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 SINRIKQYQJRGDQ-MEYUZBJRSA-N 0.000 description 2
- AUZADXNWQMBZOO-JYJNAYRXSA-N Tyr-Pro-Arg Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)C1=CC=C(O)C=C1 AUZADXNWQMBZOO-JYJNAYRXSA-N 0.000 description 2
- XJPXTYLVMUZGNW-IHRRRGAJSA-N Tyr-Pro-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O XJPXTYLVMUZGNW-IHRRRGAJSA-N 0.000 description 2
- SOEGLGLDSUHWTI-STECZYCISA-N Tyr-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=C(O)C=C1 SOEGLGLDSUHWTI-STECZYCISA-N 0.000 description 2
- PWKMJDQXKCENMF-MEYUZBJRSA-N Tyr-Thr-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O PWKMJDQXKCENMF-MEYUZBJRSA-N 0.000 description 2
- LDKDSFQSEUOCOO-RPTUDFQQSA-N Tyr-Thr-Phe Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LDKDSFQSEUOCOO-RPTUDFQQSA-N 0.000 description 2
- KLOZTPOXVVRVAQ-DZKIICNBSA-N Tyr-Val-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 KLOZTPOXVVRVAQ-DZKIICNBSA-N 0.000 description 2
- PQPWEALFTLKSEB-DZKIICNBSA-N Tyr-Val-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O PQPWEALFTLKSEB-DZKIICNBSA-N 0.000 description 2
- ZLFHAAGHGQBQQN-GUBZILKMSA-N Val-Ala-Pro Natural products CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O ZLFHAAGHGQBQQN-GUBZILKMSA-N 0.000 description 2
- SLLKXDSRVAOREO-KZVJFYERSA-N Val-Ala-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H](C(C)C)N)O SLLKXDSRVAOREO-KZVJFYERSA-N 0.000 description 2
- JOQSQZFKFYJKKJ-GUBZILKMSA-N Val-Arg-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CS)C(=O)O)N JOQSQZFKFYJKKJ-GUBZILKMSA-N 0.000 description 2
- KKHRWGYHBZORMQ-NHCYSSNCSA-N Val-Arg-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N KKHRWGYHBZORMQ-NHCYSSNCSA-N 0.000 description 2
- NMPXRFYMZDIBRF-ZOBUZTSGSA-N Val-Asn-Trp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N NMPXRFYMZDIBRF-ZOBUZTSGSA-N 0.000 description 2
- ISERLACIZUGCDX-ZKWXMUAHSA-N Val-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N ISERLACIZUGCDX-ZKWXMUAHSA-N 0.000 description 2
- VUTHNLMCXKLLFI-LAEOZQHASA-N Val-Asp-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N VUTHNLMCXKLLFI-LAEOZQHASA-N 0.000 description 2
- CWSIBTLMMQLPPZ-FXQIFTODSA-N Val-Cys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](C(C)C)N CWSIBTLMMQLPPZ-FXQIFTODSA-N 0.000 description 2
- LMSBRIVOCYOKMU-NRPADANISA-N Val-Gln-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CS)C(=O)O)N LMSBRIVOCYOKMU-NRPADANISA-N 0.000 description 2
- QHFQQRKNGCXTHL-AUTRQRHGSA-N Val-Gln-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O QHFQQRKNGCXTHL-AUTRQRHGSA-N 0.000 description 2
- VFOHXOLPLACADK-GVXVVHGQSA-N Val-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](C(C)C)N VFOHXOLPLACADK-GVXVVHGQSA-N 0.000 description 2
- CVIXTAITYJQMPE-LAEOZQHASA-N Val-Glu-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CVIXTAITYJQMPE-LAEOZQHASA-N 0.000 description 2
- ZXAGTABZUOMUDO-GVXVVHGQSA-N Val-Glu-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZXAGTABZUOMUDO-GVXVVHGQSA-N 0.000 description 2
- UEHRGZCNLSWGHK-DLOVCJGASA-N Val-Glu-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O UEHRGZCNLSWGHK-DLOVCJGASA-N 0.000 description 2
- BVWPHWLFGRCECJ-JSGCOSHPSA-N Val-Gly-Tyr Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N BVWPHWLFGRCECJ-JSGCOSHPSA-N 0.000 description 2
- VHRLUTIMTDOVCG-PEDHHIEDSA-N Val-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)NC(=O)[C@H](C(C)C)N VHRLUTIMTDOVCG-PEDHHIEDSA-N 0.000 description 2
- OTJMMKPMLUNTQT-AVGNSLFASA-N Val-Leu-Arg Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](C(C)C)N OTJMMKPMLUNTQT-AVGNSLFASA-N 0.000 description 2
- KTEZUXISLQTDDQ-NHCYSSNCSA-N Val-Lys-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N KTEZUXISLQTDDQ-NHCYSSNCSA-N 0.000 description 2
- HPANGHISDXDUQY-ULQDDVLXSA-N Val-Lys-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N HPANGHISDXDUQY-ULQDDVLXSA-N 0.000 description 2
- MJFSRZZJQWZHFQ-SRVKXCTJSA-N Val-Met-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(=O)O)N MJFSRZZJQWZHFQ-SRVKXCTJSA-N 0.000 description 2
- LJSZPMSUYKKKCP-UBHSHLNASA-N Val-Phe-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=CC=C1 LJSZPMSUYKKKCP-UBHSHLNASA-N 0.000 description 2
- CKTMJBPRVQWPHU-JSGCOSHPSA-N Val-Phe-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)O)N CKTMJBPRVQWPHU-JSGCOSHPSA-N 0.000 description 2
- NHXZRXLFOBFMDM-AVGNSLFASA-N Val-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)C(C)C NHXZRXLFOBFMDM-AVGNSLFASA-N 0.000 description 2
- BGXVHVMJZCSOCA-AVGNSLFASA-N Val-Pro-Lys Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)O)N BGXVHVMJZCSOCA-AVGNSLFASA-N 0.000 description 2
- QIVPZSWBBHRNBA-JYJNAYRXSA-N Val-Pro-Phe Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1ccccc1)C(O)=O QIVPZSWBBHRNBA-JYJNAYRXSA-N 0.000 description 2
- SSYBNWFXCFNRFN-GUBZILKMSA-N Val-Pro-Ser Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O SSYBNWFXCFNRFN-GUBZILKMSA-N 0.000 description 2
- NSUUANXHLKKHQB-BZSNNMDCSA-N Val-Pro-Trp Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CNC2=CC=CC=C12 NSUUANXHLKKHQB-BZSNNMDCSA-N 0.000 description 2
- UGFMVXRXULGLNO-XPUUQOCRSA-N Val-Ser-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O UGFMVXRXULGLNO-XPUUQOCRSA-N 0.000 description 2
- PDDJTOSAVNRJRH-UNQGMJICSA-N Val-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](C(C)C)N)O PDDJTOSAVNRJRH-UNQGMJICSA-N 0.000 description 2
- JAIZPWVHPQRYOU-ZJDVBMNYSA-N Val-Thr-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O JAIZPWVHPQRYOU-ZJDVBMNYSA-N 0.000 description 2
- RSEIVHMDTNNEOW-JYJNAYRXSA-N Val-Trp-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CS)C(=O)O)N RSEIVHMDTNNEOW-JYJNAYRXSA-N 0.000 description 2
- GTACFKZDQFTVAI-STECZYCISA-N Val-Tyr-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=C(O)C=C1 GTACFKZDQFTVAI-STECZYCISA-N 0.000 description 2
- VVIZITNVZUAEMI-DLOVCJGASA-N Val-Val-Gln Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCC(N)=O VVIZITNVZUAEMI-DLOVCJGASA-N 0.000 description 2
- 241000700605 Viruses Species 0.000 description 2
- 239000004480 active ingredient Substances 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 2
- 108010076324 alanyl-glycyl-glycine Proteins 0.000 description 2
- 239000003963 antioxidant agent Substances 0.000 description 2
- 235000006708 antioxidants Nutrition 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 108010013835 arginine glutamate Proteins 0.000 description 2
- 108010001271 arginyl-glutamyl-arginine Proteins 0.000 description 2
- 108010057412 arginyl-glycyl-aspartyl-phenylalanine Proteins 0.000 description 2
- 108010062796 arginyllysine Proteins 0.000 description 2
- 239000012148 binding buffer Substances 0.000 description 2
- 229960002685 biotin Drugs 0.000 description 2
- 235000020958 biotin Nutrition 0.000 description 2
- 239000011616 biotin Substances 0.000 description 2
- 230000000903 blocking effect Effects 0.000 description 2
- 229940041514 candida albicans extract Drugs 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 230000021615 conjugation Effects 0.000 description 2
- 238000013270 controlled release Methods 0.000 description 2
- 101150029677 csoS1A gene Proteins 0.000 description 2
- 108010016616 cysteinylglycine Proteins 0.000 description 2
- 210000000172 cytosol Anatomy 0.000 description 2
- 230000006378 damage Effects 0.000 description 2
- 238000013480 data collection Methods 0.000 description 2
- 238000000326 densiometry Methods 0.000 description 2
- 230000029087 digestion Effects 0.000 description 2
- 108010009297 diglycyl-histidine Proteins 0.000 description 2
- 239000003085 diluting agent Substances 0.000 description 2
- 108010054813 diprotin B Proteins 0.000 description 2
- 238000010494 dissociation reaction Methods 0.000 description 2
- 230000005593 dissociations Effects 0.000 description 2
- 238000003028 enzyme activity measurement method Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000002073 fluorescence micrograph Methods 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- 108010006664 gamma-glutamyl-glycyl-glycine Proteins 0.000 description 2
- 108010080575 glutamyl-aspartyl-alanine Proteins 0.000 description 2
- 108010033719 glycyl-histidyl-glycine Proteins 0.000 description 2
- 108010050475 glycyl-leucyl-tyrosine Proteins 0.000 description 2
- 108010082286 glycyl-seryl-alanine Proteins 0.000 description 2
- 108010045126 glycyl-tyrosyl-glycine Proteins 0.000 description 2
- 108010010147 glycylglutamine Proteins 0.000 description 2
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 2
- 229910052737 gold Inorganic materials 0.000 description 2
- 239000010931 gold Substances 0.000 description 2
- 101150084612 gpmA gene Proteins 0.000 description 2
- 238000009499 grossing Methods 0.000 description 2
- 230000012010 growth Effects 0.000 description 2
- 239000001963 growth medium Substances 0.000 description 2
- 108010036413 histidylglycine Proteins 0.000 description 2
- 238000002744 homologous recombination Methods 0.000 description 2
- 230000006801 homologous recombination Effects 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 238000001597 immobilized metal affinity chromatography Methods 0.000 description 2
- 239000000543 intermediate Substances 0.000 description 2
- 108010027338 isoleucylcysteine Proteins 0.000 description 2
- 108010078274 isoleucylvaline Proteins 0.000 description 2
- 125000001449 isopropyl group Chemical group [H]C([H])([H])C([H])(*)C([H])([H])[H] 0.000 description 2
- 101150109249 lacI gene Proteins 0.000 description 2
- 108010077158 leucinyl-arginyl-tryptophan Proteins 0.000 description 2
- 108010051673 leucyl-glycyl-phenylalanine Proteins 0.000 description 2
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 2
- XIXADJRWDQXREU-UHFFFAOYSA-M lithium acetate Chemical compound [Li+].CC([O-])=O XIXADJRWDQXREU-UHFFFAOYSA-M 0.000 description 2
- 108010025153 lysyl-alanyl-alanine Proteins 0.000 description 2
- 108010044348 lysyl-glutamyl-aspartic acid Proteins 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 108700023046 methionyl-leucyl-phenylalanine Proteins 0.000 description 2
- 108010056582 methionylglutamic acid Proteins 0.000 description 2
- 238000002887 multiple sequence alignment Methods 0.000 description 2
- 235000015097 nutrients Nutrition 0.000 description 2
- 235000016709 nutrition Nutrition 0.000 description 2
- 238000005580 one pot reaction Methods 0.000 description 2
- 235000019319 peptone Nutrition 0.000 description 2
- 108010018625 phenylalanylarginine Proteins 0.000 description 2
- 108010012581 phenylalanylglutamate Proteins 0.000 description 2
- 108010051242 phenylalanylserine Proteins 0.000 description 2
- 238000005498 polishing Methods 0.000 description 2
- 229920000642 polymer Polymers 0.000 description 2
- 239000002243 precursor Substances 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 108010025826 prolyl-leucyl-arginine Proteins 0.000 description 2
- 108010004914 prolylarginine Proteins 0.000 description 2
- 108020001580 protein domains Proteins 0.000 description 2
- 230000017854 proteolysis Effects 0.000 description 2
- 238000010379 pull-down assay Methods 0.000 description 2
- 102000005962 receptors Human genes 0.000 description 2
- 108020003175 receptors Proteins 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 108010026333 seryl-proline Proteins 0.000 description 2
- 230000035939 shock Effects 0.000 description 2
- 159000000000 sodium salts Chemical class 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 239000011550 stock solution Substances 0.000 description 2
- 238000012916 structural analysis Methods 0.000 description 2
- 239000000725 suspension Substances 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000010257 thawing Methods 0.000 description 2
- 238000005382 thermal cycling Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 108700004896 tripeptide FEG Proteins 0.000 description 2
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 2
- 108010015666 tryptophyl-leucyl-glutamic acid Proteins 0.000 description 2
- 108010084932 tryptophyl-proline Proteins 0.000 description 2
- 108010044292 tryptophyltyrosine Proteins 0.000 description 2
- 108010017949 tyrosyl-glycyl-glycine Proteins 0.000 description 2
- 108010071635 tyrosyl-prolyl-arginine Proteins 0.000 description 2
- 229940035893 uracil Drugs 0.000 description 2
- 108010009962 valyltyrosine Proteins 0.000 description 2
- 239000003981 vehicle Substances 0.000 description 2
- 239000012224 working solution Substances 0.000 description 2
- 210000005253 yeast cell Anatomy 0.000 description 2
- 239000012138 yeast extract Substances 0.000 description 2
- AUTOLBMXDDTRRT-JGVFFNPUSA-N (4R,5S)-dethiobiotin Chemical compound C[C@@H]1NC(=O)N[C@@H]1CCCCCC(O)=O AUTOLBMXDDTRRT-JGVFFNPUSA-N 0.000 description 1
- SXGZJKUKBWWHRA-UHFFFAOYSA-N 2-(N-morpholiniumyl)ethanesulfonate Chemical compound [O-]S(=O)(=O)CC[NH+]1CCOCC1 SXGZJKUKBWWHRA-UHFFFAOYSA-N 0.000 description 1
- PHLXSNIEQIKENK-UHFFFAOYSA-N 2-[[2-[5-methyl-3-(trifluoromethyl)pyrazol-1-yl]acetyl]amino]-4,5,6,7-tetrahydro-1-benzothiophene-3-carboxamide Chemical compound CC1=CC(C(F)(F)F)=NN1CC(=O)NC1=C(C(N)=O)C(CCCC2)=C2S1 PHLXSNIEQIKENK-UHFFFAOYSA-N 0.000 description 1
- IVLXQGJVBGMLRR-UHFFFAOYSA-N 2-aminoacetic acid;hydron;chloride Chemical compound Cl.NCC(O)=O IVLXQGJVBGMLRR-UHFFFAOYSA-N 0.000 description 1
- ANXOZVKTXWJGOY-UHFFFAOYSA-N 2-aminoethanone Chemical compound NC[C]=O ANXOZVKTXWJGOY-UHFFFAOYSA-N 0.000 description 1
- FWBHETKCLVMNFS-UHFFFAOYSA-N 4',6-Diamino-2-phenylindol Chemical compound C1=CC(C(=N)N)=CC=C1C1=CC2=CC=C(C(N)=N)C=C2N1 FWBHETKCLVMNFS-UHFFFAOYSA-N 0.000 description 1
- YLTKNGYYPIWKHZ-ACZMJKKPSA-N Ala-Ala-Glu Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O YLTKNGYYPIWKHZ-ACZMJKKPSA-N 0.000 description 1
- FOWHQTWRLFTELJ-FXQIFTODSA-N Ala-Asp-Met Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCSC)C(=O)O)N FOWHQTWRLFTELJ-FXQIFTODSA-N 0.000 description 1
- DAEFQZCYZKRTLR-ZLUOBGJFSA-N Ala-Cys-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(O)=O)C(O)=O DAEFQZCYZKRTLR-ZLUOBGJFSA-N 0.000 description 1
- HQJKCXHQNUCKMY-GHCJXIJMSA-N Ala-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](C)N HQJKCXHQNUCKMY-GHCJXIJMSA-N 0.000 description 1
- MNZHHDPWDWQJCQ-YUMQZZPRSA-N Ala-Leu-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O MNZHHDPWDWQJCQ-YUMQZZPRSA-N 0.000 description 1
- PMQXMXAASGFUDX-SRVKXCTJSA-N Ala-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CCCCN PMQXMXAASGFUDX-SRVKXCTJSA-N 0.000 description 1
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 1
- MAZZQZWCCYJQGZ-GUBZILKMSA-N Ala-Pro-Arg Chemical compound [H]N[C@@H](C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O MAZZQZWCCYJQGZ-GUBZILKMSA-N 0.000 description 1
- PXAFZDXYEIIUTF-LKTVYLICSA-N Ala-Trp-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(O)=O)C(O)=O PXAFZDXYEIIUTF-LKTVYLICSA-N 0.000 description 1
- CRCCTGPNZUCAHE-DCAQKATOSA-N Arg-His-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CN=CN1 CRCCTGPNZUCAHE-DCAQKATOSA-N 0.000 description 1
- CVXXSWQORBZAAA-SRVKXCTJSA-N Arg-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N CVXXSWQORBZAAA-SRVKXCTJSA-N 0.000 description 1
- YLVGUOGAFAJMKP-JYJNAYRXSA-N Arg-Met-Tyr Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N YLVGUOGAFAJMKP-JYJNAYRXSA-N 0.000 description 1
- 108060006004 Ascorbate peroxidase Proteins 0.000 description 1
- YNDLOUMBVDVALC-ZLUOBGJFSA-N Asn-Ala-Ala Chemical compound C[C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CC(=O)N)N YNDLOUMBVDVALC-ZLUOBGJFSA-N 0.000 description 1
- JJGRJMKUOYXZRA-LPEHRKFASA-N Asn-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)N)N)C(=O)O JJGRJMKUOYXZRA-LPEHRKFASA-N 0.000 description 1
- OPEPUCYIGFEGSW-WDSKDSINSA-N Asn-Gly-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OPEPUCYIGFEGSW-WDSKDSINSA-N 0.000 description 1
- MJIJBEYEHBKTIM-BYULHYEWSA-N Asn-Val-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N MJIJBEYEHBKTIM-BYULHYEWSA-N 0.000 description 1
- KRXIWXCXOARFNT-ZLUOBGJFSA-N Asp-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(O)=O KRXIWXCXOARFNT-ZLUOBGJFSA-N 0.000 description 1
- RGKKALNPOYURGE-ZKWXMUAHSA-N Asp-Ala-Val Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)O RGKKALNPOYURGE-ZKWXMUAHSA-N 0.000 description 1
- SVABRQFIHCSNCI-FOHZUACHSA-N Asp-Gly-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O SVABRQFIHCSNCI-FOHZUACHSA-N 0.000 description 1
- SPWXXPFDTMYTRI-IUKAMOBKSA-N Asp-Ile-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SPWXXPFDTMYTRI-IUKAMOBKSA-N 0.000 description 1
- 108010083946 Asp-Tyr-Leu-Lys Proteins 0.000 description 1
- BPAUXFVCSYQDQX-JRQIVUDYSA-N Asp-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC(=O)O)N)O BPAUXFVCSYQDQX-JRQIVUDYSA-N 0.000 description 1
- GFYOIYJJMSHLSN-QXEWZRGKSA-N Asp-Val-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O GFYOIYJJMSHLSN-QXEWZRGKSA-N 0.000 description 1
- XMKXONRMGJXCJV-LAEOZQHASA-N Asp-Val-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O XMKXONRMGJXCJV-LAEOZQHASA-N 0.000 description 1
- 102000003846 Carbonic anhydrases Human genes 0.000 description 1
- 108090000209 Carbonic anhydrases Proteins 0.000 description 1
- 229920002101 Chitin Polymers 0.000 description 1
- 108090000317 Chymotrypsin Proteins 0.000 description 1
- 108020004705 Codon Proteins 0.000 description 1
- 238000011537 Coomassie blue staining Methods 0.000 description 1
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 241000195493 Cryptophyta Species 0.000 description 1
- AZDQAZRURQMSQD-XPUUQOCRSA-N Cys-Val-Gly Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O AZDQAZRURQMSQD-XPUUQOCRSA-N 0.000 description 1
- 102100025698 Cytosolic carboxypeptidase 4 Human genes 0.000 description 1
- 238000007702 DNA assembly Methods 0.000 description 1
- 239000003155 DNA primer Substances 0.000 description 1
- 241000196324 Embryophyta Species 0.000 description 1
- 101001065501 Escherichia phage MS2 Lysis protein Proteins 0.000 description 1
- OTMSDBZUPAUEDD-UHFFFAOYSA-N Ethane Chemical compound CC OTMSDBZUPAUEDD-UHFFFAOYSA-N 0.000 description 1
- FAQVCWVVIYYWRR-WHFBIAKZSA-N Gln-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O FAQVCWVVIYYWRR-WHFBIAKZSA-N 0.000 description 1
- WQWMZOIPXWSZNE-WDSKDSINSA-N Gln-Asp-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O WQWMZOIPXWSZNE-WDSKDSINSA-N 0.000 description 1
- KDXKFBSNIJYNNR-YVNDNENWSA-N Gln-Glu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KDXKFBSNIJYNNR-YVNDNENWSA-N 0.000 description 1
- ZBKUIQNCRIYVGH-SDDRHHMPSA-N Gln-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZBKUIQNCRIYVGH-SDDRHHMPSA-N 0.000 description 1
- LURQDGKYBFWWJA-MNXVOIDGSA-N Gln-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)N)N LURQDGKYBFWWJA-MNXVOIDGSA-N 0.000 description 1
- DUGYCMAIAKAQPB-GLLZPBPUSA-N Gln-Thr-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O DUGYCMAIAKAQPB-GLLZPBPUSA-N 0.000 description 1
- WIMVKDYAKRAUCG-IHRRRGAJSA-N Gln-Tyr-Glu Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O WIMVKDYAKRAUCG-IHRRRGAJSA-N 0.000 description 1
- ZZLDMBMFKZFQMU-NRPADANISA-N Gln-Val-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O ZZLDMBMFKZFQMU-NRPADANISA-N 0.000 description 1
- NTBDVNJIWCKURJ-ACZMJKKPSA-N Glu-Asp-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O NTBDVNJIWCKURJ-ACZMJKKPSA-N 0.000 description 1
- SBCYJMOOHUDWDA-NUMRIWBASA-N Glu-Asp-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SBCYJMOOHUDWDA-NUMRIWBASA-N 0.000 description 1
- LRPXYSGPOBVBEH-IUCAKERBSA-N Glu-Gly-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O LRPXYSGPOBVBEH-IUCAKERBSA-N 0.000 description 1
- ZWABFSSWTSAMQN-KBIXCLLPSA-N Glu-Ile-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O ZWABFSSWTSAMQN-KBIXCLLPSA-N 0.000 description 1
- VGUYMZGLJUJRBV-YVNDNENWSA-N Glu-Ile-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O VGUYMZGLJUJRBV-YVNDNENWSA-N 0.000 description 1
- FMBWLLMUPXTXFC-SDDRHHMPSA-N Glu-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)O)N)C(=O)O FMBWLLMUPXTXFC-SDDRHHMPSA-N 0.000 description 1
- QDMVXRNLOPTPIE-WDCWCFNPSA-N Glu-Lys-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QDMVXRNLOPTPIE-WDCWCFNPSA-N 0.000 description 1
- JZJGEKDPWVJOLD-QEWYBTABSA-N Glu-Phe-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JZJGEKDPWVJOLD-QEWYBTABSA-N 0.000 description 1
- DMYACXMQUABZIQ-NRPADANISA-N Glu-Ser-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O DMYACXMQUABZIQ-NRPADANISA-N 0.000 description 1
- HZISRJBYZAODRV-XQXXSGGOSA-N Glu-Thr-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O HZISRJBYZAODRV-XQXXSGGOSA-N 0.000 description 1
- TWYSSILQABLLME-HJGDQZAQSA-N Glu-Thr-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TWYSSILQABLLME-HJGDQZAQSA-N 0.000 description 1
- LZEUDRYSAZAJIO-AUTRQRHGSA-N Glu-Val-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LZEUDRYSAZAJIO-AUTRQRHGSA-N 0.000 description 1
- QIZJOTQTCAGKPU-KWQFWETISA-N Gly-Ala-Tyr Chemical compound [NH3+]CC(=O)N[C@@H](C)C(=O)N[C@H](C([O-])=O)CC1=CC=C(O)C=C1 QIZJOTQTCAGKPU-KWQFWETISA-N 0.000 description 1
- KKBWDNZXYLGJEY-UHFFFAOYSA-N Gly-Arg-Pro Natural products NCC(=O)NC(CCNC(=N)N)C(=O)N1CCCC1C(=O)O KKBWDNZXYLGJEY-UHFFFAOYSA-N 0.000 description 1
- DTPOVRRYXPJJAZ-FJXKBIBVSA-N Gly-Arg-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N DTPOVRRYXPJJAZ-FJXKBIBVSA-N 0.000 description 1
- STVHDEHTKFXBJQ-LAEOZQHASA-N Gly-Glu-Ile Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O STVHDEHTKFXBJQ-LAEOZQHASA-N 0.000 description 1
- MBOAPAXLTUSMQI-JHEQGTHGSA-N Gly-Glu-Thr Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MBOAPAXLTUSMQI-JHEQGTHGSA-N 0.000 description 1
- JSNNHGHYGYMVCK-XVKPBYJWSA-N Gly-Glu-Val Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O JSNNHGHYGYMVCK-XVKPBYJWSA-N 0.000 description 1
- INLIXXRWNUKVCF-JTQLQIEISA-N Gly-Gly-Tyr Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 INLIXXRWNUKVCF-JTQLQIEISA-N 0.000 description 1
- OLPPXYMMIARYAL-QMMMGPOBSA-N Gly-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)CN OLPPXYMMIARYAL-QMMMGPOBSA-N 0.000 description 1
- ORXZVPZCPMKHNR-IUCAKERBSA-N Gly-His-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CNC=N1 ORXZVPZCPMKHNR-IUCAKERBSA-N 0.000 description 1
- OQQKUTVULYLCDG-ONGXEEELSA-N Gly-Lys-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCCCN)NC(=O)CN)C(O)=O OQQKUTVULYLCDG-ONGXEEELSA-N 0.000 description 1
- LPHQAFLNEHWKFF-QXEWZRGKSA-N Gly-Met-Ile Chemical compound [H]NCC(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LPHQAFLNEHWKFF-QXEWZRGKSA-N 0.000 description 1
- CQMFNTVQVLQRLT-JHEQGTHGSA-N Gly-Thr-Gln Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O CQMFNTVQVLQRLT-JHEQGTHGSA-N 0.000 description 1
- UVTSZKIATYSKIR-RYUDHWBXSA-N Gly-Tyr-Glu Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O UVTSZKIATYSKIR-RYUDHWBXSA-N 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 241000168525 Haematococcus Species 0.000 description 1
- 241001600172 Haliangium ochraceum Species 0.000 description 1
- 241000605178 Halothiobacillus neapolitanus Species 0.000 description 1
- KYMUEAZVLPRVAE-GUBZILKMSA-N His-Asn-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O KYMUEAZVLPRVAE-GUBZILKMSA-N 0.000 description 1
- LPBWRHRHEIYAIP-KKUMJFAQSA-N His-Tyr-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O LPBWRHRHEIYAIP-KKUMJFAQSA-N 0.000 description 1
- 101000932590 Homo sapiens Cytosolic carboxypeptidase 4 Proteins 0.000 description 1
- JXUGDUWBMKIJDC-NAKRPEOUSA-N Ile-Ala-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O JXUGDUWBMKIJDC-NAKRPEOUSA-N 0.000 description 1
- VAXBXNPRXPHGHG-BJDJZHNGSA-N Ile-Ala-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)O)N VAXBXNPRXPHGHG-BJDJZHNGSA-N 0.000 description 1
- UDLAWRKOVFDKFL-PEFMBERDSA-N Ile-Asp-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N UDLAWRKOVFDKFL-PEFMBERDSA-N 0.000 description 1
- OEQKGSPBDVKYOC-ZKWXMUAHSA-N Ile-Gly-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CS)C(=O)O)N OEQKGSPBDVKYOC-ZKWXMUAHSA-N 0.000 description 1
- MQFGXJNSUJTXDT-QSFUFRPTSA-N Ile-Gly-Ile Chemical compound N[C@@H]([C@@H](C)CC)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)O MQFGXJNSUJTXDT-QSFUFRPTSA-N 0.000 description 1
- QZZIBQZLWBOOJH-PEDHHIEDSA-N Ile-Ile-Val Chemical compound N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(=O)O QZZIBQZLWBOOJH-PEDHHIEDSA-N 0.000 description 1
- ADDYYRVQQZFIMW-MNXVOIDGSA-N Ile-Lys-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ADDYYRVQQZFIMW-MNXVOIDGSA-N 0.000 description 1
- YSGBJIQXTIVBHZ-AJNGGQMLSA-N Ile-Lys-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O YSGBJIQXTIVBHZ-AJNGGQMLSA-N 0.000 description 1
- OWSWUWDMSNXTNE-GMOBBJLQSA-N Ile-Pro-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N OWSWUWDMSNXTNE-GMOBBJLQSA-N 0.000 description 1
- KCTIFOCXAIUQQK-QXEWZRGKSA-N Ile-Pro-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O KCTIFOCXAIUQQK-QXEWZRGKSA-N 0.000 description 1
- BCISUQVFDGYZBO-QSFUFRPTSA-N Ile-Val-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O BCISUQVFDGYZBO-QSFUFRPTSA-N 0.000 description 1
- YWCJXQKATPNPOE-UKJIMTQDSA-N Ile-Val-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N YWCJXQKATPNPOE-UKJIMTQDSA-N 0.000 description 1
- ODKSFYDXXFIFQN-BYPYZUCNSA-N L-arginine Chemical compound OC(=O)[C@@H](N)CCCN=C(N)N ODKSFYDXXFIFQN-BYPYZUCNSA-N 0.000 description 1
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 1
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 1
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 1
- BGZCJDGBBUUBHA-KKUMJFAQSA-N Leu-Lys-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O BGZCJDGBBUUBHA-KKUMJFAQSA-N 0.000 description 1
- KWLWZYMNUZJKMZ-IHRRRGAJSA-N Leu-Pro-Leu Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O KWLWZYMNUZJKMZ-IHRRRGAJSA-N 0.000 description 1
- IWMJFLJQHIDZQW-KKUMJFAQSA-N Leu-Ser-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IWMJFLJQHIDZQW-KKUMJFAQSA-N 0.000 description 1
- SBANPBVRHYIMRR-UHFFFAOYSA-N Leu-Ser-Pro Natural products CC(C)CC(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O SBANPBVRHYIMRR-UHFFFAOYSA-N 0.000 description 1
- VDIARPPNADFEAV-WEDXCCLWSA-N Leu-Thr-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O VDIARPPNADFEAV-WEDXCCLWSA-N 0.000 description 1
- LJBVRCDPWOJOEK-PPCPHDFISA-N Leu-Thr-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LJBVRCDPWOJOEK-PPCPHDFISA-N 0.000 description 1
- AIMGJYMCTAABEN-GVXVVHGQSA-N Leu-Val-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIMGJYMCTAABEN-GVXVVHGQSA-N 0.000 description 1
- VKVDRTGWLVZJOM-DCAQKATOSA-N Leu-Val-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O VKVDRTGWLVZJOM-DCAQKATOSA-N 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- AAORVPFVUIHEAB-YUMQZZPRSA-N Lys-Asp-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O AAORVPFVUIHEAB-YUMQZZPRSA-N 0.000 description 1
- WTZUSCUIVPVCRH-SRVKXCTJSA-N Lys-Gln-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N WTZUSCUIVPVCRH-SRVKXCTJSA-N 0.000 description 1
- ZXEUFAVXODIPHC-GUBZILKMSA-N Lys-Glu-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZXEUFAVXODIPHC-GUBZILKMSA-N 0.000 description 1
- LCMWVZLBCUVDAZ-IUCAKERBSA-N Lys-Gly-Glu Chemical compound [NH3+]CCCC[C@H]([NH3+])C(=O)NCC(=O)N[C@H](C([O-])=O)CCC([O-])=O LCMWVZLBCUVDAZ-IUCAKERBSA-N 0.000 description 1
- SLQJJFAVWSZLBL-BJDJZHNGSA-N Lys-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN SLQJJFAVWSZLBL-BJDJZHNGSA-N 0.000 description 1
- RIJCHEVHFWMDKD-SRVKXCTJSA-N Lys-Lys-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O RIJCHEVHFWMDKD-SRVKXCTJSA-N 0.000 description 1
- YXPJCVNIDDKGOE-MELADBBJSA-N Lys-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N)C(=O)O YXPJCVNIDDKGOE-MELADBBJSA-N 0.000 description 1
- HKXSZKJMDBHOTG-CIUDSAMLSA-N Lys-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CCCCN HKXSZKJMDBHOTG-CIUDSAMLSA-N 0.000 description 1
- YRNRVKTYDSLKMD-KKUMJFAQSA-N Lys-Ser-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YRNRVKTYDSLKMD-KKUMJFAQSA-N 0.000 description 1
- CAVRAQIDHUPECU-UVOCVTCTSA-N Lys-Thr-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAVRAQIDHUPECU-UVOCVTCTSA-N 0.000 description 1
- XGZDDOKIHSYHTO-SZMVWBNQSA-N Lys-Trp-Glu Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 XGZDDOKIHSYHTO-SZMVWBNQSA-N 0.000 description 1
- BWECSLVQIWEMSC-IHRRRGAJSA-N Lys-Val-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCCCN)N BWECSLVQIWEMSC-IHRRRGAJSA-N 0.000 description 1
- RIPJMCFGQHGHNP-RHYQMDGZSA-N Lys-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CCCCN)N)O RIPJMCFGQHGHNP-RHYQMDGZSA-N 0.000 description 1
- QGQGAIBGTUJRBR-NAKRPEOUSA-N Met-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCSC QGQGAIBGTUJRBR-NAKRPEOUSA-N 0.000 description 1
- OBVHKUFUDCPZDW-JYJNAYRXSA-N Met-Arg-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OBVHKUFUDCPZDW-JYJNAYRXSA-N 0.000 description 1
- NCVJJAJVWILAGI-SRVKXCTJSA-N Met-Gln-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N NCVJJAJVWILAGI-SRVKXCTJSA-N 0.000 description 1
- PHWSCIFNNLLUFJ-NHCYSSNCSA-N Met-Gln-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCSC)N PHWSCIFNNLLUFJ-NHCYSSNCSA-N 0.000 description 1
- YORIKIDJCPKBON-YUMQZZPRSA-N Met-Glu-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O YORIKIDJCPKBON-YUMQZZPRSA-N 0.000 description 1
- UZWMJZSOXGOVIN-LURJTMIESA-N Met-Gly-Gly Chemical compound CSCC[C@H](N)C(=O)NCC(=O)NCC(O)=O UZWMJZSOXGOVIN-LURJTMIESA-N 0.000 description 1
- QGRJTULYDZUBAY-ZPFDUUQYSA-N Met-Ile-Glu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O QGRJTULYDZUBAY-ZPFDUUQYSA-N 0.000 description 1
- HOZNVKDCKZPRER-XUXIUFHCSA-N Met-Lys-Ile Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HOZNVKDCKZPRER-XUXIUFHCSA-N 0.000 description 1
- CULGJGUDIJATIP-STQMWFEESA-N Met-Tyr-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 CULGJGUDIJATIP-STQMWFEESA-N 0.000 description 1
- QAVZUKIPOMBLMC-AVGNSLFASA-N Met-Val-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(C)C QAVZUKIPOMBLMC-AVGNSLFASA-N 0.000 description 1
- LBSWWNKMVPAXOI-GUBZILKMSA-N Met-Val-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O LBSWWNKMVPAXOI-GUBZILKMSA-N 0.000 description 1
- ZOKXTWBITQBERF-UHFFFAOYSA-N Molybdenum Chemical compound [Mo] ZOKXTWBITQBERF-UHFFFAOYSA-N 0.000 description 1
- 101001033003 Mus musculus Granzyme F Proteins 0.000 description 1
- AJHCSUXXECOXOY-UHFFFAOYSA-N N-glycyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-UHFFFAOYSA-N 0.000 description 1
- YYRCPTVAPLQRNC-ULQDDVLXSA-N Phe-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CC1=CC=CC=C1 YYRCPTVAPLQRNC-ULQDDVLXSA-N 0.000 description 1
- MPFGIYLYWUCSJG-AVGNSLFASA-N Phe-Glu-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 MPFGIYLYWUCSJG-AVGNSLFASA-N 0.000 description 1
- MGLBSROLWAWCKN-FCLVOEFKSA-N Phe-Phe-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MGLBSROLWAWCKN-FCLVOEFKSA-N 0.000 description 1
- XALFIVXGQUEGKV-JSGCOSHPSA-N Phe-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 XALFIVXGQUEGKV-JSGCOSHPSA-N 0.000 description 1
- APKRGYLBSCWJJP-FXQIFTODSA-N Pro-Ala-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O APKRGYLBSCWJJP-FXQIFTODSA-N 0.000 description 1
- GRIRJQGZZJVANI-CYDGBPFRSA-N Pro-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H]1CCCN1 GRIRJQGZZJVANI-CYDGBPFRSA-N 0.000 description 1
- LQZZPNDMYNZPFT-KKUMJFAQSA-N Pro-Gln-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LQZZPNDMYNZPFT-KKUMJFAQSA-N 0.000 description 1
- MGDFPGCFVJFITQ-CIUDSAMLSA-N Pro-Glu-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MGDFPGCFVJFITQ-CIUDSAMLSA-N 0.000 description 1
- FEVDNIBDCRKMER-IUCAKERBSA-N Pro-Gly-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)CNC(=O)[C@@H]1CCCN1 FEVDNIBDCRKMER-IUCAKERBSA-N 0.000 description 1
- XYSXOCIWCPFOCG-IHRRRGAJSA-N Pro-Leu-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O XYSXOCIWCPFOCG-IHRRRGAJSA-N 0.000 description 1
- MCWHYUWXVNRXFV-RWMBFGLXSA-N Pro-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 MCWHYUWXVNRXFV-RWMBFGLXSA-N 0.000 description 1
- OFGUOWQVEGTVNU-DCAQKATOSA-N Pro-Lys-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OFGUOWQVEGTVNU-DCAQKATOSA-N 0.000 description 1
- DCHQYSOGURGJST-FJXKBIBVSA-N Pro-Thr-Gly Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O DCHQYSOGURGJST-FJXKBIBVSA-N 0.000 description 1
- FZXSYIPVAFVYBH-KKUMJFAQSA-N Pro-Tyr-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O FZXSYIPVAFVYBH-KKUMJFAQSA-N 0.000 description 1
- 241000192137 Prochlorococcus marinus Species 0.000 description 1
- OFOBLEOULBTSOW-UHFFFAOYSA-N Propanedioic acid Natural products OC(=O)CC(O)=O OFOBLEOULBTSOW-UHFFFAOYSA-N 0.000 description 1
- 229940124158 Protease/peptidase inhibitor Drugs 0.000 description 1
- 108010003581 Ribulose-bisphosphate carboxylase Proteins 0.000 description 1
- 101100230601 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) HBT1 gene Proteins 0.000 description 1
- 101100544819 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) YPT31 gene Proteins 0.000 description 1
- LVVBAKCGXXUHFO-ZLUOBGJFSA-N Ser-Ala-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O LVVBAKCGXXUHFO-ZLUOBGJFSA-N 0.000 description 1
- VMLONWHIORGALA-SRVKXCTJSA-N Ser-Leu-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CO VMLONWHIORGALA-SRVKXCTJSA-N 0.000 description 1
- GZSZPKSBVAOGIE-CIUDSAMLSA-N Ser-Lys-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O GZSZPKSBVAOGIE-CIUDSAMLSA-N 0.000 description 1
- GYDFRTRSSXOZCR-ACZMJKKPSA-N Ser-Ser-Glu Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O GYDFRTRSSXOZCR-ACZMJKKPSA-N 0.000 description 1
- BMKNXTJLHFIAAH-CIUDSAMLSA-N Ser-Ser-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O BMKNXTJLHFIAAH-CIUDSAMLSA-N 0.000 description 1
- PCMZJFMUYWIERL-ZKWXMUAHSA-N Ser-Val-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O PCMZJFMUYWIERL-ZKWXMUAHSA-N 0.000 description 1
- ITJLNEXJUADEMK-UHFFFAOYSA-N Shirin Natural products CCC(C)(O)c1c(Cl)c(OC)c(C)c2OC(=O)c3c(C)c(Cl)c(O)c(Cl)c3Oc12 ITJLNEXJUADEMK-UHFFFAOYSA-N 0.000 description 1
- DBMJMQXJHONAFJ-UHFFFAOYSA-M Sodium laurylsulphate Chemical compound [Na+].CCCCCCCCCCCCOS([O-])(=O)=O DBMJMQXJHONAFJ-UHFFFAOYSA-M 0.000 description 1
- IRKWVRSEQFTGGV-VEVYYDQMSA-N Thr-Asn-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IRKWVRSEQFTGGV-VEVYYDQMSA-N 0.000 description 1
- JBHMLZSKIXMVFS-XVSYOHENSA-N Thr-Asn-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JBHMLZSKIXMVFS-XVSYOHENSA-N 0.000 description 1
- WLDUCKSCDRIVLJ-NUMRIWBASA-N Thr-Gln-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O WLDUCKSCDRIVLJ-NUMRIWBASA-N 0.000 description 1
- XPNSAQMEAVSQRD-FBCQKBJTSA-N Thr-Gly-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)NCC(O)=O XPNSAQMEAVSQRD-FBCQKBJTSA-N 0.000 description 1
- LCCSEJSPBWKBNT-OSUNSFLBSA-N Thr-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N LCCSEJSPBWKBNT-OSUNSFLBSA-N 0.000 description 1
- QHUWWSQZTFLXPQ-FJXKBIBVSA-N Thr-Met-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O QHUWWSQZTFLXPQ-FJXKBIBVSA-N 0.000 description 1
- RTAQQCXQSZGOHL-UHFFFAOYSA-N Titanium Chemical compound [Ti] RTAQQCXQSZGOHL-UHFFFAOYSA-N 0.000 description 1
- NXJZCPKZIKTYLX-XEGUGMAKSA-N Trp-Glu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N NXJZCPKZIKTYLX-XEGUGMAKSA-N 0.000 description 1
- NMOIRIIIUVELLY-WDSOQIARSA-N Trp-Val-Leu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)C(C)C)=CNC2=C1 NMOIRIIIUVELLY-WDSOQIARSA-N 0.000 description 1
- 108090000631 Trypsin Proteins 0.000 description 1
- 102000004142 Trypsin Human genes 0.000 description 1
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 1
- JAGGEZACYAAMIL-CQDKDKBSSA-N Tyr-Lys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CC=C(C=C1)O)N JAGGEZACYAAMIL-CQDKDKBSSA-N 0.000 description 1
- VTCKHZJKWQENKX-KBPBESRZSA-N Tyr-Lys-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O VTCKHZJKWQENKX-KBPBESRZSA-N 0.000 description 1
- PMHLLBKTDHQMCY-ULQDDVLXSA-N Tyr-Lys-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PMHLLBKTDHQMCY-ULQDDVLXSA-N 0.000 description 1
- GOPQNCQSXBJAII-ULQDDVLXSA-N Tyr-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N GOPQNCQSXBJAII-ULQDDVLXSA-N 0.000 description 1
- 240000006064 Urena lobata Species 0.000 description 1
- AZSHAZJLOZQYAY-FXQIFTODSA-N Val-Ala-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O AZSHAZJLOZQYAY-FXQIFTODSA-N 0.000 description 1
- VDPRBUOZLIFUIM-GUBZILKMSA-N Val-Arg-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](C(C)C)N VDPRBUOZLIFUIM-GUBZILKMSA-N 0.000 description 1
- COYSIHFOCOMGCF-WPRPVWTQSA-N Val-Arg-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-WPRPVWTQSA-N 0.000 description 1
- COYSIHFOCOMGCF-UHFFFAOYSA-N Val-Arg-Gly Natural products CC(C)C(N)C(=O)NC(C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-UHFFFAOYSA-N 0.000 description 1
- DBOXBUDEAJVKRE-LSJOCFKGSA-N Val-Asn-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C(C)C)C(=O)O)N DBOXBUDEAJVKRE-LSJOCFKGSA-N 0.000 description 1
- KZKMBGXCNLPYKD-YEPSODPASA-N Val-Gly-Thr Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O KZKMBGXCNLPYKD-YEPSODPASA-N 0.000 description 1
- HQYVQDRYODWONX-DCAQKATOSA-N Val-His-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CO)C(=O)O)N HQYVQDRYODWONX-DCAQKATOSA-N 0.000 description 1
- WDIWOIRFNMLNKO-ULQDDVLXSA-N Val-Leu-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 WDIWOIRFNMLNKO-ULQDDVLXSA-N 0.000 description 1
- SVFRYKBZHUGKLP-QXEWZRGKSA-N Val-Met-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)N)C(=O)O)N SVFRYKBZHUGKLP-QXEWZRGKSA-N 0.000 description 1
- CEKSLIVSNNGOKH-KZVJFYERSA-N Val-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](C(C)C)N)O CEKSLIVSNNGOKH-KZVJFYERSA-N 0.000 description 1
- PQSNETRGCRUOGP-KKHAAJSZSA-N Val-Thr-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(N)=O PQSNETRGCRUOGP-KKHAAJSZSA-N 0.000 description 1
- HTONZBWRYUKUKC-RCWTZXSCSA-N Val-Thr-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O HTONZBWRYUKUKC-RCWTZXSCSA-N 0.000 description 1
- RTJPAGFXOWEBAI-SRVKXCTJSA-N Val-Val-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N RTJPAGFXOWEBAI-SRVKXCTJSA-N 0.000 description 1
- WHNSHJJNWNSTSU-BZSNNMDCSA-N Val-Val-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)C(C)C)C(O)=O)=CNC2=C1 WHNSHJJNWNSTSU-BZSNNMDCSA-N 0.000 description 1
- 238000011481 absorbance measurement Methods 0.000 description 1
- 230000002378 acidificating effect Effects 0.000 description 1
- 239000002671 adjuvant Substances 0.000 description 1
- 108010070783 alanyltyrosine Proteins 0.000 description 1
- 125000000539 amino acid group Chemical group 0.000 description 1
- 125000000129 anionic group Chemical group 0.000 description 1
- 239000003242 anti bacterial agent Substances 0.000 description 1
- 229940088710 antibiotic agent Drugs 0.000 description 1
- 230000000890 antigenic effect Effects 0.000 description 1
- 230000003078 antioxidant effect Effects 0.000 description 1
- 239000007864 aqueous solution Substances 0.000 description 1
- 229960003589 arginine hydrochloride Drugs 0.000 description 1
- 108010009111 arginyl-glycyl-glutamic acid Proteins 0.000 description 1
- 108010005774 beta-Galactosidase Proteins 0.000 description 1
- 102000005936 beta-Galactosidase Human genes 0.000 description 1
- 239000011230 binding agent Substances 0.000 description 1
- 238000005842 biochemical reaction Methods 0.000 description 1
- 230000003115 biocidal effect Effects 0.000 description 1
- 230000004071 biological effect Effects 0.000 description 1
- 239000012620 biological material Substances 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000002775 capsule Substances 0.000 description 1
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 239000013592 cell lysate Substances 0.000 description 1
- 239000006285 cell suspension Substances 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 229960002376 chymotrypsin Drugs 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 239000006184 cosolvent Substances 0.000 description 1
- 238000000604 cryogenic transmission electron microscopy Methods 0.000 description 1
- 101150052710 csoS1B gene Proteins 0.000 description 1
- 101150101649 csoS1D gene Proteins 0.000 description 1
- 230000009089 cytolysis Effects 0.000 description 1
- 230000001086 cytosolic effect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000002939 deleterious effect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 229910003460 diamond Inorganic materials 0.000 description 1
- 239000010432 diamond Substances 0.000 description 1
- 239000007884 disintegrant Substances 0.000 description 1
- 238000004090 dissolution Methods 0.000 description 1
- 239000003937 drug carrier Substances 0.000 description 1
- 238000012377 drug delivery Methods 0.000 description 1
- 238000000635 electron micrograph Methods 0.000 description 1
- 239000012149 elution buffer Substances 0.000 description 1
- 108010048367 enhanced green fluorescent protein Proteins 0.000 description 1
- 239000000945 filler Substances 0.000 description 1
- 238000000799 fluorescence microscopy Methods 0.000 description 1
- 239000013505 freshwater Substances 0.000 description 1
- 238000007306 functionalization reaction Methods 0.000 description 1
- ZZUFCTLCJUWOSV-UHFFFAOYSA-N furosemide Chemical compound C1=C(Cl)C(S(=O)(=O)N)=CC(C(O)=O)=C1NCC1=CC=CO1 ZZUFCTLCJUWOSV-UHFFFAOYSA-N 0.000 description 1
- LYQGMALGKYWNIU-UHFFFAOYSA-K gadolinium(3+);triacetate Chemical compound [Gd+3].CC([O-])=O.CC([O-])=O.CC([O-])=O LYQGMALGKYWNIU-UHFFFAOYSA-K 0.000 description 1
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 1
- 229960002449 glycine Drugs 0.000 description 1
- 229960001269 glycine hydrochloride Drugs 0.000 description 1
- 108010027668 glycyl-alanyl-valine Proteins 0.000 description 1
- 239000003979 granulating agent Substances 0.000 description 1
- 238000010438 heat treatment Methods 0.000 description 1
- 230000007062 hydrolysis Effects 0.000 description 1
- 238000006460 hydrolysis reaction Methods 0.000 description 1
- 238000007654 immersion Methods 0.000 description 1
- 230000001900 immune effect Effects 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 239000000411 inducer Substances 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 238000011090 industrial biotechnology method and process Methods 0.000 description 1
- 238000001802 infusion Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 230000009878 intermolecular interaction Effects 0.000 description 1
- 238000004255 ion exchange chromatography Methods 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 229960000310 isoleucine Drugs 0.000 description 1
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 1
- 108010044374 isoleucyl-tyrosine Proteins 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 101150066555 lacZ gene Proteins 0.000 description 1
- 238000011068 loading method Methods 0.000 description 1
- 230000033001 locomotion Effects 0.000 description 1
- 239000000314 lubricant Substances 0.000 description 1
- 230000001320 lysogenic effect Effects 0.000 description 1
- 108010003700 lysyl aspartic acid Proteins 0.000 description 1
- 108010054155 lysyllysine Proteins 0.000 description 1
- 108091005958 mTurquoise2 Proteins 0.000 description 1
- VZCYOOQTPOCHFL-UPHRSURJSA-N maleic acid Chemical compound OC(=O)\C=C/C(O)=O VZCYOOQTPOCHFL-UPHRSURJSA-N 0.000 description 1
- 239000011976 maleic acid Substances 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000003340 mental effect Effects 0.000 description 1
- 238000012269 metabolic engineering Methods 0.000 description 1
- 238000006241 metabolic reaction Methods 0.000 description 1
- 230000000813 microbial effect Effects 0.000 description 1
- 238000001000 micrograph Methods 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 238000001823 molecular biology technique Methods 0.000 description 1
- 229910052750 molybdenum Inorganic materials 0.000 description 1
- 239000011733 molybdenum Substances 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 210000004897 n-terminal region Anatomy 0.000 description 1
- 239000002086 nanomaterial Substances 0.000 description 1
- 239000013642 negative control Substances 0.000 description 1
- 229910052757 nitrogen Inorganic materials 0.000 description 1
- 238000007911 parenteral administration Methods 0.000 description 1
- 239000008188 pellet Substances 0.000 description 1
- 239000000137 peptide hydrolase inhibitor Substances 0.000 description 1
- 102000013415 peroxidase activity proteins Human genes 0.000 description 1
- 108040007629 peroxidase activity proteins Proteins 0.000 description 1
- 239000000546 pharmaceutical excipient Substances 0.000 description 1
- 230000000144 pharmacologic effect Effects 0.000 description 1
- 230000035479 physiological effects, processes and functions Effects 0.000 description 1
- 239000000049 pigment Substances 0.000 description 1
- 230000037039 plant physiology Effects 0.000 description 1
- 229920000729 poly(L-lysine) polymer Polymers 0.000 description 1
- 229920002401 polyacrylamide Polymers 0.000 description 1
- 229920001184 polypeptide Polymers 0.000 description 1
- 229920000137 polyphosphoric acid Polymers 0.000 description 1
- 230000004481 post-translational protein modification Effects 0.000 description 1
- 239000003755 preservative agent Substances 0.000 description 1
- 238000007639 printing Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000001681 protective effect Effects 0.000 description 1
- 239000012460 protein solution Substances 0.000 description 1
- 230000004850 protein–protein interaction Effects 0.000 description 1
- 239000002510 pyrogen Substances 0.000 description 1
- 230000007115 recruitment Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 239000011347 resin Substances 0.000 description 1
- 229920005989 resin Polymers 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000007480 sanger sequencing Methods 0.000 description 1
- 239000010420 shell particle Substances 0.000 description 1
- 229910052709 silver Inorganic materials 0.000 description 1
- 239000004332 silver Substances 0.000 description 1
- 238000002741 site-directed mutagenesis Methods 0.000 description 1
- 108010005652 splenotritin Proteins 0.000 description 1
- 239000003381 stabilizer Substances 0.000 description 1
- 230000000087 stabilizing effect Effects 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000004797 therapeutic response Effects 0.000 description 1
- 230000001988 toxicity Effects 0.000 description 1
- 231100000419 toxicity Toxicity 0.000 description 1
- VZCYOOQTPOCHFL-UHFFFAOYSA-N trans-butenedioic acid Natural products OC(=O)C=CC(O)=O VZCYOOQTPOCHFL-UHFFFAOYSA-N 0.000 description 1
- 230000005030 transcription termination Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 239000012588 trypsin Substances 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 108010073969 valyllysine Proteins 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K39/00—Medicinal preparations containing antigens or antibodies
- A61K39/385—Haptens or antigens, bound to carriers
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/195—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
- C07K14/32—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Bacillus (G)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/70—Vectors or expression systems specially adapted for E. coli
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/80—Vectors or expression systems specially adapted for eukaryotic hosts for fungi
- C12N15/81—Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K39/00—Medicinal preparations containing antigens or antibodies
- A61K2039/51—Medicinal preparations containing antigens or antibodies comprising whole cells, viruses or DNA/RNA
- A61K2039/525—Virus
- A61K2039/5256—Virus expressing foreign proteins
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K39/00—Medicinal preparations containing antigens or antibodies
- A61K2039/51—Medicinal preparations containing antigens or antibodies comprising whole cells, viruses or DNA/RNA
- A61K2039/525—Virus
- A61K2039/5258—Virus-like particles
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A50/00—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
- Y02A50/30—Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Wood Science & Technology (AREA)
- Biomedical Technology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Zoology (AREA)
- General Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Mycology (AREA)
- Microbiology (AREA)
- Plant Pathology (AREA)
- Physics & Mathematics (AREA)
- Medicinal Chemistry (AREA)
- Pharmacology & Pharmacy (AREA)
- Immunology (AREA)
- Epidemiology (AREA)
- Animal Behavior & Ethology (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Gastroenterology & Hepatology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Peptides Or Proteins (AREA)
- Medicinal Preparation (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
本发明涉及一种用于产生携带货物分子的细菌微区室病毒样颗粒(VLP)的方法,所述方法包括:在宿主细胞或生物体中引入和表达一种或多种多核苷酸,所述一种或多种多核苷酸包含(a)编码细菌微区室壳原体的第一序列和编码与包含序列SKITGSSGNDTQGSLITYSGGARG的包封肽融合的货物分子的第二序列,以及形成包封所述货物分子的微区室;或者(b)编码细菌微区室壳原体的第一序列和编码与货物分子或生化标记融合的至少一种所述原体的第二序列,以及形成在外表面表达所述货物分子或生化标记的微区室。在一个实施方案中,所述细菌微区室原体是来自那不勒斯盐硫杆菌的CsoS1A和CsoS4A或来自赭黄嗜盐囊菌的HO‑H、HO‑P和HO‑T1。
Description
技术领域
本发明涉及携带货物分子的细菌微区室病毒样颗粒(VLP)、用于产生所述细菌微区室VLP的方法、用于产生所述VLP的分离的质粒或载体核酸、包含至少一种所述VLP的组合物、所述VLP的用途以及使用所述VLP的治疗方法。
背景技术
细菌微区室(BMC)是在一些细菌物种中发现的蛋白质壳,并且被认为已经演变为一种分隔某些具有挑战性的生化反应的策略[Kerfeld,C.A.等人,Nature ReviewsMicrobiology 16:277-290(2018)]。这些蛋白质复合体由数百至数千个多肽亚单位组成,这些多肽亚单位自组装成直径范围为40nm至400nm的多边形结构。在蓝细菌和一些化能营养细菌物种中发现的羧酶体是最早已知的BMC例子。这种蛋白质壳包封核酮糖-1,5-二磷酸羧化酶(RuBisCO),并且通过将其底物CO2和核酮糖-1,5-二磷酸集中在RuBisCO附近来提高其催化效率。羧酶体根据所包封的RuBisCO类别分为两大类。α-羧酶体含有在α-蓝细菌(通常是盐水蓝细菌)和化学自养生物中发现的1A形式RuBisCO,而β-羧酶体含有在β-蓝细菌(通常是淡水蓝细菌)中观察到的1B形式RuBisCO[Turmo,A.等人,FEMS Microbiol Lett364:(2017)]。
近年来已报道了构成BMC壳的亚单位的许多原子尺度结构,以及以下三个完整壳的原子尺度结构:来自Halothece sp.PCC 7418的还原组分β-羧酶体、来自肺炎克雷伯氏菌(Klebsiella pneumoniae)的合成甘氨酰基自由基相关的BMC 2组(GRM2)和来自赭黄嗜盐囊菌(Haliangium ochraceum)的功能未明的BMC(HO-BMC)[Kalnins,G.等人,NatureCommunications 11:388(2020);Sutter,M.等人,Science 356(6344):1293-1297(2017);Sutter,M.等人,Plant Physiology(2019)]。尽管BMC的外观和功能多种多样,但主要构建块的三级结构是保守的。BMC-H结构域蛋白(pfam00936)是化学计量上的主要模块,并且形成具有C6几何形状的同六聚体。BMC-T蛋白是由两个BMC-H结构域的串联重复形成的,并且组装为三聚体或双叠的紧贴三聚体且具有拟六方对称性。BMC-P结构域单元(pfam03319)是BMC壳复合体中的较小但重要的模块。BMC-P原体组装成具有椎体几何形状的同五聚体,其占据壳的顶点,从而覆盖由BMC-T和BMC-H蛋白形成的各平面。这导致BMC的多边形外观。对组分的架构的详细分子理解促成了BMC壳工程化领域。这种努力包括通过使用包封肽(EP)或通过壳组分的蛋白质工程化将异源蛋白货物靶向到壳内腔中,所述包封肽是衍生自同源内腔蛋白的短肽序列[Lawrence,A.D.等人,ACS Synthetic Biology 3:454-465(2014)]。这些修饰旨在将BMC改为用作细胞内纳米反应器或作为用于生物分子递送的支架。
先前已经通过将整个α-羧酶体操纵子(cso)(图1)移植到重组宿主中,在大肠杆菌(Escherichia coli)中产生了来自那不勒斯盐硫杆菌(Halothiobacillus neapolitanus)的α-羧酶体[Bonacci,W.等人PNAS 109:478-483(2012)]。在操纵子中发现的基因包括三个BMC-H旁系同源物(cso1ABC)、一个BMC-T蛋白(csoS1D)和两个BMC-P旁系同源物(csoS4AB)、以及编码RuBisCO大小单位(cbbLS)、碳酸酐酶(csoS3/SCA)和固有无序蛋白(IDP)(csoS2)的基因。已知该IDP通过促进壳与内腔蛋白之间的相互作用而对α-羧酶体组装至关重要[Cai,F.等人Life(Basel,Switzerland)5:1141-1171(2015)]。失去其原本货物的BMC更适合工程化用途,因为可以将异源货物更有效地包装至内腔中。然而,那不勒斯盐硫杆菌α-羧酶体从未在少于上述十个基因的情况下以结构封闭形式重组表达,尽管对其结构和生化过程进行了数十年的研究[Bonacci,W.等人PNAS 109:478-483(2012)]。
需要提供在重组细菌和酵母宿主中具有改善的产生效率的细菌微区室病毒样颗粒,以及包封和/或在表面呈现货物分子的可替代方式。
发明内容
已出人意料地发现,BMC VLP可以使用分别来自那不勒斯盐硫杆菌和赭黄嗜盐囊菌的两种或三种类型的BMC原体形成。那不勒斯盐硫杆菌BMC VLP称为Cso-BMC并且赭黄嗜盐囊菌BMC VLP称为HO-BMC。此外,可以使用衍生自CsoS2且称为S2CP或其变体(称为S2CP(30))的新型短肽将货物分子包封在Cso-BMC内。包封货物分子的Cso-BMC具有在未包封货物分子的Cso-BMC中未观察到的不同壳构象。值得注意的是,两种壳的原体的肽末端都朝外,从而允许目的蛋白进行遗传融合。因此,货物分子可以通过表达展示在本发明的BMCVLP的外表面上,所述货物分子与原体的末端融合或经由具有互补结合配偶体的货物分子与附接至原体末端的生化标记融合。
根据第一方面,本发明提供了一种用于产生携带货物分子的细菌微区室病毒样颗粒(VLP)的方法,所述方法包括
A)向宿主细胞或生物体中引入一种或多种异源多核苷酸,所述一种或多种异源多核苷酸包含(i)编码细菌微区室壳原体的第一序列;和(ii)编码与包封肽融合的货物分子的第二序列,其中所述包封肽包含SEQ ID NO:1(SKITGSSGNDTQGSLITYSGGARG)或SEQ IDNO:94(KPEKPGSKITGSSGNDTQGSLITYSGGARG)所示的氨基酸序列或其功能变体;
a)表达所述第一序列和所述第二序列;以及
b)形成包封所述货物分子的微区室;或
B)向宿主细胞或生物体中引入一种或多种多核苷酸,所述一种或多种多核苷酸包含(i)编码细菌微区室壳原体的第一序列;和(ii)编码与货物分子或生化标记融合的至少一种所述原体的第二序列;
a)表达所述第一序列和所述第二序列;以及
b)形成在外表面表达所述货物分子的微区室,或形成在外表面表达所述生化标记的微区室,包含互补标记的货物分子能够结合所述生化标记。
在一些实施方案中,SEQ ID NO:1所示的包封肽的功能变体在其氨基末端包含在SEQ ID NO:94的氨基末端处的1、2、3、4或5个另外的氨基酸。例如,SEQ ID NO:1的包封肽的变体可以在其氨基末端包含“G”、“PG”、“KPG”等,并保留功能。此类变体是序列SEQ ID NO:1与SEQ ID NO:94之间的中间体。
在一些实施方案中,所述包封肽由于遗传密码的冗余性而由分别与SEQ ID NO:7或SEQ ID NO:95(S2CP(30))所示的核酸序列具有至少80%同一性、至少85%同一性、至少90%同一性、至少95%同一性或100%同一性的多核苷酸序列编码。
在一些实施方案中,所述细菌微区室原体衍生自那不勒斯盐硫杆菌或赭黄嗜盐囊菌。
在一些实施方案中,所述细菌微区室原体是来自那不勒斯盐硫杆菌的CsoS1A(SEQID NO:2)和CsoS4A(SEQ ID NO:3);或来自赭黄嗜盐囊菌的HO-H(SEQ ID NO:4)、HO-P(SEQID NO:5)和HO-T1(SEQ ID NO:6),及其变体。
在一些实施方案中,所述货物分子是至少一种肽,如酶和/或荧光蛋白和/或免疫原性肽。
在一些实施方案中,所述生化标记可以选自Strep-Tag II(SII)、SpyCatcher/SpyTag(SC/ST)对和CC-Di-A/B(CCA/CCB)对。
在一些实施方案中,CsoS1A的表达由启动子PT7控制;CsoS4A由启动子PCON5控制;HO-H由酵母启动子PTDH3控制;HO-P由酵母启动子PPYK1控制并且HO-T1由酵母启动子PYEF3控制。
在一些实施方案中,所述宿主生物体是大肠杆菌或酿酒酵母。
根据第二方面,本发明提供了一种携带货物分子的工程化细菌微区室VLP,所述工程化细菌微区室VLP包含:i)细菌微区室壳原体和与包封肽融合的货物分子,其中所述包封肽包含SEQ ID NO:1(SKITGSSGNDTQGSLITYSGGARG)或SEQ ID NO:94(KPEKPGSKITGSSGNDTQGSLITYSGGARG)所示的氨基酸序列或其功能变体;或ii)细菌微区室壳原体和货物分子,其中所述货物分子与至少一种所述原体的末端融合,或者其中至少一种所述原体与标记融合,并且包含互补标记的货物分子在所述VLP的外表面与所述标记结合。
在一些实施方案中,所述细菌微区室原体衍生自那不勒斯盐硫杆菌或赭黄嗜盐囊菌。
在一些实施方案中,所述细菌微区室原体是来自那不勒斯盐硫杆菌的包含SEQ IDNO:2所示的氨基酸序列的CsoS1A和包含SEQ ID NO:3所示的氨基酸序列的CsoS4A;或来自赭黄嗜盐囊菌的包含SEQ ID NO:4所示的氨基酸序列的HO-H、包含SEQ ID NO:5所示的氨基酸序列的HO-P和包含SEQ ID NO:6所示的氨基酸序列的HO-T1,及其变体。
在一些实施方案中,所述货物分子是至少一种肽,如酶和/或荧光蛋白和/或免疫原性肽。
在一些实施方案中,所述生化标记可以选自Strep-Tag II(SII)、SpyCatcher/SpyTag(SC/ST)对和CC-Di-A/B(CCA/CCB)对。
根据第三方面,本发明提供了一种分离的质粒或载体核酸,所述分离的质粒或载体核酸包含:
a)编码细菌微区室壳原体的第一DNA序列,其每一个都与启动子可操作地连接,以及b)编码与包封肽融合的货物分子的第二DNA序列,其与启动子可操作地连接,其中所述包封肽包含SEQ ID NO:1(SKITGSSGNDTQGSLITYSGGARG)或SEQ ID NO:94(KPEKPGSKITGSSGNDTQGSLITYSGGARG)所示的氨基酸序列或其功能变体;或c)编码细菌微区室壳原体的第一DNA序列,其每一个都与启动子可操作地连接,以及d)编码与货物分子或生化标记融合的至少一种所述原体的第二DNA序列。
在一些实施方案中,所述分离的质粒或载体核酸包含如前所定义的细菌微区室壳原体、启动子、货物分子和标记。
在一些实施方案中,编码SEQ ID NO:1-6和94所示的所述细菌微区室壳原体、货物分子和标记的分离的质粒或载体核酸DNA序列由于遗传密码的冗余性而分别与SEQ IDNo7-12和95-S2CP(30)所示的核酸序列具有至少70%、至少80%、至少90%或100%同一性。
根据第四方面,本发明提供了一种包含本发明任何方面的至少一种工程化VLP的组合物或组合,所述组合物或组合用于:a)预防或治疗受试者的疾病;或b)生化过程。
在一些实施方案中,所述至少一种工程化VLP包含用于转化前药的酶。
在一些实施方案中,所述组合物可以包含一种或多种另外的治疗剂。所述组合物可以用作疫苗。
根据第五方面,本发明提供了本发明任何方面的至少一种工程化VLP在制造用于预防或治疗受试者的疾病的药剂中的用途。
根据第六方面,本发明提供了预防或治疗的方法,所述方法包括向需要这种治疗的受试者施用有效量的本发明任何方面的工程化VLP。
应当理解,本发明不限于下面详细描述的具体实施方案。
附图说明
图1显示了来自那不勒斯盐硫杆菌的α-羧酶体操纵子(cso)的示意图。虚线指示csos1B与csoS1D之间的十个不太可能与羧酶体相关联的基因。基因长度和中间距离未按比例绘制。
图2A-图2D显示了所开发的质粒的示意图。(A)TU受体质粒pESX含有链霉素选择标记物(链霉素R)以及pUC复制起点。RFP盒通过用限制酶(RE)BsmBI消化而被进入的TU替代。(B)途径受体质粒pCKH接受通过用RE BsaI消化而从pESX质粒释放的TU。pCKH含有卡那霉素选择标记物(卡那霉素R)。(C和D)经修饰的HcKan_O质粒,其可以将SII或His6标记或四个FP(mT2、meGFP、mKOκ、mCh)之一附接在ORF的N末端或C末端。在通过BsaI插入ORF之后,ORF-标记融合产物(由Gly-Ser-Ser接头隔开)通过BsmBI释放。
图3显示了所创建的VLP途径的示意图。对于Cso-BMC,使用的所有终止子均为TT7。启动子箭头上的灰度强度表示它们的相对强度,越暗越强。
图4A-图4B显示了α-羧酶体系统的货物靶向肽序列的鉴定。(A)CsoS2直系同源物(显示了来自不同属的前9个)与来自那不勒斯盐硫杆菌的直系同源物的多序列比对揭示,如序列标识所示,C末端区域高度保守。(B)用壳蛋白CsoS1A-SII、SII-CsoS1D和CsoS4A-SII进行的His6-meGFP-S2CP下拉测定证明S2CP仅介导与CsoS1A-SII的相互作用。
图5A-图5B显示了α-羧酶体壳组分的表达和纯化。(A)用于表达α-羧酶体组分的合成操纵子的示意图。壳模块也由它们的几何图标表示;Cso4A:五边形;CsoS1D:三聚体六边形;CsoS1A:六聚体六边形。(B)表达途径Cso-PmChTHC的细胞的荧光显微照片。可以看到meGFP-S2CP和CsoS4A-mCherry的共定位。DIC:差分干涉对比通道。比例尺(白色,右下)表示2μm。(C-F)对于(C)Cso-PmChTHC、(D)Cso-PSIITHC、(E)Cso-PSIITH和(F)Cso-PSIIH进行AIEX纯化后,在0.4M NaCl洗脱级分中的纯化的蛋白质壳的HTEM可视化。比例尺(黑色,右下)表示50nm。
图6A-图6D显示了S2CP如何充当包封肽。(A)示意图,其描绘了通过S2CP对UmuD1-40蛋白酶信号标记的GFP的包封如何能够保护其免受内源ClpXP蛋白酶的影响。(B)S2CP能够将UmuD1-40-meGFP靶向至简化的羧酶体的内腔中。通过使用抗GFP抗体对纯化的壳进行蛋白质印迹分析。仅从Cso-PSIITHCU,S2CP检测到UmuD1-40meGFP,并且从Cso-PSIITHCU检测不到UmuD1-40meGFP。电子显微照片显示,由(C)Cso-PSIITHCU,S2CP和(D)Cso-PSIITHCU产生的壳是相似的。比例尺(黑色,右下)表示50nm。
图7A-图7B显示了简化的α羧酶体壳的原子模型。(A、B)壳的表面表示,其中CsoS1A为灰色且CsoS4A为浅灰色。浅灰色上的右箭头和下箭头分别指示CsoS4A单体的N末端和C末端,并且灰色上的上箭头和右箭头分别指示CsoS1A单体的N末端和C末端。
图8A-图8B显示了用于研究体内壳组分之间的相互作用的壳探针(A)CsoS4A-mCherry和(B)meGFP-S2CP的荧光显微照片。单独表达时,探针通常在胞浆内均匀分布。比例尺(右下)表示2μm。
图9A-图9C显示了来自(A)Cso-PmChTHC途径构建体、(B)Cso-PSIITHC和(C)CsoS4A-SII的亲和纯化蛋白的阴离子交换(AIEX)色谱图。蓝色迹线(左Y轴)指示在280nm处的吸光度(mAU),而绿色迹线(右Y轴)指示用于所示洗脱体积的AIEX缓冲液B(Tris 50mM,NaCl1.0M,pH 7.9)的百分比。右侧的TEM显微照片是在0.3M NaCl下获得的洗脱级分的视图。可以看出,通过自身表达的CsoS4A-SII没有形成蛋白质壳。比例尺(右下)表示50nm。
图10A-图10B显示了Cso-PSIITH的纯化。与0.3M NaCl对应的洗脱级分的(A)AIEX色谱图和(B)TEM显微照片。比例尺(右下)表示50nm。
图11A-图11E显示了Cso-PSIIH的纯化。(A)来自Cso-PSIIH的亲和纯化蛋白的AIEX色谱图。(B)与0.3M NaCl对应的来自Cso-PSIIH的洗脱级分的TEM显微照片。(C)CsoS1A-SII、(D)与CsoS1D共表达的CsoS1A-SII和(E)与CsoS1D共表达的CsoS4A-SII的TEM显微照片证明这些组合不形成蛋白质壳。比例尺(右下)表示50nm。
图12A-图12D显示了从(A)Cso-PmChTHC、(B)PSIITHC、(C)PSIITH和(D)PSIIH的AIEX纯化收集的级分的十二烷基硫酸钠-聚丙烯酰胺凝胶电泳(SDS-PAGE)分析。箭头指示用于TEM分析的级分,其中左箭头和右箭头分别对应于[NaCl]=0.3M和0.4M。蛋白质梯状标志泳道标记为L,并示出质量(kDa)。
图13显示了如通过15动态光散射所测量的蛋白质壳的粒度分布。(A)Cso-PmChTHC、(B)Cso-PSIITHC、(C)Cso-PSIITH和(D)Cso-PSIIH。
图14显示了总结纯化的Cso-BMC与纯化的HO-BMC之间的差异的表。
图15显示了Cso-BMC外部的特写视图,其显示六聚体亚单位(灰色)和五聚体亚单位(浅灰色)的N末端和C末端指向远离壳内腔的方向。用箭头指示所选择的六聚体链和五聚体链的N末端和C末端。所示的Cso-BMC六聚体和五聚体亚单位的拓扑结构代表HO-BMC。
图16显示了与UmuD1-40-GFP-S2CP或UmuD1-40-GFP-S2CP(30)共表达的Cso-PSIIH壳的光密度分析。每个壳样品加载大约相同量的壳(如通过条带峰面积判断),使得可以直接比较UmuD1-40-GFP-S2CP和UmuD1-40-GFP-S2CP(30)的相对量。箭头指示UmuD1-40-GFP-S2CP和UmuD1-40-GFP-S2CP(30)。
图17A-图17D显示了Cso-BMC壳抵抗常见变性因素的稳定性的评估。(A-D)在所示条件下测试的空Cso-BMC的DLS光谱。对于每个后续光谱,基线被竖直地移位0.2,使得所有光谱可以在一张图中看到。
图18A-图18E显示了APEX2和LacZ酶负载到Cso-BMC壳中。(A-B)与酶共表达的Cso-BMC的SDS-PAGE和蛋白质印迹分析(使用抗His6抗体)。(C-D)用酶负载的Cso-BMC的TEM显微照片。比例尺(黑色,右下)表示50nm。(E)与酶共表达的Cso-BMC的DLS光谱,并且空Cso-BMC壳作为参考。
图19显示了游离和Cso-BMC包封的APEX2和LacZ酶的Michaelis-Menten动力学。
图20A-图20D显示了Cso-BMC所赋予的对APEX2和LacZ抵抗变性条件的稳定化作用的评估。游离和包封(+壳)酶的残留酶活性是通过将活性针对原始样品的活性归一化而获得的,显示为(A)23℃、(B)0%v/v甲醇、(C)无冻融、以及(D)pH 8。误差条表示平均值的一个标准偏差。
图21A-图21E显示了以下HO-BMC壳的纯化:HO-HTP和HO-HTSTP+GFP-SpyCatcher。(A-B)纯化的壳的SDS-PAGE分析,(C)蛋白质印迹分析(使用抗GFP抗体)表明HO-HTSTP+GFP-SpyCatcher样品中存在GFP-SpyCatcher,(D-E)两种HO-BMC构建体的TEM显微照片。比例尺(黑色,右下)表示50nm。
具体实施方式
为了方便,以参考文献列表的形式列出在本说明书中提及的参考文献并附加在实施例的末尾处。将这样的参考文献的全部内容通过提述并入本文。
定义
为方便起见,在此收集了在说明书、实施例和所附权利要求中使用的某些术语。
如本文所用,术语“包含”或“包括”应解释为详细说明如所提到的所述特征、整数、步骤或组分的存在,但不排除一个或多个特征、整数、步骤或组分或其组的存在或添加。然而,在本公开内容的上下文中,术语“包含(comprising)”或“包括(including)”还包括“由……组成(consisting of)”。“包含(comprising)”一词的变体(如“comprise”和“comprises”)以及“包括(including)”(如“include”和“includes”)具有相应变化的含义。
如本文所用,术语Cso-PSIIH可与术语Cso-BMC互换使用。
如本文所用的术语“变体”是指改变一个或多个氨基酸但在本发明中保留作为包封肽的能力的氨基酸序列。变体可以具有“保守”变化,其中取代的氨基酸具有相似的结构或化学特性(例如,用异亮氨酸替代亮氨酸)。很少地,变体可能具有“非保守”变化(例如,用色氨酸替代甘氨酸)。类似的微小变型也可包括氨基酸缺失或插入,或两者。可以使用本领域熟知的计算机程序,例如软件(美国威斯康星州麦迪逊DNASTAR,Inc.)找到确定哪些氨基酸残基可以被取代、插入或缺失而不破坏生物学或免疫学活性的指南。变体的一种类型是例如具有SEQ ID NO:94所示的氨基酸序列的肽,其比SEQ ID NO:1所示的序列长,也衍生自CsoS2,并且保留了SEQ ID NO:1的包封功能。预期具有介于SEQ ID NO:1与SEQ ID NO:94之间的氨基酸序列的其他变体将保留功能性。
本发明的组合物或组合通常作为与药学上可接受的佐剂、稀释剂或载体(其可在适当考虑给药的预期途径和标准药学实践的情况下进行选择)混合的药物配制品施用。此类药学上可接受的载体可以是对活性化合物化学惰性的,并且在使用条件下可以无有害副作用或毒性。合适的药物配制品可以在例如Remington The Science and Practice ofPharmacy,第19版,Mack Printing Company,Easton,Pennsylvania(1995)中找到。对于肠胃外施用,可以采用肠胃外可接受的水性溶液,其不含热原并且具有必需的pH、等渗性和稳定性。合适的溶液是本领域技术人员熟知的,其中在文献中描述了许多方法。药物递送方法的简要综述也可以在例如Langer,(Science 249:1527(1990))中找到。
另外,本领域技术人员可以使用常规技术和/或根据标准和/或接受的药学实践常规地实现合适的配制品的制备。
根据本发明使用的任何药物配制品中组合物或组合的量将取决于各种因素,如待治疗病症的严重程度、待治疗的特定患者以及采用的一种或多种化合物。在一些实施方案中,BMC-VLP在其表面展示抗原分子并用作疫苗。在任何情况下,配制品中的组合物或组合的量可以由技术人员常规地确定。
例如,固体口服组合物如片剂或胶囊可以含有1%至99%(w/w)活性成分;0至99%(w/w)稀释剂或填料;0至20%(w/w)崩解剂;0至5%(w/w)润滑剂;0-5%(w/w)流动助剂;0至50%(w/w)造粒剂或粘合剂;0到5%(w/w)抗氧化剂;和0至5%(w/w)颜料。控释片剂可以另外含有0至90%(w/w)的控释聚合物。
肠胃外配制品(如用于注射的溶液或悬浮液或用于输注的溶液)可以含有1%至50%(w/w)活性成分;50%(w/w)至99%(w/w)的液体或半固体载体或媒介物(例如溶剂,如水);和0-20%(w/w)的一种或多种其他赋形剂,如缓冲剂、抗氧化剂、悬浮稳定剂、张力调节剂和防腐剂。
取决于障碍和待治疗的患者以及给施用途径,可以以不同的治疗有效剂量向有需要的患者施用本发明的包含BMC-VLP的组合物或组合。
然而,在本发明的上下文中施用于哺乳动物(特别是人)的剂量应足以在合理的时间范围内在哺乳动物中实现治疗反应。本领域技术人员将认识到确切的剂量和组合物以及最适当的递送方案的选择还将尤其受到以下的影响:配制品的药理学特性,所治疗的病症的性质和严重程度,以及接受者的身体状况和精神敏锐度,以及特定化合物的效力,待治疗的患者的年龄、状况、体重、性别和反应。
虽然现在已大体上描述了本发明,但通过参考以下实施例将更容易地理解本发明,所述实施例是以说明方式提供,并且不旨在限制本发明。
实施例
本领域中已知并且未明确描述的标准分子生物学技术大体上遵循如Sambrook和Russel,Molecular Cloning:A Laboratory Manual,Cold Springs Harbor Laboratory,New York(2001)中所述。
实施例1:
材料和方法
细菌菌株和培养
将大肠杆菌Acella(DE3)(EdgeBio)细胞用于Cso-BMC VLP的分子克隆和蛋白质表达。使细胞以50μg/mL在补充有适当的抗生素(卡那霉素或链霉素)的溶原性肉汤(LB)或极品肉汤(TB)中生长。
将酿酒酵母(以下简称酵母)细胞用于HO-BMC VLP的分子克隆和蛋白质表达。酵母中基于质粒的表达是基于营养选择进行的,这需要配制的生长培养基。培养基缺乏工程化酵母菌株所需的关键营养素,其可以通过由质粒上的基因编码的蛋白质产生。在pCKU上,基因产物Ura3p产生尿嘧啶。由于这种化学成分确定的培养基价格昂贵(约SGD 30/升),因此我们试图将这些途径通过染色体整合到酵母基因组中,以便酵母菌株仍然可以在不那么昂贵(约SGD 5/L)的成分不确定的培养基(酵母-蛋白胨-葡萄糖)上表达途径蛋白。因此,我们开发了pGAU-YMRWδ15,其在要整合到酵母中的途径的两侧安装了同源性位点。利用酵母中的内源性同源重组机制,将所需的途径插入到酵母中的YMRWδ15染色体位点,并且所述途径上的蛋白质可以在无需选择的情况下表达。
质粒的金门组装
金门一锅式质粒组装在很大程度上遵循先前公布的方案,且略有修改[Guo,Y.等人,Nucleic Acids Res 43:e88(2015)]。为了插入一到三个片段,在反应锅中准备了1μLT4连接酶缓冲液(NEB)、0.5μL 10x纯化牛血清白蛋白(BSA,NEB)、5U BsaI(NEB)或Esp3I(Thermo)、0.2U T4连接酶(Thermo)、15ng目的质粒、1至3μL一个或多个插入片段并用水加至10μL。使反应锅经历37℃至18℃热循环过程,其中每个步骤孵育5min,持续15个循环,接着是55℃步骤持续15min以消化未组装的质粒,同时抑制连接,从而减少携带原始目的载体的菌落的数量。对于多于三个插入片段的组装,限制酶和连接酶的量增加一倍,目的质粒的量增加到75ng,热循环的次数增加到70,并且插入片段和目的质粒以2:1的摩尔比添加而不是按固定体积添加。这样做的目的是增加正确组装的质粒的数量。
合成密码子优化的BMC基因(BioBasic)并将其克隆到HcKan_O中。启动子和终止子部分作为PCR产物从各种模板扩增得到,并分别克隆到HcKan_P和HcKan_T中。
meGFP作为通过A206K突变获得的eGFP的单体形式是通过使用HiFi组装的定点诱变(SDM)产生的。用于向ORF添加荧光蛋白、S2CP或纯化标记的质粒pES1-7、pCKH和经修饰的HcKan_O质粒同样是使用HiFi组装产生的。
用于测序的引物
用于对各种质粒构建体进行测序的特异性寡核苷酸引物示于表1。
表1.用于测序的引物
序列比对
将CsoS2序列与Clustal Omega比对并用JalView 2生成输出比对文件[Waterhouse,A.M.等人,Bioinformatics 25:1189-1191(2009);Sievers,F.和Higgins,D.G.Methods in Molecular Biology(Clifton,N.J.)1079:105-116(2014)]。表2详细描述了用于序列比对的序列的登录号。
表2.在多个序列比对中示出并使用的序列的GenBank登录号。
VLP的纯化和货物负载分析
对于Cso-BMC:将Acella(DE3)细胞以50mg/L在500mL补充有卡那霉素的极品肉汤(TB,BioBasic)中培养,并在37℃下摇动直至培养物的光密度(在λ=600nm处)值达到约0.6至1.0。然后将培养物冷却至25℃,并添加异丙基β-D-1-硫代半乳糖苷(IPTG,GoldBio)至50μM以用于蛋白质诱导。将细胞在25℃下培养约30h,然后通过离心收获。使用M-110P微流化器(Microfluidics)在15,000psi下裂解细胞三次。向细胞裂解物中添加0.1mM苯甲基磺酰氟(PMSF)蛋白酶抑制剂。将裂解物在20,000xg下离心两次,每次20min。将澄清的裂解物以1mL/min线性流速加载到StrepTrapTM HP 5mL柱(GE Life Sciences)上。使用FPLC进行纯化,其中在3mL/min线性流速下用结合缓冲液(Tris·HCl 100mM,NaCl 150mM,pH 8.0)进行12个柱体积(CV)的洗涤以及用洗脱缓冲液(补充有2.5mM脱硫生物素的结合缓冲液)进行6个CV的洗脱。
为了获得用于结构研究的高质量蛋白质壳,在StrepTrapTM亲和纯化之后进行阴离子交换(AIEX)色谱法。添加AIEX缓冲液A(Tris·HCl 50mM,pH 8.0)以将合并的StrepTrapTM洗脱级分稀释两倍。将样品以1mL/min加载到10mL树脂床体积的Q Sepharose(GE LifeSciences)柱上。使用由以下组成的两步梯度方案洗脱:经6个CV的0至60% AIEX缓冲液B(Tris·HCl 50mM,NaCl 1.0M,pH 8.0)和经2个CV的60%至100% IEX缓冲液B,两者的线性流速均为2mL/min。
使用13%堆积十二烷基硫酸钠聚丙烯酰胺凝胶电泳(SDS-PAGE)凝胶分析蛋白质,并使用InstantBlue(Expedeon)对其染色。根据先前关于BMC货物定量的报告[Hagen等人,Nature Communications 9:2881,doi:10.1038/s41467-018-05162-z(2018)],使用Bio-Rad image lab软件进行光密度分析。进行背景减去法,并使用与目的条带相对应的峰面积进行定量。使用DeNovix分光光度计利用在280nm处计算的摩尔衰减系数(ε280)测量绝对蛋白质浓度。计算的T=3个壳的ε280(视为其各个分量的ε280的总和)为1 588 200M-1·cm-1。将T=4个壳的ε280计算为1 677 600M-1·cm-1。两种壳类型之间计算的ε280值的小差异是由于CsoS1A的低ε280(1490M-1·cm-1)所致。大部分的ε280贡献来自CsoS4A(23 490M-1·cm-1),其在两种壳类型中具有相同的拷贝数。根据Hagen等人描述的方案[Hagen等人,NatureCommunications 9:2881,doi:10.1038/s41467-018-05162-z(2018)],使用荧光测定每个壳的平均GFP数量。
对于HO-BMC:在8L YPD(酵母提取物1%、蛋白胨2%、葡萄糖2%,BioBasic)中在25℃下生长48h后,将酵母细胞造粒并使用M-110P微流化器在20,000psi下裂解八次。将裂解物在20,000xg下离心两次,每次20min,并且使用1M Tris·HCl(pH 12)将澄清的裂解物调节至pH 8。还将裂解物与300μL生物素封闭缓冲液(IBA Lifesciences)在温和搅拌下一起孵育15min。以与上述相同的方式进行StrepTrapTM亲和纯化。
下拉测定和免疫印迹
将纯化的His6-meGFP-S2CP和His6-meGFP分别与含有CsoS1A-SII、SII-CsoS1D或CsoS4A-SII的澄清的大肠杆菌裂解物在温和搅拌下在25℃下一起孵育1h。使用上述StrepTrapTM方案纯化裂解物混合物。使用GFP-HRP缀合的抗体(GF28R,Invitrogen)进行eGFP表位的免疫印迹检测,并且为了检测His6表位,使用与HRP缀合的His Tag抗体(Genscript)。根据制造商的推荐方案进行检测。
荧光显微术
根据所述条件使大肠杆菌细胞生长,并且收集0.1mL培养物并将其造粒。将沉淀物重悬于具有1%甲醛的PBS中并在室温下将其静置10min。将细胞用PBS洗涤两次,并重悬于0.5mL PBS中。将少量(约3μL)的细胞悬浮液与等体积的ProLongTM Diamond AntifadeMountant(Thermo Scientific)混合,然后封固在聚-L-赖氨酸显微镜载玻片(ThermoScientific)上。在成像之前,使样品在黑暗中固化至少24h。使用Olympus FV1200共聚焦显微镜以100倍物镜放大倍率对载玻片成像。使用ImageJ软件进行共定位分析。
透射电子显微术
使弗姆瓦(Forvar)/碳涂层铜网格经受辉光放电,然后将5μL纯化的蛋白质样品(稀释至A280为约0.05或更低)封固60s,然后用滤纸去除液滴。然后通过以下方法对网格进行负染色:添加5μL 2.5%乙酸钆(III)液滴,孵育90s,并类似地吸掉。使用JEOL JEM-1220TEM对网格进行成像。
壳粒度和稳定性测量
使用UncleTM仪器(Unchained labs)通过动态光散射(DLS)测定粒度分布。除非另有说明,否则将样品在TBS-50/350pH 8.0(Tris·HCl 50mM,NaCl 350mM,pH 8.0)中稀释至1mg/mL,并在20,000g下离心5min以在测量前去除聚集体。注意使用最上面的上清液进行分析。向迷你比色皿中添加9μL样品。除非另有说明,否则所有DLS测量都是一式三份地完成,并在20℃下进行。使用UncleTM分析软件进行粒度分布的分析。
对于在各种温度下的壳稳定性测量,将壳样品等分至薄壁PCR管中,并在UncleTM仪器中经受20℃-80℃范围内的温度,15min升高10℃。在15min孵育结束时,获取DLS光谱。
对于在各种缓冲条件下的壳稳定性测量,从Tris·HCl 1.0M(pH 8.0)、NaCl 5.0M和99.8%甲醇(ACS试剂级,Sigma)的储备溶液新鲜制备了含有10%和20%(v/v)甲醇的TBS-50/350(pH 8.0)缓冲液,并将其在制备当天内使用。由于当甲醇与水混合时产生的混合热,因此使含甲醇的缓冲液在制备后平衡回至室温至少1h。对于制备各种pH下的缓冲液,使用50mM下的以下组分以获得所示的pH范围:甘氨酸盐酸盐以获得pH 2-4;4-吗啉乙磺酸(MES)钠盐以获得pH 5-7;Tris·HCl以获得pH 8-9;N-环己基-3-氨基丙磺酸(CAPS)钠盐以获得pH 10-11;精氨酸盐酸盐以获得pH 12-13。所有缓冲液含有350mM NaCl。将壳在上述缓冲液中孵育15min,以在粒度测量之前留出时间用于可能的壳解离/蛋白质变性。
对于冻融稳定性,将TBS-50/350(pH 8.0)中的壳样品等分至薄壁PCR管中,并在液N2中快速冷冻。通过静置15min将样品在室温下解冻直到看不到冰晶,然后再次冷冻。
酶稳态动力学测定。
对于APEX2,使用TBS-50/350(pH 8.0)使所有试剂达到适当的工作浓度。在测定当天制备愈创木酚和H2O2均在10mM下的工作溶液。在30℃下剧烈摇动愈创木酚溶液以确保完全溶解,然后平衡回至室温。对于APEX2的测定浓度为10nM,H2O2为1mM,并且愈创木酚的测定浓度在0.20至2.0mM的范围内。总反应体积为200μL。使用BioTek SynergyTM HT微板阅读器通过在470nm处形成的四愈创木酚的吸光度来监测反应。发现直到90s,四愈创木酚的形成速率都是恒定的。该时间点用于初始速率V0测量。使用GraphPad Prism软件中的非线性最小二乘Michaelis-Menten拟合获得动力学常数。
对于LacZ的稳态动力学,使用补充有1mM MgCl2的TBS-50/350(pH 8.0)使所有试剂达到适当的工作浓度。在测定当天,从在DMSO中的50mM储备溶液制备10mM下的ONPG工作溶液。对于LacZ的测定浓度为10nM,并且ONPG(邻硝基苯基-β-半乳糖苷)的测定浓度在0.050至1.5mM的范围内。总反应体积为100μL。通过在405nm处的吸光度测量值追踪ONPG的水解。发现直到60s,产生形成速率都是恒定的,并且该时间点用于V0测量。
酶/酶-壳活性和稳定性测定。
所有酶活性测量均在环境温度(23℃)下进行,并且酶的工作浓度为10nM。测量一式三份地进行。使用饱和底物浓度(即接近Vmax)将酶活性测量测定为产物形成的初始速率。对于APEX2,饱和底物浓度是1.4mM愈创木酚和1mM H2O2。对于LacZ,饱和底物浓度是1.5mMONPG。
对于热休克测定,将酶/酶-壳样品等分至薄壁PCR管中,并在热循环仪中经受所示的高温(图6B)持续15min。孵育后,将样品冷却至20℃,并平衡回至环境温度15min,然后测定。
对于在含有甲醇的缓冲液中以及在各种pH条件下的稳定性测量,将酶/酶-壳样品透析到各种缓冲液中,如在粒度测量部分中所述。在测定之前使溶液静置至少15min,以留出时间用于可能的蛋白质变性。
对于冻融稳定性,将酶/酶-壳样品在薄壁PCR管中等分,并在液N2中快速冷冻。通过静置15min将样品在室温下解冻直到看不到冰晶,然后再次冷冻或测定。
低温电子显微术和结构分析
使用冰冷TBS-50/400缓冲液(Tris·HCl 50mM,NaCl 400mM,pH 8.0)将蛋白质溶液稀释至0.5mg/ml的浓度。向具有多孔碳支持膜(Quantifoil)的辉光放电R1.2/1.3和R2/2钼200网格施加2.5μL蛋白质样品。将网格转移到Leica EM GP浸入式冷冻机中,在90%湿度下吸干2s,并在通过液N2保持冷却的液态乙烷中快速冷冻。将网格在液N2温度下储存以防止形成结晶冰。
在大阪大学蛋白质研究所(Institute for Protein Research,OsakaUniversity)的装备有以最低剂量系统在200kV下运行的FEG的TalosTM Arctica Cryo-TEM(ThermoFisher Scientific)上筛选最佳cryoEM网格制备条件。在以下条件下捕获图像:33.67秒的曝光时间、给出约的剂量以及92,000倍放大倍率和1.6至2.5μm的散焦值。使用BM-Falcon3相机以计数模式记录图像,其中曝光设置为像素大小以及分数为70帧/单个图像。对于数据收集,制备网格并在大阪大学超高压电子显微镜研究中心(Research Center for Ultra-High Voltage Electron Microscopy(UHVEM),OsakaUniversity)的装备有在300kV和最低剂量系统下运行的FEG的TitanTM KriosTM(FEI)(ThermoFisher Scientific)上对其进行成像。使用附接至TitanTM KriosTM 的EPU软件(FEI)进行成像。在如下条件下记录图像:96,000的标称放大倍率、不使用物镜孔径、实际散焦范围在1.5至2.2μm之间且剂量率为64.3至以及曝光时间为1秒且每个孔8次图像采集。使用Falcon II检测器(FEI)在 /像素的像素大小和17帧/单个图像的帧率下记录图像。
从不同的显微镜会话(session)收集了约2100至2500个原始影像,并将其在RELION 3.0软件中进行处理[Zivanov,J.等人,eLife 7:e42166(2018)]。用MotionCor2软件对漂移进行运动校正,并用CTFFind-4.1软件和Gctf软件估计每个显微照片的CTF[Zhang,K.J Struct Biol 193:1-12(2016)]。选择具有良好观察到的CTF估计值的显微照片用于进一步处理。在RELION 3.0中以300x 300像素的方框尺寸手动挑选和提取壳。选择来自2D类别的展示出清晰二级结构元素的颗粒。使用RELION工具箱套件圆筒制备初始3D参考模型。用的低通滤波器和溶剂平滑法进行3D精细化。在不进行颗粒抛光(颗粒抛光无效果)的情况下进行CTF精细化,并进行最终的3D精细化。通过使用溶剂平滑法和软膜进行后处理,可以获得蛋白质壳的最终分辨率。
模型构建和结构分析
将用于五聚体(PDB ID:2RCF)[Tanaka,S.等人,Science 319:1083-1086(2008)]和六聚体(PDB ID:2EWH)[Tsai,Y.等人,PLOS Biology 5:e144(2007)]的生物组装模型使用UCSF Chimera[Pettersen,E.F.等人,J Comput Chem 25:1605-1612(2004)]手动拟合至电子密度图。提取二十面体重建的不对称单元并在COOT中重新建立[Emsley,P.等人,ActaCrystallogr.D Biol.Crystallogr.66:486-501(2010)]。通过在PHENIX[Liebschner,D.等人,Acta Crys D 75:861-877(2019)]和CCP4[Winn,M.D.,Acta Crystallogr.DBiol.Crystallogr.67:235-242(2011)]中对称扩展和真实空间精细化获得完整壳模型。
实施例2:
用于微区室部分的模块构建的遗传工具包
我们的基于金门克隆的遗传零件组装工具包扩展了已公布的用于酵母的代谢工程化的YeastFab质粒套件[Guo,Y.等人,Nucleic Acids Res 43:e88(2015)]。简而之,这是一种针对DNA组装的分层方法,其中遗传部分(即启动子(Pro)、开放阅读框(ORF)和终止子(Ter))被模块化。这些部分被称为0级质粒。1级质粒将Pro-ORF-Ter连接起来共同形成基因表达盒并且被称为POTX质粒(X=1至11)。2级质粒将两个或更多个表达盒串联起来以形成途径组合。3级质粒用于将途径通过染色体整合到基因组中。对于我们的酵母表达,我们使用在已公布的YeastFab工具包中的YeastFab水平0级和1级质粒,但开发我们自己的2级和3级质粒以更好地满足我们的要求。对于大肠杆菌表达,我们保留YeastFab0级质粒的使用,但开发我们自己的1级和2级质粒。我们没有开发3级(基因组整合)质粒用于大肠杆菌。
对于大肠杆菌和酵母二者,称为HcKan_P、_O和_T的0级质粒分别用于维持Pro、ORF和Ter部分,如表3中所述。
表3.用于在大肠杆菌和酵母中表达我们的VLP的遗传部分和相应的维持质粒(用于储存和释放遗传部分)的列表
通过从POTX质粒去除可增加宿主细胞不必要的负担的遗传元件对1级质粒进行修饰以适应在大肠杆菌中的蛋白质表达。我们将大肠杆菌1级质粒称为pESN(N=1至7),其含有对于TU维持所需的最小遗传元件(图2A)。对于从POTX或pESN质粒组装多个Pro-ORF-Ter,我们开发了分别对于酵母和大肠杆菌指定的2级质粒pCKU(SEQ ID NO:74)和pCKH(SEQ IDNO:63)(图2B)。酵母中基于质粒的表达是基于营养选择进行的,这需要配制的生长培养基。培养基缺乏工程化酵母菌株所需的关键营养素,其可以通过由质粒上的基因编码的蛋白质产生。在pCKU上,基因产物Ura3p产生尿嘧啶。由于这种化学成分确定的培养基价格昂贵($SGD 30/升),因此我们试图将这些途径通过染色体整合到酵母基因组中,以便酵母菌株仍然可以在不那么昂贵($SGD 5/L)的成分不确定的培养基(酵母-蛋白胨-葡萄糖)上表达途径蛋白。因此,我们开发了pGAU-YMRWδ15(SEQ ID NO:75),其在要整合到酵母中的途径的两侧安装了同源性位点。利用酵母中的内源性同源重组机制,将所需的途径插入到酵母中的YMRWδ15染色体位点,并且所述途径上的蛋白质可以在无需选择的情况下表达。对于在大肠杆菌中的途径表达,我们没有发现目前情况下所需的途径表达,因为细菌中的质粒选择通常使用合适的抗生素(在这种情况下为卡那霉素)在成分不确定的培养基(溶菌肉汤或极品肉汤)中进行。
我们还对HcKan_O质粒进行修饰以在ORF的氨基或羧基端安装荧光蛋白(FP)、生化/亲和标记或包封肽(图2C和图2D)。
选择了四种FP:mTurquoise2(mT2)、单体增强的GFP(meGFP)、单体Kusabiraorange-kappa(mKOκ)和mCherry(mCh),已知它们展现出单体行为,这应该会减少融合产物的人工制品聚集。经修饰的HcKan_O质粒的例子是HcKan_O-CmCherry(SEQ ID NO:28),其将mCherry标记到ORF的C末端。引入的两个亲和标记是六组氨酸(His6)和Strep-tag II(SII)标记,从而允许分别通过固定金属亲和色谱法(IMAC)或通过Strep-Tactin纯化蛋白质。经修饰的HcKan_O质粒的例子是将His6标记到ORF的C末端的HcKan_O-CHis6(SEQ ID NO:32)以及将Strep-Tag II标记到ORF的C末端的HcKan_O-CSII(SEQ ID NO:31)。
其他标记包括SpyCatcher/SpyTag(ST/SC)对(SEQ ID NO:13和16)和CC-Di-A/B(CCA/CCB)对(SEQ ID NO:17-20)。经修饰的HcKan_O质粒的例子是HcKan_O-CSpyCatcher(SEQ ID NO:37),其将SpyCatcher标记到ORF的C末端;HcKan_O-CSpyTag(SEQ ID NO:38),其将SpyTag标记到ORF的C末端;HcKan_O-CCCDiA(SEQ ID NO:35),其将卷曲螺旋二聚体-A标记到ORF的C末端;以及HcKan_O-CCCDiB(SEQ ID NO:36),其将卷曲螺旋二聚体-B标记到ORF的C末端。SII标记(SEQ ID NO:21和22)已广泛用于蛋白质和蛋白质复合体的纯化,而ST/SC和CCA/CCB对已用于VLP和其他蛋白质纳米结构的功能化[Fletcher,J.M.等人,Science 340:595-599(2013);Keeble,A.H.和Howarth,M.Methods in Enzymology,617,443-461(2019)]。用SpyCatcher标记的蛋白质(SEQ ID NO:13和14)与另一种用SpyTag标记的蛋白质(SEQ ID No:15和16)形成共价异酰胺键,而用CC-Di-A标记的蛋白质(SEQ ID No:17和18)与另一种用CC-Di-B标记的蛋白质(SEQ ID NO:19和20)形成强烈的分子间相互作用(解离常数,Kd为约1nM)[Thomas,F.等人,Journal of the American Chemical Society135:5161-5166,(2013)]。将SC/ST或CCA/CCB对的一个成员安装在VLP的表面上允许用所述对中的另一个相应成员标记的客体蛋白(guest protein)缀合至壳表面。
已知控制壳原体的细胞内化学计量学对于成功组装BMC壳是重要的[Kerfeld,C.A.等人,Nature Reviews Microbiology 16:277(2018)]。为了调整每个组分的表达,我们将来自Anderson集合的五个组成型活性启动子并入HcKan_P中(表4)[Anderson,J.C.Anderson Promoter Library Registry of Standard Biological Parts(2006)]。
表4.本研究中使用的组成型活性Anderson集合启动子(PCON2-5)以及原始身份和表征的相对强度的列表
启动子 | Anderson集合身份 | 相对强度 | SEQ ID NO. |
PCON2 | BBa_J23100 | 1.00 | 84 |
PCON3 | BBa_J23108 | 0.50 | 77 |
PCON4 | BBa_J23105 | 0.24 | 78 |
PCON5 | BBa_J23114 | 0.10 | 24 |
为了简洁起见,我们将这些启动子重命名为PCON1至PCON5,其中PCON2最强且PCON5最弱。PCON2至PCON5序列(分别为SEQ ID NO:84、77、78和24)分别以小写形式显示在SEQ ID NO:83和40至42内。我们还包括T7启动子(PT7;SEQ ID NO:23)以及lacI阻遏子和lac操纵子序列(LacI+PT7),以用于通过添加诱导剂异丙基β-d-1-硫代半乳糖苷(IPTG)实现基因的诱导型表达。对于转录终止,我们在所有TU中使用了T7终止子(TT7)。使用这种DNA组装的多单顺反子系统,可以定制BMC组分在pESN质粒中的表达水平(表5)。
表5.TU(在pES质粒中组装的)的汇编。转录单位(TU)用字母A-D来注释。所用的缩写为PY:PCONY(表4);PT7:具有lacI和lac操纵子的PT7;meG:meGFP;mCh:mCherry。所有TU以TT7终止。
pES | A | B | C | D |
2 | P4-meG-S2CP | P5-CsoS4A-SII | P4-UmuD1-40-meG-S2CP | P4-UmuD1-40-meG |
3 | PT7-CsoS1A | |||
4 | P4-CsoS1D | |||
5 | PT7-CsoS1A | |||
6 | P5-CsoS4A-mCh | P5-CsoS4A-SII | ||
7 | PT7-CsoS1A |
对于在Cso-BMC中包封货物,我们已经鉴定了包封肽(EP)序列
(SKITGSSGNDTQGSLITYSGGARG;SEQ ID NO:1),我们称之为S2CP,其介导将蛋白质货物隔离到简化的羧酶体中。将S2CP标记到ORF的C末端的经修饰的HcKan_O质粒的例子是HcKan_O-S2CP(SEQ ID NO:39)。稍后将讨论有关将S2CP鉴定为EP的详细信息。对于在HO-BMC中包封货物,尽管所报道的对于HO-BMC的EP被报告为在大肠杆菌是重组宿主时起作用,但我们发现它在酵母中不起作用[Lassila,J.K.等人,Journal of molecular biology426:2217-2228(2014)]。用于在大肠杆菌中制备Cso-BMC的途径和用于在酵母中表达HO-BMC的HO-ACB途径的合成操纵子示意图示于图3。
金门一锅式质粒组装在很大程度上遵循先前公布的方案,且略有修改[Guo,Y.等人,Nucleic Acids Research,43(13),e88(2015)]。为了插入一到三个片段,在反应锅中准备了1μL T4连接酶缓冲液、0.5μL 10x纯化牛血清白蛋白(BSA)、5U BsaI(用于0级和2级组装)或Esp3I(用于1级组装)、10U T4连接酶、20ng目的质粒、1至3μL一个或多个插入片段并用水加至10μL。使用的所有酶和BSA均来自新英格兰生物实验室(NEB)。使反应锅经历37℃至18℃热循环过程,其中每个步骤孵育5min,持续70个循环,接着是55℃步骤持续15min。将质粒转化至无大肠杆菌Acella(DE3)菌株(EdgeBio)中,并通过Sanger测序进行验证。
根据由Schiestl及其同事[Gietz,R.D.和Schiestl,R.H.Nature Protocols 2:31(2007)]描述的高效乙酸锂/单链DNA/PEG-3350方案进行质粒转化并将基因通过染色体整合到酵母中。
实施例3:
鉴定用于Cso系统的靶向肽
将细菌BMC改为用于细胞内纳米反应器的关键策略是通过将EP安装至货物从而将异源酶包封在壳内。虽然已经针对一些BMC系统鉴定和表征EP序列,但是对于α-羧酶体尚未报道这样的序列[Kerfeld,C.A.等人,Nature Reviews:Microbiology,16,277(2018)]。已建议EP序列驻留在CsoS2上[Oltrogge,L.M.等人,Nature Structural&Molecular Biology27:281-287(2020)]。对CsoS2进行的研究表明,其通过经由其N末端募集内腔货物而其C末端区域锚定至壳蛋白从而启动羧酶体的组装[Oltrogge,L.M.等人,Nature Structural&Molecular Biology 27:281-287(2020)]。100个CsoS2直系同源物的多序列比对揭示C末端区域是高度保守的,特别是在末端残基处(图4A)。这表明功能重要性。因此,我们决定探询那不勒斯盐硫杆菌CsoS2(SKITGSSGNDTQGSLITYSGGARG;SEQ ID NO:1)的末端24个残基的功能并将其称为“S2CP”,作为CsoS2 C末端肽的缩写。编码S2CP肽的核酸序列如SEQ ID NO:7所示。我们还考虑了如下可能性:在不会在异源蛋白货物上增加太多另外的体积的情况下稍长的S2CP变体可能改善包封功效。为此,我们选择了那不勒斯盐硫杆菌CsoS2(KPEKPGSKITGSSGNDTQGSLITYSGGARG;SEQ ID NO:94)的末端30个残基作为包封肽变体,并将其称为“S2CP(30)”。编码S2CP(30)肽的核酸序列如SEQ ID NO:95所示。
我们使用下拉测定研究用S2CP标记的非天然蛋白货物是否可以与分别代表BMC-H、BMC-T和BMC-P壳蛋白类型的CsoS1A、CsoS1D或CsoS4A相互作用。我们创建pES2-Pcon4-His6-meGFP-S2CP-TT7,纯化His6-meGFP-S2CP,并将所述蛋白质与大肠杆菌裂解物一起孵育,在所述大肠杆菌裂解物中使用PT7表达CsoS1A-SII、SII-CsoS1D或CsoS4A-SII。作为阴性对照,将纯化的His6-meGFP与含有相同壳蛋白的裂解物类似地孵育。将混合物经由Strep-Tactin纯化,并通过蛋白质印迹分析来自六种混合物的纯化的级分中GFP的存在。发现His6-meGFP-S2CP与CsoS1A-SII共洗脱,但不与SII-CsoS1D或CsoS4A-SII共洗脱(图4B)。也未观察到His6-meGFP与CsoS1A-SII、SII-CsoS1D或CsoS4A-SII共洗脱。这证明S2CP是His6-meGFP与CsoS1A相互作用所需的。虽然先前的报道证明全长CsoS2与CsoS1A相互作用(Cai等人,2015),但我们已经表明仅CsoS2的末端24个残基就足以实现相互作用。S2CP与CsoS1A(α-羧酶体中的主要壳模块)的结合应允许该肽序列将蛋白质货物靶向壳复合体。然而,仅基于此结果,尚不能确定S2CP能够介导将货物包封在壳内还是仅将其靶向至壳外围。
实施例4:
简化的α-羧酶体壳的重组形成
我们试图研究Cso组分之间的相互作用,目的是基于组分结构的知识构建简化的微区室壳。我们的方法涉及将FP翻译融合至壳组分和S2CP以用作蛋白质间相互作用的探针。使用HcKan_O-FP质粒,我们将四种FP(mTurquoise2、meGFP、mKOκ和mCherry)与CsoS4A的氨基和羧基末端融合,并使用来自pES6质粒的PCON5启动子在大肠杆菌中表达杂合蛋白。显示仅CsoS4A-mCherry大体均匀地分布在胞浆内(图8A)。其余的融合产物展示出不同程度的聚集(数据未显示),因此它们用作探针不太理想。因此,选择CsoS4A-mChery作为壳组分探针。我们还使用pES2质粒中的PCON4表达meGFP-S2CP,并发现所述蛋白质通常在胞浆中是弥散的(图8B)。
建立壳(CsoS4A-mCherry)和靶向肽(meGFP-S2CP)探针后,伴随着这些探针,我们接下来使用途径质粒pCKH-Cso-PmChTHC表达CsoS1D和CsoS1A(图5A、表6)。
表6.途径质粒和在其组装中使用的相应pES TU(表5)的列表。
组装的TU | |
Cso-PmChTHC | 2A-4A-6A-7A |
Cso-PSIITHC | 2A-4A-6B-7A |
Cso-PSIITH | 2B-4A-5A |
Cso-PSIIH | 2B-3A |
Cso-PSIITHCUS2CP | 2C-4A-6B-7A |
Cso-PSIITHCU | 2D-4A-6B-7A |
在我们的途径命名中,PmCh表示与mCherry融合的五聚体壳蛋白(CsoS4A)、T表示三聚体(CsoS1D)、H表示六聚体(CsoS1A)、以及C表示货物(meGFP-S2CP)。在表达这四种组分的细胞中,观察到在将IPTG添加至50μM时CsoS4A-mCherry和meGFP-S2CP共定位(图5B)。我们使用曼德(Mander)共定位系数(MCC)[tM1,tM2]定量共定位的程度,其中tM1是在有红色信号的区域中发现的绿色信号的分数,而tM2是在有绿色信号的区域中发现的红色信号的分数[Dunn,K.W.等人,American Journal of Physiology-Cell Physiology 300:C723-C742(2011)]。从所调查的细胞发现MCC值为[0.688,0.758],表明存在很大比例的共定位探针。
我们继而确定观察到的荧光焦点是否表明可进行纯化的蛋白质组装物。尝试了两种纯化策略。第一种纯化策略是将表达Cso-PmChTHC的大肠杆菌裂解物与纯CsoS4A-SII一起孵育,然后通过Strep-Tactin纯化。第二种纯化策略是用CsoS4A-SII替代Cso-PmChTHC途径中的CsoS4A-mCherry,从而创建新的途径即Cso-PSIITHC。将经由Strep-Tactin纯化的蛋白质通过使用Q Sepharose的阴离子交换离子色谱法(AIEX)进一步纯化。对于两种纯化策略,在AIEX色谱图中在0.3M和0.4M NaCl下观察到两个洗脱峰(图9A-图9B)。使用透射电子显微术(TEM)观察来自两个峰的级分,并且在0.4M NaCl洗脱级分中看到许多直径约20nm的衣壳样结构(图5A-图5B),并且在0.3M NaCl级分中看到的此类结构明显更少(图9A-图9B)。我们推断,虽然衣壳样结构主要在0.4M NaCl下洗脱,但由于这两个峰的重叠,有一些在0.3MNaCl级分中观察到。我们还注意到,当经受相同的AIEX程序时,单独的CsoS4A-SII在0.3MNaCl下以单个峰洗脱(图9C)。因此,对于Cso-PmChTHC和Cso-PSIITHC观察到的0.3M NaCl峰可能对应于未并入壳中的CsoS4A-SII。
已经提出CsoS2通过其C末端募集壳蛋白而对α-羧酶体的组装至关重要[Oltrogge,L.M.等人,Nature Structural&Molecular Biology 27:281-287(2020)]。在Cso-PmChTHC和Cso-PSIITHC构建体中,CsoS2的末端24个残基(S2CP;SEQ ID NO:1)可以帮助壳组装。我们希望研究S2CP对于形成衍生自α-羧酶体组分的壳是否是必需的。因此,我们构建了途径Cso-PSIITH,其中不存在S2CP。除了类似的AIEX色谱图(图10A)之外,在Cso-PSIITH组合中也看到衣壳样结构,其可区别于由Cso-PmChTHC和Cso-PSIITHC产生的那些。同样,这些结构在0.4M NaCl级分中比在0.3M NaCl级分中更丰富(图5C为0.4M NaCl,图10B为0.3MNaCl)。这些结果证明,S2CP对于观察到的蛋白质壳的形成不是必需的。
接下来,我们试图确定壳组装所需的最小组分。鉴于CsoS1A和CsoS1D是从相同的蛋白质结构域构建的,我们考虑到如下可能性:就从各自来自不同的蛋白质结构域的CsoS1A和CsoS4A构建蛋白质壳。构建了表达CsoS1A和CsoS4A-SII的新途径组合Cso-PSIIH(pCKH-Cso-BMC;SEQ ID NO:64)并如前所述纯化蛋白质(图11A)。再次看到与从先前途径组合纯化的那些相似的衣壳样结构(图11B为0.3M NaCl级分,图5D为0.4M NaCl级分)。未看到从单独的CsoS1A-SII组装的衣壳样结构(图11C)。此外,其中CsoS1A和CsoS1D共表达或CsoS1D和CsoS4A共表达的构建体未能产生蛋白质壳(图11D-图11E)。综上所述,这些结果证明CsoS1A和CsoS4A对于衣壳样壳的组装是必要且充分的。
实施例5:
S2CP将货物靶向至简化的羧酶体壳的内腔中
为了确定S2CP是否能够将货物靶向至简化的羧酶体壳的内腔中,我们将大肠杆菌UmuD N末端降解标记(残基1-40)融合至meGFP-S2CP的氨基端[Neher,S.B.等人,Proceedings of the National Academy of Sciences 100:13219-13224(2003)]。我们通过构建质粒pCKH-Cso-PSIITHCU,S2CP使UmuD1-40-meGFP-S2CP与CsoS1A、CsoS1D和CsoS4A共表达。我们假设,如果S2CP能够将UmuD1-40-meGFP靶向至羧酶体中,UmuD1-40-meGFP-S2CP(SEQID NO:49)将被保护免于内源ClpXP蛋白酶的蛋白水解,所述蛋白酶识别并降解用UmuD的N末端区域标记的蛋白质(图6A)。另一方面,如果S2CP仅将货物靶向至壳外部,则UmuD1-40-meGFP-S2CP将暴露于ClpXP并被降解。类似的构建体pCKH-Cso-PSIITHCU(其中与UmuD1-40-meGFP的唯一不同是不存在S2CP)用于解释壳对UmuD1-40-meGFP的随机包封。使用蛋白质印迹进行GFP的检测(图6D)。在与来自Cso-PSIITHCU,S2CP的纯化蛋白质对应的泳道中,检测到UmuD1-40-meGFP-S2CP。在与来自Cso-PSIITHCU的相同量的纯化蛋白质(通过在280nm处的吸光度测定的)对应的泳道中,未检测到UmuD1-40-meGFP(SEQ ID NO:52)。作为进一步的验证性分析,在对于两种途径组合的洗脱级分中可以看到相似的蛋白质壳(图6C-图6D)。这证明保护UmuD1-40-meGFP免受蛋白水解可能是由于其通过S2CP介导而被包封至壳中。
实施例6:
简化的α-羧酶体壳的原子模型
为了更好地理解简化的α-羧酶体的分子结构,使用低温电子显微镜(cryo-EM)获得Cso-PSIITHC、Cso-PSIITH和Cso-PSIIH的近原子尺度模型。对于Cso-PSIITHC,可以看到两种不同的壳大小,其对应于二十面体衣壳样三角剖分数T=3和T=4。分别在3.24和的分辨率下获得壳模型。对于Cso-PSIITH和Cso-PSIIH,仅看到T=3个壳,并且分别在3.35和的分辨率下获得结构。在Cso-PSIITHC中观察到的T=3个壳的比例为14.6%,而在T=4个壳的比例为85.4%。使用报告的那不勒斯盐硫杆菌CsoS1A(PDB:2EWH)和CsoS4A(PDB:2RCF)X射线晶体结构进行模型拟合[Tanaka,S.等人,Science 319:1083-1086(2008);Tsai,Y.等人,PLOS Biology 5:e144(2007)]。预期那不勒斯盐硫杆菌CsoS1D组装为三聚体的双堆叠层,这是根据来自海洋原绿球藻(Prochlorococcus marinus)MED4的CsoS1D的结构推导得出的,所述三聚体的双堆叠层与之共享60%相同的残基[Klein,M.G.等人,Journal of Molecular Biology 392:319-333(2009)]。然而,我们无法在Cso-PSIITHC和Cso-PSIITH的电子密度图中辨别出双堆叠层,这表明CsoS1D未包含在这些壳中。在从Cso-PSIITHC纯化的壳的内腔空间中也未检测到meGFP-S2CP货物的电子密度。尽管如此,鉴于计算研究表明壳原体与货物之间的相互作用影响壳大小和形状,可以想象T=4个壳的形成(这仅在Cso-PSIITHC中看到)可能受到货物包封的影响,而没有货物的壳组装为较小的T=3形式。
由于用于获得T=3个壳模型的三种途径组合之间没有明显的差异,因此我们聚焦于由Cso-PSIITHC产生的壳以用于模型构建和精细化(表7)。
表7.Cryo-EM数据收集、图和模型精细化以及模型验证
T=3个壳含有12个CsoS4A同五聚体和20个CsoS1A同六聚体,其外径为且计算分子量为1.7MDa(图7A)。T=4个壳含有12个同五聚体和30个同六聚体,其外径为且分子量为2.3MDa(图7B)。两种壳类型在其他方面基本相似。其中N和C末端存在的CsoS1A和CsoS4A的凹面朝向壳的外部(图7A)。
实施例7:
测定S2CP和S2CP(30)对于Cso-BMC的包封效率
使用Cso-BMC壳结构,可以经由GFP荧光定量UmuD1-40-GFP-S2CP和UmuD1-40-GFP-S2CP(30)的平均拷贝数。基于低温电子显微镜观察结果,对于所有涉及壳分子质量的计算,我们假定所有与货物共表达的壳都是T=3和4形式的混合物。由于壳形式的比例可以在样品之间变化,因此提供了两个值,其是通过假设样品中的所有壳为T=3或4计算得出的。确定在每个壳中包封平均7.7-8.0个UmuD1-40-GFP-S2CP(30)拷贝,相比之下,在每个壳中包封平均1.6-1.7个UmuD1-40-GFP-S2CP拷贝。包封UmuD1-40-GFP-S2CP或UmuD1-40-GFP-S2CP(30)的Cso-PSIIH壳的光密度分析还表明,在壳内发现的UmuD1-40-GFP-S2CP(30)是UmuD1-40-GFP-S2CP的约4倍(图16)。因此,S2CP(30)是比S2CP更有效的包封肽。
表8.定量由S2CP或S2CP(30)介导的所包封的UmuD1-40-GFP货物的平均数量。由于壳形式的比例未知,因此给出了通过假定所有壳为T=3或4而计算得出的值。
实施例8:
使用Cso-BMC稳定酶活性
蛋白质壳作为赋予酶稳定性以抵抗物理损害(如加热或冷冻)或化学损害(如有机助溶剂或非生理pH的存在)的平台而受到关注[Demchuk和Patel,BiotechnologyAdvances,41:107547(2020);Silva,C.等人,Critical Reviews in Biotechnology,38(3):335-350(2018)]。酶限制通常会降低其构象柔性,这有时会赋予稳定性以抵抗导致变性的结构变化[Das,Zhao,(2020)Biochemistry,59(31):2870-2881;Küchler等人,NatureNanotechnology,11(5):409-420(2016)]。目前,由于它们相对易于组装和粒度同质性从而提高工程化过程中的可预测性和可处理性,因此更多地建立了同聚体(homomeric)蛋白质壳以容纳酶[Patterson,D.P.,Prevelige,P.E.和Douglas,T.(2012).ACS Nano,6(6):5000-5009;Patterson,D.P.,Schwarz,B.,El-Boubbou,K.,van der Oost,J.,Prevelige,P.E.和Douglas,T.(2012).Soft Matter,8(39):10158-10166;Sánchez-Sánchez等人,Journal of Nanobiotechnology 13(1):66(2015);Tan,Xue和Yew,Molecules 26(5):1389(2021)]。由于它们的异聚体组成,因此最小BMC衍生壳表示用于容纳酶的新兴支架,因为这些壳可以为有目的的修饰提供更多途径,与此同时,它们通常均匀的粒度仍然赋予可预测性以促进工程化[Turmo,A.,Gonzalez-Esquer,C.R.和Kerfeld,C.A.FEMS MicrobiologyLetters,364(18):fnx176(2017)]。然而,尚未探索最小BMC衍生壳以用于容纳异源酶[Cai,F.,Bernstein,S.L.,Wilson,S.C.和Kerfeld,C.A.Plant Physiol 170:1868-1877(2016);Hagen,A.等人,Nature Communications 9:2881,(2018)]。这鼓励我们研究Cso-BMC是否可以容纳和稳定酶。首次测试空Cso-BMC(Cso-PSIIH)针对热休克、冷冻、甲醇助溶剂的存在和pH为2至13的环境的稳定性。将经历这些条件的壳的DLS光谱与在Tris·HCl-50/350(Tris·HCl 50mM(pH8.0)、NaCl 350mM)中的壳的DLS光谱进行比较。粒度分布和/或多个峰的外观的显著变化表明蛋白质壳分解[Yu,Z.,Reid,J.C.和Yang,Y.-P.Journal ofPharmaceutical Sciences 102(12):4284-4290(2013)]。基于所测试的条件,认为Cso-BMC在如下条件下是稳定的:高达70℃下持续15min、20%v/v甲醇、七次连续冻融和pH 5-11(图17)。
为了探索Cso-BMC包裹分子大小相当不同的酶的能力,将进化的豌豆胞浆抗坏血酸过氧化物酶(APEX2)即27.0kDa单体[Lam等人,Nature Methods 12(1):51-54(2015)]和大肠杆菌β-半乳糖苷酶(LacZ)即466.0kDa同四聚体以用于包封[Golan等人,Biochimicaet Biophysica Acta(BBA)-Bioenergetics,1293(2):238-242;Lam等人,Nature Methods12(1):51-54(2015)]。将S2CP(30)融合至酶的C末端以介导包封,因为发现它比S2CP更有效地介导重组蛋白的包封。还将酶用六组氨酸(His6)标记进行N末端标记以促进在下游去除可能与壳共纯化的未包封的酶(Nichols,Kennedy和Tullman-Ercek,2019)。构建并纯化与酶共表达的Cso-PSIIH壳。SDS-PAGE分析和蛋白质印迹证实壳样品中靶标酶的存在(图18A-图18B),并且通过考马斯蓝光密度测定法估计每个壳中酶的平均拷贝数(表8)[Hagen,A.等人,Nature Communications 9:2881(2018);Nichols等人,Methods in Enzymology,617,155-186(2019)]。这些酶的包封未显现出显著影响Cso-BMC的大小和形态(图18C-图18E)。
在一些情况下,已知将酶包封至蛋白质壳中改变酶的催化特性。为了探索被Cso-BMC包封可能如何影响APEX2和LacZ的催化效率,我们对游离酶和包封酶二者进行了稳态动力学,并将数据拟合至Michaelis-Menten模型以获得转换数(kcat)、Michaelis-Menten常数(KM)和催化效率(kcat/KM)(表8、图19)。对于包封的APEX2,kcat/KM降低至游离酶的约30%。对于包封的LacZ,kcat/KM与游离酶没有显著差异。对于两种游离酶获得的动力学常数kcat和KM与以前的工作是合理一致的,这表明S2CP(30)的存在不影响游离酶的活性[Juers,Hakda,Matthews和Huber,Biochemistry,42(46),13505-13511(2003);Lam等人,Nature Methods12(1):51-54(2015)]。
为了确定Cso-BMC对酶的可能的稳定化作用,用上述条件(发现在此条件下空壳是稳定的)对游离酶和壳包封的酶样品进行挑战。将酶活性针对原始样品的酶活性进行归一化以确定残留活性(图20)。Cso-BMC赋予两种酶中等水平的热稳定性。包封的酶在40℃下孵育15min后保留其活性的约90%,相比之下,游离酶保留其活性的40%。在50℃下,包封的APEX2保留约一半的活性,而游离酶基本上无活性。然而,在50℃下,包封的LacZ的活性仅略高于游离酶。在60℃及更高温度下,所有酶样品均无活性。Cso-BMC在高达20%v/v甲醇下对APEX2具有保护作用。另一方面,对于在甲醇中的游离和包封的LacZ,观察到活性均有所增加。据报道,高达40%v/v甲醇的存在不会使LacZ变性,而是增强其活性[Shifrin和Hunn,Archives of Biochemistry and Biophysics,130,530-535(1969)]。因此,Cso-BMC不太可能使LacZ在甲醇的情况下稳定。对于冻融稳定性,Cso-BMC使两种酶在七个连续循环下稳定。
包封酶在pH 10-11下在Cso-BMC内展示出更高的活性,但在pH 5-6下展示出更低的活性。我们认为,与游离酶相比,Cso-BMC内的酸性微环境可能使包封酶的pH活性曲线移动到更碱性的条件。已观察到如下阴离子支架对酶的pH依赖性活性的影响:合成马来酸聚合物支架对于胰蛋白酶和胰凝乳蛋白酶,以及最近DNA多磷酸骨架对于葡萄糖氧化酶-辣根过氧化物酶(GOx-HRP)级联[Goldstein,Biochemistry 11(22):4072-4084(1972);Goldstein,Levin和Katchalski,Biochemistry,3(12):1913-1919(1964);Zhang,Tsitkov和Hess,Nature Communications,7(1):13982(2016)]。
迄今为止,在最小BMC衍生壳中Cso-BMC可能展示出经由包封肽实现的最高异源货物负载量。对于此类壳使用包封肽在很大程度上是无效的,并且通常无法经由考马斯蓝染色检测到货物,这需要更灵敏的技术如免疫印迹或荧光[Cai,F.等人,Plant Physiol 170:1868-1877(2016);Hagen,A.等人,Nature Communications 9:2881(2018);Lassila,Bernstein,Kinney,Axen和Kerfeld,(2014)]。相比之下,对于Cso-BMC和S2CP(30)系统,所有三种测试的异源蛋白货物(GFP、APEX2、LacZ)可以在考马斯蓝染色的凝胶中被清楚地鉴定(图16和图18)。
表8.每个壳中包封的酶的平均拷贝数的量化以及包封和游离酶的动力学常数。对于每个壳中的平均酶拷贝数,提供了通过假定所有壳为T=3或4而计算得出的值。动力学测量一式三份地进行,并将平均值与标准误差一起显示。
实施例9:
在酿酒酵母中产生HO-BMC VLP
金门克隆系统
表达HO-BMC VLP的构建体包含表3、图2和图3中描述的组分,并且根据实施例2中描述的方法组装。简言之,将酵母启动子PTDH3克隆到HcKan_P中并命名为HcKan_P-TDH3(SEQID NO:65)。将酵母启动子PYEF3克隆到HcKan_P中并命名为HcKan_P-YEF3(SEQ ID NO:66)。将酵母启动子PPYK1克隆到HcKan_P中并命名为HcKan_P-PYK1(SEQ ID NO:67)。将酵母启动子PGPM1克隆到HcKan_P中并命名为HcKan_P-GPM1(SEQ ID NO:115)将HO-H ORF克隆到HcKan_O中并命名为HcKan_O-HO-H(SEQ ID NO:68)。将HO-P ORF克隆到HcKan_O中并命名为HcKan_O-HO-P(SEQ ID NO:69)。将HO-T1 ORF克隆到HcKan_O中并命名为HcKan_O-HO-T1(SEQ ID NO:70)。将HO-T1-SpyTag ORF克隆到HcKan_O中并命名为HcKan_O-HO-T1-SpyTag(SEQ ID NO:116)
将酵母终止子TRPL41B(SEQ ID NO:80)克隆到HcKan_T中并命名为HcKan_T-RPL41B(SEQ ID NO:71)。将酵母终止子THBT1(SEQ ID NO:81)克隆到HcKan_T中并命名为HcKan_T-HBT1(SEQ ID NO:72)。将酵母终止子TRPS20(SEQ ID NO:82)克隆到HcKan_T中并命名为HcKan_T-RPS20(SEQ ID NO:73)。将酵母终止子TYPT31(SEQ ID NO:105)克隆到HcKan_T中并命名为HcKan_T-YPT31(SEQ ID NO:119)。
将上述启动子、ORF和终止子部分组装到途径组装质粒pCKU(SEQ ID NO:74)中。然后将组装的HO-BMC途径亚克隆到pGAU-YMRWδ15(SEQ ID NO:75)中以用于将途径通过染色体整合到酵母YMRWδ15位点中。包含用于将HO-BMC途径整合到酵母YMRWδ15位点中的HO-BMC的构建体被命名为pGAU-YMRWδ15-HO-BMC(SEQ ID NO:76)。包含用于将GFP-SpyCatcher整合到酵母YPRCδ15位点中的PGPM1-GFP-SpyCatcher-TRPS20的构建体被命名为pGAH-YPRCδ15-GFP-SpyCatcher(SEQ ID NO:121)。
根据由Schiestl及其同事[Gietz,R.D.和Schiestl,R.H.Nature Protocols 2:31(2007)]描述的高效乙酸锂/单链DNA/PEG-3350方案进行质粒转化并将基因通过染色体整合到酵母中。
对于在HO-BMC中包封货物,尽管对于HO-BMC的EP被报告为在大肠杆菌是重组宿主时起作用,但我们发现它在酵母中不起作用[Lassila,J.K.等人,Journal of MolecularBiology 426:2217-2228(2014)]。因此,采用了一种使用SpyCatcher/SpyTag蛋白缀合系统将货物包封到HO-壳中的替代方法[Hagen,A.等人,Nature Communications 9:2881(2018)]。该方法涉及将SpyTag序列移植到HO-T1中面向壳的肽环中。这种经修饰的HO-T1亚单位被称为HO-T1-SpyTag。因此,具有融合SpyCatcher结构域的货物蛋白可以与HO-T1-SpyTag形成共价异肽键,并被包封在HO-壳内。构成酵母中的HO-壳的转录单位汇编在表9中,并且表达HO-壳途径的酵母菌株汇编在表10中。A用于在大肠杆菌中制备Cso-BMC的途径和用于在酵母中表达HO-BMC的HO-ACB途径的合成操纵子示意图示于图3。
表9.转录单位(TU)(在POT质粒中组装的)的汇编。(TU)用字母A-D注释。
POT | A | B |
2 | PTDH3-HO-H-TRPL41B | |
4 | PYEF3-HO-T1-TRPL41B | PYEF3-HO-T1-SpyTag-TRPL41B |
5 | PPYK1-HO-P-SII-TRPS20 |
表10.与表9有关的表达HO-壳途径的酵母菌株的列表。
VLP的纯化
在8L YPD(酵母提取物1%、蛋白胨2%、葡萄糖2%,BioBasic)中在25℃下生长48h后,将酵母细胞造粒并使用M-110P微流化器在20,000psi下裂解八次。将裂解物在20,000xg下离心两次,每次20min,并且使用1M Tris·HCl(pH 12)将澄清的裂解物调节至pH 8。还将裂解物与300μL生物素封闭缓冲液(IBA Lifesciences)在温和搅拌下一起孵育15min。以与上述相同的方式进行StrepTrap亲和纯化。
结果:
Cso-BMC和HO-BMC的纯化
使用构建的合成操纵子(图3),我们从大肠杆菌纯化Cso-BMC并且从酵母纯化HO-BMC。据发明人所知,这是仅使用来自那不勒斯盐硫杆菌cso操纵子的两个组分形成重组蛋白壳的首个已知实例。虽然Silver及其同事已报道了大肠杆菌中那不勒斯盐硫杆菌羧酶体的形成,但它是通过将编码十个基因的整个cso操纵子移植到大肠杆菌中进行的[Bonacci,W.等人,Proceedings of the National Academy of Sciences 109:478-483(2012)]。我们的系统将蛋白质壳的形成简化为仅两个基因csoS1A和csoS4A。CsoS1A组装成形成平坦六边形片块的六聚体,而CsoS4A组装成占据壳顶点的五聚体,其覆盖由CsoS1A形成的平坦片块并赋予壳二十面体几何形状。虽然所得的Cso-BMC壳(直径为22nm)比天然的那不勒斯盐硫杆菌羧酶体(直径为90至110nm)小,但合成壳的尺寸高度均匀,如由DLS测量所证明(图17)。
Kerfeld及其同事已报道HO-BMC在大肠杆菌中的重组表达,获得了使用三种壳原体HO-H、HO-P和HO-T1的原子尺度结构[Sutter,M.等人,Science 356:1293-1297(2017)]。HO-H的结构和几何函数类似于CsoS1A的结构和几何函数,而HO-P类似于CsoS4A。HO-T1类似于两个HO-H的串联重复,其组装成同样形成平坦六边形片块的三聚体。我们已经设法重建酵母中的HO-BMC。根据我们目前对文献的理解,这是在酵母中BMC衍生蛋白质壳的重组表达的首个实例。虽然酵母中的重组蛋白滴度通常比大肠杆菌低(图14),但酵母中HO-BMC VLP的表达开创了通过真核翻译后修饰机制进行定制的途径[Sudbery,P.E.Curr OpinBiotechnol 7(1996)]。同样值得注意的是,许多酵母衍生生物分子和生物体本身被赋予了通常被认为是安全的(GRAS)状态,从而使HO-BMC处于疫苗开发的良好处境[Sewalt,V.等人,Industrial Biotechnology 12:295-302(2016)]。
在TEM下观察,Cso-BMC显现为衣壳样结构,直径为约20nm,其中一些具有成角度的平面。这种形状让人联想到天然的那不勒斯盐硫杆菌羧酶体,但如前所述,合成Cso-BMC的直径为天然羧酶体的约20%。解释Cso-BMC大小较小的似乎合理的原因是其内腔空间是空的。在天然羧酶体中,已知有数百至数千个蛋白质紧密堆积在壳内[Bonacci,W.等人Proceedings of the National Academy of Sciences 109:478-483(2012)]。出于生物工程化目的,应该更希望VLP失去其天然腔内蛋白,使得重组蛋白货物可以更有效地包封在这些壳内[Schwarz,B.等人,Advances in Virus Research 97:1-60(2017)]。
我们的酵母表达的HO-BMC壳在大小和形状方面与由Kerfeld和同事报道的由大肠杆菌表达的那些非常相似[Sutter等人,Science 356:1293-1297(2017)]。对来自Cso-BMC和HO-BMC亲和纯化的蛋白质洗脱物进行十二烷基硫酸钠聚丙烯酰胺凝胶(SDS-PAGE)分析表明存在预期的壳原体蛋白。由于六聚体(CsoS1A,HO-H)和五聚体(CsoS4A,HO-P)具有相似的分子量(10±1kDa),因此无法通过SDS-PAGE很好地解析它们。尽管如此,鉴于蛋白质壳的存在,可以推断出这两个种类都存在于所观察到的约10kDa蛋白质带中。Cso-BMC和HO-壳二者的原子级结构细节表明这些颗粒的大小是基本上均匀的[Sutter,M.等人,Science 356:1293-1297(2017);Tan,Ali等人,Biomacromolecules doi:10.1021/acs.biomac.1c00533(2021)]。这种尺寸的均匀性在VLP工程化中是有用的特征,因为当将VLP功能化为生物材料时它转化为可预测性[Schwarz,B.等人,Advances in Virus Research 97:1-60(2017)]。
总结
BMC是用于对微生物细胞工厂中代谢反应进行空间编程的有前途的平台,并且可以改为用作专门的生化递送载体[Kerfeld C.A.等人Nature Reviews Microbiology 16:277(2018)]。然而,将这些蛋白质壳用于此类目的的主要障碍是它们通常复杂的组装性质,其不能简单地转化成重组系统。使用两种类型的壳蛋白的蛋白质壳的组装是从来自产生天然样α-羧酶体的那不勒斯盐硫杆菌cso操纵子的十种鉴定的组分的显著减少[Bonacci,W.等人Proceedings of the National Academy of Sciences 109:478-483(2012)]。此外,我们已经鉴定了序列S2CP,其能够将异源蛋白质货物靶向至简化的羧酶体壳中。含有另外6个残基的包封肽S2CP(30)显示出在介导GFP货物蛋白包封至Cso-BMC中时效率是S2CP的大约4倍。因此,S2CP和S2CP(30)二者均可用于控制要包装在Cso-BMC内的异源蛋白质货物的量。Cso-BMC还能够稳定两种酶即APEX2和LacZ,使之抵抗常见的酶变性因素,如热休克、甲醇助溶剂的存在、连续的冻融循环和高碱性环境。据我们所知,这是利用最小组分BMC衍生壳容纳和稳定酶以抵抗此类变性因素的首次证明。Cso-BMC扩展了可用于包封和稳定酶的当前VLP的范围[Demchuk和Patel,Biotechnology Advances,41:107547(2020)]。
我们还在酵母中重组表达HO-BMC,并提供了所述壳能够包封重组蛋白质货物的证据。据我们所知,这是BMC壳在酵母中重组表达的首次证明。
参考文献
在此说明书中对明显先前已公开的文件的任何列示或讨论都不应一定视为承认所述文件是最先进技术的一部分或者是公知常识。
Adams,P.D.,Afonine,P.V.,Bunkóczi,G.,Chen,V.B.,Davis,I.W.,Echols,N.,Zwart,P.H.(2010).PHENIX:a comprehensive Python-based system formacromolecular structure solution.Acta Crystallographica.Section D:BiologicalCrystallography,66(Pt 2),213-221.doi:10.1107/s0907444909052925.
Anderson,J.C.(2006).Anderson Promoter Library Registry of StandardBiological Parts.Retrieved from parts.igem.org/Promoters/Catalog/Anderson.
Baneyx,F.(1999).Recombinant protein expression in Escherichiacoli.Current Opinion in Biotechnology,10(5),411-421.doi:10.1016/S0958-1669(99)00003-8.
Bonacci,W.,Teng,P.K.,Afonso,B.,Niederholtmeyer,H.,Grob,P.,Silver,P.A.,&Savage,D.F.(2012).Modularity of a carbon-fixing proteinorganelle.Proceedings of the National Academy of Sciences,109(2),478-483.doi:10.1073/pnas.1108557109.
Cai,F.,Bernstein,S.L.,Wilson,S.C.,&Kerfeld,C.A.(2016).Production andCharacterization of Synthetic Carboxysome Shells with Incorporated LuminalProteins.Plant Physiology,170(3),1868-1877.doi:10.1104/pp.15.01822.
Cai,F.,Dou,Z.,Bernstein,S.L.,Leverenz,R.,Williams,E.B.,Heinhorst,S.,Kerfeld,C.A.(2015).Advances in Understanding Carboxysome Assembly inProchlorococcus and Synechococcus Implicate CsoS2 as a CriticalComponent.Life(Basel),5(2),1141-1171.doi:10.3390/life5021141.
Das,S.,Zhao,L.,Elofson,K.,&Finn,M.G.(2020).Enzyme Stabilization byVirus-Like Particles.Biochemistry,59(31),2870-2881.doi:10.1021/acs.biochem.0c00435.
Demchuk,A.M.,&Patel,T.R.(2020).The biomedical and bioengineeringpotential of protein nanocompartments.Biotechnology Advances,41,107547.doi:10.1016/j.biotechadv.2020.107547.
Dunn,K.W.,Kamocka,M.M.,&McDonald,J.H.(2011).A practical guide toevaluating colocalization in biological microscopy.American Journal ofPhysiology-Cell Physiology,300(4),C723-C742.doi:10.1152/ajpcell.00462.2010.
Emsley,P.,&Cowtan,K.(2004).Coot:model-building tools for moleculargraphics.Acta Crystallographica.Section D:Biological Crystallography,60(Pt 12Pt 1),2126-2132.doi:10.1107/s0907444904019158.
Fletcher,J.M.,Harniman,R.L.,Barnes,F.R.H.,Boyle,A.L.,Collins,A.,Mantell,J.,Woolfson,D.N.(2013).Self-Assembling Cages from Coiled-Coil PeptideModules.Science,340(6132),595-599.doi:10.1126/science.1233936.
Gietz,R.D.,&Schiestl,R.H.(2007).High-efficiency yeast transformationusing the LiAc/SS carrier DNA/PEG method.Nature Protocols,2,31.doi:10.1038/nprot.2007.13.
Golan,R.,Zehavi,U.,Naim,M.,Patchornik,A.,&Smirnoff,P.(1996).Inhibition of Escherichia coli beta-galactosidase by 2-nitro-1-(4,5-dimethoxy-2-nitrophenyl)ethyl,a photoreversible thiol label.Biochimica etBiophysica Acta(BBA)-Bioenergetics,1293(2),238-242.doi:10.1016/0167-4838(95)00254-5.
Goldstein,L.(1972).Microenvironmental effects on enzymecatalysis.Kinetic study of polyanionic and polycationic derivatives ofchymotrypsin.Biochemistry,11(22),4072-4084.doi:10.1021/bi00772a009.
Goldstein,L.,Levin,Y.,&Katchalski,E.(1964).A Water-insolublePolyanionic Derivative of Trypsin.II.Effect of the Polyelectrolyte Carrier onthe Kinetic Behavior of the Bound Trypsin*.Biochemistry,3(12),1913-1919.doi:10.1021/bi00900a022.
Guo,Y.,Dong,J.,Zhou,T.,Auxillos,J.,Li,T.,Zhang,W.,Dai,J.(2015).YeastFab:the design and construction of standard biological parts formetabolic engineering in Saccharomyces cerevisiae.Nucleic Acids Research,43(13),e88.doi:10.1093/nar/gkv464.
Hagen,A.,Sutter,M.,Sloan,N.,&Kerfeld,C.A.(2018).Programmed loadingand rapid purification of engineered bacterial microcompartment shells.NatureCommunications,9(1),2881.doi:10.1038/s41467-018-05162-z.
Juers,D.H.,Hakda,S.,Matthews,B.W.,&Huber,R.E.(2003).Structural Basisfor the Altered Activity of Gly794 Variants of Escherichia coliβ-Galactosidase.Biochemistry,42(46),13505-13511.doi:10.1021/bi035506j.
Kalnins,G.,Cesle,E.-E.,Jansons,J.,Liepins,J.,Filimonenko,A.,&Tars,K.(2020).Encapsulation mechanisms and structural studies of GRM2 bacterialmicrocompartment particles.Nature Communications,11(1),388.doi:10.1038/s41467-019-14205-y.
Keeble,A.H.,&Howarth,M.(2019).Insider information on successfulcovalent protein coupling with help from SpyBank.Methods in Enzymology,617,443-461.doi:10.1016/bs.mie.2018.12.010.
Kerfeld,C.A.,Aussignargues,C.,Zarzycki,J.,Cai,F.,&Sutter,M.(2018).Bacterial microcompartments.Nature Reviews:Microbiology,16,277.doi:10.1038/nrmicro.2018.10.
Klein,M.G.,Zwart,P.,Bagby,S.C.,Cai,F.,Chisholm,S.W.,Heinhorst,S.,Kerfeld,C.A.(2009).Identification and structural analysis of a novelcarboxysome shell protein with implications for metabolite transport.Journalof Molecular Biology,392(2),319-333.doi:10.1016/j.jmb.2009.03.056.
Küchler,A.,Yoshimoto,M.,Luginbühl,S.,Mavelli,F.,&Walde,P.(2016).Enzymatic reactions in confined environments.Nature Nanotechnology,11(5),409-420.doi:10.1038/nnano.2016.54.
Lam,S.S.,Martell,J.D.,Kamer,K.J.,Deerinck,T.J.,Ellisman,M.H.,Mootha,V.K.,&Ting,A.Y.(2015).Directed evolution of APEX2 for electron microscopy andproximity labeling.Nature Methods,12(1),51-54.doi:10.1038/nmeth.3179.
Lassila,J.K.,Bernstein,S.L.,Kinney,J.N.,Axen,S.D.,&Kerfeld,C.A.(2014).Assembly of robust bacterial microcompartment shells using buildingblocks from an organelle of unknown function.Journal of Molecular Biology,426(11),2217-2228.doi:10.1016/j.jmb.2014.02.025.
Lawrence,A.D.,Frank,S.,Newnham,S.,Lee,M.J.,Brown,I.R.,Xue,W.-F.,Warren,M.J.(2014).Solution Structure of a Bacterial MicrocompartmentTargeting Peptide and Its Application in the Construction of an EthanolBioreactor.ACS Synthetic Biology,3(7),454-465.doi:10.1021/sb4001118.
Liebschner,D.,Afonine,P.V.,Baker,M.L.,Bunkoczi,G.,Chen,V.B.,Croll,T.I.,Adams,P.D.(2019).Macromolecular structure determination using X-rays,neutrons and electrons:recent developments in Phenix.Acta CrystallographicaSection D:Structural Biology,75(10),861-877.doi:10.1107/S2059798319011471.
Neher,S.B.,Sauer,R.T.,&Baker,T.A.(2003).Distinct peptide signals inthe UmuD and UmuD′subunits of UmuD/D′mediate tethering and substrateprocessing by the ClpXP protease.Proceedings of the National Academy ofSciences,100(23),13219-13224.doi:10.1073/pnas.2235804100.
Nichols,T.M.,Kennedy,N.W.,&Tullman-Ercek,D.(2019).Cargo encapsulationin bacterial microcompartments:Methods and analysis.Methods in Enzymology,617,155-186.doi:10.1016/bs.mie.2018.12.009.
Oltrogge,L.M.,Chaijarasphong,T.,Chen,A.W.,Bolin,E.R.,Marqusee,S.,&Savage,D.F.(2020).Multivalent interactions between CsoS2 and Rubisco mediateα-carboxysome formation.Nature Structural&Molecular Biology,27(3),281-287.doi:10.1038/s41594-020-0387-7.
Patterson,D.P.,Prevelige,P.E.,&Douglas,T.(2012).Nanoreactors byProgrammed Enzyme Encapsulation Inside the Capsid of the BacteriophageP22.ACS Nano,6(6),5000-5009.doi:10.1021/nn300545z.
Patterson,D.P.,Schwarz,B.,El-Boubbou,K.,van der Oost,J.,Prevelige,P.E.,&Douglas,T.(2012).Virus-like particle nanoreactors:programmedencapsulation of the thermostable CelB glycosidase inside the P22 capsid.SoftMatter,8(39),10158-10166.doi:10.1039/C2SM26485D.
Pettersen,E.F.,Goddard,T.D.,Huang,C.C.,Couch,G.S.,Greenblatt,D.M.,Meng,E.C.,&Ferrin,T.E.(2004).UCSF Chimera–a visualization system forexploratory research and analysis.Journal of Computational Chemistry,25(13),1605-1612.doi:10.1002/jcc.20084.
Sánchez-Sánchez,L.,Tapia-Moreno,A.,Juarez-Moreno,K.,Patterson,D.P.,Cadena-Nava,R.D.,Douglas,T.,&Vazquez-Duhalt,R.(2015).Design of a VLP-nanovehicle for CYP450enzymatic activity delivery.Journal ofNanobiotechnology,13(1),66.doi:10.1186/s12951-015-0127-z.
Schwarz,B.,Uchida,M.,&Douglas,T.(2017).Chapter One-Biomedical andCatalytic Opportunities of Virus-Like Particles in Nanotechnology.InM.Kielian,T.C.Mettenleiter,&M.J.Roossinck(Eds.),Advances in Virus Research(Vol.97,pp.1-60):Academic Press.
Sewalt,V.,Shanahan,D.,Gregg,L.,La Marta,J.,&Carrillo,R.(2016).TheGenerally Recognized as Safe(GRAS)Process for Industrial MicrobialEnzymes.Industrial Biotechnology,12(5),295-302.doi:10.1089/ind.2016.0011.
Shifrin,S.,&Hunn,G.(1969).Effect of alcohols on the enzymaticactivity and subunit association ofβ-galactosidase.Archives of Biochemistryand Biophysics,130,530-535.doi:10.1016/0003-9861(69)90066-6.
Sievers,F.,&Higgins,D.G.(2014).Clustal Omega,accurate alignment ofvery large numbers of sequences.Methods in Molecular Biology,1079,105-116.doi:10.1007/978-1-62703-646-7_6.
Silva,C.,Martins,M.,Jing,S.,Fu,J.,&Cavaco-Paulo,A.(2018).Practicalinsights on enzyme stabilization.Critical Reviews in Biotechnology,38(3),335-350.doi:10.1080/07388551.2017.1355294.
Sudbery,P.E.(1996).The expression of recombinant proteins inyeasts.Current Opinion in Biotechnology,7.doi:10.1016/s0958-1669(96)80055-3.
Sutter,M.,Greber,B.,Aussignargues,C.,&Kerfeld,C.A.(2017).Assemblyprinciples and structure of a 6.5-MDa bacterial microcompartmentshell.Science,356(6344),1293-1297.doi:10.1126/science.aan3289.
Sutter,M.,Laughlin,T.G.,Sloan,N.B.,Serwas,D.,Davies,K.M.,&Kerfeld,C.A.(2019).Structure of a synthetic beta-carboxysome shell.Plant Physiology,181(3),1050–1058.doi:10.1104/pp.19.00885.
Tan,Y.Q.,Ali,S.,Xue,B.,Teo,W.Z.,Ling,L.H.,Go,M.K.,...Yew,W.S.(2021).Structure of a Minimalα-Carboxysome-Derived Shell and Its Utility in EnzymeStabilization.Biomacromolecules.doi:10.1021/acs.biomac.1c00533.
Tan,Y.Q.,Xue,B.,&Yew,W.S.(2021).Genetically Encodable Scaffolds forOptimizing Enzyme Function.Molecules,26(5),1389.Retrieved fromwwwdotmdpidotcom/1420-3049/26/5/1389.
Tanaka,S.,Kerfeld,C.A.,Sawaya,M.R.,Cai,F.,Heinhorst,S.,Cannon,G.C.,&Yeates,T.O.(2008).Atomic-Level Models of the Bacterial CarboxysomeShell.Science,319(5866),1083-1086.doi:10.1126/science.1151458.
Thomas,F.,Boyle,A.L.,Burton,A.J.,&Woolfson,D.N.(2013).A Set of deNovo Designed Parallel Heterodimeric Coiled Coils with QuantifiedDissociation Constants in the Micromolar to Sub-nanomolar Regime.Journal ofthe American Chemical Society,135(13),5161-5166.doi:10.1021/ja312310g.
Tsai,Y.,Sawaya,M.R.,Cannon,G.C.,Cai,F.,Williams,E.B.,Heinhorst,S.,...Yeates,T.O.(2007).Structural Analysis of CsoS1A and the Protein Shell ofthe Halothiobacillus neapolitanus Carboxysome.PLoS Biology,5(6),e144.doi:10.1371/journal.pbio.0050144.
Turmo,A.,Gonzalez-Esquer,C.R.,&Kerfeld,C.A.(2017).Carboxysomes:metabolic modules for CO2 fixation.FEMS Microbiology Letters,364(18),fnx176.doi:10.1093/femsle/fnx176.
Waterhouse,A.M.,Procter,J.B.,Martin,D.M.A.,Clamp,M.,&Barton,G.J.(2009).Jalview Version 2—a multiple sequence alignment editor and analysisworkbench.Bioinformatics,25(9),1189-1191.doi:10.1093/bioinformatics/btp033.
Winn,M.D.,Ballard,C.C.,Cowtan,K.D.,Dodson,E.J.,Emsley,P.,Evans,P.R.,...Wilson,K.S.(2011).Overview of the CCP4 suite and currentdevelopments.Acta Crystallographica.Section D:Biological Crystallography,67(Pt 4),235-242.doi:10.1107/s0907444910045749.
Yu,Z.,Reid,J.C.,&Yang,Y.-P.(2013).Utilizing Dynamic Light Scatteringas a Process Analytical Technology for Protein Folding and AggregationMonitoring in Vaccine Manufacturing.Journal of Pharmaceutical Sciences,102(12),4284-4290.doi:10.1002/jps.23746.
Zhang,K.(2016).Gctf:Real-time CTF determination andcorrection.Journal of Structural Biology,193(1),1-12.doi:10.1016/j.jsb.2015.11.003
Zhang,Y.,Tsitkov,S.,&Hess,H.(2016).Proximity does not contribute toactivity enhancement in the glucose oxidase–horseradish peroxidasecascade.Nature Communications,7(1),13982.doi:10.1038/ncomms13982
Zivanov,J.,Nakane,T.,Forsberg,B.O.,Kimanius,D.,Hagen,W.J.H.,Lindahl,E.,&Scheres,S.H.W.(2018).New tools for automated high-resolution cryo-EMstructure determination in RELION-3.eLife,7,e42166.doi:10.7554/eLife.42166.
SEQUENCE LISTING
<110> 新加坡国立大学
<120> 细菌微区室病毒样颗粒
<130> SP102877WO
<150> SG10202010547W
<151> 2020-10-23
<160> 121
<170> PatentIn version 3.5
<210> 1
<211> 24
<212> PRT
<213> Artificial Sequence
<220>
<223> S2CP amino acid sequence
<400> 1
Ser Lys Ile Thr Gly Ser Ser Gly Asn Asp Thr Gln Gly Ser Leu Ile
1 5 10 15
Thr Tyr Ser Gly Gly Ala Arg Gly
20
<210> 2
<211> 98
<212> PRT
<213> Artificial Sequence
<220>
<223> CsoS1A amino acid sequence
<400> 2
Met Ala Asp Val Thr Gly Ile Ala Leu Gly Met Ile Glu Thr Arg Gly
1 5 10 15
Leu Val Pro Ala Ile Glu Ala Ala Asp Ala Met Thr Lys Ala Ala Glu
20 25 30
Val Arg Leu Val Gly Arg Gln Phe Val Gly Gly Gly Tyr Val Thr Val
35 40 45
Leu Val Arg Gly Glu Thr Gly Ala Val Asn Ala Ala Val Arg Ala Gly
50 55 60
Ala Asp Ala Cys Glu Arg Val Gly Asp Gly Leu Val Ala Ala His Ile
65 70 75 80
Ile Ala Arg Val His Ser Glu Val Glu Asn Ile Leu Pro Lys Ala Pro
85 90 95
Gln Ala
<210> 3
<211> 83
<212> PRT
<213> Artificial Sequence
<220>
<223> CsoS4A amino acid sequence
<400> 3
Met Lys Ile Met Gln Val Glu Lys Thr Leu Val Ser Thr Asn Arg Ile
1 5 10 15
Ala Asp Met Gly His Lys Pro Leu Leu Val Val Trp Glu Lys Pro Gly
20 25 30
Ala Pro Arg Gln Val Ala Val Asp Ala Ile Gly Cys Ile Pro Gly Asp
35 40 45
Trp Val Leu Cys Val Gly Ser Ser Ala Ala Arg Glu Ala Ala Gly Ser
50 55 60
Lys Ser Tyr Pro Ser Asp Leu Thr Ile Ile Gly Ile Ile Asp Gln Trp
65 70 75 80
Asn Gly Glu
<210> 4
<211> 99
<212> PRT
<213> Artificial Sequence
<220>
<223> HO-H amino acid sequence
<400> 4
Met Ala Asp Ala Leu Gly Met Ile Glu Val Arg Gly Phe Val Gly Met
1 5 10 15
Val Glu Ala Ala Asp Ala Met Val Lys Ala Ala Lys Val Glu Leu Ile
20 25 30
Gly Tyr Glu Lys Thr Gly Gly Gly Tyr Val Thr Ala Val Val Arg Gly
35 40 45
Asp Val Ala Ala Val Lys Ala Ala Thr Glu Ala Gly Gln Arg Ala Ala
50 55 60
Glu Arg Val Gly Glu Val Val Ala Val His Val Ile Pro Arg Pro His
65 70 75 80
Val Asn Val Asp Ala Ala Leu Pro Leu Gly Arg Thr Pro Gly Met Asp
85 90 95
Lys Ser Ala
<210> 5
<211> 96
<212> PRT
<213> Artificial Sequence
<220>
<223> HO-P amino acid sequence
<400> 5
Met Val Leu Gly Lys Val Val Gly Thr Val Val Ala Ser Arg Lys Glu
1 5 10 15
Pro Arg Ile Glu Gly Leu Ser Leu Leu Leu Val Arg Ala Cys Asp Pro
20 25 30
Asp Gly Thr Pro Thr Gly Gly Ala Val Val Cys Ala Asp Ala Val Gly
35 40 45
Ala Gly Val Gly Glu Val Val Leu Tyr Ala Ser Gly Ser Ser Ala Arg
50 55 60
Gln Thr Glu Val Thr Asn Asn Arg Pro Val Asp Ala Thr Ile Met Ala
65 70 75 80
Ile Val Asp Leu Val Glu Met Gly Gly Asp Val Arg Phe Arg Lys Asp
85 90 95
<210> 6
<211> 205
<212> PRT
<213> Artificial Sequence
<220>
<223> HO-T1 amino acid sequence
<400> 6
Met Asp His Ala Pro Glu Arg Phe Asp Ala Thr Pro Pro Ala Gly Glu
1 5 10 15
Pro Asp Arg Pro Ala Leu Gly Val Leu Glu Leu Thr Ser Ile Ala Arg
20 25 30
Gly Ile Thr Val Ala Asp Ala Ala Leu Lys Arg Ala Pro Ser Leu Leu
35 40 45
Leu Met Ser Arg Pro Val Ser Ser Gly Lys His Leu Leu Met Met Arg
50 55 60
Gly Gln Val Ala Glu Val Glu Glu Ser Met Ile Ala Ala Arg Glu Ile
65 70 75 80
Ala Gly Ala Gly Ser Gly Ala Leu Leu Asp Glu Leu Glu Leu Pro Tyr
85 90 95
Ala His Glu Gln Leu Trp Arg Phe Leu Asp Ala Pro Val Val Ala Asp
100 105 110
Ala Trp Glu Glu Asp Thr Glu Ser Val Ile Ile Val Glu Thr Ala Thr
115 120 125
Val Cys Ala Ala Ile Asp Ser Ala Asp Ala Ala Leu Lys Thr Ala Pro
130 135 140
Val Val Leu Arg Asp Met Arg Leu Ala Ile Gly Ile Ala Gly Lys Ala
145 150 155 160
Phe Phe Thr Leu Thr Gly Glu Leu Ala Asp Val Glu Ala Ala Ala Glu
165 170 175
Val Val Arg Glu Arg Cys Gly Ala Arg Leu Leu Glu Leu Ala Cys Ile
180 185 190
Ala Arg Pro Val Asp Glu Leu Arg Gly Arg Leu Phe Phe
195 200 205
<210> 7
<211> 72
<212> DNA
<213> Artificial Sequence
<220>
<223> S2CP nucleotide sequence
<400> 7
tctaagatta ctggttcttc tggtaacgat acccaaggtt ctttgattac ttactctggt 60
ggtgctagag gt 72
<210> 8
<211> 294
<212> DNA
<213> Artificial Sequence
<220>
<223> CsoS1A nucleotide sequence
<400> 8
atggctgatg ttactggtat tgctttgggt atgattgaaa ctagaggttt ggttccagct 60
atcgaagctg ctgacgctat gaccaaggcc gctgaagtca gattggtcgg tagacaattt 120
gttggaggtg gttacgtcac tgttttggtt cgtggtgaaa ccggtgccgt taacgctgct 180
gttagagctg gtgctgatgc ttgtgaaaga gttggtgacg gtttagttgc tgcccacatt 240
attgccagag tccactctga agttgaaaac attttgccaa aggctccaca ggct 294
<210> 9
<211> 249
<212> DNA
<213> Artificial Sequence
<220>
<223> CsoS4A nucleotide sequence
<400> 9
atgaagatca tgcaagttga aaagactttg gtttctacca acagaattgc tgatatgggt 60
cacaagccat tgttggttgt ttgggaaaaa cctggtgctc caagacaagt tgctgttgat 120
gctattggtt gtattccagg tgactgggtt ttgtgtgttg gttcttctgc tgccagagaa 180
gctgctggtt ccaagtctta cccatctgat ttgactatca tcggtattat tgaccaatgg 240
aacggtgaa 249
<210> 10
<211> 297
<212> DNA
<213> Artificial Sequence
<220>
<223> HO-H nuxleotide sequence
<400> 10
atggctgatg ctttgggtat gattgaagtt agaggtttcg ttggtatggt tgaagctgct 60
gatgctatgg ttaaggctgc taaagttgaa ttgatcggtt acgaaaaaac tggtggtggt 120
tatgttactg ctgttgttag aggtgatgtt gctgctgtaa aagctgctac tgaagctggt 180
caaagggctg ctgaaagagt tggagaagtt gttgctgttc atgttattcc aagaccacat 240
gttaatgttg atgctgcttt gccattgggt agaactccag gtatggataa gtctgct 297
<210> 11
<211> 288
<212> DNA
<213> Artificial Sequence
<220>
<223> HO-P nucleotide sequence
<400> 11
atggttttag gtaaagttgt cggtactgtt gttgcatcaa gaaaggaacc aagaattgaa 60
ggtttatctt tattattggt tagagcttgt gatccagatg gtactccaac tggtggtgct 120
gttgtttgtg ctgatgctgt tggtgctggt gttggtgaag ttgttttata tgcttctggt 180
tcttctgcta gacaaactga agttactaat aatagaccag ttgatgctac tattatggct 240
attgttgatt tggttgaaat gggtggtgat gttagattta gaaaagat 288
<210> 12
<211> 615
<212> DNA
<213> Artificial Sequence
<220>
<223> HO-T1 nucleotide sequence
<400> 12
atggatcatg ctccagaaag atttgatgct actcctccag ctggtgaacc agatagacca 60
gctttgggtg ttttggaatt gacttctatt gctagaggta ttaccgttgc tgatgctgct 120
ttgaaaagag caccatcttt gttgttgatg tccagaccag tttcttccgg taaacatttg 180
ttgatgatga gaggtcaagt tgccgaagtt gaagaatcta tgattgctgc tagagaaatt 240
gctggtgctg gttctggtgc tttgttggat gaattggaat tgccatatgc tcacgaacaa 300
ctttggagat ttttggatgc tccagttgtt gcagatgctt gggaagaaga tactgaatcc 360
gttattatcg ttgaaaccgc tactgtttgt gctgctattg attctgctga tgcagcctta 420
aaaactgctc ctgttgtttt gagagatatg agattggcta ttggtattgc tggtaaggct 480
ttctttactt tgactggtga attggctgat gttgaagctg ctgctgaagt tgttagagaa 540
agatgtggtg ctagattgct agaattggct tgtattgcaa gaccagttga cgaattgaga 600
ggtaggttgt ttttc 615
<210> 13
<211> 83
<212> PRT
<213> Artificial Sequence
<220>
<223> Spycatcher tag amino acid sequence
<400> 13
Asp Ser Ala Thr His Ile Lys Phe Ser Lys Arg Asp Glu Asp Gly Lys
1 5 10 15
Glu Leu Ala Gly Ala Thr Met Glu Leu Arg Asp Ser Ser Gly Lys Thr
20 25 30
Ile Ser Thr Trp Ile Ser Asp Gly Gln Val Lys Asp Phe Tyr Leu Tyr
35 40 45
Pro Gly Lys Tyr Thr Phe Val Glu Thr Ala Ala Pro Asp Gly Tyr Glu
50 55 60
Val Ala Thr Ala Ile Thr Phe Thr Val Asn Glu Gln Gly Gln Val Thr
65 70 75 80
Val Asn Gly
<210> 14
<211> 249
<212> DNA
<213> Artificial Sequence
<220>
<223> Spycatcher tag nucleotide sequence
<400> 14
gattctgcta ctcatattaa gttctccaag agggacgaag atggtaaaga attggctggt 60
gcaactatgg aattgagaga ttcttctggt aagaccattt ccacctggat ttctgatggt 120
caagttaagg atttctactt gtacccaggt aagtacactt tcgttgaaac tgctgctcca 180
gatggttatg aagttgctac tgctattact ttcaccgtca atgaacaagg tcaagtcact 240
gttaatggt 249
<210> 15
<211> 13
<212> PRT
<213> Artificial Sequence
<220>
<223> Spytag amino acid sequence
<400> 15
Ala His Ile Val Met Val Asp Ala Tyr Lys Pro Thr Lys
1 5 10
<210> 16
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Spytag nucleotide sequence
<400> 16
gctcatatag ttatggttga tgcttacaag ccaacaaaa 39
<210> 17
<211> 21
<212> PRT
<213> Artificial Sequence
<220>
<223> CC-Di-A amino acid sequence
<400> 17
Glu Ile Ala Ala Leu Glu Lys Glu Asn Ala Ala Leu Glu Gln Glu Ile
1 5 10 15
Ala Ala Leu Glu Gln
20
<210> 18
<211> 63
<212> DNA
<213> Artificial Sequence
<220>
<223> CC-Di-A nucleotide sequence
<400> 18
gaaattgcag ctttggaaaa agaaaacgct gccttggaac aagaaattgc cgcattagaa 60
caa 63
<210> 19
<211> 21
<212> PRT
<213> Artificial Sequence
<220>
<223> CC-Di-B amino acid sequemce
<400> 19
Lys Ile Ala Ala Leu Lys Lys Lys Asn Ala Ala Leu Lys Gln Lys Ile
1 5 10 15
Ala Ala Leu Lys Gln
20
<210> 20
<211> 63
<212> DNA
<213> Artificial Sequence
<220>
<223> CC-Di-B nucleotide sequence
<400> 20
aaaattgcag cattgaaaaa gaagaacgcc gccttgaaac aaaaaattgc tgccttaaaa 60
caa 63
<210> 21
<211> 8
<212> PRT
<213> Artificial Sequence
<220>
<223> Strep-Tag II (SII) amino acid sequence
<400> 21
Trp Ser His Pro Gln Phe Glu Lys
1 5
<210> 22
<211> 24
<212> DNA
<213> Artificial Sequence
<220>
<223> Strep-Tag II (SII) nucleotide sequence
<400> 22
tggtcacatc cacaatttga aaag 24
<210> 23
<211> 1555
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter PT7 nucleotide sequence
<400> 23
tcactgcccg ctttccagtc gggaaacctg tcgtgccagc tgcattaatg aatcggccaa 60
cgcgcgggga gaggcggttt gcgtattggg cgccagggtg gtttttcttt tcaccagtga 120
aacgggcaac agctgattgc ccttcaccgc ctggccctga gagagttgca gcaagcggtc 180
cacgctggtt tgccccagca ggcgaaaatc ctgtttgatg gtggttaacg gcgggatata 240
acatgagctg tcttcggtat cgtcgtatcc cactaccgag atatccgcac caacgcgcag 300
cccggactcg gtaatggcgc gcattgcgcc cagcgccatc tgatcgttgg caaccagcat 360
cgcagtggga acgatgccct cattcagcat ttgcatggtt tgttgaaaac cggacatggc 420
actccagtcg ccttcccgtt ccgctatcgg ctgaatttga ttgcgagtga gatatttatg 480
ccagccagcc agacgcagac gcgccgagac agaacttaat gggcccgcta acagcgcgat 540
ttgctggtga cccaatgcga ccagatgctc cacgcccagt cgcgtaccgt cttcatggga 600
gaaaataata ctgttgatgg gtgtctggtc agagacatca agaaataacg ccggaacatt 660
agtgcaggca gcttccacag caatggcatc ctggtcatcc agcggatagt taatgatcag 720
cccactgacg cgttgcgcga gaagattgtg caccgccgct ttacaggctt cgacgccgct 780
tcgttctacc atcgacacca ccacgctggc acccagttga tcggcgcgag atttaatcgc 840
cgcgacaatt tgcgacggcg cgtgcagggc cagactggag gtggcaacgc caatcagcaa 900
cgactgtttg cccgccagtt gttgtgccac gcggttggga atgtaattca gctccgccat 960
cgccgcttcc actttttccc gcgttttcgc agaaacgtgg ctggcctggt tcaccacgcg 1020
ggaaacggtc tgataagaga caccggcata ctctgcgaca tcgtataacg ttactggttt 1080
cacattcacc accctgaatt gactctcttc cgggcgctat catgccatac cgcgaaaggt 1140
tttgcgccat tcgatggtgt ccgggatctc gacgctctcc cttatgcgac tcctgcatta 1200
ggaagcagcc cagtagtagg ttgaggccgt tgagcaccgc cgccgcaagg aatggtgcat 1260
gcaaggagat ggcgcccaac agtcccccgg ccacggggcc tgccaccata cccacgccga 1320
aacaagcgct catgagcccg aagtggcgag cccgatcttc cccatcggtg atgtcggcga 1380
tataggcgcc agcaaccgca cctgtggcgc cggtgatgcc ggccacgatg cgtccggcgt 1440
agaggatcga gatctcgatc ccgcgaaatt aatacgactc actatagggg aattgtgagc 1500
ggataacaat tcccctctag aaataatttt gtttaacttt aagaaggaga tatac 1555
<210> 24
<211> 118
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter PCON5 nucleotide sequence
<400> 24
ggcttcccaa ccttaccaga gggcgcccca gctggcaatt ccgacgtctt tatggctagc 60
tcagtcctag gtacaatgct agcgaattca aaagatcttt taagaaggag atatacat 118
<210> 25
<211> 500
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter PTDH3 nucleotide sequence
<400> 25
acagtttatt cctggcatcc actaaatata atggagcccg ctttttaagc tggcatccag 60
aaaaaaaaag aatcccagca ccaaaatatt gttttcttca ccaaccatca gttcataggt 120
ccattctctt agcgcaacta cagagaacag gggcacaaac aggcaaaaaa cgggcacaac 180
ctcaatggag tgatgcaacc tgcctggagt aaatgatgac acaaggcaat tgacccacgc 240
atgtatctat ctcattttct tacaccttct attaccttct gctctctctg atttggaaaa 300
agctgaaaaa aaaggttgaa accagttccc tgaaattatt cccctacttg actaataagt 360
atataaagac ggtaggtatt gattgtaatt ctgtaaatct atttcttaaa cttcttaaat 420
tctactttta tagttagtct tttttttagt tttaaaacac caagaactta gtttcgaata 480
aacacacata aacaaacaaa 500
<210> 26
<211> 500
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter PPYK1 nucleotide sequence
<400> 26
acagattggg agattttcat agtagaattc agcatgatag ctacgtaaat gtgttccgca 60
ccgtcacaaa gtgttttcta ctgttctttc ttctttcgtt cattcagttg agttgagtga 120
gtgctttgtt caatggatct tagctaaaat gcatattttt tctcttggta aatgaatgct 180
tgtgatgtct tccaagtgat ttcctttcct tcccatatga tgctaggtac ctttagtgtc 240
ttcctaaaaa aaaaaaaagg ctcgccatca aaacgatatt cgttggcttt tttttctgaa 300
ttataaatac tctttggtaa cttttcattt ccaagaacct cttttttcca gttatatcat 360
ggtccccttt caaagttatt ctctactctt tttcatattc attctttttc atcctttggt 420
tttttattct taacttgttt attattctct cttgtttcta tttacaagac accaatcaaa 480
acaaataaaa catcatcaca 500
<210> 27
<211> 500
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter PYEF3 nucleotide sequence
<400> 27
attaaaaaaa caacttacaa tcattgttcg ccccttccat acttactgcc actcgcaaaa 60
gggcccaacc agggcaatta cgtatcaaaa aatcatgaca ggctgggtaa taaatattcg 120
tgaagaaaga agaaattaaa aaaagaaacg aagaagcaaa aaaaagaaaa gactccgttt 180
aatcactttc aaccgcggtt tatccggccc cacccatgca taaccctaaa ttattagatc 240
acttagcacg tgaaaaagaa acgtttttaa tgtttttttt ttttttttct ttttcttttt 300
ttgcgttggt gaaaattttt tcgcttcctc gagtataatt atctcatctc atctttcata 360
taagataaga agttttataa aaaccttttg catcaaaatt ttgtagaata tctctttttc 420
ttacgctctc tttctttcct taattgtttt ctaaagaacc gtgtattttt ctagttcgaa 480
tccatcgata acattaaaag 500
<210> 28
<211> 3589
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CmCherry plasmid
<400> 28
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgagagacc gaattcgcgg ccgcttctag agcaatacgc aaaccgcctc 120
tccccgcgcg ttggccgatt cattaatgca gctggcacga caggtttccc gactggaaag 180
cgggcagtga gcgcaacgca attaatgtga gttagctcac tcattaggca ccccaggctt 240
tacactttat gcttccggct cgtatgttgt gtggaattgt gagcggataa caatttcaca 300
catactagag aaagaggaga aatactagat ggcttcctcc gaagacgtta tcaaagagtt 360
catgcgtttc aaagttcgta tggaaggttc cgttaacggt cacgagttcg aaatcgaagg 420
tgaaggtgaa ggtcgtccgt acgaaggtac ccagaccgct aaactgaaag ttaccaaagg 480
tggtccgctg ccgttcgctt gggacatcct gtccccgcag ttccagtacg gttccaaagc 540
ttacgttaaa cacccggctg acatcccgga ctacctgaaa ctgtccttcc cggaaggttt 600
caaatgggaa cgtgttatga acttcgaaga cggtggtgtt gttaccgtta cccaggactc 660
ctccctgcaa gacggtgagt tcatctacaa agttaaactg cgtggtacca acttcccgtc 720
cgacggtccg gttatgcaga aaaaaaccat gggttgggaa gcttccaccg aacgtatgta 780
cccggaagac ggtgctctga aaggtgaaat caaaatgcgt ctgaaactga aagacggtgg 840
tcactacgac gctgaagtta aaaccaccta catggctaaa aaaccggttc agctgccggg 900
tgcttacaaa accgacatca aactggacat cacctcccac aacgaagact acaccatcgt 960
tgaacagtac gaacgtgctg aaggtcgtca ctccaccggt gcttaataac gctgatagtg 1020
ctagtgtaga tcgctactag agccaggcat caaataaaac gaaaggctca gtcgaaagac 1080
tgggcctttc gttttatctg ttgtttgtcg gtgaacgctc tctactagag tcacactggc 1140
tcaccttcgg gtgggccttt ctgcgtttat atactagtag cggccgctgc agggtctctg 1200
gttcttctat ggtgagcaag ggcgaggagg ataacatggc catcatcaag gagttcatgc 1260
gcttcaaggt gcacatggag ggctccgtga acggccacga gttcgagatc gagggcgagg 1320
gcgagggccg cccctacgag ggcacccaga ccgccaagct gaaggtgacc aagggtggcc 1380
ccctgccctt cgcctgggac atcctgtccc ctcagttcat gtacggctcc aaggcctacg 1440
tgaagcaccc cgccgacatc cccgactact tgaagctgtc cttccccgag ggcttcaagt 1500
gggagcgcgt gatgaacttc gaggacggcg gcgtggtgac cgtgacccag gactcctccc 1560
tgcaggacgg cgagttcatc tacaaggtga agctgcgcgg caccaacttc ccctccgacg 1620
gccccgtaat gcagaagaag accatgggct gggaggcctc ctccgagcgg atgtaccccg 1680
aggacggcgc cctgaagggc gagatcaagc agaggctgaa gctgaaggac ggcggccact 1740
acgacgctga ggtcaagacc acctacaagg ccaagaagcc cgtgcagctg cccggcgcct 1800
acaacgtcaa catcaagttg gacatcacct cccacaacga ggactacacc atcgtggaac 1860
agtacgaacg cgccgagggc cgccactcca ccggcggcat ggacgagctg tacaagtagc 1920
cgagacgact gaccatttaa atcatacctg acctccatag cagaaagtca aaagcctccg 1980
accggaggct tttgacttga tcggcacgta agaggttcca actttcacca taatgaaata 2040
agatcactac cgggcgtatt ttttgagtta tcgagatttt caggagctaa ggaagctaaa 2100
atgagccata ttcaacggga aacgtcttgc tcgaggccgc gattaaattc caacatggat 2160
gctgatttat atgggtataa atgggctcgc gataatgtcg ggcaatcagg tgcgacaatc 2220
tatcgattgt atgggaagcc cgatgcgcca gagttgtttc tgaaacatgg caaaggtagc 2280
gttgccaatg atgttacaga tgagatggtc aggctaaact ggctgacgga atttatgcct 2340
cttccgacca tcaagcattt tatccgtact cctgatgatg catggttact caccactgcg 2400
atcccaggga aaacagcatt ccaggtatta gaagaatatc ctgattcagg tgaaaatatt 2460
gttgatgcgc tggcagtgtt cctgcgccgg ttgcattcga ttcctgtttg taattgtcct 2520
tttaacggcg atcgcgtatt tcgtctcgct caggcgcaat cacgaatgaa taacggtttg 2580
gttggtgcga gtgattttga tgacgagcgt aatggctggc ctgttgaaca agtctggaaa 2640
gaaatgcata agcttttgcc attctcaccg gattcagtcg tcactcatgg tgatttctca 2700
cttgataacc ttatttttga cgaggggaaa ttaataggtt gtattgatgt tggacgagtc 2760
ggaatcgcag accgatacca ggatcttgcc atcctatgga actgcctcgg tgagttttct 2820
ccttcattac agaaacggct ttttcaaaaa tatggtattg ataatcctga tatgaataaa 2880
ttgcagtttc acttgatgct cgatgagttt ttctaatgag ggcccaaatg taatcacctg 2940
gctcaccttc gggtgggcct ttctgcgttg ctggcgtttt tccataggct ccgcccccct 3000
gacgagcatc acaaaaatcg atgctcaagt cagaggtggc gaaacccgac aggactataa 3060
agataccagg cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg 3120
cttaccggat acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcatagctca 3180
cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa 3240
ccccccgttc agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg 3300
gtaagacacg acttatcgcc actggcagca gccactggta acaggattag cagagcgagg 3360
tatgtaggcg gtgctacaga gttcttgaag tggtggccta actacggcta cactagaaga 3420
acagtatttg gtatctgcgc tctgctgaag ccagttacct cggaaaaaga gttggtagct 3480
cttgatccgg caaacaaacc accgctggta gcggtggttt ttttgtttgc aagcagcaga 3540
ttacgcgcag aaaaaaagga tctcaagaag atcctttgat tttctaccg 3589
<210> 29
<211> 708
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CmCherry nucleotide sequence of key ORF
<400> 29
atggtgagca agggcgagga ggataacatg gccatcatca aggagttcat gcgcttcaag 60
gtgcacatgg agggctccgt gaacggccac gagttcgaga tcgagggcga gggcgagggc 120
cgcccctacg agggcaccca gaccgccaag ctgaaggtga ccaagggtgg ccccctgccc 180
ttcgcctggg acatcctgtc ccctcagttc atgtacggct ccaaggccta cgtgaagcac 240
cccgccgaca tccccgacta cttgaagctg tccttccccg agggcttcaa gtgggagcgc 300
gtgatgaact tcgaggacgg cggcgtggtg accgtgaccc aggactcctc cctgcaggac 360
ggcgagttca tctacaaggt gaagctgcgc ggcaccaact tcccctccga cggccccgta 420
atgcagaaga agaccatggg ctgggaggcc tcctccgagc ggatgtaccc cgaggacggc 480
gccctgaagg gcgagatcaa gcagaggctg aagctgaagg acggcggcca ctacgacgct 540
gaggtcaaga ccacctacaa ggccaagaag cccgtgcagc tgcccggcgc ctacaacgtc 600
aacatcaagt tggacatcac ctcccacaac gaggactaca ccatcgtgga acagtacgaa 660
cgcgccgagg gccgccactc caccggcggc atggacgagc tgtacaag 708
<210> 30
<211> 236
<212> PRT
<213> Artificial Sequence
<220>
<223> HcKan_O-CmCherry amino acid sequence of key ORF
<400> 30
Met Val Ser Lys Gly Glu Glu Asp Asn Met Ala Ile Ile Lys Glu Phe
1 5 10 15
Met Arg Phe Lys Val His Met Glu Gly Ser Val Asn Gly His Glu Phe
20 25 30
Glu Ile Glu Gly Glu Gly Glu Gly Arg Pro Tyr Glu Gly Thr Gln Thr
35 40 45
Ala Lys Leu Lys Val Thr Lys Gly Gly Pro Leu Pro Phe Ala Trp Asp
50 55 60
Ile Leu Ser Pro Gln Phe Met Tyr Gly Ser Lys Ala Tyr Val Lys His
65 70 75 80
Pro Ala Asp Ile Pro Asp Tyr Leu Lys Leu Ser Phe Pro Glu Gly Phe
85 90 95
Lys Trp Glu Arg Val Met Asn Phe Glu Asp Gly Gly Val Val Thr Val
100 105 110
Thr Gln Asp Ser Ser Leu Gln Asp Gly Glu Phe Ile Tyr Lys Val Lys
115 120 125
Leu Arg Gly Thr Asn Phe Pro Ser Asp Gly Pro Val Met Gln Lys Lys
130 135 140
Thr Met Gly Trp Glu Ala Ser Ser Glu Arg Met Tyr Pro Glu Asp Gly
145 150 155 160
Ala Leu Lys Gly Glu Ile Lys Gln Arg Leu Lys Leu Lys Asp Gly Gly
165 170 175
His Tyr Asp Ala Glu Val Lys Thr Thr Tyr Lys Ala Lys Lys Pro Val
180 185 190
Gln Leu Pro Gly Ala Tyr Asn Val Asn Ile Lys Leu Asp Ile Thr Ser
195 200 205
His Asn Glu Asp Tyr Thr Ile Val Glu Gln Tyr Glu Arg Ala Glu Gly
210 215 220
Arg His Ser Thr Gly Gly Met Asp Glu Leu Tyr Lys
225 230 235
<210> 31
<211> 2905
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CSII plasmid
<400> 31
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgagagacc gaattcgcgg ccgcttctag agcaatacgc aaaccgcctc 120
tccccgcgcg ttggccgatt cattaatgca gctggcacga caggtttccc gactggaaag 180
cgggcagtga gcgcaacgca attaatgtga gttagctcac tcattaggca ccccaggctt 240
tacactttat gcttccggct cgtatgttgt gtggaattgt gagcggataa caatttcaca 300
catactagag aaagaggaga aatactagat ggcttcctcc gaagacgtta tcaaagagtt 360
catgcgtttc aaagttcgta tggaaggttc cgttaacggt cacgagttcg aaatcgaagg 420
tgaaggtgaa ggtcgtccgt acgaaggtac ccagaccgct aaactgaaag ttaccaaagg 480
tggtccgctg ccgttcgctt gggacatcct gtccccgcag ttccagtacg gttccaaagc 540
ttacgttaaa cacccggctg acatcccgga ctacctgaaa ctgtccttcc cggaaggttt 600
caaatgggaa cgtgttatga acttcgaaga cggtggtgtt gttaccgtta cccaggactc 660
ctccctgcaa gacggtgagt tcatctacaa agttaaactg cgtggtacca acttcccgtc 720
cgacggtccg gttatgcaga aaaaaaccat gggttgggaa gcttccaccg aacgtatgta 780
cccggaagac ggtgctctga aaggtgaaat caaaatgcgt ctgaaactga aagacggtgg 840
tcactacgac gctgaagtta aaaccaccta catggctaaa aaaccggttc agctgccggg 900
tgcttacaaa accgacatca aactggacat cacctcccac aacgaagact acaccatcgt 960
tgaacagtac gaacgtgctg aaggtcgtca ctccaccggt gcttaataac gctgatagtg 1020
ctagtgtaga tcgctactag agccaggcat caaataaaac gaaaggctca gtcgaaagac 1080
tgggcctttc gttttatctg ttgtttgtcg gtgaacgctc tctactagag tcacactggc 1140
tcaccttcgg gtgggccttt ctgcgtttat atactagtag cggccgctgc agggtctctg 1200
gttcttcttg gtcacatcca caatttgaaa agtagccgag acgactgacc atttaaatca 1260
tacctgacct ccatagcaga aagtcaaaag cctccgaccg gaggcttttg acttgatcgg 1320
cacgtaagag gttccaactt tcaccataat gaaataagat cactaccggg cgtatttttt 1380
gagttatcga gattttcagg agctaaggaa gctaaaatga gccatattca acgggaaacg 1440
tcttgctcga ggccgcgatt aaattccaac atggatgctg atttatatgg gtataaatgg 1500
gctcgcgata atgtcgggca atcaggtgcg acaatctatc gattgtatgg gaagcccgat 1560
gcgccagagt tgtttctgaa acatggcaaa ggtagcgttg ccaatgatgt tacagatgag 1620
atggtcaggc taaactggct gacggaattt atgcctcttc cgaccatcaa gcattttatc 1680
cgtactcctg atgatgcatg gttactcacc actgcgatcc cagggaaaac agcattccag 1740
gtattagaag aatatcctga ttcaggtgaa aatattgttg atgcgctggc agtgttcctg 1800
cgccggttgc attcgattcc tgtttgtaat tgtcctttta acggcgatcg cgtatttcgt 1860
ctcgctcagg cgcaatcacg aatgaataac ggtttggttg gtgcgagtga ttttgatgac 1920
gagcgtaatg gctggcctgt tgaacaagtc tggaaagaaa tgcataagct tttgccattc 1980
tcaccggatt cagtcgtcac tcatggtgat ttctcacttg ataaccttat ttttgacgag 2040
gggaaattaa taggttgtat tgatgttgga cgagtcggaa tcgcagaccg ataccaggat 2100
cttgccatcc tatggaactg cctcggtgag ttttctcctt cattacagaa acggcttttt 2160
caaaaatatg gtattgataa tcctgatatg aataaattgc agtttcactt gatgctcgat 2220
gagtttttct aatgagggcc caaatgtaat cacctggctc accttcgggt gggcctttct 2280
gcgttgctgg cgtttttcca taggctccgc ccccctgacg agcatcacaa aaatcgatgc 2340
tcaagtcaga ggtggcgaaa cccgacagga ctataaagat accaggcgtt tccccctgga 2400
agctccctcg tgcgctctcc tgttccgacc ctgccgctta ccggatacct gtccgccttt 2460
ctcccttcgg gaagcgtggc gctttctcat agctcacgct gtaggtatct cagttcggtg 2520
taggtcgttc gctccaagct gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc 2580
gccttatccg gtaactatcg tcttgagtcc aacccggtaa gacacgactt atcgccactg 2640
gcagcagcca ctggtaacag gattagcaga gcgaggtatg taggcggtgc tacagagttc 2700
ttgaagtggt ggcctaacta cggctacact agaagaacag tatttggtat ctgcgctctg 2760
ctgaagccag ttacctcgga aaaagagttg gtagctcttg atccggcaaa caaaccaccg 2820
ctggtagcgg tggttttttt gtttgcaagc agcagattac gcgcagaaaa aaaggatctc 2880
aagaagatcc tttgattttc taccg 2905
<210> 32
<211> 2899
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CHis6 plasmid
<400> 32
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgagagacc gaattcgcgg ccgcttctag agcaatacgc aaaccgcctc 120
tccccgcgcg ttggccgatt cattaatgca gctggcacga caggtttccc gactggaaag 180
cgggcagtga gcgcaacgca attaatgtga gttagctcac tcattaggca ccccaggctt 240
tacactttat gcttccggct cgtatgttgt gtggaattgt gagcggataa caatttcaca 300
catactagag aaagaggaga aatactagat ggcttcctcc gaagacgtta tcaaagagtt 360
catgcgtttc aaagttcgta tggaaggttc cgttaacggt cacgagttcg aaatcgaagg 420
tgaaggtgaa ggtcgtccgt acgaaggtac ccagaccgct aaactgaaag ttaccaaagg 480
tggtccgctg ccgttcgctt gggacatcct gtccccgcag ttccagtacg gttccaaagc 540
ttacgttaaa cacccggctg acatcccgga ctacctgaaa ctgtccttcc cggaaggttt 600
caaatgggaa cgtgttatga acttcgaaga cggtggtgtt gttaccgtta cccaggactc 660
ctccctgcaa gacggtgagt tcatctacaa agttaaactg cgtggtacca acttcccgtc 720
cgacggtccg gttatgcaga aaaaaaccat gggttgggaa gcttccaccg aacgtatgta 780
cccggaagac ggtgctctga aaggtgaaat caaaatgcgt ctgaaactga aagacggtgg 840
tcactacgac gctgaagtta aaaccaccta catggctaaa aaaccggttc agctgccggg 900
tgcttacaaa accgacatca aactggacat cacctcccac aacgaagact acaccatcgt 960
tgaacagtac gaacgtgctg aaggtcgtca ctccaccggt gcttaataac gctgatagtg 1020
ctagtgtaga tcgctactag agccaggcat caaataaaac gaaaggctca gtcgaaagac 1080
tgggcctttc gttttatctg ttgtttgtcg gtgaacgctc tctactagag tcacactggc 1140
tcaccttcgg gtgggccttt ctgcgtttat atactagtag cggccgctgc agggtctctg 1200
gttcttctca tcatcaccat caccattagc cgagacgact gaccatttaa atcatacctg 1260
acctccatag cagaaagtca aaagcctccg accggaggct tttgacttga tcggcacgta 1320
agaggttcca actttcacca taatgaaata agatcactac cgggcgtatt ttttgagtta 1380
tcgagatttt caggagctaa ggaagctaaa atgagccata ttcaacggga aacgtcttgc 1440
tcgaggccgc gattaaattc caacatggat gctgatttat atgggtataa atgggctcgc 1500
gataatgtcg ggcaatcagg tgcgacaatc tatcgattgt atgggaagcc cgatgcgcca 1560
gagttgtttc tgaaacatgg caaaggtagc gttgccaatg atgttacaga tgagatggtc 1620
aggctaaact ggctgacgga atttatgcct cttccgacca tcaagcattt tatccgtact 1680
cctgatgatg catggttact caccactgcg atcccaggga aaacagcatt ccaggtatta 1740
gaagaatatc ctgattcagg tgaaaatatt gttgatgcgc tggcagtgtt cctgcgccgg 1800
ttgcattcga ttcctgtttg taattgtcct tttaacggcg atcgcgtatt tcgtctcgct 1860
caggcgcaat cacgaatgaa taacggtttg gttggtgcga gtgattttga tgacgagcgt 1920
aatggctggc ctgttgaaca agtctggaaa gaaatgcata agcttttgcc attctcaccg 1980
gattcagtcg tcactcatgg tgatttctca cttgataacc ttatttttga cgaggggaaa 2040
ttaataggtt gtattgatgt tggacgagtc ggaatcgcag accgatacca ggatcttgcc 2100
atcctatgga actgcctcgg tgagttttct ccttcattac agaaacggct ttttcaaaaa 2160
tatggtattg ataatcctga tatgaataaa ttgcagtttc acttgatgct cgatgagttt 2220
ttctaatgag ggcccaaatg taatcacctg gctcaccttc gggtgggcct ttctgcgttg 2280
ctggcgtttt tccataggct ccgcccccct gacgagcatc acaaaaatcg atgctcaagt 2340
cagaggtggc gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc 2400
ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct 2460
tcgggaagcg tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc 2520
gttcgctcca agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta 2580
tccggtaact atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca 2640
gccactggta acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag 2700
tggtggccta actacggcta cactagaaga acagtatttg gtatctgcgc tctgctgaag 2760
ccagttacct cggaaaaaga gttggtagct cttgatccgg caaacaaacc accgctggta 2820
gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag 2880
atcctttgat tttctaccg 2899
<210> 33
<211> 18
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CHis6 nucleotide sequence of key ORF
<400> 33
catcatcacc atcaccat 18
<210> 34
<211> 6
<212> PRT
<213> Artificial Sequence
<220>
<223> HcKan_O-CHis6 amino acid sequence of key ORF
<400> 34
His His His His His His
1 5
<210> 35
<211> 2983
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CCCDiA plasmid
<400> 35
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgagagacc gaattcgcgg ccgcttctag agcaatacgc aaaccgcctc 120
tccccgcgcg ttggccgatt cattaatgca gctggcacga caggtttccc gactggaaag 180
cgggcagtga gcgcaacgca attaatgtga gttagctcac tcattaggca ccccaggctt 240
tacactttat gcttccggct cgtatgttgt gtggaattgt gagcggataa caatttcaca 300
catactagag aaagaggaga aatactagat ggcttcctcc gaagacgtta tcaaagagtt 360
catgcgtttc aaagttcgta tggaaggttc cgttaacggt cacgagttcg aaatcgaagg 420
tgaaggtgaa ggtcgtccgt acgaaggtac ccagaccgct aaactgaaag ttaccaaagg 480
tggtccgctg ccgttcgctt gggacatcct gtccccgcag ttccagtacg gttccaaagc 540
ttacgttaaa cacccggctg acatcccgga ctacctgaaa ctgtccttcc cggaaggttt 600
caaatgggaa cgtgttatga acttcgaaga cggtggtgtt gttaccgtta cccaggactc 660
ctccctgcaa gacggtgagt tcatctacaa agttaaactg cgtggtacca acttcccgtc 720
cgacggtccg gttatgcaga aaaaaaccat gggttgggaa gcttccaccg aacgtatgta 780
cccggaagac ggtgctctga aaggtgaaat caaaatgcgt ctgaaactga aagacggtgg 840
tcactacgac gctgaagtta aaaccaccta catggctaaa aaaccggttc agctgccggg 900
tgcttacaaa accgacatca aactggacat cacctcccac aacgaagact acaccatcgt 960
tgaacagtac gaacgtgctg aaggtcgtca ctccaccggt gcttaataac gctgatagtg 1020
ctagtgtaga tcgctactag agccaggcat caaataaaac gaaaggctca gtcgaaagac 1080
tgggcctttc gttttatctg ttgtttgtcg gtgaacgctc tctactagag tcacactggc 1140
tcaccttcgg gtgggccttt ctgcgtttat atactagtag cggccgctgc agggtctctg 1200
gttctggtgg tggttcaggt ggttctgaaa ttgcagcttt ggaaaaagaa aacgctgcct 1260
tggaacaaga aattgccgca ttagaacaag gtggtagtgg tggatctggt tagccgagac 1320
gactgaccat ttaaatcata cctgacctcc atagcagaaa gtcaaaagcc tccgaccgga 1380
ggcttttgac ttgatcggca cgtaagaggt tccaactttc accataatga aataagatca 1440
ctaccgggcg tattttttga gttatcgaga ttttcaggag ctaaggaagc taaaatgagc 1500
catattcaac gggaaacgtc ttgctcgagg ccgcgattaa attccaacat ggatgctgat 1560
ttatatgggt ataaatgggc tcgcgataat gtcgggcaat caggtgcgac aatctatcga 1620
ttgtatggga agcccgatgc gccagagttg tttctgaaac atggcaaagg tagcgttgcc 1680
aatgatgtta cagatgagat ggtcaggcta aactggctga cggaatttat gcctcttccg 1740
accatcaagc attttatccg tactcctgat gatgcatggt tactcaccac tgcgatccca 1800
gggaaaacag cattccaggt attagaagaa tatcctgatt caggtgaaaa tattgttgat 1860
gcgctggcag tgttcctgcg ccggttgcat tcgattcctg tttgtaattg tccttttaac 1920
ggcgatcgcg tatttcgtct cgcacaggcg caatcacgaa tgaataacgg tttggttggt 1980
gcgagtgatt ttgatgacga gcgtaatggc tggcctgttg aacaagtctg gaaagaaatg 2040
cataagcttt tgccattctc accggattca gtcgtcactc atggtgattt ctcacttgat 2100
aaccttattt ttgacgaggg gaaattaata ggttgtattg atgttggacg agtcggaatc 2160
gcagaccgat accaggatct tgccatccta tggaactgcc tcggtgagtt ttctccttca 2220
ttacagaaac ggctttttca aaaatatggt attgataatc ctgatatgaa taaattgcag 2280
tttcacttga tgctcgatga gtttttctaa tgagggccca aatgtaatca cctggctcac 2340
cttcgggtgg gcctttctgc gttgctggcg tttttccata ggctccgccc ccctgacgag 2400
catcacaaaa atcgatgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 2460
caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 2520
ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt 2580
aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 2640
gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 2700
cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 2760
ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aagaacagta 2820
tttggtatct gcgctctgct gaagccagtt acctcggaaa aagagttggt agctcttgat 2880
ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag cagattacgc 2940
gcagaaaaaa aggatctcaa gaagatcctt tgattttcta ccg 2983
<210> 36
<211> 2983
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CCCDiB plasmid
<400> 36
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgagagacc gaattcgcgg ccgcttctag agcaatacgc aaaccgcctc 120
tccccgcgcg ttggccgatt cattaatgca gctggcacga caggtttccc gactggaaag 180
cgggcagtga gcgcaacgca attaatgtga gttagctcac tcattaggca ccccaggctt 240
tacactttat gcttccggct cgtatgttgt gtggaattgt gagcggataa caatttcaca 300
catactagag aaagaggaga aatactagat ggcttcctcc gaagacgtta tcaaagagtt 360
catgcgtttc aaagttcgta tggaaggttc cgttaacggt cacgagttcg aaatcgaagg 420
tgaaggtgaa ggtcgtccgt acgaaggtac ccagaccgct aaactgaaag ttaccaaagg 480
tggtccgctg ccgttcgctt gggacatcct gtccccgcag ttccagtacg gttccaaagc 540
ttacgttaaa cacccggctg acatcccgga ctacctgaaa ctgtccttcc cggaaggttt 600
caaatgggaa cgtgttatga acttcgaaga cggtggtgtt gttaccgtta cccaggactc 660
ctccctgcaa gacggtgagt tcatctacaa agttaaactg cgtggtacca acttcccgtc 720
cgacggtccg gttatgcaga aaaaaaccat gggttgggaa gcttccaccg aacgtatgta 780
cccggaagac ggtgctctga aaggtgaaat caaaatgcgt ctgaaactga aagacggtgg 840
tcactacgac gctgaagtta aaaccaccta catggctaaa aaaccggttc agctgccggg 900
tgcttacaaa accgacatca aactggacat cacctcccac aacgaagact acaccatcgt 960
tgaacagtac gaacgtgctg aaggtcgtca ctccaccggt gcttaataac gctgatagtg 1020
ctagtgtaga tcgctactag agccaggcat caaataaaac gaaaggctca gtcgaaagac 1080
tgggcctttc gttttatctg ttgtttgtcg gtgaacgctc tctactagag tcacactggc 1140
tcaccttcgg gtgggccttt ctgcgtttat atactagtag cggccgctgc agggtctctg 1200
gttctggtgg tggtagtggt ggttctaaaa ttgcagcatt gaaaaagaag aacgccgcct 1260
tgaaacaaaa aattgctgcc ttaaaacaag gtggtagtgg tggatctggt tagccgagac 1320
gactgaccat ttaaatcata cctgacctcc atagcagaaa gtcaaaagcc tccgaccgga 1380
ggcttttgac ttgatcggca cgtaagaggt tccaactttc accataatga aataagatca 1440
ctaccgggcg tattttttga gttatcgaga ttttcaggag ctaaggaagc taaaatgagc 1500
catattcaac gggaaacgtc ttgctcgagg ccgcgattaa attccaacat ggatgctgat 1560
ttatatgggt ataaatgggc tcgcgataat gtcgggcaat caggtgcgac aatctatcga 1620
ttgtatggga agcccgatgc gccagagttg tttctgaaac atggcaaagg tagcgttgcc 1680
aatgatgtta cagatgagat ggtcaggcta aactggctga cggaatttat gcctcttccg 1740
accatcaagc attttatccg tactcctgat gatgcatggt tactcaccac tgcgatccca 1800
gggaaaacag cattccaggt attagaagaa tatcctgatt caggtgaaaa tattgttgat 1860
gcgctggcag tgttcctgcg ccggttgcat tcgattcctg tttgtaattg tccttttaac 1920
ggcgatcgcg tatttcgtct cgcacaggcg caatcacgaa tgaataacgg tttggttggt 1980
gcgagtgatt ttgatgacga gcgtaatggc tggcctgttg aacaagtctg gaaagaaatg 2040
cataagcttt tgccattctc accggattca gtcgtcactc atggtgattt ctcacttgat 2100
aaccttattt ttgacgaggg gaaattaata ggttgtattg atgttggacg agtcggaatc 2160
gcagaccgat accaggatct tgccatccta tggaactgcc tcggtgagtt ttctccttca 2220
ttacagaaac ggctttttca aaaatatggt attgataatc ctgatatgaa taaattgcag 2280
tttcacttga tgctcgatga gtttttctaa tgagggccca aatgtaatca cctggctcac 2340
cttcgggtgg gcctttctgc gttgctggcg tttttccata ggctccgccc ccctgacgag 2400
catcacaaaa atcgatgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 2460
caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 2520
ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt 2580
aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 2640
gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 2700
cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 2760
ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aagaacagta 2820
tttggtatct gcgctctgct gaagccagtt acctcggaaa aagagttggt agctcttgat 2880
ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag cagattacgc 2940
gcagaaaaaa aggatctcaa gaagatcctt tgattttcta ccg 2983
<210> 37
<211> 3136
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CSpyCatcher plasmid
<400> 37
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgagagacc gaattcgcgg ccgcttctag agcaatacgc aaaccgcctc 120
tccccgcgcg ttggccgatt cattaatgca gctggcacga caggtttccc gactggaaag 180
cgggcagtga gcgcaacgca attaatgtga gttagctcac tcattaggca ccccaggctt 240
tacactttat gcttccggct cgtatgttgt gtggaattgt gagcggataa caatttcaca 300
catactagag aaagaggaga aatactagat ggcttcctcc gaagacgtta tcaaagagtt 360
catgcgtttc aaagttcgta tggaaggttc cgttaacggt cacgagttcg aaatcgaagg 420
tgaaggtgaa ggtcgtccgt acgaaggtac ccagaccgct aaactgaaag ttaccaaagg 480
tggtccgctg ccgttcgctt gggacatcct gtccccgcag ttccagtacg gttccaaagc 540
ttacgttaaa cacccggctg acatcccgga ctacctgaaa ctgtccttcc cggaaggttt 600
caaatgggaa cgtgttatga acttcgaaga cggtggtgtt gttaccgtta cccaggactc 660
ctccctgcaa gacggtgagt tcatctacaa agttaaactg cgtggtacca acttcccgtc 720
cgacggtccg gttatgcaga aaaaaaccat gggttgggaa gcttccaccg aacgtatgta 780
cccggaagac ggtgctctga aaggtgaaat caaaatgcgt ctgaaactga aagacggtgg 840
tcactacgac gctgaagtta aaaccaccta catggctaaa aaaccggttc agctgccggg 900
tgcttacaaa accgacatca aactggacat cacctcccac aacgaagact acaccatcgt 960
tgaacagtac gaacgtgctg aaggtcgtca ctccaccggt gcttaataac gctgatagtg 1020
ctagtgtaga tcgctactag agccaggcat caaataaaac gaaaggctca gtcgaaagac 1080
tgggcctttc gttttatctg ttgtttgtcg gtgaacgctc tctactagag tcacactggc 1140
tcaccttcgg gtgggccttt ctgcgtttat atactagtag cggccgctgc agggtctctg 1200
gttctggtgg ttctgattct gctactcata ttaagttctc caagagggac gaagatggta 1260
aagaattggc tggtgcaact atggaattga gagattcttc tggtaagacc atttccacct 1320
ggatttctga tggtcaagtt aaggatttct acttgtaccc aggtaagtac actttcgttg 1380
aaactgctgc tccagatggt tatgaagttg ctactgctat tactttcacc gtcaatgaac 1440
aaggtcaagt cactgttaat ggttagccga gacgactgac catttaaatc atacctgacc 1500
tccatagcag aaagtcaaaa gcctccgacc ggaggctttt gacttgatcg gcacgtaaga 1560
ggttccaact ttcaccataa tgaaataaga tcactaccgg gcgtattttt tgagttatcg 1620
agattttcag gagctaagga agctaaaatg agccatattc aacgggaaac gtcttgctcg 1680
aggccgcgat taaattccaa catggatgct gatttatatg ggtataaatg ggctcgcgat 1740
aatgtcgggc aatcaggtgc gacaatctat cgattgtatg ggaagcccga tgcgccagag 1800
ttgtttctga aacatggcaa aggtagcgtt gccaatgatg ttacagatga gatggtcagg 1860
ctaaactggc tgacggaatt tatgcctctt ccgaccatca agcattttat ccgtactcct 1920
gatgatgcat ggttactcac cactgcgatc ccagggaaaa cagcattcca ggtattagaa 1980
gaatatcctg attcaggtga aaatattgtt gatgcgctgg cagtgttcct gcgccggttg 2040
cattcgattc ctgtttgtaa ttgtcctttt aacggcgatc gcgtatttcg tctcgctcag 2100
gcgcaatcac gaatgaataa cggtttggtt ggtgcgagtg attttgatga cgagcgtaat 2160
ggctggcctg ttgaacaagt ctggaaagaa atgcataagc ttttgccatt ctcaccggat 2220
tcagtcgtca ctcatggtga tttctcactt gataacctta tttttgacga ggggaaatta 2280
ataggttgta ttgatgttgg acgagtcgga atcgcagacc gataccagga tcttgccatc 2340
ctatggaact gcctcggtga gttttctcct tcattacaga aacggctttt tcaaaaatat 2400
ggtattgata atcctgatat gaataaattg cagtttcact tgatgctcga tgagtttttc 2460
taatgagggc ccaaatgtaa tcacctggct caccttcggg tgggcctttc tgcgttgctg 2520
gcgtttttcc ataggctccg cccccctgac gagcatcaca aaaatcgatg ctcaagtcag 2580
aggtggcgaa acccgacagg actataaaga taccaggcgt ttccccctgg aagctccctc 2640
gtgcgctctc ctgttccgac cctgccgctt accggatacc tgtccgcctt tctcccttcg 2700
ggaagcgtgg cgctttctca tagctcacgc tgtaggtatc tcagttcggt gtaggtcgtt 2760
cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc 2820
ggtaactatc gtcttgagtc caacccggta agacacgact tatcgccact ggcagcagcc 2880
actggtaaca ggattagcag agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg 2940
tggcctaact acggctacac tagaagaaca gtatttggta tctgcgctct gctgaagcca 3000
gttacctcgg aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg 3060
gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc 3120
ctttgatttt ctaccg 3136
<210> 38
<211> 2926
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CSpyTag plasmid
<400> 38
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgagagacc gaattcgcgg ccgcttctag agcaatacgc aaaccgcctc 120
tccccgcgcg ttggccgatt cattaatgca gctggcacga caggtttccc gactggaaag 180
cgggcagtga gcgcaacgca attaatgtga gttagctcac tcattaggca ccccaggctt 240
tacactttat gcttccggct cgtatgttgt gtggaattgt gagcggataa caatttcaca 300
catactagag aaagaggaga aatactagat ggcttcctcc gaagacgtta tcaaagagtt 360
catgcgtttc aaagttcgta tggaaggttc cgttaacggt cacgagttcg aaatcgaagg 420
tgaaggtgaa ggtcgtccgt acgaaggtac ccagaccgct aaactgaaag ttaccaaagg 480
tggtccgctg ccgttcgctt gggacatcct gtccccgcag ttccagtacg gttccaaagc 540
ttacgttaaa cacccggctg acatcccgga ctacctgaaa ctgtccttcc cggaaggttt 600
caaatgggaa cgtgttatga acttcgaaga cggtggtgtt gttaccgtta cccaggactc 660
ctccctgcaa gacggtgagt tcatctacaa agttaaactg cgtggtacca acttcccgtc 720
cgacggtccg gttatgcaga aaaaaaccat gggttgggaa gcttccaccg aacgtatgta 780
cccggaagac ggtgctctga aaggtgaaat caaaatgcgt ctgaaactga aagacggtgg 840
tcactacgac gctgaagtta aaaccaccta catggctaaa aaaccggttc agctgccggg 900
tgcttacaaa accgacatca aactggacat cacctcccac aacgaagact acaccatcgt 960
tgaacagtac gaacgtgctg aaggtcgtca ctccaccggt gcttaataac gctgatagtg 1020
ctagtgtaga tcgctactag agccaggcat caaataaaac gaaaggctca gtcgaaagac 1080
tgggcctttc gttttatctg ttgtttgtcg gtgaacgctc tctactagag tcacactggc 1140
tcaccttcgg gtgggccttt ctgcgtttat atactagtag cggccgctgc agggtctctg 1200
gttctggtgg ttctgctcat atagttatgg ttgatgctta caagccaaca aaatagccga 1260
gacgactgac catttaaatc atacctgacc tccatagcag aaagtcaaaa gcctccgacc 1320
ggaggctttt gacttgatcg gcacgtaaga ggttccaact ttcaccataa tgaaataaga 1380
tcactaccgg gcgtattttt tgagttatcg agattttcag gagctaagga agctaaaatg 1440
agccatattc aacgggaaac gtcttgctcg aggccgcgat taaattccaa catggatgct 1500
gatttatatg ggtataaatg ggctcgcgat aatgtcgggc aatcaggtgc gacaatctat 1560
cgattgtatg ggaagcccga tgcgccagag ttgtttctga aacatggcaa aggtagcgtt 1620
gccaatgatg ttacagatga gatggtcagg ctaaactggc tgacggaatt tatgcctctt 1680
ccgaccatca agcattttat ccgtactcct gatgatgcat ggttactcac cactgcgatc 1740
ccagggaaaa cagcattcca ggtattagaa gaatatcctg attcaggtga aaatattgtt 1800
gatgcgctgg cagtgttcct gcgccggttg cattcgattc ctgtttgtaa ttgtcctttt 1860
aacggcgatc gcgtatttcg tctcgctcag gcgcaatcac gaatgaataa cggtttggtt 1920
ggtgcgagtg attttgatga cgagcgtaat ggctggcctg ttgaacaagt ctggaaagaa 1980
atgcataagc ttttgccatt ctcaccggat tcagtcgtca ctcatggtga tttctcactt 2040
gataacctta tttttgacga ggggaaatta ataggttgta ttgatgttgg acgagtcgga 2100
atcgcagacc gataccagga tcttgccatc ctatggaact gcctcggtga gttttctcct 2160
tcattacaga aacggctttt tcaaaaatat ggtattgata atcctgatat gaataaattg 2220
cagtttcact tgatgctcga tgagtttttc taatgagggc ccaaatgtaa tcacctggct 2280
caccttcggg tgggcctttc tgcgttgctg gcgtttttcc ataggctccg cccccctgac 2340
gagcatcaca aaaatcgatg ctcaagtcag aggtggcgaa acccgacagg actataaaga 2400
taccaggcgt ttccccctgg aagctccctc gtgcgctctc ctgttccgac cctgccgctt 2460
accggatacc tgtccgcctt tctcccttcg ggaagcgtgg cgctttctca tagctcacgc 2520
tgtaggtatc tcagttcggt gtaggtcgtt cgctccaagc tgggctgtgt gcacgaaccc 2580
cccgttcagc ccgaccgctg cgccttatcc ggtaactatc gtcttgagtc caacccggta 2640
agacacgact tatcgccact ggcagcagcc actggtaaca ggattagcag agcgaggtat 2700
gtaggcggtg ctacagagtt cttgaagtgg tggcctaact acggctacac tagaagaaca 2760
gtatttggta tctgcgctct gctgaagcca gttacctcgg aaaaagagtt ggtagctctt 2820
gatccggcaa acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta 2880
cgcgcagaaa aaaaggatct caagaagatc ctttgatttt ctaccg 2926
<210> 39
<211> 2944
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-S2CP plasmid
<400> 39
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgagagacc gaattcgcgg ccgcttctag agcaatacgc aaaccgcctc 120
tccccgcgcg ttggccgatt cattaatgca gctggcacga caggtttccc gactggaaag 180
cgggcagtga gcgcaacgca attaatgtga gttagctcac tcattaggca ccccaggctt 240
tacactttat gcttccggct cgtatgttgt gtggaattgt gagcggataa caatttcaca 300
catactagag aaagaggaga aatactagat ggcttcctcc gaagacgtta tcaaagagtt 360
catgcgtttc aaagttcgta tggaaggttc cgttaacggt cacgagttcg aaatcgaagg 420
tgaaggtgaa ggtcgtccgt acgaaggtac ccagaccgct aaactgaaag ttaccaaagg 480
tggtccgctg ccgttcgctt gggacatcct gtccccgcag ttccagtacg gttccaaagc 540
ttacgttaaa cacccggctg acatcccgga ctacctgaaa ctgtccttcc cggaaggttt 600
caaatgggaa cgtgttatga acttcgaaga cggtggtgtt gttaccgtta cccaggactc 660
ctccctgcaa gacggtgagt tcatctacaa agttaaactg cgtggtacca acttcccgtc 720
cgacggtccg gttatgcaga aaaaaaccat gggttgggaa gcttccaccg aacgtatgta 780
cccggaagac ggtgctctga aaggtgaaat caaaatgcgt ctgaaactga aagacggtgg 840
tcactacgac gctgaagtta aaaccaccta catggctaaa aaaccggttc agctgccggg 900
tgcttacaaa accgacatca aactggacat cacctcccac aacgaagact acaccatcgt 960
tgaacagtac gaacgtgctg aaggtcgtca ctccaccggt gcttaataac gctgatagtg 1020
ctagtgtaga tcgctactag agccaggcat caaataaaac gaaaggctca gtcgaaagac 1080
tgggcctttc gttttatctg ttgtttgtcg gtgaacgctc tctactagag tcacactggc 1140
tcaccttcgg gtgggccttt ctgcgtttat atactagtag cggccgctgc agggtctctt 1200
ctaagattac tggttcttct ggtaacgata cccaaggttc tttgattact tactctggtg 1260
gtgctagagg ttagccgaga cgactgacca tttaaatcat acctgacctc catagcagaa 1320
agtcaaaagc ctccgaccgg aggcttttga cttgatcggc acgtaagagg ttccaacttt 1380
caccataatg aaataagatc actaccgggc gtattttttg agttatcgag attttcagga 1440
gctaaggaag ctaaaatgag ccatattcaa cgggaaacgt cttgctcgag gccgcgatta 1500
aattccaaca tggatgctga tttatatggg tataaatggg ctcgcgataa tgtcgggcaa 1560
tcaggtgcga caatctatcg attgtatggg aagcccgatg cgccagagtt gtttctgaaa 1620
catggcaaag gtagcgttgc caatgatgtt acagatgaga tggtcaggct aaactggctg 1680
acggaattta tgcctcttcc gaccatcaag cattttatcc gtactcctga tgatgcatgg 1740
ttactcacca ctgcgatccc agggaaaaca gcattccagg tattagaaga atatcctgat 1800
tcaggtgaaa atattgttga tgcgctggca gtgttcctgc gccggttgca ttcgattcct 1860
gtttgtaatt gtccttttaa cggcgatcgc gtatttcgtc tcgctcaggc gcaatcacga 1920
atgaataacg gtttggttgg tgcgagtgat tttgatgacg agcgtaatgg ctggcctgtt 1980
gaacaagtct ggaaagaaat gcataagctt ttgccattct caccggattc agtcgtcact 2040
catggtgatt tctcacttga taaccttatt tttgacgagg ggaaattaat aggttgtatt 2100
gatgttggac gagtcggaat cgcagaccga taccaggatc ttgccatcct atggaactgc 2160
ctcggtgagt tttctccttc attacagaaa cggctttttc aaaaatatgg tattgataat 2220
cctgatatga ataaattgca gtttcacttg atgctcgatg agtttttcta atgagggccc 2280
aaatgtaatc acctggctca ccttcgggtg ggcctttctg cgttgctggc gtttttccat 2340
aggctccgcc cccctgacga gcatcacaaa aatcgatgct caagtcagag gtggcgaaac 2400
ccgacaggac tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct 2460
gttccgaccc tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg 2520
ctttctcata gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg 2580
ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt 2640
cttgagtcca acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg 2700
attagcagag cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac 2760
ggctacacta gaagaacagt atttggtatc tgcgctctgc tgaagccagt tacctcggaa 2820
aaagagttgg tagctcttga tccggcaaac aaaccaccgc tggtagcggt ggtttttttg 2880
tttgcaagca gcagattacg cgcagaaaaa aaggatctca agaagatcct ttgattttct 2940
accg 2944
<210> 40
<211> 1864
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_P-Pcon3 plasmid
<400> 40
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg gctggcttcc caaccttacc agagggcgcc ccagctggca attccgacgt 120
cctgacagct agctcagtcc taggtataat gctagcgaat tcaaaagatc ttttaagaag 180
gagatataca tgatgcgaga cgactgacca tttaaatcat acctgacctc catagcagaa 240
agtcaaaagc ctccgaccgg aggcttttga cttgatcggc acgtaagagg ttccaacttt 300
caccataatg aaataagatc actaccgggc gtattttttg agttatcgag attttcagga 360
gctaaggaag ctaaaatgag ccatattcaa cgggaaacgt cttgctcgag gccgcgatta 420
aattccaaca tggatgctga tttatatggg tataaatggg ctcgcgataa tgtcgggcaa 480
tcaggtgcga caatctatcg attgtatggg aagcccgatg cgccagagtt gtttctgaaa 540
catggcaaag gtagcgttgc caatgatgtt acagatgaga tggtcaggct aaactggctg 600
acggaattta tgcctcttcc gaccatcaag cattttatcc gtactcctga tgatgcatgg 660
ttactcacca ctgcgatccc agggaaaaca gcattccagg tattagaaga atatcctgat 720
tcaggtgaaa atattgttga tgcgctggca gtgttcctgc gccggttgca ttcgattcct 780
gtttgtaatt gtccttttaa cggcgatcgc gtatttcgtc tcgctcaggc gcaatcacga 840
atgaataacg gtttggttgg tgcgagtgat tttgatgacg agcgtaatgg ctggcctgtt 900
gaacaagtct ggaaagaaat gcataagctt ttgccattct caccggattc agtcgtcact 960
catggtgatt tctcacttga taaccttatt tttgacgagg ggaaattaat aggttgtatt 1020
gatgttggac gagtcggaat cgcagaccga taccaggatc ttgccatcct atggaactgc 1080
ctcggtgagt tttctccttc attacagaaa cggctttttc aaaaatatgg tattgataat 1140
cctgatatga ataaattgca gtttcacttg atgctcgatg agtttttcta atgagggccc 1200
aaatgtaatc acctggctca ccttcgggtg ggcctttctg cgttgctggc gtttttccat 1260
aggctccgcc cccctgacga gcatcacaaa aatcgatgct caagtcagag gtggcgaaac 1320
ccgacaggac tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct 1380
gttccgaccc tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg 1440
ctttctcata gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg 1500
ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt 1560
cttgagtcca acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg 1620
attagcagag cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac 1680
ggctacacta gaagaacagt atttggtatc tgcgctctgc tgaagccagt tacctcggaa 1740
aaagagttgg tagctcttga tccggcaaac aaaccaccgc tggtagcggt ggtttttttg 1800
tttgcaagca gcagattacg cgcagaaaaa aaggatctca agaagatcct ttgattttct 1860
accg 1864
<210> 41
<211> 1864
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_P-Pcon4 plasmid
<400> 41
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg gctggcttcc caaccttacc agagggcgcc ccagctggca attccgacgt 120
ctttacggct agctcagtcc taggtactat gctagcgaat tcaaaagatc ttttaagaag 180
gagatataca tgatgcgaga cgactgacca tttaaatcat acctgacctc catagcagaa 240
agtcaaaagc ctccgaccgg aggcttttga cttgatcggc acgtaagagg ttccaacttt 300
caccataatg aaataagatc actaccgggc gtattttttg agttatcgag attttcagga 360
gctaaggaag ctaaaatgag ccatattcaa cgggaaacgt cttgctcgag gccgcgatta 420
aattccaaca tggatgctga tttatatggg tataaatggg ctcgcgataa tgtcgggcaa 480
tcaggtgcga caatctatcg attgtatggg aagcccgatg cgccagagtt gtttctgaaa 540
catggcaaag gtagcgttgc caatgatgtt acagatgaga tggtcaggct aaactggctg 600
acggaattta tgcctcttcc gaccatcaag cattttatcc gtactcctga tgatgcatgg 660
ttactcacca ctgcgatccc agggaaaaca gcattccagg tattagaaga atatcctgat 720
tcaggtgaaa atattgttga tgcgctggca gtgttcctgc gccggttgca ttcgattcct 780
gtttgtaatt gtccttttaa cggcgatcgc gtatttcgtc tcgctcaggc gcaatcacga 840
atgaataacg gtttggttgg tgcgagtgat tttgatgacg agcgtaatgg ctggcctgtt 900
gaacaagtct ggaaagaaat gcataagctt ttgccattct caccggattc agtcgtcact 960
catggtgatt tctcacttga taaccttatt tttgacgagg ggaaattaat aggttgtatt 1020
gatgttggac gagtcggaat cgcagaccga taccaggatc ttgccatcct atggaactgc 1080
ctcggtgagt tttctccttc attacagaaa cggctttttc aaaaatatgg tattgataat 1140
cctgatatga ataaattgca gtttcacttg atgctcgatg agtttttcta atgagggccc 1200
aaatgtaatc acctggctca ccttcgggtg ggcctttctg cgttgctggc gtttttccat 1260
aggctccgcc cccctgacga gcatcacaaa aatcgatgct caagtcagag gtggcgaaac 1320
ccgacaggac tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct 1380
gttccgaccc tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg 1440
ctttctcata gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg 1500
ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt 1560
cttgagtcca acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg 1620
attagcagag cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac 1680
ggctacacta gaagaacagt atttggtatc tgcgctctgc tgaagccagt tacctcggaa 1740
aaagagttgg tagctcttga tccggcaaac aaaccaccgc tggtagcggt ggtttttttg 1800
tttgcaagca gcagattacg cgcagaaaaa aaggatctca agaagatcct ttgattttct 1860
accg 1864
<210> 42
<211> 1864
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_P-Pcon5 plasmid
<400> 42
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg gctggcttcc caaccttacc agagggcgcc ccagctggca attccgacgt 120
ctttatggct agctcagtcc taggtacaat gctagcgaat tcaaaagatc ttttaagaag 180
gagatataca tgatgcgaga cgactgacca tttaaatcat acctgacctc catagcagaa 240
agtcaaaagc ctccgaccgg aggcttttga cttgatcggc acgtaagagg ttccaacttt 300
caccataatg aaataagatc actaccgggc gtattttttg agttatcgag attttcagga 360
gctaaggaag ctaaaatgag ccatattcaa cgggaaacgt cttgctcgag gccgcgatta 420
aattccaaca tggatgctga tttatatggg tataaatggg ctcgcgataa tgtcgggcaa 480
tcaggtgcga caatctatcg attgtatggg aagcccgatg cgccagagtt gtttctgaaa 540
catggcaaag gtagcgttgc caatgatgtt acagatgaga tggtcaggct aaactggctg 600
acggaattta tgcctcttcc gaccatcaag cattttatcc gtactcctga tgatgcatgg 660
ttactcacca ctgcgatccc agggaaaaca gcattccagg tattagaaga atatcctgat 720
tcaggtgaaa atattgttga tgcgctggca gtgttcctgc gccggttgca ttcgattcct 780
gtttgtaatt gtccttttaa cggcgatcgc gtatttcgtc tcgctcaggc gcaatcacga 840
atgaataacg gtttggttgg tgcgagtgat tttgatgacg agcgtaatgg ctggcctgtt 900
gaacaagtct ggaaagaaat gcataagctt ttgccattct caccggattc agtcgtcact 960
catggtgatt tctcacttga taaccttatt tttgacgagg ggaaattaat aggttgtatt 1020
gatgttggac gagtcggaat cgcagaccga taccaggatc ttgccatcct atggaactgc 1080
ctcggtgagt tttctccttc attacagaaa cggctttttc aaaaatatgg tattgataat 1140
cctgatatga ataaattgca gtttcacttg atgctcgatg agtttttcta atgagggccc 1200
aaatgtaatc acctggctca ccttcgggtg ggcctttctg cgttgctggc gtttttccat 1260
aggctccgcc cccctgacga gcatcacaaa aatcgatgct caagtcagag gtggcgaaac 1320
ccgacaggac tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct 1380
gttccgaccc tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg 1440
ctttctcata gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg 1500
ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt 1560
cttgagtcca acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg 1620
attagcagag cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac 1680
ggctacacta gaagaacagt atttggtatc tgcgctctgc tgaagccagt tacctcggaa 1740
aaagagttgg tagctcttga tccggcaaac aaaccaccgc tggtagcggt ggtttttttg 1800
tttgcaagca gcagattacg cgcagaaaaa aaggatctca agaagatcct ttgattttct 1860
accg 1864
<210> 43
<211> 3301
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_P-LacI+PT7 plasmid
<400> 43
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg gcttcactgc ccgctttcca gtcgggaaac ctgtcgtgcc agctgcatta 120
atgaatcggc caacgcgcgg ggagaggcgg tttgcgtatt gggcgccagg gtggtttttc 180
ttttcaccag tgaaacgggc aacagctgat tgcccttcac cgcctggccc tgagagagtt 240
gcagcaagcg gtccacgctg gtttgcccca gcaggcgaaa atcctgtttg atggtggtta 300
acggcgggat ataacatgag ctgtcttcgg tatcgtcgta tcccactacc gagatatccg 360
caccaacgcg cagcccggac tcggtaatgg cgcgcattgc gcccagcgcc atctgatcgt 420
tggcaaccag catcgcagtg ggaacgatgc cctcattcag catttgcatg gtttgttgaa 480
aaccggacat ggcactccag tcgccttccc gttccgctat cggctgaatt tgattgcgag 540
tgagatattt atgccagcca gccagacgca gacgcgccga gacagaactt aatgggcccg 600
ctaacagcgc gatttgctgg tgacccaatg cgaccagatg ctccacgccc agtcgcgtac 660
cgtcttcatg ggagaaaata atactgttga tgggtgtctg gtcagagaca tcaagaaata 720
acgccggaac attagtgcag gcagcttcca cagcaatggc atcctggtca tccagcggat 780
agttaatgat cagcccactg acgcgttgcg cgagaagatt gtgcaccgcc gctttacagg 840
cttcgacgcc gcttcgttct accatcgaca ccaccacgct ggcacccagt tgatcggcgc 900
gagatttaat cgccgcgaca atttgcgacg gcgcgtgcag ggccagactg gaggtggcaa 960
cgccaatcag caacgactgt ttgcccgcca gttgttgtgc cacgcggttg ggaatgtaat 1020
tcagctccgc catcgccgct tccacttttt cccgcgtttt cgcagaaacg tggctggcct 1080
ggttcaccac gcgggaaacg gtctgataag agacaccggc atactctgcg acatcgtata 1140
acgttactgg tttcacattc accaccctga attgactctc ttccgggcgc tatcatgcca 1200
taccgcgaaa ggttttgcgc cattcgatgg tgtccgggat ctcgacgctc tcccttatgc 1260
gactcctgca ttaggaagca gcccagtagt aggttgaggc cgttgagcac cgccgccgca 1320
aggaatggtg catgcaagga gatggcgccc aacagtcccc cggccacggg gcctgccacc 1380
atacccacgc cgaaacaagc gctcatgagc ccgaagtggc gagcccgatc ttccccatcg 1440
gtgatgtcgg cgatataggc gccagcaacc gcacctgtgg cgccggtgat gccggccacg 1500
atgcgtccgg cgtagaggat cgagatctcg atcccgcgaa attaatacga ctcactatag 1560
gggaattgtg agcggataac aattcccctc tagaaataat tttgtttaac tttaagaagg 1620
agatatacga tgcgagacga ctgaccattt aaatcatacc tgacctccat agcagaaagt 1680
caaaagcctc cgaccggagg cttttgactt gatcggcacg taagaggttc caactttcac 1740
cataatgaaa taagatcact accgggcgta ttttttgagt tatcgagatt ttcaggagct 1800
aaggaagcta aaatgagcca tattcaacgg gaaacgtctt gctcgaggcc gcgattaaat 1860
tccaacatgg atgctgattt atatgggtat aaatgggctc gcgataatgt cgggcaatca 1920
ggtgcgacaa tctatcgatt gtatgggaag cccgatgcgc cagagttgtt tctgaaacat 1980
ggcaaaggta gcgttgccaa tgatgttaca gatgagatgg tcaggctaaa ctggctgacg 2040
gaatttatgc ctcttccgac catcaagcat tttatccgta ctcctgatga tgcatggtta 2100
ctcaccactg cgatcccagg gaaaacagca ttccaggtat tagaagaata tcctgattca 2160
ggtgaaaata ttgttgatgc gctggcagtg ttcctgcgcc ggttgcattc gattcctgtt 2220
tgtaattgtc cttttaacgg cgatcgcgta tttcgtctcg ctcaggcgca atcacgaatg 2280
aataacggtt tggttggtgc gagtgatttt gatgacgagc gtaatggctg gcctgttgaa 2340
caagtctgga aagaaatgca taagcttttg ccattctcac cggattcagt cgtcactcat 2400
ggtgatttct cacttgataa ccttattttt gacgagggga aattaatagg ttgtattgat 2460
gttggacgag tcggaatcgc agaccgatac caggatcttg ccatcctatg gaactgcctc 2520
ggtgagtttt ctccttcatt acagaaacgg ctttttcaaa aatatggtat tgataatcct 2580
gatatgaata aattgcagtt tcacttgatg ctcgatgagt ttttctaatg agggcccaaa 2640
tgtaatcacc tggctcacct tcgggtgggc ctttctgcgt tgctggcgtt tttccatagg 2700
ctccgccccc ctgacgagca tcacaaaaat cgatgctcaa gtcagaggtg gcgaaacccg 2760
acaggactat aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt 2820
ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt 2880
tctcatagct cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc 2940
tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt 3000
gagtccaacc cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt 3060
agcagagcga ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc 3120
tacactagaa gaacagtatt tggtatctgc gctctgctga agccagttac ctcggaaaaa 3180
gagttggtag ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt 3240
gcaagcagca gattacgcgc agaaaaaaag gatctcaaga agatcctttg attttctacc 3300
g 3301
<210> 44
<211> 2499
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-His6-meGFP plasmid
<400> 44
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgggttctt ctcatcatca ccatcaccat tcttctggga tgtctaaagg 120
tgaagaatta ttcactggtg ttgtcccaat tttggttgaa ttagatggtg atgttaatgg 180
tcacaaattt tctgtctccg gtgaaggtga aggtgatgct acttacggta aattgacctt 240
aaaatttatt tgtactactg gtaaattgcc agttccatgg ccaaccttag tcactacttt 300
aacttatggt gttcaatgtt tttctagata cccagatcat atgaaacaac atgacttttt 360
caagtctgcc atgccagaag gttatgttca agaaagaact atttttttca aagatgacgg 420
taactacaag accagagctg aagtcaagtt tgaaggtgat accttagtta atagaatcga 480
attaaaaggt attgatttta aagaagatgg taacatttta ggtcacaaat tggaatacaa 540
ctataactct cacaatgttt acatcatggc tgacaaacaa aagaatggta tcaaagttaa 600
cttcaaaatt agacacaaca ttgaagatgg ttctgttcaa ttagctgacc attatcaaca 660
aaatactcca attggtgatg gtccagtctt gttaccagac aaccattact tatccactca 720
atctaaatta tccaaagatc caaacgaaaa gagagatcac atggtcttgt tagaatttgt 780
tactgctgct ggtattaccc atggtatgga tgaattgtac aaataatagc cgagacgact 840
gaccatttaa atcatacctg acctccatag cagaaagtca aaagcctccg accggaggct 900
tttgacttga tcggcacgta agaggttcca actttcacca taatgaaata agatcactac 960
cgggcgtatt ttttgagtta tcgagatttt caggagctaa ggaagctaaa atgagccata 1020
ttcaacggga aacgtcttgc tcgaggccgc gattaaattc caacatggat gctgatttat 1080
atgggtataa atgggctcgc gataatgtcg ggcaatcagg tgcgacaatc tatcgattgt 1140
atgggaagcc cgatgcgcca gagttgtttc tgaaacatgg caaaggtagc gttgccaatg 1200
atgttacaga tgagatggtc aggctaaact ggctgacgga atttatgcct cttccgacca 1260
tcaagcattt tatccgtact cctgatgatg catggttact caccactgcg atcccaggga 1320
aaacagcatt ccaggtatta gaagaatatc ctgattcagg tgaaaatatt gttgatgcgc 1380
tggcagtgtt cctgcgccgg ttgcattcga ttcctgtttg taattgtcct tttaacggcg 1440
atcgcgtatt tcgtctcgct caggcgcaat cacgaatgaa taacggtttg gttggtgcga 1500
gtgattttga tgacgagcgt aatggctggc ctgttgaaca agtctggaaa gaaatgcata 1560
agcttttgcc attctcaccg gattcagtcg tcactcatgg tgatttctca cttgataacc 1620
ttatttttga cgaggggaaa ttaataggtt gtattgatgt tggacgagtc ggaatcgcag 1680
accgatacca ggatcttgcc atcctatgga actgcctcgg tgagttttct ccttcattac 1740
agaaacggct ttttcaaaaa tatggtattg ataatcctga tatgaataaa ttgcagtttc 1800
acttgatgct cgatgagttt ttctaatgag ggcccaaatg taatcacctg gctcaccttc 1860
gggtgggcct ttctgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc 1920
acaaaaatcg atgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg 1980
cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat 2040
acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt 2100
atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc 2160
agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg 2220
acttatcgcc actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg 2280
gtgctacaga gttcttgaag tggtggccta actacggcta cactagaaga acagtatttg 2340
gtatctgcgc tctgctgaag ccagttacct cggaaaaaga gttggtagct cttgatccgg 2400
caaacaaacc accgctggta gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag 2460
aaaaaaagga tctcaagaag atcctttgat tttctaccg 2499
<210> 45
<211> 753
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-His6-meGFP nucleotide sequence of key ORF
<400> 45
atgggttctt ctcatcatca ccatcaccat tcttctggga tgtctaaagg tgaagaatta 60
ttcactggtg ttgtcccaat tttggttgaa ttagatggtg atgttaatgg tcacaaattt 120
tctgtctccg gtgaaggtga aggtgatgct acttacggta aattgacctt aaaatttatt 180
tgtactactg gtaaattgcc agttccatgg ccaaccttag tcactacttt aacttatggt 240
gttcaatgtt tttctagata cccagatcat atgaaacaac atgacttttt caagtctgcc 300
atgccagaag gttatgttca agaaagaact atttttttca aagatgacgg taactacaag 360
accagagctg aagtcaagtt tgaaggtgat accttagtta atagaatcga attaaaaggt 420
attgatttta aagaagatgg taacatttta ggtcacaaat tggaatacaa ctataactct 480
cacaatgttt acatcatggc tgacaaacaa aagaatggta tcaaagttaa cttcaaaatt 540
agacacaaca ttgaagatgg ttctgttcaa ttagctgacc attatcaaca aaatactcca 600
attggtgatg gtccagtctt gttaccagac aaccattact tatccactca atctaaatta 660
tccaaagatc caaacgaaaa gagagatcac atggtcttgt tagaatttgt tactgctgct 720
ggtattaccc atggtatgga tgaattgtac aaa 753
<210> 46
<211> 251
<212> PRT
<213> Artificial Sequence
<220>
<223> HcKan_O-His6-meGFP amino acid sequence of key ORF
<400> 46
Met Gly Ser Ser His His His His His His Ser Ser Gly Met Ser Lys
1 5 10 15
Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu Val Glu Leu Asp
20 25 30
Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu Gly Glu Gly
35 40 45
Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe Ile Cys Thr Thr Gly
50 55 60
Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Leu Thr Tyr Gly
65 70 75 80
Val Gln Cys Phe Ser Arg Tyr Pro Asp His Met Lys Gln His Asp Phe
85 90 95
Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu Arg Thr Ile Phe
100 105 110
Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val Lys Phe Glu
115 120 125
Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile Asp Phe Lys
130 135 140
Glu Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr Asn Tyr Asn Ser
145 150 155 160
His Asn Val Tyr Ile Met Ala Asp Lys Gln Lys Asn Gly Ile Lys Val
165 170 175
Asn Phe Lys Ile Arg His Asn Ile Glu Asp Gly Ser Val Gln Leu Ala
180 185 190
Asp His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro Val Leu Leu
195 200 205
Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Lys Leu Ser Lys Asp Pro
210 215 220
Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val Thr Ala Ala
225 230 235 240
Gly Ile Thr His Gly Met Asp Glu Leu Tyr Lys
245 250
<210> 47
<211> 2658
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-UmuD1-40-meGFP-S2CP plasmid
<400> 47
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgctgttta tcaaacctgc cgatctgcgt gaaattgtta cctttccgct 120
gtttagcgat ctggttcagt gtggttttcc gagtccggca gcagattatg ttgaacagcg 180
tattgatctg agcagcggta tgtctaaagg tgaagaatta ttcactggtg ttgtcccaat 240
tttggttgaa ttagatggtg atgttaatgg tcacaaattt tctgtctccg gtgaaggtga 300
aggtgatgct acttacggta aattgacctt aaaatttatt tgtactactg gtaaattgcc 360
agttccatgg ccaaccttag tcactacttt aacttatggt gttcaatgtt tttctagata 420
cccagatcat atgaaacaac atgacttttt caagtctgcc atgccagaag gttatgttca 480
agaaagaact atttttttca aagatgacgg taactacaag accagagctg aagtcaagtt 540
tgaaggtgat accttagtta atagaatcga attaaaaggt attgatttta aagaagatgg 600
taacatttta ggtcacaaat tggaatacaa ctataactct cacaatgttt acatcatggc 660
tgacaaacaa aagaatggta tcaaagttaa cttcaaaatt agacacaaca ttgaagatgg 720
ttctgttcaa ttagctgacc attatcaaca aaatactcca attggtgatg gtccagtctt 780
gttaccagac aaccattact tatccactca atctaaatta tccaaagatc caaacgaaaa 840
gagagatcac atggtcttgt tagaatttgt tactgctgct ggtattaccc atggtatgga 900
tgaattgtac aaatctaaga ttactggttc ttctggtaac gatacccaag gttctttgat 960
tacttactct ggtggtgcta gaggttagcc gagacgactg accatttaaa tcatacctga 1020
cctccatagc agaaagtcaa aagcctccga ccggaggctt ttgacttgat cggcacgtaa 1080
gaggttccaa ctttcaccat aatgaaataa gatcactacc gggcgtattt tttgagttat 1140
cgagattttc aggagctaag gaagctaaaa tgagccatat tcaacgggaa acgtcttgct 1200
cgaggccgcg attaaattcc aacatggatg ctgatttata tgggtataaa tgggctcgcg 1260
ataatgtcgg gcaatcaggt gcgacaatct atcgattgta tgggaagccc gatgcgccag 1320
agttgtttct gaaacatggc aaaggtagcg ttgccaatga tgttacagat gagatggtca 1380
ggctaaactg gctgacggaa tttatgcctc ttccgaccat caagcatttt atccgtactc 1440
ctgatgatgc atggttactc accactgcga tcccagggaa aacagcattc caggtattag 1500
aagaatatcc tgattcaggt gaaaatattg ttgatgcgct ggcagtgttc ctgcgccggt 1560
tgcattcgat tcctgtttgt aattgtcctt ttaacggcga tcgcgtattt cgtctcgctc 1620
aggcgcaatc acgaatgaat aacggtttgg ttggtgcgag tgattttgat gacgagcgta 1680
atggctggcc tgttgaacaa gtctggaaag aaatgcataa gcttttgcca ttctcaccgg 1740
attcagtcgt cactcatggt gatttctcac ttgataacct tatttttgac gaggggaaat 1800
taataggttg tattgatgtt ggacgagtcg gaatcgcaga ccgataccag gatcttgcca 1860
tcctatggaa ctgcctcggt gagttttctc cttcattaca gaaacggctt tttcaaaaat 1920
atggtattga taatcctgat atgaataaat tgcagtttca cttgatgctc gatgagtttt 1980
tctaatgagg gcccaaatgt aatcacctgg ctcaccttcg ggtgggcctt tctgcgttgc 2040
tggcgttttt ccataggctc cgcccccctg acgagcatca caaaaatcga tgctcaagtc 2100
agaggtggcg aaacccgaca ggactataaa gataccaggc gtttccccct ggaagctccc 2160
tcgtgcgctc tcctgttccg accctgccgc ttaccggata cctgtccgcc tttctccctt 2220
cgggaagcgt ggcgctttct catagctcac gctgtaggta tctcagttcg gtgtaggtcg 2280
ttcgctccaa gctgggctgt gtgcacgaac cccccgttca gcccgaccgc tgcgccttat 2340
ccggtaacta tcgtcttgag tccaacccgg taagacacga cttatcgcca ctggcagcag 2400
ccactggtaa caggattagc agagcgaggt atgtaggcgg tgctacagag ttcttgaagt 2460
ggtggcctaa ctacggctac actagaagaa cagtatttgg tatctgcgct ctgctgaagc 2520
cagttacctc ggaaaaagag ttggtagctc ttgatccggc aaacaaacca ccgctggtag 2580
cggtggtttt tttgtttgca agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga 2640
tcctttgatt ttctaccg 2658
<210> 48
<211> 915
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-UmuD1-40-meGFP-S2CP nucl;eotide sequence of key ORF
<400> 48
atgctgttta tcaaacctgc cgatctgcgt gaaattgtta cctttccgct gtttagcgat 60
ctggttcagt gtggttttcc gagtccggca gcagattatg ttgaacagcg tattgatctg 120
agcagcggta tgtctaaagg tgaagaatta ttcactggtg ttgtcccaat tttggttgaa 180
ttagatggtg atgttaatgg tcacaaattt tctgtctccg gtgaaggtga aggtgatgct 240
acttacggta aattgacctt aaaatttatt tgtactactg gtaaattgcc agttccatgg 300
ccaaccttag tcactacttt aacttatggt gttcaatgtt tttctagata cccagatcat 360
atgaaacaac atgacttttt caagtctgcc atgccagaag gttatgttca agaaagaact 420
atttttttca aagatgacgg taactacaag accagagctg aagtcaagtt tgaaggtgat 480
accttagtta atagaatcga attaaaaggt attgatttta aagaagatgg taacatttta 540
ggtcacaaat tggaatacaa ctataactct cacaatgttt acatcatggc tgacaaacaa 600
aagaatggta tcaaagttaa cttcaaaatt agacacaaca ttgaagatgg ttctgttcaa 660
ttagctgacc attatcaaca aaatactcca attggtgatg gtccagtctt gttaccagac 720
aaccattact tatccactca atctaaatta tccaaagatc caaacgaaaa gagagatcac 780
atggtcttgt tagaatttgt tactgctgct ggtattaccc atggtatgga tgaattgtac 840
aaatctaaga ttactggttc ttctggtaac gatacccaag gttctttgat tacttactct 900
ggtggtgcta gaggt 915
<210> 49
<211> 305
<212> PRT
<213> Artificial Sequence
<220>
<223> HcKan_O-UmuD1-40-meGFP-S2CP amino acid sequence of key ORF
<400> 49
Met Leu Phe Ile Lys Pro Ala Asp Leu Arg Glu Ile Val Thr Phe Pro
1 5 10 15
Leu Phe Ser Asp Leu Val Gln Cys Gly Phe Pro Ser Pro Ala Ala Asp
20 25 30
Tyr Val Glu Gln Arg Ile Asp Leu Ser Ser Gly Met Ser Lys Gly Glu
35 40 45
Glu Leu Phe Thr Gly Val Val Pro Ile Leu Val Glu Leu Asp Gly Asp
50 55 60
Val Asn Gly His Lys Phe Ser Val Ser Gly Glu Gly Glu Gly Asp Ala
65 70 75 80
Thr Tyr Gly Lys Leu Thr Leu Lys Phe Ile Cys Thr Thr Gly Lys Leu
85 90 95
Pro Val Pro Trp Pro Thr Leu Val Thr Thr Leu Thr Tyr Gly Val Gln
100 105 110
Cys Phe Ser Arg Tyr Pro Asp His Met Lys Gln His Asp Phe Phe Lys
115 120 125
Ser Ala Met Pro Glu Gly Tyr Val Gln Glu Arg Thr Ile Phe Phe Lys
130 135 140
Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val Lys Phe Glu Gly Asp
145 150 155 160
Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile Asp Phe Lys Glu Asp
165 170 175
Gly Asn Ile Leu Gly His Lys Leu Glu Tyr Asn Tyr Asn Ser His Asn
180 185 190
Val Tyr Ile Met Ala Asp Lys Gln Lys Asn Gly Ile Lys Val Asn Phe
195 200 205
Lys Ile Arg His Asn Ile Glu Asp Gly Ser Val Gln Leu Ala Asp His
210 215 220
Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro Val Leu Leu Pro Asp
225 230 235 240
Asn His Tyr Leu Ser Thr Gln Ser Lys Leu Ser Lys Asp Pro Asn Glu
245 250 255
Lys Arg Asp His Met Val Leu Leu Glu Phe Val Thr Ala Ala Gly Ile
260 265 270
Thr His Gly Met Asp Glu Leu Tyr Lys Ser Lys Ile Thr Gly Ser Ser
275 280 285
Gly Asn Asp Thr Gln Gly Ser Leu Ile Thr Tyr Ser Gly Gly Ala Arg
290 295 300
Gly
305
<210> 50
<211> 2586
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-UmuD1-40-meGFP plasmid
<400> 50
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgctgttta tcaaacctgc cgatctgcgt gaaattgtta cctttccgct 120
gtttagcgat ctggttcagt gtggttttcc gagtccggca gcagattatg ttgaacagcg 180
tattgatctg agcagcggta tgtctaaagg tgaagaatta ttcactggtg ttgtcccaat 240
tttggttgaa ttagatggtg atgttaatgg tcacaaattt tctgtctccg gtgaaggtga 300
aggtgatgct acttacggta aattgacctt aaaatttatt tgtactactg gtaaattgcc 360
agttccatgg ccaaccttag tcactacttt aacttatggt gttcaatgtt tttctagata 420
cccagatcat atgaaacaac atgacttttt caagtctgcc atgccagaag gttatgttca 480
agaaagaact atttttttca aagatgacgg taactacaag accagagctg aagtcaagtt 540
tgaaggtgat accttagtta atagaatcga attaaaaggt attgatttta aagaagatgg 600
taacatttta ggtcacaaat tggaatacaa ctataactct cacaatgttt acatcatggc 660
tgacaaacaa aagaatggta tcaaagttaa cttcaaaatt agacacaaca ttgaagatgg 720
ttctgttcaa ttagctgacc attatcaaca aaatactcca attggtgatg gtccagtctt 780
gttaccagac aaccattact tatccactca atctaaatta tccaaagatc caaacgaaaa 840
gagagatcac atggtcttgt tagaatttgt tactgctgct ggtattaccc atggtatgga 900
tgaattgtac aaatagccga gacgactgac catttaaatc atacctgacc tccatagcag 960
aaagtcaaaa gcctccgacc ggaggctttt gacttgatcg gcacgtaaga ggttccaact 1020
ttcaccataa tgaaataaga tcactaccgg gcgtattttt tgagttatcg agattttcag 1080
gagctaagga agctaaaatg agccatattc aacgggaaac gtcttgctcg aggccgcgat 1140
taaattccaa catggatgct gatttatatg ggtataaatg ggctcgcgat aatgtcgggc 1200
aatcaggtgc gacaatctat cgattgtatg ggaagcccga tgcgccagag ttgtttctga 1260
aacatggcaa aggtagcgtt gccaatgatg ttacagatga gatggtcagg ctaaactggc 1320
tgacggaatt tatgcctctt ccgaccatca agcattttat ccgtactcct gatgatgcat 1380
ggttactcac cactgcgatc ccagggaaaa cagcattcca ggtattagaa gaatatcctg 1440
attcaggtga aaatattgtt gatgcgctgg cagtgttcct gcgccggttg cattcgattc 1500
ctgtttgtaa ttgtcctttt aacggcgatc gcgtatttcg tctcgctcag gcgcaatcac 1560
gaatgaataa cggtttggtt ggtgcgagtg attttgatga cgagcgtaat ggctggcctg 1620
ttgaacaagt ctggaaagaa atgcataagc ttttgccatt ctcaccggat tcagtcgtca 1680
ctcatggtga tttctcactt gataacctta tttttgacga ggggaaatta ataggttgta 1740
ttgatgttgg acgagtcgga atcgcagacc gataccagga tcttgccatc ctatggaact 1800
gcctcggtga gttttctcct tcattacaga aacggctttt tcaaaaatat ggtattgata 1860
atcctgatat gaataaattg cagtttcact tgatgctcga tgagtttttc taatgagggc 1920
ccaaatgtaa tcacctggct caccttcggg tgggcctttc tgcgttgctg gcgtttttcc 1980
ataggctccg cccccctgac gagcatcaca aaaatcgatg ctcaagtcag aggtggcgaa 2040
acccgacagg actataaaga taccaggcgt ttccccctgg aagctccctc gtgcgctctc 2100
ctgttccgac cctgccgctt accggatacc tgtccgcctt tctcccttcg ggaagcgtgg 2160
cgctttctca tagctcacgc tgtaggtatc tcagttcggt gtaggtcgtt cgctccaagc 2220
tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc ggtaactatc 2280
gtcttgagtc caacccggta agacacgact tatcgccact ggcagcagcc actggtaaca 2340
ggattagcag agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg tggcctaact 2400
acggctacac tagaagaaca gtatttggta tctgcgctct gctgaagcca gttacctcgg 2460
aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt 2520
tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatttt 2580
ctaccg 2586
<210> 51
<211> 843
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-UmuD1-40-meGFP nucleotide sequence of key ORF
<400> 51
atgctgttta tcaaacctgc cgatctgcgt gaaattgtta cctttccgct gtttagcgat 60
ctggttcagt gtggttttcc gagtccggca gcagattatg ttgaacagcg tattgatctg 120
agcagcggta tgtctaaagg tgaagaatta ttcactggtg ttgtcccaat tttggttgaa 180
ttagatggtg atgttaatgg tcacaaattt tctgtctccg gtgaaggtga aggtgatgct 240
acttacggta aattgacctt aaaatttatt tgtactactg gtaaattgcc agttccatgg 300
ccaaccttag tcactacttt aacttatggt gttcaatgtt tttctagata cccagatcat 360
atgaaacaac atgacttttt caagtctgcc atgccagaag gttatgttca agaaagaact 420
atttttttca aagatgacgg taactacaag accagagctg aagtcaagtt tgaaggtgat 480
accttagtta atagaatcga attaaaaggt attgatttta aagaagatgg taacatttta 540
ggtcacaaat tggaatacaa ctataactct cacaatgttt acatcatggc tgacaaacaa 600
aagaatggta tcaaagttaa cttcaaaatt agacacaaca ttgaagatgg ttctgttcaa 660
ttagctgacc attatcaaca aaatactcca attggtgatg gtccagtctt gttaccagac 720
aaccattact tatccactca atctaaatta tccaaagatc caaacgaaaa gagagatcac 780
atggtcttgt tagaatttgt tactgctgct ggtattaccc atggtatgga tgaattgtac 840
aaa 843
<210> 52
<211> 281
<212> PRT
<213> Artificial Sequence
<220>
<223> HcKan_O-UmuD1-40-meGFP amino acid sequence of key ORF
<400> 52
Met Leu Phe Ile Lys Pro Ala Asp Leu Arg Glu Ile Val Thr Phe Pro
1 5 10 15
Leu Phe Ser Asp Leu Val Gln Cys Gly Phe Pro Ser Pro Ala Ala Asp
20 25 30
Tyr Val Glu Gln Arg Ile Asp Leu Ser Ser Gly Met Ser Lys Gly Glu
35 40 45
Glu Leu Phe Thr Gly Val Val Pro Ile Leu Val Glu Leu Asp Gly Asp
50 55 60
Val Asn Gly His Lys Phe Ser Val Ser Gly Glu Gly Glu Gly Asp Ala
65 70 75 80
Thr Tyr Gly Lys Leu Thr Leu Lys Phe Ile Cys Thr Thr Gly Lys Leu
85 90 95
Pro Val Pro Trp Pro Thr Leu Val Thr Thr Leu Thr Tyr Gly Val Gln
100 105 110
Cys Phe Ser Arg Tyr Pro Asp His Met Lys Gln His Asp Phe Phe Lys
115 120 125
Ser Ala Met Pro Glu Gly Tyr Val Gln Glu Arg Thr Ile Phe Phe Lys
130 135 140
Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val Lys Phe Glu Gly Asp
145 150 155 160
Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile Asp Phe Lys Glu Asp
165 170 175
Gly Asn Ile Leu Gly His Lys Leu Glu Tyr Asn Tyr Asn Ser His Asn
180 185 190
Val Tyr Ile Met Ala Asp Lys Gln Lys Asn Gly Ile Lys Val Asn Phe
195 200 205
Lys Ile Arg His Asn Ile Glu Asp Gly Ser Val Gln Leu Ala Asp His
210 215 220
Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro Val Leu Leu Pro Asp
225 230 235 240
Asn His Tyr Leu Ser Thr Gln Ser Lys Leu Ser Lys Asp Pro Asn Glu
245 250 255
Lys Arg Asp His Met Val Leu Leu Glu Phe Val Thr Ala Ala Gly Ile
260 265 270
Thr His Gly Met Asp Glu Leu Tyr Lys
275 280
<210> 53
<211> 2037
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CsoS1A plasmid
<400> 53
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atggctgatg ttactggtat tgctttgggt atgattgaaa ctagaggttt 120
ggttccagct atcgaagctg ctgacgctat gaccaaggcc gctgaagtca gattggtcgg 180
tagacaattt gttggaggtg gttacgtcac tgttttggtt cgtggtgaaa ccggtgccgt 240
taacgctgct gttagagctg gtgctgatgc ttgtgaaaga gttggtgacg gtttagttgc 300
tgcccacatt attgccagag tccactctga agttgaaaac attttgccaa aggctccaca 360
ggcttagccg agacgactga ccatttaaat catacctgac ctccatagca gaaagtcaaa 420
agcctccgac cggaggcttt tgacttgatc ggcacgtaag aggttccaac tttcaccata 480
atgaaataag atcactaccg ggcgtatttt ttgagttatc gagattttca ggagctaagg 540
aagctaaaat gagccatatt caacgggaaa cgtcttgctc gaggccgcga ttaaattcca 600
acatggatgc tgatttatat gggtataaat gggctcgcga taatgtcggg caatcaggtg 660
cgacaatcta tcgattgtat gggaagcccg atgcgccaga gttgtttctg aaacatggca 720
aaggtagcgt tgccaatgat gttacagatg agatggtcag gctaaactgg ctgacggaat 780
ttatgcctct tccgaccatc aagcatttta tccgtactcc tgatgatgca tggttactca 840
ccactgcgat cccagggaaa acagcattcc aggtattaga agaatatcct gattcaggtg 900
aaaatattgt tgatgcgctg gcagtgttcc tgcgccggtt gcattcgatt cctgtttgta 960
attgtccttt taacggcgat cgcgtatttc gtctcgctca ggcgcaatca cgaatgaata 1020
acggtttggt tggtgcgagt gattttgatg acgagcgtaa tggctggcct gttgaacaag 1080
tctggaaaga aatgcataag cttttgccat tctcaccgga ttcagtcgtc actcatggtg 1140
atttctcact tgataacctt atttttgacg aggggaaatt aataggttgt attgatgttg 1200
gacgagtcgg aatcgcagac cgataccagg atcttgccat cctatggaac tgcctcggtg 1260
agttttctcc ttcattacag aaacggcttt ttcaaaaata tggtattgat aatcctgata 1320
tgaataaatt gcagtttcac ttgatgctcg atgagttttt ctaatgaggg cccaaatgta 1380
atcacctggc tcaccttcgg gtgggccttt ctgcgttgct ggcgtttttc cataggctcc 1440
gcccccctga cgagcatcac aaaaatcgat gctcaagtca gaggtggcga aacccgacag 1500
gactataaag ataccaggcg tttccccctg gaagctccct cgtgcgctct cctgttccga 1560
ccctgccgct taccggatac ctgtccgcct ttctcccttc gggaagcgtg gcgctttctc 1620
atagctcacg ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag ctgggctgtg 1680
tgcacgaacc ccccgttcag cccgaccgct gcgccttatc cggtaactat cgtcttgagt 1740
ccaacccggt aagacacgac ttatcgccac tggcagcagc cactggtaac aggattagca 1800
gagcgaggta tgtaggcggt gctacagagt tcttgaagtg gtggcctaac tacggctaca 1860
ctagaagaac agtatttggt atctgcgctc tgctgaagcc agttacctcg gaaaaagagt 1920
tggtagctct tgatccggca aacaaaccac cgctggtagc ggtggttttt ttgtttgcaa 1980
gcagcagatt acgcgcagaa aaaaaggatc tcaagaagat cctttgattt tctaccg 2037
<210> 54
<211> 1992
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-CsoS4A plasmid
<400> 54
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgaagatca tgcaagttga aaagactttg gtttctacca acagaattgc 120
tgatatgggt cacaagccat tgttggttgt ttgggaaaaa cctggtgctc caagacaagt 180
tgctgttgat gctattggtt gtattccagg tgactgggtt ttgtgtgttg gttcttctgc 240
tgccagagaa gctgctggtt ccaagtctta cccatctgat ttgactatca tcggtattat 300
tgaccaatgg aacggtgaat agccgagacg actgaccatt taaatcatac ctgacctcca 360
tagcagaaag tcaaaagcct ccgaccggag gcttttgact tgatcggcac gtaagaggtt 420
ccaactttca ccataatgaa ataagatcac taccgggcgt attttttgag ttatcgagat 480
tttcaggagc taaggaagct aaaatgagcc atattcaacg ggaaacgtct tgctcgaggc 540
cgcgattaaa ttccaacatg gatgctgatt tatatgggta taaatgggct cgcgataatg 600
tcgggcaatc aggtgcgaca atctatcgat tgtatgggaa gcccgatgcg ccagagttgt 660
ttctgaaaca tggcaaaggt agcgttgcca atgatgttac agatgagatg gtcaggctaa 720
actggctgac ggaatttatg cctcttccga ccatcaagca ttttatccgt actcctgatg 780
atgcatggtt actcaccact gcgatcccag ggaaaacagc attccaggta ttagaagaat 840
atcctgattc aggtgaaaat attgttgatg cgctggcagt gttcctgcgc cggttgcatt 900
cgattcctgt ttgtaattgt ccttttaacg gcgatcgcgt atttcgtctc gctcaggcgc 960
aatcacgaat gaataacggt ttggttggtg cgagtgattt tgatgacgag cgtaatggct 1020
ggcctgttga acaagtctgg aaagaaatgc ataagctttt gccattctca ccggattcag 1080
tcgtcactca tggtgatttc tcacttgata accttatttt tgacgagggg aaattaatag 1140
gttgtattga tgttggacga gtcggaatcg cagaccgata ccaggatctt gccatcctat 1200
ggaactgcct cggtgagttt tctccttcat tacagaaacg gctttttcaa aaatatggta 1260
ttgataatcc tgatatgaat aaattgcagt ttcacttgat gctcgatgag tttttctaat 1320
gagggcccaa atgtaatcac ctggctcacc ttcgggtggg cctttctgcg ttgctggcgt 1380
ttttccatag gctccgcccc cctgacgagc atcacaaaaa tcgatgctca agtcagaggt 1440
ggcgaaaccc gacaggacta taaagatacc aggcgtttcc ccctggaagc tccctcgtgc 1500
gctctcctgt tccgaccctg ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa 1560
gcgtggcgct ttctcatagc tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct 1620
ccaagctggg ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc ttatccggta 1680
actatcgtct tgagtccaac ccggtaagac acgacttatc gccactggca gcagccactg 1740
gtaacaggat tagcagagcg aggtatgtag gcggtgctac agagttcttg aagtggtggc 1800
ctaactacgg ctacactaga agaacagtat ttggtatctg cgctctgctg aagccagtta 1860
cctcggaaaa agagttggta gctcttgatc cggcaaacaa accaccgctg gtagcggtgg 1920
tttttttgtt tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag aagatccttt 1980
gattttctac cg 1992
<210> 55
<211> 1896
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_T-TT7 plasmid
<400> 55
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgt agctaacaaa gcccgaaagg aagctgagtt ggctgctgcc accgctgagc 120
aataactagc ataacccctt ggggcctcta aacgggtctt gaggggtttt ttgctgaaag 180
gaggaactat atccggatat cccgcaagag gcccggcagt acccctccga gacgactgac 240
catttaaatc atacctgacc tccatagcag aaagtcaaaa gcctccgacc ggaggctttt 300
gacttgatcg gcacgtaaga ggttccaact ttcaccataa tgaaataaga tcactaccgg 360
gcgtattttt tgagttatcg agattttcag gagctaagga agctaaaatg agccatattc 420
aacgggaaac gtcttgctcg aggccgcgat taaattccaa catggatgct gatttatatg 480
ggtataaatg ggctcgcgat aatgtcgggc aatcaggtgc gacaatctat cgattgtatg 540
ggaagcccga tgcgccagag ttgtttctga aacatggcaa aggtagcgtt gccaatgatg 600
ttacagatga gatggtcagg ctaaactggc tgacggaatt tatgcctctt ccgaccatca 660
agcattttat ccgtactcct gatgatgcat ggttactcac cactgcgatc ccagggaaaa 720
cagcattcca ggtattagaa gaatatcctg attcaggtga aaatattgtt gatgcgctgg 780
cagtgttcct gcgccggttg cattcgattc ctgtttgtaa ttgtcctttt aacggcgatc 840
gcgtatttcg tctcgctcag gcgcaatcac gaatgaataa cggtttggtt ggtgcgagtg 900
attttgatga cgagcgtaat ggctggcctg ttgaacaagt ctggaaagaa atgcataagc 960
ttttgccatt ctcaccggat tcagtcgtca ctcatggtga tttctcactt gataacctta 1020
tttttgacga ggggaaatta ataggttgta ttgatgttgg acgagtcgga atcgcagacc 1080
gataccagga tcttgccatc ctatggaact gcctcggtga gttttctcct tcattacaga 1140
aacggctttt tcaaaaatat ggtattgata atcctgatat gaataaattg cagtttcact 1200
tgatgctcga tgagtttttc taatgagggc ccaaatgtaa tcacctggct caccttcggg 1260
tgggcctttc tgcgttgctg gcgtttttcc ataggctccg cccccctgac gagcatcaca 1320
aaaatcgatg ctcaagtcag aggtggcgaa acccgacagg actataaaga taccaggcgt 1380
ttccccctgg aagctccctc gtgcgctctc ctgttccgac cctgccgctt accggatacc 1440
tgtccgcctt tctcccttcg ggaagcgtgg cgctttctca tagctcacgc tgtaggtatc 1500
tcagttcggt gtaggtcgtt cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc 1560
ccgaccgctg cgccttatcc ggtaactatc gtcttgagtc caacccggta agacacgact 1620
tatcgccact ggcagcagcc actggtaaca ggattagcag agcgaggtat gtaggcggtg 1680
ctacagagtt cttgaagtgg tggcctaact acggctacac tagaagaaca gtatttggta 1740
tctgcgctct gctgaagcca gttacctcgg aaaaagagtt ggtagctctt gatccggcaa 1800
acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa 1860
aaaaggatct caagaagatc ctttgatttt ctaccg 1896
<210> 56
<211> 2943
<212> DNA
<213> Artificial Sequence
<220>
<223> pES1 plasmid
<400> 56
agagacccaa gacactgcgg ctttgtatgt gtccgcagcg cccgccgcag tctcacgccc 60
ggagcgtagc gaccgagtga gctagctatt tgtttatttt tctaaataca ttcaaatatg 120
tatccgctca tgagacaata accctgataa atgcttcaat aatattgaaa aaggaagagt 180
atgagggaag cggtgatcgc cgaagtatcg actcaactat cagaggtagt tggcgtcatc 240
gagcgccatc tcgaaccgac gttgctggcc gtacatttgt acggctccgc agtggatggc 300
ggcctgaagc cacacagtga tattgatttg ctggttacgg tgaccgtaag gcttgatgaa 360
acaacgcggc gagctttgat caacgacctt ttggaaactt cggcttcccc tggagagagc 420
gagattctcc gcgctgtaga agtcaccatt gttgtgcacg acgacatcat tccgtggcgt 480
tatccagcta agcgcgaact gcaatttgga gaatggcagc gcaatgacat tcttgcaggt 540
atcttcgagc cagccacgat cgacattgat ctggctatct tgctgacaaa agcaagagaa 600
catagcgttg ccttggtagg tccagcggcg gaggaactct ttgatccggt tcctgaacag 660
gatctatttg aggcgctaaa tgaaacctta acgctatgga actcgccgcc cgactgggct 720
ggcgatgagc gaaatgtagt gcttacgttg tcccgcattt ggtacagcgc agtaaccggc 780
aaaatcgcgc cgaaggatgt cgctgccgac tgggcaatgg agcgcctgcc ggcccagtat 840
cagcccgtca tacttgaagc tagacaggct tatcttggac aagaagaaga tcgcttggcc 900
tcgcgcgcag atcagttgga agaatttgtc cactacgtga aaggcgagat caccaaggta 960
gtcggcaaat aatgtctaac aattcgttca agccgagggg ccgcaagatc cggccacgat 1020
gacccggtcg tcggttcagg gcagggtcgt taaatagccg cttatgtcta ttgctggttt 1080
accggtttat tgactaccgg aagcagtgtg accgtgtgct tctcaaatgc ctgaggtttc 1140
agcaaaaaac ccctcaagac ccgtttagag gccccaaggg gttatgctag ttattgctca 1200
gcggcgtcag accccgtaga aaagatcaaa ggatcttctt gagatccttt ttttctgcgc 1260
gtaatctgct gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg tttgccggat 1320
caagagctac caactctttt tccgaaggta actggcttca gcagagcgca gataccaaat 1380
actgtccttc tagtgtagcc gtagttaggc caccacttca agaactctgt agcaccgcct 1440
acatacctcg ctctgctaat cctgttacca gtggctgctg ccagtggcga taagtcgtgt 1500
cttaccgggt tggactcaag acgatagtta ccggataagg cgcagcggtc gggctgaacg 1560
gggggttcgt gcacacagcc cagcttggag cgaacgacct acaccgaact gagataccta 1620
cagcgtgagc tatgagaaag cgccacgctt cccgaaggga gaaaggcgga caggtatccg 1680
gtaagcggca gggtcggaac aggagagcgc acgagggagc ttccaggggg aaacgcctgg 1740
tatctttata gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc 1800
tcgtcagggg ggcggagcct atggaaaaac gccagcaacg cggccttttt acggttcctg 1860
gccgttggca gtgactcggt ctctacctgg ctagagacgg caatacgcaa accgcctctc 1920
cccgcgcgtt ggccgattca ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg 1980
ggcagtgagc gcaacgcaat taatgtgagt tagctcactc attaggcacc ccaggcttta 2040
cactttatgc ttccggctcg tatgttgtgt ggaattgtga gcggataaca atttcacaca 2100
tactagagaa agaggagaaa tactagatgg cttcctccga agacgttatc aaagagttca 2160
tgcgtttcaa agttcgtatg gaaggttccg ttaacggtca cgagttcgaa atcgaaggtg 2220
aaggtgaagg tcgtccgtac gaaggtaccc agaccgctaa actgaaagtt accaaaggtg 2280
gtccgctgcc gttcgcttgg gacatcctgt ccccgcagtt ccagtacggt tccaaagctt 2340
acgttaaaca cccggctgac atcccggact acctgaaact gtccttcccg gaaggtttca 2400
aatgggaacg tgttatgaac ttcgaagacg gtggtgttgt taccgttacc caggactcct 2460
ccctgcaaga cggtgagttc atctacaaag ttaaactgcg tggtaccaac ttcccgtccg 2520
acggtccggt tatgcagaaa aaaaccatgg gttgggaagc ttccaccgaa cgtatgtacc 2580
cggaagacgg tgctctgaaa ggtgaaatca aaatgcgtct gaaactgaaa gacggtggtc 2640
actacgacgc tgaagttaaa accacctaca tggctaaaaa accggttcag ctgccgggtg 2700
cttacaaaac cgacatcaaa ctggacatca cctcccacaa cgaagactac accatcgttg 2760
aacagtacga acgtgctgaa ggtcgtcact ccaccggtgc ttaataacgc tgatagtgct 2820
agtgtagatc gctactagag ccaggcatca aataaaacga aaggctcagt cgaaagactg 2880
ggcctttcgt tttatctgtt gtttgtcggt gaacgctctc tactagagcg tctcacctct 2940
gag 2943
<210> 57
<211> 2943
<212> DNA
<213> Artificial Sequence
<220>
<223> pES2 plasmid
<400> 57
agagacccaa gacactgcgg ctttgtatgt gtccgcagcg cccgccgcag tctcacgccc 60
ggagcgtagc gaccgagtga gctagctatt tgtttatttt tctaaataca ttcaaatatg 120
tatccgctca tgagacaata accctgataa atgcttcaat aatattgaaa aaggaagagt 180
atgagggaag cggtgatcgc cgaagtatcg actcaactat cagaggtagt tggcgtcatc 240
gagcgccatc tcgaaccgac gttgctggcc gtacatttgt acggctccgc agtggatggc 300
ggcctgaagc cacacagtga tattgatttg ctggttacgg tgaccgtaag gcttgatgaa 360
acaacgcggc gagctttgat caacgacctt ttggaaactt cggcttcccc tggagagagc 420
gagattctcc gcgctgtaga agtcaccatt gttgtgcacg acgacatcat tccgtggcgt 480
tatccagcta agcgcgaact gcaatttgga gaatggcagc gcaatgacat tcttgcaggt 540
atcttcgagc cagccacgat cgacattgat ctggctatct tgctgacaaa agcaagagaa 600
catagcgttg ccttggtagg tccagcggcg gaggaactct ttgatccggt tcctgaacag 660
gatctatttg aggcgctaaa tgaaacctta acgctatgga actcgccgcc cgactgggct 720
ggcgatgagc gaaatgtagt gcttacgttg tcccgcattt ggtacagcgc agtaaccggc 780
aaaatcgcgc cgaaggatgt cgctgccgac tgggcaatgg agcgcctgcc ggcccagtat 840
cagcccgtca tacttgaagc tagacaggct tatcttggac aagaagaaga tcgcttggcc 900
tcgcgcgcag atcagttgga agaatttgtc cactacgtga aaggcgagat caccaaggta 960
gtcggcaaat aatgtctaac aattcgttca agccgagggg ccgcaagatc cggccacgat 1020
gacccggtcg tcggttcagg gcagggtcgt taaatagccg cttatgtcta ttgctggttt 1080
accggtttat tgactaccgg aagcagtgtg accgtgtgct tctcaaatgc ctgaggtttc 1140
agcaaaaaac ccctcaagac ccgtttagag gccccaaggg gttatgctag ttattgctca 1200
gcggcgtcag accccgtaga aaagatcaaa ggatcttctt gagatccttt ttttctgcgc 1260
gtaatctgct gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg tttgccggat 1320
caagagctac caactctttt tccgaaggta actggcttca gcagagcgca gataccaaat 1380
actgtccttc tagtgtagcc gtagttaggc caccacttca agaactctgt agcaccgcct 1440
acatacctcg ctctgctaat cctgttacca gtggctgctg ccagtggcga taagtcgtgt 1500
cttaccgggt tggactcaag acgatagtta ccggataagg cgcagcggtc gggctgaacg 1560
gggggttcgt gcacacagcc cagcttggag cgaacgacct acaccgaact gagataccta 1620
cagcgtgagc tatgagaaag cgccacgctt cccgaaggga gaaaggcgga caggtatccg 1680
gtaagcggca gggtcggaac aggagagcgc acgagggagc ttccaggggg aaacgcctgg 1740
tatctttata gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc 1800
tcgtcagggg ggcggagcct atggaaaaac gccagcaacg cggccttttt acggttcctg 1860
gccgttggca gtgactcggt ctctacctgg ctagagacgg caatacgcaa accgcctctc 1920
cccgcgcgtt ggccgattca ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg 1980
ggcagtgagc gcaacgcaat taatgtgagt tagctcactc attaggcacc ccaggcttta 2040
cactttatgc ttccggctcg tatgttgtgt ggaattgtga gcggataaca atttcacaca 2100
tactagagaa agaggagaaa tactagatgg cttcctccga agacgttatc aaagagttca 2160
tgcgtttcaa agttcgtatg gaaggttccg ttaacggtca cgagttcgaa atcgaaggtg 2220
aaggtgaagg tcgtccgtac gaaggtaccc agaccgctaa actgaaagtt accaaaggtg 2280
gtccgctgcc gttcgcttgg gacatcctgt ccccgcagtt ccagtacggt tccaaagctt 2340
acgttaaaca cccggctgac atcccggact acctgaaact gtccttcccg gaaggtttca 2400
aatgggaacg tgttatgaac ttcgaagacg gtggtgttgt taccgttacc caggactcct 2460
ccctgcaaga cggtgagttc atctacaaag ttaaactgcg tggtaccaac ttcccgtccg 2520
acggtccggt tatgcagaaa aaaaccatgg gttgggaagc ttccaccgaa cgtatgtacc 2580
cggaagacgg tgctctgaaa ggtgaaatca aaatgcgtct gaaactgaaa gacggtggtc 2640
actacgacgc tgaagttaaa accacctaca tggctaaaaa accggttcag ctgccgggtg 2700
cttacaaaac cgacatcaaa ctggacatca cctcccacaa cgaagactac accatcgttg 2760
aacagtacga acgtgctgaa ggtcgtcact ccaccggtgc ttaataacgc tgatagtgct 2820
agtgtagatc gctactagag ccaggcatca aataaaacga aaggctcagt cgaaagactg 2880
ggcctttcgt tttatctgtt gtttgtcggt gaacgctctc tactagagcg tctcacctca 2940
ggc 2943
<210> 58
<211> 2943
<212> DNA
<213> Artificial Sequence
<220>
<223> pES3 plasmid
<400> 58
agagacccaa gacactgcgg ctttgtatgt gtccgcagcg cccgccgcag tctcacgccc 60
ggagcgtagc gaccgagtga gctagctatt tgtttatttt tctaaataca ttcaaatatg 120
tatccgctca tgagacaata accctgataa atgcttcaat aatattgaaa aaggaagagt 180
atgagggaag cggtgatcgc cgaagtatcg actcaactat cagaggtagt tggcgtcatc 240
gagcgccatc tcgaaccgac gttgctggcc gtacatttgt acggctccgc agtggatggc 300
ggcctgaagc cacacagtga tattgatttg ctggttacgg tgaccgtaag gcttgatgaa 360
acaacgcggc gagctttgat caacgacctt ttggaaactt cggcttcccc tggagagagc 420
gagattctcc gcgctgtaga agtcaccatt gttgtgcacg acgacatcat tccgtggcgt 480
tatccagcta agcgcgaact gcaatttgga gaatggcagc gcaatgacat tcttgcaggt 540
atcttcgagc cagccacgat cgacattgat ctggctatct tgctgacaaa agcaagagaa 600
catagcgttg ccttggtagg tccagcggcg gaggaactct ttgatccggt tcctgaacag 660
gatctatttg aggcgctaaa tgaaacctta acgctatgga actcgccgcc cgactgggct 720
ggcgatgagc gaaatgtagt gcttacgttg tcccgcattt ggtacagcgc agtaaccggc 780
aaaatcgcgc cgaaggatgt cgctgccgac tgggcaatgg agcgcctgcc ggcccagtat 840
cagcccgtca tacttgaagc tagacaggct tatcttggac aagaagaaga tcgcttggcc 900
tcgcgcgcag atcagttgga agaatttgtc cactacgtga aaggcgagat caccaaggta 960
gtcggcaaat aatgtctaac aattcgttca agccgagggg ccgcaagatc cggccacgat 1020
gacccggtcg tcggttcagg gcagggtcgt taaatagccg cttatgtcta ttgctggttt 1080
accggtttat tgactaccgg aagcagtgtg accgtgtgct tctcaaatgc ctgaggtttc 1140
agcaaaaaac ccctcaagac ccgtttagag gccccaaggg gttatgctag ttattgctca 1200
gcggcgtcag accccgtaga aaagatcaaa ggatcttctt gagatccttt ttttctgcgc 1260
gtaatctgct gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg tttgccggat 1320
caagagctac caactctttt tccgaaggta actggcttca gcagagcgca gataccaaat 1380
actgtccttc tagtgtagcc gtagttaggc caccacttca agaactctgt agcaccgcct 1440
acatacctcg ctctgctaat cctgttacca gtggctgctg ccagtggcga taagtcgtgt 1500
cttaccgggt tggactcaag acgatagtta ccggataagg cgcagcggtc gggctgaacg 1560
gggggttcgt gcacacagcc cagcttggag cgaacgacct acaccgaact gagataccta 1620
cagcgtgagc tatgagaaag cgccacgctt cccgaaggga gaaaggcgga caggtatccg 1680
gtaagcggca gggtcggaac aggagagcgc acgagggagc ttccaggggg aaacgcctgg 1740
tatctttata gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc 1800
tcgtcagggg ggcggagcct atggaaaaac gccagcaacg cggccttttt acggttcctg 1860
gccgttggca gtgactcggt ctctaggcgg ctagagacgg caatacgcaa accgcctcta 1920
gccgcgcgtt ggccgattca ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg 1980
ggcagtgagc gcaacgcaat taatgtgagt tagctcactc attaggcacc ccaggcttta 2040
cactttatgc ttccggctcg tatgttgtgt ggaattgtga gcggataaca atttcacaca 2100
tactagagaa agaggagaaa tactagatgg cttcctccga agacgttatc aaagagttca 2160
tgcgtttcaa agttcgtatg gaaggttccg ttaacggtca cgagttcgaa atcgaaggtg 2220
aaggtgaagg tcgtccgtac gaaggtaccc agaccgctaa actgaaagtt accaaaggtg 2280
gtccgctgcc gttcgcttgg gacatcctgt ccccgcagtt ccagtacggt tccaaagctt 2340
acgttaaaca cccggctgac atcccggact acctgaaact gtccttcccg gaaggtttca 2400
aatgggaacg tgttatgaac ttcgaagacg gtggtgttgt taccgttacc caggactcct 2460
ccctgcaaga cggtgagttc atctacaaag ttaaactgcg tggtaccaac ttcccgtccg 2520
acggtccggt tatgcagaaa aaaaccatgg gttgggaagc ttccaccgaa cgtatgtacc 2580
cggaagacgg tgctctgaaa ggtgaaatca aaatgcgtct gaaactgaaa gacggtggtc 2640
actacgacgc tgaagttaaa accacctaca tggctaaaaa accggttcag ctgccgggtg 2700
cttacaaaac cgacatcaaa ctggacatca cctcccacaa cgaagactac accatcgttg 2760
aacagtacga acgtgctgaa ggtcgtcact ccaccggtgc ttaataacgc tgatagtgct 2820
agtgtagatc gctactagag ccaggcatca aataaaacga aaggctcagt cgaaagactg 2880
ggcctttcgt tttatctgtt gtttgtcggt gaacgctctc tactagagcg tctcacctct 2940
gag 2943
<210> 59
<211> 2943
<212> DNA
<213> Artificial Sequence
<220>
<223> pES4 plasmid
<400> 59
agagacccaa gacactgcgg ctttgtatgt gtccgcagcg cccgccgcag tctcacgccc 60
ggagcgtagc gaccgagtga gctagctatt tgtttatttt tctaaataca ttcaaatatg 120
tatccgctca tgagacaata accctgataa atgcttcaat aatattgaaa aaggaagagt 180
atgagggaag cggtgatcgc cgaagtatcg actcaactat cagaggtagt tggcgtcatc 240
gagcgccatc tcgaaccgac gttgctggcc gtacatttgt acggctccgc agtggatggc 300
ggcctgaagc cacacagtga tattgatttg ctggttacgg tgaccgtaag gcttgatgaa 360
acaacgcggc gagctttgat caacgacctt ttggaaactt cggcttcccc tggagagagc 420
gagattctcc gcgctgtaga agtcaccatt gttgtgcacg acgacatcat tccgtggcgt 480
tatccagcta agcgcgaact gcaatttgga gaatggcagc gcaatgacat tcttgcaggt 540
atcttcgagc cagccacgat cgacattgat ctggctatct tgctgacaaa agcaagagaa 600
catagcgttg ccttggtagg tccagcggcg gaggaactct ttgatccggt tcctgaacag 660
gatctatttg aggcgctaaa tgaaacctta acgctatgga actcgccgcc cgactgggct 720
ggcgatgagc gaaatgtagt gcttacgttg tcccgcattt ggtacagcgc agtaaccggc 780
aaaatcgcgc cgaaggatgt cgctgccgac tgggcaatgg agcgcctgcc ggcccagtat 840
cagcccgtca tacttgaagc tagacaggct tatcttggac aagaagaaga tcgcttggcc 900
tcgcgcgcag atcagttgga agaatttgtc cactacgtga aaggcgagat caccaaggta 960
gtcggcaaat aatgtctaac aattcgttca agccgagggg ccgcaagatc cggccacgat 1020
gacccggtcg tcggttcagg gcagggtcgt taaatagccg cttatgtcta ttgctggttt 1080
accggtttat tgactaccgg aagcagtgtg accgtgtgct tctcaaatgc ctgaggtttc 1140
agcaaaaaac ccctcaagac ccgtttagag gccccaaggg gttatgctag ttattgctca 1200
gcggcgtcag accccgtaga aaagatcaaa ggatcttctt gagatccttt ttttctgcgc 1260
gtaatctgct gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg tttgccggat 1320
caagagctac caactctttt tccgaaggta actggcttca gcagagcgca gataccaaat 1380
actgtccttc tagtgtagcc gtagttaggc caccacttca agaactctgt agcaccgcct 1440
acatacctcg ctctgctaat cctgttacca gtggctgctg ccagtggcga taagtcgtgt 1500
cttaccgggt tggactcaag acgatagtta ccggataagg cgcagcggtc gggctgaacg 1560
gggggttcgt gcacacagcc cagcttggag cgaacgacct acaccgaact gagataccta 1620
cagcgtgagc tatgagaaag cgccacgctt cccgaaggga gaaaggcgga caggtatccg 1680
gtaagcggca gggtcggaac aggagagcgc acgagggagc ttccaggggg aaacgcctgg 1740
tatctttata gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc 1800
tcgtcagggg ggcggagcct atggaaaaac gccagcaacg cggccttttt acggttcctg 1860
gccgttggca gtgactcggt ctctaggcgg ctagagacgg caatacgcaa accgcctcta 1920
gccgcgcgtt ggccgattca ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg 1980
ggcagtgagc gcaacgcaat taatgtgagt tagctcactc attaggcacc ccaggcttta 2040
cactttatgc ttccggctcg tatgttgtgt ggaattgtga gcggataaca atttcacaca 2100
tactagagaa agaggagaaa tactagatgg cttcctccga agacgttatc aaagagttca 2160
tgcgtttcaa agttcgtatg gaaggttccg ttaacggtca cgagttcgaa atcgaaggtg 2220
aaggtgaagg tcgtccgtac gaaggtaccc agaccgctaa actgaaagtt accaaaggtg 2280
gtccgctgcc gttcgcttgg gacatcctgt ccccgcagtt ccagtacggt tccaaagctt 2340
acgttaaaca cccggctgac atcccggact acctgaaact gtccttcccg gaaggtttca 2400
aatgggaacg tgttatgaac ttcgaagacg gtggtgttgt taccgttacc caggactcct 2460
ccctgcaaga cggtgagttc atctacaaag ttaaactgcg tggtaccaac ttcccgtccg 2520
acggtccggt tatgcagaaa aaaaccatgg gttgggaagc ttccaccgaa cgtatgtacc 2580
cggaagacgg tgctctgaaa ggtgaaatca aaatgcgtct gaaactgaaa gacggtggtc 2640
actacgacgc tgaagttaaa accacctaca tggctaaaaa accggttcag ctgccgggtg 2700
cttacaaaac cgacatcaaa ctggacatca cctcccacaa cgaagactac accatcgttg 2760
aacagtacga acgtgctgaa ggtcgtcact ccaccggtgc ttaataacgc tgatagtgct 2820
agtgtagatc gctactagag ccaggcatca aataaaacga aaggctcagt cgaaagactg 2880
ggcctttcgt tttatctgtt gtttgtcggt gaacgctctc tactagagcg tctcacctct 2940
gcc 2943
<210> 60
<211> 2943
<212> DNA
<213> Artificial Sequence
<220>
<223> pES5 plasmid
<400> 60
agagacccaa gacactgcgg ctttgtatgt gtccgcagcg cccgccgcag tctcacgccc 60
ggagcgtagc gaccgagtga gctagctatt tgtttatttt tctaaataca ttcaaatatg 120
tatccgctca tgagacaata accctgataa atgcttcaat aatattgaaa aaggaagagt 180
atgagggaag cggtgatcgc cgaagtatcg actcaactat cagaggtagt tggcgtcatc 240
gagcgccatc tcgaaccgac gttgctggcc gtacatttgt acggctccgc agtggatggc 300
ggcctgaagc cacacagtga tattgatttg ctggttacgg tgaccgtaag gcttgatgaa 360
acaacgcggc gagctttgat caacgacctt ttggaaactt cggcttcccc tggagagagc 420
gagattctcc gcgctgtaga agtcaccatt gttgtgcacg acgacatcat tccgtggcgt 480
tatccagcta agcgcgaact gcaatttgga gaatggcagc gcaatgacat tcttgcaggt 540
atcttcgagc cagccacgat cgacattgat ctggctatct tgctgacaaa agcaagagaa 600
catagcgttg ccttggtagg tccagcggcg gaggaactct ttgatccggt tcctgaacag 660
gatctatttg aggcgctaaa tgaaacctta acgctatgga actcgccgcc cgactgggct 720
ggcgatgagc gaaatgtagt gcttacgttg tcccgcattt ggtacagcgc agtaaccggc 780
aaaatcgcgc cgaaggatgt cgctgccgac tgggcaatgg agcgcctgcc ggcccagtat 840
cagcccgtca tacttgaagc tagacaggct tatcttggac aagaagaaga tcgcttggcc 900
tcgcgcgcag atcagttgga agaatttgtc cactacgtga aaggcgagat caccaaggta 960
gtcggcaaat aatgtctaac aattcgttca agccgagggg ccgcaagatc cggccacgat 1020
gacccggtcg tcggttcagg gcagggtcgt taaatagccg cttatgtcta ttgctggttt 1080
accggtttat tgactaccgg aagcagtgtg accgtgtgct tctcaaatgc ctgaggtttc 1140
agcaaaaaac ccctcaagac ccgtttagag gccccaaggg gttatgctag ttattgctca 1200
gcggcgtcag accccgtaga aaagatcaaa ggatcttctt gagatccttt ttttctgcgc 1260
gtaatctgct gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg tttgccggat 1320
caagagctac caactctttt tccgaaggta actggcttca gcagagcgca gataccaaat 1380
actgtccttc tagtgtagcc gtagttaggc caccacttca agaactctgt agcaccgcct 1440
acatacctcg ctctgctaat cctgttacca gtggctgctg ccagtggcga taagtcgtgt 1500
cttaccgggt tggactcaag acgatagtta ccggataagg cgcagcggtc gggctgaacg 1560
gggggttcgt gcacacagcc cagcttggag cgaacgacct acaccgaact gagataccta 1620
cagcgtgagc tatgagaaag cgccacgctt cccgaaggga gaaaggcgga caggtatccg 1680
gtaagcggca gggtcggaac aggagagcgc acgagggagc ttccaggggg aaacgcctgg 1740
tatctttata gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc 1800
tcgtcagggg ggcggagcct atggaaaaac gccagcaacg cggccttttt acggttcctg 1860
gccgttggca gtgactcggt ctcttgccgg ctagagacgg caatacgcaa accgcctcta 1920
gccgcgcgtt ggccgattca ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg 1980
ggcagtgagc gcaacgcaat taatgtgagt tagctcactc attaggcacc ccaggcttta 2040
cactttatgc ttccggctcg tatgttgtgt ggaattgtga gcggataaca atttcacaca 2100
tactagagaa agaggagaaa tactagatgg cttcctccga agacgttatc aaagagttca 2160
tgcgtttcaa agttcgtatg gaaggttccg ttaacggtca cgagttcgaa atcgaaggtg 2220
aaggtgaagg tcgtccgtac gaaggtaccc agaccgctaa actgaaagtt accaaaggtg 2280
gtccgctgcc gttcgcttgg gacatcctgt ccccgcagtt ccagtacggt tccaaagctt 2340
acgttaaaca cccggctgac atcccggact acctgaaact gtccttcccg gaaggtttca 2400
aatgggaacg tgttatgaac ttcgaagacg gtggtgttgt taccgttacc caggactcct 2460
ccctgcaaga cggtgagttc atctacaaag ttaaactgcg tggtaccaac ttcccgtccg 2520
acggtccggt tatgcagaaa aaaaccatgg gttgggaagc ttccaccgaa cgtatgtacc 2580
cggaagacgg tgctctgaaa ggtgaaatca aaatgcgtct gaaactgaaa gacggtggtc 2640
actacgacgc tgaagttaaa accacctaca tggctaaaaa accggttcag ctgccgggtg 2700
cttacaaaac cgacatcaaa ctggacatca cctcccacaa cgaagactac accatcgttg 2760
aacagtacga acgtgctgaa ggtcgtcact ccaccggtgc ttaataacgc tgatagtgct 2820
agtgtagatc gctactagag ccaggcatca aataaaacga aaggctcagt cgaaagactg 2880
ggcctttcgt tttatctgtt gtttgtcggt gaacgctctc tactagagcg tctcacctct 2940
gag 2943
<210> 61
<211> 2943
<212> DNA
<213> Artificial Sequence
<220>
<223> pES6 plasmid
<400> 61
agagacccaa gacactgcgg ctttgtatgt gtccgcagcg cccgccgcag tctcacgccc 60
ggagcgtagc gaccgagtga gctagctatt tgtttatttt tctaaataca ttcaaatatg 120
tatccgctca tgagacaata accctgataa atgcttcaat aatattgaaa aaggaagagt 180
atgagggaag cggtgatcgc cgaagtatcg actcaactat cagaggtagt tggcgtcatc 240
gagcgccatc tcgaaccgac gttgctggcc gtacatttgt acggctccgc agtggatggc 300
ggcctgaagc cacacagtga tattgatttg ctggttacgg tgaccgtaag gcttgatgaa 360
acaacgcggc gagctttgat caacgacctt ttggaaactt cggcttcccc tggagagagc 420
gagattctcc gcgctgtaga agtcaccatt gttgtgcacg acgacatcat tccgtggcgt 480
tatccagcta agcgcgaact gcaatttgga gaatggcagc gcaatgacat tcttgcaggt 540
atcttcgagc cagccacgat cgacattgat ctggctatct tgctgacaaa agcaagagaa 600
catagcgttg ccttggtagg tccagcggcg gaggaactct ttgatccggt tcctgaacag 660
gatctatttg aggcgctaaa tgaaacctta acgctatgga actcgccgcc cgactgggct 720
ggcgatgagc gaaatgtagt gcttacgttg tcccgcattt ggtacagcgc agtaaccggc 780
aaaatcgcgc cgaaggatgt cgctgccgac tgggcaatgg agcgcctgcc ggcccagtat 840
cagcccgtca tacttgaagc tagacaggct tatcttggac aagaagaaga tcgcttggcc 900
tcgcgcgcag atcagttgga agaatttgtc cactacgtga aaggcgagat caccaaggta 960
gtcggcaaat aatgtctaac aattcgttca agccgagggg ccgcaagatc cggccacgat 1020
gacccggtcg tcggttcagg gcagggtcgt taaatagccg cttatgtcta ttgctggttt 1080
accggtttat tgactaccgg aagcagtgtg accgtgtgct tctcaaatgc ctgaggtttc 1140
agcaaaaaac ccctcaagac ccgtttagag gccccaaggg gttatgctag ttattgctca 1200
gcggcgtcag accccgtaga aaagatcaaa ggatcttctt gagatccttt ttttctgcgc 1260
gtaatctgct gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg tttgccggat 1320
caagagctac caactctttt tccgaaggta actggcttca gcagagcgca gataccaaat 1380
actgtccttc tagtgtagcc gtagttaggc caccacttca agaactctgt agcaccgcct 1440
acatacctcg ctctgctaat cctgttacca gtggctgctg ccagtggcga taagtcgtgt 1500
cttaccgggt tggactcaag acgatagtta ccggataagg cgcagcggtc gggctgaacg 1560
gggggttcgt gcacacagcc cagcttggag cgaacgacct acaccgaact gagataccta 1620
cagcgtgagc tatgagaaag cgccacgctt cccgaaggga gaaaggcgga caggtatccg 1680
gtaagcggca gggtcggaac aggagagcgc acgagggagc ttccaggggg aaacgcctgg 1740
tatctttata gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc 1800
tcgtcagggg ggcggagcct atggaaaaac gccagcaacg cggccttttt acggttcctg 1860
gccgttggca gtgactcggt ctcttgccgg ctagagacgg caatacgcaa accgcctctc 1920
cccgcgcgtt ggccgattca ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg 1980
ggcagtgagc gcaacgcaat taatgtgagt tagctcactc attaggcacc ccaggcttta 2040
cactttatgc ttccggctcg tatgttgtgt ggaattgtga gcggataaca atttcacaca 2100
tactagagaa agaggagaaa tactagatgg cttcctccga agacgttatc aaagagttca 2160
tgcgtttcaa agttcgtatg gaaggttccg ttaacggtca cgagttcgaa atcgaaggtg 2220
aaggtgaagg tcgtccgtac gaaggtaccc agaccgctaa actgaaagtt accaaaggtg 2280
gtccgctgcc gttcgcttgg gacatcctgt ccccgcagtt ccagtacggt tccaaagctt 2340
acgttaaaca cccggctgac atcccggact acctgaaact gtccttcccg gaaggtttca 2400
aatgggaacg tgttatgaac ttcgaagacg gtggtgttgt taccgttacc caggactcct 2460
ccctgcaaga cggtgagttc atctacaaag ttaaactgcg tggtaccaac ttcccgtccg 2520
acggtccggt tatgcagaaa aaaaccatgg gttgggaagc ttccaccgaa cgtatgtacc 2580
cggaagacgg tgctctgaaa ggtgaaatca aaatgcgtct gaaactgaaa gacggtggtc 2640
actacgacgc tgaagttaaa accacctaca tggctaaaaa accggttcag ctgccgggtg 2700
cttacaaaac cgacatcaaa ctggacatca cctcccacaa cgaagactac accatcgttg 2760
aacagtacga acgtgctgaa ggtcgtcact ccaccggtgc ttaataacgc tgatagtgct 2820
agtgtagatc gctactagag ccaggcatca aataaaacga aaggctcagt cgaaagactg 2880
ggcctttcgt tttatctgtt gtttgtcggt gaacgctctc tactagagcg tctcacctcc 2940
act 2943
<210> 62
<211> 2942
<212> DNA
<213> Artificial Sequence
<220>
<223> pES7 plasmid
<400> 62
agagacccaa gacactgcgg ctttgtatgt gtccgcagcg cccgccgcag tctcacgccc 60
ggagcgtagc gaccgagtga gctagctatt tgtttatttt tctaaataca ttcaaatatg 120
tatccgctca tgagacaata accctgataa atgcttcaat aatattgaaa aaggaagagt 180
atgagggaag cggtgatcgc cgaagtatcg actcaactat cagaggtagt tggcgtcatc 240
gagcgccatc tcgaaccgac gttgctggcc gtacatttgt acggctccgc agtggatggc 300
ggcctgaagc cacacagtga tattgatttg ctggttacgg tgaccgtaag gcttgatgaa 360
acaacgcggc gagctttgat caacgacctt ttggaaactt cggcttcccc tggagagagc 420
gagattctcc gcgctgtaga agtcaccatt gttgtgcacg acgacatcat tccgtggcgt 480
tatccagcta agcgcgaact gcaatttgga gaatggcagc gcaatgacat tcttgcaggt 540
atcttcgagc cagccacgat cgacattgat ctggctatct tgctgacaaa agcaagagaa 600
catagcgttg ccttggtagg tccagcggcg gaggaactct ttgatccggt tcctgaacag 660
gatctatttg aggcgctaaa tgaaacctta acgctatgga actcgccgcc cgactgggct 720
ggcgatgagc gaaatgtagt gcttacgttg tcccgcattt ggtacagcgc agtaaccggc 780
aaaatcgcgc cgaaggatgt cgctgccgac tgggcaatgg agcgcctgcc ggcccagtat 840
cagcccgtca tacttgaagc tagacaggct tatcttggac aagaagaaga tcgcttggcc 900
tcgcgcgcag atcagttgga agaatttgtc cactacgtga aaggcgagat caccaaggta 960
gtcggcaaat aatgtctaac aattcgttca agccgagggg ccgcaagatc cggccacgat 1020
gacccggtcg tcggttcagg gcagggtcgt taaatagccg cttatgtcta ttgctggttt 1080
accggtttat tgactaccgg aagcagtgtg accgtgtgct tctcaaatgc ctgaggtttc 1140
agcaaaaaac ccctcaagac ccgtttagag gccccaaggg gttatgctag ttattgctca 1200
gcggcgtcag accccgtaga aaagatcaaa ggatcttctt gagatccttt ttttctgcgc 1260
gtaatctgct gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg tttgccggat 1320
caagagctac caactctttt tccgaaggta actggcttca gcagagcgca gataccaaat 1380
actgtccttc tagtgtagcc gtagttaggc caccacttca agaactctgt agcaccgcct 1440
acatacctcg ctctgctaat cctgttacca gtggctgctg ccagtggcga taagtcgtgt 1500
cttaccgggt tggactcaag acgatagtta ccggataagg cgcagcggtc gggctgaacg 1560
gggggttcgt gcacacagcc cagcttggag cgaacgacct acaccgaact gagataccta 1620
cagcgtgagc tatgagaaag cgccacgctt cccgaaggga gaaaggcgga caggtatccg 1680
gtaagcggca gggtcggaac aggagagcgc acgagggagc ttccaggggg aaacgcctgg 1740
tatctttata gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc 1800
tcgtcagggg ggcggagcct atggaaaaac gccagcaacg cggccttttt acggttcctg 1860
gccgttggca gtgactcggt ctctcactgg ctagagacgg caatacgcaa accgcctcta 1920
gccgcgcgtt ggccgattca ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg 1980
ggcagtgagc gcaacgcaat taatgtgagt tagctcactc attaggcacc ccaggcttta 2040
cactttatgc ttccggctcg tatgttgtgt ggaattgtga gcggataaca atttcacaca 2100
tactagagaa agaggagaaa tactagatgg cttcctccga agacgttatc aaagagttca 2160
tgcgtttcaa agttcgtatg gaaggttccg ttaacggtca cgagttcgaa atcgaaggtg 2220
aaggtgaagg tcgtccgtac gaaggtaccc agaccgctaa actgaaagtt accaaaggtg 2280
gtccgctgcc gttcgcttgg gacatcctgt ccccgcagtt ccagtacggt tccaaagctt 2340
acgttaaaca cccggctgac atcccggact acctgaaact gtccttcccg gaaggtttca 2400
aatgggaacg tgttatgaac ttcgaagacg gtggtgttgt taccgttacc caggactcct 2460
ccctgcaaga cggtgagttc atctacaaag ttaaactgcg tggtaccaac ttcccgtccg 2520
acggtccggt tatgcagaaa aaaaccatgg gttgggaagc ttccaccgaa cgtatgtacc 2580
cggaagacgg tgctctgaaa ggtgaaatca aaatgcgtct gaaactgaaa gacggtggtc 2640
actacgacgc tgaagttaaa accacctaca tggctaaaaa accggttcag ctgccgggtg 2700
cttacaaaac cgacatcaaa ctggacatca cctcccacaa cgaagactac accatcgttg 2760
aacagtacga acgtgctgaa ggtcgtcact ccaccggtgc ttaataacgc tgatagtgct 2820
agtgtagatc gctactagag ccaggcatca aataaaacga aaggctcagt cgaaagactg 2880
ggcctttcgt tttatctgtt gtttgtcggt gaacgctctc tactagacgt ctcacctctg 2940
ag 2942
<210> 63
<211> 4498
<212> DNA
<213> Artificial Sequence
<220>
<223> pCKH plasmid
<400> 63
actcctccct gcaagacggt gagttcatct acaaagttaa actgcgtggt accaacttcc 60
cgtccgacgg tccggttatg cagaaaaaaa ccatgggttg ggaagcttcc accgaacgta 120
tgtacccgga agacggtgct ctgaaaggtg aaatcaaaat gcgtctgaaa ctgaaagacg 180
gtggtcacta cgacgctgaa gttaaaacca cctacatggc taaaaaaccg gttcagctgc 240
cgggtgctta caaaaccgac atcaaactgg acatcacctc ccacaacgaa gactacacca 300
tcgttgaaca gtacgaacgt gctgaaggtc gtcactccac cggtgcttaa taacgctgat 360
agtgctagtg tagatcgcta ctagagccag gcatcaaata aaacgaaagg ctcagtcgaa 420
agactgggcc tttcgtttta tctgttgttt gtcggtgaac gctctctact agagtggtct 480
catgagcgag acgtccggca tccgcttaca gacaagctgt gacagtctcc gggagctgca 540
tgtgtcagag gttttcaccg tcatcaccga aacgcgcgag actaaagggc ctcgtgatac 600
gcctattttt ataggttaat gtcatgataa taatggtttc ttaggacgga tcgcttgcct 660
gtaacttaca cgcgcctcgt atcttttaat gatggaataa tttgggaatt tactctgtgt 720
ttatttattt ttatgttttg tatttggatt ttagaaagta aataaagaag gtagaagagt 780
tacggaatga agaaaaaaaa ataaacaaag gtttaaaaaa tttcaacaaa aagcgtactt 840
tacatatata tttattagac aagaaaagca gattaaatag atatacattc gattaacgat 900
aagtaaaatg taaaatcaca ggattttcgt gtgtggtctt ctacacagac aagatgaaac 960
aattcggcat taatacctga gagcaggaag agcaagataa aaggtagtat ttgttggcga 1020
tccccctaga gtcttttaca tcttcggaaa acaaaaacta ttttttcttt aatttctttt 1080
tttactttct atttttaatt tatatattta tattaaaaaa tttaaattat aattattttt 1140
atagcacgtg atgaaaagga cccaggtggc attgacttga tcggcacgta agaggttcca 1200
actttcacca taatgaaata agatcactac cgggcgtatt ttttgagtta tcgagatttt 1260
caggagctaa ggaagctaaa atgagccata ttcaacggga aacgtcttgc tcgaggccgc 1320
gattaaattc caacatggat gctgatttat atgggtataa atgggctcgc gataatgtcg 1380
ggcaatcagg tgcgacaatc tatcgattgt atgggaagcc cgatgcgcca gagttgtttc 1440
tgaaacatgg caaaggtagc gttgccaatg atgttacaga tgagatggtc aggctaaact 1500
ggctgacgga atttatgcct cttccgacca tcaagcattt tatccgtact cctgatgatg 1560
catggttact caccactgcg atcccaggga aaacagcatt ccaggtatta gaagaatatc 1620
ctgattcagg tgaaaatatt gttgatgcgc tggcagtgtt cctgcgccgg ttgcattcga 1680
ttcctgtttg taattgtcct tttaacggcg atcgcgtatt tcgtctcgca caggcgcaat 1740
cacgaatgaa taacggtttg gttggtgcga gtgattttga tgacgagcgt aatggctggc 1800
ctgttgaaca agtctggaaa gaaatgcata agcttttgcc attctcaccg gattcagtcg 1860
tcactcatgg tgatttctca cttgataacc ttatttttga cgaggggaaa ttaataggtt 1920
gtattgatgt tggacgagtc ggaatcgcag accgatacca ggatcttgcc atcctatgga 1980
actgcctcgg tgagttttct ccttcattac agaaacggct ttttcaaaaa tatggtattg 2040
ataatcctga tatgaataaa ttgcagtttc acttgatgct cgatgagttt ttctaatgag 2100
ggcccaaatg taatcacctg gctcaccttc gggtgggcct ttctgcgttg ctggcgtttt 2160
tccataggct ccgcccccct gacgagcatc acaaaaatcg atgctcaagt cagaggtggc 2220
gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct 2280
ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg 2340
tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca 2400
agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact 2460
atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta 2520
acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta 2580
actacggcta cactagaaga acagtatttg gtatctgcgc tctgctgaag ccagttacct 2640
cggaaaaaga gttggtagct cttgatccgg caaacaaacc accgctggta gcggtggttt 2700
ttttgtttgc aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag atcctttgat 2760
tttctaccga actgtgcggt atttcacacc gcatagatcc gtcgagttca agagaaaaaa 2820
aaagaaaaag caaaaagaaa aaaggaaagc gcgcctcgtt cagaatgaca cgtatagaat 2880
gatgcattac cttgtcatct tcagtatcat actgttcgta tacatactta ctgacattca 2940
taggtataca tatatacaca tgtatatata tcgtatgctg cagctttaaa taatcggtgt 3000
cactacataa gaacaccttt ggtggaggga acatcgttgg taccattggg cgaggtggct 3060
tctcttatgg caaccgcaag agccttgaac gcactctcac tacggtgatg atcattcttg 3120
cctcgcagac aatcaacgtg gagggtaatt ctgctagcct ctgcaaagct ttcaagaaaa 3180
tgcgggatca tctcgcaaga gagatctcct actttctccc tttgcaaacc aagttcgaca 3240
actgcgtacg gcctgttcga aagatctacc accgctctgg aaagtgcctc atccaaaggc 3300
gcaaatcctg atccaaacct ttttactcca cgcacggccc ctagggcctc tttaaaagct 3360
tgaccgagag caatcccgca gtcttcagtg gtgtgatggt cgtctatgtg taagtcacca 3420
atgcactcaa cgattagcga ccagccggaa tgcttggcca gagcatgtat catatggtcc 3480
agaaacccta tacctgtgtg gacgttaatc acttgcgatt gtgtggcctg ttctgctact 3540
gcttctgcct ctttttctgg gaagatcgag tgctctatcg ctaggggacc accctttaaa 3600
gagatcgcaa tctgaatctt ggtttcattt gtaatacgct ttactagggc tttctgctct 3660
gtcatctttg ccttcgttta tcttgcctgc tcatttttta gtatattctt cgaagaaatc 3720
acattacttt atataatgta taattcatta tgtgataatg ccaatcgcta agaaaaaaaa 3780
agagtcatcc gctaggtgga aaaaaaaaaa tgaaaatcat taccgaggca taaaaaaata 3840
tagagtgtac tagaggaggc caagagtaat agaaaaagaa aattgcggga aaggactgtg 3900
ttatgacttc cctgactaat gccgacgtct cgacctcgag accgcaatac gcaaaccgcc 3960
tctccccgcg cgttggccga ttcattaatg cagctggcac gacaggtttc ccgactggaa 4020
agcgggcagt gagcgcaacg caattaatgt gagttagctc actcattagg caccccaggc 4080
tttacacttt atgcttccgg ctcgtatgtt gtgtggaatt gtgagcggat aacaatttca 4140
cacatactag agaaagagga gaaatactag atggcttcct ccgaagacgt tatcaaagag 4200
ttcatgcgtt tcaaagttcg tatggaaggt tccgttaacg gtcacgagtt cgaaatcgaa 4260
ggtgaaggtg aaggtcgtcc gtacgaaggt acccagaccg ctaaactgaa agttaccaaa 4320
ggtggtccgc tgccgttcgc ttgggacatc ctgtccccgc agttccagta cggttccaaa 4380
gcttacgtta aacacccggc tgacatcccg gactacctga aactgtcctt cccggaaggt 4440
ttcaaatggg aacgtgttat gaacttcgaa gacggtggtg ttgttaccgt tacccagg 4498
<210> 64
<211> 6033
<212> DNA
<213> Artificial Sequence
<220>
<223> pCKH-Cso-BMC plasmid
<400> 64
tgagcgagac gtccggcatc cgcttacaga caagctgtga cagtctccgg gagctgcatg 60
tgtcagaggt tttcaccgtc atcaccgaaa cgcgcgagac taaagggcct cgtgatacgc 120
ctatttttat aggttaatgt catgataata atggtttctt aggacggatc gcttgcctgt 180
aacttacacg cgcctcgtat cttttaatga tggaataatt tgggaattta ctctgtgttt 240
atttattttt atgttttgta tttggatttt agaaagtaaa taaagaaggt agaagagtta 300
cggaatgaag aaaaaaaaat aaacaaaggt ttaaaaaatt tcaacaaaaa gcgtacttta 360
catatatatt tattagacaa gaaaagcaga ttaaatagat atacattcga ttaacgataa 420
gtaaaatgta aaatcacagg attttcgtgt gtggtcttct acacagacaa gatgaaacaa 480
ttcggcatta atacctgaga gcaggaagag caagataaaa ggtagtattt gttggcgatc 540
cccctagagt cttttacatc ttcggaaaac aaaaactatt ttttctttaa tttctttttt 600
tactttctat ttttaattta tatatttata ttaaaaaatt taaattataa ttatttttat 660
agcacgtgat gaaaaggacc caggtggcat tgacttgatc ggcacgtaag aggttccaac 720
tttcaccata atgaaataag atcactaccg ggcgtatttt ttgagttatc gagattttca 780
ggagctaagg aagctaaaat gagccatatt caacgggaaa cgtcttgctc gaggccgcga 840
ttaaattcca acatggatgc tgatttatat gggtataaat gggctcgcga taatgtcggg 900
caatcaggtg cgacaatcta tcgattgtat gggaagcccg atgcgccaga gttgtttctg 960
aaacatggca aaggtagcgt tgccaatgat gttacagatg agatggtcag gctaaactgg 1020
ctgacggaat ttatgcctct tccgaccatc aagcatttta tccgtactcc tgatgatgca 1080
tggttactca ccactgcgat cccagggaaa acagcattcc aggtattaga agaatatcct 1140
gattcaggtg aaaatattgt tgatgcgctg gcagtgttcc tgcgccggtt gcattcgatt 1200
cctgtttgta attgtccttt taacggcgat cgcgtatttc gtctcgcaca ggcgcaatca 1260
cgaatgaata acggtttggt tggtgcgagt gattttgatg acgagcgtaa tggctggcct 1320
gttgaacaag tctggaaaga aatgcataag cttttgccat tctcaccgga ttcagtcgtc 1380
actcatggtg atttctcact tgataacctt atttttgacg aggggaaatt aataggttgt 1440
attgatgttg gacgagtcgg aatcgcagac cgataccagg atcttgccat cctatggaac 1500
tgcctcggtg agttttctcc ttcattacag aaacggcttt ttcaaaaata tggtattgat 1560
aatcctgata tgaataaatt gcagtttcac ttgatgctcg atgagttttt ctaatgaggg 1620
cccaaatgta atcacctggc tcaccttcgg gtgggccttt ctgcgttgct ggcgtttttc 1680
cataggctcc gcccccctga cgagcatcac aaaaatcgat gctcaagtca gaggtggcga 1740
aacccgacag gactataaag ataccaggcg tttccccctg gaagctccct cgtgcgctct 1800
cctgttccga ccctgccgct taccggatac ctgtccgcct ttctcccttc gggaagcgtg 1860
gcgctttctc atagctcacg ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag 1920
ctgggctgtg tgcacgaacc ccccgttcag cccgaccgct gcgccttatc cggtaactat 1980
cgtcttgagt ccaacccggt aagacacgac ttatcgccac tggcagcagc cactggtaac 2040
aggattagca gagcgaggta tgtaggcggt gctacagagt tcttgaagtg gtggcctaac 2100
tacggctaca ctagaagaac agtatttggt atctgcgctc tgctgaagcc agttacctcg 2160
gaaaaagagt tggtagctct tgatccggca aacaaaccac cgctggtagc ggtggttttt 2220
ttgtttgcaa gcagcagatt acgcgcagaa aaaaaggatc tcaagaagat cctttgattt 2280
tctaccgaac tgtgcggtat ttcacaccgc atagatccgt cgagttcaag agaaaaaaaa 2340
agaaaaagca aaaagaaaaa aggaaagcgc gcctcgttca gaatgacacg tatagaatga 2400
tgcattacct tgtcatcttc agtatcatac tgttcgtata catacttact gacattcata 2460
ggtatacata tatacacatg tatatatatc gtatgctgca gctttaaata atcggtgtca 2520
ctacataaga acacctttgg tggagggaac atcgttggta ccattgggcg aggtggcttc 2580
tcttatggca accgcaagag ccttgaacgc actctcacta cggtgatgat cattcttgcc 2640
tcgcagacaa tcaacgtgga gggtaattct gctagcctct gcaaagcttt caagaaaatg 2700
cgggatcatc tcgcaagaga gatctcctac tttctccctt tgcaaaccaa gttcgacaac 2760
tgcgtacggc ctgttcgaaa gatctaccac cgctctggaa agtgcctcat ccaaaggcgc 2820
aaatcctgat ccaaaccttt ttactccacg cacggcccct agggcctctt taaaagcttg 2880
accgagagca atcccgcagt cttcagtggt gtgatggtcg tctatgtgta agtcaccaat 2940
gcactcaacg attagcgacc agccggaatg cttggccaga gcatgtatca tatggtccag 3000
aaaccctata cctgtgtgga cgttaatcac ttgcgattgt gtggcctgtt ctgctactgc 3060
ttctgcctct ttttctggga agatcgagtg ctctatcgct aggggaccac cctttaaaga 3120
gatcgcaatc tgaatcttgg tttcatttgt aatacgcttt actagggctt tctgctctgt 3180
catctttgcc ttcgtttatc ttgcctgctc attttttagt atattcttcg aagaaatcac 3240
attactttat ataatgtata attcattatg tgataatgcc aatcgctaag aaaaaaaaag 3300
agtcatccgc taggtggaaa aaaaaaaatg aaaatcatta ccgaggcata aaaaaatata 3360
gagtgtacta gaggaggcca agagtaatag aaaaagaaaa ttgcgggaaa ggactgtgtt 3420
atgacttccc tgactaatgc cgacgtctcg acctggctgg cttcccaacc ttaccagagg 3480
gcgccccagc tggcaattcc gacgtcttta tggctagctc agtcctaggt acaatgctag 3540
cgaattcaaa agatctttta agaaggagat atacatgatg aagatcatgc aagttgaaaa 3600
gactttggtt tctaccaaca gaattgctga tatgggtcac aagccattgt tggttgtttg 3660
ggaaaaacct ggtgctccaa gacaagttgc tgttgatgct attggttgta ttccaggtga 3720
ctgggttttg tgtgttggtt cttctgctgc cagagaagct gctggttcca agtcttaccc 3780
atctgatttg actatcatcg gtattattga ccaatggaac ggtgaaggtt cttcttggtc 3840
acatccacaa tttgaaaagt agctaacaaa gcccgaaagg aagctgagtt ggctgctgcc 3900
accgctgagc aataactagc ataacccctt ggggcctcta aacgggtctt gaggggtttt 3960
ttgctgaaag gaggaactat atccggatat cccgcaagag gcccggcagt acccctcagg 4020
cggcttcact gcccgctttc cagtcgggaa acctgtcgtg ccagctgcat taatgaatcg 4080
gccaacgcgc ggggagaggc ggtttgcgta ttgggcgcca gggtggtttt tcttttcacc 4140
agtgaaacgg gcaacagctg attgcccttc accgcctggc cctgagagag ttgcagcaag 4200
cggtccacgc tggtttgccc cagcaggcga aaatcctgtt tgatggtggt taacggcggg 4260
atataacatg agctgtcttc ggtatcgtcg tatcccacta ccgagatatc cgcaccaacg 4320
cgcagcccgg actcggtaat ggcgcgcatt gcgcccagcg ccatctgatc gttggcaacc 4380
agcatcgcag tgggaacgat gccctcattc agcatttgca tggtttgttg aaaaccggac 4440
atggcactcc agtcgccttc ccgttccgct atcggctgaa tttgattgcg agtgagatat 4500
ttatgccagc cagccagacg cagacgcgcc gagacagaac ttaatgggcc cgctaacagc 4560
gcgatttgct ggtgacccaa tgcgaccaga tgctccacgc ccagtcgcgt accgtcttca 4620
tgggagaaaa taatactgtt gatgggtgtc tggtcagaga catcaagaaa taacgccgga 4680
acattagtgc aggcagcttc cacagcaatg gcatcctggt catccagcgg atagttaatg 4740
atcagcccac tgacgcgttg cgcgagaaga ttgtgcaccg ccgctttaca ggcttcgacg 4800
ccgcttcgtt ctaccatcga caccaccacg ctggcaccca gttgatcggc gcgagattta 4860
atcgccgcga caatttgcga cggcgcgtgc agggccagac tggaggtggc aacgccaatc 4920
agcaacgact gtttgcccgc cagttgttgt gccacgcggt tgggaatgta attcagctcc 4980
gccatcgccg cttccacttt ttcccgcgtt ttcgcagaaa cgtggctggc ctggttcacc 5040
acgcgggaaa cggtctgata agagacaccg gcatactctg cgacatcgta taacgttact 5100
ggtttcacat tcaccaccct gaattgactc tcttccgggc gctatcatgc cataccgcga 5160
aaggttttgc gccattcgat ggtgtccggg atctcgacgc tctcccttat gcgactcctg 5220
cattaggaag cagcccagta gtaggttgag gccgttgagc accgccgccg caaggaatgg 5280
tgcatgcaag gagatggcgc ccaacagtcc cccggccacg gggcctgcca ccatacccac 5340
gccgaaacaa gcgctcatga gcccgaagtg gcgagcccga tcttccccat cggtgatgtc 5400
ggcgatatag gcgccagcaa ccgcacctgt ggcgccggtg atgccggcca cgatgcgtcc 5460
ggcgtagagg atcgagatct cgatcccgcg aaattaatac gactcactat aggggaattg 5520
tgagcggata acaattcccc tctagaaata attttgttta actttaagaa ggagatatac 5580
gatggctgat gttactggta ttgctttggg tatgattgaa actagaggtt tggttccagc 5640
tatcgaagct gctgacgcta tgaccaaggc cgctgaagtc agattggtcg gtagacaatt 5700
tgttggaggt ggttacgtca ctgttttggt tcgtggtgaa accggtgccg ttaacgctgc 5760
tgttagagct ggtgctgatg cttgtgaaag agttggtgac ggtttagttg ctgcccacat 5820
tattgccaga gtccactctg aagttgaaaa cattttgcca aaggctccac aggcttagct 5880
aacaaagccc gaaaggaagc tgagttggct gctgccaccg ctgagcaata actagcataa 5940
ccccttgggg cctctaaacg ggtcttgagg ggttttttgc tgaaaggagg aactatatcc 6000
ggatatcccg caagaggccc ggcagtaccc ctc 6033
<210> 65
<211> 2246
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_P-TDH3 plasmid
<400> 65
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg gctacagttt attcctggca tccactaaat ataatggagc ccgcttttta 120
agctggcatc cagaaaaaaa aagaatccca gcaccaaaat attgttttct tcaccaacca 180
tcagttcata ggtccattct cttagcgcaa ctacagagaa caggggcaca aacaggcaaa 240
aaacgggcac aacctcaatg gagtgatgca acctgcctgg agtaaatgat gacacaaggc 300
aattgaccca cgcatgtatc tatctcattt tcttacacct tctattacct tctgctctct 360
ctgatttgga aaaagctgaa aaaaaaggtt gaaaccagtt ccctgaaatt attcccctac 420
ttgactaata agtatataaa gacggtaggt attgattgta attctgtaaa tctatttctt 480
aaacttctta aattctactt ttatagttag tctttttttt agttttaaaa caccaagaac 540
ttagtttcga ataaacacac ataaacaaac aaagatgcga gacgactgac catttaaatc 600
atacctgacc tccatagcag aaagtcaaaa gcctccgacc ggaggctttt gacttgatcg 660
gcacgtaaga ggttccaact ttcaccataa tgaaataaga tcactaccgg gcgtattttt 720
tgagttatcg agattttcag gagctaagga agctaaaatg agccatattc aacgggaaac 780
gtcttgctcg aggccgcgat taaattccaa catggatgct gatttatatg ggtataaatg 840
ggctcgcgat aatgtcgggc aatcaggtgc gacaatctat cgattgtatg ggaagcccga 900
tgcgccagag ttgtttctga aacatggcaa aggtagcgtt gccaatgatg ttacagatga 960
gatggtcagg ctaaactggc tgacggaatt tatgcctctt ccgaccatca agcattttat 1020
ccgtactcct gatgatgcat ggttactcac cactgcgatc ccagggaaaa cagcattcca 1080
ggtattagaa gaatatcctg attcaggtga aaatattgtt gatgcgctgg cagtgttcct 1140
gcgccggttg cattcgattc ctgtttgtaa ttgtcctttt aacggcgatc gcgtatttcg 1200
tctcgctcag gcgcaatcac gaatgaataa cggtttggtt ggtgcgagtg attttgatga 1260
cgagcgtaat ggctggcctg ttgaacaagt ctggaaagaa atgcataagc ttttgccatt 1320
ctcaccggat tcagtcgtca ctcatggtga tttctcactt gataacctta tttttgacga 1380
ggggaaatta ataggttgta ttgatgttgg acgagtcgga atcgcagacc gataccagga 1440
tcttgccatc ctatggaact gcctcggtga gttttctcct tcattacaga aacggctttt 1500
tcaaaaatat ggtattgata atcctgatat gaataaattg cagtttcact tgatgctcga 1560
tgagtttttc taatgagggc ccaaatgtaa tcacctggct caccttcggg tgggcctttc 1620
tgcgttgctg gcgtttttcc ataggctccg cccccctgac gagcatcaca aaaatcgatg 1680
ctcaagtcag aggtggcgaa acccgacagg actataaaga taccaggcgt ttccccctgg 1740
aagctccctc gtgcgctctc ctgttccgac cctgccgctt accggatacc tgtccgcctt 1800
tctcccttcg ggaagcgtgg cgctttctca tagctcacgc tgtaggtatc tcagttcggt 1860
gtaggtcgtt cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg 1920
cgccttatcc ggtaactatc gtcttgagtc caacccggta agacacgact tatcgccact 1980
ggcagcagcc actggtaaca ggattagcag agcgaggtat gtaggcggtg ctacagagtt 2040
cttgaagtgg tggcctaact acggctacac tagaagaaca gtatttggta tctgcgctct 2100
gctgaagcca gttacctcgg aaaaagagtt ggtagctctt gatccggcaa acaaaccacc 2160
gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct 2220
caagaagatc ctttgatttt ctaccg 2246
<210> 66
<211> 2246
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_P-YEF3 plasmid
<400> 66
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg gctattaaaa aaacaactta caatcattgt tcgccccttc catacttact 120
gccactcgca aaagggccca accagggcaa ttacgtatca aaaaatcatg acaggctggg 180
taataaatat tcgtgaagaa agaagaaatt aaaaaaagaa acgaagaagc aaaaaaaaga 240
aaagactccg tttaatcact ttcaaccgcg gtttatccgg ccccacccat gcataaccct 300
aaattattag atcacttagc acgtgaaaaa gaaacgtttt taatgttttt tttttttttt 360
tctttttctt tttttgcgtt ggtgaaaatt ttttcgcttc ctcgagtata attatctcat 420
ctcatctttc atataagata agaagtttta taaaaacctt ttgcatcaaa attttgtaga 480
atatctcttt ttcttacgct ctctttcttt ccttaattgt tttctaaaga accgtgtatt 540
tttctagttc gaatccatcg ataacattaa aaggatgcga gacgactgac catttaaatc 600
atacctgacc tccatagcag aaagtcaaaa gcctccgacc ggaggctttt gacttgatcg 660
gcacgtaaga ggttccaact ttcaccataa tgaaataaga tcactaccgg gcgtattttt 720
tgagttatcg agattttcag gagctaagga agctaaaatg agccatattc aacgggaaac 780
gtcttgctcg aggccgcgat taaattccaa catggatgct gatttatatg ggtataaatg 840
ggctcgcgat aatgtcgggc aatcaggtgc gacaatctat cgattgtatg ggaagcccga 900
tgcgccagag ttgtttctga aacatggcaa aggtagcgtt gccaatgatg ttacagatga 960
gatggtcagg ctaaactggc tgacggaatt tatgcctctt ccgaccatca agcattttat 1020
ccgtactcct gatgatgcat ggttactcac cactgcgatc ccagggaaaa cagcattcca 1080
ggtattagaa gaatatcctg attcaggtga aaatattgtt gatgcgctgg cagtgttcct 1140
gcgccggttg cattcgattc ctgtttgtaa ttgtcctttt aacggcgatc gcgtatttcg 1200
tctcgctcag gcgcaatcac gaatgaataa cggtttggtt ggtgcgagtg attttgatga 1260
cgagcgtaat ggctggcctg ttgaacaagt ctggaaagaa atgcataagc ttttgccatt 1320
ctcaccggat tcagtcgtca ctcatggtga tttctcactt gataacctta tttttgacga 1380
ggggaaatta ataggttgta ttgatgttgg acgagtcgga atcgcagacc gataccagga 1440
tcttgccatc ctatggaact gcctcggtga gttttctcct tcattacaga aacggctttt 1500
tcaaaaatat ggtattgata atcctgatat gaataaattg cagtttcact tgatgctcga 1560
tgagtttttc taatgagggc ccaaatgtaa tcacctggct caccttcggg tgggcctttc 1620
tgcgttgctg gcgtttttcc ataggctccg cccccctgac gagcatcaca aaaatcgatg 1680
ctcaagtcag aggtggcgaa acccgacagg actataaaga taccaggcgt ttccccctgg 1740
aagctccctc gtgcgctctc ctgttccgac cctgccgctt accggatacc tgtccgcctt 1800
tctcccttcg ggaagcgtgg cgctttctca tagctcacgc tgtaggtatc tcagttcggt 1860
gtaggtcgtt cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg 1920
cgccttatcc ggtaactatc gtcttgagtc caacccggta agacacgact tatcgccact 1980
ggcagcagcc actggtaaca ggattagcag agcgaggtat gtaggcggtg ctacagagtt 2040
cttgaagtgg tggcctaact acggctacac tagaagaaca gtatttggta tctgcgctct 2100
gctgaagcca gttacctcgg aaaaagagtt ggtagctctt gatccggcaa acaaaccacc 2160
gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct 2220
caagaagatc ctttgatttt ctaccg 2246
<210> 67
<211> 2246
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_P-PYK1 plasmid
<400> 67
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg gctacagatt gggagatttt catagtagaa ttcagcatga tagctacgta 120
aatgtgttcc gcaccgtcac aaagtgtttt ctactgttct ttcttctttc gttcattcag 180
ttgagttgag tgagtgcttt gttcaatgga tcttagctaa aatgcatatt ttttctcttg 240
gtaaatgaat gcttgtgatg tcttccaagt gatttccttt ccttcccata tgatgctagg 300
tacctttagt gtcttcctaa aaaaaaaaaa aggctcgcca tcaaaacgat attcgttggc 360
ttttttttct gaattataaa tactctttgg taacttttca tttccaagaa cctctttttt 420
ccagttatat catggtcccc tttcaaagtt attctctact ctttttcata ttcattcttt 480
ttcatccttt ggttttttat tcttaacttg tttattattc tctcttgttt ctatttacaa 540
gacaccaatc aaaacaaata aaacatcatc acagatgcga gacgactgac catttaaatc 600
atacctgacc tccatagcag aaagtcaaaa gcctccgacc ggaggctttt gacttgatcg 660
gcacgtaaga ggttccaact ttcaccataa tgaaataaga tcactaccgg gcgtattttt 720
tgagttatcg agattttcag gagctaagga agctaaaatg agccatattc aacgggaaac 780
gtcttgctcg aggccgcgat taaattccaa catggatgct gatttatatg ggtataaatg 840
ggctcgcgat aatgtcgggc aatcaggtgc gacaatctat cgattgtatg ggaagcccga 900
tgcgccagag ttgtttctga aacatggcaa aggtagcgtt gccaatgatg ttacagatga 960
gatggtcagg ctaaactggc tgacggaatt tatgcctctt ccgaccatca agcattttat 1020
ccgtactcct gatgatgcat ggttactcac cactgcgatc ccagggaaaa cagcattcca 1080
ggtattagaa gaatatcctg attcaggtga aaatattgtt gatgcgctgg cagtgttcct 1140
gcgccggttg cattcgattc ctgtttgtaa ttgtcctttt aacggcgatc gcgtatttcg 1200
tctcgctcag gcgcaatcac gaatgaataa cggtttggtt ggtgcgagtg attttgatga 1260
cgagcgtaat ggctggcctg ttgaacaagt ctggaaagaa atgcataagc ttttgccatt 1320
ctcaccggat tcagtcgtca ctcatggtga tttctcactt gataacctta tttttgacga 1380
ggggaaatta ataggttgta ttgatgttgg acgagtcgga atcgcagacc gataccagga 1440
tcttgccatc ctatggaact gcctcggtga gttttctcct tcattacaga aacggctttt 1500
tcaaaaatat ggtattgata atcctgatat gaataaattg cagtttcact tgatgctcga 1560
tgagtttttc taatgagggc ccaaatgtaa tcacctggct caccttcggg tgggcctttc 1620
tgcgttgctg gcgtttttcc ataggctccg cccccctgac gagcatcaca aaaatcgatg 1680
ctcaagtcag aggtggcgaa acccgacagg actataaaga taccaggcgt ttccccctgg 1740
aagctccctc gtgcgctctc ctgttccgac cctgccgctt accggatacc tgtccgcctt 1800
tctcccttcg ggaagcgtgg cgctttctca tagctcacgc tgtaggtatc tcagttcggt 1860
gtaggtcgtt cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg 1920
cgccttatcc ggtaactatc gtcttgagtc caacccggta agacacgact tatcgccact 1980
ggcagcagcc actggtaaca ggattagcag agcgaggtat gtaggcggtg ctacagagtt 2040
cttgaagtgg tggcctaact acggctacac tagaagaaca gtatttggta tctgcgctct 2100
gctgaagcca gttacctcgg aaaaagagtt ggtagctctt gatccggcaa acaaaccacc 2160
gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct 2220
caagaagatc ctttgatttt ctaccg 2246
<210> 68
<211> 2040
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-HO-H plasmid
<400> 68
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atggctgatg ctttgggtat gattgaagtt agaggtttcg ttggtatggt 120
tgaagctgct gatgctatgg ttaaggctgc taaagttgaa ttgatcggtt acgaaaaaac 180
tggtggtggt tatgttactg ctgttgttag aggtgatgtt gctgctgtaa aagctgctac 240
tgaagctggt caaagggctg ctgaaagagt tggagaagtt gttgctgttc atgttattcc 300
aagaccacat gttaatgttg atgctgcttt gccattgggt agaactccag gtatggataa 360
gtctgcttag ccgagacgac tgaccattta aatcatacct gacctccata gcagaaagtc 420
aaaagcctcc gaccggaggc ttttgacttg atcggcacgt aagaggttcc aactttcacc 480
ataatgaaat aagatcacta ccgggcgtat tttttgagtt atcgagattt tcaggagcta 540
aggaagctaa aatgagccat attcaacggg aaacgtcttg ctcgaggccg cgattaaatt 600
ccaacatgga tgctgattta tatgggtata aatgggctcg cgataatgtc gggcaatcag 660
gtgcgacaat ctatcgattg tatgggaagc ccgatgcgcc agagttgttt ctgaaacatg 720
gcaaaggtag cgttgccaat gatgttacag atgagatggt caggctaaac tggctgacgg 780
aatttatgcc tcttccgacc atcaagcatt ttatccgtac tcctgatgat gcatggttac 840
tcaccactgc gatcccaggg aaaacagcat tccaggtatt agaagaatat cctgattcag 900
gtgaaaatat tgttgatgcg ctggcagtgt tcctgcgccg gttgcattcg attcctgttt 960
gtaattgtcc ttttaacggc gatcgcgtat ttcgtctcgc tcaggcgcaa tcacgaatga 1020
ataacggttt ggttggtgcg agtgattttg atgacgagcg taatggctgg cctgttgaac 1080
aagtctggaa agaaatgcat aagcttttgc cattctcacc ggattcagtc gtcactcatg 1140
gtgatttctc acttgataac cttatttttg acgaggggaa attaataggt tgtattgatg 1200
ttggacgagt cggaatcgca gaccgatacc aggatcttgc catcctatgg aactgcctcg 1260
gtgagttttc tccttcatta cagaaacggc tttttcaaaa atatggtatt gataatcctg 1320
atatgaataa attgcagttt cacttgatgc tcgatgagtt tttctaatga gggcccaaat 1380
gtaatcacct ggctcacctt cgggtgggcc tttctgcgtt gctggcgttt ttccataggc 1440
tccgcccccc tgacgagcat cacaaaaatc gatgctcaag tcagaggtgg cgaaacccga 1500
caggactata aagataccag gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc 1560
cgaccctgcc gcttaccgga tacctgtccg cctttctccc ttcgggaagc gtggcgcttt 1620
ctcatagctc acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc aagctgggct 1680
gtgtgcacga accccccgtt cagcccgacc gctgcgcctt atccggtaac tatcgtcttg 1740
agtccaaccc ggtaagacac gacttatcgc cactggcagc agccactggt aacaggatta 1800
gcagagcgag gtatgtaggc ggtgctacag agttcttgaa gtggtggcct aactacggct 1860
acactagaag aacagtattt ggtatctgcg ctctgctgaa gccagttacc tcggaaaaag 1920
agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt tttttgtttg 1980
caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga ttttctaccg 2040
<210> 69
<211> 2031
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-HO-P plasmid
<400> 69
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atggttttag gtaaagttgt cggtactgtt gttgcatcaa gaaaggaacc 120
aagaattgaa ggtttatctt tattattggt tagagcttgt gatccagatg gtactccaac 180
tggtggtgct gttgtttgtg ctgatgctgt tggtgctggt gttggtgaag ttgttttata 240
tgcttctggt tcttctgcta gacaaactga agttactaat aatagaccag ttgatgctac 300
tattatggct attgttgatt tggttgaaat gggtggtgat gttagattta gaaaagatta 360
gccgagacga ctgaccattt aaatcatacc tgacctccat agcagaaagt caaaagcctc 420
cgaccggagg cttttgactt gatcggcacg taagaggttc caactttcac cataatgaaa 480
taagatcact accgggcgta ttttttgagt tatcgagatt ttcaggagct aaggaagcta 540
aaatgagcca tattcaacgg gaaacgtctt gctcgaggcc gcgattaaat tccaacatgg 600
atgctgattt atatgggtat aaatgggctc gcgataatgt cgggcaatca ggtgcgacaa 660
tctatcgatt gtatgggaag cccgatgcgc cagagttgtt tctgaaacat ggcaaaggta 720
gcgttgccaa tgatgttaca gatgagatgg tcaggctaaa ctggctgacg gaatttatgc 780
ctcttccgac catcaagcat tttatccgta ctcctgatga tgcatggtta ctcaccactg 840
cgatcccagg gaaaacagca ttccaggtat tagaagaata tcctgattca ggtgaaaata 900
ttgttgatgc gctggcagtg ttcctgcgcc ggttgcattc gattcctgtt tgtaattgtc 960
cttttaacgg cgatcgcgta tttcgtctcg ctcaggcgca atcacgaatg aataacggtt 1020
tggttggtgc gagtgatttt gatgacgagc gtaatggctg gcctgttgaa caagtctgga 1080
aagaaatgca taagcttttg ccattctcac cggattcagt cgtcactcat ggtgatttct 1140
cacttgataa ccttattttt gacgagggga aattaatagg ttgtattgat gttggacgag 1200
tcggaatcgc agaccgatac caggatcttg ccatcctatg gaactgcctc ggtgagtttt 1260
ctccttcatt acagaaacgg ctttttcaaa aatatggtat tgataatcct gatatgaata 1320
aattgcagtt tcacttgatg ctcgatgagt ttttctaatg agggcccaaa tgtaatcacc 1380
tggctcacct tcgggtgggc ctttctgcgt tgctggcgtt tttccatagg ctccgccccc 1440
ctgacgagca tcacaaaaat cgatgctcaa gtcagaggtg gcgaaacccg acaggactat 1500
aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc 1560
cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct 1620
cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg 1680
aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc 1740
cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga 1800
ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa 1860
gaacagtatt tggtatctgc gctctgctga agccagttac ctcggaaaaa gagttggtag 1920
ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt gcaagcagca 1980
gattacgcgc agaaaaaaag gatctcaaga agatcctttg attttctacc g 2031
<210> 70
<211> 2358
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-HO-T1 plasmid
<400> 70
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atggatcatg ctccagaaag atttgatgct actcctccag ctggtgaacc 120
agatagacca gctttgggtg ttttggaatt gacttctatt gctagaggta ttaccgttgc 180
tgatgctgct ttgaaaagag caccatcttt gttgttgatg tccagaccag tttcttccgg 240
taaacatttg ttgatgatga gaggtcaagt tgccgaagtt gaagaatcta tgattgctgc 300
tagagaaatt gctggtgctg gttctggtgc tttgttggat gaattggaat tgccatatgc 360
tcacgaacaa ctttggagat ttttggatgc tccagttgtt gcagatgctt gggaagaaga 420
tactgaatcc gttattatcg ttgaaaccgc tactgtttgt gctgctattg attctgctga 480
tgcagcctta aaaactgctc ctgttgtttt gagagatatg agattggcta ttggtattgc 540
tggtaaggct ttctttactt tgactggtga attggctgat gttgaagctg ctgctgaagt 600
tgttagagaa agatgtggtg ctagattgct agaattggct tgtattgcaa gaccagttga 660
cgaattgaga ggtaggttgt ttttctagcc gagacgactg accatttaaa tcatacctga 720
cctccatagc agaaagtcaa aagcctccga ccggaggctt ttgacttgat cggcacgtaa 780
gaggttccaa ctttcaccat aatgaaataa gatcactacc gggcgtattt tttgagttat 840
cgagattttc aggagctaag gaagctaaaa tgagccatat tcaacgggaa acgtcttgct 900
cgaggccgcg attaaattcc aacatggatg ctgatttata tgggtataaa tgggctcgcg 960
ataatgtcgg gcaatcaggt gcgacaatct atcgattgta tgggaagccc gatgcgccag 1020
agttgtttct gaaacatggc aaaggtagcg ttgccaatga tgttacagat gagatggtca 1080
ggctaaactg gctgacggaa tttatgcctc ttccgaccat caagcatttt atccgtactc 1140
ctgatgatgc atggttactc accactgcga tcccagggaa aacagcattc caggtattag 1200
aagaatatcc tgattcaggt gaaaatattg ttgatgcgct ggcagtgttc ctgcgccggt 1260
tgcattcgat tcctgtttgt aattgtcctt ttaacggcga tcgcgtattt cgtctcgctc 1320
aggcgcaatc acgaatgaat aacggtttgg ttggtgcgag tgattttgat gacgagcgta 1380
atggctggcc tgttgaacaa gtctggaaag aaatgcataa gcttttgcca ttctcaccgg 1440
attcagtcgt cactcatggt gatttctcac ttgataacct tatttttgac gaggggaaat 1500
taataggttg tattgatgtt ggacgagtcg gaatcgcaga ccgataccag gatcttgcca 1560
tcctatggaa ctgcctcggt gagttttctc cttcattaca gaaacggctt tttcaaaaat 1620
atggtattga taatcctgat atgaataaat tgcagtttca cttgatgctc gatgagtttt 1680
tctaatgagg gcccaaatgt aatcacctgg ctcaccttcg ggtgggcctt tctgcgttgc 1740
tggcgttttt ccataggctc cgcccccctg acgagcatca caaaaatcga tgctcaagtc 1800
agaggtggcg aaacccgaca ggactataaa gataccaggc gtttccccct ggaagctccc 1860
tcgtgcgctc tcctgttccg accctgccgc ttaccggata cctgtccgcc tttctccctt 1920
cgggaagcgt ggcgctttct catagctcac gctgtaggta tctcagttcg gtgtaggtcg 1980
ttcgctccaa gctgggctgt gtgcacgaac cccccgttca gcccgaccgc tgcgccttat 2040
ccggtaacta tcgtcttgag tccaacccgg taagacacga cttatcgcca ctggcagcag 2100
ccactggtaa caggattagc agagcgaggt atgtaggcgg tgctacagag ttcttgaagt 2160
ggtggcctaa ctacggctac actagaagaa cagtatttgg tatctgcgct ctgctgaagc 2220
cagttacctc ggaaaaagag ttggtagctc ttgatccggc aaacaaacca ccgctggtag 2280
cggtggtttt tttgtttgca agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga 2340
tcctttgatt ttctaccg 2358
<210> 71
<211> 2201
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_T-RPL41B plasmid
<400> 71
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgt agcgcggatt gagagcaaat cgttaagttc aggtcaagta aaaattgatt 120
tcgaaaacta atttctctta tacaatcctt tgattggacc gtcatccttt cgaatataag 180
attttgttaa gaatatttta gacagagatc tactttatat ttaatatcta gatattacat 240
aatttcctct ctaataaaat atcattaata aaataaaaat gaagcgattt gattttgtgt 300
tgtcaactta gtttgccgct atgcctcttg ggtaatgcta ttattgaatc gaagggcttt 360
attatattac cctttagctt attctgaggt ttctgtggcg tgcaaagtga tgaaccgggc 420
gggttttaag gataaaatca aaaagtgaaa aaatgaacgg aaaatggaat acctgtgaaa 480
tggagaatga taatgaatct ttctgtcgtg cttgaaagat tttcggctcc tccgagacga 540
ctgaccattt aaatcatacc tgacctccat agcagaaagt caaaagcctc cgaccggagg 600
cttttgactt gatcggcacg taagaggttc caactttcac cataatgaaa taagatcact 660
accgggcgta ttttttgagt tatcgagatt ttcaggagct aaggaagcta aaatgagcca 720
tattcaacgg gaaacgtctt gctcgaggcc gcgattaaat tccaacatgg atgctgattt 780
atatgggtat aaatgggctc gcgataatgt cgggcaatca ggtgcgacaa tctatcgatt 840
gtatgggaag cccgatgcgc cagagttgtt tctgaaacat ggcaaaggta gcgttgccaa 900
tgatgttaca gatgagatgg tcaggctaaa ctggctgacg gaatttatgc ctcttccgac 960
catcaagcat tttatccgta ctcctgatga tgcatggtta ctcaccactg cgatcccagg 1020
gaaaacagca ttccaggtat tagaagaata tcctgattca ggtgaaaata ttgttgatgc 1080
gctggcagtg ttcctgcgcc ggttgcattc gattcctgtt tgtaattgtc cttttaacgg 1140
cgatcgcgta tttcgtctcg ctcaggcgca atcacgaatg aataacggtt tggttggtgc 1200
gagtgatttt gatgacgagc gtaatggctg gcctgttgaa caagtctgga aagaaatgca 1260
taagcttttg ccattctcac cggattcagt cgtcactcat ggtgatttct cacttgataa 1320
ccttattttt gacgagggga aattaatagg ttgtattgat gttggacgag tcggaatcgc 1380
agaccgatac caggatcttg ccatcctatg gaactgcctc ggtgagtttt ctccttcatt 1440
acagaaacgg ctttttcaaa aatatggtat tgataatcct gatatgaata aattgcagtt 1500
tcacttgatg ctcgatgagt ttttctaatg agggcccaaa tgtaatcacc tggctcacct 1560
tcgggtgggc ctttctgcgt tgctggcgtt tttccatagg ctccgccccc ctgacgagca 1620
tcacaaaaat cgatgctcaa gtcagaggtg gcgaaacccg acaggactat aaagatacca 1680
ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc cgcttaccgg 1740
atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct cacgctgtag 1800
gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg aaccccccgt 1860
tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc cggtaagaca 1920
cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga ggtatgtagg 1980
cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa gaacagtatt 2040
tggtatctgc gctctgctga agccagttac ctcggaaaaa gagttggtag ctcttgatcc 2100
ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt gcaagcagca gattacgcgc 2160
agaaaaaaag gatctcaaga agatcctttg attttctacc g 2201
<210> 72
<211> 2172
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_T-HBT1 plasmid
<400> 72
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgt agcacacttc tcgattaaca aattcccagt attctttgaa atctattttt 120
cttcctcaat tgaatttgaa taactgtcta cgcggactcc tcctatctac aactacaaca 180
aattttaacc actttattac cactttcctc tttcatttat ttttgtcttt tatgttgtca 240
atttactagt attttttttt ttttcattta cgttcaaggt tttttatact catttaactt 300
gtcttaggtt atttatatat atacctatat atttatatat atatatatat atgtatgtat 360
atattattat caccaaatga gaaataatag ctaatttgat ttttgattat ttaaaatatt 420
ggtttgttct ttctgcaaac atctcgtttg gtacgatatt agtgaaaaac gatgtaatta 480
tcaacacgtg cattacccac ctccgagacg actgaccatt taaatcatac ctgacctcca 540
tagcagaaag tcaaaagcct ccgaccggag gcttttgact tgatcggcac gtaagaggtt 600
ccaactttca ccataatgaa ataagatcac taccgggcgt attttttgag ttatcgagat 660
tttcaggagc taaggaagct aaaatgagcc atattcaacg ggaaacgtct tgctcgaggc 720
cgcgattaaa ttccaacatg gatgctgatt tatatgggta taaatgggct cgcgataatg 780
tcgggcaatc aggtgcgaca atctatcgat tgtatgggaa gcccgatgcg ccagagttgt 840
ttctgaaaca tggcaaaggt agcgttgcca atgatgttac agatgagatg gtcaggctaa 900
actggctgac ggaatttatg cctcttccga ccatcaagca ttttatccgt actcctgatg 960
atgcatggtt actcaccact gcgatcccag ggaaaacagc attccaggta ttagaagaat 1020
atcctgattc aggtgaaaat attgttgatg cgctggcagt gttcctgcgc cggttgcatt 1080
cgattcctgt ttgtaattgt ccttttaacg gcgatcgcgt atttcgtctc gctcaggcgc 1140
aatcacgaat gaataacggt ttggttggtg cgagtgattt tgatgacgag cgtaatggct 1200
ggcctgttga acaagtctgg aaagaaatgc ataagctttt gccattctca ccggattcag 1260
tcgtcactca tggtgatttc tcacttgata accttatttt tgacgagggg aaattaatag 1320
gttgtattga tgttggacga gtcggaatcg cagaccgata ccaggatctt gccatcctat 1380
ggaactgcct cggtgagttt tctccttcat tacagaaacg gctttttcaa aaatatggta 1440
ttgataatcc tgatatgaat aaattgcagt ttcacttgat gctcgatgag tttttctaat 1500
gagggcccaa atgtaatcac ctggctcacc ttcgggtggg cctttctgcg ttgctggcgt 1560
ttttccatag gctccgcccc cctgacgagc atcacaaaaa tcgatgctca agtcagaggt 1620
ggcgaaaccc gacaggacta taaagatacc aggcgtttcc ccctggaagc tccctcgtgc 1680
gctctcctgt tccgaccctg ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa 1740
gcgtggcgct ttctcatagc tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct 1800
ccaagctggg ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc ttatccggta 1860
actatcgtct tgagtccaac ccggtaagac acgacttatc gccactggca gcagccactg 1920
gtaacaggat tagcagagcg aggtatgtag gcggtgctac agagttcttg aagtggtggc 1980
ctaactacgg ctacactaga agaacagtat ttggtatctg cgctctgctg aagccagtta 2040
cctcggaaaa agagttggta gctcttgatc cggcaaacaa accaccgctg gtagcggtgg 2100
tttttttgtt tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag aagatccttt 2160
gattttctac cg 2172
<210> 73
<211> 2200
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_T-RPS20 plasmid
<400> 73
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgt agcaactaag ctggttctaa ctggaaataa tttccattag attcctcttt 120
ttctcgtcca ttaaccaaaa tatattattg aattcagcgg ttcctttttt ctcattttcg 180
catatagctg cactattaga atcagcccac tctaggtaaa cacagttcct cgatatacct 240
ctgtcttact atcagtggtt aaaccttatg caaatataat atatatatat atatatatat 300
ctcatacttt tgttgattct tgtgtaatta ttggaaaaga caaaacaaag caagcgtttc 360
tattcatcat atttacaagt atttttatga aaaactattt cttaattttc ccaccggcgg 420
ctttgaataa ggcaatgtca ttgtcctgca taatatattg tttgcctgca cgtttgataa 480
gtcccttaga ttttagtaaa gactcattta gcggtggttc catcttccct ccgagacgac 540
tgaccattta aatcatacct gacctccata gcagaaagtc aaaagcctcc gaccggaggc 600
ttttgacttg atcggcacgt aagaggttcc aactttcacc ataatgaaat aagatcacta 660
ccgggcgtat tttttgagtt atcgagattt tcaggagcta aggaagctaa aatgagccat 720
attcaacggg aaacgtcttg ctcgaggccg cgattaaatt ccaacatgga tgctgattta 780
tatgggtata aatgggctcg cgataatgtc gggcaatcag gtgcgacaat ctatcgattg 840
tatgggaagc ccgatgcgcc agagttgttt ctgaaacatg gcaaaggtag cgttgccaat 900
gatgttacag atgagatggt caggctaaac tggctgacgg aatttatgcc tcttccgacc 960
atcaagcatt ttatccgtac tcctgatgat gcatggttac tcaccactgc gatcccaggg 1020
aaaacagcat tccaggtatt agaagaatat cctgattcag gtgaaaatat tgttgatgcg 1080
ctggcagtgt tcctgcgccg gttgcattcg attcctgttt gtaattgtcc ttttaacggc 1140
gatcgcgtat ttcgtctcgc tcaggcgcaa tcacgaatga ataacggttt ggttggtgcg 1200
agtgattttg atgacgagcg taatggctgg cctgttgaac aagtctggaa agaaatgcat 1260
aagcttttgc cattctcacc ggattcagtc gtcactcatg gtgatttctc acttgataac 1320
cttatttttg acgaggggaa attaataggt tgtattgatg ttggacgagt cggaatcgca 1380
gaccgatacc aggatcttgc catcctatgg aactgcctcg gtgagttttc tccttcatta 1440
cagaaacggc tttttcaaaa atatggtatt gataatcctg atatgaataa attgcagttt 1500
cacttgatgc tcgatgagtt tttctaatga gggcccaaat gtaatcacct ggctcacctt 1560
cgggtgggcc tttctgcgtt gctggcgttt ttccataggc tccgcccccc tgacgagcat 1620
cacaaaaatc gatgctcaag tcagaggtgg cgaaacccga caggactata aagataccag 1680
gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc gcttaccgga 1740
tacctgtccg cctttctccc ttcgggaagc gtggcgcttt ctcatagctc acgctgtagg 1800
tatctcagtt cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga accccccgtt 1860
cagcccgacc gctgcgcctt atccggtaac tatcgtcttg agtccaaccc ggtaagacac 1920
gacttatcgc cactggcagc agccactggt aacaggatta gcagagcgag gtatgtaggc 1980
ggtgctacag agttcttgaa gtggtggcct aactacggct acactagaag aacagtattt 2040
ggtatctgcg ctctgctgaa gccagttacc tcggaaaaag agttggtagc tcttgatccg 2100
gcaaacaaac caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca 2160
gaaaaaaagg atctcaagaa gatcctttga ttttctaccg 2200
<210> 74
<211> 4531
<212> DNA
<213> Artificial Sequence
<220>
<223> pCKU plasmid
<400> 74
actcctccct gcaagacggt gagttcatct acaaagttaa actgcgtggt accaacttcc 60
cgtccgacgg tccggttatg cagaaaaaaa ccatgggttg ggaagcttcc accgaacgta 120
tgtacccgga agacggtgct ctgaaaggtg aaatcaaaat gcgtctgaaa ctgaaagacg 180
gtggtcacta cgacgctgaa gttaaaacca cctacatggc taaaaaaccg gttcagctgc 240
cgggtgctta caaaaccgac atcaaactgg acatcacctc ccacaacgaa gactacacca 300
tcgttgaaca gtacgaacgt gctgaaggtc gtcactccac cggtgcttaa taacgctgat 360
agtgctagtg tagatcgcta ctagagccag gcatcaaata aaacgaaagg ctcagtcgaa 420
agactgggcc tttcgtttta tctgttgttt gtcggtgaac gctctctact agagtggtct 480
catgagcgag acgtccggca tccgcttaca gacaagctgt gacaatctcc gggagctgca 540
tgtgtcagag gttttcaccg tcatcaccga aacgcgcgag attaaagggc ctcgtgatac 600
gcctattttt ataggttaat gtcatgataa taatggtttc ttagacggat cgcttgcctg 660
taacttacac gcgcctcgta tcttttaatg atggaataat ttgggaattt actctgtgtt 720
tatttatttt tatgttttgt atttggattt tagaaagtaa ataaagaagg tagaagagtt 780
acggaatgaa gaaaaaaaaa taaacaaagg tttaaaaaat ttcaacaaaa agcgtacttt 840
acatatatat ttattagaca agaaaagcag attaaataga tatacattcg attaacgata 900
agtaaaatgt aaaatcacag gattttcgtg tgtggtcttc tacacagaca agatgaaaca 960
attcggcatt aatacctgag agcaggaaga gcaagataaa aggtagtatt tgttggcgat 1020
ccccctagag tcttttacat cttcggaaaa caaaaactat tttttcttta atttcttttt 1080
ttactttcta tttttaattt atatatttat attaaaaaat ttaaattata attattttta 1140
tagcacgtga tgaaaaggac ccaggtggca ttgacttgat cggcacgtaa gaggttccaa 1200
ctttcaccat aatgaaataa gatcactacc gggcgtattt tttgagttat cgagattttc 1260
aggagctaag gaagctaaaa tgagccatat tcaacgggaa acgtcttgct cgaggccgcg 1320
attaaattcc aacatggatg ctgatttata tgggtataaa tgggctcgcg ataatgtcgg 1380
gcaatcaggt gcgacaatct atcgattgta tgggaagccc gatgcgccag agttgtttct 1440
gaaacatggc aaaggtagcg ttgccaatga tgttacagat gagatggtca ggctaaactg 1500
gctgacggaa tttatgcctc ttccgaccat caagcatttt atccgtactc ctgatgatgc 1560
atggttactc accactgcga tcccagggaa aacagcattc caggtattag aagaatatcc 1620
tgattcaggt gaaaatattg ttgatgcgct ggcagtgttc ctgcgccggt tgcattcgat 1680
tcctgtttgt aattgtcctt ttaacggcga tcgcgtattt cgtctcgcac aggcgcaatc 1740
acgaatgaat aacggtttgg ttggtgcgag tgattttgat gacgagcgta atggctggcc 1800
tgttgaacaa gtctggaaag aaatgcataa gcttttgcca ttctcaccgg attcagtcgt 1860
cactcatggt gatttctcac ttgataacct tatttttgac gaggggaaat taataggttg 1920
tattgatgtt ggacgagtcg gaatcgcaga ccgataccag gatcttgcca tcctatggaa 1980
ctgcctcggt gagttttctc cttcattaca gaaacggctt tttcaaaaat atggtattga 2040
taatcctgat atgaataaat tgcagtttca cttgatgctc gatgagtttt tctaatgagg 2100
gcccaaatgt aatcacctgg ctcaccttcg ggtgggcctt tctgcgttgc tggcgttttt 2160
ccataggctc cgcccccctg acgagcatca caaaaatcga tgctcaagtc agaggtggcg 2220
aaacccgaca ggactataaa gataccaggc gtttccccct ggaagctccc tcgtgcgctc 2280
tcctgttccg accctgccgc ttaccggata cctgtccgcc tttctccctt cgggaagcgt 2340
ggcgctttct catagctcac gctgtaggta tctcagttcg gtgtaggtcg ttcgctccaa 2400
gctgggctgt gtgcacgaac cccccgttca gcccgaccgc tgcgccttat ccggtaacta 2460
tcgtcttgag tccaacccgg taagacacga cttatcgcca ctggcagcag ccactggtaa 2520
caggattagc agagcgaggt atgtaggcgg tgctacagag ttcttgaagt ggtggcctaa 2580
ctacggctac actagaagaa cagtatttgg tatctgcgct ctgctgaagc cagttacctc 2640
ggaaaaagag ttggtagctc ttgatccggc aaacaaacca ccgctggtag cggtggtttt 2700
tttgtttgca agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatt 2760
ttctaccgaa cgtttacaat ttcctgatgc ggtattttct ccttacgcat ctgtgcggta 2820
tttcacaccg catagggtaa taactgatat aattaaattg aagctctaat ttgtgagttt 2880
agtatacatg catttactta taatacagtt ttttagtttt gctggccgca tcttctcaaa 2940
tatgcttccc agcctgcttt tctgtaacgt tcaccctcta ccttagcatc ccttcccttt 3000
gcaaatagtc ctcttccaac aataataatg tcagatcctg tagacaccac atcatccacg 3060
gttctatact gttgacccaa tgcgtcaccc ttgtcatcta aacccacacc gggtgtcata 3120
atcaaccaat cgtaaccttc atctcttcca cccatgtctc tttgagcaat aaagccgata 3180
acaaaatctt tgtcgctctt cgcaatgtca acagtaccct tagtatattc tccagtagat 3240
agggagccct tgcatgacaa ttctgctaac atcaaaaggc ctctaggttc ctttgttact 3300
tcttctgccg cctgcttcaa accgctaaca atacctgggc ccaccacacc gtgtgcattc 3360
gtaatgtctg cccattctgc tattctgtat acacccgcag agtactgcaa tttgactgta 3420
ttaccaatgt cagcaaattt tctgtcttcg aagagtaaaa aattgtactt ggcggataat 3480
gcctttagcg gcttaactgt gccctccatg gaaaaatcag tcaagatatc cacatgtgtt 3540
tttagtaaac aaattttggg acctaatgct tcaactaact ccagtaattc cttggtggta 3600
cgaacatcca atgaagcaca caagtttgtt tgcttttcgt gcatgatatt aaatagcttg 3660
gcagcaacag gactaggatg agtagcagca cgttccttat atgtagcttt cgacatgatt 3720
tatcttcgtt tcctgcaggt ttttgttctg tgcagttggg ttaagaatac tgggcaattt 3780
catgtttctt caacactaca tatgcgtata tataccaatc taagtctgtg ctccttcctt 3840
cgttcttcct tctgttcgga gattaccgaa tcaaaaaaat ttcaaagaaa ccgaaatcaa 3900
aaaaaagaat aaaaaaaaaa tgatgaattg aattgaaaag ctgtggtatg gtgcactacg 3960
tctcgacctc gagaccgcaa tacgcaaacc gcctctcccc gcgcgttggc cgattcatta 4020
atgcagctgg cacgacaggt ttcccgactg gaaagcgggc agtgagcgca acgcaattaa 4080
tgtgagttag ctcactcatt aggcacccca ggctttacac tttatgcttc cggctcgtat 4140
gttgtgtgga attgtgagcg gataacaatt tcacacatac tagagaaaga ggagaaatac 4200
tagatggctt cctccgaaga cgttatcaaa gagttcatgc gtttcaaagt tcgtatggaa 4260
ggttccgtta acggtcacga gttcgaaatc gaaggtgaag gtgaaggtcg tccgtacgaa 4320
ggtacccaga ccgctaaact gaaagttacc aaaggtggtc cgctgccgtt cgcttgggac 4380
atcctgtccc cgcagttcca gtacggttcc aaagcttacg ttaaacaccc ggctgacatc 4440
ccggactacc tgaaactgtc cttcccggaa ggtttcaaat gggaacgtgt tatgaacttc 4500
gaagacggtg gtgttgttac cgttacccag g 4531
<210> 75
<211> 6441
<212> DNA
<213> Artificial Sequence
<220>
<223> pGAU-YMRWδ15 plasmid
<400> 75
gcaaaaggcc aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc 60
ccctgacgag catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact 120
ataaagatac caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct 180
gccgcttacc ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag 240
ctcacgctgt aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca 300
cgaacccccc gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa 360
cccggtaaga cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc 420
gaggtatgta ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag 480
aaggacagta tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg 540
tagctcttga tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca 600
gcagattacg cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc 660
tgacgctcag tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag 720
gatcttcacc tagatccttt taaattaaaa atgaagtttt aaatcaatct aaagtatata 780
tgagtaaact tggtctgaca gttaccaatg cttaatcagt gaggcaccta tctcagcgat 840
ctgtctattt cgttcatcca tagttgcctg actccccgtc gtgtagataa ctacgatacg 900
ggagggctta ccatctggcc ccagtgctgc aatgataccg cggctcccac gctcaccggc 960
tccagattta tcagcaataa accagccagc cggaagggcc gagcgcagaa gtggtcctgc 1020
aactttatcc gcctccatcc agtctattaa ttgttgccgg gaagctagag taagtagttc 1080
gccagttaat agtttgcgca acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc 1140
gtcgtttggt atggcttcat tcagctccgg ttcccaacga tcaaggcgag ttacatgatc 1200
ccccatgttg tgcaaaaaag cggttagctc cttcggtcct ccgatcgttg tcagaagtaa 1260
gttggccgca gtgttatcac tcatggttat ggcagcactg cataattctc ttactgtcat 1320
gccatccgta agatgctttt ctgtgactgg tgagtactca accaagtcat tctgagaata 1380
gtgtatgcgg cgaccgagtt gctcttgccc ggcgtcaata cgggataata ccgcgccaca 1440
tagcagaact ttaaaagtgc tcatcattgg aaaacgttct tcggggcgaa aactctcaag 1500
gatcttaccg ctgttgagat ccagttcgat gtaacccact cgtgcaccca actgatcttc 1560
agcatctttt actttcacca gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc 1620
aaaaaaggga ataagggcga cacggaaatg ttgaatactc atactcttcc tttttcaata 1680
ttattgaagc atttatcagg gttattgtct catgagcgga tacatatttg aatgtattta 1740
gaaaaataaa caaatagggg ttccgcgcac atttccccga aaagtgccac ggtctctgtc 1800
atcaatcaaa gcaacccaca aatcctaggc tgaatcatga tatcgatgga agcaatcaac 1860
aattttatca agaccgcacc aaagcacgac tatctgacag gcggagttca tcattctggt 1920
aatgtagacg tgttacaatt aagcggcaat aaagaagatg gtagtttagt atggaaccat 1980
acttttgttg atgtagacaa caatgtggta gctaagtttg aagacgctct cgaaaaactt 2040
gaaagtttgc accggcgctc atcctcatcc acaggcaatg aagaacacgc taacgtttaa 2100
ccgaggggag tcacttcata atgatgtgag aaataagtga atattgtaat aattgttggg 2160
actccattgt caacaaaagc tataatgtag gtatacagta tatactagaa gttctcctcg 2220
aggatcttgg aatccacaaa agggagtcga taaatctata taataaaaat tactttatct 2280
tctttcgttt tatacgttgt cgtttattat cctattacgt tatcaatctt cgcatttcag 2340
ctttcattag atttgatgac tgtttctcaa actttatgtc attttcttac accgctctct 2400
acctggctcg aagcacgcta gtaacatcag ctaacgaaag agttagaggc tcgctaaatc 2460
gcactgtcgg ggtcccttgg gtattttaca ctagcgtcag gacgactagc atgtgtcttt 2520
ccttccaggg gtatgcgggt gcgtggacaa atgagcagca tacgtattta ctcggcgtgc 2580
ctgctctctc gtatttctcc tggagatcaa ggaaatgttt catgtccaag cgaaaagccg 2640
ctctacggaa tggatctacg ttactgcctg cataaggaaa ccggtgtagc caaggacgaa 2700
agcgacccta ggttctaacc atcgactttg gcggaaaggt ttcactcagg aagcagacac 2760
tgattgacac ggtttagcag aacgtttgag gactaggtca aattgagtgg tttaatatcg 2820
gcatgtctgg ctttaaaatt cagtatagtg cgctgatcgg aaacgaatta aaaacacgag 2880
ttcccaaaac caggcgggct cgccacgcta atcgggatgc ataccacagc ttttcaattc 2940
aattcatcat ttttttttta ttcttttttt tgatttcggt ttctttgaaa tttttttgat 3000
tcggtaatct ccgaacagaa ggaagaacga aggaaggagc acagacttag attggtatat 3060
atacgcatat gtagtgttga agaaacatga aattgcccag tattcttaac ccaactgcac 3120
agaacaaaaa cctgcaggaa acgaagataa atcatgtcga aagctacata taaggaacgt 3180
gctgctactc atcctagtcc tgttgctgcc aagctattta atatcatgca cgaaaagcaa 3240
acaaacttgt gtgcttcatt ggatgttcgt accaccaagg aattactgga gttagttgaa 3300
gcattaggtc ccaaaatttg tttactaaaa acacatgtgg atatcttgac tgatttttcc 3360
atggagggca cagttaagcc gctaaaggca ttatccgcca agtacaattt tttactcttc 3420
gaagacagaa aatttgctga cattggtaat acagtcaaat tgcagtactc tgcgggtgta 3480
tacagaatag cagaatgggc agacattacg aatgcacacg gtgtggtggg cccaggtatt 3540
gttagcggtt tgaagcaggc ggcagaagaa gtaacaaagg aacctagagg ccttttgatg 3600
ttagcagaat tgtcatgcaa gggctcccta tctactggag aatatactaa gggtactgtt 3660
gacattgcga agagcgacaa agattttgtt atcggcttta ttgctcaaag agacatgggt 3720
ggaagagatg aaggttacga ttggttgatt atgacacccg gtgtgggttt agatgacaag 3780
ggtgacgcat tgggtcaaca gtatagaacc gtggatgatg tggtgtctac aggatctgac 3840
attattattg ttggaagagg actatttgca aagggaaggg atgctaaggt agagggtgaa 3900
cgttacagaa aagcaggctg ggaagcatat ttgagaagat gcggccagca aaactaaaaa 3960
actgtattat aagtaaatgc atgtatacta aactcacaaa ttagagcttc aatttaatta 4020
tatcagttat taccctatgc ggtgtgaaat accgcacaga tgcgtaagga gaaaataccg 4080
catcaggtag ccgctagtaa catcagctaa cgaaagagtt agaggctcgc taaatcgcac 4140
tgtcggggtc ccttgggtat tttacactag cgtcaggacg actagcatgt gtctttcctt 4200
ccaggggtat gcgggtgcgt ggacaaatga gcagcatacg tatttactcg gcgtgcctgc 4260
tctctcgtat ttctcctgga gatcaaggaa atgtttcatg tccaagcgaa aagccgctct 4320
acggaatgga tctacgttac tgcctgcata aggaaaccgg tgtagccaag gacgaaagcg 4380
accctaggtt ctaaccatcg actttggcgg aaaggtttca ctcaggaagc agacactgat 4440
tgacacggtt tagcagaacg tttgaggact aggtcaaatt gagtggttta atatcggcat 4500
gtctggcttt aaaattcagt atagtgcgct gatcggaaac gaattaaaaa cacgagttcc 4560
caaaaccagg cgggctcgcc acgctaatcg gtgcaccacc tcaggcagag aacctagaga 4620
cggcaatacg caaaccgcct ctccccgcgc gttggccgat tcattaatgc agctggcacg 4680
acaggtttcc cgactggaaa gcgggcagtg agcgcaacgc aattaatgtg agttagctca 4740
ctcattaggc accccaggct ttacacttta tgcttccggc tcgtatgttg tgtggaattg 4800
tgagcggata acaatttcac acatactaga gaaagaggag aaatactaga tggcttcctc 4860
cgaagacgtt atcaaagagt tcatgcgttt caaagttcgt atggaaggtt ccgttaacgg 4920
tcacgagttc gaaatcgaag gtgaaggtga aggtcgtccg tacgaaggta cccagaccgc 4980
taaactgaaa gttaccaaag gtggtccgct gccgttcgct tgggacatcc tgtccccgca 5040
gttccagtac ggttccaaag cttacgttaa acacccggct gacatcccgg actacctgaa 5100
actgtccttc ccggaaggtt tcaaatggga acgtgttatg aacttcgaag acggtggtgt 5160
tgttaccgtt acccaggact cctccctgca agacggtgag ttcatctaca aagttaaact 5220
gcgtggtacc aacttcccgt ccgacggtcc ggttatgcag aaaaaaacca tgggttggga 5280
agcttccacc gaacgtatgt acccggaaga cggtgctctg aaaggtgaaa tcaaaatgcg 5340
tctgaaactg aaagacggtg gtcactacga cgctgaagtt aaaaccacct acatggctaa 5400
aaaaccggtt cagctgccgg gtgcttacaa aaccgacatc aaactggaca tcacctccca 5460
caacgaagac tacaccatcg ttgaacagta cgaacgtgct gaaggtcgtc actccaccgg 5520
tgcttaataa cgctgatagt gctagtgtag atcgctacta gagccaggca tcaaataaaa 5580
cgaaaggctc agtcgaaaga ctgggccttt cgttttatct gttgtttgtc ggtgaacgct 5640
ctctactaga gtcacactgg ctccgtctca tgagcgctca tggaaaatgc aaccgataaa 5700
ccattataaa tcttcgcggt tatctggcat tgttattaac caaaaaaatg ccggcctatt 5760
acaagctact gttcaataaa tattgttgta atgaagacgg tccaactgta caaatacagc 5820
aaactgtcat atataaggag tcttatgtga cagcacttgc gttattgtca gccggagtat 5880
gtctttgtcg cattctgggc tttttacttt ctgctcagaa ggaagtacga acaagaaaaa 5940
aaaatcacca atgcttccct tttcagtatt agtttcatat ttgtttacgt tcaaactcgt 6000
cgtttgcgcg ataacctcta aaaaagtcaa ttacgtaact atatcaatca gagaatgcaa 6060
aaagcactat cataaaaatg tgtctagggg atgtgagaca tgtcaattat aagaagtgat 6120
ggtgtcatag tatatatatc ataaaagatt atcaaagttt caatcctttg tattttctag 6180
tttagcgcca acttttgaca aaacctaaac tttagataat catcattctt acaattttta 6240
tctggatggc aataatctcc tatataaagc ccagataaac tgtaaaaaga atccatcact 6300
atttgaaaaa aagtcatctg gcacgtttaa ttatcagagc agaaatgatg aagggtgtta 6360
gcgccgtcca ctgatgtgcc tggtagtcat gatttacgta taactaacac atcatgagga 6420
cggcggctcg gagagaccga t 6441
<210> 76
<211> 7606
<212> DNA
<213> Artificial Sequence
<220>
<223> pGAU-YMRWδ15-HO-BMC plasmid
<400> 76
tgagcgagac gtccggcatc cgcttacaga caagctgtga caatctccgg gagctgcatg 60
tgtcagaggt tttcaccgtc atcaccgaaa cgcgcgagat taaagggcct cgtgatacgc 120
ctatttttat aggttaatgt catgataata atggtttctt agacggatcg cttgcctgta 180
acttacacgc gcctcgtatc ttttaatgat ggaataattt gggaatttac tctgtgttta 240
tttattttta tgttttgtat ttggatttta gaaagtaaat aaagaaggta gaagagttac 300
ggaatgaaga aaaaaaaata aacaaaggtt taaaaaattt caacaaaaag cgtactttac 360
atatatattt attagacaag aaaagcagat taaatagata tacattcgat taacgataag 420
taaaatgtaa aatcacagga ttttcgtgtg tggtcttcta cacagacaag atgaaacaat 480
tcggcattaa tacctgagag caggaagagc aagataaaag gtagtatttg ttggcgatcc 540
ccctagagtc ttttacatct tcggaaaaca aaaactattt tttctttaat ttcttttttt 600
actttctatt tttaatttat atatttatat taaaaaattt aaattataat tatttttata 660
gcacgtgatg aaaaggaccc aggtggcatt gacttgatcg gcacgtaaga ggttccaact 720
ttcaccataa tgaaataaga tcactaccgg gcgtattttt tgagttatcg agattttcag 780
gagctaagga agctaaaatg agccatattc aacgggaaac gtcttgctcg aggccgcgat 840
taaattccaa catggatgct gatttatatg ggtataaatg ggctcgcgat aatgtcgggc 900
aatcaggtgc gacaatctat cgattgtatg ggaagcccga tgcgccagag ttgtttctga 960
aacatggcaa aggtagcgtt gccaatgatg ttacagatga gatggtcagg ctaaactggc 1020
tgacggaatt tatgcctctt ccgaccatca agcattttat ccgtactcct gatgatgcat 1080
ggttactcac cactgcgatc ccagggaaaa cagcattcca ggtattagaa gaatatcctg 1140
attcaggtga aaatattgtt gatgcgctgg cagtgttcct gcgccggttg cattcgattc 1200
ctgtttgtaa ttgtcctttt aacggcgatc gcgtatttcg tctcgcacag gcgcaatcac 1260
gaatgaataa cggtttggtt ggtgcgagtg attttgatga cgagcgtaat ggctggcctg 1320
ttgaacaagt ctggaaagaa atgcataagc ttttgccatt ctcaccggat tcagtcgtca 1380
ctcatggtga tttctcactt gataacctta tttttgacga ggggaaatta ataggttgta 1440
ttgatgttgg acgagtcgga atcgcagacc gataccagga tcttgccatc ctatggaact 1500
gcctcggtga gttttctcct tcattacaga aacggctttt tcaaaaatat ggtattgata 1560
atcctgatat gaataaattg cagtttcact tgatgctcga tgagtttttc taatgagggc 1620
ccaaatgtaa tcacctggct caccttcggg tgggcctttc tgcgttgctg gcgtttttcc 1680
ataggctccg cccccctgac gagcatcaca aaaatcgatg ctcaagtcag aggtggcgaa 1740
acccgacagg actataaaga taccaggcgt ttccccctgg aagctccctc gtgcgctctc 1800
ctgttccgac cctgccgctt accggatacc tgtccgcctt tctcccttcg ggaagcgtgg 1860
cgctttctca tagctcacgc tgtaggtatc tcagttcggt gtaggtcgtt cgctccaagc 1920
tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc ggtaactatc 1980
gtcttgagtc caacccggta agacacgact tatcgccact ggcagcagcc actggtaaca 2040
ggattagcag agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg tggcctaact 2100
acggctacac tagaagaaca gtatttggta tctgcgctct gctgaagcca gttacctcgg 2160
aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt 2220
tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatttt 2280
ctaccgaacg tttacaattt cctgatgcgg tattttctcc ttacgcatct gtgcggtatt 2340
tcacaccgca tagggtaata actgatataa ttaaattgaa gctctaattt gtgagtttag 2400
tatacatgca tttacttata atacagtttt ttagttttgc tggccgcatc ttctcaaata 2460
tgcttcccag cctgcttttc tgtaacgttc accctctacc ttagcatccc ttccctttgc 2520
aaatagtcct cttccaacaa taataatgtc agatcctgta gacaccacat catccacggt 2580
tctatactgt tgacccaatg cgtcaccctt gtcatctaaa cccacaccgg gtgtcataat 2640
caaccaatcg taaccttcat ctcttccacc catgtctctt tgagcaataa agccgataac 2700
aaaatctttg tcgctcttcg caatgtcaac agtaccctta gtatattctc cagtagatag 2760
ggagcccttg catgacaatt ctgctaacat caaaaggcct ctaggttcct ttgttacttc 2820
ttctgccgcc tgcttcaaac cgctaacaat acctgggccc accacaccgt gtgcattcgt 2880
aatgtctgcc cattctgcta ttctgtatac acccgcagag tactgcaatt tgactgtatt 2940
accaatgtca gcaaattttc tgtcttcgaa gagtaaaaaa ttgtacttgg cggataatgc 3000
ctttagcggc ttaactgtgc cctccatgga aaaatcagtc aagatatcca catgtgtttt 3060
tagtaaacaa attttgggac ctaatgcttc aactaactcc agtaattcct tggtggtacg 3120
aacatccaat gaagcacaca agtttgtttg cttttcgtgc atgatattaa atagcttggc 3180
agcaacagga ctaggatgag tagcagcacg ttccttatat gtagctttcg acatgattta 3240
tcttcgtttc ctgcaggttt ttgttctgtg cagttgggtt aagaatactg ggcaatttca 3300
tgtttcttca acactacata tgcgtatata taccaatcta agtctgtgct ccttccttcg 3360
ttcttccttc tgttcggaga ttaccgaatc aaaaaaattt caaagaaacc gaaatcaaaa 3420
aaaagaataa aaaaaaaatg atgaattgaa ttgaaaagct gtggtatggt gcactacgtc 3480
tcgacctggc tacagtttat tcctggcatc cactaaatat aatggagccc gctttttaag 3540
ctggcatcca gaaaaaaaaa gaatcccagc accaaaatat tgttttcttc accaaccatc 3600
agttcatagg tccattctct tagcgcaact acagagaaca ggggcacaaa caggcaaaaa 3660
acgggcacaa cctcaatgga gtgatgcaac ctgcctggag taaatgatga cacaaggcaa 3720
ttgacccacg catgtatcta tctcattttc ttacaccttc tattaccttc tgctctctct 3780
gatttggaaa aagctgaaaa aaaaggttga aaccagttcc ctgaaattat tcccctactt 3840
gactaataag tatataaaga cggtaggtat tgattgtaat tctgtaaatc tatttcttaa 3900
acttcttaaa ttctactttt atagttagtc ttttttttag ttttaaaaca ccaagaactt 3960
agtttcgaat aaacacacat aaacaaacaa agatggctga tgctttgggt atgattgaag 4020
ttagaggttt cgttggtatg gttgaagctg ctgatgctat ggttaaggct gctaaagttg 4080
aattgatcgg ttacgaaaaa actggtggtg gttatgttac tgctgttgtt agaggtgatg 4140
ttgctgctgt aaaagctgct actgaagctg gtcaaagggc tgctgaaaga gttggagaag 4200
ttgttgctgt tcatgttatt ccaagaccac atgttaatgt tgatgctgct ttgccattgg 4260
gtagaactcc aggtatggat aagtctgctt agcgcggatt gagagcaaat cgttaagttc 4320
aggtcaagta aaaattgatt tcgaaaacta atttctctta tacaatcctt tgattggacc 4380
gtcatccttt cgaatataag attttgttaa gaatatttta gacagagatc tactttatat 4440
ttaatatcta gatattacat aatttcctct ctaataaaat atcattaata aaataaaaat 4500
gaagcgattt gattttgtgt tgtcaactta gtttgccgct atgcctcttg ggtaatgcta 4560
ttattgaatc gaagggcttt attatattac cctttagctt attctgaggt ttctgtggcg 4620
tgcaaagtga tgaaccgggc gggttttaag gataaaatca aaaagtgaaa aaatgaacgg 4680
aaaatggaat acctgtgaaa tggagaatga taatgaatct ttctgtcgtg cttgaaagat 4740
tttcggctcc tcaggcggct attaaaaaaa caacttacaa tcattgttcg ccccttccat 4800
acttactgcc actcgcaaaa gggcccaacc agggcaatta cgtatcaaaa aatcatgaca 4860
ggctgggtaa taaatattcg tgaagaaaga agaaattaaa aaaagaaacg aagaagcaaa 4920
aaaaagaaaa gactccgttt aatcactttc aaccgcggtt tatccggccc cacccatgca 4980
taaccctaaa ttattagatc acttagcacg tgaaaaagaa acgtttttaa tgtttttttt 5040
ttttttttct ttttcttttt ttgcgttggt gaaaattttt tcgcttcctc gagtataatt 5100
atctcatctc atctttcata taagataaga agttttataa aaaccttttg catcaaaatt 5160
ttgtagaata tctctttttc ttacgctctc tttctttcct taattgtttt ctaaagaacc 5220
gtgtattttt ctagttcgaa tccatcgata acattaaaag gatggatcat gctccagaaa 5280
gatttgatgc tactcctcca gctggtgaac cagatagacc agctttgggt gttttggaat 5340
tgacttctat tgctagaggt attaccgttg ctgatgctgc tttgaaaaga gcaccatctt 5400
tgttgttgat gtccagacca gtttcttccg gtaaacattt gttgatgatg agaggtcaag 5460
ttgccgaagt tgaagaatct atgattgctg ctagagaaat tgctggtgct ggttctggtg 5520
ctttgttgga tgaattggaa ttgccatatg ctcacgaaca actttggaga tttttggatg 5580
ctccagttgt tgcagatgct tgggaagaag atactgaatc cgttattatc gttgaaaccg 5640
ctactgtttg tgctgctatt gattctgctg atgcagcctt aaaaactgct cctgttgttt 5700
tgagagatat gagattggct attggtattg ctggtaaggc tttctttact ttgactggtg 5760
aattggctga tgttgaagct gctgctgaag ttgttagaga aagatgtggt gctagattgc 5820
tagaattggc ttgtattgca agaccagttg acgaattgag aggtaggttg tttttctagc 5880
acacttctcg attaacaaat tcccagtatt ctttgaaatc tatttttctt cctcaattga 5940
atttgaataa ctgtctacgc ggactcctcc tatctacaac tacaacaaat tttaaccact 6000
ttattaccac tttcctcttt catttatttt tgtcttttat gttgtcaatt tactagtatt 6060
tttttttttt tcatttacgt tcaaggtttt ttatactcat ttaacttgtc ttaggttatt 6120
tatatatata cctatatatt tatatatata tatatatatg tatgtatata ttattatcac 6180
caaatgagaa ataatagcta atttgatttt tgattattta aaatattggt ttgttctttc 6240
tgcaaacatc tcgtttggta cgatattagt gaaaaacgat gtaattatca acacgtgcat 6300
tacccacctc tgccggctac agattgggag attttcatag tagaattcag catgatagct 6360
acgtaaatgt gttccgcacc gtcacaaagt gttttctact gttctttctt ctttcgttca 6420
ttcagttgag ttgagtgagt gctttgttca atggatctta gctaaaatgc atattttttc 6480
tcttggtaaa tgaatgcttg tgatgtcttc caagtgattt cctttccttc ccatatgatg 6540
ctaggtacct ttagtgtctt cctaaaaaaa aaaaaaggct cgccatcaaa acgatattcg 6600
ttggcttttt tttctgaatt ataaatactc tttggtaact tttcatttcc aagaacctct 6660
tttttccagt tatatcatgg tcccctttca aagttattct ctactctttt tcatattcat 6720
tctttttcat cctttggttt tttattctta acttgtttat tattctctct tgtttctatt 6780
tacaagacac caatcaaaac aaataaaaca tcatcacaga tggttttagg taaagttgtc 6840
ggtactgttg ttgcatcaag aaaggaacca agaattgaag gtttatcttt attattggtt 6900
agagcttgtg atccagatgg tactccaact ggtggtgctg ttgtttgtgc tgatgctgtt 6960
ggtgctggtg ttggtgaagt tgttttatat gcttctggtt cttctgctag acaaactgaa 7020
gttactaata atagaccagt tgatgctact attatggcta ttgttgattt ggttgaaatg 7080
ggtggtgatg ttagatttag aaaagatggt tcttcttggt cacatccaca atttgaaaag 7140
tagcaactaa gctggttcta actggaaata atttccatta gattcctctt tttctcgtcc 7200
attaaccaaa atatattatt gaattcagcg gttccttttt tctcattttc gcatatagct 7260
gcactattag aatcagccca ctctaggtaa acacagttcc tcgatatacc tctgtcttac 7320
tatcagtggt taaaccttat gcaaatataa tatatatata tatatatata tatatatctc 7380
atacttttgt tgattcttgt gtaattattg gaaaagacaa aacaaagcaa gcgtttctat 7440
tcatatttac aagtattttt tatgacaaac tatttcttaa ttttcccacc ggcggctttg 7500
aataaggcaa tgtcattgtc ctgcataata tattgtttgc ctgcacgttt gataagtccc 7560
ttagatttta gtaaagactc atttagcggt ggttccatct tccctc 7606
<210> 77
<211> 122
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter Pcon3 nucleotide sequence
<400> 77
ggctggcttc ccaaccttac cagagggcgc cccagctggc aattccgacg tcctgacagc 60
tagctcagtc ctaggtataa tgctagcgaa ttcaaaagat cttttaagaa ggagatatac 120
at 122
<210> 78
<211> 122
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter Pcon4 nucleotide sequence
<400> 78
ggctggcttc ccaaccttac cagagggcgc cccagctggc aattccgacg tctttacggc 60
tagctcagtc ctaggtacta tgctagcgaa ttcaaaagat cttttaagaa ggagatatac 120
at 122
<210> 79
<211> 154
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter TT7 nucleotide sequence
<400> 79
tagctaacaa agcccgaaag gaagctgagt tggctgctgc caccgctgag caataactag 60
cataacccct tggggcctct aaacgggtct tgaggggttt tttgctgaaa ggaggaacta 120
tatccggata tcccgcaaga ggcccggcag tacc 154
<210> 80
<211> 459
<212> DNA
<213> Artificial Sequence
<220>
<223> Yeast transcriptional terminator TRPL41B
<400> 80
tagcgcggat tgagagcaaa tcgttaagtt caggtcaagt aaaaattgat ttcgaaaact 60
aatttctctt atacaatcct ttgattggac cgtcatcctt tcgaatataa gattttgtta 120
agaatatttt agacagagat ctactttata tttaatatct agatattaca taatttcctc 180
tctaataaaa tatcattaat aaaataaaaa tgaagcgatt tgattttgtg ttgtcaactt 240
agtttgccgc tatgcctctt gggtaatgct attattgaat cgaagggctt tattatatta 300
ccctttagct tattctgagg tttctgtggc gtgcaaagtg atgaaccggg cgggttttaa 360
ggataaaatc aaaaagtgaa aaaatgaacg gaaaatggaa tacctgtgaa atggagaatg 420
ataatgaatc tttctgtcgt gcttgaaaga ttttcggct 459
<210> 81
<211> 430
<212> DNA
<213> Artificial Sequence
<220>
<223> Yeast transcriptional terminator THBT1
<400> 81
tagcacactt ctcgattaac aaattcccag tattctttga aatctatttt tcttcctcaa 60
ttgaatttga ataactgtct acgcggactc ctcctatcta caactacaac aaattttaac 120
cactttatta ccactttcct ctttcattta tttttgtctt ttatgttgtc aatttactag 180
tatttttttt tttttcattt acgttcaagg ttttttatac tcatttaact tgtcttaggt 240
tatttatata tatacctata tatttatata tatatatata tatgtatgta tatattatta 300
tcaccaaatg agaaataata gctaatttga tttttgatta tttaaaatat tggtttgttc 360
tttctgcaaa catctcgttt ggtacgatat tagtgaaaaa cgatgtaatt atcaacacgt 420
gcattaccca 430
<210> 82
<211> 458
<212> DNA
<213> Artificial Sequence
<220>
<223> Yeast transcriptional terminator TRPS20
<400> 82
tagcaactaa gctggttcta actggaaata atttccatta gattcctctt tttctcgtcc 60
attaaccaaa atatattatt gaattcagcg gttccttttt tctcattttc gcatatagct 120
gcactattag aatcagccca ctctaggtaa acacagttcc tcgatatacc tctgtcttac 180
tatcagtggt taaaccttat gcaaatataa tatatatata tatatatata tctcatactt 240
ttgttgattc ttgtgtaatt attggaaaag acaaaacaaa gcaagcgttt ctattcatca 300
tatttacaag tatttttatg aaaaactatt tcttaatttt cccaccggcg gctttgaata 360
aggcaatgtc attgtcctgc ataatatatt gtttgcctgc acgtttgata agtcccttag 420
attttagtaa agactcattt agcggtggtt ccatcttc 458
<210> 83
<211> 1864
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_P-Pcon2 plasmid
<400> 83
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg gctggcttcc caaccttacc agagggcgcc ccagctggca attccgacgt 120
cttgacggct agctcagtcc taggtacagt gctagcgaat tcaaaagatc ttttaagaag 180
gagatataca tgatgcgaga cgactgacca tttaaatcat acctgacctc catagcagaa 240
agtcaaaagc ctccgaccgg aggcttttga cttgatcggc acgtaagagg ttccaacttt 300
caccataatg aaataagatc actaccgggc gtattttttg agttatcgag attttcagga 360
gctaaggaag ctaaaatgag ccatattcaa cgggaaacgt cttgctcgag gccgcgatta 420
aattccaaca tggatgctga tttatatggg tataaatggg ctcgcgataa tgtcgggcaa 480
tcaggtgcga caatctatcg attgtatggg aagcccgatg cgccagagtt gtttctgaaa 540
catggcaaag gtagcgttgc caatgatgtt acagatgaga tggtcaggct aaactggctg 600
acggaattta tgcctcttcc gaccatcaag cattttatcc gtactcctga tgatgcatgg 660
ttactcacca ctgcgatccc agggaaaaca gcattccagg tattagaaga atatcctgat 720
tcaggtgaaa atattgttga tgcgctggca gtgttcctgc gccggttgca ttcgattcct 780
gtttgtaatt gtccttttaa cggcgatcgc gtatttcgtc tcgctcaggc gcaatcacga 840
atgaataacg gtttggttgg tgcgagtgat tttgatgacg agcgtaatgg ctggcctgtt 900
gaacaagtct ggaaagaaat gcataagctt ttgccattct caccggattc agtcgtcact 960
catggtgatt tctcacttga taaccttatt tttgacgagg ggaaattaat aggttgtatt 1020
gatgttggac gagtcggaat cgcagaccga taccaggatc ttgccatcct atggaactgc 1080
ctcggtgagt tttctccttc attacagaaa cggctttttc aaaaatatgg tattgataat 1140
cctgatatga ataaattgca gtttcacttg atgctcgatg agtttttcta atgagggccc 1200
aaatgtaatc acctggctca ccttcgggtg ggcctttctg cgttgctggc gtttttccat 1260
aggctccgcc cccctgacga gcatcacaaa aatcgatgct caagtcagag gtggcgaaac 1320
ccgacaggac tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct 1380
gttccgaccc tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg 1440
ctttctcata gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg 1500
ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt 1560
cttgagtcca acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg 1620
attagcagag cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac 1680
ggctacacta gaagaacagt atttggtatc tgcgctctgc tgaagccagt tacctcggaa 1740
aaagagttgg tagctcttga tccggcaaac aaaccaccgc tggtagcggt ggtttttttg 1800
tttgcaagca gcagattacg cgcagaaaaa aaggatctca agaagatcct ttgattttct 1860
accg 1864
<210> 84
<211> 122
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter Pcon2 nucleotide sequence
<400> 84
ggctggcttc ccaaccttac cagagggcgc cccagctggc aattccgacg tcttgacggc 60
tagctcagtc ctaggtacag tgctagcgaa ttcaaaagat cttttaagaa ggagatatac 120
at 122
<210> 85
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_chc_F' forward primer
<400> 85
gatcctttga ttttctaccg 20
<210> 86
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_chc_R' reverse primer
<400> 86
ctcgataact caaaaaatac g 21
<210> 87
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> pES_Chc_F' forward primer
<400> 87
cggagcctat ggaaaaacgc 20
<210> 88
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> pES_Chc_R' reverse primer
<400> 88
ccgcagtgtc ttgggtctct 20
<210> 89
<211> 23
<212> DNA
<213> Artificial Sequence
<220>
<223> His_chc_F' forward primer
<400> 89
tagagtgtac tagaggaggc caa 23
<210> 90
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> CEN_chc_R' reverse primer
<400> 90
ggtgatgacg gtgaaaacct 20
<210> 91
<211> 24
<212> DNA
<213> Artificial Sequence
<220>
<223> Ura_chc_F' forward primer
<400> 91
tctgttcgga gattaccgaa tcaa 24
<210> 92
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> pGau_chc_F' forward primer
<400> 92
ccacctcagg cagagaacct 20
<210> 93
<211> 19
<212> DNA
<213> Artificial Sequence
<220>
<223> pGau_chc_R' reverse primer
<400> 93
ggaaaaacgc cagcaacgc 19
<210> 94
<211> 30
<212> PRT
<213> Artificial Sequence
<220>
<223> S2CP(30) amino acid sequence
<400> 94
Lys Pro Glu Lys Pro Gly Ser Lys Ile Thr Gly Ser Ser Gly Asn Asp
1 5 10 15
Thr Gln Gly Ser Leu Ile Thr Tyr Ser Gly Gly Ala Arg Gly
20 25 30
<210> 95
<211> 90
<212> DNA
<213> Artificial Sequence
<220>
<223> S2CP(30) nucleotide sequence
<400> 95
aaaccggaaa aaccaggtag caaaattacc ggtagcagcg gcaatgatac ccagggtagc 60
ctgattacct atagcggtgg tgcacgtggt 90
<210> 96
<211> 229
<212> PRT
<213> Artificial Sequence
<220>
<223> HO-T1-SpyTag amino acid sequence
<400> 96
Met Asp His Ala Pro Glu Arg Phe Asp Ala Thr Pro Pro Ala Gly Glu
1 5 10 15
Pro Asp Arg Pro Ala Leu Gly Val Leu Glu Leu Thr Ser Ile Ala Arg
20 25 30
Gly Ile Thr Val Ala Asp Ala Ala Leu Lys Arg Ala Pro Ser Leu Leu
35 40 45
Leu Met Ser Arg Pro Val Ser Ser Gly Lys His Leu Leu Met Met Arg
50 55 60
Gly Gln Val Ala Glu Val Glu Glu Ser Met Ile Ala Ala Arg Glu Ile
65 70 75 80
Ala Gly Ala Gly Gly Gly Ser Gly Gly Ser Ala His Ile Val Met Val
85 90 95
Asp Ala Tyr Lys Pro Thr Lys Gly Gly Ser Gly Gly Ser Gly Ala Leu
100 105 110
Leu Asp Glu Leu Glu Leu Pro Tyr Ala His Glu Gln Leu Trp Arg Phe
115 120 125
Leu Asp Ala Pro Val Val Ala Asp Ala Trp Glu Glu Asp Thr Glu Ser
130 135 140
Val Ile Ile Val Glu Thr Ala Thr Val Cys Ala Ala Ile Asp Ser Ala
145 150 155 160
Asp Ala Ala Leu Lys Thr Ala Pro Val Val Leu Arg Asp Met Arg Leu
165 170 175
Ala Ile Gly Ile Ala Gly Lys Ala Phe Phe Thr Leu Thr Gly Glu Leu
180 185 190
Ala Asp Val Glu Ala Ala Ala Glu Val Val Arg Glu Arg Cys Gly Ala
195 200 205
Arg Leu Leu Glu Leu Ala Cys Ile Ala Arg Pro Val Asp Glu Leu Arg
210 215 220
Gly Arg Leu Phe Phe
225
<210> 97
<211> 687
<212> DNA
<213> Artificial Sequence
<220>
<223> HO-T1-SpyTag nucleotide sequence
<400> 97
atggatcatg ctccagaaag atttgatgct actcctccag ctggtgaacc agatagacca 60
gctttgggtg ttttggaatt gacttctatt gctagaggta ttaccgttgc tgatgctgct 120
ttgaaaagag caccatcttt gttgttgatg tccagaccag tttcttccgg taaacatttg 180
ttgatgatga gaggtcaagt tgccgaagtt gaagaatcta tgattgctgc tagagaaatt 240
gctggtgctg gtggtggttc aggtggttct gctcatatag ttatggttga tgcttacaag 300
ccaacaaaag gtggtagtgg tggatctggt gctttgttgg atgaattgga attgccatat 360
gctcacgaac aactttggag atttttggat gctccagttg ttgcagatgc ttgggaagaa 420
gatactgaat ccgttattat cgttgaaacc gctactgttt gtgctgctat tgattctgct 480
gatgcagcct taaaaactgc tcctgttgtt ttgagagata tgagattggc tattggtatt 540
gctggtaagg ctttctttac tttgactggt gaattggctg atgttgaagc tgctgctgaa 600
gttgttagag aaagatgtgg tgctagattg ctagaattgg catgtattgc aagaccagtt 660
gacgaattga gaggtaggtt gtttttc 687
<210> 98
<211> 339
<212> PRT
<213> Artificial Sequence
<220>
<223> GFP-SpyCatcher amino acid sequence
<400> 98
Met Gly Ser Ser His His His His His His Ser Ser Gly Met Ser Lys
1 5 10 15
Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu Val Glu Leu Asp
20 25 30
Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu Gly Glu Gly
35 40 45
Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe Ile Cys Thr Thr Gly
50 55 60
Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Leu Thr Tyr Gly
65 70 75 80
Val Gln Cys Phe Ser Arg Tyr Pro Asp His Met Lys Gln His Asp Phe
85 90 95
Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu Arg Thr Ile Phe
100 105 110
Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val Lys Phe Glu
115 120 125
Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile Asp Phe Lys
130 135 140
Glu Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr Asn Tyr Asn Ser
145 150 155 160
His Asn Val Tyr Ile Met Ala Asp Lys Gln Lys Asn Gly Ile Lys Val
165 170 175
Asn Phe Lys Ile Arg His Asn Ile Glu Asp Gly Ser Val Gln Leu Ala
180 185 190
Asp His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro Val Leu Leu
195 200 205
Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Lys Leu Ser Lys Asp Pro
210 215 220
Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val Thr Ala Ala
225 230 235 240
Gly Ile Thr His Gly Met Asp Glu Leu Tyr Lys Gly Ser Gly Gly Ser
245 250 255
Asp Ser Ala Thr His Ile Lys Phe Ser Lys Arg Asp Glu Asp Gly Lys
260 265 270
Glu Leu Ala Gly Ala Thr Met Glu Leu Arg Asp Ser Ser Gly Lys Thr
275 280 285
Ile Ser Thr Trp Ile Ser Asp Gly Gln Val Lys Asp Phe Tyr Leu Tyr
290 295 300
Pro Gly Lys Tyr Thr Phe Val Glu Thr Ala Ala Pro Asp Gly Tyr Glu
305 310 315 320
Val Ala Thr Ala Ile Thr Phe Thr Val Asn Glu Gln Gly Gln Val Thr
325 330 335
Val Asn Gly
<210> 99
<211> 1017
<212> DNA
<213> Artificial Sequence
<220>
<223> GFP-SpyCatcher nucleotide sequence
<400> 99
atgggttctt ctcatcatca ccatcaccat tcttctggga tgtctaaagg tgaagaatta 60
ttcactggtg ttgtcccaat tttggttgaa ttagatggtg atgttaatgg tcacaaattt 120
tctgtctccg gtgaaggtga aggtgatgct acttacggta aattgacctt aaaatttatt 180
tgtactactg gtaaattgcc agttccatgg ccaaccttag tcactacttt aacttatggt 240
gttcaatgtt tttctagata cccagatcat atgaaacaac atgacttttt caagtctgcc 300
atgccagaag gttatgttca agaaagaact atttttttca aagatgacgg taactacaag 360
accagagctg aagtcaagtt tgaaggtgat accttagtta atagaatcga attaaaaggt 420
attgatttta aagaagatgg taacatttta ggtcacaaat tggaatacaa ctataactct 480
cacaatgttt acatcatggc tgacaaacaa aagaatggta tcaaagttaa cttcaaaatt 540
agacacaaca ttgaagatgg ttctgttcaa ttagctgacc attatcaaca aaatactcca 600
attggtgatg gtccagtctt gttaccagac aaccattact tatccactca atctaaatta 660
tccaaagatc caaacgaaaa gagagatcac atggtcttgt tagaatttgt tactgctgct 720
ggtattaccc atggtatgga tgaattgtac aaaggttctg gtggttctga ttctgctact 780
catattaagt tctccaagag ggacgaagat ggtaaagaat tggctggtgc aactatggaa 840
ttgagagatt cttctggtaa gaccatttcc acctggattt ctgatggtca agttaaggat 900
ttctacttgt acccaggtaa gtacactttc gttgaaactg ctgctccaga tggttatgaa 960
gttgctactg ctattacttt caccgtcaat gaacaaggtc aagtcactgt taatggt 1017
<210> 100
<211> 296
<212> PRT
<213> Artificial Sequence
<220>
<223> APEX2-S2CP(30) amino acid sequence
<400> 100
Met Gly Ser Ser His His His His His His Ser Ser Gly Met Gly Lys
1 5 10 15
Ser Tyr Pro Thr Val Ser Ala Asp Tyr Gln Asp Ala Val Glu Lys Ala
20 25 30
Lys Lys Lys Leu Arg Gly Phe Ile Ala Glu Lys Arg Cys Ala Pro Leu
35 40 45
Met Leu Arg Leu Ala Phe His Ser Ala Gly Thr Phe Asp Lys Gly Thr
50 55 60
Lys Thr Gly Gly Pro Phe Gly Thr Ile Lys His Pro Ala Glu Leu Ala
65 70 75 80
His Ser Ala Asn Asn Gly Leu Asp Ile Ala Val Arg Leu Leu Glu Pro
85 90 95
Leu Lys Ala Glu Phe Pro Ile Leu Ser Tyr Ala Asp Phe Tyr Gln Leu
100 105 110
Ala Gly Val Val Ala Val Glu Val Thr Gly Gly Pro Lys Val Pro Phe
115 120 125
His Pro Gly Arg Glu Asp Lys Pro Glu Pro Pro Pro Glu Gly Arg Leu
130 135 140
Pro Asp Pro Thr Lys Gly Ser Asp His Leu Arg Asp Val Phe Gly Lys
145 150 155 160
Ala Met Gly Leu Thr Asp Gln Asp Ile Val Ala Leu Ser Gly Gly His
165 170 175
Thr Ile Gly Ala Ala His Lys Glu Arg Ser Gly Phe Glu Gly Pro Trp
180 185 190
Thr Ser Asn Pro Leu Ile Phe Asp Asn Ser Tyr Phe Thr Glu Leu Leu
195 200 205
Ser Gly Glu Lys Glu Gly Leu Leu Gln Leu Pro Ser Asp Lys Ala Leu
210 215 220
Leu Ser Asp Pro Val Phe Arg Pro Leu Val Asp Lys Tyr Ala Ala Asp
225 230 235 240
Glu Asp Ala Phe Phe Ala Asp Tyr Ala Glu Ala His Gln Lys Leu Ser
245 250 255
Glu Leu Gly Phe Ala Asp Ala Gly Ser Ser Lys Pro Glu Lys Pro Gly
260 265 270
Ser Lys Ile Thr Gly Ser Ser Gly Asn Asp Thr Gln Gly Ser Leu Ile
275 280 285
Thr Tyr Ser Gly Gly Ala Arg Gly
290 295
<210> 101
<211> 888
<212> DNA
<213> Artificial Sequence
<220>
<223> APEX2-S2CP(30) nucleotide sequence
<400> 101
atgggttctt ctcatcatca ccatcaccat tcttctggga tgggtaagtc ttacccaact 60
gtttctgctg attatcaaga tgctgttgaa aaggccaaga agaagttgag aggtttcatt 120
gctgaaaaaa gatgcgctcc attgatgttg agattggctt ttcattctgc tggtactttc 180
gataagggta caaaaactgg tggtccattc ggtactatca aacatccagc tgaattggct 240
cattcagcta acaatggttt ggatattgct gtcagattgc tggaaccatt gaaagccgaa 300
tttccaattt tgtcctacgc cgatttttac caattggctg gtgttgttgc agttgaagtt 360
acaggtggtc caaaagttcc atttcatcca ggtagagaag ataagccaga accaccacca 420
gaaggtagat tgccagatcc aacaaaaggt tctgatcact tgagagatgt tttcggtaaa 480
gctatgggtt tgactgatca agatattgtc gctttgtctg gtggtcatac aattggtgct 540
gctcacaaag aaagatcagg ttttgaaggt ccttggactt ctaacccatt gatctttgat 600
aactcttact tcaccgagtt gttgtccggt gaaaaagaag gtttgttgca attgccatct 660
gataaggctt tgttgtctga tccagttttc agaccattgg ttgataagta tgctgctgat 720
gaagatgctt tctttgctga ttacgctgaa gctcatcaaa agttgtctga attgggtttt 780
gctgatgctg gttcttctaa accggaaaaa ccaggtagca aaattaccgg tagcagcggc 840
aatgataccc agggtagcct gattacctat agcggtggtg cacgtggt 888
<210> 102
<211> 1070
<212> PRT
<213> Artificial Sequence
<220>
<223> LacZ-S2CP(30) amino acid sequence
<400> 102
Met Gly Ser Ser His His His His His His Ser Ser Gly Met Thr Met
1 5 10 15
Ile Thr Asp Ser Leu Ala Val Val Leu Gln Arg Arg Asp Trp Glu Asn
20 25 30
Pro Gly Val Thr Gln Leu Asn Arg Leu Ala Ala His Pro Pro Phe Ala
35 40 45
Ser Trp Arg Asn Ser Glu Glu Ala Arg Thr Asp Arg Pro Ser Gln Gln
50 55 60
Leu Arg Ser Leu Asn Gly Glu Trp Arg Phe Ala Trp Phe Pro Ala Pro
65 70 75 80
Glu Ala Val Pro Glu Ser Trp Leu Glu Cys Asp Leu Pro Glu Ala Asp
85 90 95
Thr Val Val Val Pro Ser Asn Trp Gln Met His Gly Tyr Asp Ala Pro
100 105 110
Ile Tyr Thr Asn Val Thr Tyr Pro Ile Thr Val Asn Pro Pro Phe Val
115 120 125
Pro Thr Glu Asn Pro Thr Gly Cys Tyr Ser Leu Thr Phe Asn Val Asp
130 135 140
Glu Ser Trp Leu Gln Glu Gly Gln Thr Arg Ile Ile Phe Asp Gly Val
145 150 155 160
Asn Ser Ala Phe His Leu Trp Cys Asn Gly Arg Trp Val Gly Tyr Gly
165 170 175
Gln Asp Ser Arg Leu Pro Ser Glu Phe Asp Leu Ser Ala Phe Leu Arg
180 185 190
Ala Gly Glu Asn Arg Leu Ala Val Met Val Leu Arg Trp Ser Asp Gly
195 200 205
Ser Tyr Leu Glu Asp Gln Asp Met Trp Arg Met Ser Gly Ile Phe Arg
210 215 220
Asp Val Ser Leu Leu His Lys Pro Thr Thr Gln Ile Ser Asp Phe His
225 230 235 240
Val Ala Thr Arg Phe Asn Asp Asp Phe Ser Arg Ala Val Leu Glu Ala
245 250 255
Glu Val Gln Met Cys Gly Glu Leu Arg Asp Tyr Leu Arg Val Thr Val
260 265 270
Ser Leu Trp Gln Gly Glu Thr Gln Val Ala Ser Gly Thr Ala Pro Phe
275 280 285
Gly Gly Glu Ile Ile Asp Glu Arg Gly Gly Tyr Ala Asp Arg Val Thr
290 295 300
Leu Arg Leu Asn Val Glu Asn Pro Lys Leu Trp Ser Ala Glu Ile Pro
305 310 315 320
Asn Leu Tyr Arg Ala Val Val Glu Leu His Thr Ala Asp Gly Thr Leu
325 330 335
Ile Glu Ala Glu Ala Cys Asp Val Gly Phe Arg Glu Val Arg Ile Glu
340 345 350
Asn Gly Leu Leu Leu Leu Asn Gly Lys Pro Leu Leu Ile Arg Gly Val
355 360 365
Asn Arg His Glu His His Pro Leu His Gly Gln Val Met Asp Glu Gln
370 375 380
Thr Met Val Gln Asp Ile Leu Leu Met Lys Gln Asn Asn Phe Asn Ala
385 390 395 400
Val Arg Cys Ser His Tyr Pro Asn His Pro Leu Trp Tyr Thr Leu Cys
405 410 415
Asp Arg Tyr Gly Leu Tyr Val Val Asp Glu Ala Asn Ile Glu Thr His
420 425 430
Gly Met Val Pro Met Asn Arg Leu Thr Asp Asp Pro Arg Trp Leu Pro
435 440 445
Ala Met Ser Glu Arg Val Thr Arg Met Val Gln Arg Asp Arg Asn His
450 455 460
Pro Ser Val Ile Ile Trp Ser Leu Gly Asn Glu Ser Gly His Gly Ala
465 470 475 480
Asn His Asp Ala Leu Tyr Arg Trp Ile Lys Ser Val Asp Pro Ser Arg
485 490 495
Pro Val Gln Tyr Glu Gly Gly Gly Ala Asp Thr Thr Ala Thr Asp Ile
500 505 510
Ile Cys Pro Met Tyr Ala Arg Val Asp Glu Asp Gln Pro Phe Pro Ala
515 520 525
Val Pro Lys Trp Ser Ile Lys Lys Trp Leu Ser Leu Pro Gly Glu Thr
530 535 540
Arg Pro Leu Ile Leu Cys Glu Tyr Ala His Ala Met Gly Asn Ser Leu
545 550 555 560
Gly Gly Phe Ala Lys Tyr Trp Gln Ala Phe Arg Gln Tyr Pro Arg Leu
565 570 575
Gln Gly Gly Phe Val Trp Asp Trp Val Asp Gln Ser Leu Ile Lys Tyr
580 585 590
Asp Glu Asn Gly Asn Pro Trp Ser Ala Tyr Gly Gly Asp Phe Gly Asp
595 600 605
Thr Pro Asn Asp Arg Gln Phe Cys Met Asn Gly Leu Val Phe Ala Asp
610 615 620
Arg Thr Pro His Pro Ala Leu Thr Glu Ala Lys His Gln Gln Gln Phe
625 630 635 640
Phe Gln Phe Arg Leu Ser Gly Gln Thr Ile Glu Val Thr Ser Glu Tyr
645 650 655
Leu Phe Arg His Ser Asp Asn Glu Leu Leu His Trp Met Val Ala Leu
660 665 670
Asp Gly Lys Pro Leu Ala Ser Gly Glu Val Pro Leu Asp Val Ala Pro
675 680 685
Gln Gly Lys Gln Leu Ile Glu Leu Pro Glu Leu Pro Gln Pro Glu Ser
690 695 700
Ala Gly Gln Leu Trp Leu Thr Val Arg Val Val Gln Pro Asn Ala Thr
705 710 715 720
Ala Trp Ser Glu Ala Gly His Ile Ser Ala Trp Gln Gln Trp Arg Leu
725 730 735
Ala Glu Asn Leu Ser Val Thr Leu Pro Ala Ala Ser His Ala Ile Pro
740 745 750
His Leu Thr Thr Ser Glu Met Asp Phe Cys Ile Glu Leu Gly Asn Lys
755 760 765
Arg Trp Gln Phe Asn Arg Gln Ser Gly Phe Leu Ser Gln Met Trp Ile
770 775 780
Gly Asp Lys Lys Gln Leu Leu Thr Pro Leu Arg Asp Gln Phe Thr Arg
785 790 795 800
Ala Pro Leu Asp Asn Asp Ile Gly Val Ser Glu Ala Thr Arg Ile Asp
805 810 815
Pro Asn Ala Trp Val Glu Arg Trp Lys Ala Ala Gly His Tyr Gln Ala
820 825 830
Glu Ala Ala Leu Leu Gln Cys Thr Ala Asp Thr Leu Ala Asp Ala Val
835 840 845
Leu Ile Thr Thr Ala His Ala Trp Gln His Gln Gly Lys Thr Leu Phe
850 855 860
Ile Ser Arg Lys Thr Tyr Arg Ile Asp Gly Ser Gly Gln Met Ala Ile
865 870 875 880
Thr Val Asp Val Glu Val Ala Ser Asp Thr Pro His Pro Ala Arg Ile
885 890 895
Gly Leu Asn Cys Gln Leu Ala Gln Val Ala Glu Arg Val Asn Trp Leu
900 905 910
Gly Leu Gly Pro Gln Glu Asn Tyr Pro Asp Arg Leu Thr Ala Ala Cys
915 920 925
Phe Asp Arg Trp Asp Leu Pro Leu Ser Asp Met Tyr Thr Pro Tyr Val
930 935 940
Phe Pro Ser Glu Asn Gly Leu Arg Cys Gly Thr Arg Glu Leu Asn Tyr
945 950 955 960
Gly Pro His Gln Trp Arg Gly Asp Phe Gln Phe Asn Ile Ser Arg Tyr
965 970 975
Ser Gln Gln Gln Leu Met Glu Thr Ser His Arg His Leu Leu His Ala
980 985 990
Glu Glu Gly Thr Trp Leu Asn Ile Asp Gly Phe His Met Gly Ile Gly
995 1000 1005
Gly Asp Asp Ser Trp Ser Pro Ser Val Ser Ala Glu Phe Gln Leu
1010 1015 1020
Ser Ala Gly Arg Tyr His Tyr Gln Leu Val Trp Cys Gln Lys Gly
1025 1030 1035
Ser Ser Lys Pro Glu Lys Pro Gly Ser Lys Ile Thr Gly Ser Ser
1040 1045 1050
Gly Asn Asp Thr Gln Gly Ser Leu Ile Thr Tyr Ser Gly Gly Ala
1055 1060 1065
Arg Gly
1070
<210> 103
<211> 3210
<212> DNA
<213> Artificial Sequence
<220>
<223> LacZ-S2CP(30) nucleotide sequence
<400> 103
atgggttctt ctcatcatca ccatcaccat tcttctggga tgaccatgat tacggattca 60
ctggccgtcg ttttacaacg tcgtgactgg gaaaaccctg gcgttaccca acttaatcgc 120
cttgcagcac atcccccttt cgccagctgg cgtaatagcg aagaggcccg caccgatcgc 180
ccttcccaac agttgcgcag cctgaatggc gaatggcgct ttgcctggtt tccggcacca 240
gaagcggtgc cggaaagctg gctggagtgc gatcttcctg aggccgatac tgtcgtcgtc 300
ccctcaaact ggcagatgca cggttacgat gcgcccatct acaccaacgt gacctatccc 360
attacggtca atccgccgtt tgttcccacg gagaatccga cgggttgtta ctcgctcaca 420
tttaatgttg atgaaagctg gctacaggaa ggccagacgc gaattatttt tgatggcgtt 480
aactcggcgt ttcatctgtg gtgcaacggg cgctgggtcg gttacggcca ggacagtcgt 540
ttgccgtctg aatttgacct gagcgcattt ttacgcgccg gagaaaaccg cctcgcggtg 600
atggtgctgc gctggagtga cggcagttat ctggaagatc aggatatgtg gcggatgagc 660
ggcattttcc gtgatgtctc gttgctgcat aaaccgacta cacaaatcag cgatttccat 720
gttgccactc gctttaatga tgatttcagc cgcgctgtac tggaggctga agttcagatg 780
tgcggcgagt tgcgtgacta cctacgggta acagtttctt tatggcaggg tgaaacgcag 840
gtcgccagcg gcaccgcgcc tttcggcggt gaaattatcg atgagcgtgg tggttatgcc 900
gatcgcgtca cactacgtct gaacgtcgaa aacccgaaac tgtggagcgc cgaaatcccg 960
aatctctatc gtgcggtggt tgaactgcac accgccgacg gcacgctgat tgaagcagaa 1020
gcctgcgatg tcggtttccg cgaggtgcgg attgaaaatg gtctgctgct gctgaacggc 1080
aagccgttgc tgattcgagg cgttaaccgt cacgagcatc atcctctgca tggtcaggtc 1140
atggatgagc agacgatggt gcaggatatc ctgctgatga agcagaacaa ctttaacgcc 1200
gtgcgctgtt cgcattatcc gaaccatccg ctgtggtaca cgctgtgcga ccgctacggc 1260
ctgtatgtgg tggatgaagc caatattgaa acccacggca tggtgccaat gaatcgtctg 1320
accgatgatc cgcgctggct accggcgatg agcgaacgcg taacgcgaat ggtgcagcgc 1380
gatcgtaatc acccgagtgt gatcatctgg tcgctgggga atgaatcagg ccacggcgct 1440
aatcacgacg cgctgtatcg ctggatcaaa tctgtcgatc cttcccgccc ggtgcagtat 1500
gaaggcggcg gagccgacac cacggccacc gatattattt gcccgatgta cgcgcgcgtg 1560
gatgaagacc agcccttccc ggctgtgccg aaatggtcca tcaaaaaatg gctttcgcta 1620
cctggagaaa cgcgcccgct gatcctttgc gaatacgccc acgcgatggg taacagtctt 1680
ggcggtttcg ctaaatactg gcaggcgttt cgtcagtatc cccgtttaca gggcggcttc 1740
gtctgggact gggtggatca gtcgctgatt aaatatgatg aaaacggcaa cccgtggtcg 1800
gcttacggcg gtgattttgg cgatacgccg aacgatcgcc agttctgtat gaacggtctg 1860
gtctttgccg accgcacgcc gcatccagcg ctgacggaag caaaacacca gcagcagttt 1920
ttccagttcc gtttatccgg gcaaaccatc gaagtgacca gcgaatacct gttccgtcat 1980
agcgataacg agctcctgca ctggatggtg gcgctggatg gtaagccgct ggcaagcggt 2040
gaagtgcctc tggatgtcgc tccacaaggt aaacagttga ttgaactgcc tgaactaccg 2100
cagccggaga gcgccgggca actctggctc acagtacgcg tagtgcaacc gaacgcgacc 2160
gcatggtcag aagccggaca catcagcgcc tggcagcagt ggcgtctggc tgaaaacctc 2220
agcgtgacac tccccgccgc gtcccacgcc atcccgcatc tgaccaccag cgaaatggat 2280
ttttgcatcg agctgggtaa taagcgttgg caatttaacc gccagtcagg ctttctttca 2340
cagatgtgga ttggcgataa aaaacaactg ctgacgccgc tgcgcgatca gttcacccgt 2400
gcaccgctgg ataacgacat tggcgtaagt gaagcgaccc gcattgaccc taacgcctgg 2460
gtcgaacgct ggaaggcggc gggccattac caggccgaag cagcgttgtt gcagtgcacg 2520
gcagatacac ttgctgatgc ggtgctgatt acgaccgctc acgcgtggca gcatcagggg 2580
aaaaccttat ttatcagccg gaaaacctac cggattgatg gtagtggtca aatggcgatt 2640
accgttgatg ttgaagtggc gagcgataca ccgcatccgg cgcggattgg cctgaactgc 2700
cagctggcgc aggtagcaga gcgggtaaac tggctcggat tagggccgca agaaaactat 2760
cccgaccgcc ttactgccgc ctgttttgac cgctgggatc tgccattgtc agacatgtat 2820
accccgtacg tcttcccgag cgaaaacggt ctgcgctgcg ggacgcgcga attgaattat 2880
ggcccacacc agtggcgcgg cgacttccag ttcaacatca gccgctacag tcaacagcaa 2940
ctgatggaaa ccagccatcg ccatctgctg cacgcggaag aaggcacatg gctgaatatc 3000
gacggtttcc atatggggat tggtggcgac gactcctgga gcccgtcagt atcggcggaa 3060
ttccagctga gcgccggtcg ctaccattac cagttggtct ggtgtcaaaa aggttcttct 3120
aaaccggaaa aaccaggtag caaaattacc ggtagcagcg gcaatgatac ccagggtagc 3180
ctgattacct atagcggtgg tgcacgtggt 3210
<210> 104
<211> 453
<212> DNA
<213> Artificial Sequence
<220>
<223> Promoter PGPM1 nucleotide sequence
<400> 104
gtgatgtcta agtaaccttt atggtatatt tcttaatgtg gaaagatact agcgcgcgca 60
cccacacaca agcttcgtct tttcttgaag aaaagaggaa gctcgctaaa tgggattcca 120
ctttccgttc cctgccagct gatggaaaaa ggttagtgga acgatgaaga ataaaaagag 180
agatccactg aggtgaaatt tcagctgaca gcgagtttca tgatcgtgat gaacaatggt 240
aacgagttgt ggctgttgcc agggagggtg gttctcaact tttaatgtat ggccaaatcg 300
ctacttgggt ttgttatata acaaagaaga aataatgaac tgattctctt cctccttctt 360
gtcctttctt aattctgttg taattacctt cctttgtaat tttttttgta attattcttc 420
ttaataatcc aaacaaacac acatattaca ata 453
<210> 105
<211> 432
<212> DNA
<213> Artificial Sequence
<220>
<223> Yeast transcriptional terminator TYPT31
<400> 105
gagatatttt gcagcagttg cgcacttgca tgtgaatgac tcttctcccc tttaattctg 60
tgctatattt ttacaatttt ctgctgacat atagtttata tacatataga acgcatatag 120
gaaattgaag taaacagaat acacaagtag aggccggtat gtacgacatt ttgcttacta 180
ctctttaaaa tcatcgtctt cttcgtcttc atcgtcttct tctttttcac catatcctac 240
atcatcttta gagcctgtgc taggttcctt cttgtctaat tcttctgcag tctttttata 300
gtcaattact ttgccgcgtg ttcttcttcc ggatgtgatg atattagagg tatcaatttc 360
tgccaaatcg tcctcttctt cttctccctc atttcccatc aatgcgtcta acttggcatc 420
gtccatatca ga 432
<210> 106
<211> 2971
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-S2CP(30) plasmid
<400> 106
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgagagacc gaattcgcgg ccgcttctag agcaatacgc aaaccgcctc 120
tccccgcgcg ttggccgatt cattaatgca gctggcacga caggtttccc gactggaaag 180
cgggcagtga gcgcaacgca attaatgtga gttagctcac tcattaggca ccccaggctt 240
tacactttat gcttccggct cgtatgttgt gtggaattgt gagcggataa caatttcaca 300
catactagag aaagaggaga aatactagat ggcttcctcc gaagacgtta tcaaagagtt 360
catgcgtttc aaagttcgta tggaaggttc cgttaacggt cacgagttcg aaatcgaagg 420
tgaaggtgaa ggtcgtccgt acgaaggtac ccagaccgct aaactgaaag ttaccaaagg 480
tggtccgctg ccgttcgctt gggacatcct gtccccgcag ttccagtacg gttccaaagc 540
ttacgttaaa cacccggctg acatcccgga ctacctgaaa ctgtccttcc cggaaggttt 600
caaatgggaa cgtgttatga acttcgaaga cggtggtgtt gttaccgtta cccaggactc 660
ctccctgcaa gacggtgagt tcatctacaa agttaaactg cgtggtacca acttcccgtc 720
cgacggtccg gttatgcaga aaaaaaccat gggttgggaa gcttccaccg aacgtatgta 780
cccggaagac ggtgctctga aaggtgaaat caaaatgcgt ctgaaactga aagacggtgg 840
tcactacgac gctgaagtta aaaccaccta catggctaaa aaaccggttc agctgccggg 900
tgcttacaaa accgacatca aactggacat cacctcccac aacgaagact acaccatcgt 960
tgaacagtac gaacgtgctg aaggtcgtca ctccaccggt gcttaataac gctgatagtg 1020
ctagtgtaga tcgctactag agccaggcat caaataaaac gaaaggctca gtcgaaagac 1080
tgggcctttc gttttatctg ttgtttgtcg gtgaacgctc tctactagag tcacactggc 1140
tcaccttcgg gtgggccttt ctgcgtttat atactagtag cggccgctgc agggtctctg 1200
gttcttctaa accggaaaaa ccaggtagca aaattaccgg tagcagcggc aatgataccc 1260
agggtagcct gattacctat agcggtggtg cacgtggtta gccgagacga ctgaccattt 1320
aaatcatacc tgacctccat agcagaaagt caaaagcctc cgaccggagg cttttgactt 1380
gatcggcacg taagaggttc caactttcac cataatgaaa taagatcact accgggcgta 1440
ttttttgagt tatcgagatt ttcaggagct aaggaagcta aaatgagcca tattcaacgg 1500
gaaacgtctt gctcgaggcc gcgattaaat tccaacatgg atgctgattt atatgggtat 1560
aaatgggctc gcgataatgt cgggcaatca ggtgcgacaa tctatcgatt gtatgggaag 1620
cccgatgcgc cagagttgtt tctgaaacat ggcaaaggta gcgttgccaa tgatgttaca 1680
gatgagatgg tcaggctaaa ctggctgacg gaatttatgc ctcttccgac catcaagcat 1740
tttatccgta ctcctgatga tgcatggtta ctcaccactg cgatcccagg gaaaacagca 1800
ttccaggtat tagaagaata tcctgattca ggtgaaaata ttgttgatgc gctggcagtg 1860
ttcctgcgcc ggttgcattc gattcctgtt tgtaattgtc cttttaacgg cgatcgcgta 1920
tttcgtctcg ctcaggcgca atcacgaatg aataacggtt tggttggtgc gagtgatttt 1980
gatgacgagc gtaatggctg gcctgttgaa caagtctgga aagaaatgca taagcttttg 2040
ccattctcac cggattcagt cgtcactcat ggtgatttct cacttgataa ccttattttt 2100
gacgagggga aattaatagg ttgtattgat gttggacgag tcggaatcgc agaccgatac 2160
caggatcttg ccatcctatg gaactgcctc ggtgagtttt ctccttcatt acagaaacgg 2220
ctttttcaaa aatatggtat tgataatcct gatatgaata aattgcagtt tcacttgatg 2280
ctcgatgagt ttttctaatg agggcccaaa tgtaatcacc tggctcacct tcgggtgggc 2340
ctttctgcgt tgctggcgtt tttccatagg ctccgccccc ctgacgagca tcacaaaaat 2400
cgatgctcaa gtcagaggtg gcgaaacccg acaggactat aaagatacca ggcgtttccc 2460
cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc cgcttaccgg atacctgtcc 2520
gcctttctcc cttcgggaag cgtggcgctt tctcatagct cacgctgtag gtatctcagt 2580
tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg aaccccccgt tcagcccgac 2640
cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc cggtaagaca cgacttatcg 2700
ccactggcag cagccactgg taacaggatt agcagagcga ggtatgtagg cggtgctaca 2760
gagttcttga agtggtggcc taactacggc tacactagaa gaacagtatt tggtatctgc 2820
gctctgctga agccagttac ctcggaaaaa gagttggtag ctcttgatcc ggcaaacaaa 2880
ccaccgctgg tagcggtggt ttttttgttt gcaagcagca gattacgcgc agaaaaaaag 2940
gatctcaaga agatcctttg attttctacc g 2971
<210> 107
<211> 90
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-S2CP(30) nucleotide sequence of key ORF
<400> 107
aaaccggaaa aaccaggtag caaaattacc ggtagcagcg gcaatgatac ccagggtagc 60
ctgattacct atagcggtgg tgcacgtggt 90
<210> 108
<211> 36
<212> PRT
<213> Artificial Sequence
<220>
<223> HcKan_O-S2CP(30) amino acid sequence of key ORF
<400> 108
Lys Pro Glu Lys Pro Gly Lys Pro Glu Lys Pro Gly Ser Lys Ile Thr
1 5 10 15
Gly Ser Ser Gly Asn Asp Thr Gln Gly Ser Leu Ile Thr Tyr Ser Gly
20 25 30
Gly Ala Arg Gly
35
<210> 109
<211> 2631
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-APEX2-S2CP(30) plasmid
<400> 109
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgggttctt ctcatcatca ccatcaccat tcttctggga tgggtaagtc 120
ttacccaact gtttctgctg attatcaaga tgctgttgaa aaggccaaga agaagttgag 180
aggtttcatt gctgaaaaaa gatgcgctcc attgatgttg agattggctt ttcattctgc 240
tggtactttc gataagggta caaaaactgg tggtccattc ggtactatca aacatccagc 300
tgaattggct cattcagcta acaatggttt ggatattgct gtcagattgc tggaaccatt 360
gaaagccgaa tttccaattt tgtcctacgc cgatttttac caattggctg gtgttgttgc 420
agttgaagtt acaggtggtc caaaagttcc atttcatcca ggtagagaag ataagccaga 480
accaccacca gaaggtagat tgccagatcc aacaaaaggt tctgatcact tgagagatgt 540
tttcggtaaa gctatgggtt tgactgatca agatattgtc gctttgtctg gtggtcatac 600
aattggtgct gctcacaaag aaagatcagg ttttgaaggt ccttggactt ctaacccatt 660
gatctttgat aactcttact tcaccgagtt gttgtccggt gaaaaagaag gtttgttgca 720
attgccatct gataaggctt tgttgtctga tccagttttc agaccattgg ttgataagta 780
tgctgctgat gaagatgctt tctttgctga ttacgctgaa gctcatcaaa agttgtctga 840
attgggtttt gctgatgctg gttcttctaa accggaaaaa ccaggtagca aaattaccgg 900
tagcagcggc aatgataccc agggtagcct gattacctat agcggtggtg cacgtggtta 960
gccgagacga ctgaccattt aaatcatacc tgacctccat agcagaaagt caaaagcctc 1020
cgaccggagg cttttgactt gatcggcacg taagaggttc caactttcac cataatgaaa 1080
taagatcact accgggcgta ttttttgagt tatcgagatt ttcaggagct aaggaagcta 1140
aaatgagcca tattcaacgg gaaacgtctt gctcgaggcc gcgattaaat tccaacatgg 1200
atgctgattt atatgggtat aaatgggctc gcgataatgt cgggcaatca ggtgcgacaa 1260
tctatcgatt gtatgggaag cccgatgcgc cagagttgtt tctgaaacat ggcaaaggta 1320
gcgttgccaa tgatgttaca gatgagatgg tcaggctaaa ctggctgacg gaatttatgc 1380
ctcttccgac catcaagcat tttatccgta ctcctgatga tgcatggtta ctcaccactg 1440
cgatcccagg gaaaacagca ttccaggtat tagaagaata tcctgattca ggtgaaaata 1500
ttgttgatgc gctggcagtg ttcctgcgcc ggttgcattc gattcctgtt tgtaattgtc 1560
cttttaacgg cgatcgcgta tttcgtctcg ctcaggcgca atcacgaatg aataacggtt 1620
tggttggtgc gagtgatttt gatgacgagc gtaatggctg gcctgttgaa caagtctgga 1680
aagaaatgca taagcttttg ccattctcac cggattcagt cgtcactcat ggtgatttct 1740
cacttgataa ccttattttt gacgagggga aattaatagg ttgtattgat gttggacgag 1800
tcggaatcgc agaccgatac caggatcttg ccatcctatg gaactgcctc ggtgagtttt 1860
ctccttcatt acagaaacgg ctttttcaaa aatatggtat tgataatcct gatatgaata 1920
aattgcagtt tcacttgatg ctcgatgagt ttttctaatg agggcccaaa tgtaatcacc 1980
tggctcacct tcgggtgggc ctttctgcgt tgctggcgtt tttccatagg ctccgccccc 2040
ctgacgagca tcacaaaaat cgatgctcaa gtcagaggtg gcgaaacccg acaggactat 2100
aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc 2160
cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct 2220
cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg 2280
aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc 2340
cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga 2400
ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa 2460
gaacagtatt tggtatctgc gctctgctga agccagttac ctcggaaaaa gagttggtag 2520
ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt gcaagcagca 2580
gattacgcgc agaaaaaaag gatctcaaga agatcctttg attttctacc g 2631
<210> 110
<211> 888
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-APEX2-S2CP(30) nucleotide sequence of key ORF
<400> 110
atgggttctt ctcatcatca ccatcaccat tcttctggga tgggtaagtc ttacccaact 60
gtttctgctg attatcaaga tgctgttgaa aaggccaaga agaagttgag aggtttcatt 120
gctgaaaaaa gatgcgctcc attgatgttg agattggctt ttcattctgc tggtactttc 180
gataagggta caaaaactgg tggtccattc ggtactatca aacatccagc tgaattggct 240
cattcagcta acaatggttt ggatattgct gtcagattgc tggaaccatt gaaagccgaa 300
tttccaattt tgtcctacgc cgatttttac caattggctg gtgttgttgc agttgaagtt 360
acaggtggtc caaaagttcc atttcatcca ggtagagaag ataagccaga accaccacca 420
gaaggtagat tgccagatcc aacaaaaggt tctgatcact tgagagatgt tttcggtaaa 480
gctatgggtt tgactgatca agatattgtc gctttgtctg gtggtcatac aattggtgct 540
gctcacaaag aaagatcagg ttttgaaggt ccttggactt ctaacccatt gatctttgat 600
aactcttact tcaccgagtt gttgtccggt gaaaaagaag gtttgttgca attgccatct 660
gataaggctt tgttgtctga tccagttttc agaccattgg ttgataagta tgctgctgat 720
gaagatgctt tctttgctga ttacgctgaa gctcatcaaa agttgtctga attgggtttt 780
gctgatgctg gttcttctaa accggaaaaa ccaggtagca aaattaccgg tagcagcggc 840
aatgataccc agggtagcct gattacctat agcggtggtg cacgtggt 888
<210> 111
<211> 296
<212> PRT
<213> Artificial Sequence
<220>
<223> HcKan_O-APEX2-S2CP(30) amino acid sequence of key ORF
<400> 111
Met Gly Ser Ser His His His His His His Ser Ser Gly Met Gly Lys
1 5 10 15
Ser Tyr Pro Thr Val Ser Ala Asp Tyr Gln Asp Ala Val Glu Lys Ala
20 25 30
Lys Lys Lys Leu Arg Gly Phe Ile Ala Glu Lys Arg Cys Ala Pro Leu
35 40 45
Met Leu Arg Leu Ala Phe His Ser Ala Gly Thr Phe Asp Lys Gly Thr
50 55 60
Lys Thr Gly Gly Pro Phe Gly Thr Ile Lys His Pro Ala Glu Leu Ala
65 70 75 80
His Ser Ala Asn Asn Gly Leu Asp Ile Ala Val Arg Leu Leu Glu Pro
85 90 95
Leu Lys Ala Glu Phe Pro Ile Leu Ser Tyr Ala Asp Phe Tyr Gln Leu
100 105 110
Ala Gly Val Val Ala Val Glu Val Thr Gly Gly Pro Lys Val Pro Phe
115 120 125
His Pro Gly Arg Glu Asp Lys Pro Glu Pro Pro Pro Glu Gly Arg Leu
130 135 140
Pro Asp Pro Thr Lys Gly Ser Asp His Leu Arg Asp Val Phe Gly Lys
145 150 155 160
Ala Met Gly Leu Thr Asp Gln Asp Ile Val Ala Leu Ser Gly Gly His
165 170 175
Thr Ile Gly Ala Ala His Lys Glu Arg Ser Gly Phe Glu Gly Pro Trp
180 185 190
Thr Ser Asn Pro Leu Ile Phe Asp Asn Ser Tyr Phe Thr Glu Leu Leu
195 200 205
Ser Gly Glu Lys Glu Gly Leu Leu Gln Leu Pro Ser Asp Lys Ala Leu
210 215 220
Leu Ser Asp Pro Val Phe Arg Pro Leu Val Asp Lys Tyr Ala Ala Asp
225 230 235 240
Glu Asp Ala Phe Phe Ala Asp Tyr Ala Glu Ala His Gln Lys Leu Ser
245 250 255
Glu Leu Gly Phe Ala Asp Ala Gly Ser Ser Lys Pro Glu Lys Pro Gly
260 265 270
Ser Lys Ile Thr Gly Ser Ser Gly Asn Asp Thr Gln Gly Ser Leu Ile
275 280 285
Thr Tyr Ser Gly Gly Ala Arg Gly
290 295
<210> 112
<211> 4953
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-LacZ-S2CTP plasmid
<400> 112
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atgggttctt ctcatcatca ccatcaccat tcttctggga tgaccatgat 120
tacggattca ctggccgtcg ttttacaacg tcgtgactgg gaaaaccctg gcgttaccca 180
acttaatcgc cttgcagcac atcccccttt cgccagctgg cgtaatagcg aagaggcccg 240
caccgatcgc ccttcccaac agttgcgcag cctgaatggc gaatggcgct ttgcctggtt 300
tccggcacca gaagcggtgc cggaaagctg gctggagtgc gatcttcctg aggccgatac 360
tgtcgtcgtc ccctcaaact ggcagatgca cggttacgat gcgcccatct acaccaacgt 420
gacctatccc attacggtca atccgccgtt tgttcccacg gagaatccga cgggttgtta 480
ctcgctcaca tttaatgttg atgaaagctg gctacaggaa ggccagacgc gaattatttt 540
tgatggcgtt aactcggcgt ttcatctgtg gtgcaacggg cgctgggtcg gttacggcca 600
ggacagtcgt ttgccgtctg aatttgacct gagcgcattt ttacgcgccg gagaaaaccg 660
cctcgcggtg atggtgctgc gctggagtga cggcagttat ctggaagatc aggatatgtg 720
gcggatgagc ggcattttcc gtgatgtctc gttgctgcat aaaccgacta cacaaatcag 780
cgatttccat gttgccactc gctttaatga tgatttcagc cgcgctgtac tggaggctga 840
agttcagatg tgcggcgagt tgcgtgacta cctacgggta acagtttctt tatggcaggg 900
tgaaacgcag gtcgccagcg gcaccgcgcc tttcggcggt gaaattatcg atgagcgtgg 960
tggttatgcc gatcgcgtca cactacgtct gaacgtcgaa aacccgaaac tgtggagcgc 1020
cgaaatcccg aatctctatc gtgcggtggt tgaactgcac accgccgacg gcacgctgat 1080
tgaagcagaa gcctgcgatg tcggtttccg cgaggtgcgg attgaaaatg gtctgctgct 1140
gctgaacggc aagccgttgc tgattcgagg cgttaaccgt cacgagcatc atcctctgca 1200
tggtcaggtc atggatgagc agacgatggt gcaggatatc ctgctgatga agcagaacaa 1260
ctttaacgcc gtgcgctgtt cgcattatcc gaaccatccg ctgtggtaca cgctgtgcga 1320
ccgctacggc ctgtatgtgg tggatgaagc caatattgaa acccacggca tggtgccaat 1380
gaatcgtctg accgatgatc cgcgctggct accggcgatg agcgaacgcg taacgcgaat 1440
ggtgcagcgc gatcgtaatc acccgagtgt gatcatctgg tcgctgggga atgaatcagg 1500
ccacggcgct aatcacgacg cgctgtatcg ctggatcaaa tctgtcgatc cttcccgccc 1560
ggtgcagtat gaaggcggcg gagccgacac cacggccacc gatattattt gcccgatgta 1620
cgcgcgcgtg gatgaagacc agcccttccc ggctgtgccg aaatggtcca tcaaaaaatg 1680
gctttcgcta cctggagaaa cgcgcccgct gatcctttgc gaatacgccc acgcgatggg 1740
taacagtctt ggcggtttcg ctaaatactg gcaggcgttt cgtcagtatc cccgtttaca 1800
gggcggcttc gtctgggact gggtggatca gtcgctgatt aaatatgatg aaaacggcaa 1860
cccgtggtcg gcttacggcg gtgattttgg cgatacgccg aacgatcgcc agttctgtat 1920
gaacggtctg gtctttgccg accgcacgcc gcatccagcg ctgacggaag caaaacacca 1980
gcagcagttt ttccagttcc gtttatccgg gcaaaccatc gaagtgacca gcgaatacct 2040
gttccgtcat agcgataacg agctcctgca ctggatggtg gcgctggatg gtaagccgct 2100
ggcaagcggt gaagtgcctc tggatgtcgc tccacaaggt aaacagttga ttgaactgcc 2160
tgaactaccg cagccggaga gcgccgggca actctggctc acagtacgcg tagtgcaacc 2220
gaacgcgacc gcatggtcag aagccggaca catcagcgcc tggcagcagt ggcgtctggc 2280
tgaaaacctc agcgtgacac tccccgccgc gtcccacgcc atcccgcatc tgaccaccag 2340
cgaaatggat ttttgcatcg agctgggtaa taagcgttgg caatttaacc gccagtcagg 2400
ctttctttca cagatgtgga ttggcgataa aaaacaactg ctgacgccgc tgcgcgatca 2460
gttcacccgt gcaccgctgg ataacgacat tggcgtaagt gaagcgaccc gcattgaccc 2520
taacgcctgg gtcgaacgct ggaaggcggc gggccattac caggccgaag cagcgttgtt 2580
gcagtgcacg gcagatacac ttgctgatgc ggtgctgatt acgaccgctc acgcgtggca 2640
gcatcagggg aaaaccttat ttatcagccg gaaaacctac cggattgatg gtagtggtca 2700
aatggcgatt accgttgatg ttgaagtggc gagcgataca ccgcatccgg cgcggattgg 2760
cctgaactgc cagctggcgc aggtagcaga gcgggtaaac tggctcggat tagggccgca 2820
agaaaactat cccgaccgcc ttactgccgc ctgttttgac cgctgggatc tgccattgtc 2880
agacatgtat accccgtacg tcttcccgag cgaaaacggt ctgcgctgcg ggacgcgcga 2940
attgaattat ggcccacacc agtggcgcgg cgacttccag ttcaacatca gccgctacag 3000
tcaacagcaa ctgatggaaa ccagccatcg ccatctgctg cacgcggaag aaggcacatg 3060
gctgaatatc gacggtttcc atatggggat tggtggcgac gactcctgga gcccgtcagt 3120
atcggcggaa ttccagctga gcgccggtcg ctaccattac cagttggtct ggtgtcaaaa 3180
aggttcttct aaaccggaaa aaccaggtag caaaattacc ggtagcagcg gcaatgatac 3240
ccagggtagc ctgattacct atagcggtgg tgcacgtggt tagccgagac gactgaccat 3300
ttaaatcata cctgacctcc atagcagaaa gtcaaaagcc tccgaccgga ggcttttgac 3360
ttgatcggca cgtaagaggt tccaactttc accataatga aataagatca ctaccgggcg 3420
tattttttga gttatcgaga ttttcaggag ctaaggaagc taaaatgagc catattcaac 3480
gggaaacgtc ttgctcgagg ccgcgattaa attccaacat ggatgctgat ttatatgggt 3540
ataaatgggc tcgcgataat gtcgggcaat caggtgcgac aatctatcga ttgtatggga 3600
agcccgatgc gccagagttg tttctgaaac atggcaaagg tagcgttgcc aatgatgtta 3660
cagatgagat ggtcaggcta aactggctga cggaatttat gcctcttccg accatcaagc 3720
attttatccg tactcctgat gatgcatggt tactcaccac tgcgatccca gggaaaacag 3780
cattccaggt attagaagaa tatcctgatt caggtgaaaa tattgttgat gcgctggcag 3840
tgttcctgcg ccggttgcat tcgattcctg tttgtaattg tccttttaac ggcgatcgcg 3900
tatttcgtct cgctcaggcg caatcacgaa tgaataacgg tttggttggt gcgagtgatt 3960
ttgatgacga gcgtaatggc tggcctgttg aacaagtctg gaaagaaatg cataagcttt 4020
tgccattctc accggattca gtcgtcactc atggtgattt ctcacttgat aaccttattt 4080
ttgacgaggg gaaattaata ggttgtattg atgttggacg agtcggaatc gcagaccgat 4140
accaggatct tgccatccta tggaactgcc tcggtgagtt ttctccttca ttacagaaac 4200
ggctttttca aaaatatggt attgataatc ctgatatgaa taaattgcag tttcacttga 4260
tgctcgatga gtttttctaa tgagggccca aatgtaatca cctggctcac cttcgggtgg 4320
gcctttctgc gttgctggcg tttttccata ggctccgccc ccctgacgag catcacaaaa 4380
atcgatgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac caggcgtttc 4440
cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc ggatacctgt 4500
ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt aggtatctca 4560
gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc gttcagcccg 4620
accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga cacgacttat 4680
cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta ggcggtgcta 4740
cagagttctt gaagtggtgg cctaactacg gctacactag aagaacagta tttggtatct 4800
gcgctctgct gaagccagtt acctcggaaa aagagttggt agctcttgat ccggcaaaca 4860
aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag cagattacgc gcagaaaaaa 4920
aggatctcaa gaagatcctt tgattttcta ccg 4953
<210> 113
<211> 3210
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-LacZ-S2CTP nucleotid sequence of key ORF
<400> 113
atgggttctt ctcatcatca ccatcaccat tcttctggga tgaccatgat tacggattca 60
ctggccgtcg ttttacaacg tcgtgactgg gaaaaccctg gcgttaccca acttaatcgc 120
cttgcagcac atcccccttt cgccagctgg cgtaatagcg aagaggcccg caccgatcgc 180
ccttcccaac agttgcgcag cctgaatggc gaatggcgct ttgcctggtt tccggcacca 240
gaagcggtgc cggaaagctg gctggagtgc gatcttcctg aggccgatac tgtcgtcgtc 300
ccctcaaact ggcagatgca cggttacgat gcgcccatct acaccaacgt gacctatccc 360
attacggtca atccgccgtt tgttcccacg gagaatccga cgggttgtta ctcgctcaca 420
tttaatgttg atgaaagctg gctacaggaa ggccagacgc gaattatttt tgatggcgtt 480
aactcggcgt ttcatctgtg gtgcaacggg cgctgggtcg gttacggcca ggacagtcgt 540
ttgccgtctg aatttgacct gagcgcattt ttacgcgccg gagaaaaccg cctcgcggtg 600
atggtgctgc gctggagtga cggcagttat ctggaagatc aggatatgtg gcggatgagc 660
ggcattttcc gtgatgtctc gttgctgcat aaaccgacta cacaaatcag cgatttccat 720
gttgccactc gctttaatga tgatttcagc cgcgctgtac tggaggctga agttcagatg 780
tgcggcgagt tgcgtgacta cctacgggta acagtttctt tatggcaggg tgaaacgcag 840
gtcgccagcg gcaccgcgcc tttcggcggt gaaattatcg atgagcgtgg tggttatgcc 900
gatcgcgtca cactacgtct gaacgtcgaa aacccgaaac tgtggagcgc cgaaatcccg 960
aatctctatc gtgcggtggt tgaactgcac accgccgacg gcacgctgat tgaagcagaa 1020
gcctgcgatg tcggtttccg cgaggtgcgg attgaaaatg gtctgctgct gctgaacggc 1080
aagccgttgc tgattcgagg cgttaaccgt cacgagcatc atcctctgca tggtcaggtc 1140
atggatgagc agacgatggt gcaggatatc ctgctgatga agcagaacaa ctttaacgcc 1200
gtgcgctgtt cgcattatcc gaaccatccg ctgtggtaca cgctgtgcga ccgctacggc 1260
ctgtatgtgg tggatgaagc caatattgaa acccacggca tggtgccaat gaatcgtctg 1320
accgatgatc cgcgctggct accggcgatg agcgaacgcg taacgcgaat ggtgcagcgc 1380
gatcgtaatc acccgagtgt gatcatctgg tcgctgggga atgaatcagg ccacggcgct 1440
aatcacgacg cgctgtatcg ctggatcaaa tctgtcgatc cttcccgccc ggtgcagtat 1500
gaaggcggcg gagccgacac cacggccacc gatattattt gcccgatgta cgcgcgcgtg 1560
gatgaagacc agcccttccc ggctgtgccg aaatggtcca tcaaaaaatg gctttcgcta 1620
cctggagaaa cgcgcccgct gatcctttgc gaatacgccc acgcgatggg taacagtctt 1680
ggcggtttcg ctaaatactg gcaggcgttt cgtcagtatc cccgtttaca gggcggcttc 1740
gtctgggact gggtggatca gtcgctgatt aaatatgatg aaaacggcaa cccgtggtcg 1800
gcttacggcg gtgattttgg cgatacgccg aacgatcgcc agttctgtat gaacggtctg 1860
gtctttgccg accgcacgcc gcatccagcg ctgacggaag caaaacacca gcagcagttt 1920
ttccagttcc gtttatccgg gcaaaccatc gaagtgacca gcgaatacct gttccgtcat 1980
agcgataacg agctcctgca ctggatggtg gcgctggatg gtaagccgct ggcaagcggt 2040
gaagtgcctc tggatgtcgc tccacaaggt aaacagttga ttgaactgcc tgaactaccg 2100
cagccggaga gcgccgggca actctggctc acagtacgcg tagtgcaacc gaacgcgacc 2160
gcatggtcag aagccggaca catcagcgcc tggcagcagt ggcgtctggc tgaaaacctc 2220
agcgtgacac tccccgccgc gtcccacgcc atcccgcatc tgaccaccag cgaaatggat 2280
ttttgcatcg agctgggtaa taagcgttgg caatttaacc gccagtcagg ctttctttca 2340
cagatgtgga ttggcgataa aaaacaactg ctgacgccgc tgcgcgatca gttcacccgt 2400
gcaccgctgg ataacgacat tggcgtaagt gaagcgaccc gcattgaccc taacgcctgg 2460
gtcgaacgct ggaaggcggc gggccattac caggccgaag cagcgttgtt gcagtgcacg 2520
gcagatacac ttgctgatgc ggtgctgatt acgaccgctc acgcgtggca gcatcagggg 2580
aaaaccttat ttatcagccg gaaaacctac cggattgatg gtagtggtca aatggcgatt 2640
accgttgatg ttgaagtggc gagcgataca ccgcatccgg cgcggattgg cctgaactgc 2700
cagctggcgc aggtagcaga gcgggtaaac tggctcggat tagggccgca agaaaactat 2760
cccgaccgcc ttactgccgc ctgttttgac cgctgggatc tgccattgtc agacatgtat 2820
accccgtacg tcttcccgag cgaaaacggt ctgcgctgcg ggacgcgcga attgaattat 2880
ggcccacacc agtggcgcgg cgacttccag ttcaacatca gccgctacag tcaacagcaa 2940
ctgatggaaa ccagccatcg ccatctgctg cacgcggaag aaggcacatg gctgaatatc 3000
gacggtttcc atatggggat tggtggcgac gactcctgga gcccgtcagt atcggcggaa 3060
ttccagctga gcgccggtcg ctaccattac cagttggtct ggtgtcaaaa aggttcttct 3120
aaaccggaaa aaccaggtag caaaattacc ggtagcagcg gcaatgatac ccagggtagc 3180
ctgattacct atagcggtgg tgcacgtggt 3210
<210> 114
<211> 1070
<212> PRT
<213> Artificial Sequence
<220>
<223> HcKan_O-LacZ-S2CTP amino acid sequence of key ORF
<400> 114
Met Gly Ser Ser His His His His His His Ser Ser Gly Met Thr Met
1 5 10 15
Ile Thr Asp Ser Leu Ala Val Val Leu Gln Arg Arg Asp Trp Glu Asn
20 25 30
Pro Gly Val Thr Gln Leu Asn Arg Leu Ala Ala His Pro Pro Phe Ala
35 40 45
Ser Trp Arg Asn Ser Glu Glu Ala Arg Thr Asp Arg Pro Ser Gln Gln
50 55 60
Leu Arg Ser Leu Asn Gly Glu Trp Arg Phe Ala Trp Phe Pro Ala Pro
65 70 75 80
Glu Ala Val Pro Glu Ser Trp Leu Glu Cys Asp Leu Pro Glu Ala Asp
85 90 95
Thr Val Val Val Pro Ser Asn Trp Gln Met His Gly Tyr Asp Ala Pro
100 105 110
Ile Tyr Thr Asn Val Thr Tyr Pro Ile Thr Val Asn Pro Pro Phe Val
115 120 125
Pro Thr Glu Asn Pro Thr Gly Cys Tyr Ser Leu Thr Phe Asn Val Asp
130 135 140
Glu Ser Trp Leu Gln Glu Gly Gln Thr Arg Ile Ile Phe Asp Gly Val
145 150 155 160
Asn Ser Ala Phe His Leu Trp Cys Asn Gly Arg Trp Val Gly Tyr Gly
165 170 175
Gln Asp Ser Arg Leu Pro Ser Glu Phe Asp Leu Ser Ala Phe Leu Arg
180 185 190
Ala Gly Glu Asn Arg Leu Ala Val Met Val Leu Arg Trp Ser Asp Gly
195 200 205
Ser Tyr Leu Glu Asp Gln Asp Met Trp Arg Met Ser Gly Ile Phe Arg
210 215 220
Asp Val Ser Leu Leu His Lys Pro Thr Thr Gln Ile Ser Asp Phe His
225 230 235 240
Val Ala Thr Arg Phe Asn Asp Asp Phe Ser Arg Ala Val Leu Glu Ala
245 250 255
Glu Val Gln Met Cys Gly Glu Leu Arg Asp Tyr Leu Arg Val Thr Val
260 265 270
Ser Leu Trp Gln Gly Glu Thr Gln Val Ala Ser Gly Thr Ala Pro Phe
275 280 285
Gly Gly Glu Ile Ile Asp Glu Arg Gly Gly Tyr Ala Asp Arg Val Thr
290 295 300
Leu Arg Leu Asn Val Glu Asn Pro Lys Leu Trp Ser Ala Glu Ile Pro
305 310 315 320
Asn Leu Tyr Arg Ala Val Val Glu Leu His Thr Ala Asp Gly Thr Leu
325 330 335
Ile Glu Ala Glu Ala Cys Asp Val Gly Phe Arg Glu Val Arg Ile Glu
340 345 350
Asn Gly Leu Leu Leu Leu Asn Gly Lys Pro Leu Leu Ile Arg Gly Val
355 360 365
Asn Arg His Glu His His Pro Leu His Gly Gln Val Met Asp Glu Gln
370 375 380
Thr Met Val Gln Asp Ile Leu Leu Met Lys Gln Asn Asn Phe Asn Ala
385 390 395 400
Val Arg Cys Ser His Tyr Pro Asn His Pro Leu Trp Tyr Thr Leu Cys
405 410 415
Asp Arg Tyr Gly Leu Tyr Val Val Asp Glu Ala Asn Ile Glu Thr His
420 425 430
Gly Met Val Pro Met Asn Arg Leu Thr Asp Asp Pro Arg Trp Leu Pro
435 440 445
Ala Met Ser Glu Arg Val Thr Arg Met Val Gln Arg Asp Arg Asn His
450 455 460
Pro Ser Val Ile Ile Trp Ser Leu Gly Asn Glu Ser Gly His Gly Ala
465 470 475 480
Asn His Asp Ala Leu Tyr Arg Trp Ile Lys Ser Val Asp Pro Ser Arg
485 490 495
Pro Val Gln Tyr Glu Gly Gly Gly Ala Asp Thr Thr Ala Thr Asp Ile
500 505 510
Ile Cys Pro Met Tyr Ala Arg Val Asp Glu Asp Gln Pro Phe Pro Ala
515 520 525
Val Pro Lys Trp Ser Ile Lys Lys Trp Leu Ser Leu Pro Gly Glu Thr
530 535 540
Arg Pro Leu Ile Leu Cys Glu Tyr Ala His Ala Met Gly Asn Ser Leu
545 550 555 560
Gly Gly Phe Ala Lys Tyr Trp Gln Ala Phe Arg Gln Tyr Pro Arg Leu
565 570 575
Gln Gly Gly Phe Val Trp Asp Trp Val Asp Gln Ser Leu Ile Lys Tyr
580 585 590
Asp Glu Asn Gly Asn Pro Trp Ser Ala Tyr Gly Gly Asp Phe Gly Asp
595 600 605
Thr Pro Asn Asp Arg Gln Phe Cys Met Asn Gly Leu Val Phe Ala Asp
610 615 620
Arg Thr Pro His Pro Ala Leu Thr Glu Ala Lys His Gln Gln Gln Phe
625 630 635 640
Phe Gln Phe Arg Leu Ser Gly Gln Thr Ile Glu Val Thr Ser Glu Tyr
645 650 655
Leu Phe Arg His Ser Asp Asn Glu Leu Leu His Trp Met Val Ala Leu
660 665 670
Asp Gly Lys Pro Leu Ala Ser Gly Glu Val Pro Leu Asp Val Ala Pro
675 680 685
Gln Gly Lys Gln Leu Ile Glu Leu Pro Glu Leu Pro Gln Pro Glu Ser
690 695 700
Ala Gly Gln Leu Trp Leu Thr Val Arg Val Val Gln Pro Asn Ala Thr
705 710 715 720
Ala Trp Ser Glu Ala Gly His Ile Ser Ala Trp Gln Gln Trp Arg Leu
725 730 735
Ala Glu Asn Leu Ser Val Thr Leu Pro Ala Ala Ser His Ala Ile Pro
740 745 750
His Leu Thr Thr Ser Glu Met Asp Phe Cys Ile Glu Leu Gly Asn Lys
755 760 765
Arg Trp Gln Phe Asn Arg Gln Ser Gly Phe Leu Ser Gln Met Trp Ile
770 775 780
Gly Asp Lys Lys Gln Leu Leu Thr Pro Leu Arg Asp Gln Phe Thr Arg
785 790 795 800
Ala Pro Leu Asp Asn Asp Ile Gly Val Ser Glu Ala Thr Arg Ile Asp
805 810 815
Pro Asn Ala Trp Val Glu Arg Trp Lys Ala Ala Gly His Tyr Gln Ala
820 825 830
Glu Ala Ala Leu Leu Gln Cys Thr Ala Asp Thr Leu Ala Asp Ala Val
835 840 845
Leu Ile Thr Thr Ala His Ala Trp Gln His Gln Gly Lys Thr Leu Phe
850 855 860
Ile Ser Arg Lys Thr Tyr Arg Ile Asp Gly Ser Gly Gln Met Ala Ile
865 870 875 880
Thr Val Asp Val Glu Val Ala Ser Asp Thr Pro His Pro Ala Arg Ile
885 890 895
Gly Leu Asn Cys Gln Leu Ala Gln Val Ala Glu Arg Val Asn Trp Leu
900 905 910
Gly Leu Gly Pro Gln Glu Asn Tyr Pro Asp Arg Leu Thr Ala Ala Cys
915 920 925
Phe Asp Arg Trp Asp Leu Pro Leu Ser Asp Met Tyr Thr Pro Tyr Val
930 935 940
Phe Pro Ser Glu Asn Gly Leu Arg Cys Gly Thr Arg Glu Leu Asn Tyr
945 950 955 960
Gly Pro His Gln Trp Arg Gly Asp Phe Gln Phe Asn Ile Ser Arg Tyr
965 970 975
Ser Gln Gln Gln Leu Met Glu Thr Ser His Arg His Leu Leu His Ala
980 985 990
Glu Glu Gly Thr Trp Leu Asn Ile Asp Gly Phe His Met Gly Ile Gly
995 1000 1005
Gly Asp Asp Ser Trp Ser Pro Ser Val Ser Ala Glu Phe Gln Leu
1010 1015 1020
Ser Ala Gly Arg Tyr His Tyr Gln Leu Val Trp Cys Gln Lys Gly
1025 1030 1035
Ser Ser Lys Pro Glu Lys Pro Gly Ser Lys Ile Thr Gly Ser Ser
1040 1045 1050
Gly Asn Asp Thr Gln Gly Ser Leu Ile Thr Tyr Ser Gly Gly Ala
1055 1060 1065
Arg Gly
1070
<210> 115
<211> 2199
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_P-GPM1 plasmid
<400> 115
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg gctgtgatgt ctaagtaacc tttatggtat atttcttaat gtggaaagat 120
actagcgcgc gcacccacac acaagcttcg tcttttcttg aagaaaagag gaagctcgct 180
aaatgggatt ccactttccg ttccctgcca gctgatggaa aaaggttagt ggaacgatga 240
agaataaaaa gagagatcca ctgaggtgaa atttcagctg acagcgagtt tcatgatcgt 300
gatgaacaat ggtaacgagt tgtggctgtt gccagggagg gtggttctca acttttaatg 360
tatggccaaa tcgctacttg ggtttgttat ataacaaaga agaaataatg aactgattct 420
cttcctcctt cttgtccttt cttaattctg ttgtaattac cttcctttgt aatttttttt 480
gtaattattc ttcttaataa tccaaacaaa cacacatatt acaatagatg cgagacgact 540
gaccatttaa atcatacctg acctccatag cagaaagtca aaagcctccg accggaggct 600
tttgacttga tcggcacgta agaggttcca actttcacca taatgaaata agatcactac 660
cgggcgtatt ttttgagtta tcgagatttt caggagctaa ggaagctaaa atgagccata 720
ttcaacggga aacgtcttgc tcgaggccgc gattaaattc caacatggat gctgatttat 780
atgggtataa atgggctcgc gataatgtcg ggcaatcagg tgcgacaatc tatcgattgt 840
atgggaagcc cgatgcgcca gagttgtttc tgaaacatgg caaaggtagc gttgccaatg 900
atgttacaga tgagatggtc aggctaaact ggctgacgga atttatgcct cttccgacca 960
tcaagcattt tatccgtact cctgatgatg catggttact caccactgcg atcccaggga 1020
aaacagcatt ccaggtatta gaagaatatc ctgattcagg tgaaaatatt gttgatgcgc 1080
tggcagtgtt cctgcgccgg ttgcattcga ttcctgtttg taattgtcct tttaacggcg 1140
atcgcgtatt tcgtctcgct caggcgcaat cacgaatgaa taacggtttg gttggtgcga 1200
gtgattttga tgacgagcgt aatggctggc ctgttgaaca agtctggaaa gaaatgcata 1260
agcttttgcc attctcaccg gattcagtcg tcactcatgg tgatttctca cttgataacc 1320
ttatttttga cgaggggaaa ttaataggtt gtattgatgt tggacgagtc ggaatcgcag 1380
accgatacca ggatcttgcc atcctatgga actgcctcgg tgagttttct ccttcattac 1440
agaaacggct ttttcaaaaa tatggtattg ataatcctga tatgaataaa ttgcagtttc 1500
acttgatgct cgatgagttt ttctaatgag ggcccaaatg taatcacctg gctcaccttc 1560
gggtgggcct ttctgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc 1620
acaaaaatcg atgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg 1680
cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat 1740
acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt 1800
atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc 1860
agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg 1920
acttatcgcc actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg 1980
gtgctacaga gttcttgaag tggtggccta actacggcta cactagaaga acagtatttg 2040
gtatctgcgc tctgctgaag ccagttacct cggaaaaaga gttggtagct cttgatccgg 2100
caaacaaacc accgctggta gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag 2160
aaaaaaagga tctcaagaag atcctttgat tttctaccg 2199
<210> 116
<211> 2430
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-HO-T1-SpyTag plasmid
<400> 116
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgg atggatcatg ctccagaaag atttgatgct actcctccag ctggtgaacc 120
agatagacca gctttgggtg ttttggaatt gacttctatt gctagaggta ttaccgttgc 180
tgatgctgct ttgaaaagag caccatcttt gttgttgatg tccagaccag tttcttccgg 240
taaacatttg ttgatgatga gaggtcaagt tgccgaagtt gaagaatcta tgattgctgc 300
tagagaaatt gctggtgctg gtggtggttc aggtggttct gctcatatag ttatggttga 360
tgcttacaag ccaacaaaag gtggtagtgg tggatctggt gctttgttgg atgaattgga 420
attgccatat gctcacgaac aactttggag atttttggat gctccagttg ttgcagatgc 480
ttgggaagaa gatactgaat ccgttattat cgttgaaacc gctactgttt gtgctgctat 540
tgattctgct gatgcagcct taaaaactgc tcctgttgtt ttgagagata tgagattggc 600
tattggtatt gctggtaagg ctttctttac tttgactggt gaattggctg atgttgaagc 660
tgctgctgaa gttgttagag aaagatgtgg tgctagattg ctagaattgg catgtattgc 720
aagaccagtt gacgaattga gaggtaggtt gtttttctag ccgagacgac tgaccattta 780
aatcatacct gacctccata gcagaaagtc aaaagcctcc gaccggaggc ttttgacttg 840
atcggcacgt aagaggttcc aactttcacc ataatgaaat aagatcacta ccgggcgtat 900
tttttgagtt atcgagattt tcaggagcta aggaagctaa aatgagccat attcaacggg 960
aaacgtcttg ctcgaggccg cgattaaatt ccaacatgga tgctgattta tatgggtata 1020
aatgggctcg cgataatgtc gggcaatcag gtgcgacaat ctatcgattg tatgggaagc 1080
ccgatgcgcc agagttgttt ctgaaacatg gcaaaggtag cgttgccaat gatgttacag 1140
atgagatggt caggctaaac tggctgacgg aatttatgcc tcttccgacc atcaagcatt 1200
ttatccgtac tcctgatgat gcatggttac tcaccactgc gatcccaggg aaaacagcat 1260
tccaggtatt agaagaatat cctgattcag gtgaaaatat tgttgatgcg ctggcagtgt 1320
tcctgcgccg gttgcattcg attcctgttt gtaattgtcc ttttaacggc gatcgcgtat 1380
ttcgtctcgc acaggcgcaa tcacgaatga ataacggttt ggttggtgcg agtgattttg 1440
atgacgagcg taatggctgg cctgttgaac aagtctggaa agaaatgcat aagcttttgc 1500
cattctcacc ggattcagtc gtcactcatg gtgatttctc acttgataac cttatttttg 1560
acgaggggaa attaataggt tgtattgatg ttggacgagt cggaatcgca gaccgatacc 1620
aggatcttgc catcctatgg aactgcctcg gtgagttttc tccttcatta cagaaacggc 1680
tttttcaaaa atatggtatt gataatcctg atatgaataa attgcagttt cacttgatgc 1740
tcgatgagtt tttctaatga gggcccaaat gtaatcacct ggctcacctt cgggtgggcc 1800
tttctgcgtt gctggcgttt ttccataggc tccgcccccc tgacgagcat cacaaaaatc 1860
gatgctcaag tcagaggtgg cgaaacccga caggactata aagataccag gcgtttcccc 1920
ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc gcttaccgga tacctgtccg 1980
cctttctccc ttcgggaagc gtggcgcttt ctcatagctc acgctgtagg tatctcagtt 2040
cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga accccccgtt cagcccgacc 2100
gctgcgcctt atccggtaac tatcgtcttg agtccaaccc ggtaagacac gacttatcgc 2160
cactggcagc agccactggt aacaggatta gcagagcgag gtatgtaggc ggtgctacag 2220
agttcttgaa gtggtggcct aactacggct acactagaag aacagtattt ggtatctgcg 2280
ctctgctgaa gccagttacc tcggaaaaag agttggtagc tcttgatccg gcaaacaaac 2340
caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca gaaaaaaagg 2400
atctcaagaa gatcctttga ttttctaccg 2430
<210> 117
<211> 687
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_O-HO-T1-SpyTag nucleotide sequence of key ORF
<400> 117
atggatcatg ctccagaaag atttgatgct actcctccag ctggtgaacc agatagacca 60
gctttgggtg ttttggaatt gacttctatt gctagaggta ttaccgttgc tgatgctgct 120
ttgaaaagag caccatcttt gttgttgatg tccagaccag tttcttccgg taaacatttg 180
ttgatgatga gaggtcaagt tgccgaagtt gaagaatcta tgattgctgc tagagaaatt 240
gctggtgctg gtggtggttc aggtggttct gctcatatag ttatggttga tgcttacaag 300
ccaacaaaag gtggtagtgg tggatctggt gctttgttgg atgaattgga attgccatat 360
gctcacgaac aactttggag atttttggat gctccagttg ttgcagatgc ttgggaagaa 420
gatactgaat ccgttattat cgttgaaacc gctactgttt gtgctgctat tgattctgct 480
gatgcagcct taaaaactgc tcctgttgtt ttgagagata tgagattggc tattggtatt 540
gctggtaagg ctttctttac tttgactggt gaattggctg atgttgaagc tgctgctgaa 600
gttgttagag aaagatgtgg tgctagattg ctagaattgg catgtattgc aagaccagtt 660
gacgaattga gaggtaggtt gtttttc 687
<210> 118
<211> 229
<212> PRT
<213> Artificial Sequence
<220>
<223> HcKan_O-HO-T1-SpyTag amino acid sequence of key ORF
<400> 118
Met Asp His Ala Pro Glu Arg Phe Asp Ala Thr Pro Pro Ala Gly Glu
1 5 10 15
Pro Asp Arg Pro Ala Leu Gly Val Leu Glu Leu Thr Ser Ile Ala Arg
20 25 30
Gly Ile Thr Val Ala Asp Ala Ala Leu Lys Arg Ala Pro Ser Leu Leu
35 40 45
Leu Met Ser Arg Pro Val Ser Ser Gly Lys His Leu Leu Met Met Arg
50 55 60
Gly Gln Val Ala Glu Val Glu Glu Ser Met Ile Ala Ala Arg Glu Ile
65 70 75 80
Ala Gly Ala Gly Gly Gly Ser Gly Gly Ser Ala His Ile Val Met Val
85 90 95
Asp Ala Tyr Lys Pro Thr Lys Gly Gly Ser Gly Gly Ser Gly Ala Leu
100 105 110
Leu Asp Glu Leu Glu Leu Pro Tyr Ala His Glu Gln Leu Trp Arg Phe
115 120 125
Leu Asp Ala Pro Val Val Ala Asp Ala Trp Glu Glu Asp Thr Glu Ser
130 135 140
Val Ile Ile Val Glu Thr Ala Thr Val Cys Ala Ala Ile Asp Ser Ala
145 150 155 160
Asp Ala Ala Leu Lys Thr Ala Pro Val Val Leu Arg Asp Met Arg Leu
165 170 175
Ala Ile Gly Ile Ala Gly Lys Ala Phe Phe Thr Leu Thr Gly Glu Leu
180 185 190
Ala Asp Val Glu Ala Ala Ala Glu Val Val Arg Glu Arg Cys Gly Ala
195 200 205
Arg Leu Leu Glu Leu Ala Cys Ile Ala Arg Pro Val Asp Glu Leu Arg
210 215 220
Gly Arg Leu Phe Phe
225
<210> 119
<211> 2178
<212> DNA
<213> Artificial Sequence
<220>
<223> HcKan_T-YPT31 plasmid
<400> 119
aagaaaggcc cacccgtgaa ggtgagccag tgagttgatt gcagtccagt tacgctggag 60
tccgtctcgt agcgagatat tttgcagcag ttgcgcactt gcatgtgaat gactcttctc 120
ccctttaatt ctgtgctata tttttacaat tttctgctga catatagttt atatacatat 180
agaacgcata taggaaattg aagtaaacag aatacacaag tagaggccgg tatgtacgac 240
attttgctta ctactcttta aaatcatcgt cttcttcgtc ttcatcgtct tcttcttttt 300
caccatatcc tacatcatct ttagagcctg tgctaggttc cttcttgtct aattcttctg 360
cagtcttttt atagtcaatt actttgccgc gtgttcttct tccggatgtg atgatattag 420
aggtatcaat ttctgccaaa tcgtcctctt cttcttctcc ctcatttccc atcaatgcgt 480
ctaacttggc atcgtccata tcagacctcc gagacgactg accatttaaa tcatacctga 540
cctccatagc agaaagtcaa aagcctccga ccggaggctt ttgacttgat cggcacgtaa 600
gaggttccaa ctttcaccat aatgaaataa gatcactacc gggcgtattt tttgagttat 660
cgagattttc aggagctaag gaagctaaaa tgagccatat tcaacgggaa acgtcttgct 720
cgaggccgcg attaaattcc aacatggatg ctgatttata tgggtataaa tgggctcgcg 780
ataatgtcgg gcaatcaggt gcgacaatct atcgattgta tgggaagccc gatgcgccag 840
agttgtttct gaaacatggc aaaggtagcg ttgccaatga tgttacagat gagatggtca 900
ggctaaactg gctgacggaa tttatgcctc ttccgaccat caagcatttt atccgtactc 960
ctgatgatgc atggttactc accactgcga tcccagggaa aacagcattc caggtattag 1020
aagaatatcc tgattcaggt gaaaatattg ttgatgcgct ggcagtgttc ctgcgccggt 1080
tgcattcgat tcctgtttgt aattgtcctt ttaacggcga tcgcgtattt cgtctcgctc 1140
aggcgcaatc acgaatgaat aacggtttgg ttggtgcgag tgattttgat gacgagcgta 1200
atggctggcc tgttgaacaa gtctggaaag aaatgcataa gcttttgcca ttctcaccgg 1260
attcagtcgt cactcatggt gatttctcac ttgataacct tatttttgac gaggggaaat 1320
taataggttg tattgatgtt ggacgagtcg gaatcgcaga ccgataccag gatcttgcca 1380
tcctatggaa ctgcctcggt gagttttctc cttcattaca gaaacggctt tttcaaaaat 1440
atggtattga taatcctgat atgaataaat tgcagtttca cttgatgctc gatgagtttt 1500
tctaatgagg gcccaaatgt aatcacctgg ctcaccttcg ggtgggcctt tctgcgttgc 1560
tggcgttttt ccataggctc cgcccccctg acgagcatca caaaaatcga tgctcaagtc 1620
agaggtggcg aaacccgaca ggactataaa gataccaggc gtttccccct ggaagctccc 1680
tcgtgcgctc tcctgttccg accctgccgc ttaccggata cctgtccgcc tttctccctt 1740
cgggaagcgt ggcgctttct catagctcac gctgtaggta tctcagttcg gtgtaggtcg 1800
ttcgctccaa gctgggctgt gtgcacgaac cccccgttca gcccgaccgc tgcgccttat 1860
ccggtaacta tcgtcttgag tccaacccgg taagacacga cttatcgcca ctggcagcag 1920
ccactggtaa caggattagc agagcgaggt atgtaggcgg tgctacagag ttcttgaagt 1980
ggtggcctaa ctacggctac actagaagaa cagtatttgg tatctgcgct ctgctgaagc 2040
cagttacctc ggaaaaagag ttggtagctc ttgatccggc aaacaaacca ccgctggtag 2100
cggtggtttt tttgtttgca agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga 2160
tcctttgatt ttctaccg 2178
<210> 120
<211> 5404
<212> DNA
<213> Artificial Sequence
<220>
<223> pGAH-YPRCd15 plasmid
<400> 120
gcaaaaggcc aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc 60
ccctgacgag catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact 120
ataaagatac caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct 180
gccgcttacc ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag 240
ctcacgctgt aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca 300
cgaacccccc gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa 360
cccggtaaga cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc 420
gaggtatgta ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag 480
aaggacagta tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg 540
tagctcttga tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca 600
gcagattacg cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc 660
tgacgctcag tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag 720
gatcttcacc tagatccttt taaattaaaa atgaagtttt aaatcaatct aaagtatata 780
tgagtaaact tggtctgaca gttaccaatg cttaatcagt gaggcaccta tctcagcgat 840
ctgtctattt cgttcatcca tagttgcctg actccccgtc gtgtagataa ctacgatacg 900
ggagggctta ccatctggcc ccagtgctgc aatgataccg cggctcccac gctcaccggc 960
tccagattta tcagcaataa accagccagc cggaagggcc gagcgcagaa gtggtcctgc 1020
aactttatcc gcctccatcc agtctattaa ttgttgccgg gaagctagag taagtagttc 1080
gccagttaat agtttgcgca acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc 1140
gtcgtttggt atggcttcat tcagctccgg ttcccaacga tcaaggcgag ttacatgatc 1200
ccccatgttg tgcaaaaaag cggttagctc cttcggtcct ccgatcgttg tcagaagtaa 1260
gttggccgca gtgttatcac tcatggttat ggcagcactg cataattctc ttactgtcat 1320
gccatccgta agatgctttt ctgtgactgg tgagtactca accaagtcat tctgagaata 1380
gtgtatgcgg cgaccgagtt gctcttgccc ggcgtcaata cgggataata ccgcgccaca 1440
tagcagaact ttaaaagtgc tcatcattgg aaaacgttct tcggggcgaa aactctcaag 1500
gatcttaccg ctgttgagat ccagttcgat gtaacccact cgtgcaccca actgatcttc 1560
agcatctttt actttcacca gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc 1620
aaaaaaggga ataagggcga cacggaaatg ttgaatactc atactcttcc tttttcaata 1680
ttattgaagc atttatcagg gttattgtct catgagcgga tacatatttg aatgtattta 1740
gaaaaataaa caaatagggg ttccgcgcac atttccccga aaagtgccac ggtctctgtc 1800
agataacgcc aggcgccttt atatcatata attaagacac aaaaggataa aacaaaggtg 1860
ttaactattc tgcatactca ctatcgtaaa ctgtcctgca aatcgtgtaa atatgtattt 1920
catttttttt gcagtgaaaa aaggcatgta aaataccgca tcaagtaact ctactccgcc 1980
tgtggtttca agactaacgg cttgagacaa aatgggaaga aatgattgca gaaaagccat 2040
atgtgtaata gcaaaaagct ggatactgct taccagatgt ttaccttaat ttcttggtga 2100
attagagaag tacagaagtt ttactattaa tcccaccata gaaatttgta taggaaagta 2160
gtttattgga gttattggat atactgtgta aactatttct tgaaattgta atcttaagat 2220
gctcttctta ttctattaaa aatagaaaat gattttcata tttatttatt tatttatatt 2280
ttggcattac tcttcatcat ttttttccct ctaagaagct tcctttcttt ttataaggat 2340
aacaaaacca aaaggaatat tgggtcagat gaatggacgc gaatgcaaga cagaagtcca 2400
aatcacgtca agacaaagaa agaaagaaag aaaaactaac acattaatgt agttttaaaa 2460
tttcaaatcc gaacaacaga gcatagggtt tcgcaaatct ctacctggct cgaagcagcg 2520
gtatttcaca ccgcatagat ccgtcgagtt caagagaaaa aaaaagaaaa agcaaaaaga 2580
aaaaaggaaa gcgcgcctcg ttcagaatga cacgtataga atgatgcatt accttgtcat 2640
cttcagtatc atactgttcg tatacatact tactgacatt cataggtata catatataca 2700
catgtatata tatcgtatgc tgcagcttta aataatcggt gtcactacat aagaacacct 2760
ttggtggagg gaacatcgtt ggtaccattg ggcgaggtgg cttctcttat ggcaaccgca 2820
agagccttga acgcactctc actacggtga tgatcattct tgcctcgcag acaatcaacg 2880
tggagggtaa ttctgctagc ctctgcaaag ctttcaagaa aatgcgggat catctcgcaa 2940
gagagatctc ctactttctc cctttgcaaa ccaagttcga caactgcgta cggcctgttc 3000
gaaagatcta ccaccgctct ggaaagtgcc tcatccaaag gcgcaaatcc tgatccaaac 3060
ctttttactc cacgcacggc ccctagggcc tctttaaaag cttgaccgag agcaatcccg 3120
cagtcttcag tggtgtgatg gtcgtctatg tgtaagtcac caatgcactc aacgattagc 3180
gaccagccgg aatgcttggc cagagcatgt atcatatggt ccagaaaccc tatacctgtg 3240
tggacgttaa tcacttgcga ttgtgtggcc tgttctgcta ctgcttctgc ctctttttct 3300
gggaagatcg agtgctctat cgctagggga ccacccttta aagagatcgc aatctgaatc 3360
ttggtttcat ttgtaatacg ctttactagg gctttctgct ctgtcatctt tgccttcgtt 3420
tatcttgcct gctcattttt tagtatattc ttcgaagaaa tcacattact ttatataatg 3480
tataattcat tatgtgataa tgccaatcgc taagaaaaaa aaagagtcat ccgctaggtg 3540
gaaaaaaaaa aatgaaaatc attaccgagg cataaaaaaa tatagagtgt actagaggag 3600
gccaagagta atagaaaaag aaaattgcgg gaaaggactg tgttatgact tccctgtgca 3660
ccacctcagg cagagaacct agagacggca atacgcaaac cgcctctccc cgcgcgttgg 3720
ccgattcatt aatgcagctg gcacgacagg tttcccgact ggaaagcggg cagtgagcgc 3780
aacgcaatta atgtgagtta gctcactcat taggcacccc aggctttaca ctttatgctt 3840
ccggctcgta tgttgtgtgg aattgtgagc ggataacaat ttcacacata ctagagaaag 3900
aggagaaata ctagatggct tcctccgaag acgttatcaa agagttcatg cgtttcaaag 3960
ttcgtatgga aggttccgtt aacggtcacg agttcgaaat cgaaggtgaa ggtgaaggtc 4020
gtccgtacga aggtacccag accgctaaac tgaaagttac caaaggtggt ccgctgccgt 4080
tcgcttggga catcctgtcc ccgcagttcc agtacggttc caaagcttac gttaaacacc 4140
cggctgacat cccggactac ctgaaactgt ccttcccgga aggtttcaaa tgggaacgtg 4200
ttatgaactt cgaagacggt ggtgttgtta ccgttaccca ggactcctcc ctgcaagacg 4260
gtgagttcat ctacaaagtt aaactgcgtg gtaccaactt cccgtccgac ggtccggtta 4320
tgcagaaaaa aaccatgggt tgggaagctt ccaccgaacg tatgtacccg gaagacggtg 4380
ctctgaaagg tgaaatcaaa atgcgtctga aactgaaaga cggtggtcac tacgacgctg 4440
aagttaaaac cacctacatg gctaaaaaac cggttcagct gccgggtgct tacaaaaccg 4500
acatcaaact ggacatcacc tcccacaacg aagactacac catcgttgaa cagtacgaac 4560
gtgctgaagg tcgtcactcc accggtgctt aataacgctg atagtgctag tgtagatcgc 4620
tactagagcc aggcatcaaa taaaacgaaa ggctcagtcg aaagactggg cctttcgttt 4680
tatctgttgt ttgtcggtga acgctctcta ctagagtcac actggctccg tctcatgagc 4740
gcttggaagg tcgggatgag catatacaag cactaagaag aacaatacag aactctacac 4800
ggtattattg tgctacaagc tcgagtaaaa ccgagtgttt tgacgatact aacgttgtta 4860
agaaagtaac ttgttatcaa actcattacc aacttgtgat taattggtga ataatatgat 4920
aattgtcgaa attccattgt tggtaaagcc tataatatta tgtatacaga ttatactaga 4980
aattctctcg agaatataag aatccccaaa attgaatcgg tatttctaca tactaatatt 5040
accattactt ctcctttcgt tttatatgtt tcattcctat tacattatcg atctttgcat 5100
ttcagcttcc attatatttg atgtctgttt tatgtcccca cgttacaccg catgtgacag 5160
tatactagta acatgagtgc taccgaatag atgacatttt agactttcat tccaacaact 5220
tggttgacag aatgttacgt accctatatc taatctatat gaggcctgaa tctaactgaa 5280
aggtggaatt tcagtaattt atcaagcttt aataagtttg ggtagtttaa ctgtgcaaaa 5340
aggtatttac cttacatact gaatcttgtc tgtttggtag cggctgcttt atgtcggaga 5400
gacc 5404
<210> 121
<211> 6264
<212> DNA
<213> Artificial Sequence
<220>
<223> pGAH-YPRCd15-GFP-SpyCatcher plasmid
<400> 121
gcaaaaggcc aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc 60
ccctgacgag catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact 120
ataaagatac caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct 180
gccgcttacc ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag 240
ctcacgctgt aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca 300
cgaacccccc gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa 360
cccggtaaga cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc 420
gaggtatgta ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag 480
aaggacagta tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg 540
tagctcttga tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca 600
gcagattacg cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc 660
tgacgctcag tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag 720
gatcttcacc tagatccttt taaattaaaa atgaagtttt aaatcaatct aaagtatata 780
tgagtaaact tggtctgaca gttaccaatg cttaatcagt gaggcaccta tctcagcgat 840
ctgtctattt cgttcatcca tagttgcctg actccccgtc gtgtagataa ctacgatacg 900
ggagggctta ccatctggcc ccagtgctgc aatgataccg cggctcccac gctcaccggc 960
tccagattta tcagcaataa accagccagc cggaagggcc gagcgcagaa gtggtcctgc 1020
aactttatcc gcctccatcc agtctattaa ttgttgccgg gaagctagag taagtagttc 1080
gccagttaat agtttgcgca acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc 1140
gtcgtttggt atggcttcat tcagctccgg ttcccaacga tcaaggcgag ttacatgatc 1200
ccccatgttg tgcaaaaaag cggttagctc cttcggtcct ccgatcgttg tcagaagtaa 1260
gttggccgca gtgttatcac tcatggttat ggcagcactg cataattctc ttactgtcat 1320
gccatccgta agatgctttt ctgtgactgg tgagtactca accaagtcat tctgagaata 1380
gtgtatgcgg cgaccgagtt gctcttgccc ggcgtcaata cgggataata ccgcgccaca 1440
tagcagaact ttaaaagtgc tcatcattgg aaaacgttct tcggggcgaa aactctcaag 1500
gatcttaccg ctgttgagat ccagttcgat gtaacccact cgtgcaccca actgatcttc 1560
agcatctttt actttcacca gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc 1620
aaaaaaggga ataagggcga cacggaaatg ttgaatactc atactcttcc tttttcaata 1680
ttattgaagc atttatcagg gttattgtct catgagcgga tacatatttg aatgtattta 1740
gaaaaataaa caaatagggg ttccgcgcac atttccccga aaagtgccac ggtctctgtc 1800
agataacgcc aggcgccttt atatcatata attaagacac aaaaggataa aacaaaggtg 1860
ttaactattc tgcatactca ctatcgtaaa ctgtcctgca aatcgtgtaa atatgtattt 1920
catttttttt gcagtgaaaa aaggcatgta aaataccgca tcaagtaact ctactccgcc 1980
tgtggtttca agactaacgg cttgagacaa aatgggaaga aatgattgca gaaaagccat 2040
atgtgtaata gcaaaaagct ggatactgct taccagatgt ttaccttaat ttcttggtga 2100
attagagaag tacagaagtt ttactattaa tcccaccata gaaatttgta taggaaagta 2160
gtttattgga gttattggat atactgtgta aactatttct tgaaattgta atcttaagat 2220
gctcttctta ttctattaaa aatagaaaat gattttcata tttatttatt tatttatatt 2280
ttggcattac tcttcatcat ttttttccct ctaagaagct tcctttcttt ttataaggat 2340
aacaaaacca aaaggaatat tgggtcagat gaatggacgc gaatgcaaga cagaagtcca 2400
aatcacgtca agacaaagaa agaaagaaag aaaaactaac acattaatgt agttttaaaa 2460
tttcaaatcc gaacaacaga gcatagggtt tcgcaaatct ctacctggct cgaagcagcg 2520
gtatttcaca ccgcatagat ccgtcgagtt caagagaaaa aaaaagaaaa agcaaaaaga 2580
aaaaaggaaa gcgcgcctcg ttcagaatga cacgtataga atgatgcatt accttgtcat 2640
cttcagtatc atactgttcg tatacatact tactgacatt cataggtata catatataca 2700
catgtatata tatcgtatgc tgcagcttta aataatcggt gtcactacat aagaacacct 2760
ttggtggagg gaacatcgtt ggtaccattg ggcgaggtgg cttctcttat ggcaaccgca 2820
agagccttga acgcactctc actacggtga tgatcattct tgcctcgcag acaatcaacg 2880
tggagggtaa ttctgctagc ctctgcaaag ctttcaagaa aatgcgggat catctcgcaa 2940
gagagatctc ctactttctc cctttgcaaa ccaagttcga caactgcgta cggcctgttc 3000
gaaagatcta ccaccgctct ggaaagtgcc tcatccaaag gcgcaaatcc tgatccaaac 3060
ctttttactc cacgcacggc ccctagggcc tctttaaaag cttgaccgag agcaatcccg 3120
cagtcttcag tggtgtgatg gtcgtctatg tgtaagtcac caatgcactc aacgattagc 3180
gaccagccgg aatgcttggc cagagcatgt atcatatggt ccagaaaccc tatacctgtg 3240
tggacgttaa tcacttgcga ttgtgtggcc tgttctgcta ctgcttctgc ctctttttct 3300
gggaagatcg agtgctctat cgctagggga ccacccttta aagagatcgc aatctgaatc 3360
ttggtttcat ttgtaatacg ctttactagg gctttctgct ctgtcatctt tgccttcgtt 3420
tatcttgcct gctcattttt tagtatattc ttcgaagaaa tcacattact ttatataatg 3480
tataattcat tatgtgataa tgccaatcgc taagaaaaaa aaagagtcat ccgctaggtg 3540
gaaaaaaaaa aatgaaaatc attaccgagg cataaaaaaa tatagagtgt actagaggag 3600
gccaagagta atagaaaaag aaaattgcgg gaaaggactg tgttatgact tccctgtgca 3660
ccacctcagg cagagaacct ggctgtgatg tctaagtaac ctttatggta tatttcttaa 3720
tgtggaaaga tactagcgcg cgcacccaca cacaagcttc gtcttttctt gaagaaaaga 3780
ggaagctcgc taaatgggat tccactttcc gttccctgcc agctgatgga aaaaggttag 3840
tggaacgatg aagaataaaa agagagatcc actgaggtga aatttcagct gacagcgagt 3900
ttcatgatcg tgatgaacaa tggtaacgag ttgtggctgt tgccagggag ggtggttctc 3960
aacttttaat gtatggccaa atcgctactt gggtttgtta tataacaaag aagaaataat 4020
gaactgattc tcttcctcct tcttgtcctt tcttaattct gttgtaatta ccttcctttg 4080
taattttttt tgtaattatt cttcttaata atccaaacaa acacacatat tacaatagat 4140
gggttcttct catcatcacc atcaccattc ttctgggatg tctaaaggtg aagaattatt 4200
cactggtgtt gtcccaattt tggttgaatt agatggtgat gttaatggtc acaaattttc 4260
tgtctccggt gaaggtgaag gtgatgctac ttacggtaaa ttgaccttaa aatttatttg 4320
tactactggt aaattgccag ttccatggcc aaccttagtc actactttaa cttatggtgt 4380
tcaatgtttt tctagatacc cagatcatat gaaacaacat gactttttca agtctgccat 4440
gccagaaggt tatgttcaag aaagaactat ttttttcaaa gatgacggta actacaagac 4500
cagagctgaa gtcaagtttg aaggtgatac cttagttaat agaatcgaat taaaaggtat 4560
tgattttaaa gaagatggta acattttagg tcacaaattg gaatacaact ataactctca 4620
caatgtttac atcatggctg acaaacaaaa gaatggtatc aaagttaact tcaaaattag 4680
acacaacatt gaagatggtt ctgttcaatt agctgaccat tatcaacaaa atactccaat 4740
tggtgatggt ccagtcttgt taccagacaa ccattactta tccactcaat ctaaattatc 4800
caaagatcca aacgaaaaga gagatcacat ggtcttgtta gaatttgtta ctgctgctgg 4860
tattacccat ggtatggatg aattgtacaa aggttctggt ggttctgatt ctgctactca 4920
tattaagttc tccaagaggg acgaagatgg taaagaattg gctggtgcaa ctatggaatt 4980
gagagattct tctggtaaga ccatttccac ctggatttct gatggtcaag ttaaggattt 5040
ctacttgtac ccaggtaagt acactttcgt tgaaactgct gctccagatg gttatgaagt 5100
tgctactgct attactttca ccgtcaatga acaaggtcaa gtcactgtta atggttagcg 5160
agatattttg cagcagttgc gcacttgcat gtgaatgact cttctcccct ttaattctgt 5220
gctatatttt tacaattttc tgctgacata tagtttatat acatatagaa cgcatatagg 5280
aaattgaagt aaacagaata cacaagtaga ggccggtatg tacgacattt tgcttactac 5340
tctttaaaat catcgtcttc ttcgtcttca tcgtcttctt ctttttcacc atatcctaca 5400
tcatctttag agcctgtgct aggttccttc ttgtctaatt cttctgcagt ctttttatag 5460
tcaattactt tgccgcgtgt tcttcttccg gatgtgatga tattagaggt atcaatttct 5520
gccaaatcgt cctcttcttc ttctccctca tttcccatca atgcgtctaa cttggcatcg 5580
tccatatcag acctctgagc gcttggaagg tcgggatgag catatacaag cactaagaag 5640
aacaatacag aactctacac ggtattattg tgctacaagc tcgagtaaaa ccgagtgttt 5700
tgacgatact aacgttgtta agaaagtaac ttgttatcaa actcattacc aacttgtgat 5760
taattggtga ataatatgat aattgtcgaa attccattgt tggtaaagcc tataatatta 5820
tgtatacaga ttatactaga aattctctcg agaatataag aatccccaaa attgaatcgg 5880
tatttctaca tactaatatt accattactt ctcctttcgt tttatatgtt tcattcctat 5940
tacattatcg atctttgcat ttcagcttcc attatatttg atgtctgttt tatgtcccca 6000
cgttacaccg catgtgacag tatactagta acatgagtgc taccgaatag atgacatttt 6060
agactttcat tccaacaact tggttgacag aatgttacgt accctatatc taatctatat 6120
gaggcctgaa tctaactgaa aggtggaatt tcagtaattt atcaagcttt aataagtttg 6180
ggtagtttaa ctgtgcaaaa aggtatttac cttacatact gaatcttgtc tgtttggtag 6240
cggctgcttt atgtcggaga gacc 6264
Claims (24)
1.一种用于产生携带货物分子的细菌微区室病毒样颗粒(VLP)的方法,所述方法包括
A)向宿主细胞或生物体中引入一种或多种异源多核苷酸,其包含
(i)编码细菌微区室壳原体的第一序列;和
(ii)编码与包封肽融合的货物分子的第二序列,其中所述包封肽包含SEQ ID NO:1(SKITGSSGNDTQGSLITYSGGARG)或SEQ ID NO:94(KPEKPGSKITGSSGNDTQGSLITYSGGARG)所示的氨基酸序列或其功能变体;
a)表达所述第一序列和所述第二序列;以及
b)形成包封所述货物分子的微区室;或
B)向宿主细胞或生物体中引入一种或多种多核苷酸,其包含
(i)编码细菌微区室壳原体的第一序列;和
(ii)编码与货物分子或生化标记融合的至少一种所述原体的第二序列;
a)表达所述第一序列和所述第二序列;以及
b)形成在外表面上表达所述货物分子的微区室,或
c)形成在外表面表达所述生化标记的微区室,包含互补标记的货物分子能够结合所述生化标记。
2.根据权利要求1所述的方法,其中SEQ ID NO:1所示的包封肽的功能变体在其氨基末端处包含在SEQ ID NO:94的氨基末端处的1、2、3、4或5个另外的氨基酸,其中此类变体是序列SEQ ID NO:1与SEQ ID NO:94之间的中间体。
3.根据权利要求1或2所述的方法,其中所述细菌微区室原体衍生自那不勒斯盐硫杆菌或赭黄嗜盐囊菌。
4.根据权利要求3所述的方法,其中所述细菌微区室原体是来自那不勒斯盐硫杆菌的CsoS1A(SEQ ID NO:2)和CsoS4A(SEQ ID NO:3);或来自赭黄嗜盐囊菌的HO-H(SEQ ID NO:4)、HO-P(SEQ ID NO:5)和HO-T1(SEQ ID NO:6),及其变体。
5.根据权利要求1至4中任一项所述的方法,其中所述货物分子是至少一种肽,如酶和/或荧光蛋白和/或免疫原性肽。
6.根据权利要求1至5中任一项所述的方法,其中所述生化标记可以选自Strep-Tag II(SII)、SpyCatcher/SpyTag(SC/ST)对和CC-Di-A/B(CCA/CCB)对。
7.根据权利要求4至6中任一项所述的方法,其中CsoS1A的表达由启动子PT7控制;CsoS4A由启动子PCON5控制;HO-H由酵母启动子PTDH3控制;HO-P由酵母启动子PPYK1控制并且HO-T1由酵母启动子PYEF3控制。
8.根据权利要求1至7中任一项所述的方法,其中所述宿主生物体是大肠杆菌或酿酒酵母。
9.一种携带货物分子的工程化细菌微区室VLP,所述工程化细菌微区室VLP包含:
i)细菌微区室壳原体和与包封肽融合的货物分子,其中所述包封肽包含SEQ ID NO:1或SEQ ID NO:94所示的氨基酸序列或其功能变体;或
ii)细菌微区室壳原体和货物分子,其中所述货物分子与至少一种所述原体的末端融合,或者其中至少一种所述原体与标记融合,并且包含互补标记的货物分子在所述VLP的外表面上与所述标记结合。
10.根据权利要求9所述的工程化VLP,其中SEQ ID NO:1所示的包封肽的功能变体在其氨基末端处包含在SEQ ID NO:94的氨基末端处的1、2、3、4或5个另外的氨基酸,其中此类变体是序列SEQ ID NO:1与SEQ ID NO:94之间的中间体。
11.根据权利要求9或10所述的工程化VLP,其中所述细菌微区室原体衍生自那不勒斯盐硫杆菌或赭黄嗜盐囊菌。
12.根据权利要求11所述的工程化VLP,其中所述细菌微区室原体是来自那不勒斯盐硫杆菌的包含SEQ ID NO:2所示的氨基酸序列的CsoS1A和包含SEQ ID NO:3所示的氨基酸序列的CsoS4A;或来自赭黄嗜盐囊菌的包含SEQ ID NO:4所示的氨基酸序列的HO-H、包含SEQID NO:5所示的氨基酸序列的HO-P和包含SEQ ID NO:6所示的氨基酸序列的HO-T1,及其变体。
13.根据权利要求9至12中任一项所述的工程化VLP,其中所述货物分子是至少一种肽,如酶和/或荧光蛋白和/或免疫原性肽。
14.根据权利要求9至13中任一项所述的工程化VLP,其中所述生化标记选自Strep-TagII(SII)、SpyCatcher/SpyTag(SC/ST)对和CC-Di-A/B(CCA/CCB)对。
15.一种分离的质粒或载体核酸,所述分离的质粒或载体核酸包含:
a)编码细菌微区室壳原体的第一DNA序列,其每一个都与启动子可操作地连接,以及
b)编码与包封肽融合的货物分子的第二DNA序列,其与启动子可操作地连接,其中所述包封肽包含SEQ ID NO:1或SEQ ID NO:94所示的氨基酸序列或其功能变体;或
c)编码细菌微区室壳原体的第一DNA序列,其每一个都与启动子可操作地连接,以及
d)编码与货物分子或生化标记融合的至少一种所述原体的第二DNA序列。
16.根据权利要求13所述的分离的质粒或载体,其中SEQ ID NO:1所示的包封肽的功能变体在其氨基末端处包含在SEQ ID NO:94的氨基末端处的1、2、3、4或5个另外的氨基酸,其中此类变体是序列SEQ ID NO:1与SEQ ID NO:94之间的中间体。
17.根据权利要求15或16所述的分离的质粒或载体,其中所述细菌微区室壳原体、启动子、货物分子和标记是如权利要求1至7中任一项所定义的。
18.根据权利要求17所述的分离的质粒或载体,其中编码所述细菌微区室壳原体、货物分子和标记的DNA序列由于遗传密码的冗余性而与SEQ ID NO:7-12和95具有至少70%、至少80%、至少90%或100%同一性。
19.一种包含至少一种根据权利要求9至14中任一项所述的工程化VLP的组合物或组合,所述组合物或组合用于:
a)预防或治疗受试者的疾病;或
b)生化过程。
20.根据权利要求19所述的组合物或组合,其中所述至少一种工程化VLP包含用于转化前药的酶。
21.根据权利要求19或20所述的组合,所述组合包含一种或多种另外的治疗剂。
22.根据权利要求19或21所述的组合物或组合,所述组合物或组合是疫苗。
23.至少一种根据权利要求9至14中任一项所述的工程化VLP在制造用于预防或治疗受试者的疾病的药剂中的用途。
24.一种预防或治疗的方法,所述方法包括向需要这种治疗的受试者施用有效量的根据权利要求9至14中任一项所述的工程化VLP。
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SG10202010547W | 2020-10-23 | ||
SG10202010547W | 2020-10-23 | ||
PCT/SG2021/050639 WO2022086450A1 (en) | 2020-10-23 | 2021-10-21 | Bacterial microcompartment virus-like particles |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116615550A true CN116615550A (zh) | 2023-08-18 |
Family
ID=81291750
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202180080176.0A Pending CN116615550A (zh) | 2020-10-23 | 2021-10-21 | 细菌微区室病毒样颗粒 |
Country Status (8)
Country | Link |
---|---|
US (1) | US20230302124A1 (zh) |
EP (1) | EP4232585A1 (zh) |
JP (1) | JP2023546670A (zh) |
KR (1) | KR20230088911A (zh) |
CN (1) | CN116615550A (zh) |
AU (1) | AU2021364272A1 (zh) |
CA (1) | CA3196412A1 (zh) |
WO (1) | WO2022086450A1 (zh) |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2011094765A2 (en) * | 2010-02-01 | 2011-08-04 | The Regents Of The University Of California | A targeting signal for integrating proteins, peptides and biological molecules into bacterial microcompartments |
EP2574620A1 (en) * | 2011-09-28 | 2013-04-03 | University College Cork | Accumulation of metabolic products in bacterial microcompartments |
CN110438095B (zh) * | 2019-07-31 | 2023-09-05 | 中国科学院武汉病毒研究所 | 一种新型纳米反应容器的合成及应用 |
-
2021
- 2021-10-21 EP EP21883435.6A patent/EP4232585A1/en active Pending
- 2021-10-21 WO PCT/SG2021/050639 patent/WO2022086450A1/en active Application Filing
- 2021-10-21 CA CA3196412A patent/CA3196412A1/en active Pending
- 2021-10-21 CN CN202180080176.0A patent/CN116615550A/zh active Pending
- 2021-10-21 AU AU2021364272A patent/AU2021364272A1/en active Pending
- 2021-10-21 JP JP2023524512A patent/JP2023546670A/ja active Pending
- 2021-10-21 US US18/033,282 patent/US20230302124A1/en active Pending
- 2021-10-21 KR KR1020237016963A patent/KR20230088911A/ko unknown
Also Published As
Publication number | Publication date |
---|---|
WO2022086450A1 (en) | 2022-04-28 |
AU2021364272A1 (en) | 2023-06-08 |
JP2023546670A (ja) | 2023-11-07 |
US20230302124A1 (en) | 2023-09-28 |
CA3196412A1 (en) | 2022-04-28 |
KR20230088911A (ko) | 2023-06-20 |
EP4232585A1 (en) | 2023-08-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2021203937B2 (en) | Compositions and methods for rapid and dynamic flux control using synthetic metabolic valves | |
KR102370675B1 (ko) | 표적 핵산의 변형을 위한 개선된 방법 | |
AU2016203445B2 (en) | Integration of a polynucleotide encoding a polypeptide that catalyzes pyruvate to acetolactate conversion | |
AU2023270322A1 (en) | Compositions and methods for modifying genomes | |
KR102006527B1 (ko) | 전립선-연관 항원의 발현을 위한 벡터 | |
CN101939434B (zh) | 用于在大豆中提高种子贮藏油脂的生成和改变脂肪酸谱的来自解脂耶氏酵母的dgat基因 | |
CN104838016B (zh) | 用于识别相对于其野生型具有提高的特定代谢物细胞内浓度的细胞的方法 | |
KR20140099224A (ko) | 케토-아이소발레레이트 데카르복실라제 효소 및 이의 이용 방법 | |
DK2443248T3 (en) | IMPROVEMENT OF LONG-CHAIN POLYUM Saturated OMEGA-3 AND OMEGA-6 FATTY ACID BIOS SYNTHESIS BY EXPRESSION OF ACYL-CoA LYSOPHOSPHOLIPID ACYL TRANSFERASES | |
KR20140092759A (ko) | 숙주 세포 및 아이소부탄올의 제조 방법 | |
KR20210080375A (ko) | 암 면역요법을 위한 재조합 폭스바이러스 | |
KR20180069081A (ko) | 심장 이상 및 기타 병리 이상의 치료를 위한 복수의 생물학적으로 활성화된 폴리펩티드를 단일 벡터로부터 발현하기 위한 조성물 및 방법 | |
CN107630029B (zh) | 一种产朊假丝酵母游离型表达载体及其构建方法与应用 | |
CN109996874A (zh) | 10-甲基硬脂酸的异源性产生 | |
KR20210148270A (ko) | 이중 원형 재조합 dna 작제물 및 이의 조성물을 이용하여 바실러스의 게놈 내로의 폴리뉴클레오타이드를 통합하기 위한 방법 | |
CN114945665A (zh) | 用于增强地衣芽孢杆菌中蛋白产生的组合物和方法 | |
KR20210148269A (ko) | 선형 재조합 dna 작제물 및 이의 조성물을 이용하여 공여 dna 서열을 바실러스 게놈 내에 통합시키기 위한 방법 | |
CN107002070A (zh) | 共表达质粒 | |
KR20080030956A (ko) | 개선된 조절 발현 체계를 사용한 질병의 치료 | |
CN112553240A (zh) | 重组表达载体系统、重组工程菌及其制备方法和用途 | |
CN115243701A (zh) | 用于无佐剂诱导免疫应答的IgG变体 | |
CN116615550A (zh) | 细菌微区室病毒样颗粒 | |
CN115209909A (zh) | 递送组合物和方法 | |
RU2730664C2 (ru) | Генотерапевтический ДНК-вектор на основе генотерапевтического ДНК-вектора VTvaf17, несущий целевой ген, выбранный из группы генов ANG, ANGPT1, VEGFA, FGF1, HIF1α, HGF, SDF1, KLK4, PDGFC, PROK1, PROK2 для повышения уровня экспрессии этих целевых генов, способ его получения и применения, штамм Escherichia coli SCS110-AF/VTvaf17-ANG, или Escherichia coli SCS110-AF/VTvaf17-ANGPT1, или Escherichia coli SCS110-AF/VTvaf17-VEGFA, или Escherichia coli SCS110-AF/VTvaf17-FGF1, или Escherichia coli SCS110-AF/VTvaf17-HIF1α, или Escherichia coli SCS110-AF/VTvaf17-HGF, или Escherichia coli SCS110-AF/VTvaf17-SDF1, или Escherichia coli SCS110-AF/VTvaf17-KLK4, или Escherichia coli SCS110-AF/VTvaf17-PDGFC, или Escherichia coli SCS110-AF/VTvaf17-PROK1, или Escherichia coli SCS110-AF/VTvaf17-PROK2, несущий генотерапевтический ДНК-вектор, способ его получения, способ производства в промышленных масштабах генотерапевтического ДНК-вектора | |
US20040014158A1 (en) | Protein conjugates, methods, vectors, proteins and DNA for producing them, their use, and medicaments and vaccines containing a certain quantity of said protein conjugates |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |