WO2024118635A2 - Thermostable binding scaffolds - Google Patents
Thermostable binding scaffolds Download PDFInfo
- Publication number
- WO2024118635A2 WO2024118635A2 PCT/US2023/081399 US2023081399W WO2024118635A2 WO 2024118635 A2 WO2024118635 A2 WO 2024118635A2 US 2023081399 W US2023081399 W US 2023081399W WO 2024118635 A2 WO2024118635 A2 WO 2024118635A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- seq
- sequence
- amino acid
- relative
- protein scaffold
- Prior art date
Links
- 230000027455 binding Effects 0.000 title claims abstract description 85
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 338
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 337
- 235000001014 amino acid Nutrition 0.000 claims description 705
- 150000001413 amino acids Chemical class 0.000 claims description 703
- 230000035772 mutation Effects 0.000 claims description 463
- 238000006467 substitution reaction Methods 0.000 claims description 387
- 238000012217 deletion Methods 0.000 claims description 232
- 230000037430 deletion Effects 0.000 claims description 232
- 238000003780 insertion Methods 0.000 claims description 226
- 230000037431 insertion Effects 0.000 claims description 226
- 239000011347 resin Substances 0.000 claims description 156
- 229920005989 resin Polymers 0.000 claims description 155
- 238000000034 method Methods 0.000 claims description 40
- 108091033319 polynucleotide Proteins 0.000 claims description 29
- 102000040430 polynucleotide Human genes 0.000 claims description 29
- 239000002157 polynucleotide Substances 0.000 claims description 29
- 125000000151 cysteine group Chemical group N[C@@H](CS)C(=O)* 0.000 claims description 26
- 239000002245 particle Substances 0.000 claims description 26
- 239000011324 bead Substances 0.000 claims description 24
- 235000009582 asparagine Nutrition 0.000 claims description 22
- 235000018977 lysine Nutrition 0.000 claims description 22
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 20
- 229920001184 polypeptide Polymers 0.000 claims description 18
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 18
- 150000002669 lysines Chemical class 0.000 claims description 17
- 235000018417 cysteine Nutrition 0.000 claims description 16
- 150000001508 asparagines Chemical class 0.000 claims description 15
- 229910052747 lanthanoid Inorganic materials 0.000 claims description 14
- 150000002602 lanthanoids Chemical class 0.000 claims description 14
- 239000000203 mixture Substances 0.000 claims description 13
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 claims description 12
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 claims description 11
- 239000013598 vector Substances 0.000 claims description 11
- 108010090804 Streptavidin Proteins 0.000 claims description 10
- 125000000524 functional group Chemical group 0.000 claims description 10
- 102000004190 Enzymes Human genes 0.000 claims description 9
- 108090000790 Enzymes Proteins 0.000 claims description 9
- 229910052771 Terbium Inorganic materials 0.000 claims description 9
- 230000001590 oxidative effect Effects 0.000 claims description 9
- GZCRRIHWUXGPOV-UHFFFAOYSA-N terbium atom Chemical group [Tb] GZCRRIHWUXGPOV-UHFFFAOYSA-N 0.000 claims description 9
- 230000002285 radioactive effect Effects 0.000 claims description 8
- 102100036402 DAP3-binding cell death enhancer 1 Human genes 0.000 claims description 6
- 101000929221 Homo sapiens DAP3-binding cell death enhancer 1 Proteins 0.000 claims description 6
- 229960002685 biotin Drugs 0.000 claims description 6
- 235000020958 biotin Nutrition 0.000 claims description 6
- 239000011616 biotin Substances 0.000 claims description 6
- 238000012258 culturing Methods 0.000 claims description 6
- 229910052698 phosphorus Inorganic materials 0.000 claims description 6
- 229920001223 polyethylene glycol Polymers 0.000 claims description 5
- 150000003141 primary amines Chemical class 0.000 claims description 4
- 239000002202 Polyethylene glycol Substances 0.000 claims description 3
- 108091028664 Ribonucleotide Proteins 0.000 claims description 3
- 239000005547 deoxyribonucleotide Substances 0.000 claims description 3
- 125000002637 deoxyribonucleotide group Chemical group 0.000 claims description 3
- 239000007850 fluorescent dye Substances 0.000 claims description 3
- 125000003827 glycol group Chemical group 0.000 claims description 3
- 230000003100 immobilizing effect Effects 0.000 claims description 3
- 239000006249 magnetic particle Substances 0.000 claims description 3
- 239000002336 ribonucleotide Substances 0.000 claims description 3
- 125000002652 ribonucleotide group Chemical group 0.000 claims description 3
- 125000003396 thiol group Chemical group [H]S* 0.000 claims description 3
- 238000001042 affinity chromatography Methods 0.000 abstract description 48
- 239000012539 chromatography resin Substances 0.000 abstract description 11
- 238000011161 development Methods 0.000 abstract description 7
- 108010038196 saccharide-binding proteins Proteins 0.000 abstract description 7
- 239000000758 substrate Substances 0.000 abstract description 6
- 241000193468 Clostridium perfringens Species 0.000 abstract description 4
- 108010003272 Hyaluronate lyase Proteins 0.000 abstract description 4
- 102000001974 Hyaluronidases Human genes 0.000 abstract description 3
- 229960002773 hyaluronidase Drugs 0.000 abstract description 3
- 108020001580 protein domains Proteins 0.000 abstract description 3
- 229940024606 amino acid Drugs 0.000 description 574
- 235000018102 proteins Nutrition 0.000 description 121
- -1 1 to 20 amino acids Chemical class 0.000 description 91
- HEMHJVSKTPXQMS-UHFFFAOYSA-M Sodium hydroxide Chemical compound [OH-].[Na+] HEMHJVSKTPXQMS-UHFFFAOYSA-M 0.000 description 90
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 75
- 238000000746 purification Methods 0.000 description 43
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 42
- 210000004027 cell Anatomy 0.000 description 42
- ZMXDDKWLCZADIW-UHFFFAOYSA-N N,N-Dimethylformamide Chemical compound CN(C)C=O ZMXDDKWLCZADIW-UHFFFAOYSA-N 0.000 description 39
- 239000003795 chemical substances by application Substances 0.000 description 35
- 239000006166 lysate Substances 0.000 description 33
- 239000005090 green fluorescent protein Substances 0.000 description 27
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 27
- 241000588724 Escherichia coli Species 0.000 description 25
- 239000000872 buffer Substances 0.000 description 23
- 238000010828 elution Methods 0.000 description 22
- 102000051619 SUMO-1 Human genes 0.000 description 19
- 108090000631 Trypsin Proteins 0.000 description 19
- 102000004142 Trypsin Human genes 0.000 description 19
- 239000000499 gel Substances 0.000 description 19
- 239000012588 trypsin Substances 0.000 description 19
- 101000879203 Caenorhabditis elegans Small ubiquitin-related modifier Proteins 0.000 description 18
- 238000004458 analytical method Methods 0.000 description 14
- 238000002844 melting Methods 0.000 description 14
- 230000008018 melting Effects 0.000 description 14
- UXVMQQNJUSDDNG-UHFFFAOYSA-L Calcium chloride Chemical compound [Cl-].[Cl-].[Ca+2] UXVMQQNJUSDDNG-UHFFFAOYSA-L 0.000 description 13
- 125000000539 amino acid group Chemical group 0.000 description 13
- 239000001110 calcium chloride Substances 0.000 description 13
- 235000011148 calcium chloride Nutrition 0.000 description 13
- 229910001628 calcium chloride Inorganic materials 0.000 description 13
- 230000029087 digestion Effects 0.000 description 13
- 230000000694 effects Effects 0.000 description 13
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 12
- XUJNEKJLAYXESH-REOHCLBHSA-N L-Cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 12
- 239000006228 supernatant Substances 0.000 description 12
- 238000004140 cleaning Methods 0.000 description 11
- 239000011230 binding agent Substances 0.000 description 10
- 238000004587 chromatography analysis Methods 0.000 description 10
- 238000004091 panning Methods 0.000 description 10
- 239000008188 pellet Substances 0.000 description 10
- 239000000243 solution Substances 0.000 description 10
- PXHVJJICTQNCMI-UHFFFAOYSA-N Nickel Chemical compound [Ni] PXHVJJICTQNCMI-UHFFFAOYSA-N 0.000 description 9
- 125000003275 alpha amino acid group Chemical group 0.000 description 9
- 238000013459 approach Methods 0.000 description 9
- 238000006243 chemical reaction Methods 0.000 description 9
- RWSXRVCMGQZWBV-WDSKDSINSA-N glutathione Chemical compound OC(=O)[C@@H](N)CCC(=O)N[C@@H](CS)C(=O)NCC(O)=O RWSXRVCMGQZWBV-WDSKDSINSA-N 0.000 description 9
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 8
- 102000035195 Peptidases Human genes 0.000 description 8
- 108091005804 Peptidases Proteins 0.000 description 8
- 239000004365 Protease Substances 0.000 description 8
- 150000002019 disulfides Chemical class 0.000 description 8
- 229940088598 enzyme Drugs 0.000 description 8
- 238000002703 mutagenesis Methods 0.000 description 8
- 231100000350 mutagenesis Toxicity 0.000 description 8
- 230000007935 neutral effect Effects 0.000 description 8
- 230000002829 reductive effect Effects 0.000 description 8
- 239000012146 running buffer Substances 0.000 description 8
- 239000000523 sample Substances 0.000 description 8
- 238000012360 testing method Methods 0.000 description 8
- 229920000936 Agarose Polymers 0.000 description 7
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 7
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 7
- 125000000899 L-alpha-glutamyl group Chemical group [H]N([H])[C@]([H])(C(=O)[*])C([H])([H])C([H])([H])C(O[H])=O 0.000 description 7
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 7
- 229920002684 Sepharose Polymers 0.000 description 7
- 229960001230 asparagine Drugs 0.000 description 7
- 239000008103 glucose Substances 0.000 description 7
- 239000013612 plasmid Substances 0.000 description 7
- 108020004414 DNA Proteins 0.000 description 6
- 239000004471 Glycine Substances 0.000 description 6
- 102100022662 Guanylyl cyclase C Human genes 0.000 description 6
- 101710198293 Guanylyl cyclase C Proteins 0.000 description 6
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 6
- 239000004472 Lysine Substances 0.000 description 6
- 101150071716 PCSK1 gene Proteins 0.000 description 6
- PZBFGYYEXUXCOF-UHFFFAOYSA-N TCEP Chemical compound OC(=O)CCP(CCC(O)=O)CCC(O)=O PZBFGYYEXUXCOF-UHFFFAOYSA-N 0.000 description 6
- 239000007983 Tris buffer Substances 0.000 description 6
- 235000009697 arginine Nutrition 0.000 description 6
- 238000003556 assay Methods 0.000 description 6
- 210000004899 c-terminal region Anatomy 0.000 description 6
- 238000005119 centrifugation Methods 0.000 description 6
- 239000003638 chemical reducing agent Substances 0.000 description 6
- 230000021615 conjugation Effects 0.000 description 6
- 230000018109 developmental process Effects 0.000 description 6
- 238000002474 experimental method Methods 0.000 description 6
- 238000011534 incubation Methods 0.000 description 6
- 239000002198 insoluble material Substances 0.000 description 6
- 239000002609 medium Substances 0.000 description 6
- 239000003960 organic solvent Substances 0.000 description 6
- 238000002823 phage display Methods 0.000 description 6
- 238000001542 size-exclusion chromatography Methods 0.000 description 6
- 238000011282 treatment Methods 0.000 description 6
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 6
- 108010039224 Amidophosphoribosyltransferase Proteins 0.000 description 5
- 101100478890 Caenorhabditis elegans smo-1 gene Proteins 0.000 description 5
- 108090000317 Chymotrypsin Proteins 0.000 description 5
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 5
- 239000007993 MOPS buffer Substances 0.000 description 5
- 230000009286 beneficial effect Effects 0.000 description 5
- OWMVSZAMULFTJU-UHFFFAOYSA-N bis-tris Chemical compound OCCN(CCO)C(CO)(CO)CO OWMVSZAMULFTJU-UHFFFAOYSA-N 0.000 description 5
- 239000003153 chemical reaction reagent Substances 0.000 description 5
- 229960002376 chymotrypsin Drugs 0.000 description 5
- 238000003776 cleavage reaction Methods 0.000 description 5
- 239000013078 crystal Substances 0.000 description 5
- 238000000326 densiometry Methods 0.000 description 5
- 238000002022 differential scanning fluorescence spectroscopy Methods 0.000 description 5
- 239000013604 expression vector Substances 0.000 description 5
- 238000002955 isolation Methods 0.000 description 5
- 239000000463 material Substances 0.000 description 5
- 230000035945 sensitivity Effects 0.000 description 5
- 230000003068 static effect Effects 0.000 description 5
- 108091093088 Amplicon Proteins 0.000 description 4
- 239000004475 Arginine Substances 0.000 description 4
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 4
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 4
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 4
- 239000007987 MES buffer Substances 0.000 description 4
- 239000012722 SDS sample buffer Substances 0.000 description 4
- 101150102102 SMT3 gene Proteins 0.000 description 4
- 101150096255 SUMO1 gene Proteins 0.000 description 4
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 4
- 101100408688 Schizosaccharomyces pombe (strain 972 / ATCC 24843) pmt3 gene Proteins 0.000 description 4
- 235000004279 alanine Nutrition 0.000 description 4
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 4
- 230000001580 bacterial effect Effects 0.000 description 4
- 238000012575 bio-layer interferometry Methods 0.000 description 4
- 229940098773 bovine serum albumin Drugs 0.000 description 4
- 230000008878 coupling Effects 0.000 description 4
- 238000010168 coupling process Methods 0.000 description 4
- 238000005859 coupling reaction Methods 0.000 description 4
- 230000006240 deamidation Effects 0.000 description 4
- 238000001506 fluorescence spectroscopy Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000004927 fusion Effects 0.000 description 4
- 229960003180 glutathione Drugs 0.000 description 4
- 230000006698 induction Effects 0.000 description 4
- 230000003993 interaction Effects 0.000 description 4
- 238000011068 loading method Methods 0.000 description 4
- 239000000178 monomer Substances 0.000 description 4
- FSYKKLYZXJSNPZ-UHFFFAOYSA-N sarcosine Chemical compound C[NH2+]CC([O-])=O FSYKKLYZXJSNPZ-UHFFFAOYSA-N 0.000 description 4
- 150000003384 small molecules Chemical class 0.000 description 4
- 239000001509 sodium citrate Substances 0.000 description 4
- NLJMYIDDQXHKNR-UHFFFAOYSA-K sodium citrate Chemical compound O.O.[Na+].[Na+].[Na+].[O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O NLJMYIDDQXHKNR-UHFFFAOYSA-K 0.000 description 4
- 241000894007 species Species 0.000 description 4
- 238000010186 staining Methods 0.000 description 4
- 238000005406 washing Methods 0.000 description 4
- NPDBDJFLKKQMCM-SCSAIBSYSA-N (2s)-2-amino-3,3-dimethylbutanoic acid Chemical compound CC(C)(C)[C@H](N)C(O)=O NPDBDJFLKKQMCM-SCSAIBSYSA-N 0.000 description 3
- PECYZEOJVXMISF-UHFFFAOYSA-N 3-aminoalanine Chemical compound [NH3+]CC(N)C([O-])=O PECYZEOJVXMISF-UHFFFAOYSA-N 0.000 description 3
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 3
- TXUWMXQFNYDOEZ-UHFFFAOYSA-N 5-(1H-indol-3-ylmethyl)-3-methyl-2-sulfanylidene-4-imidazolidinone Chemical compound O=C1N(C)C(=S)NC1CC1=CNC2=CC=CC=C12 TXUWMXQFNYDOEZ-UHFFFAOYSA-N 0.000 description 3
- BWGNESOTFCXPMA-UHFFFAOYSA-N Dihydrogen disulfide Chemical compound SS BWGNESOTFCXPMA-UHFFFAOYSA-N 0.000 description 3
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 3
- 108010024636 Glutathione Proteins 0.000 description 3
- 108010053070 Glutathione Disulfide Proteins 0.000 description 3
- 102100040870 Glycine amidinotransferase, mitochondrial Human genes 0.000 description 3
- 101000893303 Homo sapiens Glycine amidinotransferase, mitochondrial Proteins 0.000 description 3
- 108060003951 Immunoglobulin Proteins 0.000 description 3
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 3
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 3
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 3
- PKFBJSDMCRJYDC-GEZSXCAASA-N N-acetyl-s-geranylgeranyl-l-cysteine Chemical compound CC(C)=CCC\C(C)=C\CC\C(C)=C\CC\C(C)=C\CSC[C@@H](C(O)=O)NC(C)=O PKFBJSDMCRJYDC-GEZSXCAASA-N 0.000 description 3
- DNIAPMSPPWPWGF-UHFFFAOYSA-N Propylene glycol Chemical compound CC(O)CO DNIAPMSPPWPWGF-UHFFFAOYSA-N 0.000 description 3
- 239000000427 antigen Substances 0.000 description 3
- 102000036639 antigens Human genes 0.000 description 3
- 108091007433 antigens Proteins 0.000 description 3
- 238000013378 biophysical characterization Methods 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 3
- 239000007975 buffered saline Substances 0.000 description 3
- 239000011575 calcium Substances 0.000 description 3
- 238000012512 characterization method Methods 0.000 description 3
- 238000010367 cloning Methods 0.000 description 3
- 230000007423 decrease Effects 0.000 description 3
- 238000011033 desalting Methods 0.000 description 3
- 238000010790 dilution Methods 0.000 description 3
- 239000012895 dilution Substances 0.000 description 3
- 238000011067 equilibration Methods 0.000 description 3
- YPZRWBKMTBYPTK-BJDJZHNGSA-N glutathione disulfide Chemical compound OC(=O)[C@@H](N)CCC(=O)N[C@H](C(=O)NCC(O)=O)CSSC[C@@H](C(=O)NCC(O)=O)NC(=O)CC[C@H](N)C(O)=O YPZRWBKMTBYPTK-BJDJZHNGSA-N 0.000 description 3
- 238000010438 heat treatment Methods 0.000 description 3
- 229960002885 histidine Drugs 0.000 description 3
- 235000014304 histidine Nutrition 0.000 description 3
- 238000001597 immobilized metal affinity chromatography Methods 0.000 description 3
- 102000018358 immunoglobulin Human genes 0.000 description 3
- 229930027917 kanamycin Natural products 0.000 description 3
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 3
- 229960000318 kanamycin Drugs 0.000 description 3
- 229930182823 kanamycin A Natural products 0.000 description 3
- 239000007788 liquid Substances 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 229930182817 methionine Natural products 0.000 description 3
- 235000019799 monosodium phosphate Nutrition 0.000 description 3
- 102000039446 nucleic acids Human genes 0.000 description 3
- 108020004707 nucleic acids Proteins 0.000 description 3
- 150000007523 nucleic acids Chemical class 0.000 description 3
- 229920005862 polyol Polymers 0.000 description 3
- 150000003077 polyols Chemical class 0.000 description 3
- 238000001742 protein purification Methods 0.000 description 3
- 230000008929 regeneration Effects 0.000 description 3
- 238000011069 regeneration method Methods 0.000 description 3
- 239000012723 sample buffer Substances 0.000 description 3
- 230000007017 scission Effects 0.000 description 3
- AJPJDKMHJJGVTQ-UHFFFAOYSA-M sodium dihydrogen phosphate Chemical compound [Na+].OP(O)([O-])=O AJPJDKMHJJGVTQ-UHFFFAOYSA-M 0.000 description 3
- 229910000162 sodium phosphate Inorganic materials 0.000 description 3
- 239000007858 starting material Substances 0.000 description 3
- MRTPISKDZDHEQI-YFKPBYRVSA-N (2s)-2-(tert-butylamino)propanoic acid Chemical compound OC(=O)[C@H](C)NC(C)(C)C MRTPISKDZDHEQI-YFKPBYRVSA-N 0.000 description 2
- VWTFNYVAFGYEKI-QMMMGPOBSA-N (2s)-2-azaniumyl-3-(3,4-dimethoxyphenyl)propanoate Chemical compound COC1=CC=C(C[C@H](N)C(O)=O)C=C1OC VWTFNYVAFGYEKI-QMMMGPOBSA-N 0.000 description 2
- AXDLCFOOGCNDST-VIFPVBQESA-N (2s)-3-(4-hydroxyphenyl)-2-(methylamino)propanoic acid Chemical compound CN[C@H](C(O)=O)CC1=CC=C(O)C=C1 AXDLCFOOGCNDST-VIFPVBQESA-N 0.000 description 2
- SMWADGDVGCZIGK-ZJUUUORDSA-N (2s,5r)-5-phenylpyrrolidin-1-ium-2-carboxylate Chemical compound N1[C@H](C(=O)O)CC[C@@H]1C1=CC=CC=C1 SMWADGDVGCZIGK-ZJUUUORDSA-N 0.000 description 2
- WOXWUZCRWJWTRT-UHFFFAOYSA-N 1-amino-1-cyclohexanecarboxylic acid Chemical compound OC(=O)C1(N)CCCCC1 WOXWUZCRWJWTRT-UHFFFAOYSA-N 0.000 description 2
- DICMQVOBSKLBBN-UHFFFAOYSA-N 2-(cyclodecylamino)acetic acid Chemical compound OC(=O)CNC1CCCCCCCCC1 DICMQVOBSKLBBN-UHFFFAOYSA-N 0.000 description 2
- NPLBBQAAYSJEMO-UHFFFAOYSA-N 2-(cycloheptylazaniumyl)acetate Chemical compound OC(=O)CNC1CCCCCC1 NPLBBQAAYSJEMO-UHFFFAOYSA-N 0.000 description 2
- FUOOLUPWFVMBKG-UHFFFAOYSA-N 2-Aminoisobutyric acid Chemical compound CC(C)(N)C(O)=O FUOOLUPWFVMBKG-UHFFFAOYSA-N 0.000 description 2
- YDBPFLZECVWPSH-UHFFFAOYSA-N 2-[3-(diaminomethylideneamino)propylamino]acetic acid Chemical compound NC(=N)NCCCNCC(O)=O YDBPFLZECVWPSH-UHFFFAOYSA-N 0.000 description 2
- SNDPXSYFESPGGJ-UHFFFAOYSA-N 2-aminopentanoic acid Chemical compound CCCC(N)C(O)=O SNDPXSYFESPGGJ-UHFFFAOYSA-N 0.000 description 2
- JWYOAMOZLZXDER-UHFFFAOYSA-N 2-azaniumylcyclopentane-1-carboxylate Chemical compound NC1CCCC1C(O)=O JWYOAMOZLZXDER-UHFFFAOYSA-N 0.000 description 2
- GAUBNQMYYJLWNF-UHFFFAOYSA-N 3-(Carboxymethylamino)propanoic acid Chemical compound OC(=O)CCNCC(O)=O GAUBNQMYYJLWNF-UHFFFAOYSA-N 0.000 description 2
- 101000651036 Arabidopsis thaliana Galactolipid galactosyltransferase SFR2, chloroplastic Proteins 0.000 description 2
- OYPRJOBELJOOCE-UHFFFAOYSA-N Calcium Chemical compound [Ca] OYPRJOBELJOOCE-UHFFFAOYSA-N 0.000 description 2
- KRKNYBCHXYNGOX-UHFFFAOYSA-K Citrate Chemical compound [O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O KRKNYBCHXYNGOX-UHFFFAOYSA-K 0.000 description 2
- 108020004705 Codon Proteins 0.000 description 2
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 2
- 238000002965 ELISA Methods 0.000 description 2
- UGJMXCAKCUNAIE-UHFFFAOYSA-N Gabapentin Chemical compound OC(=O)CC1(CN)CCCCC1 UGJMXCAKCUNAIE-UHFFFAOYSA-N 0.000 description 2
- 102000005720 Glutathione transferase Human genes 0.000 description 2
- 108010070675 Glutathione transferase Proteins 0.000 description 2
- PMMYEEVYMWASQN-DMTCNVIQSA-N Hydroxyproline Chemical compound O[C@H]1CN[C@H](C(O)=O)C1 PMMYEEVYMWASQN-DMTCNVIQSA-N 0.000 description 2
- AHLPHDHHMVZTML-BYPYZUCNSA-N L-Ornithine Chemical compound NCCC[C@H](N)C(O)=O AHLPHDHHMVZTML-BYPYZUCNSA-N 0.000 description 2
- ZGUNAGUHMKGQNY-ZETCQYMHSA-N L-alpha-phenylglycine zwitterion Chemical compound OC(=O)[C@@H](N)C1=CC=CC=C1 ZGUNAGUHMKGQNY-ZETCQYMHSA-N 0.000 description 2
- LRQKBLKVPFOOQJ-YFKPBYRVSA-N L-norleucine Chemical compound CCCC[C@H]([NH3+])C([O-])=O LRQKBLKVPFOOQJ-YFKPBYRVSA-N 0.000 description 2
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 2
- MLTRLIITQPXHBJ-BQBZGAKWSA-N Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O MLTRLIITQPXHBJ-BQBZGAKWSA-N 0.000 description 2
- 102000008300 Mutant Proteins Human genes 0.000 description 2
- 108010021466 Mutant Proteins Proteins 0.000 description 2
- KSPIYJQBLVDRRI-UHFFFAOYSA-N N-methylisoleucine Chemical compound CCC(C)C(NC)C(O)=O KSPIYJQBLVDRRI-UHFFFAOYSA-N 0.000 description 2
- RHGKLRLOHDJJDR-UHFFFAOYSA-N Ndelta-carbamoyl-DL-ornithine Natural products OC(=O)C(N)CCCNC(N)=O RHGKLRLOHDJJDR-UHFFFAOYSA-N 0.000 description 2
- 108091034117 Oligonucleotide Proteins 0.000 description 2
- AHLPHDHHMVZTML-UHFFFAOYSA-N Orn-delta-NH2 Natural products NCCCC(N)C(O)=O AHLPHDHHMVZTML-UHFFFAOYSA-N 0.000 description 2
- UTJLXEIPEHZYQJ-UHFFFAOYSA-N Ornithine Natural products OC(=O)C(C)CCCN UTJLXEIPEHZYQJ-UHFFFAOYSA-N 0.000 description 2
- 229920001213 Polysorbate 20 Polymers 0.000 description 2
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Chemical compound OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 2
- 108010077895 Sarcosine Proteins 0.000 description 2
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 2
- 108010003723 Single-Domain Antibodies Proteins 0.000 description 2
- 108010088160 Staphylococcal Protein A Proteins 0.000 description 2
- 239000012505 Superdex™ Substances 0.000 description 2
- 102000040945 Transcription factor Human genes 0.000 description 2
- 108091023040 Transcription factor Proteins 0.000 description 2
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 2
- 230000009824 affinity maturation Effects 0.000 description 2
- 239000003513 alkali Substances 0.000 description 2
- 125000003277 amino group Chemical group 0.000 description 2
- 229940067621 aminobutyrate Drugs 0.000 description 2
- BFNBIHQBYMNNAN-UHFFFAOYSA-N ammonium sulfate Chemical compound N.N.OS(O)(=O)=O BFNBIHQBYMNNAN-UHFFFAOYSA-N 0.000 description 2
- 229910052921 ammonium sulfate Inorganic materials 0.000 description 2
- 235000011130 ammonium sulphate Nutrition 0.000 description 2
- RWZYAGGXGHYGMB-UHFFFAOYSA-N anthranilic acid Chemical compound NC1=CC=CC=C1C(O)=O RWZYAGGXGHYGMB-UHFFFAOYSA-N 0.000 description 2
- 150000001484 arginines Chemical class 0.000 description 2
- IADUEWIQBXOCDZ-UHFFFAOYSA-N azetidine-2-carboxylic acid Chemical compound OC(=O)C1CCN1 IADUEWIQBXOCDZ-UHFFFAOYSA-N 0.000 description 2
- 238000009835 boiling Methods 0.000 description 2
- 229910052791 calcium Inorganic materials 0.000 description 2
- 125000002915 carbonyl group Chemical group [*:2]C([*:1])=O 0.000 description 2
- 239000006143 cell culture medium Substances 0.000 description 2
- 239000007795 chemical reaction product Substances 0.000 description 2
- 235000013477 citrulline Nutrition 0.000 description 2
- 229960002173 citrulline Drugs 0.000 description 2
- XVOYSCVBGLVSOL-UHFFFAOYSA-N cysteic acid Chemical compound OC(=O)C(N)CS(O)(=O)=O XVOYSCVBGLVSOL-UHFFFAOYSA-N 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000010494 dissociation reaction Methods 0.000 description 2
- 230000005593 dissociations Effects 0.000 description 2
- PMMYEEVYMWASQN-UHFFFAOYSA-N dl-hydroxyproline Natural products OC1C[NH2+]C(C([O-])=O)C1 PMMYEEVYMWASQN-UHFFFAOYSA-N 0.000 description 2
- 239000012149 elution buffer Substances 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 210000003527 eukaryotic cell Anatomy 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 239000010408 film Substances 0.000 description 2
- 108020001507 fusion proteins Proteins 0.000 description 2
- 102000037865 fusion proteins Human genes 0.000 description 2
- BTCSSZJGUNDROE-UHFFFAOYSA-N gamma-aminobutyric acid Chemical compound NCCCC(O)=O BTCSSZJGUNDROE-UHFFFAOYSA-N 0.000 description 2
- 239000011521 glass Substances 0.000 description 2
- 239000003102 growth factor Substances 0.000 description 2
- 150000002411 histidines Chemical class 0.000 description 2
- 229910052739 hydrogen Inorganic materials 0.000 description 2
- 239000001257 hydrogen Substances 0.000 description 2
- 230000002209 hydrophobic effect Effects 0.000 description 2
- 229960002591 hydroxyproline Drugs 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000002779 inactivation Effects 0.000 description 2
- 239000003446 ligand Substances 0.000 description 2
- 230000002934 lysing effect Effects 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 229960001913 mecysteine Drugs 0.000 description 2
- 239000012528 membrane Substances 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 239000013642 negative control Substances 0.000 description 2
- 229910052759 nickel Inorganic materials 0.000 description 2
- 229910052757 nitrogen Inorganic materials 0.000 description 2
- 229960003104 ornithine Drugs 0.000 description 2
- 150000008300 phosphoramidites Chemical class 0.000 description 2
- HXEACLLIILLPRG-UHFFFAOYSA-N pipecolic acid Chemical compound OC(=O)C1CCCCN1 HXEACLLIILLPRG-UHFFFAOYSA-N 0.000 description 2
- JSSXHAMIXJGYCS-UHFFFAOYSA-N piperazin-4-ium-2-carboxylate Chemical compound OC(=O)C1CNCCN1 JSSXHAMIXJGYCS-UHFFFAOYSA-N 0.000 description 2
- 229920002704 polyhistidine Polymers 0.000 description 2
- 239000000256 polyoxyethylene sorbitan monolaurate Substances 0.000 description 2
- 235000010486 polyoxyethylene sorbitan monolaurate Nutrition 0.000 description 2
- 239000011148 porous material Substances 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000002797 proteolythic effect Effects 0.000 description 2
- 238000011084 recovery Methods 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 230000002441 reversible effect Effects 0.000 description 2
- 238000007480 sanger sequencing Methods 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 229960001153 serine Drugs 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 239000002904 solvent Substances 0.000 description 2
- 238000003860 storage Methods 0.000 description 2
- 230000001225 therapeutic effect Effects 0.000 description 2
- 239000013638 trimer Substances 0.000 description 2
- 229960004441 tyrosine Drugs 0.000 description 2
- 239000004474 valine Substances 0.000 description 2
- JPZXHKDZASGCLU-LBPRGKRZSA-N β-(2-naphthyl)-alanine Chemical compound C1=CC=CC2=CC(C[C@H](N)C(O)=O)=CC=C21 JPZXHKDZASGCLU-LBPRGKRZSA-N 0.000 description 2
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 1
- CWLQUGTUXBXTLF-RXMQYKEDSA-N (2r)-1-methylpyrrolidine-2-carboxylic acid Chemical compound CN1CCC[C@@H]1C(O)=O CWLQUGTUXBXTLF-RXMQYKEDSA-N 0.000 description 1
- YAXAFCHJCYILRU-RXMQYKEDSA-N (2r)-2-(methylamino)-4-methylsulfanylbutanoic acid Chemical compound CN[C@@H](C(O)=O)CCSC YAXAFCHJCYILRU-RXMQYKEDSA-N 0.000 description 1
- XLBVNMSMFQMKEY-SCSAIBSYSA-N (2r)-2-(methylamino)pentanedioic acid Chemical compound CN[C@@H](C(O)=O)CCC(O)=O XLBVNMSMFQMKEY-SCSAIBSYSA-N 0.000 description 1
- GDFAOVXKHJXLEI-GSVOUGTGSA-N (2r)-2-(methylamino)propanoic acid Chemical compound CN[C@H](C)C(O)=O GDFAOVXKHJXLEI-GSVOUGTGSA-N 0.000 description 1
- SCIFESDRCALIIM-SECBINFHSA-N (2r)-2-(methylazaniumyl)-3-phenylpropanoate Chemical compound CN[C@@H](C(O)=O)CC1=CC=CC=C1 SCIFESDRCALIIM-SECBINFHSA-N 0.000 description 1
- FDKWRPBBCBCIGA-REOHCLBHSA-N (2r)-2-azaniumyl-3-$l^{1}-selanylpropanoate Chemical compound [Se]C[C@H](N)C(O)=O FDKWRPBBCBCIGA-REOHCLBHSA-N 0.000 description 1
- WNNNWFKQCKFSDK-SCSAIBSYSA-N (2r)-2-azaniumylpent-4-enoate Chemical compound [O-]C(=O)[C@H]([NH3+])CC=C WNNNWFKQCKFSDK-SCSAIBSYSA-N 0.000 description 1
- CYZKJBZEIFWZSR-ZCFIWIBFSA-N (2r)-3-(1h-imidazol-5-yl)-2-(methylamino)propanoic acid Chemical compound CN[C@@H](C(O)=O)CC1=CN=CN1 CYZKJBZEIFWZSR-ZCFIWIBFSA-N 0.000 description 1
- CZCIKBSVHDNIDH-LLVKDONJSA-N (2r)-3-(1h-indol-3-yl)-2-(methylamino)propanoic acid Chemical compound C1=CC=C2C(C[C@@H](NC)C(O)=O)=CNC2=C1 CZCIKBSVHDNIDH-LLVKDONJSA-N 0.000 description 1
- BJBUEDPLEOHJGE-SRBOSORUSA-N (2r)-3-hydroxypyrrolidine-2-carboxylic acid Chemical compound OC1CCN[C@H]1C(O)=O BJBUEDPLEOHJGE-SRBOSORUSA-N 0.000 description 1
- AKCRVYNORCOYQT-RXMQYKEDSA-N (2r)-3-methyl-2-(methylazaniumyl)butanoate Chemical compound C[NH2+][C@H](C(C)C)C([O-])=O AKCRVYNORCOYQT-RXMQYKEDSA-N 0.000 description 1
- LNSMPSPTFDIWRQ-GSVOUGTGSA-N (2r)-4-amino-2-(methylamino)-4-oxobutanoic acid Chemical compound CN[C@@H](C(O)=O)CC(N)=O LNSMPSPTFDIWRQ-GSVOUGTGSA-N 0.000 description 1
- NTWVQPHTOUKMDI-RXMQYKEDSA-N (2r)-5-(diaminomethylideneamino)-2-(methylamino)pentanoic acid Chemical compound CN[C@@H](C(O)=O)CCCNC(N)=N NTWVQPHTOUKMDI-RXMQYKEDSA-N 0.000 description 1
- OZRWQPFBXDVLAH-RXMQYKEDSA-N (2r)-5-amino-2-(methylamino)pentanoic acid Chemical compound CN[C@@H](C(O)=O)CCCN OZRWQPFBXDVLAH-RXMQYKEDSA-N 0.000 description 1
- KSPIYJQBLVDRRI-NTSWFWBYSA-N (2r,3s)-3-methyl-2-(methylazaniumyl)pentanoate Chemical compound CC[C@H](C)[C@@H](NC)C(O)=O KSPIYJQBLVDRRI-NTSWFWBYSA-N 0.000 description 1
- KRHNXNZBLHHEIU-CRCLSJGQSA-N (2r,4s)-4-hydroxypiperidin-1-ium-2-carboxylate Chemical compound O[C@H]1CCN[C@@H](C(O)=O)C1 KRHNXNZBLHHEIU-CRCLSJGQSA-N 0.000 description 1
- FQRURPFZTFUXEZ-MRVPVSSYSA-N (2s)-2,3,3,3-tetrafluoro-2-(n-fluoroanilino)propanoic acid Chemical compound OC(=O)[C@](F)(C(F)(F)F)N(F)C1=CC=CC=C1 FQRURPFZTFUXEZ-MRVPVSSYSA-N 0.000 description 1
- NMDDZEVVQDPECF-LURJTMIESA-N (2s)-2,7-diaminoheptanoic acid Chemical compound NCCCCC[C@H](N)C(O)=O NMDDZEVVQDPECF-LURJTMIESA-N 0.000 description 1
- YPJJGMCMOHDOFZ-ZETCQYMHSA-N (2s)-2-(1-benzothiophen-3-ylamino)propanoic acid Chemical compound C1=CC=C2C(N[C@@H](C)C(O)=O)=CSC2=C1 YPJJGMCMOHDOFZ-ZETCQYMHSA-N 0.000 description 1
- LDUWTIUXPVCEQF-LURJTMIESA-N (2s)-2-(cyclopentylamino)propanoic acid Chemical compound OC(=O)[C@H](C)NC1CCCC1 LDUWTIUXPVCEQF-LURJTMIESA-N 0.000 description 1
- NVXKJPGRZSDYPK-JTQLQIEISA-N (2s)-2-(methylamino)-4-phenylbutanoic acid Chemical compound CN[C@H](C(O)=O)CCC1=CC=CC=C1 NVXKJPGRZSDYPK-JTQLQIEISA-N 0.000 description 1
- HOKKHZGPKSLGJE-VKHMYHEASA-N (2s)-2-(methylamino)butanedioic acid Chemical compound CN[C@H](C(O)=O)CC(O)=O HOKKHZGPKSLGJE-VKHMYHEASA-N 0.000 description 1
- FPDYKABXINADKS-LURJTMIESA-N (2s)-2-(methylazaniumyl)hexanoate Chemical compound CCCC[C@H](NC)C(O)=O FPDYKABXINADKS-LURJTMIESA-N 0.000 description 1
- HCPKYUNZBPVCHC-YFKPBYRVSA-N (2s)-2-(methylazaniumyl)pentanoate Chemical compound CCC[C@H](NC)C(O)=O HCPKYUNZBPVCHC-YFKPBYRVSA-N 0.000 description 1
- WTDHSXGBDZBWAW-QMMMGPOBSA-N (2s)-2-[cyclohexyl(methyl)azaniumyl]propanoate Chemical compound OC(=O)[C@H](C)N(C)C1CCCCC1 WTDHSXGBDZBWAW-QMMMGPOBSA-N 0.000 description 1
- IUYZJPXOXGRNNE-ZETCQYMHSA-N (2s)-2-[cyclopentyl(methyl)amino]propanoic acid Chemical compound OC(=O)[C@H](C)N(C)C1CCCC1 IUYZJPXOXGRNNE-ZETCQYMHSA-N 0.000 description 1
- PECGVEGMRUZOML-AWEZNQCLSA-N (2s)-2-amino-3,3-diphenylpropanoic acid Chemical compound C=1C=CC=CC=1C([C@H](N)C(O)=O)C1=CC=CC=C1 PECGVEGMRUZOML-AWEZNQCLSA-N 0.000 description 1
- DQLHSFUMICQIMB-VIFPVBQESA-N (2s)-2-amino-3-(4-methylphenyl)propanoic acid Chemical compound CC1=CC=C(C[C@H](N)C(O)=O)C=C1 DQLHSFUMICQIMB-VIFPVBQESA-N 0.000 description 1
- RXZQHZDTHUUJQJ-LURJTMIESA-N (2s)-2-amino-3-(furan-2-yl)propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=CO1 RXZQHZDTHUUJQJ-LURJTMIESA-N 0.000 description 1
- DFZVZEMNPGABKO-ZETCQYMHSA-N (2s)-2-amino-3-pyridin-3-ylpropanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=CN=C1 DFZVZEMNPGABKO-ZETCQYMHSA-N 0.000 description 1
- JADFNIWVWZWOFA-JTQLQIEISA-N (2s)-2-amino-4-(4-methoxyphenyl)butanoic acid Chemical compound COC1=CC=C(CC[C@H](N)C(O)=O)C=C1 JADFNIWVWZWOFA-JTQLQIEISA-N 0.000 description 1
- WNNNWFKQCKFSDK-BYPYZUCNSA-N (2s)-2-aminopent-4-enoic acid Chemical compound OC(=O)[C@@H](N)CC=C WNNNWFKQCKFSDK-BYPYZUCNSA-N 0.000 description 1
- KFHRMMHGGBCRIV-BYPYZUCNSA-N (2s)-2-azaniumyl-4-methoxybutanoate Chemical compound COCC[C@H](N)C(O)=O KFHRMMHGGBCRIV-BYPYZUCNSA-N 0.000 description 1
- FMUMEWVNYMUECA-LURJTMIESA-N (2s)-2-azaniumyl-5-methylhexanoate Chemical compound CC(C)CC[C@H](N)C(O)=O FMUMEWVNYMUECA-LURJTMIESA-N 0.000 description 1
- KWWFNGCKGYUCLC-RXMQYKEDSA-N (2s)-3,3-dimethyl-2-(methylamino)butanoic acid Chemical compound CN[C@H](C(O)=O)C(C)(C)C KWWFNGCKGYUCLC-RXMQYKEDSA-N 0.000 description 1
- LNSMPSPTFDIWRQ-VKHMYHEASA-N (2s)-4-amino-2-(methylamino)-4-oxobutanoic acid Chemical compound CN[C@H](C(O)=O)CC(N)=O LNSMPSPTFDIWRQ-VKHMYHEASA-N 0.000 description 1
- XJODGRWDFZVTKW-LURJTMIESA-N (2s)-4-methyl-2-(methylamino)pentanoic acid Chemical compound CN[C@H](C(O)=O)CC(C)C XJODGRWDFZVTKW-LURJTMIESA-N 0.000 description 1
- NWGZOALPWZDXNG-LURJTMIESA-N (2s)-5-(diaminomethylideneamino)-2-(dimethylamino)pentanoic acid Chemical compound CN(C)[C@H](C(O)=O)CCCNC(N)=N NWGZOALPWZDXNG-LURJTMIESA-N 0.000 description 1
- KSZFSNZOGAXEGH-BYPYZUCNSA-N (2s)-5-amino-2-(methylamino)-5-oxopentanoic acid Chemical compound CN[C@H](C(O)=O)CCC(N)=O KSZFSNZOGAXEGH-BYPYZUCNSA-N 0.000 description 1
- OZRWQPFBXDVLAH-YFKPBYRVSA-N (2s)-5-amino-2-(methylamino)pentanoic acid Chemical compound CN[C@H](C(O)=O)CCCN OZRWQPFBXDVLAH-YFKPBYRVSA-N 0.000 description 1
- LJRDOKAZOAKLDU-UDXJMMFXSA-N (2s,3s,4r,5r,6r)-5-amino-2-(aminomethyl)-6-[(2r,3s,4r,5s)-5-[(1r,2r,3s,5r,6s)-3,5-diamino-2-[(2s,3r,4r,5s,6r)-3-amino-4,5-dihydroxy-6-(hydroxymethyl)oxan-2-yl]oxy-6-hydroxycyclohexyl]oxy-4-hydroxy-2-(hydroxymethyl)oxolan-3-yl]oxyoxane-3,4-diol;sulfuric ac Chemical compound OS(O)(=O)=O.N[C@@H]1[C@@H](O)[C@H](O)[C@H](CN)O[C@@H]1O[C@H]1[C@@H](O)[C@H](O[C@H]2[C@@H]([C@@H](N)C[C@@H](N)[C@@H]2O)O[C@@H]2[C@@H]([C@@H](O)[C@H](O)[C@@H](CO)O2)N)O[C@@H]1CO LJRDOKAZOAKLDU-UDXJMMFXSA-N 0.000 description 1
- KRHNXNZBLHHEIU-UHNVWZDZSA-N (2s,4r)-4-hydroxypiperidin-1-ium-2-carboxylate Chemical compound O[C@@H]1CCN[C@H](C(O)=O)C1 KRHNXNZBLHHEIU-UHNVWZDZSA-N 0.000 description 1
- RWSHAZYNUOFZMI-ZJUUUORDSA-N (2s,4r)-4-phenoxypyrrolidine-2-carboxylic acid Chemical compound C1N[C@H](C(=O)O)C[C@H]1OC1=CC=CC=C1 RWSHAZYNUOFZMI-ZJUUUORDSA-N 0.000 description 1
- GTODXOQKGULQFP-UWVGGRQHSA-N (2s,4s)-4-azaniumyl-1-benzoylpyrrolidine-2-carboxylate Chemical compound C1[C@@H](N)C[C@@H](C(O)=O)N1C(=O)C1=CC=CC=C1 GTODXOQKGULQFP-UWVGGRQHSA-N 0.000 description 1
- OYNANFOWNSGDJL-IMJSIDKUSA-N (2s,4s)-4-sulfanylpyrrolidin-1-ium-2-carboxylate Chemical compound OC(=O)[C@@H]1C[C@H](S)CN1 OYNANFOWNSGDJL-IMJSIDKUSA-N 0.000 description 1
- FSNCEEGOMTYXKY-SNVBAGLBSA-N (3r)-2,3,4,9-tetrahydro-1h-pyrido[3,4-b]indole-3-carboxylic acid Chemical compound N1C2=CC=CC=C2C2=C1CN[C@@H](C(=O)O)C2 FSNCEEGOMTYXKY-SNVBAGLBSA-N 0.000 description 1
- FSNCEEGOMTYXKY-JTQLQIEISA-N (3s)-2,3,4,9-tetrahydro-1h-pyrido[3,4-b]indole-3-carboxylic acid Chemical compound N1C2=CC=CC=C2C2=C1CN[C@H](C(=O)O)C2 FSNCEEGOMTYXKY-JTQLQIEISA-N 0.000 description 1
- XJLSEXAGTJCILF-RXMQYKEDSA-N (R)-nipecotic acid zwitterion Chemical compound OC(=O)[C@@H]1CCCNC1 XJLSEXAGTJCILF-RXMQYKEDSA-N 0.000 description 1
- FUVZDXDCPRQZSQ-UHFFFAOYSA-N 1,5,6,7-tetrahydroindazol-4-one Chemical compound O=C1CCCC2=C1C=NN2 FUVZDXDCPRQZSQ-UHFFFAOYSA-N 0.000 description 1
- QITDFMFPFGJWJY-UHFFFAOYSA-N 1-(2,2-diphenylethylamino)cyclopropane-1-carboxylic acid Chemical compound C=1C=CC=CC=1C(C=1C=CC=CC=1)CNC1(C(=O)O)CC1 QITDFMFPFGJWJY-UHFFFAOYSA-N 0.000 description 1
- NDMFETHQFUOIQX-UHFFFAOYSA-N 1-(3-chloropropyl)imidazolidin-2-one Chemical compound ClCCCN1CCNC1=O NDMFETHQFUOIQX-UHFFFAOYSA-N 0.000 description 1
- FVTVMQPGKVHSEY-UHFFFAOYSA-N 1-AMINOCYCLOBUTANE CARBOXYLIC ACID Chemical compound OC(=O)C1(N)CCC1 FVTVMQPGKVHSEY-UHFFFAOYSA-N 0.000 description 1
- NILQLFBWTXNUOE-UHFFFAOYSA-N 1-aminocyclopentanecarboxylic acid Chemical compound OC(=O)C1(N)CCCC1 NILQLFBWTXNUOE-UHFFFAOYSA-N 0.000 description 1
- PAJPWUMXBYXFCZ-UHFFFAOYSA-N 1-aminocyclopropanecarboxylic acid Chemical compound OC(=O)C1(N)CC1 PAJPWUMXBYXFCZ-UHFFFAOYSA-N 0.000 description 1
- HTTPGMNPPMMMOP-UHFFFAOYSA-N 1-azaniumyl-2,3-dihydroindene-1-carboxylate Chemical compound C1=CC=C2C(N)(C(O)=O)CCC2=C1 HTTPGMNPPMMMOP-UHFFFAOYSA-N 0.000 description 1
- KXZQYLBVMZGIKC-UHFFFAOYSA-N 1-pyridin-2-yl-n-(pyridin-2-ylmethyl)methanamine Chemical compound C=1C=CC=NC=1CNCC1=CC=CC=N1 KXZQYLBVMZGIKC-UHFFFAOYSA-N 0.000 description 1
- BLCJBICVQSYOIF-UHFFFAOYSA-N 2,2-diaminobutanoic acid Chemical compound CCC(N)(N)C(O)=O BLCJBICVQSYOIF-UHFFFAOYSA-N 0.000 description 1
- JTTIOYHBNXDJOD-UHFFFAOYSA-N 2,4,6-triaminopyrimidine Chemical compound NC1=CC(N)=NC(N)=N1 JTTIOYHBNXDJOD-UHFFFAOYSA-N 0.000 description 1
- OGNSCSPNOLGXSM-UHFFFAOYSA-N 2,4-diaminobutyric acid Chemical compound NCCC(N)C(O)=O OGNSCSPNOLGXSM-UHFFFAOYSA-N 0.000 description 1
- JTBWDIQPPZCDHV-UHFFFAOYSA-N 2-(1-azaniumylcyclohexyl)acetate Chemical compound [O-]C(=O)CC1([NH3+])CCCCC1 JTBWDIQPPZCDHV-UHFFFAOYSA-N 0.000 description 1
- WAAJQPAIOASFSC-UHFFFAOYSA-N 2-(1-hydroxyethylamino)acetic acid Chemical compound CC(O)NCC(O)=O WAAJQPAIOASFSC-UHFFFAOYSA-N 0.000 description 1
- ITYQPPZZOYSACT-UHFFFAOYSA-N 2-(2,2-dimethylpropylamino)acetic acid Chemical compound CC(C)(C)CNCC(O)=O ITYQPPZZOYSACT-UHFFFAOYSA-N 0.000 description 1
- UEQSFWNXRZJTKB-UHFFFAOYSA-N 2-(2,2-diphenylethylamino)acetic acid Chemical compound C=1C=CC=CC=1C(CNCC(=O)O)C1=CC=CC=C1 UEQSFWNXRZJTKB-UHFFFAOYSA-N 0.000 description 1
- PIINGYXNCHTJTF-UHFFFAOYSA-N 2-(2-azaniumylethylamino)acetate Chemical compound NCCNCC(O)=O PIINGYXNCHTJTF-UHFFFAOYSA-N 0.000 description 1
- XCDGCRLSSSSBIA-UHFFFAOYSA-N 2-(2-methylsulfanylethylamino)acetic acid Chemical compound CSCCNCC(O)=O XCDGCRLSSSSBIA-UHFFFAOYSA-N 0.000 description 1
- IAICFWDJMWEXAO-UHFFFAOYSA-N 2-(2-sulfanylethylamino)acetic acid Chemical compound OC(=O)CNCCS IAICFWDJMWEXAO-UHFFFAOYSA-N 0.000 description 1
- STMXJQHRRCPJCJ-UHFFFAOYSA-N 2-(3,3-diphenylpropylamino)acetic acid Chemical compound C=1C=CC=CC=1C(CCNCC(=O)O)C1=CC=CC=C1 STMXJQHRRCPJCJ-UHFFFAOYSA-N 0.000 description 1
- DZNWPRKEUQEDSZ-UHFFFAOYSA-N 2-(3-amino-2-oxopyridin-1-yl)acetic acid Chemical compound NC1=CC=CN(CC(O)=O)C1=O DZNWPRKEUQEDSZ-UHFFFAOYSA-N 0.000 description 1
- DHGYLUFLENKZHH-UHFFFAOYSA-N 2-(3-aminopropylamino)acetic acid Chemical compound NCCCNCC(O)=O DHGYLUFLENKZHH-UHFFFAOYSA-N 0.000 description 1
- OGAULEBSQQMUKP-UHFFFAOYSA-N 2-(4-aminobutylamino)acetic acid Chemical compound NCCCCNCC(O)=O OGAULEBSQQMUKP-UHFFFAOYSA-N 0.000 description 1
- KGSVNOLLROCJQM-UHFFFAOYSA-N 2-(benzylamino)acetic acid Chemical compound OC(=O)CNCC1=CC=CC=C1 KGSVNOLLROCJQM-UHFFFAOYSA-N 0.000 description 1
- IVCQRTJVLJXKKJ-UHFFFAOYSA-N 2-(butan-2-ylazaniumyl)acetate Chemical compound CCC(C)NCC(O)=O IVCQRTJVLJXKKJ-UHFFFAOYSA-N 0.000 description 1
- KQLGGQARRCMYGD-UHFFFAOYSA-N 2-(cyclobutylamino)acetic acid Chemical compound OC(=O)CNC1CCC1 KQLGGQARRCMYGD-UHFFFAOYSA-N 0.000 description 1
- OQMYZVWIXPPDDE-UHFFFAOYSA-N 2-(cyclohexylazaniumyl)acetate Chemical compound OC(=O)CNC1CCCCC1 OQMYZVWIXPPDDE-UHFFFAOYSA-N 0.000 description 1
- PNKNDNFLQNMQJL-UHFFFAOYSA-N 2-(cyclooctylazaniumyl)acetate Chemical compound OC(=O)CNC1CCCCCCC1 PNKNDNFLQNMQJL-UHFFFAOYSA-N 0.000 description 1
- DXQCCQKRNWMECV-UHFFFAOYSA-N 2-(cyclopropylazaniumyl)acetate Chemical compound OC(=O)CNC1CC1 DXQCCQKRNWMECV-UHFFFAOYSA-N 0.000 description 1
- PRVOMNLNSHAUEI-UHFFFAOYSA-N 2-(cycloundecylamino)acetic acid Chemical compound OC(=O)CNC1CCCCCCCCCC1 PRVOMNLNSHAUEI-UHFFFAOYSA-N 0.000 description 1
- HEPOIJKOXBKKNJ-UHFFFAOYSA-N 2-(propan-2-ylazaniumyl)acetate Chemical compound CC(C)NCC(O)=O HEPOIJKOXBKKNJ-UHFFFAOYSA-N 0.000 description 1
- TXHAHOVNFDVCCC-UHFFFAOYSA-N 2-(tert-butylazaniumyl)acetate Chemical compound CC(C)(C)NCC(O)=O TXHAHOVNFDVCCC-UHFFFAOYSA-N 0.000 description 1
- AWEZYTUWDZADKR-UHFFFAOYSA-N 2-[(2-amino-2-oxoethyl)azaniumyl]acetate Chemical compound NC(=O)CNCC(O)=O AWEZYTUWDZADKR-UHFFFAOYSA-N 0.000 description 1
- MNDBDVPDSHGIHR-UHFFFAOYSA-N 2-[(3-amino-3-oxopropyl)amino]acetic acid Chemical compound NC(=O)CCNCC(O)=O MNDBDVPDSHGIHR-UHFFFAOYSA-N 0.000 description 1
- IGVHXMVKIMYIRY-UHFFFAOYSA-N 2-[bis(3-aminopropyl)amino]acetic acid Chemical compound NCCCN(CC(O)=O)CCCN IGVHXMVKIMYIRY-UHFFFAOYSA-N 0.000 description 1
- UHQFXIWMAQOCAN-UHFFFAOYSA-N 2-amino-1,3-dihydroindene-2-carboxylic acid Chemical compound C1=CC=C2CC(N)(C(O)=O)CC2=C1 UHQFXIWMAQOCAN-UHFFFAOYSA-N 0.000 description 1
- AKVBCGQVQXPRLD-UHFFFAOYSA-N 2-aminooctanoic acid Chemical compound CCCCCCC(N)C(O)=O AKVBCGQVQXPRLD-UHFFFAOYSA-N 0.000 description 1
- CDULPPOISZOUTK-UHFFFAOYSA-N 2-azaniumyl-3,4-dihydro-1h-naphthalene-2-carboxylate Chemical compound C1=CC=C2CC(N)(C(O)=O)CCC2=C1 CDULPPOISZOUTK-UHFFFAOYSA-N 0.000 description 1
- KFHRMMHGGBCRIV-UHFFFAOYSA-N 2-azaniumyl-4-methoxybutanoate Chemical compound COCCC(N)C(O)=O KFHRMMHGGBCRIV-UHFFFAOYSA-N 0.000 description 1
- JTTHKOPSMAVJFE-UHFFFAOYSA-N 2-azaniumyl-4-phenylbutanoate Chemical compound OC(=O)C(N)CCC1=CC=CC=C1 JTTHKOPSMAVJFE-UHFFFAOYSA-N 0.000 description 1
- WRFPVMFCRNYQNR-UHFFFAOYSA-N 2-hydroxyphenylalanine Chemical compound OC(=O)C(N)CC1=CC=CC=C1O WRFPVMFCRNYQNR-UHFFFAOYSA-N 0.000 description 1
- PRNLNZMJMCUWNV-UHFFFAOYSA-N 2-piperidin-1-ium-2-ylacetate Chemical compound OC(=O)CC1CCCCN1 PRNLNZMJMCUWNV-UHFFFAOYSA-N 0.000 description 1
- UENRXLSRMCSUSN-UHFFFAOYSA-N 3,5-diaminobenzoic acid Chemical compound NC1=CC(N)=CC(C(O)=O)=C1 UENRXLSRMCSUSN-UHFFFAOYSA-N 0.000 description 1
- ZAQXSMCYFQJRCQ-UHFFFAOYSA-N 3-(1-adamantyl)-2-azaniumylpropanoate Chemical compound C1C(C2)CC3CC2CC1(CC(N)C(O)=O)C3 ZAQXSMCYFQJRCQ-UHFFFAOYSA-N 0.000 description 1
- AJHPGXZOIAYYDW-UHFFFAOYSA-N 3-(2-cyanophenyl)-2-[(2-methylpropan-2-yl)oxycarbonylamino]propanoic acid Chemical compound CC(C)(C)OC(=O)NC(C(O)=O)CC1=CC=CC=C1C#N AJHPGXZOIAYYDW-UHFFFAOYSA-N 0.000 description 1
- XFXOLBNQYFRSLQ-UHFFFAOYSA-N 3-amino-2-naphthoic acid Chemical compound C1=CC=C2C=C(C(O)=O)C(N)=CC2=C1 XFXOLBNQYFRSLQ-UHFFFAOYSA-N 0.000 description 1
- WBXOONOXOHMGQW-UHFFFAOYSA-N 3-aminobicyclo[2.2.1]heptane-4-carboxylic acid Chemical compound C1CC2(C(O)=O)C(N)CC1C2 WBXOONOXOHMGQW-UHFFFAOYSA-N 0.000 description 1
- CKTUXQBZPWBFDX-UHFFFAOYSA-N 3-azaniumylcyclohexane-1-carboxylate Chemical compound NC1CCCC(C(O)=O)C1 CKTUXQBZPWBFDX-UHFFFAOYSA-N 0.000 description 1
- ALYNCZNDIQEVRV-PZFLKRBQSA-N 4-amino-3,5-ditritiobenzoic acid Chemical compound [3H]c1cc(cc([3H])c1N)C(O)=O ALYNCZNDIQEVRV-PZFLKRBQSA-N 0.000 description 1
- SHINASQYHDCLEU-UHFFFAOYSA-N 4-aminopyrrolidine-2-carboxylic acid Chemical compound NC1CNC(C(O)=O)C1 SHINASQYHDCLEU-UHFFFAOYSA-N 0.000 description 1
- WIPRZHGCPZSPLJ-UHFFFAOYSA-N 4-aminotetrahydro-2H-thiopyran-4-carboxylic acid Chemical compound OC(=O)C1(N)CCSCC1 WIPRZHGCPZSPLJ-UHFFFAOYSA-N 0.000 description 1
- DRNGLYHKYPNTEA-UHFFFAOYSA-N 4-azaniumylcyclohexane-1-carboxylate Chemical compound NC1CCC(C(O)=O)CC1 DRNGLYHKYPNTEA-UHFFFAOYSA-N 0.000 description 1
- PMQQFSDIECYOQV-UHFFFAOYSA-N 5,5-dimethyl-1,3-thiazolidin-3-ium-4-carboxylate Chemical compound CC1(C)SCNC1C(O)=O PMQQFSDIECYOQV-UHFFFAOYSA-N 0.000 description 1
- 229940117976 5-hydroxylysine Drugs 0.000 description 1
- ODHCTXKNWHHXJC-GSVOUGTGSA-N 5-oxo-D-proline Chemical compound OC(=O)[C@H]1CCC(=O)N1 ODHCTXKNWHHXJC-GSVOUGTGSA-N 0.000 description 1
- ODHCTXKNWHHXJC-VKHMYHEASA-N 5-oxo-L-proline Chemical compound OC(=O)[C@@H]1CCC(=O)N1 ODHCTXKNWHHXJC-VKHMYHEASA-N 0.000 description 1
- RYFOQDQDVYIEHN-UHFFFAOYSA-N 6-amino-2-(dimethylamino)hexanoic acid Chemical compound CN(C)C(C(O)=O)CCCCN RYFOQDQDVYIEHN-UHFFFAOYSA-N 0.000 description 1
- 210000002925 A-like Anatomy 0.000 description 1
- 229920001817 Agar Polymers 0.000 description 1
- 102000008102 Ankyrins Human genes 0.000 description 1
- 108010049777 Ankyrins Proteins 0.000 description 1
- 241000978166 Astrea Species 0.000 description 1
- IADUEWIQBXOCDZ-VKHMYHEASA-N Azetidine-2-carboxylic acid Natural products OC(=O)[C@@H]1CCN1 IADUEWIQBXOCDZ-VKHMYHEASA-N 0.000 description 1
- 239000005711 Benzoic acid Substances 0.000 description 1
- 101710201279 Biotin carboxyl carrier protein Proteins 0.000 description 1
- 125000001433 C-terminal amino-acid group Chemical group 0.000 description 1
- 101000755496 Canis lupus familiaris Transforming protein RhoA Proteins 0.000 description 1
- 101710132601 Capsid protein Proteins 0.000 description 1
- KXDHJXZQYSOELW-UHFFFAOYSA-N Carbamic acid Chemical group NC(O)=O KXDHJXZQYSOELW-UHFFFAOYSA-N 0.000 description 1
- 241001112695 Clostridiales Species 0.000 description 1
- 241000193403 Clostridium Species 0.000 description 1
- 241001509496 Clostridium celatum Species 0.000 description 1
- 241000910087 Clostridium nigeriense Species 0.000 description 1
- 108091035707 Consensus sequence Proteins 0.000 description 1
- 241001443882 Coprobacillus Species 0.000 description 1
- 241000699802 Cricetulus griseus Species 0.000 description 1
- 102000015833 Cystatin Human genes 0.000 description 1
- AHLPHDHHMVZTML-SCSAIBSYSA-N D-Ornithine Chemical compound NCCC[C@@H](N)C(O)=O AHLPHDHHMVZTML-SCSAIBSYSA-N 0.000 description 1
- FDKWRPBBCBCIGA-UWTATZPHSA-N D-Selenocysteine Natural products [Se]C[C@@H](N)C(O)=O FDKWRPBBCBCIGA-UWTATZPHSA-N 0.000 description 1
- 150000008574 D-amino acids Chemical class 0.000 description 1
- DZLNHFMRPBPULJ-GSVOUGTGSA-N D-thioproline Chemical compound OC(=O)[C@H]1CSCN1 DZLNHFMRPBPULJ-GSVOUGTGSA-N 0.000 description 1
- 238000012286 ELISA Assay Methods 0.000 description 1
- NIGWMJHCCYYCSF-UHFFFAOYSA-N Fenclonine Chemical compound OC(=O)C(N)CC1=CC=C(Cl)C=C1 NIGWMJHCCYYCSF-UHFFFAOYSA-N 0.000 description 1
- 102000016359 Fibronectins Human genes 0.000 description 1
- 108010067306 Fibronectins Proteins 0.000 description 1
- 102100032518 Gamma-crystallin B Human genes 0.000 description 1
- 101710092798 Gamma-crystallin B Proteins 0.000 description 1
- YIWFXZNIBQBFHR-LURJTMIESA-N Gly-His Chemical compound [NH3+]CC(=O)N[C@H](C([O-])=O)CC1=CN=CN1 YIWFXZNIBQBFHR-LURJTMIESA-N 0.000 description 1
- HVLSXIKZNLPZJJ-TXZCQADKSA-N HA peptide Chemical compound C([C@@H](C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](C)C(O)=O)NC(=O)[C@H]1N(CCC1)C(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 HVLSXIKZNLPZJJ-TXZCQADKSA-N 0.000 description 1
- 101001128634 Homo sapiens NADH dehydrogenase [ubiquinone] 1 beta subcomplex subunit 2, mitochondrial Proteins 0.000 description 1
- 101000724418 Homo sapiens Neutral amino acid transporter B(0) Proteins 0.000 description 1
- 108091006905 Human Serum Albumin Proteins 0.000 description 1
- 102000008100 Human Serum Albumin Human genes 0.000 description 1
- 102000009066 Hyaluronoglucosaminidase Human genes 0.000 description 1
- 241000235789 Hyperoartia Species 0.000 description 1
- SNDPXSYFESPGGJ-BYPYZUCNSA-N L-2-aminopentanoic acid Chemical compound CCC[C@H](N)C(O)=O SNDPXSYFESPGGJ-BYPYZUCNSA-N 0.000 description 1
- QUOGESRFPZDMMT-UHFFFAOYSA-N L-Homoarginine Natural products OC(=O)C(N)CCCCNC(N)=N QUOGESRFPZDMMT-UHFFFAOYSA-N 0.000 description 1
- LOOZZTFGSTZNRX-VIFPVBQESA-N L-Homotyrosine Chemical compound OC(=O)[C@@H](N)CCC1=CC=C(O)C=C1 LOOZZTFGSTZNRX-VIFPVBQESA-N 0.000 description 1
- 235000019766 L-Lysine Nutrition 0.000 description 1
- GDFAOVXKHJXLEI-UHFFFAOYSA-N L-N-Boc-N-methylalanine Natural products CNC(C)C(O)=O GDFAOVXKHJXLEI-UHFFFAOYSA-N 0.000 description 1
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 1
- QWCKQJZIFLGMSD-VKHMYHEASA-N L-alpha-aminobutyric acid Chemical compound CC[C@H](N)C(O)=O QWCKQJZIFLGMSD-VKHMYHEASA-N 0.000 description 1
- ODKSFYDXXFIFQN-BYPYZUCNSA-N L-arginine Chemical compound OC(=O)[C@@H](N)CCCN=C(N)N ODKSFYDXXFIFQN-BYPYZUCNSA-N 0.000 description 1
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 1
- RHGKLRLOHDJJDR-BYPYZUCNSA-N L-citrulline Chemical compound NC(=O)NCCC[C@H]([NH3+])C([O-])=O RHGKLRLOHDJJDR-BYPYZUCNSA-N 0.000 description 1
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 1
- QUOGESRFPZDMMT-YFKPBYRVSA-N L-homoarginine Chemical compound OC(=O)[C@@H](N)CCCCNC(N)=N QUOGESRFPZDMMT-YFKPBYRVSA-N 0.000 description 1
- FFFHZYDWPBMWHY-VKHMYHEASA-N L-homocysteine Chemical compound OC(=O)[C@@H](N)CCS FFFHZYDWPBMWHY-VKHMYHEASA-N 0.000 description 1
- FMUMEWVNYMUECA-UHFFFAOYSA-N L-homoleucine Natural products CC(C)CCC(N)C(O)=O FMUMEWVNYMUECA-UHFFFAOYSA-N 0.000 description 1
- JTTHKOPSMAVJFE-VIFPVBQESA-N L-homophenylalanine Chemical compound OC(=O)[C@@H](N)CCC1=CC=CC=C1 JTTHKOPSMAVJFE-VIFPVBQESA-N 0.000 description 1
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 1
- JZKXXXDKRQWDET-QMMMGPOBSA-N L-m-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC(O)=C1 JZKXXXDKRQWDET-QMMMGPOBSA-N 0.000 description 1
- QEFRNWWLZKMPFJ-ZXPFJRLXSA-N L-methionine (R)-S-oxide Chemical compound C[S@@](=O)CC[C@H]([NH3+])C([O-])=O QEFRNWWLZKMPFJ-ZXPFJRLXSA-N 0.000 description 1
- QEFRNWWLZKMPFJ-UHFFFAOYSA-N L-methionine sulphoxide Natural products CS(=O)CCC(N)C(O)=O QEFRNWWLZKMPFJ-UHFFFAOYSA-N 0.000 description 1
- 125000000393 L-methionino group Chemical group [H]OC(=O)[C@@]([H])(N([H])[*])C([H])([H])C(SC([H])([H])[H])([H])[H] 0.000 description 1
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 1
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 1
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 1
- 102000003960 Ligases Human genes 0.000 description 1
- 108090000364 Ligases Proteins 0.000 description 1
- 102000019298 Lipocalin Human genes 0.000 description 1
- 108050006654 Lipocalin Proteins 0.000 description 1
- 101710175625 Maltose/maltodextrin-binding periplasmic protein Proteins 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- CERQOIWHTDAKMF-UHFFFAOYSA-M Methacrylate Chemical compound CC(=C)C([O-])=O CERQOIWHTDAKMF-UHFFFAOYSA-M 0.000 description 1
- CYZKJBZEIFWZSR-LURJTMIESA-N N(alpha)-methyl-L-histidine Chemical compound CN[C@H](C(O)=O)CC1=CNC=N1 CYZKJBZEIFWZSR-LURJTMIESA-N 0.000 description 1
- CZCIKBSVHDNIDH-NSHDSACASA-N N(alpha)-methyl-L-tryptophan Chemical compound C1=CC=C2C(C[C@H]([NH2+]C)C([O-])=O)=CNC2=C1 CZCIKBSVHDNIDH-NSHDSACASA-N 0.000 description 1
- YDGMGEXADBMOMJ-LURJTMIESA-N N(g)-dimethylarginine Chemical compound CN(C)C(\N)=N\CCC[C@H](N)C(O)=O YDGMGEXADBMOMJ-LURJTMIESA-N 0.000 description 1
- WRUZLCLJULHLEY-UHFFFAOYSA-N N-(p-hydroxyphenyl)glycine Chemical compound OC(=O)CNC1=CC=C(O)C=C1 WRUZLCLJULHLEY-UHFFFAOYSA-N 0.000 description 1
- VKZGJEWGVNFKPE-UHFFFAOYSA-N N-Isobutylglycine Chemical compound CC(C)CNCC(O)=O VKZGJEWGVNFKPE-UHFFFAOYSA-N 0.000 description 1
- SCIFESDRCALIIM-UHFFFAOYSA-N N-Me-Phenylalanine Natural products CNC(C(O)=O)CC1=CC=CC=C1 SCIFESDRCALIIM-UHFFFAOYSA-N 0.000 description 1
- HOKKHZGPKSLGJE-GSVOUGTGSA-N N-Methyl-D-aspartic acid Chemical compound CN[C@@H](C(O)=O)CC(O)=O HOKKHZGPKSLGJE-GSVOUGTGSA-N 0.000 description 1
- NTWVQPHTOUKMDI-YFKPBYRVSA-N N-Methyl-arginine Chemical compound CN[C@H](C(O)=O)CCCN=C(N)N NTWVQPHTOUKMDI-YFKPBYRVSA-N 0.000 description 1
- GDFAOVXKHJXLEI-VKHMYHEASA-N N-methyl-L-alanine Chemical compound C[NH2+][C@@H](C)C([O-])=O GDFAOVXKHJXLEI-VKHMYHEASA-N 0.000 description 1
- XLBVNMSMFQMKEY-BYPYZUCNSA-N N-methyl-L-glutamic acid Chemical compound CN[C@H](C(O)=O)CCC(O)=O XLBVNMSMFQMKEY-BYPYZUCNSA-N 0.000 description 1
- KSPIYJQBLVDRRI-WDSKDSINSA-N N-methyl-L-isoleucine Chemical compound CC[C@H](C)[C@H](NC)C(O)=O KSPIYJQBLVDRRI-WDSKDSINSA-N 0.000 description 1
- YAXAFCHJCYILRU-YFKPBYRVSA-N N-methyl-L-methionine Chemical compound C[NH2+][C@H](C([O-])=O)CCSC YAXAFCHJCYILRU-YFKPBYRVSA-N 0.000 description 1
- SCIFESDRCALIIM-VIFPVBQESA-N N-methyl-L-phenylalanine Chemical compound C[NH2+][C@H](C([O-])=O)CC1=CC=CC=C1 SCIFESDRCALIIM-VIFPVBQESA-N 0.000 description 1
- AKCRVYNORCOYQT-YFKPBYRVSA-N N-methyl-L-valine Chemical compound CN[C@@H](C(C)C)C(O)=O AKCRVYNORCOYQT-YFKPBYRVSA-N 0.000 description 1
- WVMBPWMAQDVZCM-UHFFFAOYSA-N N-methylanthranilic acid Chemical compound CNC1=CC=CC=C1C(O)=O WVMBPWMAQDVZCM-UHFFFAOYSA-N 0.000 description 1
- CWLQUGTUXBXTLF-YFKPBYRVSA-N N-methylproline Chemical compound CN1CCC[C@H]1C(O)=O CWLQUGTUXBXTLF-YFKPBYRVSA-N 0.000 description 1
- 102100032194 NADH dehydrogenase [ubiquinone] 1 beta subcomplex subunit 2, mitochondrial Human genes 0.000 description 1
- 102100028267 Neutral amino acid transporter B(0) Human genes 0.000 description 1
- 101710163270 Nuclease Proteins 0.000 description 1
- GEYBMYRBIABFTA-VIFPVBQESA-N O-methyl-L-tyrosine Chemical compound COC1=CC=C(C[C@H](N)C(O)=O)C=C1 GEYBMYRBIABFTA-VIFPVBQESA-N 0.000 description 1
- KNTFCRCCPLEUQZ-VKHMYHEASA-N O-methylserine Chemical compound COC[C@H](N)C(O)=O KNTFCRCCPLEUQZ-VKHMYHEASA-N 0.000 description 1
- UUKJGZSUMOVVOS-VXKWHMMOSA-N OC(=O)[C@@H]1C[C@@H](CN1)SC(c1ccccc1)(c1ccccc1)c1ccccc1 Chemical compound OC(=O)[C@@H]1C[C@@H](CN1)SC(c1ccccc1)(c1ccccc1)c1ccccc1 UUKJGZSUMOVVOS-VXKWHMMOSA-N 0.000 description 1
- MMZUHXUJQSMMAV-YOEHRIQHSA-N OC(=O)[C@@H]1C[C@H](Cc2ccc(cc2)-c2ccccc2)CN1 Chemical compound OC(=O)[C@@H]1C[C@H](Cc2ccc(cc2)-c2ccccc2)CN1 MMZUHXUJQSMMAV-YOEHRIQHSA-N 0.000 description 1
- 108010020346 Polyglutamic Acid Proteins 0.000 description 1
- 229940124158 Protease/peptidase inhibitor Drugs 0.000 description 1
- 238000011529 RT qPCR Methods 0.000 description 1
- 102000000395 SH3 domains Human genes 0.000 description 1
- 108050008861 SH3 domains Proteins 0.000 description 1
- 108700038981 SUMO-1 Proteins 0.000 description 1
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 1
- 108010076818 TEV protease Proteins 0.000 description 1
- 108020005038 Terminator Codon Proteins 0.000 description 1
- 102000002933 Thioredoxin Human genes 0.000 description 1
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 1
- 239000004473 Threonine Substances 0.000 description 1
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 1
- 102000044159 Ubiquitin Human genes 0.000 description 1
- 108090000848 Ubiquitin Proteins 0.000 description 1
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 1
- 238000002835 absorbance Methods 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- ODHCTXKNWHHXJC-UHFFFAOYSA-N acide pyroglutamique Natural products OC(=O)C1CCC(=O)N1 ODHCTXKNWHHXJC-UHFFFAOYSA-N 0.000 description 1
- 150000003926 acrylamides Chemical class 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 238000001261 affinity purification Methods 0.000 description 1
- 239000008272 agar Substances 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 150000001294 alanine derivatives Chemical class 0.000 description 1
- 239000012670 alkaline solution Substances 0.000 description 1
- WNNNWFKQCKFSDK-UHFFFAOYSA-N allylglycine Chemical compound OC(=O)C(N)CC=C WNNNWFKQCKFSDK-UHFFFAOYSA-N 0.000 description 1
- DLAMVQGYEVKIRE-UHFFFAOYSA-N alpha-(methylamino)isobutyric acid Chemical compound CNC(C)(C)C(O)=O DLAMVQGYEVKIRE-UHFFFAOYSA-N 0.000 description 1
- 125000003368 amide group Chemical group 0.000 description 1
- UGJQDKYTAYNNBH-UHFFFAOYSA-N amino cyclopropanecarboxylate Chemical compound NOC(=O)C1CC1 UGJQDKYTAYNNBH-UHFFFAOYSA-N 0.000 description 1
- 239000012062 aqueous buffer Substances 0.000 description 1
- 125000000637 arginyl group Chemical group N[C@@H](CCCNC(N)=N)C(=O)* 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- YDGMGEXADBMOMJ-UHFFFAOYSA-N asymmetrical dimethylarginine Natural products CN(C)C(N)=NCCCC(N)C(O)=O YDGMGEXADBMOMJ-UHFFFAOYSA-N 0.000 description 1
- GFZWHAAOIVMHOI-UHFFFAOYSA-N azetidine-3-carboxylic acid Chemical compound OC(=O)C1CNC1 GFZWHAAOIVMHOI-UHFFFAOYSA-N 0.000 description 1
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 1
- 238000007413 biotinylation Methods 0.000 description 1
- 230000006287 biotinylation Effects 0.000 description 1
- JCZLABDVDPYLRZ-AWEZNQCLSA-N biphenylalanine Chemical compound C1=CC(C[C@H](N)C(O)=O)=CC=C1C1=CC=CC=C1 JCZLABDVDPYLRZ-AWEZNQCLSA-N 0.000 description 1
- PWLNAUNEAKQYLH-UHFFFAOYSA-N butyric acid octyl ester Natural products CCCCCCCCOC(=O)CCC PWLNAUNEAKQYLH-UHFFFAOYSA-N 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 150000001720 carbohydrates Chemical class 0.000 description 1
- 235000014633 carbohydrates Nutrition 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000012501 chromatography medium Substances 0.000 description 1
- 239000007979 citrate buffer Substances 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 230000001268 conjugating effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 239000013068 control sample Substances 0.000 description 1
- 239000005289 controlled pore glass Substances 0.000 description 1
- ATDGTVJJHBUTRL-UHFFFAOYSA-N cyanogen bromide Chemical compound BrC#N ATDGTVJJHBUTRL-UHFFFAOYSA-N 0.000 description 1
- ORQXBVXKBGUSBA-UHFFFAOYSA-N cyclohexyl D-alanine Natural products OC(=O)C(N)CC1CCCCC1 ORQXBVXKBGUSBA-UHFFFAOYSA-N 0.000 description 1
- 108050004038 cystatin Proteins 0.000 description 1
- 230000002939 deleterious effect Effects 0.000 description 1
- YSMODUONRAFBET-UHFFFAOYSA-N delta-DL-hydroxylysine Natural products NCC(O)CCC(N)C(O)=O YSMODUONRAFBET-UHFFFAOYSA-N 0.000 description 1
- 239000003398 denaturant Substances 0.000 description 1
- 238000010217 densitometric analysis Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000007865 diluting Methods 0.000 description 1
- 239000003085 diluting agent Substances 0.000 description 1
- 239000000539 dimer Substances 0.000 description 1
- 238000006471 dimerization reaction Methods 0.000 description 1
- 125000000118 dimethyl group Chemical group [H]C([H])([H])* 0.000 description 1
- 230000006334 disulfide bridging Effects 0.000 description 1
- 125000002228 disulfide group Chemical group 0.000 description 1
- 230000009189 diving Effects 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- YSMODUONRAFBET-UHNVWZDZSA-N erythro-5-hydroxy-L-lysine Chemical compound NC[C@H](O)CC[C@H](N)C(O)=O YSMODUONRAFBET-UHNVWZDZSA-N 0.000 description 1
- 238000012869 ethanol precipitation Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 229920000370 gamma-poly(glutamate) polymer Polymers 0.000 description 1
- 238000002523 gelfiltration Methods 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 1
- 229960002743 glutamine Drugs 0.000 description 1
- 108010020688 glycylhistidine Proteins 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 1
- NBZBKCUXIYYUSX-UHFFFAOYSA-N iminodiacetic acid Chemical compound OC(=O)CNCC(O)=O NBZBKCUXIYYUSX-UHFFFAOYSA-N 0.000 description 1
- 210000000987 immune system Anatomy 0.000 description 1
- 230000000415 inactivating effect Effects 0.000 description 1
- QNRXNRGSOJZINA-UHFFFAOYSA-N indoline-2-carboxylic acid Chemical compound C1=CC=C2NC(C(=O)O)CC2=C1 QNRXNRGSOJZINA-UHFFFAOYSA-N 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 238000007852 inverse PCR Methods 0.000 description 1
- 229960000310 isoleucine Drugs 0.000 description 1
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 1
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 1
- 238000012933 kinetic analysis Methods 0.000 description 1
- HXEACLLIILLPRG-RXMQYKEDSA-N l-pipecolic acid Natural products OC(=O)[C@H]1CCCCN1 HXEACLLIILLPRG-RXMQYKEDSA-N 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000011031 large-scale manufacturing process Methods 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 210000004698 lymphocyte Anatomy 0.000 description 1
- 210000004962 mammalian cell Anatomy 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- JZKXXXDKRQWDET-UHFFFAOYSA-N meta-tyrosine Natural products OC(=O)C(N)CC1=CC=CC(O)=C1 JZKXXXDKRQWDET-UHFFFAOYSA-N 0.000 description 1
- 229910052751 metal Inorganic materials 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 150000002739 metals Chemical class 0.000 description 1
- 229960004452 methionine Drugs 0.000 description 1
- 230000000813 microbial effect Effects 0.000 description 1
- 235000013336 milk Nutrition 0.000 description 1
- 239000008267 milk Substances 0.000 description 1
- 210000004080 milk Anatomy 0.000 description 1
- 238000000329 molecular dynamics simulation Methods 0.000 description 1
- 238000002887 multiple sequence alignment Methods 0.000 description 1
- UUIQMZJEGPQKFD-UHFFFAOYSA-N n-butyric acid methyl ester Natural products CCCC(=O)OC UUIQMZJEGPQKFD-UHFFFAOYSA-N 0.000 description 1
- XJODGRWDFZVTKW-ZCFIWIBFSA-N n-methylleucine Chemical compound CN[C@@H](C(O)=O)CC(C)C XJODGRWDFZVTKW-ZCFIWIBFSA-N 0.000 description 1
- 101150079292 nagH gene Proteins 0.000 description 1
- 108010087904 neutravidin Proteins 0.000 description 1
- 238000007481 next generation sequencing Methods 0.000 description 1
- 238000010899 nucleation Methods 0.000 description 1
- CQYBNXGHMBNGCG-RNJXMRFFSA-N octahydroindole-2-carboxylic acid Chemical compound C1CCC[C@H]2N[C@H](C(=O)O)C[C@@H]21 CQYBNXGHMBNGCG-RNJXMRFFSA-N 0.000 description 1
- 239000012788 optical film Substances 0.000 description 1
- 210000001672 ovary Anatomy 0.000 description 1
- 230000003647 oxidation Effects 0.000 description 1
- 238000007254 oxidation reaction Methods 0.000 description 1
- YPZRWBKMTBYPTK-UHFFFAOYSA-N oxidized gamma-L-glutamyl-L-cysteinylglycine Natural products OC(=O)C(N)CCC(=O)NC(C(=O)NCC(O)=O)CSSCC(C(=O)NCC(O)=O)NC(=O)CCC(N)C(O)=O YPZRWBKMTBYPTK-UHFFFAOYSA-N 0.000 description 1
- 229960001639 penicillamine Drugs 0.000 description 1
- 239000000137 peptide hydrolase inhibitor Substances 0.000 description 1
- 239000006187 pill Substances 0.000 description 1
- 229920000724 poly(L-arginine) polymer Polymers 0.000 description 1
- 229920002401 polyacrylamide Polymers 0.000 description 1
- 108010011110 polyarginine Proteins 0.000 description 1
- 229940093429 polyethylene glycol 6000 Drugs 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 229920003053 polystyrene-divinylbenzene Polymers 0.000 description 1
- 239000002244 precipitate Substances 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 210000001236 prokaryotic cell Anatomy 0.000 description 1
- 238000000159 protein binding assay Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000010791 quenching Methods 0.000 description 1
- 239000012070 reactive reagent Substances 0.000 description 1
- 102000005962 receptors Human genes 0.000 description 1
- 108010054624 red fluorescent protein Proteins 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 229940043230 sarcosine Drugs 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 235000016491 selenocysteine Nutrition 0.000 description 1
- 229940055619 selenocysteine Drugs 0.000 description 1
- ZKZBPNGNEQAJSX-UHFFFAOYSA-N selenocysteine Natural products [SeH]CC(N)C(O)=O ZKZBPNGNEQAJSX-UHFFFAOYSA-N 0.000 description 1
- 239000000377 silicon dioxide Substances 0.000 description 1
- 238000002741 site-directed mutagenesis Methods 0.000 description 1
- 239000002002 slurry Substances 0.000 description 1
- 230000000087 stabilizing effect Effects 0.000 description 1
- 108010018381 streptavidin-binding peptide Proteins 0.000 description 1
- 238000012916 structural analysis Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 229910052717 sulfur Inorganic materials 0.000 description 1
- 125000004434 sulfur atom Chemical group 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 238000002849 thermal shift Methods 0.000 description 1
- JOKIQGQOKXGHDV-UHFFFAOYSA-N thiomorpholine-3-carboxylic acid Chemical compound [O-]C(=O)C1CSCC[NH2+]1 JOKIQGQOKXGHDV-UHFFFAOYSA-N 0.000 description 1
- 108060008226 thioredoxin Proteins 0.000 description 1
- 229940094937 thioredoxin Drugs 0.000 description 1
- 229960002898 threonine Drugs 0.000 description 1
- FGMPLJWBKKVCDB-UHFFFAOYSA-N trans-L-hydroxy-proline Natural products ON1CCCC1C(O)=O FGMPLJWBKKVCDB-UHFFFAOYSA-N 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 229960004799 tryptophan Drugs 0.000 description 1
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 1
- 238000011179 visual inspection Methods 0.000 description 1
- 238000003260 vortexing Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 210000005253 yeast cell Anatomy 0.000 description 1
- ORQXBVXKBGUSBA-QMMMGPOBSA-N β-cyclohexyl-alanine Chemical compound OC(=O)[C@@H](N)CC1CCCCC1 ORQXBVXKBGUSBA-QMMMGPOBSA-N 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1044—Preparation or screening of libraries displayed on scaffold proteins
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K1/00—General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length
- C07K1/14—Extraction; Separation; Purification
- C07K1/16—Extraction; Separation; Purification by chromatography
- C07K1/22—Affinity chromatography or related techniques based upon selective absorption processes
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K16/00—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
- C07K16/12—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from bacteria
- C07K16/1267—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from bacteria from Gram-positive bacteria
- C07K16/1282—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from bacteria from Gram-positive bacteria from Clostridium (G)
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K16/00—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
- C07K16/18—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K16/00—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
- C07K16/18—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans
- C07K16/28—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans against receptors, cell surface antigens or cell surface determinants
- C07K16/2803—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans against receptors, cell surface antigens or cell surface determinants against the immunoglobulin superfamily
- C07K16/2818—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans against receptors, cell surface antigens or cell surface determinants against the immunoglobulin superfamily against CD28 or CD152
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K16/00—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
- C07K16/18—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans
- C07K16/28—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans against receptors, cell surface antigens or cell surface determinants
- C07K16/2851—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans against receptors, cell surface antigens or cell surface determinants against the lectin superfamily, e.g. CD23, CD72
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K16/00—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
- C07K16/18—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans
- C07K16/28—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans against receptors, cell surface antigens or cell surface determinants
- C07K16/2896—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans against receptors, cell surface antigens or cell surface determinants against molecules with a "CD"-designation, not provided for elsewhere
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K16/00—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
- C07K16/40—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against enzymes
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K16/00—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
- C07K16/44—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material not provided for elsewhere, e.g. haptens, metals, DNA, RNA, amino acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/24—Hydrolases (3) acting on glycosyl compounds (3.2)
- C12N9/2402—Hydrolases (3) acting on glycosyl compounds (3.2) hydrolysing O- and S- glycosyl compounds (3.2.1)
- C12N9/2474—Hyaluronoglucosaminidase (3.2.1.35), i.e. hyaluronidase
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2317/00—Immunoglobulins specific features
- C07K2317/90—Immunoglobulins specific features characterized by (pharmaco)kinetic aspects or by stability of the immunoglobulin
- C07K2317/94—Stability, e.g. half-life, pH, temperature or enzyme-resistance
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2318/00—Antibody mimetics or scaffolds
- C07K2318/20—Antigen-binding scaffold molecules wherein the scaffold is not an immunoglobulin variable region or antibody mimetics
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K5/00—Peptides containing up to four amino acids in a fully defined sequence; Derivatives thereof
- C07K5/04—Peptides containing up to four amino acids in a fully defined sequence; Derivatives thereof containing only normal peptide links
- C07K5/06—Dipeptides
- C07K5/06104—Dipeptides with the first amino acid being acidic
- C07K5/06113—Asp- or Asn-amino acid
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K5/00—Peptides containing up to four amino acids in a fully defined sequence; Derivatives thereof
- C07K5/04—Peptides containing up to four amino acids in a fully defined sequence; Derivatives thereof containing only normal peptide links
- C07K5/10—Tetrapeptides
- C07K5/1021—Tetrapeptides with the first amino acid being acidic
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12R—INDEXING SCHEME ASSOCIATED WITH SUBCLASSES C12C - C12Q, RELATING TO MICROORGANISMS
- C12R2001/00—Microorganisms ; Processes using microorganisms
- C12R2001/01—Bacteria or Actinomycetales ; using bacteria or Actinomycetales
- C12R2001/145—Clostridium
Definitions
- Affinity chromatography with target-specific, immobilized capture agents is an established method of protein purification.
- a capture agent such as a protein, nucleic acid, or small molecule
- a solid support which can then be used to isolate a protein of interest from a complex mixture.
- the technique has been widely used at the laboratory scale for single-step purifications of diverse target proteins, including enzymes, transcription factors, growth factors and antibodies.
- Protein-based capture agents for AC in industrial applications has been less widespread because currently available approaches are incompatible with the temperatures, pH extremes, and solvents often needed for process-scale purification or are useful for only a limited number of targets.
- An exception is the purification of kg-quantities of antibodies with AC resins based on Staphylococcal Protein A.
- the development of Protein A resins highlights the use of protein engineering to improve the robustness of AC resins as well as some remaining limitations.
- Early versions of resins with wild-type Protein A captured antibodies with high selectivity and capacity from cell culture media feedstock but lost activity gradually after multiple cycles of cleaning in place with sodium hydroxide. Mutagenesis of Protein A yielded variants with increased resistance to sodium hydroxide treatment and higher binding capacity. Despite the widespread use of Protein A resins, they are nonetheless limited to the purification of antibodies.
- the conjugation of the antibody to the resin often results in heterogenous coupling because of a lack of precise control over the sites of conjugation.
- chromatography must be performed under oxidizing conditions in order to preserve the disulfide bonds essential for maintaining antibody structure.
- Elution of the target also usually requires low or high pH conditions that are incompatible with some target proteins.
- immuno-AC resins the chief limitation of immuno-AC resins is their sensitivity to the sodium hydroxide solutions which are preferred for cleaning-in-place procedures. Because of these limitations, new capture agents are needed that can perform under a variety of extreme conditions necessary for robust target purification.
- the invention features a protein scaffold that includes framework regions and loop regions.
- the protein scaffold has the structure:
- a and B are each independently, absent or include at least one amino acid
- F1 includes the sequence of: (T/S)-LI-(H/R)-(T/S)-(P/E)-(G/S)-W (SEQ ID NO: 4) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 4;
- L1 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F2 includes the sequence of: G-(S/N/T)-E-(A/S)-(D/N/S/A)-LLDGDD-(S/N/T)-TGV-(E/W/A)-Y (SEQ ID NO: 5) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 5;
- L2 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F3 includes the sequence of: S-(L/V)-AGEFIGLDLG (SEQ ID NO: 6) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 6;
- L3 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F4 includes the sequence of: G-(I/V)-(H/R/Y/N)-FVIG-(A/K/R)-(D/N) (SEQ ID NO: 7) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 7;
- L4 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- L5 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F6 includes the sequence of: WTTI-(R/K/H/Q)-EYD-(H/K/R/Q) (SEQ ID NO: 9) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 9;
- L6 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F7 includes the sequence of: (Q/K)-DVI-(D/E)-E-(D/S)-F (SEQ ID NO: 10) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 10;
- L7 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F8 includes the sequence of: (Q/K/R)-YIRLTNLE (SEQ ID NO: 11 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 11 ;
- L8 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and
- F9 includes the sequence of: LTFSEFA-(I/V)-VS (SEQ ID NO: 12) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 12.
- a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof includes a sequence having, for example, one insertion, two insertions, one deletion, two deletions, one substitution mutation, two substitution mutations, one insertion and one deletion, one insertion and one substitution mutation, or one deletion and one substitution mutation.
- the protein scaffold includes at least one mutation selected from the group consisting of N807X, S809X, R812X, S813X, E814X, S815Xi, D818X, N822X, N825X, N832X, W836X, K857X, E858X, I859X, K86OX2, L861X, D862X, R865X, K870X, N871 X, N880X, K881 X, K883X, N890X, K897X, K901 X, K908X, E912X, S914X, and K922X 3 relative to SEQ ID NO: 1 , wherein:
- X is any amino acid except the amino acid in the equivalent position in SEQ ID NO: 1 ;
- Xi is any amino acid except R or S;
- X2 is any amino acid except P or K
- X3 is any amino acid except R or K.
- F1 includes the sequence of: (T/S)-LI-(H/R)-(T/S)-(P/E)-(G/S)-W (SEQ ID NO: 4) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 4;
- F2 includes the sequence of: G-(S/N/T)-E-(A/S)-(D/N/S/A)-LLDGDD-(S/N/T)-TGV-(E/W/A)-Y (SEQ ID NO: 5) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 5;
- F3 includes the sequence of: S-(L/V)-AGEFIGLDLG (SEQ ID NO: 6) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 6;
- F4 includes the sequence of: G-(I/V)-(H/R/Y/N)-FVIG-(A/K/R)-(D/N) (SEQ ID NO: 7) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 7;
- F5 includes the sequence of: DKW-(T/N/S)-(R/K)-F-(R/K)-LEYS (SEQ ID NO: 8) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 8;
- F6 includes the sequence of: WTTI-(R/K/H/Q)-EYD-(H/K/R/Q) (SEQ ID NO: 9) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 9;
- F7 includes the sequence of: (Q/K)-DVI-(D/E)-E-(D/S)-F (SEQ ID NO: 10) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 10;
- F8 includes the sequence of: (Q/K/R)-YIRLTNLE (SEQ ID NO: 11 ) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 11 ; and
- F9 includes the sequence of: LTFSEFA-(I/V)-VS (SEQ ID NO: 12) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 12.
- F1 includes the sequence of: (T/S)-LI-(H/R)-(T/S)-(P/E)-(G/S)-W (SEQ ID NO: 4);
- F2 includes the sequence of: G-(S/N/T)-E-(A/S)-(D/N/S/A)-LLDGDD-(S/N/T)-TGV-(E/W/A)-Y (SEQ ID NO: 5);
- F3 includes the sequence of: S-(L/V)-AGEFIGLDLG (SEQ ID NO: 6);
- F4 includes the sequence of: G-(I/V)-(H/R/Y/N)-FVIG-(A/K/R)-(D/N) (SEQ ID NO: 7);
- F5 includes the sequence of: DKW-(T/N/S)-(R/K)-F-(R/K)-LEYS (SEQ ID NO: 8);
- F6 includes the sequence of: WTTI-(R/K/H/Q)-EYD-(H/K/R/Q) (SEQ ID NO: 9);
- F7 includes the sequence of: (Q/K)-DVI-(D/E)-E-(D/S)-F (SEQ ID NO: 10);
- F8 includes the sequence of: (Q/K/R)-YIRLTNLE (SEQ ID NO: 11 ); and
- F9 includes the sequence of: LTFSEFA-(I/V)-VS (SEQ ID NO: 12).
- the protein scaffold includes at least one mutation selected from the group consisting of N807X, S809X, R812X, S813X, E814X, S815X, D818X, N822X, N825X, N832X, W836X, K857X, E858X, I859X, K860X, L861X, D862X, R865X, K870X, N871 X, N880X, K881 X, K883X, N890X, K897X, K901 X, K908X, E912X, S914X, and K922X relative to SEQ ID NO: 1 , wherein X is any amino acid.
- the protein scaffold includes at least one mutation selected from the group consisting of N807D, S809T, R812H, S813T, E814P, S815G, D818V, N822S, N825D, N832S, W836E, K857E, E858V, I859V, K860E, L861V, D862G, R865H, K870A, N871 D, N880T, K881 R, K883R, N890G, K897R, K901 H, K908Q, E912D, S914D, and K922Q relative to SEQ ID NO: 1.
- the at least one mutation is K870X and/or N890X. In some embodiments, the at least one mutation is K870A and/or N890G. In some embodiments, the at least one mutation is K870A. In some embodiments, the at least one mutation is N890G.
- the protein scaffold includes at least 3 fewer lysines relative to SEQ ID NO: 1 .
- the protein scaffold includes at least 3, 4, 5, 6, 7, 8, 9, or 10 fewer lysines relative to SEQ ID NO: 1 .
- the protein scaffold includes at least 6 fewer lysines relative to SEQ ID NO: 1 .
- the protein scaffold includes 9 fewer lysines relative to SEQ ID NO: 1 .
- the protein scaffold does not include any lysines.
- the protein scaffold includes at least 3 fewer asparagines relative to SEQ ID NO: 1 .
- the protein scaffold includes at least 3, 4, 5, 6, 7, or 8 fewer asparagines relative to SEQ ID NO: 1 .
- the protein scaffold includes at least 5 fewer asparagines relative to SEQ ID NO: 1 .
- the protein scaffold includes 7 fewer asparagines relative to SEQ ID NO: 1 .
- the protein scaffold does not include any asparagines.
- a and B are each independently, absent or at least one amino acid.
- each of A and B may each be independently, absent.
- a and B are each independently, at least 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 30, 400, 500, 600, 700, 800, 900, 1 ,000 or more amino acids.
- a and B are each independently, from 0 to 1 ,000 amino acids, e.g., from 1 to 10 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids), from 10 to 100 amino acids (e.g., 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100 amino acids, or from 100 to 1 ,000 amino acids (e.g., 100, 200, 300, 400, 500, 600, 700, 800, 900, or
- a and B are each independently, absent or from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids).
- each of L1 -L8 is independently, from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids).
- each of L1 -L8 is, independently, from 1 amino acid to 10 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids). In some embodiments, each of L1 -L8 is, independently, from 3 amino acids to 10 amino acids. In some embodiments, each of L1 -L8 is, independently, from 3 amino acids to 8 amino acids.
- L1 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10,
- L1 is from 0 to 5 amino acids (e.g., from 1 to 5 amino acids, e.g., 0, 1 , 2, 3, 4, or 5 amino acids).
- L2 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L2 is from 1 amino acid to 16 amino acids (e.g., from 4 to 16 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, or 16 amino acids).
- L3 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L3 is 6 amino acids.
- L4 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L4 is from 0 to 5 amino acids (e.g., from 1 to 5 amino acids, e.g., 0, 1 , 2, 3, 4, or 5 amino acids).
- L5 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L5 is 5 amino acids.
- L6 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L6 is from 3 to 6 amino acids (e.g., 3, 4, 5, or 6 amino acids).
- L7 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L7 is 4 or 5 amino acids. In some embodiments, L8 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L8 is from 4 to 6 amino acids (e.g., 4, 5, or 6 amino acids).
- L1 is 4 amino acids. In some embodiments, L2 is 7 amino acids. In some embodiments, L8 is 5 amino acids. In some embodiments, L1 is 4 amino acids, L2 is 7 amino acids, and/or L8 is 5 amino acids. In some embodiments, L1 is 4 amino acids, L2 is 7 amino acids, and L8 is 5 amino acids.
- L1 includes the sequence of: X1X2X3X4 (SEQ ID NO: 13), wherein each of X1-X4 is, independently, any amino acid.
- X2 is V.
- L2 includes the sequence of: X1X2X3X4X5X6X7 (SEQ ID NO: 14), wherein each of X1-X7 is, independently, any amino acid.
- L8 includes the sequence of: XIX2XSX4XS (SEQ ID NO: 15), wherein each of X1-X5 is, independently, any amino acid.
- L4 includes the sequence of: X1X2X3X4X5X6X7 (SEQ ID NO: 14), wherein each of X1-X7 is, independently, any amino acid.
- L6 includes the sequence of: XIX2X3X4XSX6 (SEQ ID NO: 16), wherein each of Xi-Xe is, independently, any amino acid.
- L8 includes at least two amino acids. In some embodiments, L8 includes at least one amino acid.
- L4 includes the sequence of: (G/D)-GGSS (SEQ ID NO: 17) or GDT or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 17 or GDT.
- L6 includes the sequence of TGAPAG (SEQ ID NO: 18) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 18.
- L4 includes the sequence of: (G/D)-GGSS (SEQ ID NO: 17) or GDT; and L6 includes the sequence of TGAPAG (SEQ ID NO: 18).
- L3 includes the sequence of: (E/K/S)-(V/E)-(V/I/T)-(E/K/P/S)-(V/L)-(G/D) (SEQ ID NO: 19) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 19.
- L5 includes the sequence of: LD-(G/N)-(E/S)-S (SEQ ID NO: 20) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 20.
- L7 includes at least one amino acid.
- L7 includes the sequence of ETPI-(S/E)-A (SEQ ID NO: 21 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 21 .
- L3 includes the sequence of: (E/K/S)-(V/E)-(V/I/T)-(E/K/P/S)-(V/L)-(G/D) (SEQ ID NO: 19) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 19;
- L5 includes the sequence of: LD-(G/N)-(E/S)-S (SEQ ID NO: 20) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 20;
- L7 includes the sequence of ETPI-(S/E)-A (SEQ ID NO: 21 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations)
- A includes the sequence of (D/N/H)-P. In some embodiments, A includes the sequence of DP.
- B includes the sequence of DELE (SEQ ID NO: 35).
- F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 22;
- L1 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 23;
- L2 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 24;
- L3 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 25;
- L4 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 26;
- L5 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 27;
- L6 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 28;
- L7 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 29;
- L8 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and
- F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 30.
- F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22);
- L1 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- amino acid e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids
- F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23);
- L2 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- amino acid e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids
- F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
- L3 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- amino acid e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids
- F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25);
- L4 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
- L5 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27);
- L6 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28);
- L7 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29);
- L8 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and
- F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30).
- F1 comprises the sequence of: TLIHTPGW (SEQ ID NO: 22) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 22;
- L1 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F2 comprises the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 23; L2 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F3 comprises the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 24;
- L3 comprises the sequence of: EVVEVG (SEQ ID NO: 31 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 31 ;
- F4 comprises the sequence of: GIHFVIGAD (SEQ ID NO: 25) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 25
- L4 comprises the sequence of: GGGSS (SEQ ID NO: 32) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 32;
- F5 comprises the sequence of: DKWTRFRLEYS (SEQ ID NO: 26) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 26;
- L5 comprises the sequence of: LDGES (SEQ ID NO: 33) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 33;
- F6 comprises the sequence of: WTTIREYDH (SEQ ID NO: 27) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 27;
- L6 comprises the sequence of: TGAPAG (SEQ ID NO: 18) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 18;
- F7 comprises the sequence of: QDVIDEDF (SEQ ID NO: 28) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 28;
- L7 comprises the sequence of: ETPISA (SEQ ID NO: 34) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 34;
- F8 comprises the sequence of: QYIRLTNLE (SEQ ID NO: 29) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 29;
- L8 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and
- F9 comprises the sequence of: LTFSEFAIVS (SEQ ID NO: 30) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 30.
- F1 comprises the sequence of: TLIHTPGW (SEQ ID NO: 22) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 22;
- L1 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F2 comprises the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 23;
- L2 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F3 comprises the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 24;
- L3 comprises the sequence of: EVVEVG (SEQ ID NO: 31 ) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 31 ;
- F4 comprises the sequence of: GIHFVIGAD (SEQ ID NO: 25) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 25;
- L4 comprises the sequence of: GGGSS (SEQ ID NO: 32) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 32;
- F5 comprises the sequence of: DKWTRFRLEYS (SEQ ID NO: 26) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 26;
- L5 comprises the sequence of: LDGES (SEQ ID NO: 33) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 33;
- F6 comprises the sequence of: WTTIREYDH (SEQ ID NO: 27) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 27;
- L6 comprises the sequence of: TGAPAG (SEQ ID NO: 18) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 18;
- F7 comprises the sequence of: QDVIDEDF (SEQ ID NO: 28) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 28;
- L7 comprises the sequence of: ETPISA (SEQ ID NO: 34) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 34;
- F8 comprises the sequence of: QYIRLTNLE (SEQ ID NO: 29) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 29;
- L8 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and F9 comprises the sequence of: LTFSEFAIVS (SEQ ID NO: 30) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 30.
- F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22);
- L1 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- amino acid e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids
- F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23);
- L2 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- amino acid e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids
- F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
- L3 includes the sequence of: EVVEVG (SEQ ID NO: 31 );
- F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25);
- L4 includes the sequence of: GGGSS (SEQ ID NO: 32);
- F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
- L5 includes the sequence of: LDGES (SEQ ID NO: 33);
- F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27);
- L6 includes the sequence of: TGAPAG (SEQ ID NO: 18);
- F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28);
- L7 includes the sequence of: ETPISA (SEQ ID NO: 34);
- F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29);
- L8 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and
- F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30).
- F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22);
- L1 includes the sequence of: X1X2X3X4 (SEQ ID NO: 13), wherein each of X1-X4 is, independently, any amino acid;
- F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23);
- L2 includes the sequence of: X1X2X3X4X5X6X7 (SEQ ID NO: 14), wherein each of X1-X7 is, independently, any amino acid;
- F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
- L3 includes the sequence of: EVVEVG (SEQ ID NO: 31 );
- F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25);
- L4 includes the sequence of: GGGSS (SEQ ID NO: 32);
- F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
- L5 includes the sequence of: LDGES (SEQ ID NO: 33);
- F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27);
- L6 includes the sequence of: TGAPAG (SEQ ID NO: 18);
- F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28);
- L7 includes the sequence of: ETPISA (SEQ ID NO: 34);
- F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29);
- L8 includes the sequence of: XIX2XSX4XS (SEQ ID NO: 15), wherein each of X1-X5 is, independently, any amino acid; and F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30).
- A includes the sequence of: DP;
- F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22);
- L1 includes the sequence of: X1X2X3X4 (SEQ ID NO: 13), wherein each of X1-X4 is, independently, any amino acid;
- F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23);
- L2 includes the sequence of: X1X2X3X4X5X6X7 (SEQ ID NO: 14), wherein each of X1-X7 is, independently, any amino acid;
- F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
- L3 includes the sequence of: EVVEVG (SEQ ID NO: 31 );
- F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25);
- L4 includes the sequence of: GGGSS (SEQ ID NO: 32);
- F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
- L5 includes the sequence of: LDGES (SEQ ID NO: 33);
- F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27);
- L6 includes the sequence of: TGAPAG (SEQ ID NO: 18);
- F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28);
- L7 includes the sequence of: ETPISA (SEQ ID NO: 34);
- F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29);
- L8 includes the sequence of: X1X2X3X4X5 (SEQ ID NO: 15), wherein each of X1-X5 is, independently, any amino acid;
- F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30);
- DELE SEQ ID NO: 35.
- L1 includes the sequence of X1X2X3X4 (SEQ ID NO: 13), wherein each of Xi , X3, and X4 is, independently, any amino acid, and X2 is V.
- a protein scaffold that includes a polypeptide having at least 80% (e.g., at least 85%, 90%, 95%, 97%, or 99%) sequence identity to SEQ ID NO: 3.
- the polypeptide includes the sequence of SEQ ID NO: 3.
- the polypeptide does not include the sequence of SEQ ID NO: 1 .
- the polypeptide does not include the sequence of SEQ ID NO: 2.
- the polypeptide includes at least one mutation selected from the group consisting of N807X, S809X, R812X, S813X, E814X, S815Xi , D818X, N822X, N825X, N832X, W836X, K857X, E858X, I859X, K86OX2, L861 X, D862X, R865X, K870X, N871 X, N880X, K881 X, K883X, N890X, K897X, K901 X, K908X, E912X, S914X, and K922X 3 relative to SEQ ID NO: 1 , wherein:
- X is any amino acid except the amino acid in the equivalent position in SEQ ID NO: 1 ;
- Xi is any amino acid except R or S;
- X2 is any amino acid except P or K
- X3 is any amino acid except R or K.
- the protein scaffold further includes a mutation that adds a cysteine residue.
- the protein scaffold includes a first mutation that adds a first cysteine residue and a second mutation that adds a second cysteine residue.
- the first cysteine residue and the second cysteine residue form a disulfide bond under oxidizing conditions.
- the protein scaffold comprises at least one mutation selected from the group consisting of F806C, P808C, S845C, L855C, V858C, V861 C, K878C, W879C, L884C, L888C, A904C, P905C, A906GC, G907C, I924C, L926C, N928C, L936C, I943C, L948C.
- the protein scaffold comprises at least two or more mutations selected from the group consisting of F806C, P808C, S845C, L855C, V858C, V861 C, K878C, W879C, L884C, L888C, A904C, P905C, A906GC, G907C, I924C, L926C, N928C, L936C, I943C, L948C.
- the protein scaffold comprises a pair of cysteine mutations selected from the group consisting of K878C and G907C, K878C and A904C, V861 C and I943C, P905C and L855C, S845C and L936C, W879C and N928C, L884C and L926C, F806C and L948C, V858C and L888C, K878C and G907C, K878C and A906GC, S845C and N928C, K878C and A904C, P808C and I943C, V861 C and I924C, P808C and V861 C, and I943C and L855C.
- cysteine mutations selected from the group consisting of K878C and G907C, K878C and A904C, V861 C and I943C, P905C and L855C, S845C and L936C,
- the pair of cysteine mutations is selected from the group consisting of K878C and G907C, K878C and A904C, S845C and L936C, W879C and N928C, W879C and N928C, L884C and L926C, V858C and L888C, K878C and G907C, and K878C and A906GC (i.e., the substitution of alanine 906 with glycine and cysteine).
- the protein scaffold further includes a tag covalently attached to the scaffold.
- the tag is an affinity tag (e.g., a polyhistidine tag, e.g., 4, 5, 6, 7, 8, 9, or 10 histidines), an epitope tag, a covalent tag, or a protein tag.
- an affinity tag e.g., a polyhistidine tag, e.g., 4, 5, 6, 7, 8, 9, or 10 histidines
- an epitope tag e.g., a covalent tag
- a protein tag e.g., a polyhistidine tag, e.g., 4, 5, 6, 7, 8, 9, or 10 histidines
- the tag is attached to the N-terminus or the C-terminus of the scaffold.
- the scaffold is conjugated to a functional group.
- the functional group includes biotin, streptavidin or a derivative of streptavidin, a polyethylene glycol moiety, a fluorescent dye, an enzyme, a radioactive moiety, a lanthanide, or a lanthanide binding motif.
- the scaffold is conjugated to a lanthanide or a lanthanide binding motif.
- the lanthanide is terbium.
- the scaffold is conjugated to a radioactive moiety.
- the radioactive moiety is an a or emitter.
- the functional group is conjugated to a sulfhydryl group or a primary amine.
- polynucleotide encoding a protein scaffold as described herein, e.g., of any of the above embodiments.
- the polynucleotide is a ribonucleotide.
- the polynucleotide is a deoxyribonucleotide.
- featured is a vector that includes a polynucleotide as described herein.
- featured is a cell that includes a polynucleotide encoding the protein scaffold or a vector that includes the polynucleotide.
- a method of producing a protein scaffold as described herein includes the steps of (a) providing a cell transformed with a polynucleotide encoding the protein scaffold or a vector that includes the polynucleotide; (b) culturing the transformed cell under conditions for expressing the polynucleotide, wherein the culturing results in expression of the protein scaffold.
- the method may further include (c) isolating the protein scaffold or using the protein scaffold to bind a target.
- featured is a particle that includes the protein scaffold of any of the above embodiments.
- the particle is a magnetic particle.
- featured is a resin that includes a plurality of the particles, e.g., containing the protein scaffold.
- a column e.g., a chromatography column
- the particles or the resin e.g., conjugated to the scaffold.
- a method of purifying a target molecule from a plurality of molecules includes (a) providing a sample that includes a mixture of the target molecule and the plurality of molecules; (b) contacting the sample with the protein scaffold of any one of the above embodiments, wherein the scaffold specifically binds to the target molecule; and (c) separating the target molecule bound to the protein scaffold from the plurality of molecules.
- the step of separating includes immobilizing the protein scaffold.
- the protein scaffold is conjugated to a particle.
- the particle includes a magnetic bead.
- the protein scaffold is conjugated to a resin or monolith including a plurality of the particles.
- the Carbohydrate Binding Module Family 32 (CBM32) scaffold of SEQ ID NO: 1 is derived from a single protein domain of Clostridium perfringens hyaluronidase (NagH), a multi-domain enzyme consisting of 1627 amino acids.
- Amino acid residue 1 of SEQ ID NO: 1 corresponds to amino acid residue 807 of NagH
- amino acid residue 140 of SEQ ID NO: 1 corresponds to amino acid residue 946 of NagH.
- Amino acid positions and mutations described herein generally relate to the position on the corresponding full length NagH unless otherwise specified.
- constant region generally refers to a region of a binding scaffold that does not include the variable loop regions involved in target binding.
- a constant region may include a framework region (e.g., F1 -F9) or a loop region (e.g., L3-L7) that is not one of the three loops mutagenized for target binding (e.g., L1 , L2, and L8).
- a constant region may have sequence variability.
- non-naturally occurring amino acid means non-proteinogenic amino acids.
- non-naturally occurring amino acids include D-amino acids; an amino acid having an acetylaminomethyl group attached to a sulfur atom of a cysteine; a pegylated amino acid; the omega amino acids of the formula NH2(CH2)nCOOH where n is 2-6, neutral nonpolar amino acids, such as sarcosine, t-butyl alanine, t-butyl glycine, N-methyl isoleucine, and norleucine; oxymethionine; phenylglycine; citrulline; methionine sulfoxide; cysteic acid; ornithine; diaminobutyric acid; 3- aminoalanine; 3-hydroxy-D-proline; 2,4-diaminobutyric acid; 2-aminopentanoic acid; 2-aminooctanoic acid, 2-car
- amino acids are a-aminobutyric acid, a-amino-a- methylbutyrate, aminocyclopropane-carboxylate, aminoisobutyric acid, aminonorbornyl-carboxylate, L- cyclohexylalanine, cyclopentylalanine, L-N-methylleucine, L-N-methylmethionine, L-N-methylnorvaline, L- N-methylphenylalanine, L-N-methylproline, L-N-methylserine, L-N-methyltryptophan, D-ornithine, L-N- methylethylglycine, L-norleucine, a-methyl-aminoisobutyrate, a-methylcyclohexylalanine, D-a- methylalanine, D-a-methylarginine, D-a-methylasparagine, D-a-methylaspartate, D-a-methylcysteine
- amino acid residues may be charged or polar.
- Charged amino acids include alanine, lysine, aspartic acid, or glutamic acid, or non-naturally occurring analogs thereof.
- Polar amino acids include glutamine, asparagine, histidine, serine, threonine, tyrosine, methionine, or tryptophan, or non-naturally occurring analogs thereof. It is specifically contemplated that in some embodiments, a terminal amino group in the amino acid may be an amido group or a carbamate group.
- percent (%) identity refers to the percentage of amino acid residues of a candidate sequence, e.g., a protein scaffold, that are identical to the amino acid residues of a reference sequence, e.g., a wild-type CBM32 polypeptide, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent identity (i.e., gaps can be introduced in one or both of the candidate and reference sequences for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). Alignment for purposes of determining percent identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, ALIGN, or Megalign (DNASTAR) software.
- the percent amino acid sequence identity of a given candidate sequence to, with, or against a given reference sequence is calculated as follows:
- the percent amino acid sequence identity of the candidate sequence to the reference sequence would not equal to the percent amino acid sequence identity of the reference sequence to the candidate sequence.
- FIG. 1 is a schematic drawing showing an outline of the protein engineering campaign to produce a member of the nC-B class of nanoCLAMPs.
- the starting nanoCLAMP was anti-SUMO clone SMT3-A1 , a member of the nC-A class of nanoCLAMPs.
- SMT3A1 was mutated over 7 rounds.
- the end product is clone P2788, whose constant regions served as the basis for the nC- B class of nanoCLAMPs.
- FIG. 2 is a space filled model of P2788, an example of the nC-B class of nanoCLAMP with mutations in P2788 mapped on to CBM-32-3 crystal structure and the sequence of the constant region of the nC-A class of nanoCLAMPs.
- the constant regions of clone P2788 are the basis for the nC-B class of nanoCLAMPs. Side chains of mutated positions are labeled in green; side of chains of variable loops are shown in red. Other side chains are shown in light gray. Backbone residues are shown in dark gray.
- the alignment compares the constant regions of the nC-A and nC-B classes of nanoCLAMPs.
- Residues denoted with black, bolded text represent nC-A positions where at least one mutation was tested (top row). Mutations were tested for 58% of positions in the constant regions (72 out of 124). Residues denoted with bolded text represent mutations in P2788 (bottom row). In P2788, 24% of positions in the constant regions are mutated relative to nC-A (30 out of 124).
- FIG. 3 is a model showing the superimposition of the crystal structures for CBM32-2 (basis for the nC-A class of nanoCLAMPs) and the AlphaFold model for P2788 (basis for the nC-B class).
- CBM32-2 (PDB accession 2W1 Q) and P2788 were aligned with jFATCAT (rigid) on the RCSB server, resulting in a high TM-score (0.95).
- Backbone deviations are apparent and expected for the loops, which have different amino acid sequences.
- FIG. 4 is a graph showing differential scanning fluorimetry analysis of SMT3-A1 (nC-A class) and P2788 (nC-B class). Both clones show classically shaped melting curves with low initial fluorescence. The 30 mutations in P2788 increase its T m by 24 °C relative to SMT3-A1 .
- FIGS. 5A-5F are gels and a graph showing protease resistance of nanoCLAMPs of the nC-A and nC-B classes.
- FIG. 5A shows SDS-PAGE analysis of SMT3-A1 (nC-A class) and P2788 (same variable loops as SMT3-A1 but with constant regions of the nC-B class) after exposure to 16 hr incubation with trypsin or chymotrypsin.
- FIGS. 5B-5D show SDS-PAGE analysis of time course tryptic digestions of SMT3-A1 , P2788 and P2808.
- P2788 and P2808 have the same constant regions (nC-B class), but different loops.
- FIG. 5E shows quantitative densitometry analysis of the time course-stained gels.
- FIG. 5F shows SDS-PAGE analysis of members of the nC-A class of nanoCLAMPs (SMT3-A1 ) and nC-B class (P2788, P2808, P2809, and P281 1 ) following 16 h tryptic digestions.
- FIG. 6 is a set of size exclusion chromatograms showing monodispersity and melting temperature analysis of anti-SUMO nanoCLAMPs of the nC-B class. Size exclusion chromatography (left panel) and differential scanning fluorescence (right panel) of nanoCLAMPs P2808, P2809 and P281 1 .
- FIG. 7 is a graph showing dynamic binding capacity of SMT3-A1 resin (nC-A class) and P2808 resin (nC-B class). Breakthrough curves were generated by loading a solution of 0.2 mg/ml Sumo-GFP in PBS onto 0.6 ml of packed resin in a column (3 cm height x 5 mm ID) at a flowrate of 0.5 ml/min and measuring the fluorescence of the eluate. The percent fluorescence of the load was calculated by diving the eluate fluorescence by the load fluorescence.
- V x is the volume of eluate collected
- Vdeiay is the elution volume of the load under non-binding conditions
- c concentration of target in load
- V resin is the volume of the packed resin in the column.
- the P2808 resin has a dynamic binding capacity of 10 mg/ml resin (240 nmol/ml resin).
- FIG. 9 is a graph showing the effect of sodium hydroxide treatment on binding capacity of nanoCLAMP capture agents of the nC-A and nC-B class.
- the binding capacities of resins with capture agents of the nC-A class (SMT3-A1 , P1519, P1533) and the nC-B class (P2808, P2809, P281 1 ) were determined after each of 22 cycles of purification of GFP-SUMO from a spiked E. coli lysate, followed by washing, eluting, and cleaning in place with 0.1 M NaOH (10 min contact time).
- the % of starting binding capacity was determined by dividing the eluate fluorescence with the load fluorescence.
- the selectivity was determined by analysis of the eluates on SDS-PAGE (FIG. 13).
- FIGS. 10A-10D are graphs and gels showing the effect of organic solvent and autoclaving on the binding capacity of resins made with nanoCLAMPs of the nC-B class.
- Resins P2808, P2809, and P281 1 (nC-B class) and Resin SMT3-A1 (nC-A class) were incubated in 100% DMF for 2 h (FIGS. 10A and 10B) or autoclaved (105-minute liquid steam cycle, including 30 min exposure to 120 °C and 20 p.s.i.) (FIGS. 10C and 10D) and then re-equilibrated in fresh buffer and tested in affinity chromatography purification of SUMO-GFP from spiked E. coli lysate.
- Binding capacity % of untreated was determined by dividing the eluate fluorescence by the control (non-treated) eluate fluorescence. Specificity was determined by Coomassie staining of SDS-PAGE (FIGS. 10B and 10D).
- FIG. 11 is a graph showing kinetic thermal stability of nC-A (SMT3-A1 ) and nC-B (P2808, P2809, and P281 1 ) nanoCLAMPs. NanoCLAMPs were heat treated, cooled and centrifuged. The supernatant was tested for binding activity by biolayer interferometry. The percent of starting response was measured as the amplitude of binding divided by that obtained by the control sample (held at 20 °C during heat treatments).
- FIG. 12 is a gel showing static binding capacity of P2808 resin. Affinity resin prepared with P2808 (nC-B class) was incubated with a spiked E. coli lysate; washed; eluted with 3 M imidazole, pH 8; buffer exchanged; and quantified by A280.
- FIG. 13 is a set of gels showing the effect of sodium hydroxide treatment on specificity of nanoCLAMP capture agents of the nC-A and nC-B class.
- nC-A SMT3-A1 , P1519, P1533
- nC-B P2808, P2809, P2811
- Sumo-binding nanoCLAMPs were covalently conjugated to 6% cross-linked agarose resin and then used to purify a Sumo-GFP fusion from a crude E. coli lysate.
- Each cycle consisted of a crude Sumo-GFP-spiked sample load, wash, elution with 3M Imidazole (collected), wash, 0.1 M NaOH regeneration (10 min contact time per cycle), and a 5 min refolding wash.
- the target protein in the eluate was quantified by fluorescence spectroscopy (FIG. 9), and the percent yield calculated by dividing the fluorescence by the initial eluate fluorescence.
- the prominent band in the eluates of each gel is SUMO-GFP (42 kD).
- FIGS. 14A and 14B are a graph and a gel testing the stability of Resin P2808 (nC-B class) through >20 low pH elution cycles.
- SUMO-GFP was spiked into crude E. coli lysate, loaded onto P2808 resin, washed, and eluted with 0.1 M Citrate, pH 2.5, followed by regeneration with 0.1 N NaOH with 1 min contact time per cycle, and a re-equilibration wash for 5 min.
- FIG. 14A shows the target protein in the eluate, which was quantified by densitometry of the Coomassie stained SDS-PAGE gel of FIG. 14B because the fluorescence of the eluate was destroyed by the low pH. Percent yield was calculated by dividing the band density by the initial eluate band density. The purity of the eluted target from each cycle was assessed by SDS-PAGE of FIG. 14B.
- FIG. 15 is a graph showing nanoCLAMPs stably binding terbium (Tb).
- SMT3-A1 nC-A class
- P2808 nC-B class
- a negative control protein recombinant SMT3
- the buffer exchanged proteins were analyzed by time resolved fluorescence 24 h post buffer exchange (Ex/Em: 350 nm/544 nm), 200 p- sec delay.
- FIG. 16 is a model showing the front, back, top, and bottom faces of the nC-B class of nanoCLAMP.
- A, B, F1 -9, and L1 -L8 are mapped on clone P2808 sequence and 3D-modeled to illustrate the locations of each region of the scaffold.
- FIG. 17 is a model showing an alignment of the P2808, an example of the nC-B class, in which Loop 1 is replaced with a (G4S)s sequence or is removed.
- FIG. 18 is a model showing an alignment of the nC-B nanoCLAMP P2808 in which Loop 2 is replaced with a (G4S)s sequence or is removed.
- FIG. 19 is a model showing an alignment of the nC-B nanoCLAMP P2808 in which Loop 4 is replaced with a (G4S)s sequence or is removed.
- FIG. 20 is a model showing an alignment of the nC-B nanoCLAMP P2808 in which Loop 6 is replaced with a (G4S)s sequence or GG.
- FIG. 21 is a model showing the nC-B nanoCLAMP P2808 in which Loop 8 is replaced with a GGGGG (SEQ ID NO: 36), GGGG (SEQ ID NO: 37), GGG, GG, or G, or is removed.
- FIG. 22 is a model showing an alignment of the nC-B nanoCLAMP P2808 in which Loop 8 is replaced with a (G4S)s sequence or is removed.
- FIG. 23 is a model showing an alignment of the nC-B nanoCLAMP P2808 in which Loop 3 is replaced with a (648)3 sequence or is removed.
- FIG. 24 is a model showing an alignment of the nC-B nanoCLAMP P2808 in which Loop 5 is replaced with a (648)3 sequence or is removed.
- FIG. 25 is a model showing the nC-B nanoCLAMP P2808 in which Loop 7 is replaced with a GGGGG (SEQ ID NO: 36), GGGG (SEQ ID NO: 37), GGG, GG, or G, or is removed.
- FIG. 26 is a model showing an alignment of the nC-B nanoCLAMP P2808 in which Loop 8 is replaced with a (648)3 sequence or G.
- FIG. 27 is a gel showing introduction of artificial disulfide bonds into clones P2808 and P2960.
- Purified proteins were treated with SDS sample buffer containing (first lane of each set) or lacking (second lane of each set) reducing agent (DTT).
- DTT reducing agent
- the presence of faster migrating species indicates disulfide bonding in samples lacking DTT, likely due to more compact folding and a smaller hydrodynamic radius.
- P2808 and P2960 contain no Cys residues and migrate at the same rate in oxidizing and reducing sample buffer, as expected.
- BSA which contains 17 disulfide bonds, is included as a control for the activity of the reducing agent (DTT).
- FIG. 28 is a graph showing that artificial disulfide improves thermal stability of P3015 by 9 °C.
- the graph shows differential scanning fluorescence (DSF) analysis of melting temperature of reduced and oxidized P3015.
- Immunoaffinity chromatography is an established laboratory-scale technique for the isolation of target proteins with high yield and purity.
- properties of antibodies and nanobodies often make immunoaffinity chromatography incompatible with conditions typical of many industrial scale processes.
- the present invention features an antibody-mimetic scaffold, called a nanoCLAMP, that can be used in process-scale affinity chromatography.
- the 16 kD antibody mimetic is based on a bacterial, cysteine-free, p-sandwich protein with a structure analogous to immunoglobulin variable domains.
- the first generation of nanoCLAMPs generally showed high selectivity and affinity but also suffered from sensitivity to high temperature, digestion by proteases, and inactivation by alkali.
- the present invention solves this problem by engineering a plurality of mutations in in the nanoCLAMP scaffold to improve the general robustness of nanoCLAMPs and resistance to extreme conditions.
- This mutated scaffold serves as the basis for an improved nanoCLAMP class, called the nC-B class.
- Phage display was used to generate hundreds of nC-B capture agents recognizing diverse targets.
- the resulting immunoaffinity capture agents typically had a Kd of ⁇ 80 nM, a T m of > 70 °C and a ti/2 in 0.1 mg/ml trypsin of > 20 hours.
- the nC-B capture agents also maintained their binding capacity and selectivity over 20 purification cycles, each including 10 minutes of cleaning in place with 0.1 M NaOH.
- Affinity chromatography resins made with nC-B capture agents supported efficient single-step purifications from crude mixtures.
- Target proteins could be eluted with either 3 M imidazole, pH 8 or 0.1 M sodium citrate, pH 2.5.
- affinity chromatography resins with nC-B capture agents remained functional after exposure to 100% DMF and autoclaving.
- the robust nanoCLAMP scaffold described herein allows for the development of custom, high performance affinity chromatography resins compatible with the harsh conditions of process-scale applications that can be adaptable to a wide diversity of target substrates.
- the scaffolds described herein are derived from the Carbohydrate Binding Module Family 32 (CBM32) protein domain of Clostridium perfringens hyaluronidase (NagH), a multi-domain enzyme consisting of 1627 amino acids.
- Amino acid residue 1 of SEQ ID NO: 1 corresponds to amino acid residue 807 of NagH
- amino acid residue 140 of SEQ ID NO: 1 corresponds to amino acid residue 946 of NagH.
- Amino acid positions and mutations described herein generally relate to the position on the corresponding full length NagH unless otherwise specified.
- the WT sequence of CBM32 is shown below:
- the protein scaffold does not retain carbohydrate binding activity, e.g., of the native CBM scaffold.
- Loop L1 corresponds to residues 817-820
- loop L2 corresponds to residues 838-844
- loop L8 corresponds to residues 931 - 935.
- the original scaffold (nC-A) is shown below and contains a single M929L mutation relative to SEQ ID NO: 1 .
- X denotes a variable loop residue, and each X may independently be any residue.
- nC-A Scaffold sequence SEQ ID NO: 2
- SEQ ID NO: 2 NPSLIRSESWXXXXGNEANLLDGDDNTGVWYXXXXXXXSLAGEF IGLDLGKEIKLDGIRFVIGKNGGGS SDKWNKFK
- nC-B The current scaffold (nC-B) described herein is based on the exemplary scaffold of SEQ ID NO: 3 shown below.
- X denotes a variable loop residue, and each X may independently be any residue.
- nC-B Scaffold sequence full length (SEQ ID NO: 3) DPTLIHTPGWXXXGSEADLLDGDDSTGVEYXXXXXXSLAGEF IGLDLGEWEVGGIHFVIGADGGGS SDKWTRFR LEYSLDGESWTT IREYDHTGAPAGQDVIDEDFETP I SAQYIRLTNLEXXXXXLTFSEFAIVSDELE
- the protein scaffolds described herein include (e.g., consist of) framework regions (F) and loop regions (L).
- the scaffolds generally have the structure of:
- F1 -F9 correspond to framework regions 1 -9
- L1 -L8 correspond to loop regions 1 -8.
- Framework regions and loop regions were selected based on where beta strands turn to loops or where beta strands show a sharp turn from the plane of the strand’s beta sheet (see FIGS. 2 and 16).
- the N- and C- termini of the scaffold, A and B may each independently, be present (e.g., contain one or more amino acids) or absent.
- F1 includes the sequence of: (T/S)-LI-(H/R)-(T/S)-(P/E)-(G/S)-W (SEQ ID NO: 4) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 4;
- L1 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- amino acid e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids
- F2 includes the sequence of: G-(S/N/T)-E-(A/S)-(D/N/S/A)-LLDGDD-(S/N/T)-TGV-(E/W/A)-Y (SEQ ID NO: 5) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 5;
- L2 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- amino acid e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids
- F3 includes the sequence of: S-(L/V)-AGEFIGLDLG (SEQ ID NO: 6) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 6;
- L3 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- amino acid e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids
- F4 includes the sequence of: G-(I/V)-(H/R/Y/N)-FVIG-(A/K/R)-(D/N) (SEQ ID NO: 7) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 7;
- L4 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F5 includes the sequence of: DKW-(T/N/S)-(R/K)-F-(R/K)-LEYS (SEQ ID NO: 8) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 8;
- L5 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F6 includes the sequence of: WTTI-(R/K/H/Q)-EYD-(H/K/R/Q) (SEQ ID NO: 9) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 9;
- L6 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F7 includes the sequence of: (Q/K)-DVI-(D/E)-E-(D/S)-F (SEQ ID NO: 10) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 10;
- L7 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F8 includes the sequence of: (Q/K/R)-YIRLTNLE (SEQ ID NO: 1 1 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 1 1 ;
- L8 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and F9 includes the sequence of: LTFSEFA-(I/V)-VS (SEQ ID NO: 12) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 12.
- a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof includes a sequence having, for example, one insertion, two insertions, one deletion, two deletions, one substitution mutation, two substitution mutations, one insertion and one deletion, one insertion and one substitution mutation, or one deletion and one substitution mutation.
- the protein scaffold includes at least one mutation selected from the group consisting of N807X, S809X, R812X, S813X, E814X, S815X, D818X, N822X, N825X, N832X, W836X, K857X, E858X, I859X, K860X, L861X, D862X, R865X, K870X, N871 X, N880X, K881 X, K883X, N890X, K897X, K901 X, K908X, E912X, S914X, and K922X relative to SEQ ID NO: 1 , wherein X is any amino acid.
- the protein scaffold includes at least one mutation selected from the group consisting of N807X, S809X, R812X, S813X, E814X, S815Xi, D818X, N822X, N825X, N832X, W836X, K857X, E858X, I859X, K86OX2, L861X, D862X, R865X, K870X, N871 X, N880X, K881 X, K883X, N890X, K897X, K901 X, K908X, E912X, S914X, and K922X 3 relative to SEQ ID NO: 1 , wherein:
- X is any amino acid except the amino acid in the equivalent position in SEQ ID NO: 1 ;
- Xi is any amino acid except R or S;
- X2 is any amino acid except P or K
- X3 is any amino acid except R or K.
- F1 includes the sequence of: (T/S)-LI-(H/R)-(T/S)-(P/E)-(G/S)-W (SEQ ID NO: 4) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 4;
- F2 includes the sequence of: G-(S/N/T)-E-(A/S)-(D/N/S/A)-LLDGDD-(S/N/T)-TGV-(E/W/A)-Y (SEQ ID NO: 5) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 5;
- F3 includes the sequence of: S-(L/V)-AGEFIGLDLG (SEQ ID NO: 6) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 6;
- F4 includes the sequence of: G-(I/V)-(H/R/Y/N)-FVIG-(A/K/R)-(D/N) (SEQ ID NO: 7) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 7;
- F5 includes the sequence of: DKW-(T/N/S)-(R/K)-F-(R/K)-LEYS (SEQ ID NO: 8) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 8;
- F6 includes the sequence of: WTTI-(R/K/H/Q)-EYD-(H/K/R/Q) (SEQ ID NO: 9) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 9;
- F7 includes the sequence of: (Q/K)-DVI-(D/E)-E-(D/S)-F (SEQ ID NO: 10) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 10;
- F8 includes the sequence of: (Q/K/R)-YIRLTNLE (SEQ ID NO: 11 ) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 11 ; and
- F9 includes the sequence of: LTFSEFA-(I/V)-VS (SEQ ID NO: 12) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 12.
- F1 includes the sequence of: (T/S)-LI-(H/R)-(T/S)-(P/E)-(G/S)-W (SEQ ID NO: 4);
- F2 includes the sequence of: G-(S/N/T)-E-(A/S)-(D/N/S/A)-LLDGDD-(S/N/T)-TGV-(E/W/A)-Y (SEQ ID NO: 5);
- F3 includes the sequence of: S-(L/V)-AGEFIGLDLG (SEQ ID NO: 6);
- F4 includes the sequence of: G-(I/V)-(H/R/Y/N)-FVIG-(A/K/R)-(D/N) (SEQ ID NO: 7);
- F5 includes the sequence of: DKW-(T/N/S)-(R/K)-F-(R/K)-LEYS (SEQ ID NO: 8);
- F6 includes the sequence of: WTTI-(R/K/H/Q)-EYD-(H/K/R/Q) (SEQ ID NO: 9);
- F7 includes the sequence of: (Q/K)-DVI-(D/E)-E-(D/S)-F (SEQ ID NO: 10);
- F8 includes the sequence of: (Q/K/R)-YIRLTNLE (SEQ ID NO: 11 ); and
- F9 includes the sequence of: LTFSEFA-(I/V)-VS (SEQ ID NO: 12).
- the protein scaffold includes at least one mutation selected from the group consisting of N807X, S809X, R812X, S813X, E814X, S815X, D818X, N822X, N825X, N832X, W836X, K857X, E858X, I859X, K860X, L861X, D862X, R865X, K870X, N871 X, N880X, K881 X, K883X, N890X, K897X, K901 X, K908X, E912X, S914X, and K922X relative to SEQ ID NO: 1 , wherein X is any amino acid.
- the protein scaffold includes at least one mutation selected from the group consisting of N807D, S809T, R812H, S813T, E814P, S815G, D818V, N822S, N825D, N832S, W836E, K857E, E858V, I859V, K860E, L861V, D862G, R865H, K870A, N871 D, N880T, K881 R, K883R, N890G, K897R, K901 H, K908Q, E912D, S914D, and K922Q relative to SEQ ID NO: 1.
- the at least one mutation is K870X and/or N890X. In some embodiments, the at least one mutation is K870A and/or N890G. In some embodiments, the at least one mutation is K870A. In some embodiments, the at least one mutation is N890G.
- the protein scaffold includes at least 3 fewer lysines relative to SEQ ID NO: 1 .
- the protein scaffold includes at least 3, 4, 5, 6, 7, 8, 9, or 10 fewer lysines relative to SEQ ID NO: 1 .
- the protein scaffold includes at least 6 fewer lysines relative to SEQ ID NO: 1 .
- the protein scaffold includes 9 fewer lysines relative to SEQ ID NO: 1 .
- the protein scaffold does not include any lysines.
- the protein scaffold includes at least 3 fewer asparagines relative to SEQ ID NO: 1 .
- the protein scaffold includes at least 3, 4, 5, 6, 7, or 8 fewer asparagines relative to SEQ ID NO: 1 .
- the protein scaffold includes at least 5 fewer asparagines relative to SEQ ID NO: 1 .
- the protein scaffold includes 7 fewer asparagines relative to SEQ ID NO: 1 .
- the protein scaffold does not include any asparagines.
- a and B are each independently, absent or at least one amino acid.
- each of A and B may each be independently, absent.
- a and B are each independently, at least 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 30, 400, 500, 600, 700, 800, 900, 1 ,000 or more amino acids.
- a and B are each independently, from 0 to 1 ,000 amino acids, e.g., from 1 to 10 amino acids (e.g., 1 , 2, 3, 4, 5, 6,7, 8, 9, or 10 amino acids), from 10 to 100 amino acids (e.g., 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100 amino acids, or from 100 to 1 ,000 amino acids (e.g., 100, 200, 300, 400, 500, 600, 700, 800, 900, or
- a and B are each independently, absent or from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids).
- each of L1 -L8 is independently, from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids).
- each of L1 -L8 is, independently, from 1 amino acid to 10 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids). In some embodiments, each of L1 -L8 is, independently, from 3 amino acids to 10 amino acids. In some embodiments, each of L1 -L8 is, independently, from 3 amino acids to 8 amino acids.
- L1 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10,
- L1 is from 0 to 5 amino acids (e.g., from 1 to 5 amino acids, e.g., 0, 1 , 2, 3, 4, or 5 amino acids).
- L2 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L2 is from 1 amino acid to
- 16 amino acids e.g., from 4 to 16 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, or 16 amino acids).
- L3 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L3 is 6 amino acids.
- L4 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L4 is from 0 to 5 amino acids (e.g., from 1 to 5 amino acids, e.g., 0, 1 , 2, 3, 4, or 5 amino acids)
- L5 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L5 is 5 amino acids.
- L6 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L6 is from 3 to 6 amino acids (e.g., 3, 4, 5, or 6 amino acids).
- L7 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L7 is 4 or 5 amino acids.
- L8 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L8 is from 4 to 6 amino acids (e.g., 4, 5, or 6 amino acids).
- L1 is 4 amino acids. In some embodiments, L2 is 7 amino acids. In some embodiments, L8 is 5 amino acids. In some embodiments, L1 is 4 amino acids, L2 is 7 amino acids, and/or L8 is 5 amino acids. In some embodiments, L1 is 4 amino acids, L2 is 7 amino acids, and L8 is 5 amino acids.
- L1 includes the sequence of: X1X2X3X4 (SEQ ID NO: 13), wherein each of X1-X4 is, independently, any amino acid.
- X2 is V.
- L2 includes the sequence of: X1X2X3X4X5X6X7 (SEQ ID NO: 14), wherein each of X1-X7 is, independently, any amino acid.
- L8 includes the sequence of: XIX2XSX4XS (SEQ ID NO: 15), wherein each of X1-X5 is, independently, any amino acid.
- L4 includes the sequence of: X1X2X3X4X5X6X7 (SEQ ID NO: 14), wherein each of X1-X7 is, independently, any amino acid.
- L6 includes the sequence of: XIX2X3X4XSX6 (SEQ ID NO: 16), wherein each of Xi-Xe is, independently, any amino acid.
- L8 includes at least two amino acids. In some embodiments, L8 includes at least one amino acid.
- L4 includes the sequence of: (G/D)-GGSS (SEQ ID NO: 17) or GDT or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 17 or GDT.
- L6 includes the sequence of TGAPAG (SEQ ID NO: 18) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 18.
- L4 includes the sequence of: (G/D)-GGSS (SEQ ID NO: 17) or GDT; and L6 includes the sequence of TGAPAG (SEQ ID NO: 18).
- L3 includes the sequence of: (E/K/S)-(V/E)-(V/I/T)-(E/K/P/S)-(V/L)-(G/D) (SEQ ID NO: 19) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 19.
- L5 includes the sequence of: LD-(G/N)-(E/S)-S (SEQ ID NO: 20) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 20.
- L7 includes at least one amino acid.
- L7 includes the sequence of ETPI-(S/E)-A (SEQ ID NO: 21 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 21 .
- L3 includes the sequence of: (E/K/S)-(V/E)-(V/I/T)-(E/K/P/S)-(V/L)-(G/D) (SEQ ID NO: 19) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 19;
- L5 includes the sequence of: LD-(G/N)-(E/S)-S (SEQ ID NO: 20) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 20;
- L7 includes the sequence of ETPI-(S/E)-A (SEQ ID NO: 21 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations)
- A includes the sequence of (D/N/H)-P. In some embodiments, A includes the sequence of DP.
- B includes the sequence of DELE (SEQ ID NO: 35).
- F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 22;
- L1 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 23;
- L2 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 24;
- L3 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 25;
- L4 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 26;
- L5 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 27;
- L6 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 28;
- L7 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 29;
- L8 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 30.
- F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22);
- L1 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- amino acid e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids
- F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23);
- L2 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- amino acid e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids
- F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
- L3 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- amino acid e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids
- F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25);
- L4 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
- L5 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27);
- L6 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28);
- L7 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29);
- L8 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and
- F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30).
- F1 comprises the sequence of: TLIHTPGW (SEQ ID NO: 22) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 22;
- L1 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F2 comprises the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 23;
- L2 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F3 comprises the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 24;
- L3 comprises the sequence of: EVVEVG (SEQ ID NO: 31 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 31 ;
- F4 comprises the sequence of: GIHFVIGAD (SEQ ID NO: 25) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 25;
- L4 comprises the sequence of: GGGSS (SEQ ID NO: 32) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 32;
- F5 comprises the sequence of: DKWTRFRLEYS (SEQ ID NO: 26) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 26;
- L5 comprises the sequence of: LDGES (SEQ ID NO: 33) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 33;
- F6 comprises the sequence of: WTTIREYDH (SEQ ID NO: 27) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 27;
- L6 comprises the sequence of: TGAPAG (SEQ ID NO: 18) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 18;
- F7 comprises the sequence of: QDVIDEDF (SEQ ID NO: 28) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 28;
- L7 comprises the sequence of: ETPISA (SEQ ID NO: 34) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 34;
- F8 comprises the sequence of: QYIRLTNLE (SEQ ID NO: 29) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 29;
- L8 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and
- F9 comprises the sequence of: LTFSEFAIVS (SEQ ID NO: 30) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 30.
- F1 comprises the sequence of: TLIHTPGW (SEQ ID NO: 22) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 22;
- L1 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F2 comprises the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 23;
- L2 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F3 comprises the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 24;
- L3 comprises the sequence of: EVVEVG (SEQ ID NO: 31 ) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 31 ;
- F4 comprises the sequence of: GIHFVIGAD (SEQ ID NO: 25) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 25;
- L4 comprises the sequence of: GGGSS (SEQ ID NO: 32) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 32;
- F5 comprises the sequence of: DKWTRFRLEYS (SEQ ID NO: 26) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 26;
- L5 comprises the sequence of: LDGES (SEQ ID NO: 33) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 33;
- F6 comprises the sequence of: WTTIREYDH (SEQ ID NO: 27) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 27;
- L6 comprises the sequence of: TGAPAG (SEQ ID NO: 18) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 18;
- F7 comprises the sequence of: QDVIDEDF (SEQ ID NO: 28) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 28;
- L7 comprises the sequence of: ETPISA (SEQ ID NO: 34) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 34;
- F8 comprises the sequence of: QYIRLTNLE (SEQ ID NO: 29) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 29;
- L8 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6,
- F9 comprises the sequence of: LTFSEFAIVS (SEQ ID NO: 30) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 30.
- F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22);
- L1 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7,
- F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23); L2 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
- F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
- L3 includes the sequence of: EVVEVG (SEQ ID NO: 31 );
- F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25);
- L4 includes the sequence of: GGGSS (SEQ ID NO: 32);
- F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
- L5 includes the sequence of: LDGES (SEQ ID NO: 33);
- F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27);
- L6 includes the sequence of: TGAPAG (SEQ ID NO: 18);
- F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28);
- L7 includes the sequence of: ETPISA (SEQ ID NO: 34);
- F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29);
- L8 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7,
- F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30).
- F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22);
- L1 includes the sequence of: X1X2X3X4 (SEQ ID NO: 13), wherein each of X1-X4 is, independently, any amino acid;
- F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23);
- L2 includes the sequence of: X1X2X3X4X5X6X7 (SEQ ID NO: 14), wherein each of X1-X7 is, independently, any amino acid;
- F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
- L3 includes the sequence of: EVVEVG (SEQ ID NO: 31 );
- F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25);
- L4 includes the sequence of: GGGSS (SEQ ID NO: 32);
- F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
- L5 includes the sequence of: LDGES (SEQ ID NO: 33);
- F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27);
- L6 includes the sequence of: TGAPAG (SEQ ID NO: 18);
- F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28);
- L7 includes the sequence of: ETPISA (SEQ ID NO: 34);
- F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29);
- L8 includes the sequence of: XIX2XSX4XS (SEQ ID NO: 15), wherein each of X1-X5 is, independently, any amino acid; and
- F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30).
- A includes the sequence of: DP;
- F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22);
- L1 includes the sequence of: X1X2X3X4 (SEQ ID NO: 13), wherein each of X1-X4 is, independently, any amino acid;
- F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23); L2 includes the sequence of: X1X2X3X4X5X6X7 (SEQ ID NO: 14), wherein each of X1-X7 is, independently, any amino acid;
- F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
- L3 includes the sequence of: EVVEVG (SEQ ID NO: 31 );
- F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25);
- L4 includes the sequence of: GGGSS (SEQ ID NO: 32);
- F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
- L5 includes the sequence of: LDGES (SEQ ID NO: 33);
- F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27);
- L6 includes the sequence of: TGAPAG (SEQ ID NO: 18);
- F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28);
- L7 includes the sequence of: ETPISA (SEQ ID NO: 34);
- F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29);
- L8 includes the sequence of: X1X2X3X4X5 (SEQ ID NO: 15), wherein each of X1-X5 is, independently, any amino acid;
- F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30);
- DELE SEQ ID NO: 35.
- L1 includes the sequence of X1X2X3X4 (SEQ ID NO: 13), wherein each of Xi , X3, and X4 is, independently, any amino acid, and X2 is V.
- a protein scaffold that includes a polypeptide having at least 80% (e.g., at least 85%, 90%, 95%, 97%, or 99%) sequence identity to SEQ ID NO: 3.
- the polypeptide includes the sequence of SEQ ID NO: 3.
- the polypeptide does not include the sequence of SEQ ID NO: 1 .
- the polypeptide does not include the sequence of SEQ ID NO: 2.
- polypeptide having at least 85% (e.g., at least 90%, 95%, 97%, 99%, or 100%) sequence identity to a polypeptide of Table 9 or Table 10.
- polypeptide includes a sequence as set forth in Table 9 or Table 10.
- the polypeptide includes at least one mutation selected from the group consisting of N807X, S809X, R812X, S813X, E814X, S815Xi , D818X, N822X, N825X, N832X, W836X, K857X, E858X, I859X, K86OX2, L861 X, D862X, R865X, K870X, N871 X, N880X, K881 X, K883X, N890X, K897X, K901 X, K908X, E912X, S914X, and K922X 3 relative to SEQ ID NO: 1 , wherein:
- X is any amino acid except the amino acid in the equivalent position in SEQ ID NO: 1 ;
- Xi is any amino acid except R or S;
- X2 is any amino acid except P or K
- X3 is any amino acid except R or K.
- the protein scaffolds described herein which contain 9 framework regions (i.e., F1 -F9) may be optimized or swapped according to established biophysical techniques. Accordingly, the invention also features protein scaffolds containing 7 out of 9 or 8 out of 9 framework regions described herein. Based on the detailed structural analysis of the scaffold known in the art (see, e.g., Ficko-Blean et al. J. Mol. Bio.
- beta strands or a portion thereof e.g., more than 2 residues of a given framework region, e.g., any one of F1 -F9 of the core protein fold, while still maintaining structural integrity of the overall scaffold.
- the phage library could then be selected for library members that are thermostable and maintain binding to a target by selecting the library for those members that can withstand an incubation at >55°C without aggregation and for those members that can bind to an immobilized target.
- the isolated clones with these properties would represent scaffolds with a swapped out beta strand or portion thereof.
- the invention also contemplates protein scaffolds having at least 7, e.g., at least 8, of the following framework regions, wherein 7 of the 9 or 8 of the 9 framework regions have the following sequence:
- F1 includes the sequence of: (T/S)-LI-(H/R)-(T/S)-(P/E)-(G/S)-W (SEQ ID NO: 4) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 4;
- F2 includes the sequence of: G-(S/N/T)-E-(A/S)-(D/N/S/A)-LLDGDD-(S/N/T)-TGV-(E/W/A)-Y (SEQ ID NO: 5) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 5;
- F3 includes the sequence of: S-(L/V)-AGEFIGLDLG (SEQ ID NO: 6) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 6;
- F4 includes the sequence of: G-(I/V)-(H/R/Y/N)-FVIG-(A/K/R)-(D/N) (SEQ ID NO: 7) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 7;
- F5 includes the sequence of: DKW-(T/N/S)-(R/K)-F-(R/K)-LEYS (SEQ ID NO: 8) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 8;
- F6 includes the sequence of: WTTI-(R/K/H/Q)-EYD-(H/K/R/Q) (SEQ ID NO: 9) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 9;
- F7 includes the sequence of: (Q/K)-DVI-(D/E)-E-(D/S)-F (SEQ ID NO: 10) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 10;
- F8 includes the sequence of: (Q/K/R)-YIRLTNLE (SEQ ID NO: 11 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 11 ; and
- F9 includes the sequence of: LTFSEFA-(I/V)-VS (SEQ ID NO: 12) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 12.
- the invention also contemplates protein scaffolds having at least 7, e.g., at least 8, of the following framework regions, wherein 7 of the 9 or 8 of the 9 framework regions have the following sequence: F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 22;
- F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 23;
- F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 24;
- F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 25;
- F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 26;
- F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 27;
- F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 28;
- F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 29; and
- F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 30.
- the invention also contemplates protein scaffolds having at least 7, e.g., at least 8, of the following framework regions, wherein 7 of the 9 or 8 of the 9 framework regions have the following sequence:
- F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22);
- F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23);
- F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
- F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25);
- F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
- F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27);
- F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28);
- F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29);
- F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30).
- a protein scaffold that includes a polypeptide having at least 80% (e.g., at least 85%, 90%, 95%, 97%, or 99%) sequence identity to the framework regions (F1 -F9) over the region of alignment corresponding to F1 -F9 of the reference sequence (e.g., SEQ ID NO: 3).
- the protein scaffold includes one or more non-natural amino acids.
- one or more of the framework regions includes a non-natural amino acid.
- one or more of the loop regions includes a non-natural amino acid.
- the protein scaffolds described herein may lack native cysteine residues. Accordingly, the scaffold may be mutagenized to introduce one or more cysteine residues into the scaffold (e.g., in one or more loop or framework regions). When two or more cysteine residues are introduced into the scaffold at nearby sites, the two cysteine residues may form a disulfide bridge, e.g., under oxidizing conditions. In some embodiments, the disulfide bridge enhances thermal stability of the protein scaffold.
- the protein scaffold includes a mutation that adds a cysteine residue. In some embodiments, the protein scaffold includes a first mutation that adds a first cysteine residue and a second mutation that adds a second cysteine residue. In some embodiments, the first cysteine residue and the second cysteine residue form a disulfide bond under oxidizing conditions.
- the protein scaffold comprises at least one mutation selected from the group consisting of F806C, P808C, S845C, L855C, V858C, V861 C, K878C, W879C, L884C, L888C, A904C, P905C, A906GC, G907C, I924C, L926C, N928C, L936C, I943C, L948C.
- the protein scaffold comprises at least two or more mutations selected from the group consisting of F806C, P808C, S845C, L855C, V858C, V861 C, K878C, W879C, L884C, L888C, A904C, P905C, A906GC, G907C, I924C, L926C, N928C, L936C, I943C, L948C.
- the protein scaffold comprises a pair of cysteine mutations selected from the group consisting of K878C and G907C, K878C and A904C, V861 C and I943C, P905C and L855C, S845C and L936C, W879C and N928C, L884C and L926C, F806C and L948C, V858C and L888C, K878C and G907C, K878C and A906GC, S845C and N928C, K878C and A904C, P808C and I943C, V861 C and I924C, P808C and V861 C, and I943C and L855C.
- cysteine mutations selected from the group consisting of K878C and G907C, K878C and A904C, V861 C and I943C, P905C and L855C, S845C and L936C,
- the pair of cysteine mutations is selected from the group consisting of K878C and G907C, K878C and A904C, S845C and L936C, W879C and N928C, W879C and N928C, L884C and L926C, V858C and L888C, K878C and G907C, and K878C and A906GC.
- the protein scaffolds described herein may further include a tag.
- a tag may provide for ease of purification, detection or attachment of the protein scaffold.
- the tag may be covalently attached to the scaffold.
- a and/or B of the scaffold is or includes a tag.
- the tag is an affinity tag (e.g., a polyhistidine tag, e.g., 4, 5, 6, 7, 8, 9, or 10 histidines, e.g., Gly-His tags, e.g., AviTag, e.g., Calmodulin-tag, e.g., polyglutamate tag, e.g., polyarginine tag, e.g., SBP-tag).
- affinity tag e.g., a polyhistidine tag, e.g., 4, 5, 6, 7, 8, 9, or 10 histidines, e.g., Gly-His tags, e.g., AviTag, e.g., Calmodulin-tag, e.g., polyglutamate tag, e.g., polyarginine tag, e.g., SBP-tag.
- the tag is an epitope tag (e.g., ALFA-tag, C-tag, iCapTag, E-tag, FLAG- tag, HA-tag, Myc-tag, NE-tag, Rho1 D4-tag, S-tag, Softag 1 , Softag 3, Spot-tag, T7-tag, TC tag, Ty tag, V6 tag, VSV-tag or Xpress tag).
- epitope tag e.g., ALFA-tag, C-tag, iCapTag, E-tag, FLAG- tag, HA-tag, Myc-tag, NE-tag, Rho1 D4-tag, S-tag, Softag 1 , Softag 3, Spot-tag, T7-tag, TC tag, Ty tag, V6 tag, VSV-tag or Xpress tag.
- the tag is a covalent protein tag (e.g., Isopeptag, SpyTag, SnoopTag, DogTag or SdyTag).
- the tag is a protein tag (e.g., biotin carboxyl carrier protein tag, glutathione-S-transferase (GST) tag, green fluorescent protein (GFP) tag, HaloTag, SNAP-tag, CLIP-tag, HUH-tag, maltose binding protein tag, Nus tag, thioredoxin tag, Fc tag, Designed Intrinsically Disordered tag, CRDSAT tag, SpyCatcher, SnoopCatcher, DogCatcher, SdyCatcher, or SUMO-tag.
- GST glutathione-S-transferase
- GFP green fluorescent protein
- HaloTag SNAP-tag
- CLIP-tag HUH-tag
- maltose binding protein tag Nus tag, thioredoxin tag, Fc tag, Designed Intrins
- a and/or B of the scaffold includes an affinity tag, epitope tag, covalent peptide tag, or protein tag.
- the tag is attached to the N-terminus or the C-terminus of the scaffold.
- the scaffold is conjugated to a functional group.
- the functional group includes biotin, streptavidin or a derivative of streptavidin, a polyethylene glycol moiety, a fluorescent dye, an enzyme, a radioactive moiety, a lanthanide, or a lanthanide binding motif.
- the scaffold is conjugated to a lanthanide or a lanthanide binding motif.
- the lanthanide is terbium.
- the scaffold is conjugated to a radioactive moiety.
- the radioactive moiety is an a or p emitter.
- the functional group is conjugated to a sulfhydryl group or a primary amine (e.g., on a cysteine residue or a lysine).
- the protein scaffolds described herein may be encoded by a polynucleotide.
- the polynucleotide is a ribonucleotide.
- the polynucleotide is a deoxyribonucleotide.
- a vector that includes a polynucleotide encoding the protein scaffold.
- a cell that includes a polynucleotide encoding the protein scaffold or a vector that includes the polynucleotide.
- the polynucleotide or vector may include an expression element configured to drive expression of the protein scaffold.
- the cell may be a prokaryotic cell (e.g., E. coli).
- the cell may be a eukaryotic cell.
- the eukaryotic cell is yeast cell (e.g., S. cerevisiae) or a mammalian cell (e.g., a Chinese hamster ovary (CHO) cell).
- the protein scaffold is secreted by the cell.
- the protein scaffold is expressed within the cell.
- Such a cell e.g., E. coli
- Also featured herein is a method of producing a protein scaffold as described herein.
- the method includes the steps of providing a cell transformed with a polynucleotide encoding the protein scaffold or a vector that includes the polynucleotide and culturing the transformed cell under conditions for expressing the polynucleotide.
- the culturing step results in expression of the protein scaffold.
- the method may further include isolating the protein scaffold or using the protein scaffold to bind a target.
- the protein scaffolds described herein may be conjugated to a particle.
- the particle is a magnetic particle.
- a resin or monolith that includes a plurality of the particles, e.g., containing the protein scaffold.
- a column e.g., a chromatography column
- the scaffolds and methods of use thereof can use a surface linked to the protein scaffold, which is configured to bind its target.
- the surface of the resin refers to a part of a support structure (e.g., a substrate) that is accessible to contact with one or more target molecules.
- the shape, form, materials, and modifications of the surface of the resin can be selected from a range of options depending on the application.
- the surface of the resin is SEPHAROSE®.
- the surface of the resin is agarose.
- the surface of the resin can be substantially flat or planar. Alternatively, the surface of the resin can be rounded or contoured. Exemplary contours that can be included on a surface of the resin are wells, depressions, pillars, ridges, channels or the like.
- the surface of the resin is modified to contain channels, patterns, layers, or other configurations (e.g., a patterned surface).
- the surface can be in the form of a bead, box, column, cylinder, disc, dish (e.g., glass dish, PETRI dish), fiber, film, filter, microtiter plate (e.g., 96-well microtiter plate), multi-bladed stick, net, pellet, plate, ring, rod, roll, sheet, slide, stick, tray, tube, or vial.
- the surface can be a singular discrete body (e.g., a single tube, a single bead), any number of a plurality of surface bodies (e.g., a rack of 10 tubes, several beads), or combinations thereof (e.g., a tray includes a plurality of microtiter plates, a column filled with beads, a microtiter plate filed with beads).
- a surface can include a membrane-based resin matrix.
- the surface of the resin includes a porous resin or a non-porous resin.
- porous resins can include additional agarose-based resins (e.g., cyanogen bromide activated SEPHAROSE® (GE); WORKBEADSTM 40 ACT and WORKBEADSTM 40/10000 ACT (Bioworks)), methacrylate: (Tosoh 650M derivatives etc.), polystyrene divinylbenzene (Life Tech Poros media/ GE Source media), fractogel, polyacrylamide, silica, controlled pore glass, dextran derivatives, acrylamide derivatives, convective- interaction media (Sartorius), additional polymers, and combinations thereof.
- GE cyanogen bromide activated SEPHAROSE®
- WORKBEADSTM 40 ACT WORKBEADSTM 40/10000 ACT
- methacrylate (Tosoh 650M derivatives etc.)
- a surface can include one or more pores.
- pore sizes can be from 300 to 8,000 Angstroms, e.g., 500 to 4,000 Angstroms in size.
- a resin as described herein includes a plurality of particles.
- particle sizes are 5 pm - 500 pm, 20 pm -300 pm, and 50 pm -200 pm.
- particle size can be 50 pm, 60 pm, 70 pm, 80 pm, 90 pm, 100 pm, 110 pm, 120 pm, 130 pm, 140 pm, 150 pm, 160 pm, 170 pm, 180 pm, 190 pm, or 200 pm.
- a protein scaffold can be immobilized, coated on, bound to, stuck, adhered, or attached to any of the forms of surfaces described herein (e.g., bead, box, column, cylinder, disc, dish (e.g., glass dish, PETRI dish), fiber, film, filter, microtiter plate (e.g., 96-well microtiter plate), multi-bladed stick, net, pellet, plate, ring, rod, roll, sheet, slide, stick, tray, tube, or vial).
- any of the forms of surfaces described herein e.g., bead, box, column, cylinder, disc, dish (e.g., glass dish, PETRI dish), fiber, film, filter, microtiter plate (e.g., 96-well microtiter plate), multi-bladed stick, net, pellet, plate, ring, rod, roll, sheet, slide, stick, tray, tube, or vial).
- a method of purifying a target molecule e.g., from a plurality of molecules, e.g., from a crude lysate.
- the method includes providing a sample that includes a mixture of the target molecule and the plurality of molecules and contacting the sample with a protein scaffold as described herein.
- the protein scaffold may have previously been generated with loop regions that are specific to the desired target.
- the scaffold e.g., the loop regions of the scaffold
- the method further includes separating the target molecule bound to the protein scaffold from the plurality of molecules.
- the step of separating includes immobilizing the protein scaffold.
- the protein scaffold is conjugated to a particle.
- the particle includes a magnetic bead.
- the protein scaffold is conjugated to a resin or monolith as described herein.
- the scaffolds, particles, resins, and columns described herein are amenable to single-step purifications from crude mixtures.
- target proteins may be eluted with polyol, imidazole (e.g., 3 M imidazole, e.g., pH 8), or sodium citrate (e.g., or 0.1 M sodium citrate, e.g., pH 2.5).
- the scaffolds, particles, resins, and columns may be cleaned, e.g., with an alkaline substance, e.g., NaOH, e.g., 0.1 M NaOH.
- resins or columns made with a protein scaffold as described herein may remain functional after exposure to dimethylformamide (DMF), e.g., 100% DMF and/or autoclaving.
- Example 1 Protein engineering of nanoCLAMP antibody-mimetics for use as affinity chromatography capture agents resistant to high temperature, trypsin, low pH, organic solvent, and sodium hydroxide
- Immunoaffinity chromatography is an established laboratory-scale technique for the isolation of target proteins with high yield and purity.
- properties of antibodies and nanobodies often make immunoaffinity chromatography incompatible with conditions typical of many industrial scale processes.
- a nanoCLAMP an antibody-mimetic
- the 16 kD antibody mimetic is based on a bacterial, cysteine- free, p-sandwich protein with a structure analogous to immunoglobulin variable domains.
- the first generation of nanoCLAMPs generally showed high selectivity and affinity but also suffered from sensitivity to high temperature, digestion by proteases, and inactivation by alkali.
- the resulting immunoaffinity capture agents typically had a Kd of ⁇ 80 nM, a T m of > 70 °C and a ti/2 in 0.1 mg/ml trypsin of > 20 hours.
- the nC-B capture agents also maintained their binding capacity and selectivity over 20 purification cycles, each including 10 minutes of cleaning in place with 0.1 M NaOH.
- Affinity chromatography resins made with nC-B capture agents supported efficient single-step purifications from crude mixtures. Target proteins could be eluted with either 3 M imidazole, pH 8 or 0.1 M sodium citrate, pH 2.5.
- affinity chromatography resins with nC-B capture agents remained functional after exposure to 100% DMF and autoclaving.
- the robust nC-B scaffold developed in this work enables the development of custom, high performance affinity chromatography resins compatible with the harsh conditions of process-scale applications.
- Affinity chromatography with target-specific, immobilized capture agents is an established method of protein purification.
- a capture agent such as a protein, nucleic acid, or small molecule
- a solid support which can then be used to isolate a protein of interest from a complex mixture.
- the technique has been widely used at the laboratory scale for single-step purifications of diverse target proteins, including enzymes, transcription factors, growth factors and antibodies.
- Protein-based capture agents for AC in industrial applications has been less widespread because currently available approaches are incompatible with the temperatures, pH extremes, and solvents often needed for process-scale purification or are useful for only a limited number of targets.
- An exception is the purification of kg-quantities of antibodies with AC resins based on Staphylococcal Protein A.
- the development of Protein A resins highlights the use of protein engineering to improve the robustness of AC resins as well as some remaining limitations.
- Early versions of resins with wild-type Protein A captured antibodies with high selectivity and capacity from cell culture media feedstock but lost activity gradually after multiple cycles of cleaning in place with sodium hydroxide. Mutagenesis of Protein A yielded variants with increased resistance to sodium hydroxide treatment and higher binding capacity. Despite the widespread use of Protein A resins, they are nonetheless limited to the purification of antibodies.
- the conjugation of the antibody to the resin often results in heterogenous coupling because of a lack of precise control over the sites of conjugation.
- chromatography must be performed under oxidizing conditions in order to preserve the disulfide bonds essential for maintaining antibody structure.
- Elution of the target also usually requires low or high pH conditions that are incompatible with some target proteins.
- immuno-AC resins the chief limitation of immuno-AC resins is their sensitivity to the sodium hydroxide solutions which are preferred for cleaning-in-place procedures.
- antibody mimetics proteins that, like antibodies, can be produced to bind specific antigens with high affinity and specificity, but are not directly derived from the immune system of animals.
- antibody mimetics include those based on Protein A, gamma-b crystallin, ubiquitin, cystatin, lipocalins, ankyrin repeat motifs, SH3 domains, fibronectin, OB fold domains, lamprey variable lymphocyte receptors, minibodies, miniproteins, and Kunitz domains.
- Most of these antibody mimetics use animal-sparing phage display for their isolation and can be produced by microbial cells.
- nanoCLAMPs have the unusual and advantageous general property of releasing bound target protein in solutions of non-denaturing polyols and ammonium sulfate at neutral pH.
- Kd a variety of target proteins with Kd’s ranging from 1 to 100 nM before affinity maturation and from 10 to 1000 pM after affinity maturation (Suderman et al. 2017).
- Affinity chromatography media produced with nanoCLAMPs support single-step purifications to near homogeneity as assessed by Coomassie staining.
- the working binding capacity ranges from 5 to 200 nmol target protein per ml of packed beads.
- nanoCLAMP resins While these first generation nanoCLAMP resins have adequate selectivity and capacity for laboratory-scale purifications, they are suboptimal for process-scale purifications because of their moderate thermostability (T m ranging from 45 to 60 °C), sensitivity to protease digestion (t 1/2 ⁇ 1 h in 0.1 mg/ml trypsin), and moderate alkaline resistance (50% loss of activity after 12 cycles of incubation with 0.1 M NaOH).
- substitutions for asparagine was to increase the protein’s resistance to the alkaline solutions commonly used to sanitize industrial chromatography columns. Asparagine in certain contexts is susceptible to deamidation, and its removal has been shown to reduce the loss of protein binding activity in sodium hydroxide.
- SMT3-A1 consists of residues 807 to 946 of the 2 nd Type 32 carbohydrate binding module of NagH from Clostridium perfringenswth loops mutated in positions 817-820, 838-844, and 931 -935 and selected for binding to yeast SUMO.
- nanoCLAMP amino acids based on the sequence of NagH.
- the volumes listed for Aggregated, Dimer, and Monomer SEC % correspond to elution volumes from a Superdex 75 SEC column.
- Rounds 1 through 4 focused on increasing T m .
- Rounds 5 and 6 focused on removing potential protease cleavage sites while maintaining T m and included many reversions to past rounds.
- Round 7 focused on removing remaining asparagines while maintaining T m .
- mutations were tested for 58% of the amino acids in the constant regions (72 of 124 positions).
- the clone resulting from 7 rounds of mutagenesis is designated P2788.
- P2788 contains 30 mutations, representing approximately 24% of the positions in the constant regions, and includes a three-residue, C-terminal extension of residues from CBM32-2.
- the resulting mutations are broadly distributed throughout the primary sequence as shown in an alignment with the original sequence (FIG.
- the number of lysines and arginines representing potential trypsin cleavage sites was reduced from 1 1 in the constant regions of the starting protein (clone SMT3-A1 ) to 5 in the constant regions of the resulting protein (clone P2788).
- Three of the remaining arginines (R881 , R897 and R925) are expected to be involved in salt bridges as identified by the ESBRI algorithm (Costantini et al. ESBRI: a web server for evaluating salt bridges in proteins. Bioinformation 3:137-138, 2008). For these residues, we were unable to identify any substitution mutations that did not destabilize the proteins.
- K883 substitution with arginine was beneficial, but several additional substitutions either resulted in a greater than 10 °C decrease in melting temperature or high initial fluorescence in DSF.
- K878, which is universally conserved in the alignment 10 of 10 substitution mutations resulted in proteins with high initial fluorescence in DSF.
- K878 NC forms hydrogen bonds with the carbonyl oxygens of P905 and G907 as determined by the RING 2.0 algorithm (Piovesan, et al. Nucleic Acids Res 44:W367-374, 2016).
- N928 is universally conserved in the consensus alignment and buried in the 3D structure.
- N928 N8 2 forms hydrogen bonds with the carbonyl oxygens of S845 and L846 as determined by the RING 2.0 algorithm.
- We chose not to attempt substitution mutations with N928 because of the low likelihood of deamidation based on its sequence context and the likely challenge of finding a substitution with a beneficial effect.
- nanoCLAMP-B nanoCLAMP-B with the identifier “B” referring to the next variant of the original class of nanoCLAMPs.
- the first generation of nanoCLAMPs represented by SMT3-A1 and others is referred to as the “nC-A class” (nanoCLAMP-A with the identifier “A” referring to the first class of nanoCLAMPs).
- the nC-A class of nanoCLAMPs encompasses the first published nanoCLAMPs (Suderman et al. supra). Relative to NagH CpCBM32-2, nanoCLAMPs of the nC-A class have a M929L mutation that removes a methionine as well as amino acid differences in the variable loops.
- oligonucleotides constructed with phosphoramidite trimers were designed so that the variable regions encoded all amino acids except cysteine (omitted to avoid heterogeneous coupling to multiple cysteines), methionine (omitted to avoid the risk of inactivating oxidation), and lysine and arginine (omitted to avoid the addition of a trypsin-cleavage sites).
- Position 818 was held constant with a valine, the wild-type amino acid, because valine and another small hydrophobic amino acid isoleucine, appeared in one quarter of nanoCLAMPs from previous screens.
- the resulting library contained over 10 10 variants of nC-B nanoCLAMPs.
- nanoCLAMP SUMO binders were subcloned into a bacterial expression vector, expressed, purified by immobilized metal affinity chromatography (IMAC), and confirmed to be over 90% pure as estimated by SDS-PAGE (data not shown).
- IMAC immobilized metal affinity chromatography
- the purified nanoCLAMPs were then screened for monodispersity by size exclusion chromatography and T m by DSF. Of the 18, seven (38%) were over 90% monomer. Of these, five had a melting temperature of greater than 73 °C, with four having melting temperatures greater than 99 °C (Table 4).
- the initial results with the SUMO test case suggest that the nC-B constant regions generally support the isolation of clones with high monodispersity and thermostability.
- nC-B nanoCLAMPs P2808, P2809, and P2811 . These were selected to provide a diverse sample of binding loops. For the three clones, Loop 817-820 and Loop 931 -935 did not have any apparent similarity. Except for V818, whose identity was fixed in the library, there were no identities in any positions of these loops. For Loop 838-844, clones P2808 and P2809 are identical in 5 of 7 positions while clone 2811 shows no identities with either.
- T m for these nanoCLAMPs exceeds the quantifiable range for this assay.
- a functional binding assay to assess kinetic thermostability.
- Tso we incubated samples of each nanoCLAMP at different temperatures, cooled and centrifuged the solutions, and then measured the binding activity remaining in the supernatant by biolayer interferometry.
- Tso we define the temperature of a 5-minute heat challenge after which 50% of binding activity is irreversibly lost.
- the Tso ranged from ⁇ 85 °C for clone P2808 to > 100 °C for clones P2809 and P2811 .
- the rank order and values are consistent with the T m measured by DSF (FIG. 11).
- FIGS. 5B-5D show a time course of a tryptic digest comparing nanoCLAMPs with different combinations of constant regions and variable loops to understand the contribution of each component to trypsin resistance.
- SMT3-A1 nC-A constant regions and original variable loops
- P2788 nC-B constant regions with the original variable loops from SMT3-A1
- P2808 nC-B constant regions with newly isolated variable loops.
- nC-B affinity resins affinity chromatography resins made with capture agents of the nC-B class of nanoCLAMPs.
- P2808 As a test case for the utility of nC-B affinity resins, we used P2808 as a capture agent for more detailed studies. To generate the affinity resin, the P2808 protein was expressed, purified by IMAC under denaturing conditions, conjugated to sulfhydryl-reactive 6% cross-linked agarose resin, and then refolded by rinsing with buffered saline.
- IMAC immunosorbiation Protocol
- sulfhydryl-reactive 6% cross-linked agarose resin conjugated to sulfhydryl-reactive 6% cross-linked agarose resin, and then refolded by rinsing with buffered saline.
- the bound target was washed and eluted with a purity of over 90% as estimated by densitometric analysis of Coomassie stained SDS-PAGE.
- the static binding capacity under these conditions was 11 .5 mg/ml resin (277 nmol/ml).
- imidazole elution worked consistently with P2808, P2809, and P2811 , as well as three different AC resins targeting SUMO, mCherry, and GFP with capture agents of the nC-A class .
- nanoCLAMPs of the nC-B class can generally serve as capture agents for resins capable of single-step affinity purification of targets to homogeneity. These resins are also compatible with cleaning-in-place protocols using 0.1 M sodium hydroxide for over 20 cycles, without loss of binding capacity or specificity. Further, in practice, cleaning- in-place cycles are not usually performed after each run, so the expected lifetime likely exceeds 100 cycles with the assumption of sanitation every 5 th run.
- nC-B resins were robust to the broad range of conditions tested so far, we decided to determine whether the resins could also retain binding and specificity after autoclaving.
- the resin made with the nC-A nanoCLAMP (SMT3-A1 ) did not bind any detectable target protein after autoclaving.
- the resins made with the nC-B nanoCLAMPs retained 25% to 45% binding capacity with specificity comparable to controls (FIGS. 10C and 10D).
- nC-B resins are compatible with NaOH cleaning-in-place, can be produced by bacterial expression of the capture reagent, and lack cysteine residues.
- nC-B resins are distinct in having been shown to exhibit resistance to boiling temperatures, trypsin, and organic solvent. Key performance parameters of nC-B resins and supporting results are summarized in Table 7.
- nC-B nanoCLAMP resins to extend Protein A-like levels of performance to a broad range of proteins beyond antibodies.
- nC-B resins efficiency, low cost-of-manufacture and reusability have the potential to reduce the total cost of manufacture for process-scale purifications.
- nanoCLAMPs general compatibility with high temperature, organic solvent and pH extremes may enable industrial applications where extreme conditions are required.
- nC-B nanoCLAMPs also support their use in applications beyond immunoaffinity chromatography.
- nanoCLAMPs have been used successfully in bioelectric and electrochemical sensors.
- the conditions at the surface of a biosensor represent a challenging environment where the improved stability of nC-B nanoCLAMPs may be enabling.
- conjugation of nanoCLAMPs to surfaces in DMF allows the use of reaction conditions compatible with reagents that have low solubility in aqueous buffers.
- the plasmid pET(SMT3-A1 ) containing nanoCLAMP SMT3-A1 was mutated by inverse PCR (Ochman et al. Genetics 120:621 -623, 1988) by amplifying the plasmid with forward and reverse primers containing the mutation(s) of interest with 15-bp overlapping 5’ ends, purifying the amplicon, In-Fusion cloning the ends back together (In Fusion HD Cloning Kit, Takara), and transforming chemically competent NEc1 E. coli (BL21 (DE3) derivative with slyDD(His151 -His196) from Nectagen, Inc.).
- Plasmids were purified by Qiagen miniprep kit (Qiagen) and mutations were verified by sequencing the purified plasmids by Sanger sequencing (Genewiz). Glycerol stocks of the plasmids in NEc1 cells were prepared for seeding expression cultures. Constructs for conjugating to Sulfolink resin (Thermo) coded for the nanoCLAMP with an N-terminal 6-His tag and a 13-amino acid C-terminal GS-linker followed by a Cys. Constructs for expressing nanoCLAMPs for biophysical characterization lacked the GS-linker and the C-terminal Cys to avoid dimerization issues due to disulfides.
- pET(SMT3-A1) Sequence SEQ ID NO: 38
- Glycerol stocks of NEc1 cells harboring nanoCLAMP expression vectors were used to inoculate 3 ml starter cultures of 2xYT/2% glucose (Glu)/100 mg/ml Carbenici Ilin (CB) and grown overnight at 37 °C, 250 rpm. The overnight cultures were diluted 1 :100 into 300 ml of Novagen Overnight Express Instant TB Medium/1 % glycerol/CB and incubated 24 h, 30 °C, 250 rpm.
- Cells were pelleted at 10k x g, 10 min, 4 °C, and lysed with 30 ml 100mM NaH2PO4, 10 mM Tris, 6 M GuHCI (QAB) pH 8.5, plus 1 mM TCEP (QAB-TCEP, pH 8.5) using a Polytron to homogenize.
- the insoluble material was pelleted at 15k x g, 20 min, 15 °C, and the cleared supernatant applied to Ni Sepharose 6 Fast Flow (Cytivia) and incubated rotating for 1 h to overnight.
- the beads were transferred to a column and washed with 3 CV QAB-TCEP, pH 8.5, then 3 CV QAB, pH 8.5.
- the protein was eluted with QAB, pH 8.5 + 250 mM imidazole and quantified by A280.
- the purity of the eluted protein was measured by SDS-PAGE on 12% NuPAGE Bis-Tris gels and Coomassie staining with Gel-Code Blue (after removing the GuHCI by cold ethanol precipitation). Yields for nanoCLAMPs were typically 150 - 300 mg/L culture, and purity was typically greater than 90%.
- the purified, denatured nanoCLAMPs in QAB, pH 8.5 were reduced with 2 mM TCEP if used after storage and conjugated to Sulfolink cross-linked, 6% beaded agarose (Thermo). Briefly, the resin was equilibrated with QAB, pH 8.5 + 5 mM EDTA and transferred to a column. The nanoCLAMP was adjusted to 8 mg/ml in a volume 2X the volume of the Sulfolink resin, and then incubated with the resin with rotation for 30 min. at room temperature. The resin was allowed to settle for 15 min., and the column drained to the top of the resin bed.
- the column was washed with QAB, pH 8.5 and then incubated with 50 mM L-Cys in QAB, pH 8.5 to quench for 15 min. with rotation.
- the column was allowed to settle, drained, and washed again with 6 M GuHCI, 20 mM Tris, (QCB) pH 8.
- the nanoCLAMP was refolded on the resin by rinsing with 6 CV of 20 mM MOPS, 150 mM NaCI (MBS), pH 6.5 + 1 mM CaCl2.
- the target protein, SUMO-GFP (described above) was spiked into the lysate to a final concentration of 0.025 to 0.2 mg/ml, depending on the application with a highly concentrated stock so that the total protein concentration remained unchanged.
- the spiked lysate was then incubated with 10 ml of the nanoCLAMP resin (packed volume) in a total volume of 1 .4 ml, rotating at 4 °C for 1 h.
- the resin was precipitated by centrifugation and transferred to a small column.
- the resin was washed 4 times with 400 ml PBS, pH 7.4, and then eluted with 3 M imidazole, pH 8.
- the eluates were buffer exchanged twice with Zeba columns (7 kD MWCO, Thermo), and quantified by A280 or fluorescence using an iD5 plate reader.
- Glycerol stocks of NEc1 cells harboring nanoCLAMP expression vectors were used to inoculate 3 ml starter cultures of 2xYT/2% glucose (Glu)/100 mg/ml Carbenici Ilin (CB) and grown overnight at 37 °C, 250 rpm. The overnight cultures were diluted 1 :100 into 35 ml of Novagen Overnight Express Instant TB Medium/1% glycerol/CB and incubated 24 h, 30 °C, 250 rpm. Cells were pelleted and lysed with QCB, pH 8, and insoluble material removed by centrifugation at 15k x g, 20 min, 15 °C.
- the cleared lysate was incubated with Ni Sepharose 6 Fast Flow (Cytivia) for > 1 h rotating, room temperature, then transferred to 2 ml columns.
- the columns were washed with 6 x 1 ml QCB, pH 8, then refolded with 11 ml of 20 mM MOPS, 150 mM NaCI (MBS), 1 mM CaCl2, pH 8.
- the nanoCLAMPs were eluted with MBS, 1 mM CaCL, 250 mM imidazole, pH 8, buffer exchanged to remove the imidazole using Zeba 7 MWCO desalting columns, and normalized to 1 mg/ml in MBS, 1 mM CaCl2, pH 6.5.
- a target protein for panning the library NL-26 we prepared a biotinylated yeast SUMO construct (B-SUMO; P1068) in a pET expression vector and transformed into BL21 (DE3) E. coli harboring a constitutively expressed biotin ligase, BirA.
- An overnight starter culture was diluted 1 :100 into 500 ml Novagen Overnight Express Instant TB Medium/1% glycerol/CB/CAM including 5 mM Biotin and incubated 24 h, 30 °C, 250 rpm. Following the induction, the cells were pelleted and the media discarded.
- the pellet was frozen at -80 °C, thawed on ice and resuspended at 5 ml/g pellet in MBS, pH 7.4 + Pierce Protease Inhibitor Tablet Mini and sonicated on ice for 10 min. at 50% duty cycle.
- Biotin was added to 100 pM and the lysed cells incubated at 37 °C, 30 min, 250 rpm to drive biotinylation to completion.
- the lysate was cleared by centrifugation at 30k x g, 20 min, 4 °C and the supernatant transferred to a 2.25 ml SMT3-A1 resin (Nectagen, Inc) packed column at 1 ml/min.
- the resin was washed with 25 ml MBS, pH 7.4 and the protein eluted with polyol elution buffer (PEB): 10 mM Tris, 1 mM EDTA, 0.75M ammonium sulfate, 40% propylene glycol, pH 7.9.
- PEB polyol elution buffer
- the protein was desalted 2X into 50 mM Tris, pH 8 and stored as a 50% glycerol stock at -20 °C.
- Purified nanoCLAMPs were diluted to a final concentration of 0.18 mg/ml in MBS, 1 mM CaCl2, pH 6.5, centrifuged at 20k x g, 2 min 4 °C, and the supernatants transferred to a clean tube.
- the samples were loaded into a 125 pl sample loop and injected onto a Superdex 75 10/300 GL column (GE Healthcare Life Sciences, Pittsburg, PA) equilibrated in MBS, 1 mM CaCl2, pH 6.5 at a flowrate of 0.65 ml/min.
- the column was calibrated with Bio-Rad Gel Filtration Standard per manufacturer’s instructions.
- the melting temperature of purified nanoCLAMPs was determined using GloMelt Thermal Shift Protein Stability Kit (Biotium) per manufacturer’s instructions. Briefly, purified nanoCLAMPs were adjusted to 1 mg/ml in MBS, 1 mM CaCL, pH 6.5 and diluted in half with 2X GloMelt (Biotium) and aliquoted to 386 well plate and sealed with optical film. The plate was then heated in a Quantstudio 5 qPCR machine using SYBR Green reporter with no passive reference. The heating profile was 25 °C for 2 min; ramp at 0.05 °C /sec to 99 °C; 99 °C for 2 min. T m is defined as the inflection point in the unfolding curve.
- Digestions were performed by incubating the nanoCLAMPs at 0.25 mg/ml in a 20 pl reaction containing 0.1 mg/ml trypsin (Roche Cat 11418475001 ) or chymotrypsin (Roche Cat 11418467001 ) diluted in 1 mM HCI, such that the final HCI concentration in the reaction was 0.1 mM.
- CaCL was added to the reaction to 10 mM.
- the protein and remaining diluent buffer was MBS, pH 6.5.
- the pCombX phagemid template p2799 (Table 8), contained the N-terminal and C-terminal constant regions of the nC-B class separated by a stuffer region containing Hind II I and Spel cut sites. This template was digested with Hindlll and Spel, gel purified, and the plasmid region amplified with degenerate primers 1957T R and 1960T F, which added the N and C-terminal part of nC-B as well as the randomized loops L1 and L8, respectively.
- the primers listed with a T indicate they are degenerate primers constructed using phosphoramidite trimer mixes (Glen Research) of oligos (IDT) containing all amino acids except Cys, Met, Lys, and Arg.
- the short internal region of the P2788 was amplified using primers 1958T F and 1959 R, which added and randomized Loop L2.
- PCR was carried out with ClonAmp HiFi PCR Mix, according to manufacturer’s instructions (Takara Bio, Mountain View, CA). The reaction cycle was 98 °C for 10 sec, 65 °C for 10 sec, and 72 °C for 30 sec, repeated 30 times.
- NNN randomized codon
- the desalted DNA was then adjusted to 100 ng/pl with ddF and used to electroporate electrocompetent TG1 cells (Lucigen).
- TG1 cells Approximately 50 pl of DNA was added to 1 .25 ml ice cold TG1 cells and pipetted up and down 4 times to mix on ice, after which 25 pl aliquots were transferred to 50 electroporation cuvettes (with 1 mm gaps) on ice.
- the cells were electroporated, and immediately quenched with 975 il recovery media (Lucigen), pooled, and incubated at 37 °C, 250 rpm for 1 h.
- the library was infected by adding helper phage VCSM13 (Stratagene, Cat#200251 ) to 750 ml of culture at an MOI of 20 phage/cell, and incubating at 37 °C, 100 rpm for 30 min, then 250 rpm for an additional 30 min.
- the cells were pelleted at 7500 x g for 10 min, and the media discarded.
- the cells were resuspended in 1 .2 L 2xYT/CB, 70 pg/ml kanamycin (KAN), and incubated 15 h at 30 °C, 250 rpm.
- the cells were combined, and 100 ml was centrifuged at 10k x g for 10 min.
- the phage containing supernatant was transferred to clean tubes and precipitated by adding 37.5 ml of 5X PEG/NaCI (20% polyethylene glycol 6000/2.5 M NaCI), and incubated on ice for 25 min.
- the phage was pelleted at 13k x g, 25 min and the supernatant discarded.
- the phage was resuspended in 10 ml 20 mM NaH2PO4, 150 mM NaCI, pH 7.4 (PBS), then centrifuged at 15k x g for 15 min to remove insoluble material.
- the phage was precipitated a second time by adding 1 /4 volume 5X PEG/NaCI, incubated on ice for 5 min, and pelleted at 13k x g, 10 min at 4 °C.
- phage To preclear the phage against beads alone, 1 ml of phage was prepared at a concentration of 2 x 10 13 phage/ml in 2% M-PBS-T, the block removed from the first set of beads, and the phage added to the beads and incubated 1 h, rotating. The magnet was applied, and the precleared phage removed and transferred to a clean tube. The magnet was applied, and this step repeated two times to ensure no carryover of beads bound to phage to the next step. Biotinylated target (B-SUMO) was added to the precleared phage to 100 nM final concentration and incubated rotating 1 h.
- B-SUMO Biotinylated target
- Block was removed from the second set of beads, and the phage/B-SUMO mix was added to the beads to precipitate the biotinylated target and bound phage.
- the beads were washed 8X with PBS-T, 1 ml each, vortexing between each step and applying the magnet.
- the washed beads were eluted with 800 pl 0.1 M glycine, pH 2.0, 10 min rotating, the magnet applied, and the eluate transferred to 72 pl 2 M Tris base to neutralize.
- the cells were infected at 37 °C, 45 min, 175 rpm, and then expanded to 100 ml 2xYT/Glu/CB and incubated overnight at 30 °C, 250 rpm.
- the overnight cultures were harvested by measuring the ODeoo, centrifuging the cells at 10k x g for 10 min and then resuspending the cells to an ODeoo of 75 in 2xYT/18% glycerol.
- 5 ml of 2xYT/Glu/CB was inoculated with 5 pl of the 75 ODeoo glycerol stock and incubated at 37 °C, 250 rpm until the ODeoo reached 0.5.
- the cells were superinfected at 20:1 phage:cell, mixed well, and incubated at 37 °C, 30 min, 150 rpm and then 30 min at 250 rpm.
- the cells were pelleted at 5500 x g, 10 min, the glucose containing media discarded and the cells resuspended in 10 ml 2xYT/CB /KAN and incubated overnight at 30 °C, 250 rpm.
- the overnight phage prep was processed as described above.
- neutravidin-coated magnetic beads Spherotech
- the media can then be used directly in an ELISA assay (soluble expression-based monoclonal enzyme-linked immunosorbent assay: semELISA).
- ELISA assay soluble expression-based monoclonal enzyme-linked immunosorbent assay: semELISA.
- streptavidin coated microtiter plates (ThermoFisher) were rinsed 3 times with 200 pl PBS, and then coated with biotinylated target proteins at 2 pg/ml with 100 pl/well and incubated 1 h. For blank controls, a plate was incubated with 100 pl/well PBS. The coating solution was removed, and the plates blocked with 2% M-PBS-T. The block was removed and 50 pl of 4% M-PBS-T added to each well.
- each induction plate supernatant was transferred to the blank and protein-coated wells and pipetted 10 times to mix and incubated 1 h.
- the plates were washed 4 times with 200 pl PBS-T and the plates dumped and slapped on paper towels in between washes.
- 75 pl of 1 :2000 dilution anti-FLAG-HRP (Sigma A8592) in 4% M-PBS-T was added to each well and incubated 1 h.
- the anti-FLAG-HRP was discarded, and the plates washed as before.
- the plates were developed by adding 75 pl TMB Ultra substrate (ThermoFisher) and analyzed for positive signals compared to controls.
- a packed volume of 0.6 ml of P2808 resin or SMT3-A1 resin was packed into a Tricorn 5/50 column (5 mm ID x 3.06 cm height) and equilibrated in 20 mM NaH2PO4, 150 mM NaCI, pH 7.4 (PBS) at 0.5 ml/min for 5 CV.
- Vdeiay The delay volume, Vdeiay, was measured for the configuration at 0.5 ml.
- a cleared E. coli lysate was prepared by lysing a pellet of NEc1 E. coli (a derivative of BL21 (DE3) with the C-terminal region of SlyD knocked out by recombineering, Nectagen, Inc) with BPER (Thermo) and removing insoluble material by centrifugation at 15 k x g, 20 min, 4 °C.
- the cleared supernatant was diluted to a total protein concentration of roughly 3.3 mg/ml with PBS, pH 7.4 such that the BPER reagent was present at 20% vol/vol.
- the spiked lysate was loaded onto the column at 0.5 ml/min for indicated times, washed with 20 CV PBS, pH 7.4, then eluted with 3 M Imidazole, pH 8.
- Fractions containing eluted target were pooled and desalted 2X on Zeba 7 MWCO columns and the protein quantified by A280. Imidazole removal was verified by testing the A280 of elution buffer alone following 2X desalting.
- the cycle consisted of a 2 ml equilibration in running buffer at 1 ml/min, 0.5 ml load of spiked lysate at 0.5 ml/min, 3 ml wash with running buffer at 0.5 ml/min, 2 ml elution with 3 M imidazole, pH 8 (collected) at 0.5 ml/min, a 0.5 ml wash with running buffer at 0.5 ml/min, a cleaning in place cycle of 1 .5 ml NaOH at 1 ml/min and then 2 ml at 0.2 ml/min (total contact time 10 min), and finally a refolding step with 5 ml running buffer at 1 ml/min.
- the target concentration in the eluates was measured by fluorescence spectroscopy in duplicate on an i D5 plate reader (Molecular Dynamics) Ex/Em 485/535 nm.
- the eluates were analyzed by SDS-PAGE using NuPAGE gels as described above.
- the control and autoclaved resin were stored overnight at 4 °C.
- the DMF treated resin was rinsed 3X with fresh MBS, 1 mM CaCl2, pH 7.2 and then stored overnight at 4 °C.
- the next day all three of the sets of beads were rinsed with fresh buffer, and then incubated with 1 .3 ml of E. coli lysate (prepared as described above) spiked with SUMO-GFP at 0.2 mg/ml, for 1 h, 4 °C, rotating.
- the resin was loaded into a small, tared column, rinsed 4 x 400 ml PBS, pH 7.4, then eluted 3 x 25 ml 3 M imidazole, pH 8.
- the fluorescence of the eluates was read on an iD5 plate reader in duplicate as described and the concentration determined by comparison with a standard curve of the target and compared to controls.
- the concentrations were normalized and analyzed by SDS PAGE as described above to assess purity.
- Example 2 Using the protein scaffold to target diverse antigens.
- Table 9 contains a subset of a much larger set of target-specific nanoCLAMPs, the majority of which possess the loop lengths of 4, 7, and 5 residues for loops 1 , 2, and 8, respectively, as designed in library NL-26 (see above).
- To demonstrate the protein scaffold’s ability to tolerate various loop lengths we only included those nanoCLAMPs in Table 9 that possess at least one loop with a different length than the designed length.
- Table 10 we demonstrate the scaffold’s ability to support vast loop diversity by tabulating the amino acid sequences of nanoCLAMPs specific to several targets and show the diversity of loop sequences to a single target in several cases. Table 9. Sequences of binding scaffolds (nanoCLAMPs) with variable loop lengths
- Protein binding to the protein scaffold was demonstrated by incubating the proteins with terbium, removing unbound terbium by buffer exchange, and measuring time resolved fluorescence. Proteins were prepared at 30 pM in 20 mM MOPS, 150 mM NaCI (MBS), pH 6.5 and buffer exchanged to remove any unbound Ca. SMT3-A1 (nC-A), P2808 (nC-B), and a negative control protein (recombinant SMT3) were added to a 140 pl reaction in the same buffer so their final concentrations were 8.57 pM, and either CaCl2 or TbCIs added to 300 pM.
- the loop, the flanking N-terminal, and the flanking C-terminal amino acid are shown in different shades.
- the structure was assessed qualitatively for the maintenance of the overall beta-sheet structures. Structures that maintained the overall beta-sheet structure were considered to maintain the overall fold. To explore short loop lengths, each loop was completely deleted and modeled as above. If the complete deletion did not impact the overall fold, no additional constructs were modeled. The complete deletion was aligned with Swiss PDB Viewer with the MagicFit function and then assessed qualitatively.
- a deletion was considered to result in disruption of the overall beta-sheet structure if a betastrand secondary structure assignment was converted to a coil assignment or one or more beta strands lost association with an adjacent beta-strand. If the complete deletion resulted in a disruption of the overall beta-sheet structure, a deletion series was made starting with each wild-type amino acid replaced by G and then removing one G at a time. The construct in the deletion series with the shortest loop length that maintained the fold qualitatively was aligned and assessed. The results from these modelling experiments are shown in FIGS. 17-26.
- the third column indicates the length diversity observed across orthologs. The observation of variation correlates roughly with the modeling data.
- the well characterized SUMO binder P2808 was modeled with AlphaFold to rationally select adjacent residues on neighboring beta strands for substitution with Cys residues, with the aim of further stabilizing the protein by introducing a disulfide bond.
- Substitution mutations were chosen by visual inspection of the structure to identify residues whose side chains were located in the core of the protein, whose side were oriented towards each other, and whose alpha carbons were approximately the same distance apart as observed with natural disulfide bonds.
- AlphaFold modeling predicted that several of the selected substitution mutations would form disulfide bonds and that a few that would not (Table 13).
- the refolded proteins were then eluted from the resin and tested for the presence of disulfide bonds by mobility shift on SDS PAGE under reducing vs oxidizing conditions. Because proteins possessing intramolecular disulfides remain more compact than those that do not, proteins with disulfides typically run faster in SDS-PAGE due to their smaller hydrodynamic radius. Thirteen of the 14 proteins predicted by AlphaFold to form disulfide bonds migrated faster on SDS PAGE in sample buffer lacking reducing agent than in sample buffer containing reducing agent. This observation is consistent with the agent reducing disulfide bonds in those proteins (FIG. 27). The proteins that appeared to possess disulfide bonds also had a band that ran similarly to reduced form.
- lysines are expected to reduce susceptibility to trypsin and increase the specificity of labeling with aminereactive reagents.
- These scaffolds have only a single primary amine located the N-terminal alpha amino group and are expected to be modified specifically at this position by amine-reactive reagents.
- Clones P3013 and P3014 contain no asparagines, common sites of deamidation, so are expected to be less susceptible to deamidation.
- Tm of basis +/- 4C.
- the 2808 sequence was modeled in AlphaFold with pairs of cysteine substitutions.
- the version was mmseq, and the modeling was performed with relaxation.
- nanoCLAMPs were purified as described above, under denaturing conditions, except the refolding step was modified. Briefly, nanoCLAMPs were bound to Ni Sepharose 6 Fast Flow (Cytivia) in 6 M GuHCI, 20 mM Tris, pH 8 (QCB)+ 5 mM TCEP. The resins were washed with 5 column volumes (CV) QCB + 1 mM TCEP, then 5 CV QCB (no TCEP).
- CV column volumes
- the proteins were then gradually refolded by washing with 10 CV QCB + 2 mM GSH/1 mM GSSG, then steps of 5 CVs each stepping the GuHCI down from 4, 3, 2, 1 , and finally 0 M GuHCI by diluting with 20 mM MOPS, 150 mM NaCI (MBS), 1 mM CaCl2, 2 mM GSH/1 mM GSSG, pH 8. Each refolding step of 5 CVs was incubated for 30 min. The refolded protein was then washed in 10 CV MBS, pH 8, 1 mM CaCL, and finally eluted with MBS, pH 8, 1 mM CaCl2, 250 mM imidazole.
- Proteins were normalized to 1 mg/ml in MBS, 1 mM CaCl2, pH 6.5. The proteins were diluted 10X into SDS sample buffer containing 50 mM DTT (reducing) and SDS sample buffer lacking reducing agent. The proteins were heated to 95 °C for 5 min, cooled and 1 pg separated on 12% NuPAGE BisTris gel (Thermo) with MES running buffer. Gels were stained with GelCode Blue (Thermo).
- DFS Differential scanning fluorimetry
- DSF was performed as described previously, except in the reducing case, TCEP was included in the DSC cocktail at 50 mM (final). The DSC program was performed as described, and the Tm measured at the inflection point of the curve.
- Phage library NL-26 contains clones of nanoCLAMPs with the wild-type nC-B framework as well as mutations resulting from errors in gene synthesis, PCR and phage propagation.
- the library was screened and analyzed to identify nanoCLAMPs that maintain the ability to bind their intended target and that contain one or more mutations in the framework regions.
- the analysis resulted in the identification of 105 nanoCLAMP variants, each recognizing one of four target antigens and each containing one or more mutations in the framework regions.
- the number of mutations identified in each framework and their position are summarized in Tables 14 and 15.
- a listing of individual enriched clones, targets and framework mutations is shown in Table 16. The identification of these variants in this non-exhaustive analysis indicates that each framework region can tolerate one or more mutations while maintaining the ability to be displayed on the phage surface and mediate binding to its target.
- the NL-26 phage library was panned against recombinant GFPMut2, Human Serum Albumin, mCherry, and TEV protease, as described above. Phage were enriched in two rounds, and approximately 200,000 clones from each round were DNA sequenced by next generation DNA sequencing with an Illumina MiSeq system. The sequencing reads were processed with PipeBio software to cluster and count like-sequences. Clones were identified that met the criteria of 1 ) showing greater than two-fold normalized enrichment from Round 1 to Round 2 and 2) having one or more mutations in the framework region. Table 14. Tolerance of Frameworks 1-9 to Mutations
- Table 16 Listing of individual enriched clones, target and framework mutations
Landscapes
- Chemical & Material Sciences (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Immunology (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Medicinal Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Biotechnology (AREA)
- Microbiology (AREA)
- Analytical Chemistry (AREA)
- Bioinformatics & Computational Biology (AREA)
- Crystallography & Structural Chemistry (AREA)
- Physics & Mathematics (AREA)
- Plant Pathology (AREA)
- Peptides Or Proteins (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
The present invention features thermostable protein binding scaffolds containing framework regions and variable loop regions that can be mutagenized to bind a desired target. The scaffolds are derived from the Carbohydrate Binding Module Family 32 (CBM32) protein domain of Clostridium perfringens hyaluronidase (NagH). The robust framework of the scaffolds described herein allows for the development of custom, high performance affinity chromatography resins compatible with the harsh conditions of process-scale applications that can be adaptable to a wide diversity of target substrates.
Description
THERMOSTABLE BINDING SCAFFOLDS
Sequence Listing
This application contains a Sequence Listing which has been filed electronically in Extensible Markup Language (XML) format and is hereby incorporated by reference in its entirety. Said XML copy, created on November 22, 2023, is named 51027-005W02_Sequence_Listing_11_22_23.XML and is 526,678 bytes in size.
Statement Regarding Federally Sponsored Research
This invention was made with government support under Grant No. 1 R43 GM143942-01 , awarded by the National Institutes of Health. The government has certain rights in the invention.
Background of the Invention
Affinity chromatography (AC) with target-specific, immobilized capture agents is an established method of protein purification. In this technique, a capture agent, such as a protein, nucleic acid, or small molecule, is coupled to a solid support, which can then be used to isolate a protein of interest from a complex mixture. The technique has been widely used at the laboratory scale for single-step purifications of diverse target proteins, including enzymes, transcription factors, growth factors and antibodies.
Use of protein-based capture agents for AC in industrial applications has been less widespread because currently available approaches are incompatible with the temperatures, pH extremes, and solvents often needed for process-scale purification or are useful for only a limited number of targets. An exception is the purification of kg-quantities of antibodies with AC resins based on Staphylococcal Protein A. The development of Protein A resins highlights the use of protein engineering to improve the robustness of AC resins as well as some remaining limitations. Early versions of resins with wild-type Protein A captured antibodies with high selectivity and capacity from cell culture media feedstock but lost activity gradually after multiple cycles of cleaning in place with sodium hydroxide. Mutagenesis of Protein A yielded variants with increased resistance to sodium hydroxide treatment and higher binding capacity. Despite the widespread use of Protein A resins, they are nonetheless limited to the purification of antibodies.
For the process-scale purification of non-antibody targets, the use of AC is much less widespread than the use of Protein A to purify antibodies. For instance, non-protein, ligand-based approaches, such as small molecule substrate mimetics, are effective but are limited to specific enzyme classes and are difficult to use with a general protein of interest. Alternatively, specialized affinity resins, such as glutathione or nickel require the addition of non-native tags, which cause downstream complications for proteins intended for therapeutic use. Immuno-AC with antibody- or nanobody-based capture agents is the most generally applicable approach and has widely been used to purify a diverse range of proteins at laboratory-scale. However, immuno-AC has some limitations. In general, the conjugation of the antibody to the resin often results in heterogenous coupling because of a lack of precise control over the sites of conjugation. In addition, chromatography must be performed under oxidizing conditions in order to preserve the disulfide bonds essential for maintaining antibody structure. Elution of the target also usually requires low or high pH conditions that are incompatible with some target proteins. For processscale applications, the chief limitation of immuno-AC resins is their sensitivity to the sodium hydroxide
solutions which are preferred for cleaning-in-place procedures. Because of these limitations, new capture agents are needed that can perform under a variety of extreme conditions necessary for robust target purification.
Summary of the Invention
In one aspect, the invention features a protein scaffold that includes framework regions and loop regions. The protein scaffold has the structure:
A-F1 -L1 -F2-L2-F3-L3-F4-L4-F5-L5-F6-L6-F7-L7-F8-L8-F9-B, wherein each of F1 -F9 correspond to framework regions 1 -9; each of L1 -L8 correspond to loop regions 1 -8;
A and B are each independently, absent or include at least one amino acid;
F1 includes the sequence of: (T/S)-LI-(H/R)-(T/S)-(P/E)-(G/S)-W (SEQ ID NO: 4) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 4;
L1 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F2 includes the sequence of: G-(S/N/T)-E-(A/S)-(D/N/S/A)-LLDGDD-(S/N/T)-TGV-(E/W/A)-Y (SEQ ID NO: 5) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 5;
L2 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F3 includes the sequence of: S-(L/V)-AGEFIGLDLG (SEQ ID NO: 6) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 6;
L3 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F4 includes the sequence of: G-(I/V)-(H/R/Y/N)-FVIG-(A/K/R)-(D/N) (SEQ ID NO: 7) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 7;
L4 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F5 includes the sequence of: DKW-(T/N/S)-(R/K)-F-(R/K)-LEYS (SEQ ID NO: 8) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 8;
L5 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F6 includes the sequence of: WTTI-(R/K/H/Q)-EYD-(H/K/R/Q) (SEQ ID NO: 9) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 9;
L6 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F7 includes the sequence of: (Q/K)-DVI-(D/E)-E-(D/S)-F (SEQ ID NO: 10) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 10;
L7 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F8 includes the sequence of: (Q/K/R)-YIRLTNLE (SEQ ID NO: 11 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 11 ;
L8 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and
F9 includes the sequence of: LTFSEFA-(I/V)-VS (SEQ ID NO: 12) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 12.
As described herein, a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof includes a sequence having, for example, one insertion, two insertions, one deletion, two deletions, one substitution mutation, two substitution mutations, one insertion and one deletion, one insertion and one substitution mutation, or one deletion and one substitution mutation.
In some embodiments, the protein scaffold includes at least one mutation selected from the group consisting of N807X, S809X, R812X, S813X, E814X, S815Xi, D818X, N822X, N825X, N832X, W836X, K857X, E858X, I859X, K86OX2, L861X, D862X, R865X, K870X, N871 X, N880X, K881 X, K883X, N890X, K897X, K901 X, K908X, E912X, S914X, and K922X3 relative to SEQ ID NO: 1 , wherein:
X is any amino acid except the amino acid in the equivalent position in SEQ ID NO: 1 ;
Xi is any amino acid except R or S;
X2 is any amino acid except P or K; and
X3 is any amino acid except R or K.
In some embodiments,
F1 includes the sequence of: (T/S)-LI-(H/R)-(T/S)-(P/E)-(G/S)-W (SEQ ID NO: 4) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 4;
F2 includes the sequence of: G-(S/N/T)-E-(A/S)-(D/N/S/A)-LLDGDD-(S/N/T)-TGV-(E/W/A)-Y (SEQ ID NO: 5) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 5;
F3 includes the sequence of: S-(L/V)-AGEFIGLDLG (SEQ ID NO: 6) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 6;
F4 includes the sequence of: G-(I/V)-(H/R/Y/N)-FVIG-(A/K/R)-(D/N) (SEQ ID NO: 7) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 7;
F5 includes the sequence of: DKW-(T/N/S)-(R/K)-F-(R/K)-LEYS (SEQ ID NO: 8) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 8;
F6 includes the sequence of: WTTI-(R/K/H/Q)-EYD-(H/K/R/Q) (SEQ ID NO: 9) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 9;
F7 includes the sequence of: (Q/K)-DVI-(D/E)-E-(D/S)-F (SEQ ID NO: 10) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 10;
F8 includes the sequence of: (Q/K/R)-YIRLTNLE (SEQ ID NO: 11 ) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 11 ; and
F9 includes the sequence of: LTFSEFA-(I/V)-VS (SEQ ID NO: 12) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 12.
In some embodiments, F1 includes the sequence of: (T/S)-LI-(H/R)-(T/S)-(P/E)-(G/S)-W (SEQ ID NO: 4);
F2 includes the sequence of: G-(S/N/T)-E-(A/S)-(D/N/S/A)-LLDGDD-(S/N/T)-TGV-(E/W/A)-Y (SEQ ID NO: 5);
F3 includes the sequence of: S-(L/V)-AGEFIGLDLG (SEQ ID NO: 6);
F4 includes the sequence of: G-(I/V)-(H/R/Y/N)-FVIG-(A/K/R)-(D/N) (SEQ ID NO: 7);
F5 includes the sequence of: DKW-(T/N/S)-(R/K)-F-(R/K)-LEYS (SEQ ID NO: 8);
F6 includes the sequence of: WTTI-(R/K/H/Q)-EYD-(H/K/R/Q) (SEQ ID NO: 9);
F7 includes the sequence of: (Q/K)-DVI-(D/E)-E-(D/S)-F (SEQ ID NO: 10);
F8 includes the sequence of: (Q/K/R)-YIRLTNLE (SEQ ID NO: 11 ); and
F9 includes the sequence of: LTFSEFA-(I/V)-VS (SEQ ID NO: 12).
In some embodiments, the protein scaffold includes at least one mutation selected from the group consisting of N807X, S809X, R812X, S813X, E814X, S815X, D818X, N822X, N825X, N832X, W836X, K857X, E858X, I859X, K860X, L861X, D862X, R865X, K870X, N871 X, N880X, K881 X, K883X, N890X, K897X, K901 X, K908X, E912X, S914X, and K922X relative to SEQ ID NO: 1 , wherein X is any amino acid.
In some embodiments, the protein scaffold includes at least one mutation selected from the group consisting of N807D, S809T, R812H, S813T, E814P, S815G, D818V, N822S, N825D, N832S, W836E, K857E, E858V, I859V, K860E, L861V, D862G, R865H, K870A, N871 D, N880T, K881 R, K883R, N890G, K897R, K901 H, K908Q, E912D, S914D, and K922Q relative to SEQ ID NO: 1.
In some embodiments, the at least one mutation is K870X and/or N890X. In some embodiments, the at least one mutation is K870A and/or N890G. In some embodiments, the at least one mutation is K870A. In some embodiments, the at least one mutation is N890G.
In some embodiments, the protein scaffold includes at least 3 fewer lysines relative to SEQ ID NO: 1 . For example, in some embodiments, the protein scaffold includes at least 3, 4, 5, 6, 7, 8, 9, or 10 fewer lysines relative to SEQ ID NO: 1 . In some embodiments, the protein scaffold includes at least 6 fewer lysines relative to SEQ ID NO: 1 . In some embodiments, the protein scaffold includes 9 fewer lysines relative to SEQ ID NO: 1 . In some embodiments, the protein scaffold does not include any lysines.
In some embodiments, the protein scaffold includes at least 3 fewer asparagines relative to SEQ ID NO: 1 . For example, in some embodiments, the protein scaffold includes at least 3, 4, 5, 6, 7, or 8 fewer asparagines relative to SEQ ID NO: 1 . In some embodiments, the protein scaffold includes at least 5 fewer asparagines relative to SEQ ID NO: 1 . In some embodiments, the protein scaffold includes 7 fewer asparagines relative to SEQ ID NO: 1 . In some embodiments, the protein scaffold does not include any asparagines.
In some embodiments, A and B are each independently, absent or at least one amino acid. For example, each of A and B may each be independently, absent. In some embodiments, A and B are each independently, at least 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 30, 400, 500, 600, 700, 800, 900, 1 ,000 or more amino acids. In some embodiments, A and B are each independently, from 0 to 1 ,000 amino acids, e.g., from 1 to 10 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids), from 10 to 100 amino acids (e.g., 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100 amino acids, or from 100 to 1 ,000 amino acids (e.g., 100, 200, 300, 400, 500, 600, 700, 800, 900, or
I ,000 amino acids).
In some embodiments, A and B are each independently, absent or from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids).
In some embodiments, each of L1 -L8 is independently, from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids).
In some embodiments, each of L1 -L8 is, independently, from 1 amino acid to 10 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids). In some embodiments, each of L1 -L8 is, independently, from 3 amino acids to 10 amino acids. In some embodiments, each of L1 -L8 is, independently, from 3 amino acids to 8 amino acids.
In some embodiments, L1 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10,
I I , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L1 is from 0 to 5 amino acids (e.g., from 1 to 5 amino acids, e.g., 0, 1 , 2, 3, 4, or 5 amino acids).
In some embodiments, L2 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L2 is from 1 amino acid to 16 amino acids (e.g., from 4 to 16 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, or 16 amino acids).
In some embodiments, L3 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L3 is 6 amino acids.
In some embodiments, L4 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L4 is from 0 to 5 amino acids (e.g., from 1 to 5 amino acids, e.g., 0, 1 , 2, 3, 4, or 5 amino acids).
In some embodiments, L5 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L5 is 5 amino acids.
In some embodiments, L6 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L6 is from 3 to 6 amino acids (e.g., 3, 4, 5, or 6 amino acids).
In some embodiments, L7 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L7 is 4 or 5 amino acids.
In some embodiments, L8 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L8 is from 4 to 6 amino acids (e.g., 4, 5, or 6 amino acids).
In some embodiments, L1 is 4 amino acids. In some embodiments, L2 is 7 amino acids. In some embodiments, L8 is 5 amino acids. In some embodiments, L1 is 4 amino acids, L2 is 7 amino acids, and/or L8 is 5 amino acids. In some embodiments, L1 is 4 amino acids, L2 is 7 amino acids, and L8 is 5 amino acids.
In some embodiments, L1 includes the sequence of: X1X2X3X4 (SEQ ID NO: 13), wherein each of X1-X4 is, independently, any amino acid. In some embodiments, X2 is V.
In some embodiments, L2 includes the sequence of: X1X2X3X4X5X6X7 (SEQ ID NO: 14), wherein each of X1-X7 is, independently, any amino acid.
In some embodiments, L8 includes the sequence of: XIX2XSX4XS (SEQ ID NO: 15), wherein each of X1-X5 is, independently, any amino acid.
In some embodiments, L4 includes the sequence of: X1X2X3X4X5X6X7 (SEQ ID NO: 14), wherein each of X1-X7 is, independently, any amino acid.
In some embodiments, L6 includes the sequence of: XIX2X3X4XSX6 (SEQ ID NO: 16), wherein each of Xi-Xe is, independently, any amino acid.
In some embodiments, L8 includes at least two amino acids. In some embodiments, L8 includes at least one amino acid.
In some embodiments, L4 includes the sequence of: (G/D)-GGSS (SEQ ID NO: 17) or GDT or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 17 or GDT.
In some embodiments, L6 includes the sequence of TGAPAG (SEQ ID NO: 18) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 18.
In some embodiments, L4 includes the sequence of: (G/D)-GGSS (SEQ ID NO: 17) or GDT; and L6 includes the sequence of TGAPAG (SEQ ID NO: 18).
In some embodiments, L3 includes the sequence of: (E/K/S)-(V/E)-(V/I/T)-(E/K/P/S)-(V/L)-(G/D) (SEQ ID NO: 19) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 19.
In some embodiments, L5 includes the sequence of: LD-(G/N)-(E/S)-S (SEQ ID NO: 20) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 20.
In some embodiments, L7 includes at least one amino acid.
In some embodiments, L7 includes the sequence of ETPI-(S/E)-A (SEQ ID NO: 21 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 21 .
In some embodiments, L3 includes the sequence of: (E/K/S)-(V/E)-(V/I/T)-(E/K/P/S)-(V/L)-(G/D) (SEQ ID NO: 19) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 19; L5 includes the sequence of: LD-(G/N)-(E/S)-S (SEQ ID NO: 20) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution
mutations) relative to SEQ ID NO: 20; and L7 includes the sequence of ETPI-(S/E)-A (SEQ ID NO: 21 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 21 .
In some embodiments, A includes the sequence of (D/N/H)-P. In some embodiments, A includes the sequence of DP.
In some embodiments, B includes the sequence of DELE (SEQ ID NO: 35).
In some embodiments, F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 22;
L1 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 23;
L2 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 24;
L3 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 25;
L4 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 26;
L5 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 27;
L6 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 28;
L7 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 29;
L8 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and
F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 30.
In some embodiments, F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22);
L1 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23);
L2 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
L3 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25);
L4 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
L5 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27);
L6 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28);
L7 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29);
L8 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and
F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30).
In some embodiments, F1 comprises the sequence of: TLIHTPGW (SEQ ID NO: 22) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 22;
L1 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F2 comprises the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 23;
L2 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F3 comprises the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 24;
L3 comprises the sequence of: EVVEVG (SEQ ID NO: 31 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 31 ;
F4 comprises the sequence of: GIHFVIGAD (SEQ ID NO: 25) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 25
L4 comprises the sequence of: GGGSS (SEQ ID NO: 32) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 32;
F5 comprises the sequence of: DKWTRFRLEYS (SEQ ID NO: 26) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 26;
L5 comprises the sequence of: LDGES (SEQ ID NO: 33) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 33;
F6 comprises the sequence of: WTTIREYDH (SEQ ID NO: 27) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 27;
L6 comprises the sequence of: TGAPAG (SEQ ID NO: 18) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 18;
F7 comprises the sequence of: QDVIDEDF (SEQ ID NO: 28) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 28;
L7 comprises the sequence of: ETPISA (SEQ ID NO: 34) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 34;
F8 comprises the sequence of: QYIRLTNLE (SEQ ID NO: 29) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 29;
L8 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and
F9 comprises the sequence of: LTFSEFAIVS (SEQ ID NO: 30) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 30.
In some embodiments, F1 comprises the sequence of: TLIHTPGW (SEQ ID NO: 22) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 22;
L1 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F2 comprises the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 23;
L2 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F3 comprises the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 24;
L3 comprises the sequence of: EVVEVG (SEQ ID NO: 31 ) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 31 ;
F4 comprises the sequence of: GIHFVIGAD (SEQ ID NO: 25) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 25;
L4 comprises the sequence of: GGGSS (SEQ ID NO: 32) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 32;
F5 comprises the sequence of: DKWTRFRLEYS (SEQ ID NO: 26) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 26;
L5 comprises the sequence of: LDGES (SEQ ID NO: 33) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 33;
F6 comprises the sequence of: WTTIREYDH (SEQ ID NO: 27) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 27;
L6 comprises the sequence of: TGAPAG (SEQ ID NO: 18) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 18;
F7 comprises the sequence of: QDVIDEDF (SEQ ID NO: 28) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 28;
L7 comprises the sequence of: ETPISA (SEQ ID NO: 34) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 34;
F8 comprises the sequence of: QYIRLTNLE (SEQ ID NO: 29) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 29;
L8 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and
F9 comprises the sequence of: LTFSEFAIVS (SEQ ID NO: 30) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 30.
In some embodiments, F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22);
L1 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23);
L2 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
L3 includes the sequence of: EVVEVG (SEQ ID NO: 31 );
F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25);
L4 includes the sequence of: GGGSS (SEQ ID NO: 32);
F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
L5 includes the sequence of: LDGES (SEQ ID NO: 33);
F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27);
L6 includes the sequence of: TGAPAG (SEQ ID NO: 18);
F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28);
L7 includes the sequence of: ETPISA (SEQ ID NO: 34);
F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29);
L8 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and
F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30).
In some embodiments, F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22);
L1 includes the sequence of: X1X2X3X4 (SEQ ID NO: 13), wherein each of X1-X4 is, independently, any amino acid;
F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23);
L2 includes the sequence of: X1X2X3X4X5X6X7 (SEQ ID NO: 14), wherein each of X1-X7 is, independently, any amino acid;
F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
L3 includes the sequence of: EVVEVG (SEQ ID NO: 31 );
F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25);
L4 includes the sequence of: GGGSS (SEQ ID NO: 32);
F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
L5 includes the sequence of: LDGES (SEQ ID NO: 33);
F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27);
L6 includes the sequence of: TGAPAG (SEQ ID NO: 18);
F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28);
L7 includes the sequence of: ETPISA (SEQ ID NO: 34);
F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29);
L8 includes the sequence of: XIX2XSX4XS (SEQ ID NO: 15), wherein each of X1-X5 is, independently, any amino acid; and
F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30).
In some embodiments, A includes the sequence of: DP;
F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22);
L1 includes the sequence of: X1X2X3X4 (SEQ ID NO: 13), wherein each of X1-X4 is, independently, any amino acid;
F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23);
L2 includes the sequence of: X1X2X3X4X5X6X7 (SEQ ID NO: 14), wherein each of X1-X7 is, independently, any amino acid;
F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
L3 includes the sequence of: EVVEVG (SEQ ID NO: 31 );
F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25);
L4 includes the sequence of: GGGSS (SEQ ID NO: 32);
F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
L5 includes the sequence of: LDGES (SEQ ID NO: 33);
F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27);
L6 includes the sequence of: TGAPAG (SEQ ID NO: 18);
F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28);
L7 includes the sequence of: ETPISA (SEQ ID NO: 34);
F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29);
L8 includes the sequence of: X1X2X3X4X5 (SEQ ID NO: 15), wherein each of X1-X5 is, independently, any amino acid;
F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30); and
B includes the sequence of: DELE (SEQ ID NO: 35).
In some embodiments, L1 includes the sequence of X1X2X3X4 (SEQ ID NO: 13), wherein each of Xi , X3, and X4 is, independently, any amino acid, and X2 is V.
In another aspect, featured is a protein scaffold that includes a polypeptide having at least 80% (e.g., at least 85%, 90%, 95%, 97%, or 99%) sequence identity to SEQ ID NO: 3. In some embodiments, the polypeptide includes the sequence of SEQ ID NO: 3. In some embodiments, the polypeptide does not include the sequence of SEQ ID NO: 1 . In some embodiments, the polypeptide does not include the sequence of SEQ ID NO: 2.
In some embodiments, the polypeptide includes at least one mutation selected from the group consisting of N807X, S809X, R812X, S813X, E814X, S815Xi , D818X, N822X, N825X, N832X, W836X, K857X, E858X, I859X, K86OX2, L861 X, D862X, R865X, K870X, N871 X, N880X, K881 X, K883X, N890X, K897X, K901 X, K908X, E912X, S914X, and K922X3 relative to SEQ ID NO: 1 , wherein:
X is any amino acid except the amino acid in the equivalent position in SEQ ID NO: 1 ;
Xi is any amino acid except R or S;
X2 is any amino acid except P or K; and
X3 is any amino acid except R or K.
In some embodiments of any of the above aspects, the protein scaffold further includes a mutation that adds a cysteine residue. In some embodiments, the protein scaffold includes a first mutation that adds a first cysteine residue and a second mutation that adds a second cysteine residue. In
some embodiments, the first cysteine residue and the second cysteine residue form a disulfide bond under oxidizing conditions.
In some embodiments, the protein scaffold comprises at least one mutation selected from the group consisting of F806C, P808C, S845C, L855C, V858C, V861 C, K878C, W879C, L884C, L888C, A904C, P905C, A906GC, G907C, I924C, L926C, N928C, L936C, I943C, L948C.
In some embodiments, the protein scaffold comprises at least two or more mutations selected from the group consisting of F806C, P808C, S845C, L855C, V858C, V861 C, K878C, W879C, L884C, L888C, A904C, P905C, A906GC, G907C, I924C, L926C, N928C, L936C, I943C, L948C.
In some embodiments, the protein scaffold comprises a pair of cysteine mutations selected from the group consisting of K878C and G907C, K878C and A904C, V861 C and I943C, P905C and L855C, S845C and L936C, W879C and N928C, L884C and L926C, F806C and L948C, V858C and L888C, K878C and G907C, K878C and A906GC, S845C and N928C, K878C and A904C, P808C and I943C, V861 C and I924C, P808C and V861 C, and I943C and L855C.
In some embodiments, the pair of cysteine mutations is selected from the group consisting of K878C and G907C, K878C and A904C, S845C and L936C, W879C and N928C, W879C and N928C, L884C and L926C, V858C and L888C, K878C and G907C, and K878C and A906GC (i.e., the substitution of alanine 906 with glycine and cysteine).
In some embodiments of any of the above aspects, the protein scaffold further includes a tag covalently attached to the scaffold.
In some embodiments, the tag is an affinity tag (e.g., a polyhistidine tag, e.g., 4, 5, 6, 7, 8, 9, or 10 histidines), an epitope tag, a covalent tag, or a protein tag.
In some embodiments, the tag is attached to the N-terminus or the C-terminus of the scaffold.
In some embodiments, the scaffold is conjugated to a functional group. In some embodiments, the functional group includes biotin, streptavidin or a derivative of streptavidin, a polyethylene glycol moiety, a fluorescent dye, an enzyme, a radioactive moiety, a lanthanide, or a lanthanide binding motif.
In some embodiments, the scaffold is conjugated to a lanthanide or a lanthanide binding motif. In some embodiments, the lanthanide is terbium.
In some embodiments, the scaffold is conjugated to a radioactive moiety. In some embodiments, the radioactive moiety is an a or emitter.
In some embodiments, the functional group is conjugated to a sulfhydryl group or a primary amine.
In another aspect, featured is a polynucleotide encoding a protein scaffold as described herein, e.g., of any of the above embodiments. In some embodiments, the polynucleotide is a ribonucleotide. In some embodiments, the polynucleotide is a deoxyribonucleotide.
In another aspect, featured is a vector that includes a polynucleotide as described herein.
In another aspect, featured is a cell that includes a polynucleotide encoding the protein scaffold or a vector that includes the polynucleotide.
In another aspect, featured is a method of producing a protein scaffold as described herein, e.g., of any of the above embodiments. The method includes the steps of (a) providing a cell transformed with a polynucleotide encoding the protein scaffold or a vector that includes the polynucleotide; (b) culturing the transformed cell under conditions for expressing the polynucleotide, wherein the culturing results in
expression of the protein scaffold. The method may further include (c) isolating the protein scaffold or using the protein scaffold to bind a target.
In another aspect, featured is a particle that includes the protein scaffold of any of the above embodiments. In some embodiments, the particle is a magnetic particle.
In another aspect, featured is a resin that includes a plurality of the particles, e.g., containing the protein scaffold.
In another aspect, featured is a column (e.g., a chromatography column) containing the particles or the resin, e.g., conjugated to the scaffold.
In another aspect, featured is a method of purifying a target molecule from a plurality of molecules. The method includes (a) providing a sample that includes a mixture of the target molecule and the plurality of molecules; (b) contacting the sample with the protein scaffold of any one of the above embodiments, wherein the scaffold specifically binds to the target molecule; and (c) separating the target molecule bound to the protein scaffold from the plurality of molecules.
In some embodiments, the step of separating includes immobilizing the protein scaffold.
In some embodiments, the protein scaffold is conjugated to a particle. In some embodiments, the particle includes a magnetic bead. In some embodiments, the protein scaffold is conjugated to a resin or monolith including a plurality of the particles.
Definitions
The Carbohydrate Binding Module Family 32 (CBM32) scaffold of SEQ ID NO: 1 is derived from a single protein domain of Clostridium perfringens hyaluronidase (NagH), a multi-domain enzyme consisting of 1627 amino acids. Amino acid residue 1 of SEQ ID NO: 1 corresponds to amino acid residue 807 of NagH, and amino acid residue 140 of SEQ ID NO: 1 corresponds to amino acid residue 946 of NagH. Amino acid positions and mutations described herein generally relate to the position on the corresponding full length NagH unless otherwise specified.
The term “constant region,” as used herein, generally refers to a region of a binding scaffold that does not include the variable loop regions involved in target binding. For example, a constant region may include a framework region (e.g., F1 -F9) or a loop region (e.g., L3-L7) that is not one of the three loops mutagenized for target binding (e.g., L1 , L2, and L8). A constant region may have sequence variability.
The term “non-naturally occurring amino acid,” as used herein, means non-proteinogenic amino acids. Examples of non-naturally occurring amino acids include D-amino acids; an amino acid having an acetylaminomethyl group attached to a sulfur atom of a cysteine; a pegylated amino acid; the omega amino acids of the formula NH2(CH2)nCOOH where n is 2-6, neutral nonpolar amino acids, such as sarcosine, t-butyl alanine, t-butyl glycine, N-methyl isoleucine, and norleucine; oxymethionine; phenylglycine; citrulline; methionine sulfoxide; cysteic acid; ornithine; diaminobutyric acid; 3- aminoalanine; 3-hydroxy-D-proline; 2,4-diaminobutyric acid; 2-aminopentanoic acid; 2-aminooctanoic acid, 2-carboxy piperazine; piperazine-2-carboxylic acid, 2-amino-4-phenylbutanoic acid; 3-(2- naphthyl)alanine, and hydroxyproline. Other amino acids are a-aminobutyric acid, a-amino-a- methylbutyrate, aminocyclopropane-carboxylate, aminoisobutyric acid, aminonorbornyl-carboxylate, L- cyclohexylalanine, cyclopentylalanine, L-N-methylleucine, L-N-methylmethionine, L-N-methylnorvaline, L- N-methylphenylalanine, L-N-methylproline, L-N-methylserine, L-N-methyltryptophan, D-ornithine, L-N- methylethylglycine, L-norleucine, a-methyl-aminoisobutyrate, a-methylcyclohexylalanine, D-a-
methylalanine, D-a-methylarginine, D-a-methylasparagine, D-a-methylaspartate, D-a-methylcysteine, D- a-methylglutarnine, D-a-methylhistidine, D-a-methylisoleucine, D-a-methylleucine, D-a-methyllysine, D-a- methylmethionine, D-a-methylornithine, D-a-methylphenylalanine, D-a-methylproline, D-a-methylserine, D-N-methylserine, D-a-methylthreonine, D-a-methyltryptophan, D-a-methyltyrosine, D-a-methylvaline, D- N-methylalanine, D-N-methylarginine, D-N-methylasparagine, D-N-methylaspartate, D-N-methylcysteine, D-N-rnethylglutamine, D-N-methylglutamate, D-N-methylhistidine, D-N-methylisoleucine, D-N- methylleucine, D-N-methyllysine, N-methylcyclohexylalanine, D-N-methylornithine, N-methylglycine, N- methylaminoisobutyrate, N-(1 -methylpropyl)glycine, N-(2-methylpropyl)glycine, D-N-methyltryptophan, D- N-methyltyrosine, D-N-methylvaline, y-aminobutyric acid, L-t-butylglycine, L-ethylglycine, L- homophenylalanine, L-a-methylarginine, L-a-methylaspartate, L-a-methylcysteine, L-a-methylglutamine, L-a-methylhistidine, L-a-methylisoleucine, L-a-methylleucine, L-a-methylmethionine, L-a-methylnorvaline, L-a-methylphenylalanine, L-a-methylserine, L-a-methyltryptophan, L-a-methylvaline, N-(N-(2,2- diphenylethyl) carbamylmethylglycine, 1 -carboxy-1 -(2,2-diphenyl-ethylamino) cyclopropane, 4- hydroxyproline, ornithine, 2-aminobenzoyl (anthraniloyl), D-cyclohexylalanine, 4-phenyl-phenylalanine, L- citrulline, a-cyclohexylglycine, L-1 ,2,3,4-tetrahydroisoquinoline-3-carboxylic acid, L-thiazolidine-4- carboxylic acid, L-homotyrosine, L-2-furylalanine, L-histidine (3-methyl), N-(3-guanidinopropyl)glycine, O- methyl-L-tyrosine, O-glycan-serine, meta-tyrosine, nor-tyrosine, L-N,N',N"-trimethyllysine, homolysine, norlysine, N-glycan asparagine, 7-hydroxy-1 ,2,3,4-tetrahydro-4-fluorophenylalanine, 4- methylphenylalanine, bis-(2-picolyl)amine, pentafluorophenylalanine, indoline-2-carboxylic acid, 2- aminobenzoic acid, 3-amino-2-naphthoic acid, asymmetric dimethylarginine, L-tetrahydroisoquinoline-1 - carboxylic acid, D-tetrahydroisoquinoline-1 -carboxylic acid, 1 -amino-cyclohexane acetic acid, D/L- allylglycine, 4-aminobenzoic acid, 1 -amino-cyclobutane carboxylic acid, 2 or 3 or 4-aminocyclohexane carboxylic acid, 1 -amino-1 -cyclopentane carboxylic acid, 1 -aminoindane-1 -carboxylic acid, 4-amino- pyrrolidine-2-carboxylic acid, 2-aminotetraline-2-carboxylic acid, azetidine-3-carboxylic acid, 4-benzyl- pyrolidine-2-carboxylic acid, tert-butylglycine, b-(benzothiazolyl-2-yl)-alanine, b-cyclopropyl alanine, 5,5- dimethyl-1 ,3-thiazolidine-4-carboxylic acid, (2R,4S)4-hydroxypiperidine-2-carboxylic acid, (2S,4S) and (2S,4R)-4-(2-naphthylmethoxy)-pyrolidine-2-carboxylic acid, (2S,4S) and (2S,4R)4-phenoxy-pyrrolidine-2- carboxylic acid, (2R,5S)and(2S,5R)-5-phenyl-pyrrolidine-2-carboxylic acid, (2S,4S)-4-amino-1 -benzoyl- pyrrolidine-2-carboxylic acid, t-butylalanine, (2S,5R)-5-phenyl-pyrrolidine-2-carboxylic acid, 1 - aminomethyl-cyclohexane-acetic acid, 3,5-bis-(2-amino)ethoxy-benzoic acid, 3,5-diamino-benzoic acid, 2- methylamino-benzoic acid, N-methylanthranylic acid, L-N-methylalanine, L-N-methylarginine, L-N- methylasparagine, L-N-methylaspartic acid, L-N-methylcysteine, L-N-methylglutamine, L-N- methylglutamic acid, L-N-methylhistidine, L-N-methylisoleucine, L-N-methyllysine, L-N-methylnorleucine, L-N-methylornithine, L-N-methylthreonine, L-N-methyltyrosine, L-N-methylvaline, L-N-methyl-t- butylglycine, L-norvaline, a-methyl-y-aminobutyrate, 4,4'-biphenylalanine, a-methylcylcopentylalanine, a- methyl-a-napthylalanine, a-methylpenicillamine, N-(4-aminobutyl)glycine, N-(2-aminoethyl)glycine, N-(3- aminopropyl)glycine, N-amino-a-methylbutyrate, a-napthylalanine, N-benzylglycine, N-(2- carbamylethyl)glycine, N-(carbamylmethyl)glycine, N-(2-carboxyethyl)glycine, N-(carboxymethyl)glycine, N-cyclobutylglycine, N-cyclodecylglycine, N-cycloheptylglycine, N-cyclohexylglycine, N-cyclodecylglycine, N-cylcododecylglycine, N-cyclooctylglycine, N-cyclopropylglycine, N-cycloundecylglycine, N-(2,2- diphenylethyl)glycine, N-(3,3-diphenylpropyl)glycine, N-(3-guanidinopropyl)glycine, N-(1 - hydroxyethyl)glycine, N-(hydroxyethyl))glycine, N-(imidazolylethyl))glycine, N-(3-indolylyethyl)glycine, N-
methyl-Y-aminobutyrate, D-N-methylmethionine, N-methylcyclopentylalanine, D-N-methylphenylalanine, D-N-methylproline, D-N-methylthreonine, N-(1 -methylethyl)glycine, N-methyl-napthylalanine, N- methylpenicillamine, N-(p-hydroxyphenyl)glycine, N-(thiomethyl)glycine, penicillamine, L-a-methylalanine, L-a-methylasparagine, L-a-methyl-t-butylglycine, L-methylethylglycine, L-a-methylglutamate, L-a- methylhomophenylalanine, N-(2-methylthioethyl)glycine, L-a-methyllysine, L-a-methylnorleucine, L-a- methylornithine, L-a-methylproline, L-a-methylthreonine, L-a-methyltyrosine, L-N-methyl- homophenylalanine, N-(N-(3,3-diphenylpropyl) carbamylmethylglycine, L-pyroglutamic acid, D- pyroglutamic acid, O-methyl-L-serine, O-methyl-L-homoserine, 5-hydroxylysine, a-carboxyglutamate, phenylglycine, L-pipecolic acid (homoproline), L-homoleucine, L-lysine (dimethyl), L-2-naphthylalanine, L- dimethyldopa or L-dimethoxy-phenylalanine, L-3-pyridylalanine, L-histidine (benzoyloxymethyl), N- cycloheptylglycine, L-diphenylalanine, O-methyl-L-homotyrosine, L-p-homolysine, O-glycan-threoine, Ortho-tyrosine, L-N,N'-dimethyllysine, L-homoarginine, neotryptophan, 3-benzothienylalanine, isoquinoline-3-carboxylic acid, diaminopropionic acid, homocysteine, 3,4-dimethoxyphenylalanine, 4- chlorophenylalanine, L-1 ,2,3,4-tetrahydronorharman-3-carboxylic acid, adamantylalanine, symmetrical dimethylarginine, 3-carboxythiomorpholine, D-1 ,2,3,4-tetrahydronorharman-3-carboxylic acid, 3- aminobenzoic acid, 3-amino-1 -carboxymethyl-pyridin-2-one, 1 -amino-1 -cyclohexane carboxylic acid, 2- aminocyclopentane carboxylic acid, 1 -amino-1 -cyclopropane carboxylic acid, 2-aminoindane-2-carboxylic acid, 4-amino-tetrahydrothiopyran-4-carboxylic acid, azetidine-2-carboxylic acid, b-(benzothiazol-2-yl)- alanine, neopentylglycine, 2-carboxymethyl piperidine, b-cyclobutyl alanine, allylglycine, diaminopropionic acid, homo-cyclohexyl alanine, (2S,4R)- 4-hydroxypiperidine-2-carboxylic acid, octahydroindole-2- carboxylic acid, (2S,4R) and (2S,4R)-4-(2-naphthyl), pyrrolidine-2-carboxylic acid, nipecotic acid, (2S,4R)and (2S,4S)-4-(4-phenylbenzyl) pyrrolidine-2-carboxylic acid, (3S)-1 -pyrrolidine-3-carboxylic acid, (2S,4S)-4-tritylmercapto-pyrrolidine-2-carboxylic acid, (2S,4S)-4-mercaptoproline, t-butylglycine, N,N- bis(3-aminopropyl)glycine, 1 -amino-cyclohexane-1 -carboxylic acid, N-mercaptoethylglycine, and selenocysteine. In some embodiments, amino acid residues may be charged or polar. Charged amino acids include alanine, lysine, aspartic acid, or glutamic acid, or non-naturally occurring analogs thereof. Polar amino acids include glutamine, asparagine, histidine, serine, threonine, tyrosine, methionine, or tryptophan, or non-naturally occurring analogs thereof. It is specifically contemplated that in some embodiments, a terminal amino group in the amino acid may be an amido group or a carbamate group.
As used herein, the term “percent (%) identity” refers to the percentage of amino acid residues of a candidate sequence, e.g., a protein scaffold, that are identical to the amino acid residues of a reference sequence, e.g., a wild-type CBM32 polypeptide, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent identity (i.e., gaps can be introduced in one or both of the candidate and reference sequences for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). Alignment for purposes of determining percent identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, ALIGN, or Megalign (DNASTAR) software. Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full length of the sequences being compared. In some embodiments, the percent amino acid sequence identity of a given candidate sequence to, with, or against a given reference sequence (which can alternatively be phrased as a given candidate sequence that has or includes a
certain percent amino acid sequence identity to, with, or against a given reference sequence) is calculated as follows:
100 x (fraction of A/B) where A is the number of amino acid residues scored as identical in the alignment of the candidate sequence and the reference sequence, and where B is the total number of amino acid residues in the reference sequence. In some embodiments where the length of the candidate sequence does not equal to the length of the reference sequence, the percent amino acid sequence identity of the candidate sequence to the reference sequence would not equal to the percent amino acid sequence identity of the reference sequence to the candidate sequence.
Brief Description of the Drawings
FIG. 1 is a schematic drawing showing an outline of the protein engineering campaign to produce a member of the nC-B class of nanoCLAMPs. The starting nanoCLAMP was anti-SUMO clone SMT3-A1 , a member of the nC-A class of nanoCLAMPs. SMT3A1 was mutated over 7 rounds. At the conclusion of each round, the performance of clone(s) combining different mutations from the round was assessed by DSF and SEC. The end product is clone P2788, whose constant regions served as the basis for the nC- B class of nanoCLAMPs.
FIG. 2 is a space filled model of P2788, an example of the nC-B class of nanoCLAMP with mutations in P2788 mapped on to CBM-32-3 crystal structure and the sequence of the constant region of the nC-A class of nanoCLAMPs. The constant regions of clone P2788 are the basis for the nC-B class of nanoCLAMPs. Side chains of mutated positions are labeled in green; side of chains of variable loops are shown in red. Other side chains are shown in light gray. Backbone residues are shown in dark gray. The alignment compares the constant regions of the nC-A and nC-B classes of nanoCLAMPs. Residues denoted with black, bolded text represent nC-A positions where at least one mutation was tested (top row). Mutations were tested for 58% of positions in the constant regions (72 out of 124). Residues denoted with bolded text represent mutations in P2788 (bottom row). In P2788, 24% of positions in the constant regions are mutated relative to nC-A (30 out of 124).
FIG. 3 is a model showing the superimposition of the crystal structures for CBM32-2 (basis for the nC-A class of nanoCLAMPs) and the AlphaFold model for P2788 (basis for the nC-B class). CBM32-2 (PDB accession 2W1 Q) and P2788 were aligned with jFATCAT (rigid) on the RCSB server, resulting in a high TM-score (0.95). Backbone deviations are apparent and expected for the loops, which have different amino acid sequences.
FIG. 4 is a graph showing differential scanning fluorimetry analysis of SMT3-A1 (nC-A class) and P2788 (nC-B class). Both clones show classically shaped melting curves with low initial fluorescence. The 30 mutations in P2788 increase its Tm by 24 °C relative to SMT3-A1 .
FIGS. 5A-5F are gels and a graph showing protease resistance of nanoCLAMPs of the nC-A and nC-B classes. FIG. 5A shows SDS-PAGE analysis of SMT3-A1 (nC-A class) and P2788 (same variable loops as SMT3-A1 but with constant regions of the nC-B class) after exposure to 16 hr incubation with trypsin or chymotrypsin. FIGS. 5B-5D show SDS-PAGE analysis of time course tryptic digestions of SMT3-A1 , P2788 and P2808. P2788 and P2808 have the same constant regions (nC-B class), but different loops. P2788 and P2808 were resistant to tryptic digest for over 20 hours. FIG. 5E shows quantitative densitometry analysis of the time course-stained gels. FIG. 5F shows SDS-PAGE analysis of
members of the nC-A class of nanoCLAMPs (SMT3-A1 ) and nC-B class (P2788, P2808, P2809, and P281 1 ) following 16 h tryptic digestions.
FIG. 6 is a set of size exclusion chromatograms showing monodispersity and melting temperature analysis of anti-SUMO nanoCLAMPs of the nC-B class. Size exclusion chromatography (left panel) and differential scanning fluorescence (right panel) of nanoCLAMPs P2808, P2809 and P281 1 .
FIG. 7 is a graph showing dynamic binding capacity of SMT3-A1 resin (nC-A class) and P2808 resin (nC-B class). Breakthrough curves were generated by loading a solution of 0.2 mg/ml Sumo-GFP in PBS onto 0.6 ml of packed resin in a column (3 cm height x 5 mm ID) at a flowrate of 0.5 ml/min and measuring the fluorescence of the eluate. The percent fluorescence of the load was calculated by diving the eluate fluorescence by the load fluorescence. The dynamic binding capacity (DBC) was calculated with the following formula: DBC = (Vx-Vdeiay) *c/(Vresin). Vx is the volume of eluate collected, Vdeiay is the elution volume of the load under non-binding conditions, c = concentration of target in load, V resin is the volume of the packed resin in the column. The P2808 resin has a dynamic binding capacity of 10 mg/ml resin (240 nmol/ml resin).
FIGS. 8A and 8B are gels showing performance of P2808 resin in single step affinity chromatography purification of GFP-SUMO from spiked lysates in resin-limiting and protein-limiting scenarios. Identical columns were loaded under conditions that were 33% above (FIG. 8A) or 58% below (FIG. 8B) the columns dynamic binding capacity. E. coli lysates with spiked-in target protein (SUMO- GFP) were used as the load. After loading and washing, bound proteins were eluted with 3 M imidazole pH 8. Total protein loaded on SDS-PAGE in FIG. 8A: Lysate = 32 pg, Spiked Lysate = 34 pg, FT = 47 pg, Eluate = 6 pg; FIG. 8B: Lysate = 17 pg, Spiked Lysate = 17 pg, FT = 21 pg, Wash = NA, Eluate = 3 pg. Metrics of purifications in FIG. 8A and 8B are tabulated in Table 6.
FIG. 9 is a graph showing the effect of sodium hydroxide treatment on binding capacity of nanoCLAMP capture agents of the nC-A and nC-B class. The binding capacities of resins with capture agents of the nC-A class (SMT3-A1 , P1519, P1533) and the nC-B class (P2808, P2809, P281 1 ) were determined after each of 22 cycles of purification of GFP-SUMO from a spiked E. coli lysate, followed by washing, eluting, and cleaning in place with 0.1 M NaOH (10 min contact time). The % of starting binding capacity was determined by dividing the eluate fluorescence with the load fluorescence. The selectivity was determined by analysis of the eluates on SDS-PAGE (FIG. 13).
FIGS. 10A-10D are graphs and gels showing the effect of organic solvent and autoclaving on the binding capacity of resins made with nanoCLAMPs of the nC-B class. Resins P2808, P2809, and P281 1 (nC-B class) and Resin SMT3-A1 (nC-A class) were incubated in 100% DMF for 2 h (FIGS. 10A and 10B) or autoclaved (105-minute liquid steam cycle, including 30 min exposure to 120 °C and 20 p.s.i.) (FIGS. 10C and 10D) and then re-equilibrated in fresh buffer and tested in affinity chromatography purification of SUMO-GFP from spiked E. coli lysate. Binding capacity % of untreated was determined by dividing the eluate fluorescence by the control (non-treated) eluate fluorescence. Specificity was determined by Coomassie staining of SDS-PAGE (FIGS. 10B and 10D).
FIG. 11 is a graph showing kinetic thermal stability of nC-A (SMT3-A1 ) and nC-B (P2808, P2809, and P281 1 ) nanoCLAMPs. nanoCLAMPs were heat treated, cooled and centrifuged. The supernatant was tested for binding activity by biolayer interferometry. The percent of starting response was measured as the amplitude of binding divided by that obtained by the control sample (held at 20 °C during heat treatments).
FIG. 12 is a gel showing static binding capacity of P2808 resin. Affinity resin prepared with P2808 (nC-B class) was incubated with a spiked E. coli lysate; washed; eluted with 3 M imidazole, pH 8; buffer exchanged; and quantified by A280.
FIG. 13 is a set of gels showing the effect of sodium hydroxide treatment on specificity of nanoCLAMP capture agents of the nC-A and nC-B class. nC-A (SMT3-A1 , P1519, P1533) and nC-B (P2808, P2809, P2811 ) Sumo-binding nanoCLAMPs were covalently conjugated to 6% cross-linked agarose resin and then used to purify a Sumo-GFP fusion from a crude E. coli lysate. Each cycle consisted of a crude Sumo-GFP-spiked sample load, wash, elution with 3M Imidazole (collected), wash, 0.1 M NaOH regeneration (10 min contact time per cycle), and a 5 min refolding wash. The target protein in the eluate was quantified by fluorescence spectroscopy (FIG. 9), and the percent yield calculated by dividing the fluorescence by the initial eluate fluorescence. The purity of the eluted target from each cycle was assessed by SDS-PAGE and stained with Coomassie. Cycle number is shown for each lane, L = Load, M = marker. The prominent band in the eluates of each gel is SUMO-GFP (42 kD).
FIGS. 14A and 14B are a graph and a gel testing the stability of Resin P2808 (nC-B class) through >20 low pH elution cycles. SUMO-GFP was spiked into crude E. coli lysate, loaded onto P2808 resin, washed, and eluted with 0.1 M Citrate, pH 2.5, followed by regeneration with 0.1 N NaOH with 1 min contact time per cycle, and a re-equilibration wash for 5 min. FIG. 14A shows the target protein in the eluate, which was quantified by densitometry of the Coomassie stained SDS-PAGE gel of FIG. 14B because the fluorescence of the eluate was destroyed by the low pH. Percent yield was calculated by dividing the band density by the initial eluate band density. The purity of the eluted target from each cycle was assessed by SDS-PAGE of FIG. 14B.
FIG. 15 is a graph showing nanoCLAMPs stably binding terbium (Tb). SMT3-A1 (nC-A class), P2808 (nC-B class), and a negative control protein (recombinant SMT3) were incubated with CaCL or TbCIs overnight, and then buffer exchanged to remove unbound metals. The buffer exchanged proteins were analyzed by time resolved fluorescence 24 h post buffer exchange (Ex/Em: 350 nm/544 nm), 200 p- sec delay.
FIG. 16 is a model showing the front, back, top, and bottom faces of the nC-B class of nanoCLAMP. A, B, F1 -9, and L1 -L8 are mapped on clone P2808 sequence and 3D-modeled to illustrate the locations of each region of the scaffold.
FIG. 17 is a model showing an alignment of the P2808, an example of the nC-B class, in which Loop 1 is replaced with a (G4S)s sequence or is removed.
FIG. 18 is a model showing an alignment of the nC-B nanoCLAMP P2808 in which Loop 2 is replaced with a (G4S)s sequence or is removed.
FIG. 19 is a model showing an alignment of the nC-B nanoCLAMP P2808 in which Loop 4 is replaced with a (G4S)s sequence or is removed.
FIG. 20 is a model showing an alignment of the nC-B nanoCLAMP P2808 in which Loop 6 is replaced with a (G4S)s sequence or GG.
FIG. 21 is a model showing the nC-B nanoCLAMP P2808 in which Loop 8 is replaced with a GGGGG (SEQ ID NO: 36), GGGG (SEQ ID NO: 37), GGG, GG, or G, or is removed.
FIG. 22 is a model showing an alignment of the nC-B nanoCLAMP P2808 in which Loop 8 is replaced with a (G4S)s sequence or is removed.
FIG. 23 is a model showing an alignment of the nC-B nanoCLAMP P2808 in which Loop 3 is replaced with a (648)3 sequence or is removed.
FIG. 24 is a model showing an alignment of the nC-B nanoCLAMP P2808 in which Loop 5 is replaced with a (648)3 sequence or is removed.
FIG. 25 is a model showing the nC-B nanoCLAMP P2808 in which Loop 7 is replaced with a GGGGG (SEQ ID NO: 36), GGGG (SEQ ID NO: 37), GGG, GG, or G, or is removed.
FIG. 26 is a model showing an alignment of the nC-B nanoCLAMP P2808 in which Loop 8 is replaced with a (648)3 sequence or G.
FIG. 27 is a gel showing introduction of artificial disulfide bonds into clones P2808 and P2960. SDS-PAGE analysis of P2808 and P2960 variants, mutated to contain pairs of adjacent Cysteines, under oxidizing and reducing conditions. Purified proteins were treated with SDS sample buffer containing (first lane of each set) or lacking (second lane of each set) reducing agent (DTT). The presence of faster migrating species indicates disulfide bonding in samples lacking DTT, likely due to more compact folding and a smaller hydrodynamic radius. P2808 and P2960 contain no Cys residues and migrate at the same rate in oxidizing and reducing sample buffer, as expected. BSA, which contains 17 disulfide bonds, is included as a control for the activity of the reducing agent (DTT).
FIG. 28 is a graph showing that artificial disulfide improves thermal stability of P3015 by 9 °C. The graph shows differential scanning fluorescence (DSF) analysis of melting temperature of reduced and oxidized P3015.
Detailed Description
Immunoaffinity chromatography is an established laboratory-scale technique for the isolation of target proteins with high yield and purity. However, properties of antibodies and nanobodies often make immunoaffinity chromatography incompatible with conditions typical of many industrial scale processes. To overcome these limitations, the present invention features an antibody-mimetic scaffold, called a nanoCLAMP, that can be used in process-scale affinity chromatography. The 16 kD antibody mimetic is based on a bacterial, cysteine-free, p-sandwich protein with a structure analogous to immunoglobulin variable domains. Like antibodies and other antibody mimetics, the first generation of nanoCLAMPs generally showed high selectivity and affinity but also suffered from sensitivity to high temperature, digestion by proteases, and inactivation by alkali. The present invention solves this problem by engineering a plurality of mutations in in the nanoCLAMP scaffold to improve the general robustness of nanoCLAMPs and resistance to extreme conditions.
This mutated scaffold serves as the basis for an improved nanoCLAMP class, called the nC-B class. Phage display was used to generate hundreds of nC-B capture agents recognizing diverse targets. The resulting immunoaffinity capture agents typically had a Kd of < 80 nM, a Tm of > 70 °C and a ti/2 in 0.1 mg/ml trypsin of > 20 hours. The nC-B capture agents also maintained their binding capacity and selectivity over 20 purification cycles, each including 10 minutes of cleaning in place with 0.1 M NaOH. Affinity chromatography resins made with nC-B capture agents supported efficient single-step purifications from crude mixtures. Target proteins could be eluted with either 3 M imidazole, pH 8 or 0.1 M sodium citrate, pH 2.5. Furthermore, affinity chromatography resins with nC-B capture agents remained functional after exposure to 100% DMF and autoclaving. The robust nanoCLAMP scaffold described herein allows for the development of custom, high performance affinity chromatography resins
compatible with the harsh conditions of process-scale applications that can be adaptable to a wide diversity of target substrates.
Protein Scaffolds
The scaffolds described herein are derived from the Carbohydrate Binding Module Family 32 (CBM32) protein domain of Clostridium perfringens hyaluronidase (NagH), a multi-domain enzyme consisting of 1627 amino acids. Amino acid residue 1 of SEQ ID NO: 1 corresponds to amino acid residue 807 of NagH, and amino acid residue 140 of SEQ ID NO: 1 corresponds to amino acid residue 946 of NagH. Amino acid positions and mutations described herein generally relate to the position on the corresponding full length NagH unless otherwise specified. The WT sequence of CBM32 is shown below:
CBM32 SEQ (SEQ ID NO: 1 ) NPSLIRSESWQVYEGNEANLLDGDDNTGVWYKTLNGDTSLAGEF IGLDLGKEIKLDGIRFVIGKNGGGS SDKWNKFK LEYSLDNESWTT IKEYDKTGAPAGKDVIEESFETP I SAKYIRLTNMENINKWLTFSEFAIVSD
Previous work identified a scaffold in which three or five loop regions (L1 , L2, and L8; or L1 , L2, L4, L6, and L8) were mutagenized to form binders to diverse protein targets instead of carbohydrates, a property not expected for a carbohydrate binding module. In some embodiments, the protein scaffold does not retain carbohydrate binding activity, e.g., of the native CBM scaffold. Loop L1 corresponds to residues 817-820, loop L2 corresponds to residues 838-844, and loop L8 corresponds to residues 931 - 935. The original scaffold (nC-A) is shown below and contains a single M929L mutation relative to SEQ ID NO: 1 . X denotes a variable loop residue, and each X may independently be any residue. nC-A Scaffold sequence (SEQ ID NO: 2) NPSLIRSESWXXXXGNEANLLDGDDNTGVWYXXXXXXXSLAGEF IGLDLGKEIKLDGIRFVIGKNGGGS SDKWNKFK
LEYSLDNESWTT IKEYDKTGAPAGKDVIEESFETP I SAKYIRLTNLEXXXXXLTFSEFAIVSD
The current scaffold (nC-B) described herein is based on the exemplary scaffold of SEQ ID NO: 3 shown below. X denotes a variable loop residue, and each X may independently be any residue. nC-B Scaffold sequence full length (SEQ ID NO: 3) DPTLIHTPGWXXXXGSEADLLDGDDSTGVEYXXXXXXXSLAGEF IGLDLGEWEVGGIHFVIGADGGGS SDKWTRFR LEYSLDGESWTT IREYDHTGAPAGQDVIDEDFETP I SAQYIRLTNLEXXXXXLTFSEFAIVSDELE
The protein scaffolds described herein include (e.g., consist of) framework regions (F) and loop regions (L). The scaffolds generally have the structure of:
A-F1 -L1 -F2-L2-F3-L3-F4-L4-F5-L5-F6-L6-F7-L7-F8-L8-F9-B.
F1 -F9 correspond to framework regions 1 -9, and L1 -L8 correspond to loop regions 1 -8. Framework regions and loop regions were selected based on where beta strands turn to loops or where beta strands show a sharp turn from the plane of the strand’s beta sheet (see FIGS. 2 and 16). The N- and C- termini of the scaffold, A and B, may each independently, be present (e.g., contain one or more amino acids) or absent.
F1 includes the sequence of: (T/S)-LI-(H/R)-(T/S)-(P/E)-(G/S)-W (SEQ ID NO: 4) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 4;
L1 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F2 includes the sequence of: G-(S/N/T)-E-(A/S)-(D/N/S/A)-LLDGDD-(S/N/T)-TGV-(E/W/A)-Y (SEQ ID NO: 5) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 5;
L2 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F3 includes the sequence of: S-(L/V)-AGEFIGLDLG (SEQ ID NO: 6) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 6;
L3 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F4 includes the sequence of: G-(I/V)-(H/R/Y/N)-FVIG-(A/K/R)-(D/N) (SEQ ID NO: 7) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 7;
L4 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F5 includes the sequence of: DKW-(T/N/S)-(R/K)-F-(R/K)-LEYS (SEQ ID NO: 8) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 8;
L5 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F6 includes the sequence of: WTTI-(R/K/H/Q)-EYD-(H/K/R/Q) (SEQ ID NO: 9) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 9;
L6 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F7 includes the sequence of: (Q/K)-DVI-(D/E)-E-(D/S)-F (SEQ ID NO: 10) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 10;
L7 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F8 includes the sequence of: (Q/K/R)-YIRLTNLE (SEQ ID NO: 1 1 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 1 1 ;
L8 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and
F9 includes the sequence of: LTFSEFA-(I/V)-VS (SEQ ID NO: 12) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 12.
As described herein, a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof includes a sequence having, for example, one insertion, two insertions, one deletion, two deletions, one substitution mutation, two substitution mutations, one insertion and one deletion, one insertion and one substitution mutation, or one deletion and one substitution mutation.
In some embodiments, the protein scaffold includes at least one mutation selected from the group consisting of N807X, S809X, R812X, S813X, E814X, S815X, D818X, N822X, N825X, N832X, W836X, K857X, E858X, I859X, K860X, L861X, D862X, R865X, K870X, N871 X, N880X, K881 X, K883X, N890X, K897X, K901 X, K908X, E912X, S914X, and K922X relative to SEQ ID NO: 1 , wherein X is any amino acid.
In some embodiments, the protein scaffold includes at least one mutation selected from the group consisting of N807X, S809X, R812X, S813X, E814X, S815Xi, D818X, N822X, N825X, N832X, W836X, K857X, E858X, I859X, K86OX2, L861X, D862X, R865X, K870X, N871 X, N880X, K881 X, K883X, N890X, K897X, K901 X, K908X, E912X, S914X, and K922X3 relative to SEQ ID NO: 1 , wherein:
X is any amino acid except the amino acid in the equivalent position in SEQ ID NO: 1 ;
Xi is any amino acid except R or S;
X2 is any amino acid except P or K; and
X3 is any amino acid except R or K.
In some embodiments,
F1 includes the sequence of: (T/S)-LI-(H/R)-(T/S)-(P/E)-(G/S)-W (SEQ ID NO: 4) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 4;
F2 includes the sequence of: G-(S/N/T)-E-(A/S)-(D/N/S/A)-LLDGDD-(S/N/T)-TGV-(E/W/A)-Y (SEQ ID NO: 5) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 5;
F3 includes the sequence of: S-(L/V)-AGEFIGLDLG (SEQ ID NO: 6) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 6;
F4 includes the sequence of: G-(I/V)-(H/R/Y/N)-FVIG-(A/K/R)-(D/N) (SEQ ID NO: 7) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 7;
F5 includes the sequence of: DKW-(T/N/S)-(R/K)-F-(R/K)-LEYS (SEQ ID NO: 8) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 8;
F6 includes the sequence of: WTTI-(R/K/H/Q)-EYD-(H/K/R/Q) (SEQ ID NO: 9) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 9;
F7 includes the sequence of: (Q/K)-DVI-(D/E)-E-(D/S)-F (SEQ ID NO: 10) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 10;
F8 includes the sequence of: (Q/K/R)-YIRLTNLE (SEQ ID NO: 11 ) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 11 ; and
F9 includes the sequence of: LTFSEFA-(I/V)-VS (SEQ ID NO: 12) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 12.
In some embodiments,
F1 includes the sequence of: (T/S)-LI-(H/R)-(T/S)-(P/E)-(G/S)-W (SEQ ID NO: 4);
F2 includes the sequence of: G-(S/N/T)-E-(A/S)-(D/N/S/A)-LLDGDD-(S/N/T)-TGV-(E/W/A)-Y (SEQ ID NO: 5);
F3 includes the sequence of: S-(L/V)-AGEFIGLDLG (SEQ ID NO: 6);
F4 includes the sequence of: G-(I/V)-(H/R/Y/N)-FVIG-(A/K/R)-(D/N) (SEQ ID NO: 7);
F5 includes the sequence of: DKW-(T/N/S)-(R/K)-F-(R/K)-LEYS (SEQ ID NO: 8);
F6 includes the sequence of: WTTI-(R/K/H/Q)-EYD-(H/K/R/Q) (SEQ ID NO: 9);
F7 includes the sequence of: (Q/K)-DVI-(D/E)-E-(D/S)-F (SEQ ID NO: 10);
F8 includes the sequence of: (Q/K/R)-YIRLTNLE (SEQ ID NO: 11 ); and
F9 includes the sequence of: LTFSEFA-(I/V)-VS (SEQ ID NO: 12).
In some embodiments, the protein scaffold includes at least one mutation selected from the group consisting of N807X, S809X, R812X, S813X, E814X, S815X, D818X, N822X, N825X, N832X, W836X, K857X, E858X, I859X, K860X, L861X, D862X, R865X, K870X, N871 X, N880X, K881 X, K883X, N890X, K897X, K901 X, K908X, E912X, S914X, and K922X relative to SEQ ID NO: 1 , wherein X is any amino acid.
In some embodiments, the protein scaffold includes at least one mutation selected from the group consisting of N807D, S809T, R812H, S813T, E814P, S815G, D818V, N822S, N825D, N832S, W836E, K857E, E858V, I859V, K860E, L861V, D862G, R865H, K870A, N871 D, N880T, K881 R, K883R, N890G, K897R, K901 H, K908Q, E912D, S914D, and K922Q relative to SEQ ID NO: 1.
In some embodiments, the at least one mutation is K870X and/or N890X. In some embodiments, the at least one mutation is K870A and/or N890G. In some embodiments, the at least one mutation is K870A. In some embodiments, the at least one mutation is N890G.
In some embodiments, the protein scaffold includes at least 3 fewer lysines relative to SEQ ID NO: 1 . For example, in some embodiments, the protein scaffold includes at least 3, 4, 5, 6, 7, 8, 9, or 10 fewer lysines relative to SEQ ID NO: 1 . In some embodiments, the protein scaffold includes at least 6 fewer lysines relative to SEQ ID NO: 1 . In some embodiments, the protein scaffold includes 9 fewer lysines relative to SEQ ID NO: 1 . In some embodiments, the protein scaffold does not include any lysines.
In some embodiments, the protein scaffold includes at least 3 fewer asparagines relative to SEQ ID NO: 1 . For example, in some embodiments, the protein scaffold includes at least 3, 4, 5, 6, 7, or 8 fewer asparagines relative to SEQ ID NO: 1 . In some embodiments, the protein scaffold includes at least 5 fewer asparagines relative to SEQ ID NO: 1 . In some embodiments, the protein scaffold includes
7 fewer asparagines relative to SEQ ID NO: 1 . In some embodiments, the protein scaffold does not include any asparagines.
In some embodiments, A and B are each independently, absent or at least one amino acid. For example, each of A and B may each be independently, absent. In some embodiments, A and B are each independently, at least 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 30, 400, 500, 600, 700, 800, 900, 1 ,000 or more amino acids. In some embodiments, A and B are each independently, from 0 to 1 ,000 amino acids, e.g., from 1 to 10 amino acids (e.g., 1 , 2, 3, 4, 5, 6,7, 8, 9, or 10 amino acids), from 10 to 100 amino acids (e.g., 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100 amino acids, or from 100 to 1 ,000 amino acids (e.g., 100, 200, 300, 400, 500, 600, 700, 800, 900, or
I ,000 amino acids).
In some embodiments, A and B are each independently, absent or from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids).
In some embodiments, each of L1 -L8 is independently, from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids).
In some embodiments, each of L1 -L8 is, independently, from 1 amino acid to 10 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids). In some embodiments, each of L1 -L8 is, independently, from 3 amino acids to 10 amino acids. In some embodiments, each of L1 -L8 is, independently, from 3 amino acids to 8 amino acids.
In some embodiments, L1 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10,
I I , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L1 is from 0 to 5 amino acids (e.g., from 1 to 5 amino acids, e.g., 0, 1 , 2, 3, 4, or 5 amino acids).
In some embodiments, L2 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L2 is from 1 amino acid to
16 amino acids (e.g., from 4 to 16 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, or 16 amino acids).
In some embodiments, L3 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L3 is 6 amino acids.
In some embodiments, L4 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L4 is from 0 to 5 amino acids (e.g., from 1 to 5 amino acids, e.g., 0, 1 , 2, 3, 4, or 5 amino acids)
In some embodiments, L5 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L5 is 5 amino acids.
In some embodiments, L6 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L6 is from 3 to 6 amino acids (e.g., 3, 4, 5, or 6 amino acids).
In some embodiments, L7 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L7 is 4 or 5 amino acids.
In some embodiments, L8 is from 1 amino acid to 20 amino acids (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids). In some embodiments, L8 is from 4 to 6 amino acids (e.g., 4, 5, or 6 amino acids).
In some embodiments, L1 is 4 amino acids. In some embodiments, L2 is 7 amino acids. In some embodiments, L8 is 5 amino acids. In some embodiments, L1 is 4 amino acids, L2 is 7 amino acids,
and/or L8 is 5 amino acids. In some embodiments, L1 is 4 amino acids, L2 is 7 amino acids, and L8 is 5 amino acids.
In some embodiments, L1 includes the sequence of: X1X2X3X4 (SEQ ID NO: 13), wherein each of X1-X4 is, independently, any amino acid. In some embodiments, X2 is V.
In some embodiments, L2 includes the sequence of: X1X2X3X4X5X6X7 (SEQ ID NO: 14), wherein each of X1-X7 is, independently, any amino acid.
In some embodiments, L8 includes the sequence of: XIX2XSX4XS (SEQ ID NO: 15), wherein each of X1-X5 is, independently, any amino acid.
In some embodiments, L4 includes the sequence of: X1X2X3X4X5X6X7 (SEQ ID NO: 14), wherein each of X1-X7 is, independently, any amino acid.
In some embodiments, L6 includes the sequence of: XIX2X3X4XSX6 (SEQ ID NO: 16), wherein each of Xi-Xe is, independently, any amino acid. In some embodiments, L8 includes at least two amino acids. In some embodiments, L8 includes at least one amino acid.
In some embodiments, L4 includes the sequence of: (G/D)-GGSS (SEQ ID NO: 17) or GDT or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 17 or GDT.
In some embodiments, L6 includes the sequence of TGAPAG (SEQ ID NO: 18) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 18.
In some embodiments, L4 includes the sequence of: (G/D)-GGSS (SEQ ID NO: 17) or GDT; and L6 includes the sequence of TGAPAG (SEQ ID NO: 18).
In some embodiments, L3 includes the sequence of: (E/K/S)-(V/E)-(V/I/T)-(E/K/P/S)-(V/L)-(G/D) (SEQ ID NO: 19) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 19.
In some embodiments, L5 includes the sequence of: LD-(G/N)-(E/S)-S (SEQ ID NO: 20) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 20.
In some embodiments, L7 includes at least one amino acid.
In some embodiments, L7 includes the sequence of ETPI-(S/E)-A (SEQ ID NO: 21 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 21 .
In some embodiments, L3 includes the sequence of: (E/K/S)-(V/E)-(V/I/T)-(E/K/P/S)-(V/L)-(G/D) (SEQ ID NO: 19) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 19; L5 includes the sequence of: LD-(G/N)-(E/S)-S (SEQ ID NO: 20) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 20; and L7 includes the sequence of ETPI-(S/E)-A (SEQ ID NO: 21 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 21 .
In some embodiments, A includes the sequence of (D/N/H)-P. In some embodiments, A includes the sequence of DP.
In some embodiments, B includes the sequence of DELE (SEQ ID NO: 35).
In some embodiments, F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 22;
L1 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 23;
L2 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 24;
L3 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 25;
L4 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 26;
L5 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 27;
L6 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 28;
L7 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 29;
L8 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and
F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 30.
In some embodiments, F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22);
L1 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23);
L2 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
L3 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25);
L4 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
L5 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27);
L6 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28);
L7 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29);
L8 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and
F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30).
In some embodiments, F1 comprises the sequence of: TLIHTPGW (SEQ ID NO: 22) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 22;
L1 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F2 comprises the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 23;
L2 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F3 comprises the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 24;
L3 comprises the sequence of: EVVEVG (SEQ ID NO: 31 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 31 ;
F4 comprises the sequence of: GIHFVIGAD (SEQ ID NO: 25) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 25;
L4 comprises the sequence of: GGGSS (SEQ ID NO: 32) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 32;
F5 comprises the sequence of: DKWTRFRLEYS (SEQ ID NO: 26) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 26;
L5 comprises the sequence of: LDGES (SEQ ID NO: 33) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 33;
F6 comprises the sequence of: WTTIREYDH (SEQ ID NO: 27) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 27;
L6 comprises the sequence of: TGAPAG (SEQ ID NO: 18) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 18;
F7 comprises the sequence of: QDVIDEDF (SEQ ID NO: 28) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 28;
L7 comprises the sequence of: ETPISA (SEQ ID NO: 34) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 34;
F8 comprises the sequence of: QYIRLTNLE (SEQ ID NO: 29) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 29;
L8 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and
F9 comprises the sequence of: LTFSEFAIVS (SEQ ID NO: 30) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 30.
In some embodiments, F1 comprises the sequence of: TLIHTPGW (SEQ ID NO: 22) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 22;
L1 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F2 comprises the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 23;
L2 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F3 comprises the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 24;
L3 comprises the sequence of: EVVEVG (SEQ ID NO: 31 ) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 31 ;
F4 comprises the sequence of: GIHFVIGAD (SEQ ID NO: 25) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 25;
L4 comprises the sequence of: GGGSS (SEQ ID NO: 32) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 32;
F5 comprises the sequence of: DKWTRFRLEYS (SEQ ID NO: 26) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 26;
L5 comprises the sequence of: LDGES (SEQ ID NO: 33) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 33;
F6 comprises the sequence of: WTTIREYDH (SEQ ID NO: 27) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 27;
L6 comprises the sequence of: TGAPAG (SEQ ID NO: 18) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 18;
F7 comprises the sequence of: QDVIDEDF (SEQ ID NO: 28) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 28;
L7 comprises the sequence of: ETPISA (SEQ ID NO: 34) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 34;
F8 comprises the sequence of: QYIRLTNLE (SEQ ID NO: 29) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 29;
L8 is absent or comprises at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6,
7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and
F9 comprises the sequence of: LTFSEFAIVS (SEQ ID NO: 30) or a sequence having one amino acid insertion, deletion, or substitution mutation (e.g., one substitution mutation) relative to SEQ ID NO: 30.
In some embodiments, F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22);
L1 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7,
8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23);
L2 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids);
F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
L3 includes the sequence of: EVVEVG (SEQ ID NO: 31 );
F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25);
L4 includes the sequence of: GGGSS (SEQ ID NO: 32);
F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
L5 includes the sequence of: LDGES (SEQ ID NO: 33);
F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27);
L6 includes the sequence of: TGAPAG (SEQ ID NO: 18);
F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28);
L7 includes the sequence of: ETPISA (SEQ ID NO: 34);
F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29);
L8 is absent or includes at least one amino acid (e.g., 1 to 20 amino acids, e.g., 1 , 2, 3, 4, 5, 6, 7,
8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids); and
F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30).
In some embodiments, F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22);
L1 includes the sequence of: X1X2X3X4 (SEQ ID NO: 13), wherein each of X1-X4 is, independently, any amino acid;
F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23);
L2 includes the sequence of: X1X2X3X4X5X6X7 (SEQ ID NO: 14), wherein each of X1-X7 is, independently, any amino acid;
F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
L3 includes the sequence of: EVVEVG (SEQ ID NO: 31 );
F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25);
L4 includes the sequence of: GGGSS (SEQ ID NO: 32);
F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
L5 includes the sequence of: LDGES (SEQ ID NO: 33);
F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27);
L6 includes the sequence of: TGAPAG (SEQ ID NO: 18);
F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28);
L7 includes the sequence of: ETPISA (SEQ ID NO: 34);
F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29);
L8 includes the sequence of: XIX2XSX4XS (SEQ ID NO: 15), wherein each of X1-X5 is, independently, any amino acid; and
F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30).
In some embodiments, A includes the sequence of: DP;
F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22);
L1 includes the sequence of: X1X2X3X4 (SEQ ID NO: 13), wherein each of X1-X4 is, independently, any amino acid;
F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23);
L2 includes the sequence of: X1X2X3X4X5X6X7 (SEQ ID NO: 14), wherein each of X1-X7 is, independently, any amino acid;
F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
L3 includes the sequence of: EVVEVG (SEQ ID NO: 31 );
F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25);
L4 includes the sequence of: GGGSS (SEQ ID NO: 32);
F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
L5 includes the sequence of: LDGES (SEQ ID NO: 33);
F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27);
L6 includes the sequence of: TGAPAG (SEQ ID NO: 18);
F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28);
L7 includes the sequence of: ETPISA (SEQ ID NO: 34);
F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29);
L8 includes the sequence of: X1X2X3X4X5 (SEQ ID NO: 15), wherein each of X1-X5 is, independently, any amino acid;
F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30); and
B includes the sequence of: DELE (SEQ ID NO: 35).
In some embodiments, L1 includes the sequence of X1X2X3X4 (SEQ ID NO: 13), wherein each of Xi , X3, and X4 is, independently, any amino acid, and X2 is V.
In another aspect, featured is a protein scaffold that includes a polypeptide having at least 80% (e.g., at least 85%, 90%, 95%, 97%, or 99%) sequence identity to SEQ ID NO: 3. In some embodiments, the polypeptide includes the sequence of SEQ ID NO: 3. In some embodiments, the polypeptide does not include the sequence of SEQ ID NO: 1 . In some embodiments, the polypeptide does not include the sequence of SEQ ID NO: 2.
In another aspect, featured is a polypeptide having at least 85% (e.g., at least 90%, 95%, 97%, 99%, or 100%) sequence identity to a polypeptide of Table 9 or Table 10. In some embodiments, the polypeptide includes a sequence as set forth in Table 9 or Table 10.
In some embodiments, the polypeptide includes at least one mutation selected from the group consisting of N807X, S809X, R812X, S813X, E814X, S815Xi , D818X, N822X, N825X, N832X, W836X, K857X, E858X, I859X, K86OX2, L861 X, D862X, R865X, K870X, N871 X, N880X, K881 X, K883X, N890X, K897X, K901 X, K908X, E912X, S914X, and K922X3 relative to SEQ ID NO: 1 , wherein:
X is any amino acid except the amino acid in the equivalent position in SEQ ID NO: 1 ;
Xi is any amino acid except R or S;
X2 is any amino acid except P or K; and
X3 is any amino acid except R or K.
One of skill in the art would understand that the protein scaffolds described herein, which contain 9 framework regions (i.e., F1 -F9) may be optimized or swapped according to established biophysical techniques. Accordingly, the invention also features protein scaffolds containing 7 out of 9 or 8 out of 9 framework regions described herein. Based on the detailed structural analysis of the scaffold known in the art (see, e.g., Ficko-Blean et al. J. Mol. Bio. 390: 208-220, 2009) and PDB ID 2w1 q, one of skill in the art could swap out one or more beta strands or a portion thereof (e.g., more than 2 residues of a given framework region, e.g., any one of F1 -F9) of the core protein fold, while still maintaining structural
integrity of the overall scaffold. One could generate a phage library where each member of a library expresses a protein scaffold with loops that confer binding to a specific target and with randomized amino acids at each position in a specific beta strand. The phage library could then be selected for library members that are thermostable and maintain binding to a target by selecting the library for those members that can withstand an incubation at >55°C without aggregation and for those members that can bind to an immobilized target. The isolated clones with these properties would represent scaffolds with a swapped out beta strand or portion thereof.
Accordingly, in some embodiments, the invention also contemplates protein scaffolds having at least 7, e.g., at least 8, of the following framework regions, wherein 7 of the 9 or 8 of the 9 framework regions have the following sequence:
F1 includes the sequence of: (T/S)-LI-(H/R)-(T/S)-(P/E)-(G/S)-W (SEQ ID NO: 4) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 4;
F2 includes the sequence of: G-(S/N/T)-E-(A/S)-(D/N/S/A)-LLDGDD-(S/N/T)-TGV-(E/W/A)-Y (SEQ ID NO: 5) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 5;
F3 includes the sequence of: S-(L/V)-AGEFIGLDLG (SEQ ID NO: 6) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 6;
F4 includes the sequence of: G-(I/V)-(H/R/Y/N)-FVIG-(A/K/R)-(D/N) (SEQ ID NO: 7) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 7;
F5 includes the sequence of: DKW-(T/N/S)-(R/K)-F-(R/K)-LEYS (SEQ ID NO: 8) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 8;
F6 includes the sequence of: WTTI-(R/K/H/Q)-EYD-(H/K/R/Q) (SEQ ID NO: 9) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 9;
F7 includes the sequence of: (Q/K)-DVI-(D/E)-E-(D/S)-F (SEQ ID NO: 10) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 10;
F8 includes the sequence of: (Q/K/R)-YIRLTNLE (SEQ ID NO: 11 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 11 ; and
F9 includes the sequence of: LTFSEFA-(I/V)-VS (SEQ ID NO: 12) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 12.
In other embodiments, the invention also contemplates protein scaffolds having at least 7, e.g., at least 8, of the following framework regions, wherein 7 of the 9 or 8 of the 9 framework regions have the following sequence:
F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 22;
F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 23;
F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 24;
F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 25;
F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 26;
F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 27;
F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 28;
F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 29; and
F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof (e.g., one or two substitution mutations) relative to SEQ ID NO: 30.
In other embodiments, the invention also contemplates protein scaffolds having at least 7, e.g., at least 8, of the following framework regions, wherein 7 of the 9 or 8 of the 9 framework regions have the following sequence:
F1 includes the sequence of: TLIHTPGW (SEQ ID NO: 22);
F2 includes the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23);
F3 includes the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
F4 includes the sequence of: GIHFVIGAD (SEQ ID NO: 25);
F5 includes the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
F6 includes the sequence of: WTTIREYDH (SEQ ID NO: 27);
F7 includes the sequence of: QDVIDEDF (SEQ ID NO: 28);
F8 includes the sequence of: QYIRLTNLE (SEQ ID NO: 29); and
F9 includes the sequence of: LTFSEFAIVS (SEQ ID NO: 30).
In another aspect, featured is a protein scaffold that includes a polypeptide having at least 80% (e.g., at least 85%, 90%, 95%, 97%, or 99%) sequence identity to the framework regions (F1 -F9) over the region of alignment corresponding to F1 -F9 of the reference sequence (e.g., SEQ ID NO: 3).
In some embodiments, the protein scaffold includes one or more non-natural amino acids. In some embodiments, one or more of the framework regions includes a non-natural amino acid. In some embodiments, one or more of the loop regions includes a non-natural amino acid.
Cysteine Mutations and Disulfide Bridges
The protein scaffolds described herein may lack native cysteine residues. Accordingly, the scaffold may be mutagenized to introduce one or more cysteine residues into the scaffold (e.g., in one or more loop or framework regions). When two or more cysteine residues are introduced into the scaffold at nearby sites, the two cysteine residues may form a disulfide bridge, e.g., under oxidizing conditions. In some embodiments, the disulfide bridge enhances thermal stability of the protein scaffold.
In some embodiments, the protein scaffold includes a mutation that adds a cysteine residue. In some embodiments, the protein scaffold includes a first mutation that adds a first cysteine residue and a second mutation that adds a second cysteine residue. In some embodiments, the first cysteine residue and the second cysteine residue form a disulfide bond under oxidizing conditions.
In some embodiments, the protein scaffold comprises at least one mutation selected from the group consisting of F806C, P808C, S845C, L855C, V858C, V861 C, K878C, W879C, L884C, L888C, A904C, P905C, A906GC, G907C, I924C, L926C, N928C, L936C, I943C, L948C.
In some embodiments, the protein scaffold comprises at least two or more mutations selected from the group consisting of F806C, P808C, S845C, L855C, V858C, V861 C, K878C, W879C, L884C, L888C, A904C, P905C, A906GC, G907C, I924C, L926C, N928C, L936C, I943C, L948C.
In some embodiments, the protein scaffold comprises a pair of cysteine mutations selected from the group consisting of K878C and G907C, K878C and A904C, V861 C and I943C, P905C and L855C, S845C and L936C, W879C and N928C, L884C and L926C, F806C and L948C, V858C and L888C, K878C and G907C, K878C and A906GC, S845C and N928C, K878C and A904C, P808C and I943C, V861 C and I924C, P808C and V861 C, and I943C and L855C.
In some embodiments, the pair of cysteine mutations is selected from the group consisting of K878C and G907C, K878C and A904C, S845C and L936C, W879C and N928C, W879C and N928C, L884C and L926C, V858C and L888C, K878C and G907C, and K878C and A906GC.
Tags and Functional Groups
The protein scaffolds described herein may further include a tag. A tag may provide for ease of purification, detection or attachment of the protein scaffold. The tag may be covalently attached to the scaffold. In some embodiments, A and/or B of the scaffold is or includes a tag.
In some embodiments, the tag is an affinity tag (e.g., a polyhistidine tag, e.g., 4, 5, 6, 7, 8, 9, or 10 histidines, e.g., Gly-His tags, e.g., AviTag, e.g., Calmodulin-tag, e.g., polyglutamate tag, e.g., polyarginine tag, e.g., SBP-tag).
In some embodiments, the tag is an epitope tag (e.g., ALFA-tag, C-tag, iCapTag, E-tag, FLAG- tag, HA-tag, Myc-tag, NE-tag, Rho1 D4-tag, S-tag, Softag 1 , Softag 3, Spot-tag, T7-tag, TC tag, Ty tag, V6 tag, VSV-tag or Xpress tag).
In some embodiments, the tag is a covalent protein tag (e.g., Isopeptag, SpyTag, SnoopTag, DogTag or SdyTag).
In some embodiments, the tag is a protein tag (e.g., biotin carboxyl carrier protein tag, glutathione-S-transferase (GST) tag, green fluorescent protein (GFP) tag, HaloTag, SNAP-tag, CLIP-tag, HUH-tag, maltose binding protein tag, Nus tag, thioredoxin tag, Fc tag, Designed Intrinsically Disordered tag, CRDSAT tag, SpyCatcher, SnoopCatcher, DogCatcher, SdyCatcher, or SUMO-tag.
In some embodiments, A and/or B of the scaffold includes an affinity tag, epitope tag, covalent peptide tag, or protein tag.
In some embodiments, the tag is attached to the N-terminus or the C-terminus of the scaffold.
In some embodiments, the scaffold is conjugated to a functional group. In some embodiments, the functional group includes biotin, streptavidin or a derivative of streptavidin, a polyethylene glycol moiety, a fluorescent dye, an enzyme, a radioactive moiety, a lanthanide, or a lanthanide binding motif.
In some embodiments, the scaffold is conjugated to a lanthanide or a lanthanide binding motif. In some embodiments, the lanthanide is terbium.
In some embodiments, the scaffold is conjugated to a radioactive moiety. In some embodiments, the radioactive moiety is an a or p emitter.
In some embodiments, the functional group is conjugated to a sulfhydryl group or a primary amine (e.g., on a cysteine residue or a lysine).
Polynucleotides, Vectors, and Cells
The protein scaffolds described herein may be encoded by a polynucleotide. In some embodiments, the polynucleotide is a ribonucleotide. In some embodiments, the polynucleotide is a deoxyribonucleotide. Also contemplated herein is a vector that includes a polynucleotide encoding the protein scaffold.
In other embodiments, featured is a cell that includes a polynucleotide encoding the protein scaffold or a vector that includes the polynucleotide. The polynucleotide or vector may include an expression element configured to drive expression of the protein scaffold. The cell may be a prokaryotic cell (e.g., E. coli). The cell may be a eukaryotic cell. In some embodiments, the eukaryotic cell is yeast cell (e.g., S. cerevisiae) or a mammalian cell (e.g., a Chinese hamster ovary (CHO) cell). In some embodiments, the protein scaffold is secreted by the cell. In some embodiments, the protein scaffold is expressed within the cell. Such a cell (e.g., E. coli) may be lysed to provide a lysate that includes the protein scaffold.
Also featured herein is a method of producing a protein scaffold as described herein. The method includes the steps of providing a cell transformed with a polynucleotide encoding the protein scaffold or a vector that includes the polynucleotide and culturing the transformed cell under conditions for expressing the polynucleotide. The culturing step results in expression of the protein scaffold. The method may further include isolating the protein scaffold or using the protein scaffold to bind a target.
Particles, Resins, and Columns
The protein scaffolds described herein may be conjugated to a particle. In some embodiments, the particle is a magnetic particle. Also featured is a resin or monolith that includes a plurality of the particles, e.g., containing the protein scaffold. Also contemplated herein is a column (e.g., a chromatography column) containing the particles or the resin, e.g., conjugated to the scaffold.
The scaffolds and methods of use thereof can use a surface linked to the protein scaffold, which is configured to bind its target. The surface of the resin refers to a part of a support structure (e.g., a substrate) that is accessible to contact with one or more target molecules. The shape, form, materials, and modifications of the surface of the resin can be selected from a range of options depending on the application. In one embodiment, the surface of the resin is SEPHAROSE®. In one embodiment, the surface of the resin is agarose.
The surface of the resin can be substantially flat or planar. Alternatively, the surface of the resin can be rounded or contoured. Exemplary contours that can be included on a surface of the resin are wells, depressions, pillars, ridges, channels or the like.
In one embodiment, the surface of the resin is modified to contain channels, patterns, layers, or other configurations (e.g., a patterned surface). The surface can be in the form of a bead, box, column, cylinder, disc, dish (e.g., glass dish, PETRI dish), fiber, film, filter, microtiter plate (e.g., 96-well microtiter plate), multi-bladed stick, net, pellet, plate, ring, rod, roll, sheet, slide, stick, tray, tube, or vial. The surface can be a singular discrete body (e.g., a single tube, a single bead), any number of a plurality of surface bodies (e.g., a rack of 10 tubes, several beads), or combinations thereof (e.g., a tray includes a plurality of microtiter plates, a column filled with beads, a microtiter plate filed with beads).
In some embodiments, a surface can include a membrane-based resin matrix. In some embodiments, the surface of the resin includes a porous resin or a non-porous resin. Examples of porous resins can include additional agarose-based resins (e.g., cyanogen bromide activated SEPHAROSE® (GE); WORKBEADS™ 40 ACT and WORKBEADS™ 40/10000 ACT (Bioworks)), methacrylate: (Tosoh 650M derivatives etc.), polystyrene divinylbenzene (Life Tech Poros media/ GE Source media), fractogel, polyacrylamide, silica, controlled pore glass, dextran derivatives, acrylamide derivatives, convective- interaction media (Sartorius), additional polymers, and combinations thereof.
In some embodiments, a surface can include one or more pores. In some embodiments, pore sizes can be from 300 to 8,000 Angstroms, e.g., 500 to 4,000 Angstroms in size.
A resin as described herein includes a plurality of particles. Examples of particle sizes are 5 pm - 500 pm, 20 pm -300 pm, and 50 pm -200 pm. In some embodiments, particle size can be 50 pm, 60 pm, 70 pm, 80 pm, 90 pm, 100 pm, 110 pm, 120 pm, 130 pm, 140 pm, 150 pm, 160 pm, 170 pm, 180 pm, 190 pm, or 200 pm.
A protein scaffold can be immobilized, coated on, bound to, stuck, adhered, or attached to any of the forms of surfaces described herein (e.g., bead, box, column, cylinder, disc, dish (e.g., glass dish, PETRI dish), fiber, film, filter, microtiter plate (e.g., 96-well microtiter plate), multi-bladed stick, net, pellet, plate, ring, rod, roll, sheet, slide, stick, tray, tube, or vial).
Methods of Purification
Featured herein is a method of purifying a target molecule, e.g., from a plurality of molecules, e.g., from a crude lysate. The method includes providing a sample that includes a mixture of the target molecule and the plurality of molecules and contacting the sample with a protein scaffold as described herein. The protein scaffold may have previously been generated with loop regions that are specific to the desired target. The scaffold (e.g., the loop regions of the scaffold) specifically binds to the target molecule. The method further includes separating the target molecule bound to the protein scaffold from the plurality of molecules. In some embodiments, the step of separating includes immobilizing the protein
scaffold. In some embodiments, the protein scaffold is conjugated to a particle. In some embodiments, the particle includes a magnetic bead. In some embodiments, the protein scaffold is conjugated to a resin or monolith as described herein.
The scaffolds, particles, resins, and columns described herein are amenable to single-step purifications from crude mixtures. For example, target proteins may be eluted with polyol, imidazole (e.g., 3 M imidazole, e.g., pH 8), or sodium citrate (e.g., or 0.1 M sodium citrate, e.g., pH 2.5). The scaffolds, particles, resins, and columns may be cleaned, e.g., with an alkaline substance, e.g., NaOH, e.g., 0.1 M NaOH. Furthermore, resins or columns made with a protein scaffold as described herein may remain functional after exposure to dimethylformamide (DMF), e.g., 100% DMF and/or autoclaving. These features allow reuse of the scaffold without loss of target binding capabilities.
Examples
Example 1. Protein engineering of nanoCLAMP antibody-mimetics for use as affinity chromatography capture agents resistant to high temperature, trypsin, low pH, organic solvent, and sodium hydroxide
Immunoaffinity chromatography is an established laboratory-scale technique for the isolation of target proteins with high yield and purity. However, properties of antibodies and nanobodies often make immunoaffinity chromatography incompatible with conditions typical of many industrial scale processes. To overcome these limitations, we have optimized an antibody-mimetic, called a nanoCLAMP, for use in process-scale affinity chromatography. The 16 kD antibody mimetic is based on a bacterial, cysteine- free, p-sandwich protein with a structure analogous to immunoglobulin variable domains. Like antibodies and other antibody mimetics, the first generation of nanoCLAMPs generally showed high selectivity and affinity but also suffered from sensitivity to high temperature, digestion by proteases, and inactivation by alkali. In this work, we address these limitations with a protein engineering campaign to improve the general robustness of nanoCLAMPs. Over 7 rounds of mutagenesis and screening, we tested 185 protein variants, with at least one mutation made in 58% of the positions in the nanoCLAMP’s constant regions (72 of 124). The campaign yielded a protein with mutations in 30 of 124 positions in the constant regions and dramatically improved resistance to extreme conditions. The mutant protein served as the basis for an improved nanoCLAMP class, called the nC-B class. Phage display was then used to generate several nC-B capture agents recognizing diverse targets. The resulting immunoaffinity capture agents typically had a Kd of < 80 nM, a Tm of > 70 °C and a ti/2 in 0.1 mg/ml trypsin of > 20 hours. The nC-B capture agents also maintained their binding capacity and selectivity over 20 purification cycles, each including 10 minutes of cleaning in place with 0.1 M NaOH. Affinity chromatography resins made with nC-B capture agents supported efficient single-step purifications from crude mixtures. Target proteins could be eluted with either 3 M imidazole, pH 8 or 0.1 M sodium citrate, pH 2.5. Furthermore, affinity chromatography resins with nC-B capture agents remained functional after exposure to 100% DMF and autoclaving. The robust nC-B scaffold developed in this work enables the development of custom, high performance affinity chromatography resins compatible with the harsh conditions of process-scale applications.
Introduction
Affinity chromatography (AC) with target-specific, immobilized capture agents is an established method of protein purification. In this technique, a capture agent, such as a protein, nucleic acid, or small molecule, is coupled to a solid support, which can then be used to isolate a protein of interest from a complex mixture. The technique has been widely used at the laboratory scale for single-step purifications of diverse target proteins, including enzymes, transcription factors, growth factors and antibodies.
Use of protein-based capture agents for AC in industrial applications has been less widespread because currently available approaches are incompatible with the temperatures, pH extremes, and solvents often needed for process-scale purification or are useful for only a limited number of targets. An exception is the purification of kg-quantities of antibodies with AC resins based on Staphylococcal Protein A. The development of Protein A resins highlights the use of protein engineering to improve the robustness of AC resins as well as some remaining limitations. Early versions of resins with wild-type Protein A captured antibodies with high selectivity and capacity from cell culture media feedstock but lost activity gradually after multiple cycles of cleaning in place with sodium hydroxide. Mutagenesis of Protein A yielded variants with increased resistance to sodium hydroxide treatment and higher binding capacity. Despite the widespread use of Protein A resins, they are nonetheless limited to the purification of antibodies.
For the process-scale purification of non-antibody targets, the use of AC is much less widespread than the use of Protein A to purify antibodies. For instance, non-protein, ligand-based approaches, such as small molecule substrate mimetics, are effective but are limited to specific enzyme classes and are difficult to use with a general protein of interest. Alternatively, specialized affinity resins such as glutathione or nickel require the addition of non-native tags, which cause downstream complications for proteins intended for therapeutic use. Immuno-AC with antibody- or nanobody-based capture agents is the most generally applicable approach and has widely been used to purify a diverse range of proteins at laboratory-scale. However, immuno-AC has some limitations. In general, the conjugation of the antibody to the resin often results in heterogenous coupling because of a lack of precise control over the sites of conjugation. In addition, chromatography must be performed under oxidizing conditions in order to preserve the disulfide bonds essential for maintaining antibody structure. Elution of the target also usually requires low or high pH conditions that are incompatible with some target proteins. For processscale applications, the chief limitation of immuno-AC resins is their sensitivity to the sodium hydroxide solutions which are preferred for cleaning-in-place procedures.
These limitations have motivated the development of AC resins based on antibody mimetics - proteins that, like antibodies, can be produced to bind specific antigens with high affinity and specificity, but are not directly derived from the immune system of animals. Examples of antibody mimetics include those based on Protein A, gamma-b crystallin, ubiquitin, cystatin, lipocalins, ankyrin repeat motifs, SH3 domains, fibronectin, OB fold domains, lamprey variable lymphocyte receptors, minibodies, miniproteins, and Kunitz domains. Most of these antibody mimetics use animal-sparing phage display for their isolation and can be produced by microbial cells. Many have a unique cysteine that supports homogeneous, sitespecific coupling to sulfhydryl-reactive supports. However, few of the current antibody mimetics have been shown to enable elution near neutral pH or to be compatible with harsh conditions sometimes needed for process-scale procedures. The development of custom peptide and protein-based affinity chromatography platforms for the purification of non-antibody protein therapeutics is supported by some
proprietary platforms, but the availability and technical details of these platforms is limited (e.g., Avitide, LigaTrap Technologies, Astrea Bioseparations, and Navigo Proteins).
We set out to develop a broadly available and generally applicable class of protein-based affinity capture reagents useful for industrial protein purification. Specifically, we sought to develop an AC capture agent technology with the potential to address a broad range of targets; enable single-step purifications from crude mixtures; elute targets at near neutral pH; and maintain function after exposure to high temperatures, organic solvents, proteases and pH extremes.
We previously developed an antibody mimetic based on the 16 kD, 2nd Type 32 carbohydrate binding module of the hyaluronoglucosaminidase nagH from Clostridium perfringens (NagH CpCBM32-2) (Suderman et al. Protein Expr. Purif. 134:1 14-124, 2017). This binding module is a monomeric p- sandwich domain with variable loops comparable to the complementary determining regions of the immunoglobulin variable domain. We named these antibody mimetics nanoCLAMPs (nano Clostridial Antibody Mimetic Proteins). nanoCLAMPs have the unusual and advantageous general property of releasing bound target protein in solutions of non-denaturing polyols and ammonium sulfate at neutral pH. We have isolated nanoCLAMPs recognizing a variety of target proteins with Kd’s ranging from 1 to 100 nM before affinity maturation and from 10 to 1000 pM after affinity maturation (Suderman et al. 2017). Affinity chromatography media produced with nanoCLAMPs support single-step purifications to near homogeneity as assessed by Coomassie staining. The working binding capacity ranges from 5 to 200 nmol target protein per ml of packed beads. While these first generation nanoCLAMP resins have adequate selectivity and capacity for laboratory-scale purifications, they are suboptimal for process-scale purifications because of their moderate thermostability (Tm ranging from 45 to 60 °C), sensitivity to protease digestion (t 1/2 < 1 h in 0.1 mg/ml trypsin), and moderate alkaline resistance (50% loss of activity after 12 cycles of incubation with 0.1 M NaOH).
We set out to improve upon the performance of the first generation of nanoCLAMPs in order to improve their general utility for the process-scale purification of targets with AC. Toward this end, we undertook a multi-round protein engineering campaign to improve upon first generation nanoCLAMPs. In each round, we made site-directed mutations at specific positions, evaluated the mutations’ impact on thermostability and monodispersity, and then combined beneficial mutations to generate the basis for the next round. The end-product of the 7-round, >180 mutation campaign was a clone with significantly improved performance. We used this clone as the basis for the new “nC-B” class of nanoCLAMPs. Here we report on the campaign to develop the improved nC-B class, the generation and characterization of nC-B nanoCLAMPs against the exemplary protein yeast SUMO (SMT3), and the performance of nC-B nanoCLAMPs after repeated exposure to extreme conditions.
Results
Approach to improve nanoCLAMP performance with a multi-round campaign of site-directed mutagenesis.
We aimed to improve the thermal, proteolytic, and alkaline stability of the first generation of nanoCLAMPs by using consensus protein design, an approach for improving the thermostability of proteins. Our initial attempts of directly synthesizing several versions of the consensus sequence were unsuccessful and yielded only aggregated or multimeric proteins. Therefore, we decided to take an
incremental approach of making single mutations, determining their effect alone and in combination, and working towards an improved protein in several rounds. We focused the effort on surface residues and loops and generally sought to remove lysine, arginine and asparagine where possible. The intent of making substitutions for lysine and arginine was to reduce the number of potential sites for cleavage by trypsin. The intent of making substitutions for asparagine was to increase the protein’s resistance to the alkaline solutions commonly used to sanitize industrial chromatography columns. Asparagine in certain contexts is susceptible to deamidation, and its removal has been shown to reduce the loss of protein binding activity in sodium hydroxide.
We refer to an individual nanoCLAMP as a “clone,” i.e., a specific isolate with unique binding loops. Our starting clone consisted of an anti-SMT3 nanoCLAMP, SMT3-A1 (Suderman et al. supra), which is capable of purifying SUMO-fusion proteins in a single step from complex lysates. The target of SMT3-A1 is a SUMO tag, which is widely used to improve the solubility and yield of proteins produced in E. coli, and can be cleaved to leave behind a native sequence. SMT3-A1 consists of residues 807 to 946 of the 2nd Type 32 carbohydrate binding module of NagH from Clostridium perfringenswth loops mutated in positions 817-820, 838-844, and 931 -935 and selected for binding to yeast SUMO. Throughout the paper we number nanoCLAMP amino acids based on the sequence of NagH. To identify evolutionarily conserved amino acids, we generated a multiple sequence alignment for 20 non-redundant BLAST hits selected to cover a range of similarity. The percent identity of the orthologs ranged from 58% for Clostridium nigeriense to 43% for Coprobacillus sp. AF21 -8LB (Table 1 ).
Our workflow for the protein engineering campaign is summarized in FIG. 1. We made sitespecific mutations and then purified each individual mutant proteins for further biophysical assessment. For the initial assessment, we measured the melting temperature of the resulting mutants using differential scanning fluorimetry (DSF). We then assessed two parameters, melting temperature and initial fluorescence. The rationale for including low initial fluorescence as a criterion for progression is that we previously observed a rough correlation between high initial fluorescence and the presence of soluble aggregates and multimers detected by size exclusion chromatography. Initial fluorescence in DSF may be caused by the binding of fluorophore to hydrophobic patches exposed prior to unfolding. After identifying beneficial mutations, we made several constructs with different combinations of the singly beneficial mutations, whose effects were usually, but not always, additive. As a secondary screen, we confirmed that the starting clone for each subsequent round of mutagenesis was monodisperse by size exclusion chromatography. Optimized protein resulting from the mutagenesis campaign.
The results of 7 rounds of mutagenesis and evaluation are summarized in Table 2. The full list of mutations is listed in Table 3.
N = no usable data
*IF = Initial fluorescence in DSF assay Criteria for L, M, H in IF assay: L = IF below 30% of amplitude of unfolding peak, M = 30% to 50% of amplitude of unfolding peak, H = above 50% of amplitude of unfolding peak
The volumes listed for Aggregated, Dimer, and Monomer SEC % correspond to elution volumes from a Superdex 75 SEC column.
Rounds 1 through 4 focused on increasing Tm. Rounds 5 and 6 focused on removing potential protease cleavage sites while maintaining Tm and included many reversions to past rounds. Round 7 focused on removing remaining asparagines while maintaining Tm. Overall, mutations were tested for 58% of the amino acids in the constant regions (72 of 124 positions). The clone resulting from 7 rounds of mutagenesis is designated P2788. In all, P2788 contains 30 mutations, representing approximately 24% of the positions in the constant regions, and includes a three-residue, C-terminal extension of residues from CBM32-2. The resulting mutations are broadly distributed throughout the primary sequence as shown in an alignment with the original sequence (FIG. 2) and throughout the 3-D structure when mapped to the CBM32-2 crystal structure (PDB accession 2W1 Q). Although the mutants all possessed the same binding loops as the initial clone, we expected and observed a gradual decline in target binding with an increasing number of mutations, many of which were adjacent to the binding loops. We speculate that the loss of binding was caused by shifts in the conformation of the binding loops (data not shown). Because our intent was to improve the stability of the nanoCLAMP and then isolate new binders, our workflow did not include a screen for SUMO binding.
The number of lysines and arginines representing potential trypsin cleavage sites was reduced from 1 1 in the constant regions of the starting protein (clone SMT3-A1 ) to 5 in the constant regions of the resulting protein (clone P2788). Three of the remaining arginines (R881 , R897 and R925) are expected to be involved in salt bridges as identified by the ESBRI algorithm (Costantini et al. ESBRI: a web server for evaluating salt bridges in proteins. Bioinformation 3:137-138, 2008). For these residues, we were unable to identify any substitution mutations that did not destabilize the proteins. For another position, K883, substitution with arginine was beneficial, but several additional substitutions either resulted in a greater than 10 °C decrease in melting temperature or high initial fluorescence in DSF. For one remaining position, K878, which is universally conserved in the alignment, 10 of 10 substitution mutations resulted in proteins with high initial fluorescence in DSF. In the wild-type NagH CpCBM32-2 structure, K878 NC forms hydrogen bonds with the carbonyl oxygens of P905 and G907 as determined by the RING
2.0 algorithm (Piovesan, et al. Nucleic Acids Res 44:W367-374, 2016). For asparagine, the number was reduced from 8 in the original clone (SMT3-A1 ) to 1 in the mutated clone. The remaining asparagine N928 is universally conserved in the consensus alignment and buried in the 3D structure. N928 N82 forms hydrogen bonds with the carbonyl oxygens of S845 and L846 as determined by the RING 2.0 algorithm. We chose not to attempt substitution mutations with N928 because of the low likelihood of deamidation based on its sequence context and the likely challenge of finding a substitution with a beneficial effect.
We next used AlphaFold to predict the 3-D structure of P2788 in order to assess the likelihood of gross changes in 3D-structure (Jumper et al. Nature 596:583-589, 2021 ). A 3D-alignment of the crystal structure of CBM32-2 and the predicted structure of P2788 was performed with the jFATCAT(rigid) algorithm (FIG. 3). As expected with conservative substitutions, the high degree of similarity and the use of templates by the AlphaFold algorithm, the predicted structure of the constant regions of P2788 does not show gross deviations from the solved crystal structure of NagH CpCBM32-2. As expected, because of the differences in amino acid sequence, the variable loops show deviations, especially for the longest 838-844 loop. The overall similarity of the solved CBM32-2 structure versus the predicted structure of P2788 is high, even with the differences in loop sequences (TM-score = 0.95).
Compared with the starting protein, the Tm increased by 24 °C from 52 °C to 76 °C. (FIG. 4). We next tested P2788’s resistance to digestion by trypsin. Following a 16-hour digestion in 0.1 mg/ml trypsin, no full length SMT3-A1 remained whereas no apparent digestion of P2788 had occurred as assessed by SDS-PAGE (FIG. 5A). A time course of trypsin digestion determined that the ti/2 increased from 3 hours for SMT3-A1 to > 16 hours for P2788 (FIGS. 5C-5E). While our protein engineering campaign deleted only a few surface-exposed chymotrypsin-cleavage sites, we also tested resistance to chymotrypsin as a reflection of general stability. Following a 16-hour digestion with 0.1 mg/ml chymotrypsin, about a fifth of P2788 appeared to remain full-length while SMT3-A1 was completely digested (FIG. 5A).
We next sought to determine whether a new class of nanoCLAMPs based on the constant regions of P2788 could confer these properties to newly isolated clones. We call the P2788-derived class the “nC-B class” (nanoCLAMP-B with the identifier “B” referring to the next variant of the original class of nanoCLAMPs). The first generation of nanoCLAMPs represented by SMT3-A1 and others is referred to as the “nC-A class” (nanoCLAMP-A with the identifier “A” referring to the first class of nanoCLAMPs). For clarity, the nC-A class of nanoCLAMPs encompasses the first published nanoCLAMPs (Suderman et al. supra). Relative to NagH CpCBM32-2, nanoCLAMPs of the nC-A class have a M929L mutation that removes a methionine as well as amino acid differences in the variable loops.
Phage display library panning for improved SUMO capture agents of the nC-B class of nanoCLAMPs.
To confirm that the optimized constant regions of the nC-B class can support the general isolation of high affinity binders with improved thermal, proteolytic, and alkaline stability, we sought to isolate new nanoCLAMP binders containing the nC-B constant regions and a diversity of variable loops. We first constructed a phage display library with randomized binding loops in the context of the nC-B constant regions and panned the library for binders to yeast SUMO (SMT3). This library has the same three variable loops with randomized residues as the previous library from which SMT3-A1 was isolated but uses the nC-B constant regions instead of the nC-A constant regions. Degenerate oligonucleotides
constructed with phosphoramidite trimers were designed so that the variable regions encoded all amino acids except cysteine (omitted to avoid heterogeneous coupling to multiple cysteines), methionine (omitted to avoid the risk of inactivating oxidation), and lysine and arginine (omitted to avoid the addition of a trypsin-cleavage sites). Position 818 was held constant with a valine, the wild-type amino acid, because valine and another small hydrophobic amino acid isoleucine, appeared in one quarter of nanoCLAMPs from previous screens. The resulting library contained over 1010 variants of nC-B nanoCLAMPs.
Following the third round of panning of this library, we randomly selected 96 clones and screened them for target binding by semELISA, which yielded 93 confirmed positives. Of these, 40 were sequenced to identify 18 unique nanoCLAMP SUMO binders. These binders were subcloned into a bacterial expression vector, expressed, purified by immobilized metal affinity chromatography (IMAC), and confirmed to be over 90% pure as estimated by SDS-PAGE (data not shown). We then evaluated the purified nanoCLAMPs’ ability to function as affinity capture agents in a medium-throughput, small- scale depletion assay. In this assay, the nanoCLAMPs were conjugated to cross-linked agarose under denaturing conditions, refolded on the resin, incubated with SUMO, and the quantity of unbound SUMO measured by A280.
The purified nanoCLAMPs were then screened for monodispersity by size exclusion chromatography and Tm by DSF. Of the 18, seven (38%) were over 90% monomer. Of these, five had a melting temperature of greater than 73 °C, with four having melting temperatures greater than 99 °C (Table 4). The initial results with the SUMO test case suggest that the nC-B constant regions generally support the isolation of clones with high monodispersity and thermostability.
Characterization of binding affinity, thermostability and protease-resistance of nanoCLAMP capture agents of the nC-B class.
We chose three nC-B nanoCLAMPs (P2808, P2809, and P2811 ) for further characterization. These were selected to provide a diverse sample of binding loops. For the three clones, Loop 817-820 and Loop 931 -935 did not have any apparent similarity. Except for V818, whose identity was fixed in the library, there were no identities in any positions of these loops. For Loop 838-844, clones P2808 and P2809 are identical in 5 of 7 positions while clone 2811 shows no identities with either.
All three nanoCLAMPs were produced in E. coli with yields > 150 mg/liter of shaken E. coli culture and used for biophysical characterization experiments.
We first checked the quaternary structure of nanoCLAMPs by size exclusion chromatography to confirm that subsequent results could be interpreted without confounding avidity effects from higher order multimers or aggregates. All three clones migrated as monodisperse monomers (FIG. 6). To rank these nanoCLAMPs by their affinity for SUMO, we used biolayer interferometry to measure their dissociation constants, which ranged from 5 to 80 nM (Table 5). We then measured the melting temperature of the nanoCLAMPs using DSF. P2808 had an apparent Tm of 73 °C. P2809 and P811 had flat-line DSF curves with no melting transition apparent between 25° and 99 °C (FIG. 6). This observation suggests that the Tm for these nanoCLAMPs exceeds the quantifiable range for this assay. We corroborated this observation with a functional binding assay to assess kinetic thermostability. In this assay, we incubated samples of each nanoCLAMP at different temperatures, cooled and centrifuged the solutions, and then measured the binding activity remaining in the supernatant by biolayer interferometry. In this test, we define Tso as the temperature of a 5-minute heat challenge after which 50% of binding activity is irreversibly lost. The Tso ranged from ~85 °C for clone P2808 to > 100 °C for clones P2809 and P2811 . The rank order and values are consistent with the Tm measured by DSF (FIG. 11). Clones P2809 and P2811 , both with Tm’s > 100 °C, maintained greater than 90% activity after incubation at boiling temperatures for 5 minutes. The maintenance of binding activity indicates that the nanoCLAMPs remained in solution after heat treatment and did not irreversibly aggregate or precipitate. Taken together with the DSF data, the kinetic thermostability measurements suggest that these two nanoCLAMPs may remain folded up to, and possibly above, 99 °C.
Table 5. Tm, T5o, Kd, trypsin resistance of selected anti-SUMO nanoCLAMPs
We next characterized the trypsin-resistance of the three clones. Clones P2808 and P2809 were highly resistant to digestion with 0.1 mg/ml trypsin, while clone P2811 was less resistant (FIG. 5F). FIGS. 5B-5D show a time course of a tryptic digest comparing nanoCLAMPs with different combinations of constant regions and variable loops to understand the contribution of each component to trypsin resistance. We tested the original clone SMT3-A1 (nC-A constant regions and original variable loops), P2788 (nC-B constant regions with the original variable loops from SMT3-A1 ), and P2808 (nC-B constant regions with newly isolated variable loops). Both P2788 and P2808 show a ti/2 of > 20 hours in 0.1 mg/ml trypsin, compared with ~4 hours for the original SMT3-A1 clone. Together with the observations of P2809 and P2811 , these results indicate that trypsin-resistance depends upon the sequences of both the variable loops and constant regions. The observation of trypsin-resistance for 3 of 4 clones isolated with diverse loop sequences and identical constant regions indicates that the nC-B constant regions can be generally used to isolate trypsin-resistant clones, in at least a first test-case.
Measurement of performance parameters with affinity chromatography using capture agents of the nC-B class of nanoCLAMPs.
For brevity, we will subsequently refer to affinity chromatography resins made with capture agents of the nC-B class of nanoCLAMPs as “nC-B affinity resins.” As a test case for the utility of nC-B affinity resins, we used P2808 as a capture agent for more detailed studies. To generate the affinity resin, the P2808 protein was expressed, purified by IMAC under denaturing conditions, conjugated to sulfhydryl-reactive 6% cross-linked agarose resin, and then refolded by rinsing with buffered saline. We prepared a test mixture of crude E. coli lysates spiked with a SUMO-GFP fusion protein, optimized purification conditions and assessed the resins’ selectivity and binding capacity.
In pilot experiments, we observed that polyol-elution at neutral pH was still possible, as with the original nanoCLAMP (Suderman et al. supra), but with qualitatively lower speed and yield (data not shown). As an alternative, we tested elution of the target protein with molar concentrations of imidazole,
which has been used successfully for disrupting protein-peptide interactions and antibody-Protein A interactions.
A buffer with 3 M imidazole at pH 8 quickly and completely eluted SUMO from P2808 resin. As shown in FIG. 12, after incubation for 1 hour with an excess of target protein in a spiked E. coli lysate, the bound target was washed and eluted with a purity of over 90% as estimated by densitometric analysis of Coomassie stained SDS-PAGE. The static binding capacity under these conditions was 11 .5 mg/ml resin (277 nmol/ml). In subsequent experiments, we found that imidazole elution worked consistently with P2808, P2809, and P2811 , as well as three different AC resins targeting SUMO, mCherry, and GFP with capture agents of the nC-A class . The observations suggest that imidazole elution is a general property of both class of nanoCLAMPs.
With elution conditions established, we next measured the dynamic binding capacity of P2808 resin using a 0.6 ml packed column (0.5 cm ID x 3.06 cm) at a constant flowrate of 0.5 ml/min. We utilized the fluorescence of the SUMO-GFP target to determine the QB10, or the volume at which the eluate’s fluorescence equaled 10% of the load’s fluorescence (see Materials and Methods for calculations). The dynamic binding capacity of P2808 resin under these conditions was 10 mg/ml resin (240 nmol/ml resin) (FIG. 7). This capacity represents an approximate 70% increase over our original SMT3-A1 resin.
We next tested the efficiency of the SUMO affinity resin P2808 resin under flow conditions. A 0.6 ml packed column and a flowrate of 0.5 ml/min (linear flowrate = 153 cm/h) were used to test two regimes.
In the capacity-excess regime, SUMO was spiked into an E. coli lysate at a low concentration (SUMO-GFP = 0.76% of total protein by weight), loaded at 42% of the column’s dynamic capacity (36% of its static binding capacity). As assessed by densitometry of a Coomassie-stained SDS-PAGE gel, the purity was > 90%, and the yield was 90%. (FIG. 8A).
In the capacity-limited regime, SUMO-GFP was spiked into an E. co// lysate at a high concentration (SUMO-GFP = 5.7% of total protein by weight) and loaded at 133% of the column’s dynamic binding capacity (116% of its static binding capacity) (FIG. 8B). The purity and yield were comparable to the capacity-excess regime, and both estimated to be > 90%. The performance metrics for both purifications are summarized in Table 6.
1 : DBC = 10 mg/ml resin (see FIG. 7)
2: SBC = 11 .5 mg/ml resin (see FIG. 12)
Compatibility of nC-B affinity resins with repeated cycles of NaOH cleaning.
For large-scale production of industrial proteins or biologies, the re-use and cleaning of columns in place reduces manufacturing costs and maintains consistent performance. Sodium hydroxide solutions are commonly used for cleaning-in-place procedures, so we tested the general compatibility of the new nanoCLAMP resins with sodium hydroxide treatment. We tested 3 resins made with nC-A nanoCLAMPs and 3 resins made with nC-B nanoCLAMPs. Each cycle consisted of loading a SUMO-GFP target spiked into an E. coli lysate; a wash step, elution with 3 M imidazole pH 8; 10 minutes of contact time with 0.1 M NaOH; and then a 5-minute re-equilibration step. The eluates were collected and analyzed for purity by SDS PAGE, and the purified target quantified by fluorescence spectroscopy (FIGS. 9 and 13). In the first few cycles, we were surprised to observe an improvement in binding capacity of 5 to 20% with nC-B resins. It is possible that the increase in binding results from the removal of inhibitory material by NaOH. For all three nC-B resins, the binding capacity plateaued over 20 cycles and remained at or above 100% of starting capacity. In contrast, we observed a steady reduction in binding capacity by 25% to 50% with all three nC-A resins. Taken together, this data indicates that nanoCLAMPs of the nC-B class can generally serve as capture agents for resins capable of single-step affinity purification of targets to homogeneity. These resins are also compatible with cleaning-in-place protocols using 0.1 M sodium hydroxide for over 20 cycles, without loss of binding capacity or specificity. Further, in practice, cleaning- in-place cycles are not usually performed after each run, so the expected lifetime likely exceeds 100 cycles with the assumption of sanitation every 5th run.
Compatibility of nC-B affinity resins with repeated cycles of low pH elution.
Because elution with 3 M imidazole might be suboptimal for some applications, we also tested elution with a citrate buffer at pH 2.5. In these experiments, the resin was also cleaned with NaOH in between each cycle for 1 minute before re-equilibrating the resin with buffered saline. The nanoCLAMP maintained 100% of its binding capacity and specificity over more than 20 cycles of loading, elution, and regeneration (FIGS. 14A and 14B).
Compatibility of nC-B affinity resins with organic solvent and autoclaving.
We next tested the ability of the nC-B resins and the original SMT3-A1 resin to resist more extreme conditions. First, we tested the resins’ ability to recover selective binding capacity after exposure to 100% DMF. We incubated nanoCLAMP resin in 100% DMF for 2 hours, re-equilibrated with buffered
saline, and then measured binding capacity. Of the four resins tested (the original SMT3-A1 resin and P2808 resin, P2809 resin, and P2811 resin), all retained over 85% binding capacity after treatment with DMF (FIG. 10A) and maintained their apparent selectivity as assessed qualitatively by SDS-PAGE (FIG. 10B).
Because the nC-B resins were robust to the broad range of conditions tested so far, we decided to determine whether the resins could also retain binding and specificity after autoclaving. We autoclaved the resin with a 105-minute liquid steam cycle, including 30-minute exposure to 120 °C and 20 p.s.i., reequilibrated at room temperature, and then measured binding capacity. The resin made with the nC-A nanoCLAMP (SMT3-A1 ) did not bind any detectable target protein after autoclaving. In contrast, the resins made with the nC-B nanoCLAMPs (P2808, P2809, and P2811 ) retained 25% to 45% binding capacity with specificity comparable to controls (FIGS. 10C and 10D). The SMT3-A1 eluates were not analyzed in FIG. 10D because there was not enough protein in the autoclaved sample to prepare a normalized aliquot to compare to the control. To our knowledge, affinity chromatography resins based on nC-B nanoCLAMPs are the first protein-based affinity resins shown to retain significant binding capacity and specificity after autoclaving.
Discussion
We report on a protein engineering campaign yielding an improved class of nanoCLAMPs suitable as capture agents suitable for process-scale affinity chromatography. Similar to Protein A resins or antibody-based immunoaffinity resins, resins made with the new nC-B class of nanoCLAMPs support single step purifications from complex mixtures with high yield and fold-purification. Similar to Protein A resins, but unlike antibody-based resins, nC-B resins are compatible with NaOH cleaning-in-place, can be produced by bacterial expression of the capture reagent, and lack cysteine residues. Unlike either Protein A or antibody-based resins, nC-B resins are distinct in having been shown to exhibit resistance to boiling temperatures, trypsin, and organic solvent. Key performance parameters of nC-B resins and supporting results are summarized in Table 7.
Affinity resins based on Sulfolink crosslinked beaded agarose (Thermo)
This work provides a test case supporting the potential of nC-B nanoCLAMP resins to extend Protein A-like levels of performance to a broad range of proteins beyond antibodies. nC-B resins’ efficiency, low cost-of-manufacture and reusability have the potential to reduce the total cost of manufacture for process-scale purifications. In addition, nanoCLAMPs’ general compatibility with high temperature, organic solvent and pH extremes may enable industrial applications where extreme conditions are required.
The improved stability and performance of nC-B nanoCLAMPs also support their use in applications beyond immunoaffinity chromatography. nanoCLAMPs have been used successfully in bioelectric and electrochemical sensors. The conditions at the surface of a biosensor represent a challenging environment where the improved stability of nC-B nanoCLAMPs may be enabling. For example, conjugation of nanoCLAMPs to surfaces in DMF allows the use of reaction conditions compatible with reagents that have low solubility in aqueous buffers.
The performance characteristics demonstrated for the exemplary nanoCLAMPs in this work support further exploration of nanoCLAMPs’ potential to substitute for other capture agents in a broad range of applications, especially those procedures requiring tight, selective or reversible binding and exposure to extreme conditions.
Materials and Methods
Cloning of SMT3-A1 mutants
The plasmid pET(SMT3-A1 ) containing nanoCLAMP SMT3-A1 was mutated by inverse PCR (Ochman et al. Genetics 120:621 -623, 1988) by amplifying the plasmid with forward and reverse primers containing the mutation(s) of interest with 15-bp overlapping 5’ ends, purifying the amplicon, In-Fusion cloning the ends back together (In Fusion HD Cloning Kit, Takara), and transforming chemically competent NEc1 E. coli (BL21 (DE3) derivative with slyDD(His151 -His196) from Nectagen, Inc.). Plasmids were purified by Qiagen miniprep kit (Qiagen) and mutations were verified by sequencing the purified plasmids by Sanger sequencing (Genewiz). Glycerol stocks of the plasmids in NEc1 cells were prepared for seeding expression cultures. Constructs for conjugating to Sulfolink resin (Thermo) coded for the nanoCLAMP with an N-terminal 6-His tag and a 13-amino acid C-terminal GS-linker followed by a Cys. Constructs for expressing nanoCLAMPs for biophysical characterization lacked the GS-linker and the C-terminal Cys to avoid dimerization issues due to disulfides. pET(SMT3-A1) Sequence (SEQ ID NO: 38)
TTCTTGAAGACGAAAGGGCCTCGTGATACGCCTATTTTTATAGGTTAATGTCATGATAATAATGGTTTCTTAGACGT CAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCG CTCATGAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGT CGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATG
CTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGC CCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTGTTGACGCCGG GC AAGAGC AAC T C GGT C GC C GC AT AC AC T AT T C T C AGAAT GAC T T GGT T GAGT AC T C AC C AGT C AC AGAAAAGC AT C T T AC GGAT GGC AT GAC AGT AAGAGAAT T AT GC AGT GC T GC C AT AAC C AT GAGT GAT AAC AC T GC GGC C AAC T T AC T T C TGAC AAC GAT C GGAGGAC C GAAGGAGC T AAC CGCTTTTTT GC AC AAC AT GGGGGAT CAT GT AAC TCGCCTTGATCG T TGGGAAC C GGAGC T GAAT GAAGC CAT AC C AAAC GAC GAGC GT GAC AC CACGATGCCTGCAGCAAT GGC AAC AAC GT T GC GC AAAC TAT T AAC T GGC GAAC TACTTACTCTAGCTTCCC GGC AAC AAT T AAT AGAC T GGAT GGAGGC GGAT AAA GTTGCAGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGG GTCTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACACGACGGGGAGTC AGGC AAC TAT GGAT GAAC GAAAT AGAC AGAT C GC T GAGAT AGGT GC C T C AC TGAT T AAGC AT T GGT AAC T GT C AGAC CAAGTTTACTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTT TGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAG GATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTT T GT T T GC C GGAT C AAGAGC T AC C AAC T C T T T T T C C GAAGGT AAC T GGC T T C AGC AGAGC GC AGAT AC C AAAT AC T GT CCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCC TGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAG GCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATA C C T AC AGC GT GAGC TAT GAGAAAGC GCCACGCTTCCC GAAGGGAGAAAGGC GGAC AGGT AT C C GGT AAGC GGC AGGG TCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCAC CTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTT TTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACC GTATTACCGCCTTT GAGT GAGC TGATACCGCTCGCCGCAGCC GAAC GAC C GAGC GC AGC GAGT C AGT GAGC GAGGAA GCGGAAGAGCGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCACACCGCATATATGGTGCACTCTC AGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGTATACACTCCGCTATCGCTACGTGACTGGGTCATGGCTGCG CCCCGACACCCGCCAACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGCTGT GACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACCGAAACGCGCGAGGCAGCTGCGGTAAAGCT CATCAGCGTGGTCGTGAAGCGATTCACAGATGTCTGCCTGTTCATCCGCGTCCAGCTCGTTGAGTTTCTCCAGAAGC GTTAATGTCTGGCTTCTGATAAAGCGGGCCATGTTAAGGGCGGTTTTTTCCTGTTTGGTCACTGATGCCTCCGTGTA AGGGGGAT TTCTGTTCATGGGGGTAATGATACCGAT GAAAC GAGAGAGGAT GC T C AC GAT AC GGGT T AC T GAT GAT G AACATGCCCGGTTACTGGAACGTTGTGAGGGTAAACAACTGGCGGTATGGATGCGGCGGGACCAGAGAAAAATCACT CAGGGTCAATGCCAGCGCTTCGTTAATACAGATGTAGGTGTTCCACAGGGTAGCCAGCAGCATCCTGCGATGCAGAT C CGGAAC AT AAT GGTGCAGGGCGCT GAC TTCCGCGTTTC C AGAC T T T AC GAAAC AC GGAAAC C GAAGAC CATTCATG TTGTTGCTCAGGTCGCAGACGTTTTGCAGCAGCAGTCGCTTCACGTTCGCTCGCGTATCGGTGATTCATTCTGCTAA CCAGTAAGGCAACCCCGCCAGCCTAGCCGGGTCCTCAACGACAGGAGCACGATCATGCGCACCCGTGGCCAGGACCC AACGCTGCCCGAGATGCGCCGCGTGCGGCTGCTGGAGATGGCGGACGCGATGGATATGTTCTGCCAAGGGTTGGTTT GCGCATTCACAGTTCTCCGCAAGAATTGATTGGCTCCAATTCTTGGAGTGGTGAATCCGTTAGCGAGGTGCCGCCGG CTTCCATTCAGGTCGAGGTGGCCCGGCTCCATGCACCGCGACGCAACGCGGGGAGGCAGACAAGGTATAGGGCGGCG CCTACAATCCATGCCAACCCGTTCCATGTGCTCGCCGAGGCGGCATAAATCGCCGTGACGATCAGCGGTCCAGTGAT CGAAGTTAGGCTGGTAAGAGCCGCGAGCGATCCTTGAAGCTGTCCCTGATGGTCGTCATCTACCTGCCTGGACAGCA T GGC C T GC AAC GC GGGC AT C C C GAT GC C GC C GGAAGC GAGAAGAAT C AT AATGGGGAAGGC C AT C C AGC C T C GC GT C GCGAACGCCAGCAAGACGTAGCCCAGCGCGTCGGCCGCCATGCCGGCGATAATGGCCTGCTTCTCGCCGAAACGTTT GGT GGC GGGAC C AGT GAC GAAGGC T T GAGC GAGGGC GT GC AAGAT T C C GAAT AC C GC AAGC GAC AGGC CGATCATCG TCGCGCTCCAGCGAAAGCGGTCCTCGCCGAAAATGACCCAGAGCGCTGCCGGCACCTGTCCTACGAGTTGCATGATA AAGAAGAC AGT C AT AAGT GC GGC GAC GATAGTCATGCCCCGCGCCCACC GGAAGGAGC T GAC T GGGT T GAAGGC T C T CAAGGGCATCGGTCGAGATCCCGGTGCCTAATGAGTGAGCTAACTTACATTAATTGCGTTGCGCTCACTGCCCGCTT TCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGG CGCCAGGGTGGTTTTTCTTTTCACCAGTGAGACGGGCAACAGCTGATTGCCCTTCACCGCCTGGCCCTGAGAGAGTT GCAGCAAGCGGTCCACGCTGGTTTGCCCCAGCAGGCGAAAATCCTGTTTGATGGTGGTTAACGGCGGGATATAACAT GAGCTGTCTTCGGTATCGTCGTATCCCACTACCGAGATATCCGCACCAACGCGCAGCCCGGACTCGGTAATGGCGCG CATTGCGCCCAGCGCCATCTGATCGTTGGCAACCAGCATCGCAGTGGGAACGATGCCCTCATTCAGCATTTGCATGG TTTGTTGAAAACCGGACATGGCACTCCAGTCGCCTTCCCGTTCCGCTATCGGCTGAATTTGATTGCGAGTGAGATAT T T AT GC C AGC C AGC C AGAC GC AGAC GC GC C GAGAC AGAAC TTAATGGGCCCGC T AAC AGC GCGATTTGCT GGT GAC C CAATGCGACCAGATGCTCCACGCCCAGTCGCGTACCGTCTTCATGGGAGAAAATAATACTGTTGATGGGTGTCTGGT C AGAGAC AT C AAGAAAT AAC GC C GGAAC AT T AGT GC AGGC AGC T T C C AC AGC AAT GGC AT C C T GGT C AT C C AGC GGA TAGTTAATGATCAGCCCACT GAC GCGTTGCGC GAGAAGAT TGTGCACCGCCGCTTTACAGGCTTC GAC GCCGCTTCG TTCTACCATCGACACCACCACGCTGGCACCCAGTTGATCGGCGCGAGATTTAATCGCCGCGACAATTTGCGACGGCG CGTGCAGGGCCAGACTGGAGGTGGCAACGCCAATCAGCAACGACTGTTTGCCCGCCAGTTGTTGTGCCACGCGGTTG GGAATGTAATTCAGCTCCGCCATCGCCGCTTCCACTTTTTCCCGCGTTTTCGCAGAAACGTGGCTGGCCTGGTTCAC CACGCGGGAAACGGTCTGATAAGAGACACCGGCATACTCTGCGACATCGTATAACGTTACTGGTTTCACATTCACCA CCCTGAATTGACTCTCTTCCGGGCGCTATCATGCCATACCGCGAAAGGTTTTGCGCCATTCGATGGTGTCCGGGATC TCGACGCTCTCCCTTATGCGACTCCTGCATTAGGAAGCAGCCCAGTAGTAGGTTGAGGCCGTTGAGCACCGCCGCCG C AAGGAAT GGT GC AT GC AAGGAGAT GGC GC C C AAC AGT C C C C C GGC C AC GGGGC C T GC C AC C AT AC C C AC GC C GAAA CAAGCGCTCATGAGCCCGAAGTGGCGAGCCCGATCTTCCCCATCGGTGATGTCGGCGATATAGGCGCCAGCAACCGC
ACCTGTGGCGCCGGTGATGCCGGCCACGATGCGTCCGGCGTAGAGGATCGAGATCTCGATCCCGCGAAATTAATACG AC T C AC T AT AGGGGAAT T GT GAGC GGAT AAC AAT T C C C C T C T AGAAAT AAT T T T GT T T AAC T T T AAGAAGGAGAT AT ACCATGGGCAGCAGCCATCATCATCATCATCACAACCCTTCTTTAATTCGTTCTGAATCCTGGGAAGACATCAAAGG GAAT GAAGC C AAT T TAT T AGAT GGAGAC GAT AAC AC CGGTGTTTGGTATTT C AAC GAAGT T T T C T AC GAAT C T C T T G CAGGAGAATTCATTGGATTGGACTTAGGTAAGGAAATTAAATTGGATGGTATTCGTTTTGTTATTGGTAAGAATGGA GGCGGTAGTTCCGACAAATGGAACAAATTCAAGTTGGAGTACTCCCTGGATAACGAAAGTTGGACTACTATCAAAGA AT AC GAC AAGAC AGGGGC T C C T GC AGGGAAAGAT GT T AT T GAAGAAT C C T T CGAGAC TCCCATTTCCGC T AAGT AC A TTCGTCTGACTAATCTGGAAGACAAAATCCTGTTCCTGACTTTTAGTGAGTTTGCAATTGTGTCTGACGGTGGAGGT GGCAGCGGCGGTGGTGGCTCGGGTGGAGGGTGCTGAGCTGCTAACAAAGCCCGAAAGGAAGCTGAGTTGGCTGCTGC CACCGCTGAGCAATAACTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGGGGTTTTTTGCTGAAAGGAGGAA CTATATCCGGATATCCCGCAAGAGGCCCGGCAGTACCGGCATAACCAAGCCTATGCCTACAGCATCCAGGGTGACGG TGCCGAGGATGACGATGAGCGCATTGTTAGATTTCATACACGGTGCCTGACTGCGTTAGCAATTTAACTGTGATAAA C T AC C GC AT T AAAGC T TAT C GAT GAT AAGC T GT C AAAC AT GAGAA
Expression and purification of nanoCLAMPs under denaturing conditions for conjugation to affinity chromatography resin (1 I scale)
Glycerol stocks of NEc1 cells harboring nanoCLAMP expression vectors (described above) were used to inoculate 3 ml starter cultures of 2xYT/2% glucose (Glu)/100 mg/ml Carbenici Ilin (CB) and grown overnight at 37 °C, 250 rpm. The overnight cultures were diluted 1 :100 into 300 ml of Novagen Overnight Express Instant TB Medium/1 % glycerol/CB and incubated 24 h, 30 °C, 250 rpm. Cells were pelleted at 10k x g, 10 min, 4 °C, and lysed with 30 ml 100mM NaH2PO4, 10 mM Tris, 6 M GuHCI (QAB) pH 8.5, plus 1 mM TCEP (QAB-TCEP, pH 8.5) using a Polytron to homogenize. The insoluble material was pelleted at 15k x g, 20 min, 15 °C, and the cleared supernatant applied to Ni Sepharose 6 Fast Flow (Cytivia) and incubated rotating for 1 h to overnight. The beads were transferred to a column and washed with 3 CV QAB-TCEP, pH 8.5, then 3 CV QAB, pH 8.5. The protein was eluted with QAB, pH 8.5 + 250 mM imidazole and quantified by A280. The purity of the eluted protein was measured by SDS-PAGE on 12% NuPAGE Bis-Tris gels and Coomassie staining with Gel-Code Blue (after removing the GuHCI by cold ethanol precipitation). Yields for nanoCLAMPs were typically 150 - 300 mg/L culture, and purity was typically greater than 90%.
The purified, denatured nanoCLAMPs in QAB, pH 8.5 were reduced with 2 mM TCEP if used after storage and conjugated to Sulfolink cross-linked, 6% beaded agarose (Thermo). Briefly, the resin was equilibrated with QAB, pH 8.5 + 5 mM EDTA and transferred to a column. The nanoCLAMP was adjusted to 8 mg/ml in a volume 2X the volume of the Sulfolink resin, and then incubated with the resin with rotation for 30 min. at room temperature. The resin was allowed to settle for 15 min., and the column drained to the top of the resin bed. The column was washed with QAB, pH 8.5 and then incubated with 50 mM L-Cys in QAB, pH 8.5 to quench for 15 min. with rotation. The column was allowed to settle, drained, and washed again with 6 M GuHCI, 20 mM Tris, (QCB) pH 8. Finally, the nanoCLAMP was refolded on the resin by rinsing with 6 CV of 20 mM MOPS, 150 mM NaCI (MBS), pH 6.5 + 1 mM CaCl2.
Determination of static binding capacity of nanoCLAMP resins
A spiked E. coli lysate was prepared by pelleting the equivalent of ODeoo = 8 culture, discarding the supernatant and lysing the cells with BPER at 4 ml per g pellet, 20 min at room temperature with rotation. The insoluble material was removed by centrifugation, and the cleared lysate adjusted to a total protein concentration of approximately 1 .87 mg/ml, 20% BPER in PBS, pH 7.4. The target protein, SUMO-GFP (described above) was spiked into the lysate to a final concentration of 0.025 to 0.2 mg/ml, depending on the application with a highly concentrated stock so that the total protein concentration
remained unchanged. The spiked lysate was then incubated with 10 ml of the nanoCLAMP resin (packed volume) in a total volume of 1 .4 ml, rotating at 4 °C for 1 h. The resin was precipitated by centrifugation and transferred to a small column. The resin was washed 4 times with 400 ml PBS, pH 7.4, and then eluted with 3 M imidazole, pH 8. The eluates were buffer exchanged twice with Zeba columns (7 kD MWCO, Thermo), and quantified by A280 or fluorescence using an iD5 plate reader.
Expression and purification of nanoCLAMPs for biophysical characterization
Glycerol stocks of NEc1 cells harboring nanoCLAMP expression vectors (described above) were used to inoculate 3 ml starter cultures of 2xYT/2% glucose (Glu)/100 mg/ml Carbenici Ilin (CB) and grown overnight at 37 °C, 250 rpm. The overnight cultures were diluted 1 :100 into 35 ml of Novagen Overnight Express Instant TB Medium/1% glycerol/CB and incubated 24 h, 30 °C, 250 rpm. Cells were pelleted and lysed with QCB, pH 8, and insoluble material removed by centrifugation at 15k x g, 20 min, 15 °C. The cleared lysate was incubated with Ni Sepharose 6 Fast Flow (Cytivia) for > 1 h rotating, room temperature, then transferred to 2 ml columns. The columns were washed with 6 x 1 ml QCB, pH 8, then refolded with 11 ml of 20 mM MOPS, 150 mM NaCI (MBS), 1 mM CaCl2, pH 8. The nanoCLAMPs were eluted with MBS, 1 mM CaCL, 250 mM imidazole, pH 8, buffer exchanged to remove the imidazole using Zeba 7 MWCO desalting columns, and normalized to 1 mg/ml in MBS, 1 mM CaCl2, pH 6.5.
Expression and purification of target proteins for panning and affinity chromatography
To prepare a target protein for panning the library NL-26, we prepared a biotinylated yeast SUMO construct (B-SUMO; P1068) in a pET expression vector and transformed into BL21 (DE3) E. coli harboring a constitutively expressed biotin ligase, BirA. An overnight starter culture was diluted 1 :100 into 500 ml Novagen Overnight Express Instant TB Medium/1% glycerol/CB/CAM including 5 mM Biotin and incubated 24 h, 30 °C, 250 rpm. Following the induction, the cells were pelleted and the media discarded. The pellet was frozen at -80 °C, thawed on ice and resuspended at 5 ml/g pellet in MBS, pH 7.4 + Pierce Protease Inhibitor Tablet Mini and sonicated on ice for 10 min. at 50% duty cycle. Biotin was added to 100 pM and the lysed cells incubated at 37 °C, 30 min, 250 rpm to drive biotinylation to completion. The lysate was cleared by centrifugation at 30k x g, 20 min, 4 °C and the supernatant transferred to a 2.25 ml SMT3-A1 resin (Nectagen, Inc) packed column at 1 ml/min. The resin was washed with 25 ml MBS, pH 7.4 and the protein eluted with polyol elution buffer (PEB): 10 mM Tris, 1 mM EDTA, 0.75M ammonium sulfate, 40% propylene glycol, pH 7.9. The protein was desalted 2X into 50 mM Tris, pH 8 and stored as a 50% glycerol stock at -20 °C.
B-SUMO sequence (P1068) (SEQ ID NO: 39)
MSDSEVNQEAKPEVKPEVKPETHINLKVSDGSSEIFFKIKKTTPLRRLMEAFAKRQGKEMDSLRFLYDGIR IQADQTPEDLDMEDNDIIEAHREQIGGGGGLNDIFEAQKIEWHE
To facilitate quantification of affinity chromatography target protein in eluates, we prepared a SUMO-GFP fusion (P10126RDG-1 ) that we could track by fluorescence spectroscopy. Briefly, the pET expression construct was transformed into NEc1 E. coli (Nectagen, Inc) and an overnight culture diluted 1 :100 into 1 L Novagen Overnight Express Instant TB Medium/1% glycerol/CB and grown 24 h at 30 °C, 250 rpm. The cells were pelleted, the media removed, and the cells lysed with BPER with Universal
Nuclease (Thermo). The insoluble material was removed by centrifugation and the cleared supernatant loaded onto a 5 ml Ni Sepharose 6 Fast Flow column (Cytivia) at 1 .5 ml/min, and the resin washed with 100 ml 50 mM NaF^PCk, 300 mM NaCI, 20 mM imidazole, pH 8, and then eluted with the same buffer with 250 mM imidazole. Protein purity of both B-SUMO and SUMO-GFP was assessed by SDS-PAGE using 12% NuPAGE, BisTris, MES running buffer under reducing conditions and stained with GelCode Blue (Thermo).
SUMO-GFP fusion (P10126RDG-1) (SEQ ID NO: 40)
MGSSHHHHHHSDSEVNQEAKPEVKPEVKPETHINLKVSDGSSEIFFKIKKTTPLRRLMEAFAKRQGKEMD SLRFLYDGIRIQADQTPEDLDMEDNDIIEAHREQIGGLYFQGSKGEELFTGVVPILVELDGDVNGHKFSVS GEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTFAYGLQCFARYPDHMKQHDFFKSAMPEGYVQERTIFF KDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIE DGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYKIEGR GGKPIPNPLLGLDST
Analysis of monodispersity by size exclusion chromatography
Purified nanoCLAMPs were diluted to a final concentration of 0.18 mg/ml in MBS, 1 mM CaCl2, pH 6.5, centrifuged at 20k x g, 2 min 4 °C, and the supernatants transferred to a clean tube. The samples were loaded into a 125 pl sample loop and injected onto a Superdex 75 10/300 GL column (GE Healthcare Life Sciences, Pittsburg, PA) equilibrated in MBS, 1 mM CaCl2, pH 6.5 at a flowrate of 0.65 ml/min. The column was calibrated with Bio-Rad Gel Filtration Standard per manufacturer’s instructions.
Determination of melting temperature by differential scanning fluorimetry
The melting temperature of purified nanoCLAMPs was determined using GloMelt Thermal Shift Protein Stability Kit (Biotium) per manufacturer’s instructions. Briefly, purified nanoCLAMPs were adjusted to 1 mg/ml in MBS, 1 mM CaCL, pH 6.5 and diluted in half with 2X GloMelt (Biotium) and aliquoted to 386 well plate and sealed with optical film. The plate was then heated in a Quantstudio 5 qPCR machine using SYBR Green reporter with no passive reference. The heating profile was 25 °C for 2 min; ramp at 0.05 °C /sec to 99 °C; 99 °C for 2 min. Tm is defined as the inflection point in the unfolding curve.
Determination of protease stability by digestion with trypsin and chymotrypsin
Digestions were performed by incubating the nanoCLAMPs at 0.25 mg/ml in a 20 pl reaction containing 0.1 mg/ml trypsin (Roche Cat 11418475001 ) or chymotrypsin (Roche Cat 11418467001 ) diluted in 1 mM HCI, such that the final HCI concentration in the reaction was 0.1 mM. CaCL was added to the reaction to 10 mM. The protein and remaining diluent buffer was MBS, pH 6.5. The reaction was incubated at 37 °C for the indicated times and stopped by adding 2 ml 10X Protease Arrest (G- Biosciences), analyzed on SDS-PAGE (12% NuPAGE, Bis Tris, in MES buffer) in SDS sample buffer with reducing agent, and stained with GelCode Blue (Thermo). Densitometry was carried out using GelAnalyzer software to measure relative staining intensities of the full-length band.
Phage display library NL-26 construction for nC-B class
The pCombX phagemid template p2799 (Table 8), contained the N-terminal and C-terminal constant regions of the nC-B class separated by a stuffer region containing Hind II I and Spel cut sites. This template was digested with Hindlll and Spel, gel purified, and the plasmid region amplified with degenerate primers 1957T R and 1960T F, which added the N and C-terminal part of nC-B as well as the randomized loops L1 and L8, respectively. The primers listed with a T indicate they are degenerate primers constructed using phosphoramidite trimer mixes (Glen Research) of oligos (IDT) containing all amino acids except Cys, Met, Lys, and Arg. The short internal region of the P2788 was amplified using primers 1958T F and 1959 R, which added and randomized Loop L2. PCR was carried out with ClonAmp HiFi PCR Mix, according to manufacturer’s instructions (Takara Bio, Mountain View, CA). The reaction cycle was 98 °C for 10 sec, 65 °C for 10 sec, and 72 °C for 30 sec, repeated 30 times. These two amplicons, which contained overlapping ends, were gel purified and cloned together by Gibson Assembly (described below), creating the nC-B construct with 3 variable loops: Loop L1 (3 residues -817,819, 820), Loop L2 (7 residues, 838 - 844), and Loop L8 (5 residues, 931 - 935), for a total of 15 variable residues in 3 loops.
NNN = randomized codon
To clone the library components, 10 pg of the large amplicon and 7.86 pg of the short amplicon were combined in a 2 ml reaction containing 1000 pl of Gibson Assembly Master Mix (2X) (NEB) and incubated at 50 °C for 30 min and then put on ice. The ligated DNA was then purified and concentrated in one Nucleospin Gel and PCR Cleanup Kit (Machery Nagel) and eluted in 45 pl EB. The DNA was then desalted on a VSWP 0.025 pm membrane (EMD Millipore) on ddFLO for 40 min with a water change at 20 min. The desalted DNA was then adjusted to 100 ng/pl with ddF and used to electroporate electrocompetent TG1 cells (Lucigen). Approximately 50 pl of DNA was added to 1 .25 ml ice cold TG1 cells and pipetted up and down 4 times to mix on ice, after which 25 pl aliquots were transferred to 50 electroporation cuvettes (with 1 mm gaps) on ice. The cells were electroporated, and immediately
quenched with 975 il recovery media (Lucigen), pooled, and incubated at 37 °C, 250 rpm for 1 h. To titer the library, 10 pl of recovered culture was serially diluted in 2xYT and 10 pl of each dilution spotted on 2xYT/glu/carb and incubated at 30 °C overnight. The remaining library was expanded to 3 L 2xYT/glu/carb and amplified overnight at 30 °C, 250 rpm. The next day, the library was pelleted at 10k x g, 10 min, 4 °C and the media discarded. The pellet was re-suspended to an ODeoo of 75 in 2xYT/2% glucose/18% glycerol, aliquoted and stored at -80 °C.
Panning of nanoCLAMP library NL-26 (nC-B library)
For the first round of panning, 2.7 L of 2xYT medium with 2% glucose and 100 mg/ml carbenici Ilin (2xYT/Glu/CB) was inoculated with 3.6 ml of the NL-26 library glycerol stock (ODeoo = 75), to an ODeoo of approximately 0.1 and grown at 37 °C, 250 rpm until the ODeoo reached 0.52. The library was infected by adding helper phage VCSM13 (Stratagene, Cat#200251 ) to 750 ml of culture at an MOI of 20 phage/cell, and incubating at 37 °C, 100 rpm for 30 min, then 250 rpm for an additional 30 min. The cells were pelleted at 7500 x g for 10 min, and the media discarded. The cells were resuspended in 1 .2 L 2xYT/CB, 70 pg/ml kanamycin (KAN), and incubated 15 h at 30 °C, 250 rpm. The cells were combined, and 100 ml was centrifuged at 10k x g for 10 min. The phage containing supernatant was transferred to clean tubes and precipitated by adding 37.5 ml of 5X PEG/NaCI (20% polyethylene glycol 6000/2.5 M NaCI), and incubated on ice for 25 min. The phage was pelleted at 13k x g, 25 min and the supernatant discarded. The phage was resuspended in 10 ml 20 mM NaH2PO4, 150 mM NaCI, pH 7.4 (PBS), then centrifuged at 15k x g for 15 min to remove insoluble material. The phage was precipitated a second time by adding 1/4 volume 5X PEG/NaCI, incubated on ice for 5 min, and pelleted at 13k x g, 10 min at 4 °C. The phage pellet was resuspended in 3 ml PBS and quantified by absorbance at 268 nm (A268 = 1 for a solution of 5 x 1012 phage/ml).
Two sets of 100 pl of Dynabeads MyOne Streptavidin T1 (ThermoFisher Scientific) magnetic beads slurry were washed 2 x 1 ml with PBS-T (PBS with 0.05% Tween 20), applying magnet in between washes to remove the supernatant, and then blocked in 1 ml of 2% dry milk solution in PBS with 0.05% Tween 20 (2% M-PBS-T) for 1 h, rotating, at room temperature. To preclear the phage against beads alone, 1 ml of phage was prepared at a concentration of 2 x 1013 phage/ml in 2% M-PBS-T, the block removed from the first set of beads, and the phage added to the beads and incubated 1 h, rotating. The magnet was applied, and the precleared phage removed and transferred to a clean tube. The magnet was applied, and this step repeated two times to ensure no carryover of beads bound to phage to the next step. Biotinylated target (B-SUMO) was added to the precleared phage to 100 nM final concentration and incubated rotating 1 h. Block was removed from the second set of beads, and the phage/B-SUMO mix was added to the beads to precipitate the biotinylated target and bound phage. The beads were washed 8X with PBS-T, 1 ml each, vortexing between each step and applying the magnet. The washed beads were eluted with 800 pl 0.1 M glycine, pH 2.0, 10 min rotating, the magnet applied, and the eluate transferred to 72 pl 2 M Tris base to neutralize. The neutralized phage was then added to 9 ml XL1 -blue E. coH, which had been grown to ODeoo = 0.435 and placed on ice. The cells were infected at 37 °C, 45 min, 175 rpm, and then expanded to 100 ml 2xYT/Glu/CB and incubated overnight at 30 °C, 250 rpm.
The overnight cultures were harvested by measuring the ODeoo, centrifuging the cells at 10k x g for 10 min and then resuspending the cells to an ODeoo of 75 in 2xYT/18% glycerol. To prepare phage for
the next round of panning, 5 ml of 2xYT/Glu/CB was inoculated with 5 pl of the 75 ODeoo glycerol stock and incubated at 37 °C, 250 rpm until the ODeoo reached 0.5. The cells were superinfected at 20:1 phage:cell, mixed well, and incubated at 37 °C, 30 min, 150 rpm and then 30 min at 250 rpm. The cells were pelleted at 5500 x g, 10 min, the glucose containing media discarded and the cells resuspended in 10 ml 2xYT/CB /KAN and incubated overnight at 30 °C, 250 rpm.
The overnight phage prep was processed as described above. The phage was then prepared at A268 = 0.8 in 2% M-PBS-T, and the panning and pre-clearing continued as described, except in the second and third rounds, the biotinylated target concentration was reduced 10X per round. Washes after phage-capture was also increased in the third round, to 12 washes. In round 2, neutravidin-coated magnetic beads (Spherotech) were used in place of streptavidin-beads to reduce enrichment for streptavidin binders.
Qualitative semELISA of individual clones following panning.
At the end of the last panning round, individual colonies were plated on 2xYT/Glu/CB agar plates following the 45 min 150 rpm recovery at 37 °C of the infected XL1 -blue cells with the eluted phage. The next day, 95 colonies were inoculated into 400 pl 2xYT/Glu/CB in a 96-deep-well culture plate, and grown overnight at 37 °C, 300 rpm to generate a master plate, to which glycerol was added to 18% for storage at -80 °C. To prepare an induction plate for the ELISA, 5 pl of each master-plate culture was inoculated into 400 pl fresh 2xYT/0.1 % glucose/CB medium and incubated for 2.75 h at 37 °C, 300 rpm. IPTG was then added to 0.5 mM and the plates incubated at 30 °C with 300 rpm shaking overnight. Because the phagemid contains an amber stop codon, some nanoCLAMP protein is produced without the pill domain, even though XL1 -blue is a suppressor strain, resulting in the periplasmic localization of some nanoCLAMP, of which some percentage is ultimately secreted to the media. The media can then be used directly in an ELISA assay (soluble expression-based monoclonal enzyme-linked immunosorbent assay: semELISA). After the overnight induction, the plates were centrifuged at 1200 x g for 10 min to pellet the cells. Streptavidin coated microtiter plates (ThermoFisher) were rinsed 3 times with 200 pl PBS, and then coated with biotinylated target proteins at 2 pg/ml with 100 pl/well and incubated 1 h. For blank controls, a plate was incubated with 100 pl/well PBS. The coating solution was removed, and the plates blocked with 2% M-PBS-T. The block was removed and 50 pl of 4% M-PBS-T added to each well. At this point 50 pl of each induction plate supernatant was transferred to the blank and protein-coated wells and pipetted 10 times to mix and incubated 1 h. The plates were washed 4 times with 200 pl PBS-T and the plates dumped and slapped on paper towels in between washes. After the washes, 75 pl of 1 :2000 dilution anti-FLAG-HRP (Sigma A8592) in 4% M-PBS-T was added to each well and incubated 1 h. The anti-FLAG-HRP was discarded, and the plates washed as before. The plates were developed by adding 75 pl TMB Ultra substrate (ThermoFisher) and analyzed for positive signals compared to controls. Positive clones were then grown up from the master plate by inoculating 1 ml 2xYT/2% glucose/100 pg/ml CB with 3 pl glycerol stock and incubated for at least 6 h at 37 °C, 250 rpm. The cells were then pelleted, and the media was discarded. Plasmid DNA was prepared from the pellets using the Qiaprep Spin Miniprep Kit, and the sequences determined by Sanger sequencing at Genewiz (South Plainfield, NJ). The nanoCLAMP inserts from unique positive clones was amplified and cloned into the pET expression vector, described above.
Biolayer Interferometry of nanoCLAMPs
Kinetic analysis of interactions between nanoCLAMPs and Biotinylated SUMO was carried out on an OctetRed96 using SAX streptavidin coated sensor tips. The tips were transferred first to buffer (MBS, 1 mM CaCl2, pH 6.5 + 1 % BSA) for 300 sec, then to B-SUMO at 2 mg/ml in buffer for 180 sec, then to buffer for 300 sec, then to at least 4 dilutions of nanoCLAMPs in buffer (association) for 200 sec, then to buffer (dissociation) for 500 sec. The cells were constantly vortexed at 1000 rpm at rm temp. The kinetics were fit to a 1 :1 model and Kd calculated using global fit analysis.
Dynamic Binding Capacity of P2808 resin and SMT3-A1 resin
A packed volume of 0.6 ml of P2808 resin or SMT3-A1 resin was packed into a Tricorn 5/50 column (5 mm ID x 3.06 cm height) and equilibrated in 20 mM NaH2PO4, 150 mM NaCI, pH 7.4 (PBS) at 0.5 ml/min for 5 CV. The load, a Sumo-GFP fusion protein (MW = 41 ,559 g/mol) diluted to a concentration, c, of 0.2 mg/ml in PBS, was pumped through the system with the column on bypass and the eluate fluorescence measured to determine the total load fluorescence at Ex/Em 485/535 nm. The delay volume, Vdeiay, was measured for the configuration at 0.5 ml. The load was then directed to the column and the volume Vx measured, where Vx is the volume where the fluorescence of the eluate = 10% of that of the total load. The dynamic binding capacity, in units of mg/(ml resin), was then calculated as follows: DBC = (Vx - Vdeiay)*c/(Vol Resin).
Purification of Sumo-GFP from a spiked E. coli lysate by affinity chromatography with P2808 resin
A cleared E. coli lysate was prepared by lysing a pellet of NEc1 E. coli (a derivative of BL21 (DE3) with the C-terminal region of SlyD knocked out by recombineering, Nectagen, Inc) with BPER (Thermo) and removing insoluble material by centrifugation at 15 k x g, 20 min, 4 °C. The cleared supernatant was diluted to a total protein concentration of roughly 3.3 mg/ml with PBS, pH 7.4 such that the BPER reagent was present at 20% vol/vol. The target protein SUMO-GFP (MW = 41 ,559 g/mol) was spiked-in to a final concentration of either 0.2 mg/ml or 0.025 mg/ml. The spiked lysate was loaded onto the column at 0.5 ml/min for indicated times, washed with 20 CV PBS, pH 7.4, then eluted with 3 M Imidazole, pH 8. Fractions containing eluted target were pooled and desalted 2X on Zeba 7 MWCO columns and the protein quantified by A280. Imidazole removal was verified by testing the A280 of elution buffer alone following 2X desalting. Spiked lysate, early wash fractions, and pooled elutions (post buffer exchange) were analyzed by NuPAGE SDS PAGE under reducing conditions, 12%, Bis-Tris in MES running buffer and stained with Gel Code Blue (Thermo).
Repeated AC purification cycles including cleaning in place (CIP) of resins using 0.1 M NaOH
Repeated affinity chromatography purifications were carried out on an FPLC with a small 50 pl (packed) column using running buffer 20 mM MOPS, 150 mM NaCI, 1 mM CaCl2, pH 7.2. The Load consisted of Sumo-GFP spiked into a cleared E. coli lysate (described above in Purification of Sumo-GFP from a spiked E. coli lysate) at 0.1 mg/ml. The cycle consisted of a 2 ml equilibration in running buffer at 1 ml/min, 0.5 ml load of spiked lysate at 0.5 ml/min, 3 ml wash with running buffer at 0.5 ml/min, 2 ml elution with 3 M imidazole, pH 8 (collected) at 0.5 ml/min, a 0.5 ml wash with running buffer at 0.5 ml/min, a cleaning in place cycle of 1 .5 ml NaOH at 1 ml/min and then 2 ml at 0.2 ml/min (total contact time 10 min), and finally a refolding step with 5 ml running buffer at 1 ml/min. The target concentration in the
eluates was measured by fluorescence spectroscopy in duplicate on an i D5 plate reader (Molecular Dynamics) Ex/Em 485/535 nm. The eluates were analyzed by SDS-PAGE using NuPAGE gels as described above.
Repeated AC purification cycles with low pH elution and short cleaning in place with NaOH
Repeated affinity chromatography purifications were carried out as described above, except the column was eluted with 0.1 M citrate, pH 2.5 instead of 3 M imidazole, pH 8. Also, the cleaning in place step with 0.1 M NaOH was shortened to 1 ml, at 1 ml/min (1 min contact time per cycle). Since the eluted SUMO-GFP was denatured by the low pH elution, the relative elution concentrations were compared using densitometry of the target bands on SDS PAGE.
Determination of effect of autoclaving or DMF incubation on SUMO binding resin binding capacity and specificity
For each resin tested, three 10 pl aliquots (packed vol) of resin were loaded into 1 .5 ml screw cap tubes. To one, 1 ml DMF was added, and the tube incubated at rm temp for 2 h. To the other two, 100 pl MBS, 1 mM CaCl2, pH 7.2 was added. One of these was autoclaved with its cap left slightly loose, on a 30 min liquid cycle, which sterilizes at around 120 °C and 20 psi for 30 min, and then slowly drops the pressure and the temperature over the next 90 min. The other set of resin was left on ice as a control. After two hours, all of the resins were cooled to room temperature and centrifuged at 1 k x g, 1 min. The control and autoclaved resin were stored overnight at 4 °C. The DMF treated resin was rinsed 3X with fresh MBS, 1 mM CaCl2, pH 7.2 and then stored overnight at 4 °C. The next day all three of the sets of beads were rinsed with fresh buffer, and then incubated with 1 .3 ml of E. coli lysate (prepared as described above) spiked with SUMO-GFP at 0.2 mg/ml, for 1 h, 4 °C, rotating. The resin was loaded into a small, tared column, rinsed 4 x 400 ml PBS, pH 7.4, then eluted 3 x 25 ml 3 M imidazole, pH 8. The fluorescence of the eluates was read on an iD5 plate reader in duplicate as described and the concentration determined by comparison with a standard curve of the target and compared to controls. The concentrations were normalized and analyzed by SDS PAGE as described above to assess purity.
Example 2. Using the protein scaffold to target diverse antigens.
We selected protein scaffolds that bound to diverse protein targets. A summary of the protein scaffolds and their cognate targets is shown in Table 9 below. Table 9 contains a subset of a much larger set of target-specific nanoCLAMPs, the majority of which possess the loop lengths of 4, 7, and 5 residues for loops 1 , 2, and 8, respectively, as designed in library NL-26 (see above). To demonstrate the protein scaffold’s ability to tolerate various loop lengths, we only included those nanoCLAMPs in Table 9 that possess at least one loop with a different length than the designed length. In Table 10 we demonstrate the scaffold’s ability to support vast loop diversity by tabulating the amino acid sequences of nanoCLAMPs specific to several targets and show the diversity of loop sequences to a single target in several cases.
Table 9. Sequences of binding scaffolds (nanoCLAMPs) with variable loop lengths
*denotes a stop codon
Example 3. Terbium binding
Lanthanide binding to the protein scaffold was demonstrated by incubating the proteins with terbium, removing unbound terbium by buffer exchange, and measuring time resolved fluorescence. Proteins were prepared at 30 pM in 20 mM MOPS, 150 mM NaCI (MBS), pH 6.5 and buffer exchanged to remove any unbound Ca. SMT3-A1 (nC-A), P2808 (nC-B), and a negative control protein (recombinant SMT3) were added to a 140 pl reaction in the same buffer so their final concentrations were 8.57 pM, and either CaCl2 or TbCIs added to 300 pM. The reactions were incubated at 4 °C for 16 h, and then buffer exchanged with Zeba 7 MWCO desalting columns into MBS, pH 6.5. The proteins were diluted to 0.5 pM in MBS, pH 6.5 and then 200 pl analyzed in duplicate on i D5 (Molecular Devices) plate reader using time- resolved fluorescence with Ex/Em: 350/544 nm, 200 micro-sec delay. As shown in FIG. 15, P972 (nC-A) exhibited 9X greater fluorescence when incubated with terbium instead of calcium, and P2808 (nC-B) exhibited 23X greater fluorescence when incubated with terbium instead of calcium.
Example 4. Loop length modelling
We next undertook modelling experiments to ascertain the ability of the protein scaffold to maintain its secondary and tertiary structure while varying loop lengths of each of L1 -L8. For each loop, the plasticity was modeled with P2808 as follows. To explore long length, each loop was replaced by a flexible 15-amino acid (648)3 linker (sequence: GGGGSGGGGSGGGGS (SEQ ID NO: 41 ). The protein fold was modeled in AlphaFold mmseq without relaxation. The top result was aligned with P2808 in Swiss PDB Viewer with the MagicFit function.
The loop, the flanking N-terminal, and the flanking C-terminal amino acid are shown in different shades. The structure was assessed qualitatively for the maintenance of the overall beta-sheet structures. Structures that maintained the overall beta-sheet structure were considered to maintain the overall fold. To explore short loop lengths, each loop was completely deleted and modeled as above. If
the complete deletion did not impact the overall fold, no additional constructs were modeled. The complete deletion was aligned with Swiss PDB Viewer with the MagicFit function and then assessed qualitatively.
A deletion was considered to result in disruption of the overall beta-sheet structure if a betastrand secondary structure assignment was converted to a coil assignment or one or more beta strands lost association with an adjacent beta-strand. If the complete deletion resulted in a disruption of the overall beta-sheet structure, a deletion series was made starting with each wild-type amino acid replaced by G and then removing one G at a time. The construct in the deletion series with the shortest loop length that maintained the fold qualitatively was aligned and assessed. The results from these modelling experiments are shown in FIGS. 17-26.
Our database of nanoCLAMP binders was searched for any clones whose loops deviated from the standard lengths of Loop 1 : 4 amino acids, Loop 2: 7 amino acids and Loop 8: 5 amino acids. The resulting clones are listed in Table 9 above. Tables 11 and 12 below summarize the lengths for clones observed with the nC-B and nC-A scaffolds. Table 11 shows top loops and Table 12 shows bottom loops.
The third column indicates the length diversity observed across orthologs. The observation of variation correlates roughly with the modeling data.
Table 11. Lengths observed for top loops in isolated nanoCLAMP clones or in alignment with NagH CBM32-2 in different species
Table 12. Lengths observed for bottom loops in alignment with NagH CBM32-2 in different species
Taken together, the modeling results, the observed loop lengths in isolated nanoCLAMPs, and natural variation in loop length across species indicate that the length and sequence of each of L1 -L8 can be independently varied without disrupting the core fold of the protein.
Example 5. Introduction of artificial disulfides into the nC-B scaffold to improve stability
The well characterized SUMO binder P2808 was modeled with AlphaFold to rationally select adjacent residues on neighboring beta strands for substitution with Cys residues, with the aim of further stabilizing the protein by introducing a disulfide bond. Substitution mutations were chosen by visual inspection of the structure to identify residues whose side chains were located in the core of the protein, whose side were oriented towards each other, and whose alpha carbons were approximately the same distance apart as observed with natural disulfide bonds. AlphaFold modeling predicted that several of the selected substitution mutations would form disulfide bonds and that a few that would not (Table 13). We cloned, expressed, and purified 14 mutants predicted to form disulfides (12 mutants of P2808 and 2 mutants of a P2808 mutant, P2960, which has DGGGSS871 -876GDT and DHTGAP900-905SST X and Y loops from C. celatum). The proteins were immobilized on Ni Sepharose 6 Fast Flow beads under denaturing and reducing conditions and refolded by reducing concentration of denaturant over time, in the presence of reduced and oxidized glutathione to aid in formation of disulfide bonds. The refolded proteins were then eluted from the resin and tested for the presence of disulfide bonds by mobility shift on SDS PAGE under reducing vs oxidizing conditions. Because proteins possessing intramolecular disulfides remain more compact than those that do not, proteins with disulfides typically run faster in SDS-PAGE due to their smaller hydrodynamic radius. Thirteen of the 14 proteins predicted by AlphaFold to form disulfide bonds migrated faster on SDS PAGE in sample buffer lacking reducing agent than in sample buffer containing reducing agent. This observation is consistent with the agent reducing disulfide bonds in those proteins (FIG. 27). The proteins that appeared to possess disulfide bonds also had a band that ran similarly to reduced form. This observation may indicate a mixed population of proteins with and without disulfides. These preparations also contained a small percentage of higher molecular weight species, likely intermolecular disulfide bonded multimers, which largely disappeared upon reduction. The control proteins, P2808 and P2960, which contain no cysteines, migrated at the same rate in both the reducing and non-reducing buffer. Further, an unrelated disulfide-containing control protein, bovine serum albumin (BSA), showed the expected decrease in electrophoretic mobility in reducing buffers. Taken together, these results indicate that disulfide bonds can be successfully introduced into nC-B scaffold in the twelve
positive constructs in Table 13. Clones P3007, P3008, and P3009 contain no lysines. The absence of lysines is expected to reduce susceptibility to trypsin and increase the specificity of labeling with aminereactive reagents. These scaffolds have only a single primary amine located the N-terminal alpha amino group and are expected to be modified specifically at this position by amine-reactive reagents. Clones P3013 and P3014 contain no asparagines, common sites of deamidation, so are expected to be less susceptible to deamidation.
Neutral or better Tm is defined by Tm of basis, +/- 4C.
1 Clones with disulfide bonds and good thermal stability
To ensure that the thermal stability was not negatively affected by the mutations, we measured the melting temperature of the mutants and the parents using differential scanning fluorimetry (DSF). We defined neutral or better results as a Tm not more than 4°C below that of the parent. Clones P3010, P3011 , P3017 and P3021 had deleterious effects on thermal stability and were not pursued further. Of the 14 proteins tested, 10 had neutral or better effect on the melting temperature and were at least as thermally stable as the parent constructs (Table 13). One clone, P3015 with a disulfide bond between residues 884 and 926, showed an 8°C improvement in thermal stability in oxidized vs. reduced conditions (FIG. 27).
Materials and Methods
AlphaFold modeling of introduced disulfides
The 2808 sequence was modeled in AlphaFold with pairs of cysteine substitutions. The version was mmseq, and the modeling was performed with relaxation.
Analysis of disulfides by SDS PAGE nanoCLAMPs were purified as described above, under denaturing conditions, except the refolding step was modified. Briefly, nanoCLAMPs were bound to Ni Sepharose 6 Fast Flow (Cytivia) in 6 M GuHCI, 20 mM Tris, pH 8 (QCB)+ 5 mM TCEP. The resins were washed with 5 column volumes (CV) QCB + 1 mM TCEP, then 5 CV QCB (no TCEP). The proteins were then gradually refolded by washing with 10 CV QCB + 2 mM GSH/1 mM GSSG, then steps of 5 CVs each stepping the GuHCI down from 4, 3, 2, 1 , and finally 0 M GuHCI by diluting with 20 mM MOPS, 150 mM NaCI (MBS), 1 mM CaCl2, 2 mM GSH/1 mM GSSG, pH 8. Each refolding step of 5 CVs was incubated for 30 min. The refolded protein was then washed in 10 CV MBS, pH 8, 1 mM CaCL, and finally eluted with MBS, pH 8, 1 mM CaCl2, 250 mM imidazole. Proteins were normalized to 1 mg/ml in MBS, 1 mM CaCl2, pH 6.5. The proteins were diluted 10X into SDS sample buffer containing 50 mM DTT (reducing) and SDS sample buffer lacking reducing agent. The proteins were heated to 95 °C for 5 min, cooled and 1 pg separated on 12% NuPAGE BisTris gel (Thermo) with MES running buffer. Gels were stained with GelCode Blue (Thermo).
Differential scanning fluorimetry (DSF)
DSF was performed as described previously, except in the reducing case, TCEP was included in the DSC cocktail at 50 mM (final). The DSC program was performed as described, and the Tm measured at the inflection point of the curve.
Example 6. Identification of Framework Variants that Maintain Binding Activity
Phage library NL-26 contains clones of nanoCLAMPs with the wild-type nC-B framework as well as mutations resulting from errors in gene synthesis, PCR and phage propagation. The library was screened and analyzed to identify nanoCLAMPs that maintain the ability to bind their intended target and that contain one or more mutations in the framework regions. The analysis resulted in the identification of 105 nanoCLAMP variants, each recognizing one of four target antigens and each containing one or more mutations in the framework regions. The number of mutations identified in each framework and their position are summarized in Tables 14 and 15. A listing of individual enriched clones, targets and framework mutations is shown in Table 16. The identification of these variants in this non-exhaustive analysis indicates that each framework region can tolerate one or more mutations while maintaining the ability to be displayed on the phage surface and mediate binding to its target.
Methods
The NL-26 phage library was panned against recombinant GFPMut2, Human Serum Albumin, mCherry, and TEV protease, as described above. Phage were enriched in two rounds, and approximately 200,000 clones from each round were DNA sequenced by next generation DNA sequencing with an Illumina MiSeq system. The sequencing reads were processed with PipeBio software to cluster and
count like-sequences. Clones were identified that met the criteria of 1 ) showing greater than two-fold normalized enrichment from Round 1 to Round 2 and 2) having one or more mutations in the framework region. Table 14. Tolerance of Frameworks 1-9 to Mutations
Other Embodiments
While the invention has been described in connection with specific embodiments thereof, it will be understood that it is capable of further modifications and this application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the invention that come within known or customary practice within the art to which the invention pertains and may be applied to the essential features hereinbefore set forth, and follows in the scope of the claims.
Other embodiments are within the claims.
Claims
1 . A protein scaffold comprising the structure:
A-F1 -L1 -F2-L2-F3-L3-F4-L4-F5-L5-F6-L6-F7-L7-F8-L8-F9-B; wherein:
F1 -F9 correspond to framework regions 1 -9;
L1 -L8 correspond to loop regions 1 -8;
A and B are each independently, absent or comprise at least one amino acid;
F1 comprises the sequence of: (T/S)-LI-(H/R)-(T/S)-(P/E)-(G/S)-W (SEQ ID NO: 4) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 4;
L1 is absent or comprises at least one amino acid;
F2 comprises the sequence of: G-(S/N/T)-E-(A/S)-(D/N/S/A)-LLDGDD-(S/N/T)-TGV-(E/W/A)-Y (SEQ ID NO: 5) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 5;
L2 is absent or comprises at least one amino acid;
F3 comprises the sequence of: S-(L/V)-AGEFIGLDLG (SEQ ID NO: 6) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 6;
L3 is absent or comprises at least one amino acid;
F4 comprises the sequence of: G-(I/V)-(H/R/Y/N)-FVIG-(A/K/R)-(D/N) (SEQ ID NO: 7) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 7;
L4 is absent or comprises at least one amino acid;
F5 comprises the sequence of: DKW-(T/N/S)-(R/K)-F-(R/K)-LEYS (SEQ ID NO: 8) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 8;
L5 is absent or comprises at least one amino acid;
F6 comprises the sequence of: WTTI-(R/K/H/Q)-EYD-(H/K/R/Q) (SEQ ID NO: 9) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 9;
L6 is absent or comprises at least one amino acid;
F7 comprises the sequence of: (Q/K)-DVI-(D/E)-E-(D/S)-F (SEQ ID NO: 10) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 10;
L7 is absent or comprises at least one amino acid;
F8 comprises the sequence of: (Q/K/R)-YIRLTNLE (SEQ ID NO: 11 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 11 ;
L8 is absent or comprises at least one amino acid; and
F9 comprises the sequence of: LTFSEFA-(I/V)-VS (SEQ ID NO: 12) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 12; and
wherein the protein scaffold comprises at least one mutation selected from the group consisting of N807X, S809X, R812X, S813X, E814X, S815Xi, D818X, N822X, N825X, N832X, W836X, K857X, E858X, I859X, K86OX2, L861X, D862X, R865X, K870X, N871X, N880X, K881X, K883X, N890X, K897X, K901X, K908X, E912X, S914X, and K922X3 relative to SEQ ID NO: 1 , wherein:
X is any amino acid except the amino acid in the equivalent position in SEQ ID NO: 1 ;
Xi is any amino acid except R or S;
X2 is any amino acid except P or K; and
X3 is any amino acid except R or K.
2. The protein scaffold of claim 1 , wherein:
F1 comprises the sequence of: (T/S)-LI-(H/R)-(T/S)-(P/E)-(G/S)-W (SEQ ID NO: 4) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 4;
F2 comprises the sequence of: G-(S/N/T)-E-(A/S)-(D/N/S/A)-LLDGDD-(S/N/T)-TGV-(E/W/A)-Y (SEQ ID NO: 5) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 5;
F3 comprises the sequence of: S-(L/V)-AGEFIGLDLG (SEQ ID NO: 6) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 6;
F4 comprises the sequence of: G-(I/V)-(H/R/Y/N)-FVIG-(A/K/R)-(D/N) (SEQ ID NO: 7) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 7;
F5 comprises the sequence of: DKW-(T/N/S)-(R/K)-F-(R/K)-LEYS (SEQ ID NO: 8) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 8;
F6 comprises the sequence of: WTTI-(R/K/H/Q)-EYD-(H/K/R/Q) (SEQ ID NO: 9) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 9;
F7 comprises the sequence of: (Q/K)-DVI-(D/E)-E-(D/S)-F (SEQ ID NO: 10) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 10;
F8 comprises the sequence of: (Q/K/R)-YIRLTNLE (SEQ ID NO: 11 ) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 11 ; and
F9 comprises the sequence of: LTFSEFA-(I/V)-VS (SEQ ID NO: 12) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 12.
3. The protein scaffold of claim 2, wherein:
F1 comprises the sequence of: (T/S)-LI-(H/R)-(T/S)-(P/E)-(G/S)-W (SEQ ID NO: 4);
F2 comprises the sequence of: G-(S/N/T)-E-(A/S)-(D/N/S/A)-LLDGDD-(S/N/T)-TGV-(E/W/A)-Y (SEQ ID NO: 5);
F3 comprises the sequence of: S-(L/V)-AGEFIGLDLG (SEQ ID NO: 6);
F4 comprises the sequence of: G-(I/V)-(H/R/Y/N)-FVIG-(A/K/R)-(D/N) (SEQ ID NO: 7);
F5 comprises the sequence of: DKW-(T/N/S)-(R/K)-F-(R/K)-LEYS (SEQ ID NO: 8);
F6 comprises the sequence of: WTTI-(R/K/H/Q)-EYD-(H/K/R/Q) (SEQ ID NO: 9);
F7 comprises the sequence of: (Q/K)-DVI-(D/E)-E-(D/S)-F (SEQ ID NO: 10);
F8 comprises the sequence of: (Q/K/R)-YIRLTNLE (SEQ ID NO: 11 ); and
F9 comprises the sequence of: LTFSEFA-(I/V)-VS (SEQ ID NO: 12).
4. The protein scaffold of any one of claims 1 -3, wherein the protein scaffold comprises at least one mutation selected from the group consisting of N807X, S809X, R812X, S813X, E814X, S815X, D818X, N822X, N825X, N832X, W836X, K857X, E858X, I859X, K860X, L861 X, D862X, R865X, K870X, N871 X, N880X, K881 X, K883X, N890X, K897X, K901 X, K908X, E912X, S914X, and K922X relative to SEQ ID NO: 1 , wherein X is any amino acid.
5. The protein scaffold of claim 4, wherein the protein scaffold comprises at least one mutation selected from the group consisting of N807D, S809T, R812H, S813T, E814P, S815G, D818V, N822S, N825D, N832S, W836E, K857E, E858V, I859V, K860E, L861 V, D862G, R865H, K870A, N871 D, N880T, K881 R, K883R, N890G, K897R, K901 H, K908Q, E912D, S914D, and K922Q relative to SEQ ID NO: 1 .
6. The protein scaffold of claim 4, wherein at least one mutation is K870X and/or N890X.
7. The protein scaffold of claim 6, wherein at least one mutation is K870A and/or N890G.
8. The protein scaffold of any one of claims 1 -7, comprising at least 3 fewer lysines relative to SEQ ID NO: 1 .
9. The protein scaffold of claim 8, comprising at least 6 fewer lysines relative to SEQ ID NO: 1 .
10. The protein scaffold of claim 9, comprising 9 fewer lysines relative to SEQ ID NO: 1 .
11 . The protein scaffold of claim 9, comprising no lysines.
12. The protein scaffold of any one of claims 1 -1 1 , comprising at least 3 fewer asparagines relative to SEQ ID NO: 1 .
13. The protein scaffold of claim 12, comprising at least 5 fewer asparagines relative to SEQ ID NO: 1 .
14. The protein scaffold of claim 13, comprising 7 fewer asparagines relative to SEQ ID NO: 1 .
15. The protein scaffold of claim 13, comprising no asparigines.
16. The protein scaffold of any one of claims 1 -15, wherein A and B are each independently, absent or from 1 amino acid to 20 amino acids.
17. The protein scaffold of any one of claims 1 -16, wherein each of L1 -L8 is, independently, from 1 amino acid to 20 amino acids.
18. The protein scaffold of claim 17, wherein each of L1 -L8 is, independently, from 1 amino acid to 10 amino acids.
19. The protein scaffold of claim 18, wherein each of L1 -L8 is, independently, from 3 amino acids to 10 amino acids.
20. The protein scaffold of claim 19, wherein each of L1 -L8 is, independently, from 3 amino acids to 8 amino acids.
21 . The protein scaffold of claim 20, wherein L1 is 4 amino acids, L2 is 7 amino acids, and/or L8 is 5 amino acids.
22. The protein scaffold of claim 21 , wherein L1 comprises the sequence of: X1X2X3X4 (SEQ ID NO: 13), wherein each of X1-X4 is, independently, any amino acid.
23. The protein scaffold of claim 21 or 22, wherein L2 comprises the sequence of: XIX2X3X4XSX6X7 (SEQ ID NO: 14), wherein each of X1-X7 is, independently, any amino acid.
24. The protein scaffold do of any one of claims 21 -23, wherein L8 comprises the sequence of: XIX2X3X4XS (SEQ ID NO: 15), wherein each of X1-X5 is, independently, any amino acid.
25. The protein scaffold of any one of claims 1 -23, wherein L4 comprises the sequence of: X1X2X3X4X5X6X7 (SEQ ID NO: 14), wherein each of X1-X7 is, independently, any amino acid.
26. The protein scaffold of any one of claims 1 -25, wherein L6 comprises the sequence of: X1X2X3X4X5X6 (SEQ ID NO: 16), wherein each of Xi-Xe is, independently, any amino acid.
27. The protein scaffold of any one of claims 1 -26, wherein L8 comprises at least two amino acids.
28. The protein scaffold of any one of claims 1 -27, wherein L4 comprises the sequence of: (G/D)-GGSS (SEQ ID NO: 17) or GDT or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 17 or GDT.
29. The protein scaffold of any one of claims 1 -28, wherein L6 comprises the sequence of TGAPAG (SEQ ID NO: 18) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 18.
30. The protein scaffold of any one of claims 1 -29, wherein:
L4 comprises the sequence of: (G/D)-GGSS (SEQ ID NO: 17) or GDT; and L6 comprises the sequence of TGAPAG (SEQ ID NO: 18).
31 . The protein scaffold of any one of claims 1 -30, wherein L3 comprises the sequence of: (E/K/S)-(V/E)- (V/I/T)-(E/K/P/S)-(V/L)-(G/D) (SEQ ID NO: 19) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 19.
32. The protein scaffold of any one of claims 1 -31 , wherein L5 comprises the sequence of: LD-(G/N)- (E/S)-S (SEQ ID NO: 20) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 20.
33. The protein scaffold of any one of claims 1 -32, wherein L7 comprises at least one amino acid.
34. The protein scaffold of any one of claims 1 -33, wherein L7 comprises the sequence of ETPI-(S/E)-A (SEQ ID NO: 21 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 21 .
35. The protein scaffold of any one of claims 32-34, wherein:
L3 comprises the sequence of: (E/K/S)-(V/E)-(V/I/T)-(E/K/P/S)-(V/L)-(G/D) (SEQ ID NO: 19) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 19;
L5 comprises the sequence of: LD-(G/N)-(E/S)-S (SEQ ID NO: 20) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 20; and
L7 comprises the sequence of ETPI-(S/E)-A (SEQ ID NO: 21 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 21.
36. The protein scaffold of any one of claims 1 -35, wherein A comprises the sequence of (D/N/H)-P.
37. The protein scaffold of claim 36, wherein A comprises the sequence of DP.
38. The protein scaffold of any one of claims 1 -37, wherein B comprises the sequence of DELE (SEQ ID NO: 35).
39. The protein scaffold of any one of claims 1 -38, wherein:
F1 comprises the sequence of: TLIHTPGW (SEQ ID NO: 22) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 22;
L1 is absent or comprises at least one amino acid;
F2 comprises the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 23;
L2 is absent or comprises at least one amino acid;
F3 comprises the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 24;
L3 comprises the sequence of: EVVEVG (SEQ ID NO: 31 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 31 ;
F4 comprises the sequence of: GIHFVIGAD (SEQ ID NO: 25) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 25;
L4 comprises the sequence of: GGGSS (SEQ ID NO: 32) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 32;
F5 comprises the sequence of: DKWTRFRLEYS (SEQ ID NO: 26) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 26;
L5 comprises the sequence of: LDGES (SEQ ID NO: 33) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 33;
F6 comprises the sequence of: WTTIREYDH (SEQ ID NO: 27) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 27;
L6 comprises the sequence of: TGAPAG (SEQ ID NO: 18) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 18;
F7 comprises the sequence of: QDVIDEDF (SEQ ID NO: 28) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 28;
L7 comprises the sequence of: ETPISA (SEQ ID NO: 34) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 34;
F8 comprises the sequence of: QYIRLTNLE (SEQ ID NO: 29) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 29;
L8 is absent or comprises at least one amino acid; and
F9 comprises the sequence of: LTFSEFAIVS (SEQ ID NO: 30) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 30.
40. The protein scaffold of claim 39, wherein:
F1 comprises the sequence of: TLIHTPGW (SEQ ID NO: 22) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 22;
L1 is absent or comprises at least one amino acid;
F2 comprises the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 23;
L2 is absent or comprises at least one amino acid;
F3 comprises the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 24;
L3 comprises the sequence of: EVVEVG (SEQ ID NO: 31 ) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 31 ;
F4 comprises the sequence of: GIHFVIGAD (SEQ ID NO: 25) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 25;
L4 comprises the sequence of: GGGSS (SEQ ID NO: 32) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 32;
F5 comprises the sequence of: DKWTRFRLEYS (SEQ ID NO: 26) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 26;
L5 comprises the sequence of: LDGES (SEQ ID NO: 33) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 33;
F6 comprises the sequence of: WTTIREYDH (SEQ ID NO: 27) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 27;
L6 comprises the sequence of: TGAPAG (SEQ ID NO: 18) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 18;
F7 comprises the sequence of: QDVIDEDF (SEQ ID NO: 28) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 28;
L7 comprises the sequence of: ETPISA (SEQ ID NO: 34) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 34;
F8 comprises the sequence of: QYIRLTNLE (SEQ ID NO: 29) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 29;
L8 is absent or comprises at least one amino acid; and
F9 comprises the sequence of: LTFSEFAIVS (SEQ ID NO: 30) or a sequence having one amino acid insertion, deletion, or substitution mutation relative to SEQ ID NO: 30.
41 . The protein scaffold of any one of claim 40, wherein:
F1 comprises the sequence of: TLIHTPGW (SEQ ID NO: 22);
L1 is absent or comprises at least one amino acid;
F2 comprises the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23);
L2 is absent or comprises at least one amino acid;
F3 comprises the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
L3 comprises the sequence of: EVVEVG (SEQ ID NO: 31 );
F4 comprises the sequence of: GIHFVIGAD (SEQ ID NO: 25);
L4 comprises the sequence of: GGGSS (SEQ ID NO: 32);
F5 comprises the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
L5 comprises the sequence of: LDGES (SEQ ID NO: 33);
F6 comprises the sequence of: WTTIREYDH (SEQ ID NO: 27);
L6 comprises the sequence of: TGAPAG (SEQ ID NO: 18);
F7 comprises the sequence of: QDVIDEDF (SEQ ID NO: 28);
L7 comprises the sequence of: ETPISA (SEQ ID NO: 34);
F8 comprises the sequence of: QYIRLTNLE (SEQ ID NO: 29);
L8 is absent or comprises at least one amino acid; and
F9 comprises the sequence of: LTFSEFAIVS (SEQ ID NO: 30).
42. The protein scaffold of any one of claim 41 , wherein:
F1 comprises the sequence of: TLIHTPGW (SEQ ID NO: 22);
L1 comprises the sequence of: XIX2XSX4 (SEQ ID NO: 13), wherein each of X1-X4 is, independently, any amino acid;
F2 comprises the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23);
L2 comprises the sequence of: X1X2X3X4X5X6X7 (SEQ ID NO: 14), wherein each of X1-X7 is, independently, any amino acid;
F3 comprises the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
L3 comprises the sequence of: EVVEVG (SEQ ID NO: 31 );
F4 comprises the sequence of: GIHFVIGAD (SEQ ID NO: 25);
L4 comprises the sequence of: GGGSS (SEQ ID NO: 32);
F5 comprises the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
L5 comprises the sequence of: LDGES (SEQ ID NO: 33);
F6 comprises the sequence of: WTTIREYDH (SEQ ID NO: 27);
L6 comprises the sequence of: TGAPAG (SEQ ID NO: 18);
F7 comprises the sequence of: QDVIDEDF (SEQ ID NO: 28);
L7 comprises the sequence of: ETPISA (SEQ ID NO: 34);
F8 comprises the sequence of: QYIRLTNLE (SEQ ID NO: 29);
L8 comprises the sequence of: X1X2X3X4X5 (SEQ ID NO: 15), wherein each of X1-X5 is, independently, any amino acid; and
F9 comprises the sequence of: LTFSEFAIVS (SEQ ID NO: 30).
43. The protein scaffold of any one of claim 42, wherein:
A comprises the sequence of: DP;
F1 comprises the sequence of: TLIHTPGW (SEQ ID NO: 22);
L1 comprises the sequence of: XIX2XSX4 (SEQ ID NO: 13), wherein each of X1-X4 is, independently, any amino acid;
F2 comprises the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23);
L2 comprises the sequence of: X1X2X3X4X5X6X7 (SEQ ID NO: 14), wherein each of X1-X7 is, independently, any amino acid;
F3 comprises the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
L3 comprises the sequence of: EVVEVG (SEQ ID NO: 31 );
F4 comprises the sequence of: GIHFVIGAD (SEQ ID NO: 25);
L4 comprises the sequence of: GGGSS (SEQ ID NO: 32);
F5 comprises the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
L5 comprises the sequence of: LDGES (SEQ ID NO: 33);
F6 comprises the sequence of: WTTIREYDH (SEQ ID NO: 27);
L6 comprises the sequence of: TGAPAG (SEQ ID NO: 18);
F7 comprises the sequence of: QDVIDEDF (SEQ ID NO: 28);
L7 comprises the sequence of: ETPISA (SEQ ID NO: 34);
F8 comprises the sequence of: QYIRLTNLE (SEQ ID NO: 29);
L8 comprises the sequence of: X1X2X3X4X5 (SEQ ID NO: 15), wherein each of X1-X5 is, independently, any amino acid;
F9 comprises the sequence of: LTFSEFAIVS (SEQ ID NO: 30); and
B comprises the sequence of: DELE (SEQ ID NO: 35).
44. The protein scaffold of claim 42 or 43, wherein L1 comprises the sequence of X1X2X3X4 (SEQ ID NO: 13), wherein each of Xi, X3, and X4 is, independently, any amino acid, and X2 is V.
45. A protein scaffold comprising a polypeptide having at least 80% sequence identity to SEQ ID NO: 3.
46. The protein scaffold of claim 45, wherein the polypeptide has at least 85%, 90%, 95%, 97%, or 99% sequence identity to SEQ ID NO: 3.
47. The protein scaffold of claim 45 or 46, wherein the polypeptide comprises at least one mutation selected from the group consisting of N807X, S809X, R812X, S813X, E814X, S815Xi, D818X, N822X, N825X, N832X, W836X, K857X, E858X, I859X, K86OX2, L861 X, D862X, R865X, K870X, N871 X, N880X, K881 X, K883X, N890X, K897X, K901 X, K908X, E912X, S914X, and K922X3 relative to SEQ ID NO: 1 , wherein:
X is any amino acid except the amino acid in the equivalent position in SEQ ID NO: 1 ;
Xi is any amino acid except R or S;
X2 is any amino acid except P or K; and
X3 is any amino acid except R or K.
48. The protein scaffold of claim 47, wherein the protein scaffold comprises at least one mutation selected from the group consisting of N807X, S809X, R812X, S813X, E814X, S815X, D818X, N822X, N825X, N832X, W836X, K857X, E858X, I859X, K860X, L861 X, D862X, R865X, K870X, N871 X, N880X, K881 X, K883X, N890X, K897X, K901 X, K908X, E912X, S914X, and K922X relative to SEQ ID NO: 1 , wherein X is any amino acid.
49. The protein scaffold of claim 48, wherein the protein scaffold comprises at least one mutation selected from the group consisting of N807D, S809T, R812H, S813T, E814P, S815G, D818V, N822S, N825D, N832S, W836E, K857E, E858V, I859V, K860E, L861 V, D862G, R865H, K870A, N871 D, N880T, K881 R, K883R, N890G, K897R, K901 H, K908Q, E912D, S914D, and K922Q relative to SEQ ID NO: 1 .
50. A protein scaffold comprising framework regions and loop regions, the protein scaffold comprising at least 7 framework regions from the following structure:
A-F1 -L1 -F2-L2-F3-L3-F4-L4-F5-L5-F6-L6-F7-L7-F8-L8-F9-B; wherein:
F1 -F9 correspond to framework regions 1 -9;
L1 -L8 correspond to loop regions 1 -8 that are each independently, absent or comprises one or more amino acids;
A and B are each independently, absent or comprise at least one amino acid;
F1 comprises the sequence of: (T/S)-LI-(H/R)-(T/S)-(P/E)-(G/S)-W (SEQ ID NO: 4) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 4;
F2 comprises the sequence of: G-(S/N/T)-E-(A/S)-(D/N/S/A)-LLDGDD-(S/N/T)-TGV-(E/W/A)-Y (SEQ ID NO: 5) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 5;
F3 comprises the sequence of: S-(L/V)-AGEFIGLDLG (SEQ ID NO: 6) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 6;
F4 comprises the sequence of: G-(I/V)-(H/R/Y/N)-FVIG-(A/K/R)-(D/N) (SEQ ID NO: 7) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 7;
F5 comprises the sequence of: DKW-(T/N/S)-(R/K)-F-(R/K)-LEYS (SEQ ID NO: 8) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 8;
F6 comprises the sequence of: WTTI-(R/K/H/Q)-EYD-(H/K/R/Q) (SEQ ID NO: 9) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 9;
F7 comprises the sequence of: (Q/K)-DVI-(D/E)-E-(D/S)-F (SEQ ID NO: 10) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 10;
F8 comprises the sequence of: (Q/K/R)-YIRLTNLE (SEQ ID NO: 11 ) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 11 ; and
F9 comprises the sequence of: LTFSEFA-(I/V)-VS (SEQ ID NO: 12) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 12.
51 . The protein scaffold of claim 50, comprising at least 7 of the framework regions, wherein
F1 comprises the sequence of: TLIHTPGW (SEQ ID NO: 22) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 22;
F2 comprises the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 23;
F3 comprises the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 24;
F4 comprises the sequence of: GIHFVIGAD (SEQ ID NO: 25) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 25;
F5 comprises the sequence of: DKWTRFRLEYS (SEQ ID NO: 26) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 26;
F6 comprises the sequence of: WTTIREYDH (SEQ ID NO: 27) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 27;
F7 comprises the sequence of: QDVIDEDF (SEQ ID NO: 28) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 28;
F8 comprises the sequence of: QYIRLTNLE (SEQ ID NO: 29) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 29; and
F9 comprises the sequence of: LTFSEFAIVS (SEQ ID NO: 30) or a sequence having one or two amino acid insertions, deletions, substitution mutations, or a combination thereof relative to SEQ ID NO: 30.
52. The protein scaffold of claim 51 , comprising at least 7 of the framework regions, wherein
F1 comprises the sequence of: TLIHTPGW (SEQ ID NO: 22);
F2 comprises the sequence of: GSEADLLDGDDSTGVEY (SEQ ID NO: 23);
F3 comprises the sequence of: SLAGEFIGLDLG (SEQ ID NO: 24);
F4 comprises the sequence of: GIHFVIGAD (SEQ ID NO: 25);
F5 comprises the sequence of: DKWTRFRLEYS (SEQ ID NO: 26);
F6 comprises the sequence of: WTTIREYDH (SEQ ID NO: 27);
F7 comprises the sequence of: QDVIDEDF (SEQ ID NO: 28);
F8 comprises the sequence of: QYIRLTNLE (SEQ ID NO: 29); and
F9 comprises the sequence of: LTFSEFAIVS (SEQ ID NO: 30).
53. The protein scaffold of any one of claims 50-52, wherein the protein scaffold comprises at least 8 of the framework regions F1 -F9.
54. The proteins scaffold of any one of claims 50-53, wherein the protein scaffold comprises at least 80%, 85%, 90%, 95%, 97%, or 99% sequence identity to the framework regions F1 -F9 over one or more regions of alignment.
55. The protein scaffold of any one of claims 1 -54, further comprising a substitution mutation that adds a cysteine residue.
56. The protein scaffold of claim 55, wherein the protein scaffold comprises a first substitution mutation that adds a first cysteine residue and a second substitution mutation that adds a second cysteine residue.
57. The protein scaffold of claim 56, wherein the first cysteine residue and the second cysteine residue form a disulfide bond under oxidizing conditions.
58. The protein scaffold of any one of claims 55-57, wherein the protein scaffold comprises at least one mutation selected from the group consisting of F806C, P808C, S845C, L855C, V858C, V861 C, K878C, W879C, L884C, L888C, A904C, P905C, A906GC, G907C, I924C, L926C, N928C, L936C, I943C, L948C.
59. The protein scaffold of claim 58, wherein the protein scaffold comprises at least two or more mutations selected from the group consisting of F806C, P808C, S845C, L855C, V858C, V861 C, K878C, W879C, L884C, L888C, A904C, P905C, A906GC, G907C, I924C, L926C, N928C, L936C, I943C, L948C.
60. The proteins scaffold of claim 59, wherein the protein scaffold comprises a pair of cysteine mutations selected from the group consisting of K878C and G907C, K878C and A904C, V861 C and I943C, P905C and L855C, S845C and L936C, W879C and N928C, L884C and L926C, F806C and L948C, V858C and L888C, K878C and G907C, K878C and A906GC, S845C and N928C, K878C and A904C, P808C and I943C, V861 C and I924C, P808C and V861 C, and I943C and L855C.
61 . The protein scaffold of claim 60, wherein the pair of cysteine mutations is selected from the group consisting of K878C and G907C, K878C and A904C, S845C and L936C, W879C and N928C, W879C and N928C, L884C and L926C, V858C and L888C, K878C and G907C, and K878C and A906GC.
62. The protein scaffold of any one of claims 1 -61 , further comprising a tag covalently attached to the scaffold.
63. The protein scaffold of claim 62, wherein the tag is an affinity tag.
64. The protein scaffold of claim 62 or 63, wherein the tag is attached to the N-terminus or the C- terminus of the scaffold.
65. The protein scaffold of any one of claims 1 -64, wherein the scaffold is conjugated to a functional group.
66. The protein scaffold of claim 65, wherein the functional group comprises biotin, streptavidin or a derivative of streptavidin, a polyethylene glycol moiety, a fluorescent dye, an enzyme, a radioactive moiety, a lanthanide, or a lanthanide binding motif.
67. The protein scaffold of claim 66, wherein the lanthanide is terbium.
68. The protein scaffold of claim 66, wherein the radioactive moiety is an a or emitter.
69. The protein scaffold of any one of claims 65-68, wherein the functional group is conjugated to sulfhydryl group or a primary amine.
70. A polynucleotide encoding the protein scaffold of any one of claims 1 -69.
71 . The polynucleotide of claim 70, wherein the polynucleotide is a ribonucleotide.
72. The polynucleotide of claim 70, wherein the polynucleotide is a deoxyribonucleotide.
73. A vector comprising the polynucleotide of claim 71 or 72.
74. A cell comprising the polynucleotide of any one of claims 70-72 or the vector of claim 73.
75. A method of producing the protein scaffold of any one of claims 1 -69 comprising:
(a) providing a cell transformed with the polynucleotide of any one of claims 70-72 or the vector of claim 73;
(b) culturing the transformed cell under conditions for expressing the polynucleotide, wherein the culturing results in expression of the protein scaffold; and
(c) isolating the protein scaffold.
76. A particle comprising the protein scaffold of any one of claims 1 -69.
77. The particle of claim 76, wherein the particle is a magnetic particle.
78. A resin comprising a plurality of the particles of claim 76 or 77.
79. A column comprising the resin of claim 78.
80. A method of purifying a target molecule from a plurality of molecules, the method comprising:
(a) providing a sample comprising a mixture of the target molecule and the plurality of molecules;
(b) contacting the sample with the protein scaffold of any one of claims 1 -69, wherein the scaffold specifically binds to the target molecule; and
(c) separating the target molecule bound to the protein scaffold from the plurality of molecules.
81 . The method of claim 80, wherein the step of separating comprises immobilizing the protein scaffold.
82. The method of claim 81 , wherein the protein scaffold is conjugated to a particle.
83. The method of claim 82, wherein the particle comprises a magnetic bead.
84. The method of claim 82 or 83, wherein the protein scaffold is conjugated to a resin or monolith comprising a plurality of the particles.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202263428344P | 2022-11-28 | 2022-11-28 | |
US63/428,344 | 2022-11-28 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2024118635A2 true WO2024118635A2 (en) | 2024-06-06 |
WO2024118635A3 WO2024118635A3 (en) | 2024-07-18 |
Family
ID=91324840
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2023/081399 WO2024118635A2 (en) | 2022-11-28 | 2023-11-28 | Thermostable binding scaffolds |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024118635A2 (en) |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2000018922A2 (en) * | 1998-10-01 | 2000-04-06 | Incyte Genomics, Inc. | Human carbohydrate-associated proteins |
AU2015305220B2 (en) * | 2014-08-22 | 2021-01-28 | Nectagen, Inc. | Affinity proteins and uses thereof |
US11566346B2 (en) * | 2020-06-25 | 2023-01-31 | Philip David Rodley | Protein scaffold |
-
2023
- 2023-11-28 WO PCT/US2023/081399 patent/WO2024118635A2/en unknown
Also Published As
Publication number | Publication date |
---|---|
WO2024118635A3 (en) | 2024-07-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6184463B2 (en) | Proteins having affinity for immunoglobulin and immunoglobulin binding affinity ligands | |
JP5522723B2 (en) | Novel polypeptide, material for affinity chromatography, and method for separation and / or purification of immunoglobulin | |
EP2690173B1 (en) | Protein for affinity-separation matrix | |
US9920098B2 (en) | Protein ligand for affinity isolation matrix | |
CN103443120B (en) | Novel immunoglobulin-bindipolypeptide polypeptide | |
JP2019506163A (en) | Split intein with exceptional splicing activity | |
CN116333065A (en) | Novel alkali-stable immunoglobulin-binding proteins | |
Alm et al. | A small bispecific protein selected for orthogonal affinity purification | |
WO2015030094A1 (en) | Fab REGION-BINDING PEPTIDE | |
JP6805831B2 (en) | Proteins that have affinity for immunoglobulins, affinity separators using them, columns for liquid chromatography | |
CN111132994B (en) | FC-binding proteins with cysteines in the C-terminal helical region | |
JP5951509B2 (en) | Immunoglobulin GFc region binding polypeptide | |
CN112334149A (en) | Fc-binding proteins with cystein in c-terminal spiral region | |
TR201911279T4 (en) | Binding of polypeptides with a mutated scaffold. | |
JP6596005B2 (en) | Affinity separation matrix for Fab region-containing peptides | |
JP2013528567A5 (en) | ||
JP2022519808A (en) | Immunoglobulin-binding protein for affinity purification | |
WO2024118635A2 (en) | Thermostable binding scaffolds | |
US20190153072A1 (en) | Method for producing antibody fragment | |
JP6818305B2 (en) | Polypeptide showing affinity for antibodies that have formed a non-natural conformation | |
EP4380955A1 (en) | Chimeric igg-fc-binding ligand polypeptide and uses thereof for igg affinity purification | |
AU2015305220A1 (en) | Affinity proteins and uses thereof | |
WO2016136910A1 (en) | Modified fab region-binding peptide | |
US20180170973A1 (en) | Immunoglobulin-binding modified protein | |
JP6731345B2 (en) | Immunoglobulin G-binding peptide |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23898715 Country of ref document: EP Kind code of ref document: A2 |