US20100069300A1 - C-Type Lectin Fold as a Scaffold for Massive Sequence Variation - Google Patents
C-Type Lectin Fold as a Scaffold for Massive Sequence Variation Download PDFInfo
- Publication number
- US20100069300A1 US20100069300A1 US12/493,802 US49380209A US2010069300A1 US 20100069300 A1 US20100069300 A1 US 20100069300A1 US 49380209 A US49380209 A US 49380209A US 2010069300 A1 US2010069300 A1 US 2010069300A1
- Authority
- US
- United States
- Prior art keywords
- xaa
- protein
- scaffold
- binding
- cell
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 102000003930 C-Type Lectins Human genes 0.000 title claims description 17
- 108090000342 C-Type Lectins Proteins 0.000 title claims description 17
- 230000027455 binding Effects 0.000 claims abstract description 165
- 238000000034 method Methods 0.000 claims abstract description 47
- 125000003275 alpha amino acid group Chemical group 0.000 claims abstract description 25
- 108090000623 proteins and genes Proteins 0.000 claims description 118
- 102000004169 proteins and genes Human genes 0.000 claims description 115
- 210000004027 cell Anatomy 0.000 claims description 103
- 125000000539 amino acid group Chemical group 0.000 claims description 56
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 30
- 229920001184 polypeptide Polymers 0.000 claims description 28
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 28
- 150000007523 nucleic acids Chemical class 0.000 claims description 22
- 239000002904 solvent Substances 0.000 claims description 20
- 102000039446 nucleic acids Human genes 0.000 claims description 19
- 108020004707 nucleic acids Proteins 0.000 claims description 19
- 210000004899 c-terminal region Anatomy 0.000 claims description 17
- 206010028980 Neoplasm Diseases 0.000 claims description 16
- 230000001580 bacterial effect Effects 0.000 claims description 15
- 201000011510 cancer Diseases 0.000 claims description 13
- 241000282414 Homo sapiens Species 0.000 claims description 10
- 231100000219 mutagenic Toxicity 0.000 claims description 9
- 230000003505 mutagenic effect Effects 0.000 claims description 9
- 238000002703 mutagenesis Methods 0.000 claims description 8
- 231100000350 mutagenesis Toxicity 0.000 claims description 8
- 239000003053 toxin Substances 0.000 claims description 6
- 231100000765 toxin Toxicity 0.000 claims description 6
- 230000035899 viability Effects 0.000 claims description 6
- 230000007423 decrease Effects 0.000 claims description 5
- 229940002612 prodrug Drugs 0.000 claims description 5
- 239000000651 prodrug Substances 0.000 claims description 5
- 102000002086 C-type lectin-like Human genes 0.000 claims description 4
- 108050009406 C-type lectin-like Proteins 0.000 claims description 4
- 230000001413 cellular effect Effects 0.000 claims description 4
- 230000000977 initiatory effect Effects 0.000 claims description 4
- 230000003247 decreasing effect Effects 0.000 claims description 3
- 230000003362 replicative effect Effects 0.000 claims description 2
- 230000008685 targeting Effects 0.000 claims description 2
- 210000004962 mammalian cell Anatomy 0.000 claims 1
- 108091008324 binding proteins Proteins 0.000 abstract description 123
- 239000003446 ligand Substances 0.000 abstract description 28
- 239000000203 mixture Substances 0.000 abstract description 13
- 102000014914 Carrier Proteins Human genes 0.000 abstract 4
- 102000023732 binding proteins Human genes 0.000 description 119
- 235000018102 proteins Nutrition 0.000 description 78
- 101710184528 Scaffolding protein Proteins 0.000 description 65
- 235000001014 amino acid Nutrition 0.000 description 26
- 150000001413 amino acids Chemical class 0.000 description 24
- 241000588807 Bordetella Species 0.000 description 23
- 229940024606 amino acid Drugs 0.000 description 23
- 230000000875 corresponding effect Effects 0.000 description 21
- 230000000670 limiting effect Effects 0.000 description 20
- 230000003993 interaction Effects 0.000 description 19
- 102000005962 receptors Human genes 0.000 description 18
- 108020003175 receptors Proteins 0.000 description 18
- 108060003951 Immunoglobulin Proteins 0.000 description 16
- 102000018358 immunoglobulin Human genes 0.000 description 16
- 239000000427 antigen Substances 0.000 description 14
- 102000036639 antigens Human genes 0.000 description 14
- 108091007433 antigens Proteins 0.000 description 14
- 239000000178 monomer Substances 0.000 description 14
- 108091033319 polynucleotide Proteins 0.000 description 13
- 102000040430 polynucleotide Human genes 0.000 description 13
- 239000002157 polynucleotide Substances 0.000 description 13
- 238000006467 substitution reaction Methods 0.000 description 13
- 108010087870 Mannose-Binding Lectin Proteins 0.000 description 12
- 102000009112 Mannose-Binding Lectin Human genes 0.000 description 12
- 239000002245 particle Substances 0.000 description 12
- 210000001519 tissue Anatomy 0.000 description 12
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 11
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 11
- 241001515965 unidentified phage Species 0.000 description 11
- 229910052739 hydrogen Inorganic materials 0.000 description 10
- 239000001257 hydrogen Substances 0.000 description 10
- 108010021711 pertactin Proteins 0.000 description 10
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 10
- 239000013638 trimer Substances 0.000 description 10
- 108010047041 Complementarity Determining Regions Proteins 0.000 description 9
- 101710167241 Intimin Proteins 0.000 description 9
- GFFGJBXGBJISGV-UHFFFAOYSA-N adenyl group Chemical group N1=CN=C2N=CNC2=C1N GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 9
- 201000010099 disease Diseases 0.000 description 9
- 108020004705 Codon Proteins 0.000 description 8
- 102100034343 Integrase Human genes 0.000 description 8
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 8
- 238000001514 detection method Methods 0.000 description 8
- 241000192656 Nostoc Species 0.000 description 7
- 241000192117 Trichodesmium erythraeum Species 0.000 description 7
- 125000004429 atom Chemical group 0.000 description 7
- 230000000694 effects Effects 0.000 description 7
- 125000004435 hydrogen atom Chemical group [H]* 0.000 description 7
- 230000002209 hydrophobic effect Effects 0.000 description 7
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 6
- 230000004075 alteration Effects 0.000 description 6
- 239000000835 fiber Substances 0.000 description 6
- 229940072221 immunoglobulins Drugs 0.000 description 6
- 230000003612 virological effect Effects 0.000 description 6
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 5
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 5
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 5
- 102000004856 Lectins Human genes 0.000 description 5
- 108090001090 Lectins Proteins 0.000 description 5
- 241000424623 Nostoc punctiforme Species 0.000 description 5
- 102000003800 Selectins Human genes 0.000 description 5
- 108090000184 Selectins Proteins 0.000 description 5
- 239000012634 fragment Substances 0.000 description 5
- 230000005661 hydrophobic surface Effects 0.000 description 5
- 239000002523 lectin Substances 0.000 description 5
- 230000001404 mediated effect Effects 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 238000012986 modification Methods 0.000 description 5
- 230000001105 regulatory effect Effects 0.000 description 5
- 239000000126 substance Substances 0.000 description 5
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 4
- 241000606125 Bacteroides Species 0.000 description 4
- 241001608472 Bifidobacterium longum Species 0.000 description 4
- 241000588724 Escherichia coli Species 0.000 description 4
- 241000589892 Treponema denticola Species 0.000 description 4
- 241000607618 Vibrio harveyi Species 0.000 description 4
- 238000003556 assay Methods 0.000 description 4
- 229940009291 bifidobacterium longum Drugs 0.000 description 4
- 239000003153 chemical reaction reagent Substances 0.000 description 4
- 239000003795 chemical substances by application Substances 0.000 description 4
- 239000013078 crystal Substances 0.000 description 4
- 230000002538 fungal effect Effects 0.000 description 4
- 230000002068 genetic effect Effects 0.000 description 4
- 238000000111 isothermal titration calorimetry Methods 0.000 description 4
- 238000004519 manufacturing process Methods 0.000 description 4
- 125000003729 nucleotide group Chemical group 0.000 description 4
- 108010038196 saccharide-binding proteins Proteins 0.000 description 4
- 230000006641 stabilisation Effects 0.000 description 4
- 238000011105 stabilization Methods 0.000 description 4
- 239000013598 vector Substances 0.000 description 4
- 229930024421 Adenine Natural products 0.000 description 3
- 201000001320 Atherosclerosis Diseases 0.000 description 3
- 241000894006 Bacteria Species 0.000 description 3
- OYPRJOBELJOOCE-UHFFFAOYSA-N Calcium Chemical compound [Ca] OYPRJOBELJOOCE-UHFFFAOYSA-N 0.000 description 3
- 102000000844 Cell Surface Receptors Human genes 0.000 description 3
- 108010001857 Cell Surface Receptors Proteins 0.000 description 3
- 102000011022 Chorionic Gonadotropin Human genes 0.000 description 3
- 108010062540 Chorionic Gonadotropin Proteins 0.000 description 3
- 241001464430 Cyanobacterium Species 0.000 description 3
- 102100025137 Early activation antigen CD69 Human genes 0.000 description 3
- 101000934374 Homo sapiens Early activation antigen CD69 Proteins 0.000 description 3
- 101001109501 Homo sapiens NKG2-D type II integral membrane protein Proteins 0.000 description 3
- 125000000998 L-alanino group Chemical group [H]N([*])[C@](C([H])([H])[H])([H])C(=O)O[H] 0.000 description 3
- 125000000510 L-tryptophano group Chemical group [H]C1=C([H])C([H])=C2N([H])C([H])=C(C([H])([H])[C@@]([H])(C(O[H])=O)N([H])[*])C2=C1[H] 0.000 description 3
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 3
- 241000699666 Mus <mouse, genus> Species 0.000 description 3
- 102100022680 NKG2-D type II integral membrane protein Human genes 0.000 description 3
- 108091028043 Nucleic acid sequence Proteins 0.000 description 3
- 108020003564 Retroelements Proteins 0.000 description 3
- 229960000643 adenine Drugs 0.000 description 3
- 210000004556 brain Anatomy 0.000 description 3
- 239000011575 calcium Substances 0.000 description 3
- 229910052791 calcium Inorganic materials 0.000 description 3
- 150000001720 carbohydrates Chemical class 0.000 description 3
- 235000014633 carbohydrates Nutrition 0.000 description 3
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 3
- 239000003814 drug Substances 0.000 description 3
- 230000000369 enteropathogenic effect Effects 0.000 description 3
- 206010015037 epilepsy Diseases 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 210000002216 heart Anatomy 0.000 description 3
- 229940084986 human chorionic gonadotropin Drugs 0.000 description 3
- 210000000987 immune system Anatomy 0.000 description 3
- 238000001727 in vivo Methods 0.000 description 3
- 210000004185 liver Anatomy 0.000 description 3
- 210000004072 lung Anatomy 0.000 description 3
- 238000002493 microarray Methods 0.000 description 3
- 239000002773 nucleotide Substances 0.000 description 3
- 239000013612 plasmid Substances 0.000 description 3
- 230000000644 propagated effect Effects 0.000 description 3
- 210000002307 prostate Anatomy 0.000 description 3
- 238000002864 sequence alignment Methods 0.000 description 3
- 230000009870 specific binding Effects 0.000 description 3
- 210000000952 spleen Anatomy 0.000 description 3
- 238000013518 transcription Methods 0.000 description 3
- 230000035897 transcription Effects 0.000 description 3
- 230000010415 tropism Effects 0.000 description 3
- 208000030507 AIDS Diseases 0.000 description 2
- 239000004475 Arginine Substances 0.000 description 2
- 206010003210 Arteriosclerosis Diseases 0.000 description 2
- 208000031212 Autoimmune polyendocrinopathy Diseases 0.000 description 2
- 102100036465 Autoimmune regulator Human genes 0.000 description 2
- 241000588832 Bordetella pertussis Species 0.000 description 2
- 241000282832 Camelidae Species 0.000 description 2
- 241000283707 Capra Species 0.000 description 2
- 206010009900 Colitis ulcerative Diseases 0.000 description 2
- 208000012239 Developmental disease Diseases 0.000 description 2
- 102000004190 Enzymes Human genes 0.000 description 2
- 108090000790 Enzymes Proteins 0.000 description 2
- 201000011240 Frontotemporal dementia Diseases 0.000 description 2
- 206010018364 Glomerulonephritis Diseases 0.000 description 2
- 208000030836 Hashimoto thyroiditis Diseases 0.000 description 2
- -1 His Chemical compound 0.000 description 2
- 101000928549 Homo sapiens Autoimmune regulator Proteins 0.000 description 2
- 101710198693 Invasin Proteins 0.000 description 2
- XUJNEKJLAYXESH-REOHCLBHSA-N L-Cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 2
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 2
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 2
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 2
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 2
- 241000209510 Liliopsida Species 0.000 description 2
- 101710185544 Maltose-binding periplasmic protein Proteins 0.000 description 2
- 101710175625 Maltose/maltodextrin-binding periplasmic protein Proteins 0.000 description 2
- 101710144262 Maltotriose-binding protein Proteins 0.000 description 2
- 241000124008 Mammalia Species 0.000 description 2
- 208000036626 Mental retardation Diseases 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 2
- 208000003250 Mixed connective tissue disease Diseases 0.000 description 2
- 108010077854 Natural Killer Cell Receptors Proteins 0.000 description 2
- 102000010648 Natural Killer Cell Receptors Human genes 0.000 description 2
- 208000009905 Neurofibromatoses Diseases 0.000 description 2
- 108091005461 Nucleic proteins Proteins 0.000 description 2
- 241000283973 Oryctolagus cuniculus Species 0.000 description 2
- 206010035226 Plasma cell myeloma Diseases 0.000 description 2
- 201000004681 Psoriasis Diseases 0.000 description 2
- 241000700159 Rattus Species 0.000 description 2
- 208000021386 Sjogren Syndrome Diseases 0.000 description 2
- 108091008874 T cell receptors Proteins 0.000 description 2
- 102000016266 T-Cell Antigen Receptors Human genes 0.000 description 2
- 201000006704 Ulcerative Colitis Diseases 0.000 description 2
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Chemical compound CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 2
- 241000607477 Yersinia pseudotuberculosis Species 0.000 description 2
- 208000007502 anemia Diseases 0.000 description 2
- 208000011775 arteriosclerosis disease Diseases 0.000 description 2
- 125000003118 aryl group Chemical group 0.000 description 2
- 201000000448 autoimmune hemolytic anemia Diseases 0.000 description 2
- 201000009771 autoimmune polyendocrine syndrome type 1 Diseases 0.000 description 2
- 229960002685 biotin Drugs 0.000 description 2
- 235000020958 biotin Nutrition 0.000 description 2
- 239000011616 biotin Substances 0.000 description 2
- 210000001185 bone marrow Anatomy 0.000 description 2
- 210000000481 breast Anatomy 0.000 description 2
- 206010008129 cerebral palsy Diseases 0.000 description 2
- 210000003679 cervix uteri Anatomy 0.000 description 2
- 210000001072 colon Anatomy 0.000 description 2
- 230000021615 conjugation Effects 0.000 description 2
- 235000018417 cysteine Nutrition 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 201000001981 dermatomyositis Diseases 0.000 description 2
- 238000006471 dimerization reaction Methods 0.000 description 2
- 208000035475 disorder Diseases 0.000 description 2
- 206010014665 endocarditis Diseases 0.000 description 2
- 210000003527 eukaryotic cell Anatomy 0.000 description 2
- 230000007717 exclusion Effects 0.000 description 2
- 102000037865 fusion proteins Human genes 0.000 description 2
- 108020001507 fusion proteins Proteins 0.000 description 2
- 238000012252 genetic analysis Methods 0.000 description 2
- 238000011331 genomic analysis Methods 0.000 description 2
- 208000015181 infectious disease Diseases 0.000 description 2
- 229960000310 isoleucine Drugs 0.000 description 2
- 150000002632 lipids Chemical class 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 244000005700 microbiome Species 0.000 description 2
- 201000006417 multiple sclerosis Diseases 0.000 description 2
- 210000003205 muscle Anatomy 0.000 description 2
- 206010028417 myasthenia gravis Diseases 0.000 description 2
- 201000004931 neurofibromatosis Diseases 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 210000000496 pancreas Anatomy 0.000 description 2
- 230000001717 pathogenic effect Effects 0.000 description 2
- 238000002823 phage display Methods 0.000 description 2
- 208000005987 polymyositis Diseases 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- AAEVYOVXGOFMJO-UHFFFAOYSA-N prometryn Chemical compound CSC1=NC(NC(C)C)=NC(NC(C)C)=N1 AAEVYOVXGOFMJO-UHFFFAOYSA-N 0.000 description 2
- 238000010188 recombinant method Methods 0.000 description 2
- 210000003079 salivary gland Anatomy 0.000 description 2
- 208000012672 seasonal affective disease Diseases 0.000 description 2
- 235000004400 serine Nutrition 0.000 description 2
- 210000003491 skin Anatomy 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 238000001370 static light scattering Methods 0.000 description 2
- 210000002784 stomach Anatomy 0.000 description 2
- 238000001356 surgical procedure Methods 0.000 description 2
- 201000000596 systemic lupus erythematosus Diseases 0.000 description 2
- 210000001550 testis Anatomy 0.000 description 2
- 229940124597 therapeutic agent Drugs 0.000 description 2
- 201000005060 thrombophlebitis Diseases 0.000 description 2
- 210000001541 thymus gland Anatomy 0.000 description 2
- 210000001685 thyroid gland Anatomy 0.000 description 2
- 238000002054 transplantation Methods 0.000 description 2
- 238000011282 treatment Methods 0.000 description 2
- 235000002374 tyrosine Nutrition 0.000 description 2
- 210000003932 urinary bladder Anatomy 0.000 description 2
- 210000004291 uterus Anatomy 0.000 description 2
- 239000004474 valine Substances 0.000 description 2
- 230000007923 virulence factor Effects 0.000 description 2
- 239000000304 virulence factor Substances 0.000 description 2
- BEJKOYIMCGMNRB-GRHHLOCNSA-N (2s)-2-amino-3-(4-hydroxyphenyl)propanoic acid;(2s)-2-amino-3-phenylpropanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1.OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 BEJKOYIMCGMNRB-GRHHLOCNSA-N 0.000 description 1
- ZIIUUSVHCHPIQD-UHFFFAOYSA-N 2,4,6-trimethyl-N-[3-(trifluoromethyl)phenyl]benzenesulfonamide Chemical compound CC1=CC(C)=CC(C)=C1S(=O)(=O)NC1=CC=CC(C(F)(F)F)=C1 ZIIUUSVHCHPIQD-UHFFFAOYSA-N 0.000 description 1
- 208000033316 Acquired hemophilia A Diseases 0.000 description 1
- 241000251468 Actinopterygii Species 0.000 description 1
- 206010001052 Acute respiratory distress syndrome Diseases 0.000 description 1
- 208000026872 Addison Disease Diseases 0.000 description 1
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 1
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 1
- 208000024827 Alzheimer disease Diseases 0.000 description 1
- 208000000044 Amnesia Diseases 0.000 description 1
- 208000031091 Amnestic disease Diseases 0.000 description 1
- 206010002198 Anaphylactic reaction Diseases 0.000 description 1
- 241000272525 Anas platyrhynchos Species 0.000 description 1
- 206010002329 Aneurysm Diseases 0.000 description 1
- 206010002383 Angina Pectoris Diseases 0.000 description 1
- 206010002556 Ankylosing Spondylitis Diseases 0.000 description 1
- 102000000412 Annexin Human genes 0.000 description 1
- 108050008874 Annexin Proteins 0.000 description 1
- 235000002198 Annona diversifolia Nutrition 0.000 description 1
- 208000019901 Anxiety disease Diseases 0.000 description 1
- 208000003017 Aortic Valve Stenosis Diseases 0.000 description 1
- 206010003226 Arteriovenous fistula Diseases 0.000 description 1
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 1
- 208000004300 Atrophic Gastritis Diseases 0.000 description 1
- 208000023275 Autoimmune disease Diseases 0.000 description 1
- 208000012219 Autonomic Nervous System disease Diseases 0.000 description 1
- 108090001008 Avidin Proteins 0.000 description 1
- 208000035143 Bacterial infection Diseases 0.000 description 1
- 208000023328 Basedow disease Diseases 0.000 description 1
- 206010004552 Bicuspid aortic valve Diseases 0.000 description 1
- 102000015081 Blood Coagulation Factors Human genes 0.000 description 1
- 108010039209 Blood Coagulation Factors Proteins 0.000 description 1
- 102000004506 Blood Proteins Human genes 0.000 description 1
- 108010017384 Blood Proteins Proteins 0.000 description 1
- 102100028728 Bone morphogenetic protein 1 Human genes 0.000 description 1
- 108090000654 Bone morphogenetic protein 1 Proteins 0.000 description 1
- 241001466550 Bordetella virus BPP1 Species 0.000 description 1
- 241000283690 Bos taurus Species 0.000 description 1
- 208000004020 Brain Abscess Diseases 0.000 description 1
- 206010006811 Bursitis Diseases 0.000 description 1
- 208000004434 Calcinosis Diseases 0.000 description 1
- 108010045403 Calcium-Binding Proteins Proteins 0.000 description 1
- 206010007559 Cardiac failure congestive Diseases 0.000 description 1
- 208000031229 Cardiomyopathies Diseases 0.000 description 1
- 208000024172 Cardiovascular disease Diseases 0.000 description 1
- 206010007747 Cataract congenital Diseases 0.000 description 1
- 241000700199 Cavia porcellus Species 0.000 description 1
- 208000010693 Charcot-Marie-Tooth Disease Diseases 0.000 description 1
- 102000019034 Chemokines Human genes 0.000 description 1
- 108010012236 Chemokines Proteins 0.000 description 1
- 235000015256 Chionanthus virginicus Nutrition 0.000 description 1
- 206010008748 Chorea Diseases 0.000 description 1
- 108091026890 Coding region Proteins 0.000 description 1
- 102100031162 Collagen alpha-1(XVIII) chain Human genes 0.000 description 1
- 208000002330 Congenital Heart Defects Diseases 0.000 description 1
- 206010018325 Congenital glaucomas Diseases 0.000 description 1
- 208000011990 Corticobasal Degeneration Diseases 0.000 description 1
- 208000019736 Cranial nerve disease Diseases 0.000 description 1
- 206010011321 Craniorachischisis Diseases 0.000 description 1
- 208000020406 Creutzfeldt Jacob disease Diseases 0.000 description 1
- 208000003407 Creutzfeldt-Jakob Syndrome Diseases 0.000 description 1
- 208000010859 Creutzfeldt-Jakob disease Diseases 0.000 description 1
- 241000699800 Cricetinae Species 0.000 description 1
- 208000011231 Crohn disease Diseases 0.000 description 1
- 208000014311 Cushing syndrome Diseases 0.000 description 1
- 241000192700 Cyanobacteria Species 0.000 description 1
- 102000004127 Cytokines Human genes 0.000 description 1
- 108090000695 Cytokines Proteins 0.000 description 1
- WQZGKKKJIJFFOK-QTVWNMPRSA-N D-mannopyranose Chemical compound OC[C@H]1OC(O)[C@@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-QTVWNMPRSA-N 0.000 description 1
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 1
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 1
- 206010011891 Deafness neurosensory Diseases 0.000 description 1
- 206010012289 Dementia Diseases 0.000 description 1
- 208000016192 Demyelinating disease Diseases 0.000 description 1
- 208000020401 Depressive disease Diseases 0.000 description 1
- 206010012438 Dermatitis atopic Diseases 0.000 description 1
- 206010012442 Dermatitis contact Diseases 0.000 description 1
- 206010012565 Developmental glaucoma Diseases 0.000 description 1
- 208000032131 Diabetic Neuropathies Diseases 0.000 description 1
- 201000010374 Down Syndrome Diseases 0.000 description 1
- 206010013801 Duchenne Muscular Dystrophy Diseases 0.000 description 1
- 208000011345 Duchenne and Becker muscular dystrophy Diseases 0.000 description 1
- 206010013883 Dwarfism Diseases 0.000 description 1
- 108010024212 E-Selectin Proteins 0.000 description 1
- 102100023471 E-selectin Human genes 0.000 description 1
- 206010014561 Emphysema Diseases 0.000 description 1
- 206010062608 Endocarditis noninfective Diseases 0.000 description 1
- 108010079505 Endostatins Proteins 0.000 description 1
- 206010014950 Eosinophilia Diseases 0.000 description 1
- 241000283073 Equus caballus Species 0.000 description 1
- 206010015226 Erythema nodosum Diseases 0.000 description 1
- 206010015251 Erythroblastosis foetalis Diseases 0.000 description 1
- 208000032027 Essential Thrombocythemia Diseases 0.000 description 1
- 206010061846 Extradural abscess Diseases 0.000 description 1
- 102000001690 Factor VIII Human genes 0.000 description 1
- 108010054218 Factor VIII Proteins 0.000 description 1
- 108010087819 Fc receptors Proteins 0.000 description 1
- 102000009109 Fc receptors Human genes 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- 102000002090 Fibronectin type III Human genes 0.000 description 1
- 108050009401 Fibronectin type III Proteins 0.000 description 1
- 206010016654 Fibrosis Diseases 0.000 description 1
- 206010017533 Fungal infection Diseases 0.000 description 1
- 241000234271 Galanthus Species 0.000 description 1
- 241000234283 Galanthus nivalis Species 0.000 description 1
- 241000287828 Gallus gallus Species 0.000 description 1
- 208000036495 Gastritis atrophic Diseases 0.000 description 1
- 208000003736 Gerstmann-Straussler-Scheinker Disease Diseases 0.000 description 1
- 206010072075 Gerstmann-Straussler-Scheinker syndrome Diseases 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 208000024869 Goodpasture syndrome Diseases 0.000 description 1
- 201000005569 Gout Diseases 0.000 description 1
- 208000015023 Graves' disease Diseases 0.000 description 1
- 101000948764 Haemophilus phage HP1 (strain HP1c1) Uncharacterized 58.7 kDa protein in lys 3'region Proteins 0.000 description 1
- 206010019280 Heart failures Diseases 0.000 description 1
- 206010061201 Helminthic infection Diseases 0.000 description 1
- 208000035186 Hemolytic Autoimmune Anemia Diseases 0.000 description 1
- 208000031220 Hemophilia Diseases 0.000 description 1
- 208000009292 Hemophilia A Diseases 0.000 description 1
- 102000008949 Histocompatibility Antigens Class I Human genes 0.000 description 1
- 108010088652 Histocompatibility Antigens Class I Proteins 0.000 description 1
- 108010001336 Horseradish Peroxidase Proteins 0.000 description 1
- 208000023105 Huntington disease Diseases 0.000 description 1
- 206010020649 Hyperkeratosis Diseases 0.000 description 1
- 206010020751 Hypersensitivity Diseases 0.000 description 1
- 206010020772 Hypertension Diseases 0.000 description 1
- 206010062767 Hypophysitis Diseases 0.000 description 1
- 208000010159 IgA glomerulonephritis Diseases 0.000 description 1
- 206010021263 IgA nephropathy Diseases 0.000 description 1
- 108010067060 Immunoglobulin Variable Region Proteins 0.000 description 1
- 102000017727 Immunoglobulin Variable Region Human genes 0.000 description 1
- 206010061218 Inflammation Diseases 0.000 description 1
- 102000015696 Interleukins Human genes 0.000 description 1
- 108010063738 Interleukins Proteins 0.000 description 1
- 208000011200 Kawasaki disease Diseases 0.000 description 1
- 208000001126 Keratosis Diseases 0.000 description 1
- 108010092694 L-Selectin Proteins 0.000 description 1
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 1
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 1
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 1
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 1
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 1
- 125000000393 L-methionino group Chemical group [H]OC(=O)[C@@]([H])(N([H])[*])C([H])([H])C(SC([H])([H])[H])([H])[H] 0.000 description 1
- 102100033467 L-selectin Human genes 0.000 description 1
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 1
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 1
- 101500020501 Lachesis muta muta Bradykinin-potentiating peptide 3 Proteins 0.000 description 1
- 241000282838 Lama Species 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- 208000024369 Libman-Sacks endocarditis Diseases 0.000 description 1
- 239000000232 Lipid Bilayer Substances 0.000 description 1
- 108060001084 Luciferase Proteins 0.000 description 1
- 239000005089 Luciferase Substances 0.000 description 1
- 206010025323 Lymphomas Diseases 0.000 description 1
- 206010025327 Lymphopenia Diseases 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- 102000043129 MHC class I family Human genes 0.000 description 1
- 108091054437 MHC class I family Proteins 0.000 description 1
- FYYHWMGAXLPEAU-UHFFFAOYSA-N Magnesium Chemical compound [Mg] FYYHWMGAXLPEAU-UHFFFAOYSA-N 0.000 description 1
- 101710110798 Mannose-binding protein C Proteins 0.000 description 1
- 102100026553 Mannose-binding protein C Human genes 0.000 description 1
- 102000018697 Membrane Proteins Human genes 0.000 description 1
- 108010052285 Membrane Proteins Proteins 0.000 description 1
- 206010027202 Meningitis bacterial Diseases 0.000 description 1
- 206010027260 Meningitis viral Diseases 0.000 description 1
- 206010068836 Metabolic myopathy Diseases 0.000 description 1
- 208000003430 Mitral Valve Prolapse Diseases 0.000 description 1
- 208000019022 Mood disease Diseases 0.000 description 1
- 208000034578 Multiple myelomas Diseases 0.000 description 1
- 101100181099 Mus musculus Klra1 gene Proteins 0.000 description 1
- 208000031888 Mycoses Diseases 0.000 description 1
- 208000003926 Myelitis Diseases 0.000 description 1
- 201000003793 Myelodysplastic syndrome Diseases 0.000 description 1
- 208000009525 Myocarditis Diseases 0.000 description 1
- 206010028643 Myopathy endocrine Diseases 0.000 description 1
- 208000023137 Myotoxicity Diseases 0.000 description 1
- NSTPXGARCQOSAU-VIFPVBQESA-N N-formyl-L-phenylalanine Chemical compound O=CN[C@H](C(=O)O)CC1=CC=CC=C1 NSTPXGARCQOSAU-VIFPVBQESA-N 0.000 description 1
- 208000012902 Nervous system disease Diseases 0.000 description 1
- 108020004485 Nonsense Codon Proteins 0.000 description 1
- 208000001132 Osteoporosis Diseases 0.000 description 1
- 101710116435 Outer membrane protein Proteins 0.000 description 1
- 102000016610 Oxidized LDL Receptors Human genes 0.000 description 1
- 108010028191 Oxidized LDL Receptors Proteins 0.000 description 1
- 108010035766 P-Selectin Proteins 0.000 description 1
- 102100023472 P-selectin Human genes 0.000 description 1
- 206010033645 Pancreatitis Diseases 0.000 description 1
- 208000027099 Paranoid disease Diseases 0.000 description 1
- 208000030852 Parasitic disease Diseases 0.000 description 1
- 208000018737 Parkinson disease Diseases 0.000 description 1
- 208000000733 Paroxysmal Hemoglobinuria Diseases 0.000 description 1
- 241001494479 Pecora Species 0.000 description 1
- 241000009328 Perro Species 0.000 description 1
- 102100036050 Phosphatidylinositol N-acetylglucosaminyltransferase subunit A Human genes 0.000 description 1
- 102000015439 Phospholipases Human genes 0.000 description 1
- 108010064785 Phospholipases Proteins 0.000 description 1
- 241000425347 Phyla <beetle> Species 0.000 description 1
- 208000000609 Pick Disease of the Brain Diseases 0.000 description 1
- 108010089814 Plant Lectins Proteins 0.000 description 1
- 241000251575 Polyandrocarpa Species 0.000 description 1
- 206010036376 Postherpetic Neuralgia Diseases 0.000 description 1
- 241000288906 Primates Species 0.000 description 1
- 102000029797 Prion Human genes 0.000 description 1
- 108091000054 Prion Proteins 0.000 description 1
- 208000024777 Prion disease Diseases 0.000 description 1
- 101710093543 Probable non-specific lipid-transfer protein Proteins 0.000 description 1
- 206010037075 Protozoal infections Diseases 0.000 description 1
- 230000004570 RNA-binding Effects 0.000 description 1
- 206010037779 Radiculopathy Diseases 0.000 description 1
- 208000003782 Raynaud disease Diseases 0.000 description 1
- 208000012322 Raynaud phenomenon Diseases 0.000 description 1
- 101710146873 Receptor-binding protein Proteins 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 1
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 1
- 208000033464 Reiter syndrome Diseases 0.000 description 1
- 208000013616 Respiratory Distress Syndrome Diseases 0.000 description 1
- 208000007014 Retinitis pigmentosa Diseases 0.000 description 1
- 206010039491 Sarcoma Diseases 0.000 description 1
- 206010039710 Scleroderma Diseases 0.000 description 1
- 206010048908 Seasonal allergy Diseases 0.000 description 1
- RJFAYQIBOAGBLC-BYPYZUCNSA-N Selenium-L-methionine Chemical group C[Se]CC[C@H](N)C(O)=O RJFAYQIBOAGBLC-BYPYZUCNSA-N 0.000 description 1
- RJFAYQIBOAGBLC-UHFFFAOYSA-N Selenomethionine Natural products C[Se]CCC(N)C(O)=O RJFAYQIBOAGBLC-UHFFFAOYSA-N 0.000 description 1
- 208000009966 Sensorineural Hearing Loss Diseases 0.000 description 1
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 1
- 201000001388 Smith-Magenis syndrome Diseases 0.000 description 1
- 201000010829 Spina bifida Diseases 0.000 description 1
- 208000029033 Spinal Cord disease Diseases 0.000 description 1
- 208000006097 Spinal Dysraphism Diseases 0.000 description 1
- 208000010112 Spinocerebellar Degenerations Diseases 0.000 description 1
- 108010090804 Streptavidin Proteins 0.000 description 1
- 208000006011 Stroke Diseases 0.000 description 1
- 206010042265 Sturge-Weber Syndrome Diseases 0.000 description 1
- 201000000002 Subdural Empyema Diseases 0.000 description 1
- NINIDFKCEFEMDL-UHFFFAOYSA-N Sulfur Chemical compound [S] NINIDFKCEFEMDL-UHFFFAOYSA-N 0.000 description 1
- 241000282898 Sus scrofa Species 0.000 description 1
- 201000009594 Systemic Scleroderma Diseases 0.000 description 1
- 206010042953 Systemic sclerosis Diseases 0.000 description 1
- 108010092262 T-Cell Antigen Receptors Proteins 0.000 description 1
- 210000001744 T-lymphocyte Anatomy 0.000 description 1
- 206010043118 Tardive Dyskinesia Diseases 0.000 description 1
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 1
- 239000004473 Threonine Substances 0.000 description 1
- 206010043561 Thrombocytopenic purpura Diseases 0.000 description 1
- 208000000323 Tourette Syndrome Diseases 0.000 description 1
- 208000016620 Tourette disease Diseases 0.000 description 1
- 241000589886 Treponema Species 0.000 description 1
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 1
- 208000026911 Tuberous sclerosis complex Diseases 0.000 description 1
- 102100031988 Tumor necrosis factor ligand superfamily member 6 Human genes 0.000 description 1
- 108050002568 Tumor necrosis factor ligand superfamily member 6 Proteins 0.000 description 1
- 102000018594 Tumour necrosis factor Human genes 0.000 description 1
- 108050007852 Tumour necrosis factor Proteins 0.000 description 1
- 206010067584 Type 1 diabetes mellitus Diseases 0.000 description 1
- 108010039648 Type II Antifreeze Proteins Proteins 0.000 description 1
- 208000006038 Urogenital Abnormalities Diseases 0.000 description 1
- 206010046851 Uveitis Diseases 0.000 description 1
- 206010046996 Varicose vein Diseases 0.000 description 1
- 206010047115 Vasculitis Diseases 0.000 description 1
- 206010047249 Venous thrombosis Diseases 0.000 description 1
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 1
- 208000036142 Viral infection Diseases 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 201000007960 WAGR syndrome Diseases 0.000 description 1
- 201000011032 Werner Syndrome Diseases 0.000 description 1
- 208000008383 Wilms tumor Diseases 0.000 description 1
- 208000017733 acquired polycythemia vera Diseases 0.000 description 1
- 208000009621 actinic keratosis Diseases 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 208000009956 adenocarcinoma Diseases 0.000 description 1
- 210000000577 adipose tissue Anatomy 0.000 description 1
- 210000004100 adrenal gland Anatomy 0.000 description 1
- 208000024447 adrenal gland neoplasm Diseases 0.000 description 1
- 208000011341 adult acute respiratory distress syndrome Diseases 0.000 description 1
- 201000000028 adult respiratory distress syndrome Diseases 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 239000000556 agonist Substances 0.000 description 1
- 235000004279 alanine Nutrition 0.000 description 1
- 125000001931 aliphatic group Chemical group 0.000 description 1
- 230000007815 allergy Effects 0.000 description 1
- 150000001408 amides Chemical class 0.000 description 1
- 230000006986 amnesia Effects 0.000 description 1
- 206010002022 amyloidosis Diseases 0.000 description 1
- 206010002026 amyotrophic lateral sclerosis Diseases 0.000 description 1
- 208000003455 anaphylaxis Diseases 0.000 description 1
- 206010002320 anencephaly Diseases 0.000 description 1
- 238000002399 angioplasty Methods 0.000 description 1
- 208000008303 aniridia Diseases 0.000 description 1
- 230000002547 anomalous effect Effects 0.000 description 1
- 239000005557 antagonist Substances 0.000 description 1
- 230000036506 anxiety Effects 0.000 description 1
- 206010002906 aortic stenosis Diseases 0.000 description 1
- 230000006907 apoptotic process Effects 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 229960001230 asparagine Drugs 0.000 description 1
- 208000006673 asthma Diseases 0.000 description 1
- 201000008937 atopic dermatitis Diseases 0.000 description 1
- 230000001363 autoimmune Effects 0.000 description 1
- 208000022362 bacterial infectious disease Diseases 0.000 description 1
- 201000009904 bacterial meningitis Diseases 0.000 description 1
- 244000052616 bacterial pathogen Species 0.000 description 1
- 208000018300 basal ganglia disease Diseases 0.000 description 1
- 239000011324 bead Substances 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 208000021654 bicuspid aortic valve disease Diseases 0.000 description 1
- 239000012620 biological material Substances 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 210000000601 blood cell Anatomy 0.000 description 1
- 239000003114 blood coagulation factor Substances 0.000 description 1
- 210000001124 body fluid Anatomy 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 206010006451 bronchitis Diseases 0.000 description 1
- 230000002308 calcification Effects 0.000 description 1
- 125000002915 carbonyl group Chemical group [*:2]C([*:1])=O 0.000 description 1
- 208000005761 carcinoid heart disease Diseases 0.000 description 1
- 230000000747 cardiac effect Effects 0.000 description 1
- 206010007776 catatonia Diseases 0.000 description 1
- 150000001768 cations Chemical class 0.000 description 1
- 230000030833 cell death Effects 0.000 description 1
- 239000002458 cell surface marker Substances 0.000 description 1
- 210000003169 central nervous system Anatomy 0.000 description 1
- 208000015114 central nervous system disease Diseases 0.000 description 1
- 208000026106 cerebrovascular disease Diseases 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 201000001352 cholecystitis Diseases 0.000 description 1
- 208000012601 choreatic disease Diseases 0.000 description 1
- 208000016644 chronic atrophic gastritis Diseases 0.000 description 1
- 208000025302 chronic primary adrenal insufficiency Diseases 0.000 description 1
- 230000004087 circulation Effects 0.000 description 1
- 230000007882 cirrhosis Effects 0.000 description 1
- 208000019425 cirrhosis of liver Diseases 0.000 description 1
- 238000000975 co-precipitation Methods 0.000 description 1
- 238000004040 coloring Methods 0.000 description 1
- 239000002299 complementary DNA Substances 0.000 description 1
- 208000028831 congenital heart disease Diseases 0.000 description 1
- 208000010247 contact dermatitis Diseases 0.000 description 1
- 210000004351 coronary vessel Anatomy 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 1
- 150000001945 cysteines Chemical class 0.000 description 1
- 231100000599 cytotoxic agent Toxicity 0.000 description 1
- 239000002619 cytotoxin Substances 0.000 description 1
- 230000003412 degenerative effect Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 206010012601 diabetes mellitus Diseases 0.000 description 1
- 239000000032 diagnostic agent Substances 0.000 description 1
- 229940039227 diagnostic agent Drugs 0.000 description 1
- 208000016097 disease of metabolism Diseases 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- 238000002224 dissection Methods 0.000 description 1
- 150000002019 disulfides Chemical class 0.000 description 1
- 208000010118 dystonia Diseases 0.000 description 1
- 230000009881 electrostatic interaction Effects 0.000 description 1
- 230000002124 endocrine Effects 0.000 description 1
- 210000002889 endothelial cell Anatomy 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 239000002532 enzyme inhibitor Substances 0.000 description 1
- 229940125532 enzyme inhibitor Drugs 0.000 description 1
- 230000002327 eosinophilic effect Effects 0.000 description 1
- 201000000165 epidural abscess Diseases 0.000 description 1
- 230000001667 episodic effect Effects 0.000 description 1
- 210000002919 epithelial cell Anatomy 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 229960000301 factor viii Drugs 0.000 description 1
- 201000006061 fatal familial insomnia Diseases 0.000 description 1
- 208000001031 fetal erythroblastosis Diseases 0.000 description 1
- 238000000684 flow cytometry Methods 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- 239000007850 fluorescent dye Substances 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 230000005714 functional activity Effects 0.000 description 1
- 125000000524 functional group Chemical group 0.000 description 1
- 244000053095 fungal pathogen Species 0.000 description 1
- 210000000232 gallbladder Anatomy 0.000 description 1
- 210000000609 ganglia Anatomy 0.000 description 1
- 230000002496 gastric effect Effects 0.000 description 1
- 210000001035 gastrointestinal tract Anatomy 0.000 description 1
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 1
- 150000004676 glycans Chemical class 0.000 description 1
- 230000013595 glycosylation Effects 0.000 description 1
- 238000006206 glycosylation reaction Methods 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 208000002566 gonadal dysgenesis Diseases 0.000 description 1
- 239000003102 growth factor Substances 0.000 description 1
- 238000001631 haemodialysis Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 208000019622 heart disease Diseases 0.000 description 1
- 208000018578 heart valve disease Diseases 0.000 description 1
- 230000000322 hemodialysis Effects 0.000 description 1
- 208000006454 hepatitis Diseases 0.000 description 1
- 231100000283 hepatitis Toxicity 0.000 description 1
- 208000013057 hereditary mucoepithelial dysplasia Diseases 0.000 description 1
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 1
- 229940088597 hormone Drugs 0.000 description 1
- 239000005556 hormone Substances 0.000 description 1
- 108091008039 hormone receptors Proteins 0.000 description 1
- 230000005099 host tropism Effects 0.000 description 1
- 208000003906 hydrocephalus Diseases 0.000 description 1
- 230000005660 hydrophilic surface Effects 0.000 description 1
- 125000001165 hydrophobic group Chemical group 0.000 description 1
- 208000015210 hypertensive heart disease Diseases 0.000 description 1
- 208000003532 hypothyroidism Diseases 0.000 description 1
- 230000002989 hypothyroidism Effects 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 230000001524 infective effect Effects 0.000 description 1
- 201000007119 infective endocarditis Diseases 0.000 description 1
- 208000027866 inflammatory disease Diseases 0.000 description 1
- 230000004054 inflammatory process Effects 0.000 description 1
- 208000014674 injury Diseases 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 229940047122 interleukins Drugs 0.000 description 1
- 230000009878 intermolecular interaction Effects 0.000 description 1
- 210000000936 intestine Anatomy 0.000 description 1
- 238000007917 intracranial administration Methods 0.000 description 1
- 238000007918 intramuscular administration Methods 0.000 description 1
- 238000007912 intraperitoneal administration Methods 0.000 description 1
- 238000007913 intrathecal administration Methods 0.000 description 1
- 238000001990 intravenous administration Methods 0.000 description 1
- 208000002551 irritable bowel syndrome Diseases 0.000 description 1
- 230000000302 ischemic effect Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 1
- 210000003734 kidney Anatomy 0.000 description 1
- 206010023497 kuru Diseases 0.000 description 1
- 239000004816 latex Substances 0.000 description 1
- 229920000126 latex Polymers 0.000 description 1
- 230000000503 lectinlike effect Effects 0.000 description 1
- 208000032839 leukemia Diseases 0.000 description 1
- 108700041430 link Proteins 0.000 description 1
- 210000002751 lymph Anatomy 0.000 description 1
- 210000001165 lymph node Anatomy 0.000 description 1
- 210000004324 lymphatic system Anatomy 0.000 description 1
- 231100001023 lymphopenia Toxicity 0.000 description 1
- 239000011777 magnesium Substances 0.000 description 1
- 229910052749 magnesium Inorganic materials 0.000 description 1
- 108010022177 mannose binding protein A Proteins 0.000 description 1
- 201000007261 marantic endocarditis Diseases 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 201000001441 melanoma Diseases 0.000 description 1
- 230000002503 metabolic effect Effects 0.000 description 1
- 230000037353 metabolic pathway Effects 0.000 description 1
- 229930182817 methionine Natural products 0.000 description 1
- 230000000813 microbial effect Effects 0.000 description 1
- 238000000386 microscopy Methods 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 230000036651 mood Effects 0.000 description 1
- 208000005264 motor neuron disease Diseases 0.000 description 1
- 210000000214 mouth Anatomy 0.000 description 1
- 208000001725 mucocutaneous lymph node syndrome Diseases 0.000 description 1
- 201000000585 muscular atrophy Diseases 0.000 description 1
- 201000006938 muscular dystrophy Diseases 0.000 description 1
- 206010028537 myelofibrosis Diseases 0.000 description 1
- 201000000050 myeloid neoplasm Diseases 0.000 description 1
- 230000002107 myocardial effect Effects 0.000 description 1
- 208000010125 myocardial infarction Diseases 0.000 description 1
- 208000031225 myocardial ischemia Diseases 0.000 description 1
- 230000032147 negative regulation of DNA repair Effects 0.000 description 1
- 208000018389 neoplasm of cerebral hemisphere Diseases 0.000 description 1
- 230000001613 neoplastic effect Effects 0.000 description 1
- 210000000653 nervous system Anatomy 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 201000010193 neural tube defect Diseases 0.000 description 1
- 208000018360 neuromuscular disease Diseases 0.000 description 1
- 201000001119 neuropathy Diseases 0.000 description 1
- 230000007823 neuropathy Effects 0.000 description 1
- 208000016135 nonbacterial thrombotic endocarditis Diseases 0.000 description 1
- 238000006384 oligomerization reaction Methods 0.000 description 1
- 229920001542 oligosaccharide Polymers 0.000 description 1
- 150000002482 oligosaccharides Chemical class 0.000 description 1
- 201000008482 osteoarthritis Diseases 0.000 description 1
- 230000002611 ovarian Effects 0.000 description 1
- 210000001672 ovary Anatomy 0.000 description 1
- 230000003071 parasitic effect Effects 0.000 description 1
- 230000000849 parathyroid Effects 0.000 description 1
- 201000003045 paroxysmal nocturnal hemoglobinuria Diseases 0.000 description 1
- 210000003899 penis Anatomy 0.000 description 1
- 208000008494 pericarditis Diseases 0.000 description 1
- 208000029308 periodic paralysis Diseases 0.000 description 1
- 208000027232 peripheral nervous system disease Diseases 0.000 description 1
- 108700010839 phage proteins Proteins 0.000 description 1
- 239000008177 pharmaceutical agent Substances 0.000 description 1
- 239000000546 pharmaceutical excipient Substances 0.000 description 1
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 1
- 230000026731 phosphorylation Effects 0.000 description 1
- 238000006366 phosphorylation reaction Methods 0.000 description 1
- 210000003635 pituitary gland Anatomy 0.000 description 1
- 210000002826 placenta Anatomy 0.000 description 1
- 239000003726 plant lectin Substances 0.000 description 1
- 238000007747 plating Methods 0.000 description 1
- 208000037244 polycythemia vera Diseases 0.000 description 1
- 210000002729 polyribosome Anatomy 0.000 description 1
- 229920001282 polysaccharide Polymers 0.000 description 1
- 239000005017 polysaccharide Substances 0.000 description 1
- 230000004481 post-translational protein modification Effects 0.000 description 1
- 230000035935 pregnancy Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 201000009395 primary hyperaldosteronism Diseases 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 201000002212 progressive supranuclear palsy Diseases 0.000 description 1
- 210000001236 prokaryotic cell Anatomy 0.000 description 1
- 230000002062 proliferating effect Effects 0.000 description 1
- 230000000069 prophylactic effect Effects 0.000 description 1
- 238000011321 prophylaxis Methods 0.000 description 1
- 230000012846 protein folding Effects 0.000 description 1
- 230000004853 protein function Effects 0.000 description 1
- 230000002797 proteolythic effect Effects 0.000 description 1
- 208000020016 psychiatric disease Diseases 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 206010061928 radiculitis Diseases 0.000 description 1
- 230000002285 radioactive effect Effects 0.000 description 1
- 208000002574 reactive arthritis Diseases 0.000 description 1
- 230000009257 reactivity Effects 0.000 description 1
- 210000005084 renal tissue Anatomy 0.000 description 1
- 201000010384 renal tubular acidosis Diseases 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 201000003068 rheumatic fever Diseases 0.000 description 1
- 208000004124 rheumatic heart disease Diseases 0.000 description 1
- 206010039073 rheumatoid arthritis Diseases 0.000 description 1
- 238000002702 ribosome display Methods 0.000 description 1
- 201000000980 schizophrenia Diseases 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 229960002718 selenomethionine Drugs 0.000 description 1
- 231100000879 sensorineural hearing loss Toxicity 0.000 description 1
- 208000023573 sensorineural hearing loss disease Diseases 0.000 description 1
- 150000003355 serines Chemical class 0.000 description 1
- 125000003607 serino group Chemical group [H]N([H])[C@]([H])(C(=O)[*])C(O[H])([H])[H] 0.000 description 1
- 210000002027 skeletal muscle Anatomy 0.000 description 1
- 210000000813 small intestine Anatomy 0.000 description 1
- 150000003384 small molecules Chemical class 0.000 description 1
- 210000000278 spinal cord Anatomy 0.000 description 1
- 230000002269 spontaneous effect Effects 0.000 description 1
- 230000000087 stabilizing effect Effects 0.000 description 1
- 238000007920 subcutaneous administration Methods 0.000 description 1
- 235000000346 sugar Nutrition 0.000 description 1
- 150000008163 sugars Chemical class 0.000 description 1
- 229910052717 sulfur Inorganic materials 0.000 description 1
- 239000011593 sulfur Substances 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 230000002537 thrombolytic effect Effects 0.000 description 1
- 210000002105 tongue Anatomy 0.000 description 1
- 210000003437 trachea Anatomy 0.000 description 1
- 230000008733 trauma Effects 0.000 description 1
- 208000009999 tuberous sclerosis Diseases 0.000 description 1
- 210000004881 tumor cell Anatomy 0.000 description 1
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 1
- 150000003668 tyrosines Chemical class 0.000 description 1
- 125000001493 tyrosinyl group Chemical group [H]OC1=C([H])C([H])=C(C([H])=C1[H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
- 210000002700 urine Anatomy 0.000 description 1
- 210000001215 vagina Anatomy 0.000 description 1
- 208000027185 varicose disease Diseases 0.000 description 1
- 230000002792 vascular Effects 0.000 description 1
- 201000011531 vascular cancer Diseases 0.000 description 1
- 206010055031 vascular neoplasm Diseases 0.000 description 1
- 201000010044 viral meningitis Diseases 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/195—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P35/00—Antineoplastic agents
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/001—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof by chemical synthesis
Definitions
- This invention relates to a class of binding proteins with a range of binding specificities and affinities based upon variation at select amino acid positions within a scaffold.
- the variable positions may be readily modified to produce a variety of binding proteins with different binding specificities and affinities.
- This range of proteins may be screened to identify one or more as binding a target molecule of interest.
- Compositions comprising the binding proteins, as well as methods of using the binding proteins are also provided.
- the amino acid sequence of a protein determines its secondary, tertiary, and quaternary structure to result in the protein's final three-dimensional (3D) shape.
- the shape and functional groups (side chains) of the amino acids therein define the protein's function.
- binding domain the portion of the protein responsible for the binding activity (binding domain) must either be exposed, or be capable of being exposed, on an accessible surface of the protein exposed to the exterior solvent to provide for possible interaction with a binding target.
- the amino acid residues of the binding domain must be varied.
- the “variable region” or binding domain includes six loops clustered in space.
- the loops provide the 6 complementarity determining regions (CDRs) and are contained in two polypeptides, a heavy chain and a light chain, each carrying 3 CDRs (H1, H2, and H3 of the heavy chain and L1, L2, and L3 of the light chain).
- the amino acid residues of the variable regions orient the CDRs toward the exterior solvent environment to permit their interaction with an antigen.
- High sequence variability of the amino acid residues of the CDRs allows immunoglobulins as a class to bind a large variety of antigens.
- the CDRs and non-CDR portion of the variable region form an immunoglobulin fold to determine the structure of the loops and thereby maintain the overall structure of the immunoglobulin variable region, with proper orientation of the CDRs.
- binding protein To diversify the binding functionality of a binding protein and thus promote recognition of members of a diverse population of target molecules, amino acid variability is necessary. Interactions between a binding protein and its target molecule (the ligand) are usually non-covalent and yet often very tight (high affinity or avidity) and specific. The intermolecular interactions are defined by the amino acid residues of the protein's binding domain which form a surface that fits “hand-in-glove” like onto the surface of the ligand being bound. The two contacting surfaces must have complementarity via hydrogen bonding (at times mediated by a water molecule), charge interactions, alignment of attracting dipoles, hydrophobic to hydrophobic (van der Waals) interactions, and/or protrusions fitting with depressions.
- the binding domain is presented within the context of the framework made up by the rest of the immunoglobulin molecule.
- the framework generally referred to as the immunoglobulin fold, forms the scaffold of the protein structure and functions to correctly present the binding domain.
- the framework restrains the 3D shape of the protein so that the amino acid residues of the binding domain are positioned in a manner to create the accessible specific binding site.
- the present invention is related to the discovery of a diversity-generating retroelement (DGR) belonging to a Bordetella bacteriophage.
- DGR diversity-generating retroelement
- the DGR has recently been shown to be capable of producing massive, targeted amino acid sequence variation in the phage's receptor-binding protein, the major tropism determinant (Mtd). See Liu, M. et al. “Reverse transcriptase-mediated tropism switching in Bordetella bacteriophage.” Science 295, 2091-4 (2002); Liu, M. et al.
- the Bordetella phage DGR utilizes a single copy of mtd followed by a nearly identical (90%), 134-bp direct repeat of the 3′ end of mtd (see FIG. 1 herein). Genetic information in this direct repeat, called the template repeat (TR) due to its invariance, is converted into a cDNA altered by random insertion of A, G, C, or T specifically at sites occupied by adenines in TR through the action of a DGR-encoded reverse transcriptase.
- TR template repeat
- the mutagenized sequence is then substituted into the variable region (VR) of mtd by a process known as mutagenic homing, thereby producing an Mtd variant.
- VR variable region
- the effect of the resulting amino acid variation in VR is to alter the binding specificity of Mtd and consequently host tropism for the phage.
- Bvg-plus tropic phage-1 (BPP-1) infects only Bvg + Bordetella, the pathogenic phase, since the Mtd-P1 variant expressed by this phage uses as its receptor the Bvg + -specific outer membrane protein, pertactin.
- BPP-1 Bvg-plus tropic phage-1
- Bordetella When Bordetella encounters an ex vivo environment, it ceases expressing pertactin, becoming Bvg ⁇ as it concomitantly becomes resistant to infection by BPP-1 (see Uhl, M. A. & Miller, J. F. “Integration of multiple domains in a two-component sensor protein: the Bordetella pertussis BvgAS phosphorelay.” EMBO J 15, 1028-36 (1996)).
- Mtd variants such as Mtd-M1
- BMP Bvg-minus tropic phage
- Mtd variants such as Mtd-I1
- BIP Bvg-indiscriminant phage
- Mtd variants, such as Mtd-3c, that confer infectivity towards Bvg + Bordetella but use instead of pertactin, an unknown receptor have also been found.
- the molecular protein structure with which Mtd creates diverse receptor-binding sites and tolerates massive sequence variation was not known prior to the present invention.
- Mtd is found on the tails of Bordetella bacteriophage, which number 6 per phage particle. Based upon the discovery described herein, there appear to be 2 Mtd trimers per phage tail, and thereby 12 Mtd trimers per phage particle.
- the invention is based in part on the discovery of the unexpected structures of multiple Mtd variants.
- the basic structure is a pyramid-shaped homotrimer with variable amino acid residues organized along the pyramid base by a C-type lectin (CTL)-fold that creates a discrete receptor-binding site in each of the three monomers.
- CTL C-type lectin
- the present invention thus provides the use of the CTL-fold, or portion thereof, as a scaffold to orient the side chains of variable amino acid residues toward the external solvent environment.
- the side chains of the variable amino acid residues define, in whole or in part, the three dimensional structure or shape of all or part of the binding site, which is attached to the scaffold through the alpha carbons of each variable amino acid residue.
- the present invention also provides for the use of CTL-folds as a scaffold for massive sequence variation of the variable amino acid residues, and thus the side chains thereof, in the manner exemplified by Bordetella bacteriophage.
- the availability of ⁇ 10 13 possible combinations of variable amino acid residue side chains in the binding site provides a highly diverse population of binding proteins with different specificities.
- the extraordinary diversity available in this localized portion of the binding site provided by the scaffold provides differing shapes and chemical reactivities suitable for binding to and operating on a wide range of target molecules.
- This level of diversity provided to the binding site of a CTL-fold by the present invention is paralleled only by the antigen binding region of immunoglobulins and T cell receptors in the immune system.
- binding proteins of the invention may be produced by modification of a single polypeptide chain to result in a highly diverse population of binding proteins.
- the single chain can be modified via recombinant methods, such as by recombinant use of the elements of the DGR of Bordetella bacteriophage.
- the scaffold, or backbone conformation, present in the CTL-fold has been observed to provide a stable structure for the presentation of a binding site.
- the CTL-fold has closely spaced N and C termini which are opposite the binding site of the fold.
- the invention provides for the use of the CTL-fold to present a binding site with variable residues that may be varied without compromising the maintenance of the structural integrity of the CTL-fold.
- the scaffold structure includes stabilization of loops in the binding site by two inserts and trimeric intertwining as well as other structures contributing to the CTL fold.
- the scaffold is similarly stabilized by the structures present in the scaffold, such as, but not limited to, the presence of disulfide bridges that contribute to the integrity of the CTL fold.
- the CTL-fold therefore, provides a stable, highly tolerant scaffold for combinatorial display of the side chains of variable amino acid residues used to form all or part of a binding site.
- the availability of a scaffold to present diverse binding sites permits the generation of binding proteins with different specificities and affinities for binding a wide number of different target molecules, particularly biomolecules.
- the binding proteins may be used to bind, and thus detect, identify, localize or modify, such target molecules.
- the invention thus provides, in one aspect, for a protein scaffold comprising a variable binding site comprising the amino acid sequence
- the scaffold serves as a framework to present variable amino acid residues, the side chains of which form the binding site of the protein.
- the scaffold is derived from, and forms all or part of, a CTL-fold which displays or exposes the binding site to the external solvent environment.
- the invention includes the above sequence (wherein SEQ ID NO:1 constitutes all or part of the binding side of the scaffold) in a non-Mtd, CTL-fold as the scaffold.
- the scaffold may optionally be conjugated to another polypeptide or other molecule through residues distant from the binding site.
- the invention also provides a binding protein comprising a scaffold as described above.
- the binding specificity of the protein is determined by the variable binding site, and the protein comprises a scaffold comprising the amino acid sequence
- the side chains of the variable (Xaa) residues may form the whole of the binding site where no other side chains of the protein contribute to binding interactions with a target molecule bound by the protein.
- other side chains of the protein such as those of other amino acid residues in the scaffold or superscaffold, may contribute to the binding interactions with a target molecule.
- the side chains of the variable residues only compose part of the binding site of the protein.
- a target molecule include a viral antigen, a bacterial antigen, a fungal antigen, an enzyme, an enzyme inhibitor, a cell surface molecule of any composition, a reporter molecule, a serum protein, and a receptor.
- a viral antigen in the case of a target molecule, it may be, but is not limited to, a polypeptide required for replication.
- the binding sites of the invention like immunoglobulin binding sites, recognize proteins (including native, denatured, and proteolytic forms thereof as well as conformational determinants thereof); nucleic acids; polysaccharides (alone or as modifications on another molecule, such as a protein); lipids; and small chemical molecules (like haptens in the case of an antibody).
- the scaffold is extended at the Xaa 1 end by all or part of the sequence -Ala-Ala-Leu-Phe-Gly-Gly- (SEQ ID NO:2), wherein the extension may be by 1, 2, 3, 4, 5, or all 6 of the consecutive amino acid residues of SEQ ID NO:2 linked to Xaa 1 via the carboxyl end of the last Gly residue in SEQ ID NO:2.
- the scaffold is extended at the Xaa 12 end by all or part of the sequence -Gly-Ala-Arg-Gly-Val-Cys-Asp-His-Leu-Ile-Leu-Glu (SEQ ID NO:3), wherein the extension may be by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or all 12 of the consecutive amino acid residues linked to Xaa 12 via the amino end of the first Gly residue in SEQ ID NO:3.
- the scaffold may also be extended at both ends by any combination of the above extensions at Xaa 1 and Xaa 12 followed by further optional extensions. Where all 12 amino acids of SEQ ID NO:3 are present in a scaffold, preferred embodiments of the invention have no further extension at the C terminus by additional amino acid residues.
- the superscaffold is composed of additional amino acids attached to a scaffold of the invention without adverse effect on the binding site contained therein.
- a binding protein of the invention is thus preferably composed of a binding site within a scaffold which is attached to a superscaffold.
- the superscaffold is composed of amino acids associated with the scaffold in naturally occurring sources of the scaffold, such as in naturally occurring polypeptides with a CTL-fold.
- the scaffold may be grafted onto a heterologous superscaffold, such as the superscaffold of another CTL-fold containing polypeptide, analogous to the grafting of mouse antibody CDRs onto a human antibody framework.
- Amino acid residues of the superscaffold may also serve to permit conjugation of the binding protein to another molecule.
- the superscaffold may be a polypeptide linker as a non-limiting example.
- the polypeptide linker may be of differing lengths and compositions.
- the superscaffold may also optionally constitute or comprise a dimerization or multimerization domain which permits organization of more than one scaffold in three dimensional space without covalent linkage, or optionally through one or more disulfide bonds in addition to non-covalent interactions.
- the superscaffold may be a linker molecule or linker polypeptide which covalently links a scaffold to another molecule, such as a second scaffold, which may be the same or different from the first scaffold.
- the superscaffold may comprise a transmembrane region or domain capable of tethering the scaffold in a lipid bilayer, such as at a cell surface.
- the superscaffold may be another protein molecule to form a fusion protein comprising a scaffold of the invention.
- a further aspect of the invention provides additional scaffolds and binding proteins comprising them.
- the scaffold is a CTL-fold containing a region with one or more variable residues, which region starts at the end of the ⁇ 3 strand (or with the last residue thereof) and continues through any intervening secondary structures until, but preferably not including, the non-solvent exposed residues of, or before the start of, the ⁇ 5 strand.
- the scaffold may comprise a variable region represented by the sequence
- sequences are optionally extended by all or part of SEQ ID NO:2, wherein the extension may be by 1, 2, 3, 4, 5, or all 6 of the consecutive amino acid residues therein linked to Xaa 1 via the carboxyl end of the last Gly residue in SEQ ID NO:2.
- these sequences are also optionally extended by all or part of SEQ ID NO:3, wherein the extension may be by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or all 12 of the consecutive amino acid residues linked to the C terminal Xaa via the amino end of the first Gly residue in SEQ ID NO:3.
- the sequences may also be extended at both ends by any combination of the above extensions at Xaa 1 and Xaa 23 followed by further optional extensions. Where all 12 amino acids of SEQ ID NO:3 are present, preferred embodiments of the invention have no further extension at the C terminus.
- SEQ ID NO:4 containing sequences are preferably part of a scaffold as found in the CTL-fold portion of Mtd.
- the sequences may be substituted for the corresponding sequence between the ⁇ 3 and ⁇ 5 strands of another CTL-fold as described herein.
- the scaffold may comprise a cyanobacterium derived variable region represented by
- sequences are optionally extended by all or part of SEQ ID NO:2, wherein the extension may be by 1, 2, 3, 4, 5, or all 6 of the consecutive amino acid residues therein linked to Xaa 1 via the carboxyl end of the last Gly residue in SEQ ID NO:2.
- these sequences are also optionally extended by all or part of SEQ ID NO:3, wherein the extension may be by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or all 12 of the consecutive amino acid residues linked to the C terminal Xaa via the amino end of the first Gly residue in SEQ ID NO:3.
- the sequence is extended at the C terminus by all or part of -Gly-Phe-Arg-Leu-Val-Ser-Phe-Pro-Pro-Arg-Thr-Leu-Glu- (SEQ ID NO:6), -Gly-Phe-Arg-Leu-Val-Ser-Phe-Pro-Pro-Arg-Thr-Pro-Glu- (SEQ ID NO:7), -Gly-Phe-Arg-Val-Val-Cys-Ala-Phe-Gly-Arg-Ile-Leu-Gln- (SEQ ID NO:8), or -Gly-Phe-Arg-Val-Val-Cys-Ala-Phe-Gly-Arg-Thr-Phe-Gln- (SEQ ID NO:9), wherein the extension may be by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or all 13 of the consecutive amino acid residues in any one of SEQ ID NOs:6-9 linked to the C terminal Xaa via the amino end of the first Gly residue in
- the C terminus extension may also be by -Gly-Phe-Arg-Val-Ile-Ser-Ser-Ser-Pro-Val-Val-Ser-Gly-Phe-His-Ser- (SEQ ID NO:10), wherein the extension may be by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or all 16 of the consecutive amino acid residues linked to the C terminal Xaa via the amino end of the first Gly residue in SEQ ID NO:10; or by -Gly-Cys-Arg-Val-Val-Val-Val-Arg-Gly-Arg-Leu-Ser- (SEQ ID NO:11), wherein the extension may be by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or all 12 of the consecutive amino acid residues linked to the C terminal Xaa via the amino end of the first Gly residue in SEQ ID NO:11.
- sequences may also be extended at both ends by any combination of the above extensions at Xaa 1 and Xaa 21 (or Xaa 22 , Xaa 23 , or Xaa 24 ) followed by further optional extensions. Where all the amino acids of any of SEQ ID NOs:3 or 6-11 are present, preferred embodiments of the invention have no further extension at the C terminus.
- SEQ ID NO:5 containing sequences are preferably part of a scaffold as found in the CTL-fold of a protein containing a cyanobacterium amino acid sequence as shown in FIG. 5 .
- Those cyanobacterium CTL-fold containing proteins are from Trichodesmium erythraeum (preferably T.e. 1A, T.e. 1B, or T.e. 2); Nostoc PPC ssp. 7120 (preferably N. PCC. 1, N. PCC. 2A, or N. PCC. 2B); or Nostoc punctiforme (preferably N.p. 1 or N.p. 2) and have both protein level homology as well (as indicated in FIG. 5 ) and genetic similarity because the coding regions for the proteins contain a corresponding TR.
- the sequences may be substituted for the corresponding sequence between the ⁇ 3 and ⁇ 5 strands of another CTL-fold as described herein.
- the invention also provides a Treponema denticola derived variable region comprising a sequence represented by
- sequence is optionally extended at the C terminus Leu by one or more residues in -Gly-Phe-Arg-Leu-Ala-Cys-Arg-Pro (SEQ ID NO:13) wherein the extension may be by 1, 2, 3, 4, 5, 6, 7, or all 8 of the consecutive amino acid residues linked to the C terminal Leu via the amino end of the first Gly residue in SEQ ID NO:13. Where all 8 amino acids of SEQ ID NO:13 are present, preferred embodiments of the invention have no further extension at the C terminus.
- SEQ ID NO:12 containing sequences are preferably part of a scaffold as found in the CTL-fold of a Treponema denticola protein containing the corresponding T.d. amino acid sequence in FIG. 5 .
- the sequences may be substituted for the corresponding sequence between the ⁇ 3 and ⁇ 5 strands of another CTL-fold as described herein.
- the invention further provides a scaffold comprising another phage derived variable region represented by
- sequence is optionally extended at the Xaa 12 end by one or more residues in -Gly-Phe-Arg-Pro-Ala-Phe-Phe-Val (SEQ ID NO:15) wherein the extension may be by 1, 2, 3, 4, 5, 6, 7, or all 8 of the consecutive amino acid residues linked to Xaa 12 via the amino end of the first Gly residue in SEQ ID NO:15. Where all 8 amino acids of SEQ ID NO:15 are present, preferred embodiments of the invention have no further extension at the C terminus.
- SEQ ID NO:14 containing sequences are preferably part of a scaffold as found in the CTL-fold of a Vibrio harveyi ML phage protein (ORF35 encoded protein) containing the corresponding V.h. ML amino acid sequence in FIG. 5 .
- the sequences may be substituted for the corresponding sequence between the ⁇ 3 and ⁇ 5 strands of another CTL-fold as described herein.
- the invention also provides a scaffold comprising a Bifidobacterium longum derived variable region represented by
- sequence is optionally extended at the Xaa 12 end by one or more residues in -Gly-Gly-Arg-Leu-Ser-Ala-Leu-Gly-Arg-Thr-Lys-Ala (SEQ ID NO:17) wherein the extension may be by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or all 12 of the consecutive amino acid residues linked to Xaa 12 via the amino end of the first Gly residue in SEQ ID NO:17. Where all 12 amino acids of SEQ ID NO:17 are present, preferred embodiments of the invention have no further extension at the C terminus.
- SEQ ID NO:16 containing sequences are preferably part of a scaffold as found in the CTL-fold of a Bifidobacterium longum protein containing the corresponding B.l. amino acid sequence in FIG. 5 .
- the sequences may be substituted for the corresponding sequence between the ⁇ 3 and ⁇ 5 strands of another CTL-fold as described herein.
- the invention also provides a scaffold comprising a Bacteroides thetaiotaonicron derived variable region represented by
- sequence is optionally extended at the Xaa 17 end by one or more residues in -Arg-Ala-Cys-Gly-Phe-Gly-Leu-Arg-Ser-Ser-Gln-Glu (SEQ ID NO:19) wherein the extension may be by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or all 12 of the consecutive amino acid residues linked to Xaa 17 via the amino end of the first Arg residue in SEQ ID NO:19. Where all 12 amino acids of SEQ ID NO:19 are present, preferred embodiments of the invention have no further extension at the C terminus.
- SEQ ID NO:18 containing sequences are preferably part of a scaffold as found in the CTL-fold of a Bacteroides thetaiotaonicron protein containing the corresponding B.t. amino acid sequence in FIG. 5 .
- the sequences may be substituted for the corresponding sequence between the ⁇ 3 and ⁇ 5 strands of another CTL-fold as described herein.
- the invention provides for the use of the region between the ⁇ 3 and ⁇ 5 strands of a CTL-fold as a variable region in which amino acids may be altered to produce novel binding sites with different specificities and avidities.
- the nucleic acid sequence encoding the CTL-fold of a CTL-fold containing protein may be operably linked to a template region (TR), and an IMH as needed, wherein the TR corresponds to all or part of the binding site in the CTL-fold and contains adenine residues that direct changes in the amino acid sequence of the binding site, and thus variable region, as described herein.
- Preferred embodiments of the invention include CTL-fold encoding nucleic acids with the Mtd IMH, or a functional fragment thereof, to direct alterations in the VR based on adenine residues in the functionally linked TR.
- a scaffold in a binding protein of the invention is preferably all or part of a CTL-fold that correctly orients the binding site contained therein.
- CTL-folds include that in Mtd as described herein as well those classified as C-type lectin-like domains (CTLDs) and divergent CTLDs.
- Preferred regions of the CTL-fold in Mtd are residues 171-381 and residues 306-381 of SEQ ID NO:20.
- residues 171-381 the size is analogous to recombinant single chain antibodies composed of a single variable domain (VHH), which remains a stable polypeptide with the antigen binding capability of the original variable region of the heavy chain (see NanobodiesTM by Ablynx).
- VHH are based on antibodies that lack light chains found in camelidae (camels and llamas).
- residues 306-381 at least one region composed of residues 171-199, residues 237-263, residues 200-236, or residues 264-305 is preferably present in the fold as well. Particularly preferred is the presence of any two, any three, or all four of these regions.
- CTLD examples include those that bind Ca 2+ , such as carbohydrate recognition domains (CRDs), C-type lectin domains (which bind sugars), coagulation factor binding proteins, and IgE Fc receptor.
- CCDs carbohydrate recognition domains
- C-type lectin domains which bind sugars
- coagulation factor binding proteins and IgE Fc receptor.
- Divergent CTLD examples include type II antifreeze proteins, oxidized LDL receptor, phospholipase receptors, NK cell receptors (which bind MHC ligands).
- Other non-limiting examples include link protein modules, endostatin, and intimin.
- the CTL-fold is bacterial (including bacterial phages), human or mammalian in origin.
- Non-limiting examples include the selectins (see Lasky (1995) Annu. Rev. Biochem., 64:113-139), including E-selectin, L-selectin and P-selectin; mannose binding protein (MBP), including MBP-A and MBP-C; the natural killer (NK) receptor NKG2D; CD69; eosinophilic major basic protein (EMBP); tumour necrosis factor-stimulated gene-6 product (TSG-6); enteropathogenic E. coli (EPEC) intimin (the D3 domain therein is a CTL-fold); and Yersinia pseudotuberculosis invasin (the D5 domain is a CTL-fold).
- selectins see Lasky (1995) Annu. Rev. Biochem., 64:113-139
- MBP mannose binding protein
- NK natural killer receptor NKG2D
- CD69
- the side chains of the Xaa residues in the above sequences form a binding site, in whole or in part.
- N and C terminal ends of the sequences are optional amino acid sequences, or one of the ends is —H (a covalently bonded hydrogen atom), such as those that form a CTL-fold containing the binding site displayed in a solvent exposed portion of the fold.
- SEQ ID NO:21 containing sequences are preferably part of a scaffold as found in the CTL-fold of an MBP protein, preferably with a collagenous domain.
- the sequences may be substituted for the corresponding sequence between the ⁇ 3 and ⁇ 5 strands of another CTL-fold as described herein.
- a selectin derived variable region of the invention is represented by
- the side chains of the Xaa residues in the above sequences form a binding site, in whole or in part.
- N and C terminal ends of the sequences are optional amino acid sequences, or one of the ends is —H (a covalently bonded hydrogen atom), such as those that form a CTL-fold containing the binding site displayed in a solvent exposed portion of the fold.
- SEQ ID NO:22 containing sequences are preferably part of a scaffold as found in the CTL-fold of a selectin protein.
- the sequences may be substituted for the corresponding sequence between the ⁇ 3 and ⁇ 5 strands of another CTL-fold as described herein.
- the invention provides nucleic acid molecules, or polynucleotides, encoding the scaffolds and binding proteins as described herein.
- the nucleic acids or polynucleotides may be part of a nucleic acid vector or plasmid, optionally in a cell, preferably suitable for expression of the encoded protein.
- the scaffold is preferably all or part of a variable region (VR) in the nucleic acid molecule which is operably linked to an initiation of mutagenic homing (IMH) sequence and a template region (TR) as described below.
- V variable region
- nucleic acid molecules encoding the CTL-folds described above, but which do not have an operably linked IMH and/or TR components may be modified to be a nucleic acid molecule of the invention by attachment of the necessary functional nucleic acid components.
- the invention also provides a plurality, or library, of scaffolds or binding proteins as well as methods for their production.
- a method of producing a plurality of scaffolds or proteins with different binding specificities comprising expressing and replicating a nucleic acid molecule or polypeptide encoding a scaffold or binding protein of the invention in a cell under conditions of mutagenic homing wherein said TR directs mutagenesis of variable residues within the variable region (VR) containing the scaffold.
- a plurality or library of scaffolds or binding proteins include those expressed as a phage display, ribosome display, polysome display, or cell surface display as well as those presented as an array or microarray format.
- the plurality is expressed as part of the tail fibers of Bordetella bacteriophages.
- the resultant plurality or library of scaffolds or binding proteins may be screened for binding against a target molecule of interest.
- the invention provides a method of selecting for binding comprising producing or providing a plurality, or library, of scaffolds or proteins in a plurality of cells as described above followed by selecting proteins which bind a molecule of interest after individually contacting each of said plurality of scaffolds or proteins (or phage particles, cells, or media containing them) with a target molecule of interest.
- the binding proteins in the plurality or library are in dimeric or other multimeric form.
- the invention also provides for identifying a multimeric form of a binding protein as having a greater avidity for the target molecule of interest than a monomeric form of the protein.
- the plurality or library of scaffolds or binding proteins may be screened for binding to any one of a multiplicity of target molecules as an additional method of the invention.
- the scaffolds or proteins contacted with multiple molecules followed by selection of those scaffolds or proteins that bind at least one of the target molecules may be isolated.
- the multiple target molecules may be in a mixture or disposed on an array or microarray as non-limiting examples. Other such examples include multiple molecules in or on a cell or tissue as well as multiple molecules immobilized on a solid support.
- the target molecules are preferably polypeptides, optionally modified by glycosylation, phosphorylation, or other post-translational modification; carbohydrates; lipids; or complex combinations thereof.
- the target molecules may be expressed on the exterior of phage or a virus, or a viable or non-viable cell of any phyla.
- the plurality or library of scaffold or binding protein is expressed on the exterior of phage, such as Bordetella bacteriophage.
- the invention provides methods of selecting for binding against a target ligand or molecule of interest by use of the plurality or library of phage particles.
- the plurality, or library is provided and contacted with a target ligand or molecule of interest followed by selection of phage which bind the ligand or molecule, optionally by removal of phage which do not bind.
- the selected phage particles may be propagated followed by one or more additional rounds of contacting and selection, optionally under more stringent wash conditions, to “enrich” for phage expressing a scaffold or binding protein with greater affinity or avidity.
- the polynucleotide encoding the scaffold or binding protein may be isolated from the selected phage and analyzed (e.g. sequenced), amplified or propagated to produce the scaffold or binding protein.
- the phage may have been expressing the protein in dimeric, trimeric or other multimeric form.
- Such selected phage may be used as sources of genes or gene fragments encoding binding protein molecules with the desired specificity and avidity.
- the selection methods of the invention may further include an additional determination of the scaffold or binding proteins, selected as described above, as binding or not binding to a second molecule. Scaffolds or binding proteins that bind a second molecule would be identified as non-specific for the target ligand or molecule of interest, while those that do not bind a second molecule would be identified as specific for the target ligand or molecule of interest relative to the second molecule.
- the scaffolds and binding proteins of the invention may also be modified, such as by attachment of another moiety thereto.
- a moiety for attachment include a detectable label or a toxin or activatable pro-drug.
- Modified scaffolds and binding proteins may be used to target a cell which is bound thereby.
- a detectably labeled modified scaffold or binding protein may be used to detect a cell expressing a molecule bound by the binding site of the scaffold or protein. The molecule may be expressed on the cell surface, such that the scaffold or binding protein binds the exterior of the cell.
- the molecule may also be expressed within the cell, wherein the scaffold or binding protein binds after introduction into the interior of the cell, such as, but not limited to, cases where the cells have been permeabilized.
- Non-limiting examples of cells that may be detected include both prokaryotic and eukaryotic cells, including bacterial cells and higher eukaryotic cells from a multicellular organism.
- a modified scaffold or binding protein attached to a toxin, or pro-drug form thereof may be used to decrease the viability of, or to kill, cells which express a cell surface molecule bound by the modified scaffold or protein.
- the cells are cancer cells, such as those of a mammal, preferably a human.
- compositions comprising the scaffolds and binding proteins of the invention are provided.
- the compositions may be used for the practice of the methods disclosed herein, including diagnostic, prophylactic or therapeutic applications. Additionally, compositions comprising the nucleic acid molecules and polypeptides disclosed herein as well as materials for the expression thereof are provided. These compositions may be provided in the form of a kit for the expression and production of the scaffolds and proteins of the invention.
- FIG. 1 shows the organization of the Bordetella phage DGR containing a single copy of Mtd with its VR followed by a nearly identical (90%), 134-bp direct repeat of the VR called the template repeat (TR), which is invariant among Mtd variants.
- the amino acid sequence of VR in each of the five Mtd variants is shown in the upper box, together with the predicted amino acid sequence encoded by the corresponding nucleotide triplets of the TR in the lower box.
- the region corresponding to the initiation of mutagenic homing (IMH) sequence is underlined.
- FIG. 2A shows two representations of the intertwined, pyramid-shaped trimer structure of several Mtd variants.
- FIG. 2B shows a representation of an Mtd monomer and three domains therein: ⁇ -prism, intermediate domain containing the ⁇ -sandwich, and C-type lectin (CTL)-fold including the VR and the region corresponding to the IMH.
- CTL C-type lectin
- FIG. 2C is a schematic showing regions of secondary structure in Mtd.
- FIG. 3A shows a representation of an Mtd CTL-fold.
- FIG. 3B shows a representation of 12 variable residues which are almost all solvent-exposed and organized into a receptor-binding site on the external face of the Mtd ⁇ 2 ⁇ 3 ⁇ 4 ⁇ 4′ sheet.
- FIG. 3C shows a structural comparison of Mtd-P1,-3c, -M1, -I1, and -N1 used to determine that the main chain conformation of the CTL domain is remarkably consistent, despite half of the variable residues being on loop regions.
- FIG. 3D shows a representation of Serine-270 (S270) and Glutamate-267 (E267) from the second insert in the Mtd CTL-fold forming hydrogen bonds to the invariant VR residues Serine-351 (S351) and Serine-353 (S353), respectively, within the binding region.
- FIG. 3E shows that the ⁇ 2 ⁇ 3 loop from one monomer hydrogen bonds to the invariant VR residue Arginine-354 (R354) and to main chain (scaffold) atoms of VR.
- FIG. 4 shows by means of molecular surface representations that Mtd-P1 (BPP-1) and Mtd-I1 (BIP-1) have highly hydrophobic binding sites, and that the continuity of the hydrophobic surface decreases successively for Mtd-3c (BPP-3), -M1 (BMP-1), and -N1 (BNP).
- the view is looking onto the base of pyramid-shaped Mtd, that is, the surface that binds the exposed binding surface of the target molecule.
- the variable amino acid residues are numbered on the surface of BPP-1.
- variable and invariant hydrophobic amino acid residues (Ala, Val, Leu, Ile, Phe, Tyr, Trp, and Met) are in green and yellow, respectively; and variable and invariant hydrophilic amino acid residues (Ser, Thr, Asn, Gln, Asp, Glu, His, Lys, Arg, and Cys) are in red and pink, respectively.
- the surface denoted ‘Invariant’ shows, using the same coloring scheme, the hydrophobic and hydrophilic surface surrounding the variable portion of the binding sites.
- FIG. 5 shows the structure-based sequence alignment of the ⁇ 2 ⁇ 3 ⁇ 4 ⁇ 4′ sheet of the CTL-fold in Mtd-P1 and 12 variable proteins of putative DGRs, as discussed herein.
- Residues colored light gray correspond to variable residues in Mtd, and those residues found to differ between VR and TR in genomic sequences of the other 12 proteins
- Residues colored dark gray are those that could vary by an adenine-directed mechanism in these other proteins.
- Magenta corresponds to identical residues and yellow to residues conserved in chemical character. In assigning color, the grays take precedence over magenta and yellow, such that certain putatively variable residues are also identical or conserved.
- the 12 variable proteins of putative DGR's are from Vibrio harveyi ML phage (V.h. ML); Bifidobacterium longum (B.l); Bacteroides thetaiotaonicron (B.t); Treponema denticola (T.d.); Trichodesmium erythraeum 1A (T.e. 1A); Trichodesmium erythraeum 1B (T.e. 1B); Trichodesmium erythraeum # 2 (T.e.
- Nostoc PPC ssp. 7120 #1 (N. PCC. 1); Nostoc PPC ssp. 7120 #2A (N. PCC. 2A); Nostoc PPC ssp. 7120 #2B (N. PCC. 2B); Nostoc punctiforme # 1 (N.p. 1); and Nostoc punctiforme #2 (N.p. 2).
- This invention is based in part on X-ray crystal structures of four Mtd variants, each competent to promote infectivity and each having a different receptor specificity (Mtd-P1,-3c, -M1, and I1).
- the structure of a fifth Mtd variant from a non-infective phage was also determined.
- the 1.5 ⁇ resolution structure of Mtd-P1 was determined by multiwavelength anomalous dispersion using seleno-methionine substituted protein, and structures of other Mtd variants were determined by molecular replacement.
- the overall structures of these variants are nearly identical, indicating sequence variation within the VR causes no large conformational shifts.
- the Mtd variants are all seen to form an intertwined, pyramid-shaped trimer ( FIG. 2A ).
- the dimensions of the trimer correspond roughly to the size of knobs seen on the ends of Bordetella phage tail fibers (see Liu, M. et al. Genomic and genetic analysis of Bordetella bacteriophages encoding reverse transcriptase-mediated tropism-switching cassettes. J Bacteriol 186, 1503-17 (2004)).
- the extensive trimer interface buries more than 4,500 ⁇ 2 of surface area in each monomer, consistent with an obligatory trimer and with trimeric association observed by static light scattering.
- the majority (69%) of the interface area is composed of non-polar residues.
- Each polypeptide is also joined to its neighbor via 20 hydrogen bonds, one electrostatic interaction (between Glu-234 and Arg-354), and at least one shared cation (magnesium or calcium at Phe-313 carbonyl).
- Mtd is composed of three domains (see FIG. 2B ).
- the N-terminal domains (residues 1-48) of each of the three monomers form a threefold symmetric ⁇ -prism, with each monomer contributing a four-stranded, antiparallel ⁇ -sheet flanked by a short ⁇ -helix.
- the ⁇ -prism is structurally similar to the pseudo-threefold symmetric ⁇ -prisms observed in monocot lectins (rmsd 2.4 ⁇ , 60 C ⁇ atoms, see Hester, G., Kaku, H., et al.
- the ⁇ -prism domain of each Mtd monomer is joined to the following intermediate domain by a short 3 10 -helix (residues 49-54), which intertwines with equivalent 3 10 -helices from other monomers. These connections cross such that the ⁇ -prism domain occupies a different face of the pyramid than the other domains.
- the intermediate domain In contrast to the intimate trimeric association of the ⁇ -prism domain, the intermediate domain (residues 56-170) splays away from the trimer axis and makes little contact to other monomers.
- the intermediate domain is formed by an elaborated ⁇ -sandwich containing three- and four-stranded antiparallel sheets and with the three-stranded sheet making a near right-angle turn near its middle (see FIG. 2B ).
- the structure of the intermediate domain appears to constitute a novel fold.
- N-terminal ⁇ -prism or intermediate ⁇ -sandwich domains are theorized to permit association of the individual monomers with each other as well as being possibly involved in tethering Mtd to the surface of Bordetella phage.
- the superscaffold of the proteins of the invention may thus include all or part of one or both of the ⁇ -prism and intermediate domains of Mtd, where the Mtd CTL-fold contains one scaffold of the invention. These superscaffold domains may be used to arrange and display the binding site of a scaffold of the invention as described herein.
- the Mtd C-terminal domain (residues 171-381), which constitutes more than half of Mtd and contains the VR, is unexpectedly found to have a C-type lectin (CTL)-fold (see Weis, W. I., et al. Structure of the calcium-dependent lectin domain from a rat mannose-binding protein determined by MAD phasing. Science 254, 1608-15 (1991); Drickamer, K. C-type lectin-like domains. Curr Opin Struct Bial 9, 585-90 (1999); and Holm, L. et al. Protein structure comparison by alignment of distance matrices. J Mol Biol 233, 123-38. (1993)). See FIG. 3A .
- MMBP mammalian mannose binding protein
- Mtd has no obvious amino acid sequence relationship to other convergently evolved CTL domains, such as the E. coli virulence factor intimin, but does have structural similarity as expected (rmsd 1.8 ⁇ , 75 C ⁇ atoms).
- the typical distinguishing features of the ⁇ 110-130 residue CTL-fold, as also seen in Mtd, are a two-stranded antiparallel ⁇ -sheet formed by the domain's N- and C-termini ( ⁇ 1 ⁇ 5) connected by two a-helices to a three-stranded, antiparallel ⁇ -sheet ( ⁇ 2 ⁇ 3 ⁇ 4), see FIG. 3A .
- These features are also generally present in other CTL-folds, which range from about 95 to about 150 residues, described herein for use in the practice of the invention.
- the ⁇ 2 strand is uniquely twisted in Mtd such that it crosses over the ⁇ 3 strand.
- Mtd Unique to Mtd are inserts (residues 200-236 and 264-305) that interrupt connections between ⁇ 1 and ⁇ 1 and between ⁇ 2 and ⁇ 2, respectively, as well as some additional short strands ( ⁇ 0 and ⁇ 4′).
- the inserts have no regular secondary structure but do have specific conformations due to an extensive hydrogen bonding network, including to residues within the binding site. Without being bound by theory, and offered to advance the understanding of the present invention, it is possible that the inserts stabilize the VR as discussed below.
- the Mtd CTL-fold, and other analogous CTL-folds of similar structural arrangement may be used as a scaffold in the practice of the present invention.
- the Mtd CTL-fold contains 12 residues that are variable.
- the 12 variable residues are almost all solvent-exposed and organized into a receptor-binding site on the external face of the ⁇ [2 ⁇ 3 ⁇ 4 ⁇ 4′ sheet ( FIG. 3B ).
- This face is equivalent to the one in the CTL-fold proteins Ly49A (see Tormo, J., et al. Crystal structure of a lectin-like natural killer cell receptor bound to its MHC class I ligand. Nature 402, 623-31 (1999)) and intimin (Luo, Y. et al. Crystal structure of enteropathogenic Escherichia coli intimin-receptor complex. Nature 405, 1073-7 (2000); and Batchelor, M. et al.
- variable residues except for 348 and 369, are encoded by AAC codons in TR.
- Adenine-directed mutagenesis permits substitution of Asn encoded by AAC with 14 other residues, which cover the gamut of chemical character. For example, while adenine substitution of AAC cannot produce a codon for Trp, it can produce codons for Phe and Tyr. Likewise, while substitution cannot produce codons for Glu and Lys, it can produce codons for Asp and Arg (also His). Significantly, the use of the AAC codon rules out a nonsense codon being introduced.
- Adenine-substitution of the two non-AAC codons in TR, ACG encoding Thr-348 and ATC encoding Ile-369 can produce three other amino acids (Ser, Pro, Ala at 348; Val, Leu, Phe at 369).
- residue 348 is preferably hydrophobic to pack between the invariant residues Trp-307 and Trp-309 ( FIG. 3B ).
- the binding site in Mtd contains four invariant, solvent-exposed aromatic residues that are likely to contribute to interactions despite their status as amino acid residues of a scaffold as described herein. These are Trp-307 and Trp-345 at the center and periphery, respectively, of the binding site. Also at the periphery are the invariant residues Tyr-322 and Tyr-333, which come from the intertwining of an adjacent monomer's ⁇ 2 ⁇ 3 loop into a neighbor's binding site ( FIG. 3B ). Altogether, the binding site including the variable and above invariant residues in Mtd-P1 presents ⁇ 900 ⁇ 2 of exposed surface area.
- amino acids may be grouped based upon the similarities of their side chains and substituted for each other on this basis.
- a group of amino acids having aliphatic side chains is glycine, alanine, valine, leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains is serine and threonine; a group of amino acids having amide-containing side chains is asparagine and glutamine; a group of amino acids having aromatic side chains is phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic side chains is lysine, arginine, and histidine; and a group of amino acids having sulfur-containing side chains is cysteine and methionine.
- the invention provides for the “conservative substitution” of one amino acid residue in a group by another amino acid residue in the same group.
- Other conservative amino acid substitution groups include, but are not limited to, valine-leucine-isoleucine, phenylalanine-tyrosine, lysine-arginine, alanine-valine, and asparagine-glutamine.
- the final portion of VR is encoded by the ‘initiation of mutagenic homing’ (IMH) sequence, which maintains the unidirectional flow of mutagenized genetic information from TR to VR.
- IMH mutagenic homing
- This region of VR is unaffected by adenine-directed mutagenesis and therefore invariant.
- Invariance at the nucleotide level is echoed at the protein level among Mtd variants, with ⁇ 5 making close intra- and inter-molecular contacts within the central core of the trimer that would be potentially disrupted by variation.
- IMH-encoded ⁇ 5 strand of the protein may be part of a superscaffold as described herein while the nucleic acid encoding the ⁇ 5 strand, or a portion thereof, serves as the IMH, which maintains the unidirectional flow of diversity generating information from TR to VR.
- the present invention provides a binding protein comprising a scaffold for presentation of a binding site with variable residues as described herein.
- the scaffolds and binding proteins of the invention may be substituted for antibodies, and antigen binding fragments thereof, or other affinity agents in detection or other affinity-based assays or in therapeutics as known in the art.
- the scaffold comprises all or part of a CTLD, the Mtd CTL-fold, or an Mtd-like CTL-fold.
- the scaffold would permit possible variation at one or more of the 12 variable residues described herein.
- the scaffold comprises all or part of another CTL-fold, including those of microbial proteins as described herein (see FIG. 5 and Example 3) as well as those of a selectin; MBP; NKG2D; CD69; EMBP; TSG-6; and intimin as described herein.
- binding site it is meant the side chains of variable residues which define, in whole or in part, the three dimensional structure or shape which permits binding of the polypeptide attached to the side chains (through the alpha carbons of each variable residue) to a target molecule.
- a scaffold is a polypeptide which functionally presents the binding site defining variable residues (contained in said polypeptide) to interact with a target molecule bound by the binding site.
- Scaffolds of the invention that contain a binding site that is functionally presented to bind a target molecule are thus analogous to a Fv region of an antibody molecule and so may be used in analogous ways.
- a scaffold of the invention may be conjugated to another molecule as described herein, such as to form a fusion protein or to form a labeled scaffold.
- the scaffolds of the invention may also be viewed as comprising a variable region which contains a binding site of the invention.
- the relationship between a binding site, and thus a scaffold or binding protein of the invention, and a “target molecule” as used herein may also be described as the relationship between the members of a binding pair, wherein one member of the pair has an area on its surface or in a portion thereof which binds to the other member of the pair.
- the relationship may also be described as that between members of a specific binding pair, wherein one member of the pair has an area on its surface or in a portion thereof which specifically binds to the other member of the pair.
- the members of a pair may be referred to as ligand and anti-ligand (or ligand and receptor), either of which may be the scaffold or binding protein of the invention.
- a scaffold or binding protein of the invention may be viewed as a receptor that binds a ligand as the molecule of interest, or as a ligand that is bound by a receptor as the molecule of interest.
- a scaffold of the invention is at least about 40 amino acid residues.
- the scaffold may also be about 45, about 50, about 55, about 60, about 65, about 70, about 75, about 80, about 85, about 90, about 100, about 110, about 120, about 130, about 140, about 150, about 160, about 170, about 180, about 190, about 200, about 220, or about 230 or more amino acid residues.
- the scaffold in a binding protein of the invention is also preferably in the C-terminal half of the protein. More preferred is where the scaffold is within about 100, about 75, about 50, about 40, about 30, about 20, or about 10 amino acid residues of the C-terminus of the protein.
- the amino acid sequences that form the superscaffold are preferably those of non-CTL-fold regions naturally occurring in association with a CTL-fold.
- One non-limiting example is residues 1-170 of Mtd (SEQ ID NO:20).
- Other non-limiting examples include the oligomerization domains described by Drickamer (Ibid), including ⁇ -helical domains of mannose-binding protein (MBP), which domains form trimeric coiled coils; the ⁇ strand from the N terminus of the MBP CRD, optionally with the C-terminal ⁇ strand of the CRD and the C-terminal end of helix ⁇ 2, which dimerize MBP when the ⁇ -helical coiled coil domain is absent; the N-terminal ⁇ strands of the Polyandrocarpa lectin, optionally with helix ⁇ 2; loops from factors IX and X which permit the formation of a “head to head” interaction between two CTLDs with optional stabilization by an interchain
- the resultant multimers may be homomultimers, composed of scaffolds with the same binding activity, or heteromultimers, composed of scaffolds with more than one binding activity.
- the invention provides for homodimers, heterodimers, homotrimers, heterotrimers, as well has higher orders of homomeric and heteromeric proteins.
- Further non-limiting examples include the transmembrane and domains D0, D1, and/or D2 of EPEC intimin as well as the four Ig-like domains (D1-D4) of Y. pseudotuberculosis invasin.
- the binding proteins of the invention are thus made up of at least a scaffold containing a binding site as described herein.
- This combination may be non-naturally occurring in the sense that the binding site may be part of a variable region derived from a first CTL-fold that is inserted into the corresponding region of a second, and different, CTL-fold.
- the Mtd based binding site may be inserted in place of the corresponding region between the ⁇ 3 and ⁇ 5 strands of another CTL-fold as described herein.
- the binding proteins of the invention may thus be considered “recombinant”.
- Additional “recombinant” binding proteins include those comprising a superscaffold attached to the scaffold wherein the superscaffold is not derived from the same protein as the scaffold.
- the polypeptide sequence of the superscaffold is preferably that attached to a CTL-fold containing protein described herein.
- Further “recombinant” binding proteins include the multimeric forms of a superscaffold containing binding protein wherein the subunits of the multimeric form may be the same (to result in a homomultimer) or different (to result in a heteromultimer).
- a scaffold or binding protein of the invention is not an isolated form of a naturally occurring polypeptide, where isolated refers to a state of being substantially removed from, preferably entirely removed from, other polypeptides or biomolecules that are normally found with a naturally occurring polypeptide.
- a naturally occurring polypeptide is one produced by a living organism in the absence of manipulation or modification by human intervention.
- human intervention include recombinant DNA methodology, mutagenesis by chemical or physical means, inhibition of DNA repair, or manipulation of genetics.
- the binding proteins of the invention are preferably recombinant proteins or otherwise the result of human intervention.
- a scaffold or binding protein produced by the recombinant methods described herein is not a naturally occurring polypeptide.
- recombinant refers to the alteration of a native nucleic acid, or protein or modification by the introduction of a heterologous nucleic acid or protein, via human intervention.
- the term may refer to a cell derived from a cell so modified.
- recombinant cells express genes that are not found within the native (nonrecombinant) form of the cell or express native genes in an unnaturally overexpressed, under-expressed, or not expressed state.
- Preferred embodiments of the invention thus do not include naturally occurring Mtd proteins, such as those with SEQ ID NO:20 (Mtd-P1 or Bordetella phage BPP-1) or variations thereof having the amino acid sequences of Mtd-P3c, Mtd-M1, Mtd-I1, or Mtd-U1.
- Naturally occurring selectins; MBPs; NKG2D; CD69; EMBP; TSG-6; and intimin as well as naturally occurring sequences of CTL-fold containing proteins from Vibrio harveyi ML phage (V.h.
- ML Bifidobacterium longum (B.l); Bacteroides thetaiotaonicron (B.t); Treponema denticola (T.d.); Trichodesmium erythraeum 1A (T.e. 1A); Trichodesmium erythraeum 1B (T.e. 1B); Trichodesmium erythraeum # 2 (T.e. 2); Nostoc PPC ssp. 7120 #1 (N. PCC. 1); Nostoc PPC ssp. 7120 #2A (N. PCC. 2A); Nostoc PPC ssp. 7120 #2B (N. PCC.
- the invention also provides polynucleotides encoding the scaffolds and binding proteins described herein.
- the polynucleotides are preferably operably linked to a regulatory nucleic acid sequence that controls or regulates the expression of the coding polynucleotide in a cell or cell extract.
- a regulatory sequence refers to regions or sequence located upstream and/or downstream from the start of transcription that are involved in recognition and binding of RNA polymerase and other proteins to initiate transcription.
- the term includes a promoter for regulating start of transcription.
- the polynucleotide may be part of a vector or plasmid used to propagate or amplify the polynucleotide. Where the polynucleotide is operably linked to a regulatory nucleic acid sequence, presence in a vector or plasmid permits the expression of the encoded scaffold or binding protein. This permits production and isolation of large quantities of a scaffold or binding protein of the invention.
- the polynucleotide and regulatory sequence is operably linked to other sequences to form a diversity-generating retroelement (DGR) as described herein such that the variable residues of the binding site in the scaffold or binding protein may be readily diversified via a DGR.
- DGR diversity-generating retroelement
- this aspect of the invention is advantageously applied to other CTL-folds and the binding sites contained therein where the region between the ⁇ 3 and ⁇ 5 strands are not a variable region until operably linked to a TR (and an IMH if necessary), as well as any other necessary components in cis or in trans, like reverse transcriptase activity as a non-limiting example, wherein the TR directs alterations of amino acid residues of the binding site, and thus variable region, as described herein.
- this means to create alterations in the binding site is limited by adenine directed mutagenesis as described herein.
- the invention also contemplates the use of traditional mutagenesis techniques for altering the binding specificity of the region between the ⁇ 3 and ⁇ 5 strands of a CTL-fold as described herein.
- the polynucleotide may also be part of a phage or bacterial genome and expressed on the surface of phage or bacteria.
- DGR as used herein includes the use of mutagenic homing wherein an IMH directs mutagenesis of variable residues within the variable region (VR) of a scaffold or binding protein of the invention though a functionally linked TR, which directs alterations of nucleotide residues in the VR based on the locations of adenine residues at corresponding positions in the related TR sequence, as well as any other necessary components in cis or in trans, like reverse transcriptase activity as a non-limiting example.
- DGR advantageously permits use of the phage or bacteria to form a library expressing a heterogeneous population of encoded scaffolds or binding proteins on the surfaces of individual organisms.
- population refers to a plurality of heterogeneous members which have similarities but at least two of which have different binding sites as described herein.
- a population of diversified population of phage may be used in a method to identify a scaffold or binding protein as binding to a target molecule of interest.
- target molecules include a cell surface molecule, optionally of a cancer cell, an epithelial cell, an endothelial cell, and a bacterial or fungal cell surface molecule.
- the scaffold or binding protein is expressed as part of the tail fiber in a bacteriophage particle.
- Such a method may comprise expressing a population of scaffolds or binding proteins on the surfaces of members of a library of phage particles (including as part of the tail fibers), of bacteria or of other cells; contacting the members of the library with a target molecule of interest, optionally immobilized; removing members that do not bind to the target; and selecting the library member(s) that bind the target molecule of interest.
- the selected members can be propagated to form another library of members for an additional round of screening or selection using the above method. This permits the enrichment of library member(s) that bind the target of interest and also provides a means to verify the selected member(s) as binding the target.
- the method further comprises isolating polynucleotides from the selected members).
- the phage library members are one form of a plurality, or family, of scaffolds or binding proteins of the invention.
- a selected or identified scaffold or binding protein may also be “evolved” by a variation of the above to select for enhanced binding to the same ligand or binding to a different ligand.
- One method for evolving a previously identified or selected scaffold or binding protein is to provide a polynucleotide encoding the scaffold or binding protein, allow it to undergo diversification as described herein to produce a library of variants; and select for a member of the library with enhanced binding to the same target molecule or with “gain of function” binding to another target molecule.
- the scaffold or binding protein will display specific binding affinity for a particular target, optionally with the functionality of blocking the binding of one or more other molecules to the target molecule.
- the scaffold or binding protein may also be able to stimulate or inhibit a metabolic pathway, to act as a signal or messenger, or to stimulate or inhibit cellular activity.
- a scaffold or binding protein can thus be used as an antagonist, an agonist, as well as a modulator of a cell surface ligand function.
- a scaffold or binding protein for an “orphan” receptor to which no natural ligand is known may also be generated.
- a scaffold or binding protein of the invention binds to a target molecule better by at least about 2 ⁇ , more preferably about 5 ⁇ or about 10 ⁇ , than binding to background molecules that are present or used as non-specific control targets.
- the scaffolds and binding proteins of the invention may also be modified, such as by attachment of another moiety thereto.
- the moiety may be a label, optionally a detectable label, including a directly detectable label such as a radioactive isotope, a fluorescent label (Cy3 and Cy5 as non-limiting examples) or a particulate label.
- a directly detectable label such as a radioactive isotope
- a fluorescent label Cy3 and Cy5 as non-limiting examples
- particulate labels include latex particles and colloidal gold particles.
- the label may be for indirect detection.
- Non-limiting examples include an enzyme, such as, but not limited to, luciferase, alkaline phosphatase, and horse radish peroxidase.
- Non-limiting examples include a molecule bound by another molecule, such as, but not limited to, biotin, the Fc portion of an antibody, an affinity peptide, or a purification tag.
- the label is covalently attached.
- the scaffold or binding protein may also be selected to bind antibodies from specific animals, e.g., goat, rabbit, mouse, etc., for use as a secondary reagent in assays using such antibodies as the primary detection agent.
- a scaffold or binding protein of the invention may be detected directly by use of a reagent that binds thereto.
- a reagent that binds thereto.
- Non-limiting examples include an antibody, or functional fragment thereof, that binds a portion of the scaffold without interference of the binding site or that binds a portion of the superscaffold without interfering with the binding site.
- Such an antibody or fragment thereof is preferably labeled for detection as described herein and as known in the art.
- a ligand for a portion of the scaffold or the superscaffold, which binds to a region distinct from, and without interference to, the binding site may be used.
- the ligand is also preferably labeled for detection as provided herein and known in the art.
- Detection of a scaffold or binding protein of the invention may be advantageously used to detect the presence of a target molecule bound by the scaffold or binding protein. Such detection may also be used to detect the presence of a cell that expresses the ligand or molecule.
- Non-limiting detection assays in which the invention may be adapted include flow cytometry and fluorescent microscopy.
- a labeled scaffold or binding protein of the invention which specifically binds human chorionic gonadotropin (hCG), to the exclusion of other factors that are normally found therewith, may be used to detect hCG in human urine samples as an indicator of pregnancy, such as by use of a lateral flow device as known in the art.
- a labeled scaffold or binding protein of the invention may be used to detect a microorganism, such as pathogenic bacteria or fungi by binding to a cell surface molecule specific to the microorganism of interest, relative to other organisms normally found therewith.
- the invention also provides a method of detecting a cell, the method comprising contacting a scaffold or binding protein of the invention which binds a cell surface molecule specific to the cell and subsequently detecting the bound scaffold or binding protein.
- the cell is a bacterial or fungal cell, particularly pathogenic forms thereof.
- the cell may be associated with a disease or other unwanted condition, including, but not limited to a cancer cell or a virally infected cell.
- the invention provides for the use of a scaffold or binding protein as disclosed herein as a diagnostic agent, either in vitro or in vivo, based on its ability to bind to a tissue or disease associated target molecule.
- Tissue associated molecules are those that are expressed exclusively, or at a significantly higher level, in one or more tissue(s) compared to other tissues in an animal.
- Disease associated molecules are those that are expressed exclusively, or at a significantly higher level, in one or more diseased cells, diseased tissues, or bodily fluid in comparison to non-diseased cells, tissues, or fluids in an organism.
- Non-limiting tissue or disease associated molecules are discussed in Tables I and II of U.S. Patent Publication No 2002/0107215.
- Non-limiting examples of tissues where target ligands bound by the scaffolds and binding proteins of the invention include liver, pancreas, adrenal gland, thyroid, salivary gland, pituitary gland, brain, spinal cord, lung, heart, breast, skeletal muscle, bone marrow, thymus, spleen, lymph node, colorectal, stomach, ovarian, small intestine, uterus, placenta, prostate, testis, colon, colon, gastric, bladder, trachea, kidney, and adipose tissue.
- tumor cells include tumor cells, tumor tissue sample, organ cells, blood cells, and cells of the skin, lung, heart, muscle, brain, mucosae, liver, intestine, spleen, stomach, lymphatic system, cervix, vagina, prostate, mouth, and tongue.
- Non-limiting examples of diseases include, but are not limited to, an autoimmune/inflammatory disorder such as acquired immunodeficiency syndrome (AIDS), Addison's disease, adult respiratory distress syndrome, allergies, ankylosing spondylitis, amyloidosis, anemia, asthma, atherosclerosis, autoimmune hemolytic anemia, autoimmune thyroiditis, autoimmune polyendocrinopathycandidiasis-ectodermal dystrophy (APECED), bronchitis, cholecystitis, contact dermatitis, Crohn's disease, atopic dermatitis, dermatomyositis, diabetes mellitus, emphysema, episodic lymphopenia with lymphocytotoxins, erythroblastosis fetalis, erythema nodosum, atrophic gastritis, glomerulonephritis, Goodpasture's syndrome, gout, Graves' disease, Hashimoto's thyroiditis, hypereos
- Exemplary disease or conditions include, e.g., MS, SLE, ITP, IDDM, MG, CLL, CD, RA, Factor VIII Hemophilia, transplantation, arteriosclerosis, Sjogren's Syndrome, Kawasaki Disease, AHA, ulcerative colitis, multiple myeloma, Glomerulonephritis, seasonal allergies, and IgA Nephropathy; and a cardiovascular disorder such as congestive heart failure, ischemic heart disease, angina pectoris, myocardial infarction, hypertensive heart disease, degenerative valvular heart disease, calcific aortic valve stenosis, congenitally bicuspid aortic valve, mitral annular calcification, mitral valve prolapse, rheumatic fever and rheumatic heart disease, infective endocarditis, nonbacterial thrombotic endocarditis, endocarditis of systemic lupus erythematosus, carcinoid heart disease,
- a scaffold or binding protein is conjugated, optionally through a linker, to a toxin, pro-drug, or other molecule (e.g., a protein, nucleic acid, organic small molecule, etc.) suitable for use as a pharmaceutical or therapeutic agent.
- proteins include cytokines, chemokines, growth factors, interleukins, cell-surface proteins, extracellular domains, cell surface receptors, and cytotoxins.
- the conjugated scaffold or binding protein delivers the attached molecule to a location bound by the binding site of the scaffold or binding protein.
- Such forms of the invention may be used in method of decreasing the viability of a cell, preferably a disease associated cell, such as a cancer cell or virally infected cell.
- the invention provides a method of targeting a cell expressing a cell surface molecule by use of a scaffold or binding protein of the invention. Such a method comprises contacting said cell with a scaffold or binding protein of the invention which binds said cell surface molecule.
- the scaffold or binding protein is one which preferably binds an external cell surface molecule of the cell with sufficient specificity to minimize undesirable binding to non-cancer cells.
- the scaffold or binding protein is one which preferably binds a viral antigen expressed on the external cell surface of an infected cell with sufficient specificity to minimize undesirable binding to non-infected cells.
- the invention also provides a method of decreasing the viability of a cell, said method comprising covalently linking a cellular toxin or pro-drug to a scaffold or binding protein of the invention and contacting the linked scaffold or binding protein with a cell comprising a cell surface molecule bound by the scaffold or binding protein to decrease the viability of the cell.
- the cell is a cancer cell, expressing a cell surface marker specific to the cancer cell as described above.
- the cell is a virally infected cell, expressing a viral antigen, on the cell surface, that is specific to virally infected cells as described above.
- the invention provides for the selection of a scaffold or binding protein which binds a cell surface molecule such that the binding of one or multiple scaffolds or binding proteins to the cell through the molecule triggers, or is sufficient to activate, a cell death program in the bound cell.
- a scaffold or binding protein is one that is analogous to Fas ligand or an antibody against Fas which triggers apoptosis of a cell upon binding to Fas expressed on the cell.
- the invention provides for the use of a scaffold or binding protein as disclosed herein as a therapeutic agent for use in the treatment of disease or other unwanted conditions.
- a scaffold or binding protein may be used in the prophylactic treatment of a disease or unwanted condition.
- the treatments of the invention include both in vivo or ex vivo administration.
- the scaffold or binding protein is formulated as a composition comprising a pharmaceutically acceptable excipient, optionally for delayed release (or slow release over time). Sterile formulations of a scaffold or binding protein are also contemplated.
- a scaffold or binding protein is typically administered or transferred directly to the cells to be treated or to the tissue site of interest via intramuscular, intradermal, subdermal, subcutaneous, oral, intraperitoneal, intrathecal, or intravenous procedures.
- a scaffold or binding protein can be placed within a cavity of the body, such as during surgery, or by inhalation, or vaginal or rectal administration.
- the contacted cells are returned or delivered to the site from which they were obtained or to another site in the subject to be treated. The subject need not be that from which the cells were obtained.
- the treated cells may be optionally grafted onto a tissue or organ before being returned or alternatively delivered to the blood or lymph system using standard delivery or transfusion techniques.
- Subjects that may be treated with a scaffold or binding protein of the invention include, but are not limited to, a mammal, including a human, primate, dog, cat, mouse, pig, cow, goat, rabbit, rat, guinea pig, hamster, horse, sheep; or a non-mammalian vertebrate such as a bird (e.g., a chicken or duck), or fish; or an invertebrate.
- a mammal including a human, primate, dog, cat, mouse, pig, cow, goat, rabbit, rat, guinea pig, hamster, horse, sheep; or a non-mammalian vertebrate such as a bird (e.g., a chicken or duck), or fish; or an invertebrate.
- compositions comprising a scaffold or binding protein disclosed herein.
- Non-limiting examples include attachment of a scaffold or binding protein to a surface, such as that of a tube, well, or dish; attachment to a matrix of an affinity material; or attachment to beads, a column, a solid support, or a microarray
- kits comprising agents (like a scaffold or binding protein, or a library of scaffolds or binding proteins, described herein as non-limiting examples) for use in one or more methods as disclosed herein.
- agents like a scaffold or binding protein, or a library of scaffolds or binding proteins, described herein as non-limiting examples
- kits optionally comprising an agent with an identifying description or label or instructions relating to their use in the methods of the present invention, are provided.
- Such a kit may comprise containers, each with one or more of the various reagents (typically in concentrated form) or devices utilized in the methods.
- a set of instructions will also typically be included.
- Standards for calibrating the binding of a scaffold or binding protein to a ligand may also be included in the kits of the invention.
- kits of the invention may comprise one or more reagents for production of a library of scaffolds or binding proteins, such as that embodied in phage particles which express individual members of the library.
- kits may contain vectors, such as initial phage particles, and cells for their propagation and plating as well as expression of scaffolds or binding proteins.
- the inserts form hydrogen bonds to VR, including three to side chains of three invariant serines in VR.
- Ser-270 and Glu-267 from the second insert form hydrogen bonds to the invariant VR residues Ser-351 and Ser-353, respectively ( FIG. 3D )
- main chain atoms of the first insert form hydrogen bonds to invariant VR residue Ser-365 (not depicted). These interactions are supplemented by hydrogen bonds between the inserts and main chain (scaffold) atoms of the VR.
- trimeric assembly contributes to stabilizing VR, specifically through contacts from a neighboring monomer's extensive ⁇ 2 ⁇ 3 loop.
- the ⁇ 2 ⁇ 3 loop from one monomer contributes not only the aforementioned invariant tyrosines (322 and 333) to a neighbor's binding site ( FIG. 3B ), but also hydrogen bonds to the invariant VR residue Arg-354 and to main chain (scaffold) atoms of VR ( FIG. 3E ).
- the ⁇ 2 ⁇ 3 loop has the same intertwining conformation in all Mtd variants examined, being positioned over invariant residues (i.e., 351-356) in a neighbor's binding site.
- FIG. 4A shows that Mtd-P1 and Mtd-I1 have highly hydrophobic binding sites, and that the continuity of the hydrophobic surface decreases successively for Mtd-3c, -M1, and -N1, with this last one having nine TR-encoded, mostly hydrophilic residues ( FIG. 1 ).
- the binding sites of Mtd-P1 and -I1 accommodate four to five large, exposed hydrophobic residues, and although a preponderance of exposed hydrophobic surface is correlated with protein instability, both Mtd-P1 and -I1 are found to be highly stable proteins.
- the invariant area surrounding the binding site is largely hydrophilic, most likely aiding protein stability.
- Mtd-P 1 Mtd-P 1 and the Bordetella receptor pertactin.
- the pertactin ectodomain (Prn-E) was incubated with Mtd variants and found by a coprecipitation assay to associate most strongly with Mtd-P1 but also with Mtd-3c and Mtd-M1.
- Prn-E was not found to associate with Mtd-I1 or Mtd-N1.
- the three Mtd variants that are found to bind pertactin have in common the variable residue Tyr-359, previously shown by sequence comparison to be a consistent determinant for pertactin interaction.
- Mtd-P1 Despite each monomer providing a discrete binding site, the stoichiometry of association between Mtd and Prn-E is 3:1, as assessed by static light scattering. This may reflect steric occlusion of empty binding sites by elongated pertactin or pseudo-symmetric binding.
- the affinity of Mtd for Prn-E has a K D of ⁇ 3 ⁇ M as measured by isothermal titration calorimetry (ITC). Because Bordetella phage has six tail fibers with each fiber appearing to have two Mtd trimers, the affinity is likely translate to high avidity during infection. The ITC experiment also demonstrated that the endothermic interaction between the two molecules is entropically driven, as would be expected from the hydrophobic binding site of Mtd-P1.
Landscapes
- Chemical & Material Sciences (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medicinal Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Genetics & Genomics (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- Gastroenterology & Hepatology (AREA)
- General Chemical & Material Sciences (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Pharmacology & Pharmacy (AREA)
- Animal Behavior & Ethology (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Peptides Or Proteins (AREA)
Abstract
This invention provides a class of binding proteins with a range of binding specificities and affinities based upon variation at select amino acid positions within a scaffold. The variable positions may be readily modified to produce a library of binding proteins with different binding specificities and affinities. The library may be screened to identify one or more as binding a ligand of interest. Compositions comprising the binding proteins, as well as methods of using the binding proteins are also provided.
Description
- This application is a continuation application of U.S. application Ser. No. 11/027,323 filed Dec. 31, 2004, now pending. The disclosure of the prior application is considered part of and is incorporated by reference in the disclosure of this application.
- This invention was made with government support under Grant Nos. T32 GM008326, F31 AI061840 and F32 AI49695 awarded by the National Institutes of Health. The government has certain rights in the invention.
- 1. Field of the Invention
- This invention relates to a class of binding proteins with a range of binding specificities and affinities based upon variation at select amino acid positions within a scaffold. The variable positions may be readily modified to produce a variety of binding proteins with different binding specificities and affinities. This range of proteins may be screened to identify one or more as binding a target molecule of interest. Compositions comprising the binding proteins, as well as methods of using the binding proteins are also provided.
- 2. Background Information
- The amino acid sequence of a protein determines its secondary, tertiary, and quaternary structure to result in the protein's final three-dimensional (3D) shape. The shape and functional groups (side chains) of the amino acids therein define the protein's function. In the case of a binding protein, the portion of the protein responsible for the binding activity (binding domain) must either be exposed, or be capable of being exposed, on an accessible surface of the protein exposed to the exterior solvent to provide for possible interaction with a binding target. Thus to vary the binding activity, the amino acid residues of the binding domain must be varied.
- With an immunoglobulin as an example of a familiar binding protein with specificity and affinity, the “variable region” or binding domain includes six loops clustered in space. The loops provide the 6 complementarity determining regions (CDRs) and are contained in two polypeptides, a heavy chain and a light chain, each carrying 3 CDRs (H1, H2, and H3 of the heavy chain and L1, L2, and L3 of the light chain). The amino acid residues of the variable regions orient the CDRs toward the exterior solvent environment to permit their interaction with an antigen. High sequence variability of the amino acid residues of the CDRs allows immunoglobulins as a class to bind a large variety of antigens. The CDRs and non-CDR portion of the variable region form an immunoglobulin fold to determine the structure of the loops and thereby maintain the overall structure of the immunoglobulin variable region, with proper orientation of the CDRs.
- But variability in the sequence of a protein, like an immunoglobulin, is often limited by the effects of variability on protein folding and the resulting final 3D shape. Amino acid residues with side chains that are not exposed to the exterior solvent are often limited in variability because as part of the protein's interior they must “fit” within the interior space as dictated by other amino acid residues. The protein can tolerate greater variability in residues with side chains oriented toward, and exposed to, the exterior solvent, given that they do not have to “fit” into an interior space constrained by other residues.
- To diversify the binding functionality of a binding protein and thus promote recognition of members of a diverse population of target molecules, amino acid variability is necessary. Interactions between a binding protein and its target molecule (the ligand) are usually non-covalent and yet often very tight (high affinity or avidity) and specific. The intermolecular interactions are defined by the amino acid residues of the protein's binding domain which form a surface that fits “hand-in-glove” like onto the surface of the ligand being bound. The two contacting surfaces must have complementarity via hydrogen bonding (at times mediated by a water molecule), charge interactions, alignment of attracting dipoles, hydrophobic to hydrophobic (van der Waals) interactions, and/or protrusions fitting with depressions.
- In the example of an immunoglobulin, the binding domain is presented within the context of the framework made up by the rest of the immunoglobulin molecule. The framework, generally referred to as the immunoglobulin fold, forms the scaffold of the protein structure and functions to correctly present the binding domain. The framework restrains the 3D shape of the protein so that the amino acid residues of the binding domain are positioned in a manner to create the accessible specific binding site.
- The usefulness of immunoglobulins as manipulable binding proteins is limited, however, by the nature of the immunoglobulin framework, which requires two polypeptides to form the complete ligand- or antigen-binding site. This results in a number of disadvantages: the need to manipulate rather large polypeptides, the need for complicated molecular cloning to diversify a binding site; and the complication of modifying six different CDRs. The consequences of these disadvantages include Constraints on using phage display (see for example U.S. Pat. Nos. 5,223,409 and 5,571,698) to diversify immunoglobulins for the purpose of creating new binding or other functional activities.
- A number of attempts have been made to overcome the limitations of immunoglobulins. These include the use of a CTL4-like Sandwich architecture as a framework for presenting randomized peptide sequences (see WO 00/60070); the use of fibronectin type III domains (see U.S. Pat. No. 6,818,418); the use of an “anticalin” (see WO 99/16873 and Beste et al. Proc. Natl. Acad. Sci., USA 96:1898-1903 (1999)); and even the use of single chain antibodies, optionally with a CH3 domain of an immunoglobulin to permit spontaneous dimerization.
- Citation of documents herein is not intended as an admission that any is pertinent prior art. All statements as to the date or representation as to the contents of documents is based on the information available to the applicant and does not constitute any admission as to the correctness of the dates or contents of the documents.
- The present invention is related to the discovery of a diversity-generating retroelement (DGR) belonging to a Bordetella bacteriophage. The DGR has recently been shown to be capable of producing massive, targeted amino acid sequence variation in the phage's receptor-binding protein, the major tropism determinant (Mtd). See Liu, M. et al. “Reverse transcriptase-mediated tropism switching in Bordetella bacteriophage.” Science 295, 2091-4 (2002); Liu, M. et al. “Genomic and genetic analysis of Bordetella bacteriophages encoding reverse transcriptase-mediated tropism-switching cassettes.” J Bacteriol 186,1503-17 (2004); and Doulatov, S. et al. “Tropism switching in Bordetella bacteriophage defines a family of diversity-generating retroelements.” Nature 431,476-81 (2004). This genetically programmed diversity, with ˜1013 different Mtd sequences possible, is rivaled in scale only by antibodies (immunoglobulins) and T cell receptors in the immune system (see Davis, M. M. & Bjorkman, P. J. “T-cell antigen receptor genes and T-cell recognition.” Nature 334, 395-402 (1988)).
- As noted above, whereas the immune system requires variability in numerous gene segments to achieve antigen-binding diversity, the Bordetella phage DGR utilizes a single copy of mtd followed by a nearly identical (90%), 134-bp direct repeat of the 3′ end of mtd (see
FIG. 1 herein). Genetic information in this direct repeat, called the template repeat (TR) due to its invariance, is converted into a cDNA altered by random insertion of A, G, C, or T specifically at sites occupied by adenines in TR through the action of a DGR-encoded reverse transcriptase. The mutagenized sequence is then substituted into the variable region (VR) of mtd by a process known as mutagenic homing, thereby producing an Mtd variant. Due to the adenine dependency of the mutagenic process mediated'by the DGR reverse transcriptase, 12 amino acid residues in VR, encoded by codons corresponding to nucleotide triplets in TR with adenine residues at non-wobble positions, are subject to variation at high frequency. The effect of the resulting amino acid variation in VR is to alter the binding specificity of Mtd and consequently host tropism for the phage. These alterations are crucial to the phage's survival because its host, Bordetella, undergoes phase variation under different environmental conditions, and the expression patterns of bacterial cell surface receptors, such as pertactin change with the phase. For example, Bvg-plus tropic phage-1 (BPP-1) infects only Bvg+ Bordetella, the pathogenic phase, since the Mtd-P1 variant expressed by this phage uses as its receptor the Bvg+-specific outer membrane protein, pertactin. When Bordetella encounters an ex vivo environment, it ceases expressing pertactin, becoming Bvg− as it concomitantly becomes resistant to infection by BPP-1 (see Uhl, M. A. & Miller, J. F. “Integration of multiple domains in a two-component sensor protein: the Bordetella pertussis BvgAS phosphorelay.” EMBO J 15, 1028-36 (1996)). - However, the phage counters by producing Mtd variants, such as Mtd-M1, that use unknown receptors expressed exclusively by Bvg− Bordetella, thereby creating Bvg-minus tropic phage (BMP). Alternatively, Mtd variants, such as Mtd-I1, are produced that infect through unknown receptors expressed by both phases of Bordetella, thereby creating Bvg-indiscriminant phage (BIP). Mtd variants, such as Mtd-3c, that confer infectivity towards Bvg+ Bordetella but use instead of pertactin, an unknown receptor, have also been found. The molecular protein structure with which Mtd creates diverse receptor-binding sites and tolerates massive sequence variation was not known prior to the present invention.
- Mtd is found on the tails of Bordetella bacteriophage, which number 6 per phage particle. Based upon the discovery described herein, there appear to be 2 Mtd trimers per phage tail, and thereby 12 Mtd trimers per phage particle.
- The invention is based in part on the discovery of the unexpected structures of multiple Mtd variants. The basic structure is a pyramid-shaped homotrimer with variable amino acid residues organized along the pyramid base by a C-type lectin (CTL)-fold that creates a discrete receptor-binding site in each of the three monomers. The present invention thus provides the use of the CTL-fold, or portion thereof, as a scaffold to orient the side chains of variable amino acid residues toward the external solvent environment. The side chains of the variable amino acid residues define, in whole or in part, the three dimensional structure or shape of all or part of the binding site, which is attached to the scaffold through the alpha carbons of each variable amino acid residue.
- The present invention also provides for the use of CTL-folds as a scaffold for massive sequence variation of the variable amino acid residues, and thus the side chains thereof, in the manner exemplified by Bordetella bacteriophage. The availability of ˜1013 possible combinations of variable amino acid residue side chains in the binding site provides a highly diverse population of binding proteins with different specificities. The extraordinary diversity available in this localized portion of the binding site provided by the scaffold provides differing shapes and chemical reactivities suitable for binding to and operating on a wide range of target molecules. This level of diversity provided to the binding site of a CTL-fold by the present invention is paralleled only by the antigen binding region of immunoglobulins and T cell receptors in the immune system. But unlike those examples, the binding proteins of the invention may be produced by modification of a single polypeptide chain to result in a highly diverse population of binding proteins. The single chain can be modified via recombinant methods, such as by recombinant use of the elements of the DGR of Bordetella bacteriophage.
- The scaffold, or backbone conformation, present in the CTL-fold has been observed to provide a stable structure for the presentation of a binding site. As noted by Kogelberg et al. (Curr. Opin. Structural Biol., 11:635-643, 2001), the CTL-fold has closely spaced N and C termini which are opposite the binding site of the fold. Thus the invention provides for the use of the CTL-fold to present a binding site with variable residues that may be varied without compromising the maintenance of the structural integrity of the CTL-fold. In the case of Mtd, the scaffold structure includes stabilization of loops in the binding site by two inserts and trimeric intertwining as well as other structures contributing to the CTL fold. In the case of other CTL-folds, the scaffold is similarly stabilized by the structures present in the scaffold, such as, but not limited to, the presence of disulfide bridges that contribute to the integrity of the CTL fold. The CTL-fold, therefore, provides a stable, highly tolerant scaffold for combinatorial display of the side chains of variable amino acid residues used to form all or part of a binding site.
- The availability of a scaffold to present diverse binding sites permits the generation of binding proteins with different specificities and affinities for binding a wide number of different target molecules, particularly biomolecules. The binding proteins may be used to bind, and thus detect, identify, localize or modify, such target molecules.
- The invention thus provides, in one aspect, for a protein scaffold comprising a variable binding site comprising the amino acid sequence
-
(SEQ ID NO: 1) -Xaa1-Trp-Xaa2-Xaa3-Xaa4-Ser-Xaa5-Ser-Gly-Ser-Arg- Ala-Ala-Xaa6-Trp-Xaa7-Xaa8-Gly-Pro-Ser-Xaa9-Ser- Xaa10-Ala-Xaa11-Xaa12- -
- wherein each of Xaa1 to Xaa12 is independently any amino acid residue, the side chains of which form a binding site, in whole or in part.
- The scaffold serves as a framework to present variable amino acid residues, the side chains of which form the binding site of the protein. Preferably, the scaffold is derived from, and forms all or part of, a CTL-fold which displays or exposes the binding site to the external solvent environment. Thus the invention includes the above sequence (wherein SEQ ID NO:1 constitutes all or part of the binding side of the scaffold) in a non-Mtd, CTL-fold as the scaffold. The scaffold may optionally be conjugated to another polypeptide or other molecule through residues distant from the binding site.
- In another aspect, the invention also provides a binding protein comprising a scaffold as described above. The binding specificity of the protein is determined by the variable binding site, and the protein comprises a scaffold comprising the amino acid sequence
-
(SEQ ID NO: 1) -Xaa1-Trp-Xaa2-Xaa3-Xaa4-Ser-Xaa5-Ser-Gly-Ser-Arg- Ala-Ala-Xaa6-Trp-Xaa7-Xaa8-Gly-Pro-Ser-Xaa9-Ser- Xaa10-Ala-Xaa11-Xaa12- -
- wherein each of Xaa1 to Xaa12 is independently any amino acid residue, the side chains of which form a binding site, in whole or in part, that determines the binding specificity of the protein; and
- at each of the Xaa1 and Xaa12 ends of the scaffold are amino acid sequences that form a superscaffold which displays said binding site in a solvent exposed portion of the protein, or one of the Xaa1 and Xaa12 ends of the scaffold is —H (a covalently bonded hydrogen atom) and the other end is an amino acid sequence that forms a superscaffold which displays said binding site in a solvent exposed portion of the protein.
- The side chains of the variable (Xaa) residues may form the whole of the binding site where no other side chains of the protein contribute to binding interactions with a target molecule bound by the protein. Alternatively, other side chains of the protein, such as those of other amino acid residues in the scaffold or superscaffold, may contribute to the binding interactions with a target molecule. In this case, the side chains of the variable residues only compose part of the binding site of the protein. Non-limiting examples of a target molecule include a viral antigen, a bacterial antigen, a fungal antigen, an enzyme, an enzyme inhibitor, a cell surface molecule of any composition, a reporter molecule, a serum protein, and a receptor. In the case of a viral antigen as a target molecule, it may be, but is not limited to, a polypeptide required for replication. Thus the binding sites of the invention, like immunoglobulin binding sites, recognize proteins (including native, denatured, and proteolytic forms thereof as well as conformational determinants thereof); nucleic acids; polysaccharides (alone or as modifications on another molecule, such as a protein); lipids; and small chemical molecules (like haptens in the case of an antibody).
- Optionally, the scaffold is extended at the Xaa1 end by all or part of the sequence -Ala-Ala-Leu-Phe-Gly-Gly- (SEQ ID NO:2), wherein the extension may be by 1, 2, 3, 4, 5, or all 6 of the consecutive amino acid residues of SEQ ID NO:2 linked to Xaa1 via the carboxyl end of the last Gly residue in SEQ ID NO:2. Alternatively, the scaffold is extended at the Xaa12 end by all or part of the sequence -Gly-Ala-Arg-Gly-Val-Cys-Asp-His-Leu-Ile-Leu-Glu (SEQ ID NO:3), wherein the extension may be by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or all 12 of the consecutive amino acid residues linked to Xaa12 via the amino end of the first Gly residue in SEQ ID NO:3. The scaffold may also be extended at both ends by any combination of the above extensions at Xaa1 and Xaa12 followed by further optional extensions. Where all 12 amino acids of SEQ ID NO:3 are present in a scaffold, preferred embodiments of the invention have no further extension at the C terminus by additional amino acid residues.
- The superscaffold is composed of additional amino acids attached to a scaffold of the invention without adverse effect on the binding site contained therein. A binding protein of the invention is thus preferably composed of a binding site within a scaffold which is attached to a superscaffold. Preferably, the superscaffold is composed of amino acids associated with the scaffold in naturally occurring sources of the scaffold, such as in naturally occurring polypeptides with a CTL-fold. Alternatively, the scaffold may be grafted onto a heterologous superscaffold, such as the superscaffold of another CTL-fold containing polypeptide, analogous to the grafting of mouse antibody CDRs onto a human antibody framework. Amino acid residues of the superscaffold may also serve to permit conjugation of the binding protein to another molecule. Thus the superscaffold may be a polypeptide linker as a non-limiting example. The polypeptide linker may be of differing lengths and compositions.
- The superscaffold may also optionally constitute or comprise a dimerization or multimerization domain which permits organization of more than one scaffold in three dimensional space without covalent linkage, or optionally through one or more disulfide bonds in addition to non-covalent interactions. Alternatively, the superscaffold may be a linker molecule or linker polypeptide which covalently links a scaffold to another molecule, such as a second scaffold, which may be the same or different from the first scaffold. Additionally, the superscaffold may comprise a transmembrane region or domain capable of tethering the scaffold in a lipid bilayer, such as at a cell surface. Further still, the superscaffold may be another protein molecule to form a fusion protein comprising a scaffold of the invention.
- A further aspect of the invention provides additional scaffolds and binding proteins comprising them. Generally, the scaffold is a CTL-fold containing a region with one or more variable residues, which region starts at the end of the β3 strand (or with the last residue thereof) and continues through any intervening secondary structures until, but preferably not including, the non-solvent exposed residues of, or before the start of, the β5 strand. Thus the scaffold may comprise a variable region represented by the sequence
-
- -Xaa1-Trp-Xaa2-Xaa3-Xaa4-Xaa5-Xaa6-Ser-Xaa7-Xaa8-Arg-Xaa9-Xaa10-Xaa11-Xaa12-Xaa13-Xaa14-Xaa15-Xaa16-Xaa17-Xaa18-Xaa19-Xaa20-Xaa21-Xaa22-Xaa23- (SEQ ID NO:4) wherein each Xaa is independently any amino acid residue but wherein Xaa5 is preferably Ser, Ala, or Pro, or a conservative substitution of any of these three residues; or Xaa7 is preferably Gly, Ala, or Leu, or a conservative substitution of any of these three residues; and/or Xaa8 is preferably Ser, Tyr, Phe, or Trp, or a conservative substitution of any of these four residues; or
- SEQ ID NO:4 wherein Xaa5 is Ser or wherein Xaa7 is Gly or wherein Xaa8 is Ser or wherein Xaa9 is Ala or wherein Xaa10 is Ala or wherein Xaa12 is Trp or wherein Xaa15 is Gly or wherein Xaa16 is Pro or wherein Xaa17 is Ser or wherein Xaa19 is Ser or wherein Xaa21 is Ala or any combination of the foregoing for Xaa5, Xaa7, Xaa8, Xaa9, Xaa10, Xaa12, Xaa15, Xaa16, Xaa17, Xaa19, and Xaa21. The side chains of the Xaa residues in the above sequences form a binding site, in whole or in part. At each of the N and C terminal ends of the sequences are optional amino acid sequences, or one of the ends is —H (a covalently bonded hydrogen atom), such as those that form a CTL-fold containing the binding site V displayed in a solvent exposed portion of the fold.
- At the N terminus, these sequences are optionally extended by all or part of SEQ ID NO:2, wherein the extension may be by 1, 2, 3, 4, 5, or all 6 of the consecutive amino acid residues therein linked to Xaa1 via the carboxyl end of the last Gly residue in SEQ ID NO:2. At the C-terminus, these sequences are also optionally extended by all or part of SEQ ID NO:3, wherein the extension may be by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or all 12 of the consecutive amino acid residues linked to the C terminal Xaa via the amino end of the first Gly residue in SEQ ID NO:3. The sequences may also be extended at both ends by any combination of the above extensions at Xaa1 and Xaa23 followed by further optional extensions. Where all 12 amino acids of SEQ ID NO:3 are present, preferred embodiments of the invention have no further extension at the C terminus.
- SEQ ID NO:4 containing sequences are preferably part of a scaffold as found in the CTL-fold portion of Mtd. Alternatively, the sequences may be substituted for the corresponding sequence between the β3 and β5 strands of another CTL-fold as described herein.
- Alternatively, the scaffold may comprise a cyanobacterium derived variable region represented by
-
(SEQ ID NO: 5) Xaa1-Trp-Xaa2-Xaa3-Xaa4-Xaa5-Xaa6-Xaa7-Cys-Arg- Ser-Xaa8-Xaa9-Arg-Xaa10-Xaa11-Xaa12-Xaa13-Xaa14- Xaa15-Xaa16-Xaa17-Xaa18-Xaa19-Xaa20-Xaa21-, -
- optionally with the addition of -Xaa22-, or Xaa22-Xaa23-, or -Xaa22-Xaa23-Xaa24- at the C terminus end, wherein each Xaa is independently any amino acid residue but wherein Xaa5 is preferably Ser, Ala, or Pro, or a conservative substitution of any of these three residues; or Xaa8 is Gly or Ala, or Leu, or a conservative substitution of any of these three residues; and/or Xaa9 is Ser, Tyr, Phe, or Trp, or a conservative substitution of any of these four residues. Again, the side chains of the Xaa residues in the above sequence form a binding site, in whole or in part. At each of the N and C terminal ends of the sequences are optional amino acid sequences, or one of the ends is —H (a covalently bonded hydrogen atom), such as those that form a CTL-fold containing the binding site displayed in a solvent exposed portion of the fold.
- At the N terminus, these sequences are optionally extended by all or part of SEQ ID NO:2, wherein the extension may be by 1, 2, 3, 4, 5, or all 6 of the consecutive amino acid residues therein linked to Xaa1 via the carboxyl end of the last Gly residue in SEQ ID NO:2. At the C-terminus, these sequences are also optionally extended by all or part of SEQ ID NO:3, wherein the extension may be by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or all 12 of the consecutive amino acid residues linked to the C terminal Xaa via the amino end of the first Gly residue in SEQ ID NO:3. Alternatively, the sequence is extended at the C terminus by all or part of -Gly-Phe-Arg-Leu-Val-Ser-Phe-Pro-Pro-Arg-Thr-Leu-Glu- (SEQ ID NO:6), -Gly-Phe-Arg-Leu-Val-Ser-Phe-Pro-Pro-Arg-Thr-Pro-Glu- (SEQ ID NO:7), -Gly-Phe-Arg-Val-Val-Cys-Ala-Phe-Gly-Arg-Ile-Leu-Gln- (SEQ ID NO:8), or -Gly-Phe-Arg-Val-Val-Cys-Ala-Phe-Gly-Arg-Thr-Phe-Gln- (SEQ ID NO:9), wherein the extension may be by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or all 13 of the consecutive amino acid residues in any one of SEQ ID NOs:6-9 linked to the C terminal Xaa via the amino end of the first Gly residue in each SEQ ID NO. The C terminus extension may also be by -Gly-Phe-Arg-Val-Ile-Ser-Ser-Ser-Pro-Val-Val-Ser-Gly-Phe-His-Ser- (SEQ ID NO:10), wherein the extension may be by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or all 16 of the consecutive amino acid residues linked to the C terminal Xaa via the amino end of the first Gly residue in SEQ ID NO:10; or by -Gly-Cys-Arg-Val-Val-Val-Val-Arg-Gly-Arg-Leu-Ser- (SEQ ID NO:11), wherein the extension may be by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or all 12 of the consecutive amino acid residues linked to the C terminal Xaa via the amino end of the first Gly residue in SEQ ID NO:11.
- The sequences may also be extended at both ends by any combination of the above extensions at Xaa1 and Xaa21 (or Xaa22, Xaa23, or Xaa24) followed by further optional extensions. Where all the amino acids of any of SEQ ID NOs:3 or 6-11 are present, preferred embodiments of the invention have no further extension at the C terminus.
- SEQ ID NO:5 containing sequences are preferably part of a scaffold as found in the CTL-fold of a protein containing a cyanobacterium amino acid sequence as shown in
FIG. 5 . Those cyanobacterium CTL-fold containing proteins are from Trichodesmium erythraeum (preferably T.e. 1A, T.e. 1B, or T.e. 2); Nostoc PPC ssp. 7120 (preferably N. PCC. 1, N. PCC. 2A, or N. PCC. 2B); or Nostoc punctiforme (preferably N.p. 1 or N.p. 2) and have both protein level homology as well (as indicated inFIG. 5 ) and genetic similarity because the coding regions for the proteins contain a corresponding TR. Alternatively, the sequences may be substituted for the corresponding sequence between the β3 and β5 strands of another CTL-fold as described herein. - The invention also provides a Treponema denticola derived variable region comprising a sequence represented by
-
(SEQ ID NO: 12) Xaa1-Arg-Val-Xaa2-Arg-Gly-Gly-Xaa3-Trp-Xaa4-Xaa5- Xaa6-Ala-Xaa7-Xaa8-Cys-Xaa9-Val-Gly-Xaa10-Arg- Xaa11-Xaa12-Xaa13-Xaa14-Pro-Xaa15-Xaa16-Xaa17- Xaa18-Xaa19-Xaa20-Leu-, -
- wherein each Xaa is independently any amino acid residue and the side chains of the Xaa residues in the above sequence form a binding site, in whole or in part. At each of the N and C terminal ends of the sequences are optional amino acid sequences, or one of the ends is —H (a covalently bonded hydrogen atom), such as those that form a CTL-fold containing the binding site displayed in a solvent exposed portion of the fold.
- The sequence is optionally extended at the C terminus Leu by one or more residues in -Gly-Phe-Arg-Leu-Ala-Cys-Arg-Pro (SEQ ID NO:13) wherein the extension may be by 1, 2, 3, 4, 5, 6, 7, or all 8 of the consecutive amino acid residues linked to the C terminal Leu via the amino end of the first Gly residue in SEQ ID NO:13. Where all 8 amino acids of SEQ ID NO:13 are present, preferred embodiments of the invention have no further extension at the C terminus.
- SEQ ID NO:12 containing sequences are preferably part of a scaffold as found in the CTL-fold of a Treponema denticola protein containing the corresponding T.d. amino acid sequence in
FIG. 5 . Alternatively, the sequences may be substituted for the corresponding sequence between the β3 and β5 strands of another CTL-fold as described herein. - The invention further provides a scaffold comprising another phage derived variable region represented by
-
(SEQ ID NO: 14) -Gly-Gly-Gly-Leu-Trp-Cys-Arg-Asn-Tyr-Gly-Asp-Arg- Phe-Pro-Ile-Arg-Gly-Gly-Xaa1-Trp-Xaa2-Xaa3-Gly- Ser-Xaa4-Ala-Gly-Leu-Gly-Ala-Leu-Xaa5-Leu-Xaa- Xaa7-Ala-Arg-Ser-Xaa8-Ser-Xaa9-Xaa10-Xaa11-Xaa12- -
- wherein each Xaa is independently any amino acid residue and the side chains of the Xaa residues in the above sequence form a binding site, in whole or in part. At each of the N and C terminal ends of the sequences are optional amino acid sequences, or one of the ends is —H (a covalently bonded hydrogen atom), such as those that form a CTL-fold containing the binding site displayed in a solvent exposed portion of the fold.
- The sequence is optionally extended at the Xaa12 end by one or more residues in -Gly-Phe-Arg-Pro-Ala-Phe-Phe-Val (SEQ ID NO:15) wherein the extension may be by 1, 2, 3, 4, 5, 6, 7, or all 8 of the consecutive amino acid residues linked to Xaa12 via the amino end of the first Gly residue in SEQ ID NO:15. Where all 8 amino acids of SEQ ID NO:15 are present, preferred embodiments of the invention have no further extension at the C terminus.
- SEQ ID NO:14 containing sequences are preferably part of a scaffold as found in the CTL-fold of a Vibrio harveyi ML phage protein (ORF35 encoded protein) containing the corresponding V.h. ML amino acid sequence in
FIG. 5 . Alternatively, the sequences may be substituted for the corresponding sequence between the β3 and β5 strands of another CTL-fold as described herein. - The invention also provides a scaffold comprising a Bifidobacterium longum derived variable region represented by
-
(SEQ ID NO: 16) -Xaa1-Arg-Phe-Gly-Xaa2-Leu-Xaa3-Xaa4-Gly-Ala-Ala- Cys-Gly-Ala-Phe-Ala-Val-Xaa5-Leu-Xaa6-Xaa7-Xaa8- Leu-Ala-Xaa9-Arg-Xaa10-Trp-Xaa12- -
- wherein each Xaa is independently any amino acid residue and the side chains of the Xaa residues in the above sequence form a binding site, in whole or in part. At each of the N and C terminal ends of the sequences are optional amino acid sequences, or one of the ends is —H (a covalently bonded hydrogen atom), such as those that form a CTL-fold containing the binding site displayed in a solvent exposed portion of the fold.
- The sequence is optionally extended at the Xaa12 end by one or more residues in -Gly-Gly-Arg-Leu-Ser-Ala-Leu-Gly-Arg-Thr-Lys-Ala (SEQ ID NO:17) wherein the extension may be by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or all 12 of the consecutive amino acid residues linked to Xaa12 via the amino end of the first Gly residue in SEQ ID NO:17. Where all 12 amino acids of SEQ ID NO:17 are present, preferred embodiments of the invention have no further extension at the C terminus.
- SEQ ID NO:16 containing sequences are preferably part of a scaffold as found in the CTL-fold of a Bifidobacterium longum protein containing the corresponding B.l. amino acid sequence in
FIG. 5 . Alternatively, the sequences may be substituted for the corresponding sequence between the β3 and β5 strands of another CTL-fold as described herein. - Additionally, the invention also provides a scaffold comprising a Bacteroides thetaiotaonicron derived variable region represented by
-
(SEQ ID NO: 18) -Xaa1-Gly-Xaa2-Cys-Trp-Ser-Ala-Val-Pro-Xaa3-Xaa4- Xaa5-Xaa6-Xaa7-Gly-Xaa8-Xaa9-Leu-Xaa10-Phe-Xaa11- Ser-Ser-Xaa12-Val-Xaa13-Pro-Leu-Xaa14-Xaa15-Xaa16- Xaa17- -
- wherein each Xaa is independently any amino acid residue and the side chains of the Xaa residues in the above sequence form a binding site, in whole or in part. At each of the N and C terminal ends of the sequences are optional amino acid sequences, or one of the ends is —H (a covalently bonded hydrogen atom), such as those that form a CTL-fold containing the binding site displayed in a solvent exposed portion of the fold.
- The sequence is optionally extended at the Xaa17 end by one or more residues in -Arg-Ala-Cys-Gly-Phe-Gly-Leu-Arg-Ser-Ser-Gln-Glu (SEQ ID NO:19) wherein the extension may be by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or all 12 of the consecutive amino acid residues linked to Xaa17 via the amino end of the first Arg residue in SEQ ID NO:19. Where all 12 amino acids of SEQ ID NO:19 are present, preferred embodiments of the invention have no further extension at the C terminus.
- SEQ ID NO:18 containing sequences are preferably part of a scaffold as found in the CTL-fold of a Bacteroides thetaiotaonicron protein containing the corresponding B.t. amino acid sequence in
FIG. 5 . Alternatively, the sequences may be substituted for the corresponding sequence between the β3 and β5 strands of another CTL-fold as described herein. - Additionally, the invention provides for the use of the region between the β3 and β5 strands of a CTL-fold as a variable region in which amino acids may be altered to produce novel binding sites with different specificities and avidities. Thus in an additional aspect of the invention, the nucleic acid sequence encoding the CTL-fold of a CTL-fold containing protein may be operably linked to a template region (TR), and an IMH as needed, wherein the TR corresponds to all or part of the binding site in the CTL-fold and contains adenine residues that direct changes in the amino acid sequence of the binding site, and thus variable region, as described herein. Preferred embodiments of the invention include CTL-fold encoding nucleic acids with the Mtd IMH, or a functional fragment thereof, to direct alterations in the VR based on adenine residues in the functionally linked TR.
- A scaffold in a binding protein of the invention is preferably all or part of a CTL-fold that correctly orients the binding site contained therein. Non-limiting examples of CTL-folds include that in Mtd as described herein as well those classified as C-type lectin-like domains (CTLDs) and divergent CTLDs. Preferred regions of the CTL-fold in Mtd are residues 171-381 and residues 306-381 of SEQ ID NO:20. In the case of residues 171-381, the size is analogous to recombinant single chain antibodies composed of a single variable domain (VHH), which remains a stable polypeptide with the antigen binding capability of the original variable region of the heavy chain (see Nanobodies™ by Ablynx). These VHH are based on antibodies that lack light chains found in camelidae (camels and llamas). In the case of residues 306-381, at least one region composed of residues 171-199, residues 237-263, residues 200-236, or residues 264-305 is preferably present in the fold as well. Particularly preferred is the presence of any two, any three, or all four of these regions.
- CTLD examples include those that bind Ca2+, such as carbohydrate recognition domains (CRDs), C-type lectin domains (which bind sugars), coagulation factor binding proteins, and IgE Fc receptor. Divergent CTLD examples include type II antifreeze proteins, oxidized LDL receptor, phospholipase receptors, NK cell receptors (which bind MHC ligands). Other non-limiting examples include link protein modules, endostatin, and intimin. For a review of the C-type lectin fold, see Drickamer, K. “C-type lectin-like domains.” Curr Opin Struct Biol 9, 585-90 (1999).
- Preferably, the CTL-fold is bacterial (including bacterial phages), human or mammalian in origin. Non-limiting examples include the selectins (see Lasky (1995) Annu. Rev. Biochem., 64:113-139), including E-selectin, L-selectin and P-selectin; mannose binding protein (MBP), including MBP-A and MBP-C; the natural killer (NK) receptor NKG2D; CD69; eosinophilic major basic protein (EMBP); tumour necrosis factor-stimulated gene-6 product (TSG-6); enteropathogenic E. coli (EPEC) intimin (the D3 domain therein is a CTL-fold); and Yersinia pseudotuberculosis invasin (the D5 domain is a CTL-fold).
- An MBP derived variable region of the invention is represented by
-
- -Xaa1-Xaa2-Gly-Xaa3-Trp-Asn-Asp-Xaa4-Xaa5-Cys-Xaa6-Xaa7-Xaa8- (SEQ ID NO:21) wherein each Xaa is independently any amino acid residue; or
- SEQ ID NO:21 wherein Xaa1 is Asp or wherein Xaa2 is Asn or wherein Xaa3 is Leu, Gln, His, or Lys or wherein Xaa5 is Ile, Val, or Asp or wherein Xaa5 is Ser, Pro, Val, or Ala or wherein Xaa6 is Gln, Asn, Arg, or His or wherein Xaa7 is Ala, Tyr, Arg, or Lys or wherein Xaa8 is Ser, Gln, Pro, or Arg or any combination of the foregoing for Xaa1 to Xaa8.
- The side chains of the Xaa residues in the above sequences form a binding site, in whole or in part. At each of the N and C terminal ends of the sequences are optional amino acid sequences, or one of the ends is —H (a covalently bonded hydrogen atom), such as those that form a CTL-fold containing the binding site displayed in a solvent exposed portion of the fold.
- SEQ ID NO:21 containing sequences are preferably part of a scaffold as found in the CTL-fold of an MBP protein, preferably with a collagenous domain. Alternatively, the sequences may be substituted for the corresponding sequence between the β3 and β5 strands of another CTL-fold as described herein.
- A selectin derived variable region of the invention is represented by
-
- Xaa1 Xaa2-Xaa3-Xaa4-Xaa5-Xaa6-Xaa7-Gly-Xaa8-Trp-Asn-Asp-Xaa9-Xaa10-Cys-Xaa11-Xaa12-Xaa13- (SEQ ID NO:22) wherein each Xaa is independently any amino acid residue; or
- SEQ ID NO:22 wherein Xaa1 is Ile or wherein Xaa2 is Lys or wherein Xaa3 is Arg or wherein Xaa4 is Gln or wherein Xaa5 is Arg or wherein Xaa6 is Asp or wherein Xaa7 is Ser or wherein Xaa8 is Leu, Gln, His, or Lys or wherein Xaa9 is Ile, Val, or Asp or wherein Xaa10 is Ser, Pro, Val, or Ala or wherein Xaa11 is Gln, Asn, Arg, or His or wherein Xaa12 is Ala, Tyr, Arg, or Lys or wherein Xaa13 is Ser, Gln, Pro, or Arg or any combination of the foregoing for Xaa1 to Xaa13.
- The side chains of the Xaa residues in the above sequences form a binding site, in whole or in part. At each of the N and C terminal ends of the sequences are optional amino acid sequences, or one of the ends is —H (a covalently bonded hydrogen atom), such as those that form a CTL-fold containing the binding site displayed in a solvent exposed portion of the fold.
- SEQ ID NO:22 containing sequences are preferably part of a scaffold as found in the CTL-fold of a selectin protein. Alternatively, the sequences may be substituted for the corresponding sequence between the β3 and β5 strands of another CTL-fold as described herein.
- In a further aspect, the invention provides nucleic acid molecules, or polynucleotides, encoding the scaffolds and binding proteins as described herein. The nucleic acids or polynucleotides may be part of a nucleic acid vector or plasmid, optionally in a cell, preferably suitable for expression of the encoded protein. The scaffold is preferably all or part of a variable region (VR) in the nucleic acid molecule which is operably linked to an initiation of mutagenic homing (IMH) sequence and a template region (TR) as described below. Thus nucleic acid molecules encoding the CTL-folds described above, but which do not have an operably linked IMH and/or TR components, may be modified to be a nucleic acid molecule of the invention by attachment of the necessary functional nucleic acid components.
- The invention also provides a plurality, or library, of scaffolds or binding proteins as well as methods for their production. Thus, a method of producing a plurality of scaffolds or proteins with different binding specificities is disclosed, the method comprising expressing and replicating a nucleic acid molecule or polypeptide encoding a scaffold or binding protein of the invention in a cell under conditions of mutagenic homing wherein said TR directs mutagenesis of variable residues within the variable region (VR) containing the scaffold. Non-limiting examples of a plurality or library of scaffolds or binding proteins include those expressed as a phage display, ribosome display, polysome display, or cell surface display as well as those presented as an array or microarray format. In some preferred embodiments, the plurality is expressed as part of the tail fibers of Bordetella bacteriophages.
- The resultant plurality or library of scaffolds or binding proteins may be screened for binding against a target molecule of interest. The invention provides a method of selecting for binding comprising producing or providing a plurality, or library, of scaffolds or proteins in a plurality of cells as described above followed by selecting proteins which bind a molecule of interest after individually contacting each of said plurality of scaffolds or proteins (or phage particles, cells, or media containing them) with a target molecule of interest. Optionally, the binding proteins in the plurality or library are in dimeric or other multimeric form. The invention also provides for identifying a multimeric form of a binding protein as having a greater avidity for the target molecule of interest than a monomeric form of the protein.
- Alternatively, the plurality or library of scaffolds or binding proteins may be screened for binding to any one of a multiplicity of target molecules as an additional method of the invention. The scaffolds or proteins contacted with multiple molecules followed by selection of those scaffolds or proteins that bind at least one of the target molecules may be isolated. The multiple target molecules may be in a mixture or disposed on an array or microarray as non-limiting examples. Other such examples include multiple molecules in or on a cell or tissue as well as multiple molecules immobilized on a solid support. The target molecules are preferably polypeptides, optionally modified by glycosylation, phosphorylation, or other post-translational modification; carbohydrates; lipids; or complex combinations thereof. The target molecules may be expressed on the exterior of phage or a virus, or a viable or non-viable cell of any phyla. In some embodiments of the invention, the plurality or library of scaffold or binding protein is expressed on the exterior of phage, such as Bordetella bacteriophage.
- Where the members of a plurality or library of scaffolds or binding proteins are individually expressed on the exterior of individual phage particles, the invention provides methods of selecting for binding against a target ligand or molecule of interest by use of the plurality or library of phage particles. The plurality, or library, is provided and contacted with a target ligand or molecule of interest followed by selection of phage which bind the ligand or molecule, optionally by removal of phage which do not bind. The selected phage particles may be propagated followed by one or more additional rounds of contacting and selection, optionally under more stringent wash conditions, to “enrich” for phage expressing a scaffold or binding protein with greater affinity or avidity. The polynucleotide encoding the scaffold or binding protein may be isolated from the selected phage and analyzed (e.g. sequenced), amplified or propagated to produce the scaffold or binding protein. In cases of a binding protein, the phage may have been expressing the protein in dimeric, trimeric or other multimeric form. Such selected phage may be used as sources of genes or gene fragments encoding binding protein molecules with the desired specificity and avidity.
- The selection methods of the invention may further include an additional determination of the scaffold or binding proteins, selected as described above, as binding or not binding to a second molecule. Scaffolds or binding proteins that bind a second molecule would be identified as non-specific for the target ligand or molecule of interest, while those that do not bind a second molecule would be identified as specific for the target ligand or molecule of interest relative to the second molecule.
- The scaffolds and binding proteins of the invention may also be modified, such as by attachment of another moiety thereto. Non-limiting examples of a moiety for attachment include a detectable label or a toxin or activatable pro-drug. Modified scaffolds and binding proteins may be used to target a cell which is bound thereby. As a non-limiting example, a detectably labeled modified scaffold or binding protein may be used to detect a cell expressing a molecule bound by the binding site of the scaffold or protein. The molecule may be expressed on the cell surface, such that the scaffold or binding protein binds the exterior of the cell. The molecule may also be expressed within the cell, wherein the scaffold or binding protein binds after introduction into the interior of the cell, such as, but not limited to, cases where the cells have been permeabilized. Non-limiting examples of cells that may be detected include both prokaryotic and eukaryotic cells, including bacterial cells and higher eukaryotic cells from a multicellular organism.
- A modified scaffold or binding protein attached to a toxin, or pro-drug form thereof, may be used to decrease the viability of, or to kill, cells which express a cell surface molecule bound by the modified scaffold or protein. Preferably, the cells are cancer cells, such as those of a mammal, preferably a human.
- In additional aspects of the invention, compositions comprising the scaffolds and binding proteins of the invention are provided. The compositions may be used for the practice of the methods disclosed herein, including diagnostic, prophylactic or therapeutic applications. Additionally, compositions comprising the nucleic acid molecules and polypeptides disclosed herein as well as materials for the expression thereof are provided. These compositions may be provided in the form of a kit for the expression and production of the scaffolds and proteins of the invention.
- The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the drawings and detailed description, and from the claims.
-
FIG. 1 shows the organization of the Bordetella phage DGR containing a single copy of Mtd with its VR followed by a nearly identical (90%), 134-bp direct repeat of the VR called the template repeat (TR), which is invariant among Mtd variants. The amino acid sequence of VR in each of the five Mtd variants is shown in the upper box, together with the predicted amino acid sequence encoded by the corresponding nucleotide triplets of the TR in the lower box. The region corresponding to the initiation of mutagenic homing (IMH) sequence is underlined. -
FIG. 2A shows two representations of the intertwined, pyramid-shaped trimer structure of several Mtd variants. -
FIG. 2B shows a representation of an Mtd monomer and three domains therein: β-prism, intermediate domain containing the β-sandwich, and C-type lectin (CTL)-fold including the VR and the region corresponding to the IMH. -
FIG. 2C is a schematic showing regions of secondary structure in Mtd. -
FIG. 3A shows a representation of an Mtd CTL-fold. -
FIG. 3B shows a representation of 12 variable residues which are almost all solvent-exposed and organized into a receptor-binding site on the external face of the Mtd β2β3β4β4′ sheet. -
FIG. 3C shows a structural comparison of Mtd-P1,-3c, -M1, -I1, and -N1 used to determine that the main chain conformation of the CTL domain is remarkably consistent, despite half of the variable residues being on loop regions. -
FIG. 3D shows a representation of Serine-270 (S270) and Glutamate-267 (E267) from the second insert in the Mtd CTL-fold forming hydrogen bonds to the invariant VR residues Serine-351 (S351) and Serine-353 (S353), respectively, within the binding region. -
FIG. 3E shows that the β2β3 loop from one monomer hydrogen bonds to the invariant VR residue Arginine-354 (R354) and to main chain (scaffold) atoms of VR. -
FIG. 4 shows by means of molecular surface representations that Mtd-P1 (BPP-1) and Mtd-I1 (BIP-1) have highly hydrophobic binding sites, and that the continuity of the hydrophobic surface decreases successively for Mtd-3c (BPP-3), -M1 (BMP-1), and -N1 (BNP). The view is looking onto the base of pyramid-shaped Mtd, that is, the surface that binds the exposed binding surface of the target molecule. The variable amino acid residues (except for 348) are numbered on the surface of BPP-1. The variable and invariant hydrophobic amino acid residues (Ala, Val, Leu, Ile, Phe, Tyr, Trp, and Met) are in green and yellow, respectively; and variable and invariant hydrophilic amino acid residues (Ser, Thr, Asn, Gln, Asp, Glu, His, Lys, Arg, and Cys) are in red and pink, respectively. The surface denoted ‘Invariant’ shows, using the same coloring scheme, the hydrophobic and hydrophilic surface surrounding the variable portion of the binding sites. -
FIG. 5 shows the structure-based sequence alignment of the β2β3β4β4′ sheet of the CTL-fold in Mtd-P1 and 12 variable proteins of putative DGRs, as discussed herein. Residues colored light gray correspond to variable residues in Mtd, and those residues found to differ between VR and TR in genomic sequences of the other 12 proteins Residues colored dark gray are those that could vary by an adenine-directed mechanism in these other proteins. Magenta corresponds to identical residues and yellow to residues conserved in chemical character. In assigning color, the grays take precedence over magenta and yellow, such that certain putatively variable residues are also identical or conserved. Secondary structure elements (box for β-strand, and oval for 310-helix) for Mtd are denoted above the alignment, and the ‘GGXW’ motif is also denoted. The 12 variable proteins of putative DGR's are from Vibrio harveyi ML phage (V.h. ML); Bifidobacterium longum (B.l); Bacteroides thetaiotaonicron (B.t); Treponema denticola (T.d.);Trichodesmium erythraeum 1A (T.e. 1A); Trichodesmium erythraeum 1B (T.e. 1B); Trichodesmium erythraeum #2 (T.e. 2); Nostoc PPC ssp. 7120 #1 (N. PCC. 1); Nostoc PPC ssp. 7120 #2A (N. PCC. 2A); Nostoc PPC ssp. 7120 #2B (N. PCC. 2B); Nostoc punctiforme #1 (N.p. 1); and Nostoc punctiforme #2 (N.p. 2). - This invention is based in part on X-ray crystal structures of four Mtd variants, each competent to promote infectivity and each having a different receptor specificity (Mtd-P1,-3c, -M1, and I1). The structure of a fifth Mtd variant from a non-infective phage (see Mtd-N1 in
FIG. 1 ) was also determined. The 1.5 Å resolution structure of Mtd-P1 was determined by multiwavelength anomalous dispersion using seleno-methionine substituted protein, and structures of other Mtd variants were determined by molecular replacement. The overall structures of these variants are nearly identical, indicating sequence variation within the VR causes no large conformational shifts. - The Mtd variants are all seen to form an intertwined, pyramid-shaped trimer (
FIG. 2A ). The dimensions of the trimer (height and base of ˜90 Å and ˜50 Å, respectively) correspond roughly to the size of knobs seen on the ends of Bordetella phage tail fibers (see Liu, M. et al. Genomic and genetic analysis of Bordetella bacteriophages encoding reverse transcriptase-mediated tropism-switching cassettes. J Bacteriol 186, 1503-17 (2004)). The extensive trimer interface buries more than 4,500 Å2 of surface area in each monomer, consistent with an obligatory trimer and with trimeric association observed by static light scattering. The majority (69%) of the interface area is composed of non-polar residues. Each polypeptide is also joined to its neighbor via 20 hydrogen bonds, one electrostatic interaction (between Glu-234 and Arg-354), and at least one shared cation (magnesium or calcium at Phe-313 carbonyl). - Mtd is composed of three domains (see
FIG. 2B ). At the apex of the pyramid, the N-terminal domains (residues 1-48) of each of the three monomers form a threefold symmetric ≈-prism, with each monomer contributing a four-stranded, antiparallel β-sheet flanked by a short α-helix. The β-prism is structurally similar to the pseudo-threefold symmetric β-prisms observed in monocot lectins (rmsd 2.4 Å, 60 Cα atoms, see Hester, G., Kaku, H., et al. Structure of mannose-specific snowdrop (Galanthus nivalis) lectin is representative of anew plant lectin family.Nat Struct Biol 2, 472-9 (1995)). However, the Mtd β-prism does not contain the spatial arrangement of residues required in monocot lectins which bind carbohydrates without a CTL-fold. - The β-prism domain of each Mtd monomer is joined to the following intermediate domain by a short 310-helix (residues 49-54), which intertwines with equivalent 310-helices from other monomers. These connections cross such that the β-prism domain occupies a different face of the pyramid than the other domains.
- In contrast to the intimate trimeric association of the β-prism domain, the intermediate domain (residues 56-170) splays away from the trimer axis and makes little contact to other monomers. The intermediate domain is formed by an elaborated β-sandwich containing three- and four-stranded antiparallel sheets and with the three-stranded sheet making a near right-angle turn near its middle (see
FIG. 2B ). The structure of the intermediate domain appears to constitute a novel fold. Without being bound by theory, and offered to advance understanding of the invention, the N-terminal β-prism or intermediate β-sandwich domains are theorized to permit association of the individual monomers with each other as well as being possibly involved in tethering Mtd to the surface of Bordetella phage. - The superscaffold of the proteins of the invention may thus include all or part of one or both of the β-prism and intermediate domains of Mtd, where the Mtd CTL-fold contains one scaffold of the invention. These superscaffold domains may be used to arrange and display the binding site of a scaffold of the invention as described herein.
- The Mtd C-terminal domain (residues 171-381), which constitutes more than half of Mtd and contains the VR, is unexpectedly found to have a C-type lectin (CTL)-fold (see Weis, W. I., et al. Structure of the calcium-dependent lectin domain from a rat mannose-binding protein determined by MAD phasing. Science 254, 1608-15 (1991); Drickamer, K. C-type lectin-like domains. Curr Opin Struct Bial 9, 585-90 (1999); and Holm, L. et al. Protein structure comparison by alignment of distance matrices. J Mol Biol 233, 123-38. (1993)). See
FIG. 3A . Although originally named for calcium-dependent carbohydrate binding in mammalian mannose binding protein (MMBP, see Weis, W. I., et al. Structure of a C-type mannose-binding protein complexed with an oligosaccharide.Nature 360, 127-34 (1992)), different individual CTL-folds have been recognized to bind different ligands. - The similarity of Mtd to carbohydrate-binding CTL proteins, such as MMBP (1.5 Å rmsd, 60 Cα atoms), appears to be the result of convergent evolution. None of the 14 residues absolutely conserved in carbohydrate-binding CTL domains is found in Mtd, and neither are the residues required for calcium- and carbohydrate-binding. Likewise, none of the four disulfide-bond forming cysteines found in many CTL domains is found in Mtd, confirming that disulfides are not required for stability of CTL-folds. Furthermore, Mtd has no obvious amino acid sequence relationship to other convergently evolved CTL domains, such as the E. coli virulence factor intimin, but does have structural similarity as expected (rmsd 1.8 Å, 75 Cα atoms).
- The typical distinguishing features of the ˜110-130 residue CTL-fold, as also seen in Mtd, are a two-stranded antiparallel β-sheet formed by the domain's N- and C-termini (β1β5) connected by two a-helices to a three-stranded, antiparallel β-sheet (β2β3β4), see
FIG. 3A . These features are also generally present in other CTL-folds, which range from about 95 to about 150 residues, described herein for use in the practice of the invention. The β2 strand is uniquely twisted in Mtd such that it crosses over the β3 strand. Unique to Mtd are inserts (residues 200-236 and 264-305) that interrupt connections between β1 and α1 and between α2 and β2, respectively, as well as some additional short strands (β0 and β4′). The inserts have no regular secondary structure but do have specific conformations due to an extensive hydrogen bonding network, including to residues within the binding site. Without being bound by theory, and offered to advance the understanding of the present invention, it is possible that the inserts stabilize the VR as discussed below. As noted above, the Mtd CTL-fold, and other analogous CTL-folds of similar structural arrangement, may be used as a scaffold in the practice of the present invention. - The Mtd CTL-fold contains 12 residues that are variable. The 12 variable residues are almost all solvent-exposed and organized into a receptor-binding site on the external face of the β[2β3β4β4′ sheet (
FIG. 3B ). This face is equivalent to the one in the CTL-fold proteins Ly49A (see Tormo, J., et al. Crystal structure of a lectin-like natural killer cell receptor bound to its MHC class I ligand. Nature 402, 623-31 (1999)) and intimin (Luo, Y. et al. Crystal structure of enteropathogenic Escherichia coli intimin-receptor complex. Nature 405, 1073-7 (2000); and Batchelor, M. et al. Structural basis for recognition of the translocated intimin receptor (Tir) by intimin from enteropathogenic Escherichia coli. EMBO J 19, 2452-64 (2000)) responsible for interaction with their respective targets, class I MHC molecules and Tir. Half of the 12 variable residues are located on regular secondary structure elements: three are located on β-strands (357 on β4; 368 and 369 on β4′), and three on a 310-helix that connects β3 to β4 (347, 348, and 350), seeFIG. 3B . The other half of the variable residues occupy loop positions preceding the 310-helix (344 and 346) or connecting β4 to β4′ (359, 360, 364, and 366). - All variable residues, except for 348 and 369, are encoded by AAC codons in TR. Adenine-directed mutagenesis permits substitution of Asn encoded by AAC with 14 other residues, which cover the gamut of chemical character. For example, while adenine substitution of AAC cannot produce a codon for Trp, it can produce codons for Phe and Tyr. Likewise, while substitution cannot produce codons for Glu and Lys, it can produce codons for Asp and Arg (also His). Significantly, the use of the AAC codon rules out a nonsense codon being introduced. Adenine-substitution of the two non-AAC codons in TR, ACG encoding Thr-348 and ATC encoding Ile-369, can produce three other amino acids (Ser, Pro, Ala at 348; Val, Leu, Phe at 369). There appears to be no structural necessity for
residue 348 to be small, but 369 is preferably hydrophobic to pack between the invariant residues Trp-307 and Trp-309 (FIG. 3B ). - Along with these variable residues, the binding site in Mtd contains four invariant, solvent-exposed aromatic residues that are likely to contribute to interactions despite their status as amino acid residues of a scaffold as described herein. These are Trp-307 and Trp-345 at the center and periphery, respectively, of the binding site. Also at the periphery are the invariant residues Tyr-322 and Tyr-333, which come from the intertwining of an adjacent monomer's β2β3 loop into a neighbor's binding site (
FIG. 3B ). Altogether, the binding site including the variable and above invariant residues in Mtd-P1 presents ˜900 Å2 of exposed surface area. - In the practice of the invention, it is contemplated that “conservative amino acid substitutions” may be favored due to the interchangeability of residues having similar side chains. Thus amino acids may be grouped based upon the similarities of their side chains and substituted for each other on this basis. For example, a group of amino acids having aliphatic side chains is glycine, alanine, valine, leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains is serine and threonine; a group of amino acids having amide-containing side chains is asparagine and glutamine; a group of amino acids having aromatic side chains is phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic side chains is lysine, arginine, and histidine; and a group of amino acids having sulfur-containing side chains is cysteine and methionine. The invention provides for the “conservative substitution” of one amino acid residue in a group by another amino acid residue in the same group. Other conservative amino acid substitution groups include, but are not limited to, valine-leucine-isoleucine, phenylalanine-tyrosine, lysine-arginine, alanine-valine, and asparagine-glutamine.
- The final portion of VR, the β5 strand, is encoded by the ‘initiation of mutagenic homing’ (IMH) sequence, which maintains the unidirectional flow of mutagenized genetic information from TR to VR. This region of VR is unaffected by adenine-directed mutagenesis and therefore invariant. Invariance at the nucleotide level is echoed at the protein level among Mtd variants, with β5 making close intra- and inter-molecular contacts within the central core of the trimer that would be potentially disrupted by variation. Thus all or part of this IMH-encoded β5 strand of the protein may be part of a superscaffold as described herein while the nucleic acid encoding the β5 strand, or a portion thereof, serves as the IMH, which maintains the unidirectional flow of diversity generating information from TR to VR.
- Based in part on the foregoing, the present invention provides a binding protein comprising a scaffold for presentation of a binding site with variable residues as described herein. In a broad sense, the scaffolds and binding proteins of the invention may be substituted for antibodies, and antigen binding fragments thereof, or other affinity agents in detection or other affinity-based assays or in therapeutics as known in the art.
- In preferred embodiments, the scaffold comprises all or part of a CTLD, the Mtd CTL-fold, or an Mtd-like CTL-fold. In the case of the Mtd CTL-fold, the scaffold would permit possible variation at one or more of the 12 variable residues described herein. Alternatively, the scaffold comprises all or part of another CTL-fold, including those of microbial proteins as described herein (see
FIG. 5 and Example 3) as well as those of a selectin; MBP; NKG2D; CD69; EMBP; TSG-6; and intimin as described herein. By “binding site”, it is meant the side chains of variable residues which define, in whole or in part, the three dimensional structure or shape which permits binding of the polypeptide attached to the side chains (through the alpha carbons of each variable residue) to a target molecule. Thus a scaffold is a polypeptide which functionally presents the binding site defining variable residues (contained in said polypeptide) to interact with a target molecule bound by the binding site. Scaffolds of the invention that contain a binding site that is functionally presented to bind a target molecule are thus analogous to a Fv region of an antibody molecule and so may be used in analogous ways. As a non-limiting example, a scaffold of the invention may be conjugated to another molecule as described herein, such as to form a fusion protein or to form a labeled scaffold. The scaffolds of the invention may also be viewed as comprising a variable region which contains a binding site of the invention. - The relationship between a binding site, and thus a scaffold or binding protein of the invention, and a “target molecule” as used herein may also be described as the relationship between the members of a binding pair, wherein one member of the pair has an area on its surface or in a portion thereof which binds to the other member of the pair. The relationship may also be described as that between members of a specific binding pair, wherein one member of the pair has an area on its surface or in a portion thereof which specifically binds to the other member of the pair. The members of a pair may be referred to as ligand and anti-ligand (or ligand and receptor), either of which may be the scaffold or binding protein of the invention. The members of a pair are exemplified by other known, and non-limiting examples, including antibody and antigen or hapten; biotin and avidin (or streptavidin); hormone and hormone receptor; immunoglobulin and protein A; and phosphorylated serine residues and annexin. Thus a scaffold or binding protein of the invention may be viewed as a receptor that binds a ligand as the molecule of interest, or as a ligand that is bound by a receptor as the molecule of interest.
- Preferably, a scaffold of the invention is at least about 40 amino acid residues. The scaffold may also be about 45, about 50, about 55, about 60, about 65, about 70, about 75, about 80, about 85, about 90, about 100, about 110, about 120, about 130, about 140, about 150, about 160, about 170, about 180, about 190, about 200, about 220, or about 230 or more amino acid residues.
- The scaffold in a binding protein of the invention is also preferably in the C-terminal half of the protein. More preferred is where the scaffold is within about 100, about 75, about 50, about 40, about 30, about 20, or about 10 amino acid residues of the C-terminus of the protein.
- Scaffolds containing a binding site may also be conjugated to a superscaffold as described herein to form a binding protein of the invention. A superscaffold of the invention of course does not interfere with the presentation of the binding site by the scaffold, although as explained herein, the superscaffold can serve to permit multimerization of scaffolds, and thus multimerization of binding sites in order to effect high avidity of the binding site comprised of multiple identical or non-identical lower affinity binding sites. Alternatively, the superscaffold can serve as a means, or a linker, to permit conjugation of another molecule to the scaffold and thus binding site through the structure of the superscaffold.
- The amino acid sequences that form the superscaffold are preferably those of non-CTL-fold regions naturally occurring in association with a CTL-fold. One non-limiting example is residues 1-170 of Mtd (SEQ ID NO:20). Other non-limiting examples include the oligomerization domains described by Drickamer (Ibid), including α-helical domains of mannose-binding protein (MBP), which domains form trimeric coiled coils; the β strand from the N terminus of the MBP CRD, optionally with the C-terminal β strand of the CRD and the C-terminal end of helix α2, which dimerize MBP when the α-helical coiled coil domain is absent; the N-terminal β strands of the Polyandrocarpa lectin, optionally with helix α2; loops from factors IX and X which permit the formation of a “head to head” interaction between two CTLDs with optional stabilization by an interchain disulfide bond. Of course the resultant multimers may be homomultimers, composed of scaffolds with the same binding activity, or heteromultimers, composed of scaffolds with more than one binding activity. Thus the invention provides for homodimers, heterodimers, homotrimers, heterotrimers, as well has higher orders of homomeric and heteromeric proteins. Further non-limiting examples include the transmembrane and domains D0, D1, and/or D2 of EPEC intimin as well as the four Ig-like domains (D1-D4) of Y. pseudotuberculosis invasin.
- The binding proteins of the invention are thus made up of at least a scaffold containing a binding site as described herein. This combination may be non-naturally occurring in the sense that the binding site may be part of a variable region derived from a first CTL-fold that is inserted into the corresponding region of a second, and different, CTL-fold. Thus, as a non-limiting example, the Mtd based binding site may be inserted in place of the corresponding region between the β3 and β5 strands of another CTL-fold as described herein. The binding proteins of the invention may thus be considered “recombinant”. Additional “recombinant” binding proteins include those comprising a superscaffold attached to the scaffold wherein the superscaffold is not derived from the same protein as the scaffold. The polypeptide sequence of the superscaffold is preferably that attached to a CTL-fold containing protein described herein. Further “recombinant” binding proteins include the multimeric forms of a superscaffold containing binding protein wherein the subunits of the multimeric form may be the same (to result in a homomultimer) or different (to result in a heteromultimer).
- Preferably, a scaffold or binding protein of the invention is not an isolated form of a naturally occurring polypeptide, where isolated refers to a state of being substantially removed from, preferably entirely removed from, other polypeptides or biomolecules that are normally found with a naturally occurring polypeptide. A naturally occurring polypeptide is one produced by a living organism in the absence of manipulation or modification by human intervention. Non-limiting examples of human intervention include recombinant DNA methodology, mutagenesis by chemical or physical means, inhibition of DNA repair, or manipulation of genetics. Stated differently, the binding proteins of the invention are preferably recombinant proteins or otherwise the result of human intervention. Thus a scaffold or binding protein produced by the recombinant methods described herein, is not a naturally occurring polypeptide.
- The term “recombinant” refers to the alteration of a native nucleic acid, or protein or modification by the introduction of a heterologous nucleic acid or protein, via human intervention. The term may refer to a cell derived from a cell so modified. As a non-limiting example, recombinant cells express genes that are not found within the native (nonrecombinant) form of the cell or express native genes in an unnaturally overexpressed, under-expressed, or not expressed state.
- Preferred embodiments of the invention thus do not include naturally occurring Mtd proteins, such as those with SEQ ID NO:20 (Mtd-P1 or Bordetella phage BPP-1) or variations thereof having the amino acid sequences of Mtd-P3c, Mtd-M1, Mtd-I1, or Mtd-U1. Naturally occurring selectins; MBPs; NKG2D; CD69; EMBP; TSG-6; and intimin as well as naturally occurring sequences of CTL-fold containing proteins from Vibrio harveyi ML phage (V.h. ML); Bifidobacterium longum (B.l); Bacteroides thetaiotaonicron (B.t); Treponema denticola (T.d.);
Trichodesmium erythraeum 1A (T.e. 1A); Trichodesmium erythraeum 1B (T.e. 1B); Trichodesmium erythraeum #2 (T.e. 2); Nostoc PPC ssp. 7120 #1 (N. PCC. 1); Nostoc PPC ssp. 7120 #2A (N. PCC. 2A); Nostoc PPC ssp. 7120 #2B (N. PCC. 2B); Nostoc punctiforme #1 (N.p. 1); and Nostoc punctiforme #2 (N.p. 2) having the corresponding sequences shown inFIG. 5 are also preferably not part of the present invention. These proteins are, however, disclosed as providing variable regions between the β3 and β5 strands of the CTL-fold contained therein for use in the presentation of a binding site as described herein. These proteins are also disclosed as providing CTL-folds for use with the binding sites and variable regions as described herein. - The invention also provides polynucleotides encoding the scaffolds and binding proteins described herein. The polynucleotides are preferably operably linked to a regulatory nucleic acid sequence that controls or regulates the expression of the coding polynucleotide in a cell or cell extract. A regulatory sequence refers to regions or sequence located upstream and/or downstream from the start of transcription that are involved in recognition and binding of RNA polymerase and other proteins to initiate transcription. The term includes a promoter for regulating start of transcription.
- The polynucleotide may be part of a vector or plasmid used to propagate or amplify the polynucleotide. Where the polynucleotide is operably linked to a regulatory nucleic acid sequence, presence in a vector or plasmid permits the expression of the encoded scaffold or binding protein. This permits production and isolation of large quantities of a scaffold or binding protein of the invention.
- Alternatively, the polynucleotide and regulatory sequence is operably linked to other sequences to form a diversity-generating retroelement (DGR) as described herein such that the variable residues of the binding site in the scaffold or binding protein may be readily diversified via a DGR. While embodiments of the invention based upon the nucleic acids encoding the sequences shown in
FIG. 5 are readily used to diversify the binding sites contained therein, this aspect of the invention is advantageously applied to other CTL-folds and the binding sites contained therein where the region between the β3 and β5 strands are not a variable region until operably linked to a TR (and an IMH if necessary), as well as any other necessary components in cis or in trans, like reverse transcriptase activity as a non-limiting example, wherein the TR directs alterations of amino acid residues of the binding site, and thus variable region, as described herein. Of course this means to create alterations in the binding site is limited by adenine directed mutagenesis as described herein. But the invention also contemplates the use of traditional mutagenesis techniques for altering the binding specificity of the region between the β3 and β5 strands of a CTL-fold as described herein. - The polynucleotide, preferably as part of a DGR, may also be part of a phage or bacterial genome and expressed on the surface of phage or bacteria. DGR as used herein includes the use of mutagenic homing wherein an IMH directs mutagenesis of variable residues within the variable region (VR) of a scaffold or binding protein of the invention though a functionally linked TR, which directs alterations of nucleotide residues in the VR based on the locations of adenine residues at corresponding positions in the related TR sequence, as well as any other necessary components in cis or in trans, like reverse transcriptase activity as a non-limiting example. Use of a DGR advantageously permits use of the phage or bacteria to form a library expressing a heterogeneous population of encoded scaffolds or binding proteins on the surfaces of individual organisms. The use of “population” refers to a plurality of heterogeneous members which have similarities but at least two of which have different binding sites as described herein.
- A population of diversified population of phage may be used in a method to identify a scaffold or binding protein as binding to a target molecule of interest. Non-limiting examples of such target molecules include a cell surface molecule, optionally of a cancer cell, an epithelial cell, an endothelial cell, and a bacterial or fungal cell surface molecule. In some embodiments of the invention, the scaffold or binding protein is expressed as part of the tail fiber in a bacteriophage particle.
- Such a method may comprise expressing a population of scaffolds or binding proteins on the surfaces of members of a library of phage particles (including as part of the tail fibers), of bacteria or of other cells; contacting the members of the library with a target molecule of interest, optionally immobilized; removing members that do not bind to the target; and selecting the library member(s) that bind the target molecule of interest. Alternatively, the selected members can be propagated to form another library of members for an additional round of screening or selection using the above method. This permits the enrichment of library member(s) that bind the target of interest and also provides a means to verify the selected member(s) as binding the target. In some embodiments of the invention, the method further comprises isolating polynucleotides from the selected members). The phage library members are one form of a plurality, or family, of scaffolds or binding proteins of the invention.
- A selected or identified scaffold or binding protein may also be “evolved” by a variation of the above to select for enhanced binding to the same ligand or binding to a different ligand. One method for evolving a previously identified or selected scaffold or binding protein is to provide a polynucleotide encoding the scaffold or binding protein, allow it to undergo diversification as described herein to produce a library of variants; and select for a member of the library with enhanced binding to the same target molecule or with “gain of function” binding to another target molecule.
- Of course chemically or genetically known target molecules or unknown target molecules may be used to select or identify a scaffold or binding protein of the invention. Prior information regarding a target molecule's structure is not required to isolate a scaffold or binding protein that binds it. Preferably, the scaffold or binding protein will display specific binding affinity for a particular target, optionally with the functionality of blocking the binding of one or more other molecules to the target molecule. In the case of a cell surface ligand, the scaffold or binding protein may also be able to stimulate or inhibit a metabolic pathway, to act as a signal or messenger, or to stimulate or inhibit cellular activity. A scaffold or binding protein can thus be used as an antagonist, an agonist, as well as a modulator of a cell surface ligand function. A scaffold or binding protein for an “orphan” receptor to which no natural ligand is known may also be generated.
- Unless otherwise defined herein, the use of “specifically binds” or “selectively binds” with respect to a scaffold or binding protein herein refers to binding interactions between the scaffold or binding protein and a first molecular entity that occurs to the exclusion of interactions with a second molecular entity present with the first in a heterogeneous population of molecules or other biological materials. Generally, a scaffold or binding protein of the invention binds to a target molecule better by at least about 2×, more preferably about 5× or about 10×, than binding to background molecules that are present or used as non-specific control targets.
- The scaffolds and binding proteins of the invention may also be modified, such as by attachment of another moiety thereto. In some embodiments of the invention, the moiety may be a label, optionally a detectable label, including a directly detectable label such as a radioactive isotope, a fluorescent label (Cy3 and Cy5 as non-limiting examples) or a particulate label. Non-limiting examples of particulate labels include latex particles and colloidal gold particles. Alternatively, the label may be for indirect detection. Non-limiting examples include an enzyme, such as, but not limited to, luciferase, alkaline phosphatase, and horse radish peroxidase. Other non-limiting examples include a molecule bound by another molecule, such as, but not limited to, biotin, the Fc portion of an antibody, an affinity peptide, or a purification tag. Preferably, the label is covalently attached. The scaffold or binding protein may also be selected to bind antibodies from specific animals, e.g., goat, rabbit, mouse, etc., for use as a secondary reagent in assays using such antibodies as the primary detection agent.
- Alternatively, a scaffold or binding protein of the invention may be detected directly by use of a reagent that binds thereto. Non-limiting examples include an antibody, or functional fragment thereof, that binds a portion of the scaffold without interference of the binding site or that binds a portion of the superscaffold without interfering with the binding site. Such an antibody or fragment thereof is preferably labeled for detection as described herein and as known in the art. Alternatively, a ligand for a portion of the scaffold or the superscaffold, which binds to a region distinct from, and without interference to, the binding site may be used. The ligand is also preferably labeled for detection as provided herein and known in the art.
- Detection of a scaffold or binding protein of the invention may be advantageously used to detect the presence of a target molecule bound by the scaffold or binding protein. Such detection may also be used to detect the presence of a cell that expresses the ligand or molecule. Non-limiting detection assays in which the invention may be adapted include flow cytometry and fluorescent microscopy.
- As an alternative non-limiting example, a labeled scaffold or binding protein of the invention which specifically binds human chorionic gonadotropin (hCG), to the exclusion of other factors that are normally found therewith, may be used to detect hCG in human urine samples as an indicator of pregnancy, such as by use of a lateral flow device as known in the art. Alternatively, a labeled scaffold or binding protein of the invention may be used to detect a microorganism, such as pathogenic bacteria or fungi by binding to a cell surface molecule specific to the microorganism of interest, relative to other organisms normally found therewith.
- Thus the invention also provides a method of detecting a cell, the method comprising contacting a scaffold or binding protein of the invention which binds a cell surface molecule specific to the cell and subsequently detecting the bound scaffold or binding protein. Preferably, the cell is a bacterial or fungal cell, particularly pathogenic forms thereof. Alternatively, the cell may be associated with a disease or other unwanted condition, including, but not limited to a cancer cell or a virally infected cell.
- Therefore, the invention provides for the use of a scaffold or binding protein as disclosed herein as a diagnostic agent, either in vitro or in vivo, based on its ability to bind to a tissue or disease associated target molecule. Tissue associated molecules are those that are expressed exclusively, or at a significantly higher level, in one or more tissue(s) compared to other tissues in an animal. Disease associated molecules are those that are expressed exclusively, or at a significantly higher level, in one or more diseased cells, diseased tissues, or bodily fluid in comparison to non-diseased cells, tissues, or fluids in an organism.
- Non-limiting tissue or disease associated molecules are discussed in Tables I and II of U.S. Patent Publication No 2002/0107215. Non-limiting examples of tissues where target ligands bound by the scaffolds and binding proteins of the invention include liver, pancreas, adrenal gland, thyroid, salivary gland, pituitary gland, brain, spinal cord, lung, heart, breast, skeletal muscle, bone marrow, thymus, spleen, lymph node, colorectal, stomach, ovarian, small intestine, uterus, placenta, prostate, testis, colon, colon, gastric, bladder, trachea, kidney, and adipose tissue. Other non-limiting examples include tumor cells, tumor tissue sample, organ cells, blood cells, and cells of the skin, lung, heart, muscle, brain, mucosae, liver, intestine, spleen, stomach, lymphatic system, cervix, vagina, prostate, mouth, and tongue.
- Non-limiting examples of diseases include, but are not limited to, an autoimmune/inflammatory disorder such as acquired immunodeficiency syndrome (AIDS), Addison's disease, adult respiratory distress syndrome, allergies, ankylosing spondylitis, amyloidosis, anemia, asthma, atherosclerosis, autoimmune hemolytic anemia, autoimmune thyroiditis, autoimmune polyendocrinopathycandidiasis-ectodermal dystrophy (APECED), bronchitis, cholecystitis, contact dermatitis, Crohn's disease, atopic dermatitis, dermatomyositis, diabetes mellitus, emphysema, episodic lymphopenia with lymphocytotoxins, erythroblastosis fetalis, erythema nodosum, atrophic gastritis, glomerulonephritis, Goodpasture's syndrome, gout, Graves' disease, Hashimoto's thyroiditis, hypereosinophilia, irritable bowel syndrome, multiple sclerosis, myasthenia gravis, myocardial or pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis, polymyositis, psoriasis, Reiter's syndrome, rheumatoid arthritis, scleroderma, Sjogren's syndrome, systemic anaphylaxis, systemic lupus erythematosus, systemic sclerosis, thrombocytopenic purpura, ulcerative colitis, uveitis, Werner syndrome, complications of cancer, hemodialysis, and extracorporeal circulation, viral, bacterial, fungal, parasitic, protozoal, and helminthic infections, and trauma; a cell proliferative disorder such as actinic keratosis, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia; cancers including adenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma, teratocarcinorna, and, in particular, a cancer of the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix, gall bladder, ganglia, gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin, spleen, testis, thymus, thyroid, and uterus; a neurological disorder such as epilepsy, ischemic cerebrovascular disease, stroke, cerebral neoplasms, Alzheimer's disease, Pick's disease, Huntington's disease, dementia, Parkinson's disease and other extrapyramidal disorders, amyotrophic lateral sclerosis and other motor neuron disorders, progressive neural muscular atrophy, retinitis pigmentosa, hereditary ataxias, multiple sclerosis and other demyelinating diseases, bacterial and viral meningitis, brain abscess, subdural empyema, epidural abscess, suppurative intracranial thrombophlebitis, myelitis and radiculitis, viral central nervous system disease, prion diseases including kuru, Creutzfeldt-Jakob disease, and GerstmannStraussler-Scheinker syndrome, fatal familial insomnia, nutritional and metabolic diseases of the nervous system, neurofibromatosis, tuberous sclerosis, cerebelloretinal hemangioblastomatosis, encephalotrigeminal syndrome, mental retardation and other developmental disorders of the central nervous system including Down syndrome, cerebral palsy, neuroskeletal disorders, autonomic nervous system disorders, cranial nerve disorders, spinal cord diseases, muscular dystrophy and other neuromuscular disorders, peripheral nervous system disorders, dermatomyositis and polymyositis, inherited, metabolic, endocrine, and toxic myopathies, myasthenia gravis, periodic paralysis, mental disorders including mood, anxiety, and schizophrenic disorders, seasonal affective disorder (SAD), akathesia, amnesia, catatonia, diabetic neuropathy, tardive dyskinesia, dystonias, paranoid psychoses, postherpetic neuralgia, Tourette's disorder, progressive supranuclear palsy, corticobasal degeneration, and familial frontotemporal dementia; a developmental disorder such as renal tubular acidosis, anemia, Cushing's syndrome, achondroplastic dwarfism, Duchenne and Becker muscular dystrophy, epilepsy, gonadal dysgenesis, WAGR syndrome (Wilms' tumor, aniridia, genitourinary abnormalities, and mental retardation), Smith-Magenis syndrome, myelodysplastic syndrome, hereditary mucoepithelial dysplasia, hereditary keratodermas, hereditary neuropathies such as Charcot-Marie-Tooth disease and neurofibromatosis, hypothyroidism, hydrocephalus, seizure disorders such as Syndenham's chorea and cerebral palsy, spina bifida, anencephaly, craniorachischisis, congenital glaucoma, cataract, and sensorineural hearing loss. Exemplary disease or conditions include, e.g., MS, SLE, ITP, IDDM, MG, CLL, CD, RA, Factor VIII Hemophilia, transplantation, arteriosclerosis, Sjogren's Syndrome, Kawasaki Disease, AHA, ulcerative colitis, multiple myeloma, Glomerulonephritis, seasonal allergies, and IgA Nephropathy; and a cardiovascular disorder such as congestive heart failure, ischemic heart disease, angina pectoris, myocardial infarction, hypertensive heart disease, degenerative valvular heart disease, calcific aortic valve stenosis, congenitally bicuspid aortic valve, mitral annular calcification, mitral valve prolapse, rheumatic fever and rheumatic heart disease, infective endocarditis, nonbacterial thrombotic endocarditis, endocarditis of systemic lupus erythematosus, carcinoid heart disease, cardiomyopathy, myocarditis, pericarditis, neoplastic heart disease, congenital heart disease, complications of cardiac transplantation, arteriovenous fistula, atherosclerosis, hypertension, vasculitis, Raynaud's disease, aneurysms, arterial dissections, varicose veins, thrombophlebitis and phlebothrombosis, vascular tumors, and complications of thrombolysis, balloon angioplasty, vascular replacement, and coronary artery bypass graft surgery.
- In other embodiments of the invention, a scaffold or binding protein is conjugated, optionally through a linker, to a toxin, pro-drug, or other molecule (e.g., a protein, nucleic acid, organic small molecule, etc.) suitable for use as a pharmaceutical or therapeutic agent. Non-limiting examples of proteins include cytokines, chemokines, growth factors, interleukins, cell-surface proteins, extracellular domains, cell surface receptors, and cytotoxins. The conjugated scaffold or binding protein delivers the attached molecule to a location bound by the binding site of the scaffold or binding protein. Such forms of the invention may be used in method of decreasing the viability of a cell, preferably a disease associated cell, such as a cancer cell or virally infected cell. Stated differently, the invention provides a method of targeting a cell expressing a cell surface molecule by use of a scaffold or binding protein of the invention. Such a method comprises contacting said cell with a scaffold or binding protein of the invention which binds said cell surface molecule.
- In the case of a cancer cell, such as those of the cancers listed above, the scaffold or binding protein is one which preferably binds an external cell surface molecule of the cell with sufficient specificity to minimize undesirable binding to non-cancer cells. Similarly, in the case of a virally infected cell, the scaffold or binding protein is one which preferably binds a viral antigen expressed on the external cell surface of an infected cell with sufficient specificity to minimize undesirable binding to non-infected cells.
- Thus the invention also provides a method of decreasing the viability of a cell, said method comprising covalently linking a cellular toxin or pro-drug to a scaffold or binding protein of the invention and contacting the linked scaffold or binding protein with a cell comprising a cell surface molecule bound by the scaffold or binding protein to decrease the viability of the cell. Preferably, the cell is a cancer cell, expressing a cell surface marker specific to the cancer cell as described above. Alternatively, the cell is a virally infected cell, expressing a viral antigen, on the cell surface, that is specific to virally infected cells as described above.
- Alternatively, the invention provides for the selection of a scaffold or binding protein which binds a cell surface molecule such that the binding of one or multiple scaffolds or binding proteins to the cell through the molecule triggers, or is sufficient to activate, a cell death program in the bound cell. A non-limiting example of such a scaffold or binding protein is one that is analogous to Fas ligand or an antibody against Fas which triggers apoptosis of a cell upon binding to Fas expressed on the cell.
- Therefore, the invention provides for the use of a scaffold or binding protein as disclosed herein as a therapeutic agent for use in the treatment of disease or other unwanted conditions. Alternatively, a scaffold or binding protein may be used in the prophylactic treatment of a disease or unwanted condition. The treatments of the invention include both in vivo or ex vivo administration. Preferably, the scaffold or binding protein is formulated as a composition comprising a pharmaceutically acceptable excipient, optionally for delayed release (or slow release over time). Sterile formulations of a scaffold or binding protein are also contemplated.
- With respect to in vivo embodiments, a scaffold or binding protein is typically administered or transferred directly to the cells to be treated or to the tissue site of interest via intramuscular, intradermal, subdermal, subcutaneous, oral, intraperitoneal, intrathecal, or intravenous procedures. Alternatively, a scaffold or binding protein can be placed within a cavity of the body, such as during surgery, or by inhalation, or vaginal or rectal administration. With respect to ex vivo embodiments, the contacted cells are returned or delivered to the site from which they were obtained or to another site in the subject to be treated. The subject need not be that from which the cells were obtained. The treated cells may be optionally grafted onto a tissue or organ before being returned or alternatively delivered to the blood or lymph system using standard delivery or transfusion techniques.
- Subjects that may be treated with a scaffold or binding protein of the invention include, but are not limited to, a mammal, including a human, primate, dog, cat, mouse, pig, cow, goat, rabbit, rat, guinea pig, hamster, horse, sheep; or a non-mammalian vertebrate such as a bird (e.g., a chicken or duck), or fish; or an invertebrate.
- The invention also provides for compositions comprising a scaffold or binding protein disclosed herein. Non-limiting examples include attachment of a scaffold or binding protein to a surface, such as that of a tube, well, or dish; attachment to a matrix of an affinity material; or attachment to beads, a column, a solid support, or a microarray
- The compositions and methods of the present invention are ideally suited for preparation of kits produced in accordance with well known procedures. The invention thus provides kits comprising agents (like a scaffold or binding protein, or a library of scaffolds or binding proteins, described herein as non-limiting examples) for use in one or more methods as disclosed herein. Such kits, optionally comprising an agent with an identifying description or label or instructions relating to their use in the methods of the present invention, are provided. Such a kit may comprise containers, each with one or more of the various reagents (typically in concentrated form) or devices utilized in the methods. A set of instructions will also typically be included. Standards for calibrating the binding of a scaffold or binding protein to a ligand may also be included in the kits of the invention.
- Alternatively a kit of the invention may comprise one or more reagents for production of a library of scaffolds or binding proteins, such as that embodied in phage particles which express individual members of the library. Such kits may contain vectors, such as initial phage particles, and cells for their propagation and plating as well as expression of scaffolds or binding proteins.
- Having now generally described the invention, the same will be more readily understood through reference to the following examples which are provided by way of illustration, and are not intended to be limiting of the present invention, unless specified.
- The following examples are offered to illustrate, but not to limit the claimed invention.
- Structural comparison of Mtd-P1,-3c, -M1, I1, and -N1 were used to discover that the main chain conformation of the CTL domain is remarkably invariant, despite half of the variable residues being on loop regions (
FIG. 3C ). The binding site in these variants is highly well ordered, having average main chain B-factors ranging from ˜9 Å in Mtd-P1 to −24 Å2 in Mtd-M1 and with density visible for all but one side chain (Phe-346 in Mtd-I1). Providing stabilization to these loops in Mtd are two features unique to the Mtd CTL-fold, namely the two inserts and trimeric assembly. - The inserts form hydrogen bonds to VR, including three to side chains of three invariant serines in VR. Ser-270 and Glu-267 from the second insert form hydrogen bonds to the invariant VR residues Ser-351 and Ser-353, respectively (
FIG. 3D ), and main chain atoms of the first insert form hydrogen bonds to invariant VR residue Ser-365 (not depicted). These interactions are supplemented by hydrogen bonds between the inserts and main chain (scaffold) atoms of the VR. Likewise, trimeric assembly contributes to stabilizing VR, specifically through contacts from a neighboring monomer's extensive β2β3 loop. The β2β3 loop from one monomer contributes not only the aforementioned invariant tyrosines (322 and 333) to a neighbor's binding site (FIG. 3B ), but also hydrogen bonds to the invariant VR residue Arg-354 and to main chain (scaffold) atoms of VR (FIG. 3E ). The β2β3 loop has the same intertwining conformation in all Mtd variants examined, being positioned over invariant residues (i.e., 351-356) in a neighbor's binding site. - The binding sites of the five Mtd variants studied differ greatly in their pattern of hydrophobicities.
FIG. 4A shows that Mtd-P1 and Mtd-I1 have highly hydrophobic binding sites, and that the continuity of the hydrophobic surface decreases successively for Mtd-3c, -M1, and -N1, with this last one having nine TR-encoded, mostly hydrophilic residues (FIG. 1 ). The binding sites of Mtd-P1 and -I1 accommodate four to five large, exposed hydrophobic residues, and although a preponderance of exposed hydrophobic surface is correlated with protein instability, both Mtd-P1 and -I1 are found to be highly stable proteins. The invariant area surrounding the binding site is largely hydrophilic, most likely aiding protein stability. - To understand the basis of Mtd interactions with its ligand, a cell surface receptor, we characterized association between Mtd-
P 1 and the Bordetella receptor pertactin. The pertactin ectodomain (Prn-E) was incubated with Mtd variants and found by a coprecipitation assay to associate most strongly with Mtd-P1 but also with Mtd-3c and Mtd-M1. As a measure of specificity, Prn-E was not found to associate with Mtd-I1 or Mtd-N1. The three Mtd variants that are found to bind pertactin have in common the variable residue Tyr-359, previously shown by sequence comparison to be a consistent determinant for pertactin interaction. The presence of a tyrosine residue in the binding pocket is consistent with the presence of a number of hydrophobic surface-exposed patches on Prn-E (see Emsley, P., et al. Structure of Bordetella pertussis virulence factor P.69 pertactin.Nature 381, 90-2 (1996)). The maintenance of Pm affinity in some of these Mtd variants agrees with the relatively high frequency with which the phage adopts the BPP phenotype. - Despite each monomer providing a discrete binding site, the stoichiometry of association between Mtd and Prn-E is 3:1, as assessed by static light scattering. This may reflect steric occlusion of empty binding sites by elongated pertactin or pseudo-symmetric binding. The affinity of Mtd for Prn-E has a KD of ˜3 μM as measured by isothermal titration calorimetry (ITC). Because Bordetella phage has six tail fibers with each fiber appearing to have two Mtd trimers, the affinity is likely translate to high avidity during infection. The ITC experiment also demonstrated that the endothermic interaction between the two molecules is entropically driven, as would be expected from the hydrophobic binding site of Mtd-P1. The affinity of Mtd-M1 for Prn-E is too low to be reliably measured by ITC, but a KD of ≧200 μM is estimated, suggesting that the boundary between a productive and nonproductive interaction lies between 3 and ≧200 μM.
- A number of other putative DGRs have been identified in phage and bacterial genomes. These resemble the Bordetella phage DGR in having sequence-related reverse transcriptases, similar arrangements of VR and TR, adenines constituting the main differences between VR and TR, and IMH-like elements at the end of VR. However, the putative variable proteins have no obvious sequence relationship to Mtd or other proteins. Because there appears to be no genetic requirement for VR and its IMH element to be positioned at the very C-terminus of a protein, the variations in positioning likely reflects the necessities of protein binding requirements as specified by the CTL-fold. Despite the low sequence identity among these proteins (˜17%), we have been able to use the structure of Mtd along with considerations about variability to construct a sequence alignment consisting of the β2β3β4β4′ sheet of the CTL-fold (see
FIG. 5 ). Most notably, the invariant Mtd binding site residue Trp-345 is seen to be present in a highly conserved ‘GGXW’ motif. Invariant residues (Ser-351, Ser-353, Arg-354) involved in loop stabilization, trimeric contacts, or both are also generally conserved. As in Mtd, residues differing between VR and TR or ones that could potentially vary through an adenine-directed mechanism in these proteins are located chiefly between the β3 and ≈4′ strands. These conclusions are bolstered by profile-based sequence alignment, which provides statistical confidence for the putative variable proteins from such diverse organisms as Treponema dentieola, Vibrio harveyi ML phage, and the various cyanobacteria being related to Mtd and consequently having a CTL-fold. - All references cited herein are hereby incorporated by reference in their entireties, whether previously specifically incorporated or not. As used herein, the terms “a”, “an”, and “any” are each intended to include both the singular and plural forms.
- Having now fully described this invention, it will be appreciated by those skilled in the art that the same can be performed within a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the invention and without undue experimentation. While this invention has been described in connection with specific embodiments thereof, it will be understood that it is capable of further modifications. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains and as may be applied to the essential features hereinbefore set forth.
Claims (20)
1. A non-naturally occurring protein with binding specificity determined by a variable binding site, said protein comprising
a scaffold comprising the amino acid sequence
wherein each of Xaa1 to Xaa12 is independently any amino acid residue, the side chains of which form a binding site, in whole or in part, that determines the binding specificity of the protein; and
at each of the Xaa1 and Xaa12 ends of the scaffold are polypeptides that form a superscaffold which displays said binding site in a solvent exposed portion of the protein, or
one of the Xaa1 and Xaa12 ends of the scaffold is —H and the other end is a polypeptide that forms a superscaffold which displays said binding site in a solvent exposed portion of the protein.
2. The protein of claim 1 wherein said scaffold polypeptide is derived from a C-type lectin fold (CTL-fold).
3. The protein of claim 2 wherein said CTL-fold is a C-type lectin-like domain (CTLD) or a MTD like domain.
4. The protein of claim 1 wherein said scaffold is in the C-terminal half of the protein.
5. The protein of claim 4 wherein said scaffold is within about 100 amino acid residues or within about 50 amino acid residues of the C-terminus of the protein.
6. The protein of claim 1 wherein said scaffold comprises
7. The protein of claim 7 wherein said scaffold is about 44-45 amino acid residues in length.
8. A nucleic acid molecule encoding the protein of claim 1 .
9. The nucleic acid molecule of claim 9 wherein said scaffold is all or part of a variable region (VR) operably linked to an initiation of mutagenic homing (IMH) sequence and a template region (TR).
10. A method of producing a plurality of proteins with different binding specificities, said method comprising
expressing and replicating the nucleic acid molecule of claim 10 in a cell under conditions of mutagenic homing wherein said TR directs mutagenesis of variable residues within said scaffold.
11. A method of selecting a protein with binding specificity for a molecule of interest, said method comprising
producing a plurality of proteins in a plurality of cells by the method of claim 11 ;
selecting proteins which bind a molecule of interest after individual contact of each of said plurality of proteins with said molecule of interest.
12. The method of claim 12 wherein said molecule of interest is a cell surface molecule.
13. The method of claim 13 wherein said molecule of interest is a cell surface molecule of a cancer or other mammalian cell or a bacterial cell surface molecule.
14. The protein of claim 1 , further comprising a label attached to said protein.
15. The protein of claim 16 wherein said label is a covalently attached, directly detectable label.
16. The protein of claim 1 , further comprising a cellular toxin or pro-drag attached to said protein.
17. A method of decreasing the viability of a cancer cell, said method comprising
covalently linking a cellular toxin or pro-drug to a protein selected by the method of claim 14 ; and
contacting said linked protein with a cancer cell comprising a cell surface molecule which binds said protein to decrease the viability of said cell.
18. The protein of claim 19 wherein said cancer cell is in a mammalian or human subject.
19. A method of detecting a bacterial cell, said method comprising
obtaining the protein selected by the method of claim 15 ;
contacting said protein with a bacterial cell comprising a cell surface molecule which binds said protein; and
detecting said protein on said bacterial cell.
20. A method of targeting a cell expressing a cell surface molecule, said method comprising
contacting said cell with a protein according to claim 1 which binds said cell surface molecule.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/493,802 US20100069300A1 (en) | 2004-12-31 | 2009-06-29 | C-Type Lectin Fold as a Scaffold for Massive Sequence Variation |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/027,323 US7749694B2 (en) | 2004-12-31 | 2004-12-31 | C-type lectin fold as a scaffold for massive sequence variation |
US12/493,802 US20100069300A1 (en) | 2004-12-31 | 2009-06-29 | C-Type Lectin Fold as a Scaffold for Massive Sequence Variation |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/027,323 Continuation US7749694B2 (en) | 2004-12-31 | 2004-12-31 | C-type lectin fold as a scaffold for massive sequence variation |
Publications (1)
Publication Number | Publication Date |
---|---|
US20100069300A1 true US20100069300A1 (en) | 2010-03-18 |
Family
ID=36648033
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/027,323 Expired - Fee Related US7749694B2 (en) | 2004-12-31 | 2004-12-31 | C-type lectin fold as a scaffold for massive sequence variation |
US12/493,802 Abandoned US20100069300A1 (en) | 2004-12-31 | 2009-06-29 | C-Type Lectin Fold as a Scaffold for Massive Sequence Variation |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/027,323 Expired - Fee Related US7749694B2 (en) | 2004-12-31 | 2004-12-31 | C-type lectin fold as a scaffold for massive sequence variation |
Country Status (3)
Country | Link |
---|---|
US (2) | US7749694B2 (en) |
EP (1) | EP1841880A4 (en) |
WO (1) | WO2006073971A2 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090176654A1 (en) * | 2007-08-10 | 2009-07-09 | Protelix, Inc. | Universal fibronectin type III binding-domain libraries |
US20100152063A1 (en) * | 2007-08-10 | 2010-06-17 | Protelix, Inc. | Universal fibronectin type iii binding-domain libraries |
US8470966B2 (en) | 2007-08-10 | 2013-06-25 | Protelica, Inc. | Universal fibronectin type III binding-domain libraries |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7749694B2 (en) * | 2004-12-31 | 2010-07-06 | The Regents Of The University Of California | C-type lectin fold as a scaffold for massive sequence variation |
CA2682444A1 (en) * | 2007-03-30 | 2008-10-09 | National Research Council Of Canada | Phage receptor binding proteins for antibacterial therapy and other novel uses |
JP2013507124A (en) * | 2009-10-09 | 2013-03-04 | アナフォア インコーポレイテッド | Polypeptide that binds IL-23R |
US20110086806A1 (en) * | 2009-10-09 | 2011-04-14 | Anaphore, Inc. | Polypeptides that Bind IL-23R |
US20230074139A1 (en) * | 2021-09-03 | 2023-03-09 | International Business Machines Corporation | Proactive maintenance for smart vehicle |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7585957B2 (en) * | 2004-08-03 | 2009-09-08 | The Regents Of The University Of California | Site specific system for generating diversity protein sequences |
US7749694B2 (en) * | 2004-12-31 | 2010-07-06 | The Regents Of The University Of California | C-type lectin fold as a scaffold for massive sequence variation |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5223409A (en) | 1988-09-02 | 1993-06-29 | Protein Engineering Corp. | Directed evolution of novel binding proteins |
DE19742706B4 (en) | 1997-09-26 | 2013-07-25 | Pieris Proteolab Ag | lipocalin muteins |
US6818418B1 (en) | 1998-12-10 | 2004-11-16 | Compound Therapeutics, Inc. | Protein scaffolds for antibody mimics and other binding proteins |
EP1163339A1 (en) | 1999-04-01 | 2001-12-19 | Innogenetics N.V. | A polypeptide structure for use as a scaffold |
-
2004
- 2004-12-31 US US11/027,323 patent/US7749694B2/en not_active Expired - Fee Related
-
2005
- 2005-12-28 EP EP05855715A patent/EP1841880A4/en not_active Withdrawn
- 2005-12-28 WO PCT/US2005/047201 patent/WO2006073971A2/en active Application Filing
-
2009
- 2009-06-29 US US12/493,802 patent/US20100069300A1/en not_active Abandoned
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7585957B2 (en) * | 2004-08-03 | 2009-09-08 | The Regents Of The University Of California | Site specific system for generating diversity protein sequences |
US7749694B2 (en) * | 2004-12-31 | 2010-07-06 | The Regents Of The University Of California | C-type lectin fold as a scaffold for massive sequence variation |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090176654A1 (en) * | 2007-08-10 | 2009-07-09 | Protelix, Inc. | Universal fibronectin type III binding-domain libraries |
US20100152063A1 (en) * | 2007-08-10 | 2010-06-17 | Protelix, Inc. | Universal fibronectin type iii binding-domain libraries |
US20110124527A1 (en) * | 2007-08-10 | 2011-05-26 | Guido Cappuccilli | Universal fibronectin type iii binding-domain libraries |
US8470966B2 (en) | 2007-08-10 | 2013-06-25 | Protelica, Inc. | Universal fibronectin type III binding-domain libraries |
US8680019B2 (en) | 2007-08-10 | 2014-03-25 | Protelica, Inc. | Universal fibronectin Type III binding-domain libraries |
US8697608B2 (en) | 2007-08-10 | 2014-04-15 | Protelica, Inc. | Universal fibronectin type III binding-domain libraries |
US9376483B2 (en) | 2007-08-10 | 2016-06-28 | Protelica, Inc. | Universal fibronectin type III binding-domain libraries |
Also Published As
Publication number | Publication date |
---|---|
US7749694B2 (en) | 2010-07-06 |
EP1841880A2 (en) | 2007-10-10 |
WO2006073971A3 (en) | 2009-03-12 |
EP1841880A4 (en) | 2009-10-21 |
US20070275367A1 (en) | 2007-11-29 |
WO2006073971A2 (en) | 2006-07-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20100069300A1 (en) | C-Type Lectin Fold as a Scaffold for Massive Sequence Variation | |
Rintala‐Dempsey et al. | S100–annexin complexes–structural insights | |
Huber et al. | A specific domain in α-catenin mediates binding to β-catenin or plakoglobin | |
US20050164301A1 (en) | LDL receptor class A and EGF domain monomers and multimers | |
JP2023015311A (en) | Multimers, Tetramers and Octamers | |
Huber et al. | Mutations affecting transmembrane segment interactions impair adhesiveness of E-cadherin | |
US20050053973A1 (en) | Novel proteins with targeted binding | |
MXPA06014796A (en) | C-met kinase binding proteins. | |
Whiteheart et al. | Multiple binding proteins suggest diverse functions for the N-ethylmaleimide sensitive factor | |
Pascolutti et al. | Mapping and engineering the interaction between adiponectin and T-cadherin | |
Rath et al. | Peptide models of membrane protein folding | |
Carter et al. | Phage display reveals multiple contact sites between FhuA, an outer membrane receptor of Escherichia coli, and TonB | |
Cai et al. | Cryo-electron microscopic analysis of single-pass transmembrane receptors | |
Gayen et al. | Solution NMR study of the transmembrane domain of single-span membrane proteins: opportunities and strategies | |
US20250263449A1 (en) | Multivalent proteins and screening methods | |
CN107073093A (en) | Rabphilin Rab and application thereof | |
JP2022531977A (en) | Superspecific cell targeting using a newly designed colocalization-dependent protein switch | |
Roper et al. | “Affimer” synthetic protein scaffolds block oxidized LDL binding to the LOX-1 scavenger receptor and inhibit ERK1/2 activation | |
MX2007005865A (en) | Protein scaffolds and uses therof. | |
Limatola et al. | Cysteine residues are critical for chemokine receptor CXCR2 functional properties | |
Zhu et al. | Phosphoantigen-induced inside-out stabilization of butyrophilin receptor complexes drives dimerization-dependent γδ TCR activation | |
US20220064245A1 (en) | Fusion proteins comprising a cytokine and scaffold protein | |
Zhao et al. | Development of agents that modulate protein-protein interactions in membranes | |
WO2011132940A2 (en) | Rtk-bpb specifically binding to rtk | |
Vithayathil | Unlocking the Secrets of Membrane Proteins with Phage Display |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |