WO2003095619A2 - Protein complex purification - Google Patents

Protein complex purification Download PDF

Info

Publication number
WO2003095619A2
WO2003095619A2 PCT/US2003/014511 US0314511W WO03095619A2 WO 2003095619 A2 WO2003095619 A2 WO 2003095619A2 US 0314511 W US0314511 W US 0314511W WO 03095619 A2 WO03095619 A2 WO 03095619A2
Authority
WO
WIPO (PCT)
Prior art keywords
protein
complex
tag
affinity
bait
Prior art date
Application number
PCT/US2003/014511
Other languages
French (fr)
Other versions
WO2003095619A3 (en
Inventor
John J. Boniface
Vladimir Kery
John M. Peltier
Lawrence Weir
Paul B. Robbins
Manuel M. Rodriguez
Original Assignee
Prolexys Pharmaceuticals, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Prolexys Pharmaceuticals, Inc. filed Critical Prolexys Pharmaceuticals, Inc.
Priority to AU2003228944A priority Critical patent/AU2003228944A1/en
Publication of WO2003095619A2 publication Critical patent/WO2003095619A2/en
Publication of WO2003095619A3 publication Critical patent/WO2003095619A3/en
Priority to US10/984,958 priority patent/US7825227B2/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6803General methods of protein analysis not limited to specific proteins or families of proteins
    • G01N33/6842Proteomic analysis of subsets of protein mixtures with reduced complexity, e.g. membrane proteins, phosphoproteins, organelle proteins
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K1/00General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length
    • C07K1/14Extraction; Separation; Purification
    • C07K1/16Extraction; Separation; Purification by chromatography
    • C07K1/22Affinity chromatography or related techniques based upon selective absorption processes
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K1/00General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length
    • C07K1/14Extraction; Separation; Purification
    • C07K1/36Extraction; Separation; Purification by a combination of two or more processes of different types
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K5/00Peptides containing up to four amino acids in a fully defined sequence; Derivatives thereof
    • C07K5/04Peptides containing up to four amino acids in a fully defined sequence; Derivatives thereof containing only normal peptide links
    • C07K5/10Tetrapeptides
    • C07K5/1002Tetrapeptides with the first amino acid being neutral
    • C07K5/1005Tetrapeptides with the first amino acid being neutral and aliphatic
    • C07K5/101Tetrapeptides with the first amino acid being neutral and aliphatic the side chain containing 2 to 4 carbon atoms, e.g. Val, Ile, Leu
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K5/00Peptides containing up to four amino acids in a fully defined sequence; Derivatives thereof
    • C07K5/04Peptides containing up to four amino acids in a fully defined sequence; Derivatives thereof containing only normal peptide links
    • C07K5/10Tetrapeptides
    • C07K5/1021Tetrapeptides with the first amino acid being acidic
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2500/00Screening for compounds of potential therapeutic value

Definitions

  • the present invention relates to a method for purifying a protein complex from a cell, a cell lysate, tissue lysate or an organism and identifying interacting proteins contained in such complex.
  • yeast two-hybrid systems have been extensively employed to define interactions existing among proteins. The principles and methods of the yeast two-hybrid system have been described in detail elsewhere [Bartel and Fields (1997) The Yeast Two- Hybrid System. Oxford University Press, New York; Fields and Song (1989) Nature 340:245- 246].
  • yeast two hybrid results in the identification of a pair of directly interacting proteins. For this reason, yeast two hybrid is not readily useful for identifying higher order structures in multiprotein complexes comprised of both direct and indirect protein associations.
  • the present invention provides an efficient method for purifying protein complexes from a cell, cell lysate, tissue lysate or organism by employing a combined set of affinity tags attached to a known protein which may be used as a bait to isolate any interacting proteins therewith.
  • the present invention provides efficient methods for purifying protein complexes from a cell, a cell or tissue lysate or an organism by employing a combined set of affinity tags of high affinity, specificity and ease of elution.
  • the proteins, or fragments thereof, purified according to the invention can then be identified and used as new targets for therapeutic intervention, as diagnostic tools, and as the basis for generating new animal and cell models.
  • the invention employs a known protein or a fragment thereof, termed “bait” , which is modified to contain at least two affinity tags separated by a sequence of amino acids which is cleavable by a specific protease.
  • the modified bait protein is termed herein as a "first binding component” .
  • the modified portion of the bait protein can be present as an extension either at the amino terminus or the carboxyl terminus of the bait or at both termini.
  • one or more affinity tags can be inserted in the ORF of the bait.
  • the affinity tags in the first binding component serve as specific means for purifying a desired protein complex bound to the first binding component before and after a protease digestion.
  • the inclusion of the protease specific segment between the affinity tags can provide an enhanced specificity of the second affinity purification step since the protease cleavage can generate a tag sequence that is more selective for the second affinity ligand.
  • the design of the second affinity tag segment of the first binding component can be such that two or more tags are assembled in tandem possibly separated by specific protease sites different from that between the first affinity recognition sequence and the second affinity tag segment. This design allows significant flexibility in purification strategies with advantages that will be evident below.
  • the first binding component is a bait protein containing a biotinylation recognition sequence and a peptide affinity tag sequence of a hexapeptide comprised of six histidines (6HIS) separated by a protease cleavage sequence for TEV (Tobacco Etch Virus) protease.
  • Any protein or a fragment thereof can serve as the bait.
  • the invention does not require any knowledge of the function of the bait protein and can thus serve as a general purification strategy for purifying any protein complex.
  • a method for purifying a protein complex from a cell, a cell or tissue lysate, or an organism containing the protein complex comprising the steps of: a) providing a first binding component comprising four parts: 1) a peptide segment having an affinity modifiable segment, 2) a protease specificity segment, 3) a peptide affinity tag, and 4) a bait; b) contacting the first binding component with the protein complex, whereby part 4) of the first binding component binds to said protein complex, thereby forming a bait-bound protein complex; c) modifying part 1) of the first binding component by attaching an affinity ligand thereto, thereby forming an affinity-tagged bait-bound protein complex; d) contacting the complex formed in step c) with a first affinity matrix specific for said affinity ligand thereby binding said complex to said matrix, and separating the complex from unbound material; e) contacting the complex formed in step c) with a protease that specifically cleaves part
  • the affinity tags useful in the invention include, but are not limited to, an amino acid recognition sequence of a biotin ligase, 6HIS hexapeptide, Strep-tag IITM(Sigma GenoSys), calmodulin binding peptide (CBP), any number of available epitopes with appropriate immunochemical reagents available (for example, hemagglutinin (HA), FLAG, MYC), as well as polypeptides to which immunological or nonimmunological (e.g. aptamers) reagents can be prepared.
  • the amino acid sequence of a biotinylation recognition sequence is particularly preferred since the affinity of biotin for streptavidin is one of the strongest noncovalent interactions known allowing for highly specific binding of a biotin-tagged protein to an affinity matrix of immobilized streptavidin or suitable avidin-like molecule.
  • protease specificity sequences useful for the invention include, but are not limited to, those of TEV, PreScission proteaseTM (Amersham Biosciences, Piscataway, NJ), enterokinase, thrombin, Factor Xa, and furins. Any amino acid sequence providing a recognition sequence for a protease having a high specificity is suitable for the invention. High specificity, as used herein, means that the protease cleaves only at the recognition sequence and that that recognition sequence is typically comprised of several amino acids or alternatively is specific to a unique secondary or tertiary structure formed by the recognition sequence.
  • the bait protein include, but are not limited to, proliferating cell nuclear antigen (PCNA), histone deacetylase 1 (HDAC1), cyclin-dependent kinase inhibitor lb
  • CDKNlb N-ethylmaleimide-sensitive factor attachment protein
  • NAP A N-ethylmaleimide-sensitive factor attachment protein
  • CDK5 cyclin-dependent kinase 5
  • GRB2 growth factor receptor-bound protein 2
  • the first binding component i.e. , modified bait protein
  • the first binding component is prepared in vivo in a cell, or organism or in vitro or a combination of both. If the modified bait protein is expressed in a cell from which a desired protein complex is to be purified, those proteins and other cellular components with which it normally binds are purified together with the bait protein under conditions that do not disrupt the binding interactions between those proteins. Alternatively, the modified bait protein can be expressed in a transgenic animal and tissues or whole organisms can be used in the invention to purify protein complexes.
  • the first binding component can also be prepared recombinantly in bacteria or another host using methods well- known in the art. In this instance, the purified first binding component is mixed with a cell or tissue lysate of choice to form a protein complex in vitro, which is then purified according to the steps described herein.
  • the individual proteins that comprise the complexes purified according to the invention are identified by a variety of mass spectrometric methods which include an associated set of separation methods. The identification of the interacting proteins will provide new targets for the identification of useful pharmaceuticals and diagnostic tools.
  • Fig. 1 shows that the bait proteins in AviTag E24 (A) and E25 (B) vectors are expressed upon induction in E. coli. Equal amounts of cells uninduced (U) and induced (I) with arabinose were resuspended in the SDS-Laemmli sample buffer, the lysates were separated by SDS-PAGE and the gels were stained with Coommassie stain. The arrows indicate the location of the fusion proteins.
  • Fig. 2 shows the results of purification of E24 and E25 fusion protein constructs using Ni 2+ affinity chromatography via the 6HIS affinity tag. Each sample was loaded on the gel in the amount of 5 ⁇ g of total protein. PS is the Markl2 protein standard (Invitrogen). Numbers at or above bands indicated the predicted molecular weight (kDa) for each protein.
  • Fig. 3 confirms the biotinylation of the E24 and E25 tagged proteins and their ability to bind to neutravidin beads.
  • Western blots are shown of samples [tags only (A), GRB2 (B), NAPA (C), CDKNlb (D)] taken before biotinylation, N, after biotinylation, B and the supernatant, S, following the binding to UltraLink neutravidin beads (Pierce). PS indicates the Markl2 Protein standards (Invitrogen). Aliquots of approximately 20 ng of each purified protein sample were loaded on the gel. The Western blots were developed using neutravidin -horseradish peroxidase complex and TMB (tetramethyl benzidine, Sigma) substrate.
  • TMB tetramethyl benzidine
  • (E) Also shown in (E) is a gel-shift assay for the CDKNlb-E25 fusion protein. Approximately 2 ⁇ g of expressed protein, before biotinylation, N, or after biotinylation, B, were incubated in the presence or absence of 4 ⁇ g of neutravidin protein, NA, and then analyzed by SDS-PAGE under non-denaturing conditions (nonreducing, unboiled).
  • Fig. 4 demonstrates the efficiency of TEN protease in cleaving the E24(A) and E25 (B) fusion proteins bound to neutravidin beads.
  • the starting uncleaved fusion proteins are shown (U) and the total available uncleaved fusion protein bound to beads (T) are shown as controls. Gray and black arrows indicate the cleaved fusion protein and TEN protease, respectively.
  • TEV was used at two different concentrations (IX, 10X) as detailed in the Examples Section. Minus and plus signs refer to the absence or presence of cell lysate and thus bound interacting proteins.
  • Fig. 5 shows the results of SDS-PAGE analysis of the isolated protein complexes following two-step affinity purification. Shown are proteins either eluted from the second affinity matrix ( ⁇ i + beads) by a sarkosyl detergent or remaining on the beads following elution (Ni 2+ beads). The "C" label indicates contamination, due to bead carry-over, of TEV protease, which also contains a 6HIS tag. DETAILED DESCRIPTION OF THE INVENTION
  • protein complex designates a cluster of macromolecules comprising at least one protein wherein the cluster is stabilized by non-covalent bonds.
  • a protein complex can be comprised entirely of proteins or peptides, or it can include carbohydrates, lipids, glycolipids, nucleic acids, oligonucleotides, nucleoproteins, nucleosides, nucleoside phosphates, enzyme co-factors, porphyrins, metal ions and the like, or any biomolecule.
  • bait or "bait protein” is used synonymously herein and is a peptide for which the nucleotide coding sequence is known in the art or is obtainable by employing methods well known in the art.
  • first binding component is typically, but not necessarily, a protein, synonymously used as the "modified bait” herein.
  • the first binding component possesses two properties significant for the invention: (a) it can bind to another protein or protein complex in a cell, in vitro, or an intact organism, cell lysate or tissue lysate, and (b) it can bind to an affinity reagent or can be modified by attachment of an affinity ligand.
  • Property (a) is an inherent biological property of the first binding component.
  • Property (b) is conferred by adding to the native structure of the bait a peptide tail having three functional segments.
  • the first segment is herein termed as "affinity modifiable segment".
  • the affinity modifiable segment has an amino acid sequence providing specificity for covalent attachment of an affinity ligand.
  • Exemplified herein is an amino acid sequence of a biotinylation recognition sequence, which can be recognized by a biotin ligase (endogenous or added exogenously) to covalently attach a biotin molecule to the peptide.
  • the biotin moiety serves as an affinity ligand that specifically binds any avidin-like reagent with a high affinity for biotin, for example streptavidin or neutravidin, immobilized in a matrix usually in the form of a chomatography support or bead.
  • any amino acid sequence providing recognition for attachment of an affinity ligand can be employed.
  • AviTag As the biotinylation recognition sequence
  • BirA E.coli gene product as the biotin ligase.
  • protease specificity sequence contains an amino acid sequence providing a recognition site for a protease that specifically cleaves at or near the recognition sequence.
  • Many specific proteases are known, together with their recognition sequences, including, but not limited to TEV protease, PreScission proteaseTM (Amersham Biosciences, Piscataway, NJ), enterokinase, clotting factors such as Factor Xa, furins, purins, and the like. It is preferred to employ the recognition sequence of a protease that does not cleave a peptide bond within the protein complex itself.
  • the third functional segment is termed as "peptide affinity tag" and is a peptide having a sequence capable of being specifically bound to and eluted from one or more ligands of an affinity matrix such as a chromatography support or bead.
  • a peptide affinity tag include an epitope, which can bind to a matrix-immobilized antibody, or a specific binding protein.
  • Preferred peptide affinity tags are those which are elutable from the affinity matrix by mild conditions unlikely to disrupt the protein complex and/or elute nonspecifically associated contaminants or that interact tightly with the affinity matrix such that specific elution conditions can be employed to preferentially elute the interacting proteins but retain some or all of the first binding component.
  • a peptide affinity tag is a 12 amino acid peptide, known as the Protein C tag in the art, which is recognized in a calcium dependent manner by the commercially available monoclonal antibody HPC4 (Roche Applied Science, Indianapolis, IN).
  • affinity tags are the FLAG MlTM (Sigma Corp., St. Louis, MO) epitope and calmodulin binding peptide (CBP), whose respective affinity interactions are reversibly Ca +2 dependent.
  • the third functional segment can also be composed of two or more different affinity tags in tandem that may or may not be separated from each other by specific proteolytic sites that are different from that adjacent to the affinity modifiable segment. When the third functional segment is composed of multiple affinity tags they can be used in alternative or sequential affinity purification steps. When greater purity is desired, sequential affinity purification steps can be used. Alternative affinity purification steps can allow for customization of the purification (e.g. a bait or bait-complex may sterically hinder one affinity tag but not another).
  • the modifying peptide segment can be present as an extension of the amino terminus or the carboxy terminus of the bait or at both termini.
  • one or more affinity tags can be inserted in the ORF of the bait.
  • the modifying peptide be short, to minimize any effect it may have on normal binding properties of the bait to the protein complex.
  • the invention can also be practiced with the same affinity tag serving as the affinity segment more than once. In some cases it may be desirable to place tags at both the N and C-terminus with protease digestion sites located between the "bait" and the tags.
  • second binding component refers to a protein complex comprising of one or more peptide affinity tags and a bait bound to a protein complex which is formed after a protease digestion.
  • the present invention relates to methods for purifying protein complexes from a cell, a cell or tissue lysate or an organism by employing the first binding component described above, i.e. , the bait protein containing at least two affinity tags separated by an amino acid sequence encoding a specific protease recogmtion site in such a way that at least two rounds of affinity purification can be carried out with a protease cleavage step occurring between one or more of the purification steps.
  • the first binding component described above i.e. , the bait protein containing at least two affinity tags separated by an amino acid sequence encoding a specific protease recogmtion site in such a way that at least two rounds of affinity purification can be carried out with a protease cleavage step occurring between one or more of the purification steps.
  • the isolation and identification of proteins bound to an exogenous bait protein introduced at relatively normal expression levels in cells, tissues or organisms has been a challenging problem.
  • the present invention solves this problem by incorporating a combined set of affinity tags that utilize high affinity, specificity and ease of elution.
  • Biotin ligase recognition sequences that are small, allow for extremely high affinity isolation and introduce a minimal background due to low levels of endogenous ligands.
  • Biotin ligase or biotinylation recognition sequences in combination with protease recognition sequences and affinity tags create ideal constructs for the applications described.
  • novel use of a protease digestion step that can create the specificity required for an efficient second purification strategy, e.g., a calcium-dependent step that allows a gentle and specific elution.
  • the biotin-avidin system is described for use as an affinity modifiable segment.
  • the system involves the incorporation of a biotinylation recognition sequence (can be any one of a number of amino acid sequences specifically recognized and biotinylated by members of the biotin ligase family of enzymes), followed by a specific protease digestion site and then finally a hexapeptide 6HIS tag.
  • the third functional segment can be one or more specific peptide sequences that can serve as affinity tags, e.g. , the Protein C tag, Strep-tag IITM(Sigma GenoSys), Hemagglutinin tag or FLAG recognition sequence.
  • the biotinylation recognition sequence can be a short peptide or a protein containing natural or unnatural amino acid sequences that will get biotinylated by a specific biotin ligase in vivo or in vitro.
  • a hexapeptide 6HIS and/or the Protein C tag is used as the peptide affinity tag.
  • the Protein C tag is a 12 amino acid peptide recognized in a calcium dependent fashion by the monoclonal antibody HPC4 (Roche, Indianapolis, IN). Additionally, the Protein C tag can be placed at the N or C-terminus of a protein or internally and still be recognized by HPC4. For this reason, a wide selection of protease cleavage sites can be incorporated and still permit purification using HPC4. It also means that a single antibody can be used for both the initial purification of the modified bait protein and the subsequent second- step purification of the multi-protein complex.
  • the TEV protease site is used as one of many possible proteases.
  • this design is flexible with regard to the type of protease used, it is advisable to select a highly specific protease, such as the TEV protease, to reduce inappropriate proteolysis.
  • the bait portion of the first binding component can be any protein or a fragment thereof.
  • the first binding component can be prepared by standard cloning procedures known to those in the art.
  • the first binding component could be prepared from cDNA libraries, or the like, without prior knowledge of the bait nucleic acid sequence. During such a "random" or “shotgun” approach, the identity of the bait protein could be determined later by mass spectrometry.
  • bait proteins that have been used to illustrate the invention, together with some of the proteins likely to be isolated as part of a protein complex isolatable by the method of the invention.
  • the proteins are listed by the standard names by which they are known in the art, and by which they are indexed in public databases, i.e., Genbank.
  • GRB2 (NM 002086) can serve as bait for Sosl, She, dynamin2 (see Table 3 for a more complete list).
  • NAPA (NM 003827) can serve as bait for syntaxins, VAMP, SNAP-23 (see Table
  • CDKNlb (NM 004064) can serve as bait for CDC2, CDK2, GRB2 (see Table 3 for a more complete list).
  • bait proteins are noted, together with some of the proteins likely to be isolated as part of a protein complex isolatable by the method of the invention.
  • eIF-4E (NM 001968) can serve as bait for purifying a protein complex that includes eIF-4A, eIF-4GI, MNK2, eIF-3 (itself a ten-subunit complex), ERK1/2 and possibly others.
  • Cyclin Dl (CCND1); (NM 053056) can serve as bait for purifying a protein complex that includes CDK4, PCNA, p21/Cipl (CDKN1A) and possibly others.
  • PCNA (NM 002592) can serve as bait for Husl, Rad9 and possibly others.
  • the modified bait protein referred to herein as the first binding component can be prepared either in vivo or in vitro or a combination thereof as described in the Examples Section.
  • the modified bait protein can be expressed in prokaryotic cells (e.g. , E. col ⁇ ) by employing the standard protocols well known in the art [Makrides, S.C. (1996) Microbiol. Rev. 60:512-538; Baneyx, F. (1999) Curr. Opin. Biotech. 10:411-421].
  • the first binding component can be expressed in eukaryotic cells (e.g. , mammalian cells or yeast) [Logan A. C. et al. (2002) Curr.
  • the first binding component can be expressed in an organism by transgenic or "knock- in” methods, discussed in more detail in the Examples Section.
  • the first binding component is prepared in a recombinant host (e.g., E. col ⁇ ) purification of the recombinantly expressed protein could be performed efficiently using the affinity tag (e.g. , 6HIS) contained therein by employing the standard biochemical approaches (e.g., Ni 2+ beads/column). In this case it is preferable to retain all three functional segments. If biotinylation of the bait protein has not occurred during expression in the recombinant host organism, it can then be performed, in vitro, using a recombinantly expressed and purified product of the E. coli BirA biotin ligase gene, as an example.
  • a recombinant host e.g., E. col ⁇
  • the purified first binding component could then be mixed with a source of protein complex, such as a cell lysate, tissue lysate or organism lysate.
  • the protein complex is then isolated as described in Example 6.
  • the modified bait protein can also be expressed in cells or an organism containing possible ligands and possibly biotinylated in the cell or organism by an endogenous or exogenous biotin ligase, expressed normally or recombinantly introduced. This system is thus applicable for experiments where the modified, biotinylated bait is expressed in cells of many types (prokaryotic or eukaryotic). Mammalian cells or whole organisms (e.g.
  • mice trans genie for the tagged first binding component would offer an added advantage since they allow the isolation of multi-protein complexes after their formation, in situ.
  • An example of the use of a biotin tag and endogenous biotinylation in mammalian cells can be found in [Parrott M.B. et al (2001) BBRC 281:993-1000; Parrott M.B. et al. (2002) Mol Ther. 1:96-104].
  • the biotin- tagged proteins, together with associated ligands of protein complexes are isolated from the cell, tissue or organism lysates using an avidin-like affinity reagent.
  • Specific elution of biotin- tagged proteins from the affinity column is then performed by digestion with a protease (e.g. TEV protease).
  • a protease e.g. TEV protease
  • the digestion step can serve several purposes: (1) it allows the specific elution of the bait and the associated protein complexes for immediate analysis or a second affinity purification step.
  • the immediate analysis of the elution from the first purification step may be advantageous for the identification of transiently interacting proteins that would normally be lost during multi-step purifications.
  • biotin-avidin may enable this because complexes are isolated and concentrated more quickly and may survive very rapid, yet stringent washing prior to elution; (2) it can expose a second affinity tag, hitherto cryptic or sterically hindered, for use in a second round of purification; (3) or create the specificity required for the second step.
  • affinity tag When a 6HIS sequence is used as the affinity tag, nickel-chelate bound beads allow for a second affinity purification step to remove contaminants remaining after the protease digestion.
  • the 6HIS tag is particularly useful when it is advantageous to remove all or some of the bait protein prior to mass spectrometry analysis, because the 6HIS-Ni +2 interaction survives relatively strong denaturing conditions. This is illustrated in Example 6.
  • a cryptic epitope can be created that is exposed upon the protease digestion step.
  • the digestion can create an N-terminal FLAG sequence that is specifically recognized by a calcium dependent antibody (e.g., SIGMA FLAG MlTM). This antibody does not recognize the FLAG sequence if it is internal or C-terminal.
  • a second purification step can be done with immobilized anti-FLAG MlTM antibody.
  • the specific proteins can then be isolated by a second specific and gentle elution with a calcium chelator such as EGTA.
  • the digestion sites when combined with a FLAG epitope, should incorporate a sequence recognized by a protease that cleaves C-terminally and "exo" to its recognition sequence, i.e., between the recognition sequence and the peptide affinity tag.
  • a protease that cleaves C-terminally and "exo" to its recognition sequence, i.e., between the recognition sequence and the peptide affinity tag.
  • proteases that do not cleave "exo" to their recognition sequence, if the amino acids left behind following cleavage are part of or compatible with the FLAG Ml recognition sequence.
  • a third example of an affinity tag is the Protein C epitope (Roche, Indianopolis, IN) which is also recognized by a calcium dependent antibody. Since this epitope is not sensitive to its location (can be N-terminal, C-terminal or internal), this design is more flexible with regards to the protease used following the first affinity purification step.
  • the present invention has several characteristics that distinguish it from the conventional methods [Puig O. et al. (2001) Methods 24:218-229].
  • the first distinguishing characteristic is the use of the biotin-streptavidin interaction for the first step in the isolation of a complex bound to the bait. This interaction is seven orders of magnitude higher in affinity than the protein A-IgG system used in the TAP protocol (K d of one Z domain binding to IgG is approximately 10 "8 M [Braisted and Wells, (1996) Proc. Natl. Acad. Sci. USA. 93:5688] vs. a K d of 10 '15 M for biotoin-streptavidin).
  • the first purification step requires the isolation of the protein complex when it is in its most dilute and contaminated state.
  • the higher binding affinity allows the isolation of biotinylated protein and associated ligands present at femtomolar or higher concentrations and permits a very stringent wash (if necessary) without loss of bait protein.
  • this also allows rapid isolation and concentration of protein complexes, minimizing losses of specific interactors and potentially enabling the analysis of transiently interacting proteins.
  • the published protein A- IgG system would only allow the isolation of proteins present at several logs higher concentrations and wouldn't allow as rapid an analysis or permit the use of such a stringent wash.
  • the second improvement over the protein A-IgG system is that the consensus biotinylation sequences are short and thus potentially less disruptive to protein folding during expression and subsequent protein-protein interactions.
  • the FLAG tag is used to practice the invention, this represents a system where the successful digestion with the protease can create the unique recognition sequence for the second purification step. This can have some advantages in creating additional specificity during the second purification step. For example, nonspecifically bound proteins that are cleaved and/or otherwise eluted during the digestion step are unlikely to meet the criteria for binding to the second affinity support.
  • anti-FLAG antibodies commercially available (Ml, M2, M5, Sigma Corporation, St Louis, MO).
  • Ml recognizes sequences as small as DYKD (SEQ ID NO: 1) or DYKDE (SEQ ID NO: 2), but binds to such sequences only when they are present on the N-terminus (unblocked by an initiation Methionine or any other amino acid).
  • the FLAG sequence can be used with any protease that cleaves exo and C-terminal to its recognition site, thereby allowing the digestion dependent creation of an N-terminal FLAG sequence from a previously Ml unreactive internal FLAG sequence.
  • Tags such as 6HIS and FLAG are also advantageous over biologically relevant tags such as CBP (used in combination with the protein A-Z domains system in Puig et al, which can bind to calmodulin and calmodulin-containing protein complexes present in cells, cell lysates or tissue lysates, complicating their use.
  • Affinity tags such as 6HIS, FLAG and others previously mentioned do not interact as extensively with biologically relevant and promiscuous ligands like calmodulin.
  • the use of CBP may require the incorporation of EGTA to prevent CBP-calmodulin interactions during the binding reaction and isolation of the protein complexes. Under these circumstances, the CBP containing constructs are not compatible with interactions requiring calcium and other divalent metal ions.
  • Affinity tags such as 6HIS, FLAG and the others previously mentioned can be used in the presence of calcium, thus permitting the identification of calcium dependent interactions.
  • 6HIS another advantage of 6HIS, is that the 6HIS-nickel interaction is stable under conditions that are mildly denaturing to proteins and/or disrupt protein-protein interactions.
  • exploiting the 6HIS tag in the final purification step allows for a preferential elution of the unknown protein interactors over that of the first binding component.
  • n-lauroyl-sarcosine elution resulted in a significant percentage of the first binding component remaining on beads, but efficient elution of interacting proteins.
  • Alternative possible designs are as follows:
  • the Factor Xa protease targets the sequence: I(EorD)GR (SEQ ID NO:4)
  • the proteins or fragments thereof contained in the purified protein complexes can be characterized further by employing the standard techniques that are known in the art.
  • the individual proteins that comprise the complexes purified according to the invention are identified by a variety of mass spectrometric methods which include an associated set of separation methods.
  • mass spectrometric methods which include an associated set of separation methods.
  • Most current generations of mass spectrometers enable the rapid identification of known proteins by searching mass spectrometry data (peptide masses, and/or peptide fragment mass spectra) against a database of known sequences (predicted peptides masses and/or predicted peptide fragment mass spectra).
  • it is becoming much more routine to also be able to identify de novo amino acid sequences of peptides directly from peptide mass spectra and thus discover unknown proteins.
  • components of a protein complex would not have previously been isolated and may be known to exist only from genetic studies, or they can be previously unknown or known but unrecognized as components of a complex that interacts with the bait.
  • the method of the invention therefore serves as a method for isolating, purifying and characterizing novel proteins and for providing insight into their biological function.
  • protein complexes isolated by the method of the invention under varied physiological or pathological states will yield data as to how the composition of a complex varies in response to varied conditions. The data are then used as indicators of disease, as indicators of therapeutic efficacy, and for providing a rationale for novel therapies. Identifying novel proteins or novel interactions of known or novel proteins can lead to the identification of new members of disease-associated pathways or biochemical reactions. These proteins can be drug targets because of their interaction in such pathways or reactions whether or not they vary according to the "states" described above.
  • Embodiment #1 Key
  • biotinylation is optional in both embodiments if the first binding component is biotinylated in a cell or an organism by endogenous or exogenous biotin ligases.
  • NAPA Human N-ethylmaleimide-sensitive factor attachment protein
  • GRB2 Human Growth Factor Receptor Binding Protein 2
  • GRB2 Human Growth Factor Receptor Binding Protein 2
  • GRB2 Human Growth Factor Receptor Binding Protein 2
  • DNA constructs prepared for introduction into a prokaryotic or eukaryotic host will typically comprise a replication system (i.e. vector) recognized by the host, including the intended DNA fragment encoding the first binding component of the present invention, and will preferably also include transcription and translational initiation regulatory sequences operably linked to the polypeptide-encoding segment.
  • Expression systems may include, for example, an origin of replication or autonomously replicating sequence (ARS) and expression control sequences, a promoter, an enhancer and necessary processing information sites, such as ribosome-binding sites, RNA splice sites, polyadenylation sites, transcriptional terminator sequences, and mRNA stabilizing sequences.
  • Signal peptides may also be included where appropriate from secreted polypeptides of the same or related species, which allow the protein to cross and/or lodge in cell membranes or be secreted from the cell.
  • the construct may be joined to an amplifiable gene (e.g. , DHFR) so that multiple copies of the gene may be made.
  • an amplifiable gene e.g. , DHFR
  • enhancer and other expression control sequences see also Enhancers and Eukaryotic Gene Expression (1983) Cold Spring Harbor Press, N. Y. While such expression vectors may replicate autonomously, they may less preferably replicate by being inserted into the genome of the host cell.
  • Expression and cloning vectors will likely contain a selectable marker, that is, a gene encoding a protein necessary for the survival or growth of a host cell transformed with the vector. Although such a marker gene may be carried on another poly nucleotide sequence co- introduced into the host cell, it is most often contained on the cloning vector. Only those host cells into which the marker gene has been introduced will survive and/or grow under selective conditions. Typical selection genes encode proteins that (a) confer resistance to antibiotics e.g., kanamycin, tetracycline, etc. or other toxic substances; (b) complement auxotrophic deficiencies; or (c) supply critical nutrients not available from complex media. The choice of the proper selectable marker will depend on the host cell; appropriate markers for different hosts are known in the art.
  • Recombinant host cells in the present context, are those which have been genetically modified to contain an isolated DNA molecule of the instant invention.
  • the DNA can be introduced by any means known to the art which are appropriate for the particular type of cell, including without limitation, transformation, lipofection, electroporation or viral mediated transduction.
  • a DNA construct capable of enabling the expression of the first binding component of the invention or the nucleic acids encoding individual segments of the first binding component can be easily prepared by the art-known techniques such as cloning, hybridization screening and PCR.
  • Polymerase Chain Reaction (PCR) is a repetitive, enzymatic, primed synthesis of a nucleic acid sequence. This procedure is well known and commonly used by those skilled in this art [see Mullis, U.S. Patent Nos. 4,683,195, 4,683,202, and 4,800,159; Saiki et al. (1985) Science 230: 1350-1354].
  • PCR is based on the enzymatic amplification of a DNA fragment of interest that is flanked by two oligonucleotide primers that hybridize to opposite strands of the target sequence.
  • the primers are oriented with the 3' ends pointing towards each other. Repeated cycles of heat denaturation of the template, annealing of the primers to their complementary sequences, and extension of the annealed primers with a DNA polymerase result in the amplification of the segment defined by the 5 ' ends of the PCR primers. Since the extension product of each primer can serve as a template for the other primer, each cycle essentially doubles the amount of DNA template produced in the previous cycle.
  • thermostable DNA polymerase such as Taq polymerase, which is isolated from the thermophilic bacterium Thermus aquaticus
  • the amplification process can be completely automated.
  • Other enzymes which can be used are known to those skilled in the art.
  • a person of ordinary skill in the art can prepare monoclonal or polyclonal antibodies specific for the complex or the proteins contained in the complex.
  • Monoclonal or polyclonal antibodies, preferably monoclonal, specifically reacting with a protein of interest can be made by methods well known in the art [see, e.g., Harlow and Lane (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratories; Goding (1996) Monoclonal Antibodies: Principles and Practice, 3rd ed. , Academic Press, San Diego, CA; and Ausubel et al. (1993) Current Protocols in Molecular Biology , Wiley Interscience/Greene Publishing, New York, NY].
  • Standard techniques for cloning, DNA isolation, amplification and purification, for enzymatic reactions involving DNA ligase, DNA polymerase, restriction endonucleases and the like, and various separation techniques are those known and commonly employed by those skilled in the art.
  • a number of standard techniques are described in Sambrook et al. (1989) Molecular Cloning, Second Edition, Cold Spring Harbor Laboratory, Plainview, New York; Maniatis et al. (1982) Molecular Cloning, Cold Spring Harbor Laboratory, Plainview, New York; Wu (ed.) (1993) Meth. Enzymol 218, Part I; Wu (ed.) (1979) Meth. Enzymol. 68; Wu et al. (eds.) (1983) Meth.
  • Example 1 Expression of the First Binding Component Containing Avitag in E coli.
  • E24-TAG and E25- TAG contain the nucleotide sequences capable of expressing a fusion protein of the following general design:
  • the E.coli expression vectors E24 and E25 are based on the Invitrogen Gateway compatible vector pDEST15 (Cat # 11802-014) in which transcription is driven by the T7 promoter and which contain the sequence encoding Glutathione-S-Transferase (GST) at the N- terminus.
  • PCR products amplified from M07 were inserted into vector E04 which is a modified version of pDEST15 containing a tag for GST and CBP (calmodulin binding peptide) separated by a cleavage site for PreScissionTM protease.
  • TEV TEV protease
  • Vector pGEX4T3/TAP and the PCR product were both digested with PinAl and Xho I and ligated together.
  • E24 and E25 were achieved by amplifying the tag region of M07 (see Example 8) with the following oligonucleotide and MACH. Reverse (see Example 8).
  • This primer binds to the region around the ATG start codon of Avitag and introduces an Nde I site (underlined) over the start codon.
  • the tag region of each vector was amplified with the replacement of the ps region with the Protein C region.
  • the resultant PCR amplicons were digested with Nde I and Xho I and sub-cloned into Ndel/Xhol digested vector pD15-E04 and transformed into DB3.1 bacterial cells. Positive colonies were identified and verified by DNA sequencing.
  • nucleotide sequences of the tag region are as follows:
  • E24 and E25 CDKNlb In order to express the Avitag containing bait proteins, two ml of LB medium containing 100 ⁇ g/ml Ampicillin was inoculated with a single colony containing the desired expression vector and the culture was incubated at 37 °C in a shaker incubator. After 8 hours, one ml of the culture was diluted into 25 ml of fresh LB medium (100 ⁇ g/ml Ampicillin) and incubated overnight at 37 °C.
  • the overnight culture was diluted into 1 liter of LB medium containing 100 ⁇ g/ml of Ampicillin and an antifoam agent in a Nalgene centrifuge bottle to a final OD 595 of 0.05, which was left immersed in a water bath at 30 °C with an airflow of 10 cc/min to the culture bottle in a Bactolift apparatus.
  • arabinose was added to a final concentration of 0.2 % (w/v).
  • the cells were harvested by centrifugation at 5000 rpm for 8 minutes in a Beckman JLA 8.1000 rotor.
  • the supernatant was removed from the pelleted cells and the wet weight of the cells was determined.
  • the cell pellet was resuspended in 3-fold volume (v/w) of the lysis buffer (lOmM sodium phosphate, 150mM NaCl, pH 7.2) and stored frozen in 50 ml conical tubes.
  • the frozen cell pellets were analyzed by using SDS-PAGE as shown in Fig. 1. Expression varied with GRB2 expressing to approximately 50% of total protein, NAPA at approximately 30% and CDKNlb was undetectable by Coommassie stain.
  • the frozen cells were thawed in a 37 °C water bath and lysed by sonication in the presence of lysozyme (5ug/ml) using a Virsonic 600 sonicator (20 sec pulse, 10 sec pause for at least 9 minutes). The supernatant was separated from the lysate by centrifugation for 30 minutes at 46,000xg.
  • CDKNlb CDKNlb
  • Example 3 In vitro Biotinylation and Binding to Neutravidin Beads.
  • the degree of biotinylation and the efficiency of binding of the biotinylated proteins to the Neutravidin beads was determined using a Western blot in which the proteins were visualized using a streptavidin-HRP conjugate (ZYMED Laboratories, San Francisco, CA)(see Fig. 3). Analyses were performed on the baits before and after biotinylation, as well as, on the supernatant after binding of the biotinylated baits to the beads. Approximately 40 ng of protein (both biotinylated and unbiotinylated) and 80 ng of supernatant were used for each analysis.
  • the proteins expressed in the vector E25 were biotinylated more efficiently than those expressed in the E24 vector.
  • the binding of the protein to the beads was found to be in excess of 80% (see Table 2 and Fig. 3).
  • Table 1 Composition of reaction mixtures for in vitro biotinylation of bacterially expressed first binding components in the E24 and E25 vectors.
  • Biomix A 0.5 M bicine buffer, pH 8.3 Biomix B: 100 mM ATP, 100 mM MgOAc
  • the results for % biotinylation are reported relative to the total protein in the sample. From the gel-shift assay performed with CDKNlb at least 80 to 90% of the specific CDKNlb band shifted indicating that % biotinylation for the pure protein is often higher than the results reported by the ELIFA.
  • the incubation mixtures were then washed 3 times with 1ml HEGNS buffer and subjected to the TEV protease digestion step by adding 30 ⁇ l of HEGNS buffer containing 0.5 units (lx )or 5 units (l ⁇ x)of TEV Protease (Invitrogen Corporation) per ⁇ g of the protein at 4°C for lhr.
  • the uncleaved proteins were eluted with 50% Acetonitrile with 0.1 % TFA prior to the digestion step.
  • TEV protease digestion was efficient but varied depending on the specific bait protein.
  • Mammalian whole cell lysate preparation To provide a source of mammalian proteins to interact (in pulldown analyses) with purified mammalian bait proteins expressed in bacteria or another host organism, large scale cultures of mammalian cells (2-25 liters) were grown in suspension to densities of 2-5 x lOVml or " 80-90% confluency for adherent cultures. Once the suspension cultures reached their desired density, the cells were centrifuged at 3,000xg for 10 minutes. The resulting cell pellet (or plate of adherent cells) was washed lx with cold phosphate buffered saline (PBS).
  • PBS cold phosphate buffered saline
  • the PBS was removed and cold lysis buffer was added at a volume of lO ⁇ l per mg of wet cell weight or lml per 15 cm plate of adherent cells.
  • the lysis buffer consists of 25 mM hydroxyethylpiperazine-N'-2-ethanesulfonic acid (HEPES) pH 7.5, 150 mM NaCl, 1 % NP40, lOmM MgCl 2 , 1 mM ethylene-diamine tetra acetate (EDTA), 10% Glycerol, lmM dithiothreitol (DTT), and a protease inhibitor cocktail (Roche, Mannheim Germany) added just prior to application (1 tablet/ 10ml of buffer ).
  • HEPES hydroxyethylpiperazine-N'-2-ethanesulfonic acid
  • the suspension cell pellet was resuspended with gentle pipetting and slow speed vortexing.
  • the adherent cells were lysed (on ice) by adding the buffer directly to the culture plate and scraping the cells off of the dish.
  • the lysates were then transferred to centrifuge tubes and allowed to incubate at 4°Cfor 15 minutes. At 2 minute intervals during the 4 °C incubation, the lysates were gently resuspended with mild shaking and/or slow speed vortexing. Following the incubation, the lysates were centrifuged at 27,000xg for 15 minutes at 4°C to remove insoluble debris. The supernatant was then aliquoted into fresh centrifuge tubes, snap frozen in liquid nitrogen and then stored at -80 °C.
  • HDAC histone deacetylase
  • the lysate is then made available for use in downstream analyses such as pulldown interaction studies.
  • Tissues used to date include whole mouse brain and mouse cerebellum. Modification to the following protocol for different organs can easily be made by those skilled in the art. For example, those skilled in the art will understand that more aggressive tissue disruption for muscle tissue than for brain is required because of the larger amount of connective tissue. Tissues were flash frozen in liquid nitrogen immediately following dissection. Frozen brain or cerebella were weighed. Mouse cerebella weighed approximately 70 mg, the striatum was approximately 30 mg and the cortex was about 80 mg.
  • Tissues were then mixed with cold homogenization buffer of 25 mM hydroxyethylpiperazine-N'-2-ethanesulfonic acid (HEPES) pH 7.5, 150 mM NaCl, 1 % NP40, lOmM MgCl 2 , 1 mM ethylene-diamine tetra acetate (EDTA), 10% Glycerol, lmM dithiothreitol (DTT), and a protease inhibitor cocktail (Roche, Mannheim Germany) at a ratio of 100 mg tissue to 1 ml of buffer.
  • HEPES hydroxyethylpiperazine-N'-2-ethanesulfonic acid
  • the tissue was then homogenized on ice with a Dounce homogenizer (Wheaton brand) for 5 plunges with a loose pestle and 25 plunges with a tight pestle.
  • the resulting brain tissue homogenate was then centrifuged at 100,000xg for 30 minutes. The resulting supernatant was used for pulldown assays according to the protocol for cell lysates. 100 mg of brain tissue typically yields approximately 10 mg of protein lysate.
  • Mammalian cells expressing the modified bait proteins are prepared and harvested as described in Example 8. Harvested transfectants are then lysed according to the procedure described above for whole cell lysates using our optimized protocol. If biotinylation of the modified bait protein has not occurred by co-transfection or the creation of stable cell lines expressing the biotin ligase (e.g. BirA) [Parrott MB & Barry MA (2001) BBRC 281, 993- 1000: Parrott MB & Barry MA, (2002) Mol Ther. 1 : 96-104), biotinylation is performed using the cell lysate according to the protocol for purified recombinant modified bait protein, except that the biotin concentration is reduced.
  • biotin ligase e.g. BirA
  • the amount of enzyme and biotin in the reaction can be adjusted according to the level of expression of the modified bait protein as determined by immunochemical assays specific to the second affinity tag segment. Concentrations of biotin as low as 10 to 20 nM can be used for efficient biotinylation and at these concentrations remaining free biotin does not interfere with subsequent binding to neutravidin beads. Cell lysate containing biotinylated bait and associated interacting proteins are then purified as described in Example 6 below.
  • HeLa cell lysate (5mg/ml) prepared as in Example 5 was incubated at 4°C for 1.5hrs with 15 ⁇ l of streptavidin beads containing 5 ⁇ g of each of the first binding components prepared herein, i.e. , GRB2, NAPA, CDKNlb or tags-only proteins in the E24 or E25 vectors.
  • the samples were briefly centrifuged to pellet the beads after which the lysate was removed. Beads were transferred to clean eppendorf tubes and washed three times with lml of pulldown HEGNS buffer (20mM HEPES, pH 7.5, 150mM NaCl, 10% glycerol, 0.1 % NP-40).
  • Protein complexes were released from the beads using lOO ⁇ l of pulldown buffer containing 50 units of TEV protease and incubating for lhr at 4°C with rotation, following which the supernatant removed. An additional 50 ⁇ l of pulldown buffer was added to the beads, gently mixed, and the supernatant removed to be combined with the previous supernatant. Imidazole, to a final concentration of 10 mM, was added to the supernatants that were then incubated with a lO ⁇ l volume of nickel equilibrated Chelating -NTA Sepharose (Amersham Biosciences, Piscataway, NJ) beads at 4°C for 30 min.
  • the first binding components expressed in E. coli and purified according to the invention were capable of forming complexes, from which several interacting proteins were isolated.
  • the background level of nonspecific binding is extremely low.
  • Several of the most intense bands visible in the GRB2 pulldowns are known by us to be specific interacting proteins of GRB2 as reported in the literature (see also MS results below confirming this).
  • several isolated interacting proteins for GRB2 and NAPA show gel bands of similar intensity, indicating that binding proceeded with high efficiency for these baits. The very faint CDKNlb lanes indicate that the interactions with this bait were less favored. It is well known in the art that mass spectrometric identification of low abundance proteins in the presence of high abundance proteins is unfavorable.
  • Example 7 Mass Spectrometric Analyses for Protein Identification. Sample digestion
  • Protein mixtures isolated from a pulldown assay described in Example 6 were digested with trypsin in solution to produce a mixture of peptides. The procedure is as follows:
  • the protein solution was diluted with 120ul of purified water and 5 ⁇ l of 0.1 ⁇ g/ ⁇ l Trypsin solution was added thereto and incubated at 37°C for approximately 2 hours. After allowing the digested protein solution to cool to RT, lO ⁇ l of 10%TFA was added to quench the trypsin. The resulting solution was then concentrated under vacuum to yield a final volume of approximately 100 ⁇ l in preparation for desalting.
  • the digested proteins were first desalted using a C-18 reverse phase cartridge (Michrom BioResources, Auburn CA) to remove the salt from the digestion buffer. They were eluted with 95% AcN in water which contains 0.1 % TFA by volume. Afterwards the eluted sample was taken to dryness in a vacuum centrifuge and then reconstituted in 5 microliters of 2% aqueous AcN containing 0.5% aqueous acetic acid by volume. The reconstituted sample was then injected onto a 300 micron x 5 cm strong cation exchange (CEX) column (Vydac column, Western Analytical Services, Marietta, CA) which was eluted using 250 mM NH 4 OAc flowing at 4 ⁇ l/min.
  • CEX micron x 5 cm strong cation exchange
  • the gradient ran from 0 to 35% NH 4 OAc in 40 minutes.
  • CEX derived fractions were reconstituted in 5 ul of 1 % aqueous TFA, loaded onto the autosampler (FAMOS autosampler, LC Packings, Sunnyvale, CA) of the ESI LC/MS/MS system (LC packings Ultimate LC and either Q Trap MS system, AB/MDS Sciex Toronto Canada, or an LCQ MS system, Thermo Finnigan, San Jose, CA), and injected onto a C-18 reverse phase trap cartridge (LC Packings) which was prewashed for 1 minute at a flow rate of 50 ⁇ l/min using 0.5 % aqueous acetic acid.
  • FMOS autosampler LC Packings, Sunnyvale, CA
  • LC Packings LC packings Ultimate LC and either Q Trap MS system, AB/MDS Sciex Toronto Canada, or an LCQ MS system, Thermo Finnigan, San Jose, CA
  • the flow was then reversed and the peptides were back eluted onto a 75 micron x 15 cm C-18 reverse phase LC column (LC Packings) for separation of the peptides at a flow rate of 250 nl/min.
  • the loading buffer was 0.5 % aqueous acetic acid and the elution buffer was AcN.
  • MS data were collected in the information dependent mode in which one survey scan was acquired and then followed by the acquisition of three MS/MS spectra.
  • MALDI MS/MS data were acquired in a two step process in which the CEX-derived fractions were reconstituted in 5 ⁇ l of 1 % aqueous TFA and loaded onto the autosampler of an LC system (LC Packings) that spots the LC effluent directly onto a MALDI target while simultaneously mixing the effluent with MALDI matrix. Spots were deposited every 15 seconds and a total of 144 spots were collected for each CEX fraction. The samples were then analyzed by MALDI MS (AB 4700 Proteomics Analyzer, Applied Biosystems, Foster City, CA) to identify all of the usable peptide signals which were subsequently subjected to MS/MS analysis.
  • MALDI MS AB 4700 Proteomics Analyzer, Applied Biosystems, Foster City, CA
  • the proteins which were contained in the pulldown samples were identified by comparison of the MS/MS data to theoretical data derived from the human subset of the proteins contained in the NCBInr protein sequence database.
  • the NCBInr protein sequence database was obtained by downloading it from the NCBI website.
  • the matching of spectral data to the database was performed using a commercially available software package called Mascot which was purchased from Matrix Science (London, UK).
  • Mascot which was purchased from Matrix Science (London, UK).
  • the information from the separate CEX fractions derived from a single pulldown sample was combined either before or after the database search to yield a single combined result from each pulldown.
  • the algorithm used by Mascot for protein identification uses a two step process in which MS/MS data is first assigned to multiple possible peptide ID's.
  • the Mascot algorithm then assembles the putative peptide ID's into the minimum number of protein ID's that can explain the raw data.
  • the output from Mascot was then further filtered based on the following criteria.
  • the protein level data had subtracted from it known false positives that were determined to be non-specific interactors on the basis of control experiments using the tag-only construct for pull downs.
  • any proteins that were observed repeatedly across many pulldowns but not observed in the standard control experiment were also subtracted.
  • the proteins were ranked by the score assigned by Mascot and any proteins below a score of 60 were ignored.
  • peptide level results from each listed protein were then screened to make sure that they met a minimum quality. All known interactors were considered identified if they had multiple peptides on which the identification was based or if there was only a single peptide, then its score alone was greater than or equal to 60.
  • Example 8 Expression of the First Binding Component Containing Avitag in Mammalian Cells.
  • mammalian expression vectors used herein are referred to as M08-TAG and M09- TAG and contain the nucleotide sequences capable of expressing a fusion protein of the following general design:
  • a total of six genes (PCNA, HDAC, CDKNlb, NAPA, CDK5, Tag-only) were expressed from three separate vectors designated as M07, M08, and M09, which encoded three variations of an N-terminal tag (see below for details) designed for tandem affinity purification.
  • the base vector was derived from an expression plasmid, pT-Rex-DEST30 purchased from Invitrogen Corporation (Carlsbad, CA).
  • pT-Rex-DEST30 gene expression is driven by the cytomegalovirus (CMV) immediate early promoter under the control of the tet operator sequence fused to the 3' end of the promoter.
  • CMV cytomegalovirus
  • the M07 and M08 vectors are similar in that the functional elements of the tag sequences are identical with the exception that the M08 vector encodes a triple unit repeat of four glycine residues and one serine residue to serve as a spacer to permit more efficient cleavage of bait proteins from the purification tags.
  • the M09 vector is similar to the M07 construct in that it does not contain the triplet repeat spacer, but diverges from the M07 vector due to the substitution of a Protein C binding domain in place of the PreScission protease cleavage site.
  • This design relies on the efficiency of the TEV protease for the removal of the bulk of the tag and affords additional flexibility for purification by incorporating the high affinity Protein C domain.
  • the annotated nucleotide sequences (from the start of the ORF to the beginning of the bait protein) are as follows:
  • GlyGlyGlyGlyGlySer GlyGlyGlyGlySer GlyGlyGlyGlySer Ala GGCGGCGGCGGCAGC GGCGGCGGCGGCAGC GGCGGCGGCGGCAGC GCGCGCGGCGGCAGC GCG
  • TEV CLEAVAGE SITE SPACER 6X His Tag GluAsnLeuTyrPheGlnGly SerSer Ala HisHisHisHis GAGAACCTGTACTTCCAGGGC AGCAGC GCT CATCACCATCAC PreScission CLEAVAGE SITE
  • the tag configuration of the mammalian vectors was as follows:
  • M01 was made by amplifying by PCR using the following primers
  • TCCCTCGAGCCGTCGTCGTCATCCTTGTAGTC - XhoI-FLAG-R (SEQ ID NO: 27) using vector pAN5rfc.l FLAG as template.
  • Vector pT-REx-Dest30 and PCR product were both digested with PinAl and Xho I and ligated together.
  • M07 was created by inserting a DNA fragment encoding "TEV - Ser 2 spacer - 6x HIS tag - PS cleavage site" and oligonucleotides encoding Ascl, EcoRV, SacII restriction sites on the 5' end and Xhol site on the 3' end. Also included were an EcoR47III site between the Ser 2 spacer and 6x HIS tag and a Pmll site between the 6x HIS tag and the PS cleavage site. The "top and bottom strand" oligos were annealed and cut with Ascl/Xhol and cloned into an AscI/XhoI-cut M01 vector. (The Ascl site is not regenerated to avoid creating a Proline). Cloned sequences were verified by DNA sequencing.
  • M08 was created by inserting a DNA fragment encoding (G 4 S) 3 and oligonucleotides with an EcoRV site at the 5' end and a SacII site at the 3 'end.
  • the "top and bottom strand" oligos were annealed and cut with EcoRV/SacII and cloned into an EcoRV/SacII -cut M07 (pMASH2) vector. (The SacII site is not regenerated). Cloned sequences were verified by DNA sequencing.
  • the mammalian expression vector M09 was made from the vector M07. The result of this construction was to introduce the coding sequence for the Protein C epitope tag and at the same time remove the PreScission cleavage site.
  • Construction of M09 was achieved by amplifying the tag region of M07 with the following oligonucleotide primers in a polymerase chain reaction (PCR).
  • PCR polymerase chain reaction
  • This primer binds to a region 129bp upstream of the Avitag coding region.
  • This primer encodes the Protein C epitope tag region at the 5' end and the 3' region binds to the 6xHIS region as indicated in the design above.
  • the primer also encodes an Xho I site directly after the Protein C tag.
  • Each plasmid was transfected into 293 T-Rex cells (Invitrogen Corporation, Carlsbad, CA) using the following protocol: 24 hours prior to transfection, cells were plated at 90% density in 10cm tissue culture plates. For DNA complex formation and transfection, 24ug
  • M07, M08 or M09 plasmid DNA containing the nucleotide sequence encoding one of the six bait proteins as listed above was diluted in 1.5 ml of serum free media.
  • 60 ⁇ l Lipofectamine 2000 reagent (Invitrogen Corp. Carlsbad, CA) was diluted in 1.5ml serum free media. After a five minute incubation period at room temperature, the DNA and transfection reagent dilutions were combined, mixed gently and allowed to incubate for 20 minutes at room temperature.
  • the Lipofectamine/DNA mixture was then added to the media of 293 T-Rex cells at " 90% density in 10 cm plate, rocked gently and placed in an incubator at 37°C (5 % CO2). Following a 4 hour incubation period, the media were changed and the cells were placed back in the incubator. 24 hours after the transfection, 250 ⁇ g/ml of geneticin antibiotic was added to the culture media to eliminate non-transfected cells.
  • the 293 T-Rex cells (Invitrogen Corp. Carlsbad, CA) are engineered to express the tet repressor protein, and thus protein expression from the M07, M08, M09 vectors is restricted in the absence of tetracycline. Addition of tetracycline displaces the tet repressor protein and allows transcription from the CMV promoter. Seven days after transfection and multiple passaging of the cell populations to permit sustained expansion, transgene expression was induced by adding tetracycline at a final concentration of lOug/ml for a period of 48 hours.
  • the media were removed, the culture plates were rinsed once with cold phosphate buffered saline and protein lysates containing the first binding component expressed from vectors M07, M08 and M09 were harvested using a gentle lysis buffer as described in Example 5 which was added directly to the plates.
  • the lysed cells were removed from the dishes using cell scrapers and the extracts were placed on ice for 15 minutes. Following this incubation, the lysates were centrifuged at 3,000x g for 15 minutes to remove cell debris and insoluble matter. The remaining supernatant was aliquoted and snap frozen in liquid nitrogen and stored at -80° C.
  • Example 9 Use of Transgenic Animals to purify Protein complexes.
  • Transgenic animals provide enormous potential for the study of biological processes and the modeling of disease [Prosser, H and Rastan S., Trends Biotechnol. (2003) 21(5): 224- 32].
  • they are generated by the introduction of recombinant DNA expression constructs into a very early stage embryo which matures to become an adult organism. This is accomplished by microinjecting the DNA into a single embryonic stem cell which is then implanted into a blastocyst or multicell embryo. The blastocyst is then implanted into a pseudo-pregnant female where, if all goes well, the embryo develops into a healthy newborn.
  • the introduced DNA can be designed to either integrate randomly into the genome, integrate specifically in the genome and remove or replace existing sequences by homologous recombination as in a "knock-out” or integrate specifically to introduce sequences as in a "knock-in.”
  • Using existing technologies for the formation of transgenic animals it would be possible to introduce the coding sequences for a tagged protein or proteins into intact, living organisms including but not limited to mice, rats, rabbits or primates. Using this approach, the experimentalist would essentially be able to perform an in vivo pulldown experiment.
  • the primary advantages of this technique include:
  • a the ability to provide all potential interacting proteins in the appropriate context; b. the ability to analyze protein-protein interactions from each/any organ system independently; c. the ability to examine protein-protein interactions over time and through every developmental stage of the host organism; d. the ability to combine the study of tagged protein-protein interactions with established animal models of acquired and inherited disease.
  • Refinements of this approach could include the utilization of naturally or artificially regulated and/or tissue specific promoters to induce or direct expression of the tagged protein at a specific time, dose or in a desired subset of organismal tissues [Chandrasekaran et al. (1996) J Biol Chem. 271(45):28424-21] .
  • This technique could be employed if a given protein were found to be toxic or, not well tolerated in a host animal or simply if the experimental design required specific expression levels or tissue restriction of expression.
  • RNA interference RNA interference
  • anti-sense RNA interference
  • a further embodiment of this approach could include the use of "knock-in" homologous recombination to specifically insert the preferred tag sequences in frame, (N- terminally, C-terminally or internally) with a given gene such that the augmented gene sequence would encode the desired fusion peptide with minimal disruption of the native gene promoter/enhancer sequences [Yu et al. (2003) Neurosci. 23(6):2193-202].
  • the transgenic animals could be interbred to facilitate studies of one or more biological pathways in which it might be advantageous to have multiple tagged proteins present in the living organism.
  • a tissue lysate can be prepared from a transgenic animal as described in Example 5 and a protein complex can be purified and interactor proteins can be identified according to the invention disclosed herein.
  • the studies described in the above examples demonstrate that the present invention, i.e. , a method of purifying a protein complex by using a modified bait protein containing affinity tags of high specificity separated by a specific protease cleavage site, can be used to purify protein complexes containing several interacting proteins as evidenced in Fig. 5 and in Table 3.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Physics & Mathematics (AREA)
  • Analytical Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Urology & Nephrology (AREA)
  • Immunology (AREA)
  • Biomedical Technology (AREA)
  • Hematology (AREA)
  • Cell Biology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Microbiology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Food Science & Technology (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The present invention provides a method for isolating protein complexes from a cell, a cell or tissue lysate or an whole organism by employing a combined set of affinity tags of high affinity, specificity and ease of elution. The method involves using a known protein modified to contain more than one affinity tag separated by one or more specific protease cleavage sites, as bait, to isolate any interacting proteins or fragments thereof. The proteins or fragments thereof contained in the isolated complex can then be identified and the interacting partners can be used as new targets for diagnostic tools or the basis for the development of new compounds for therapeutic drug intervention.

Description

PROTEIN COMPLEX PURIFICATION
CROSS-REFERENCE TO RELATED APPLICATIONS
The present application claims priority from United States Provisional Patent Application Serial No. 60/379,317 filed May 9, 2002, which is incorporated herein in its entirety, by reference.
BACKGROUND OF THE INVENTION
The present invention relates to a method for purifying a protein complex from a cell, a cell lysate, tissue lysate or an organism and identifying interacting proteins contained in such complex.
In living cells, complex processes are typically accomplished by highly specific binding interactions among functional cell components, most commonly involving one or more proteins. Understanding which proteins bind to one another, and under what circumstances, poses difficult unsolved problems . An approach to learning which proteins bind to each other to form protein complexes is to isolate functional protein complexes, or portions thereof, in order to identify their components. This approach poses severe technical obstacles if the results are to be meaningful.
Recent advances in human genomics research have led to rapid progress in the identification of novel genes. In biological and pharmaceutical research, there is a need to determine the functions of gene products. An important step in defining the function of a novel gene is to determine its interactions with other gene products in the appropriate context. Several approaches have been devised towards this goal of determining a novel gene product's physical interaction with other gene products. Specifically, yeast two-hybrid systems have been extensively employed to define interactions existing among proteins. The principles and methods of the yeast two-hybrid system have been described in detail elsewhere [Bartel and Fields (1997) The Yeast Two- Hybrid System. Oxford University Press, New York; Fields and Song (1989) Nature 340:245- 246]. However, the number of false-positives and false-negatives arising from yeast two hybrid screening is high and thus extensive experimentation is required to validate the interactions observed in this system. Moreover, a yeast two hybrid experiment results in the identification of a pair of directly interacting proteins. For this reason, yeast two hybrid is not readily useful for identifying higher order structures in multiprotein complexes comprised of both direct and indirect protein associations.
There is a continuing need for the discovery of additional protein-protein interactions that are physiologically relevant. The present invention provides an efficient method for purifying protein complexes from a cell, cell lysate, tissue lysate or organism by employing a combined set of affinity tags attached to a known protein which may be used as a bait to isolate any interacting proteins therewith. The advantages of the invention will be evident in the following description.
SUMMARY OF THE INVENTION
The present invention provides efficient methods for purifying protein complexes from a cell, a cell or tissue lysate or an organism by employing a combined set of affinity tags of high affinity, specificity and ease of elution. The proteins, or fragments thereof, purified according to the invention, can then be identified and used as new targets for therapeutic intervention, as diagnostic tools, and as the basis for generating new animal and cell models.
The invention employs a known protein or a fragment thereof, termed "bait" , which is modified to contain at least two affinity tags separated by a sequence of amino acids which is cleavable by a specific protease. The modified bait protein is termed herein as a "first binding component" . The modified portion of the bait protein can be present as an extension either at the amino terminus or the carboxyl terminus of the bait or at both termini. Alternatively, one or more affinity tags can be inserted in the ORF of the bait. The affinity tags in the first binding component serve as specific means for purifying a desired protein complex bound to the first binding component before and after a protease digestion. The inclusion of the protease specific segment between the affinity tags can provide an enhanced specificity of the second affinity purification step since the protease cleavage can generate a tag sequence that is more selective for the second affinity ligand. The design of the second affinity tag segment of the first binding component can be such that two or more tags are assembled in tandem possibly separated by specific protease sites different from that between the first affinity recognition sequence and the second affinity tag segment. This design allows significant flexibility in purification strategies with advantages that will be evident below.
An example of the first binding component is a bait protein containing a biotinylation recognition sequence and a peptide affinity tag sequence of a hexapeptide comprised of six histidines (6HIS) separated by a protease cleavage sequence for TEV (Tobacco Etch Virus) protease. Any protein or a fragment thereof can serve as the bait. The invention does not require any knowledge of the function of the bait protein and can thus serve as a general purification strategy for purifying any protein complex.
The following is the general design of the first binding component exemplified in the invention:
(Biotinylation recognition sequence)-(protease cleavage sequence)-(affinity tag)-( + /- protease cleavage sequence)-(+ /-affinity tag)-(bait protein).
Provided herein is a method for purifying a protein complex from a cell, a cell or tissue lysate, or an organism containing the protein complex, comprising the steps of: a) providing a first binding component comprising four parts: 1) a peptide segment having an affinity modifiable segment, 2) a protease specificity segment, 3) a peptide affinity tag, and 4) a bait; b) contacting the first binding component with the protein complex, whereby part 4) of the first binding component binds to said protein complex, thereby forming a bait-bound protein complex; c) modifying part 1) of the first binding component by attaching an affinity ligand thereto, thereby forming an affinity-tagged bait-bound protein complex; d) contacting the complex formed in step c) with a first affinity matrix specific for said affinity ligand thereby binding said complex to said matrix, and separating the complex from unbound material; e) contacting the complex formed in step c) with a protease that specifically cleaves part 2) of the first binding component thereby cleaving the first binding component and forming a second binding component comprising parts 3) and 4) bound to the protein complex, but not bound to the first affinity matrix, wherein a peptide affinity tag of part 3) is retained; f) contacting a peptide affinity tag of the second binding component with a second affinity matrix specific for a peptide affinity tag of the second binding component thereby binding the protein complex to the second affinity matrix and separating the bait-bound protein complex from unbound material, and; g) removing the second binding component and bait-bound protein complex or its components from the second affinity matrix, whereby a protein complex is purified together with the bait.
The affinity tags useful in the invention include, but are not limited to, an amino acid recognition sequence of a biotin ligase, 6HIS hexapeptide, Strep-tag II™(Sigma GenoSys), calmodulin binding peptide (CBP), any number of available epitopes with appropriate immunochemical reagents available (for example, hemagglutinin (HA), FLAG, MYC), as well as polypeptides to which immunological or nonimmunological (e.g. aptamers) reagents can be prepared. The amino acid sequence of a biotinylation recognition sequence is particularly preferred since the affinity of biotin for streptavidin is one of the strongest noncovalent interactions known allowing for highly specific binding of a biotin-tagged protein to an affinity matrix of immobilized streptavidin or suitable avidin-like molecule.
The protease specificity sequences useful for the invention include, but are not limited to, those of TEV, PreScission protease™ (Amersham Biosciences, Piscataway, NJ), enterokinase, thrombin, Factor Xa, and furins. Any amino acid sequence providing a recognition sequence for a protease having a high specificity is suitable for the invention. High specificity, as used herein, means that the protease cleaves only at the recognition sequence and that that recognition sequence is typically comprised of several amino acids or alternatively is specific to a unique secondary or tertiary structure formed by the recognition sequence. Examples of the bait protein include, but are not limited to, proliferating cell nuclear antigen (PCNA), histone deacetylase 1 (HDAC1), cyclin-dependent kinase inhibitor lb
(CDKNlb), N-ethylmaleimide-sensitive factor attachment protein (NAP A), cyclin-dependent kinase 5 (CDK5), growth factor receptor-bound protein 2 (GRB2), eukaryotic initiation factor
4E (eIF-4E), and cyclin Dl .
The first binding component (i.e. , modified bait protein) is prepared in vivo in a cell, or organism or in vitro or a combination of both. If the modified bait protein is expressed in a cell from which a desired protein complex is to be purified, those proteins and other cellular components with which it normally binds are purified together with the bait protein under conditions that do not disrupt the binding interactions between those proteins. Alternatively, the modified bait protein can be expressed in a transgenic animal and tissues or whole organisms can be used in the invention to purify protein complexes. The first binding component can also be prepared recombinantly in bacteria or another host using methods well- known in the art. In this instance, the purified first binding component is mixed with a cell or tissue lysate of choice to form a protein complex in vitro, which is then purified according to the steps described herein.
The individual proteins that comprise the complexes purified according to the invention are identified by a variety of mass spectrometric methods which include an associated set of separation methods. The identification of the interacting proteins will provide new targets for the identification of useful pharmaceuticals and diagnostic tools.
BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 shows that the bait proteins in AviTag E24 (A) and E25 (B) vectors are expressed upon induction in E. coli. Equal amounts of cells uninduced (U) and induced (I) with arabinose were resuspended in the SDS-Laemmli sample buffer, the lysates were separated by SDS-PAGE and the gels were stained with Coommassie stain. The arrows indicate the location of the fusion proteins. Fig. 2 shows the results of purification of E24 and E25 fusion protein constructs using Ni2+ affinity chromatography via the 6HIS affinity tag. Each sample was loaded on the gel in the amount of 5 μg of total protein. PS is the Markl2 protein standard (Invitrogen). Numbers at or above bands indicated the predicted molecular weight (kDa) for each protein.
Fig. 3 confirms the biotinylation of the E24 and E25 tagged proteins and their ability to bind to neutravidin beads. Western blots are shown of samples [tags only (A), GRB2 (B), NAPA (C), CDKNlb (D)] taken before biotinylation, N, after biotinylation, B and the supernatant, S, following the binding to UltraLink neutravidin beads (Pierce). PS indicates the Markl2 Protein standards (Invitrogen). Aliquots of approximately 20 ng of each purified protein sample were loaded on the gel. The Western blots were developed using neutravidin -horseradish peroxidase complex and TMB (tetramethyl benzidine, Sigma) substrate. Also shown in (E) is a gel-shift assay for the CDKNlb-E25 fusion protein. Approximately 2 μg of expressed protein, before biotinylation, N, or after biotinylation, B, were incubated in the presence or absence of 4 μg of neutravidin protein, NA, and then analyzed by SDS-PAGE under non-denaturing conditions (nonreducing, unboiled).
Fig. 4 demonstrates the efficiency of TEN protease in cleaving the E24(A) and E25 (B) fusion proteins bound to neutravidin beads. The starting uncleaved fusion proteins are shown (U) and the total available uncleaved fusion protein bound to beads (T) are shown as controls. Gray and black arrows indicate the cleaved fusion protein and TEN protease, respectively. TEV was used at two different concentrations (IX, 10X) as detailed in the Examples Section. Minus and plus signs refer to the absence or presence of cell lysate and thus bound interacting proteins.
Fig. 5 shows the results of SDS-PAGE analysis of the isolated protein complexes following two-step affinity purification. Shown are proteins either eluted from the second affinity matrix (Νi + beads) by a sarkosyl detergent or remaining on the beads following elution (Ni2+ beads). The "C" label indicates contamination, due to bead carry-over, of TEV protease, which also contains a 6HIS tag. DETAILED DESCRIPTION OF THE INVENTION
In general, the terms and phrases used herein have their art-recognized meaning, which can be found by reference to standard texts, journal references and contexts known to those skilled in the art. The following definitions are provided to clarify their specific use in the context of the invention.
The term "protein complex", as used herein, designates a cluster of macromolecules comprising at least one protein wherein the cluster is stabilized by non-covalent bonds. A protein complex can be comprised entirely of proteins or peptides, or it can include carbohydrates, lipids, glycolipids, nucleic acids, oligonucleotides, nucleoproteins, nucleosides, nucleoside phosphates, enzyme co-factors, porphyrins, metal ions and the like, or any biomolecule.
The term "bait" or "bait protein" is used synonymously herein and is a peptide for which the nucleotide coding sequence is known in the art or is obtainable by employing methods well known in the art.
The term "first binding component" is typically, but not necessarily, a protein, synonymously used as the "modified bait" herein. The first binding component possesses two properties significant for the invention: (a) it can bind to another protein or protein complex in a cell, in vitro, or an intact organism, cell lysate or tissue lysate, and (b) it can bind to an affinity reagent or can be modified by attachment of an affinity ligand. Property (a) is an inherent biological property of the first binding component. Property (b) is conferred by adding to the native structure of the bait a peptide tail having three functional segments.
The first segment is herein termed as "affinity modifiable segment". The affinity modifiable segment has an amino acid sequence providing specificity for covalent attachment of an affinity ligand. Exemplified herein is an amino acid sequence of a biotinylation recognition sequence, which can be recognized by a biotin ligase (endogenous or added exogenously) to covalently attach a biotin molecule to the peptide. The biotin moiety serves as an affinity ligand that specifically binds any avidin-like reagent with a high affinity for biotin, for example streptavidin or neutravidin, immobilized in a matrix usually in the form of a chomatography support or bead. In principle, any amino acid sequence providing recognition for attachment of an affinity ligand can be employed. We demonstrate this principle in the examples herein using the AviTag (Avidity, Denver, CO) as the biotinylation recognition sequence and the BirA E.coli gene product as the biotin ligase.
The second functional segment is termed as "protease specificity sequence" and contains an amino acid sequence providing a recognition site for a protease that specifically cleaves at or near the recognition sequence. Many specific proteases are known, together with their recognition sequences, including, but not limited to TEV protease, PreScission protease™ (Amersham Biosciences, Piscataway, NJ), enterokinase, clotting factors such as Factor Xa, furins, purins, and the like. It is preferred to employ the recognition sequence of a protease that does not cleave a peptide bond within the protein complex itself.
The third functional segment is termed as "peptide affinity tag" and is a peptide having a sequence capable of being specifically bound to and eluted from one or more ligands of an affinity matrix such as a chromatography support or bead. Typical examples of a peptide affinity tag include an epitope, which can bind to a matrix-immobilized antibody, or a specific binding protein. Preferred peptide affinity tags are those which are elutable from the affinity matrix by mild conditions unlikely to disrupt the protein complex and/or elute nonspecifically associated contaminants or that interact tightly with the affinity matrix such that specific elution conditions can be employed to preferentially elute the interacting proteins but retain some or all of the first binding component. Exemplified herein below is a hexapeptide 6HIS, which is known to specifically bind to a column of nickel (Ni2+) or cobalt (Co2+) with high affinity [Crowe et al. (1994)]. InMethods in Molecular Biology (Harwood, A. J. , eds.). Vol. 31:371-387, Humana Press, Inc. Otawa; Porath et al. (1992) J. Protein Expr. Purif. 3: 263- 281]. Another example of a peptide affinity tag is a 12 amino acid peptide, known as the Protein C tag in the art, which is recognized in a calcium dependent manner by the commercially available monoclonal antibody HPC4 (Roche Applied Science, Indianapolis, IN). Other examples of affinity tags are the FLAG Ml™ (Sigma Corp., St. Louis, MO) epitope and calmodulin binding peptide (CBP), whose respective affinity interactions are reversibly Ca+2 dependent. The third functional segment can also be composed of two or more different affinity tags in tandem that may or may not be separated from each other by specific proteolytic sites that are different from that adjacent to the affinity modifiable segment. When the third functional segment is composed of multiple affinity tags they can be used in alternative or sequential affinity purification steps. When greater purity is desired, sequential affinity purification steps can be used. Alternative affinity purification steps can allow for customization of the purification (e.g. a bait or bait-complex may sterically hinder one affinity tag but not another).
The three functional segments described above are included in a single linear peptide bound to the bait. If the bait is itself a protein, the modifying peptide segment can be present as an extension of the amino terminus or the carboxy terminus of the bait or at both termini. Alternatively, one or more affinity tags can be inserted in the ORF of the bait. When the combined tags are present on the same side of the bait (each N-terminal, or each C-terminal), the functional segments of the modifying peptide are arranged in sequence, with the peptide affinity tag being proximal to the bait followed by the protease specificity sequence, then by the affinity modifiable segment, the latter being most distal. It is desirable that the modifying peptide be short, to minimize any effect it may have on normal binding properties of the bait to the protein complex. The invention can also be practiced with the same affinity tag serving as the affinity segment more than once. In some cases it may be desirable to place tags at both the N and C-terminus with protease digestion sites located between the "bait" and the tags.
The term "second binding component", as used herein, refers to a protein complex comprising of one or more peptide affinity tags and a bait bound to a protein complex which is formed after a protease digestion.
The present invention relates to methods for purifying protein complexes from a cell, a cell or tissue lysate or an organism by employing the first binding component described above, i.e. , the bait protein containing at least two affinity tags separated by an amino acid sequence encoding a specific protease recogmtion site in such a way that at least two rounds of affinity purification can be carried out with a protease cleavage step occurring between one or more of the purification steps.
The isolation and identification of proteins bound to an exogenous bait protein introduced at relatively normal expression levels in cells, tissues or organisms has been a challenging problem. The need to express the bait at low levels in order to avoid the introduction of aberrant interactions, creates a situation where small amounts of protein are isolated from large numbers of cells (108 to 1010) or equivalent cell or tissue lysate and thus in the presence of an abundance of contaminants. The present invention solves this problem by incorporating a combined set of affinity tags that utilize high affinity, specificity and ease of elution. Disclosed herein, are a new combination of existing affinity tags that when combined are ideal for this application. In particular, we describe the use of biotin ligase recognition sequences that are small, allow for extremely high affinity isolation and introduce a minimal background due to low levels of endogenous ligands. Biotin ligase or biotinylation recognition sequences, in combination with protease recognition sequences and affinity tags create ideal constructs for the applications described. Additionally we describe in one embodiment the novel use of a protease digestion step that can create the specificity required for an efficient second purification strategy, e.g., a calcium-dependent step that allows a gentle and specific elution.
Here is general design of the first binding component:
1st segment (Affinity modifiable sequence) - 2nd segment (A specific proteolytic cleavage site) - 3rd segment (One or more specific affinity tags with or without another proteolytic cleavage site between them and/or the bait)-(bait).
As an example of the invention, the biotin-avidin system is described for use as an affinity modifiable segment. Specifically, the system involves the incorporation of a biotinylation recognition sequence (can be any one of a number of amino acid sequences specifically recognized and biotinylated by members of the biotin ligase family of enzymes), followed by a specific protease digestion site and then finally a hexapeptide 6HIS tag. However, the third functional segment can be one or more specific peptide sequences that can serve as affinity tags, e.g. , the Protein C tag, Strep-tag II™(Sigma GenoSys), Hemagglutinin tag or FLAG recognition sequence. The biotinylation recognition sequence can be a short peptide or a protein containing natural or unnatural amino acid sequences that will get biotinylated by a specific biotin ligase in vivo or in vitro.
The following are possible two designs of the first binding component:
(Biotinylation recognition sequence)-(TEV protease site)-(6HIS) -(bait protein) or
(Biotinylation recognition sequence)-(TEV protease site)-(Protein C tag)-(bait protein).
In the above designs, a hexapeptide 6HIS and/or the Protein C tag, is used as the peptide affinity tag. The Protein C tag is a 12 amino acid peptide recognized in a calcium dependent fashion by the monoclonal antibody HPC4 (Roche, Indianapolis, IN). Additionally, the Protein C tag can be placed at the N or C-terminus of a protein or internally and still be recognized by HPC4. For this reason, a wide selection of protease cleavage sites can be incorporated and still permit purification using HPC4. It also means that a single antibody can be used for both the initial purification of the modified bait protein and the subsequent second- step purification of the multi-protein complex. In the above examples, the TEV protease site is used as one of many possible proteases. In general, since this design is flexible with regard to the type of protease used, it is advisable to select a highly specific protease, such as the TEV protease, to reduce inappropriate proteolysis.
The bait portion of the first binding component can be any protein or a fragment thereof. When the nucleic acid sequence encoding such protein or the fragment thereof, is known, a priori, the first binding component can be prepared by standard cloning procedures known to those in the art. Alternatively, the first binding component could be prepared from cDNA libraries, or the like, without prior knowledge of the bait nucleic acid sequence. During such a "random" or "shotgun" approach, the identity of the bait protein could be determined later by mass spectrometry.
Following are a list of bait proteins that have been used to illustrate the invention, together with some of the proteins likely to be isolated as part of a protein complex isolatable by the method of the invention. (The proteins are listed by the standard names by which they are known in the art, and by which they are indexed in public databases, i.e., Genbank.)
1. GRB2: (NM 002086) can serve as bait for Sosl, She, dynamin2 (see Table 3 for a more complete list). 2. NAPA: (NM 003827) can serve as bait for syntaxins, VAMP, SNAP-23 (see Table
3 for a more complete list).
3. CDKNlb: (NM 004064) can serve as bait for CDC2, CDK2, GRB2 (see Table 3 for a more complete list).
For the purpose of further illustration, the following additional bait proteins are noted, together with some of the proteins likely to be isolated as part of a protein complex isolatable by the method of the invention.
4. eIF-4E: (NM 001968) can serve as bait for purifying a protein complex that includes eIF-4A, eIF-4GI, MNK2, eIF-3 (itself a ten-subunit complex), ERK1/2 and possibly others. 5. Cyclin Dl (CCND1); (NM 053056) can serve as bait for purifying a protein complex that includes CDK4, PCNA, p21/Cipl (CDKN1A) and possibly others.
6. PCNA: (NM 002592) can serve as bait for Husl, Rad9 and possibly others.
The modified bait protein referred to herein as the first binding component can be prepared either in vivo or in vitro or a combination thereof as described in the Examples Section. Once a construct capable of expressing a desired bait protein with the appropriate modifications is prepared, the modified bait protein can be expressed in prokaryotic cells (e.g. , E. colϊ) by employing the standard protocols well known in the art [Makrides, S.C. (1996) Microbiol. Rev. 60:512-538; Baneyx, F. (1999) Curr. Opin. Biotech. 10:411-421]. Alternatively, the first binding component can be expressed in eukaryotic cells (e.g. , mammalian cells or yeast) [Logan A. C. et al. (2002) Curr. Opin. Biotechnol. 13:429-436; Regulier, E. (2002) Hum Gene Ther.li: 1981-1990; Geisse, S. etal. (1996) Protein Expression and Purification 8:271-282]. Finally, the first binding component can be expressed in an organism by transgenic or "knock- in" methods, discussed in more detail in the Examples Section.
When the first binding component is prepared in a recombinant host (e.g., E. colϊ) purification of the recombinantly expressed protein could be performed efficiently using the affinity tag (e.g. , 6HIS) contained therein by employing the standard biochemical approaches (e.g., Ni2+ beads/column). In this case it is preferable to retain all three functional segments. If biotinylation of the bait protein has not occurred during expression in the recombinant host organism, it can then be performed, in vitro, using a recombinantly expressed and purified product of the E. coli BirA biotin ligase gene, as an example. The purified first binding component could then be mixed with a source of protein complex, such as a cell lysate, tissue lysate or organism lysate. The protein complex is then isolated as described in Example 6. The modified bait protein can also be expressed in cells or an organism containing possible ligands and possibly biotinylated in the cell or organism by an endogenous or exogenous biotin ligase, expressed normally or recombinantly introduced. This system is thus applicable for experiments where the modified, biotinylated bait is expressed in cells of many types (prokaryotic or eukaryotic). Mammalian cells or whole organisms (e.g. mice trans genie for the tagged first binding component) would offer an added advantage since they allow the isolation of multi-protein complexes after their formation, in situ. An example of the use of a biotin tag and endogenous biotinylation in mammalian cells can be found in [Parrott M.B. et al (2001) BBRC 281:993-1000; Parrott M.B. et al. (2002) Mol Ther. 1:96-104]. The biotin- tagged proteins, together with associated ligands of protein complexes, are isolated from the cell, tissue or organism lysates using an avidin-like affinity reagent. Specific elution of biotin- tagged proteins from the affinity column is then performed by digestion with a protease (e.g. TEV protease). At this stage the digestion step can serve several purposes: (1) it allows the specific elution of the bait and the associated protein complexes for immediate analysis or a second affinity purification step. The immediate analysis of the elution from the first purification step may be advantageous for the identification of transiently interacting proteins that would normally be lost during multi-step purifications. The extremely high affinity interaction of biotin-avidin may enable this because complexes are isolated and concentrated more quickly and may survive very rapid, yet stringent washing prior to elution; (2) it can expose a second affinity tag, hitherto cryptic or sterically hindered, for use in a second round of purification; (3) or create the specificity required for the second step. When a 6HIS sequence is used as the affinity tag, nickel-chelate bound beads allow for a second affinity purification step to remove contaminants remaining after the protease digestion. The 6HIS tag is particularly useful when it is advantageous to remove all or some of the bait protein prior to mass spectrometry analysis, because the 6HIS-Ni+2 interaction survives relatively strong denaturing conditions. This is illustrated in Example 6. If a FLAG sequence is used as the affinity tag, a cryptic epitope can be created that is exposed upon the protease digestion step. The digestion can create an N-terminal FLAG sequence that is specifically recognized by a calcium dependent antibody (e.g., SIGMA FLAG Ml™). This antibody does not recognize the FLAG sequence if it is internal or C-terminal. In the presence of calcium a second purification step can be done with immobilized anti-FLAG Ml™ antibody. The specific proteins can then be isolated by a second specific and gentle elution with a calcium chelator such as EGTA. Ideally, when combined with a FLAG epitope, the digestion sites should incorporate a sequence recognized by a protease that cleaves C-terminally and "exo" to its recognition sequence, i.e., between the recognition sequence and the peptide affinity tag. Examples of this include enterokinase, Factor Xa and furins. Additionally, it is possible to use proteases that do not cleave "exo" to their recognition sequence, if the amino acids left behind following cleavage are part of or compatible with the FLAG Ml recognition sequence. A third example of an affinity tag is the Protein C epitope (Roche, Indianopolis, IN) which is also recognized by a calcium dependent antibody. Since this epitope is not sensitive to its location (can be N-terminal, C-terminal or internal), this design is more flexible with regards to the protease used following the first affinity purification step.
The present invention has several characteristics that distinguish it from the conventional methods [Puig O. et al. (2001) Methods 24:218-229]. The first distinguishing characteristic is the use of the biotin-streptavidin interaction for the first step in the isolation of a complex bound to the bait. This interaction is seven orders of magnitude higher in affinity than the protein A-IgG system used in the TAP protocol (Kd of one Z domain binding to IgG is approximately 10"8M [Braisted and Wells, (1996) Proc. Natl. Acad. Sci. USA. 93:5688] vs. a Kd of 10'15 M for biotoin-streptavidin). This is critical because the first purification step requires the isolation of the protein complex when it is in its most dilute and contaminated state. The higher binding affinity allows the isolation of biotinylated protein and associated ligands present at femtomolar or higher concentrations and permits a very stringent wash (if necessary) without loss of bait protein. As discussed above, this also allows rapid isolation and concentration of protein complexes, minimizing losses of specific interactors and potentially enabling the analysis of transiently interacting proteins. The published protein A- IgG system would only allow the isolation of proteins present at several logs higher concentrations and wouldn't allow as rapid an analysis or permit the use of such a stringent wash. The second improvement over the protein A-IgG system is that the consensus biotinylation sequences are short and thus potentially less disruptive to protein folding during expression and subsequent protein-protein interactions. If the FLAG tag is used to practice the invention, this represents a system where the successful digestion with the protease can create the unique recognition sequence for the second purification step. This can have some advantages in creating additional specificity during the second purification step. For example, nonspecifically bound proteins that are cleaved and/or otherwise eluted during the digestion step are unlikely to meet the criteria for binding to the second affinity support. There are three anti-FLAG antibodies commercially available (Ml, M2, M5, Sigma Corporation, St Louis, MO). Only the Ml antibody binds the FLAG sequence in a calcium dependent fashion and thus allows a specific and gentle elution. Ml recognizes sequences as small as DYKD (SEQ ID NO: 1) or DYKDE (SEQ ID NO: 2), but binds to such sequences only when they are present on the N-terminus (unblocked by an initiation Methionine or any other amino acid). Thus, the FLAG sequence can be used with any protease that cleaves exo and C-terminal to its recognition site, thereby allowing the digestion dependent creation of an N-terminal FLAG sequence from a previously Ml unreactive internal FLAG sequence. Tags such as 6HIS and FLAG are also advantageous over biologically relevant tags such as CBP (used in combination with the protein A-Z domains system in Puig et al, which can bind to calmodulin and calmodulin-containing protein complexes present in cells, cell lysates or tissue lysates, complicating their use. Affinity tags such as 6HIS, FLAG and others previously mentioned do not interact as extensively with biologically relevant and promiscuous ligands like calmodulin. Additionally, the use of CBP may require the incorporation of EGTA to prevent CBP-calmodulin interactions during the binding reaction and isolation of the protein complexes. Under these circumstances, the CBP containing constructs are not compatible with interactions requiring calcium and other divalent metal ions. Affinity tags such as 6HIS, FLAG and the others previously mentioned can be used in the presence of calcium, thus permitting the identification of calcium dependent interactions. As discussed above, another advantage of 6HIS, is that the 6HIS-nickel interaction is stable under conditions that are mildly denaturing to proteins and/or disrupt protein-protein interactions. As demonstrated in this application, exploiting the 6HIS tag in the final purification step allows for a preferential elution of the unknown protein interactors over that of the first binding component. In this case, n-lauroyl-sarcosine elution resulted in a significant percentage of the first binding component remaining on beads, but efficient elution of interacting proteins. In general it is advantageous to reduce the amount of "known" protein (i.e. , bait) present in a mixture prior to mass spectrometric analysis. Alternative possible designs are as follows:
(Biotinylation recognition sequence)-(protease recognition sequence) v-DYKDDDD-(bait protein).
If a given protease cleaves the protein specifically as shown (v), this would create an N- terminal DYKDDDD (SEQ ID NO: 3) sequence that is recognizable by the FLAG Ml antibody.
A specific example using Factor Xa as the protease is shown below:
The Factor Xa protease targets the sequence: I(EorD)GR (SEQ ID NO:4)
(Biotinylation recognition sequence)~I(EorD)GRvDYKDDDD-(bait protein).
The proteins or fragments thereof contained in the purified protein complexes can be characterized further by employing the standard techniques that are known in the art. For example, the individual proteins that comprise the complexes purified according to the invention are identified by a variety of mass spectrometric methods which include an associated set of separation methods. Most current generations of mass spectrometers enable the rapid identification of known proteins by searching mass spectrometry data (peptide masses, and/or peptide fragment mass spectra) against a database of known sequences (predicted peptides masses and/or predicted peptide fragment mass spectra). Alternatively, it is becoming much more routine to also be able to identify de novo amino acid sequences of peptides directly from peptide mass spectra and thus discover unknown proteins. In many cases, components of a protein complex would not have previously been isolated and may be known to exist only from genetic studies, or they can be previously unknown or known but unrecognized as components of a complex that interacts with the bait. The method of the invention therefore serves as a method for isolating, purifying and characterizing novel proteins and for providing insight into their biological function. In addition, protein complexes isolated by the method of the invention under varied physiological or pathological states will yield data as to how the composition of a complex varies in response to varied conditions. The data are then used as indicators of disease, as indicators of therapeutic efficacy, and for providing a rationale for novel therapies. Identifying novel proteins or novel interactions of known or novel proteins can lead to the identification of new members of disease-associated pathways or biochemical reactions. These proteins can be drug targets because of their interaction in such pathways or reactions whether or not they vary according to the "states" described above.
In all cases it will be understood that a person ordinarily skilled in the art can add and optimize spacer amino acid sequences between any of the contiguous functional segments or between the bait protein and the peptide tail to allow efficient formation of multi-protein complexes, affinity purification and protease digestion.
The following diagrams generally illustrate operation of two embodiments of the invention.
Embodiment #1 Key:
ezzzz- Biotin Ligase Recognition Sequence Affinity tag E=3 Protease Site Bait Φ Biotin Specifically interacting proteins
Nonspecifically interacting proteins
First affinity purification
Figure imgf000019_0001
Second affinity purification
Figure imgf000019_0002
Figure imgf000019_0003
Embodiment #2 Key:
*""* Biotin Ligase Recognition Sequence Affinity tag EΠΞΠΠΠ! Protease Site Bait Φ Biotin Specifically interacting proteins
Nonspecifically interacting proteins
First affinity
Biotinylation purification
Figure imgf000020_0001
Figure imgf000020_0002
Second affinity purification
Figure imgf000020_0003
Figure imgf000020_0004
In vitro biotinylation is optional in both embodiments if the first binding component is biotinylated in a cell or an organism by endogenous or exogenous biotin ligases.
The following are the nucleotide sequences encoding the bait proteins used to illustrate the invention.
Nucleotide Sequence encoding Human Cyclin-dependent Kinase Inhibitor lb (CDKNlb)(Genbank Accession No. NM 004064, SEQ ID NO:10) gtcagcσtcc cttccaccgc catattgggc cactaaaaaa agggggctcg tcttttcggg gtgtttttct cσccctσccc tgtccccgct tgctcacggc tctgcgactc cgacgccggc aaggtttgga gagcggctgg gttcgcggga cccgcgggct tgcacccgcc cagactcgga cgggctttgc caccctctcc gcttgcctgg tcccctctcc tctccgccct σccgctcgcc agtccatttg atcagcggag actcggcggc σgggccgggg cttccccgca gcccctgcgc gotcctagag ctcgggccgt ggσtcgtcgg ggtctgtgtc ttttggctcc gagggcagtc gctgggcttc cgagaggggt tcgggccgcg taggggcgct ttgttttgtt cggttttgtt tttttgagag tgcgagagag gcggtcgtgc agacccggga gaaagatgtc aaacgtgcga gtgtotaacg ggagccctag cotggagcgg a ggacgcca ggcaggcgga goaccccaag ccctcggcct gcaggaacct cttcggcccg gtggaccacg aagagttaac ccgggacttg gagaagcact gcagagacat ggaagaggcg agccagcgca agtggaattt cgattttcag aatcacaaac ccctagaggg caagtacgag tggcaagagg tggagaaggg cagcttgccc gagttctact aoagaσcccc gcggcccσcc aaaggtgcct gcaaggtgcc ggcgcaggag agccaggatg tcagcgggag ocgccoggcg gcgcctttaa ttggggctcc ggctaactct gaggacacgc atttggtgga cccaaagact gatccgtcgg acagccagac ggggttagcg gagcaatgcg caggaataag gaagogacct gcaaccgacg attcttctac tcaaaacaaa agagccaaca gaacagaaga aaatgtttca gacggttccc caaatgccgg ttctgtggag cagacgccoa agaagcctgg cctcagaaga cgtcaaacgt aaacagctcg aattaagaat atgtttcctt gtttatcaga tacatcactg cttgatgaag caaggaagat atacatgaaa attttaaaaa tacatatcgo tgaottcatg gaatggacat cctgtataag cactgaaaaa caacaacaca ataacactaa aattttaggc actcttaaat gatctgcctc taaaagcgtt ggatgtagca ttatgcaatt aggtttttcc ttatttgctt cattgtacta cctgtgtata tagtttttac cttttatgta gcacataaac tttggggaag ggagggcagg gtggggctga ggaactgacg tggagcgggg tatgaagagc ttgctttgat ttacagcaag tagataaata tttgacttgc atgaagagaa gcaattttgg ggaagggttt gaattgtttt ctttaaagat gtaatgtcco tttoagagac agctgatact tcatttaaaa aaatcacaaa aatttgaaca ctggctaaag ataattgcta tttattttta caagaagttt attctcattt gggagatctg gtgatctccc aagc atcta aagtttgtta gatagctgca tgtggctttt ttaaaaaagc aacagaaacc tatcctcact gccotcccca gtctctctta aagttggaat ttaccagtta attactcagc agaatggtga tcactccagg tagtttgggg caaaaatccg aggtgcttgg gagttttgaa tgttaagaat tgaccatctg cttttattaa atttgttgac aaaattttct cattttcttt tcacttcggg ctgtgtaaac acagtcaaaa taattctaaa tccctcgata tttttaaaga tctgtaagta acttcacatt aaaaaatgaa atatttttta atttaaagct tactctgtcc atttatccac aggaaagtgt tatttttaaa ggaaggttca tgtagagaaa agcacacttg taggataagt gaaatggata ctacatcttt aaacagtatt tcattgcctg tgtatggaaa aaccatttga agtgtacctg tgtacataac tctgtaaaaa cactgaaaaa ttatactaac ttatttatgt taaaagattt tttttaatct agacaatata caagccaaag tggcatgttt tgtgcatttg taaatgctgt gttgggtaga ataggttttc ccctcttttg ttaaataata tggctatgct taaaaggttg catactgagc caagtataat tttttgtaat gtgtgaaaaa gatgccaatt attgttacac attaagtaat caataaagaa aacttccata gctaaaaaaa aaaaaaaaaa
Nucleotide Sequence encoding Human N-ethylmaleimide-sensitive factor attachment protein (NAPA) (Genbank Accession No. NM 003827, SEO ID NO:ll) gacgatacgc cgggcgcagg cgcagaagcc gcgcccgtcc gcggcgccgc cagccagggσ ggaaacggct gcggcttcgc tagggacgca tgcgcgggtc ccttagtttt cgcgagataa cggtcgaaaa cgcgctcttg tcgatttcct gtagtgaatc aggcaccgga gtgcaggttc gggggtggaa tccttgggcc gctgggcaag cggcgagacc tggccagggc cagcgagccg aggacagagg gcgcacggag ggccgggcσg cagccccggc cgcttgcaga ccccgccatg gacccgttcc tggtgctgct gcactcggtg tcgtccagcc tgtcgagcag cgagctgacc gagctcaagt tcctatgcct cgggcgcgtg ggcaagcgca agctggagcg cgtgcagagc ggcctagaσc tcttctccat gctgctggag cagaacgacc tggagcccgg gcacaccgag ctcctgcgcg agctgctcgc ctccctgcgg cgccacgacc tgctgcggcg cgtcgacgac ttcgaggcgg gggcggcggc cggggσcgcg cctggggaag aagacctgtg tgcagcattt aacgtcatat gtgataatgt ggggaaagat tggagaaggc tggctcgtca gctcaaagtc tcagacacca agatcgaσag catcgaggac agataccccc gcaacctgac agagcgtgtg cgggagtcac tgagaatctg gaagaacaca gagaaggaga acgcaacagt ggccσacctg caggcccgtg acctccagaa caggagtggg gccatgtccc cgatgtcatg gaactcagac gcatctaσct ccgaagcgtc ctgatgggcc gctgσtttgc gctggtggac cacaggcatc tacacagcct ggactttggt tctctccagg aaggtagccc agcactgtga agacccagca ggaagccagg ctgagtgagc cacagaccac ctgcttctga actcaagctg cgtttattaa tgcctctccc gσaccaggcc gggcttgggc cctgcacaga tatttccatt tcttcctcac tatgacactg agcaagatct tgtctccact aaatgagctc ctgcgggagt agttggaaag ttggaaccgt gtccagcaca gaaggaatct gtgcagatga gcagtcacac tgttactcca cagcggagga gaccagctca gaggcccagg aatcggagcg aagcagagag gtggagaact gggatttgaa cccccgcσat ccttcaccag agcccatgct caaccactgt ggcgttctgc tgcσcctgca gttggcagaa aggatgtttt gtcccatttc cttggaggcc accgggacag acctggacac tagggtσagg cggggtgcgt ggtggggaga ggcatggctg gggtgggggt ggggagacct ggttggccgt ggtccagctc ttggcccctg tgtgagttga gtctcctctc tgagactgct aagtaggggc agtgatggtt gccaggacga attgagataa tatctgtgag gtgctgatga gtgattgaca cacagcactc tσtaaatctt cσttgtgagg attatgggtc ctgcaattct acagtttctt actgttttgt atcaaaatca ctatctttct gataacagaa ttgccaaggc agcgggatct cgtatcttta aaaagcagtc ctcttattcc taaggtaatc ctattaaaac acagctttac aacttccata tcacaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaa
Nucleotide Sequence encoding Human Growth Factor Receptor Binding Protein 2 (GRB2)(Genbank Accession No. NM 002086, SEO ID NO: 12) gccagtgaat tcgggggctc agccctcctc cctcccttcc ccctgcttca ggctgctgag cactgagcag cgctcagaat ggaagccatc gccaaatatg acttcaaagc tactgcagac gacgagctga gcttcaaaag gggggacatc ctcaaggttt tgaacgaaga atgtgatcag aactggtaca aggcagagct taatggaaaa gacggcttca ttcccaagaa ctacatagaa atgaaacσac atccgtggtt ttttggcaaa atccccagag ccaaggcaga agaaatgctt agcaaacagc ggcacgatgg ggcctttctt atccgagaga gtgagagcgc tcctggggac ttctccctct ctgtcaagtt tggaaacgat gtgcagcact tσaaggtgct ccgagatgga gccgggaagt acttcctctg ggtggtgaag ttσaattctt tgaatgagct ggtggattat cacagatcta catctgtctc cagaaaccag cagatattcc tgcgggacat agaacaggtg ccacagcagc cgacatacgt ccaggccctc tttgactttg atccccagga ggatggagag ctgggcttcc gccggggaga ttttatccat gtcatggata actcagaccc caactggtgg aaaggagctt gcσacgggca gaccggcatg tttccσcgca attatgtcac ccccgtgaac cggaaσgtct aagagtcaag aagcaattat ttaaagaaag tgaaaaatgt aaaacacata caaaagaatt aaacccacaa gctgcctctg acagcagcct gtgagggagt gcagaacacc tggccgggtc accctgtgac cctctcactt tggttggaac tttagggggt gggagggggc gttggattta aaaatgccaa aacttaccta taaattaaga agagttttta ttacaaattt tcactgctgc tcctctttcc cctcctttgt cttttttttc atcctttttt ctcttctgtc catcagtgca tgacgtttaa ggccacgtat agtcctagct gacgccaata ataaaaaaca agaaaccaaa aaaaaaaaac ccgaattca
The following is the summary of the amino acid recognition sequences disclosed herein with their sequence identifier numbers.
SEO ID NO. Sequence Description
1 DYKD Ml recognition sequence
2 DYKDE Ml recognition sequence
3 DYKDDDD Ml recognition sequence
4 IE/DGR Factor Xa recognition sequence
5 EDQVDPRLIDGK Protein C tag
6 MSGLNDIFEAAQKIEWHE BirA recognition sequence 7 ENLYFQG TEV protease recognition sequence
8. HHHHHH hexahistidine, 6HIS
9. LEVLFQGP PreScission recognition sequence
DNA constructs prepared for introduction into a prokaryotic or eukaryotic host will typically comprise a replication system (i.e. vector) recognized by the host, including the intended DNA fragment encoding the first binding component of the present invention, and will preferably also include transcription and translational initiation regulatory sequences operably linked to the polypeptide-encoding segment. Expression systems (expression vectors) may include, for example, an origin of replication or autonomously replicating sequence (ARS) and expression control sequences, a promoter, an enhancer and necessary processing information sites, such as ribosome-binding sites, RNA splice sites, polyadenylation sites, transcriptional terminator sequences, and mRNA stabilizing sequences. Signal peptides may also be included where appropriate from secreted polypeptides of the same or related species, which allow the protein to cross and/or lodge in cell membranes or be secreted from the cell.
An appropriate promoter and other necessary vector sequences will be selected so as to be functional in the host. Examples of workable combinations of cell lines and expression vectors are described in [Sambrook et al. (1989) vide infra; Ausubel et al. (Eds.) (1995) Current Protocols in Molecular Biology, Greene Publishing and Wiley Interscience, New York; and Metzger et al. (1988) Nature 334:31-36]. Many useful vectors for expression in bacteria, yeast, fungal, mammalian, insect, plant or other cells are well known in the art and may be obtained from vendors such as Invitrogen, Stratagene, New England Biolabs, Promega Biotech, and others. In addition, the construct may be joined to an amplifiable gene (e.g. , DHFR) so that multiple copies of the gene may be made. For appropriate enhancer and other expression control sequences, see also Enhancers and Eukaryotic Gene Expression (1983) Cold Spring Harbor Press, N. Y. While such expression vectors may replicate autonomously, they may less preferably replicate by being inserted into the genome of the host cell.
Expression and cloning vectors will likely contain a selectable marker, that is, a gene encoding a protein necessary for the survival or growth of a host cell transformed with the vector. Although such a marker gene may be carried on another poly nucleotide sequence co- introduced into the host cell, it is most often contained on the cloning vector. Only those host cells into which the marker gene has been introduced will survive and/or grow under selective conditions. Typical selection genes encode proteins that (a) confer resistance to antibiotics e.g., kanamycin, tetracycline, etc. or other toxic substances; (b) complement auxotrophic deficiencies; or (c) supply critical nutrients not available from complex media. The choice of the proper selectable marker will depend on the host cell; appropriate markers for different hosts are known in the art.
Recombinant host cells, in the present context, are those which have been genetically modified to contain an isolated DNA molecule of the instant invention. The DNA can be introduced by any means known to the art which are appropriate for the particular type of cell, including without limitation, transformation, lipofection, electroporation or viral mediated transduction.
A DNA construct capable of enabling the expression of the first binding component of the invention or the nucleic acids encoding individual segments of the first binding component (e.g., affinity tags or protease recognition sequence) can be easily prepared by the art-known techniques such as cloning, hybridization screening and PCR. Polymerase Chain Reaction (PCR) is a repetitive, enzymatic, primed synthesis of a nucleic acid sequence. This procedure is well known and commonly used by those skilled in this art [see Mullis, U.S. Patent Nos. 4,683,195, 4,683,202, and 4,800,159; Saiki et al. (1985) Science 230: 1350-1354]. PCR is based on the enzymatic amplification of a DNA fragment of interest that is flanked by two oligonucleotide primers that hybridize to opposite strands of the target sequence. The primers are oriented with the 3' ends pointing towards each other. Repeated cycles of heat denaturation of the template, annealing of the primers to their complementary sequences, and extension of the annealed primers with a DNA polymerase result in the amplification of the segment defined by the 5 ' ends of the PCR primers. Since the extension product of each primer can serve as a template for the other primer, each cycle essentially doubles the amount of DNA template produced in the previous cycle. This results in the exponential accumulation of the specific target fragment, up to several million-fold in a few hours. By using a thermostable DNA polymerase such as Taq polymerase, which is isolated from the thermophilic bacterium Thermus aquaticus, the amplification process can be completely automated. Other enzymes which can be used are known to those skilled in the art.
Once a protein complex is purified according to the invention, a person of ordinary skill in the art can prepare monoclonal or polyclonal antibodies specific for the complex or the proteins contained in the complex. Monoclonal or polyclonal antibodies, preferably monoclonal, specifically reacting with a protein of interest can be made by methods well known in the art [see, e.g., Harlow and Lane (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratories; Goding (1996) Monoclonal Antibodies: Principles and Practice, 3rd ed. , Academic Press, San Diego, CA; and Ausubel et al. (1993) Current Protocols in Molecular Biology , Wiley Interscience/Greene Publishing, New York, NY].
Standard techniques for cloning, DNA isolation, amplification and purification, for enzymatic reactions involving DNA ligase, DNA polymerase, restriction endonucleases and the like, and various separation techniques are those known and commonly employed by those skilled in the art. A number of standard techniques are described in Sambrook et al. (1989) Molecular Cloning, Second Edition, Cold Spring Harbor Laboratory, Plainview, New York; Maniatis et al. (1982) Molecular Cloning, Cold Spring Harbor Laboratory, Plainview, New York; Wu (ed.) (1993) Meth. Enzymol 218, Part I; Wu (ed.) (1979) Meth. Enzymol. 68; Wu et al. (eds.) (1983) Meth. Enzymol. 100 and 101; Grossman and Moldave (eds.) Meth. Enzymol. 65; Miller (ed.) (1972) Experiments in Molecular Genetics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York; Old and Primrose (1981) Principles of Gene Manipulation, University of California Press, Berkley; Schleif and Wensink (1982) Practical Methods in Molecular Biology; Glover (ed.) (1985) DNA Cloning Vol. I and II, IRL Press, Oxford, UK; Hames and Higgins (eds.) (1985) Nucleic Acid Hybridization, IRL Press, Oxford, UK; Setlow and Hollaender (1979) Genetic Engineering: Principles and Methods, Vols. 1-4, Plenum Press, New York; and Ausubel et al. (1992) Current Protocols in Molecular Biology , Greene/Wiley, New York, NY.
The following examples illustrate the invention but are in no way intended to limit the scope of the invention:
Example 1: Expression of the First Binding Component Containing Avitag in E coli.
The bacterial expression vectors exemplified herein are termed as E24-TAG and E25- TAG and contain the nucleotide sequences capable of expressing a fusion protein of the following general design:
AviTag-(+/- spacer)-(TEV protease site)-(+/- spacer)-6HIS tag-(+/- spacer)-(+/-Protein C tag)-
(bait protein).
Construction of E24 and E25 vectors
The E.coli expression vectors E24 and E25 are based on the Invitrogen Gateway compatible vector pDEST15 (Cat # 11802-014) in which transcription is driven by the T7 promoter and which contain the sequence encoding Glutathione-S-Transferase (GST) at the N- terminus. PCR products amplified from M07 (see Example 8) were inserted into vector E04 which is a modified version of pDEST15 containing a tag for GST and CBP (calmodulin binding peptide) separated by a cleavage site for PreScission™ protease.
The arrangement of tags in E coli vectors was as follows:
E04 GST-ps-CBP
E24 AT-TEV-HIS-PrC
E25 AT-(G4S)2-TEV-HIS-PrC.
E26 AT-(G4S)3-TEV-HIS-PrC.
Where
AT = AviTag
HIS = 6x Histidine tag
GST = Glutathione-S-transferase
TEV = TEV protease
CBP = Calmodulin binding peptide G4S = Spacer region consisting of a tandem repeat of 4 Glycines and a serine PrC = Protein C epitope Tag ps = PreScission™ cleavage site
E04 was made using the following PCR primers:
GGAACCGGTGAAGGAGATAGAACCATGTCCCCTATACTAGGTTATTG - PinAI-SD-GST-F (SEQ ID NO: 13)
TCCCTCGAGCCTGGTACCGAAAGTGCCCCGG - XhoI-CBP-R (SEQ ID NO: 14)
to amplify pGEX4T3/TAP (Amersham Biosciences, Piscataway, NJ). Vector pGEX4T3/TAP and the PCR product were both digested with PinAl and Xho I and ligated together.
Construction of E24 and E25 was achieved by amplifying the tag region of M07 (see Example 8) with the following oligonucleotide and MACH. Reverse (see Example 8).
MACH.Nde - (purchased from MWG, High Point, NC)
5 ' -CTACCGGTGAAGGAGATAGTCATATGTCCGGCCTGAACGAC-3 ' (SEQ ID NO : 15 )
This primer binds to the region around the ATG start codon of Avitag and introduces an Nde I site (underlined) over the start codon.
Using these primers with M07 as the PCR template, the tag region of each vector was amplified with the replacement of the ps region with the Protein C region. The resultant PCR amplicons were digested with Nde I and Xho I and sub-cloned into Ndel/Xhol digested vector pD15-E04 and transformed into DB3.1 bacterial cells. Positive colonies were identified and verified by DNA sequencing.
Upon Sequencing of potential E25 clones it was discovered that a PCR error had deleted one of the G4S triplets in such away that it left a G4S doublet instead of the expected triplet. However, the G4S doublet is still in-frame with the downstream regions and so was named E25. Upon sequencing of a second batch of E25 clones a correct version containing the G4S triplet was identified and this was named E26.
The nucleotide sequences of the tag region (from the start of the ORF to the beginning of the bait protein) in the E24 and E25 vectors are as follows:
E24-TAG (SEQ ID NOS: 16 and 17)
Avitag
MetSerGlyLeuAsnAspIlePheGluAlaGlnLysIleGluTrpHisGlu GlyAlalleSerAla
ATGTCCGGCCTGAACGACATCTTCGAGGCTCAGAAAATCGAATGGCACGAA GGCGCGATATCCGCG TEV CLEAVAGE SITE 6x His tag Protein C
GluAsnLeuTyrPheGlnGly SerSerAla HisHisHisHisHisHis GlySer Gl AspGln GAGAACCTGTACTTCCAGGGC AGCAGCGCT CATCACCATCACCATCAC GGGAGC GAAGATCAG
ValAspProArgLeuIleAspGlyLys GTAGATCCACGGTTAATCGATGGTAAG
E25-TAG (SEQ ID NOS: 18 and 19) AviTag
MetSerGlyLeuAsnAspIlePheGluAlaGlnLysIleGluTrpHisGlu GlyAlalleSer ATGTCCGGCCTGAACGACATCTTCGAGGCTCAGAAAATCGAATGGCACGAA GGCGCGATATCC
(G4S)2 Spacer TEV CLEAVAGE SITE GlyGlyGlyGlySerGlyGlyGlyGlySerGlyGlyGlyGlySer Ala GluAsnLeuTyrPheGlnGly GGCGGCGGCGGCAGCGGCGGCGGCGGCAGCGGCGGCGGCGGCAGC GCG GAGAACCTGTACTTCCAGGGC
6x His Tag Protein C
SerSerAla HisHisHisHisHisHis GlySer GluAspGlnValAspProArgLeuIleAspGlyLys AGCAGCGCT CATCACCATCACCATCAC GGGAGC GAAGATCAGGTAGATCCACGGTTAATCGATGGTAAG
Protein Expression in E coli The following constructs were expressed in E. coli.
Constructs Baits contained
E24 and E25 Tag-only (control with no bait protein) E24 and E25 GRB2
E24 and E25 NAPA
E24 and E25 CDKNlb In order to express the Avitag containing bait proteins, two ml of LB medium containing 100 μg/ml Ampicillin was inoculated with a single colony containing the desired expression vector and the culture was incubated at 37 °C in a shaker incubator. After 8 hours, one ml of the culture was diluted into 25 ml of fresh LB medium (100 μg/ml Ampicillin) and incubated overnight at 37 °C. The overnight culture was diluted into 1 liter of LB medium containing 100 μg/ml of Ampicillin and an antifoam agent in a Nalgene centrifuge bottle to a final OD595 of 0.05, which was left immersed in a water bath at 30 °C with an airflow of 10 cc/min to the culture bottle in a Bactolift apparatus. When the culture reached an optical density of 0.7-0.9 at 595 nm, arabinose was added to a final concentration of 0.2 % (w/v). After 3 hours of induction, the cells were harvested by centrifugation at 5000 rpm for 8 minutes in a Beckman JLA 8.1000 rotor. The supernatant was removed from the pelleted cells and the wet weight of the cells was determined. The cell pellet was resuspended in 3-fold volume (v/w) of the lysis buffer (lOmM sodium phosphate, 150mM NaCl, pH 7.2) and stored frozen in 50 ml conical tubes.
The frozen cell pellets were analyzed by using SDS-PAGE as shown in Fig. 1. Expression varied with GRB2 expressing to approximately 50% of total protein, NAPA at approximately 30% and CDKNlb was undetectable by Coommassie stain.
Example 2: Purification of the Expressed Proteins.
For the first binding components expressed in E. coli, the frozen cells were thawed in a 37 °C water bath and lysed by sonication in the presence of lysozyme (5ug/ml) using a Virsonic 600 sonicator (20 sec pulse, 10 sec pause for at least 9 minutes). The supernatant was separated from the lysate by centrifugation for 30 minutes at 46,000xg.
Purification of Expressed proteins on a Ni2+ column In order to purify the expressed Avitag proteins, each well of a 24-well Whatman deep well plate (UNI-FILTER, 24 wells, 10ml, Polypropylene, Whatman GFC, MBPP 25-30, VWR International, West Chester, PA) placed on top of the Whatman vacuum manifold system, was loaded with 2ml Ni-NTA resin (Invitrogen Corp.) and pre-equilibrated with binding buffer (50mM Na2HPO4, 500mM NaCl, lOmM Imidazole, pH 8.0). Ten ml of the supernant (cell extract) was added to each well and a vacuum was applied slowly to filter away the cell extract without disrupting the binding of tagged fusion protein to nickel. An additional 10 ml of the binding buffer was added to each well and then eluted under vacuum. After two more rounds of washing with the binding buffer in the same manner, the fusion proteins were eluted with 5ml of elution buffer (50mM Na2HPO4, 500mM NaCl, 400mM Imidazole, pH7.4). The eluted proteins were dialyzed in 1 L of TBS (Tris buffered saline, 20 mM Tris, pH 7.5, 150 mM sodium chloride) for 2hrs. The concentrations of the protein were determined at each step.
As shown in Fig. 2, certain tagged proteins (Avitag containing tag-only, Grb2, NAPA,
CDKNlb) were purified well with a single step on the Ni2+ column. There is no significant difference between the E24 and E25 constructs in terms of the purification efficiency on the Ni2+ column. However, in general, vector E25 resulted in better protein yield than the vector
E24.
Example 3: In vitro Biotinylation and Binding to Neutravidin Beads.
Approximately 150 μg of purified protein was incubated for 30 minutes at 30 °C in a reaction mix containing an equimolar amount of Biotin and 21 μg of purified recombinant biotin ligase, the gene product of BirA (Avidity, Boulder, CO) in a total volume of 300 μl. The composition of each reaction mix for each specific protein can be found in table 1 below. Due to the low molecular weight of tag only, a lower amount of protein was used to maintain the equimolar ratio with Biotin. The 300 μl biotinylation reaction mix was then incubated for 20 minutes with 100 μl Ultralink Neutravidin beads (Pierce, Rockford, IL). The supernatant was removed and the beads were washed 3x with TEV cleavage buffer (50 mM Tris/HCl, 0.5 mM EDTA, pH 8.0) plus 10% Glycerol and stored overnight at 4°C.
The degree of biotinylation and the efficiency of binding of the biotinylated proteins to the Neutravidin beads was determined using a Western blot in which the proteins were visualized using a streptavidin-HRP conjugate (ZYMED Laboratories, San Francisco, CA)(see Fig. 3). Analyses were performed on the baits before and after biotinylation, as well as, on the supernatant after binding of the biotinylated baits to the beads. Approximately 40 ng of protein (both biotinylated and unbiotinylated) and 80 ng of supernatant were used for each analysis.
As can be seen in Table 2 and Fig.3, in general, the proteins expressed in the vector E25 were biotinylated more efficiently than those expressed in the E24 vector. The binding of the protein to the beads was found to be in excess of 80% (see Table 2 and Fig. 3).
Table 1. Composition of reaction mixtures for in vitro biotinylation of bacterially expressed first binding components in the E24 and E25 vectors.
Figure imgf000031_0001
Biomix A: 0.5 M bicine buffer, pH 8.3 Biomix B: 100 mM ATP, 100 mM MgOAc
Table 2. Efficiency of the in vitro biotinylation and the binding of biotinylated E24, E25 tag-containing proteins evaluated by an ELIFA assay.
Figure imgf000032_0001
The results for % biotinylation are reported relative to the total protein in the sample. From the gel-shift assay performed with CDKNlb at least 80 to 90% of the specific CDKNlb band shifted indicating that % biotinylation for the pure protein is often higher than the results reported by the ELIFA.
Example 4: TEV Protease Digestion.
The following were subjected to the protease digestion steps below: Constructs Baits contained
E24 and E25 Tag-only (control with no bait protein) E24 and E25 GRB2 E24 and E25 NAPA E24 and E25 CDKNlb
Approximately 5 μg of each biotinylated bait protein on neutravidin beads was washed with pulldown buffer, HEGNS (150mM NaCl, 20mM HEPES, pH 7.5, O. lmM EDTA, 10% Glycerol, 0.1 % NP-40, ImM DTT) and incubated with and without 310μl of Hela cell lysate (5 mg/ml) prepared as described in Example 5, at 4°C for lhr. The incubation mixtures were then washed 3 times with 1ml HEGNS buffer and subjected to the TEV protease digestion step by adding 30μl of HEGNS buffer containing 0.5 units (lx )or 5 units (lθx)of TEV Protease (Invitrogen Corporation) per μg of the protein at 4°C for lhr. As a control, the uncleaved proteins were eluted with 50% Acetonitrile with 0.1 % TFA prior to the digestion step. As shown in Fig. 4 in general, TEV protease digestion was efficient but varied depending on the specific bait protein. The presence of cell lysate under conditions that allow the formation of protein complexes with the first binding component bound to the beads did not affect the efficiency of proteolytic cleavage. Our experience with other TAP systems involving larger fusion tags indicates that this is a major advantage of this particular tag design.
Example 5: Preparation of Cell and Tissue Lysates.
Mammalian whole cell lysate preparation To provide a source of mammalian proteins to interact (in pulldown analyses) with purified mammalian bait proteins expressed in bacteria or another host organism, large scale cultures of mammalian cells (2-25 liters) were grown in suspension to densities of 2-5 x lOVml or " 80-90% confluency for adherent cultures. Once the suspension cultures reached their desired density, the cells were centrifuged at 3,000xg for 10 minutes. The resulting cell pellet (or plate of adherent cells) was washed lx with cold phosphate buffered saline (PBS). The PBS was removed and cold lysis buffer was added at a volume of lOμl per mg of wet cell weight or lml per 15 cm plate of adherent cells. The lysis buffer consists of 25 mM hydroxyethylpiperazine-N'-2-ethanesulfonic acid (HEPES) pH 7.5, 150 mM NaCl, 1 % NP40, lOmM MgCl2, 1 mM ethylene-diamine tetra acetate (EDTA), 10% Glycerol, lmM dithiothreitol (DTT), and a protease inhibitor cocktail (Roche, Mannheim Germany) added just prior to application (1 tablet/ 10ml of buffer ). Once the lysis buffer was added, the suspension cell pellet was resuspended with gentle pipetting and slow speed vortexing. The adherent cells were lysed (on ice) by adding the buffer directly to the culture plate and scraping the cells off of the dish. The lysates were then transferred to centrifuge tubes and allowed to incubate at 4°Cfor 15 minutes. At 2 minute intervals during the 4 °C incubation, the lysates were gently resuspended with mild shaking and/or slow speed vortexing. Following the incubation, the lysates were centrifuged at 27,000xg for 15 minutes at 4°C to remove insoluble debris. The supernatant was then aliquoted into fresh centrifuge tubes, snap frozen in liquid nitrogen and then stored at -80 °C.
As quality control measures to assess the abundance and functionality of the proteins within the whole cell lysates we first determined the protein concentration using a Bradford colorimetric assay from Bio-Rad (Hercules, CA). Once the concentration of the lysate preparation was established, we tested the lysates with one or more of the following assays:
a. histone deacetylase (HDAC) activity assay to assess nuclear protein activity; b. alkaline phosphatase assay to assess cytoplasmic protein function; c. western blots for expression of specific proteins of interest such as but not limited to HDAC.
Once found to perform satisfactorily in the above assays, the lysate is then made available for use in downstream analyses such as pulldown interaction studies.
In attempts to identify a whole cell lysis protocol which provided sufficient quantities (and concentrations) of biologically active proteins to serve as interactors in pulldown studies, we tested nine separate protocols derived from commercial sources or culled from journal articles. The protocols and the resulting lysates, were initially evaluated using four primary criteria: total protein yield per mg of wet cell weight, protein purity as assessed by refractive index and optical density measurements at A260, A28O and A600, biological activity of proteins of the nuclear compartment as assessed by histone deacetylase activity and the biological activity of proteins of the cytoplasmic compartment as assessed by alkaline phosphatase activity. Five of the nine protocols were discarded due to either poor protein yields, poor biological activity or because they utilized detergents or compounds which were found to be detrimental to or incompatible with our downstream mass spectrometry processes. The remaining four frontrunners from the aforementioned evaluations were further examined using our most stringent test; mass spectrometric analysis of a pulldown. The mass spectrometry data was carefully analyzed to determine which buffer/protocol facilitated the identification of the largest number of curated and novel interacting proteins, the highest strength or quality of protein identifications and the fewest non-specific interactions or cleanliness of the data.
As a result of these experiments, a buffer and protocol that we developed clearly outperformed all others. The protocol was modified from that described by De Rooij and Bos [Oncogene (1997) 14:623-625]. Modifications of the published protocol include the addition of lmM dithiothreitol, which mimics the reducing environment that normally exists inside mammalian cells and the increase of the force of centrifugation from 18,000xg to 27,000xg to remove excess debris. These modifications were made to maintain protein stability and reduce nonspecific binding, respectively, thereby improving downstream mass spectrometry results. In summary, we have found that this protocol maximizes the yield of total protein per mg of wet cell weight, the biological activity of the resulting proteins and the identification of protein-protein interactions by mass spectrometry. To our knowledge, nor the original or modified protocol have been used for the mass spectrometry analysis of protein-protein interactions. It was this reagent that was used to generate the data put forth in this application.
Tissue Lysate Preparation
Tissues used to date include whole mouse brain and mouse cerebellum. Modification to the following protocol for different organs can easily be made by those skilled in the art. For example, those skilled in the art will understand that more aggressive tissue disruption for muscle tissue than for brain is required because of the larger amount of connective tissue. Tissues were flash frozen in liquid nitrogen immediately following dissection. Frozen brain or cerebella were weighed. Mouse cerebella weighed approximately 70 mg, the striatum was approximately 30 mg and the cortex was about 80 mg. Tissues were then mixed with cold homogenization buffer of 25 mM hydroxyethylpiperazine-N'-2-ethanesulfonic acid (HEPES) pH 7.5, 150 mM NaCl, 1 % NP40, lOmM MgCl2, 1 mM ethylene-diamine tetra acetate (EDTA), 10% Glycerol, lmM dithiothreitol (DTT), and a protease inhibitor cocktail (Roche, Mannheim Germany) at a ratio of 100 mg tissue to 1 ml of buffer. The tissue was then homogenized on ice with a Dounce homogenizer (Wheaton brand) for 5 plunges with a loose pestle and 25 plunges with a tight pestle. The resulting brain tissue homogenate was then centrifuged at 100,000xg for 30 minutes. The resulting supernatant was used for pulldown assays according to the protocol for cell lysates. 100 mg of brain tissue typically yields approximately 10 mg of protein lysate.
Preparation of cell lysates from mammalian expression system
Mammalian cells expressing the modified bait proteins are prepared and harvested as described in Example 8. Harvested transfectants are then lysed according to the procedure described above for whole cell lysates using our optimized protocol. If biotinylation of the modified bait protein has not occurred by co-transfection or the creation of stable cell lines expressing the biotin ligase (e.g. BirA) [Parrott MB & Barry MA (2001) BBRC 281, 993- 1000: Parrott MB & Barry MA, (2002) Mol Ther. 1 : 96-104), biotinylation is performed using the cell lysate according to the protocol for purified recombinant modified bait protein, except that the biotin concentration is reduced. The amount of enzyme and biotin in the reaction can be adjusted according to the level of expression of the modified bait protein as determined by immunochemical assays specific to the second affinity tag segment. Concentrations of biotin as low as 10 to 20 nM can be used for efficient biotinylation and at these concentrations remaining free biotin does not interfere with subsequent binding to neutravidin beads. Cell lysate containing biotinylated bait and associated interacting proteins are then purified as described in Example 6 below.
Example 6: Isolation of Protein Complexes ("Pulldown Assay")
One ml of HeLa cell lysate (5mg/ml) prepared as in Example 5 was incubated at 4°C for 1.5hrs with 15μl of streptavidin beads containing 5 μg of each of the first binding components prepared herein, i.e. , GRB2, NAPA, CDKNlb or tags-only proteins in the E24 or E25 vectors. The samples were briefly centrifuged to pellet the beads after which the lysate was removed. Beads were transferred to clean eppendorf tubes and washed three times with lml of pulldown HEGNS buffer (20mM HEPES, pH 7.5, 150mM NaCl, 10% glycerol, 0.1 % NP-40). Protein complexes were released from the beads using lOOμl of pulldown buffer containing 50 units of TEV protease and incubating for lhr at 4°C with rotation, following which the supernatant removed. An additional 50μl of pulldown buffer was added to the beads, gently mixed, and the supernatant removed to be combined with the previous supernatant. Imidazole, to a final concentration of 10 mM, was added to the supernatants that were then incubated with a lOμl volume of nickel equilibrated Chelating -NTA Sepharose (Amersham Biosciences, Piscataway, NJ) beads at 4°C for 30 min. The supernatant was removed from the nickel beads after which the beads were washed three times with pulldown buffer containing lOmM imidazole. Proteins associated with the bait were eluted in three fractions of 50μl of pulldown buffer containing 0.8%(w/v) n-lauroyl-sarcosine and the eluates combined. 15μl of each sample was analyzed by SDS PAGE and the remainder was precipitated with ethanol in preparation for mass spectrometric analysis.
As can be seen in Fig. 5, the first binding components expressed in E. coli and purified according to the invention were capable of forming complexes, from which several interacting proteins were isolated. In general the background level of nonspecific binding is extremely low. Several of the most intense bands visible in the GRB2 pulldowns are known by us to be specific interacting proteins of GRB2 as reported in the literature (see also MS results below confirming this). Additionally, several isolated interacting proteins for GRB2 and NAPA show gel bands of similar intensity, indicating that binding proceeded with high efficiency for these baits. The very faint CDKNlb lanes indicate that the interactions with this bait were less favored. It is well known in the art that mass spectrometric identification of low abundance proteins in the presence of high abundance proteins is unfavorable. By comparing the intensity of bands present in the eluent (left side) versus the protein remaining on the beads after elution (right side), it is clear that the mild detergent elution preferentially releases interacting proteins and leaves the modified bait protein and TEV protease (also 6HIS tagged) present on the beads. This combination of properties of the present invention yielded a well-normalized mixture of low-abundance interacting proteins that would have been difficult or impossible to identify using the conventional methods. This was particularly enabling for the CDKNlb pulldown, in which the interactions were less favored, but from which interacting proteins were still identified (see Table 3).
Example 7: Mass Spectrometric Analyses for Protein Identification. Sample digestion
Protein mixtures isolated from a pulldown assay described in Example 6 were digested with trypsin in solution to produce a mixture of peptides. The procedure is as follows:
An approximately lOug protein pellet was dissolved in lOμl of 50% aqueous acetonitrile (AcN) /0.1 % trifluoracetic acid (TFA) solution to form a protein solution, and
50μl of an 8M aqueous Urea/0.2M NH4HCO3 solution was added thereto. Next, lOμl of 45mM aqueous dithiothreitol (DTT) solution was added to the protein solution and the resulting solution was mixed with a vortex mixer and then incubated for approximately 15 minutes at 60°C. After cooling the protein solution to room temperature (about 5 minutes), lOμl of a lOOmM iodoacetamide solution was added, and the resulting solution was then incubated at RT for approximately 15 minutes in the dark. The protein solution was diluted with 120ul of purified water and 5μl of 0.1 μg/μl Trypsin solution was added thereto and incubated at 37°C for approximately 2 hours. After allowing the digested protein solution to cool to RT, lOμl of 10%TFA was added to quench the trypsin. The resulting solution was then concentrated under vacuum to yield a final volume of approximately 100 μl in preparation for desalting.
CEX Fractionation of digested proteins
The digested proteins were first desalted using a C-18 reverse phase cartridge (Michrom BioResources, Auburn CA) to remove the salt from the digestion buffer. They were eluted with 95% AcN in water which contains 0.1 % TFA by volume. Afterwards the eluted sample was taken to dryness in a vacuum centrifuge and then reconstituted in 5 microliters of 2% aqueous AcN containing 0.5% aqueous acetic acid by volume. The reconstituted sample was then injected onto a 300 micron x 5 cm strong cation exchange (CEX) column (Vydac column, Western Analytical Services, Marietta, CA) which was eluted using 250 mM NH4OAc flowing at 4 μl/min. The gradient ran from 0 to 35% NH4OAc in 40 minutes. For electrospray LC/MS/MS analysis, six fractions were collected and for MALDI LC/MS/MS 3 fractions were collected. The collected fractions were taken to dryness in a vacuum centrifuge.
ESI LC/MS/MS Analysis The CEX derived fractions were reconstituted in 5 ul of 1 % aqueous TFA, loaded onto the autosampler (FAMOS autosampler, LC Packings, Sunnyvale, CA) of the ESI LC/MS/MS system (LC packings Ultimate LC and either Q Trap MS system, AB/MDS Sciex Toronto Canada, or an LCQ MS system, Thermo Finnigan, San Jose, CA), and injected onto a C-18 reverse phase trap cartridge (LC Packings) which was prewashed for 1 minute at a flow rate of 50 μl/min using 0.5 % aqueous acetic acid. The flow was then reversed and the peptides were back eluted onto a 75 micron x 15 cm C-18 reverse phase LC column (LC Packings) for separation of the peptides at a flow rate of 250 nl/min. The loading buffer was 0.5 % aqueous acetic acid and the elution buffer was AcN. MS data were collected in the information dependent mode in which one survey scan was acquired and then followed by the acquisition of three MS/MS spectra.
MALDI LC/MS/MS Analysis
MALDI MS/MS data were acquired in a two step process in which the CEX-derived fractions were reconstituted in 5 μl of 1 % aqueous TFA and loaded onto the autosampler of an LC system (LC Packings) that spots the LC effluent directly onto a MALDI target while simultaneously mixing the effluent with MALDI matrix. Spots were deposited every 15 seconds and a total of 144 spots were collected for each CEX fraction. The samples were then analyzed by MALDI MS (AB 4700 Proteomics Analyzer, Applied Biosystems, Foster City, CA) to identify all of the usable peptide signals which were subsequently subjected to MS/MS analysis.
Protein identification by database searching
The proteins which were contained in the pulldown samples were identified by comparison of the MS/MS data to theoretical data derived from the human subset of the proteins contained in the NCBInr protein sequence database. The NCBInr protein sequence database was obtained by downloading it from the NCBI website. The matching of spectral data to the database was performed using a commercially available software package called Mascot which was purchased from Matrix Science (London, UK). Depending on the instrument that was used to generate the MS/MS data, the information from the separate CEX fractions derived from a single pulldown sample was combined either before or after the database search to yield a single combined result from each pulldown.
The algorithm used by Mascot for protein identification uses a two step process in which MS/MS data is first assigned to multiple possible peptide ID's. The Mascot algorithm then assembles the putative peptide ID's into the minimum number of protein ID's that can explain the raw data. The output from Mascot was then further filtered based on the following criteria. The protein level data had subtracted from it known false positives that were determined to be non-specific interactors on the basis of control experiments using the tag-only construct for pull downs. Secondly, any proteins that were observed repeatedly across many pulldowns but not observed in the standard control experiment were also subtracted. Next, the proteins were ranked by the score assigned by Mascot and any proteins below a score of 60 were ignored.
The peptide level results from each listed protein were then screened to make sure that they met a minimum quality. All known interactors were considered identified if they had multiple peptides on which the identification was based or if there was only a single peptide, then its score alone was greater than or equal to 60.
Summarized in Table 3 are the results of the mass spectrometric analysis of the samples prepared in Example 7. Many proteins, that have been described in the literature as interacting with these three modified baits were identified with high confidence and are listed in column 3. A representative PubMed ID is given for a published paper for each identified interacting protein within which is described experimental evidence in support of these results.
Column 5 indicates whether the interaction between the listed proteins and the baits is known to be direct (D), indirect (I) or undetermined (U). The observation of indirect partners of the exemplified bait proteins indicates that the present invention isolates complexes unapproachable by yeast two-hybrid and related methods.
Table 3. Summary of the interacting proteins identified using the invention.
Figure imgf000041_0001
Example 8: Expression of the First Binding Component Containing Avitag in Mammalian Cells.
The mammalian expression vectors used herein are referred to as M08-TAG and M09- TAG and contain the nucleotide sequences capable of expressing a fusion protein of the following general design:
AviTag-(+/- spacer)-(TEV protease site)-(+/- spacer)-6HIS tag-(+/- spacer)-(+/- PreScission cleavage site)-(+ /-Protein C tag)-(bait protein).
A total of six genes (PCNA, HDAC, CDKNlb, NAPA, CDK5, Tag-only) were expressed from three separate vectors designated as M07, M08, and M09, which encoded three variations of an N-terminal tag (see below for details) designed for tandem affinity purification. The base vector was derived from an expression plasmid, pT-Rex-DEST30 purchased from Invitrogen Corporation (Carlsbad, CA). In pT-Rex-DEST30, gene expression is driven by the cytomegalovirus (CMV) immediate early promoter under the control of the tet operator sequence fused to the 3' end of the promoter.
The M07 and M08 vectors are similar in that the functional elements of the tag sequences are identical with the exception that the M08 vector encodes a triple unit repeat of four glycine residues and one serine residue to serve as a spacer to permit more efficient cleavage of bait proteins from the purification tags.
The M09 vector is similar to the M07 construct in that it does not contain the triplet repeat spacer, but diverges from the M07 vector due to the substitution of a Protein C binding domain in place of the PreScission protease cleavage site. This design relies on the efficiency of the TEV protease for the removal of the bulk of the tag and affords additional flexibility for purification by incorporating the high affinity Protein C domain. The annotated nucleotide sequences (from the start of the ORF to the beginning of the bait protein) are as follows:
M07- TAG (SEQ ID NOS: 20 and 21)
AviTag SEQUENCE MetSerGlyLeuAsnAspIlePheGluAlaGlnLysIleGluTrpHisGlu GlyAlalleSerAla ATGTCCGGCCTGAACGACATCTTCGAGGCTCAGAAAATCGAATGGCACGAA GGCGCGATATCCGCG
TEV CLEAVAGE SITE SPACER 6X His Tag GluAsnLeuTyrPheGlnGly SerSer Ala HisHisHisHisHisHis Val GAGAACCTGTACTTCCAGGGC AGCAGC GCT CATCACCATCACCATCAC GTG
PreScission CLEAVAGE SITE BAIT SEQ LeuGluValLeuPheGlnGlyPro XXXXXXX CTGGAAGTTCTGTTCCAGGGGCCC NNNNNNN
M08-TAG (SEQ ID NOS: 22 and 23) AviTag SEQUENCE
MetSerGlyLeuAsnAspIlePheGluAlaGlnLysIleGluTrpHisGlu GlyAlalleSer ATGTCCGGCCTGAACGACATCTTCGAGGCTCAGAAAATCGAATGGCACGAA GGCGCGATATCC
Glγ4-Ser Spacer
GlyGlyGlyGlySer GlyGlyGlyGlySer GlyGlyGlyGlySer Ala GGCGGCGGCGGCAGC GGCGGCGGCGGCAGC GGCGGCGGCGGCAGC GCG
TEV CLEAVAGE SITE SPACER 6X His Tag GluAsnLeuTyrPheGlnGly SerSer Ala HisHisHisHis GAGAACCTGTACTTCCAGGGC AGCAGC GCT CATCACCATCAC PreScission CLEAVAGE SITE
HisHis Val LeuGluValLeuPheGlnGlyPro CATCAC GTG CTGGAAGTTCTGTTCCAGGGGCCC
M09-TAG (SEQ ID NO: 24 and 25)
AviTag SEQUENCE
MetSerGlyLeuAsnAspIlePheGluAlaGlnLysIleGluTrpHisGlu GlyAlalleSerAla ATGTCCGGCCTGAACGACATCTTCGAGGCTCAGAAAATCGAATGGCACGAA GGCGCGATATCCGCG
TEV CLEAVAGE SITE SPACER 6X His Tag GluAsnLeuTyrPheGlnGlySerSerAlaHisHisHisHisHisHis AGCAGCGCTCATCACCATCACCATCACGAGAACCTGTACTTCCAGGGC
Protein C
GlySerGluAspGlnValAspProArgLeuIleAspGlyLys
GGGAGCGAAGATCAGGTAGATCCACGGTTAATCGATGGTAAG
The tag configuration of the mammalian vectors was as follows:
M01 AT-ek-FLAG
M07 AT-TEV-HIS-PS
M08 AT-(G4S)3-TEV-HIS-PS.
M09 AT-TEV-HIS-PrC Where
AT AviTag (Avidity, Denver, CO)
FLAG FLAG epitope
TEV Tobacco Etch Virus protease cleavage site ek enterokinase cleavage site HIS 6x Histidine tag PrC Protein C epitope Tag PS PreScission Cleavage site
(G4S)n spacer region containing -Gly-Gly-Gly-Gly-Ser- with "n" repeats
M01 was made by amplifying by PCR using the following primers
GGAACCGGTGAAGGAGATAGAACCATGTCCGGCCTGAACGAC- PinAI-SD-Avi-F (SEQ ID NO:
26)
TCCCTCGAGCCGTCGTCGTCATCCTTGTAGTC - XhoI-FLAG-R (SEQ ID NO: 27) using vector pAN5rfc.l FLAG as template. Vector pT-REx-Dest30 and PCR product were both digested with PinAl and Xho I and ligated together.
M07 was created by inserting a DNA fragment encoding "TEV - Ser2 spacer - 6x HIS tag - PS cleavage site" and oligonucleotides encoding Ascl, EcoRV, SacII restriction sites on the 5' end and Xhol site on the 3' end. Also included were an EcoR47III site between the Ser2 spacer and 6x HIS tag and a Pmll site between the 6x HIS tag and the PS cleavage site. The "top and bottom strand" oligos were annealed and cut with Ascl/Xhol and cloned into an AscI/XhoI-cut M01 vector. (The Ascl site is not regenerated to avoid creating a Proline). Cloned sequences were verified by DNA sequencing.
M08 was created by inserting a DNA fragment encoding (G4S)3 and oligonucleotides with an EcoRV site at the 5' end and a SacII site at the 3 'end. The "top and bottom strand" oligos were annealed and cut with EcoRV/SacII and cloned into an EcoRV/SacII -cut M07 (pMASH2) vector. (The SacII site is not regenerated). Cloned sequences were verified by DNA sequencing. The mammalian expression vector M09 was made from the vector M07. The result of this construction was to introduce the coding sequence for the Protein C epitope tag and at the same time remove the PreScission cleavage site.
Construction of M09 was achieved by amplifying the tag region of M07 with the following oligonucleotide primers in a polymerase chain reaction (PCR).
MACH. Forward - (purchased from MWG, High Point, NC) 5' -GACGAGCTCGTTTAGTGAACCGTCAGATCGCCTGGAGACGCC-3 ' (SEQ ID NO : 28)
This primer binds to a region 129bp upstream of the Avitag coding region. MACH. Reverse - (purchased from MWG, High Point, NC)
5' CATGACGAGCTAGCTAGCCTCGAGCTTACCATCGATTAACCGTGGATCTACCTGATCTTCGCTCCCGTGATGG TGATGGTGATGAGCGCTGCTGCC-3' (SEQ ID NO: 29)
This primer encodes the Protein C epitope tag region at the 5' end and the 3' region binds to the 6xHIS region as indicated in the design above. The primer also encodes an Xho I site directly after the Protein C tag.
Using these two primers with M07 as the PCR template, the sequences were amplified and the resultant PCR amplicon was digested with PinAI (restriction site is 15bp upstream of the Avitag ATG codon) and Xho I (site is with MACH. Reverse) and sub-cloned into PinAI/XhoI digested vector backbone M01 and transformed into DB3.1 bacterial cells. Positive colonies were identified and verified by DNA sequencing. The result of this procedure was the replacement of the PS region with the Protein C region.
Transfection
Each plasmid was transfected into 293 T-Rex cells (Invitrogen Corporation, Carlsbad, CA) using the following protocol: 24 hours prior to transfection, cells were plated at 90% density in 10cm tissue culture plates. For DNA complex formation and transfection, 24ug
M07, M08 or M09 plasmid DNA containing the nucleotide sequence encoding one of the six bait proteins as listed above, was diluted in 1.5 ml of serum free media. 60μl Lipofectamine 2000 reagent (Invitrogen Corp. Carlsbad, CA) was diluted in 1.5ml serum free media. After a five minute incubation period at room temperature, the DNA and transfection reagent dilutions were combined, mixed gently and allowed to incubate for 20 minutes at room temperature. The Lipofectamine/DNA mixture was then added to the media of 293 T-Rex cells at "90% density in 10 cm plate, rocked gently and placed in an incubator at 37°C (5 % CO2). Following a 4 hour incubation period, the media were changed and the cells were placed back in the incubator. 24 hours after the transfection, 250 μg/ml of geneticin antibiotic was added to the culture media to eliminate non-transfected cells.
Protein Expression and Harvesting
The 293 T-Rex cells (Invitrogen Corp. Carlsbad, CA) are engineered to express the tet repressor protein, and thus protein expression from the M07, M08, M09 vectors is restricted in the absence of tetracycline. Addition of tetracycline displaces the tet repressor protein and allows transcription from the CMV promoter. Seven days after transfection and multiple passaging of the cell populations to permit sustained expansion, transgene expression was induced by adding tetracycline at a final concentration of lOug/ml for a period of 48 hours. Following the period of induction, the media were removed, the culture plates were rinsed once with cold phosphate buffered saline and protein lysates containing the first binding component expressed from vectors M07, M08 and M09 were harvested using a gentle lysis buffer as described in Example 5 which was added directly to the plates. The lysed cells were removed from the dishes using cell scrapers and the extracts were placed on ice for 15 minutes. Following this incubation, the lysates were centrifuged at 3,000x g for 15 minutes to remove cell debris and insoluble matter. The remaining supernatant was aliquoted and snap frozen in liquid nitrogen and stored at -80° C.
Example 9: Use of Transgenic Animals to purify Protein complexes.
Transgenic animals provide enormous potential for the study of biological processes and the modeling of disease [Prosser, H and Rastan S., Trends Biotechnol. (2003) 21(5): 224- 32]. Typically, they are generated by the introduction of recombinant DNA expression constructs into a very early stage embryo which matures to become an adult organism. This is accomplished by microinjecting the DNA into a single embryonic stem cell which is then implanted into a blastocyst or multicell embryo. The blastocyst is then implanted into a pseudo-pregnant female where, if all goes well, the embryo develops into a healthy newborn. The introduced DNA can be designed to either integrate randomly into the genome, integrate specifically in the genome and remove or replace existing sequences by homologous recombination as in a "knock-out" or integrate specifically to introduce sequences as in a "knock-in." Using existing technologies for the formation of transgenic animals, it would be possible to introduce the coding sequences for a tagged protein or proteins into intact, living organisms including but not limited to mice, rats, rabbits or primates. Using this approach, the experimentalist would essentially be able to perform an in vivo pulldown experiment. The primary advantages of this technique include:
a. the ability to provide all potential interacting proteins in the appropriate context; b. the ability to analyze protein-protein interactions from each/any organ system independently; c. the ability to examine protein-protein interactions over time and through every developmental stage of the host organism; d. the ability to combine the study of tagged protein-protein interactions with established animal models of acquired and inherited disease.
Refinements of this approach could include the utilization of naturally or artificially regulated and/or tissue specific promoters to induce or direct expression of the tagged protein at a specific time, dose or in a desired subset of organismal tissues [Chandrasekaran et al. (1996) J Biol Chem. 271(45):28424-21] . This technique could be employed if a given protein were found to be toxic or, not well tolerated in a host animal or simply if the experimental design required specific expression levels or tissue restriction of expression.
Similarly, it should be possible to capitalize on the ability to perform a "knock-out" recombination reaction where an endogenous gene or genes could be replaced by a tagged wild type or mutant version of the same gene [Chandrasekaran et al. (1996) supra]. This would eliminate any potential effects or competition produced by the presence of the endogenous gene product and permit clearer analysis of the tagged protein behavior [Lees-Miller et al. (2003) Mol. Cell Biol. 23(6): 1856-1862]. A technically simpler variation of this approach, which would obviate the need for homologous recombination, would be to combine the use of a transgenic animal expressing a tagged molecule of interest with the use of a gene expression "knock-down" strategy such as, but not limited to, RNA interference (RNAi) or anti-sense thereby eliminating or reducing the influence of the native gene product [Forler et al. (2003) Nat. Biotechnol. 21(l):89-92].
A further embodiment of this approach could include the use of "knock-in" homologous recombination to specifically insert the preferred tag sequences in frame, (N- terminally, C-terminally or internally) with a given gene such that the augmented gene sequence would encode the desired fusion peptide with minimal disruption of the native gene promoter/enhancer sequences [Yu et al. (2003) Neurosci. 23(6):2193-202].
Finally, once created, the transgenic animals could be interbred to facilitate studies of one or more biological pathways in which it might be advantageous to have multiple tagged proteins present in the living organism.
A tissue lysate can be prepared from a transgenic animal as described in Example 5 and a protein complex can be purified and interactor proteins can be identified according to the invention disclosed herein.
In summary, the studies described in the above examples demonstrate that the present invention, i.e. , a method of purifying a protein complex by using a modified bait protein containing affinity tags of high specificity separated by a specific protease cleavage site, can be used to purify protein complexes containing several interacting proteins as evidenced in Fig. 5 and in Table 3.
The foregoing exemplary descriptions and the illustrative preferred embodiments of the present invention have been explained in the drawings and described in detail, with varying modifications and alternative embodiments being taught. While the invention has been so shown, described and illustrated, it should be understood by those skilled in the art that equivalent changes in form and detail may be made therein without departing from the true spirit and scope of the invention, and that the scope of the invention is to be limited only to the claims except as precluded by the prior art. Moreover, the invention as disclosed herein, may be suitably practiced in the absence of the specific elements which are disclosed herein.
All references cited in the present application are incorporated in their entirety herein by reference to the extent not inconsistent herewith. Abbreviations and nomenclature, where employed, are deemed standard in the field and commonly used in professional journals such as those cited herein.

Claims

We claim:
1. A method for purifying a protein complex from a cell, a cell or tissue lysate, or an organism containing the protein complex, comprising the steps of:
a) providing a first binding component comprising four parts: 1) a peptide segment having an affinity modifiable segment, 2) a protease specificity segment, 3) a peptide affinity tag, and 4) a bait;
b) contacting the first binding component with the protein complex, whereby part 4) of the first binding component binds to said protein complex, thereby forming a bait- bound protein complex;
c) modifying part 1) of the first binding component by attaching an affinity ligand thereto, thereby forming an affinity-tagged bait-bound protein complex;
d) contacting the complex formed in step c) with a first affinity matrix specific for said affinity ligand thereby binding said complex to said matrix, and separating the complex from unbound material;
e) contacting the complex formed in step c) with a protease that specifically cleaves part 2) of the first binding component thereby cleaving the first binding component and forming a second binding component comprising parts 3) and 4) bound to the protein complex, but not bound to the first affinity matrix, wherein a peptide affinity tag of part 3) is retained;
f) contacting a peptide affinity tag of the second binding component with a second affinity matrix specific for a peptide affinity tag of the second binding component thereby binding the protein complex to the second affinity matrix and separating the bait-bound protein complex from unbound material, and; g) removing the second binding component and bait-bound protein complex or its components from the second affimty matrix, whereby a protein complex is purified together with the bait.
2. The method of claim 1 wherein said affinity modifiable segment comprises an amino acid sequence of the biotinylation recognition sequence.
3. The method of claim 2 wherein said biotinylation recognition sequence consists of the sequence as shown in SEQ ID NO: 6.
4. The method of claim 1 wherein said protease specificity segment comprises an amino acid sequence cleavable by a protease selected from a group consisting of TEV protease, enterokinase, thrombin, a purin, a furin, and Factor Xa.
5. The method of claim 2 wherein said protease specificity segment comprises an amino acid sequence cleavable by a protease selected from a group consisting of TEV protease, enterokinase, thrombin, a purin, a furin, and Factor Xa.
6. The method of claim 5 wherein said protease specificity segment comprises an amino acid sequence cleavable by TEV protease.
7. The method of claim 6 wherein said amino acid sequence consists of the sequence as shown in SEQ ID NO: 7.
8. The method of claim 1 wherein said peptide affinity tag is an epitope tag.
9. The method of claim 2 wherein said peptide affinity tag is an epitope tag.
10. The method of claim 6 wherein said peptide affinity tag is an epitope tag.
11. The method of claim 1 wherein said peptide affinity tag is a Protein C tag.
12. The method of claim 2 wherein said peptide affinity tag is a Protein C tag.
13. The method of claim 6 wherein said peptide affinity tag is a Protein C tag.
14. The method of claim 1 wherein said peptide tag is 6HIS hexapeptide.
15. The method of claim 2 wherein said peptide tag is 6HIS hexapeptide.
16. The method of claim 6 wherein said peptide tag is 6HIS hexapeptide.
17. The method of claim 13 wherein said Protein C tag comprises the amino acid sequence as shown in SEQ ID NO: 5.
18. The method of claim 10 wherein the epitope tag is a FLAG tag.
19. The method of claim 1 wherein the bait comprises an amino acid sequence of a protein or a fragment thereof selected from the group consisting of eIF4E, cyclin Dl , GRB2,
PCNA, HDAC1 , CDKNlb, NAPA, and CDK5.
20. The method of claim 2 wherein the bait comprises an amino acid sequence of a protein or a fragment thereof selected from the group consisting of eIF4E, cyclin Dl, GRB2, PCNA, HDAC1, CDKNlb, NAPA, and CDK5.
21. An isolated protein complex according to claim 1.
22. An isolated antibody selectively immunoreactive with the protein complex of claim 1.
23. A method of diagnosing a physiological disorder in an animal comprising assaying for the presence of the protein complex of claim 1 in a cell or tissue extract.
24. A method for screening for a drug candidate capable of modulating the interaction of a protein of the protein complex of claim 1, comprising:
a) purifying a protein complex from a cell, a cell or tissue lysate, or an organism exposed to the drug candidate to form a first protein complex;
b) purifying a protein complex from a cell, a cell or tissue lysate, or an organism that is not exposed to the drug candidate to form a second protein complex;
c) measuring the amount of a protein of said first and said second complex; and,
d) comparing the amount of the protein of said first complex with the amount of the protein of the second complex, wherein if the amount of the protein of said first complex is greater than, or less than the amount of the protein of said second complex, a drug candidate for modulating the interaction of a protein of said protein complex is identified.
25. The method of claim 24, wherein said screening is an in vitro screening.
26. The method of claim 24, wherein said complex is measured by binding with an antibody specific for said protein complex.
27. A method for purifying a protein complex from a cell, a cell or tissue lysate, or an organism containing the protein complex, comprising the steps of:
a) providing a first binding component comprising four parts: 1) a peptide segment having an affinity modifiable segment, 2) a protease specificity segment, 3) a peptide affinity tag, and 4) a bait, wherein said affinity modifiable segment is tagged with an affinity ligand; b) contacting the first binding component of step a) with the protein complex, whereby part 4) of the first binding component binds to said protein complex to form an affinity-tagged bait-bound protein complex;
c) contacting the complex formed in step b) with a first affinity matrix specific for said affinity ligand thereby binding said complex to said matrix, and separating the complex from unbound material;
d) contacting the complex formed in step b) with a protease that specifically cleaves part 2) of the first binding component thereby cleaving the first binding component and forming a second binding component comprising parts 3) and 4) bound to the protein complex, but not bound to the first affinity matrix, wherein a peptide affinity tag of part 3) is retained;
e) contacting a peptide affinity tag of the second binding component with a second affinity matrix specific for a peptide affinity tag of the second binding component thereby binding the protein complex to the second affinity matrix and separating the bait-bound protein complex from unbound material, and;
f) removing the second binding component and protein complex or its components from the second affinity matrix, whereby a protein complex is purified together with the bait.
28. The method of claim 27 wherein said affinity modifiable segment comprises an amino acid sequence of the biotinylation recognition sequence.
29. The method of claim 28 wherein said biotinylation recognition sequence consists of the sequence as shown in SEQ ID NO: 6.
30. The method of claim 27 wherein said protease specificity segment comprises an amino acid sequence cleavable by a protease selected from a group consisting of TEV protease, enterokinase, thrombin, a purin, a furin, and Factor Xa.
31. The method of claim 28 wherein said protease specificity segment comprises an amino acid sequence cleavable by a protease selected from a group consisting of TEV protease, enterokinase, thrombin, a purin, a furin, and Factor Xa.
32. The method of claim 31 wherein said protease specificity segment comprises an amino acid sequence cleavable by TEV protease.
33. The method of claim 32 wherein said amino acid sequence consists of the sequence as shown in SEQ ID NO: 7.
34. The method of claim 27 wherein said peptide affinity tag is an epitope tag.
35. The method of claim 28 wherein said peptide affinity tag is an epitope tag.
36. The method of claim 32 wherein said peptide affinity tag is an epitope tag.
37. The method of claim 27 wherein said peptide affinity tag is a Protein C tag.
38. The method of claim 28 wherein said peptide affinity tag is a Protein C tag.
39. The method of claim 32 wherein said peptide affinity tag is a Protein C tag.
40. The method of claim 27 wherein said peptide affinity tag is 6HIS hexapeptide.
41. The method of claim 28 wherein said peptide affinity tag is 6HIS hexapeptide.
42. The method of claim 32 wherein said peptide affinity tag is 6HIS hexapeptide.
43. The method of claim 42 wherein said Protein C tag comprises the amino acid sequence as shown in SEQ ID NO: 5.
44. The method of claim 36 wherein the epitope tag is a FLAG tag.
45. The method of claim 27 wherein the bait comprises an amino acid sequence of a protein or a fragment thereof selected from the group consisting of eIF4E, cyclin Dl , GRB2, PCNA, HDAC1, CDKNlb, NAPA, and CDK5.
46. The method of claim 28 wherein the bait comprises amino acid sequence of a protein or a fragment thereof selected from the group consisting of eIF4E, cyclin Dl, GRB2, PCNA, HDAC1, CDKNlb, NAPA, and CDK5.
47. An isolated protein complex according to claim 27.
48. An isolated antibody selectively immunoreactive with the protein complex of claim 27.
49. A method of diagnosing a physiological disorder in an animal comprising assaying for the presence of the protein complex of claim 27 in a cell or tissue extract.
50. A method for screening for a drug candidate capable of modulating the interaction of a protein of the protein complex of claim 27, comprising:
a) purifying a protein complex from a cell, a cell or tissue lysate, or an organism exposed to the drug candidate to form a first protein complex;
b) purifying a protein complex from a cell, a cell or tissue lysate, or an organism that is not exposed to the drug candidate to form a second protein complex;
c) measuring the amount of a protein of said first and said second complex; and, d) comparing the amount of the protein of said first complex with the amount of the protein of the second complex, wherein if the amount of the protein of said first complex is greater than, or less than the amount of the protein of said second complex, a drug candidate for modulating the interaction of a protein of said protein complex is identified.
51. The method of claim 50, wherein said screening is an in vitro screening.
52. The method of claim 50, wherein said complex is measured by binding with an antibody specific for said given protein complex.
53. A first binding component comprising a peptide segment having an affinity modifiable segment, a protease specificity segment, a peptide affinity tag, and a bait.
54. The method of claim 1 wherein, after step d), a step of separating a component of the complex formed in step c) is followed, whereby the component of the affinity-tagged bait-bound protein complex is purified.
55. The method of claim 24 or 50 wherein the amount of a protein of a said first and second complex is measured by mass spectrometry.
PCT/US2003/014511 2002-05-09 2003-05-09 Protein complex purification WO2003095619A2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
AU2003228944A AU2003228944A1 (en) 2002-05-09 2003-05-09 Protein complex purification
US10/984,958 US7825227B2 (en) 2002-05-09 2004-11-09 Method for purification of a protein complex and identification of its components

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US37931702P 2002-05-09 2002-05-09
US60/379,317 2002-05-09

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US10/984,958 Continuation-In-Part US7825227B2 (en) 2002-05-09 2004-11-09 Method for purification of a protein complex and identification of its components

Publications (2)

Publication Number Publication Date
WO2003095619A2 true WO2003095619A2 (en) 2003-11-20
WO2003095619A3 WO2003095619A3 (en) 2004-04-15

Family

ID=29420510

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2003/014511 WO2003095619A2 (en) 2002-05-09 2003-05-09 Protein complex purification

Country Status (2)

Country Link
AU (1) AU2003228944A1 (en)
WO (1) WO2003095619A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1976992A1 (en) * 2005-12-22 2008-10-08 Henry M. Krause Methods and compositions for the detection and isolation of ligands

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ROE SIMON: 'Protein Purification Applications', 2001, OXFORD UNIVERSITY PRESS article BREWER ET AL.: 'Fusion protein purification methods', pages 1 - 18, XP002973245 Second Edition *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1976992A1 (en) * 2005-12-22 2008-10-08 Henry M. Krause Methods and compositions for the detection and isolation of ligands
EP1976992A4 (en) * 2005-12-22 2009-05-06 Henry M Krause Methods and compositions for the detection and isolation of ligands
US8236937B2 (en) 2005-12-22 2012-08-07 Indanio Bioscience Inc. Methods and compositions for the detection and isolation of ligands

Also Published As

Publication number Publication date
AU2003228944A8 (en) 2003-11-11
AU2003228944A1 (en) 2003-11-11
WO2003095619A3 (en) 2004-04-15

Similar Documents

Publication Publication Date Title
Lamla et al. The Nano-tag, a streptavidin-binding peptide for the purification and detection of recombinant proteins
US7329506B2 (en) Apparatuses and methods for determining protease activity
Eriksson et al. Quantitative membrane proteomics applying narrow range peptide isoelectric focusing for studies of small cell lung cancer resistance mechanisms
Layfield et al. Purification of poly‐ubiquitinated proteins by S5a‐affinity chromatography
US20080213760A1 (en) Compositions and methods for protein isolation
US7825227B2 (en) Method for purification of a protein complex and identification of its components
US20100297667A1 (en) Method for diagnosis of disease using quantitative monitoring of protein tyrosine phosphatase
US20070275416A1 (en) Affinity marker for purification of proteins
US20020102741A1 (en) Methods for systematic identification of protein - protein interactions
Kumada et al. High biological activity of a recombinant protein immobilized onto polystyrene
WO2003095619A2 (en) Protein complex purification
JP2007139787A (en) Method for identification and relative determination of protein based on selective isolation of rrnk peptide for simplifying complicated proteinic mixture
WO2008025558A2 (en) Affinity marker comprising a first and a second tag and its use
JP2005098830A (en) Method for screening protein interaction substance by mass spectrometry
Pacholczyk et al. Epitope and mimotope for an antibody to the Na, K‐ATPase
Séraphin et al. 17 Tandem Affinity Purification to Enhance Interacting Protein Identification
US20040265921A1 (en) Intein-mediated attachment of ligands to proteins for immobilization onto a support
WO1997041438A1 (en) Methods of modulating t-cell activation
WO2015072507A1 (en) Method for identifying polyubiquitinated substrate
Peltier et al. An integrated strategy for the discovery of drug targets by the analysis of protein–protein interactions
KR102516595B1 (en) A polypeptide specifically binding to N-terminal arginylated protein and uses thereof
EP1281716B1 (en) Purification process based on enzyme/tagged-peptide binding
Gevaert 17. A Strong Cation Exchange Chromatography Protocol for Examining N-Terminal Proteoforms
WO2004035783A9 (en) Protein complexes of the tumor-necrosis-factor-alpha (tnf-alpha) signalling pathway
WO2003102196A1 (en) Vectors for expression of biotinylated proteins in mammalian cells, and their use for identification of protein-nucleic acid interactions in vivo

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NI NO NZ OM PH PL PT RO RU SC SD SE SG SK SL TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 10984958

Country of ref document: US

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP