WO2012123119A1 - Methods for the identification and characterization of proteins interacting with histone tails and of compounds interacting with said proteins - Google Patents

Methods for the identification and characterization of proteins interacting with histone tails and of compounds interacting with said proteins Download PDF

Info

Publication number
WO2012123119A1
WO2012123119A1 PCT/EP2012/001149 EP2012001149W WO2012123119A1 WO 2012123119 A1 WO2012123119 A1 WO 2012123119A1 EP 2012001149 W EP2012001149 W EP 2012001149W WO 2012123119 A1 WO2012123119 A1 WO 2012123119A1
Authority
WO
WIPO (PCT)
Prior art keywords
protein
histone tail
histone
interacting
given
Prior art date
Application number
PCT/EP2012/001149
Other languages
French (fr)
Inventor
Gerard Drewes
Gerard Joberty
Original Assignee
Cellzome Ag
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cellzome Ag filed Critical Cellzome Ag
Publication of WO2012123119A1 publication Critical patent/WO2012123119A1/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6875Nucleoproteins
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2500/00Screening for compounds of potential therapeutic value
    • G01N2500/04Screening involving studying the effect of compounds C directly on molecule A (e.g. C are potential ligands for a receptor A, or potential substrates for an enzyme A)

Definitions

  • the present invention relates to methods for the identification and characterization of proteins interacting with histone tails as well as to the identification and characterization of compounds interacting with said proteins.
  • Proteins interacting with histones have gained an increasing interest in the past years due to their potential to modify chromatin structure and thereby to make DNA accessible or inaccessible for RNA polymerase complexes and transcription factors.
  • proteins interacting with the histone tails because they have been shown to have a significant influence on the the regulation of gene expression.
  • Chromatin is the state in which DNA is packaged within the cell.
  • the nucleosome is the unit of chromatin and it consists of an octamer of the four core histone proteins (dimers of histone H3, H4, H2A and H2B) around which 147 base pairs of DNA are wrapped.
  • the histones are small, basic proteins rich in arginine and lysine residues resulting in a high affinity for DNA.
  • Histone H3 and H4 are among the most conserved proteins known.
  • Each of the core histones has a histone fold domain and a flexible N-terminal tail (H2A and H2B also have C-terminal tails) which contains sites for covalent modification that are important for chromatin function.
  • histone modifications are acetylation of lysines, methylation of lysines and arginines, phosphorylation of serines, threonines and tyrosines, ubiquitinylation of lysines and transformation of arginine to citrulline. With the exception of the citrullination, all these modifications are reversible.
  • Acetylation by the creation of a negative charge has a tendency to displace the tail from the nucleosome and the DNA and favours gene transcription. Methylation does not create a new charge and can promote or inhibit transcription. Phosphorylation has often a negative impact on the role of nearby histone marks.
  • the covalent posttranslational modifications (PMTs) of histone tails are added by enzymes classified as “writers” (e.g. acetyltransferases) or removed by enzymes classified as “erasers” (e.g. deacetylases). Proteins binding to and recognizing these modifications are referred to as “readers” (e.g. by bromodomain proteins that bind to acetylated lysine).
  • HATs histone acetyl transferases
  • HDACs histone deacetylases
  • Methyl groups are added by protein methyl transferases (PTMs) and removed by protein demethylases (PDMs).
  • Lysine residues can be mono-, di- or tri -methylated by protein lysine methyltransferases (PK Ts) and arginine residues can be mono- or di-methylated by protein arginine methyl transferases (PRJvlTs).
  • the dimethylation on arginine can be asymmetrical or symmetrical.
  • Lysine and arginine methylation of histone tails can either activate or repress gene transcription.
  • H3S10 histone 3
  • Ubiquitin moieties can be added to histones (mono- or poly-ubiquitination) by protein ubiquitin ligases (E3 ligases) and removed by protein ubiquitin carboxyl-terminal hydrolases (UCHs).
  • E3 ligases protein ubiquitin ligases
  • UCHs protein ubiquitin carboxyl-terminal hydrolases
  • sumoylation is a large modification and shows some similarity to ubiquitylation. All four core histones can be sumoylated and specific sites have been identified on H4, H2A, and H2B. Sumoylation antagonizes both acetylation and ubiquitylation, which occurs on the same lysine residue, and consequently represses transcription in yeast (Nathan et al., 2006. Genes Dev. 20(8):966-976).
  • the combinatorial histone mark pattern of each histone tail forms a "histone code” for the assembly of large epigenetic protein complexes containing histone mark "writers", “erasers” and “readers”. These protein complexes are thought to determine the transcriptional fate of a given gene (Ruthenburg et al., 2007. Nat. Rev. Mol. Cell Biol. 8(12):983-994). Recent research revealed that the misregulation of histone modification (misreading, miswriting and miserasing of histone marks) can cause epigenetic alterations leading to pathogenic conditions like cancer and inflammation.
  • the proteins mixed lineage leukemia (MLL) and enhancer of zeste 2 (EZH2) catalyse the methylation of histone H3 lysine 4 (H3K4) and H3 lysine 27 (H3K27), repectively, which represent two of the most important histone methylation marks.
  • MLL rearrangement and deregulation of EZH2 are among the most common mutations in leukemia and solid tumours.
  • PTD plant homeodomain
  • HDACs histone deacetylases
  • HDAC inhibitors are in preclinical development and clinical trials for the treatment of a wide variety of diseases including cancer, inflammatory, cardiac, and neurodegenerative diseases (Bolden et al., 2006. Nat. Rev. Drug Discov. 5(9):769-784; Haberland et al., 2009. Nat. Rev. Genet. 10(l):32-42). It is expected that the development of selective HDAC inhibitors targeting only one member of the HDAC family should lead to improved efficacy and drug safety compared to non-selective "pan-HDAC inhibitors" (Kalin et al., 2009. Curr. Opin. Chem. Biol. 13: 1 -9; Balasubramanian et al., 2009. Cancer Lett. 280(2):211-21).
  • HDAC activity can be measured using purified recombinant enzyme in solution-based assays with acetylated peptide substrates (Blackwell et al., 2008. Life Sciences 82(21 -22):1050-1058).
  • HDACs typically require the availability of purified or recombinant HDACs. However, not all HDACs can be produced will sufficient enzymatic activity to allow for inhibitor screening (Blackwell et al., 2008. Life Sciences 82(21-22):1050-1058). In addition, some preparations of HDACs expressed in insect cells are contaminated with endogenous insect HDACs making the interpretation of assay results ambiguous.
  • HDAC inhibitors Another, although not in all instances necessary prerequisite for the identification of selective HDAC inhibitors is a method that allows to determine the target selectivity of these molecules. For example, it can be intended to provide molecules that bind to and inhibit a particular drug target but do not interact with a closely related target, inhibition of which could lead to unwanted side effects. Conventionally, large panels of individual enzyme assays are used to assess the inhibitory effect of a compound for HDACs (Khan et al., 2008. Biochem. J. 409(2):581 -9; Blackwell et al., 2008. Life Sci. 82(21-22): 1050- 1058).
  • the invention relates to a method for the identification of a protein capable of interacting with a given histone tail, comprising the steps of a) providing a protein preparation containing a protein of interest, b) contacting the protein preparation with the given histone tail under conditions allowing the binding of the protein to said histone tail, and c) characterizing the protein bound to the histone tail.
  • the histone tail is labeled and after the contacting the histone tail-protein complex is purified using said label.
  • the label is biotin and the histone tail-protein complex is purified with the help of streptavidin.
  • step c) includes the steps of cl ) eluting the protein from the histone tail, and c2) characterizing the protein.
  • said characterizing the protein includes the identification of the protein.
  • said protein is characterized by mass spectrometry or immunodection.
  • step b) is performed in the presence of varying concentrations of a non-labeled histone tail.
  • the amount of protein is determined in step c) and wherein a detection of a reduced amount of protein with increasing concentrations of the non-labeled histone tail is indicative for a specific binding of the protein to the histone tail.
  • the invention relates to a method for the identification of a compound being capable of interacting with a protein capable of interacting with a given histone tail, comprising the steps of a) identifying a protein capable of interacting with a given histone tail according to the above methods of the invention, b) providing a protein preparation containing said protein, c) contacting the protein preparation with said histone tail immobilized on a solid support under conditions allowing the formation of a complex between the histone tail and the protein, d) incubating complex with a given compound, and e) determining whether the compound is able to separate the protein from the immobilized histone tail.
  • step e) includes the detection of separated protein or the determination of the amount of separated protein.
  • the present invention relates to a method for the identification of a compound being capable of interacting with a protein capable of interacting with a given histone tail, comprising the steps of a) identifying a protein capable of interacting with a given histone tail according to the above methods of the invention, b) providing a protein preparation containing said protein, c) contacting the protein preparation with said histone tail immobilized on a solid support and with a given compound under conditions allowing the formation of a complex between the histone tail and the protein, and d) detecting the complex formed in step c),
  • step d) said detecting is performed by determining the amount of the complex.
  • steps a) to d) are performed with several protein preparations in order to test different compounds.
  • the present invention relates to a method for the identification of a compound being capable of interacting with a protein capable of interacting with a given histone tail, comprising the steps of: a) identifying a protein capable of interacting with a given histone tail according to the above methods of the invention, b) providing two aliquots of a protein preparation containing said protein, c) contacting one aliquot with said histone tail immobilized on a solid support under conditions allowing the formation of a complex between the histone tail and the protein, d) contacting the other aliquot with said histone tail immobilized on a solid support and with a given compound under conditions allowing the formation of a complex between the histone tail and the protein, and e) determining the amount of the complex formed in steps c) and d).
  • the present invention relates to a method for the identification of a compound being capable of interacting with a protein capable of interacting with a given histone tail, comprising the steps of: a) identifying an protein capable of interacting with a given histone tail according to the above methods of the invention, b) providing two aliquots comprising each at least one cell containing said protein, c) incubating one aliquot with a given compound, d) harvesting the cells of each aliquot, e) lysing the cells in order to obtain protein preparations, f) contacting the protein preparations with said histone tail immobilized on a solid support under conditions allowing the formation of a complex between the histone tail and the protein, and g) determining the amount of the complex formed in each aliquot in step f),
  • a reduced amount of the complex formed in the aliquot incubated with the compound in comparison to the aliquot not incubated with the compound indicates that the protein is a target of the compound.
  • the amount of the complex is determined by separating the protein from the immobilized histone tail and subsequent detection of separated protein or subsequent determination of the amount of separated protein.
  • said protein is detected or the amount of said protein is determined by mass spectrometry or immunodetection methods, preferably with an antibody directed against said protein. In a preferred embodiment, said methods are performed as a medium or high throughput screening.
  • said compound is selected from the group consisting of synthetic compounds, or organic synthetic drugs, more preferably small molecule organic drugs, and natural small molecule compounds.
  • the solid support is selected from the group consisting of agarose, modified agarose, sepharose beads (e.g. NHS-activated sepharose), latex, cellulose, and ferro- or ferrimagnetic particles.
  • the histone tail is non-covalently coupled to the solid support.
  • the provision of a protein preparation includes the steps of harvesting at least one cell containing the protein and lysing the cell.
  • the steps of the formation of the complex between the histone tail and the protein are performed under essentially physiological conditions.
  • at least one of the amino acids of the histone tail has been further modified.
  • the present invention also relates to a method for the identification of a compound being capable of interacting with a protein capable of interacting with a given histone tail, wherein instead of the identification of a protein according to the method of the invention (step a) in the respective methods of the invention) the histone tail is one of the histone tails as shown in one of the Tables 1 to 35 and and the protein is one of the proteins corresponding to the respective histone tail as shown in one of the Tables 1 to 35.
  • all embodiments described above for the methods of the invention where in step a) a protein is identified according to the invention apply also for the methods where a specific histone tail and a specific protein are used.
  • the present invention relates both to methods for the identification of proteins capable of binding to a given histone tail as well as to methods for the identification of compounds being capable of binding to said proteins. In the following, preferred embodiments and definitions for the methods of the invention are discussed.
  • the term “protein” also includes enzymes that can add or remove histone tail modifications and proteins that can bind to modified histone tails and proteins associated with said enzymes or binding proteins.
  • the term “protein” also includes peptides or oligopeptides with or without posttranslational modifications such as glycosylation, ubiquinylation, methylation or the like.
  • the term “histone tail” denotes the flexible aminoterminal regions of the core histones (H2A, H2B, H3, H4) and the flexible carboxyterminal regions of histones H2A and H2B that extend beyond the surface of the nucleosome (Jufvas et al., 201 1.
  • the histone tail comprises the aminoterminal amino acids at position 1 to 40 of histone H2A (SGRG QGGKARAKAKTRSSRAGLQFPVGRVHRLLRKGNYS), more preferred amino acids 1 to 30, and most preferred 1 to 21 , of said amino acid sequence.
  • the histone tail comprises the aminoterminal amino acids at position 1 to 40 of histone H2B (PEPSKSAPAPKXGSKXAITKAQKXDGKKRKRSRKESYSIY), more preferred amino acids 1 to 30, and most preferred 1 to 21 , of said amino acid sequence.
  • the histone tail comprises the aminoterminal amino acids at position 1 to 40 of histone H3 (ARTKQTARKSTGGKAPRKQLATKAARKSAPATGGVKKPHR), more preferred amino acids 1 to 30, and most preferred 1 to 21 , of said amino acid sequence.
  • the histone tail comprises the aminoterminal amino acids at position 1 to 40 of histone H4 (SGRG GGKGLGKGGAKRHRKVLRDNIQGITKPAIRRLARR), more preferred amino acids 1 to 30, and most preferred 1 to 21 , of said amono acid sequence.
  • histone H4 SGRG GGKGLGKGGAKRHRKVLRDNIQGITKPAIRRLARR
  • any peptide located within the 40 aminoterminal amino acid residues of a histone tail Preferably, this peptide comprises 5 to 20 amino acid residues.
  • the histone tail comprises the carboxyterminal amino acid residues of histones H2A and H2B.
  • At least one of the amino acids of the histone tail has been further modified (Jufvas et al., 201 1. PLoS One 6(l):el 5960).
  • At least one lysine is acetylated (Kac) on the free epsilon amino group.
  • At least one lysine is mono-methylated (Kmel), di-methylated (Kme2) or tri- methylated (Kme3).
  • At least one arginine is mono-methylated (Rmel), asymmetrically di- methylated (Rme2a) or symmetrically di-methylated (Rme2s).
  • At least one serine is phosphorylated (Sp).
  • At least one threonine is phosphorylated (Tp).
  • At least one tyrosine is phosphorylated (Yp).
  • the histone tail may be labeled, e.g. with biotin.
  • the histone tail may be labeled with biotin at the amino-terminus or carboxy-terminus. This enables to further purify the histone tail-protein complex, which then facilitates the characterization of the protein.
  • the biotin group is attached to the carboxyterminus of aminoterminal histone tails.
  • the biotin group is attached to the aminoterminus of carboxyterminal histone tails.
  • step b) may be performed in the presence of varying concentrations of a non-labeled histone tail. This enables the identification of a specific binding of the protein to the histone tail, because if with increased concentration of the non-labeled histone tail reduced amounts of protein are detected, this is an indication of a specific binding.
  • the protein specifically interacts with the histone tail.
  • the histone tail may be immobilized on a microarray as known in the art.
  • a microarray are commercially available or can be produced by known methods (Voigt and Reinberg, 201 1. Chembiochem. 12(2):236-252).
  • said microarray contains a plurality of histone tails.
  • HDAC histone deacetylase
  • histone deacetylase means enzymes that remove acetyl groups from histones or other substrate proteins. These enzymes are known in the art.
  • the expression "protein” relates to both human and other proteins of this family.
  • the expression especially includes functionally active derivatives thereof, or functionally active fragments thereof, or a homologues thereof, or variants encoded by a nucleic acid that hybridizes to the nucleic acid encoding said protein under low stringency conditions.
  • these low stringency conditions include hybridization in a buffer comprising 35% formamide, 5X SSC, 50 raM Tris-HCl (pH 7.5), 5 mM EDTA, 0.02% PVP, 0.02% BSA, 100 ⁇ g/ml denatured salmon sperm DNA, and 10% (wt/vol) dextran sulfate for 18-20 hours at 40°C, washing in a buffer consisting of 2X SSC, 25 mM Tris-HCl (pH 7.4), 5 mM EDTA, and 0.1 % SDS for 1 -5 hours at 55°C, and washing in a buffer consisting of 2X SSC, 25 mM Tris-HCl (pH 7.4) 5 mM EDTA, and 0.1% SDS for 1.5 hours at 60°C.
  • a buffer comprising 35% formamide, 5X SSC, 50 raM Tris-HCl (pH 7.5), 5 mM EDTA, 0.02% PVP, 0.02% B
  • the expression "protein” includes mutant forms of said protein.
  • the methods of the present invention can be performed with any protein preparation as a starting material, as long as the respective protein is present in the preparation.
  • Examples include a liquid mixture of several proteins, a cell lysate, a partial cell lysate which contains not all proteins present in the original cell (for example a nuclear extract) or a combination of several cell lysates.
  • the term "protein preparation” also includes dissolved purified protein. Preferably, said protein is endogenously produced by said cell.
  • the term "endogenously” means that the respective cell expresses said protein without being transfected with a protein-encoding nucleic acid. This ensures that the protein is present, as much as possible, in its natural environment, especially it is contained in a naturally occurring protein complex as discussed above.
  • cell preparation refers to any preparation containing at least one cell with the desired properties. Suitable cell preparations are described below.
  • the presence of the protein in a protein preparation of interest can be detected on Western blots probed with antibodies that are specifically directed against said protein.
  • MS mass spectrometry
  • Cell lysates or partial cell lysates can be obtained by isolating cell organelles (e.g. nucleus, mitochondria, ribosomes, golgi etc.) first and then preparing protein preparations derived from these organelles. Methods for the isolation of cell organelles are known in the art (Chapter 4.2 Purification of Organelles from Mammalian Cells in "Current Protocols in Protein Science", Editors: John.E. Coligan, Ben M. Dunn, Hidde L. Ploegh, David W. Speicher, Paul T. Wingfield; Wiley, ISBN: 0-471 -14098-8).
  • cell organelles e.g. nucleus, mitochondria, ribosomes, golgi etc.
  • protein preparations can be prepared by fractionation of cell extracts thereby enriching specific types of proteins such as cytoplasmic, nuclear or membrane proteins (Chapter 4.3 Subcellular Fractionation of Tissue Culture Cells in "Current Protocols in Protein Science", Editors: John.E. Coligan, Ben M. Dunn, Hidde L. Ploegh, David W. Speicher, Paul T. Wingfield; Wiley, ISBN: 0-471-14098-8). Methods for the preparation of nuclear extracts are known in the art (Dignam et al., 1983. Nucleic Acids Res. 1 1 (5): 1475- 1489).
  • the provision of a protein preparation includes the steps of harvesting at least one cell containing the protein and lysing the cell.
  • suitable cells for this purpose as well as for the cell preparations used as the starting material in one aspect of the present invention are those cells or tissues where the protein is endogenously expressed.
  • cells isolated from peripheral blood represent a suitable biological material.
  • Procedures for the preparation and culture of human lymphocytes and lymphocyte subpopulations obtained from peripheral blood (PBLs) are widely known (W.E Biddison, Chapter 2.2 "Preparation and culture of human lymphocytes” in Current Protocols in Cell Biology, 1998, John Wiley & Sons, Inc.).
  • density gradient centrifugation is a method for the separation of lymphocytes from other blood cell populations (e.g. erythrocytes and granulocytes).
  • Human lymphocyte subpopulations can be isolated via their specific cell surface receptors which can be recognized by monoclonal antibodies.
  • the physical separation method involves coupling of these antibody reagents to magnetic beads which allow the enrichment of cells that are bound by these antibodies (positive selection).
  • primary human cells cultured cell lines e.g. MOLT-4 cells, Jurkat, Ramos, HeLa, HL-60 or -562 cells
  • MOLT-4 cells e.g., MOLT-4 cells, Jurkat, Ramos, HeLa, HL-60 or -562 cells
  • the cell is part of a cell culture system and methods for the harvest of a cell out of a cell culture system are known in the art (literature supra).
  • the choice of the cell will mainly depend on the expression of the protein, since it has to be ensured that the protein is principally present in the cell of choice.
  • methods like Westernblot, PCR-based nucleic acids detection methods, Northemblots and DNA- microarray methods ("DNA chips") might be suitable in order to determine whether a given protein of interest is present in the cell.
  • the protein preparation may be a preparation containing the protein which has been recombinantely produced. Methods for the production of recombinant proteins in prokaryotic and eukaryotic cells are widely established (Chapter 5 Production of Recombinant Proteins in "Current Protocols in Protein Science", Editors: John. E. Coligan, Ben M. Dunn, Hidde L. Ploegh, David W. Speicher, Paul T. Wingfield; Wiley, 1995, ISBN: 0-471 -14098-8).
  • the choice of the cell may also be influenced by the purpose of the study. If the in vivo efficacy for a given drug needs to be analyzed then cells or tissues may be selected in which the desired therapeutic effect occurs (e.g. B-cells). By contrast, for the elucidation of protein targets mediating unwanted side effects the cell or tissue may be analysed in which the side effect is observed (e.g. cardiomyocytes, vascular smooth muscle or epithelium cells). Furthermore, it is envisaged within the present invention that the cell containing the protein may be obtained from an organism, e.g. by biopsy. Corresponding methods are known in the art. For example, a biopsy is a diagnostic procedure used to obtain a small amount of tissue, which can then be examined microscopically or with biochemical methods. Biopsies are important to diagnose, classify and stage a disease, but also to evaluate and monitor drug treatment.
  • the lysis is performed simultaneously.
  • the cell is first harvested and then separately lysed.
  • Lysis of different cell types and tissues can be achieved by homogenizers (e.g. Potter-homogenizer), ultrasonic desintegrators, enzymatic lysis, detergents (e.g. NP-40, Triton X-100, CHAPS, SDS), osmotic shock, repeated freezing and thawing, or a combination of these methods.
  • homogenizers e.g. Potter-homogenizer
  • ultrasonic desintegrators e.g. Potter-homogenizer
  • enzymatic lysis e.g. NP-40, Triton X-100, CHAPS, SDS
  • detergents e.g. NP-40, Triton X-100, CHAPS, SDS
  • osmotic shock repeated freezing and thawing, or a combination of these methods.
  • the protein preparation containing the protein is contacted with a histone tail immobilized on a solid support thereby allowing the formation of complex between the histone tail and the protein.
  • compounds are identified which interfere with the binding between the histone tail and the protein present in a cell or protein preparation.
  • solid support relates to every undissolved support being able to immobilize a small molecule ligand on its surface.
  • the solid support may be selected from the group consisting of agarose, modified agarose, sepharose beads (e.g. NHS-activated sepharose), latex, cellulose, and ferro- or ferrimagnetic particles.
  • the histone tail may be coupled to the solid support either covalently or non-covalently.
  • Non-covalent binding includes binding via biotin to streptavidin matrices, avidin matrices or neutravidin matrices.
  • biotin is covalently cojugated to the histone tail and interacts non-covalently with streptavidin which is bound directly to the solid support
  • the biotin label and the histone tail peptide are separated by a linker (for example hexanoic acid and 1 to 5 amino acid residues).
  • a linker for example hexanoic acid and 1 to 5 amino acid residues.
  • these amino acid residues are lysine or glycine.
  • the linker may be cleavable to facilitate the release of the proteins from the solid support.
  • the cleavage may be achieved by enzymatic cleavage (e.g TEV protease) or treatment with suitable chemical methods.
  • the histone tail is non-covalently coupled to the solid support.
  • the term "allowing the formation of a complex” includes all conditions under which such formation is possible. Conditions allowing the formation of said complexes are known in the art. The skilled person will know which conditions can be applied in order to enable the formation of said complex. This includes the possibility of having the solid support on an immobilized phase and pouring the lysate or protein preparation onto it. In another preferred embodiment, it is also included that the solid support is in a particulate form and mixed with the cell lysate. Such conditions are known to the person skilled in the art.
  • the steps of the formation of said complex are performed under essentially physiological conditions.
  • the physical state of proteins within cells is described in Petty, 1998 (Howard R. Petty, Chapter 1, Unit 1.5 in: Juan S. Bonifacino, Mary Dasso, Joe B. Harford, Jennifer Lippincott-Schwartz, and Kenneth M. Yamada (eds.) Current Protocols in Cell Biology Copyright ⁇ 2003 John Wiley & Sons, Inc. AH rights reserved.
  • DPI 10.1002/0471 143030.cb0101s00Online Posting Date: May, 2001Print Publication Date: October, 1998).
  • Essentially physiological conditions are inter alia those conditions which are present in the original, unprocessed sample material. They include the physiological protein concentration, pH, salt concentration, buffer capacity and post-translational modifications of the proteins involved.
  • the term "essentially physiological conditions” does not require conditions identical to those in the original living organism, wherefrom the sample is derived, but essentially cell-like conditions or conditions close to cellular conditions. The person skilled in the art will, of course, realize that certain constraints may arise due to the experimental set-up which will eventually lead to less cell-like conditions.
  • the eventually necessary disruption of cell walls or cell membranes when taking and processing a sample from a living organism may require conditions which are not identical to the physiological conditions found in the organism.
  • Suitable variations of physiological conditions for practicing the methods of the invention will be apparent to those skilled in the art and are encompassed by the term "essentially physiological conditions” as used herein.
  • the term "essentially physiological conditions” relates to conditions close to physiological conditions, as e. g. found in natural cells, but does not necessarily require that these conditions are identical.
  • "essentially physiological conditions” may comprise 50-200 mM NaCl or C1, pH 6.5-8.5, 20-37°C, and 0.001-10 mM divalent cation (e.g. Mg++, Ca++,); more preferably about 150 m NaCl or C1, pH7.2 to 7.6, 5 mM divalent cation and often include 0.01-1.0 percent non-specific protein (e.g. BSA).
  • a non-ionic detergent can often be present, usually at about 0.001 to 2%, typically 0.05-0.2% (volume/volume).
  • buffered aequous conditions may be applicable: 10-250 mM NaCl, 5-50 mM Tris HC1, pH5-8, with optional addition of divalent cation(s) and/or metal chelators and/or non-ionic detergents.
  • "essentially physiological conditions" mean a pH of from 6.5 to 7.5, preferably from 7.0 to 7.5, and / or a buffer concentration of from 10 to 50 mM, preferably from 25 to 50 mM, and / or a concentration of monovalent salts (e.g. Na or K) of from 120 to 170 mM, preferably 150 mM.
  • Divalent salts e.g. Mg or Ca
  • the buffer is selected from the group consisting of Tris-HCl or HEPES.
  • washing is part of the knowledge of the person skilled in the art.
  • the washing serves to remove non-bound components of the cell lysate from the solid support.
  • Nonspecific (e.g. simple ionic) binding interactions can be minimized by adding low levels of detergent or by moderate adjustments to salt concentrations in the wash buffer.
  • a binding between a ligand i.e. a histone tail in the context of the present invention
  • a protein or protein complex i.e. a histone tail in the context of the present invention
  • these methods include in situ methods where the binding is assessed without separating the protein or protein complex from the ligand.
  • anti-protein antibodies can be used in combination with the ALPHAScreen technology where the excitation of a donor bead at 680 nm produces singlet oxygen which can diffuse to an acceptor bead undergoing a chemiluminescent reaction (Glickman et al., 2002. J. Biomol. Screen. 7(1):3-10).
  • the binding between the histone tail and the protein is determined by separating bound protein from the histone tail and subsequent determination of the protein. This subsequent determination of the protein may either be the detection of the protein in the eluate or the determination of its amount.
  • a binding between the histone tail and the protein preferably indicates that the compound does not completely inhibit the binding.
  • the compound is presumably a strong interactor with the protein, which is indicative for its therapeutic potential.
  • the amount is determined, the less protein can be detected in the eluate, the stronger the respective compound interacts with the protein, which is indicative for its therapeutic potential.
  • separating means every action which destroys the interactions between the histone tail and the protein. This includes in a preferred embodiment the elution of the protein from the histone tail.
  • the elution can be achieved by using nonspecific reagents (ionic strength, pH value, detergents). Such non-specific methods for destroying the interaction are principally known in the art and depend on the nature of the ligand enzyme interaction.
  • change of ionic strength, the pH value, the temperature or incubation with detergents are suitable methods to dissociate the target enzymes from the immobilized compound.
  • the application of an elution buffer can dissociate binding partners by extremes of pH value (high or low pH; e.g. lowering pH by using 0.1 M citrate, pH2-3), change of ionic strength (e.g. high salt concentration using Nal, KI, MgCl 2 , or KC1), polarity reducing agents which disrupt hydrophobic interactions (e.g.
  • the solid support has preferably to be separated from the released material.
  • the individual methods for this depend on the nature of the solid support and are known in the art. If the support material is contained within a column the released material can be collected as column flowthrough. In case the support material is mixed with the lysate components (so called batch procedure) an additional separation step such as gentle centrifugation may be necessary and the released material is collected as supernatant.
  • magnetic beads can be used as solid support so that the beads can be eliminated from the sample by using a magnetic device.
  • Methods for the detection of proteins or for the determination of their amounts include physico-chemical methods such as protein sequencing (e.g. Edmann degradation), analysis by mass spectrometry methods or immunodetection methods employing antibodies directed against the protein.
  • mass spectrometry or immunodetection methods are used in the context of the methods of the invention.
  • the mass spectrometry analysis is performed in a quantitative manner, for example by using iTRAQ technology (isobaric tags for relative and absolute quantification) or cICAT (cleavable isotope-coded affinity tags) (Wu et al., 2006. J. Proteome Res. 5, 651-658).
  • the mass spectrometry analysis is performed in a quantitative manner, for example by using the TMT technology.
  • the TMT reagents are a set of multiplexed, amine-specific, stable isotope reagents that can label peptides in up to six different biological samples enabling simultaneous identification and quantitation of peptides.
  • the characterization by mass spectrometry is performed by the identification of proteotypic peptides of the protein.
  • the idea is that the protein is digested with proteases and the resulting peptides are determined by MS.
  • proteotypic peptide As a result, peptide frequencies for peptides from the same source protein differ by a great degree, the most frequently observed peptides that "typically" contribute to the identification of this protein being termed "proteotypic peptide”. Therefore, a proteotypic peptide as used in the present invention is an experimentally well observable peptide that uniquely identifies a specific protein or protein isoform.
  • the characterization is performed by comparing the proteotypic peptides obtained in the course of practicing the methods of the invention with known proteotypic peptides. Since, when using fragments prepared by protease digestion for the identification of a protein in MS, usually the same proteotypic peptides are observed for a given protein, it is possible to compare the proteotypic peptides obtained for a given sample with the proteotypic peptides already known for the protein.
  • the characterization of the protein, or the detection of the protein, or the determination of the amount of the protein is carried out by quantitative mass spectrometry.
  • Suitable immunodetection methods include but are not limited to Western blots, ELISA assays, sandwich ELISA assays and antibody arrays or a combination thereof.
  • the establishment of such assays is known in the art (Chapter 1 1 , Immunology, pages 1 1 -1 to 1 1 -30 in: Short Protocols in Molecular Biology. Fourth Edition, Edited by F.M. Ausubel et al., Wiley, New York, 1999).
  • assays can not only be configured in a way to detect and quantify the protein of interest, but also to analyse posttranslational modification patterns of the protein such as phosphorylation, acetylation, methylation, ubiquitination or sumoylation.
  • the identification methods of the invention involve the use of compounds which are tested for their ability to be a compound interacting with the protein.
  • such a compound can be every molecule which is able to interact with the protein and to modulate its binding to the histone tail.
  • the compound is able to inhibit partially or completely the binding of the protein to the histone tail.
  • the compound is able to enhance the binding of the protein to the histone tail leading to a stabilization of the histone tail - protein complex.
  • said compound is selected from the group consisting of synthetic or naturally occurring chemical compounds or organic synthetic drugs, more preferably small molecule organic drugs or natural small molecule compounds.
  • said compound is identified starting from a library containing such compounds. Then, in the course of the present invention, such a library is screened.
  • small molecules are preferably not proteins or nucleic acids (e.g. siRNA or morpholino oligonucleotide).
  • small molecules exhibit a molecular weight of less than 1000 Da, more preferred less than 750 Da, most preferred less than 500 Da.
  • the compound of the present invention is not a nucleic acid or a protein. More preferably, the compound of the present invention is not a peptide.
  • a "library” according to the present invention relates to a (mostly large) collection of (numerous) different chemical entities that are provided in a sorted manner that enables both a fast functional analysis (screening) of the different individual entities, and at the same time provide for a rapid identification of the individual entities that form the library. Examples are collections of tubes or wells or spots on surfaces that contain chemical compounds that can be added into reactions with one or more defined potentially interacting partners in a high-throughput fashion. After the identification of a desired "positive" interaction of both partners, the respective compound can be rapidly identified due to the library construction. Libraries of synthetic and natural origins can either be purchased or designed by the skilled artisan.
  • Solid-phase chemistry is said to become an efficient tool for this optimisation process, and recent advances in this field are highlighted in this review article.
  • the current drug discovery processes in many pharmaceutical companies require large and growing collections of high quality lead structures for use in high throughput screening assays. Collections of small molecules with diverse structures and "drug-like" properties have, in the past, been acquired by several means: by archive of previous internal lead optimisation efforts, by purchase from compound vendors, and by union of separate collections following company mergers.
  • high throughput/combinatorial chemistry is described as being an important component in the process of new lead generation, the selection of library designs for synthesis and the subsequent design of library members has evolved to a new level of challenge and importance.
  • the protein preparation is first incubated with the compound and then contacted with the histone tail.
  • the simultaneous incubation is equally preferred (competitive binding assay).
  • the protein containing protein preparation is preferably first incubated with the compound for 10 to 60 minutes, more preferred 30 to 45 minutes at a temperature of 4°C to 37°C, more preferred 4°C to 25°C, most preferred 4°C.
  • compounds are used at concentrations ranging from 1 nM to 100 ⁇ , preferably from 10 nM to 10 ⁇ .
  • the second step, contacting with the ligand is preferably performed for 10 to 60 minutes at 4°C.
  • the protein containing protein preparation is preferably simultaneously incubated with the compound and the the histone tail for 30 to 120 minutes, more preferred 60 to 120 minutes at a temperature of 4°C to 37°C, more preferred 4°C to 25°C, most preferred 4°C.
  • compounds are used at concentrations ranging from 1 nM to 100 ⁇ , preferably from 10 nM to 10 ⁇ .
  • the methods of the invention may be performed with several protein preparations in order to test different compounds. This embodiment is especially interesting in the context of medium or high throughput screenings.
  • the identification methods of the invention are performed as a medium or high throughput screening.
  • medium throuphput screening may refer to multiple tests performed in parallel, preferably in a 96-well format, which means that 2 to 96 tests are performed in parallel.
  • High throuphput screening may refer to multiple tests performed in parallel, preferably in a 384 or 1536 well format, which means that 2 to 384 or 2 to 1536 tests are performed in parallel.
  • the interaction compound identified according to the present invention may be further characterized by determining whether it has an effect on the protein, for example on its enzymatic activity (Khan et al., 2008. Biochem. J. 409(2):581-9; Blackwell et al., 2008. Life Sci. 82(21-22): 1050- 1058).
  • the compounds identified according to the present invention may further be optimized (lead optimisation). This subsequent optimisation of such compounds is often accelerated because of the structure-activity relationship (SAR) information encoded in these lead generation libraries. Lead optimisation is often facilitated due to the ready applicability of high-throughput chemistry (HTC) methods for follow-up synthesis.
  • HTC high-throughput chemistry
  • the invention further relates to a method for the preparation of a pharmaceutical composition comprising the steps of a) identifying a protein interacting compound as described above, and b) formulating the interacting compound to a pharmaceutical composition.
  • the obtained pharmaceutical composition can be used for the prevention or treatment of diseases where the respective protein plays a role.
  • the invention relates to a method for the identification of a compound being capable of interacting with a given protein capable of interacting with a given histone tail, comprising the steps of a) providing a protein preparation containing said protein, b) contacting the protein preparation with said histone tail being immobilized on a solid support under conditions allowing the formation of a complex between the histone tail and the protein, c) incubating complex with a given compound, and d) determining whether the compound is able to separate the protein from the immobilized histone tail.
  • the present invention relates to a method for the identification of a compound being capable of interacting with a given protein capable of interacting with a given histone tail, comprising the steps of a) providing a protein preparation containing said protein, b) contacting the protein preparation with said histone tail being immobilized on a solid support and with a given compound under conditions allowing the formation of a complex between the histone tail and the protein, and c) detecting the complex formed in step b),
  • the present invention relates to a method for the identification of a compound being capable of interacting with a given protein capable of interacting with a given histone tail, comprising the steps of: a) providing two aliquots of a protein preparation containing said protein, b) contacting one aliquot with said histone tail being immobilized on a solid support under conditions allowing the formation of a complex between the histone tail and the protein, c) contacting the other aliquot with said histone tail immobilized on a
  • the present invention relates to a method for the identification of a compound being capable of interacting with a given protein capable of interacting with a given histone tail, comprising the steps of: a) providing two aliquots comprising each at least one cell containing said protein, b) incubating one aliquot with a given compound, c) harvesting the cells of each aliquot, d) lysing the cells in order to obtain protein preparations, e) contacting the protein preparations with said histone tail being immobilized on a solid support under conditions allowing the formation of a complex between the histone tail and the protein, and f) determining the amount of the complex formed in each aliquot in step e),
  • the given histone tail may have the sequence A[Rme]TKQTARKSTGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 1.
  • the given histone tail may have the sequence A[Rme2a][pT]KQTARKSTGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 2.
  • the given histone tail may have the sequence A[Rme2a][pT][Kme3]QTARKSTGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 3.
  • the given histone tail may have the sequence A[Rme2a]T[ me3]QTARKSTGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 4.
  • the given histone tail may have the sequence AR[pT][Kme3]QTARKSTGG APRKQLA and the given protein may be selected from the proteins depicted in Table 5.
  • the given histone tail may have the sequence ART[Kme]QTARKSTGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 6.
  • the given histone tail may have the sequence ART[Kme2]QTARKSTGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 7.
  • the given histone tail may have the sequence A ART[Kme3 ] QT ARKSTGGKAPRKQL A and the given protein may be selected from the proteins depicted in Table 8.
  • the given histone tail may have the sequence ART[ me3]QTAR[Kac]STGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 9.
  • the given histone tail may have the sequence ART[ me3]QTAR[Kme3]STGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 10.
  • the given histone tail may have the sequence ART QTAR[ ac] STGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 1 1.
  • the given histone tail may have the sequence ART QTAR[Kac][pS]TGG APRKQLA and the given protein may be selected from the proteins depicted in Table 12.
  • the given histone tail may have the sequence ARTKQTAR[Kac][pS][pT]GGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 13.
  • the given histone tail may have the sequence ART QTAR[Kme]STGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 14.
  • the given histone tail may have the sequence ARTKQT AR[Kme2] STGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 15.
  • the given histone tail may have the sequence ARTKQT AR[Kme3] STGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 16.
  • the given histone tail may have the sequence ARTKQT AR[Kme3][pS]TGGKAPRKQ LA and the given protein may be selected from the proteins depicted in Table 17.
  • the given histone tail may have the sequence ARTKQT AR[Kme3 ] [pS] [pT]GGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 18.
  • the given histone tail may have the sequence ARTKQT AR[Kme3]S[pT]GGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 19.
  • the given histone tail may have the sequence ARTKQTARK[pS] [pT]GGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 20.
  • the given histone tail may have the sequence ART QT ARKS [pT] GG AP RKQL A and the given protein may be selected from the proteins depicted in Table 21.
  • the given histone tail may have the sequence [pS] G [Rme2a] G [Kac] GGKGLGKGG AKRHRKV and the given protein may be selected from the proteins depicted in Table 22.
  • the given histone tail may have the sequence SG[Rme2a]G[Kac]GGKGLGKGGAKRHRKV and the given protein may be selected from the proteins depicted in Table 23.
  • the given histone tail may have the sequence SGRG[Kac]GGKGLGKGGAKRHRKV and the given protein may be selected from the proteins depicted in Table 24.
  • the given histone tail may have the sequence SGRG[Kac] GG [Kac] GLGKGG AKRHRKV and the given protein may be selected from the proteins depicted in Table 25.
  • the given histone tail may have the sequence SGRG[Kac]GG[Kac]GLG[Kac]GG AKRHRKV and the given protein may be selected from the proteins depicted in Table 26.
  • the given histone tail may have the sequence SGRG[Kac]GG[Kac]GLG[Kac]GGA[Kac]RHRKV and the given protein may be selected from the proteins depicted in Table 27.
  • the given histone tail may have the sequence SGRG[ ac]GGKGLG[ ac]GGAKRHRKV and the given protein may be selected from the proteins depicted in Table 28.
  • the given histone tail may have the sequence SGRG[ ac]GG GLGKGGA[ ac]RHRKV and the given protein may be selected from the proteins depicted in Table 29.
  • the given histone tail may have the sequence SGRGKGG[Kac]GLGKGGAKRHRKV and the given protein may be selected from the proteins depicted in Table 30.
  • the given histone tail may have the sequence SGRGKGG[Kac]GLG[Kac]GGAKRHRKV and the given protein may be selected from the proteins depicted in Table 31 .
  • the given histone tail may have the sequence SGRGKGG[Kac]GLG[Kme3]GGAKRHRKV and the given protein may be selected from the proteins depicted in Table 32.
  • the given histone tail may have the sequence SGRGKGG GLG[ ac]GGAKRHRKV and the given protein may be selected from the proteins depicted in Table 33.
  • the given histone tail may have the sequence SGRG GGKGLG[ me2]GGA RHRKV and the given protein may be selected from the proteins depicted in Table 34.
  • the given histone tail may have the sequence SGRG GGKGLG[ me3]GGAKRHRKV and the given protein may be selected from the proteins depicted in Table 35.
  • all embodiments described above for the methods of the invention where in step a) a protein is identified according to the invention apply also for the methods where a specific histone tail and a specific protein is used.
  • the histone tail may be biotinylated as described above.
  • affinity matrix refers to the immobilized ligand (histone tail) as defined in the present application.
  • TAF15 TAF15 RNA polymerase II TAF15 RNA polymerase II
  • TATA box binding protein (TBP)- associated factor TAF15 RNA polymerase II
  • TBP TATA box binding protein
  • IPI00872107.1 ILF2 interleukin enhancer binding factor 2, 45kDa
  • IPI00045051.3 PURB purine-rich element binding protein B
  • IPI00880117.1 APOBEC3B apolipoprotein B mRNA editing enzyme, catalytic polypeptide- like 3B
  • IPI00872107.1 ILF2 interleukin enhancer binding factor 2, 45kDa
  • GTF2A2 general transcription factor IIA, 2, 12kDa
  • IPI00880117.1 APOBEC3B apolipoprotein B mRNA editing enzyme, catalytic polypeptide- like 3B
  • CD3EAP CD3e molecule, epsilon associated protein
  • CD3EAP CD3e molecule, epsilon associated protein
  • GTF2A1 general transcription factor MA, 1, 19/37kDa
  • HNRNPA2B1 heterogeneous nuclear ribonucleoprotein A2/B1
  • IPI00872107.1 ILF2 interleukin enhancer binding factor 2, 45kDa
  • GTF2E2 general transcription factor ME, polypeptide 2, beta 34kDa
  • GTF2F1 general transcription factor IIF, polypeptide 1, 74kDa
  • GTF2F2 general transcription factor IIF, polypeptide 2, 30kDa
  • IPI00642971.3 EEF1D eukaryotic translation elongation factor 1 delta (guanine nucleotide exchange protein)
  • PRKAG2 protein kinase IPI00005367.1 PRKAG2 protein kinase, AMP-activated, gamma 2 non-catalytic subunit
  • IPI00880117.1 APOBEC3B apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 3B
  • FBR-MuSV Finkel-Biskis-Reilly murine sarcoma virus
  • DCTN1 dynactin 1 Table 31: Proteins interacting with histone tail SGRGKGG[Kac]GLG[Kac]GGAKRHRKV
  • IPI00644127.1 IARS isoleucyl-tRNA synthetase
  • TAF3 TAF3 RNA polymerase II TAF3 TAF3 RNA polymerase II
  • TATA box binding protein (TBP)- associated factor 140kDa
  • TAF5 TAF5 RNA polymerase II TAF5 TAF5 RNA polymerase II
  • TATA box binding protein (TBP)- associated factor TAF5 TAF5 RNA polymerase II
  • TATA box binding protein (TBP)- associated factor TAF5 TAF5 RNA polymerase II
  • TATA box binding protein (TBP)- associated factor TAF5 TAF5 RNA polymerase II
  • TATA box binding protein (TBP)- associated factor TATA box binding protein (TBP)- associated factor
  • TAF7 TAF7 RNA polymerase II TAF7 TAF7 RNA polymerase II
  • TATA box binding protein (TBP)- associated factor 55kDa
  • TAF4 TAF4 RNA polymerase II TAF4 TAF4 RNA polymerase II
  • TATA box binding protein (TBP)- associated factor TAF4 RNA polymerase II
  • TATA box binding protein (TBP)- associated factor TAF4 RNA polymerase II
  • TAF3 TAF3 RNA polymerase II TAF3 TAF3 RNA polymerase II
  • TATA box binding protein (TBP)- associated factor 140kDa
  • PELP1 proline, glutamate and leucine rich protein 1
  • IGF2BP1 insulin-like growth factor 2 mRNA binding protein 1 Table 32: Proteins interacting with histone tail SGRGKGG[Kac]GLG[Kme3]GGAKRHRKV
  • IPI00872107.1 ILF2 interleukin enhancer binding factor 2, 45kDa
  • IPI00880117.1 APOBEC3B apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 3B
  • Cell lysate (nuclear extract of human HL-60 cells) was contacted with immobilized histone tails.
  • the beads with captured proteins were separated from the lysate and bead bound proteins were eluted in SDS sample buffer and subsequently separated by SDS- Polyacrylamide gel electrophoresis.
  • the gel was stained with colloidal Coomassie and stained areas of each gel lane were cut out and subjected to in-gel proteolytic digestion with trypsin.
  • Peptides originating from the different histone tail beads and the lysate control were labeled with isobaric tagging reagents (TMT reagents, Thermofisher).
  • the TMT reagents are a set of multiplexed, amine-specific, stable isotope reagents that can label peptides in up to six different biological samples enabling simultaneous identification and quantitation of peptides.
  • the combined samples were fractionated using reversed- phase chromatography at pH 1 1 and fractions were subsequently analyzed with a nano- flow liquid chromatography system coupled online to a tandem mass spectrometer (LC- MS/MS) experiment followed by reporter ion quantification in the MS/MS spectra (Ross et al., 2004. Mol. Cell. Proteomics 3(12):1 154-1 169; Dayon et al., 2008.
  • LC- MS/MS tandem mass spectrometer
  • Tables 1 to 21 show the proteins interacting with individual H3 histone tails which are listed in Table 36.
  • Tables 22 to 35 depict the proteins interacting with individual H4 histone tails which are listed in Table 37. Sequence accession numbers are defined by the International Protein Index (IPI) (Kersey et al., 2004. Proteomics 4(7): 1985-1988).
  • IPI International Protein Index
  • Biotinylated histone tails were purchased from Alta Bioscience (Birmingham, UK) and solubilized at a concentration of 0.8 mM in 10 mM Tris-HCl buffer, pH 7.4 (Sigma- Aldrich T2663, St-Louis, MO, USA). Histone tails were incubated with Streptavidin agarose beads for 30 minutes (Thermo Fischer Scientific 20361 , Waltham, MA, USA). For each sample 14 ⁇ of 0.8 mM histone tail solution and 25 ⁇ of beads were used. The beads were then centrifuged for 5 minutes at 1,200 rpm (Heraeus 75004375, Hanau, Germany) and the supernatant was removed.
  • Rme mono-methylated arginine
  • Rme2a asymmetrical di-methylated arginine
  • Kme mono-methylated lysine
  • Kme2 di-methylated lysine
  • Kme3 tri- methylated lysine
  • Kac acetylated lysine
  • pS phosphorylated serine
  • pT phosphorylated threonine
  • Rme mono-methylated arginine
  • Rme2a asymmetrical di-methylated arginine
  • Kme mono-methylated lysine
  • Kme2 di-methylated lysine
  • Kme3 tri- methylated lysine
  • Kac acetylated lysine
  • pS phosphorylated serine
  • pT phosphorylated threonine
  • ahx aminohexanoic acid
  • HL-60 cells ATCC CCL-240, Manassas, VA, USA
  • spinner flasks Integra Bioscience 182101, Zizers, Switzerland
  • IMDM medium Invitrogen 21980.065, Carlsbad, CA, USA
  • fetal calf serum PAA Laboratories 15/101 , Pasching, Austria
  • Washed cells are centrifuged for 5 minutes (first wash) or 10 minutes (second wash) at 2,370 rpm (Heraus 75004375.)
  • the cell pellet was resuspended in 4 volumes of hypotonic buffer (10 mM TRIS-Cl, pH 7.4, 1.5 mM MgCl 2 (Sigma M- 1028), 10 mM C1 , 25 mM NaF (Sigma S7920), 1 mM Na 3 Vo 4 (Sigma S6508), 1 mM DTT (Biomol 04010, Plymouth Meeting, PA, USA).
  • hypotonic buffer 10 mM TRIS-Cl, pH 7.4, 1.5 mM MgCl 2 (Sigma M- 1028), 10 mM C1 , 25 mM NaF (Sigma S7920), 1 mM Na 3 Vo 4 (Sigma S6508), 1 mM DTT (Biomol 04010, Plymouth Meeting, PA, USA).
  • the cells were allowed to swell for 3 minutes (swelling checked under
  • the supernatant was discarded and the pellet was resuspended in 2 volumes of hypotonic buffer supplemented with protease inhibitors.
  • the cells were homogenized by 10 to 15 strokes in a homogenizer (VWR SCERSP885300-0015, Radnor, PA, USA) and the homogenate was centrifuged for 10 minutes at 3,300 rpm.
  • the supernatant was discarded and the nuclei were washed in 3 volumes of hypotonic buffer supplemented with protease inhibitors (1 tablet for 25 ml; Roche 13137200, Basel, Switzerland) and centnfuged for 15 minutes at 10.000 rpm in a SLA-600TC rotor (Sorvall 74503).
  • the pellet was resuspended in 1 volume of extraction buffer (50 mM TRIS-Cl, pH 7.4, 1.5 mM MgCl 2 , 20 % glycerol (Merck Z835091), 420 mM NaCl (Sigma S5150), 25 mM NaF, 1 mM Na 3 V0 4 , 1 mM DTT, 400 units/ml of DNAsel (Sigma D4527), and protease inhibitors (1 tablet for 25 ml)) and then homogenized first with 20 strokes with a homogenizer and then by 30 minutes gentle mixing at 4°C. The homogenate was then centnfuged for 30 minutes at 10,000 rpm in a SLA-600TC rotor.
  • extraction buffer 50 mM TRIS-Cl, pH 7.4, 1.5 mM MgCl 2 , 20 % glycerol (Merck Z835091), 420 mM NaCl (Sigma S5150), 25 mM NaF,
  • the supernatant was diluted in dilution buffer (1.8 ml buffer per 1 ml supernatant; 50 mM TRIS-Cl, pH 7.4, 3.9 mM EDTA (Sigma E7889), 25 mM NaF, 1 mM Na 3 V0 4 , 0.6 % Igepal CA-630 (Sigma, 13021), 1 mM DTT and protease inhibitors (1 tablet for 25 ml)).
  • the lysate was centrifuged for 1 hour at 33,500 rpm in a ⁇ 50.2 rotor (Beckman Coulter LE90K, 392052, Brea, CA, USA) and the supernatant was frozen in liquid nitrogen and stored at -80°C. After thawing of the nuclear lysate the protein concentration was adjusted to 5 mg/ml.
  • the final buffer composition was 50 mM TRIS pH 7.4, 5% Glycerol, 150 mM NaCl, 25 mM NaF, 2.5 mM EDTA, 0.4% Igepal CA-630, 1 mM DTT and protease inhibitors (1 tablet for 25 ml lysate). The lysate is then submitted to ultracentrifugation at 33,500 rpm for 20 minutes in a Ti50.2 rotor.
  • the lysate was precleared with Poly-L lysine agarose beads for 90 minutes. For 25 mg of protein in the lysate 600 ⁇ of beads were used (Sigma P6893-50). The histone tails beads were then added to the lysate for 2 hours. The beads were centrifuged and loaded into a purification column (Mobicol, Mobitec Ml 002, Goettingen, Germany).
  • the beads were washed first with 10 ml of buffer (50 mM TRIS pH 7.4, 5% Glycerol, 150 mM NaCl, 25 mM NaF, 2.5 mM EDTA, 0.4% Igepal CA-630, 1 mM DTT and protease inhibitors) and then 5 ml of buffer with half the concentration of detergent (0.4% Igepal). Bound proteins were eluted with 50 ⁇ of loading buffer (Nupage, Invitrogen NP0007) at 50°C for 30 minutes.
  • buffer 50 mM TRIS pH 7.4, 5% Glycerol, 150 mM NaCl, 25 mM NaF, 2.5 mM EDTA, 0.4% Igepal CA-630, 1 mM DTT and protease inhibitors
  • Gel-separated proteins were digested in-gel essentially following a previously described procedure (Shevchenko et al., 1996, Anal. Chem. 68:850-858). Briefly, gel-separated proteins were excised from the gel using a clean scalpel, destained twice using 100 ⁇ 5mM triethylammonium bicarbonate buffer (TEAB; Sigma T7408) and 40% ethanol in water and dehydrated with absolute ethanol. Proteins were subsequently digested in-gel with porcine trypsin (Promega) at a protease concentration of 10 ng/ ⁇ in 5mM TEAB. Digestion was allowed to proceed for 4 hours at 37°C and the reaction was subsequently stopped using 5 ⁇ 1 5% formic acid. Gel plugs were extracted twice with 20 ⁇ 1% formic acid and three times with increasing concentrations of acetonitrile. Peptide extracts were subsequently pooled with acidified digest supernatants and dried in a vacuum centrifuge.
  • the peptide extracts corresponding to the different aliquots treated with different concentrations of compound 1 were labeled with variants of the isobaric tagging reagent as shown in Table 1 (TMT sixplex Label Reagent Set, part number 90066, Thermo Fisher Scientific Inc., Rockford, IL 61 105 USA).
  • the TMT reagents are a set of multiplexed, amine-specific, stable isotope reagents that can label peptides on amino groups in up to six different biological samples enabling simultaneous identification and quantification of peptides.
  • the TMT reagents were used according to instructions provided by the manufacturer.
  • the samples were resuspended in 10 ⁇ 50 mM TEAB solution, pH 8.5 and 10 ⁇ acetonitrile were added.
  • the TMT reagent was dissolved in acetonitrile to a final concentration of 24 mM and 10 ⁇ of reagent solution were added to the sample.
  • the labeling reaction was performed at room temperature for one hour on a horizontal shaker and stopped by adding 5 ⁇ of 100 mM TEAB and 100 mM glycine in water.
  • the labeled samples were then combined, dried in a vacuum centrifuge and resuspended in 60% 200mM TEAB / 40% acetonitrile.
  • Peptide samples were injected into a capillary LC system (CapLC, Waters) and separated using a reversed phase CI 8 column (X-Bridge 1 mm x 150 mm, Waters, USA). Gradient elution was performed at a flow-rate of 50 ⁇ .
  • Solvent A 20 mM ammoniumformiate, pHl l
  • solvent B 20 mM ammoniumformiate, pHl l , 60% acetonitrile and 1 min fractions were automatically collected throughout the separation range (Micro-fraction collector, Sunchrom, Germany) and pooled to yield a total of 16 peptide fractions.
  • LTQ-Orbitrap XL and Orbitrap Velos instruments were operated with XCalibur 2.0/2.1 software. Intact peptides were detected in the Orbitrap at 30.000 resolution. Internal calibration was performed using the ion signal from (Si(CH 3 ) 2 0) 6 H + at m/z 445.120025 (Olsen et al., 2005. Mol. Cell Proteomics 4, 2010-2021). Data dependent tandem mass spectra were generated for up to six peptide precursors using a combined CID HCD approach (Kocher et al., 2009. J. Proteome Res. 8, 4743-4752). For CID up to 5000 ions (Orbitrap XL) or up to 3000 ions (Orbitrap Velos) were accumulated in the ion trap within a maximum ion accumulation time of 200 msec.
  • MascotTM 2.0 (Matrix Science) was used for protein identification using 10 ppm mass tolerance for peptide precursors and 0.8 Da (CID) tolerance for fragment ions. Carbamidomethylation of cysteine residues and iTRAQ/TMT modification of lysine residues were set as fixed modifications and S,T,Y phosphorylation, methionine oxidation, N-terminal acetylation of proteins and iTRAQ/TMT modification of peptide N-termini were set as variable modifications.
  • the search data base consisted of a customized version of the IPI protein sequence database combined with a decoy version of this database created using a script supplied by Matrix Science (Elias et al., 2005. Nat.
  • Centroided iTRAQ/TMT reporter ion signals were computed by the XCalibur software operating and extracted from MS data files using customized scripts. Only peptides unique for identified proteins were used for relative protein quantification. Further spectra used for quantification were filtered according to the following criteria: Mascot ion score > 15, signal to background ratio of the precursor ion > 4, s2i > 0.5 (Savitski et al., 2010. J. Am. Soc. Mass Spectrom. 21 (10): 1668-79). Reporter ion intensities were multiplied with the ion accumulation time yielding an area value proportional to the number of reporter ions present in the mass analyzer.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Chemical & Material Sciences (AREA)
  • Biomedical Technology (AREA)
  • Urology & Nephrology (AREA)
  • Hematology (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • Analytical Chemistry (AREA)
  • Cell Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Food Science & Technology (AREA)
  • Medicinal Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Peptides Or Proteins (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The present invention relates to methods for the identification of proteins capable of interacting with a given histone tail as well as of compounds interacting with said proteins.

Description

Methods for the identification and characterization of proteins interacting with histone tails and of compounds interacting with said proteins
The present invention relates to methods for the identification and characterization of proteins interacting with histone tails as well as to the identification and characterization of compounds interacting with said proteins. Proteins interacting with histones have gained an increasing interest in the past years due to their potential to modify chromatin structure and thereby to make DNA accessible or inaccessible for RNA polymerase complexes and transcription factors. Of particular interest are proteins interacting with the histone tails, because they have been shown to have a significant influence on the the regulation of gene expression.
Chromatin is the state in which DNA is packaged within the cell. The nucleosome is the unit of chromatin and it consists of an octamer of the four core histone proteins (dimers of histone H3, H4, H2A and H2B) around which 147 base pairs of DNA are wrapped. The histones are small, basic proteins rich in arginine and lysine residues resulting in a high affinity for DNA. Histone H3 and H4 are among the most conserved proteins known. Each of the core histones has a histone fold domain and a flexible N-terminal tail (H2A and H2B also have C-terminal tails) which contains sites for covalent modification that are important for chromatin function. Most of these modifications occur on the N-terminal unstructured part of the histone - particularly on histone H3 and H4 - that protrudes like a tail from the nucleosome, hence the name of "histone tails" ( ouzarides, 2007. Cell 128(4):693-705).
Examples of histone modifications are acetylation of lysines, methylation of lysines and arginines, phosphorylation of serines, threonines and tyrosines, ubiquitinylation of lysines and transformation of arginine to citrulline. With the exception of the citrullination, all these modifications are reversible. Acetylation, by the creation of a negative charge has a tendency to displace the tail from the nucleosome and the DNA and favours gene transcription. Methylation does not create a new charge and can promote or inhibit transcription. Phosphorylation has often a negative impact on the role of nearby histone marks. Therefore, depending on the nature of a mark and its presence or absence on a particular histone tail residue, gene transcription will be promoted or repressed. Chromatin immunoprecipitation experiments have shown that specific histone marks are enriched at promoters of actively transcribed genes, such as methylated lysine 4 on histone H3 or acetylated lysine 5 on histone H4, whereas other marks are enriched in the poorly transcribed heterochromatin regions, like methylated lysines 9 and 27 on histone H3 (Barski et al., 2007. Cell 129(4):823-837).
The covalent posttranslational modifications (PMTs) of histone tails are added by enzymes classified as "writers" (e.g. acetyltransferases) or removed by enzymes classified as "erasers" (e.g. deacetylases). Proteins binding to and recognizing these modifications are referred to as "readers" (e.g. by bromodomain proteins that bind to acetylated lysine).
The addition of acetyl groups is catalyzed by histone acetyl transferases (HATs) and their removal by histone deacetylases (HDACs). Acetylation of histone tails is almost invariably associated with the activation of transcription whereas the reversal of acetylation correlates with transcriptional repression.
Methyl groups are added by protein methyl transferases (PTMs) and removed by protein demethylases (PDMs). Lysine residues can be mono-, di- or tri -methylated by protein lysine methyltransferases (PK Ts) and arginine residues can be mono- or di-methylated by protein arginine methyl transferases (PRJvlTs). Importantly, the dimethylation on arginine can be asymmetrical or symmetrical. Lysine and arginine methylation of histone tails can either activate or repress gene transcription.
Phosphorylation is regulated by kinases (addition of phosphate gropus) and phosphatases (removal of phosphate groups). The phosphorylation of serine residue 10 of histone 3 (H3S10) by the Aurora-B kinase has been implicated in cell cycle progression and cancer whereas the phosphatase PP1 is thought to dephosphorylate H3S10.
Ubiquitin moieties can be added to histones (mono- or poly-ubiquitination) by protein ubiquitin ligases (E3 ligases) and removed by protein ubiquitin carboxyl-terminal hydrolases (UCHs). Like ubiquitylation, sumoylation is a large modification and shows some similarity to ubiquitylation. All four core histones can be sumoylated and specific sites have been identified on H4, H2A, and H2B. Sumoylation antagonizes both acetylation and ubiquitylation, which occurs on the same lysine residue, and consequently represses transcription in yeast (Nathan et al., 2006. Genes Dev. 20(8):966-976).
These posttranslational histone modifications are recognized ("read") by proteins containing domains that bind specifically to a particular mark. Bromodomain containing proteins recognize acetylated lysine. MBT domains recognize mono- and di-methylated lysines. Di- and tri-methylated lysines are read by TUDOR, CHROMO and PHD domains. TUDOR domains also recognize symmetrically dimethylated arginine whereas PHD domains also are able to bind to mono-methylated, unmethylated and acetylated lysines. These different proteins interact with transcription factors and regulators to control transcriptional initiation, elongation and termination as well as other functions like DNA methylation or DNA repair. Therefore the combinatorial histone mark pattern of each histone tail forms a "histone code" for the assembly of large epigenetic protein complexes containing histone mark "writers", "erasers" and "readers". These protein complexes are thought to determine the transcriptional fate of a given gene (Ruthenburg et al., 2007. Nat. Rev. Mol. Cell Biol. 8(12):983-994). Recent research revealed that the misregulation of histone modification (misreading, miswriting and miserasing of histone marks) can cause epigenetic alterations leading to pathogenic conditions like cancer and inflammation. As example for writers, the proteins mixed lineage leukemia (MLL) and enhancer of zeste 2 (EZH2) catalyse the methylation of histone H3 lysine 4 (H3K4) and H3 lysine 27 (H3K27), repectively, which represent two of the most important histone methylation marks. MLL rearrangement and deregulation of EZH2 are among the most common mutations in leukemia and solid tumours. Several plant homeodomain (PHD) finger-containing proteins have recently been identified as readers of trimethylation of H3K4 (H3K4me3). Misinterpretation of H3K4me3 by leukemia-associated translocations of PHD finger factors (NUP98-JARID1A or NUP98-PHF23) is crucial for the induction of leukemia. Somatic mutations of INGl , a PHD finger factor, interfere with the reading of H3K4me3 and associate with the development of squamous cell carcinoma, head and neck squamous cell carcinoma and melanoma (Chi et al., 2010. Nat. Rev. Cancer. 10(7):457-69). The link of histone modifying enzymes (and proteins that recognize these modifications) to human disease suggests that these proteins represent promising emerging drug targets (Copeland et al., 2010. Curr. Opin. Chem. Biol. 14(4):505-510). For example, the approval of the drug vorinostat (Zolinza, Merck) by the FDA for the treatment of cutaneous T-cell lymphoma in October 2006 significantly increased the interest in developing inhibitors for the class of enzymes known as histone deacetylases (HDACs).
Several HDAC inhibitors are in preclinical development and clinical trials for the treatment of a wide variety of diseases including cancer, inflammatory, cardiac, and neurodegenerative diseases (Bolden et al., 2006. Nat. Rev. Drug Discov. 5(9):769-784; Haberland et al., 2009. Nat. Rev. Genet. 10(l):32-42). It is expected that the development of selective HDAC inhibitors targeting only one member of the HDAC family should lead to improved efficacy and drug safety compared to non-selective "pan-HDAC inhibitors" (Kalin et al., 2009. Curr. Opin. Chem. Biol. 13: 1 -9; Balasubramanian et al., 2009. Cancer Lett. 280(2):211-21).
The majority of in vitro studies of HDAC activity are performed using in vitro biochemical assays. These assays are also used for the identification of HDAC inhibitors (Hauser et al., 2009. Curr. Top. Med. Chem. 9(3):227-234). For example, HDAC activity can be measured using purified recombinant enzyme in solution-based assays with acetylated peptide substrates (Blackwell et al., 2008. Life Sciences 82(21 -22):1050-1058).
Typically, these assays require the availability of purified or recombinant HDACs. However, not all HDACs can be produced will sufficient enzymatic activity to allow for inhibitor screening (Blackwell et al., 2008. Life Sciences 82(21-22):1050-1058). In addition, some preparations of HDACs expressed in insect cells are contaminated with endogenous insect HDACs making the interpretation of assay results ambiguous.
Another, although not in all instances necessary prerequisite for the identification of selective HDAC inhibitors is a method that allows to determine the target selectivity of these molecules. For example, it can be intended to provide molecules that bind to and inhibit a particular drug target but do not interact with a closely related target, inhibition of which could lead to unwanted side effects. Conventionally, large panels of individual enzyme assays are used to assess the inhibitory effect of a compound for HDACs (Khan et al., 2008. Biochem. J. 409(2):581 -9; Blackwell et al., 2008. Life Sci. 82(21-22): 1050- 1058).
Methods for selectivity profiling are also necessary for proteins that bind to histone modifications (readers). The histone binding selectivity of purified recombinant yeast bromodomains was assessed by using the native core histones in an overlay assay. In addition, N-terminally biotinylated lysine-acetylated histone peptides were spotted on a streptavidin-coated nitrocellulose membrane in a dot blot assay (Zhang et al., 2010. PLoS One 5(l):e8903).
In view of the above, there is a need for identifying proteins being capable of interacting with histone tails. Furthermore, there is a need for providing effective tools and methods for the identification and selectivity profiling of compounds capable of interacting with proteins capable of interacting with a given histone tail.
Summary of the Invention
In a first aspect, the invention relates to a method for the identification of a protein capable of interacting with a given histone tail, comprising the steps of a) providing a protein preparation containing a protein of interest, b) contacting the protein preparation with the given histone tail under conditions allowing the binding of the protein to said histone tail, and c) characterizing the protein bound to the histone tail.
In a preferred embodiment, the histone tail is labeled and after the contacting the histone tail-protein complex is purified using said label.
In a preferred embodiment, the label is biotin and the histone tail-protein complex is purified with the help of streptavidin.
In a preferred embodiment, step c) includes the steps of cl ) eluting the protein from the histone tail, and c2) characterizing the protein. In a preferred embodiment, said characterizing the protein includes the identification of the protein.
In a preferred embodiment, said protein is characterized by mass spectrometry or immunodection.
In a preferred embodiment, step b) is performed in the presence of varying concentrations of a non-labeled histone tail. In a preferred embodiment, the amount of protein is determined in step c) and wherein a detection of a reduced amount of protein with increasing concentrations of the non-labeled histone tail is indicative for a specific binding of the protein to the histone tail. In a further aspect, the invention relates to a method for the identification of a compound being capable of interacting with a protein capable of interacting with a given histone tail, comprising the steps of a) identifying a protein capable of interacting with a given histone tail according to the above methods of the invention, b) providing a protein preparation containing said protein, c) contacting the protein preparation with said histone tail immobilized on a solid support under conditions allowing the formation of a complex between the histone tail and the protein, d) incubating complex with a given compound, and e) determining whether the compound is able to separate the protein from the immobilized histone tail.
In a preferred embodiment, step e) includes the detection of separated protein or the determination of the amount of separated protein.
In a preferred embodiment, said separated protein is detected or the amount of said separated protein is determined by mass spectrometry or immunodetection methods, preferably with an antibody directed against the protein. In a further aspect, the present invention relates to a method for the identification of a compound being capable of interacting with a protein capable of interacting with a given histone tail, comprising the steps of a) identifying a protein capable of interacting with a given histone tail according to the above methods of the invention, b) providing a protein preparation containing said protein, c) contacting the protein preparation with said histone tail immobilized on a solid support and with a given compound under conditions allowing the formation of a complex between the histone tail and the protein, and d) detecting the complex formed in step c),
In a preferred embodiment, in step d) said detecting is performed by determining the amount of the complex.
In a preferred embodiment,steps a) to d) are performed with several protein preparations in order to test different compounds.
In a further aspect, the present invention relates to a method for the identification of a compound being capable of interacting with a protein capable of interacting with a given histone tail, comprising the steps of: a) identifying a protein capable of interacting with a given histone tail according to the above methods of the invention, b) providing two aliquots of a protein preparation containing said protein, c) contacting one aliquot with said histone tail immobilized on a solid support under conditions allowing the formation of a complex between the histone tail and the protein, d) contacting the other aliquot with said histone tail immobilized on a solid support and with a given compound under conditions allowing the formation of a complex between the histone tail and the protein, and e) determining the amount of the complex formed in steps c) and d). In a further aspect, the present invention relates to a method for the identification of a compound being capable of interacting with a protein capable of interacting with a given histone tail, comprising the steps of: a) identifying an protein capable of interacting with a given histone tail according to the above methods of the invention, b) providing two aliquots comprising each at least one cell containing said protein, c) incubating one aliquot with a given compound, d) harvesting the cells of each aliquot, e) lysing the cells in order to obtain protein preparations, f) contacting the protein preparations with said histone tail immobilized on a solid support under conditions allowing the formation of a complex between the histone tail and the protein, and g) determining the amount of the complex formed in each aliquot in step f),
In a preferred embodiment, a reduced amount of the complex formed in the aliquot incubated with the compound in comparison to the aliquot not incubated with the compound indicates that the protein is a target of the compound.
In a preferred embodiment, the amount of the complex is determined by separating the protein from the immobilized histone tail and subsequent detection of separated protein or subsequent determination of the amount of separated protein.
In a preferred embodiment, said protein is detected or the amount of said protein is determined by mass spectrometry or immunodetection methods, preferably with an antibody directed against said protein. In a preferred embodiment, said methods are performed as a medium or high throughput screening.
In a preferred embodiment, said compound is selected from the group consisting of synthetic compounds, or organic synthetic drugs, more preferably small molecule organic drugs, and natural small molecule compounds.
In a preferred embodiment, the solid support is selected from the group consisting of agarose, modified agarose, sepharose beads (e.g. NHS-activated sepharose), latex, cellulose, and ferro- or ferrimagnetic particles.
In a preferred embodiment, the histone tail is non-covalently coupled to the solid support.
In a preferred embodiment, the provision of a protein preparation includes the steps of harvesting at least one cell containing the protein and lysing the cell.
In a preferred embodiment, the steps of the formation of the complex between the histone tail and the protein are performed under essentially physiological conditions. In a preferred embodiment, at least one of the amino acids of the histone tail has been further modified.
The present invention also relates to a method for the identification of a compound being capable of interacting with a protein capable of interacting with a given histone tail, wherein instead of the identification of a protein according to the method of the invention (step a) in the respective methods of the invention) the histone tail is one of the histone tails as shown in one of the Tables 1 to 35 and and the protein is one of the proteins corresponding to the respective histone tail as shown in one of the Tables 1 to 35. In preferred embodiments, all embodiments described above for the methods of the invention where in step a) a protein is identified according to the invention apply also for the methods where a specific histone tail and a specific protein are used. Detailed Description
The present invention relates both to methods for the identification of proteins capable of binding to a given histone tail as well as to methods for the identification of compounds being capable of binding to said proteins. In the following, preferred embodiments and definitions for the methods of the invention are discussed.
With the help of the present invention, it is possible to very efficiently identify proteins being capable of interacting with a given histone tail, as shown in the examples of the present invention. In the context of the present invention, it is possible to identify proteins which directly interact with the histone tail as well as proteins which interact indirectly, i.e. which are part of a larger protein complex which itself interacts with the histone tail.
Consequently, according to the present invention, the term "protein" also includes enzymes that can add or remove histone tail modifications and proteins that can bind to modified histone tails and proteins associated with said enzymes or binding proteins. In this context, the term "protein" also includes peptides or oligopeptides with or without posttranslational modifications such as glycosylation, ubiquinylation, methylation or the like. According to the invention, the term "histone tail" denotes the flexible aminoterminal regions of the core histones (H2A, H2B, H3, H4) and the flexible carboxyterminal regions of histones H2A and H2B that extend beyond the surface of the nucleosome (Jufvas et al., 201 1. PLoS One 6(1 ):el 5960). Preferably, the histone tail comprises the aminoterminal amino acids at position 1 to 40 of histone H2A (SGRG QGGKARAKAKTRSSRAGLQFPVGRVHRLLRKGNYS), more preferred amino acids 1 to 30, and most preferred 1 to 21 , of said amino acid sequence.
Preferably, the histone tail comprises the aminoterminal amino acids at position 1 to 40 of histone H2B (PEPSKSAPAPKXGSKXAITKAQKXDGKKRKRSRKESYSIY), more preferred amino acids 1 to 30, and most preferred 1 to 21 , of said amino acid sequence. Preferably, the histone tail comprises the aminoterminal amino acids at position 1 to 40 of histone H3 (ARTKQTARKSTGGKAPRKQLATKAARKSAPATGGVKKPHR), more preferred amino acids 1 to 30, and most preferred 1 to 21 , of said amino acid sequence.
Preferably, the histone tail comprises the aminoterminal amino acids at position 1 to 40 of histone H4 (SGRG GGKGLGKGGAKRHRKVLRDNIQGITKPAIRRLARR), more preferred amino acids 1 to 30, and most preferred 1 to 21 , of said amono acid sequence.
Equally preferred is any peptide located within the 40 aminoterminal amino acid residues of a histone tail. Preferably, this peptide comprises 5 to 20 amino acid residues.
Equally preferred, the histone tail comprises the carboxyterminal amino acid residues of histones H2A and H2B.
Preferably, at least one of the amino acids of the histone tail has been further modified (Jufvas et al., 201 1. PLoS One 6(l):el 5960).
Preferably, at least one lysine is acetylated (Kac) on the free epsilon amino group.
Preferably, at least one lysine is mono-methylated (Kmel), di-methylated (Kme2) or tri- methylated (Kme3).
Preferably, at least one arginine is mono-methylated (Rmel), asymmetrically di- methylated (Rme2a) or symmetrically di-methylated (Rme2s).
Preferably, at least one serine is phosphorylated (Sp).
Preferably, at least one threonine is phosphorylated (Tp).
Preferably, at least one tyrosine is phosphorylated (Yp).
Equally preferred is a combination of several modifications as defined above on one histone tail peptide. In a preferred embodiment, the histone tail may be labeled, e.g. with biotin. The histone tail may be labeled with biotin at the amino-terminus or carboxy-terminus. This enables to further purify the histone tail-protein complex, which then facilitates the characterization of the protein.
Preferably, the biotin group is attached to the carboxyterminus of aminoterminal histone tails.
Preferably, the biotin group is attached to the aminoterminus of carboxyterminal histone tails.
Methods for the synthesis of a labeled histone tail are known in the art and include biotinylation (Wysocka 2006. Methods 40(4):339-343; Schulze and Mann, 2004. J. Biol. Chem. 279(1 1): 10756-64).
Furthermore, in the art, methods for the purification of protein complexes comprising a labeled histone tail as well as the protein are known (Vermeulen et al., 2010. Cell 142(6):967-980). In a preferred embodiment of the method for the identification of a protein, step b) may be performed in the presence of varying concentrations of a non-labeled histone tail. This enables the identification of a specific binding of the protein to the histone tail, because if with increased concentration of the non-labeled histone tail reduced amounts of protein are detected, this is an indication of a specific binding.
Consequently, in a preferred embodiment, the protein specifically interacts with the histone tail.
The histone tail may be immobilized on a microarray as known in the art. Such microarrays are commercially available or can be produced by known methods (Voigt and Reinberg, 201 1. Chembiochem. 12(2):236-252). Preferably, said microarray contains a plurality of histone tails. Once a protein has been identified which is capable of interacting with a given histone tail, it is, according to the invention, possible to identify compounds interacting with said proteins. This is in principle possible with every method known in the art for the identification of compounds interacting with proteins, but in a preferred embodiment the methods as depicted in the claims are used.
With the help of the present invention, it is especially possible to identify specific histone deacetylases being capable of interacting with histone tails. According to the present invention, the expression "HDAC" or "histone deacetylase" means enzymes that remove acetyl groups from histones or other substrate proteins. These enzymes are known in the art.
According to the present invention, the expression "protein" relates to both human and other proteins of this family. The expression especially includes functionally active derivatives thereof, or functionally active fragments thereof, or a homologues thereof, or variants encoded by a nucleic acid that hybridizes to the nucleic acid encoding said protein under low stringency conditions. Preferably, these low stringency conditions include hybridization in a buffer comprising 35% formamide, 5X SSC, 50 raM Tris-HCl (pH 7.5), 5 mM EDTA, 0.02% PVP, 0.02% BSA, 100 μg/ml denatured salmon sperm DNA, and 10% (wt/vol) dextran sulfate for 18-20 hours at 40°C, washing in a buffer consisting of 2X SSC, 25 mM Tris-HCl (pH 7.4), 5 mM EDTA, and 0.1 % SDS for 1 -5 hours at 55°C, and washing in a buffer consisting of 2X SSC, 25 mM Tris-HCl (pH 7.4) 5 mM EDTA, and 0.1% SDS for 1.5 hours at 60°C.
Moreover, according to the present invention, the expression "protein" includes mutant forms of said protein.
The methods of the present invention can be performed with any protein preparation as a starting material, as long as the respective protein is present in the preparation. Examples include a liquid mixture of several proteins, a cell lysate, a partial cell lysate which contains not all proteins present in the original cell (for example a nuclear extract) or a combination of several cell lysates. The term "protein preparation" also includes dissolved purified protein. Preferably, said protein is endogenously produced by said cell.
In the context of the present invention, the term "endogenously" means that the respective cell expresses said protein without being transfected with a protein-encoding nucleic acid. This ensures that the protein is present, as much as possible, in its natural environment, especially it is contained in a naturally occurring protein complex as discussed above.
In another aspect of the invention, aliquots of a cell preparation are provided as the starting material. In the context of the present invention, the term "cell preparation" refers to any preparation containing at least one cell with the desired properties. Suitable cell preparations are described below.
The presence of the protein in a protein preparation of interest can be detected on Western blots probed with antibodies that are specifically directed against said protein. Alternatively, also mass spectrometry (MS) could be used to detect the protein (see below).
Cell lysates or partial cell lysates can be obtained by isolating cell organelles (e.g. nucleus, mitochondria, ribosomes, golgi etc.) first and then preparing protein preparations derived from these organelles. Methods for the isolation of cell organelles are known in the art (Chapter 4.2 Purification of Organelles from Mammalian Cells in "Current Protocols in Protein Science", Editors: John.E. Coligan, Ben M. Dunn, Hidde L. Ploegh, David W. Speicher, Paul T. Wingfield; Wiley, ISBN: 0-471 -14098-8).
In addition, protein preparations can be prepared by fractionation of cell extracts thereby enriching specific types of proteins such as cytoplasmic, nuclear or membrane proteins (Chapter 4.3 Subcellular Fractionation of Tissue Culture Cells in "Current Protocols in Protein Science", Editors: John.E. Coligan, Ben M. Dunn, Hidde L. Ploegh, David W. Speicher, Paul T. Wingfield; Wiley, ISBN: 0-471-14098-8). Methods for the preparation of nuclear extracts are known in the art (Dignam et al., 1983. Nucleic Acids Res. 1 1 (5): 1475- 1489). In a preferred embodiment of the methods of the invention, the provision of a protein preparation includes the steps of harvesting at least one cell containing the protein and lysing the cell. In a preferred embodiment, suitable cells for this purpose as well as for the cell preparations used as the starting material in one aspect of the present invention are those cells or tissues where the protein is endogenously expressed.
In a preferred embodiment, cells isolated from peripheral blood represent a suitable biological material. Procedures for the preparation and culture of human lymphocytes and lymphocyte subpopulations obtained from peripheral blood (PBLs) are widely known (W.E Biddison, Chapter 2.2 "Preparation and culture of human lymphocytes" in Current Protocols in Cell Biology, 1998, John Wiley & Sons, Inc.). For example, density gradient centrifugation is a method for the separation of lymphocytes from other blood cell populations (e.g. erythrocytes and granulocytes). Human lymphocyte subpopulations can be isolated via their specific cell surface receptors which can be recognized by monoclonal antibodies. The physical separation method involves coupling of these antibody reagents to magnetic beads which allow the enrichment of cells that are bound by these antibodies (positive selection).
As an alternative to primary human cells cultured cell lines (e.g. MOLT-4 cells, Jurkat, Ramos, HeLa, HL-60 or -562 cells) can be used.
In a preferred embodiment, the cell is part of a cell culture system and methods for the harvest of a cell out of a cell culture system are known in the art (literature supra).
The choice of the cell will mainly depend on the expression of the protein, since it has to be ensured that the protein is principally present in the cell of choice. In order to determine whether a given cell is a suitable starting system for the methods of the invention, methods like Westernblot, PCR-based nucleic acids detection methods, Northemblots and DNA- microarray methods ("DNA chips") might be suitable in order to determine whether a given protein of interest is present in the cell. Furthermore, the protein preparation may be a preparation containing the protein which has been recombinantely produced. Methods for the production of recombinant proteins in prokaryotic and eukaryotic cells are widely established (Chapter 5 Production of Recombinant Proteins in "Current Protocols in Protein Science", Editors: John. E. Coligan, Ben M. Dunn, Hidde L. Ploegh, David W. Speicher, Paul T. Wingfield; Wiley, 1995, ISBN: 0-471 -14098-8).
The choice of the cell may also be influenced by the purpose of the study. If the in vivo efficacy for a given drug needs to be analyzed then cells or tissues may be selected in which the desired therapeutic effect occurs (e.g. B-cells). By contrast, for the elucidation of protein targets mediating unwanted side effects the cell or tissue may be analysed in which the side effect is observed (e.g. cardiomyocytes, vascular smooth muscle or epithelium cells). Furthermore, it is envisaged within the present invention that the cell containing the protein may be obtained from an organism, e.g. by biopsy. Corresponding methods are known in the art. For example, a biopsy is a diagnostic procedure used to obtain a small amount of tissue, which can then be examined microscopically or with biochemical methods. Biopsies are important to diagnose, classify and stage a disease, but also to evaluate and monitor drug treatment.
It is encompassed within the present invention that by the harvest of the at least one cell, the lysis is performed simultaneously. However, it is equally preferred that the cell is first harvested and then separately lysed.
Methods for the lysis of cells are known in the art (Karwa and Mitra: Sample preparation for the extraction, isolation, and purification of Nuclei Acids; chapter 8 in "Sample Preparation Techniques in Analytical Chemistry", Wiley 2003, Editor: Somenath Mitra, print ISBN: 0471328456; online ISBN: 0471457817). Lysis of different cell types and tissues can be achieved by homogenizers (e.g. Potter-homogenizer), ultrasonic desintegrators, enzymatic lysis, detergents (e.g. NP-40, Triton X-100, CHAPS, SDS), osmotic shock, repeated freezing and thawing, or a combination of these methods. According to the methods of the invention, the protein preparation containing the protein is contacted with a histone tail immobilized on a solid support thereby allowing the formation of complex between the histone tail and the protein. In the context of the present invention, compounds are identified which interfere with the binding between the histone tail and the protein present in a cell or protein preparation.
Throughout the invention, the term "solid support" relates to every undissolved support being able to immobilize a small molecule ligand on its surface. The solid support may be selected from the group consisting of agarose, modified agarose, sepharose beads (e.g. NHS-activated sepharose), latex, cellulose, and ferro- or ferrimagnetic particles.
Methods and strategies for choosing appropriate solid supports and for coupling compounds to said solid supports are known in the art (Wong, Shan S. Chemistry of protein conjugation and cross-linking (1991), CRC Press, Inc. ISBN 0-8493-5886-8 Chapter 12: Conjugation of proteins to solid matrices, pages 295-318).
The histone tail may be coupled to the solid support either covalently or non-covalently. Non-covalent binding includes binding via biotin to streptavidin matrices, avidin matrices or neutravidin matrices. In this embodiment biotin is covalently cojugated to the histone tail and interacts non-covalently with streptavidin which is bound directly to the solid support
Preferably, the biotin label and the histone tail peptide are separated by a linker (for example hexanoic acid and 1 to 5 amino acid residues). Preferably, these amino acid residues are lysine or glycine.
Preferably, the linker may be cleavable to facilitate the release of the proteins from the solid support. The cleavage may be achieved by enzymatic cleavage (e.g TEV protease) or treatment with suitable chemical methods.
Preferably, the histone tail is non-covalently coupled to the solid support. The term "allowing the formation of a complex" includes all conditions under which such formation is possible. Conditions allowing the formation of said complexes are known in the art. The skilled person will know which conditions can be applied in order to enable the formation of said complex. This includes the possibility of having the solid support on an immobilized phase and pouring the lysate or protein preparation onto it. In another preferred embodiment, it is also included that the solid support is in a particulate form and mixed with the cell lysate. Such conditions are known to the person skilled in the art.
In a preferred embodiment, the steps of the formation of said complex are performed under essentially physiological conditions. The physical state of proteins within cells is described in Petty, 1998 (Howard R. Petty, Chapter 1, Unit 1.5 in: Juan S. Bonifacino, Mary Dasso, Joe B. Harford, Jennifer Lippincott-Schwartz, and Kenneth M. Yamada (eds.) Current Protocols in Cell Biology Copyright © 2003 John Wiley & Sons, Inc. AH rights reserved. DPI: 10.1002/0471 143030.cb0101s00Online Posting Date: May, 2001Print Publication Date: October, 1998).
The contacting under essentially physiological conditions has the advantage that the interactions between the histone tail, the protein and optionally the compound reflect as much as possible the natural conditions. "Essentially physiological conditions" are inter alia those conditions which are present in the original, unprocessed sample material. They include the physiological protein concentration, pH, salt concentration, buffer capacity and post-translational modifications of the proteins involved. The term "essentially physiological conditions" does not require conditions identical to those in the original living organism, wherefrom the sample is derived, but essentially cell-like conditions or conditions close to cellular conditions. The person skilled in the art will, of course, realize that certain constraints may arise due to the experimental set-up which will eventually lead to less cell-like conditions. For example, the eventually necessary disruption of cell walls or cell membranes when taking and processing a sample from a living organism may require conditions which are not identical to the physiological conditions found in the organism. Suitable variations of physiological conditions for practicing the methods of the invention will be apparent to those skilled in the art and are encompassed by the term "essentially physiological conditions" as used herein. In summary, it is to be understood that the term "essentially physiological conditions" relates to conditions close to physiological conditions, as e. g. found in natural cells, but does not necessarily require that these conditions are identical.
For example, "essentially physiological conditions" may comprise 50-200 mM NaCl or C1, pH 6.5-8.5, 20-37°C, and 0.001-10 mM divalent cation (e.g. Mg++, Ca++,); more preferably about 150 m NaCl or C1, pH7.2 to 7.6, 5 mM divalent cation and often include 0.01-1.0 percent non-specific protein (e.g. BSA). A non-ionic detergent (Tween, NP-40, Triton-XlOO) can often be present, usually at about 0.001 to 2%, typically 0.05-0.2% (volume/volume). For general guidance, the following buffered aequous conditions may be applicable: 10-250 mM NaCl, 5-50 mM Tris HC1, pH5-8, with optional addition of divalent cation(s) and/or metal chelators and/or non-ionic detergents.
Preferably, "essentially physiological conditions" mean a pH of from 6.5 to 7.5, preferably from 7.0 to 7.5, and / or a buffer concentration of from 10 to 50 mM, preferably from 25 to 50 mM, and / or a concentration of monovalent salts (e.g. Na or K) of from 120 to 170 mM, preferably 150 mM. Divalent salts (e.g. Mg or Ca) may further be present at a concentration of from 1 to 5 mM, preferably 1 to 2 mM, wherein more preferably the buffer is selected from the group consisting of Tris-HCl or HEPES. The skilled person will appreciate that between the individual steps of the methods of the invention, washing steps may be necessary. Such washing is part of the knowledge of the person skilled in the art. The washing serves to remove non-bound components of the cell lysate from the solid support. Nonspecific (e.g. simple ionic) binding interactions can be minimized by adding low levels of detergent or by moderate adjustments to salt concentrations in the wash buffer.
Methods for determining whether a binding between a ligand (i.e. a histone tail in the context of the present invention) and a protein or protein complex has occurred are known in the art. These methods include in situ methods where the binding is assessed without separating the protein or protein complex from the ligand. For example, anti-protein antibodies can be used in combination with the ALPHAScreen technology where the excitation of a donor bead at 680 nm produces singlet oxygen which can diffuse to an acceptor bead undergoing a chemiluminescent reaction (Glickman et al., 2002. J. Biomol. Screen. 7(1):3-10). In a preferred embodiment of some aspects according to the invention, the binding between the histone tail and the protein is determined by separating bound protein from the histone tail and subsequent determination of the protein. This subsequent determination of the protein may either be the detection of the protein in the eluate or the determination of its amount.
In general, in the methods of the invention, a binding between the histone tail and the protein preferably indicates that the compound does not completely inhibit the binding. On the other hand, if no binding takes place in the presence of the compound, the compound is presumably a strong interactor with the protein, which is indicative for its therapeutic potential. In case that the amount is determined, the less protein can be detected in the eluate, the stronger the respective compound interacts with the protein, which is indicative for its therapeutic potential.
Consequently, in a preferred embodiment of the methods of the invention, a reduced binding observed for the aliquot incubated with the compound in comparison to the aliquot not incubated with the compound indicates that protein is a target of the compound According to invention, separating means every action which destroys the interactions between the histone tail and the protein. This includes in a preferred embodiment the elution of the protein from the histone tail. The elution can be achieved by using nonspecific reagents (ionic strength, pH value, detergents). Such non-specific methods for destroying the interaction are principally known in the art and depend on the nature of the ligand enzyme interaction. Principally, change of ionic strength, the pH value, the temperature or incubation with detergents are suitable methods to dissociate the target enzymes from the immobilized compound. The application of an elution buffer can dissociate binding partners by extremes of pH value (high or low pH; e.g. lowering pH by using 0.1 M citrate, pH2-3), change of ionic strength (e.g. high salt concentration using Nal, KI, MgCl2, or KC1), polarity reducing agents which disrupt hydrophobic interactions (e.g. dioxane or ethylene glycol), or denaturing agents (chaotropic salts or detergents such as Sodium-docedyl-sulfate, SDS; Review: Subramanian A., 2002, Immunoaffinty chromatography). In some cases, the solid support has preferably to be separated from the released material. The individual methods for this depend on the nature of the solid support and are known in the art. If the support material is contained within a column the released material can be collected as column flowthrough. In case the support material is mixed with the lysate components (so called batch procedure) an additional separation step such as gentle centrifugation may be necessary and the released material is collected as supernatant. Alternatively magnetic beads can be used as solid support so that the beads can be eliminated from the sample by using a magnetic device.
Methods for the detection of proteins or for the determination of their amounts are known in the art and include physico-chemical methods such as protein sequencing (e.g. Edmann degradation), analysis by mass spectrometry methods or immunodetection methods employing antibodies directed against the protein.
Throughout the invention, if an antibody is used in order to detect a respective protein, a specific antibody may be used (Wu and Olson, 2002. J. Clin. Invest. 109(10): 1327-1333). As indicated above, such antibodies are known in the art. Furthermore, the skilled person is aware of methods for producing the same.
Preferably, mass spectrometry or immunodetection methods are used in the context of the methods of the invention.
The identification of proteins with mass spectrometric analysis (mass spectrometry) is known in the art (Shevchenko et al., 1996, Analytical Chemistry 68: 850-858; Mann et al., 2001. Annual Review of Biochemistry 70, 437-473) and is further illustrated in the example section.
Preferably, the mass spectrometry analysis is performed in a quantitative manner, for example by using iTRAQ technology (isobaric tags for relative and absolute quantification) or cICAT (cleavable isotope-coded affinity tags) (Wu et al., 2006. J. Proteome Res. 5, 651-658). Equally preferred, the mass spectrometry analysis is performed in a quantitative manner, for example by using the TMT technology. The TMT reagents are a set of multiplexed, amine-specific, stable isotope reagents that can label peptides in up to six different biological samples enabling simultaneous identification and quantitation of peptides. The combined samples are analyzed with a nano-flow liquid chromatography system coupled online to a tandem mass spectrometer (LC-MS/MS) experiment followed by reporter ion quantitation in the MS/MS spectra (Ross et al., 2004. Mol. Cell. Proteomics 3(12): 1 154- 1 169; Dayon et al., 2008. Anal. Chem. 80(8):2921-2931 ; Thompson et al., 2003. Anal. Chem. 75(8): 1895-1904).
According to a further preferred embodiment of the present invention, the characterization by mass spectrometry (MS) is performed by the identification of proteotypic peptides of the protein. The idea is that the protein is digested with proteases and the resulting peptides are determined by MS. As a result, peptide frequencies for peptides from the same source protein differ by a great degree, the most frequently observed peptides that "typically" contribute to the identification of this protein being termed "proteotypic peptide". Therefore, a proteotypic peptide as used in the present invention is an experimentally well observable peptide that uniquely identifies a specific protein or protein isoform. According to a preferred embodiment, the characterization is performed by comparing the proteotypic peptides obtained in the course of practicing the methods of the invention with known proteotypic peptides. Since, when using fragments prepared by protease digestion for the identification of a protein in MS, usually the same proteotypic peptides are observed for a given protein, it is possible to compare the proteotypic peptides obtained for a given sample with the proteotypic peptides already known for the protein.
Accordingly, in a preferred embodiment of the methods of the present invention, the characterization of the protein, or the detection of the protein, or the determination of the amount of the protein is carried out by quantitative mass spectrometry.
Suitable immunodetection methods include but are not limited to Western blots, ELISA assays, sandwich ELISA assays and antibody arrays or a combination thereof. The establishment of such assays is known in the art (Chapter 1 1 , Immunology, pages 1 1 -1 to 1 1 -30 in: Short Protocols in Molecular Biology. Fourth Edition, Edited by F.M. Ausubel et al., Wiley, New York, 1999).
These assays can not only be configured in a way to detect and quantify the protein of interest, but also to analyse posttranslational modification patterns of the protein such as phosphorylation, acetylation, methylation, ubiquitination or sumoylation.
As detailed above, the identification methods of the invention involve the use of compounds which are tested for their ability to be a compound interacting with the protein.
Principally, according to the present invention, such a compound can be every molecule which is able to interact with the protein and to modulate its binding to the histone tail. Preferably, the compound is able to inhibit partially or completely the binding of the protein to the histone tail. Equally preferred, the compound is able to enhance the binding of the protein to the histone tail leading to a stabilization of the histone tail - protein complex.
Preferably, said compound is selected from the group consisting of synthetic or naturally occurring chemical compounds or organic synthetic drugs, more preferably small molecule organic drugs or natural small molecule compounds. Preferably, said compound is identified starting from a library containing such compounds. Then, in the course of the present invention, such a library is screened.
Such small molecules are preferably not proteins or nucleic acids (e.g. siRNA or morpholino oligonucleotide). Preferably, small molecules exhibit a molecular weight of less than 1000 Da, more preferred less than 750 Da, most preferred less than 500 Da.
Preferably, the compound of the present invention is not a nucleic acid or a protein. More preferably, the compound of the present invention is not a peptide. A "library" according to the present invention relates to a (mostly large) collection of (numerous) different chemical entities that are provided in a sorted manner that enables both a fast functional analysis (screening) of the different individual entities, and at the same time provide for a rapid identification of the individual entities that form the library. Examples are collections of tubes or wells or spots on surfaces that contain chemical compounds that can be added into reactions with one or more defined potentially interacting partners in a high-throughput fashion. After the identification of a desired "positive" interaction of both partners, the respective compound can be rapidly identified due to the library construction. Libraries of synthetic and natural origins can either be purchased or designed by the skilled artisan.
Examples of the construction of libraries are provided in, for example, Breinbauer R, Manger M, Scheck M, Waldmann H. Natural product guided compound library development. Curr. Med. Chem. 2002; 9(23):2129-2145, wherein natural products are described that are biologically validated starting points for the design of combinatorial libraries, as they have a proven record of biological relevance. This special role of natural products in medicinal chemistry and chemical biology can be interpreted in the light of new insights about the domain architecture of proteins gained by structural biology and bioinformatics. In order to fulfill the specific requirements of the individual binding pocket within a domain family it may be necessary to optimise the natural product structure by chemical variation. Solid-phase chemistry is said to become an efficient tool for this optimisation process, and recent advances in this field are highlighted in this review article. The current drug discovery processes in many pharmaceutical companies require large and growing collections of high quality lead structures for use in high throughput screening assays. Collections of small molecules with diverse structures and "drug-like" properties have, in the past, been acquired by several means: by archive of previous internal lead optimisation efforts, by purchase from compound vendors, and by union of separate collections following company mergers. Although high throughput/combinatorial chemistry is described as being an important component in the process of new lead generation, the selection of library designs for synthesis and the subsequent design of library members has evolved to a new level of challenge and importance. The potential benefits of screening multiple small molecule compound library designs against multiple biological targets offers substantial opportunity to discover new lead structures. In a preferred embodiment of the methods of the invention, the protein preparation is first incubated with the compound and then contacted with the histone tail. However, the simultaneous incubation is equally preferred (competitive binding assay). In case that the incubation with the compound is first, the protein containing protein preparation is preferably first incubated with the compound for 10 to 60 minutes, more preferred 30 to 45 minutes at a temperature of 4°C to 37°C, more preferred 4°C to 25°C, most preferred 4°C. Preferably compounds are used at concentrations ranging from 1 nM to 100 μΜ, preferably from 10 nM to 10 μΜ. The second step, contacting with the ligand, is preferably performed for 10 to 60 minutes at 4°C.
In case of simultaneous incubation, the protein containing protein preparation is preferably simultaneously incubated with the compound and the the histone tail for 30 to 120 minutes, more preferred 60 to 120 minutes at a temperature of 4°C to 37°C, more preferred 4°C to 25°C, most preferred 4°C. Preferably compounds are used at concentrations ranging from 1 nM to 100 μΜ, preferably from 10 nM to 10 μΜ.
Furthermore, the methods of the invention may be performed with several protein preparations in order to test different compounds. This embodiment is especially interesting in the context of medium or high throughput screenings.
Preferably, the identification methods of the invention are performed as a medium or high throughput screening. In this context, medium throuphput screening may refer to multiple tests performed in parallel, preferably in a 96-well format, which means that 2 to 96 tests are performed in parallel. High throuphput screening may refer to multiple tests performed in parallel, preferably in a 384 or 1536 well format, which means that 2 to 384 or 2 to 1536 tests are performed in parallel.
The interaction compound identified according to the present invention may be further characterized by determining whether it has an effect on the protein, for example on its enzymatic activity (Khan et al., 2008. Biochem. J. 409(2):581-9; Blackwell et al., 2008. Life Sci. 82(21-22): 1050- 1058).
The compounds identified according to the present invention may further be optimized (lead optimisation). This subsequent optimisation of such compounds is often accelerated because of the structure-activity relationship (SAR) information encoded in these lead generation libraries. Lead optimisation is often facilitated due to the ready applicability of high-throughput chemistry (HTC) methods for follow-up synthesis. An example for lead optimization of HDAC inhibitors was reported (Remiszewski et al., 2003. J. Med. Chem. 46(21):4609-4624).
The invention further relates to a method for the preparation of a pharmaceutical composition comprising the steps of a) identifying a protein interacting compound as described above, and b) formulating the interacting compound to a pharmaceutical composition.
Methods for the formulation of identified compounds are known in the art. Furthermore, it is known in the art how to administer such pharmaceutical compositions.
The obtained pharmaceutical composition can be used for the prevention or treatment of diseases where the respective protein plays a role.
In a further aspect, the invention relates to a method for the identification of a compound being capable of interacting with a given protein capable of interacting with a given histone tail, comprising the steps of a) providing a protein preparation containing said protein, b) contacting the protein preparation with said histone tail being immobilized on a solid support under conditions allowing the formation of a complex between the histone tail and the protein, c) incubating complex with a given compound, and d) determining whether the compound is able to separate the protein from the immobilized histone tail. In a further aspect, the present invention relates to a method for the identification of a compound being capable of interacting with a given protein capable of interacting with a given histone tail, comprising the steps of a) providing a protein preparation containing said protein, b) contacting the protein preparation with said histone tail being immobilized on a solid support and with a given compound under conditions allowing the formation of a complex between the histone tail and the protein, and c) detecting the complex formed in step b), In a further aspect, the present invention relates to a method for the identification of a compound being capable of interacting with a given protein capable of interacting with a given histone tail, comprising the steps of: a) providing two aliquots of a protein preparation containing said protein, b) contacting one aliquot with said histone tail being immobilized on a solid support under conditions allowing the formation of a complex between the histone tail and the protein, c) contacting the other aliquot with said histone tail immobilized on a solid support and with a given compound under conditions allowing the formation of a complex between the histone tail and the protein, and d) determining the amount of the complex formed in steps b) and c). In a further aspect, the present invention relates to a method for the identification of a compound being capable of interacting with a given protein capable of interacting with a given histone tail, comprising the steps of: a) providing two aliquots comprising each at least one cell containing said protein, b) incubating one aliquot with a given compound, c) harvesting the cells of each aliquot, d) lysing the cells in order to obtain protein preparations, e) contacting the protein preparations with said histone tail being immobilized on a solid support under conditions allowing the formation of a complex between the histone tail and the protein, and f) determining the amount of the complex formed in each aliquot in step e),
According to the invention, the given histone tail may have the sequence A[Rme]TKQTARKSTGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 1.
According to the invention, the given histone tail may have the sequence A[Rme2a][pT]KQTARKSTGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 2.
According to the invention, the given histone tail may have the sequence A[Rme2a][pT][Kme3]QTARKSTGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 3. According to the invention, the given histone tail may have the sequence A[Rme2a]T[ me3]QTARKSTGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 4. According to the invention, the given histone tail may have the sequence AR[pT][Kme3]QTARKSTGG APRKQLA and the given protein may be selected from the proteins depicted in Table 5.
According to the invention, the given histone tail may have the sequence ART[Kme]QTARKSTGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 6.
According to the invention, the given histone tail may have the sequence ART[Kme2]QTARKSTGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 7.
According to the invention, the given histone tail may have the sequence A ART[Kme3 ] QT ARKSTGGKAPRKQL A and the given protein may be selected from the proteins depicted in Table 8.
According to the invention, the given histone tail may have the sequence ART[ me3]QTAR[Kac]STGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 9. According to the invention, the given histone tail may have the sequence ART[ me3]QTAR[Kme3]STGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 10.
According to the invention, the given histone tail may have the sequence ART QTAR[ ac] STGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 1 1. According to the invention, the given histone tail may have the sequence ART QTAR[Kac][pS]TGG APRKQLA and the given protein may be selected from the proteins depicted in Table 12. According to the invention, the given histone tail may have the sequence ARTKQTAR[Kac][pS][pT]GGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 13.
According to the invention, the given histone tail may have the sequence ART QTAR[Kme]STGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 14.
According to the invention, the given histone tail may have the sequence ARTKQT AR[Kme2] STGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 15.
According to the invention, the given histone tail may have the sequence ARTKQT AR[Kme3] STGGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 16.
According to the invention, the given histone tail may have the sequence ARTKQT AR[Kme3][pS]TGGKAPRKQ LA and the given protein may be selected from the proteins depicted in Table 17. According to the invention, the given histone tail may have the sequence ARTKQT AR[Kme3 ] [pS] [pT]GGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 18.
According to the invention, the given histone tail may have the sequence ARTKQT AR[Kme3]S[pT]GGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 19. According to the invention, the given histone tail may have the sequence ARTKQTARK[pS] [pT]GGKAPRKQLA and the given protein may be selected from the proteins depicted in Table 20. According to the invention, the given histone tail may have the sequence ART QT ARKS [pT] GG AP RKQL A and the given protein may be selected from the proteins depicted in Table 21.
According to the invention, the given histone tail may have the sequence [pS] G [Rme2a] G [Kac] GGKGLGKGG AKRHRKV and the given protein may be selected from the proteins depicted in Table 22.
According to the invention, the given histone tail may have the sequence SG[Rme2a]G[Kac]GGKGLGKGGAKRHRKV and the given protein may be selected from the proteins depicted in Table 23.
According to the invention, the given histone tail may have the sequence SGRG[Kac]GGKGLGKGGAKRHRKV and the given protein may be selected from the proteins depicted in Table 24.
According to the invention, the given histone tail may have the sequence SGRG[Kac] GG [Kac] GLGKGG AKRHRKV and the given protein may be selected from the proteins depicted in Table 25. According to the invention, the given histone tail may have the sequence SGRG[Kac]GG[Kac]GLG[Kac]GG AKRHRKV and the given protein may be selected from the proteins depicted in Table 26.
According to the invention, the given histone tail may have the sequence SGRG[Kac]GG[Kac]GLG[Kac]GGA[Kac]RHRKV and the given protein may be selected from the proteins depicted in Table 27. According to the invention, the given histone tail may have the sequence SGRG[ ac]GGKGLG[ ac]GGAKRHRKV and the given protein may be selected from the proteins depicted in Table 28. According to the invention, the given histone tail may have the sequence SGRG[ ac]GG GLGKGGA[ ac]RHRKV and the given protein may be selected from the proteins depicted in Table 29.
According to the invention, the given histone tail may have the sequence SGRGKGG[Kac]GLGKGGAKRHRKV and the given protein may be selected from the proteins depicted in Table 30.
According to the invention, the given histone tail may have the sequence SGRGKGG[Kac]GLG[Kac]GGAKRHRKV and the given protein may be selected from the proteins depicted in Table 31 .
According to the invention, the given histone tail may have the sequence SGRGKGG[Kac]GLG[Kme3]GGAKRHRKV and the given protein may be selected from the proteins depicted in Table 32.
According to the invention, the given histone tail may have the sequence SGRGKGG GLG[ ac]GGAKRHRKV and the given protein may be selected from the proteins depicted in Table 33. According to the invention, the given histone tail may have the sequence SGRG GGKGLG[ me2]GGA RHRKV and the given protein may be selected from the proteins depicted in Table 34.
According to the invention, the given histone tail may have the sequence SGRG GGKGLG[ me3]GGAKRHRKV and the given protein may be selected from the proteins depicted in Table 35. In preferred embodiments, all embodiments described above for the methods of the invention where in step a) a protein is identified according to the invention apply also for the methods where a specific histone tail and a specific protein is used.
Especially, the histone tail may be biotinylated as described above.
The invention is further described by the following examples, which are intended to illustrate, but not to limit the present invention. In case where in the following examples the term "affinity matrix" is used, this term refers to the immobilized ligand (histone tail) as defined in the present application.
Tables for H3 histone tails (Tables 1 to 21)
Table 1: Proteins interacting with histone tail A[Rme]TKQTARKSTGGKAPRKQLA
Accession Protein
Protein description
number name
IPI00296724.2 KLHL36 kelch-like 36 (Drosophila)
Table 2: Proteins interacting with histone tail A[Rme2a][pT]KQTARKSTGGKAPRKQLA
Accession Protein
Protein description
number name
IPI00644079.2 HNRNPU heterogeneous nuclear ribonucleoprotein U (scaffold attachment factor A)
IPI00221012.7 USP9X ubiquitin specific peptidase 9, X-linked
IPI00304692.1 RB X RNA binding motif protein, X-linked
IPIO0012340.1 SFRS9 serine/arginine-rich splicing factor 9
IPI00844578.1 DHX9 DEAH (Asp-Glu-Ala-His) box polypeptide 9
Table 3: Proteins interacting with histone tail A[Rme2a][pT][Kme3]QTARKSTGGKAPRKQLA
Accession Protein
Protein description
number name
IPI00384109.4 BPTF bromodomain PHD finger transcription factor
IPI00018140.3 SYNCRIP synaptotagmin binding, cytoplasmic RNA interacting protein
IPI00298731.2 PPP1R10 protein phosphatase 1, regulatory (inhibitor) subunit 10
IPI00873762.1 TAF15 TAF15 RNA polymerase II, TATA box binding protein (TBP)- associated factor, 68kDa
IPI00061523.4 THAP11 THAP domain containing 11
IPI00027988.1 CTCF CCCTC-binding factor (zinc finger protein)
IPI00015953.3 DDX21 DEAD (Asp-Glu-Ala-Asp) box polypeptide 21
IPI00159969.3 REST REl-silencing transcription factor
IPI00479191.2 HNRNPH1 heterogeneous nuclear ribonucleoprotein HI (H)
IPI00878252.1 GTF3C2 general transcription factor IIIC, polypeptide 2, beta HOkDa
IPI00023407.4 NCKAP1L NCK-associated protein 1-like
IPI00903062.1 CYFIP1 cytoplasmic FMR1 interacting protein 1
IPI00472164.2 WASF2 WAS protein family, member 2
IPI00789699.2 CYFIP2 cytoplasmic FMR1 interacting protein 2
IPI00480028.2 ABI1 abl-interactor 1
IPI00183208.3 FBX022 F-box protein 22
IPI00436705.7 MORC3 MORC family CW-type zinc finger 3
IPI00000001.2 STAU1 staufen, RNA binding protein, homolog 1 (Drosophila)
IPI00169430.2 STRBP spermatid perinuclear RNA binding protein Table 3: Proteins interacting with histone tail A[Rme2a][pT][Kme3]QTARKSTGGKAPRKQLA
IPI00880117.1 APOBEC3B apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like
3B
IPI00784044.1 MCCC2 methylcrotonoyl-CoA carboxylase 2 (beta)
IPI00107113.3 UTP14A UTP14, U3 small nucleolar ribonucleoprotein, homolog A (yeast)
IPI00844578.1 DHX9 DEAH (Asp-Glu-Ala-His) box polypeptide 9
IPI00304692.1 RBMX RNA binding motif protein, X-linked
IPI00019812.1 PPP5C protein phosphatase 5, catalytic subunit
IPI00384333.2 TCF12 transcription factor 12
IPI00418313.3 ILF3 interleukin enhancer binding factor 3, 90kDa
IPI00000686.2 RBM19 RNA binding motif protein 19
IPI00184376.4 C90RF126 suppressor of cancer cell invasion
IPI00455982.1 HMGXB4 HMG box domain containing 4
IPI00304187.8 RBM28 RNA binding motif protein 28
IPI00031801.4 CSDA cold shock domain protein A
IPI00784224.1 ZFR zinc finger RNA binding protein
IPI00031679.2 C30RF26 chromosome 3 open reading frame 26
IPI00398625.5 HRNR hornerin
IPI00154975.3 DNAJC9 DnaJ (Hsp40) homolog, subfamily C, member 9
IPI00396015.5 ACACA acetyl-CoA carboxylase alpha
IPI00018906.6 ZFP64 zinc finger protein 64 homolog (mouse)
IPI00019733.1 RAE1 RAE1 RNA export 1 homolog (S. pombe)
IPI00872107.1 ILF2 interleukin enhancer binding factor 2, 45kDa
IPI00292975.4 RBM27 RNA binding motif protein 27
IPI00844014.1 C90RF114 chromosome 9 open reading frame 114
IPI00032533.3 WDR18 WD repeat domain 18
IPI00410264.2 WDR32 DDB1 and CUL4 associated factor 10
IPI00011274.3 HNRPDL heterogeneous nuclear ribonucleoprotein D-like
IPI00646361.3 NUP214 nucleoporin 214kDa
IPI00872597.1 ZNF326 zinc finger protein 326
IPI00217630.1 DHX37 DEAH (Asp-Glu-Ala-His) box polypeptide 37
IPI00640322.4 ZNF567 zinc finger protein 567
IPI00045051.3 PURB purine-rich element binding protein B
IPI00410067.1 ZC3HAV1 zinc finger CCCH-type, antiviral 1
IPI00012199.1 CCDC86 coiled-coil domain containing 86
IPI00328840.9 THOC4 THO complex 4
IPI00010200.3 YTHDC2 YTH domain containing 2
IPI00010746.1 PTDSS1 phosphatidylserine synthase 1
IPI00477181.1 RNF220 ring finger protein 220
IPI00000398.2 NUSAP1 nucleolar and spindle associated protein 1
IPI00008575.3 KHDRBS1 KH domain containing, RNA binding, signal transduction
associated 1
IPI00007755.3 RAB21 RAB21, member RAS oncogene family
IPI00658000.2 IGF2BP3 insulin-like growth factor 2 mRNA binding protein 3
IPI00294891.5 NOP2 NOP2 nucleolar protein homolog (yeast)
IPI00031554.1 DDX50 DEAD (Asp-Glu-Ala-Asp) box polypeptide 50 Table 3: Proteins interacting with histone tail A[Rme2a][pT][Kme3]QTARKSTGGKAPRKQLA
Figure imgf000038_0001
Table 4: Proteins interacting with histone tail A[Rme2a]T[Kme3]QTARKSTGGKAPRKQLA
Accession Protein
Protein description
number name
IPI00384109.4 BPTF bromodomain PHD finger transcription factor
IPI00018140.3 SYNCRIP synaptotagmin binding, cytoplasmic RNA interacting protein
IPI00027988.1 CTCF CCCTC-binding factor (zinc finger protein)
IPI00153032.1 LTVl LTVl homolog (S. cerevisiae)
IPI00292894.5 TSR1 TSR1, 20S rRNA accumulation, homolog (S. cerevisiae)
IPI00221091.9 RPS15A ribosomal protein S15a
IPI00436705.7 MORC3 MORC family CW-type zinc finger 3
IPI00880117.1 APOBEC3B apolipoprotein B mRNA editing enzyme, catalytic polypeptide- like 3B
IPI00384333.2 TCF12 transcription factor 12
IPI00183208.3 FBX022 F-box protein 22 Table 4: Proteins interacting with histone tail A[Rme2a]T[Kme3]QTARKSTGGKAPRKQLA
IPI00000001.2 STAU1 staufen, RNA binding protein, homolog 1 (Drosophila)
IPI00019812.1 PPP5C protein phosphatase 5, catalytic subunit
IPI00304692.1 RBMX RNA binding motif protein, X-linked
IPI0O169430.2 STRBP spermatid perinuclear RNA binding protein
IPI00107113.3 UTP14A UTP14, U3 small nucleolar ribonucleoprotein, homolog A (yeast)
IPI00418313.3 ILF3 interleukin enhancer binding factor 3, 90kDa
IPI00455982.1 H GXB4 HMG box domain containing 4
IPI00398625.5 HRNR hornerin
IPI00844578.1 DHX9 DEAH (Asp-Glu-Ala-His) box polypeptide 9
IPI00032533.3 WDR18 WD repeat domain 18
IPI00031679.2 C30RF26 chromosome 3 open reading frame 26
IPI00304187.8 RBM28 RNA binding motif protein 28
IPI00031801.4 CSDA cold shock domain protein A
IPI00794659.1 RPS20 ribosomal protein S20
IPI00019733.1 RAE1 RAE1 RNA export 1 homolog (S. pombe)
IPI00640322.4 ZNF567 zinc finger protein 567
IPI00872597.1 ZNF326 zinc finger protein 326
IPI00872107.1 ILF2 interleukin enhancer binding factor 2, 45kDa
IPI00000686.2 RBM19 RNA binding motif protein 19
IPI00784224.1 ZFR zinc finger RNA binding protein
IPI00477181.1 RNF220 ring finger protein 220
IPI00217030.10 RPS4X ribosomal protein S4, X-linked
IPI00215780.5 RPS19 ribosomal protein S19
IPI00154975.3 DNAJC9 DnaJ (Hsp40) homolog, subfamily C, member 9
IPI00012199.1 CCDC86 coiled-coil domain containing 86
IPI00018906.6 ZFP64 zinc finger protein 64 homolog (mouse)
IPI00221089.5 RPS13 ribosomal protein S13
IPI00844014.1 C90RF114 chromosome 9 open reading frame 114
IPI00184376.4 C90RF126 suppressor of cancer cell invasion
IPI00012750.3 RPS25 ribosomal protein S25
IPI00300222.6 BBX bobby sox homolog (Drosophila)
Table 5: Proteins interacting with histone tail AR[pT][Kme3]QTARKSTGGKAPRKQLA
Accession Protein
Protein description
number name
IPI00384109.4 BPTF bromodomain PHD finger transcription factor
IPI00018140.3 SYNCRIP synaptotagmin binding, cytoplasmic RNA interacting protein
IPI00013174.2 RBM14 RNA binding motif protein 14
IPI00418797.4 POLR1B polymerase (RNA) 1 polypeptide B, 128kDa
IPI00298731.2 PPP1R10 protein phosphatase 1, regulatory (inhibitor) subunit 10
IPI00477313.3 HNRNPC heterogeneous nuclear ribonucleoprotein C (C1/C2)
IPI00004353.1 GTF2A2 general transcription factor IIA, 2, 12kDa
IPI00153032.1 LTVl LTVl homolog (S. cerevisiae) Table 5: Proteins interacting with histone tail AR[pT][Kme3]QTARKSTGGKAPRKQLA
IPI00292894.5 TSR1 TSR1, 20S rRNA accumulation, homolog (S. cerevisiae)
IPI00221091.9 RPS15A ribosomal protein S15a
IPI00436705.7 MORC3 ORC family CW-type zinc finger 3
IPI00106955.3 C110RF84 chromosome 11 open reading frame 84
IPI00880117.1 APOBEC3B apolipoprotein B mRNA editing enzyme, catalytic polypeptide- like 3B
IPI00300222.6 BBX bobby sox homolog (Drosophila)
IPI00000001.2 STAU1 staufen, RNA binding protein, homolog 1 (Drosophila)
IPI00455982.1 HMGXB4 HMG box domain containing 4
IPI00019812.1 PPP5C protein phosphatase 5, catalytic subunit
IPI00169430.2 STRBP spermatid perinuclear RNA binding protein
IPI00304692.1 RBMX RNA binding motif protein, X-linked
IPI00418313.3 ILF3 interleukin enhancer binding factor 3, 90kDa
IPI001O7113.3 UTP14A UTP14, U3 small nucleolar ribonucleoprotein, homolog A (yeast)
IPI00844578.1 DHX9 DEAH (Asp-Glu-Ala-His) box polypeptide 9
IPI00031801.4 CSDA cold shock domain protein A
IPI00384333.2 TCF12 transcription factor 12
IPI00184376.4 C90RF126 suppressor of cancer cell invasion
IP 100872107.1 ILF2 interleukin enhancer binding factor 2, 45kDa
IP 100005477.3 TWISTNB TWIST neighbor
IPI00012199.1 CCDC86 coiled-coil domain containing 86
IPI00872597.1 ZNF326 zinc finger protein 326
IPI00217030.10 RPS4X ribosomal protein S4, X-linked
IPI00018906.6 ZFP64 zinc finger protein 64 homolog (mouse)
IPI00215780.5 RPS19 ribosomal protein S19
IPI00640322.4 ZNF567 zinc finger protein 567
IPI00032533.3 WDR18 WD repeat domain 18
IPI00000686.2 RBM19 RNA binding motif protein 19
IPI00031679.2 C30RF26 chromosome 3 open reading frame 26
IPI00784224.1 ZFR zinc finger RNA binding protein
IPI00221089.5 RPS13 ribosomal protein S13
IPI00477181.1 RNF220 ring finger protein 220
IPI00154975.3 DNAJC9 DnaJ (Hsp40) homolog, subfamily C, member 9
IPI00012750.3 RPS25 ribosomal protein S25
IPI00293078.1 DDX27 DEAD (Asp-Glu-Ala-Asp) box polypeptide 27
IPI00011268.2 RALY RNA binding protein, autoantigenic (hnRNP-associated with lethal yellow homolog (mouse))
Table 6: Proteins interacting with histone tail ART[Kme]QTARKSTGG APRKQLA
Accession Protein
Protein description
number name
IPI00394657.1 KAT5 K(lysine) acetyltransferase 5
IPI00063434.2 PHF23 PHD finger protein 23 Table 6: Proteins interacting with histone tail ART[ me]QTARKSTGGKAPRKQLA
IPI00012788.1 CD3EAP CD3e molecule, epsilon associated protein
IPI00005477.3 TWISTNB TWIST neighbor
IPI00027462.1 S100A9 S100 calcium binding protein A9
IPI00219129.9 NQ02 NAD(P)H dehydrogenase, quinone 2
IPI00784265.1 NOP16 NOP16 nucleolar protein homolog (yeast)
IPI00436705.7 MORC3 MORC family CW-type zinc finger 3
IPI00418313.3 ILF3 interleukin enhancer binding factor 3, 90kDa
Figure imgf000041_0001
Table 8: Proteins interacting with histone tail ART[Kme3]QTARKSTGGKAPRKQLA
Accession Protein
Protein description
number name
IPI00012788.1 CD3EAP CD3e molecule, epsilon associated protein
IPI00418797.4 POLR1B polymerase (RNA) 1 polypeptide B, 128kDa
IPI00106955.3 C110RF84 chromosome 11 open reading frame 84
IPI00436705.7 MORC3 MORC family CW-type zinc finger 3
IPI00455982.1 HMGXB4 HMG box domain containing 4
IPI00784265.1 NOP16 NOP16 nucleolar protein homolog (yeast)
ΙΡΙ000Ό5477.3 TWISTNB TWIST neighbor
IPI00300222.6 BBX bobby sox homolog (Drosophila)
IPI00219129.9 NQ02 NAD(P)H dehydrogenase, quinone 2
IPI00179298.5 HUWE1 HECT, UBA and WWE domain containing 1
IPI00418313.3 ILF3 interleukin enhancer binding factor 3, 90kDa
IPI00644937.1 RHOXF2 Rhox homeobox family, member 2
IPI00018140.3 SYNCRIP synaptotagmin binding, cytoplasmic RNA interacting protein Table 8: Proteins interacting with histone tail ART[Kme3]QTARKSTGGKAPRKQLA
IPI00844578.1 DHX9 DEAH (Asp-Glu-Ala-His) box polypeptide 9
IP 100304692.1 RBMX RNA binding motif protein, X-linked
Figure imgf000042_0001
Table 10: Proteins interacting with histone tail ART[Kme3]QTAR[Kme3]STGGKAPRKQLA
Accession Protein
Protein description
number name
IPI00885104.1 MPHOSPH8 M-phase phosphoprotein 8 Table 10: Proteins interacting with histone tail ART[Kme3]QTAR[Kme3]STGGKAPRKQLA
IPI00384109.4 BPTF bromodomain PHD finger transcription factor
IPI00890837.1 SMCHD1 structural maintenance of chromosomes flexible hinge domain containing 1
IPI00384109.4 BPTF bromodomain PHD finger transcription factor
IPI00641109.1 ZMYM3 zinc finger, MYM-type 3
IPI00013174.2 RBM14 RNA binding motif protein 14
IPI00018140.3 SYNCRIP synaptotagmin binding, cytoplasmic RNA interacting protein
IPI00004350.1 GTF2A1 general transcription factor MA, 1, 19/37kDa
IPI00436705.7 MORC3 MORC family CW-type zinc finger 3
IPI00106955.3 C110RF84 chromosome 11 open reading frame 84
IPI00848276.1 C10ORF18 chromosome 10 open reading frame 18
IPI00442098.1 PPP2R5C protein phosphatase 2, regulatory subunit B', gamma
IPI00790098.2 C30RF63 chromosome 3 open reading frame 63
IPI00554737.3 PPP2R1A protein phosphatase 2, regulatory subunit A, alpha
IPI00455982.1 HMGXB4 H G box domain containing 4
IPI00880117.1 APOBEC3B apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like
3B
IPI00300222.6 BBX bobby sox homolog (Drosophila)
IPI00008380.1 PPP2CA protein phosphatase 2, catalytic subunit, alpha isozyme
IPI00057035.3 TRMT61B tRNA methyltransferase 61 homolog B (S. cerevisiae)
IPI00640322.4 ZNF567 zinc finger protein 567
IPI00747327.3 ZNF195 zinc finger protein 195
IPI00645116.3 ZNF12 zinc finger protein 12
IPI00888777.1 ZNF853 zinc finger protein 853
IPI00031801.4 CSDA cold shock domain protein A
IPI00000001.2 STAU1 staufen, RNA binding protein, homolog 1 (Drosophila)
IPI00184376.4 C90RF126 suppressor of cancer cell invasion
I 100304692.1 RBMX RNA binding motif protein, X-linked
IPI00639924.1 C190RF68 chromosome 19 open reading frame 68
IPI00169430.2 STRBP spermatid perinuclear RNA binding protein
IPI00418313.3 ILF3 interleukin enhancer binding factor 3, 90kDa
IPI00019812.1 PPP5C protein phosphatase 5, catalytic subunit
IPI00402209.3 ADNP2 ADNP homeobox 2
IPI00384333.2 TCF12 transcription factor 12
IPI00470596.3 ZNF638 zinc finger protein 638
IPI00397734.4 ZNF460 zinc finger protein 460
IPI00790342.1 RPL6 ribosomal protein L6
IPI00107113.3 UTP14A UTP14, U3 small nucleolar ribonucieoprotein, homolog A (yeast)
IPI00026087.1 BANF1 barrier to autointegration factor 1 Table 11: Proteins interacting with histone tail ARTKQTAR[Kac]STGGKAPRKQLA
Accession Protein
Protein description
number name
IPI00106955.3 C110RF84 chromosome 11 open reading frame 84
Figure imgf000044_0001
Figure imgf000044_0002
Table 14: Proteins interacting with histone tail ARTKQTAR[Kme]STGGKAPRKQLA
Accession Protein
Protein description
number name
IPI00218326.1 SP100 SP100 nuclear antigen
IPI00303832.6 RTF1 Rtfl, Pafl/RNA polymerase II complex component, homolog (S.
cerevisiae)
IPI00017763.6 NAP1L4 nucleosome assembly protein 1-like 4
IPI00006167.1 PP 1G protein phosphatase, Mg2+/ n2+ dependent, 1G
IPI00018500.1 ARID3A AT rich interactive domain 3A (BRIGHT-like)
IPI00024719.1 HAT1 histone acetyltransferase 1
IPI00008422.5 SMARCAD1 SWI/SNF-related, matrix-associated actin-dependent regulator of chromatin, subfamily a, containing DEAD/H box 1
IPI00479789.2 C1ORF103 chromosome 1 open reading frame 103
IP 100442098.1 PPP2R5C protein phosphatase 2, regulatory subunit B', gamma
IPI00550815.1 PRR14 proline rich 14
IPI00554824.1 SGOL1 shugoshin-like 1 (S. pombe)
IPI00165393.1 ANP32E acidic (leucine-rich) nuclear phosphoprotein 32 family, member E
IPI00025849.1 ANP32A acidic (leucine-rich) nuclear phosphoprotein 32 family, member A
IPI00012795.3 EIF3I eukaryotic translation initiation factor 3, subunit 1
IPI00290460.3 EIF3G eukaryotic translation initiation factor 3, subunit G
IPI00102069.5 EIF3M eukaryotic translation initiation factor 3, subunit M Table 14: Proteins interacting with histone tail ART QTAR[Kme]STGG APRKQLA
IPI00013068.1 EIF3E eukaryotic translation initiation factor 3, subunit E
IPI00654777.2 EIF3F eukaryotic translation initiation factor 3, subunit F
IPI00033143.1 EIF3K eukaryotic translation initiation factor 3, subunit K
IPI00646839.1 EIF3C eukaryotic translation initiation factor 3, subunit C
IPI00029012.1 EIF3A eukaryotic translation initiation factor 3, subunit A
IPI00465233.1 EIF3L eukaryotic translation initiation factor 3, subunit L
IPI00647650.3 EIF3H eukaryotic translation initiation factor 3, subunit H
IPI00106955.3 C110RF84 chromosome 11 open reading frame 84
IPI00329820.2 ACTL8 actin-like 8
Table 15: Proteins interacting with histone tail ARTKQTAR[Kme2]STGGKAPRKQLA
Accession Protein
Protein description
number name
IP 100024719.1 HAT1 histone acetyltransferase 1
IP 100440502.3 BRD2 bromodomain containing 2
IPI00885104.1 MPHOSPH8 -phase phosphoprotein 8
IPI00221172.2 UBR7 ubiquitin protein ligase E3 component n-recognin 7 (putative)
IPI00890837.1 SMCHD1 structural maintenance of chromosomes flexible hinge domain containing 1
IPI00006167.1 PPM1G protein phosphatase, Mg2+/ n2+ dependent, 1G
IPI00303832.6 RTF1 Rtfl, Pafl/RNA polymerase II complex component, homolog (S.
cerevisiae)
IPI00017763.6 IMAP1L4 nucleosome assembly protein 1-like 4
IPI00903259.1 SUPT5H suppressor of Ty 5 homolog (S. cerevisiae)
IPI00018500.1 ARID3A AT rich interactive domain 3A (BRIGHT-like)
IPI00243221.2 NRD1 nardilysin (N-arginine dibasic convertase)
IPI00442098.1 PPP2R5C protein phosphatase 2, regulatory subunit B', gamma
IPI00025849.1 ANP32A acidic (leucine-rich) nuclear phosphoprotein 32 family, member A
IPI00165393.1 ANP32E acidic (leucine-rich) nuclear phosphoprotein 32 family, member E
IPI00550815.1 PRR14 proline rich 14
IPI00554824.1 SGOL1 shugoshin-like 1 (S. pombe)
IPI00554737.3 PPP2R1A protein phosphatase 2, regulatory subunit A, alpha
IPI00290460.3 EIF3G eukaryotic translation initiation factor 3, subunit G
IPI00012795.3 EIF3I eukaryotic translation initiation factor 3, subunit 1
IPI00102069.5 EIF3M eukaryotic translation initiation factor 3, subunit M
IPI00013068.1 EIF3E eukaryotic translation initiation factor 3, subunit E
IPI00654777.2 EIF3F eukaryotic translation initiation factor 3, subunit F
IPI00465233.1 EIF3L eukaryotic translation initiation factor 3, subunit L
IPI00029012.1 EIF3A eukaryotic translation initiation factor 3, subunit A
IPI00646839.1 EIF3C eukaryotic translation initiation factor 3, subunit C
IPI00647650.3 EIF3H eukaryotic translation initiation factor 3, subunit H
IPI00014874.3 ZNF562 zinc finger protein 562
IPI00329820.2 ACTL8 actin-like 8 Table 15: Proteins interacting with histone tail ARTKQTAR[Kme2]STGGKAPRKQLA
IPI00465054.2 THUMPD1 THUMP domain containing 1
Table 16: Proteins interacting with histone tail ARTKQTAR[Kme3]STGGKAPRKQLA
Accession Protein
Protein description
number name
IPI00221172.2 UBR7 ubiquitin protein ligase E3 component n-recognin 7 (putative)
IPI00885104.1 MPHOSPH8 M-phase phosphoprotein 8
IPI00440502.3 BRD2 bromodomain containing 2
IPI00890837.1 SMCHD1 structural maintenance of chromosomes flexible hinge domain containing 1
IPI00017763.6 NAP1L4 nucleosome assembly protein 1-like 4
IPI00006167.1 PPM1G protein phosphatase, Mg2+/ n2+ dependent, 1G
IPI00903259.1 SUPT5H suppressor of Ty 5 homolog (S. cerevisiae)
IPI00018500.1 ARID3A AT rich interactive domain 3A (BRIGHT-like)
IPI00303832.6 RTF1 Rtfl, Pafl/RNA polymerase II complex component, homolog (S.
cerevisiae)
IPI00008422.5 SMARCAD1 SWI/SNF-related, matrix-associated actin-dependent regulator of chromatin, subfamily a, containing DEAD/H box 1
IPI00442098.1 PPP2R5C protein phosphatase 2, regulatory subunit B', gamma
IPI00554737.3 PPP2R1A protein phosphatase 2, regulatory subunit A, alpha
IPI00550815.1 PRR14 proline rich 14
IPI00554824.1 SGOL1 shugoshin-like 1 (S. pombe)
IPI00165393.1 ANP32E acidic (leucine-rich) nuclear phosphoprotein 32 family, member E
IPI00025849.1 ANP32A acidic (leucine-rich) nuclear phosphoprotein 32 family, member A
IPI00290460.3 EIF3G eukaryotic translation initiation factor 3, subunit G
IPI00012795.3 EIF3I eukaryotic translation initiation factor 3, subunit 1
IPI00013068.1 EIF3E eukaryotic translation initiation factor 3, subunit E
IPI00102069.5 EIF3M eukaryotic translation initiation factor 3, subunit M
IPI00465233.1 EIF3L eukaryotic translation initiation factor 3, subunit L
IPI00654777.2 EIF3F eukaryotic translation initiation factor 3, subunit F
IPI00029012.1 EIF3A eukaryotic translation initiation factor 3, subunit A
IPI00033143.1 EIF3K eukaryotic translation initiation factor 3, subunit K
IPI00646839.1 EIF3C eukaryotic translation initiation factor 3, subunit C
IPI00106955.3 C110RF84 chromosome 11 open reading frame 84
IPI00014874.3 ZNF562 zinc finger protein 562
IPI00465054.2 THUMPD1 THUMP domain containing 1
IPI00055954.4 WDR43 WD repeat domain 43
IPI00031030.1 APLP2 amyloid beta (A4) precursor-like protein 2
IPI00329820.2 ACTL8 actin-like 8
IPI00848276.1 C10ORF18 chromosome 10 open reading frame 18
IPI00790098.2 C30RF63 chromosome 3 open reading frame 63 Table 17: Proteins interacting with histone tail ARTKQTAR[Kme3][pS]TGGKAPRKQLA
Accession Protein
Protein description
number name
IPI00885104.1 MPHOSPH8 M-phase phosphoprotein 8
IPI00787650.2 ATRX alpha thalassemia/mental retardation syndrome X-linked
IPI00025890.4 C12ORF30 N(alpha)-acetyltransferase 25, NatB auxiliary subunit
IPI00790098.2 C30RF63 chromosome 3 open reading frame 63
I 100410039.1 PPHLN1 periphilin 1
IPI00514336.1 BGLAP bone gamma-carboxyglutamate (gla) protein
Figure imgf000047_0001
Table 19: Proteins interacting with histone tail ARTKQTAR[Kme3]S[pT]GGKAPRKQLA
Accession Protein
Protein description
number name
IPI00221172.2 UBR7 ubiquitin protein ligase E3 component n-recognin 7 (putative)
IPI00885104.1 MPHOSPH8 M-phase phosphoprotein 8
IPI00023672.6 EZH1 enhancer of zeste homolog 1 (Drosophila)
IPI00890837.1 SMCHD1 structural maintenance of chromosomes flexible hinge domain containing 1
IPI00442098.1 PPP2R5C protein phosphatase 2, regulatory subunit B', gamma
IPI00554737.3 PPP2R1A protein phosphatase 2, regulatory subunit A, alpha
IPI00554824.1 SGOL1 shugoshin-like 1 (S. pombe)
IPI00008380.1 PPP2CA protein phosphatase 2, catalytic subunit, alpha isozyme
IPI00106955.3 C110RF84 chromosome 11 open reading frame 84
IPI00848276.1 C10ORF18 chromosome 10 open reading frame 18
IPI00790098.2 C30RF63 chromosome 3 open reading frame 63
IPI00410039.1 PPHLN1 periphilin 1 Table 20: Proteins interacting with histone tail ARTKQTARK[pS][pT]GGKAPRKQLA
Accession Protein
Protein description
number name
IPI00419373.1 HNRNPA3 heterogeneous nuclear ribonucleoprotein A3
IPI00644079.2 HNRNPU heterogeneous nuclear ribonucleoprotein U (scaffold attachment factor A)
IPI00334587.1 HNRNPAB heterogeneous nuclear ribonucleoprotein A/B
IPI00011913.1 HNRNPAO heterogeneous nuclear ribonucleoprotein AO
IPI00874030.3 HNRNPA2B1 heterogeneous nuclear ribonucleoprotein A2/B1
IPI00644055.3 HNRNPR heterogeneous nuclear ribonucleoprotein R
IPI00398625.5 HRNR hornerin
IPI00784044.1 MCCC2 methylcrotonoyl-CoA carboxylase 2 (beta)
IPI00480028.2 ABI1 abl-interactor 1
IPI00397801.4 FLG2 filaggrin family member 2
IPI00908949.1 TARS threonyl-tRNA synthetase
IPI00304692.1 RBMX RNA binding motif protein, X-linked
IPI00106491.3 MRT04 mRNA turnover 4 homolog (S. cerevisiae)
IPI00008529.1 RPLP2 ribosomal protein, large, P2
IPI00472164.2 WASF2 WAS protein family, member 2
IPI00410264.2 WDR32 DDB1 and CUL4 associated factor 10
IPI00008557.5 IGF2BP1 insulin-like growth factor 2 mRNA binding protein 1
IPI00639842.1 KRI1 KRI1 homolog (S. cerevisiae)
IPI00221089.5 RPS13 ribosomal protein S13
IPI00658000.2 IGF2BP3 insulin-like growth factor 2 mRNA binding protein 3
IPI00872107.1 ILF2 interleukin enhancer binding factor 2, 45kDa
Table 21: Proteins interacting with histone tail ARTKQTARKS[pT]GGKAPRKQLA
Accession Protein
Protein description
number name
IPI00719073.1 CHD8 chromodomain helicase DNA binding protein 8
IPI00025849.1 ANP32A acidic (leucine-rich) nuclear phosphoprotein 32 family, member A
IPI00303832.6 RTF1 Rtfl, Pafl/RNA polymerase II complex component, homolog (S.
cerevisiae)
IPI00903259.1 SUPT5H suppressor of Ty 5 homolog (S. cerevisiae)
IPI00006167.1 PPM1G protein phosphatase, Mg2+/Mn2+ dependent, 1G
IPI00646839.1 EIF3C eukaryotic translation initiation factor 3, subunit C
IPI00102069.5 EIF3M eukaryotic translation initiation factor 3, subunit M
IPI00013068.1 EIF3E eukaryotic translation initiation factor 3, subunit E
IPI00029012.1 EIF3A eukaryotic translation initiation factor 3, subunit A
IPI00465233.1 EIF3L eukaryotic translation initiation factor 3, subunit L
IPI00647650.3 EIF3H eukaryotic translation initiation factor 3, subunit H
IPI00719752.2 EIF3B eukaryotic translation initiation factor 3, subunit B
IPI00012795.3 EIF3I eukaryotic translation initiation factor 3, subunit 1
IPI00183002.6 PPP1R12A protein phosphatase 1, regulatory (inhibitor) subunit 12A
Figure imgf000049_0001
Tables for H4 histone tails (Tables 22 to 35)
Figure imgf000050_0001
Figure imgf000050_0002
Table 24: Proteins interacting with histone tail SGRG[Kac]GGKGLGKGGAKRHRKV
Accession Protein
Protein description
number name
IPI00019981.1 GTF2E2 general transcription factor ME, polypeptide 2, beta 34kDa
IPI00017450.2 GTF2F1 general transcription factor IIF, polypeptide 1, 74kDa
IPI00876896.1 GTF2F2 general transcription factor IIF, polypeptide 2, 30kDa
IPI00010252.3 TRIM33 tripartite motif containing 33
IPI00218326.1 SP100 SP100 nuclear antigen
IPI00005666.1 JMJD2A lysine (K)-specific demethylase 4A
IPI00304875.2 HIRIP3 HIRA interacting protein 3
IPI00477535.6 ERCC5 excision repair cross-complementing rodent repair
deficiency, complementation group 5
IPI00434623.1 MBD2 methyl-CpG binding domain protein 2
IPI00006715.3 RAD21 RAD21 homolog (S. pombe)
IPI00009958.6 COPS5 COP9 constitutive photomorphogenic homolog subunit 5
(Arabidopsis)
IPI00032851.1 COPZ1 coatomer protein complex, subunit zeta 1
IPI00171844.3 COPS4 COP9 constitutive photomorphogenic homolog subunit 4
(Arabidopsis)
IPI00783982.1 COPG coatomer protein complex, subunit gamma Table 24: Proteins interacting with histone tail SGRG[Kac]GGKGLGKGGAKRHRKV
IPI00220219.6 COPB2 coatomer protein complex, subunit beta 2 (beta prime)
IPI00298520.3 ARCN1 archain 1
IPI00465132.4 COPE coatomer protein complex, subunit epsilon
IPI00646493.1 CO PA coatomer protein complex, subunit alpha
IPI00000875.7 EEF1G eukaryotic translation elongation factor 1 gamma
IPI00642971.3 EEF1D eukaryotic translation elongation factor 1 delta (guanine nucleotide exchange protein)
IPI00893918.1 VARS valyl-tRNA synthetase
IPI00299254.3 EIF5B eukaryotic translation initiation factor 5B
IPI00844214.1 CTNNBL1 catenin, beta like 1
IPI00220503.9 DCTN2 dynactin 2 (p50)
IPI00852584.3 RBM33 RNA binding motif protein 33
IPI00328293.2 SRRM1 serine/arginine repetitive matrix 1
IPI00170786.1 WBP11 WW domain binding protein 11
IPI00016177.1 MNT MAX binding protein
IPI00027009.2 PACSIN2 protein kinase C and casein kinase substrate in neurons 2
IPI00218922.5 SEC63 SEC63 homolog (S. cerevisiae)
IPI00290204.1 SNRNP70 small nuclear ribonucleoprotein 70kDa (Ul)
IPI00022774.3 VCP valosin containing protein
IPI00299517.3 FAM175B family with sequence similarity 175, member B
IPI00641788.1 SNRPC small nuclear ribonucleoprotein polypeptide C
IPI00872359.1 DCTN1 dynactin 1
IPI000074O2.3 IP07 importin 7
IPI00294943.1 ARIH1 ariadne homolog, ubiquitin-conjugating enzyme E2 binding protein, 1 (Drosophila)
IPI00029468.1 ACTR1A ARP1 actin-related protein 1 homolog A, centractin alpha
(yeast)
IPI00166500.3 PIAS4 protein inhibitor of activated STAT, 4
IPI00022471.7 HMHA1 histocompatibility (minor) HA-1
IPI00300060.4 WDR70 WD repeat domain 70
IPI00004962.2 G0LIM4 golgi integral membrane protein 4
IPI00012382.3 SNRPA small nuclear ribonucleoprotein polypeptide A
IPI00337397.1 NUP98 nucleoporin 98kDa
IPI00170778.4 FNBP4 formin binding protein 4
IPI00006298.1 PPIG peptidylprolyl isomerase G (cyclophilin G)
IPI00170429.3 OGFODl 2-oxoglutarate and iron-dependent oxygenase domain containing 1
Table 25: Proteins interacting with histone tail SGRG[Kac]GG[Kac]GLGKGGAKRHRKV
Accession Protein
Protein description
number name
IPI00759680.2 BRD9 bromodomain containing 9
IPI00879403.1 SETD5 SET domain containing 5 Table 25: Proteins interacting with histone tail SGRG[Kac]GG[Kac]GLGKGGAKRHRKV
IPI00304925.5 HSPA1A heat shock 70kDa protein 1A
IPI00024568.2 GLTSCR1 glioma tumor suppressor candidate region gene 1
IPI00220871.4 RPL37 ribosomal protein L37
IPI00013257.1 SSBP4 single stranded DNA binding protein 4
IPI00741537.3 BAT2L proline-rich coiled-coil 2B
IPI00102575.4 ATAD5 ATPase family, AAA domain containing 5
IPI00440727.1 BRD4 bromodomain containing 4
IPI00017381.1 RFC4 replication factor C (activator 1) 4, 37kDa
IPI00031514.1 RFC5 replication factor C (activator 1) 5, 36.5kDa
IPI00017412.1 RFC2 replication factor C (activator 1) 2, 40k Da
Figure imgf000052_0001
Table 27: Proteins interacting with histone tail
SGRG[Kac]GG[Kac]GLG[Kac]GGA[Kac]RHRKV
Accession Protein
Protein description
number name
IPI00023177.4 CHAF1A chromatin assembly factor 1, subunit A (pl50)
IPI00011857.1 CHAF1B chromatin assembly factor 1, subunit B (p60)
IPI00292135.1 LBR lamin B receptor
IPI00745300.2 AT11 N(alpha)-acetyltransferase 40, NatD catalytic subunit, homolog (S. cerevisiae)
IPI00292135.1 LBR lamin B receptor
IPI00304596.3 NONO non-POU domain containing, octamer-binding
IPI00010740.1 SFPQ splicing factor proline/glutamine-rich Table 27: Proteins interacting with histone tail
SGRG[Kac]GG[Kac]GLG[Kac]GGA[Kac]RHRKV
IPI00298731.2 PPP1R10 protein phosphatase 1, regulatory (inhibitor) subunit 10
IPI00477619.2 C190RF2 chromosome 19 open reading frame 2
IPI00004968.1 PRPF19 PRP19/PS04 pre-mRNA processing factor 19 homolog (S.
cerevisiae)
IPI00607584.1 MYBBP1A MYB binding protein (P160) la
IPI00294241.2 GTF3A general transcription factor IIIA
IPI00020903.4 AFF2 AF4/FMR2 family, member 2
IPI00017617.1 DDX5 DEAD (Asp-Glu-Ala-Asp) box polypeptide 5
IPI00025178.3 DENND2C DENN/MADD domain containing 2C
IPI00398625.5 HRNR hornerin
IPI00453476.2 PGAM1 phosphoglycerate mutase 1 (brain)
IPI00847637.1 MAP7D3 MAP7 domain containing 3
IPI00647892.1 DLGAP5 discs, large (Drosophila) homolog-associated protein 5
IPI00792100.1 C140RF166 chromosome 14 open reading frame 166
IPI00550689.3 C220RF28 chromosome 22 open reading frame 28
IPI00293655.3 DDX1 DEAD (Asp-Glu-Ala-Asp) box polypeptide 1
IPI00027898.3 C21ORF70 chromosome 21 open reading frame 70
IPI00396321.1 LRRC59 leucine rich repeat containing 59
IPI00293260.5 DNAJC10 DnaJ (Hsp40) homolog, subfamily C, member 10
IPI00024568.2 GLTSCR1 glioma tumor suppressor candidate region gene 1
IPI00167861.3 SMG5 Smg-5 homolog, nonsense mediated mRNA decay factor (C.
elegans)
IPI00215780.5 RPS19 ribosomal protein S19
IPI00008137.1 ZNF295 zinc finger protein 295
IPI00642617.3 ZNF3 zinc finger protein 3
IPI00916222.1 LSM5 LS 5 homolog, U6 small nuclear RNA associated (S.
cerevisiae)
IPI00165189.5 IFT81 intraflagellar transport 81 homolog (Chlamydomonas)
IPI00644079.2 HNRNPU heterogeneous nuclear ribonucleoprotein U (scaffold
attachment factor A)
IPI00555647.3 SFRS10 transformer 2 beta homolog (Drosophila)
IPI00880117.1 APOBEC3B apolipoprotein B mRNA editing enzyme, catalytic
polypeptide-like 3B
IPI00171903.2 HNRNPM heterogeneous nuclear ribonucleoprotein M
IPI00008433.4 RPS5 ribosomal protein S5
IPI00005367.1 PRKAG2 protein kinase, AMP-activated, gamma 2 non-catalytic subunit
IPI00008575.3 KHDRBS1 KH domain containing, RNA binding, signal transduction associated 1
IPI00236556.1 PO myeloperoxidase
IP 100003801.1 PATZ1 POZ (BTB) and AT hook containing zinc finger 1
IPI00433169.2 ARHGEF1 Rho guanine nucleotide exchange factor (GEF) 1
IPI00794659.1 RPS20 ribosomal protein S20
IPI00784224.1 ZFR zinc finger RNA binding protein
IPI00418313.3 ILF3 interleukin enhancer binding factor 3, 90kDa Table 27: Proteins interacting with histone tail
SGRG[Kac]GG[Kac]GLG[Kac]GGA[Kac]RHRKV
IPI00784366.1 AP2B1 adaptor-related protein complex 2, beta 1 subunit
IP 100440727.1 BRD4 bromodomain containing 4
IPI00102575.4 ATAD5 ATPase family, AAA domain containing 5
IPI00017381.1 RFC4 replication factor C (activator 1) 4, 37kDa
IPI00031514.1 RFC5 replication factor C (activator 1) 5, 36.5kDa
IPI00031521.1 RFC3 replication factor C (activator 1) 3, 38kDa
Table 28: Proteins interacting with histone tail SGRG[Kac]GGKGLG[Kac]GGAKRHRKV
Accession Protein
Protein description
number name
IPI00398625.5 HRNR hornerin
IPI00018120.1 DAP3 death associated protein 3
IPI00791157.1 RPS17 ribosomal protein S17
IPI00022460.2 ZNF592 zinc finger protein 592
IPI00880117.1 APOBEC3B apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 3B
IPI00418313.3 ILF3 interleukin enhancer binding factor 3, 90kDa
IPI00102425.1 RPRD1A regulation of nuclear pre-mRNA domain containing 1A
Table 29: Proteins interacting with histone tail SGRG[Kac]GGKGLGKGGA[Kac]RHRKV
Accession Protein
Protein description
number name
IPI00013721.2 PRPF4B PRP4 pre-mRNA processing factor 4 homolog B (yeast)
IPI00013415.1 RPS7 ribosomal protein S7
IPI00790342.1 RPL6 ribosomal protein L6
IPI00019770.3 FAU Finkel-Biskis-Reilly murine sarcoma virus (FBR-MuSV) ubiquitously expressed
Table 30: Proteins interacting with histone tail SGRGKGG[Kac]GLGKGGAKRHRKV
Accession Protein
Protein description
number name
IPI00477535.6 ERCC5 excision repair cross-complementing rodent repair deficiency, complementation group 5
IPI00299254.3 EIF5B eukaryotic translation initiation factor 5B
IPI00016177.1 MNT MAX binding protein
IPI00220503.9 DCTN2 dynactin 2 (p50)
IPI00872359.1 DCTN1 dynactin 1 Table 31: Proteins interacting with histone tail SGRGKGG[Kac]GLG[Kac]GGAKRHRKV
Accession Protein
Protein description
number name
IPI00010252.3 TRIM33 tripartite motif containing 33
IPI00759680.2 BRD9 bromodomain containing 9
IPI00102575.4 ATAD5 ATPase family, AAA domain containing 5
IPI00167535.6 EP400 E1A binding protein p400
IPI00013452.9 EPRS glutamyl-prolyl-tRNA synthetase
IPI00004860.2 RARS arginyl-tRNA synthetase
IPI00644127.1 IARS isoleucyl-tRNA synthetase
IPI00011916.1 JTV1 aminoacyl tRNA synthetase complex-interacting
multifunctional protein 2
IPI00793201.1 SCYE1 aminoacyl tRNA synthetase complex-interacting
multifunctional protein 1
IPI00026665.2 QARS glutaminyl-tRNA synthetase
IPI00216951.2 DARS aspartyl-tRNA synthetase
IPI00307092.2 KARS lysyl-tRNA synthetase
IPI00024568.2 GLTSCR1 glioma tumor suppressor candidate region gene 1
IPI00003588.1 EEF1E1 eukaryotic translation elongation factor 1 epsilon 1
Table 32: Proteins interacting with histone tail SGRGKGG[Kac]GLG[Kme3]GGAKRHRKV
Accession Protein
Protein description
number name
IPI00853240.1 TAF3 TAF3 RNA polymerase II, TATA box binding protein (TBP)- associated factor, 140kDa
IPI00298925.2 TAF5 TAF5 RNA polymerase II, TATA box binding protein (TBP)- associated factor, lOOkDa
IPI00018111.1 TAF7 TAF7 RNA polymerase II, TATA box binding protein (TBP)- associated factor, 55kDa
IPI00413755.1 TAF4 TAF4 RNA polymerase II, TATA box binding protein (TBP)- associated factor, 135kDa
IPI00853240.1 TAF3 TAF3 RNA polymerase II, TATA box binding protein (TBP)- associated factor, 140kDa
IPI00400834.1 MED13L mediator complex subunit 13-like
IPI00021924.1 H1FX HI histone family, member X
IPI00217466.3 HIST1H1D histone cluster 1, Hid
IPI00477313.3 HNRNPC heterogeneous nuclear ribonucleoprotein C (C1/C2)
IPI00787089.1 PELP1 proline, glutamate and leucine rich protein 1
IPI00018140.3 SYNCRIP synaptotagmin binding, cytoplasmic RNA interacting protein
IPI00418313.3 ILF3 interleukin enhancer binding factor 3, 90kDa
IPI00008557.5 IGF2BP1 insulin-like growth factor 2 mRNA binding protein 1 Table 32: Proteins interacting with histone tail SGRGKGG[Kac]GLG[Kme3]GGAKRHRKV
IPI00552587.1 GADD45GIP1 growth arrest and DNA-damage-inducible, gamma interacting protein 1
IPI00784044.1 MCCC2 methylcrotonoyl-CoA carboxylase 2 (beta)
IPI00009680.3 MRPL44 mitochondrial ribosomal protein L44
IPI00856098.1 RRBP1 ribosome binding protein 1 homolog 180kDa (dog)
IPI00304692.1 RBMX RNA binding motif protein, X-linked
IPI00844578.1 DHX9 DEAH (Asp-Glu-Ala-His) box polypeptide 9
IPI00789699.2 CYFIP2 cytoplasmic FMR1 interacting protein 2
IPI00215966.3 RPP14 ribonuclease P/MRP 14kDa subunit
IPI00032881.2 MRPS23 mitochondrial ribosomal protein S23
IPI00398625.5 HRNR hornerin
IPI00011268.2 RALY RNA binding protein, autoantigenic (hnRNP-associated with lethal yellow homolog (mouse))
IPI00795922.1 MRPS28 mitochondrial ribosomal protein S28
IPI00022403.1 MRPL13 mitochondrial ribosomal protein L13
IPI00872107.1 ILF2 interleukin enhancer binding factor 2, 45kDa
IPI00784224.1 ZFR zinc finger RNA binding protein
IPI00013452.9 EPRS glutamyl-prolyl-tRNA synthetase
IPI00006987.1 DDX24 DEAD (Asp-Glu-Ala-Asp) box polypeptide 24
IPI00022002.6 MRPS27 mitochondrial ribosomal protein S27
Figure imgf000056_0001
Table 34: Proteins interacting with histone tail SGRGKGGKGLG[Kme2]GGAKRHRKV
Accession Protein
Protein description
number name
IPI00880117.1 APOBEC3B apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 3B
IPI00398625.5 HRNR hornerin Table 35: Proteins interacting with histone tail SGRGKGGKGLG[Kme3]GGAKRHRKV
Accession Protein
Protein description
number name
IPI00400834.1 MED13L mediator complex subunit 13-like
IPI00217466.3 HIST1H1D histone cluster 1, Hid
IPI00021924.1 H1FX HI histone family, member X
IPI00217467.3 HIST1H1C histone cluster 1, Hlc
IPI00784044.1 MCCC2 methylcrotonoyl-CoA carboxylase 2 (beta)
IPI00215966.3 RPP14 ribonuclease P/MRP 14kDa subunit
IPI00008557.5 IGF2BP1 insulin-like growth factor 2 mRNA binding protein 1
IPI00789699.2 CYFIP2 cytoplasmic FMR1 interacting protein 2
IPI00032881.2 MRPS23 mitochondrial ribosomal protein S23
IPI00418313.3 ILF3 interleukin enhancer binding factor 3, 90kDa
IPI00398625.5 HRNR hornerin
IPI00011268.2 RALY RNA binding protein, autoantigenic (hnRNP-associated with lethal yellow homolog (mouse))
IPI00001134.3 RBM7 RNA binding motif protein 7
IPI00306661.3 KCMF1 potassium channel modulatory factor 1
IPI00480028.2 ABI1 abl-interactor 1
IPI00009680.3 MRPL44 mitochondrial ribosomal protein L44
IPI00304692.1 RBMX RNA binding motif protein, X-linked
Examples
Principle of the assay
Cell lysate (nuclear extract of human HL-60 cells) was contacted with immobilized histone tails. The beads with captured proteins were separated from the lysate and bead bound proteins were eluted in SDS sample buffer and subsequently separated by SDS- Polyacrylamide gel electrophoresis. The gel was stained with colloidal Coomassie and stained areas of each gel lane were cut out and subjected to in-gel proteolytic digestion with trypsin. Peptides originating from the different histone tail beads and the lysate control were labeled with isobaric tagging reagents (TMT reagents, Thermofisher). The TMT reagents are a set of multiplexed, amine-specific, stable isotope reagents that can label peptides in up to six different biological samples enabling simultaneous identification and quantitation of peptides. The combined samples were fractionated using reversed- phase chromatography at pH 1 1 and fractions were subsequently analyzed with a nano- flow liquid chromatography system coupled online to a tandem mass spectrometer (LC- MS/MS) experiment followed by reporter ion quantification in the MS/MS spectra (Ross et al., 2004. Mol. Cell. Proteomics 3(12):1 154-1 169; Dayon et al., 2008. Anal. Chem. 80(8):2921-2931 ; Thompson et al., 2003. Anal. Chem. 75(8): 1895-1904). Further experimental protocols can be found in WO2006/134056 and previous publications (Bantscheff et al., 2007. Nat. Biotechnol. 25, 1035-1044; Bantscheff et al., 201 1. Nat. Biotechnol. 29, 255-265).
Results
Tables 1 to 21 show the proteins interacting with individual H3 histone tails which are listed in Table 36. Tables 22 to 35 depict the proteins interacting with individual H4 histone tails which are listed in Table 37. Sequence accession numbers are defined by the International Protein Index (IPI) (Kersey et al., 2004. Proteomics 4(7): 1985-1988).
Protocols
1. Preparation of the histone tail beads (affinity matrix)
Biotinylated histone tails were purchased from Alta Bioscience (Birmingham, UK) and solubilized at a concentration of 0.8 mM in 10 mM Tris-HCl buffer, pH 7.4 (Sigma- Aldrich T2663, St-Louis, MO, USA). Histone tails were incubated with Streptavidin agarose beads for 30 minutes (Thermo Fischer Scientific 20361 , Waltham, MA, USA). For each sample 14 μΐ of 0.8 mM histone tail solution and 25 μΐ of beads were used. The beads were then centrifuged for 5 minutes at 1,200 rpm (Heraeus 75004375, Hanau, Germany) and the supernatant was removed.
Table 36: Histone H3(l-21) tails used in this study.
(Abbreviations: Rme, mono-methylated arginine; Rme2a, asymmetrical di-methylated arginine; Kme, mono-methylated lysine; Kme2, di-methylated lysine; Kme3: tri- methylated lysine; Kac, acetylated lysine; pS, phosphorylated serine; pT, phosphorylated threonine; H3(l-21); ahx, aminohexanoic acid)
Figure imgf000059_0001
Table 37: Histone H4(l-21) tails used in this study.
(Abbreviations: Rme, mono-methylated arginine; Rme2a, asymmetrical di-methylated arginine; Kme, mono-methylated lysine; Kme2, di-methylated lysine; Kme3: tri- methylated lysine; Kac, acetylated lysine; pS, phosphorylated serine; pT, phosphorylated threonine; ahx, aminohexanoic acid)
Peptide sequence Histone tail name SGRGKGGKGLGKGGAKRHRKV -ahx-K-Biotin H4(l-21)-biotin
[pS]G[Rme2a]G[Kac]GGKGLGKGGAKRHRKV -ahx-K-Biotin H4(l-21), pSl, R3me2a, K5ac-biotin
SG[Rme2a]G[Kac]GGKGLGKGGAKRHRKV -ahx-K-Biotin H4(l-21), R3me2a, K5ac-biotin
SGRG[Kac]GGKGLGKGGAKRHRKV -ahx-K-Biotin H4(l-21), K5ac-biotin
SGRG[Kac]GG[Kac]GLGKGGAKRHRKV -ahx-K-Biotin H4(l-21), K5ac, K8ac-biotin
SGRGlKac]GG[Kac]GLG[Kac]GGAKRHRKV -ahx-K-Biotin H4(l-21), K5ac, K8ac, K12ac-biotin
SGRG[Kac]GG[Kac]GLG[Kac]GGAK[Rac]HRK -ahx-K- H4(l-21), K5ac, K8ac, K12ac, K16ac- Biotin biotin
SGRG[Kac]GGKGLG[Kac]GGAKRHRKV -ahx-K-Biotin H4(l-21), K5ac, K12ac-biotin
SGRG[Kac]GGKGLGKGGA[Kac]RHRKV -ahx-K-Biotin H4(l-21), K5ac, K16ac-biotin
SGRGKGG[Kac]GLGKGGAKRHRKV -ahx-K-Biotin H4(l-21), K8ac-biotin
SGRGKGG[Kac]GLG[Kac]GGAKRHRKV -ahx-K-Biotin H4(l-21), K8ac, K12ac-biotin
SGRGKGG[Kac]GLG[Kme3]GGAKRHRKV -ahx-K-Biotin H4(l-21), K8ac, K12me3-biotin
SGRGKGGKGLG[Kac]GGAKRHRKV -ahx-K-Biotin H4(l-21) K12ac-biotin
SGRGKGGKGLG[Kme2]GGAKRHRKV -ahx-K-Biotin H4(l-21), K12me2-biotin
SGRGKGGKGLG[Kme3]GGAKRHRKV -ahx-K-Biotin H4(l-21), K12me3-biotin
2. Cell culture
HL-60 cells (ATCC CCL-240, Manassas, VA, USA) were grown in spinner flasks (Integra Bioscience 182101, Zizers, Switzerland) in IMDM medium (Invitrogen 21980.065, Carlsbad, CA, USA) supplemented with 20% fetal calf serum (PAA Laboratories 15/101 , Pasching, Austria) up to a cell concentration of lxl 06 cells/ml.
3. Preparation of cell lysates (nuclear extracts)
Cells were harvested by centrifugation for 6 minutes at 2,370 rpm (Sorvall R12BP, Newtown, CO, USA) and washed twice in Phosphate Buffer Saline (137 ra NaCl (Sigma S5150), 2.7 mM C1 (Merck 1.04936), 8 mM Na2HP04 (Sigma S7907), 1.46 raM KH2P04 (Sigma P5504)). Washed cells are centrifuged for 5 minutes (first wash) or 10 minutes (second wash) at 2,370 rpm (Heraus 75004375.) The cell pellet was resuspended in 4 volumes of hypotonic buffer (10 mM TRIS-Cl, pH 7.4, 1.5 mM MgCl2 (Sigma M- 1028), 10 mM C1 , 25 mM NaF (Sigma S7920), 1 mM Na3Vo4 (Sigma S6508), 1 mM DTT (Biomol 04010, Plymouth Meeting, PA, USA). The cells were allowed to swell for 3 minutes (swelling checked under microscope) and then centrifuged for 5 minutes at 2,350 rpm (Heraeus). The supernatant was discarded and the pellet was resuspended in 2 volumes of hypotonic buffer supplemented with protease inhibitors. The cells were homogenized by 10 to 15 strokes in a homogenizer (VWR SCERSP885300-0015, Radnor, PA, USA) and the homogenate was centrifuged for 10 minutes at 3,300 rpm. The supernatant was discarded and the nuclei were washed in 3 volumes of hypotonic buffer supplemented with protease inhibitors (1 tablet for 25 ml; Roche 13137200, Basel, Switzerland) and centnfuged for 15 minutes at 10.000 rpm in a SLA-600TC rotor (Sorvall 74503). The pellet was resuspended in 1 volume of extraction buffer (50 mM TRIS-Cl, pH 7.4, 1.5 mM MgCl2, 20 % glycerol (Merck Z835091), 420 mM NaCl (Sigma S5150), 25 mM NaF, 1 mM Na3V04, 1 mM DTT, 400 units/ml of DNAsel (Sigma D4527), and protease inhibitors (1 tablet for 25 ml)) and then homogenized first with 20 strokes with a homogenizer and then by 30 minutes gentle mixing at 4°C. The homogenate was then centnfuged for 30 minutes at 10,000 rpm in a SLA-600TC rotor. The supernatant was diluted in dilution buffer (1.8 ml buffer per 1 ml supernatant; 50 mM TRIS-Cl, pH 7.4, 3.9 mM EDTA (Sigma E7889), 25 mM NaF, 1 mM Na3V04, 0.6 % Igepal CA-630 (Sigma, 13021), 1 mM DTT and protease inhibitors (1 tablet for 25 ml)). After 10 minutes incubation on ice, the lysate was centrifuged for 1 hour at 33,500 rpm in a ΤΪ50.2 rotor (Beckman Coulter LE90K, 392052, Brea, CA, USA) and the supernatant was frozen in liquid nitrogen and stored at -80°C. After thawing of the nuclear lysate the protein concentration was adjusted to 5 mg/ml. The final buffer composition was 50 mM TRIS pH 7.4, 5% Glycerol, 150 mM NaCl, 25 mM NaF, 2.5 mM EDTA, 0.4% Igepal CA-630, 1 mM DTT and protease inhibitors (1 tablet for 25 ml lysate). The lysate is then submitted to ultracentrifugation at 33,500 rpm for 20 minutes in a Ti50.2 rotor.
4. Capturing of proteins from cell lysate
The lysate was precleared with Poly-L lysine agarose beads for 90 minutes. For 25 mg of protein in the lysate 600 μΐ of beads were used (Sigma P6893-50). The histone tails beads were then added to the lysate for 2 hours. The beads were centrifuged and loaded into a purification column (Mobicol, Mobitec Ml 002, Goettingen, Germany). The beads were washed first with 10 ml of buffer (50 mM TRIS pH 7.4, 5% Glycerol, 150 mM NaCl, 25 mM NaF, 2.5 mM EDTA, 0.4% Igepal CA-630, 1 mM DTT and protease inhibitors) and then 5 ml of buffer with half the concentration of detergent (0.4% Igepal). Bound proteins were eluted with 50 μΐ of loading buffer (Nupage, Invitrogen NP0007) at 50°C for 30 minutes. 25 μΐ was alkylated with iodoacetamide (2.8 μΐ of 200 mg/ml in water; Sigma I- 1149) for 30 minutes at room temperature, and loaded on a gel for short separation (Criterion XT 4-12% Bis-Tris Gel; Criterion XT MOPS Running buffer; Bio rad 345-0123 & 161-0788, Hercules, CA, USA) prior to mass spectrometry analysis. Each gel was loaded with four purifications of differently modified histone tails and one purification of the non-modified equivalent tail. The last lane was loaded with 3 μg of precleared lysate. 5. Protein identification and quantitation by mass spectrometry
5.1 Protein digestion prior to mass spectrometric analysis
Gel-separated proteins were digested in-gel essentially following a previously described procedure (Shevchenko et al., 1996, Anal. Chem. 68:850-858). Briefly, gel-separated proteins were excised from the gel using a clean scalpel, destained twice using 100 μΐ 5mM triethylammonium bicarbonate buffer (TEAB; Sigma T7408) and 40% ethanol in water and dehydrated with absolute ethanol. Proteins were subsequently digested in-gel with porcine trypsin (Promega) at a protease concentration of 10 ng/μΐ in 5mM TEAB. Digestion was allowed to proceed for 4 hours at 37°C and the reaction was subsequently stopped using 5μ1 5% formic acid. Gel plugs were extracted twice with 20 μΐ 1% formic acid and three times with increasing concentrations of acetonitrile. Peptide extracts were subsequently pooled with acidified digest supernatants and dried in a vacuum centrifuge.
5.2 TMT labeling of peptide extracts
The peptide extracts corresponding to the different aliquots treated with different concentrations of compound 1 were labeled with variants of the isobaric tagging reagent as shown in Table 1 (TMT sixplex Label Reagent Set, part number 90066, Thermo Fisher Scientific Inc., Rockford, IL 61 105 USA). The TMT reagents are a set of multiplexed, amine-specific, stable isotope reagents that can label peptides on amino groups in up to six different biological samples enabling simultaneous identification and quantification of peptides. The TMT reagents were used according to instructions provided by the manufacturer. The samples were resuspended in 10 μΐ 50 mM TEAB solution, pH 8.5 and 10 μΐ acetonitrile were added. The TMT reagent was dissolved in acetonitrile to a final concentration of 24 mM and 10 μΐ of reagent solution were added to the sample. The labeling reaction was performed at room temperature for one hour on a horizontal shaker and stopped by adding 5 μΐ of 100 mM TEAB and 100 mM glycine in water. The labeled samples were then combined, dried in a vacuum centrifuge and resuspended in 60% 200mM TEAB / 40% acetonitrile. 2 μΐ of a 2.5% NH2OH solution in water were added, incubated for 15 min and finally the reaction was stopped by addition of 10 μΐ of 20% formic acid in water. After freeze-drying samples were resuspended in 50 μΐ 0.1% formic acid in water.
Table 38: Labeling of peptides with TMT isobaric tagging reagents
Gel lane Histone tail sample (HT) TMT6 reagent
1 HT1 126 2 HT2 127
3 HT3 128
3 HT4 129
5 HT5 130
6 lysate control 131
5.3. Peptide fractionation using high pH reversed phase chromatography
Peptide samples were injected into a capillary LC system (CapLC, Waters) and separated using a reversed phase CI 8 column (X-Bridge 1 mm x 150 mm, Waters, USA). Gradient elution was performed at a flow-rate of 50 μΐνππη. Solvent A: 20 mM ammoniumformiate, pHl l, and solvent B: 20 mM ammoniumformiate, pHl l , 60% acetonitrile and 1 min fractions were automatically collected throughout the separation range (Micro-fraction collector, Sunchrom, Germany) and pooled to yield a total of 16 peptide fractions.
Table 39: Peptide fractionation by gradient elution
Figure imgf000063_0001
5.4 LC-MS/MS analysis
Samples were dried in vacuo and resuspended in 0.1 % formic acid in water and aliquots of the sample were injected into a nano-LC system (Eksigent 1 D+) coupled to LTQ-Orbitrap mass spectrometers (Thermo-Finnigan). Peptides were separated on custom 50 cm x 75uM (ID) reversed phase columns (Reprosil) at 40°C. Gradient elution was performed from 2% acetonitrile to 40% acetonitrile in 0.1 % formic acid over 2 hrs (Solvent A was 0.1 % formic acid and solvent B was 70% acetonitrile in 0.1% formic acid).
LTQ-Orbitrap XL and Orbitrap Velos instruments were operated with XCalibur 2.0/2.1 software. Intact peptides were detected in the Orbitrap at 30.000 resolution. Internal calibration was performed using the ion signal from (Si(CH3)20)6H+ at m/z 445.120025 (Olsen et al., 2005. Mol. Cell Proteomics 4, 2010-2021). Data dependent tandem mass spectra were generated for up to six peptide precursors using a combined CID HCD approach (Kocher et al., 2009. J. Proteome Res. 8, 4743-4752). For CID up to 5000 ions (Orbitrap XL) or up to 3000 ions (Orbitrap Velos) were accumulated in the ion trap within a maximum ion accumulation time of 200 msec.
Table 40: Peptide elution off the LC system
Figure imgf000064_0001
5.5 Peptide and protein identification
Mascot™ 2.0 (Matrix Science) was used for protein identification using 10 ppm mass tolerance for peptide precursors and 0.8 Da (CID) tolerance for fragment ions. Carbamidomethylation of cysteine residues and iTRAQ/TMT modification of lysine residues were set as fixed modifications and S,T,Y phosphorylation, methionine oxidation, N-terminal acetylation of proteins and iTRAQ/TMT modification of peptide N-termini were set as variable modifications. The search data base consisted of a customized version of the IPI protein sequence database combined with a decoy version of this database created using a script supplied by Matrix Science (Elias et al., 2005. Nat. Methods 2, 667- 675). Protein identifications were accepted as follows: i) For single spectrum to sequence assignments, we required this assignment to be the best match and a minimum Mascot score of 31 and a 1 Ox difference of this assignment over the next best assignment. Based on these criteria, the decoy search results indicated <1 % false discovery rate (FDR); ii) For multiple spectrum to sequence assignments and using the same parameters, the decoy search results indicate <0.1% false discovery rate. For protein quantification a minimum of 2 sequence assignments matching to unique peptides was required. FDR for quantified proteins was «0.1 %. 5.6 Peptide and protein quantification
Centroided iTRAQ/TMT reporter ion signals were computed by the XCalibur software operating and extracted from MS data files using customized scripts. Only peptides unique for identified proteins were used for relative protein quantification. Further spectra used for quantification were filtered according to the following criteria: Mascot ion score > 15, signal to background ratio of the precursor ion > 4, s2i > 0.5 (Savitski et al., 2010. J. Am. Soc. Mass Spectrom. 21 (10): 1668-79). Reporter ion intensities were multiplied with the ion accumulation time yielding an area value proportional to the number of reporter ions present in the mass analyzer. For compound competition binding experiments fold changes are reported based on reporter ion areas in comparison to vehicle control and were calculated using sum-based bootstrap algorithm. Fold changes were corrected for isotope purity as described and adjusted for interference caused by co-eluting nearly isobaric peaks as estimated by the s2i measure (Savitski et al., 2010. J. Am. Soc. Mass Spectrom. 21(10): 1668-79).

Claims

Cellzome AG Claims
1. A method for the identification of a protein capable of interacting with a given histone tail, comprising the steps of a) providing a protein preparation containing a protein of interest, b) contacting the protein preparation with the given histone tail under
conditions allowing the binding of the protein to said histone tail, and c) characterizing the protein bound to the histone tail.
2. The method of claim 1, wherein the histone tail is labeled and after the contacting the histone tail-protein complex is purified using said label, in particular wherein the label is biotin and the histone tail-protein complex is purified with the help of streptavidin.
3. The method of claim 1 , wherein step c) includes the steps of cl) eluting the protein from the histone tail, and c2) characterizing the protein.
4. The method of any of claims 1 or 2, wherein said characterizing the protein
includes the identification of the protein.
5. The method of any of claims 1 to 4, wherein said protein is characterized by mass spectrometry or immunodection.
6. The method of any of claims 1 to 5, wherein step b) is performed in the presence of varying concentrations of a non-labeled histone tail, in particular wherein the amount of protein is determined in step c) and wherein a detection of a reduced amount of protein with increasing concentrations of the non-labeled histone tail is indicative for a specific binding of the protein to the histone tail.
7. A method for the identification of a compound being capable of interacting with a protein capable of interacting with a given histone tail, comprising the steps of a) identifying a protein capable of interacting with a given histone tail according to any of claims 1 to 6, b) providing a protein preparation containing said protein, c) contacting the protein preparation with said histone tail immobilized on a solid support under conditions allowing the formation of a complex between the histone tail and the protein, d) incubating complex with a given compound, and e) determining whether the compound is able to separate the protein from the immobilized histone tail.
8. The method of claim 7, wherein step e) includes the detection of separated protein or the determination of the amount of separated protein, in particular wherein said separated protein is detected or the amount of said separated protein is determined by mass spectrometry or immunodetection methods, preferably with an antibody directed against the protein.
9. A method for the identification of a compound being capable of interacting with a protein capable of interacting with a given histone tail, comprising the steps of a) identifying a protein capable of interacting with a given histone tail according to any of claims 1 to 6, b) providing a protein preparation containing said protein, c) contacting the protein preparation with said histone tail immobilized on a solid support and with a given compound under conditions allowing the formation of a complex between the histone tail and the protein, and d) detecting the complex formed in step c).
10. The method of claim 9 wherein in step d) said detecting is performed by determining the amount of the complex.
1 1. The method of any of claims 9 or 10, wherein steps a) to d) are performed with several protein preparations in order to test different compounds.
12. A method for the identification of a compound being capable of interacting with a protein capable of interacting with a given histone tail, comprising the steps of: a) identifying a protein capable of interacting with a given histone tail according to any of claims 1 to 6, b) providing two aliquots of a protein preparation containing said protein, c) contacting one aliquot with said histone tail immobilized on a solid support under conditions allowing the formation of a complex between the histone tail and the protein, d) contacting the other aliquot with said histone tail immobilized on a solid support and with a given compound under conditions allowing the formation of a complex between the histone tail and the protein, and e) determining the amount of the complex formed in steps c) and d).
13. A method for the identification of a compound being capable of interacting with a protein capable of interacting with a given histone tail, comprising the steps of: a) identifying an protein capable of interacting with a given histone tail according to. any of claims 1 to 6, b) providing two aliquots comprising each at least one cell containing said
protein, c) incubating one aliquot with a given compound, d) harvesting the cells of each aliquot, e) lysing the cells in order to obtain protein preparations, f) contacting the protein preparations with said histone tail immobilized on a
solid support under conditions allowing the formation of a complex between the histone tail and the protein, and g) determining the amount of the complex formed in each aliquot in step f).
14. The method of any of claims 12 or 13, wherein a reduced amount of the complex formed in the aliquot incubated with the compound in comparison to the aliquot not incubated with the compound indicates that the protein is a target of the compound, in particular wherein said protein is detected or the amount of said protein is determined by mass spectrometry or immunodetection methods, preferably with an antibody directed against said protein.
15. The method of any of claims 12 to 14, wherein the amount of the complex is determined by separating the protein from the immobilized histone tail and subsequent detection of separated protein or subsequent determination of the amount of separated protein, in particular wherein said protein is detected or the amount of said protein is determined by mass spectrometry or immunodetection methods, preferably with an antibody directed against said protein.
16. The method of any of claims 1 to 15, performed as a medium or high throughput screening.
17. The method of any of claims 1 to 16, wherein said compound is selected from the group consisting of synthetic compounds, or organic synthetic drugs, more preferably small molecule organic drugs, and natural small molecule compounds.
18. The method of any of claims 1 to 17, wherein the solid support is selected from the group consisting of agarose, modified agarose, sepharose beads (e.g. NHS-activated sepharose), latex, cellulose, and ferro- or ferrimagnetic particles.
19. The method of any of claims 1 to 18, wherein the provision of a protein preparation includes the steps of harvesting at least one cell containing the protein and lysing the cell.
20. The method of any of claims 1 to 19, wherein at least one of the amino acids of the histone tail has been further modified.
21. The method of any of claims 7 to 20, wherein instead of step a) the histone tail is one of the histone tails as shown in one of the Tables 1 to 35, and the protein is one of the proteins corresponding to the respective histone tail as shown in one of the Tables 1 to 35.
22. The method of any of claims 1 to 21 , wherein the characterization of the protein, or the detection of the protein, or the determination of the amount of the protein is carried out by quantitative mass spectrometry.
PCT/EP2012/001149 2011-03-17 2012-03-14 Methods for the identification and characterization of proteins interacting with histone tails and of compounds interacting with said proteins WO2012123119A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP11002203.5 2011-03-17
EP11002203 2011-03-17

Publications (1)

Publication Number Publication Date
WO2012123119A1 true WO2012123119A1 (en) 2012-09-20

Family

ID=45855708

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2012/001149 WO2012123119A1 (en) 2011-03-17 2012-03-14 Methods for the identification and characterization of proteins interacting with histone tails and of compounds interacting with said proteins

Country Status (1)

Country Link
WO (1) WO2012123119A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013127011A1 (en) 2012-02-27 2013-09-06 British Columbia Cancer Agency Branch Reprogramming effector protein interactions to correct epigenetic defects in cancer
WO2015108955A3 (en) * 2014-01-15 2015-11-12 The Board Of Regents Of The University Of Texas System Targeting of pelp1 in cancer therapy
WO2017054832A1 (en) * 2015-10-02 2017-04-06 University Of Copenhagen Small molecules blocking histone reader domains
CN110222798A (en) * 2019-04-24 2019-09-10 昆明理工大学 One kind is based on the improved tail code sequence recognition methods again of the random recognition methods of tail code
US10441644B2 (en) 2015-05-05 2019-10-15 The Regents Of The University Of California H3.3 CTL peptides and uses thereof

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006134056A1 (en) 2005-06-14 2006-12-21 Cellzome Ag Process for the identification of novel enzyme interacting compounds

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006134056A1 (en) 2005-06-14 2006-12-21 Cellzome Ag Process for the identification of novel enzyme interacting compounds

Non-Patent Citations (52)

* Cited by examiner, † Cited by third party
Title
"Current Protocols in Protein Science", 1995, WILEY, article "Recombinant Proteins"
"Current Protocols in Protein Science", WILEY, article "Purification of Organelles from Mammalian Cells"
"Current Protocols in Protein Science", WILEY, article "Subcellular Fractionation of Tissue Culture Cells"
"Short Protocols in Molecular Biology", 1999, WILEY, pages: 11 - 1,11-30
BALASUBRAMANIAN ET AL., CANCER LETT., vol. 280, no. 2, 2009, pages 211 - 21
BANTSCHEFF ET AL., NAT. BIOTECHNOL., vol. 25, 2007, pages 1035 - 1044
BANTSCHEFF ET AL., NAT. BIOTECHNOL., vol. 29, 2011, pages 255 - 265
BARSKI ET AL., CELL, vol. 129, no. 4, 2007, pages 823 - 837
BLACKWELL ET AL., LIFE SCI., vol. 82, no. 21-22, 2008, pages 1050 - 1058
BLACKWELL ET AL., LIFE SCIENCES, vol. 82, no. 21-22, 2008, pages 1050 - 1058
BOLDEN ET AL., NAT. REV. DRUG DISCOV., vol. 5, no. 9, 2006, pages 769 - 784
BREINBAUER R; MANGER M; SCHECK M; WALDMANN H: "Natural product guided compound library development", CURR. MED. CHEM., vol. 9, no. 23, 2002, pages 2129 - 2145
CHI ET AL., NAT. REV. CANCER., vol. 10, no. 7, 2010, pages 457 - 69
COPELAND ET AL., CURR. OPIN. CHEM. BIOL., vol. 14, no. 4, 2010, pages 505 - 510
DAYON ET AL., ANAL. CHEM., vol. 80, no. 8, 2008, pages 2921 - 2931
DIGNAM ET AL., NUCLEIC ACIDS RES., vol. 11, no. 5, 1983, pages 1475 - 1489
ELIAS ET AL., NAT. METHODS, vol. 2, 2005, pages 667 - 675
GLICKMAN ET AL., J. BIOMOL. SCREEN., vol. 7, no. 1, 2002, pages 3 - 10
HABERLAND ET AL., NAT. REV. GENET., vol. 10, no. 1, 2009, pages 32 - 42
HAUSER ET AL., CURR. TOP. MED. CHEM., vol. 9, no. 3, 2009, pages 227 - 234
HOWARD R. PETTY: "Current Protocols in Cell Biology", 2003, JOHN WILEY & SONS, INC.
JUFVAS ET AL., PLOS ONE, vol. 6, no. 1, 2011, pages 5960
JUFVAS ET AL., PLOS ONE, vol. 6, no. L, 2011, pages 5960
KALIN ET AL., CURR. OPIN. CHEM. BIOL., vol. 13, 2009, pages 1 - 9
KARWA; MITRA: "Sample Preparation Techniques in Analytical Chemistry", 2003, WILEY, article "Sample preparation for the extraction, isolation, and purification of Nuclei Acids"
KERSEY ET AL., PROTEOMICS, vol. 4, no. 7, 2004, pages 1985 - 1988
KHAN ET AL., BIOCHEM. J., vol. 409, no. 2, 2008, pages 581 - 9
KOCHER ET AL., J. PROTEOME RES., vol. 8, 2009, pages 4743 - 4752
KOUZARIDES, CELL, vol. 128, no. 4, 2007, pages 693 - 705
MANN ET AL., ANNUAL REVIEW OF BIOCHEMISTRY, vol. 70, 2001, pages 437 - 473
NATHAN ET AL., GENES DEV., vol. 20, no. 8, 2006, pages 966 - 976
OLSEN ET AL., MOL. CELL PROTEOMICS, vol. 4, 2005, pages 2010 - 2021
REMISZEWSKI ET AL., J. MED. CHEM., vol. 46, no. 21, 2003, pages 4609 - 4624
ROSS ET AL., MOL. CELL. PROTEOMICS, vol. 3, no. 12, 2004, pages 1154 - 1169
RUTHENBURG ET AL., NAT. REV. MOL. CELL BIOL., vol. 8, no. 12, 2007, pages 983 - 994
SAVITSKI ET AL., J. AM. SOC. MASS SPECTROM., vol. 21, no. 10, 2010, pages 1668 - 79
SCHULZE; MANN, J. BIOL. CHEM., vol. 279, no. 11, 2004, pages 10756 - 64
SHEVCHENKO ET AL., ANAL. CHEM., vol. 68, 1996, pages 850 - 858
SHEVCHENKO ET AL., ANALYTICAL CHEMISTRY, vol. 68, 1996, pages 850 - 858
STEWARD MELISSA M ET AL: "Molecular regulation of H3K4 trimethylation by ASH2L, a shared subunit of MLL complexes.", NATURE STRUCTURAL & MOLECULAR BIOLOGY, vol. 13, no. 9, September 2006 (2006-09-01), pages 852 - 854, XP002662146, ISSN: 1545-9993 *
SUBRAMANIAN A., IMMUNOAFFINTY CHROMATOGRAPHY, 2002
THOMPSON ET AL., ANAL. CHEM., vol. 75, no. 8, 2003, pages 1895 - 1904
VERMEULEN ET AL., CELL, vol. 142, no. 6, 2010, pages 967 - 980
VOIGT; REINBERG, CHEMBIOCHEM, vol. 12, no. 2, 2011, pages 236 - 252
W.E BIDDISON: "Current Protocols in Cell Biology", 1998, JOHN WILEY & SONS, INC., article "Preparation and culture of human lymphocytes"
WONG, SHAN S.: "Chemistry of protein conjugation and cross-linking", 1991, CRC PRESS, INC., pages: 295 - 318
WU ET AL., J. PROTEOME RES., vol. 5, 2006, pages 651 - 658
WU; OLSON, J. CLIN. INVEST., vol. 109, no. 10, 2002, pages 1327 - 1333
WYSOCKA JOANNA ET AL: "WDR5 associates with histone H3 methylated at K4 and is essential for H3 K4 methylation and vertebrate development.", CELL, vol. 121, no. 6, 17 June 2005 (2005-06-17), pages 859 - 872, XP002662147, ISSN: 0092-8674 *
WYSOCKA, METHODS, vol. 40, no. 4, 2006, pages 339 - 343
ZHANG ET AL., PLOS ONE, vol. 5, no. L, 2010, pages E8903
ZHANG QIANG ET AL: "Biochemical profiling of histone binding selectivity of the yeast bromodomain family.", PLOS ONE, vol. 5, no. 1, E8903, 2010, pages 1 - 10, XP002662145, ISSN: 1932-6203 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013127011A1 (en) 2012-02-27 2013-09-06 British Columbia Cancer Agency Branch Reprogramming effector protein interactions to correct epigenetic defects in cancer
US9552457B2 (en) 2012-02-27 2017-01-24 British Columbia Cancer Agency Branch Reprogramming effector protein interactions to correct epigenetic defects in cancer
WO2015108955A3 (en) * 2014-01-15 2015-11-12 The Board Of Regents Of The University Of Texas System Targeting of pelp1 in cancer therapy
US10682388B2 (en) 2014-01-15 2020-06-16 The Board Of Regents Of The University Of Texas System Targeting of PELP1 in cancer therapy
US10441644B2 (en) 2015-05-05 2019-10-15 The Regents Of The University Of California H3.3 CTL peptides and uses thereof
US10849965B2 (en) 2015-05-05 2020-12-01 The Regents Of The University Of California H3.3 CTL peptides and uses thereof
US11185577B2 (en) 2015-05-05 2021-11-30 The Regents Of The University Of California H3.3 CTL peptides and uses thereof
US11925679B2 (en) 2015-05-05 2024-03-12 The Regents Of The University Of California H3.3 CTL peptides and uses thereof
WO2017054832A1 (en) * 2015-10-02 2017-04-06 University Of Copenhagen Small molecules blocking histone reader domains
US10961289B2 (en) 2015-10-02 2021-03-30 The University Of Copenhagen Small molecules blocking histone reader domains
CN110222798A (en) * 2019-04-24 2019-09-10 昆明理工大学 One kind is based on the improved tail code sequence recognition methods again of the random recognition methods of tail code
CN110222798B (en) * 2019-04-24 2021-08-13 昆明理工大学 Tail code sequence re-identification method improved based on tail code random identification method

Similar Documents

Publication Publication Date Title
Peters et al. Partitioning and plasticity of repressive histone methylation states in mammalian chromatin
Scarpin et al. Parallel global profiling of plant TOR dynamics reveals a conserved role for LARP1 in translation
Wang et al. Identifying dynamic interactors of protein complexes by quantitative mass spectrometry
Rendleman et al. New insights into the cellular temporal response to proteostatic stress
Dhami et al. Dynamic methylation of Numb by Set8 regulates its binding to p53 and apoptosis
Bonenfant et al. Analysis of dynamic changes in post-translational modifications of human histones during cell cycle by mass spectrometry
Jung et al. Quantitative mass spectrometry of histones H3. 2 and H3. 3 in Suz12-deficient mouse embryonic stem cells reveals distinct, dynamic post-translational modifications at Lys-27 and Lys-36
Carlson et al. Emerging technologies to map the protein methylome
Chen et al. Affinity-purification mass spectrometry (AP-MS) of serine/threonine phosphatases
EP2419739B1 (en) Method for quantifying modified peptides
Mehus et al. Quantitation of human metallothionein isoforms: a family of small, highly conserved, cysteine-rich proteins
Seth et al. Molecular portrait of high productivity in recombinant NS0 cells
Eriksson et al. Quantitative membrane proteomics applying narrow range peptide isoelectric focusing for studies of small cell lung cancer resistance mechanisms
DK2191016T3 (en) INSULATION OF PROTEIN FACTORS BINDING DIRECT OR INDIRECT WITH NUCLEIC ACIDS
Repetto et al. Exploring the nuclear proteome of Medicago truncatula at the switch towards seed filling
WO2012123119A1 (en) Methods for the identification and characterization of proteins interacting with histone tails and of compounds interacting with said proteins
Carrier et al. Phosphoproteome and transcriptome of RA-responsive and RA-resistant breast cancer cell lines
Jones et al. ABPP-HT-high-throughput activity-based profiling of deubiquitylating enzyme inhibitors in a cellular context
Lund et al. Quantitative analysis of global protein lysine methylation by mass spectrometry
Gouw et al. In vivo stable isotope labeling of fruit flies reveals post-transcriptional regulation in the maternal-to-zygotic transition
EP2464967B1 (en) Methods for the identification and characterization of hdac interacting compounds
Plazas-Mayorca et al. Quantitative proteomics reveals direct and indirect alterations in the histone code following methyltransferase knockdown
Ensinck et al. The yeast RNA methylation complex consists of conserved yet reconfigured components with m6A-dependent and independent roles
Yoshida et al. Detection of ubiquitination activity and identification of ubiquitinated substrates using TR-TUBE
Solovyeva et al. Integrative proteogenomics for differential expression and splicing variation in a DM1 mouse model

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12709538

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12709538

Country of ref document: EP

Kind code of ref document: A1