WO2018058023A2 - Méthodes d'identification de protéines se liant à des ligands - Google Patents

Méthodes d'identification de protéines se liant à des ligands Download PDF

Info

Publication number
WO2018058023A2
WO2018058023A2 PCT/US2017/053207 US2017053207W WO2018058023A2 WO 2018058023 A2 WO2018058023 A2 WO 2018058023A2 US 2017053207 W US2017053207 W US 2017053207W WO 2018058023 A2 WO2018058023 A2 WO 2018058023A2
Authority
WO
WIPO (PCT)
Prior art keywords
ligand
proteins
sample
protein
features
Prior art date
Application number
PCT/US2017/053207
Other languages
English (en)
Other versions
WO2018058023A3 (fr
Inventor
Andrey Bondarenko
Nathan A. Yates
Steven James MULLETT
Harris B. BELL-TEMIN
Original Assignee
University Of Pittsburgh - Of The Commonwealth System Of Higher Education
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University Of Pittsburgh - Of The Commonwealth System Of Higher Education filed Critical University Of Pittsburgh - Of The Commonwealth System Of Higher Education
Priority to US16/336,079 priority Critical patent/US11567074B2/en
Publication of WO2018058023A2 publication Critical patent/WO2018058023A2/fr
Publication of WO2018058023A3 publication Critical patent/WO2018058023A3/fr

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/566Immunoassay; Biospecific binding assay; Materials therefor using specific carrier or receptor proteins as ligand binding reagents where possible specific carrier or receptor proteins are classified with their target compounds
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/94Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving narcotics or drugs or pharmaceuticals, neurotransmitters or associated receptors
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/88Integrated analysis systems specially adapted therefor, not covered by a single one of the groups G01N30/04 - G01N30/86
    • G01N2030/8809Integrated analysis systems specially adapted therefor, not covered by a single one of the groups G01N30/04 - G01N30/86 analysis specially adapted for the sample
    • G01N2030/8813Integrated analysis systems specially adapted therefor, not covered by a single one of the groups G01N30/04 - G01N30/86 analysis specially adapted for the sample biological materials
    • G01N2030/8831Integrated analysis systems specially adapted therefor, not covered by a single one of the groups G01N30/04 - G01N30/86 analysis specially adapted for the sample biological materials involving peptides or proteins

Definitions

  • This disclosure relates to methods for identifying proteins capable of binding a ligand.
  • FASTPP fast parallel proteolysis
  • DARTS drug affinity responsive target stability
  • CETSA cellular thermal shift assay
  • identifying a protein capable of binding a ligand comprising: (a) contacting the ligand with two or more samples comprising a plurality of proteins in a solution; (b) differentiating the proteins bound to the ligand ("bound proteins") from the proteins that are not bound to the ligand ("unbound proteins") in each sample; (c) denaturing and digesting the bound proteins to form a plurality of peptides in each sample; (d) quantifying a plurality of molecular features contained in the plurality of peptides in each sample, wherein the molecular features are defined as having a mass to charge ratio, retention time, and peak intensity as measured by mass spectrometry; (e) ranking the molecular features that exhibit a statistically significant difference in quantity between the samples contacted with the ligand and a sample that is not contacted with the ligand ("statistically significant molecular feature"); (f) identifying one or more amino acid sequences of the
  • step (a) comprises solubilizing the proteins using a surfactant, a detergent, or any combination thereof.
  • the detergent comprises octylglucyl pyranoside or dodecyl maltoside.
  • step (b) comprises heating each sample to a temperature such that the solubility of the bound protein is different in the sample contacted with the ligand than the solubility of that same protein in a sample not contacted with the ligand.
  • each sample is heated to a temperature of from about 40 °C to about 65 °C.
  • each sample is heated from about 48 °C to about 56 °C.
  • each sample is heated to a temperature of about 56 °C.
  • step (b) comprises titrating each sample with a solution to lower the dielectric constant such that the solubility of the bound protein is different in the sample contacted with the ligand than the solubility of that same protein in a sample not contacted with the ligand.
  • each sample is titrated with acetone or methanol.
  • the plurality of peptides of step (c) are analyzed using nano- scale liquid chromatographic tandem mass spectrometry prior to step (d).
  • step (e) comprises using differential mass spectrometry.
  • the methods further comprise assigning the molecular features to an isotope group characterized by a chemical formula and an isotope distribution.
  • ranking the statistically significant molecular features comprises statistical and practical filtering.
  • the statistically significant molecular features that are determined to still be significant based upon the statistical and practical filtering are highly ranked.
  • the statistical filtering comprises t-tests.
  • the practical filtering comprises excluding any statistically significant molecular features that are not present in at least two-thirds of the samples contacted with the ligand.
  • the practical filtering comprises excluding any statistically significant molecular features that were the only significant features in a single isotope group with a p value of less than about 0.01 based on the statistical filtering.
  • step (e) comprises CHORUS web application for storing, sharing, visualizing, and analyzing spectrometry files.
  • step (g) comprises comparing the amino acid sequences of the statistically significant molecular features with a protein database and identifying which proteins of the protein database contain the statistically significant molecular features.
  • FIG. 1 is a schematic illustration of the differential intensity screening and ranking of unknown protein targets (DISRUPT) workflow.
  • Samples of equal protein amount and concentration, in groups of n > 6, are treated with drug and vehicle.
  • Drug binding stabilizes proteins in solution.
  • samples are heated.
  • Drug stabilized proteins are less likely to denature and agglutinate due to heating than vehicle treated proteins, a difference that can be observed following centrifugation.
  • the remaining soluble proteins are digested to peptides, leaving a greater number of peptides from heat stabilized proteins in solution.
  • Samples are run in serial on a nano liquid chromatography high- resolution mass spectrometer resulting in chromatograms of individual isotopic peptides or "features.”
  • Differential mass spectrometry tools quantify these features and align them across all mass spectrometer runs, allowing for quantification of drug binding effect across files. Applying statistical and practical filters, hundreds of thousands of features can be sorted and ranked, resulting in a highly confident list of prospective drug-binding proteins.
  • FIG. 2 is a graphical illustration of the denaturation curves of all proteins in a K562 lysate. Lysates were heated to various temperatures across the denaturation curve. The reduction of intensity was calculated at the feature level for all identified peptides, and the number of peptides whose protein intensities decreased by 50-90% were calculated. The number of peptides in this range was calculated for peptides that were in this range at multiple temperatures and at a single temperature, and the data was plotted. It was determined that a temperature of 56 °C would have the greatest information content for the largest subset of proteins.
  • FIG. 3 is a schematic illustration of the major data types, computational steps, and cloud-computing platform that facilitates the label-free differential mass spectrometry (dMS) data analysis.
  • the differential mass spectrometry data analysis workflow has been integrated into a single cloud based platform that supports the efficient and scalable analysis of high- resolution LC-MS data.
  • FIG. 3 shows illustrations horizontally from right to left of the data types (top) that are transformed by computational services (middle) that are executed on distributed CPU's, and resulting information (bottom) that are stored by the system.
  • a publically accessible instance of the dMS platform is available at www.chorusproject.org along with the data and results reported in this manuscript.
  • FIG. 4 is a table comparing the CHORUS quantification precision and manual quantification methods. 20 features across a wide intensity range were blindly selected in decreasing steps of feature intensity across 14 pooled samples run non-consecutively.
  • FIGS. 5A-C are graphical illustrations of the DISRUPT analysis of staurosporine treated K562 cells compared to the control.
  • FIG. 5B is a volcano plot showing 140 features that meet the following selection criteria: n>2 significant features per isotope group where p ⁇ .01.
  • FIG. 5C depicts box and whiskers plots showing the minimum and maximum intensity as well as the 25 th and 75 th percentile of intensities across all replicates of features of greatest significance and the corresponding peptide amino acid sequence that was identified by tandem mass spectrometry.
  • FIG. 5C shows data for hundreds of thousands of high- resolution MS features and the precise selection and ranking of a small number of highly significant features that correspond to proteins that are known to bind staurosporine.
  • FIGS. 6A-G represent annotated mass spectra for identified features with p ⁇ 0.0001 from the lysates treated with staurosporine.
  • FIGS. 7A-C are graphical illustrations of a DISRUPT analysis of mdivi-1 treated a2780cis immortalized cancer cells.
  • FIG. 7A is a volcano plot showing 182, 171 features quantified in 12 samples using identical selection criteria as in FIG. 3.
  • FIG. 7B shows the data reduction as described in FIG. 3 that excludes all but 218 features with p ⁇ .01. Eight of nine features found to have a significance oip ⁇ .0001 and a positive fold change were identified as peptides with amino acid sequences GPSEAPSGQA,
  • FIG. 7C shows box plots of numbered features that were identified as DPP3 peptides with p ⁇ .01, showing significantly increased protein intensity in samples treated with mdivi-1. Data points marked with an asterisk were additional isotope group members for the identified peptides.
  • FIGS. 8A-G represent annotated mass spectra for identified features with p ⁇ 0.0001 from the cancer cells treated with mdivi-1.
  • FIGS. 9A-E represent the biochemical validation that mdivi-1 binds to, and inhibits, the function of DPP3.
  • FIG. 9A is a western blotting analysis of DPP3 with and without mdivi-1 as a function of temperature showing a significant increase in signal with
  • FIG. 9B represents thermal shift curves over a wide temperature range from Western blotting analysis for DPP3, ⁇ -actin, and DRPl; a strong shift was seen for DPP3, no shifts were observed for ⁇ -actin or DRPl .
  • FIG. 91C shows the structure of mdivi-1, a thioxodihydroquinazolinone compound that is a prospective treatment for cisplatin resistant cancers.
  • FIG. 9D shows the control data for fluorescent activity assay showing alteration of DPP3 function with mdivi-1 treatment, thus establishing that mdivi-1 absent DPP3 caused no significant change in fluorescence level of peptide substrate.
  • the addition of mdivi-1 10 minutes post reaction showed no significant difference from DPP3 with substrate but without mdivi-1.
  • FIG. 9E shows DPP3 activity was measured using a fluorescent polypeptide; the n-terminal fluorophore is cleaved by DPP3 reducing the fluorescence in the sample.
  • the assay demonstrates that mdivi-1 has an IC 50 of 70 nM upon DPP3 activity.
  • FIG. 10 is a graphical illustration of a cross validation demonstration of ranking improvement through greater sample number.
  • FIG. 11 is a schematic showing steps involved in the CHORUS cloud computing analytical workflow
  • DISRUPT differential intensity screening and ranking of unknown protein targets
  • DISRUPT ranks hundreds of thousands of high-resolution mass spectrometry features by statistical significance. This methodology was validated by profiling the binding of staurosporine, a non-specific protein kinase inhibitor, to cytosolic proteins and prioritizing a ranked list of statistically significant signals that were subsequently identified as canonical interactions. Afterwards, mdivi-1, a putative ovarian cancer treatment in tumors that have shown resistance to cisplatin, with no known molecular target or mode of action was then studied. 22"24 DISRUPT ranked dipeptidyl peptidase 3 (DPP3) as the top putative target of mdivi-1 and this finding was subsequently validated by orthogonal quantitative and functional assays.
  • DPP3 dipeptidyl peptidase 3
  • identifying a protein capable of binding a ligand comprising: (a) contacting the ligand with two or more samples comprising a plurality of proteins in a solution; (b) differentiating the proteins bound to the ligand ("bound proteins") from the proteins that are not bound to the ligand ("unbound proteins") in each sample; (c) denaturing and digesting the bound proteins to form a plurality of peptides in each sample; (d) quantifying a plurality of molecular features contained in the plurality of peptides in each sample, wherein the molecular features are defined as having a mass to charge ratio, retention time, and intensity as measured by mass spectrometry; (e) ranking the molecular features that exhibit a statistically significant difference in quantity between the samples contacted with the ligand and a sample that is not contacted with the ligand ("statistically significant molecular feature"); (f) identifying one or more amino acid sequences of the statistical
  • the methods provided herein can take advantage of ligand/protein stabilization by disordering the system through heat or chemical denaturation.
  • a ligand bound protein is a stable protein, so it can withstand more heat or chemical denaturation prior to
  • agglutination/precipitation Methods of using heat or chemical denaturation to alter the stability of proteins are well-known in the art. For example, as disclosed in U.S. Patent Application No. 2015/0133336, which is hereby incorporated by reference in its entirety, a critical feature of CETSA methodology is the use of heat to alter the stability of the protein. As noted above, chemical denaturation can also be used to change the stability of the protein.
  • the methods provided herein can rely on the known changes to conformational stability that occur in ligand binding for membrane proteins when attempting to detergent solubilize. A different proportion of protein will solubilize in the same amount of detergent if it is or is not bound to a ligand. This difference can be utilized to identify which proteins bind the ligand.
  • the methods provided herein allow for an analysis of normally insoluble proteins to determine whether the insoluble proteins are targets of a ligand by solubilizing the normally insoluble proteins with the use of a surfactant, detergent, or any combination thereof.
  • Suitable, non-limiting examples of surfactants that can be used in the methods provided herein include those that are well-known in the art.
  • suitable surfactants are those that maintain native states, that do not interfere with protein ligand binding, that are mass spectrometry compatible, and that do not interfere with protein/protein binding in existing complexes.
  • Suitable detergents that can be used in the methods provided herein include detergents that maintain native states, that do not interfere with protein ligand binding, that are mass spectrometry compatible, and that do not interfere with protein/protein binding in existing complexes.
  • suitable detergents include octylglucyl pyranoside, dodecyl maltoside, and the like.
  • the detergent comprises octylglucyl pyranoside.
  • the detergent comprises dodecyl maltoside.
  • the detergent comprises a combination of octylglucyl pyranoside and dodecyl maltoside.
  • differentiating the bound proteins from the unbound proteins comprises heating each sample to a temperature such that the solubility of the bound protein is different in the sample contacted with the ligand than the solubility of that same protein in a sample not contacted with the ligand.
  • each sample is heated to a temperature of from about 40 °C to about 65 °C, or any amount in-between these two values, such as but not limited to about 41 °C, about 42 °C, about 43 °C, about 44 °C, about 45 °C, about 46 °C, about 47 °C, about 48 °C, about 49 °C, about 50 °C, about 51 °C, about 52 °C, about 53 °C, about 54 °C, about 55 °C, about 56 °C, about 57 °C, about 58 °C, about 59 °C, about 60 °C, about 61 °C, about 62 °C, about 63 °C, about 64 °C, or about 65 °C.
  • each sample is heated from about 48 °C to about 56 °C.
  • each sample is heated from about 48 °C to about 56 °C.
  • each sample is heated to a temperature of about 56 °C.
  • differentiating the bound proteins from the unbound proteins comprises titrating each sample with a solution to lower the dielectric constant such that the solubility of the bound protein is different in the sample contacted with the ligand than the solubility of that same protein in a sample not contacted with the ligand.
  • the sample is titrated with acetone. In some embodiments, the sample is titrated with methanol. In some embodiments, the sample is titrated with a combination of acetone and methanol.
  • Other non-limiting examples of titrating solutions to lower the dielectric constant of the unbound proteins are well known in the art.
  • Non-limiting examples of such methods include thermofluor binding to hydrophobic regions exposed during the melting of a protein, which are exposed at differential rates in ligand bound and non-ligand bound proteins; thiol specific dyes binding to cysteine exposed during the melting of a protein; and employing ligands tagged for enrichment or visualization, such as5-his, glutathione-s-transferase, or fluorescent tag and bead immobilized ligands.
  • the bound proteins and unbound proteins in each sample are differentiated by solubilizing or maintaining the solubility of the bound proteins while the unbound proteins remain non-solubilized or precipitate out of solution.
  • the non-solubilized, unbound proteins are removed from each sample by methods well-known in the art.
  • the non-solubilized unbound proteins are removed from the sample using centrifugation. Centrifugation is a well-known technique in the art to remove any precipitates from a solution.
  • the non-solubilized unbound proteins are removed from the sample using filtration.
  • Filtration involves filtering away the non-solubilized unbound proteins from the solubilized bound proteins.
  • the non- solubilized unbound proteins are removed from the sample using both centrifugation and filtration, including centrifugation followed by filtration or filtration followed by
  • the bound proteins are denatured and digested to form a plurality of peptides, which may be quantified through well-known methods of proteomic analysis.
  • the peptides of the plurality may be analyzed using mass spectrometry techniques.
  • Mass spectrometry techniques for identifying peptides and/or proteins are well known in the art.
  • U.S. Patent No. 6,906,320 Mass Spectrometry Data Analysis Techniques," incorporated herein by reference in its entirety, discloses methods for identification of peptides and/or proteins using mass spectrometry.
  • the mass spectrometry techniques may be used to identify molecular features contained in the plurality of peptides. In some embodiments, these molecular features may be defined by a mass-to- charge ratio, retention time, and peak intensity.
  • the mass spectrometry techniques comprise nano-scale liquid chromatographic tandem mass spectrometry.
  • the peptides are separated by reversed phase liquid chromatography, ionized by electrospray ionization, and analyzed with a hybrid ion trap/high-resolution Fourier transform mass spectrometer or a high resolution quantitative time of flight instrument (QTOF).
  • QTOF quantitative time of flight instrument
  • high resolution mass spectra of a peptide may be measured at a frequency of approximately 1 Hz.
  • the mass spectra of each peptide may be used to categorize the plurality of peptides into distinct molecular features defined by a mass-to-charge ratio, retention time, and peak intensity.
  • molecular features that exhibit a statistically significant difference in quantity between the samples contacted with the ligand and a sample that is not contacted with the ligand are ranked.
  • the bound proteins undergo a solubility change such that there is a difference in the concentration of the proteins that bind to the ligand in the samples contacted with the ligand as compared to the concentration of the same proteins in the samples not contacted with the ligand.
  • the proteins that exhibit a difference in concentration between the samples contacted with the ligand and the samples not contacted with the ligand are the proteins that bind to the ligand.
  • the concentration of the proteins that bind to the ligand is higher in the samples contacted with the ligand than in the samples not contacted with the ligand.
  • the concentration of the proteins that bind to the ligand is lower in the samples contacted with the ligand than in the samples not contacted with the ligand.
  • the ranking comprises the use of differential mass spectrometry.
  • the spectrometry RAW files from the mass spectrometer is supplied to an analysis unit.
  • the analysis unit can be a component of the mass spectrometer or can be a stand-alone device that electronically receives the RAW files.
  • the analysis unit is a cloud based system that can interface with other local or cloud based systems.
  • the ranking comprises the CHORUS web application for storing, sharing, visualizing, and analyzing spectrometry files.
  • CHORUS is available online at www.chorusproj ect.org.
  • the analysis unit can include a filtering unit.
  • the filtering unit can remove noise and background signals from the RAW files.
  • the analysis unit can also include an image processing unit.
  • the image processing unit can identify the peaks within the filtered spectrometry files.
  • the ranking by the analysis unit can also include assigning the molecular features to an isotope group characterized by a chemical formula and an isotope distribution.
  • the analysis unit can also retention time and accurate mass-align the spectrometry files, at the level of individual features within isotope envelopes using a proprietary alignment algorithm and searched against human reference protein database (Uniprot human ref 20150303. fasta via Comet) using a 10 parts per million peptide precursor mass tolerance.
  • the analysis unit is configured to rank the statistically significant molecular features using statistical and practical filtering.
  • the statistical filtering can include t-tests. Other methods of statistical filtering are well- known in the art.
  • the practical filtering conducted by the analysis unit can include excluding any statistically significant molecular features that are not present in at least two-thirds of the samples contacted with the ligand. In some embodiments, the practical filtering comprises excluding any statistically significant molecular features that were the only significant features in a single isotope group with a p value of less than about 0.01 based on the statistical filtering.
  • the statistically significant molecular features that are determined to still be significant based upon the statistical and practical filtering are highly ranked by the analysis unit.
  • a significance threshold can be set.
  • the analysis unit can compare the molecular features to the significance threshold to determine where the molecular features are significant following the statistical and practical filtering.
  • the amino acid sequences of the statistically significant molecular features are identified using processes well-known in the art. Such processes include, but are not limited to, peptide mass fingerprinting and data-dependent MS/MS acquisition followed by a computational search against an in silico digest of the proteome. In some embodiments, the process includes data-dependent MS/MS acquisition followed by a computational search against an in silico digest of the proteome.
  • the statistically significant molecular features may be a part of a protein capable of binding the ligand.
  • identifying the protein that is capable of binding the ligand comprises comparing the amino acid sequences of the statistically significant molecular features with a protein database and identifying which proteins of the protein database contain the statistically significant molecular features.
  • the statistically significant molecular features that may be a part of a protein capable of binding the ligand include the highly ranked statistically significant molecular features.
  • sequence identity is expressed in terms of the similarity between the sequences, otherwise referred to as sequence identity. Sequence identity is frequently measured in terms of percentage identity (or similarity or homology); the higher the percentage, the more similar the two sequences are. Homologs or variants of a polypeptide will possess a relatively high degree of sequence identity when aligned using standard methods.
  • NCBI Basic Local Alignment Search Tool (Altschul et al, J. Mol. Biol. 215:403, 1990) is available from several sources, including the National Center for Biotechnology Information (NCBI, Bethesda, MD) and on the internet, for use in connection with the sequence analysis programs blastp, blastn, blastx, tblastn and tblastx.
  • Homologs and variants of a protein are typically characterized by possession of at least about 75%, for example at least about 80%, about 90%, about 95%, about 96%, about 97%), about 98%) or 99% sequence identity counted over the full length alignment with the amino acid sequence of the antibody using the NCBI Blast 2.0, gapped blastp set to default parameters.
  • the Blast 2 sequences function is employed using the default BLOSUM62 matrix set to default parameters, (gap existence cost of 11, and a per residue gap cost of 1).
  • the alignment should be performed using the Blast 2 sequences function, employing the PAM30 matrix set to default parameters (open gap 9, extension gap 1 penalties). Proteins with even greater similarity to the reference sequences will show increasing percentage identities when assessed by this method, such as at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, or at least 99% sequence identity. When less than the entire sequence is being compared for sequence identity, homologs and variants will typically possess at least 80% sequence identity over short windows of 10-20 amino acids, and may possess sequence identities of at least 85% or at least 90%) or 95% depending on their similarity to the reference sequence.
  • a computer may have one or more input and output devices. These devices can be used, among other things, to present a user interface. Examples of output devices that can be used to provide a user interface include printers or display screens for visual presentation of output and speakers or other sound generating devices for audible
  • Examples of input devices that can be used for a user interface include keyboards, and pointing devices, such as mice, touch pads, and digitizing tablets.
  • a computer may receive input information through speech recognition or in other audible format.
  • Such computers may be interconnected by one or more networks in any suitable form, including a local area network or a wide area network, such as an enterprise network, an intelligent network (EST) or the Internet.
  • networks may be based on any suitable technology and may operate according to any suitable protocol and may include wireless networks, wired networks or fiber optic networks.
  • a computer employed to implement at least a portion of the functionality described herein may comprise a memory, one or more processing units (also referred to herein simply as “processors"), one or more communication interfaces, one or more display units, and one or more user input devices.
  • the memory may comprise any computer-readable media, and may store computer instructions (also referred to herein as "processor-executable
  • the processing unit(s) may be used to execute the instructions.
  • the communication interface(s) may be coupled to a wired or wireless network, bus, or other communication means and may therefore allow the computer to transmit communications to and/or receive communications from other devices.
  • the display unit(s) may be provided, for example, to allow a user to view various information in connection with execution of the instructions.
  • the user input device(s) may be provided, for example, to allow the user to make manual adjustments, make selections, enter data or various other information, and/or interact in any of a variety of manners with the processor during execution of the instructions.
  • the various methods or processes outlined herein may be coded as software that is executable on one or more processors that employ any one of a variety of operating systems or platforms. Additionally, such software may be written using any of a number of suitable programming languages and/or programming or scripting tools, and also may be compiled as executable machine language code or intermediate code that is executed on a framework or virtual machine.
  • inventive concepts may be embodied as a computer readable storage medium (or multiple computer readable storage media) (e.g., a computer memory, one or more floppy discs, compact discs, optical discs, magnetic tapes, flash memories, circuit configurations in Field Programmable Gate Arrays or other semiconductor devices, or other non-transitory medium or tangible computer storage medium) encoded with one or more programs that, when executed on one or more computers or other processors, perform methods that implement the various embodiments of the invention discussed above.
  • the computer readable medium or media can be transportable, such that the program or programs stored thereon can be loaded onto one or more different computers or other processors to implement various aspects of the present invention as discussed above.
  • program or “software” are used herein in a generic sense to refer to any type of computer code or set of computer-executable instructions that can be employed to program a computer or other processor to implement various aspects of embodiments as discussed above. Additionally, it should be appreciated that according to one aspect, one or more computer programs that when executed perform methods of the present invention need not reside on a single computer or processor, but may be distributed in a modular fashion amongst a number of different computers or processors to implement various aspects of the present invention.
  • Computer-executable instructions may be in many forms, such as program modules, executed by one or more computers or other devices.
  • program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types.
  • functionality of the program modules may be combined or distributed as desired in various embodiments.
  • data structures may be stored in computer-readable media in any suitable form.
  • data structures may be shown to have fields that are related through location in the data structure. Such relationships may likewise be achieved by assigning storage for the fields with locations in a computer-readable medium that conveys relationship between the fields.
  • any suitable mechanism may be used to establish a relationship between information in fields of a data structure, including through the use of pointers, tags or other mechanisms that establish relationship between data elements.
  • inventive concepts may be embodied as one or more methods, of which an example has been provided.
  • the acts performed as part of the method may be ordered in any suitable way. Accordingly, embodiments may be constructed in which acts are performed in an order different than illustrated, which may include performing some acts simultaneously, even though shown as sequential acts in illustrative embodiments.
  • an "isolated" biological component such as a nucleic acid, peptide or protein
  • nucleic acids, peptides and proteins which have been “isolated” thus include nucleic acids and proteins purified by standard purification methods.
  • the term also embraces nucleic acids, peptides and proteins prepared by recombinant expression in a host cell as well as chemically synthesized nucleic acids.
  • An isolated cell type has been substantially separated from other cell types, such as a different cell type that occurs in an organ.
  • a purified cell or component can be at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% pure.
  • protein also equivalent to a "polypeptide” refers to a polymer in which the monomers are amino acid residues that are joined together through amide bonds. When the amino acids are alpha-amino acids, either the L-optical isomer or the D-optical isomer can be used, the L-isomers being preferred.
  • the terms "polypeptide” or “protein” as used herein is intended to encompass any amino acid sequence and include modified sequences such as glycoproteins.
  • polypeptide or “protein” is specifically intended to cover naturally occurring proteins, as well as those that are recombinantly or synthetically produced.
  • soluble refers to a form of a protein or polypeptide that is not inserted into a cell membrane or a form of a protein or polypeptide that remains in solution. In some embodiments, a form of a protein or polypeptide that remains in solution has not precipitated out of the solution.
  • ligand refers to a small molecule that binds to a larger molecule.
  • a ligand includes a small molecule pharmaceutical drug.
  • a statistically significant molecular feature refers to a molecular feature that exhibits a statistically significant difference in quantity between the samples contacted with a ligand and a sample that is not contacted with a ligand.
  • a statistically significant molecular feature is a molecular feature whose increase in quantity in the samples contacted with the ligand compared to the sample not contacted with the ligand is caused by something other than random chance.
  • a statistically significant molecular feature is a molecular feature whose decrease in quantity in the samples contacted with the ligand compared to the sample not contacted with the ligand is caused by something other than random chance.
  • CHORUS refers to a web application available online at
  • the present technology is further illustrated by the following examples, which should not be construed as limiting in any way.
  • the examples described herein exemplify the use of DISRUPT to identify protein targets of staurosporine and mdivi-1, an investigational compound with unknown oncologic activity.
  • the examples described herein demonstrate that top ranked features associate with canonical targets for staurosporine and DPP3 for mdivi-1.
  • FIG. 3 outlines the major computational steps in the dMS workflow.
  • One or more computational steps of the dMS workflow can be performed from the above described analysis unit.
  • a key attribute of the DISRUPT platform is the analysis of large volumes of nLC-MS data, without common data reductions steps that limit analyses to identified peptides or isotopic labels.
  • An image processing service removed noise and performed peak detection to produce a list of features that are defined by an accurate mass-to-charge ratio, retention time, and relative intensity.
  • the isotope grouping (IG) service assigned features to isotope groups that share a common chemical formula and isotope distribution. The merge across file service, aligns and extracts expression data for each feature from all of the samples were analyzed in an experiment.
  • MS/MS spectra were matched to features based on retention time and accurate m/z, and searched against a reference protein database to yield peptide and protein identification assignment.
  • the cloud-based implementation of dMS provided the massive scalability that supports the analysis of large data cubes containing tens to hundreds of samples.
  • CV coefficients of variance
  • FIGS. 6A-G are identical to FIGS. 6A-G.
  • Mdivi-1 is a small molecule member of a class of thioxodihydroquinazolinones that exhibits robust killing of a wide range of tumor cells including in cisplatin resistant ovarian cancers cells from patients who are refractory to cisplatin treatment cells. 22"24 Mdivi-1 has a molecular weight of 353.22 and has the following chemical structure:
  • Equal aliquots of a2780cis cells were treated with 20 ⁇ mdivi-1 and DMSO respectively, split into six technical replicates per condition, and heated to 56 °C for ten minutes to stimulate differential agglutination of drug-bound and unbound proteins. All samples were processed as described in the staurosporine experiment, and CHORUS was used to create and export a data cube that contained 522,600 aligned features in 218,388 isotope groups. Data-dependent MS/MS data was analyzed and results in the identification of 35,369 features linked to 5488 unique peptides sequences that were associated with 947 proteins.
  • VLLEAGEGLVTITPTTGSDGRPDAR deriving from dipeptidyl peptidase 3, DPP3, using MS/MS spectra acquired during the initial dMS experiment.
  • the 6 remaining highly significant features were identified as peptides EVDGEGKPYYEVR,
  • DISRUPT provides a selective and unbiased approach for identifying and ranking novel drug targets without the use of a priori knowledge or protein identification.
  • the blot was also probed for DRP1 and showed no evidence of a thermal shift with mdivi-1 treatment.
  • DPP3 activity was measured using a polypeptide cleavage assay that measures a decrease in fluorescence due to the removal of an n-terminal bound fluorophore. Fluorescence was measured as a function of dose, as shown in FIG. 9E; the control and quench assays for the functional assay are shown in FIG. 9D. DPP3 showed a significant reduction in activity when exposed to mdivi-1, with an estimated IC 50 of 70 nM.
  • n 3 data files
  • the 8 most significant DPP3 features had a median statistical ranking of 341, with a 25 th to 75 th percentile range of 106 to 722.5.
  • analysis of the full data set returned a median rank of 4.5 for the eight most significant DPP3 features, with a 25 th to 75 th percentile range of 2.75 to 6.25.
  • K562 cells staurosporine experiments, obtained from ATCC, Manassas, VA
  • Cells were harvested at a density of 2 to 3xlO A 6 cells per mL and centrifuged at 300 x g in 20mL ice-cold phosphate-buffered saline (PBS) pH 7.4 to pellet cells.
  • PBS ice-cold phosphate-buffered saline
  • the cell pellet was resuspended in 5mL ice-cold PBS (mdivi-1) or kinase buffer
  • the nanoLC-ESI-MS/ S analysis was performed on an UltiMaie3000 nanoLC ((Dionex, Sunnydale, CA). 3ug of tryptic peptides were injected into via autosampler onto a 25cm x 75uM ID reversed phase column packed with 3uM Reprosil (New Objective, Boston, MA) heated to 50°C. Peptides were separated and eluted on a gradient from 1% acetonitrile to 28% acetonitrile in 0.1% formic acid over 70 minutes at 300nL/min. Samples were injected online into a Veios Pro mass spectrometer using a data-dependent top 5 method in positive mode, with spray voltage set at I .9kV.
  • Full scan spectra were acquired in the range of m/z 350-1400 at 60,000 resolution using an automatic gain control target of le6, excluding charge states of +1.
  • the top 5 most intense ions were fragmented using higher energy colli si onal dissociation at 32.5% normalized collision energy and an MSn ion target of 5e4, before being excluded for 60 seconds.
  • Peaks were formed based on local maximum and its immediate surroundings. Background "snow" noise was filtered out.
  • the criteria was the size of the peak being very small (1 pixel in RT). Remaining peaks were filtered using various criteria like shape and size. List of features were formed from good peaks. Cleaned and smoothed images were produced.
  • Peaks were sorted on descending intensity and isotope groups were assembled starting from the highest peak. All possible charges were tried (1 to 7 typically) and any adjacent peaks were identified in appropriate locations (calculated based on charge and start rt and m/z). Isotope group candidate was validated using theoretical intensities distributions and the best was chosen.
  • RT shift was calculated for pairs of files; shift density distribution were built and for each RT, shifts that maximize the density were chosen. Files were aligned using the calculated RT shifts curve. Isotope group features correlations were used to match among files in the RT shift window. These steps are diagrammed in FIG. 11.
  • Lysates of A2780cis cells were prepared identically as DISRUPT experiments outlined above.
  • the soluble protein was separated on Tris-glycine gels (Invitrogen).
  • the separated proteins were blotted onto a polyvinylidene difluoride (PVDF) membrane and nonspecific binding was blocked overnight at 4°C in phosphate-buffered saline containing 0.1% Tween 20 and 10% nonfat dry milk (blocking buffer).
  • PVDF polyvinylidene difluoride
  • Membranes were incubated with primary antibodies, DPP3 (GeneTex), Drpl (BD Biosciences), and ⁇ -actin (Sigma- Aldrich), in blocking buffer overnight at 4 °C.
  • Membranes were then washed and incubated in peroxidase conjugated anti-rabbit IgG (Sigma-Aldrich) or anti-mouse IgG (Sigma-Aldrich) secondary antibody for 1 h at room temperature. Membranes were developed using
  • the robust and accurate ranking of the DISRUPT method may be attributed to 1) the use of multiple technical replicates at a single temperature, and 2) the use of differential mass spectrometry that prioritizes the quantification high resolution full scan data over the identification of peptide by tandem mass spectrometry.
  • An effective discovery platform or screening technology must be able to discriminate true positives from false positives.
  • DISRUPT has a novel experimental design; rather than investigate samples in singlicate across a wide temperature range, we use greater sample sizes for heightened statistical sensitivity.
  • the ranking accuracy of the DISRUPT method is a result of the emphasis that differential mass spectrometry places on high-resolution full scan data.
  • the dMS approach does not rely on the acquisition of MS/MS spectra and obviates the need for newer hybrid instrumentation that emphasizes MS/MS scan speeds. It allows for identification of significant features following quantification and statistical analysis, providing true unbiased discovery of unknown proteins.
  • the methodology examines all data acquired by the mass spectrometer, not just the data associated with an MS/MS scan event. By doing so, the dynamic range of the experiment is defined by the dynamic range of the high-resolution mass spectrometer, in this case a hybrid orbital ion trap, not the limited range of identifiable features from data-dependent acquisition.
  • the dMS workflow eliminates the burden of acquiring and searching large numbers of MS/MS spectra for proteins that do not show a significant change in relative abundance.
  • the identification of significant features is a simple matter of acquiring MS/MS spectra for a far smaller number of specific features, after significance has been established.
  • dMS interrogates all features independently; other methodologies rely on identification and combine features into peptides or proteins prior to quantification. Data analysis at the feature level is unbiased by identification accuracy and has the advantage that noisy and clean signals are not combined.
  • CHORUS data structures typically contain approximately 500,000 features and built-in statistical analysis tools can analyze all of this data and output the subset of statistically significant features.
  • a dMS analysis of a pooled sample was used to measure the coefficients of variation of features across a 10 4 range of intensity matched those generated by manual integration.
  • the DISRUPT methodology may prove more difficult to implement than other thermal shift methodologies.
  • the DISRUPT methodology places great importance on sample reproducibility in both preparation and analysis. Samples are run serially on the mass spectrometer, not in parallel as employed by Savitski et al; serial sample runs require very highly reproducible chromatography for Chorus to be able to align features accurately over multiple samples. In addition, since samples are not multiplexed, large multifactorial experiments can grow to take significant amounts of instrument time, although this is somewhat mitigated by the use of fewer temperatures than the CETSA method as currently described in the literature.
  • DISRUPT could easily adjust to include automation and scale into current drug discovery platforms 6 . It is a powerful tool to reexamine current drugs with unclear or unknown modes of action or adverse effects caused by unknown off target proteins.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Immunology (AREA)
  • Chemical & Material Sciences (AREA)
  • Biomedical Technology (AREA)
  • Hematology (AREA)
  • Urology & Nephrology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • Medicinal Chemistry (AREA)
  • Food Science & Technology (AREA)
  • Microbiology (AREA)
  • Cell Biology (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Biophysics (AREA)
  • Genetics & Genomics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

La présente invention concerne des méthodes d'identification d'une protéine pouvant se lier à un ligand, la méthode consistant à : (a) mettre en contact le ligand avec deux échantillons ou plus comprenant une pluralité de protéines dans une solution ; (b) séparer les protéines liées au ligand (« protéines liées ») des protéines qui ne sont pas liées au ligand (« protéines non liées ») dans chaque échantillon ; (c) dénaturer et digérer les protéines liées de sorte à former une pluralité de peptides dans chaque échantillon ; (d) quantifier une pluralité de caractéristiques moléculaires contenues dans la pluralité de peptides dans chaque échantillon, les caractéristiques moléculaires étant définies comme présentant un rapport masse sur charge, un temps de rétention et une intensité de crête telles que mesurées par spectrométrie de masse ; et (e) classer les caractéristiques moléculaires qui présentent une différence statistiquement significative en termes de quantité entre les échantillons mis en contact avec le ligand et un échantillon qui n'est pas mis en contact avec le ligand (« caractéristique moléculaire statistiquement significative »).
PCT/US2017/053207 2016-09-26 2017-09-25 Méthodes d'identification de protéines se liant à des ligands WO2018058023A2 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/336,079 US11567074B2 (en) 2016-09-26 2017-09-25 Methods for identifying proteins that bind ligands

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201662399777P 2016-09-26 2016-09-26
US62/399,777 2016-09-26

Publications (2)

Publication Number Publication Date
WO2018058023A2 true WO2018058023A2 (fr) 2018-03-29
WO2018058023A3 WO2018058023A3 (fr) 2018-05-11

Family

ID=61690052

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2017/053207 WO2018058023A2 (fr) 2016-09-26 2017-09-25 Méthodes d'identification de protéines se liant à des ligands

Country Status (2)

Country Link
US (1) US11567074B2 (fr)
WO (1) WO2018058023A2 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109521104A (zh) * 2018-08-03 2019-03-26 西北工业大学 一种用于点翠文物胶黏剂的鉴定方法

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112924694A (zh) * 2019-12-06 2021-06-08 中国科学院大连化学物理研究所 一种无歧视的蛋白质热稳定性分析方法
CN117665082A (zh) * 2022-08-26 2024-03-08 中国科学院大连化学物理研究所 探测能量状态发生变化的蛋白或配体与蛋白亲和力的方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5462863A (en) * 1989-02-09 1995-10-31 Development Center For Biotechnology Isolation of Hepatitis B surface antigen from transformed yeast cells
EP1614140A4 (fr) 2003-04-02 2008-05-07 Merck & Co Inc Techniques d'analyse de donnees de spectrometrie de masse
GB201106548D0 (en) * 2011-04-18 2011-06-01 Evitraproteoma Ab A method for determining ligand binding to a target protein using a thermal shift assahy
US9523693B2 (en) 2011-04-18 2016-12-20 Biotarget Engagement Interest Group Ab Methods for determining ligand binding to a target protein using a thermal shift assay

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109521104A (zh) * 2018-08-03 2019-03-26 西北工业大学 一种用于点翠文物胶黏剂的鉴定方法

Also Published As

Publication number Publication date
US11567074B2 (en) 2023-01-31
US20190227062A1 (en) 2019-07-25
WO2018058023A3 (fr) 2018-05-11

Similar Documents

Publication Publication Date Title
Niu et al. Ion mobility-mass spectrometry of intact protein–ligand complexes for pharmaceutical drug discovery and development
Moseley et al. Scanning quadrupole data-independent acquisition, part A: qualitative and quantitative characterization
Schmidt et al. A comparative cross-linking strategy to probe conformational changes in protein complexes
Durbin et al. Quantitation and identification of thousands of human proteoforms below 30 kDa
Baker et al. Mass spectrometry for translational proteomics: progress and clinical implications
Mischak et al. Technical aspects and inter-laboratory variability in native peptide profiling: The CE–MS experience
US11567074B2 (en) Methods for identifying proteins that bind ligands
Magni et al. Biomarkers discovery by peptide and protein profiling in biological fluids based on functionalized magnetic beads purification and mass spectrometry
US20120283954A1 (en) Method for quantitative analysis of complex proteomic data
White et al. Methods for the analysis of protein phosphorylation–mediated cellular signaling networks
Dowling et al. Fiber-type shifting in sarcopenia of old age: proteomic profiling of the contractile apparatus of skeletal muscles
Sokolowska et al. Applications of mass spectrometry in proteomics
Dowling et al. Protocol for the bottom-up proteomic analysis of mouse spleen
Wither et al. Mass spectrometry‐based bottom‐up proteomics: Sample preparation, LC‐MS/MS analysis, and database query strategies
Shang et al. Quantitative Proteomics Identified TTC4 as a TBK1 Interactor and a Positive Regulator of SeV‐Induced Innate Immunity
Zhang et al. Kinetics of protein complex dissociation studied by hydrogen/deuterium exchange and mass spectrometry
CN109790320B (zh) 由动量传递截面分布确定分子和分子组合体结构
Serrano et al. The one hour human proteome
Chang et al. UniQua: a universal signal processor for MS-based qualitative and quantitative proteomics applications
Pierson et al. Semi-automated screen for global protein conformational changes in solution by ion mobility spectrometry–massspectrometry combined with size-exclusion chromatography and differential hydrogen–deuterium exchange
Dowling DIGE analysis software and protein identification approaches
WO2023039479A1 (fr) Classification directe de données brutes de mesure de biomolécules
Higdon et al. Experiment-specific estimation of peptide identification probabilities using a randomized database
Degroeve et al. A reproducibility‐based evaluation procedure for quantifying the differences between MS/MS peak intensity normalization methods
Hmmier et al. DIGE analysis software and protein identification approaches

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17854064

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17854064

Country of ref document: EP

Kind code of ref document: A2