WO2021122834A1 - Method of protein detection - Google Patents

Method of protein detection Download PDF

Info

Publication number
WO2021122834A1
WO2021122834A1 PCT/EP2020/086548 EP2020086548W WO2021122834A1 WO 2021122834 A1 WO2021122834 A1 WO 2021122834A1 EP 2020086548 W EP2020086548 W EP 2020086548W WO 2021122834 A1 WO2021122834 A1 WO 2021122834A1
Authority
WO
WIPO (PCT)
Prior art keywords
ions
sample
protein
mass spectrometry
peptide
Prior art date
Application number
PCT/EP2020/086548
Other languages
French (fr)
Inventor
Rodion DEMIN
Andrew L. OLSON
Elie FUX
Steven Craig BREMMER
Original Assignee
Basf Plant Science Company Gmbh
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Basf Plant Science Company Gmbh filed Critical Basf Plant Science Company Gmbh
Publication of WO2021122834A1 publication Critical patent/WO2021122834A1/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6803General methods of protein analysis not limited to specific proteins or families of proteins
    • G01N33/6848Methods of protein analysis involving mass spectrometry

Definitions

  • the present invention relates to methods for detecting one or more proteins in a biological sample.
  • the present invention provides a method for detecting one or more proteins in a biological sample.
  • proteome analysis has provided much valuable data for the investigation of the structure, function, and control of biologic systems and processes.
  • proteomics The global analysis of gene expression at the protein level is termed “proteomics.”
  • the traditional method for proteome analysis combines protein separation by liquid chromatography (LC) with mass spectrometric (MS) or tandem mass spectrometric (MS/MS), termed ‘protein mass spectrometry’.
  • LC liquid chromatography
  • MS mass spectrometric
  • MS/MS tandem mass spectrometry
  • amino acid sequence information is collected in a tandem mass spectrometer and is correlated with protein sequence databases.
  • the tandem mass spectrometer initially selects (either automatically or controlled by the operator) the mass of a specific peptide ion for a second stage of mass spectrometry. After the peptide is selectively energized, it collides with an inert gas (collision-induced dissociation [CID]) or an chemically active molecules [ETD or HCD]).
  • CID collision-induced dissociation
  • ETD chemically active molecules
  • the goal is to induce, on average, a single peptide bond breakage per molecule.
  • the masses of the resulting fragment ions are recorded and contain the amino acid sequence information for the peptide.
  • Such data can be complicated because at least 2 ion series representing sequencing inward from both the N- and C-termini are concurrently present in each spectrum. For this reason, sophisticated computer algorithms have been developed to aid in sequence identification on the basis of fragmentation spectra.
  • the technique generates sequence information for many peptides from a protein and enables the redundant and unambiguous identification of the protein from the database.
  • a key step in protein mass spectrometry is the step of selecting the specific peptide ion for second stage of mass spectrometry.
  • Too many ions selected will result in data which is difficult to interpret and hence lower the quality of the data analysis, while too few peptide ions will result in a reduction in the sensitivity of protein mass spectrometry, which is particularly important consideration when attempting to detect the presence of low abundant proteins in complex samples.
  • the present invention provides a sensitive detection method for the determination of the presence of one or more known low abundant polypeptides in a protein sample comprising: i) performing a liquid chromatography (LC) coupled tandem mass spectrometry (MS/MS) analysis (LC-MS) of the peptides to separate peptides and detect peptide ions, ii) selecting during separation in the mass spectrometry the parent peptide ions of a specific mass to charge ratio (m/z value) (Selected Parent Ions), preferably, within a predefined retention time range or window, iii) fragmenting the Selected Parent Ions to generate fragments ions and subsequently perform mass spectrometry analysis to generate a Fragment Ions Data set, and iv) comparing the fragment ion data set with a reference spectral library or theoretically calculated spectra from protein database to identify said polypeptide.
  • LC liquid chromatography
  • MS/MS tandem mass spectrometry
  • LC-MS liquid chromatography
  • a reference spectral library for the low abundant polypeptides a data set comprising a preferred retention range for said polypeptide, preferably with a prescribed charge state for said polypeptide, or a data set comprising a lower signal intensity value threshold
  • the sample is in one embodiment a biological sample, a complex peptide mixture, or a proteomics sample.
  • the method of the invention can comprise a first step of preparing the biological sample for protein mass-spectrometry, the preparation can include but is not limited to one or more methods selected from the group consisting of: Cell lysis, extract fractionation, depletion of abundant proteins, enrichment or target proteins, dialysis, desalting, protein digestion, and peptide separation.
  • the intensity value threshold for the Selected Parent Ions is lower than the value in a method wherein the parent ions are not selected by a predefined retention time and/or having only multiple chared charged states.
  • a inclusion list of a parent ion data set is prepared from known peptides, from an analysis of samples containing the peptide to be determined, or from existing spectral libraries.
  • the parent ion fragmentation and mass spectrometry analysis is performed using tandem mass spectrometry (MS/MS), preferably QT oF.
  • the peptides in the sample are provided by peptide digestion of protein in the biological sample, e.g. enzymatic or chemical digestion, for example by in-solution digestion and in-gel digestion, e.g. by enzymatic or chemical digestion, e.g. by trypsin digestion.
  • the present invention relates to a method, wherein the sample used in the method of the invention can be derived from biological material, e.g. from a plant, a plant part, a plant organ, a plant tissue, or a plant cell.
  • the present invention provides an improved protein mass spectrometry method which can detect the presence of a known low abundant proteins in complex samples and still provide accurate data analysis.
  • a key step in protein mass spectrometry is the step of selecting the specific peptide ion for a second stage of mass spectrometry. Too many ions selected will result in data which is difficult to interpret and hence lower the quality of the data analysis, while too few peptide ions will result in a reduction in the sensitivity of protein mass spectrometry, which is particularly important for detecting the presence of low abundant proteins in complex samples.
  • the present invention relates to a highly sensitive detection method for the determination of the presence of one or more known polypeptides in a protein sample comprising: i) performing a liquid chromatography (LC) coupled tandem mass spectrometry (MS/MS) analysis (LC-MS) of the peptides to separate peptides and detect peptide ions, ii) selecting during separation in the mass spectrometry (MS) the parent peptide ions of a specific mass to charge ratio (m/z value) (Selected Parent Ions), iii) fragmenting the Selected Parent Ions to generate fragments ions and subsequently perform mass spectrometry analysis to generate a Fragment Ions Data set, and iv) comparing the fragment ion data set with a reference spectral library or theoretically calculated spectra from protein database to identify said polypeptide.
  • LC liquid chromatography
  • MS/MS tandem mass spectrometry
  • LC-MS liquid chromatography
  • MS/MS mass spectrometry
  • the polypeptide in question is of rare presence or low abundancy in the sample. In another embodiment, the polypeptide in question is highly abundant but found in the spectrum in an area of high noise or complexity.
  • the inventors of the present invention have developed an improved method of protein mass spectrometry wherein during separation of the peptides in the mass spectrometry (MS) the parent peptide ions (Selected Parent Ions) are selected for a specific mass to charge ratio, for example within a pre-defined retention time range.
  • the selection of a peptide ion (which is termed herein the ‘parent ion’) is dependent on the parent ion exceeding an intensity value threshold.
  • the method of the invention now allows for the intensity value threshold to be varied.
  • the intensity value threshold is lowered to increase the sensitivity of the detection method compared to a method in which the Parent Peptide Ions are not pre-selected within a pre-defined retention time range or, for example, it is increased in an area of high noise.
  • the operator of the method of the invention can select specific thresholds for specific parent ion data.
  • the Parent Peptide Ions can have a pre-defined single and/or multiple charged charge ratio.
  • the present invention has several advantages.
  • the pre-defined specific mass to charge ratio for example in combination with a pre-defined retention time window allows a fast peptide separation, reducing the running time of an analysis of a complex protein sample.
  • the received dataset is less complex, such that the bioinformatic efforts are reduced.
  • samples can be analyzed with a high threshold during the experiment, allowing data reduction and quicker analysis. If data in a pre-defined mass- to-charge-ratio range are analyzed, an automatization of the data analysis is possible. Further, the method results in a lower false positive rate. Further, a charge state determination is not required. Therefore, the operator of the method of the invention can select particular regions of the mass spectral data for more detailed analysis. Specific aspects of the method of the invention are described below.
  • the operator can choose to reduce the intensity value threshold at particular regions defined by a selected mass to charge ration or specific mass to charge ratios of the mass spectrum by reference to an ‘inclusion list’ which is populated with parent ion data for known peptides derived from known proteins.
  • the operator can select a higher sensitivity for this selection parameter of the protein mass spectrometry to detect the presence of low abundant known protein in complex biological samples.
  • the sensitivity of the method is increased if the threshold for the signal intensity detected in the mass spectrum is reduced.
  • the operator can select a lower sensitivity for the selection parameter of this selection for the protein mass spectrometry to detect the presence of low abundant known protein in complex biological samples.
  • the sensitivity of the method is decreased if the noise for the signal intensity detected in the mass spectrum is high.
  • FIG. 1 Schematic diagrams showing the method of the invention are provided in Figure 1.
  • the method of the invention allows for product ions listed in an inclusion list comprising the m/z values of the parent ions to be selected, the “Parent Ion Data Set”, e.g. in particular regions of the mass spectrum, to be selected for more detailed analysis and hence to detect the presence of low abundant known protein.
  • the inclusion list or “Parent Ion Data Set” comprises the mass to charge ratios (m/z values) of the peptides that are selected for separation in the MS.
  • the “Parent Ion Data Set” can comprise the retention time, and/or intensity value of the parent ion.
  • an inclusion list can comprise retention times (RT) and parent ion masses (m/z, mass to charge) values of fragmented peptides derived from the known protein.
  • the present invention also relates to the method of the invention whereby the charge state of the peptides of the inclusion list or Patent Ion Data set is not determine, for example, to achieve a higher probability of the selection of the peptide ion of interest for fragmentation.
  • the method of the invention has particular utiling in detecting the presence of known proteins in complex biological samples, such as a proteome extracted from biological materials, including from extracts from whole cells, tissue, organs etc derived from microorganisms, plants or animals.
  • complex biological samples such as a proteome extracted from biological materials, including from extracts from whole cells, tissue, organs etc derived from microorganisms, plants or animals.
  • Other uses of the method of the invention include the detection of low abundant proteins in biological samples such as biological medicines.
  • the sample can be a biological sample, a complex peptide mixture, or a complex protein sample.
  • the method of the present invention can comprise a first step of preparing the biological sample for protein mass-spectrometry, the preparation can include but is not limited to one or more of the methods selected from the group consisting of: Cell lysis, extract fractionation, depletion of abundant proteins, enrichment or target proteins, dialysis, desalting, protein digestion, and peptide separation.
  • the present disclosure demonstrates the method of the invention by detecting the presence of low abundant protein in a protein mixture.
  • the terms “have”, “comprise” or “include” or any arbitrary grammatical variations thereof are used in a non-exclusive way. Thus, these terms may both refer to a situation in which, besides the feature introduced by these terms, no further features are present in the entity described in this context and to a situation in which one or more further features are present.
  • the expressions “A has B”, “A comprises B” and “A includes B” may both refer to a situation in which, besides B, no other element is present in A (i.e. a situation in which A solely and exclusively consists of B) and to a situation in which, besides B, one or more further elements are present in entity A, such as element C, elements C and D or even further elements.
  • the terms “preferably”, “more preferably”, “most preferably”, “particularly”, “more particularly”, “specifically”, “more specifically” or similar terms are used in conjunction with optional features, without restricting further possibilities.
  • features introduced by these terms are optional features and are not intended to restrict the scope of the claims in any way.
  • the invention may, as the skilled person will recognize, be performed by using alternative features.
  • features introduced by “in an embodiment of the invention” or similar expressions are intended to be optional features, without any restriction regarding further embodiments of the invention, without any restrictions regarding the scope of the invention and without any restriction regarding the possibility of combining the features introduced in such way with other optional or non- optional features of the invention.
  • the term “about” relates to the indicated value with the commonly accepted technical precision in the relevant field, preferably relates to the indicated value ⁇ 20%, more preferably ⁇ 10%, most preferably ⁇ 5%.
  • the “retention time” value is the characteristic time it takes for a particular analyte to pass through the system (from the injection unit through the column to the detector) under set conditions. Hence chromatography is used to assign a specific retention time to a specific peptide in the analyzed sample.
  • the “mass-to-charge ratio (m/z)” value spectrum is a plot of an ion signal as a function of the ion’s charge and its mass. These spectra are used to determine the elemental and/or isotopic signature of a sample, the masses of particles and of molecules, and to elucidate the chemical structures of molecules, such as peptides and other chemical compounds, as well as the relative amount of different chemical compounds within a sample.
  • the “intensity” value is any value which reflects a measured signal intensity.
  • the signal intensity preferably, directly or indirectly correlates with the abundance of an ion as detected by the appropriate detection apparatus. This value is typically expressed as “counts per second”.
  • the “known protein” is any protein for which the amino acid sequence is known to the user of the method of the invention. As can be appreciated, this can include all proteins which may be present in a biological sample, as well as all proteins known from databases of protein amino acid sequences.
  • the “parent ion data set” or Parent Ion Data Set includes data relating to parent ions as measured from chromatography coupled mass spectrometry analysis. Each data set comprises the m/z, and can comprise further data, e.g. the retention time and intensity value of the parent ion.
  • the inclusion list may comprise the the Parent Ion Data of the peptides to be analyized.
  • the “mass spectrum” or “mass spectra” is a plot or graph of the intensity value, and hence the abundance of ions, as distributed according to the m/z value.
  • the m/z value is on the x-axis and the intensity value is the y-axis.
  • the spectrum provides a representation of the relationship of the m/z value and the number of ions in a sample.
  • a ‘peak’ relates to the number of ions having a certain m/z value as can be seen in the mass spectra graph.
  • the method of the invention is particular helpful to determine if a specific polypeptide or a fragment thereof is present in a biological sample.
  • Biological sample comprise a large variation of many proteins and peptides. It is often very difficult to identify a polypeptide or a fragment that is rare or low abundant in said biological sample.
  • Proteins are macromolecules, which are built of amino acids. Amino acid sequence causes chemical and physical properties of proteins. There are proteins which are secreted and are in fluids of organisms (e.g. blood) or environment (e.g. proteins produced by bacteria). Other proteins are embedded in cells or tissue. Accordingly, the present invention relates to a method, wherein the sample used in the method of the invention can be derived from biological material, e.g. body fluid or a liquid of an extract from tissues, organs or cells, inducing extracts e.g. from a plant, a plant part, a plant organ, a plant tissue, or a plant cell.
  • biological material e.g. body fluid or a liquid of an extract from tissues, organs or cells, inducing extracts e.g. from a plant, a plant part, a plant organ, a plant tissue, or a plant cell.
  • Proteins are isolated from cells or tissue using homogenization (mechanical, ultrasonic, etc.) in buffers, which supports solubility of proteins. Hydrophobic proteins with poor solubility can be solubilization using detergents. Proteins in solution can be directly used for analytics or purified further. Further purification can occur by precipitation or different chromatographic techniques (size exclusion chromatography, cation/anion exchange, immune-affinity chromatography, etc.).
  • the method of the present invention can comprise a preparation, e.g. first, step of preparing the biological sample for protein mass-spectrometry.
  • the preparation includes but is not limited to one or more methods selected from the group consisting of: Cell lysis, extract fractionation, depletion of abundant proteins, enrichment or target proteins, dialysis, desalting, protein digestion, and peptide separation.
  • the protein is then prepared as peptides for subsequent use in the method of the invention.
  • Methods of preparing peptides for use in chromatography coupled mass spectrometry analysis are well known in the art. Since mass spectrometry analysis of intact proteins are difficult according to the resolution capability, proteins can be degraded by specific enzymes to peptides. Peptides then represent the corresponding protein in the biological sample and can be easily analyzed by mass spectrometry. Trypsin, which cleaves after Arginine and Lysine residues, is commonly used for protein digests. Endoproteinase Glu-C and Lys-C are used alternatively or in combination with Trypsin.
  • Resulting peptide mixture can be analyzed directly or purified further or enriched.
  • the peptides in the sample are provided by peptide digestion of protein in the biological sample, e.g. by enzymatic or chemical digestion, for example, for example, in-solution digestion and in-gel digestion, e.g. by enzymatic or chemical digestion, e.g. by trypsin digestion.
  • the method of the invention comprises a step in which a chromatography coupled mass spectrometry analysis of the peptides is performed, for example a liquid chromatography (LC) coupled mass spectrometry (MS) analysis (LC-MS).
  • a chromatography coupled mass spectrometry analysis of the peptides is performed, for example a liquid chromatography (LC) coupled mass spectrometry (MS) analysis (LC-MS).
  • the method of the invention comprises a step of performing a liquid-chromatography (LC) coupled mass spectrometry (MS) analysis (LC-MS) of the peptides to separate peptides and detect peptide ions.
  • LC liquid chromatography
  • MS mass spectrometry
  • chromatography coupled mass spectrometry as used herein relates to mass spectrometry which is coupled to a prior chromatographic separation of the peptide(s) comprised by the samples to be investigated.
  • Chromatography is a laboratory technique for the separation of a mixture.
  • the mixture is dissolved in a fluid called the mobile phase, which carries it through a structure holding another material called the stationary phase.
  • the various constituents of the mixture travel at different speeds, causing them to separate.
  • the separation is based on differential partitioning between the mobile and stationary phases. Subtle differences in a peptide’s partition coefficient result in differential retention on the stationary phase and thus affect the separation.
  • Suitable techniques for separation to be used preferably in accordance with the present invention include all chromatographic and/or electrophoretic separation techniques such as liquid chromatography (LC).
  • LC liquid chromatography
  • HPLC high-performance liquid chromatography
  • UPLC ultra-performance liquid chromatography
  • CE capillary electrophoresis
  • HPLC high-performance liquid chromatography
  • UPLC ultra-performance liquid chromatography
  • CE capillary electrophoresis
  • LC, UPLC and/or HPLC are chromatographic techniques to be envisaged by the method of the present invention. Suitable devices for such determination of analyte(s) are well known in the art.
  • the peptides are then analyzed by mass spectrometry.
  • Mass spectrometry is an analytical technique that ionizes chemical species and sorts the ions based on their mass-to-charge ratio prior to detection of the ion signal (determining the intensity of the ion beam or counting the ions within a certain time). Mass spectrometry is used in many different fields and is applied to pure samples as well as complex mixtures.
  • a sample which is liquid or solid, is ionized, for example by proton transfer, e.g. from solvent. This may cause some of the sample's molecules to be converted into charged ions, termed “full scan ions”. These full scan ions are then separated according to their mass-to-charge ratio, typically by accelerating them and subjecting them to an electric and/or magnetic field: full scan ions of the same mass- to-charge ratio (m/z) will undergo the same amount of deflection.
  • the full scan ions are detected by a mechanism capable of detecting charged particles, for example an appliance including an electron multiplier. Results are displayed as spectra of the relative abundance of detected full scan ions as a function of the mass-to-charge ratio.
  • mass spectrometry is used to assign one or a group of specific m/z of an ion or ions generated from a specific peptide in the analyzed sample, due to the ionization process.
  • Mass spectrometry as used herein encompasses all techniques which allow for the determination of the molecular weight (i.e. the mass) or a mass variable corresponding to a peptide to be determined in accordance with the present invention.
  • mass spectrometry is used, in particular, , liquid-chromatography coupled mass spectrometry (LC-MS).
  • liquid-chromatography (LC) coupled mass spectrometry (MS) analysis shall encompass but not be limited to liquid-chromatography (LC) coupled mass spectrometry (MS), direct infusion mass spectrometry or Fourier-transform ion-cyclotron- resonance mass spectrometry (FT-ICR-MS), capillary-electrophoresis mass spectrometry (CE-MS), high-performance liquid-chromatography coupled mass spectrometry (HPLC- MS), quadrupole mass spectrometry, any sequentially coupled mass spectrometry, such as MS-MS or MS-MS-MS, and ion mobility mass spectrometry or time of flight mass spectrometry (TOF).
  • MS-MS liquid-chromatography
  • MS-MS direct infusion mass spectrometry or Fourier-transform ion-cyclotron- resonance mass spectrometry
  • CE-MS capillary-electrophoresis mass spectrometry
  • HPLC- MS high-performance liquid-chromatography coupled mass spectrometry
  • the chromatography coupled mass spectrometry analysis is performed by liquid chromatography.
  • the “Parent Ion Data Set” comprises the mass to charge ratio for selection and, preferably, the retention time, and/or intensity threshold value of the parent ion or the “Parent Ion Data Set”.
  • the intensity value is a measured signal intensity.
  • the signal intensity preferably, directly or indirectly correlates with the abundance of an ion as detected by the appropriate detection apparatus. This value is typically expressed as “counts per second”.
  • the method of the invention comprises the step of selecting during separation in the MS the parent peptide ions (Selected Parent Ions). The selection can occur, for example, within a pre-defined retention time window or range.
  • the method of the invention is particular useful to detect the presence of one or more known highly abundant polypeptides in the sample.
  • the operator may set the chromatography coupled mass spectrometry apparatus to have a high intensity value threshold so as to increase the likelihood that parent ions derived from real peptides are analyzed in the method of the invention, or to decrease the number of measured peptides in this area, e.g. if only high abundant peptides shall be measured, for example as an internal control within a measurement, or if the peptide in question is known to be found in an area of the spectrum in which a large amount of noise at the certain RT range is also found.
  • the method of the invention is particular useful to detect the presence of one or more known low abundant polypeptides in the sample. Accordingly, for certain RT range and m/z values the operator may set the chromatography coupled mass spectrometry apparatus to have a low intensity value threshold so as to increase the sensitivity of the method of the invention by ensuring parent ions derived from low abundant peptides that are analyzed in the method of the invention, or is known that there will be a low amount of noise at that certain RT range.
  • the method of the invention is performed without determination of a charge state of the peptides.
  • the method of the invention can also be performed with a determination of the charge state.
  • a single and/or multiple-charged charge state is pre-defined for the polypeptide in question.
  • the method of the invention includes step in which only parent ions which are derived from specific peptides present in the sample are analysed. Accordingly, in an embodiment of the invention, the intensity value threshold varies for the parent ion data set for certain RT windows or ranges.
  • the intensity value threshold is increased relative to the intensity value threshold assigned across of the mass spectrum.
  • the intensity value threshold is expressed as 100
  • the intensity value threshold of a specific mass to charge ratio (m/z value) may be increased to 120, 140, 160, 180, 200, 300, 500 or more.
  • the intensity value threshold is reduced relative to the intensity value threshold assigned across of the mass spectrum.
  • the intensity value threshold is expressed as 100
  • the intensity value threshold of a specific mass to charge ratio (m/z value) may be decreased to 80, 60, 50, 40, 30, 20, 15, 10, 5, 4, 3, 2, 1 or 0.
  • m/z value mass to charge ratio
  • a further embodiment of the invention is wherein intensity value threshold for a Selected Parent Ion data set is selected from an inclusion list comprising parent ion data sets of known peptides.
  • an ‘inclusion list’ of parent ion data sets of known peptides is used in the method of the invention.
  • This inclusion list can be prepared by the operator of the method of the invention by subjecting known protein samples to the method of the invention for peptides which presence shall be determined by the method of the invention, e.g. low abundant polypeptides.
  • the peptides may have not been isolated from a biological sample.
  • the peptides can be considered as ‘control’ or ‘reference samples’.
  • the presence of the low abundant peptides is detected.
  • the operator of the method of the invention can prepare an inclusion list from data available in the art and known to the skilled person, e.g. as shown in the peptide atlas or published, e.g.
  • the operator may set the chromatography coupled mass spectrometry apparatus to have a low intensity value threshold to increase the sensitivity of the method of the invention by ensuring parent ions derived from real peptides are analyzed in the method of the invention.
  • the method of the invention allows for parent ions, and hence peptides, derived from low abundant protein to be detected which would have not been detected when the intensity value threshold is set at a higher value. Therefore, this step of the method of the invention increases sensitivity.
  • the method of the invention comprises a further step of fragmenting the Selected Parent Ions to generate fragments ions and subsequently perform mass spectrometry analysis to generate a Fragment Ion Data set.
  • parent ions which exceed the intensity value threshold are selected for subsequent analysis by the generation of fragment ions and subsequent mass spectrometry analysis.
  • Methods of generating fragment ions from parent ions are well known in the art.
  • fragment ions may be generated by collision-induced dissociation, ion-molecule reaction, photodissociation, or other process.
  • the resulting ions are then separated and detected in a second stage of mass spectrometry.
  • the method of the invention can be performed in a tandem mass spectrometer.
  • Tandem mass spectrometer can include one or more physical mass analyzers that perform two or more mass analyses.
  • a mass analyzer of a tandem mass spectrometer can include, but is not limited to, a time-of-flight (TOF), a triple quadrupole, an ion trap, a linear ion trap, an orbitrap, or an Ion Cyclotron Resonance mass analyzer (orbitrap as well as ICR are Fourier transform MS).
  • Tandem mass spectrometer can also include a separation device.
  • the separation device can perform a separation technique that includes, but is not limited to, liquid chromatography, , capillary electrophoresis, or ion mobility. As an alternative, ion mobility can be used in combination with liquid chromatography separation techniques.
  • the parent ion fragmentation and mass spectrometry analysis in step iv) is performed using MS/MS, preferably QToF.
  • This fragment ion data set may comprise (i) retention time, m/z and/or intensity values of the parent ion, and (ii) m/z and/or intensity values of the fragment ion.
  • the method of the invention comprises a step of comparing the parent ion data set and the fragment ion data set with a reference library from known peptides to identify the known protein.
  • the “parent ion data set” comprises for example the retention time, m/z and intensity value of the parent ion
  • the fragment ion data set comprises for example (i) retention time, m/z and intensity values of the parent ion, and (ii) m/z and intensity values of the fragment ion.
  • the method of the invention uses data analysis techniques as described below and are implemented on a computer system, with elements including processor, data storage, and input/output devices and connections as known to a person of skill. While features of the data analysis techniques are implemented in software on a computer readable medium, a person of skill, with reference to this description, can prepare the appropriate computer- readable code for a computer system on which the embodiment is implemented, and as such software code and pseudo-code is not provided herein. It will be appreciated that various hardware and/or software combinations may be used to implement different embodiments.
  • the data analysis comprises the use of the parent ion data set, and the fragment ion data set and data mining of reference spectra libraries.
  • Reference spectra libraries of peptides may be generated for synthetic peptides and/or from prior MS analyses performed on the biological sample under investigation. Similarly, the reference spectra libraries of peptides may be generated from synthetic peptides references and/or from prior analytes MS peptides. Importantly, once the reference libraries have been generated, they can be used perpetually. Hence in an embodiment of the invention the reference library comprises parent ion data and the fragment ion data set prepared from standards of the known protein, from existing spectral libraries, or computationally generated by applying empirical or a priori fragmentation or modification rules to the known protein.
  • fragment ions and parent ions can be assigned to known peptides and from subsequent analysis to known proteins.
  • the confidence in the protein identification can be scored, for example, based on the mass accuracy and/or the relative intensities of the acquired product ion fragments compared to that of the reference (or predicted) fragmentation spectrum, on the number of matched fragments, on the similar chromatographic characteristics (co-elution, peak shape, etc.) of the extracted ion traces of these fragments. Probabilities for the identifications can be determined, for example, by searching (and scoring) similarly for decoy precursor fragment ions from the same LC-MS dataset.
  • the relative quantification can be performed by integration of the product ions traces across the chromatographic elution of the precursor. In various embodiments, use is made of differently isotopically labeled reference analytes (similarly identified, quantified and scored) to achieve absolute quantification of the corresponding precursors of interest.
  • a further embodiment of the invention is optionally and comprises calculating a score that represents how well the parent ion data set and the fragment ion data set fits to the reference library data.
  • Peptide annotation is performed by comparing the m/z ratio from each ion (ion full scan as well as in MS/MS) contained in the library and the retention time of the analyte.
  • mass measured is within the expected range of the user (e.g. ⁇ 5ppm deviation compared to the library) and the retention time measured is within the expected range e.g. +/- 0.1 min) then the ion is annotated as a match to the ion contained in the library.
  • providing means that the at least one biological sample is provided in a manner suitable for determining the protein content comprised by said biological sample. Accordingly, providing as used herein also refers to carrying out suitable pre-treatments, i.e. most preferably concentration or fractioning of the sample and/or extraction of the sample. Depending on the technique which is used to determine the at protein content comprised by said biological sample, additional pre-treatments may be required.
  • biological sample relates to a sample comprising a biological material
  • biological material preferably, includes any substance or mixture of substances produced by a cell, preferably including substances and mixtures of substances produced by such biological material.
  • the sample is derived from biological material, e.g. from an extract derived from microorganisms or multicellular organisms, e.g. plant, plant part, plant organ, plant tissue, or plant cell.
  • the biological material comprises a multitude of proteins of a cell.
  • the biological sample is a sample of a material comprising a non-defined mixture of proteins, such as a cell culture medium comprising serum, a spent cell culture medium, a bodily fluid of an organism, tissue of an organism, and the like.
  • the biological sample is a cell culture sample from archaebacterial, bacterial, and/or eukaryotic cells, wherein said cell culture sample preferably comprises cells and/or spent culture medium; preferably, in such case, the biological sample is a sample of cultured bacterial, fungal, plant, such as a dicot or monocot plant, more preferably a crop plant., algae, human or animal cells and/or spent medium of said cells.
  • the biological sample is a sample of and/or spent culture medium from E.coli cells, Paenibacillus cells, Basfia succiniciproducens cells, Corynebacterium glutamicum, Lactobacillus, Bacillus acidopullulyticus cells, Bacillus amyloliquefaciens cells, Bacillus lentus cells, Bacillus licheniformis cells, Bacillus subtilis cells, Aspergillus niger cells, Aspergillus oryzae cells, Chrysosporium lucknowense cells, Myceliophthora thermophile cells, Penicillium chrysogenum cells, Penicillium funiculosum cells, Rhizomucor miehei cells, Schizophyllum commune cells, Trichoderma harzianum cells, Trichoderma longibrachiatum cells, Trichoderma reesei cells, yeast cells, Saccharomyces cerevisiae cells, Schizosaccharomyces pombe cells
  • the term "plant” relates to a whole plant, a plant part, a plant organ, a plant tissue, or a plant cell.
  • the term includes, preferably, seeds, shoots, stems, leaves, roots (including tubers), and flowers.
  • the term "plant” relates to a member of the clade Archaeplastida.
  • Plants that are particularly useful in the methods of the invention include all plants which belong to the superfamily Viridiplantae, preferably Tracheophyta, more preferably Spermatophytina, most preferably monocotyledonous and dicotyledonous plants including fodder or forage legumes, ornamental plants, food crops, trees or shrubs selected from the list comprising Acer spp., Actinidia spp., Abelmoschus spp., Agave sisalana, Agropyron spp., Agrostis stolonifera, Allium spp., Amaranthus spp., Ammophila arenaria, Ananas comosus, Annona spp., Apium graveolens, Arachis spp, Artocarpus spp., Asparagus officinalis, Avena spp.
  • Viridiplantae preferably Tracheophyta, more preferably Spermatophytina
  • Avena sativa e.g. Avena sativa, Avena fatua, Avena byzantina, A vena fatua var. sativa, Avena hybrida
  • Averrhoa carambola e.g. Bambusa sp.
  • Benincasa hispida Bertholletia excelsea
  • Beta vulgaris Brassica spp.
  • Brassica napus e.g. Brassica napus, Brassica rapa ssp.
  • the plant cell, plant or plant part is a rice cell, rice plant, rice plant part, or rice seed.
  • the sample is a sample from a multicellular organism. More preferably, the sample comprises a bodily fluid of an organism and/or a tissue of an organism.
  • the biological sample is a sample of an animal, preferably a vertebrate, more preferably a mammal. More preferably, the biological sample is a sample of an egg, a, preferably non human, embryo, or a complete non-human organism, e.g. an insect, a nematode, or a laboratory animal.
  • the biological sample is or comprises a sample of a body fluid, a sample from a tissue or an organ, or a sample of wash/rinse fluid or a swab or smear obtained from an outer or inner body surface.
  • samples of stool, urine, saliva, sputum, tears, cerebrospinal fluid, blood, serum, plasma, lymph or lacrimal fluid are encompassed as biological samples by the method of the present invention.
  • biological samples can be obtained by use of brushes, (cotton) swabs, spatula, rinse/wash fluids, punch biopsy devices, puncture of cavities with needles or lancets, or by surgical instrumentation.
  • biological samples obtained by well- known techniques including, in an embodiment, scrapes, swabs or biopsies are also included as samples of the present invention.
  • Cell-free fluids may be obtained from the body fluids or the tissues or organs by lysing techniques such as homogenization and/or by separating techniques such as filtration or centrifugation. It is to be understood that a sample may be further processed in order to carry out the method of the present invention. Particularly, cells may be removed from the sample by methods and means known in the art. More preferably, the biological sample is a sample of a body fluid, preferably a blood, plasma, or serum sample.
  • the biological sample is a tissue sample, preferably a sample of liver tissue, heart tissue, prostate tissue, pancreas tissue, brain tissue, kidney tissue, adipose tissue, gut, skeleton tissue, lung tissue, bladder, breast tissue, cecum and/or skin tissue, such as dermal layer, comprising the epidermis and / or corium and / or subcutis.
  • the biological sample is a sample of an algae or plant, preferably of a monocotyledonous or dicotyledonous plant. More preferably, said biological sample is a tissue sample, preferably leaf tissue, root tissue, shoot tissue, stem tissue, reproductive tissue (such as flower tissue or pollen) and/or seed tissue and/or liquid comprising exudate thereof and/or volatile compounds released thereof.
  • Figure 1 Example for intensity threshold variation dependent on mass to charge selection
  • the threshold was reduced and the presence of the a low abundant polypeptide could be shown.
  • MS Qual/Quant QC Mix was used as a model system for complex peptide mixture including low abundant peptides for HPLC-MS/MS analysis.
  • Identification of proteins can be carried out with database searches using mass spectrometry data.
  • a protease digests the protein of interest to peptides.
  • Proteases cleave peptide bonds with a well-defined specificity, resulting in a peptide mixture which can be analyzed using mass spectrometry.
  • the complexity of the peptide mixture is reduced through chromatographic separation prior to the introduction of the sample to the mass spectrometer.
  • the determination of mass-to-charge ratio (m/z) of the peptides in mixture allows their assignment to the theoretically calculated peptide masses of known proteins in sequence databases. Additionally, peptides can be fragmented in a mass spectrometer.
  • peptide sequences from the database are used to calculate a theoretical fragmentation pattern for comparison with the experimental results.
  • the use of peptide fragmentation data increases the confidence of protein identification.
  • Peptide-protein matches are ranked using a scoring system based on the number of peptides detected in a particular protein sequence and on their fragmentation spectrum.
  • MS Qual/Quant QC Mix from Sigma-Aldrich was used as a test sample for analysis.
  • Agilent 1100 Series m-HPLC coupled to QTOF 5600+ mass spectrometer from Sciex (Framingham, USA) were used as HPLC-MS/MS system.
  • Mascot Server software version 2.6.1 (Matrix Science Ltd, London, UK,) was used as a database search engine. Mass spectrometer raw files were converted into peak lists using Mascot Daemon software (version 2.6.0) with AB Sciex Converter (version 1.3).
  • MS Qual/Quant QC Mix was re-dissolved in 500 pi of 0.1% formic acid.
  • Sample was diluted 1 :10 and 5 mI of the sample were injected into the HPLC-MS/MS system.
  • Peptides were separated on C18 reversed-phase chromatography column coupled directly to the electrospray ion source of the QTOF 5600+ mass spectrometer.
  • time-of-flight (TOF) spectra were acquired at view microseconds intervals, detecting peptide precursor ions in different charge states. 20 most intense precursor peaks of each TOF spectra were selected one by one using Quadrupole mass filter and fragmented using collision induced dissociation capability of mass spectrometer within intervals.
  • Mass spectrometer output file possible contain fragmentation spectra of peptides from MS Qual/Quant QC Mix proteins.
  • fragmentation spectra in raw data format (.wiff format) were converted into Mascot Generic format file (.mgf format), which is peak list, and used for Mascot searches.
  • Mascot software examine every fragmentation spectra and match it to the theoretically calculated fragmentation spectra from protein databases.
  • MS/MS spectra search settings are shown in Table 7. Decoy database is a reversed SwissProt database.
  • MS Qual/Quant QC Mix contains six proteins of different amount (Table 1). Protein identification experiment 1 with low intensity threshold leads to the identification of five proteins, however with FDR of 6.76 % (Table 5). Increased intensity threshold leads to the identification of two high abundant proteins with high confidence (experiment 2, Table 5). In both cases the identification of low abundant Peptidyl-prolyl cis-trans isomerase A proteins was not possible. Peptide ion masses and retention time of Peptidyl-prolyl cis- trans isomerase A were used to build an inclusion list (Table 4). Peptide ion masses are calculated using theoretical tryptic digest ( Figure 1 , Table 2). Retention time was taken from previous experiments. Time window for acquisition was chosen as +/- 90 seconds.
  • Intensity threshold for inclusion list was set to 100 as in experiment 1. Use of inclusion list leads to the successful identification of low abundant Peptidyl-prolyl cis-trans isomerase A and two high abundant proteins with FDR of 0 % (Table 5). Table 6 shows, that two peptides were sufficient for unambiguous identification of Peptidyl-prolyl cis-trans isomerase A. Table 1: Protein composition of MS Qual/Quant QC Mix.
  • Mea IDA Incl Inclusi Numb Peptide matches MS False sur Thre usio on List er of above homology Qual/Quant discov erne shol n Thresh queri or identity QC Mix ery rate nt d List old es threshold proteins [%]

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Chemical & Material Sciences (AREA)
  • Urology & Nephrology (AREA)
  • Immunology (AREA)
  • Biomedical Technology (AREA)
  • Hematology (AREA)
  • Cell Biology (AREA)
  • Medicinal Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Food Science & Technology (AREA)
  • Biotechnology (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)

Abstract

The present invention provides a method for detecting one or more proteins in a biological sample. In one aspect, the present invention provides a sensitive detection method for the determination of the presence of one or more known low abundant polypeptides in a protein sample comprising: performing a liquid chromatography (LC) coupled tandem mass spectrometry (MS/MS) analysis (LC-MS) of the peptides to separate peptides and detect peptide ions, selecting during separation in the mass spectrometry the parent peptide ions of a specific mass to charge ratio (m/z value) (Selected Parent Ions), preferably, within a predefined retention time range or window, fragmenting the Selected Parent Ions to generate fragments ions and subsequently perform mass spectrometry analysis to generate a Fragment Ions Data set, and comparing the fragment ion data set with a reference spectral library or theoretically calculated spectra from protein database to identify said polypeptide. For example, a reference spectral library for the low abundant polypeptides, a data set comprising a preferred retention range for said polypeptide, preferably with a prescribed charge state for said polypeptide, or a data set comprising a lower signal intensity value threshold, is provided. According to the invention, the sample can be a biological sample, e.g. a complex peptide mixture, or a proteomics sample.

Description

METHOD OF PROTEIN DETECTION
FIELD OF THE INVENTION
The present invention relates to methods for detecting one or more proteins in a biological sample.
BACKGROUND OF THE INVENTION
The present invention provides a method for detecting one or more proteins in a biological sample. In recent years proteome analysis has provided much valuable data for the investigation of the structure, function, and control of biologic systems and processes.
The global analysis of gene expression at the protein level is termed “proteomics.” The traditional method for proteome analysis combines protein separation by liquid chromatography (LC) with mass spectrometric (MS) or tandem mass spectrometric (MS/MS), termed ‘protein mass spectrometry’. Typically, amino acid sequence information is collected in a tandem mass spectrometer and is correlated with protein sequence databases. The tandem mass spectrometer initially selects (either automatically or controlled by the operator) the mass of a specific peptide ion for a second stage of mass spectrometry. After the peptide is selectively energized, it collides with an inert gas (collision-induced dissociation [CID]) or an chemically active molecules [ETD or HCD]). The goal is to induce, on average, a single peptide bond breakage per molecule. The masses of the resulting fragment ions are recorded and contain the amino acid sequence information for the peptide. Such data can be complicated because at least 2 ion series representing sequencing inward from both the N- and C-termini are concurrently present in each spectrum. For this reason, sophisticated computer algorithms have been developed to aid in sequence identification on the basis of fragmentation spectra. The technique generates sequence information for many peptides from a protein and enables the redundant and unambiguous identification of the protein from the database. As can be appreciated, a key step in protein mass spectrometry is the step of selecting the specific peptide ion for second stage of mass spectrometry. Too many ions selected will result in data which is difficult to interpret and hence lower the quality of the data analysis, while too few peptide ions will result in a reduction in the sensitivity of protein mass spectrometry, which is particularly important consideration when attempting to detect the presence of low abundant proteins in complex samples.
Hence there is a need in the art for improved methods of protein mass spectrometry which can detect the presence of low abundant proteins in complex samples while enabling accurate data analysis.
SUMMARY OF THE INVENTION
In one aspect, the present invention provides a sensitive detection method for the determination of the presence of one or more known low abundant polypeptides in a protein sample comprising: i) performing a liquid chromatography (LC) coupled tandem mass spectrometry (MS/MS) analysis (LC-MS) of the peptides to separate peptides and detect peptide ions, ii) selecting during separation in the mass spectrometry the parent peptide ions of a specific mass to charge ratio (m/z value) (Selected Parent Ions), preferably, within a predefined retention time range or window, iii) fragmenting the Selected Parent Ions to generate fragments ions and subsequently perform mass spectrometry analysis to generate a Fragment Ions Data set, and iv) comparing the fragment ion data set with a reference spectral library or theoretically calculated spectra from protein database to identify said polypeptide.
In one embodiment a reference spectral library for the low abundant polypeptides, a data set comprising a preferred retention range for said polypeptide, preferably with a prescribed charge state for said polypeptide, or a data set comprising a lower signal intensity value threshold, is provided. According to the invention, the sample is in one embodiment a biological sample, a complex peptide mixture, or a proteomics sample. Accordingly, the method of the invention can comprise a first step of preparing the biological sample for protein mass-spectrometry, the preparation can include but is not limited to one or more methods selected from the group consisting of: Cell lysis, extract fractionation, depletion of abundant proteins, enrichment or target proteins, dialysis, desalting, protein digestion, and peptide separation.
In one embodiment, the intensity value threshold for the Selected Parent Ions is lower than the value in a method wherein the parent ions are not selected by a predefined retention time and/or having only multiple chared charged states. Further, in one embodiment, a inclusion list of a parent ion data set is prepared from known peptides, from an analysis of samples containing the peptide to be determined, or from existing spectral libraries.
In one embodiment, in the method of the invention, the parent ion fragmentation and mass spectrometry analysis is performed using tandem mass spectrometry (MS/MS), preferably QT oF. For example, the peptides in the sample are provided by peptide digestion of protein in the biological sample, e.g. enzymatic or chemical digestion, for example by in-solution digestion and in-gel digestion, e.g. by enzymatic or chemical digestion, e.g. by trypsin digestion. Further, the present invention relates to a method, wherein the sample used in the method of the invention can be derived from biological material, e.g. from a plant, a plant part, a plant organ, a plant tissue, or a plant cell.
DETAILED DESCRIPTION AND DEFINITIONS
The present invention provides an improved protein mass spectrometry method which can detect the presence of a known low abundant proteins in complex samples and still provide accurate data analysis.
Typically, during mass spectra-based protein content analysis all masses in the sample are analyzed. Accordingly, all masses that have a specific intensity threshold and a specific charge state are selected for fragmentation resulting in a high false positive rate if the threshold is low. A high threshold results in corrector identification with less positive false hits but a reduced sensitivity as masses below the threshold are missed. Typically, the measurement of charge states with multiple ionization stages reduces the sensitivity of the measurement as the determination a low abundant peptide can fail because the amount of the multiple ionized fragments is under the detection limit. These methods do not allow a fast and precise identification of the presence of a low abundant peptide in the complex sample. As discussed above a key step in protein mass spectrometry is the step of selecting the specific peptide ion for a second stage of mass spectrometry. Too many ions selected will result in data which is difficult to interpret and hence lower the quality of the data analysis, while too few peptide ions will result in a reduction in the sensitivity of protein mass spectrometry, which is particularly important for detecting the presence of low abundant proteins in complex samples.
Accordingly, the present invention relates to a highly sensitive detection method for the determination of the presence of one or more known polypeptides in a protein sample comprising: i) performing a liquid chromatography (LC) coupled tandem mass spectrometry (MS/MS) analysis (LC-MS) of the peptides to separate peptides and detect peptide ions, ii) selecting during separation in the mass spectrometry (MS) the parent peptide ions of a specific mass to charge ratio (m/z value) (Selected Parent Ions), iii) fragmenting the Selected Parent Ions to generate fragments ions and subsequently perform mass spectrometry analysis to generate a Fragment Ions Data set, and iv) comparing the fragment ion data set with a reference spectral library or theoretically calculated spectra from protein database to identify said polypeptide.
In one embodiment of the invention, the polypeptide in question is of rare presence or low abundancy in the sample. In another embodiment, the polypeptide in question is highly abundant but found in the spectrum in an area of high noise or complexity.
Thus, the inventors of the present invention have developed an improved method of protein mass spectrometry wherein during separation of the peptides in the mass spectrometry (MS) the parent peptide ions (Selected Parent Ions) are selected for a specific mass to charge ratio, for example within a pre-defined retention time range. Usually, the selection of a peptide ion (which is termed herein the ‘parent ion’) is dependent on the parent ion exceeding an intensity value threshold. The method of the invention now allows for the intensity value threshold to be varied. For example, the intensity value threshold is lowered to increase the sensitivity of the detection method compared to a method in which the Parent Peptide Ions are not pre-selected within a pre-defined retention time range or, for example, it is increased in an area of high noise. In this way, the operator of the method of the invention can select specific thresholds for specific parent ion data. The Parent Peptide Ions can have a pre-defined single and/or multiple charged charge ratio.
The present invention has several advantages. The pre-defined specific mass to charge ratio, for example in combination with a pre-defined retention time window allows a fast peptide separation, reducing the running time of an analysis of a complex protein sample. The received dataset is less complex, such that the bioinformatic efforts are reduced. Advantageously, samples can be analyzed with a high threshold during the experiment, allowing data reduction and quicker analysis. If data in a pre-defined mass- to-charge-ratio range are analyzed, an automatization of the data analysis is possible. Further, the method results in a lower false positive rate. Further, a charge state determination is not required. Therefore, the operator of the method of the invention can select particular regions of the mass spectral data for more detailed analysis. Specific aspects of the method of the invention are described below. In particular, in an embodiment of the method of the invention, the operator can choose to reduce the intensity value threshold at particular regions defined by a selected mass to charge ration or specific mass to charge ratios of the mass spectrum by reference to an ‘inclusion list’ which is populated with parent ion data for known peptides derived from known proteins. In this way, the operator can select a higher sensitivity for this selection parameter of the protein mass spectrometry to detect the presence of low abundant known protein in complex biological samples. Accordingly, in the method of the present invention, the sensitivity of the method is increased if the threshold for the signal intensity detected in the mass spectrum is reduced. Also, the operator can select a lower sensitivity for the selection parameter of this selection for the protein mass spectrometry to detect the presence of low abundant known protein in complex biological samples. Accordingly, in the method of the present invention, the sensitivity of the method is decreased if the noise for the signal intensity detected in the mass spectrum is high.
Schematic diagrams showing the method of the invention are provided in Figure 1. Here it can be seen that existing methods of protein mass spectrometry would have failed to detect the presence of product ions derived from low abundant known proteins. In contrast, the method of the invention allows for product ions listed in an inclusion list comprising the m/z values of the parent ions to be selected, the “Parent Ion Data Set”, e.g. in particular regions of the mass spectrum, to be selected for more detailed analysis and hence to detect the presence of low abundant known protein. The inclusion list or “Parent Ion Data Set” comprises the mass to charge ratios (m/z values) of the peptides that are selected for separation in the MS. Also, the “Parent Ion Data Set” can comprise the retention time, and/or intensity value of the parent ion. Thus, an inclusion list can comprise retention times (RT) and parent ion masses (m/z, mass to charge) values of fragmented peptides derived from the known protein. Thus, a determination of the charge state as in methods of the prior art is for example is not performed.
Further, according to the present invention, it is not necessary to determine the charge state of the peptide ions of the inclusion list. In standard methods, often only peptide ions that have multiple charges are characterized. The determination of charge states of low abundant peptides is difficult and cannot easily or at all be performed if a particular threshold is not met. The difficulties with the determination of the peptides’ isotope pattern of low abundant peptides may not allow to derive at the charge state for the peptide. Accordingly, the present invention also relates to the method of the invention whereby the charge state of the peptides of the inclusion list or Patent Ion Data set is not determine, for example, to achieve a higher probability of the selection of the peptide ion of interest for fragmentation.
While not wishing to be limited to particular uses, the method of the invention has particular utiling in detecting the presence of known proteins in complex biological samples, such as a proteome extracted from biological materials, including from extracts from whole cells, tissue, organs etc derived from microorganisms, plants or animals. Other uses of the method of the invention include the detection of low abundant proteins in biological samples such as biological medicines. Accordingly, in the method of the invention, the sample can be a biological sample, a complex peptide mixture, or a complex protein sample. Thus, the method of the present invention can comprise a first step of preparing the biological sample for protein mass-spectrometry, the preparation can include but is not limited to one or more of the methods selected from the group consisting of: Cell lysis, extract fractionation, depletion of abundant proteins, enrichment or target proteins, dialysis, desalting, protein digestion, and peptide separation.
As way of example, the present disclosure demonstrates the method of the invention by detecting the presence of low abundant protein in a protein mixture.
Terminology
As used herein, the following terminology is applied to the various features of the invention.
As used in the following, the terms “have”, “comprise” or “include” or any arbitrary grammatical variations thereof are used in a non-exclusive way. Thus, these terms may both refer to a situation in which, besides the feature introduced by these terms, no further features are present in the entity described in this context and to a situation in which one or more further features are present. As an example, the expressions “A has B”, “A comprises B” and “A includes B” may both refer to a situation in which, besides B, no other element is present in A (i.e. a situation in which A solely and exclusively consists of B) and to a situation in which, besides B, one or more further elements are present in entity A, such as element C, elements C and D or even further elements.
Further, as used in the following, the terms "preferably", "more preferably", "most preferably", "particularly", "more particularly", "specifically", "more specifically" or similar terms are used in conjunction with optional features, without restricting further possibilities. Thus, features introduced by these terms are optional features and are not intended to restrict the scope of the claims in any way. The invention may, as the skilled person will recognize, be performed by using alternative features. Similarly, features introduced by "in an embodiment of the invention" or similar expressions are intended to be optional features, without any restriction regarding further embodiments of the invention, without any restrictions regarding the scope of the invention and without any restriction regarding the possibility of combining the features introduced in such way with other optional or non- optional features of the invention. Moreover, if not otherwise indicated, the term "about" relates to the indicated value with the commonly accepted technical precision in the relevant field, preferably relates to the indicated value ± 20%, more preferably ± 10%, most preferably ± 5%.
The “retention time” value is the characteristic time it takes for a particular analyte to pass through the system (from the injection unit through the column to the detector) under set conditions. Hence chromatography is used to assign a specific retention time to a specific peptide in the analyzed sample.
The “mass-to-charge ratio (m/z)” value spectrum is a plot of an ion signal as a function of the ion’s charge and its mass. These spectra are used to determine the elemental and/or isotopic signature of a sample, the masses of particles and of molecules, and to elucidate the chemical structures of molecules, such as peptides and other chemical compounds, as well as the relative amount of different chemical compounds within a sample.
The “intensity” value is any value which reflects a measured signal intensity. The signal intensity, preferably, directly or indirectly correlates with the abundance of an ion as detected by the appropriate detection apparatus. This value is typically expressed as “counts per second”. The “known protein” is any protein for which the amino acid sequence is known to the user of the method of the invention. As can be appreciated, this can include all proteins which may be present in a biological sample, as well as all proteins known from databases of protein amino acid sequences.
The “parent ion data set” or Parent Ion Data Set includes data relating to parent ions as measured from chromatography coupled mass spectrometry analysis. Each data set comprises the m/z, and can comprise further data, e.g. the retention time and intensity value of the parent ion. For example, the inclusion list may comprise the the Parent Ion Data of the peptides to be analyized.
The “mass spectrum” or “mass spectra” is a plot or graph of the intensity value, and hence the abundance of ions, as distributed according to the m/z value. In usual practice, and as used herein, the m/z value is on the x-axis and the intensity value is the y-axis. Hence the spectrum provides a representation of the relationship of the m/z value and the number of ions in a sample. In this context, a ‘peak’ relates to the number of ions having a certain m/z value as can be seen in the mass spectra graph.
Preparation of peptides derived from known proteins
The method of the invention is particular helpful to determine if a specific polypeptide or a fragment thereof is present in a biological sample. Biological sample comprise a large variation of many proteins and peptides. It is often very difficult to identify a polypeptide or a fragment that is rare or low abundant in said biological sample.
Proteins are macromolecules, which are built of amino acids. Amino acid sequence causes chemical and physical properties of proteins. There are proteins which are secreted and are in fluids of organisms (e.g. blood) or environment (e.g. proteins produced by bacteria). Other proteins are embedded in cells or tissue. Accordingly, the present invention relates to a method, wherein the sample used in the method of the invention can be derived from biological material, e.g. body fluid or a liquid of an extract from tissues, organs or cells, inducing extracts e.g. from a plant, a plant part, a plant organ, a plant tissue, or a plant cell. Proteins are isolated from cells or tissue using homogenization (mechanical, ultrasonic, etc.) in buffers, which supports solubility of proteins. Hydrophobic proteins with poor solubility can be solubilization using detergents. Proteins in solution can be directly used for analytics or purified further. Further purification can occur by precipitation or different chromatographic techniques (size exclusion chromatography, cation/anion exchange, immune-affinity chromatography, etc.). If the sample comprises proteins and other complex molecules from a biological sample or a fraction thereof, the method of the present invention can comprise a preparation, e.g. first, step of preparing the biological sample for protein mass-spectrometry. The preparation includes but is not limited to one or more methods selected from the group consisting of: Cell lysis, extract fractionation, depletion of abundant proteins, enrichment or target proteins, dialysis, desalting, protein digestion, and peptide separation.
Following the isolation of protein from biological samples, the protein is then prepared as peptides for subsequent use in the method of the invention. Methods of preparing peptides for use in chromatography coupled mass spectrometry analysis are well known in the art. Since mass spectrometry analysis of intact proteins are difficult according to the resolution capability, proteins can be degraded by specific enzymes to peptides. Peptides then represent the corresponding protein in the biological sample and can be easily analyzed by mass spectrometry. Trypsin, which cleaves after Arginine and Lysine residues, is commonly used for protein digests. Endoproteinase Glu-C and Lys-C are used alternatively or in combination with Trypsin. Resulting peptide mixture can be analyzed directly or purified further or enriched. For example, the peptides in the sample are provided by peptide digestion of protein in the biological sample, e.g. by enzymatic or chemical digestion, for example, for example, in-solution digestion and in-gel digestion, e.g. by enzymatic or chemical digestion, e.g. by trypsin digestion.
Peptide separation and data generation
The method of the invention comprises a step in which a chromatography coupled mass spectrometry analysis of the peptides is performed, for example a liquid chromatography (LC) coupled mass spectrometry (MS) analysis (LC-MS). For example, the method of the invention comprises a step of performing a liquid-chromatography (LC) coupled mass spectrometry (MS) analysis (LC-MS) of the peptides to separate peptides and detect peptide ions.
The term "chromatography coupled mass spectrometry" as used herein relates to mass spectrometry which is coupled to a prior chromatographic separation of the peptide(s) comprised by the samples to be investigated.
Chromatography is a laboratory technique for the separation of a mixture. The mixture is dissolved in a fluid called the mobile phase, which carries it through a structure holding another material called the stationary phase. The various constituents of the mixture travel at different speeds, causing them to separate. The separation is based on differential partitioning between the mobile and stationary phases. Subtle differences in a peptide’s partition coefficient result in differential retention on the stationary phase and thus affect the separation.
Suitable techniques for separation to be used preferably in accordance with the present invention, therefore, include all chromatographic and/or electrophoretic separation techniques such as liquid chromatography (LC). The term liquid chromatography as used herein shall encompass but not be limited to techniques such as high-performance liquid chromatography (HPLC), ultra-performance liquid chromatography (UPLC), thin layer chromatography, size exclusion, affinity chromatography and capillary electrophoresis (CE). Most preferably, LC, UPLC and/or HPLC are chromatographic techniques to be envisaged by the method of the present invention. Suitable devices for such determination of analyte(s) are well known in the art.
Following the chromatography stage, the peptides are then analyzed by mass spectrometry.
Mass spectrometry (MS) is an analytical technique that ionizes chemical species and sorts the ions based on their mass-to-charge ratio prior to detection of the ion signal (determining the intensity of the ion beam or counting the ions within a certain time). Mass spectrometry is used in many different fields and is applied to pure samples as well as complex mixtures.
In a typical MS procedure of the invention, a sample, which is liquid or solid, is ionized, for example by proton transfer, e.g. from solvent. This may cause some of the sample's molecules to be converted into charged ions, termed “full scan ions”. These full scan ions are then separated according to their mass-to-charge ratio, typically by accelerating them and subjecting them to an electric and/or magnetic field: full scan ions of the same mass- to-charge ratio (m/z) will undergo the same amount of deflection. The full scan ions are detected by a mechanism capable of detecting charged particles, for example an appliance including an electron multiplier. Results are displayed as spectra of the relative abundance of detected full scan ions as a function of the mass-to-charge ratio. Hence mass spectrometry is used to assign one or a group of specific m/z of an ion or ions generated from a specific peptide in the analyzed sample, due to the ionization process.
Mass spectrometry as used herein encompasses all techniques which allow for the determination of the molecular weight (i.e. the mass) or a mass variable corresponding to a peptide to be determined in accordance with the present invention. Preferably, mass spectrometry is used, in particular, , liquid-chromatography coupled mass spectrometry (LC-MS). The term liquid-chromatography (LC) coupled mass spectrometry (MS) analysis (LC-MS) shall encompass but not be limited to liquid-chromatography (LC) coupled mass spectrometry (MS), direct infusion mass spectrometry or Fourier-transform ion-cyclotron- resonance mass spectrometry (FT-ICR-MS), capillary-electrophoresis mass spectrometry (CE-MS), high-performance liquid-chromatography coupled mass spectrometry (HPLC- MS), quadrupole mass spectrometry, any sequentially coupled mass spectrometry, such as MS-MS or MS-MS-MS, and ion mobility mass spectrometry or time of flight mass spectrometry (TOF).
In an embodiment of the invention, the chromatography coupled mass spectrometry analysis is performed by liquid chromatography.
For this step of the method of the invention a selection criteria for separation of the parent peptide ions of a specific mass to charge ratio is provided, the “Parent Ion Data Set”. As discussed above, the “Parent Ion Data Set” comprises the mass to charge ratio for selection and, preferably, the retention time, and/or intensity threshold value of the parent ion or the “Parent Ion Data Set”.
As stated above, the intensity value is a measured signal intensity. The signal intensity, preferably, directly or indirectly correlates with the abundance of an ion as detected by the appropriate detection apparatus. This value is typically expressed as “counts per second”.
It can be appreciated that in chromatography coupled mass spectrometry analysis of samples within the mass spectra there will be peaks which are generated by real peptides, and peaks which are generated by random electrical or chemical ‘noise’, This phenomenon is well known in the art and is termed ‘background noise’ or simply ‘noise’. This can be problematic in subsequent data analysis when assigning parent ions to specific peptides. There are several different approaches by which the background noise can be excluded from further analysis. One method is to choose an ‘intensity value threshold’ such that only parent ions which exceed a specified quantity limit progress to subsequent analysis. As can be appreciated, the value of the intensity threshold will vary between different chromatography coupled mass spectrometry apparatuses. Accordingly, the method of the invention allows for the intensity value threshold to be varied according to known parent ion data. In this way, the operator of the method of the invention can select specific thresholds for specific parent ion data and select a specific set of parent ions.
Hence, rather than an intensity value threshold to be assigned across the full mass spectrum, specific values can be selected for particular regions of the spectrum, for example, for certain m/z values. Since, mass spectra are acquired during peptide separation in short temporal intervals, specific values can be selected for certain retention time (RT) range and m/z values. Accordingly, in one embodiment, the method of the invention comprises the step of selecting during separation in the MS the parent peptide ions (Selected Parent Ions). The selection can occur, for example, within a pre-defined retention time window or range. The method of the invention is particular useful to detect the presence of one or more known highly abundant polypeptides in the sample. Accordingly, the operator may set the chromatography coupled mass spectrometry apparatus to have a high intensity value threshold so as to increase the likelihood that parent ions derived from real peptides are analyzed in the method of the invention, or to decrease the number of measured peptides in this area, e.g. if only high abundant peptides shall be measured, for example as an internal control within a measurement, or if the peptide in question is known to be found in an area of the spectrum in which a large amount of noise at the certain RT range is also found.
The method of the invention is particular useful to detect the presence of one or more known low abundant polypeptides in the sample. Accordingly, for certain RT range and m/z values the operator may set the chromatography coupled mass spectrometry apparatus to have a low intensity value threshold so as to increase the sensitivity of the method of the invention by ensuring parent ions derived from low abundant peptides that are analyzed in the method of the invention, or is known that there will be a low amount of noise at that certain RT range.
In one embodiment, the method of the invention is performed without determination of a charge state of the peptides. The method of the invention can also be performed with a determination of the charge state. In one embodiment, for the polypeptide in question a single and/or multiple-charged charge state is pre-defined.
Hence, by selecting parent ions which exceed an intensity value threshold, the method of the invention includes step in which only parent ions which are derived from specific peptides present in the sample are analysed. Accordingly, in an embodiment of the invention, the intensity value threshold varies for the parent ion data set for certain RT windows or ranges.
By ‘varied according to the parent ion data’ we include where the intensity value threshold is increased relative to the intensity value threshold assigned across of the mass spectrum. Thus, if the intensity value threshold is expressed as 100, then the intensity value threshold of a specific mass to charge ratio (m/z value) may be increased to 120, 140, 160, 180, 200, 300, 500 or more.. As will be understood by the skilled person, the higher the intensity value threshold selected for a particular region of the mass spectrum, less parent ions will be analyzed in the method of the invention.
By ‘varied according to the parent ion data’ we also include where the intensity value threshold is reduced relative to the intensity value threshold assigned across of the mass spectrum. Thus, if the intensity value threshold is expressed as 100, then the intensity value threshold of a specific mass to charge ratio (m/z value) may be decreased to 80, 60, 50, 40, 30, 20, 15, 10, 5, 4, 3, 2, 1 or 0. At 0, there will be no intensity value threshold and hence every parent ion in that region of the mass spectrum will be analyzed in the method of the invention.
A further embodiment of the invention is wherein intensity value threshold for a Selected Parent Ion data set is selected from an inclusion list comprising parent ion data sets of known peptides. In this embodiment of the invention, an ‘inclusion list’ of parent ion data sets of known peptides is used in the method of the invention. This inclusion list can be prepared by the operator of the method of the invention by subjecting known protein samples to the method of the invention for peptides which presence shall be determined by the method of the invention, e.g. low abundant polypeptides. The peptides may have not been isolated from a biological sample. In one embodiment, the peptides can be considered as ‘control’ or ‘reference samples’. In another embodiment, the presence of the low abundant peptides is detected. Alternatively, the operator of the method of the invention can prepare an inclusion list from data available in the art and known to the skilled person, e.g. as shown in the peptide atlas or published, e.g.
• http://www.peptideatlas.org/ (http://pubs.acs.org/doi/10.1021/acs.jproteome.7b00467)
• Published data (often supplemental data): http://www.mcponline.Org/content/11/3/M 111.013722. long
When used in the method of the invention, the operator may set the chromatography coupled mass spectrometry apparatus to have a low intensity value threshold to increase the sensitivity of the method of the invention by ensuring parent ions derived from real peptides are analyzed in the method of the invention. In this way, the method of the invention allows for parent ions, and hence peptides, derived from low abundant protein to be detected which would have not been detected when the intensity value threshold is set at a higher value. Therefore, this step of the method of the invention increases sensitivity.
Fragmentation of selected parent ions
The method of the invention comprises a further step of fragmenting the Selected Parent Ions to generate fragments ions and subsequently perform mass spectrometry analysis to generate a Fragment Ion Data set. According to the method of the invention, parent ions which exceed the intensity value threshold are selected for subsequent analysis by the generation of fragment ions and subsequent mass spectrometry analysis. Methods of generating fragment ions from parent ions are well known in the art. For example, fragment ions may be generated by collision-induced dissociation, ion-molecule reaction, photodissociation, or other process.
The resulting ions are then separated and detected in a second stage of mass spectrometry. Hence the method of the invention can be performed in a tandem mass spectrometer.
Tandem mass spectrometer can include one or more physical mass analyzers that perform two or more mass analyses. A mass analyzer of a tandem mass spectrometer (MS/MS) can include, but is not limited to, a time-of-flight (TOF), a triple quadrupole, an ion trap, a linear ion trap, an orbitrap, or an Ion Cyclotron Resonance mass analyzer (orbitrap as well as ICR are Fourier transform MS). Tandem mass spectrometer can also include a separation device. The separation device can perform a separation technique that includes, but is not limited to, liquid chromatography, , capillary electrophoresis, or ion mobility. As an alternative, ion mobility can be used in combination with liquid chromatography separation techniques.
In an embodiment of the method of the invention, the parent ion fragmentation and mass spectrometry analysis in step iv) is performed using MS/MS, preferably QToF.
From this step of the method of the invention, a fragment ion data set is then known. This fragment ion data set may comprise (i) retention time, m/z and/or intensity values of the parent ion, and (ii) m/z and/or intensity values of the fragment ion.
Comparing data sets with a reference library
The method of the invention comprises a step of comparing the parent ion data set and the fragment ion data set with a reference library from known peptides to identify the known protein. As stated above, the “parent ion data set” comprises for example the retention time, m/z and intensity value of the parent ion, and the fragment ion data set comprises for example (i) retention time, m/z and intensity values of the parent ion, and (ii) m/z and intensity values of the fragment ion.
The method of the invention uses data analysis techniques as described below and are implemented on a computer system, with elements including processor, data storage, and input/output devices and connections as known to a person of skill. While features of the data analysis techniques are implemented in software on a computer readable medium, a person of skill, with reference to this description, can prepare the appropriate computer- readable code for a computer system on which the embodiment is implemented, and as such software code and pseudo-code is not provided herein. It will be appreciated that various hardware and/or software combinations may be used to implement different embodiments. The data analysis comprises the use of the parent ion data set, and the fragment ion data set and data mining of reference spectra libraries. Reference spectra libraries of peptides may be generated for synthetic peptides and/or from prior MS analyses performed on the biological sample under investigation. Similarly, the reference spectra libraries of peptides may be generated from synthetic peptides references and/or from prior analytes MS peptides. Importantly, once the reference libraries have been generated, they can be used perpetually. Hence in an embodiment of the invention the reference library comprises parent ion data and the fragment ion data set prepared from standards of the known protein, from existing spectral libraries, or computationally generated by applying empirical or a priori fragmentation or modification rules to the known protein.
From the analysis of the parent ion data set and the fragment ion data set and the reference library, fragment ions and parent ions can be assigned to known peptides and from subsequent analysis to known proteins.
The confidence in the protein identification can be scored, for example, based on the mass accuracy and/or the relative intensities of the acquired product ion fragments compared to that of the reference (or predicted) fragmentation spectrum, on the number of matched fragments, on the similar chromatographic characteristics (co-elution, peak shape, etc.) of the extracted ion traces of these fragments. Probabilities for the identifications can be determined, for example, by searching (and scoring) similarly for decoy precursor fragment ions from the same LC-MS dataset. The relative quantification can be performed by integration of the product ions traces across the chromatographic elution of the precursor. In various embodiments, use is made of differently isotopically labeled reference analytes (similarly identified, quantified and scored) to achieve absolute quantification of the corresponding precursors of interest.
Hence a further embodiment of the invention is optionally and comprises calculating a score that represents how well the parent ion data set and the fragment ion data set fits to the reference library data.
Peptide annotation is performed by comparing the m/z ratio from each ion (ion full scan as well as in MS/MS) contained in the library and the retention time of the analyte. When the mass measured is within the expected range of the user (e.g. < 5ppm deviation compared to the library) and the retention time measured is within the expected range e.g. +/- 0.1 min) then the ion is annotated as a match to the ion contained in the library.
The annotation of several ions provides independent indications that a given peptide is present in the matrix. Using the strategies outlined above, and other alternatives which are known to the skilled person, the method provides the identification of proteins present in the biological sample.
There are various software packages available which can be used in this step of the method of the invention. Simply as way of example, in the accompanying examples the inventors used Mascot server software (Matrix Science Limited, Version 2.6.1) (http://www.matrixscience.com/server.html).
Sample preparation
The term "providing" as used herein means that the at least one biological sample is provided in a manner suitable for determining the protein content comprised by said biological sample. Accordingly, providing as used herein also refers to carrying out suitable pre-treatments, i.e. most preferably concentration or fractioning of the sample and/or extraction of the sample. Depending on the technique which is used to determine the at protein content comprised by said biological sample, additional pre-treatments may be required.
Biological sample
The term "biological sample", as used herein, relates to a sample comprising a biological material, wherein the term "biological material", preferably, includes any substance or mixture of substances produced by a cell, preferably including substances and mixtures of substances produced by such biological material. Accordingly, in the method of the invention, the sample is derived from biological material, e.g. from an extract derived from microorganisms or multicellular organisms, e.g. plant, plant part, plant organ, plant tissue, or plant cell. Preferably, the biological material comprises a multitude of proteins of a cell. Preferably, the biological sample is a sample of a material comprising a non-defined mixture of proteins, such as a cell culture medium comprising serum, a spent cell culture medium, a bodily fluid of an organism, tissue of an organism, and the like. Thus, preferably, the biological sample is a cell culture sample from archaebacterial, bacterial, and/or eukaryotic cells, wherein said cell culture sample preferably comprises cells and/or spent culture medium; preferably, in such case, the biological sample is a sample of cultured bacterial, fungal, plant, such as a dicot or monocot plant, more preferably a crop plant., algae, human or animal cells and/or spent medium of said cells. Most preferably, the biological sample is a sample of and/or spent culture medium from E.coli cells, Paenibacillus cells, Basfia succiniciproducens cells, Corynebacterium glutamicum, Lactobacillus, Bacillus acidopullulyticus cells, Bacillus amyloliquefaciens cells, Bacillus lentus cells, Bacillus licheniformis cells, Bacillus subtilis cells, Aspergillus niger cells, Aspergillus oryzae cells, Chrysosporium lucknowense cells, Myceliophthora thermophile cells, Penicillium chrysogenum cells, Penicillium funiculosum cells, Rhizomucor miehei cells, Schizophyllum commune cells, Trichoderma harzianum cells, Trichoderma longibrachiatum cells, Trichoderma reesei cells, yeast cells, Saccharomyces cerevisiae cells, Schizosaccharomyces pombe cells, Pichia pastoris cells, Kluyveromyces lactis cells, Kluyveromyces fragilis cells, Candida rugose cells, Candida lipolytica cells, Candida Antarctica cells, CHO cells (Chinese hamster ovary cells), liver cells, hepatocytes, kidney cells, kidney cancer cells, pancreatic cells, pancreatic cancer cells, cardiac cells, cardiac cancer cells, endothelial cells, endothelial cancer cells, fibroblasts, lung cells, lung cancer cells, bladder cells, bladder cancer cells, breast cells, breast cancer cells, colon cells, colon cancer cells, ovarian cells, ovarian cancer cells, duodenum cells, duodenum cancer cells, bile duct cells, bilde duct cancer cells, stem cells or skin cells.
As used herein, the term "plant" relates to a whole plant, a plant part, a plant organ, a plant tissue, or a plant cell. Thus, the term includes, preferably, seeds, shoots, stems, leaves, roots (including tubers), and flowers. Preferably, the term "plant" relates to a member of the clade Archaeplastida. Plants that are particularly useful in the methods of the invention include all plants which belong to the superfamily Viridiplantae, preferably Tracheophyta, more preferably Spermatophytina, most preferably monocotyledonous and dicotyledonous plants including fodder or forage legumes, ornamental plants, food crops, trees or shrubs selected from the list comprising Acer spp., Actinidia spp., Abelmoschus spp., Agave sisalana, Agropyron spp., Agrostis stolonifera, Allium spp., Amaranthus spp., Ammophila arenaria, Ananas comosus, Annona spp., Apium graveolens, Arachis spp, Artocarpus spp., Asparagus officinalis, Avena spp. (e.g. Avena sativa, Avena fatua, Avena byzantina, A vena fatua var. sativa, Avena hybrida), Averrhoa carambola, Bambusa sp., Benincasa hispida, Bertholletia excelsea, Beta vulgaris, Brassica spp. (e.g. Brassica napus, Brassica rapa ssp. [canola, oilseed rape, turnip rape]), Cadaba farinosa, Camellia sinensis, Canna indica, Cannabis sativa, Capsicum spp., Carex elata, Carica papaya, Carissa macrocarpa, Carya spp., Carthamus tinctorius, Castanea spp., Ceiba pentandra, Cichorium endivia, Cinnamomum spp., Citrullus lanatus, Citrus spp., Cocos spp., Coffea spp., Colocasia esculenta, Cola spp., Corchorus sp., Coriandrum sativum, Corylus spp., Crataegus spp., Crocus sativus, Cucurbita spp., Cucumis spp., Cynara spp., Daucus carota, Desmodium spp., Dimocarpus longan, Dioscorea spp., Diospyros spp., Echinochloa spp., Elaeis (e.g. Elaeis guineensis, Elaeis oleifera), Eleusine coracana, Eragrostis tef, Erianthus sp., Eriobotrya japonica, Eucalyptus sp., Eugenia uniflora, Fagopyrum spp., Fagus spp., Festuca arundinacea, Ficus carica, Fortunella spp., Fragaria spp., Ginkgo biloba, Glycine spp. (e.g. Glycine max, Soja hispida or Soja max), Gossypium hirsutum, Helianthus spp. (e.g. Helianthus annuus), Hemerocallis fulva, Hibiscus spp., Hordeum spp. (e.g. Hordeum vulgare), Ipomoea batatas, Juglans spp., Lactuca sativa, Lathyrus spp., Lens culinaris, Linum usitatissimum, Litchi chinensis, Lotus spp., Luffa acutangula, Lupinus spp., Luzula sylvatica, Lycopersicon spp. (e.g. Lycopersicon esculentum, Lycopersicon lycopersicum, Lycopersicon pyriforme), Macrotyloma spp., Malus spp., Malpighia emarginata, Mammea americana, Mangifera indica, Manihot spp., Manilkara zapota, Medicago sativa, Melilotus spp., Mentha spp., Miscanthus sinensis, Momordica spp., Morus nigra, Musa spp., Nicotiana spp., Olea spp., Opuntia spp., Ornithopus spp., Oryza spp. (e.g. Oryza sativa, Oryza latifolia), Panicum miliaceum, Panicum virgatum, Passiflora edulis, Pastinaca sativa, Pennisetum sp., Persea spp., Petroselinum crispum, Phalaris arundinacea, Phaseolus spp., Phleum pratense, Phoenix spp., Phragmites australis, Physalis spp., Pinus spp., Pistacia vera, Pisum spp., Poa spp., Populus spp., Prosopis spp., Prunus spp., Psidium spp., Punica granatum, Pyrus communis, Quercus spp., Raphanus sativus, Rheum rhabarbarum, Ribes spp., Ricinus communis, Rubus spp., Saccharum spp., Salix sp., Sambucus spp., Secale cereale, Sesamum spp., Sinapis sp., Solanum spp. (e.g. Solanum tuberosum, Solanum integrifolium or Solanum lycopersicum), Sorghum bicolor, Spinacia spp., Syzygium spp., Tagetes spp., Tamarindus indica, Theobroma cacao, Trifolium spp., Tripsacum dactyloides, Triticosecale rimpaui, Triticum spp. (e.g. Triticum aestivum, Triticum durum, Triticum turgidum, Triticum hybernum, Triticum macha, Triticum sativum, Triticum monococcum or Triticum vulgare), Tropaeolum minus, Tropaeolum majus, Vaccinium spp., Vicia spp., Vigna spp., Viola odorata, Vitis spp., Zea mays, Zizania palustris, Ziziphus spp., amongst others. Preferably, the plant cell, plant or plant part is a rice cell, rice plant, rice plant part, or rice seed.
Preferably, the sample is a sample from a multicellular organism. More preferably, the sample comprises a bodily fluid of an organism and/or a tissue of an organism. Preferably, the biological sample is a sample of an animal, preferably a vertebrate, more preferably a mammal. More preferably, the biological sample is a sample of an egg, a, preferably non human, embryo, or a complete non-human organism, e.g. an insect, a nematode, or a laboratory animal. Preferably, the biological sample is or comprises a sample of a body fluid, a sample from a tissue or an organ, or a sample of wash/rinse fluid or a swab or smear obtained from an outer or inner body surface. Preferably, samples of stool, urine, saliva, sputum, tears, cerebrospinal fluid, blood, serum, plasma, lymph or lacrimal fluid are encompassed as biological samples by the method of the present invention. In particular in multicellular organisms, biological samples can be obtained by use of brushes, (cotton) swabs, spatula, rinse/wash fluids, punch biopsy devices, puncture of cavities with needles or lancets, or by surgical instrumentation. However, biological samples obtained by well- known techniques including, in an embodiment, scrapes, swabs or biopsies are also included as samples of the present invention. Cell-free fluids may be obtained from the body fluids or the tissues or organs by lysing techniques such as homogenization and/or by separating techniques such as filtration or centrifugation. It is to be understood that a sample may be further processed in order to carry out the method of the present invention. Particularly, cells may be removed from the sample by methods and means known in the art. More preferably, the biological sample is a sample of a body fluid, preferably a blood, plasma, or serum sample. Also more preferably, the biological sample is a tissue sample, preferably a sample of liver tissue, heart tissue, prostate tissue, pancreas tissue, brain tissue, kidney tissue, adipose tissue, gut, skeleton tissue, lung tissue, bladder, breast tissue, cecum and/or skin tissue, such as dermal layer, comprising the epidermis and / or corium and / or subcutis. Also, preferably, the biological sample is a sample of an algae or plant, preferably of a monocotyledonous or dicotyledonous plant. More preferably, said biological sample is a tissue sample, preferably leaf tissue, root tissue, shoot tissue, stem tissue, reproductive tissue (such as flower tissue or pollen) and/or seed tissue and/or liquid comprising exudate thereof and/or volatile compounds released thereof.
Figures:
Figure 1 : Example for intensity threshold variation dependent on mass to charge selection
For specific m/z values (inclusion list), the threshold was reduced and the presence of the a low abundant polypeptide could be shown.
Examples:
Detection of low abundant protein in a protein mixture.
MS Qual/Quant QC Mix was used as a model system for complex peptide mixture including low abundant peptides for HPLC-MS/MS analysis.
Introduction
Identification of proteins can be carried out with database searches using mass spectrometry data. Generally, as a first step a protease digests the protein of interest to peptides. Proteases cleave peptide bonds with a well-defined specificity, resulting in a peptide mixture which can be analyzed using mass spectrometry. The complexity of the peptide mixture is reduced through chromatographic separation prior to the introduction of the sample to the mass spectrometer. The determination of mass-to-charge ratio (m/z) of the peptides in mixture allows their assignment to the theoretically calculated peptide masses of known proteins in sequence databases. Additionally, peptides can be fragmented in a mass spectrometer. In this case the peptide sequences from the database are used to calculate a theoretical fragmentation pattern for comparison with the experimental results. The use of peptide fragmentation data increases the confidence of protein identification. Peptide-protein matches are ranked using a scoring system based on the number of peptides detected in a particular protein sequence and on their fragmentation spectrum.
However, above described protein identification approach has limitations to identify low abundant proteins with high confidence. Low intensity threshold for selection of peptide precursors during the Information Dependent Acquisition (IDA) leads to the increase of False Discovery Rate (FDR). Here we propose to use peptide ion masses calculated from theoretical digest or from experimental data (spectral libraries or test experiments) and retention time (RT), if available, for so called Inclusion Lists. In such cases, higher intensity threshold can be used for analysis of major protein components in sample and lower intensity threshold and retention time window for peptides from Inclusion List. Low abundant peptides from inclusion list preferably selected for fragmentation but only within short time window. Therefore, corresponding proteins can be identified with high confidence depicted by low value of FDR.
Materials and methods
MS Qual/Quant QC Mix from Sigma-Aldrich was used as a test sample for analysis. Agilent 1100 Series m-HPLC coupled to QTOF 5600+ mass spectrometer from Sciex (Framingham, USA) were used as HPLC-MS/MS system. Mascot Server software version 2.6.1 (Matrix Science Ltd, London, UK,) was used as a database search engine. Mass spectrometer raw files were converted into peak lists using Mascot Daemon software (version 2.6.0) with AB Sciex Converter (version 1.3).
Methods used
MS Qual/Quant QC Mix was re-dissolved in 500 pi of 0.1% formic acid. Sample was diluted 1 :10 and 5 mI of the sample were injected into the HPLC-MS/MS system. Peptides were separated on C18 reversed-phase chromatography column coupled directly to the electrospray ion source of the QTOF 5600+ mass spectrometer. During chromatographic separation time-of-flight (TOF) spectra were acquired at view microseconds intervals, detecting peptide precursor ions in different charge states. 20 most intense precursor peaks of each TOF spectra were selected one by one using Quadrupole mass filter and fragmented using collision induced dissociation capability of mass spectrometer within intervals. Other IDA criteria are shown in the Table 3. Mass spectrometer output file possible contain fragmentation spectra of peptides from MS Qual/Quant QC Mix proteins. However, fragmentation spectra in raw data format (.wiff format) were converted into Mascot Generic format file (.mgf format), which is peak list, and used for Mascot searches. Mascot software examine every fragmentation spectra and match it to the theoretically calculated fragmentation spectra from protein databases. MS/MS spectra search settings are shown in Table 7. Decoy database is a reversed SwissProt database.
Results and discussion
MS Qual/Quant QC Mix contains six proteins of different amount (Table 1). Protein identification experiment 1 with low intensity threshold leads to the identification of five proteins, however with FDR of 6.76 % (Table 5). Increased intensity threshold leads to the identification of two high abundant proteins with high confidence (experiment 2, Table 5). In both cases the identification of low abundant Peptidyl-prolyl cis-trans isomerase A proteins was not possible. Peptide ion masses and retention time of Peptidyl-prolyl cis- trans isomerase A were used to build an inclusion list (Table 4). Peptide ion masses are calculated using theoretical tryptic digest (Figure 1 , Table 2). Retention time was taken from previous experiments. Time window for acquisition was chosen as +/- 90 seconds. Intensity threshold for inclusion list was set to 100 as in experiment 1. Use of inclusion list leads to the successful identification of low abundant Peptidyl-prolyl cis-trans isomerase A and two high abundant proteins with FDR of 0 % (Table 5). Table 6 shows, that two peptides were sufficient for unambiguous identification of Peptidyl-prolyl cis-trans isomerase A. Table 1: Protein composition of MS Qual/Quant QC Mix.
Prot Name UniProt Calculated Analysed protein ein Accession MW (Da) amount [fmol]
Number
1 Carbonic Anhydrase I P00915 28739 100
2 Carbonic Anhydrase P00918 29115 100
3 NAD(P)H P15559 30736 20 dehydrogenase
4 C-reactive Protein P02741 23047 20
5 Catalase P04040 59625 4
6 Peptidyl-prolyl cis- P62937 20176 4 trans isomerase A
Table 2: Theoretical calculated trypsin digest of Peptidyl-prolyl cis-trans isomerase A (P62937)
Start - End Peptide Mr [M+2HJ++
155 - 155 K 146.1 74.1
1 1 M 149.1 75.5
132 - 133 VK 245.2 123.6
149 - 151 NGK 317.2 159.6
152 - 154 TSK 334.2 168.1
29 - 31 VPK 342.2 172.1
145 - 148 FGSR 465.2 233.6
45 - 49 GFGYK 570.3 286.1
70 - 76 HNGTGGK 669.3 335.7
126 - 131 HVVFGK 685.4 343.7
77 - 82 SIYGEK 695.3 348.7
38 - 44 ALSTGEK 704.4 353.2
50 - 55 GSCFHR 705.3 353.7
32 - 37 TAENFR 736.4 369.2
119 - 125 TEWLDGK 847.4 424.7
20 - 28 VSFELFADK 1054.5 528.3
156 - 165 ITIADCGQLE 1061.5 531.8 83 91 FEDENFILK 1153.6 577.8
134 - 144 EGMNIVEAMER 1277.6 639.8
56 - 69 IIPGFMCQGGDFTR 1540.7 771.4
2 - 19 VNPTVFFDIAVDGEPLGR 1945.0 973.5
1 - 19 M VNPTVFFDIAVDGEPLGR 2076.0 1039.0
92 118 HTGPGILSMANAGPNTNGSQFFICTAK 2733.3 1367.7
Table 3: General IDA settings
IDA Parameter Value
Most intense precursor 20 peaks
With charge State 2 to 5
Exclude isotopes 4 Da
Mass Tolerance 50 ppm
Exclude former target ions 30 seconds for after 2 occurrences
Table 4: IDA Inclusion List
Mass [Da] Retention Time
[min]
424.7 16
528.3 19 7
577.8 18.6
639.8 19.7
For 180 seconds; Intensity 100 cps; Peptides TEWLDGK, VSFELFADK, FEDENFILK and EGMNIVEAMER
Table 5: P62937 Identification results
Mea IDA Incl Inclusi Numb Peptide matches MS False sur Thre usio on List er of above homology Qual/Quant discov erne shol n Thresh queri or identity QC Mix ery rate nt d List old es threshold proteins [%]
- identified
SwissProt Decoy 1 100 no - 305 74 5 1 , 2, 3, 4, 5 6,76
2 200 no 33 24 0 1 , 2 0
0
3 200 yes 100 34 24 0 1 , 2, 6 0
0
Table 6: P62937 Identification results of measurement 3
Match 3.
P62937 Mass: 18001 Score: 61 Matches: 2(2) Sequences: 2(2) Peptidyl-prolyl cis-trans isomerase A OS=Homo sapiens OX=9606 GN=PPIA PE=1
Que Obser Mr Mr Delt Mi Sco Exp Ra Uniq Peptide ry ved (expt) (calc) a ss re ect nk ue
19 528.27 1054.5 1054.5 - 0 50 6.7e- 1 U R.VSFEL
36 327 335 0.00 05 FADK.V
08
21 577.78 1153.5 1153.5 - 0 26 0.01 1 U K.FEDEN
99 652 655 0.00 3 FILK.H
04
References
1. MS Qual/Quant QC Mix Product Information sheet (https://www.sigmaaldrich.com/content/dam/sigma- aldrich/docs/Sigma/Datasheet/9/msqc1dat.pdf)
2. Perkins, DN, Pappin, DJ, Creasy, DM and Cottrell, JS, Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis, 20(18) 3551-67 (1999) Table 7: Mascot search parameters
Parameter Entry
User name <mascot_user_full_name>
User email <mascot_user_email>
Databases SwissProt
Enzyme Trypsin
Max. missed 0 cleavages
Quantification None
Taxonomy Homo Sapiens Fixed modifications
Variable modifications Oxidation (M)
Peptide tol. +/- (Da) 0.02 Da
MS/MS tol. +/- (Da) 0.02 Da
# 13C
Peptide charge 2+,3+ and 4+
Monoisotopic/Average Monoisotopic
Data format Mascot generic
Precursor m/z
Instrument ESI-QUAD-TOF
Error tolerant search unchecked
Decoy checked
Report top Auto

Claims

1. A highly sensitive detection method for the determination of the presence of one or more known polypeptides in a protein sample comprising: i) performing a liquid-chromatography (LC) coupled tandem mass spectrometry (MS/MS) analysis (LC-MS) of the peptides to separate peptides and detect peptide ions, ii) selecting during separation in the mass spectrometry the parent peptide ions of a specific mass to charge ratio (m/z value) (Selected Parent Ions), preferably, within a pre-defined retention time range, iii) fragmenting the Selected Parent Ions to generate fragments ions and subsequently perform mass spectrometry analysis to generate a Fragment Ions Data set, and iv) identifying the polypeptide by comparing the Fragment Ion Data set with a reference spectral library or theoretically calculated spectra from protein database.
2. The method of claim 1 , wherein the charge state determination is not performed.
3. The method of claim 1 or 2, wherein a low threshold in the intensity value is used.
4. The method of any one of claims 1 to 3 for the identification of low abundant polypeptides, comprising providing a reference spectral library for the low abundant polypeptides, a data set comprising a preferred retention range for said polypeptide, preferably with a prescribed charge state for said polypeptide, and/or a data set comprising a lower signal intensity value threshold.
5. The method of any one of claims 1 to 4, wherein the sample is a biological sample, a complex peptide mixture, or a proteomics sample.
6. The method of any one of claims 1 to 5, comprising a first step of preparing the biological sample for protein mass-spectrometry, the preparation includes but is not limited to one or more methods selected from the group consisting of:
Cell lysis, extract fractionation, depletion of abundant proteins, enrichment or target proteins, dialysis, desalting, protein digestion, and peptide separation.
7. The method of claim 4 wherein the determined intensity value threshold for the Selected Parent Ions is lower than intensity value threshold used in a method in which the parent ions are not selected by a predefined retention time and/or having only multiple charged chare states.
8. The method of any of the previous claims wherein the parent ion fragmentation and mass spectrometry analysis in step iii) is performed using Tandem mass spectrometry (MS/MS), preferably QToF.
9. The method of any of one of claims 3 to 8, wherein the peptides are provided by enzymatic or chemical digestion of protein in the biological sample.
10. The method of any one of the claims 1 to 9, wherein the sample is derived from biological material, e.g. from an extract derived from microorganisms or multicellular organisms, e.g. plant, plant part, plant organ, plant tissue, or plant cell.
PCT/EP2020/086548 2019-12-19 2020-12-16 Method of protein detection WO2021122834A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP19218235 2019-12-19
EP19218235.0 2019-12-19

Publications (1)

Publication Number Publication Date
WO2021122834A1 true WO2021122834A1 (en) 2021-06-24

Family

ID=69147423

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2020/086548 WO2021122834A1 (en) 2019-12-19 2020-12-16 Method of protein detection

Country Status (1)

Country Link
WO (1) WO2021122834A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114965662A (en) * 2022-07-25 2022-08-30 广东工业大学 Chemical substance annotation method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110288779A1 (en) * 2010-05-24 2011-11-24 Agilent Technologies, Inc. System and method of data-dependent acquisition by mass spectrometry
US20140199716A1 (en) * 2013-01-17 2014-07-17 The Regents Of The University Of California Isotopic recoding for targeted tandem mass spectrometry
US9905405B1 (en) * 2017-02-13 2018-02-27 Thermo Finnigan Llc Method of generating an inclusion list for targeted mass spectrometric analysis

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110288779A1 (en) * 2010-05-24 2011-11-24 Agilent Technologies, Inc. System and method of data-dependent acquisition by mass spectrometry
US20140199716A1 (en) * 2013-01-17 2014-07-17 The Regents Of The University Of California Isotopic recoding for targeted tandem mass spectrometry
US9905405B1 (en) * 2017-02-13 2018-02-27 Thermo Finnigan Llc Method of generating an inclusion list for targeted mass spectrometric analysis

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MS QUAL/QUANT QC MIX PRODUCT INFORMATION SHEET, Retrieved from the Internet <URL:https://www.sigmaaldrich.com/content/dam/sigma-aldrich/docs/Sigma/Datasheet/9/msqc1dat.pdf>
PERKINS, DNPAPPIN, DJCREASY, DMCOTTRELL, JS: "Probability-based protein identification by searching sequence databases using mass spectrometry data", ELECTROPHORESIS, vol. 20, no. 18, 1999, pages 3551 - 67, XP002319572, DOI: 10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2
VINCENT C. CHEN ET AL: "Targeted identification of phosphorylated peptides by off-line HPLC-MALDI-MS/MS using LC retention time prediction", JOURNAL OF MASS SPECTROMETRY, vol. 43, no. 12, 1 December 2008 (2008-12-01), pages 1649 - 1658, XP055139170, ISSN: 1076-5174, DOI: 10.1002/jms.1450 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114965662A (en) * 2022-07-25 2022-08-30 广东工业大学 Chemical substance annotation method

Similar Documents

Publication Publication Date Title
Li et al. Proteomic characterization of two snake venoms: Naja naja atra and Agkistrodon halys
EP2238260B1 (en) Thyroglobulin quantitation by mass spectrometry
JP7120586B2 (en) Characteristic Polypeptides of Chinese Changbai Mountain Copperhead Viper Venom Thrombin-Like Enzymes and Uses Thereof
Haynes et al. Subcellular shotgun proteomics in plants: looking beyond the usual suspects
Patel Matrix effect in a view of LC-MS/MS: an overview
US20220033873A1 (en) Quantitative detection method for snake venom thrombin-like enzyme (svtle)
CA2444524A1 (en) Methods for mass spectrometry detection and quantification of specific target proteins in complex biological samples
AU2007258970A1 (en) Mass spectrometry biomarker assay
CN111893110B (en) White-eyebrow snake venom hemocoagulase characteristic polypeptide and application thereof in species identification of snake venom hemocoagulase for injection
US20240019446A1 (en) Methods and systems for selective quantitation and detection of allergens including gly m 7
US20230092234A1 (en) Marker polypeptide of bothrops atrox-like thrombin and method thereof for detecting species source and content of snake venom-like thrombin and application
Flamini et al. Mass spectrometry in the analysis of grape and wine proteins
CN102483420A (en) Novel Method For Quantifying Proteins By Mass Spectrometry
WO2021122834A1 (en) Method of protein detection
AU2015301806A1 (en) Methods and systems for selective quantitation and detection of allergens
Van Ness et al. Mass spectrometric-based selected reaction monitoring of protein phosphorylation during symbiotic signaling in the model legume, Medicago truncatula
WO2018007394A1 (en) Method for the calibration of a biological sample
US20230030539A1 (en) Method for analyzing the metabolic content of a biological sample
US20190004035A1 (en) Means and methods for determination of a metabolic state of a plant
MX2011012942A (en) Multiplex analysis of stacked transgenic protein.
Thomas et al. Proteolysis and autolysis of proteases and the detection of degradation products in doping control
Ishida et al. Differential oxidation processes of peroxiredoxin 2 dependent on the reaction with several peroxides in human red blood cells
US20220291230A1 (en) Method for Identifying Human Growth Hormone Proteoform (hGHP) Pattern Biomarker
US20220137057A1 (en) Metabolic analysis method
Beyene et al. Review on proteomics technologies and its application for crop improvement

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20833794

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20833794

Country of ref document: EP

Kind code of ref document: A1