CN115605976A - Charge state determination for single ion detection events - Google Patents

Charge state determination for single ion detection events Download PDF

Info

Publication number
CN115605976A
CN115605976A CN202180035275.7A CN202180035275A CN115605976A CN 115605976 A CN115605976 A CN 115605976A CN 202180035275 A CN202180035275 A CN 202180035275A CN 115605976 A CN115605976 A CN 115605976A
Authority
CN
China
Prior art keywords
pulse
peak
ions
charge state
charge
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202180035275.7A
Other languages
Chinese (zh)
Inventor
P·鲁米恩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
DH Technologies Development Pte Ltd
Original Assignee
DH Technologies Development Pte Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by DH Technologies Development Pte Ltd filed Critical DH Technologies Development Pte Ltd
Publication of CN115605976A publication Critical patent/CN115605976A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01JELECTRIC DISCHARGE TUBES OR DISCHARGE LAMPS
    • H01J49/00Particle spectrometers or separator tubes
    • H01J49/0027Methods for using particle spectrometers
    • H01J49/0036Step by step routines describing the handling of the data generated during a measurement
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/20Identification of molecular entities, parts thereof or of chemical compositions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics

Landscapes

  • Chemical & Material Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)

Abstract

Methods and systems for identifying or classifying charge states of detected ions. An example method for classifying charge states of detected ions may include generating a pulse for each ion of a plurality of ions detected by a detector, wherein each pulse has a pulse characteristic (e.g., pulse height, width, or area); generating a pulse-characteristic distribution for the generated pulses; and generating an identification of a charge state of one or more ions of the plurality of ions based on the pulse-signature distribution.

Description

Charge state determination for single ion detection events
Cross-referencing of related cases
This application was filed as PCT International patent application on day 5/14 of 2021 and claims priority to U.S. patent application Ser. No.63/024,987 filed on day 5/14 of 2020, the entire disclosure of which is incorporated herein by reference in its entirety.
Background
As a general overview, mass Spectrometry (MS) is an analytical technique for detecting and quantifying compounds based on the analysis of mass-to-charge ratio (m/z) values for ions formed by those compounds. MS involves ionization of one or more compounds of interest from a sample, generation of precursor ions, and mass analysis of the precursor ions. Tandem mass spectrometry or mass spectrometry/mass spectrometry (MS/MS) involves ionizing one or more compounds of interest from a sample, selecting one or more precursor ions of the one or more compounds, fragmenting the one or more precursor ions into product ions, and mass analyzing the product ions.
Both MS and MS/MS can provide qualitative and quantitative information. The measured precursor or product ion spectra can be used to identify the molecule of interest. The intensities of the precursor and product ions can also be used to quantify the amount of compound present in the sample.
Mass spectrometry techniques often utilize the mass-to-charge ratio (m/z) of detected ions to generate mass spectrometry data. However, knowledge (knowledge) of the actual charge or mass of the detected ions is often not directly measurable. Thus, some overlap of detected ions may occur in certain scenarios. For example, singly charged ions with mass may appear in the mass spectrum to have the same mass-to-charge ratio as doubly charged ions with double mass. This problem may be generally referred to as the peak overlap problem.
In top-down Mass Spectrometry (MS) protein analysis, for example, overlap of mass or mass-to-charge (m/z) peaks in mass spectrometry is a significant problem. In this type of analysis, a very wide variety of different fragments or product ions are generated, including product ions having a length of 1-200 amino acids and having 1-50 different charge states. The product ion peaks strongly overlap each other in a single spectrum. Furthermore, the overlap can be so extensive that even mass spectrometers with the highest mass resolution (fourier transform ion cyclotron resonance (FT-ICR) or orbitrap) cannot deconvolute these overlapping peaks. Thus, large product ions are often lost in top-down protein analysis, limiting the sequence coverage of large proteins. International publication WO2020/157720, published on 8/6/2020 and international publication WO2019/197983, published on 10/17/2019, both provide additional discussion of top-down MS protein analysis and related challenges.
Disclosure of Invention
In one aspect, the present technology relates to a method of classifying a charge state of a detected ion, the method comprising: generating a pulse for each ion of the plurality of ions detected by the detector, wherein each pulse has a pulse characteristic; generating a pulse-feature profile for the generated pulse; and generating an identification of a charge state of one or more ions of the plurality of ions based on the pulse-signature distribution.
In an example, the pulse-signature distribution is a graph of probability versus pulse signature. In another example, the pulse characteristic is at least one of a pulse height, a pulse width, or a pulse area. In yet another example, the pulse is characterized by a pulse height, and the pulse height is a maximum voltage of the pulse. In yet another example, the detector is an electron multiplier detector and the detector is configured to detect primarily single ion events. In yet another example, generating the indication of the charge state includes comparing the pulse-signature distribution to a reference pulse-signature distribution. In yet another example, the ions detected by the detector are generated by ionization of the sample, and the reference pulse-signature distribution is identified based on known signatures of the sample.
In another example, the generated identification includes a probability of a charge state. In another example, the method further comprises generating a deconvolved mass spectrum for the detected ions based on the identification of the charge state, wherein one axis of the mass spectrum is mass rather than mass per charge (m/z). In yet another example, multiple ions are grouped into different groups using the m/z domain, and identification of charge states based on pulse-signature distributions is performed for each group. In yet another example, the grouping step includes generating a mass spectrum based on the plurality of detected ions; identifying a first peak in the mass spectrum, wherein the first peak has a mass per charge (m/z) value; and grouping ions within a mass per charge (m/z) range based on the m/z value of the first peak. In yet another example, the grouping step includes selecting a first subset of the plurality of ions into a first intensity band and selecting a second subset of the plurality of ions into a second intensity band; generating a first mass spectrum for a first intensity band; generating a second mass spectrum for a second intensity band; identifying a first peak in at least one of the mass spectra, wherein the first peak has a mass per charge (m/z) value; and grouping ions within a mass per charge (m/z) range based on the m/z value of the first peak.
In another example, the method further comprises generating a second pulse-signature distribution for ions in the m/z range, and generating the identification comprises: determining that ions forming the first pulse-signature distribution have a first charge state; and determining that the ions forming the second pulse-signature distribution have a second charge state. In another example, the method further includes determining one or more isotopes corresponding to one or more ions forming the first peak based on the identification of the charge state. In yet another example, generating the indication of the charge state includes comparing the first pulse-signature distribution to a reference pulse-signature distribution. In yet another example, ions detected by the detector are generated by ionization of the sample, and the reference pulse-signature distribution is identified based on known signatures of the sample. In yet another example, the generated identification includes a probability of a charge state. In another example, the method is performed as part of a top-down protein analysis.
In another example, the method further comprises identifying a second peak; determining a consistent m/z distance based on at least the first peak and the second peak; identifying a first peak and a second peak forming feature; and wherein generating the identification of the charge state is based on the consistent distance. In another example, the step of identifying peak forming characteristics includes comparing pulse-signature distributions of the peaks and selecting peaks having substantially the same pulse-signature distribution. In yet another example, the comparison of the pulse-signature distributions is performed by calculating the euclidean distance between the pulse-signature distributions and comparing it to a predetermined threshold. In yet another example, the method further includes identifying missing peaks corresponding to the features based on the consistent distance. In yet another example, the method further comprises generating a deconvolved mass spectrum for the detected ion based on the identification of the charge state of the ion, wherein one axis of the mass spectrum is mass rather than m/z. In another example, the pulse is characterized by a maximum voltage of the pulse.
In another aspect, the present technology relates to a mass analysis system. The mass analysis system includes a detector configured to generate a pulse for each ion detected by the detector; a processor; and a memory storing instructions configured to, when executed by the processor, cause the system to perform a set of operations. The operations include generating a pulse for each ion of the plurality of ions striking the electron multiplier detector, wherein each pulse has a pulse characteristic; generating a pulse-feature profile for the generated pulse; and generating an identification of a charge state of one or more ions of the plurality of ions based on the pulse-signature distribution. In an example, the detector is an electron multiplier detector. In another example, the mass analysis system further comprises an ion source apparatus, a dissociation apparatus, and a mass analyzer.
In another aspect, the present technology relates to a method for classifying a charge state of detected ions, the method comprising detecting, using a processor, a transient time domain signal induced on an image-to-charge detector of a mass analyzer as a result of oscillation of a plurality of ions in the mass analyzer; converting the transient time domain signal into a plurality of Frequency Domain (FD) peaks corresponding to ions of the plurality of ions; generating an FD-peak-feature distribution for the generated pulse; and generating an indication of a charge state of one or more ions of the plurality of ions based on the FD-peak-feature distribution.
In an example, the FD-peak-feature distribution is a plot of probability versus FD-peak-feature. In another example, FD-peak-features are peak intensities. In yet another example, generating the identification of the charge state includes comparing the FD-peak-characteristic distribution to a reference FD-peak-characteristic distribution. In another example, ions detected by the detector are generated by ionization of the sample, and the reference FD-peak-feature distribution is identified based on known features of the sample. In yet another example, the generated identification includes a probability of a charge state. In yet another example, the method further comprises generating a deconvolved mass spectrum for the detected ions based on the identification of the charge state, wherein one axis of the mass spectrum is mass rather than mass per charge (m/z).
In another aspect, the present technology relates to a method for classifying a charge state of a detected ion. The method includes generating a pulse for each ion of a plurality of ions detected by a detector, wherein each pulse has a pulse characteristic; generating a pulse-feature profile for the generated pulse; and identifying a coarse charge state based on the pulse-signature distribution; identifying a peak pair of a first ion peak and a second ion peak such that the peaks have adjacent charge states; and determining a refined charge state of the second ion peak based on the m/z value of the first ion peak, the m/z value of the second ion peak, and the mass of the charge carrier.
In an example, the coarse charge state identification is accurate for a range of possible charge states for at least one peak forming a pair, and at least one charge state from the range is adjacent to the charge state identified for the second peak. In another example, the method further comprises accepting refined charge state identification if the refined identified charge state is an integer within a certain threshold. In yet another example, a third peak having adjacent charge states is identified, and the third peak forms a pair with at least one of the peaks, and the charge state identifications for the common peaks are matched in the two pairs. In yet another example, the method further comprises determining a charge state of the first ion peak based on the determined charge state of the second ion peak. In yet another example, the method further comprises obtaining the mass of the charge carriers based on known characteristics of a sample ionized to generate a plurality of ions.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. Additional aspects, features and/or advantages of the examples will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the disclosure.
Drawings
Non-limiting and non-exhaustive examples are described with reference to the following figures.
FIG. 1A depicts an example system for performing mass spectrometry.
Fig. 1B depicts an exemplary diagram of ion pulses.
Fig. 1C depicts an example mass spectrum separated into different bands or channels based on ion pulse intensity.
Fig. 2 depicts a graph of an example pulse-height distribution.
FIG. 3A depicts an example method for charge state assignment.
FIG. 3B depicts another example method for charge state assignment.
Fig. 3C depicts another example method for charge state assignment.
Fig. 3D depicts another example method for charge state assignment.
FIG. 3E depicts another example method for charge state assignment.
Fig. 3F depicts another example method for charge state assignment.
FIG. 4 depicts another example method for charge state assignment.
FIG. 5 depicts an example magnified mass spectrum.
FIG. 6 depicts an example peak detection operation for band mass spectrometry.
Fig. 7 depicts an example pulse-height distribution from fig. 6 for a selected peak.
FIG. 8 depicts an example pairwise similarity matrix.
FIG. 9 depicts an example calculation for the most reasonable state of charge.
FIG. 10 depicts an exemplary diagram for performing computations for feature attributes.
FIG. 11 depicts an example graph for adjacent charge state peaks.
Fig. 12 depicts an example plot of transient time domain signals measured by an image-charge detector.
Fig. 13 depicts an example system that includes an image-charge detector.
Detailed Description
As discussed briefly above, peak overlap of detected ions is problematic for analysis of MS results. To address this peak overlap problem, one solution is to determine or infer the charge state of the ions forming the peak. By determining the charge state, the mass of the ions can then be accounted for and ions from different species can be distinguished from each other. Further, multiple peaks in the mass spectrum may represent isotope clusters or features. However, in some cases, it may not be clear which peaks belong to which cluster. The charge state identification technique may also address the identification of the correct peak for such clusters.
Analog-to-digital conversion (ADC) banding methods have previously been proposed for separating ions based on their intensity. One such example of a banding method is disclosed in international publication W02020/157720 (the' 720 publication), published on 8/6/2020, incorporated herein by reference in its entirety. This separation based on ionic strength facilitates charge state separation, thereby increasing peak capacity. However, the described method does not teach how to separate ions based on their charge state. This leads to two problems: firstly, signals from the same species are diluted between multiple data channels, and secondly, no way of constructing a deconvolved mass spectrum that facilitates subsequent data interpretation is proposed. Accordingly, an improved method is desired that can assign charge states to individual ion detection events.
One such approach has recently been published in the following papers: kafader et al, multiplexed mass spectrometry of experimental protocols and their complexes, nature Methods Vol.17, pages 391-394 (2020). However, the method described in this paper is limited to mass spectrometers with detection systems in which the detected signal has a deterministic relationship to the measured charge (e.g. image-charge induced detectors). Thus, the method described in this paper does not teach, among other things, how to set a charge assignment for each individual ion measurement event of a mass spectrometer based on a detection system in which the measured signal has a probabilistic relationship with the measured charge (e.g., an electron multiplier based detection system).
In some such new classes of acquisition strategies, direct identification of charge states is attempted before the corresponding signals are added together to the mass spectrum (see Kafader et al). However, such strategies are not applicable to charge state assignment in electron multiplier detection systems. For systems based on electron multiplier detection systems, in many cases, each charge state has no unique detector response, but rather a unique pulse height distribution (more generally an intensity distribution) that is specific to each charge state and m/z value.
The present techniques allow the charge state of an ion to be determined or inferred from the characteristics of the pulse generated by the detector when the ion is detected. To this end, the present technique generates a distribution of pulse characteristics for a plurality of detected ions. Pulse characteristics may include pulse height, pulse width, or pulse area, among other possible characteristics. The distribution of the pulse points forms a unique profile based on the charge state of the detected ions. Thus, the charge state can be determined by the pulse-signature distribution. Once the charge state of an ion is determined, the mass of the ion can be determined based on the m/z of the ion and can be distinguished from other ions. Finally, compounds analyzed by MS techniques can be identified based on the determined charge state of the detected ions. Thus, by identifying and/or assigning the charge state of the ions, the measurement capabilities of the mass analysis instrument are improved. The accuracy of the mass analysis instrument can similarly be improved.
Fig. 1A depicts an example mass analysis system 100 for performing mass spectrometry techniques. In some examples, system 100 may be a mass spectrometer. The example system 100 includes an ion source apparatus 101, a dissociation apparatus 102, a mass analyzer 103, a detector 104, and computing elements, such as a processor 105 and a memory 106. For example, the ion source apparatus 101 may be an electrospray ion source (ESI) apparatus. The ion source apparatus 101 is shown as part of a mass spectrometer or may be a separate apparatus. For example, the dissociation device 102 may be an electron-based dissociation (ExD) device or a Collision Induced Dissociation (CID) device. Electron-based dissociation (ExD), ultraviolet light dissociation (UVPD), infrared light dissociation (IRMPD), and Collision Induced Dissociation (CID) are often used as fragmentation techniques for tandem mass spectrometry (MS/MS). ExD may include, but is not limited to, electron Capture Dissociation (ECD) or Electron Transfer Dissociation (ETD). CID is the most common dissociation technique in tandem mass spectrometers. As described above, in top-down and mid-down proteomics, intact or digested proteins are ionized and subjected to tandem mass spectrometry. For example, ECD is a dissociation technique that preferentially dissociates peptide and protein backbones. Thus, this technology is an ideal tool for analyzing peptide or protein sequences using top-down and mid-down proteomics approaches.
The mass analyzer 103 may be any type of mass analyzer for the desired technology, such as a time of flight (TOF), ion trap, or quadrupole mass analyzer. The detector 104 may be a suitable detector for detecting ions and generating the signals discussed herein. For example, the detector 104 may include an electron multiplying detector, which may include analog to digital conversion (ADC) circuitry. The detector 104 may also be an image charge induced detector. The detector 104 generates detection pulses for the detected ions.
The computing elements of system 100, such as processor 105 and memory 106, may be included in the mass spectrometer itself, located near the mass spectrometer, or located remotely from the mass spectrometer. In general, the computing elements of the system may be in electronic communication with the detector 104 such that the computing elements are capable of receiving signals generated from the detector 104. Processor 105 may include multiple processors and may include any type of suitable processing component for processing signals and generating the results discussed herein. Depending on the exact configuration, memory 106 (particularly storing quality analysis programs and instructions to perform the operations disclosed herein) may be volatile (such as RAM), non-volatile (such as ROM, flash memory, etc.) or some combination of the two. Other computing elements may also be included in the system 100. For example, system 100 may include storage devices (removable and/or non-removable) including, but not limited to, solid-state devices, magnetic or optical disks or tape. System 100 may also have input device(s) such as a touch screen, keyboard, mouse, pen, voice input, etc., and/or output device(s) such as a display, speakers, printer, etc. One or more communication connections, such as Local Area Networks (LANs), wide Area Networks (WANs), point-to-point, bluetooth, RF, etc., may also be incorporated into system 100.
Fig. 1B depicts an example plot 110 of ion pulses generated from a detector, such as an electron multiplier detector. The y-axis represents intensity and the x-axis represents time. The intensity may be in units of voltage. For example, for an electron multiplier detector, the output of the detector may be based on the voltage of the detected electrons (often expressed in millivolts (mV)).
In fig. 1B, three pulses are depicted-a first pulse 111, a second pulse 112, and a third pulse 113. Each of pulses 111-113 represents a different single ion arriving at the detector. Pulses 111-113 can be digitized and a peak can be found from each digitized pulse. Intensity (or peak height) and arrival time pairs may be calculated and stored for each pulse. Rectangles 131, 132 and 133 represent the intensity or pulse height of the respective pulses.
Each pulse may be characterized by a pulse characteristic. Pulse characteristics may include characteristics such as pulse height, pulse width, and/or area under the curve of the pulse. The pulse height of each pulse is indicated by rectangles 131, 132 and 132. The pulse height may be a maximum pulse height for the respective pulse, and the pulse height may have units of voltage. The pulse width may be at any point of the pulse, but one measure of the pulse width may be the full width at half maximum (FWHM). The pulse width may have a unit of time. The area under the curve of the pulses may be generated by integrating the area under the corresponding pulse signal for each pulse.
The pulse characteristics can be used to separate the detected ions into different bands. Fig. 1C depicts an example mass spectrum 150 that is separated into different bands or channels based on ion pulse characteristics, specifically maximum pulse height. A first mass spectrum 160 is generated for detected ions having a maximum pulse height between 10-20 mV. A second mass spectrum 170 is generated for the detected ions having a maximum pulse height between 20-30 mV. A third mass spectrum 180 is generated for ions detected with a maximum pulse height between 20-30 mV. Further details regarding such separation and banding are discussed further in the' 720 publication. As discussed above, while there are benefits to separating ions into different bands, such separation does not allow for the identification of the charge state of a particular ion. As discussed further herein, the pulse characteristics can be used to generate a distribution profile that allows for charge state classification of ions.
Fig. 2 depicts a graph 200 of an example pulse-signature distribution. The pulse-feature distribution in graph 200 is based on the features of the pulse height. Thus, the pulse-signature distribution may be referred to as a pulse-height distribution or an intensity distribution. In the figure, the x-axis represents the pulse height and the y-axis represents the probability or frequency of detection. For example, a higher probability value indicates that ions having a corresponding pulse height are detected more frequently.
A first pulse-height profile 202 and a second pulse-height profile 204 are depicted in the graph 200. As can be seen from the graph 200, the pulse-height distributions overlap, but the first pulse-height distribution 202 has a different profile than the profile of the second pulse-height distribution 204. The difference in cross-sectional shape is primarily due to the difference in charge state of the detected ions forming the corresponding pulse-height distribution. For example, the detected ions forming the first pulse-height distribution 202 correspond to 3+ charged ions, while the detected ions forming the second pulse-height distribution 204 correspond to 7+ charged ions. Thus, once the various pulse-height profiles have been established or generated, it may be possible to determine the charge state of any single detected ion by determining to which pulse-height profile the corresponding ion pulse fits.
As some additional detail, the pulse- height profiles 202, 204 are generated for product ions having very similar m/z values at approximately 517. The product ion was generated by top-down ECD analysis of carbonic anhydrase 2 (CA 2). As discussed above, as seen in graph 200, the pulse-height distributions resulting from different charge states can overlap significantly. In the case of such an overlap, a single intensity data point is not sufficient, and there will be a great chance that any state of charge determination based on a single intensity data point will be incorrect.
However, for a set of single-ion detection events originating from the same sample, it is sometimes possible to infer the charge state. This can be achieved either by comparing the pulse height distribution of such a collection with the pulse height distribution of known compounds with similar m/z and charge states, or using additional information that is generally available based on the nature of the sample being analyzed. An example of such additional information may be unique isotope patterns that, if resolved by a mass spectrometer, encode charge state information into their relative positions in m/z space. Alternatively, a charge state distribution can be used for this purpose, which also encodes charge state information in relative position in m/z space. In the present disclosure, a class of methods is described for assigning charge states to sets based on grouping of similar detection events in the sets, and charge state identification of single detection events that subsequently assign charge state information for individual events.
In an example, a data set containing different charge states and pulse height distributions of m/z is collected. The pulse-height distribution for the ion packets may be used to assign charge states to the individual detection events. For detection events corresponding to an unknown class, a set of detection events may be selected based on their relative proximity in m/z space. A pulse-height distribution can be calculated for such a set. This pulse-height distribution can then be matched to known pulse-height distributions from similar m/z and the "best" match selected. As will be appreciated, the term "best" may be used to identify a relatively determined optimal state given the acquired data and the applied determination effort. The best matching charge state is then inferred for each detection event and/or corresponding ion from the set. Alternatively or additionally, a model may be built based on well-characterized data for intensity distributions of different m/z, and the state of charge may be predicted based on this model. This may be achieved, for example, using machine learning techniques, where the training data set may contain annotated data collected a priori.
In another example, in a first step, the collected data for each detection event may be summed to form a conventional mass spectrum or alternatively a plurality of mass spectra based on their intensities. In the second step, an algorithm is used for feature extraction and feature charge state assignment, where the features are isotope clusters or charge state clusters corresponding to the same molecule. In a third step, a determination of a set of ion detection events corresponding to those features and subsequent inference of charge states is performed.
In another example, a data acquisition and processing strategy is presented that combines information about the pulse-height distribution of individual groups of ions with additional information about sample properties, such as isotopic patterns or charge state distributions, to identify a set of detection events originating from similar ions, and then assign charge states to the individual ions.
Fig. 3A depicts an example method 300 for charge state assignment. At operation 301, a pulse is generated for each ion of a plurality of ions detected by a detector. For example, when ions strike an electron multiplier detector, a pulse is generated. The generated pulses may be similar to those depicted in fig. 1B and discussed above. Each pulse may have or be described by their respective pulse characteristics. The detected ions may be ions from ionization of a sample studied or analyzed by mass analysis techniques. At operation 302, one or more pulse-feature distributions are generated based on the features of the pulses. For example, the generated pulse-characteristic distribution may be generated for a particular pulse characteristic (such as pulse height). Thus, the pulse-feature distribution(s) generated in operation 302 may be a pulse-height distribution similar to that depicted in fig. 2 and discussed above.
Based on the pulse-signature distribution(s) generated in operation 302, an identification of a charge state of one or more ions of the plurality of ions may be generated at operation 303. For example, the generated pulse-signature distribution may be compared to a set of reference pulse-signature distributions having known charge states to determine a closest match to the generated pulse-signature distribution. The charge states of the ions forming the generated pulse-signature distribution may then be assigned a charge state associated with the reference pulse-signature distribution. The number of reference pulse-signature distributions to which the generated pulse-signature distributions are compared may be limited or reduced based on external information or m/z values for known signatures of the sample being analyzed and/or ions forming the generated pulse-signature distributions. For example, for a particular m/z value or range, a subset of the reference pulse-signature distribution may be present, and/or a subset of the reference pulse-signature distribution may correspond to a particular isotope, compound, and/or sample.
In other examples, a machine learning model (e.g., a neural network) may be trained based on reference pulse-signature distributions with known charge states. The generated pulse-feature distribution may be provided as input to a trained machine learning model. The trained machine learning model processes the input generated pulse-feature distribution, and the output of the trained machine learning model indicates a state of charge or a likely state of charge corresponding to the generated pulse-feature distribution. For example, the output of the machine learning model may be an indication of the state of charge and/or an indication of a reference pulse-signature distribution that most closely matches the generated pulse-signature distribution. In some examples, the input to the trained machine learning model may also include m/z values or m/z ranges of ions forming the generated pulse-signature distribution. Additionally or alternatively, the input may also include external data about the sample, such as the expected compound or isotope type. In other examples, different machine learning models may be trained for different types of samples, and the machine learning model for the sample being analyzed may be selected for analyzing the generated pulse-signature distribution.
The identification or assignment of charge states may also or alternatively be based on grouping peaks together as features and then analyzing the relative distances between the grouped peaks. Additional details regarding this charge assignment process are discussed in more detail below with respect to method 3000 in fig. 3D.
Returning to the method 300 in fig. 3A, at operation 304, a mass spectrum is generated for the detected ions. A mass spectrum may be generated based on the charge state(s) identified in operation 303. For example, based on known charge states, overlapping peaks can be resolved or otherwise indicated in the mass spectrum. For example, because the charge states of the ions used to form the mass spectrum are known or identified in operation 303, a mass spectrum for deconvolution of the detected ions may be generated. For deconvolved mass spectra, one axis of the mass spectrum may be mass rather than mass per charge (m/z). At operation 305, a compound or amount of compound in the sample corresponding to the detected ion may be identified. A compound or amount of compound may be identified from the mass spectrum generated in operation 304 and/or from the charge state(s) identified in operation 303.
Fig. 3B depicts another example method 310 for charge state assignment. At operation 311, a pulse is generated for each ion of the plurality of ions detected by the detector. Operation 311 may be substantially the same as or similar to operation 301 discussed above. The generated pulses may be stored in a memory for subsequent analysis. At operation 312, a mass spectrum is generated based on the m/z values of the detected ions. At operation 313, peaks in the mass spectrum may be selected and/or identified. For example, peaks may be automatically identified by a peak finding algorithm. In other examples, the peaks may be identified and/or selected based on manual input received from a user. The identified peaks have associated m/z positions or values. The m/z position may be the center position of the peak and/or the centroid or weighted average m/z position on the peak.
At operation 314, one or more pulse-signature distributions are generated for the ions forming the peaks identified in operation 313. The ions that form the peak may be ions identified via a peak finding algorithm. The ions may also be ions within a particular m/z range of the m/z value of the identified peak. For example, the m/z range may be selected based on the characteristics of the peak and/or may be a preset range (e.g., a fixed m/z value). As an example, the m/z range may be from the beginning m/z value of the peak (i.e., where the peak begins) to the end m/z value of the peak (i.e., where the peak ends).
For each ion that forms a peak and/or is within the m/z range, a corresponding pulse of those ions may be accessed. The pulses may be plotted or stored in a manner that indicates probability or frequency versus (vertus) pulse characteristics (e.g., pulse height). For example, the graph may be similar to that of FIG. 2, or an array of pulse characteristics and probability/frequency pairs may be generated and/or stored. One or more pulse-signature distributions may then be identified or generated for the pulses. For example, if the peaks are formed by ions having different charge states, multiple pulse-signature distributions may be generated.
At operation 315, charge states of one or more ions forming the identified peak are identified and/or assigned based on the pulse-signature distributions generated in operation 314. Operation 315 may be similar to operation 303 in fig. 3A and discussed above, and identifying the charge state may be identified in a similar manner as discussed above. Based on the identified and/or assigned charge states, a mass spectrum may be generated and/or a compound or amount of compound may be identified, as discussed above with reference to operations 304 and 305.
Fig. 3C depicts another example method 320 for charge state assignment. At operation 321, a pulse is generated for each ion of the plurality of ions detected by the detector. Operation 321 may be the same as or similar to operation 301 discussed above. At operation 322, the detected ions are grouped according to their respective pulse characteristics. As an example, pulse characteristics of pulse heights may be used, and ions may be grouped into intensity bands based on their respective pulse heights. For example, a first subset of the plurality of ions may be grouped into a first intensity band and a second subset of the plurality of ions may be grouped into a second intensity band based on the pulse characteristics of each ion.
At operation 323, a mass spectrum for each intensity band may be generated. For example, where two intensity bands are used, a first mass spectrum for a first intensity band may be generated and a second mass spectrum from a second intensity band may be generated. The mass spectrum may be similar to the mass spectrum depicted in fig. 1C and discussed above.
At an operation 324, one or more peaks are identified from the mass spectrum generated in operation 323. Peak identification and/or selection may be performed in the same or similar manner as in operation 313 discussed above. By generating a mass spectrum prior to identifying the peak(s), background noise may be reduced or removed. For example, one or more of the intensity bands may exhibit a sharper signal and/or more well-defined peaks compared to a single aggregate mass spectrum for all ions. As such, the peaks may be identified more accurately.
At operation 325, one or more pulse-signature distributions may be generated for ions forming the peak(s) identified in operation 323 and/or within m/z ranges of the m/z values of the identified peak(s). For example, a first pulse-signature distribution corresponding to ions having a first charge state and a second pulse-signature distribution corresponding to ions having a second charge state may be generated in operation 314. The pulse-signature profiles may be generated as discussed above, for example, with reference to operations 315 and 303.
At operation 326, an identification of the charge state of the ions forming the identified peak(s) is generated based on at least one of the pulse-signature distribution(s) generated in operation 325. Identifying the charge state from the pulse-signature distribution(s) may be performed as discussed above. For example, operation 326 may be similar to operation 303 in fig. 3A and as discussed above, and identifying the charge state may be identified in a similar manner as discussed above. Based on the identified and/or assigned charge states, a mass spectrum may be generated and/or a compound may be identified, as discussed above with reference to operations 304 and 305.
Fig. 3D depicts another example method 3000 for charge state assignment. The description of FIG. 3D and method 3000 below is a description of an example set of step-by-step operations that may be employed for each task and the results of the corresponding steps for an example selected data set. In this example, data from an ECD experiment for top-down carbonic anhydrase 2 (CA 2) was used. Fig. 5 depicts an example scaling or data fragmentation of mass spectrum 500 of data for an experiment. More specifically, FIG. 5 depicts a scaled mass spectrum 500 of m/z ranges 517-520 for a top-down ECD experiment of CA 2.
Returning to FIG. 3D, in step 3010, data from the detector is recorded in such a way that for each detection event, a pair of intensities and corresponding m/z values is recorded and stored. For example, for each pulse generated by the detector (due to detected ions), at least one pulse characteristic (e.g., pulse height, pulse width, pulse area) and corresponding m/z value for the corresponding detected ion may be stored as a pair. The data may be generated or recorded as discussed above and/or according to the methods described in the' 720 publication.
In step 3020, the recorded data is summed into a single spectrum or multiple mass spectra based on peak intensity, as described above and/or in the' 720 publication. In this example, the data is summed to form a plurality of mass spectra corresponding to different intensity bands. Fig. 6 depicts an example of such a ribbon mass spectrum. For example, fig. 6 depicts a plurality of ribbon mass spectra 600, comprising: (1) A first mass spectrum 602 for ions having a corresponding pulse height between 20-30 mV; (2) A second mass spectrum 604 for ions having a corresponding pulse height between 30-40 mV; (3) A third mass spectrum 606 for ions having a corresponding pulse height between 40-50 mV; and (4) a fourth mass spectrum 608 for ions having a corresponding pulse height between 50-60 mV.
Returning to fig. 3D, in step 3030, a peak detection operation is performed. Various algorithms may be employed for this step, as will be appreciated by those skilled in the art. For example, a Continuous Wavelet Transform (CWT) algorithm may be used. The highlights/shading in fig. 6 shows the result of the peak detection process based on the continuous wavelet transform algorithm. For example, each peak detected by the peak detection algorithm is highlighted/shaded. The line through each peak depicted in FIG. 6 indicates the m/z value of the respective peak. The outer boundary of the shading/highlighting may indicate the m/z range of the peak.
In step 3040, the pulse-feature distribution for each peak is calculated using the detected events filtered by their proximity to the peak vertices. For example, all pulses corresponding to ions in the highlight region of a particular peak in fig. 6 may be used to generate the pulse-signature distribution. While multiple band mass spectra may be generated and used for peak identification (to improve the accuracy of the peak detection process), the pulses used to generate the pulse-signature profile are taken from all of the band mass spectra. For example, if the peak selected to generate the pulse-signature distribution is a peak located at about 518.0 in the 30-40mV mass spectrum 604 in FIG. 6, then the pulse used to generate the pulse-signature distribution includes pulses corresponding to ions from all of the ribbon mass spectra 602-608 whose m/z values are close to the peak's m/z value.
The generated pulse-signature distribution may be a pulse-height distribution. An example of the calculated pulse-height distribution for the multiple peaks in fig. 6 is shown in fig. 7. As can be seen in fig. 7, two clusters of pulse-height distributions are formed — a first cluster 702 and a second cluster 704. The first cluster 702 corresponds to ions having a first charge state and the second cluster 704 corresponds to ions having a second charge state. Although the pulse-height distribution in each cluster is not exactly the same, two distinct groupings of profiles or distributions are clearly visible from the graph in fig. 7.
Returning to fig. 3D, in step 3050, the pulse-signature distributions generated in operation 3040 may be compared to each other. A comparison may be performed to group or cluster the pulse-signature distributions with each other. Ions and/or peaks belonging to different isotopes may be grouped together based on grouping or clustering. One way to perform this comparison is to calculate the relative distance between the pulse-feature distributions. To this end, appropriate combining (binning) may be employed and the pulse-feature distribution may be represented as a vector containing the probability for each intensity range. A euclidean distance or any other suitable norm may be calculated for each pair of vectors representing the intensity distribution. An example of the calculated pair-wise euclidean distances is shown in the table 800 depicted in fig. 8. Peaks having corresponding pulse-feature distributions with relative distances less than a predefined threshold may be grouped together to form a feature. In an exemplary analysis, if the relative distance between each pulse-feature distribution within such a group is less than 0.1, the peaks are grouped into features, thereby forming two separate groups of peaks corresponding to two unique features. For example, two peaks identified in a mass spectrum may have corresponding pulse-signature distributions that are very similar to each other (i.e., in the same cluster). This pulse-signature distribution similarity indicates that the peaks are likely to be formed by ions having the same charge state. Thus, peaks may be grouped together as features and considered as part of an isotope cluster.
Returning to fig. 3D, in step 3060, the charge states of the features are identified assuming the peaks are forming isotope clusters. To accomplish this, first, the peaks corresponding to or grouped into features may be sorted in ascending or descending order in operation 3050. The distance between adjacent peaks corresponding to a feature (e.g., the m/z distance between a first peak and a second peak) may be calculated in m/z space. The uniform distance may then be selected based on the richest (and therefore most likely) distance of the particular feature or based on the minimum distance if it can properly account for other observed distances (e.g., other distances are multiples of the uniform distance). The distances between adjacent isotopes in an isotope cluster are inversely proportional to the characteristic charge, so that the characteristic charge can be inferred from those distances.
An example of such a calculation is shown in fig. 9. Fig. 9 depicts two features in which peaks have been grouped based on similarity of their corresponding pulse-feature distributions. For example, for the first feature, four peaks have been grouped together. For the second feature, the three peaks have been grouped together. The peak position and the m/z distance between the peaks are shown for each respective peak. Based on the distance, a predicted charge state may be generated based on the inverse of the distance. For example, the reciprocal of 0.142 is 7.04 (i.e., 1/0.142= 7.04), and the reciprocal of 0.334 is 2.99 (i.e., 1/0.334= 2.99). The most frequently predicted charge or distance may then be used to identify or assign a charge state. For example, for the first feature, the most common predicted state of charge is about 7 due to the m/z distance. Since the charge state must be an integer, the uniform charge state is determined to be 7. A similar calculation is performed on the second feature to determine a consistent state of charge of 3. Notably, the distance between peaks 2 and 3 of the first feature is different from the uniform distance, and the disposition of this difference is discussed further below.
Returning to fig. 3D, method 3000 may continue with step 3070, where missing peaks may be calculated. This step may be performed using a variety of strategies. For example, to find missing internal peaks, the algorithm first finds the distance between adjacent peaks, which is greater than the consistent distance and is substantially a multiple of the consistent distance. The multiplication factor N is then calculated as N = (measured distance)/(uniform distance). Additional N-1 peaks may be inserted between those at their respective positions to form complete isotope clusters.
An example of using such an algorithm may be demonstrated with reference to fig. 9. In fig. 9, the distance between peaks 2 and 3 in the first feature is 0.286, which is greater than the distance between the other peaks in the first feature. The uniform distance of the first feature is about 0.143. Using the above calculation, N =0.286/0.143=2. Thus, 1 (i.e., N-1) peaks may be inserted between peaks 2 and 3. Additional peaks are provided after this step to account for isotope clustering. The peak was placed at the m/z position of 517.994m/z (this was calculated by averaging the m/z of the peaks labeled 2 and 3).
Another algorithm may alternatively or additionally be employed to search for missing peaks, which is not limited to finding only internal peaks. This algorithm calculates the positions of possible adjacent peaks and then extracts the recorded signals corresponding to these positions. This signal is further processed to form a pulse-signature distribution, which can then be compared to one (or alternatively an average) of the pulse-signature distributions calculated for the peaks in the group using similar methods described in steps 3050-3060. The identified peak value is then added to the peak list of features.
Step 3080 may include performing an analysis to find overlapping peaks in the plurality of features. This step can be accomplished by comparing the peak positions in each feature and addressing the peaks at substantially the same positions. For example, the newly found peak from previous step 3070 with m/z of 517.994 is in substantially the same position as peak 0 from feature 2 (see fig. 9) with mass 518.003. Overlapping peaks can be identified by finding the m/z difference or distance between peaks and comparing it to an overlap threshold. If the m/z distance is below a threshold, the peaks may be considered overlapping.
Once overlapping peaks have been identified, an additional step of finding the contribution of each feature to the overlapping peaks may be performed. To this end, a system of linear equations may be written and solved approximately with or without constraints. Such constraints may include requiring that each contribution be non-negative. This step may be accomplished, for example, using a non-negative least squares approximation algorithm.
At step 3090, a detection event (e.g., an ion corresponding to the pulse) may be assigned to the feature. Step 3090 may be performed using, for example, the following algorithm. First, using the uniform distance from step 3070 and one or more additional instrument parameters (such as instrument resolution), an m/z distribution can be modeled for each peak of the features. Second, using a uniform pulse-feature distribution for the feature (which can be calculated as the average of the pulse-feature distributions of all non-overlapping peaks) and the m/z distribution from the first part of this step, two values can be calculated that reflect the probability of a feature having such a detection event. These values are the intersection of the m/z position with the calculated m/z distribution, and the similar intersection of the detection event intensity with the consistent intensity distribution attributed to the feature. The product of those values is a score, which can be used to attribute a detected event to a feature using a threshold.
An example of calculating such a score may be provided with reference to fig. 10, which fig. 10 depicts a modeled peak 1000 and a consistent pulse-signature distribution 1050. The detected ions to be assigned have an m/z value of 517.994 and a pulse height of 34. Those values are indicated by vertical lines in the figure. The m/z value of 517.994 intersects the modeled peak at intensity 0.31 (note that for the modeled peak, the intensity has been normalized to 1). 34 intersect the uniform pulse-feature distribution at 0.25. Thus, the score may be calculated as 0.0775 (i.e., 0.31 × 0.25= 0.0775).
In the case of multiple overlapping features, the feature that yields the highest score is selected. Additional constraints that balance the total contribution of the features to overlapping peaks may also be set and implemented. Additional or alternative algorithms may also be implemented for this step to determine the most appropriate characteristics for which to assign a detection event corresponding to the detected ion. Such algorithms may estimate the probability of a detected event belonging to a feature, such as by using a bayesian framework. In step 3010, a characteristic charge state is assigned to the detection event.
Fig. 3E depicts another example method 3200 for charge state assignment. At operation 3202, a pulse is generated for each ion of the plurality of ions detected by the detector. For example, when ions strike an electron multiplier detector, a pulse is generated. The generated pulses may be similar to those depicted in fig. 1B and discussed above. Each pulse may have or be described by their respective pulse characteristics. The detected ions may be ions from ionization of a sample studied or analyzed by mass analysis techniques. At operation 3204, one or more pulse-feature distributions are generated based on the features of the pulses. For example, the generated pulse-characteristic distribution may be generated for a particular pulse characteristic (such as pulse height). Thus, the pulse-feature distribution(s) generated in operation 3204 may be a pulse height distribution similar to that depicted in fig. 2 and discussed above.
At operation 3206, ion peaks having adjacent charge states are identified based on the pulse-signature distribution. Identifying peaks having adjacent charge states can include determining an estimated or coarse charge state based on the pulse-signature distribution. The coarse charge state identification may be accurate only for a range of possible charge states of at least one peak forming a pair, and at least one charge state from this range is adjacent to the charge state identified for the second peak.
An example of ion peaks with adjacent charge states is shown in example diagram 1100 in fig. 11. The first peak is located at (m/z) 1 And the second peak has (m/z) 2 At the m/z position of (a). The charge states of those peaks may be estimated based on the pulse-signature distributions generated in operation 3204.
At operation 3208, a charge state of the charge states of the ions forming the peak may be further determined based on the following equation:
equation (1):
Figure BDA0003941176700000201
equation (2):
Figure BDA0003941176700000202
equation (3):
Figure BDA0003941176700000203
equation 1 expresses the m/z position ((m/z) of the second peak) 2 ) Mass of non-charge carriers with ions (M), charge state of the second peak (z) 2 ) And the mass (X) of the charge carriers. Equation 2 expresses the m/z position of the first peak ((m/z) 1 ) Mass of non-charge carrier with ion (M), charge state of second peak (z) 2 ) And the mass (X) of the charge carriers. It is noted that equation 2 assumes that the charge state difference between the first peak and the second peak is 1. Thus, in other examples where the estimated state of charge difference between the first peak and the second peak is not 1, 1 in equation 2 is replaced with that value.
The charge state (z) of the ion in the second peak is expressed based on equation 3 of equations 1 and 2 2 ) M/z position of the second peak ((m/z) 2 ) M/z position of first peak ((m/z) 1 ) And the mass (M) of the charge carriers. Each of these values is measured by a detector or is known from the ionized sample and/or the sample preparation process. For example, a common charge carrier is a proton, which has a mass of about 1 Atomic Mass Unit (AMU). Thus, the charge state (z) of the ion forming the second peak can be determined using equation 3 2 ). Based on the determined charge state (z) of the ions forming the second peak 2 ) The charge state (z) of the ions forming the first peak can be determined 1 ). The determined state of charge may be a refined state of charge as compared to the initially estimated or determined coarse state of charge.
The refined charge state should be an integer or near integer, such as within a threshold of an integer. If not, the coarse or refined charge state identification may be incorrect. Thus, coarse and/or refined charge state identification may be accepted only if the refined identified charge state is an integer within a certain threshold. If not, the coarse state of charge can be re-estimated and the method performed again with the corrected coarse state of charge. Further, to potentially increase confidence in the assignment, a third peak having an adjacent charge state may be identified. The third peak forms a pair with at least one of the first two peaks and the charge state identifications for the common peak match in both pairs.
Fig. 3F depicts another example method 3300 for charge state assignment. Method 3300 utilizes an image-to-charge detector. Unlike ADC detectors, image-charge detectors detect oscillations of ions in a mass analyzer. Fig. 12 depicts an example plot of a transient time-domain signal measured by an image-charge detector, the transient time-domain signal including a component from each of a plurality of ions oscillating in a mass analyzer. In order to decompose the transient time domain signal measured by the image-charge detector into individual components, the transient time domain signal is converted into a frequency domain signal. The transformation method includes, but is not limited to, fourier transform or wavelet transform. A peak in the frequency domain signal corresponds to each ion of the plurality of ions oscillating in the mass analyzer. The frequency domain peaks are converted to m/z peaks using well known formulas that depend on the particular type of mass analyzer to generate a mass spectrum.
Thus, for an image-to-charge detector, the intensity of the frequency domain signal or peak is proportional to the charge state of the underlying ions, similar to how the pulses are proportional to the charge state as described above. Thus, the intensity or other characteristics of the Frequency Domain (FD) peak may be used to generate a distribution similar to the pulse-characteristic distribution discussed above. The distribution generated from the characteristics of the FD peak may be referred to as FD-peak-characteristic distribution or FD-peak-intensity distribution, where the intensity of the FD peak is used as the characteristic of interest. The FD-peak-signature distribution can then be used to determine the charge state in substantially the same manner as the pulse-signature distribution.
Returning to fig. 3F, at operation 3302, a transient time domain signal induced on an image charge detector of the mass analyzer by the oscillation of the plurality of ions in the mass analyzer is detected. At operation 3304, the transient time-domain signal is converted into a plurality of frequency-domain (FD) peaks. Each frequency domain peak may correspond to an ion of the plurality of ions.
At operation 3306, one or more FD-peak-feature distributions are generated. For example, the generated FD-peak-feature distribution may be generated for a particular FD-peak-feature (such as intensity). At operation 3308, an identification of a charge state of one or more ions of the plurality of ions detected in operation 3302 is generated based on the one or more FD-peak-feature distributions generated in operation 3306. Identifying charge states based on FD-peak-signature distributions can be performed using pulse-signature distributions using any of the methods described herein. For example, the FD-peak-signature distribution may be used instead of the pulse-signature distribution.
At operation 3310, a mass spectrum is generated for the detected ions. In operation 3308, a mass spectrum may be generated based on the identified charge state(s). For example, overlapping peaks may be resolved or otherwise indicated in a mass spectrum using known charge states. For example, because the charge states of the ions used to form the mass spectrum are known or identified in operation 3308, a mass spectrum for deconvolution of the detected ions may be generated. For deconvolved mass spectra, one axis of the mass spectrum may be mass rather than mass per charge (m/z). At operation 3312, a compound or amount of a compound in the sample corresponding to the detected ion may be identified. The compound or amount of compound may be identified from the mass spectrum generated in operation 304 and/or from the charge state(s) identified in operation 3312.
Fig. 13 depicts an example system 1300 that includes an image-charge detector 1318. The system of fig. 13 includes a mass spectrometer 1310 and computing components including a memory and processor 1320. The computing elements of the system (such as the processor 1320 and memory) may be included in the mass spectrometer itself, located near the mass spectrometer, or located remotely from the mass spectrometer. In general, a computing element of the system may be in electronic communication with the detector 1318 such that the computing element is capable of receiving a signal generated from the detector 1318. Processor 1320 may include multiple processors and may include any type of suitable processing component for processing signals and generating the results discussed herein.
Mass spectrometer 1310 includes mass analyzer 1317. The mass analyzer 1317 includes an image-charge detector 1318. The image-to-charge detector 1318 generates an oscillating signal or transient time domain signal for detected ions whose amplitude is proportional to the ion charge state. The mass analyzer 1317 may be any type of mass analyzer that can detect ions using an image-charge detector, including but not limited to an Electrostatic Linear Ion Trap (ELIT), FT-ICR, or orbitrap mass analyzer. The mass analyzer 1317 is shown as ELIT in fig. 13, and the image-charge detector 1318 is shown as a pickup electrode of ELIT.
The mass analyzer 1317 detects a transient time domain signal 1319 induced on the image charge detector 1318 by the oscillation of the plurality of ions in the mass analyzer 1317. The plurality of ions are transmitted through the mass spectrometer 1310 to the mass analyzer 1317. Processor 1320 converts transient time domain signal 1319 into a plurality of frequency domain pulses or peaks 1321. Each frequency domain signal corresponds to an ion of the plurality of ions. For example, processor 1320 converts transient time-domain signal 1319 into a plurality of frequency-domain peaks 1321 using a fourier transform.
The processor 1320 may compare the intensity of each of the plurality of frequency domain peaks 1321 to two or more different predetermined intensity ranges corresponding to two or more different charge state ranges. The processor 1320 may store each frequency domain peak in one of two or more data sets 1322 corresponding to two or more predetermined intensity ranges based on the comparison. Processor 1320 may create a mass spectrum based on the frequency domain peaks and/or the identified charge states discussed herein.
In various embodiments, during acquisition, processor 1320 converts transient time-domain signal 1319 into a plurality of frequency-domain peaks 1321, compares the intensity of each frequency-domain peak to two or more different predetermined intensity ranges, and stores each frequency-domain peak in one of two or more data sets 1322. In an alternative embodiment, after acquisition, processor 1320 converts transient time-domain signal 1319 into a plurality of frequency-domain peaks 1321, compares the intensity of each frequency-domain peak to two or more different predetermined intensity ranges, and stores each frequency-domain peak in one of two or more data sets 1322.
As discussed above, if multiple copies of the same ion oscillate in the mass analyzer 1317 at the same time, the measured intensity may not be proportional to the charge state. Thus, in various embodiments, the mass spectrometer 1310 transmits ions to the mass analyzer 1317 such that the mass analyzer 1317 includes only a single ion having a particular m/z and charge state at any given time.
In various embodiments, the system of fig. 13 further comprises an ion source apparatus 1311. The ion source apparatus 1311 may be, for example, an electrospray ion source (ESI) apparatus. The ion source apparatus 1311 is shown as part of the mass spectrometer 1310 in fig. 13, but may be a separate apparatus.
In addition, mass spectrometer 1310 also includes a dissociation device. The dissociation device may be, but is not limited to, an ExD device 1315 or a CID device 1313. The dissociation device may be used, for example, for top-down protein analysis.
In top-down protein analysis, the ion source apparatus 1311 ionizes proteins of a sample, producing a plurality of precursor ions for the proteins in an ion beam. The dissociation device dissociates a plurality of precursor ions in the ion beam to produce a plurality of product ions having different charge states in the ion beam. As described above, the mass spectrometer 1310 transmits the plurality of product ions to the mass analyzer 1317 such that the plurality of product ions are the plurality of ions transmitted by the mass spectrometer 1310 to the mass analyzer 1317.
In various embodiments, the processor 1320 is used to control or provide instructions to the ion source apparatus 1311 and the mass spectrometer 1310 and to analyze the collected data. The processor 1320 controls or provides instructions by, for example, controlling one or more voltage, current, or pressure sources (not shown).
Fig. 4 depicts another example method 400 for charge state assignment. The method 400 may be particularly useful if the isotope distribution is not sufficiently resolved for at least a plurality of peaks. Operations 401-404 may be substantially the same as operations 3010-3040. At operation 405, the state of charge of the feature may be estimated. The feature may be formed by a plurality of peaks sharing similar pulse-feature distributions, and the grouping of peaks to form the feature may be performed as discussed above with respect to operations 3050-3060. At operation 406, for each possible charge state estimated in operation 405, neighboring peaks may be identified and the likelihood of the neighboring peaks may be scored. At operation 407, the most likely candidate is selected based on the scoring performed at operation 407. Detection events may then be assigned to the features in operation 408, and a feature charge state may be assigned to the detection events based on the charge state of the features.
In another example, the methods in fig. 3A-3E may be combined with the method in fig. 4. In such an example, peaks with well resolved isotope patterns may be processed using the strategy described in method 3000 of fig. 3D. For remaining peaks where the isotope pattern is substantially unresolved, the strategy in fig. 4 may be used. In such a combination, an additional step may be taken to define whether the peak found is isotopically resolved. For example, this step may compare the peak width of the detected feature to the characteristic peak width of the mass spectrometer to obtain resolved and unresolved features;
for all these methods, three positive outcomes can generally be envisaged and their utility can be practically identical. First, using information about each detection event (e.g., charge state), the formula (m/z-m) may be used p ) Z generates a deconvoluted d mass spectrum, Z is the determined charge state, where m p Is the proton d mass. Secondly, this information can be used to construct a set of spectra for each charge state covering the entire m/z range or a segment of the m/z range. Third, this information can be used to generate a list of individual features, each feature having m/z and z assigned to it. The compounds and/or the amount of compounds present in the analyzed sample may then be determined or generated based on the identified features.
For all of the described embodiments, the step of assigning a charge state to an individual detection event may be replaced by or include assigning a probability of that detection event originating from the ion forming the feature and thus assigning a probability that detection event is associated with the charge state assigned to this feature. Such probabilities may be computed using a bayesian framework, for example. In a subsequent step of sorting up the representative mass spectrum (either in whole or in part), the proportional contributions from the detection events may then be distributed accordingly between the features it represents and their respective positions within the mass spectrum.
While the present teachings are described in conjunction with various embodiments, the present teachings are not intended to be limited to these embodiments. On the contrary, the present teachings encompass various alternatives, modifications, and equivalents, as will be appreciated by those of skill in the art.
For example, aspects of the present disclosure are described above with reference to block diagrams and/or operational illustrations of methods, systems, and computer program products according to aspects of the disclosure. The functions/acts noted in the blocks may occur out of the order noted in any flow diagrams, with the charge states assigned to such functions. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Additionally, as used herein and in the claims, the phrase "at least one of element a, element B, or element C" is intended to convey any of the following: element a, element B, element C, elements a and B, elements a and C, elements B and C, and elements a, B, and C.
The description and illustrations of one or more aspects provided herein are not intended to limit or define the scope of the claimed disclosure in any way. The aspects, examples, and details provided in this application are considered sufficient to convey ownership and enable others to make and use the best mode of the claimed disclosure. The claimed disclosure should not be construed as limited to any aspect, example, or detail provided in this application. Whether shown and described in combination or separately, the various features (structures and methods) are intended to be selectively included or omitted to produce embodiments having specific features. Having provided a description and illustration of the present application, those skilled in the art may devise variations, modifications, and alternative aspects that fall within the spirit of the broader aspects of the general inventive concept embodied in the present application, without departing from the broader scope of the disclosure as claimed.

Claims (40)

1. A method for classifying a charge state of a detected ion, the method comprising:
generating a pulse for each ion of the plurality of ions detected by the detector, wherein each pulse has pulse characteristics;
generating a pulse-feature profile for the generated pulse; and
generating an identification of a charge state of one or more ions of the plurality of ions based on the pulse-signature distribution.
2. The method of claim 1, wherein the pulse-feature distribution is a graph of probability versus pulse feature.
3. The method of any of claims 1-2, wherein the pulse characteristic is at least one of pulse height, pulse width, or pulse area.
4. The method of claim 3, wherein the pulse is characterized by a pulse height, and the pulse height is a maximum voltage of the pulse.
5. The method of any one of claims 1-4, wherein the detector is an electron multiplier detector and the detector is configured to detect primarily single ion events.
6. The method of any of claims 1-5, generating the indication of the charge state comprising comparing the pulse-signature distribution to a reference pulse-signature distribution.
7. The method of claim 6, wherein the ions detected by the detector are generated by ionization of the sample, and the reference pulse-signature distribution is identified based on known signatures of the sample.
8. The method of any of claims 1-7, wherein the generated identification comprises a probability of a charge state.
9. The method of any of claims 1-8, further comprising generating a deconvoluted mass spectrum for the detected ions based on the identification of the charge state, wherein one axis of the mass spectrum is mass rather than mass per charge (m/z).
10. The method of any of claims 1-9, wherein the plurality of ions are grouped into different groups using an m/z domain, and the identification of the charge state based on the pulse-signature distribution is performed for each group.
11. The method of claim 10, wherein the grouping step comprises:
generating a mass spectrum based on the plurality of detected ions;
identifying a first peak in the mass spectrum, wherein the first peak has a mass per charge (m/z) value; and
ions are grouped within a mass per charge (m/z) range based on the m/z value of the first peak.
12. The method of claim 10, wherein the grouping step comprises:
selecting a first subset of the plurality of ions into a first intensity band and a second subset of the plurality of ions into a second intensity band;
generating a first mass spectrum for a first intensity band;
generating a second mass spectrum for a second intensity band;
identifying a first peak in at least one of the mass spectra, wherein the first peak has a mass per charge (m/z) value; and
ions are grouped within a mass per charge (m/z) range based on the m/z value of the first peak.
13. The method of claim 10, further comprising generating a second pulse-signature distribution for ions in the m/z range, and wherein generating the signature comprises:
determining that ions forming the first pulse-signature distribution have a first charge state; and
the ions forming the second pulse-signature distribution are determined to have a second charge state.
14. The method of claim 10, further comprising, based on the identification of the charge state, determining one or more isotopes corresponding to the one or more ions that form the first peak.
15. The method of any of claims 10-14, wherein generating the indication of the charge state comprises comparing the first pulse-signature distribution to a reference pulse-signature distribution.
16. The method of claim 15, wherein the ions detected by the detector are generated by ionization of the sample, and the reference pulse-signature distribution is identified based on known signatures of the sample.
17. The method of any of claims 10-16, wherein the generated identification comprises a probability of a charge state.
18. The method of any one of claims 1-17, wherein the method is performed as part of a top-down protein analysis.
19. The method of any of claims 11-18, further comprising:
identifying a second peak;
determining a consistent m/z distance based on at least the first peak and the second peak;
identifying a first peak and a second peak forming feature; and
wherein generating the identification of the charge state is based on the consistent distance.
20. The method of claim 19, wherein the step of identifying peak forming characteristics comprises comparing pulse-signature distributions of the peaks and selecting peaks having substantially the same pulse-signature distribution.
21. The method of claim 20, wherein the comparison of pulse-signature distributions is performed by calculating the euclidean distance between the pulse-signature distributions and comparing it to a predetermined threshold.
22. The method of any of claims 19-21, further comprising, based on the consistent distance, identifying missing peaks corresponding to features.
23. The method of any of claims 19-21, further comprising, based on the identification of the charge state of the ions, generating a deconvoluted mass spectrum for the detected ions, wherein one axis of the mass spectrum is mass rather than m/z.
24. The method of any one of claims 1-23, wherein the pulse characteristic is a maximum voltage of the pulse.
25. A mass analysis system comprising:
a detector configured to generate a pulse for each ion detected by the detector;
a processor;
a memory storing instructions configured to, when executed by the processor, cause the system to perform a set of operations comprising:
generating a pulse for each ion of the plurality of ions striking the electron multiplier detector, wherein each pulse has a pulse characteristic;
generating a pulse-feature profile for the generated pulse; and
generating an identification of a charge state of one or more ions of the plurality of ions based on the pulse-signature distribution.
26. The mass spectrometry system of claim 25, wherein the detector is an electron multiplier detector.
27. A mass analysis system as claimed in any of claims 25 to 26, wherein the mass analysis system further comprises an ion source apparatus, a dissociation apparatus and a mass analyser.
28. A method for classifying a charge state of a detected ion, the method comprising:
detecting, using a processor, a transient time domain signal induced on an image-to-charge detector of a mass analyzer due to oscillation of a plurality of ions in the mass analyzer;
converting the transient time domain signal into a plurality of Frequency Domain (FD) peaks corresponding to ions of the plurality of ions;
generating an FD-peak-feature distribution for the generated pulse; and
generating an identification of a charge state of one or more ions of the plurality of ions based on the FD-peak-feature distribution.
29. The method of claim 28, wherein the FD-peak-feature distribution is a plot of probability versus FD-peak-feature.
30. The method of any one of claims 28-29, wherein the FD-peak-characteristic is peak intensity.
31. The method of any of claims 28-30, generating the indication of the state of charge comprises comparing the FD-peak-feature distribution to a reference FD-peak-feature distribution.
32. The method of any of claims 28-31, wherein the ions detected by the detector are generated by ionization of the sample, and the reference FD-peak-signature distribution is identified based on known signatures of the sample.
33. The method of any of claims 28-32, wherein the generated identification comprises a probability of a charge state.
34. The method of any of claims 28-33, further comprising, based on the identification of the charge state, generating a deconvolved mass spectrum for the detected ions, wherein one axis of the mass spectrum is mass rather than mass per charge (m/z).
35. A method for classifying a charge state of a detected ion, the method comprising:
generating a pulse for each ion of the plurality of ions detected by the detector, wherein each pulse has a pulse characteristic;
generating a pulse-feature profile for the generated pulse; and
identifying a coarse charge state based on the pulse-signature distribution;
identifying a peak pair of a first ion peak and a second ion peak such that the peaks have adjacent charge states; and
a refined charge state of the second ion peak is determined based on the m/z value of the first ion peak, the m/z value of the second ion peak, and the mass of the charge carriers.
36. The method of claim 35, wherein the coarse charge state identification is accurate for a range of possible charge states for at least one peak forming a pair, and at least one charge state from the range is adjacent to a charge state identified for a second peak.
37. The method of any of claims 35-36, further comprising accepting refined charge state identification if the refined identified charge state is an integer within a certain threshold.
38. The method of any of claims 35-37, wherein a third peak having adjacent charge states is identified and forms a pair with at least one of the peaks, and the charge state identifications for the common peaks match in the two pairs.
39. The method of any of claims 35-38, further comprising determining a charge state of a first ion peak based on the determined charge state of a second ion peak.
40. The method of any of claims 35-39, further comprising obtaining a mass of charge carriers based on known characteristics of a sample ionized to generate the plurality of ions.
CN202180035275.7A 2020-05-14 2021-05-14 Charge state determination for single ion detection events Pending CN115605976A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US202063024987P 2020-05-14 2020-05-14
US63/024,987 2020-05-14
PCT/IB2021/000330 WO2021229301A1 (en) 2020-05-14 2021-05-14 Charge state determination of a single ion detection event

Publications (1)

Publication Number Publication Date
CN115605976A true CN115605976A (en) 2023-01-13

Family

ID=76943044

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202180035275.7A Pending CN115605976A (en) 2020-05-14 2021-05-14 Charge state determination for single ion detection events

Country Status (4)

Country Link
EP (1) EP4150658A1 (en)
JP (1) JP2023526246A (en)
CN (1) CN115605976A (en)
WO (1) WO2021229301A1 (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10381206B2 (en) * 2015-01-23 2019-08-13 California Institute Of Technology Integrated hybrid NEMS mass spectrometry
JP6899560B2 (en) * 2017-05-23 2021-07-07 株式会社島津製作所 Mass spectrometric data analyzer and mass spectrometric data analysis program
JP7326324B2 (en) 2018-04-10 2023-08-15 ディーエイチ テクノロジーズ デベロップメント プライベート リミテッド Top-down analysis of antibodies in mass spectrometry
US11848181B2 (en) 2019-01-31 2023-12-19 Dh Technologies Development Pte. Ltd. Acquisition strategy for top-down analysis with reduced background and peak overlapping

Also Published As

Publication number Publication date
JP2023526246A (en) 2023-06-21
WO2021229301A1 (en) 2021-11-18
EP4150658A1 (en) 2023-03-22

Similar Documents

Publication Publication Date Title
JP4818270B2 (en) System and method for grouping precursor and fragment ions using selected ion chromatograms
JP6090479B2 (en) Mass spectrometer
JP4502009B2 (en) Mass spectrometry data analysis apparatus and program
EP3293754A1 (en) Method for identification of the monoisotopic mass of species of molecules
US20140138535A1 (en) Interpreting Multiplexed Tandem Mass Spectra Using Local Spectral Libraries
US20050159902A1 (en) Apparatus for library searches in mass spectrometry
JP2007263641A (en) Structure analysis system
CN115605976A (en) Charge state determination for single ion detection events
US20230335384A1 (en) Charge state determination of a single ion detection event
EP3204740B1 (en) Improving information dependent analysis (ida) spectral output for database searches
JP4929224B2 (en) Mass spectrometry system
US10937639B2 (en) Precursor selection for data-dependent tandem mass spectrometry
US20230282469A1 (en) Systems and methods for charge state assignment in mass spectrometry
WO2023037313A1 (en) Methods and systems for determining molecular mass
Needham et al. i, United States Patent (10) Patent No.: US 7,800,055 B2

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination