Method and Apparatus for Determining a Composition of a Spectrum
Background A spectrum is a range over which one or more measurable properties of a physical phenomenon, such as the frequency of sound or electromagnetic radiation, or the mass of specific kinds of particles, can vary. The study of spectra is termed spectroscopy. Various spectroscopy techniques exist which include, Mass Spectrometry (MS), Nuclear Magnetic Resonance (NMR), Raman Spectroscopy, etc.
An example apparatus for mass spectrometry is a quadrupole mass spectrometer (QMS), although it will be realised that other MS apparatus exist. In the QMS, ions from a sample are injected into a quadrupole mass filter to which direct and alternating voltages are applied. Stable ions are able to pass through the filter to reach a detector. A spectral peak at a given mass is output by the detector. An amplitude or height of the peak is proportional to a concentration of ions in the sample at the given mass.
Problems can arise in spectroscopy when ions of similar mass arrive at the detector. In this case the output of the detector is indicative of two or more spectral peaks at similar mass which at least partly overlap. Alternatively or additionally, problems can arise in determining whether a peak at a single mass exists at very low amplitude in relation to noise present in the signal output by the detector. Although improvements have been made in producing instruments with very low noise and reducing overlap between peaks at a given separation, these instruments tend to have a large size and/or high cost.
It is an object of embodiments of the invention to at least mitigate one or more of the problems of the prior art.
Statements of Invention
According to an aspect of the present invention there is provided methods and apparatus as set forth in the appended claims.
According to an aspect of the present invention there is provided a computer- implemented method of determining a composition of a spectrum recorded at a detector, comprising determining a first estimate of at least one characteristic of a spectrum, simulating the spectrum at a detector based on the first estimate, wherein in said simulation an amplitude of one or more peaks in the spectrum is constrained to be positive, comparing the simulated spectrum with a spectrum recorded at the detector, and determining an updated estimate of the spectrum based on the comparison.
The spectrum may comprise zero or more peaks. Optionally the spectrum comprises one or more peaks and the amplitude of each peak is constrained to be positive.
The method may comprise determining an initial estimate at least one initial characteristic of the spectrum. The method may comprise iteratively performing the steps of simulating the spectrum, comparing the simulated spectrum and determining the updated estimate of the spectrum.
The method may comprise determining a second estimate of at least one characteristic of the spectrum, simulating the spectra at the detector based on the first and second estimates, selecting one of the first and second estimates based on the comparison between the simulated spectra and the spectrum recorded at the detector.
The selecting of one of the first and second estimates may be further based on one or both of a probability of moving from the first estimate to the second estimate and from the second estimate to the first estimate of the spectrum.
The updated estimate of the spectrum may be based upon one or more statistical rules. The one or more statistical rules may define one or more of a likelihood of adding a peak to the spectrum, a likelihood of removing a peak from the spectrum and a likelihood of modifying one or more attributes of peaks forming the spectrum.
A computer-implemented method of determining a composition of a spectrum recorded at a detector, comprising providing a first estimate of at least one characteristic of a spectrum, determining second estimate of the at least one characteristic of the spectrum, simulating spectra at a detector corresponding to the first estimate and the second estimate, wherein said simulation includes injecting ions into a simulation of a detection apparatus comprising the detector according to a Monte-Carlo method, and selecting one of the first and second estimates based on the simulation and a spectrum recorded at a detector. In said Monte-Carlo method the ions may be injected into the detection apparatus randomised in one or both of space and time.
The method may comprise determining an initial estimate at least one initial characteristic of the spectrum, and iteratively performing the steps of determining the second estimate, simulating the spectrum, and selecting one of the first and second estimates.
The selected estimate may be utilised as the first estimate in a following iteration of the method.
The method may comprise comparing the simulated spectra with the spectrum recorded at the detector, and selecting one of the first and second estimates based on the comparison. The selecting is optionally further based on one or both of a probability of moving from the first estimate to the second estimate and from the second estimate to the first estimate of the spectrum.
In said simulation an amplitude of the one or more peaks may be constrained to be positive;
One or both of the first and second estimates may comprises one or more peaks present in the spectrum and an attribute of the one or more peaks.
According to another aspect of the present invention there is provided computer software which, when executed by a computer, is arranged to perform a method according to an aspect of the invention. The computer software may be stored on a computer readable medium. The computer software may be tangibly stored on a computer readable medium.
A computing apparatus comprising a memory and at least one processor, the memory storing computer executable instructions which, when executed by the at least one processor, perform a method according to an aspect of the invention.
Brief Description of the Drawings
Embodiments of the invention will now be described by way of example only, with reference to the accompanying figures, in which:
Figure 1 shows a method according to an embodiment of the invention;
Figure 2 is a schematic illustration of a quadrupole mass spectrometer; and
Figure 3 is an illustration of experimental and simulated spectra for hydrogen and helium according to an embodiment of the invention.
Detailed Description of Embodiments of the Invention
Figure 1 illustrates a method 100 according to an embodiment of the invention. The method 100 is a computer-implemented method of determining one or more aspects of a measured spectrum. The spectrum may be that output by a quadrupole mass spectrometer (QMS), although it will be realised that embodiments of the invention are not limited in this respect and that the spectrum may be determined in other ways such as Mass Spectrometry (MS), Nuclear Magnetic Resonance (NMR), Raman Spectroscopy, etc. However, for the purpose of illustration, embodiments will be described in relation to QMS.
A QMS 200 is schematically illustrated in Figure 2. The QMS 200 comprises a source of ions 210 which are injected into the QMS 200, as the skilled person will appreciate. The QMS comprises a number of electrodes, which are excited to provide a quadrupole electric field, which functions as a mass filter via application of a suitable combination of variable amplitude direct (U) and variable amplitude alternating (V) voltages from a voltage generator 230. For a given combination of U, V, frequency and electrode geometry ions of a given mass are stable (resonant) and others not. Stable ions reach a detector 240 where they form a spectral peak at the given mass (mass spectrum). The peak height is proportional to the concentration of ions in the sample at that mass.
Returning to Figure 1, step 1 10 comprises determining a set of parameters. Step 110 comprises determining an estimate of the spectrum. The spectrum may comprise one or more peaks, although it will be realised that the spectrum may not comprise any peaks. That is, the spectrum may only comprise noise. The spectrum is parameterised by a vector Θ. The vector Θ has a dimension of zero or more. In the case of the vector having a dimension of zero the spectrum does not comprise any peaks. The vector Θ defines parameters for each peak forming the spectrum. For each peak in the spectrum one or more parameters define the respective peaks. The one or more parameters per peak may comprise one or both of a position and width of each peak. The vector Θ may be determined not to include parameters defining an amplitude of each peak, such as a relative or absolute amplitude of the respective peak. In this case, the amplitude of each peak is determined in a subsequent step, as will be explained. In embodiments related to mass spectrometry, such as the exemplary explained embodiment, the vector Θ may define a ratio of mass over charge for each peak. Thus, given Θ, a shape of each peak forming the spectrum is defined.
In some embodiments of step 110, the vector Θ is a random set of initial parameters θο wherein the subscript 0 is indicative of the initial nature of the parameters i.e. for a zero-th iteration. The initial parameters may be sampled from an initial probability density function (pdf) ¾(¾) · In an
example embodiment, samples from the initial pdf are constrained to consist of one peak. However it will be realised that other numbers of initial peaks may be selected. The position of the initial peak may be
determined to be distributed across an extent of the mass over charge range for the spectrum, such that wherein Δ is an interval of valid values for the peak position.
In step 110 one of more attributes of each peak are determined. For example, a position of each peak is selected in step 1 10. As the described embodiment relates to QMS, it is only necessary to select the position of each peak in the spectrum. The shape of each peak is related to parameters that include a U/V voltage ratio of the detector, as will be appreciated by the skilled person. However, in other non-QMS embodiments, other attributes of each peak may be selected, such as the width of each peak. The number of peaks and the respective position of each peak may be selected as a respective random, or pseudo-random, value, as will be appreciated.
In step 120 a further set of parameters is determined. The further set of parameters may be referred to as a new set of parameters. The new parameters may represent an updated estimate, or guess, as to a composition of peaks forming a spectrum. In this sense, the first set of parameters Θ may represent a current set of parameters. The new parameters may be defined by a second, updated, vector, θ . The second vector Θ may comprise more or less parameters than the first vector Θ i.e. more or less peaks may be defined by the new set of parameters.
The new set of parameters may be based upon the current set of parameters Θ. The
new set of parameters may be determined according to one or more statistical
rules. In one embodiment, a peak is removed from the current set of parameters Θ with a predetermined probability, which may be 10% although it will be realised that other probabilities may be used. In one embodiment, a peak is added to the current set of parameters Θ with a predetermined probability, which may be 10% although it will be realised that other probabilities may be used. Furthermore, in one embodiment, one or more attributes of the peaks forming the current parameters Θ are modified with a predetermined probability, which may be 80% although it will be realised that other probabilities may be used. The total probabilities for all rules equals 100% as will be appreciated. It will be realised that the removal of peaks from the current set of parameters Θ is performed with respect to a minimum number of
peaks. The minimum number of peaks may be zero such that a peak may not be removed from a spectrum that does not comprise any peaks. In the case of a peak being added, the one or more attribute(s) of that added peak are determined in step 120. For example, the position of the new peak is determined. Similarly, where the attributes of the current peaks are modified a predetermined modification may be applied. The predetermined modification may be an addition of noise to the current parameters Θ. The noise may be zero-mean Gaussian noise having a predetermined variance, such as 1. Thus, following step 120 the new set of parameters θ is determined.
In step 120 a probability of moving to the new set of parameters
given the current Θ set of parameters is determined. That is, a value of is determined in step
120. The probability may be determined as:
where p is the probability, P is the predetermined probability of adding, removing or retaining the same number of peaks, such as 10%, as discussed above, and N is the number of peaks prior to the add or remove.
Similarly, a reverse probability of moving from the new set of parameters to the
old set of parameters is determined.
In step 130 spectra are generated. The spectra are generated based on each of the current Θ and new sets of parameters In embodiments where each vector
comprises elements each defining a respective peak, a single peak is generated for the rth element of each of Θ and where i may be between 1 and the number of peaks
defined by each set of parameters I. The single peak spectrum for the rth element of the new set of parameters is In step 130 an output of Monte-Carlo simulation is
used to determine one or more templates for predicting a shape of each peak given at least some of the parameters. The Monte-Carlo simulation may be performed offline.
As will be appreciated, a QMS mass spectrum consists of an ion current (y axis) plotted against a mass (m) to charge (q) position (x-axis). In the simulation of step 140, ions of a given mass to charge ratio, which may be selected by a user, are injected by simulation into a QMS mass filter model. A large number of ions (e.g. 100000, although it will be realised that other numbers may be used) are injected at each mass point (x axis position), thereby simulating the action of an ion source in a real QMS instrument. Ions are determined to start from a random spatial position over a disc corresponding to an ion source aperture, and their injection is randomised in time (i.e. occurs at any point of an applied voltage waveform). The random injection of ions in space and time justifies the term 'Monte Carlo ' to describe the nature of the simulation. For each individual ion, the force (F) on the ion is calculated from a knowledge of the electric field (E) at each point according to F = q E where q = charge on the ion. E may be calculated from a geometry of the instrument by field plotting routines (e.g. finite difference methods). From F/m an acceleration a is determined at each point in space, and v (ion velocity) by numerical integration, trajectory s as a function of time by numerical integration of v. Hence, for each individual ion its trajectory in space and time is calculated and it may be determined whether said ion is stable (and forms part of the mass spectrum) or unstable (and is rejected from the QMS). Repeating for each of the ions (e.g. 100000 ions) allows a mass spectrum to be simulated for a given set of instrument conditions, geometry, electrode size and spacings, voltage excitation on electrodes and input ion energy.
The model provides a method of computation to determine the individual trajectories of large numbers of ions injected from the ion source 210 into the quadrupole mass filter (QMF) 220. A simulated mass scan is produced, which may comprise at least 105 ions, injected into the quadrupole model at, at least some, or each point on the mass scale. The model provides an accurate, physics based, forward model of mass filter behaviour and is able to predict the mass spectral peak shapes.
Ions from the source 210 are assumed to originate at any point on a circular disk centred on a quadrupole axis and set at right angles to the axis. A quadrupole field starts immediately when the ions leave the source 210. The radii of both source and exit disks may be different and varied freely. In the simulation each ion is generated at
a point in the source disk selected at random with no correlation between the points used for successive ions; all positions are equally probable, this corresponds to uniform source illumination. The time of origin in the source is also selected randomly; that is ions enter the quadrupole at random values of the phase of the radio frequency voltage, the alternating voltage, used to operate the filter. Because of the random nature of the ion injection in space and time, the simulation may be considered a Monte-Carlo type simulation. A result of such simulation is illustrated in Figure 3. Ions may be simulated to travel through the filter 220 with constant velocity in the z direction. This is because the fringe fields may be ignored and all the electric fields experienced by ions are at right angles to the z direction; therefore there is no component of force to change the velocity in the z direction. At any time when the magnitude of either the x or y coordinate of an ion exceeds a filter radius, rO, it is rejected. Ions that pass through an exit aperture form the received signal at the detector 240. Ions may be traced through the filter 220 by determining their motion in the hyperbolic field. Their travel may be divided into small time intervals and their motion over each small time interval computed using the local field they experience -the field is function of time because the applied voltage may include an AC component. In some embodiments the motion may be approximated using a fourth order Runge-Kutta algorithm.
In step 140 the two spectra are each compared against a measured spectrum 150. The measured spectrum 150 is that measured by the QMS detector. Thus the measured spectrum 150 is provided as an input to step 140. The comparison in step 140 is based upon a General Linear Model (GLM). However, in embodiments of the invention, an amplitude of the peaks, A, is constrained to be positive.
Figure 3 illustrates a simulated spectrum 310 comprising three peaks produced by Monte-Carlo simulation as described above according to an embodiment of the invention and a measured spectrum 310 comprising two peaks measured experimentally. Reference numerals in Figure 3 specifically indicate a peak in the spectrum corresponding to helium.
In embodiments of the invention the following is calculated:
Where the GLM is defined as:
M - 1 is the dimensionality of Θ representing the number of peaks that Θ hypothesises;
d is the measured spectral data of length N;
is a matrix formed as where 1 is a vector of the same size as
d with all the elements equal to unity and used to model any offset present, although the skilled person will realise other values can be used; and γο and δ are hyper-parameters, which may be assumed to be equal to unity.
The integral in Equation 3 may be approximated via numerical integration, wherein a sample is taken a predetermined number of times from a student-T pdf defined in Equation (3). The predetermined number of times may be 1000, although other values may be used. A fraction of the samples that satisfy the above-mentioned constraint that all peaks have positive amplitude is then calculated. Note that the bottom most element of A relates to an offset which is allowed to be negative.
In step 160 it is determined whether to accept the new set of parameters
determination in step 160 is based upon the measured spectrum 150 and the current and new sets of parameters. In particular, the determination in step 160 is based upon a probability that the measured spectrum 150 corresponds to each of the current and new sets of parameters respectively, and a probability of
moving to the new set of parameters given the current Θ set of parameters, and
vice-versa. The determination may be made in embodiment of the invention according to Equation 6 :
Eqn. 6 The new set of parameters θ is accepted, thus becoming the current parameters for a future iteration of the method, if η is greater than a threshold value. The threshold value may vary between each iteration of the method in some embodiments. The threshold may be a random number drawn from a uniform distribution between zero and one. If η is less than or equal to the threshold value then the current Θ set of parameters is retained in step 170 i.e. the new set of parameters θ is discarded.
If the new set of parameters, Θ, are not accepted then the current set of parameters, Θ, are logged or stored in step 180. Alternatively, if the new set of parameters, Θ, are accepted, then the new set of parameters are logged or stored in step 180. Thus, over repeated iterations of the method 100, a set of parameters is stored at each iteration of step 180. As indicated by arrow between steps 180 and 120, the new set of parameters is provided to step 120 as an input for a next iteration of the method 100.
The diversity of the sets of stored parameters is indicative of uncertainty of the parameters of the measured spectrum, including uncertainty related to whether the measured spectrum contains a peak at a given position or not. This captures uncertainty relating to whether a low amplitude peak is present and whether a peak is present in close proximity to another, such that embodiments of the invention can achieve enhanced detection of low amplitude peaks and improved resolution of closely located peaks.
A most likely number of peaks present in the measured spectrum can be identified by identifying a number of peaks that occurs most frequently in the set of stored parameters. To produce a single estimated output, an average may then be determined across the samples with that number of peaks once the list of peak positions has been sorted in order of ascending mass over charge. We can also manipulate the samples to derive an estimate of the amplitude corresponding to each peak (190). It is innovative that, in steps 190 and 150, we enforce the constraint that the amplitude relates to a physical abundance and is therefore positive.
Embodiments of the present invention provide a method of determining a composition of a spectrum by iteratively simulating a spectrum and comparing the simulated spectrum with a measured spectrum. Advantageously the composition of the measured spectrum may be determined with increased accuracy.
It will be appreciated that embodiments of the present invention can be realised in the form of hardware, software or a combination of hardware and software. Any such software may be stored in the form of volatile or non-volatile storage such as, for example, a storage device like a ROM, whether erasable or rewritable or not, or in the form of memory such as, for example, RAM, memory chips, device or integrated circuits or on an optically or magnetically readable medium such as, for example, a CD, DVD, magnetic disk or magnetic tape. It will be appreciated that the storage devices and storage media are embodiments of machine-readable storage that are suitable for storing a program or programs that, when executed, implement embodiments of the present invention. Accordingly, embodiments provide a program comprising code for implementing a system or method as claimed in any preceding claim and a machine readable storage storing such a program. Still further,
embodiments of the present invention may be conveyed electronically via any medium such as a communication signal carried over a wired or wireless connection and embodiments suitably encompass the same.
All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and/or all of the steps of any method or process so disclosed, may be combined in any combination, except combinations where at least some of such features and/or steps are mutually exclusive.
Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise. Thus, unless expressly stated otherwise, each feature disclosed is one example only of a generic series of equivalent or similar features.
The invention is not restricted to the details of any foregoing embodiments. The invention extends to any novel one, or any novel combination, of the features disclosed in this specification (including any accompanying claims, abstract and drawings), or to any novel one, or any novel combination, of the steps of any method or process so disclosed. The claims should not be construed to cover merely the foregoing embodiments, but also any embodiments which fall within the scope of the claims.