GB2561879A - Spectroscopic analysis - Google Patents

Spectroscopic analysis Download PDF

Info

Publication number
GB2561879A
GB2561879A GB1706666.3A GB201706666A GB2561879A GB 2561879 A GB2561879 A GB 2561879A GB 201706666 A GB201706666 A GB 201706666A GB 2561879 A GB2561879 A GB 2561879A
Authority
GB
United Kingdom
Prior art keywords
pls
sample
model
data
spectrum data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB1706666.3A
Other versions
GB2561879B (en
GB201706666D0 (en
Inventor
Brewster Victoria
Toy Andrew
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Protea Ltd
Original Assignee
Protea Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Protea Ltd filed Critical Protea Ltd
Priority to GB1706666.3A priority Critical patent/GB2561879B/en
Publication of GB201706666D0 publication Critical patent/GB201706666D0/en
Publication of GB2561879A publication Critical patent/GB2561879A/en
Application granted granted Critical
Publication of GB2561879B publication Critical patent/GB2561879B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • G01N21/31Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
    • G01N21/35Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • G01N21/31Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
    • G01N21/35Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
    • G01N21/3504Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light for analysing gases, e.g. multi-gas analysis
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01JELECTRIC DISCHARGE TUBES OR DISCHARGE LAMPS
    • H01J49/00Particle spectrometers or separator tubes
    • H01J49/0027Methods for using particle spectrometers
    • H01J49/0036Step by step routines describing the handling of the data generated during a measurement
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • G01N21/31Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
    • G01N21/35Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
    • G01N2021/3595Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light using FTIR
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2201/00Features of devices classified in G01N21/00
    • G01N2201/12Circuits of general importance; Signal processing
    • G01N2201/129Using chemometrical methods
    • G01N2201/1293Using chemometrical methods resolving multicomponent spectra

Abstract

A method, software and apparatus to perform such, comprising receiving raw spectral data 1, pre-processing 2 said raw data to remove undesired components and qualitatively processing 3 the pre-processed spectral data to identify one or more components in the sample. When a spectral component is identified it is subtracted from the raw data to produce modified raw spectral data that is then pre-processed 2 and processed 3 as before. The method may comprise quantitative processing 4 to determine the concentration of identified components, this may use a Partial Least Squares model. Pre-processing may comprise baseline correction, mean centring or water spectrum subtraction. Qualitative processing may build a Partial Least Squares Discriminant Analysis (PLS-DA) model that may be constructed iteratively, identify clusters and define their boundaries. Qualitative processing may classify a component in a chemical group with a second classification step to identify the component. These classifications may correspond to PLS-DA model clusters.

Description

(56) Documents Cited:
WO 2012/036970 A1 (71) Applicant(s):
Protea Ltd
Prosperity Court, MIDDLEWICH, Cheshire, CW10 0GD, United Kingdom (72) Inventor(s):
Victoria Brewster Andrew Toy
Meyer et al., Qualitative and quantitative mixture analysis by library search: infrared analysis of mixtures of carbohydrates, Analytica Chimica Acta, 281, pp. 161-171, Elsevier, 1993
Westerhuis et al., Multivariate paired data analysis: multilevel PLSDA versus OPLSDA, Metabolomics, 6, pp. 119-128, Springer, [online]. Published 28 Oct 2009. Available from https://www.ncbi.nlm.nih.gov/pmc/ artic!es/PMC2834771/ (74) Agent and/or Address for Service:
Ogive Intellectual Property
Westgate, North Cave, Brough, HU15 2NJ,
United Kingdom (58) Field of Search:
INT CL G01N
Other: EPODOC, WPI, Patents Fulltext, NPL, XPAIP, XPESP, XPI3E, XPIEE, XPIOP, XPSPRNG (54) Title of the Invention: Spectroscopic analysis
Abstract Title: Analysing spectroscopic data taken from a gas or liquid sample to determine the identity of one or more components (57) A method, software and apparatus to perform such, comprising receiving raw spectral data 1, pre-processing 2 said raw data to remove undesired components and qualitatively processing 3 the pre-processed spectral data to identify one or more components in the sample. When a spectral component is identified it is subtracted from the raw data to produce modified raw spectral data that is then pre-processed 2 and processed 3 as before. The method may comprise quantitative processing 4 to determine the concentration of identified components, this may use a Partial Least Squares model. Pre-processing may comprise baseline correction, mean centring or water spectrum subtraction. Qualitative processing may build a Partial Least Squares Discriminant Analysis (PLS-DA) model that may be constructed iteratively, identify clusters and define their boundaries. Qualitative processing may classify a component in a chemical group with a second classification step to identify the component. These classifications may correspond to PLS-DA model clusters.
©
Raw IR Spectrum ©
Data Pre-Processing *
Subtract hits and re-anaiyse
Qualitative Analysis What species are there?
Figure GB2561879A_D0001
Fig. 2
At least one drawing originally filed was informal and the print reproduced here is taken from a later filed formal copy.
ι/io
06 18
Figure GB2561879A_D0002
Fig. 1 ©
Raw IR Spectrum
Re-analyse <
Figure GB2561879A_D0003
Figure GB2561879A_D0004
-> Data Pre-Processing ¢±
Qualitative Analysis What species are there?
Subtract hits and re-anaiyse
Figure GB2561879A_D0005
Figure GB2561879A_D0006
Fig. 2
2/10
Figure GB2561879A_D0007
Fig. 3(B)
Sample Spectrum & Water Spectrum
Figure GB2561879A_D0008
3/10
Fig. 3(C)
Figure GB2561879A_D0009
Figure GB2561879A_D0010
Fig. 3(D)
Sample Spectrum with water subtracted
Figure GB2561879A_D0011
4/10
Figure GB2561879A_D0012
Fig. 4
Y
Figure GB2561879A_D0013
Fig. 5
5/10
Figure GB2561879A_D0014
6/10
R,S-0ASaws
0.3 α;
ο· >
.5' ' i
w W ·
* o
BOO-sCO
M&.W
Mm
XAr^sfo
WrtiSXi iWOifii •ο.
a.
O
Figure GB2561879A_D0015
Ka %
>05 -0 55 -0.50 -003 -OCX) OS 0*3 0.55 3.30 0/5 0.30 035 £40 ^)5 Crt 20) (a)
PlSOA Soa-es
Figure GB2561879A_D0016
Fig. 7 ih)
AO f&« Aikstf si ®CO'Y o&osm &5s<wU3m
O-COsOi*
ΑΡίφ8ί3$ «MS#»»
7/10
06 18
Figure GB2561879A_D0017
Fig. 8
Figure GB2561879A_D0018
Fig. 9
8/10
Figure GB2561879A_D0019
Fig. 10
9/10
Figure GB2561879A_D0020
Fig. Π
Figure GB2561879A_D0021
τ-1-f-T--,-,-j—
13:00:00 13:10:00 13:20:00 13:30:00 13:40:00 13:50:00 14:00:00 14:10:00
Time
Fig. 12
10/10
Figure GB2561879A_D0022
1334
1342
Fig. 13
Spectroscopic Analysis
Technical Field
The invention relates to spectroscopic analysis and to a method and apparatus therefor, 5 and in particular to a method of performing spectroscopic analysis of Infra-Red (IR) spectra of a sample or mass spectrum data.
Background
For the purpose of the following description and for illustration, spectroscopic analysis 10 of a gas phase Infra-Red (IR) spectra will be discussed. However, it will be appreciated by persons skilled in the art that the techniques are applicable to samples in other forms,
e.g. liquid or vapours of a solid phase.
A gas stream may be required to be analysed to determine the types and quantities of 15 gases therein. Typically, after a gas stream has been sampled it is subject to IR analysis, which produces an IR absorption spectrum, an example of which is shown in Figure 1.
The spectrum in Figure 1 (referred to herein as “raw spectrum data”) is a graphical representation of how much IR light is absorbed at each wavelength. It will be appreciated that such a raw spectrum data is an “absorption” spectrum and not a transmission spectrum. As is well known, when a sample is illuminated with IR light the molecules in the sample interact with the photons from the IR light inducing a series of molecular vibrations. Each different type of chemical bond will vibrate at a different frequency, and will absorb IR light which has a matching frequency. Therefore, each peak in the spectrum corresponds to a different molecular bond vibration. As all bond types have a different vibrational frequency, they will absorb different frequencies of IR light and hence it can be determined which atoms/bonds are present in a sample and in what quantities by observing the absorption spectrum. The spectrum in Fig. 1 shows a generalised gas sample containing at least one gas. Whereas the spectrum in Fig. 1 shows four peaks it will be understood that a sample of one gas can have many more peaks, and a mixture of four gases can have less than four peaks depending on the gases, and whether they are IR active, and what bonds are present.
US2007242275A1 discloses a spectroscopic detection system (e.g. a compact, portable multiple gas analyser) for monitoring ambient air for toxic chemical substances (including various nerve and blister agents as well as toxic industrial chemicals at low or sub part per billion (ppb) levels). The system can comprise a Fourier Transform
Infrared (FTIR) spectrometer, a gas sample cell, a detector, an embedded processor, a display, power supplies, an air pump, heating elements, and other components on-board the unit with an air intake to collect a sample and an electronic communications port to interface with external devices. A detection method includes flowing a sample of ambient air through the sample cell, and optimizing the volume of the sample cell and the number of passes of a beam of radiation in the folded path to maximize throughput of the beam of radiation propagating in the sample cell to detect a trace gas having a concentration of less than about 500 ppb in the sample of ambient air. An alternative detection method comprises: measuring a first signal of the beam of radiation propagating through a sample of ambient air at a first pressure in the sample cell;
pressurizing the sample cell with ambient air to a second pressure; measuring a second signal of the beam of radiation propagating through the sample of ambient air at the second pressure in the sample cell; and combining the first signal and the second signal to determine a signal indicative of the presence of a trace gas.
A problem is encountered with known techniques when there are many different gases in a gas stream that are required to be identified and quantified. Known ways of analysing a gas stream typically require (i) prior knowledge of the expected gases, (ii) generation of specific calibration methods to detect them, (iii) calibration of apparatus, (iv) subsequent analysis and interpretation of the IR spectra, and (v) final re-iterations of the method to cover all observed gases. Performing all of these tasks can be very time consuming, as well as expensive in terms of man-hours of a skilled spectroscopist. There is a need to provide efficient processor-based techniques for performing the necessary functions in an expedited, and preferably real time, manner.
A further problem is that, in the prior art, when an unknown or unexpected gas appears in a gas stream, the user would need, in an offline procedure, to identify the new gas, collect spectra of the new gas, and then build analytical models which include this new gas to calibrate the gas analyser.
It is broadly an object of the present invention to address one or more of the above mentioned disadvantages of the previously known ways of providing spectroscopic analysis.
Summary
What is required is an apparatus and method for providing spectroscopic analysis which may reduce or minimise at least some of the above-mentioned problems.
According to a first aspect of the invention there is provided a method of determining an identity, and optionally a concentration, of one or more components in a gas or liquid sample by spectral analysis, the method comprising, in sequence:
receiving raw spectrum data for the sample;
pre-processing the raw spectrum data, to obtain pre-processed spectrum data from which the spectra of known undesired components have been removed; and performing qualitative processing on the pre-processed spectrum data to determine therefrom the identity of the one or more component in the sample;
wherein identifying a component comprises:
subtracting the spectrum of an identified component from the raw spectrum data, or previously produced modified raw spectrum data, to produce modified raw spectrum data, and returning to the pre-processing step to reanalyse the modified raw spectrum data.
An advantage is that the analysis and identification of components is based on successively improved/reduced spectral data, and that component identification can thus be performed by a processor in an expedited manner and potentially with improved accuracy. The time taken to identify components can be significantly reduced, and in certain scenarios the process can be performed in real time.
Preferably the method further includes performing quantitative processing comprising using the determined one or more identities to thereby determine, for the one or more identified component in the sample, the concentration of that component. In this manner, a list of components and their respective concentrations is gradually and accurately built.
Preferably, performing quantitative processing comprises building a quantitative Partial Least Squares (PLS) model of components of the sample.
Preferably, performing qualitative processing comprises, using a PLS-DA algorithm having as inputs the raw spectrum data or modified raw spectrum data, and a database of reference spectrum data, building a PLS-DA model of components of the sample. A large/expanding database of reference spectrum data facilitates rapidly determining the 15 identity of a component.
Preferably, building the PLS-DA model comprises iteratively identifying a component of the sample and updating the PLS-DA model to characterise components therein as identified or non-identified. Iterative identification facilitates characterisation of a component as identified.
Preferably, building the PLS-DA model comprises spatially arranging identifiers of the components within the PLS-DA model as the components are identified. Preferably, building the PLS-DA model comprises identifying clusters of identifiers of components spatially arranged within the PLS-DA model using the PLS-DA algorithm. Preferably, building the PLS-DA model comprises incorporating one or more boundaries within the PLS-DA model so as define respective separations of two clusters. The formation of plural components into clusters, and consequent derivation of boundaries between clusters, facilitates the identification of an unknown component, using PLS-DA analysis.
In one embodiment, performing qualitative processing comprises, for the one or more components, performing a first classification operation to determine, based on the raw spectrum or a previously produced modified raw spectrum data, a chemical group to which the component belongs, and performing, based on the chemical group determined and the raw spectrum or a previously produced modified raw spectrum data, a second classification operation to determine the identity of the component. Preferably, the first classification operation is such as to position the identifier of an unknown component within a cluster of a first type, corresponding to chemical group, within the PLS-DA model; and the second classification operation is such as to position the identifier of the unknown component within a cluster of a second type, corresponding to chemical species, within the PLS-DA model. Thus, using a first version of the PLS-DA model, the identification of a model is narrowed down to a chemical group, and then using a second version of the PLS-DA model, the species within the group is identified. This avoids processing a large amount of data at the state of in one extensive operation, and means that species identification is performed using a smaller data set/model, thus expediting the process.
Preferably, said pre-processing comprises one or both of baseline correction, and mean centring of the raw data spectrum. Preferably, said pre-processing comprises subtracting a water spectrum from the raw spectrum data. With undesirable components such as water removed, faster and/or more accurate processing is facilitated.
Preferably, after performing quantitative processing the method further comprises: determining whether the raw spectrum data for the sample meet one or more quality criteria, and if not, returning to the pre-processing step to reanalyse the raw spectrum data, and if the one or more quality criteria are met, out putting analysis results as a list of components and, associated with each component, a respective concentration.
Preferably, the raw spectrum data comprises Infra-Red (IR) spectrum data of a gas.
Alternatively the raw spectrum data comprises mass spectrum data.
According to a second aspect of the invention, there is provided a software application operable to perform a method according to any of the features of the first aspect of the invention.
According to a third aspect of the invention, there is provided an apparatus for 20 performing spectroscopic analysis of a sample, comprising: a sample cell for containing, in use, the sample; a light source and light detector, for directing light onto the sample and for receiving light therefrom and thereby obtain raw spectrum data; and processing circuitry, coupled to at least the light detector, the processing circuitry being configured to implement the method of any of claims 1 to 15 of the appended claims.
At least in embodiments, the invention provides a method of performing spectral analysis, and provides an automated method for identifying and quantifying gases within a complex mix of IR spectral data. The method uses full spectrum analysis to qualitatively identify gases, and combines this identification with quantitative modelling to determine their concentration. The design of the method is such that it will be continually checking and improving the analysis models as data is being collected.
Such a method provides the advantage that it removes the requirement for a trained spectroscopist to analyse the captured data, which significantly reduced the cost of ownership of gas analysers.
It will be appreciated that the method of the invention is suitably implemented in software that is run on a computer device. A further advantage is that the software is able to separate the IR spectrum of a gas sample, work out what gases are present, and determine their concentration without requiring knowledge of the expected gases beforehand.
At least in embodiments, with the method of the invention the software identifies the new gas in real time, pulls the appropriate spectra from the library (database) and automatically builds the new models and carries on data acquisition. Beneficially, in effect, the method of the invention identifies unknown or unexpected gases without requiring calibration of a gas analyser, and provides immediate gas concentration values that can be reported or acted upon.
According to an alternative characterisation of the invention, there is provided a method of performing spectroscopic analysis of raw spectrum data, comprising:
i) pre-processing the raw spectrum data, ii) building a qualitative PLS-DA model using a database of reference spectrum data, iii) performing a PLS-DA analysis of the raw spectrum data to cluster/classify the data into one or more groups, iv) subtracting an identified component from the raw spectrum, and
v) re-analysing the raw spectrum data by repeating steps hi) and iv).
According to an alternative characterisation of the invention, there is provided a method of determining the identities and concentrations of components in a gas or liquid sample by spectral analysis, the method comprising, in sequence:
receiving raw spectrum data for the sample;
pre-processing the raw spectrum data, to obtain pre-processed spectrum data from which the spectra of undesired components have been removed;
performing qualitative processing on the pre-processed spectrum data to determine therefrom the identity of the or each component in the sample; and performing quantitative processing using the determined identity or identities to thereby determine, for the or each identified component in the sample, the concentration of that component
According to an alternative characterisation of the invention, there is provided a method of determining an identity and a concentration of one or more components in a gas or liquid sample by spectral analysis, the method comprising, in sequence:
receiving raw spectrum data for the sample;
pre-processing the raw spectrum data, to obtain pre-processed spectrum data from which the spectra of known undesired components have been removed; and performing qualitative processing on the pre-processed spectrum data to determine therefrom the identity of the one or more components in the sample;
wherein identifying a component comprises:
subtracting the spectrum of an identified component from the raw spectrum data, or previously produced modified raw spectrum data, to produce modified raw spectrum data, and returning to the pre-processing step to reanalyse the modified raw spectrum data.
Any preferred or optional features of one aspect or characterisation of the invention may be a preferred or optional feature of other aspects or characterisations of the invention.
Brief Description of the Drawings
Other features of the invention will be apparent from the following description of 20 preferred embodiments shown by way of example only with reference to the accompanying drawings, in which;
Figure 1 shows a graph of raw spectral data for a gas sample comprising four components (gases);
Figure 2 shows steps of a method for identifying and quantifying the components in gas sample, according to an embodiment of the invention;
Figure 3 shows sample spectra obtained during the pre-processing step of Fig.
2;
Figure 4 shows clustering of spectra of gas components by PFS-DA algorithm in the qualitative analysis step of Fig. 2;
Figure 5 schematically illustrates matrix representations using in the qualitative analysis step of Fig. 2;
Figure 6 shows clustering of spectra of gas components by PFS-DA algorithm in the qualitative analysis step of Fig. 2, with the presence of an unknown component; and
Figure 7 shows clustering of spectra of gas components by PFS-DA algorithm in the qualitative analysis step of Fig. 2, illustrating two-stage classification according to an embodiment of the invention;
Figure 8 shows a comparison of the concentrations predicted for each spectrum by a PFS model to the actual known concentrations in the qualitative analysis step of Fig. 2, according to an embodiment of the invention;
Figure 9 is a schematic of a process for calculating PFS models in the qualitative analysis step of Fig. 2, according to an embodiment of the invention;
Figure 10 shows a method of wavelength (“wavenumber band”) selection for each gas for use with an automatic PFS model in the qualitative analysis step of
Fig. 2, according to an embodiment;
Figure 11 shows spectra obtained, including a residual spectrum, for use in the analysis check step of Fig. 2, according to an embodiment of the invention;
Figure 12 shows graphically the results from the method of Fig. 2 — a list of gas species identified and their associated concentration values, illustrating a trend of “concentration” vs “time”; and
Figure 13 is a schematic block diagram of an apparatus for performing the method of Figure 2, according to an embodiment of the invention.
Detailed Description
In the embodiment described below, the method is for analysing IR spectra to identify and quantify each species of gas in an automated manner.
Referring initially to Fig. 13, this a schematic block diagram of an apparatus for identifying and quantifying each species of gas, according to an embodiment of the invention.
The apparatus 1310 is for monitoring and/or detecting gas components in a gas sample. In some embodiments, spectra of a liquid sample, or of vapours of a solid or liquid substance, can be detected. The apparatus 1310 can be an absorption spectrometer and/or can be a Fourier Transform Infrared (FTIR) spectrometer. In the embodiment illustrated, the apparatus 1310 includes a source 1314, an interferometer 1318, a sample cell 1322, a source for a gas sample 1326, a detector 1330, a processor 1334, a display 1338, and a housing 1342. In various embodiments, the apparatus 1310 can be used to detect a gas component in a short period of time with few, if any, false positives or negatives.
In various embodiments, the source 1314 can provide a beam of radiation (e.g., an infrared beam of radiation). In one embodiment, the source is a glowbar, which is an inert solid heated to about 1000° C. to generate blackbody radiation. The glowbar can be formed from silicon carbide and can be electrically powered. The spectral range of the system can be between about 600 cm-1 and about 5000 cm-1. The spectral resolution of the system can be 1 cm-1, 2 cm-1, 4 cm-1, 8 cm-1 or 16 cm-1. However, the library referenced below is at lcm-1. In one embodiment, the detection system can record a higher resolution spectrum of a trace gas upon detection of the gas components. The higher resolution spectrum can aid identification of the gas components.
In various embodiments, the source 1314 of radiation and the interferometer 1318 can comprise a single instrument. In some embodiments, the interferometer 1318 is a
Michelson interferometer, commonly known in the art. In one embodiment, the interferometer 18 is a BRIK interferometer available from MKS Instruments, Inc.
(Wilmington, Mass.).
In one embodiment, the interferometer 1318 can be a module including a source of radiation, a fixed mirror, a movable mirror, an optics module, and a detector module (e.g., the detector 1330). The interferometer module can measure all optical frequencies produced by its source and transmitted through a sample (e.g., the sample 1326 contained within the sample cell 1322).
In various embodiments, an absorption spectrometer is used to record an optical signal, and information about the trace species is derived from the signal transmitted through the sampling region. For example, an absorption spectrum or a differential spectrum can be used.
In various embodiments, the sample cell 1322 can be a folded path and/or a multiple 5 pass absorption cell. The sample cell 1322 can include an aluminium housing enclosing a system of optical components. In some embodiments, the sample cell 1322 is a folded-path optical analysis gas cell as described in U.S. Pat. No. 5,440,143.
In various embodiments, the source of the sample of gas 1326 can be ambient air. The 10 sample cell 1322 or a gas sampling system can collect surrounding air and introduce it to a sampling region of the sample cell 1322. The sample of gas can be introduced to the sample cell 1322 at a predetermined flow rate using a flow system including an inlet
1346 and an outlet 1350 of the sample cell 1322.
In various embodiments, the detector 1330 can be an infrared detector. In some embodiments, the detector 1330 is a cooled detector. For example, the detector 1330 can be a cryogen cooled detector (e.g., a mercury cadmium telluride (MCT) detector), a Stirling cooled detector, or a Peltier cooled detector. In one embodiment, the detector is a deuterated tri glycine sulfate (DTGS) detector. In one embodiment, the detector is a 0.5 mm Stirling-cooled MCT detector with a 16-pm cut-off, which can provide the sensitivity required for detecting a trace gas. The relative responsitivity (i.e., ratio of responsitivity as a function of wavelength) of the Stirling-cooled MCT detector is at least 80% throughout the main wavelength region of interest (e.g., 8.3-12.5 litm). In addition, the D* value of the Stirling-cooled MCT detector can be at least 3x101° cm
Hzlz2 W-l. The D* can be defined as the inverse of the detector noise equivalent power multiplied by the square-root of the active element area.
The processor 1334 can receive signals from the detector 1330 and identify a trace gas by its spectral fingerprint or provide a relative or absolute concentration for the particular material within the sample. The processor 1334 can be, for example, signal processing hardware and quantitative analysis software that runs on a personal computer. The processor 1334 can include a processing unit and/or memory. The processor 1334 can continuously acquire and process spectra while computing the concentration of multiple gas components within a sample, as described in detail hereinafter. The processor 1334 can transmit information, such as the identity of the trace gas, a spectrum of the trace gas, and/or the concentration of the trace gas, to a display 1338. The processor 1334 can save spectrum concentration time histories in graphical and tabular formats and measured spectrum and spectral residuals, and these can be displayed as well. The processor 1334 can collect and save various other data for reprocessing or review at a later time. The display 1338 can be a cathode ray tube display, light emitting diode (LED) display, flat screen display, or other suitable display known in the art.
Returning to Figure 2, this shows steps of a method for identifying and quantifying the components in gas sample, according to an embodiment of the invention. This may be performed, for example, using the apparatus of Fig. 13. The method will described in terms of the various steps thereof, in sequence. It will be appreciated that the steps may be performed in a different order, and may not necessarily be performed in the order shown in Figure 2.
Step 1: Receiving raw IR Spectrum
The method includes receiving the raw IR spectrum data, step 1. An example of the raw IR spectrum data is shown in Figure 1, as noted earlier, which is produced from a Fourier Transform Infrared (FTIR) spectrometer containing a gas absorption cell. It will be appreciated that every gas has a different IR spectrum, but is will be appreciated that inert or homonuclear diatomic gases have no IR spectrum. Furthermore, such a raw spectrum data is an “absorption” spectrum and not a transmission spectrum.
Step 2: Pre-processing
The method includes performing data pre-processing (step 2) to clean up the IR spectrum ready for the subsequent qualitative analysis step, step 3. The data pre15 processing step 2 includes a “baseline correction” to removes drifts in the spectrum baseline, and “mean centring”. The baseline is shown in Figure 1 at 10, and should be at zero, but may drift due to environmental factors such as pressure and temperature of the gas sample, as well as performance drop-offs of the instrument. The baseline correction pushes the baseline 10 back to zero. Mean centring is a known technique to standardise or normalise the data, and requires the mean spectrum to be calculated (by adding the individual parts of the spectrum and dividing by the number of individual parts to obtain the mathematical mean) and then subtracted from each sample spectrum to give mean centred data. The invention also includes an optional water subtraction pre-processing step. Water has a strong and wide infrared response, and therefore when there are high levels of water in a gas sample the spectral response from water can mask those from other species and hindering identification. Preferably, the spectrum is analysed for water in the first instance and if significant levels of water are found to be present, the water is then subtracted.
Figure 3 shows sample spectra obtained during the pre-processing step of Fig. 2 - in the subtraction of the spectrum water, before the spectrum is used in step 3.
In Figure 3 the graph A is an IR sample spectrum obtained from a gas sample. The 10 graph B shows an IR water spectrum for water in hatching (diagonally downwards left to right), which has been obtained from a library and which has been superimposed onto the IR sample spectrum. The graphs at C shows the IR water spectrum in hatching (diagonally downwards left to right), which has been scaled to the IR sample spectrum using a reference wavenumber 1917.98 of the IR water spectrum. The left-hand graph 15 C shows the complete sample spectrum and water spectrum, and the right-hand graph shows a small part of the left-hand graph. The graph D shows the IR sample spectrum with the IR water spectrum subtracted. Producing the graph D avoids the sample spectrum from being masked by the water spectrum.
Step 3: Qualitative Analysis
The qualitative step 3 determines the different types of gas (or “gas components”) that are present in the spectrum by analysing the pre-processed sample spectrum with a
Partial Least Squares Discriminant Analysis (PLS-DA) algorithm. Preferably, this relies on a spectral database (i.e. library) of gas species (currently, the inventors have obtained over 300) which are used to make PLS-DA models which are then used to qualify unknowns from sample spectra. In effect, this embodiment of the invention defines the PLS-DA algorithm as described below and the calibration data from (over 300) gas species are used to model the IR sample spectrum.
Figure 4 shows clustering of spectra of gas components by PLS-DA algorithm in the qualitative analysis step of Fig. 2. Partial Least Squares - Discriminant Analysis (PLSDA) is a particular case of PLS which finds a linear function (i.e. the lines in Figures 4a and 4b) which best describes or separates the classes of data (each class of data defining a particular gas type). Preferably, the invention uses PLS-DA to cluster/classify multivariate data (such as FTIR spectra) when the gas components (e.g. acetone, MEK, acetyl acetone, MIBK, etc.) within the samples are already known (This may be termed “Supervised Clustering”). Whereas PLS-DA is a known way of processing data, embodiments of the invention use the PLS-DA together with subtracting hits and re15 analysing (i.e. steps 3 and 2 shown in Figure 2) as discussed below, to provide enhanced processing/results.
PLS-DA is based on a classical known Partial Least Squares Regression (PLSR) algorithm, however in PLS-DA the “response variable” or “dependent variable” is replaced by a “dummy variable” which describes the class characteristics of the gas. The PLS-DA algorithm is very similar to the known PLSR algorithm except for the Y data (shown in Fig 4 along the y-axis) are discreet classes rather than a continuous data, such as a gas concentration. The aim of PLS-DA is to determine linear discriminators (i.e. the lines in Fig 4a and 4b) which separate the classes of gas.
Figure 4(a) shows a simple two class theoretical example of how PLS-DA operates. Figure 4 shows a PLS-DA “scores plot” for two spectra from two types of gas, whereby the x-axis represents scores from a first factor, and the y-axis represents scores from a second factor. The scores plot decomposes the IR spectrum data from the two gases into useful information. In Figure 4a the x-axis represents the FTIR spectra, and the y-axis represents the “dummy variable” in the data. Each symbol in Figure 4 represents a sample or spectrum but the axis are the calculated scores for each spectrum. The x axis shows the scores for the first factor and the Y scores for the second factor. Figure 4(a) shows that for a set of I samples, the X data matrix represents a set of J spectra which form two specific groups or classes. Thus, in Fig 4a, J = 13; in other words, there 13 spectra represented in Fig 4a.
The vector y of length I represents a numerical label for each sample according to its group or class membership. In a system with two class groups; IA spectra which are members of group A would have a label of +1 and IB samples which belong to group B would have a label of -1.
PLS-DA is derived from PLSR, and therefore involves forming a regression model between X and y. In PLSR y is a set of continuous numbers, such as concentration values. In PLS-DA it contains discreet numbers, i.e. for the two class example above the y data would be discrete numbers at 2 levels, one level for group A and one for group B.
The known PLS1 algorithm is then used to build the regression model.
Once the model is built (i.e. defining the PLS-DA algorithm and using the calibration data from, e.g., over 300 gas species to model the IR sample spectrum) it is then possible to predict the value of Y for new and future samples using the relationship between X and y explained by the model. The decision as to which group a new spectrum belongs is determined by the calculated Ϋ values for that sample. For the two class example stated above, in which group A is assigned the label +1 and group B is assigned the label -l.AY value above zero returns a prediction of group A and below zero group B.
When there are more than two classes in a PLS-DA model, as there are when using PLS-DA as a qualitative clustering method for component identification, y is extended so that instead of a vector it becomes the matrix, Y. In the Y matrix each sample is given a numerical label to indicate whether it belongs to a particular class.
Figure 5 schematically illustrates matrix representations using the qualitative analysis step of Fig. 2. For example, in a 4 group/class model - A, B, C and D - Y is a data matrix of 4 columns in which each column would represent the class membership of that group. So the first column of the Y matrix describes the class membership of group
A. All members of group A are assigned +1 and B, C and D is -1. The second column describes membership of group B, hence B is +1 and A, C and D -1. That is, +1 denotes being in the group represented by that column and -1 not being in that group.
Next four separate PLS 1-DA models are performed as described above, one for each column/group. In one arrangement, for the four groups there are four PLS-DA models, which are superimposed as shown in Figure 6 In an alternative arrangement for the four groups there are four PLS-DA models that are plotted and assessed separately, as discussed below.
When the PLS-DA models are used for multiple classes it is usually found that the linear PLS-DA separators do not intersect in a way which clearly divides the classes. There is often ambiguity for samples, as shown in the theoretical example of Fig. 4 (b), where some samples fall into multiple classes and others do not fall into any particular class. The result of this is that a more complicated decision rule for determining class membership is needed.
Figure 6 shows clustering of spectra of gas components by PLS-DA algorithm in the qualitative analysis step of Fig. 2, with the presence of an unknown component. It will be appreciated that Figure 6 is schematic model and is not based on real data or analysis. As shown in Figure 6, when an ‘unknown’ or sample spectrum has been added into a qualitative model, an assessment is made of how closely related to each class cluster (i.e. compound group) this spectrum is in order to populate a list of possible ‘hits’ (i.e. to identify the gas component). To do this, the Euclidian distance from the centroid 602 of each group measured, as shown by the arrows 604 in Fig. 6. It will be appreciated that Fig. 6 shows a scores plot with six types of ketone, i.e. Acetone, MEK, Acetyl acetone, MIBK, Propanone, Cyclohexanone. The centre of each cluster, the centroid, is located in the scores plot and the straight line distance from this point to the unknown spectrum (see Fig. 6) is measured/calculated. Hits are then ranked in accordance with these measurements. The shortest distance is 1, the second smallest 2, and so on, and from this enumeration a class membership decision can be determined.
In an alternative arrangement to the superimposing technique shown in Figure 6 and as 5 discussed in the preceding paragraph, the multiple individual models are used separately (i.e. without superimposing them). With this arrangement the straight line distance in the scores plot between the unknown and the centre of each cluster in each individual model is measured and ranked. Hits are then ranked in accordance with these measurements (i.e. the shortest distance being 1, the second smallest 2, and so on) and 10 from this enumeration a class membership decision can be determined.
The method described above works well for simple data sets in which there are up to about 10 class groups. The inventors have discovered, however, that when the full library of over 300 gas species is used the initial model is too large and contains too 15 many very different types of compound. The model can become overwhelmed by the number of classes and finds clustering difficult. Accordingly, in a preferred embodiment of the invention, a two-step classification process is used.
Figure 7 shows clustering of spectra of gas components by PLS-DA algorithm in the 20 qualitative analysis step of Fig. 2, illustrating two-stage classification according to an embodiment of the invention. In this two-step process a model is a first built which identifies the type/class of component, denoted by a first cluster 702. In this case the class information would not be the name of the component but the name of the chemical group that that component belongs, i.e., Alcohol, Alkane, Aldehyde, etc.
Once a chemical group has been identified, the gas component is then reanalysed by a secondary PLS-DA model which only contains data from components within that group, i.e. to establish a species, denoted by a second cluster 704. In Fig. 7, two PLS5 DA scores plots are shown for real data.
In a spectrum which is from a mixture of gas components, it is likely that the most prevalent component is identified first. According to embodiments of the invention, the identified component is then subtracted from the raw sample spectrum and the new spectrum then re-analysed using the same model. The spectrum of the component to be subtracted is automatically scaled before subtraction so that the peak heights match that of the sample spectrum. It will be understood that the automatic scaling of the sample spectrum is a similar process to the scaling for the water spectrum scaling and subtraction described above.
The cycle of subtracting hits and re-analysing the spectrum (i.e., modified spectrum data) is then repeated (as shown in Fig. 2 by repeating the steps 2 and 3) until all gases (gas components) in the spectrum have been identified, or until no further identification is possible. The method provides an iterative process whereby the predominant or prevalent species in the sample is identified, automatically subtracted and the spectrum is re-analysed.
The model 1 shown in Fig. 7 is relatively messy as indicated by the 11.26% number on the x-axis which shows data variance, and has about a 50% success rate for identifying the gas from the unknown spectrum due to overlap of the chemical groups, even when the centroid method is used as discussed above. The model 2 shown in Fig. 7 is significantly better at identifying the unknown spectrum as indicated by the 54.46% number on the x-axis. It will be understood that the model 1 is a first broad PFS-DA model where the functional (chemical) groups/classes are plotted (i.e. an alkane was identified in this example), and model 2 is a second detailed PFS-DA model which identifies the gas (species; i.e. the alkane was identified as methane in this example).
It will be understood that whereas Fig. 6 shows a theoretical model with six types of ketone, embodiments of the invention models over 300 gas species, an example of which for six species is shown in Fig. 7. Furthermore, it will be appreciated that a gas can only be identified if it is contained in the database of the 300 gases. However, where a gas type is not in the database, useful information about the unknown gas is still obtained by the PFS-DA method because the chemical group (e.g. alkane) would be identified (i.e. from the model 1 shown in Figure 7).
Step 3 of the invention uses PFS-DA modelling, which is less complex than prior art ways of analysing large data sets. Such prior art ways of analysing gas sample data may use statistical analysis, which can quickly become too complex and lead to lengthy statistical calculations. Such PFS-DA is a relatively simple and quick way of analysing the data to provide grouped data as discussed above. Providing quick grouped data is useful when a gas stream is required to be analysed in real time.
Step 4: Quantitative Analysis
The next step, step 4, is to perform quantitative analysis, whereby each identified gas is compared to a calibration set of spectra containing multiple spectra of each species of gas at varying concentrations.
Figure 8 shows a comparison of the concentrations predicted for each spectrum by a PLS model to the actual known concentrations in the qualitative analysis step of Fig. 2, according to an embodiment of the invention. The data forming the basis of the plot in Fig. 8 is given in Table 1.
Table 1
Spectrum Actual Predicted Difference % Difference RSS
Methyl Isobutyl Ketone 10ppm 120C 6_4m SPC 10.00 12.81 2.81 0.06 0.0001
Methyl Isobutyl Ketone 50ppm 120C 6_4m SPC 50.00 53.14 3.24 0.06 0.0000
Methyl Isobutyl Ketone 1000ppm 1200C 6_4m SPC 1000.00 996.20 -3.80 -0.08 0.0054
Methyl Isobutyl Ketone 5000ppm 1200C 6_4m SPC 5000.00 4965.22 -34.78 -0.70 0.1598
Zero spectrum 2 spc 0.00 3.89 3.89 0.08 0.0000
Zero spectrum 3 spc 0.00 3.15 3.15 0.06 0.0000
The calibration set is in a database of several thousand spectra of different gas concentrations (i.e. the same database of 300 gas species, but with multiple concentrations of each). The quantitative analysis step automatically builds a quantitative Partial Least Squares (PLS) model for each gas species identified by the qualitative analaysis using these spectra. The use of PLS model for mapping gas concentration data onto gas species data will be known to persons skilled in the art.
The previous paragraph refers to a quantitative Partial Least Squares (PLS) model, but other models may be used such as Classical Least Squares (CLS), or Inverse Least Squares (ILS) etc. In other words, an alternative quantification analysis methods can be applied at this stage. However, the Applicants have determined that a PLS model provides the best analysis results in this embodiment.
The PLS algorithm used for quantification is very similar to the PLS-DA algorithm described above, however instead of discreet classes the Y block data is continuous data. For this quantitative modelling, the X data is the calibration spectra from the library and the Y data is the associated concentrations. In a similar manner to the qualitative modelling, the aim of the quantitative model is to describe the relationship between X and Y so that the model can then be used on a new set of X to predict a new
Y (i.e. the concentration results). The PLS algorithm does this by finding factors (i.e.
combinations of variables), which describe the covariance (how variables change within relation to each other) and achieve correlation relating X to Y. The algorithm attempts to find factors which will maximise the variance in the spectra, X, but only the variance which is relevant for predicting the concentrations, Y, of that particular gas. The output of this is a comparison of the concentrations predicted by the model for each spectrum to the actual known concentrations, as seen in figure 8.
Figure 9 shows a basic schematic for calculating the PLS models. Firstly, the X and Y data are split into two smaller data matrices, the scores and the loadings, using a Principal Component Analysis (PCA) decomposition. The scores describe how samples are related to each other and the loadings how variables relate to each other. The algorithm then calculates the β coefficients which describe the relationship between X and Y. It will be understood that the β coefficients are dimensionless numbers. Finally, the X and Y residuals are calculated. This is then repeated for each PLS factor, where the first factor will describe the most variance in the data and the second factor the second largest variance and so on. In a perfect system only 1 factor should be needed to describe the relationship between X and Y, however, noise, interference and nonlinearities in the data make this not so.
Step 4 includes a method to automatically build these PLS quantification models for each gas based on the output from step 3. The result from step 3, i.e. the qualitative analysis, is a list of gases which are present in the sample. The software then builds a calibration set of data for the quantitative modelling by pulling from the master library multiple spectra for each of the gases listed from step 3, at various concentrations. The concentration information is written in the name of each spectrum file, and therefore the Y data for the model can be automatically read into the software. It will be apparent to persons skilled in the art that other techniques may be used. What is important is that the concentration values are stored in the calibration spectra files in some manner, be it filename, metadata, etc. The data is automatically pre-processed before the model is built, involving the mean centring and baseline correction as described previously.
The calibration set is then randomly split by the software into a calibration set and a validation set. As PLS is a supervised method it can be subject to bias, and as such any model produced should be validated with data that is previously unseen by the model.
The data is therefore split into two sets a calibration set, used to build the model and a validation set used to test the model.
Figure 10 shows a method of wavelength (“wavenumber band”) selection for each gas for use with an automatic PLS model in the qualitative analysis step of Fig. 2, according to an embodiment. The automatic PLS model provides a method of automatic wavelength selection for each gas using a defined algorithm for “wavenumber band” selection. When quantitative analysis is performed by PLS, the full FTIR spectrum is typically not used; instead, a region of the spectrum is selected which is the most appropriate to use for determining the concentration of the gas species in question. The decision of which wavenumber band to use is typically made by an expert user; however in order to facilitate expedited model building a method has been developed by the inventors for this to be determined by the software. An iterative sequence of logical steps is applied to the calibration spectra in order to establish the optimum wavenumber band to use. The flow diagram in Fig. 10 outlines the steps in this.
The process begins by taking the highest concentration calibration spectra for the component being modelled (step sl002), and then finding the maximum absorption in the spectrum (step sl004). It is then determined (step sl006) whether the absorption is over 1.5 absorption units. If so, the next highest absorption in the spectrum is taken (step si008), and the test (step si006) repeated.
If, at step sl006 it is determined that the absorption is not over 1.5 absorption units, the absorption is compared with that of all other spectra of other components in the calibration set (step slOlO). A determination is then made (step sl012) as to whether there are any bands above 1.5 absorption units in the same place. If so, the absorption is discounted and the next highest absorption in the spectrum is taken (step s 1016), and the test at step si006 repeated for this.
On the other hand, if, at step sl012, it is determined that there are no bands above 1.5 absorption units in the same place, then the current band is adopted for use (step s 1014).
These steps allow the software to decide on a single wavenumber point (i.e. a single spectrum data point) which lies within a suitable region, and the wavenumber band is selected from this by adding 50 wavenumber points to either side of the single point.
This whole region is checked, and there is an absorption over 1.5 Absorption Units (AU) within this region the software adjusts the start and end points of the wavenumber band accordingly to remove any absorbances over 1.5 AU. It has been established that it is not necessary to go beyond 1.5 AU, which is considered to be the maximum upper limit.
If the wavenumber band is cropped to less than 10cm-l in size, then it is defined as an unsuitable. If a suitable band >10 wavenumber points cannot be found within this region, then processing returns to the beginning of the flow chart and the band selection process started again.
In embodiments, the quantitative analysis step 4 also automatically chooses the optimum number of PLS factors to use in the model. To achieve this the model is built with the calibration data and then tested with validation data as discussed below at step 5, and the error from the calibration predictions and the validation predictions is then calculated. The calibration error value will always decrease as the number of factors increases, but after a point the explained variance will not be true variance and the model will be over trained and give poor validations results. For most models, the validation error will initially decrease as the number of factor increases, but then will either begin to increase again or plateau. The optimum number of factors will be the number that has the lowest validation error.
Finally, once the model is built, the model will check the ‘goodness’ or quality of the model. This checking of the model is preferably done by calculating the R2 value for the “predicted concentration” vs “actual concentration”. In embodiments, this R2 value needs to be greater than 0.95. It is also checked that the model is not over fitted, and the model accuracy checked by checking the Standard Error of Validation (SEV): this is the root mean square of the difference between the predicted results and the actual results. When the range of a model is higher, this value will naturally be higher, and therefore this is calculated as a % of range for the model checks; and in embodiments the SEV needs to be less than 2% range to pass the model checks. E.g., if a model range is lOOppm the SEV cannot be greater than 2ppm and for a model ranged at lOOOppm it can be 20ppm. If either of these criteria is not met, the most likely reason would be an inappropriate analysis choice of wavenumber band, and therefore the processing returns to the wavenumber band selection stage and the next most appropriate wavenumber band after the last one which was tried is chosen, and the model building and checking process repeat again.
A set of PLS quantification models is produced, using the method described above. One 5 PLS model will be built for each gas species to be quantified. This set of models is then used to analyse the sample spectra and produce a predicted gas concentration value.
Step 5 and 6: Analysis Check and Re-analyse
When a sample spectrum is analysed using the PLS models built in step 4, the quality of 10 the analysis is then checked.
Figure 11 shows spectra obtained, including a residual spectrum 1110, for use in the analysis check step of Fig. 2, according to an embodiment of the invention. Also shown in Figure 11 is a sample spectrum 1112, and a predicted spectrum 1114.
In step 5, the residual spectrum is calculated and some statistical tests applied to it. The spectral weightings from the PLS model are added together to give the predicted spectrum. The spectrum predicted by the model is then subtracted from the actual sample spectrum to give the residual spectrum; therefore, the residual spectrum is what is left over that is not accounted for by the model. In a perfect analysis, the residual would be a flat line, but in reality this tends to be noise. Lots of peaks or large peaks in the residual spectrum indicate that there is something (unknown component) in the gas sample which is not being accounted for by the model, and therefore processing needs to return to step 2 and 3 and in order to qualify what is in the gas sample again, as shown a step 6 in Figure 2. An example residual spectrum from a sample with an interfering species which was not accounted for in the model is show in figure 11.
The residual spectrum is an array of data which often needs a trained user to interpret.
For decision by the processor, the residual spectrum is converted into a signal indicative number to which a Pass/Fail criterion can be applied. This is done by calculating the Root Mean Residual Sum of Squares (RMRSS), the square root of mean residual spectrum across wavenumber range. The optimum value for this in a perfect system would be zero, and, according to an embodiment, the maximum allowable value is
0.006. A value of lower than 0.006 results in an analysis check pass, and the results would be reported as described in step 7. A value higher than this indicates that the sample data contains a gas, or multiple gases which are not accounted for in the calibration set. In this case the sample spectrum is returned to stage 2 and 3 for the qualitative analysis to be repeated. The method preferably continually checks the
RMRSS so that, if a new gas appears in the gas stream, it is recognised that there is a change, the new species are identified and models adjusted accordingly.
Step 7: Reported Results
The results from the method of Fig. 2 are a list of gas species which have been identified and the associated concentration values thereof.
Figure 12 shows graphically the results from the method of Fig. 2 for several gas species identified, graphs illustrating a trend of “concentration” vs “time”. As the FTIR gas analyser collects and analyses data continuously on-line, there is also a trend of “concentration” vs “time”, so the user can see how gas compositions and concentrations are varying from spectrum to spectrum.
The results of the method also suitably include a list of gas species identified and their associated concentration values, the data for which are given in Table 2.
Table 2
Name Units Value R1 R2
H2O % 1.3430 31.50 40.00
CO2 % 0.0482 5.11 20.00
CO ppm 0.3494 60.00 2000.00
NO ppm -0.1656 60.00 600.00
NO2 ppm 0.2648 25.00 490.00
N2O ppm 0.1845 25.00 102.00
SO2 ppm -1.0752 26.00 446.00
NH3 ppm -0.0965 13.00 46.00
HCI ppm -0.4280 9.00 615.00
CH4 ppm 2.3091 21.00 1048.00
Oxygen Vol% 20.9011 21.00 21.00
The above method of the invention is described as being applied to gas phase IR spectra. However, it will be appreciated that the invention could be applied to non-gas optical spectroscopic measurement, and/or mass spectrum data. The method of the invention is, to an extent, limited by the library of data (i.e. the database) of spectra which is input into the system. Whereas the library (database) used in the above examples is all gas FTIR, the data analysis method itself is not specific to this. If a library of liquid measurements or mass spectra were available, the methods could be applied to that data too.

Claims (19)

1. A method of determining an identity, and optionally a concentration, of one or more components in a gas or liquid sample by spectral analysis, the method comprising,
5 in sequence:
receiving raw spectrum data for the sample;
pre-processing the raw spectrum data, to obtain pre-processed spectrum data from which the spectra of known undesired components have been removed; and performing qualitative processing on the pre-processed spectrum data to 10 determine therefrom the identity of the one or more components in the sample;
wherein identifying a component comprises:
subtracting the spectrum of an identified component from the raw spectrum data, or previously produced modified raw spectrum data, to produce modified raw spectrum data, and
15 returning to the pre-processing step to reanalyse the modified raw spectrum data.
2. A method according to claim 1, and further including performing quantitative processing comprising using the determined one or more identities to thereby
20 determine, for the one or more identified components in the sample, the concentration of that component.
3. A method according to claim 2, wherein performing quantitative processing comprises building a quantitative Partial Least Squares (PLS) model of components of the sample.
5
4. A method according to claim 1, 2 or 3, wherein performing qualitative processing comprises:
using a Partial Least Squares Discriminant Analysis (PLS-DA) algorithm having as inputs the raw spectrum data or modified raw spectrum data, and a database of reference spectrum data, and
10 building a PLS-DA model of components of the sample.
5. A method according to claim 4, wherein building the PLS-DA model comprises iteratively identifying a component of the sample and updating the PLS-DA model to characterise components therein as identified or non-identified.
6. A method according to claim 4 or 5, wherein building the PLS-DA model comprises spatially arranging identifiers of the components within the PLS-DA model as the components are identified.
20
7. A method according to claim 6, wherein building the PLS-DA model comprises identifying clusters of identifiers of components spatially arranged within the PLS-DA model using the PLS-DA algorithm.
8. A method according to claim 6 or 7, wherein building the PLS-DA model comprises incorporating one or more boundaries within the PLS-DA model so as define respective separations of two clusters.
5
9. A method according to any of the preceding claims, wherein performing qualitative processing comprises, for the one or more components, performing a first classification operation to determine, based on the raw spectrum or a previously produced modified raw spectrum data, a chemical group to which the component belongs, and
10 performing, based on the chemical group determined and the raw spectrum or a previously produced modified raw spectrum data, a second classification operation to determine the identity of the component.
10. A method according to claim 9, when dependent on claim 6, wherein:
15 the first classification operation is such as to position the identifier of an unknown component within a cluster of a first type, corresponding to chemical group, within the PLS-DA model; and the second classification operation is such as to position the identifier of the unknown component within a cluster of a second type, corresponding to chemical
20 species, within the PLS-DA model.
11. A method according to any of the preceding claims, wherein said pre-processing comprises one or both of baseline correction, and mean centring of the raw data spectrum.
12. A method according any of the preceding claims, wherein said pre-processing comprises subtracting a water spectrum from the raw spectrum data.
5
13. A method according to any of the preceding claims, when dependent on claim 2 or 3, and further comprising:
determining whether the raw spectrum data for the sample meet one or more quality criteria, and if not, returning to the pre-processing step to reanalyse the raw spectrum io data, and if the one or more quality criteria are met, out putting analysis results as a list of components and, associated with each component, a respective concentration.
14. A method according to any preceding claim, wherein the raw spectrum data 15 comprises Infra-Red (IR) spectrum data of a gas.
15. A method according to any of claim 1-13, wherein the raw spectrum data comprises mass spectrum data.
20
16. A software application operable to perform a method according to any preceding claim.
17. An apparatus for performing spectroscopic analysis of a sample, comprising:
a sample cell for containing, in use, the sample;
a light source and light detector, for directing light onto the sample and for receiving light therefrom and thereby obtain raw spectrum data; and processing circuitry, coupled to at least the light detector, the processing circuitry being configured to implement the method of any of claims 1 to 15.
18. A method of performing spectroscopic analysis as substantially described herein with reference to Figures 1 - 12 of the accompanying drawings.
19. An apparatus for performing spectroscopic analysis as substantially described 10 herein with reference to Figure 13 of the accompanying drawings.
Intellectual
Property
Office
Application No: Claims searched:
GB1706666.3A 2017-04-27 2017-04-27 Spectroscopic analysis Active GB2561879B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB1706666.3A GB2561879B (en) 2017-04-27 2017-04-27 Spectroscopic analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1706666.3A GB2561879B (en) 2017-04-27 2017-04-27 Spectroscopic analysis

Publications (3)

Publication Number Publication Date
GB201706666D0 GB201706666D0 (en) 2017-06-14
GB2561879A true GB2561879A (en) 2018-10-31
GB2561879B GB2561879B (en) 2020-05-20

Family

ID=59011031

Family Applications (1)

Application Number Title Priority Date Filing Date
GB1706666.3A Active GB2561879B (en) 2017-04-27 2017-04-27 Spectroscopic analysis

Country Status (1)

Country Link
GB (1) GB2561879B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110907383A (en) * 2019-11-22 2020-03-24 光钙(上海)高科技有限公司 Gas detection method based on Michelson infrared spectrum technology
CN112102898B (en) * 2020-09-22 2022-09-23 安徽大学 Method and system for identifying mode of spectrogram in solid fermentation process of vinegar grains
CN113267466B (en) * 2021-04-02 2023-02-03 中国科学院合肥物质科学研究院 Fruit sugar degree and acidity nondestructive testing method based on spectral wavelength optimization
CN114414519B (en) * 2022-01-24 2023-07-25 上海理工大学 Method for detecting type and concentration of heavy metal in water body

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012036970A1 (en) * 2010-09-13 2012-03-22 Mks Instruments, Inc. Monitoring, detecting and quantifying chemical compounds in a gas sample stream

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012036970A1 (en) * 2010-09-13 2012-03-22 Mks Instruments, Inc. Monitoring, detecting and quantifying chemical compounds in a gas sample stream

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Meyer et al., "Qualitative and quantitative mixture analysis by library search: infrared analysis of mixtures of carbohydrates", Analytica Chimica Acta, 281, pp. 161-171, Elsevier, 1993 *
Westerhuis et al., Multivariate paired data analysis: multilevel PLSDA versus OPLSDA, Metabolomics, 6, pp. 119-128, Springer, [online]. Published 28 Oct 2009. Available from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2834771/ *

Also Published As

Publication number Publication date
GB2561879B (en) 2020-05-20
GB201706666D0 (en) 2017-06-14

Similar Documents

Publication Publication Date Title
US7251037B2 (en) Method to reduce background noise in a spectrum
TWI468666B (en) Monitoring, detecting and quantifying chemical compounds in a sample
US20060197957A1 (en) Method to reduce background noise in a spectrum
JP6089345B2 (en) Multicomponent regression / multicomponent analysis of temporal and / or spatial series files
GB2561879A (en) Spectroscopic analysis
WO2010106712A1 (en) Etching apparatus, analysis apparatus, etching treatment method, and etching treatment program
US10557792B2 (en) Spectral modeling for complex absorption spectrum interpretation
US10718713B2 (en) Unknown sample determining method, unknown sample determining instrument, and unknown sample determining program
JP6245387B2 (en) Three-dimensional spectral data processing apparatus and processing method
CN112834485B (en) Non-calibration method for quantitative analysis of laser-induced breakdown spectroscopy elements
EP3428620B1 (en) Gas analysis apparatus, program for gas analysis apparatus, and gas analysis method
JP7395570B2 (en) Chemical analysis equipment and methods
CN113340874B (en) Quantitative analysis method based on combination ridge regression and recursive feature elimination
US8082111B2 (en) Optical emission spectroscopy qualitative and quantitative analysis method
US20220252516A1 (en) Spectroscopic apparatus and methods for determining components present in a sample
EP3892985A1 (en) System and computer-implemented method for extrapolating calibration spectra
CN111044504B (en) Coal quality analysis method considering uncertainty of laser-induced breakdown spectroscopy
Perez-Guaita et al. Improving the performance of hollow waveguide-based infrared gas sensors via tailored chemometrics
CN108982401B (en) Method for analyzing single component flow from infrared absorption spectrum of mixed gas
JP3020626B2 (en) Engine exhaust gas analyzer using Fourier transform infrared spectrometer
CN111965166A (en) Rapid measurement method for biomass briquette characteristic index
CN112595706A (en) Laser-induced breakdown spectroscopy variable selection method and system
WO2018158801A1 (en) Spectral data feature extraction device and method
JPH04268439A (en) Quantitative analysis using ftir
WO2019012773A1 (en) Gas analysis device, program for gas analysis device, and gas analysis method