EP1305600A2 - Etalonnage combinatoire a variables multiples qui ameliore les previsions par elimination de regions surmodelisees - Google Patents

Etalonnage combinatoire a variables multiples qui ameliore les previsions par elimination de regions surmodelisees

Info

Publication number
EP1305600A2
EP1305600A2 EP01952581A EP01952581A EP1305600A2 EP 1305600 A2 EP1305600 A2 EP 1305600A2 EP 01952581 A EP01952581 A EP 01952581A EP 01952581 A EP01952581 A EP 01952581A EP 1305600 A2 EP1305600 A2 EP 1305600A2
Authority
EP
European Patent Office
Prior art keywords
matrix
factors
calibration
residual
algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP01952581A
Other languages
German (de)
English (en)
Inventor
Kevin H. Hazen
Suresh Thennadil
Timothy L. Ruchti
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sensys Medical Inc
Original Assignee
Sensys Medical Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US09/630,201 external-priority patent/US6871169B1/en
Application filed by Sensys Medical Inc filed Critical Sensys Medical Inc
Publication of EP1305600A2 publication Critical patent/EP1305600A2/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • G01N21/31Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
    • G01N21/35Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
    • G01N21/359Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light using near infrared light

Definitions

  • the invention relates to multivariate analysis of spectral signals. More particularly, the invention relates to a method of multivariate analysis of a spectral signal that allows for a wavelength or spectral region to be modeled with just enough factors to fully model the analytical signal without the incorporation into the model of noise by using excess factors.
  • Multivariate analysis is a well-established tool for extracting a spectroscopic signal, usually quite small, of a target analyte from a data matrix in the presence of noise, instrument variations, environmental effects and interfering components.
  • Various methods and devices have been described that employ multivariate analysis to determine an analyte signal. For example, R. Barnes, J. Brasch, D. Purdy, W. Loughheed, Non-invasive determination of analyte concentration in body of mammals, U.S. Patent No.
  • PCR principal component regression
  • PLS partial least squares regression
  • One well-documented problem with multivariate analysis is that noise in the data may be incorporated into the model. This is especially true when too many factors are employed in the development of the model. The modeling of this noise results in subsequent prediction matrices with erroneously high error levels. See, for example, H. Martens, T. Naes, Multivariate Calibration John Wiley & Sons, 1989; or K. Beebe, B. Kowalski, An Introduction to Multivariate Calibration and Analysis, Anal. Chem. 59, 1007A-1017A (1987). Complicating this issue is the fact that a few factors may fully model a given spectral region, while additional factors may be required to model another set of wavelengths. For example, a few factors may model a region having:
  • instrument drift changes spectral response over time
  • An iterative, combinative PCR algorithm allows a different number of factors to be applied to different wavelengths or spectral regions during modeling of a matrix of calibration spectra.
  • a three-factor model is applied over a given spectral region, wherein concentrations of a target analyte are known.
  • the residual of the three-factor model is calculated and used as the input for an additional five-factor model.
  • Prior to the additional five factors being applied some of the wavelengths are removed, with the result that a three-factor model is applied over the first spectral region and an eight factor model over the second region. This analysis of residuals may be repeated such that a one to n factor model may be applied to any given wavelength or rather any number of factors may be employed to model any given frequency.
  • the scores matrices of the individual models are concatenated to arrive at a final scores matrix for the entire calibration matrix.
  • Principal component regression is employed to regress the calibration matrix against the vector of analyte concentrations to derive a calibration model, the model comprising a vector of calibration coefficients.
  • a method of predicting concentration of a target analyte from a prediction data set comprising a matrix of sample spectra applies the above calibration to a final scores matrix for the sample matrix to generate a vector of predicted values for target analyte concentration.
  • the sample matrix is iteratively modeled in the same fashion as the calibration matrix, with the final scores matrix being a concatenation of the individual scores matrices for the various spectral regions or wavelengths.
  • Figure 1 is a schematic diagram of the steps of an iterative, combinative PCR algorithm, according to the invention.
  • Figure 2 shows an assortment of exemplary spectra from a set of spectra generated for a calibration data set, according to the invention
  • Figure 3 provides a plot of the square of a mean residual spectrum calculated from a set of residual calibration spectra, according to the invention.
  • Figure 4 shows a plot of the first three loadings from an initial PCR iteration according to the invention;
  • Figure 5 plots SEP (Standard Error of Prediction) versus noise for Standard PCR and Modified PCR, according to the invention.
  • Figure 6 plots the relative error level of Standard PCR and Modified PCR as a function of noise, according to the invention.
  • Figure 7 plots SEP (standard error of prediction) and uncertainty error as a function of the number of factors used to model a signal, according to the invention.
  • the invention provides an iterative, combinative PCR (Principal Component Regression) algorithm that allows for each wavelength or spectral region of a spectral signal to be modeled with just enough factors to fully model the analytical signal without the incorporation in the model of noise by using excess factors.
  • Each wavelength or spectral region may utilize its own number of factors independently of other wavelengths or spectral regions.
  • a novel multivariate model incorporating the invented algorithm is also provided.
  • the iterative, combinative PCR algorithm allows a different number of factors to be applied to different wavelengths, or regions, of a spectral signal.
  • a three- factor model is applied over a given spectral region.
  • the exemplary embodiment is provided only as a description, and is not intended to be limiting.
  • the residual of the three-factor model is calculated and used as input for a further five-factor model.
  • some of the wavelengths used for the three-factor model are removed.
  • the removed wavelengths constitute a first spectral region.
  • the remaining wavelengths constitute a second spectral region.
  • a three-factor model is applied over the first region and an eight-factor model over the second region.
  • X is a matrix of absorbance spectra comprising i samples and k variables, or in this case, wavelengths
  • y is a vector of concentrations of a target analyte, where the concentrations are independently determined (i samples x 1).
  • Wavelength selection is initially performed on l; (10, Figure 1).
  • Wavelength selection is again employed using X 3 as the input matrix (15).
  • steps 1 — 3 must go through ⁇ iterations (20) to generate T ⁇ and V ⁇ .
  • T ⁇ U The final scores matrix, T ⁇ U , is generated by concatenating all of the Ts:
  • loading vectors may not be concatenated since the vectors are different lengths.
  • Table 1 below provides a MATLAB program implementing the invented iterative, combinative PCR algorithm.
  • % Pre a Matlab file named PCR_Noise_xx that contains the variables % X_cali matrix of calibration spectra in column format
  • % column 1 counter (related to varying noise, matrix size, or
  • % other parameters such as size of the matrix may be varied by
  • T2 X_pred' * W; % generate new score matrix (for prediction) % based upon prediction spectra
  • SEC generate_SE_c(Y_cali_ref, Y_cali);
  • SEP generate_SE_c(Y_pred_ref, Y_pred);
  • T_all_p [ones(m,1 ) T_all_p]; % 1's to allow nonzero offset
  • An alternative embodiment of the modified PCR algorithm applies PCR with a set number of factors to all spectral regions requiring that number of factors.
  • a separate PCR model with a second number of factors may be applied to individual wavelengths or spectral regions requiring that number of factors.
  • the process may be repeated such that one to n factors are applied for each spectral region or wavelength.
  • Scores may be concatenated as above with calibration coefficients being generated and predictions being performed as in steps 8 and 9.
  • computer simulated near-IR spectral data sets of serum are utilized to demonstrate the feasibility of the combinative PCR algorithm described above.
  • Phantom Serum Spectra Generation Near-IR absorbance spectra of water, albumin, triglycerides, cholesterol, glucose and urea with a concentration of 1 g/dL at 37.0jC and a 1 mm pathlength were generated from spectra collected on a NICOLET 860 IR Spectrometer, supplied by Nicolet Instrument Corporation of Madison Wl, in transmission mode followed by multivariate curve resolution analysis. The pure component spectra were used to generate phantom serum spectra by additive additions of the absorbances of the constituent components, where the concentration of each constituent was randomly selected from the concentration ranges in Table 2, below. Noise proportional to the resulting spectral absorbance at each wavelength was then added, with the standard deviation of the added noise being a percentage of the total absorbance; thus yielding spectra with increased noise levels at higher absorbance levels.
  • Prediction spectra consisted of 80 independent spectra at each of 30 noise levels where one standard deviation of the noise level varied from 0.0002 to
  • 0.0002 to 0.006 are the mean SEP s of the 50 independent prediction sets at each of the noise levels.
  • the conventional PCR algorithm was utilized to analyze the generated spectral data sets.
  • wavelength optimization through wavelength selection was performed on the calibration and monitoring data sets, with the standard error of the monitoring data set used as a metric for optimization.
  • Wavelength optimization for the standard PCR algorithm resulted in the spectral ranges of 1100 to 1862 and 1978 to 2218 nm being selected for removal, which, as would be expected, corresponds to removal of the high water absorbance regions about 1900 and 2500 nm.
  • wavelength optimization was again performed on the calibration and monitoring data sets with the standard error of prediction, of the monitoring data set used as the metric for optimization.
  • wavelength optimization was performed with each PCR iteration.
  • a total of three PCR factors were utilized with the spectral regions 1100 to 1886 and 1980 to 2378nm.
  • standard PCR utilized a long wavelength cutoff of 2218nm that excluded the sharply increasing water absorbance band that leads to higher noise levels, while the first iteration of the modified PCR algorithm had a long wavelength cutoff at 2378nm that includes more of this high noise region.
  • traditional PCR removed a larger region about the 1950nm water absorbance band compared to the first iteration of the modified PCR algorithm.
  • the modified PCR algorithm is able to incorporate these noisy regions, since only a three-factor model is utilized in these regions.
  • the square of the resulting mean residual spectrum is plotted in Figure 3.
  • the large square of the residual serves as one basis for wavelength selection.
  • the loadings provide a second basis for wavelength selection.
  • Figure 4 the three spectral loadings generated in the first iteration of PCR are presented.
  • the first loading 41 is dominated by water while the second spectral loading 42 shows considerable structure in the combination band region corresponding to protein.
  • the third loading 43 shows considerable noise about 1950 and 2350nm.
  • the second PCR iteration utilized an additional four factors yielding a total of seven factors for the remaining spectral regions.
  • the remaining spectral region continues to include the glucose absorbance band located at 2272nm that was excluded from the traditional PCR algorithm.
  • the standard PCR algorithm generated its optimal prediction abilities based largely upon the three glucose absorbance bands in the first overtone region from 1500 to 1800nm.
  • the first overtone includes smaller and broader absorption features that require additional factors to be properly modeled.
  • the large number of factors required for this region plus the limitation of a fixed number of factors for all wavelengths imposed by conventional PCR, necessitated the removal of some of the glucose containing information in the combination band region to avoid later factors adding undue amounts of noise into the model from the combination band region.
  • the calibration exemplifies several well- known phenomena. It is generally known that signal, noise and pathlength considerations often dictate that the signal-rich combination band spectral region from 2000 to 2500nm be excluded from multivariate analyses that include the first overtone region from 1450 to 1900nm. Many additional factors are required to fully model the smaller and more overlapped analytical signals in the first overtone region, with the result that the combination band spectral region from 2000 to 2500nm would be over- modeled were it to be included. Such limitation is dictated by traditional multivariate methods that require a single number of factors for the entire spectral region being analyzed. Thus, the inventive PCR algorithm allows signal in a noisy region such as the combination band to be analyzed at the same time as signal in a more complex region like the first overtone. As the following discussion reveals, applying the inventive algorithm leads to smaller standard errors of prediction (SEP).
  • SEP standard errors of prediction
  • Prediction Using the calibrations developed with the conventional PCR algorithm and the inventive PCR algorithm, prediction results were obtained on the generated prediction sets. As previously indicated, in the prediction data sets, 50 sets of 80 spectra were generated at each of 30 noise levels from 0.0002 to 0.006 times the absorbance level. The mean SEP for the original PCR 51 and modified PCR algorithm 52 are presented in Figure 5 for each of the noise levels tested. In all cases, the mean SEP of the modified PCR algorithm was lower than that of the standard PCR algorithm. In comparing the traditional PCR results to the modified PCR results, it is evident that percent relative increase in error by the traditional PCR algorithm varied from 17 to 45%, Figure 6.
  • Figures 5 and 6 clearly show that the inventive PCR algorithm resulted in smaller prediction errors compared to the conventional PCR algorithm.
  • the two PCR approaches produce similar results.
  • the two PCR approaches perform similarly.
  • the observed SEP versus the number of factors used to model the system classically decreases with initial factors to a local minimum and then increases slowly with additional factors. This is due to the calibration model modeling the analytical signal with early factors and increasingly modeling the spectral noise with later factors to the point of over-modeling the system.
  • the observed SEP may be broken up into two components according to equation 2:
  • SEP is the standard error of prediction
  • SE sig is the standard error of the signal
  • SE uncerl is the standard error due to the modeled uncertainty. If no noise or uncorrelated components of the sample were present in the model or in subsequent prediction spectra, the standard error of the modeled signal would continue to decrease with an increasing number of factors until the system was completely modeled as shown by the dashed and dotted line 70 in Figure 7. However, in generation of the calibration model both signal and uncertainty (noise) are modeled into the system. Additional factors increase the uncertainty modeled into the system. This results in the classical decrease in the SEP with early factors as the signal is modeled, a local minimum in the SEP and finally an increase in the SEP as the system is over-modeled with additional factors.
  • the invention finds particular utility in various spectroscopy applications; for example, predicting concentration of analytes such as glucose from noninvasive near-IR spectra performed on live subjects. While the invention n3s u ⁇ e ⁇ ue foundedeu n ⁇ rein wim respeci to near s ⁇ ec ⁇ rosco ⁇ y, m ⁇ methods of the invention are equally applicable to data matrices of any kind.
  • spectroscopic techniques may include UV/VIS/NIR/IR as well as techniques such as AA/NMR/MS.
  • the invention is not limited to spectroscopic techniques but may include chromatographic techniques such as GC/LC or combinations of chromatographic and spectroscopic techniques such as GC/MS or GC/IR. Additionally, the invention finds application in almost any field that relies on multivariate analysis techniques, the social sciences, for example.

Landscapes

  • Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biochemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Investigating Or Analysing Materials By Optical Means (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

La présente invention concerne un nouveau modèle à variables multiples destiné à analyser le spectre d'absorbance qui permet de modéliser chaque longueur d'onde ou région spectrale avec juste assez de facteurs pour complètement modéliser le signal analytique sans incorporer le bruit par utilisation des facteurs inutiles. On modélise chaque longueur d'onde ou région spectrale en utilisant un nombre de facteurs propre à chacun qui est indépendant des autres longueurs d'onde ou régions spectrales. Un algorithme combinatoire itératif de régression de composant principal (PCR) permet d'appliquer un nombre différent de facteurs aux différentes longueurs d'onde. Dans un mode de réalisation de l'invention, un modèle à trois facteurs est appliqué sur une région spectrale donnée. Le terme d'écart de ce modèle à trois facteurs est calculé et utilisé comme input dans un modèle additionnel à cinq facteurs. Avant d'appliquer ce modèle additionnel à cinq facteurs, on retire quelques unes des longueurs d'ondes. Cette opération débouche sur un modèle à trois facteurs sur la première région et sur un modèle à huit facteurs sur la deuxième région. Cette analyse ou ces termes d'écart peuvent être répétés de façon qu'un modèle à un ou n facteurs puisse être appliqué à n'importe quelle longueur d'onde donnée, ou plutôt qu'on puisse utiliser n'importe quel nombre de facteurs pour modéliser n'importe quelle fréquence ou région spectrale donnée. Un procédé de prévision de la concentration d'un analyte cible à partir des spectres d'échantillon applique un étalonnage élaboré par utilisation de l'algorithme PCR de l'invention à une matrice de spectres d'échantillon de façon à générer un vecteur de concentrations prévues pour cet analyte cible.
EP01952581A 2000-08-01 2001-07-09 Etalonnage combinatoire a variables multiples qui ameliore les previsions par elimination de regions surmodelisees Withdrawn EP1305600A2 (fr)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US630201 2000-08-01
US09/630,201 US6871169B1 (en) 1997-08-14 2000-08-01 Combinative multivariate calibration that enhances prediction ability through removal of over-modeled regions
PCT/US2001/021703 WO2002010726A2 (fr) 2000-08-01 2001-07-09 Etalonnage combinatoire a variables multiples qui ameliore les previsions par elimination de regions surmodelisees

Publications (1)

Publication Number Publication Date
EP1305600A2 true EP1305600A2 (fr) 2003-05-02

Family

ID=24526210

Family Applications (1)

Application Number Title Priority Date Filing Date
EP01952581A Withdrawn EP1305600A2 (fr) 2000-08-01 2001-07-09 Etalonnage combinatoire a variables multiples qui ameliore les previsions par elimination de regions surmodelisees

Country Status (4)

Country Link
EP (1) EP1305600A2 (fr)
AU (1) AU2001273314A1 (fr)
TW (1) TW576918B (fr)
WO (1) WO2002010726A2 (fr)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102192889B (zh) * 2010-03-08 2012-11-21 上海富科思分析仪器有限公司 一种光纤原位药物溶出度/释放度试验仪的紫外可见吸收光谱的校正方法
CN105203498A (zh) * 2015-09-11 2015-12-30 天津工业大学 一种基于lasso的近红外光谱变量选择方法
CN113703282B (zh) * 2021-08-02 2022-09-06 联芯集成电路制造(厦门)有限公司 光罩热膨胀校正方法
CN113607683B (zh) * 2021-08-09 2024-09-06 天津九光科技发展有限责任公司 一种近红外光谱定量分析的自动建模方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DK39792D0 (da) * 1992-03-25 1992-03-25 Foss Electric As Fremgangsmaade til bestemmelse af en komponent
US5379764A (en) * 1992-12-09 1995-01-10 Diasense, Inc. Non-invasive determination of analyte concentration in body of mammals
US6040578A (en) * 1996-02-02 2000-03-21 Instrumentation Metrics, Inc. Method and apparatus for multi-spectral analysis of organic blood analytes in noninvasive infrared spectroscopy

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO0210726A3 *

Also Published As

Publication number Publication date
AU2001273314A1 (en) 2002-02-13
WO2002010726A2 (fr) 2002-02-07
WO2002010726A3 (fr) 2002-04-25
TW576918B (en) 2004-02-21

Similar Documents

Publication Publication Date Title
US6871169B1 (en) Combinative multivariate calibration that enhances prediction ability through removal of over-modeled regions
US7620674B2 (en) Method and apparatus for enhanced estimation of an analyte property through multiple region transformation
CN107440684B (zh) 用于预测分析物的浓度的方法和设备
Pedersen et al. Near-infrared absorption and scattering separated by extended inverted signal correction (EISC): Analysis of near-infrared transmittance spectra of single wheat seeds
Roger et al. EPO–PLS external parameter orthogonalisation of PLS application to temperature-independent measurement of sugar content of intact fruits
US6711503B2 (en) Hybrid least squares multivariate spectral analysis methods
Kalivas Interrelationships of multivariate regression methods using eigenvector basis sets
Tan et al. Wavelet analysis applied to removing non‐constant, varying spectroscopic background in multivariate calibration
AU738441B2 (en) Method and apparatus for generating basis sets for use in spectroscopic analysis
US5459677A (en) Calibration transfer for analytical instruments
Chen et al. Simultaneous wavelength selection and outlier detection in multivariate regression of near-infrared spectra
Liu et al. Multi-spectrometer calibration transfer based on independent component analysis
Karstang et al. Multivariate prediction and background correction using local modeling and derivative spectroscopy
Gributs et al. Parsimonious calibration models for near-infrared spectroscopy using wavelets and scaling functions
Tan et al. Improvement of a standard-free method for near-infrared calibration transfer
EP1305600A2 (fr) Etalonnage combinatoire a variables multiples qui ameliore les previsions par elimination de regions surmodelisees
Gemperline Developments in nonlinear multivariate calibration
Alrezj et al. Digital bandstop filtering in the quantitative analysis of glucose from near‐infrared and midinfrared spectra
Chen et al. A new hybrid strategy for constructing a robust calibration model for near-infrared spectral analysis
CN113795748A (zh) 用于配置光谱测定装置的方法
Malik et al. Support vector regression with digital band pass filtering for the quantitative analysis of near‐infrared spectra
Jouan-Rimbaud et al. Calibration line adjustment to facilitate the use of synthetic calibration samples in near-infrared spectrometric analysis of pharmaceutical production samples
Bärring et al. Optimizing meta-parameters in continuous piecewise direct standardization
CN114166764A (zh) 基于特征波长筛选的光谱特征模型的构建方法及装置
Ni et al. The relationship between net analyte signal/preprocessing and orthogonal signal correction algorithms

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20030213

AK Designated contracting states

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

AX Request for extension of the european patent

Extension state: AL LT LV MK RO SI

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: SENSYS MEDICAL, INC.

17Q First examination report despatched

Effective date: 20080630

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20081111