WO2005036198A1 - Diagnostic de maladies a prion et classification d'echantillons par mme et/ou mlle - Google Patents

Diagnostic de maladies a prion et classification d'echantillons par mme et/ou mlle Download PDF

Info

Publication number
WO2005036198A1
WO2005036198A1 PCT/GB2004/004219 GB2004004219W WO2005036198A1 WO 2005036198 A1 WO2005036198 A1 WO 2005036198A1 GB 2004004219 W GB2004004219 W GB 2004004219W WO 2005036198 A1 WO2005036198 A1 WO 2005036198A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
diagnostic
spectral
nmr
predetermined condition
Prior art date
Application number
PCT/GB2004/004219
Other languages
English (en)
Inventor
Yulan Wang
Huiru TANG
John Christopher Lindon
Maurice John Sauer
Original Assignee
Imperial Innovations Limited
Secretary Of State For Environment, Food And Rural Affairs
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Imperial Innovations Limited, Secretary Of State For Environment, Food And Rural Affairs filed Critical Imperial Innovations Limited
Publication of WO2005036198A1 publication Critical patent/WO2005036198A1/fr

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01RMEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
    • G01R33/00Arrangements or instruments for measuring magnetic variables
    • G01R33/20Arrangements or instruments for measuring magnetic variables involving magnetic resonance
    • G01R33/44Arrangements or instruments for measuring magnetic variables involving magnetic resonance using nuclear magnetic resonance [NMR]
    • G01R33/46NMR spectroscopy
    • G01R33/465NMR spectroscopy applied to biological material, e.g. in vitro testing
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01RMEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
    • G01R33/00Arrangements or instruments for measuring magnetic variables
    • G01R33/20Arrangements or instruments for measuring magnetic variables involving magnetic resonance
    • G01R33/44Arrangements or instruments for measuring magnetic variables involving magnetic resonance using nuclear magnetic resonance [NMR]
    • G01R33/46NMR spectroscopy
    • G01R33/4625Processing of acquired signals, e.g. elimination of phase errors, baseline fitting, chemometric analysis

Definitions

  • This invention pertains generally to the field of metabonomics, and, more particularly, to chemometric methods for the analysis of chemical, biochemical, and biological data, for example, spectral data, for example, nuclear magnetic resonance (NMR) spectra, and their applications, including, e.g., classification, diagnosis, prognosis, etc., particularly in the context of prion diseases, especially transmissible spongiform encephalopathies (TSEs), such as, for example, Creutzfeld Jacob Disease (CJD), Bovine Spongiform Encephalopathy (BSE), and scrapie, and especially scrapie.
  • TSEs transmissible spongiform encephalopathies
  • CJD Creutzfeld Jacob Disease
  • BSE Bovine Spongiform Encephalopathy
  • scrapie scrapie
  • Ranges are often expressed herein as from “about” one particular value, and/or to “about” another particular value. When such a range is expressed, another embodiment includes from the one particular value and/or to the other particular value. Similarly, when values are expressed as approximations, by the use of the antecedent "about,” it will be understood that the particular value forms another embodiment.
  • TSEs Transmissible Spongiform Encephalopathies
  • prion diseases affecting animals and man which are caused by unconventional infectious agents.
  • the individual agents responsible for specific TSEs have not been completely characterised, although their overall composition appears to be very similar.
  • the hallmark of a TSE disease is misshapen protein molecules that clump together and accumulate in brain tissue. It is currently believed that these misshapen prion proteins have the ability to cause other proteins of the same type to also change shape.
  • Scrapie Perhaps the most widely recognised spongiform encephalopathy of animals is scrapie, which is a fatal progressive neurological disorder of sheep. Scrapie has been reported in many countries and has been recognised in British sheep flocks for over two centuries, having been first recorded in 1732. It is notably absent in Australia and New Zealand. Scrapie affects sheep and goats naturally and can be transmitted experimentally to several animal species.
  • Bovine spongiform encephalopathy (also known as mad cow disease) was recognised as a potential member of the scrapie family of diseases as soon as it was discovered in the United Kingdom in November 1986.
  • BSE occurs in adult animals in both sexes, typically in four and five year olds. It is a neurological disease involving pronounced changes in mental state, abnormalities of posture and movement and of sensation. The clinical disease usually lasts for several weeks and it is characteristically progressive and fatal.
  • An additional concern is the possibility of BSE infection occurring in sheep.
  • the clinical signs of scrapie and experimental BSE in sheep are very similar and there is a possibility that BSE in sheep may have occurred but this may have been masked by scrapie.
  • Kuru, Creutzfeldt-Jakob Disease (CJD) and its more rare variant, the Gerstmann- Straussler-Scheinker (GSS) disease are examples of spongiform encephalopathies found in humans. Like the animal disorders, they are progressive and universally fatal. Kuru at one time was common in certain tribes in New Guinea although not seen elsewhere. Transfer of infection was thought to occur through ceremonial handling and ingestion of human brains affected by the disease. As the disease is not vertically transmitted, the incidence of kuru has decreased dramatically since these practices ceased around 1956. Several cases do still occur each year in patients over thirty years of age; this is consistent with an incubation period of 30 years or more.
  • CJD or more correctly sporadic CJD is a rare disease of man which affects about one person per million each year and is prevalent worldwide. There is currently no single diagnostic test for CJD; the only way to confirm a diagnosis of CJD is by brain biopsy or autopsy.
  • TSEs transmissible spongiform encephalopathies
  • FTIR Fourier transform infra red
  • Biosystems can conveniently be viewed at several levels of bio-molecular organisation based on biochemistry, i.e., genetic and gene expression (genomic and transcriptomic), protein and signalling (proteomic) and metabolic control and regulation (metabonomic). There are also important cellular ionic regulation variations that relate to genetic, proteomic and metabolic activities, and systematic studies on these even at the cellular and sub-cellular level should also be investigated to complete the full description of the bio-molecular organisation of a bio-system.
  • genomic and proteomic methods may be useful aids, for example, in drug development, they do suffer from substantial limitations.
  • Metabonomics is conventionally defined as "the quantitative measurement of the multiparametric metabolic response of living systems to pathophysiological stimuli or genetic modification” (see, for example, Nicholson et al., 1999). This concept has arisen primarily from the application of 1 H NMR spectroscopy to study the metabolic composition of biofluids, cells, and tissues and from studies utilising pattern recognition (PR), expert systems and other chemoinformatic tools to interpret and classify complex NMR-generated metabolic data sets. Metabonomic methods have the potential, ultimately, to determine the entire dynamic metabolic make-up of an organism.
  • PR pattern recognition
  • each level of bio-molecular organisation requires a series of analytical bio-technologies appropriate to the recovery of the individual types of bio-molecular data.
  • Genomic, proteomic and metabonomic technologies by definition generate massive data sets which require appropriate multi-variate statistical tools (chemometrics, bio- informatics) for data mining and to extract useful biological information.
  • These data exploration tools also allow the inter-relationships between multivariate data sets from the different technologies to be investigated, they facilitate dimension reduction and extraction of latent properties and allow multidimensional visualization.
  • a pathological condition or a xenobiotic may act at the pharmacological level only and hence may not affect gene regulation or expression directly.
  • significant disease or toxicological effects may be completely unrelated to gene switching.
  • exposure to ethanol in vivo may cause many changes in gene expression but none of these events explains drunkenness.
  • genomic and proteomic methods are likely to be ineffective.
  • all disease or drug-induced pathophysiological perturbations result in disturbances in the ratios and concentrations, binding or fluxes of endogenous biochemicals, either by direct chemical reaction or by binding to key enzymes or nucleic acids that control metabolism. If these disturbances are of sufficient magnitude, effects will result which will affect the efficient functioning of the whole organism.
  • metabolites are in dynamic equilibrium with those inside cells and tissues and, consequently, abnormal cellular processes in tissues of the whole organism following a toxic insult or as a consequence of disease will be reflected in altered biofluid compositions.
  • Biofluids Fluids secreted, excreted, or otherwise derived from an organism
  • biofluids provide a unique window into its biochemical status since the composition of a given biofluid is a consequence of the function of the cells that are intimately concerned with the fluid's manufacture and secretion.
  • the composition of a particular fluid e.g., urine, blood plasma, milk, etc.
  • the composition and condition of an organism's tissues are also indicators of the organism's biochemical status.
  • a xenobiotic is a substance (e.g., compound, composition) which is administered to an organism, or to which the organism is exposed.
  • xenobiotics are chemical, biochemical or biological species (e.g., compounds) which are not normally present in that organism, or are normally present in that organism, but not at the level obtained following administration/ exposure.
  • examples of xenobiotics include drugs, formulated medicines and their components (e.g., vaccines, immunological stimulants, inert carrier vehicles), infectious agents, pesticides, herbicides, substances present in foods (e.g. plant compounds administered to animals), and substances present in the environment.
  • a disease state pertains to a deviation from the normal healthy state of the organism.
  • diseases states include, but are not limited to, bacterial, viral, and parasitic infections; cancer in all its forms; degenerative diseases (e.g., arthritis, multiple sclerosis); trauma (e.g., as a result of injury); organ failure (including diabetes); cardiovascular disease (e.g., atherosclerosis, thrombosis); and, inherited diseases caused by genetic composition (e.g., sickle-cell anaemia).
  • a genetic modification pertains to alteration of the genetic composition of an organism.
  • Examples of genetic modifications include, but are not limited to: the incorporation of a gene or genes into an organism from another species; increasing the number of copies of an existing gene or genes in an organism; removal of a gene or genes from an organism; and, rendering a gene or genes in an organism non-functional.
  • Biofluids often exhibit very subtle changes in metabolite profile in response to external stimuli. This is because the body's cellular systems attempt to maintain homeostasis (constancy of internal environment), for example, in the face of cytotoxic challenge. One means of achieving this is to modulate the composition of biofluids. Hence, even when cellular homeostasis is maintained, subtle responses to disease or toxicity are expressed in altered biofluid composition. However, dietary, diurnal and hormonal variations may also influence biofluid compositions, and it is clearly important to differentiate these effects if correct biochemical inferences are to be drawn from their analysis.
  • Metabonomics offers a number of distinct advantages (over genomics and proteomics) in a clinical setting: firstly, it can often be performed on standard preparations (e.g., of serum, plasma, urine, etc.), circumventing the need for specialist preparations of cellular RNA and protein required for genomics and proteomics, respectively. Secondly, many of the risk factors already identified (e.g., levels of various lipids in blood) are small molecule metabolites which will contribute to the metabonomic dataset.
  • NMR spectroscopy see, for example, Nicholson et al., 1989
  • intact tissues have been successfully analysed using magic-angle-spinning 1 H NMR spectroscopy (see, for example, Moka et al., 1998; Tomlins et al., 1998).
  • the NMR spectrum of a biofluid provides a metabolic fingerprint or profile of the organism from which the biofluid was obtained, and this metabolic fingerprint or profile is characteristically changed by a disease, toxic process, or genetic modification.
  • NMR spectra may be collected for various states of an organism (e.g., pre-dose and various times post-dose, for one or more xenobiotics, separately or in combination; healthy (control) and diseased animal; unmodified (control) and genetically modified animal).
  • each compound or class of compound produces characteristic changes in the concentrations and patterns of endogenous metabolites in biofluids that provide information on the sites and basic mechanisms of the toxic process.
  • NMR-based metabonomics over genomics or proteomics
  • Reanalysis of the same sample by 1 H NMR spectroscopy results in a typical coefficient of variation for the measurement of peak intensities in a spectrum of less than 5% across the whole range of peaks.
  • the value of each peak intensity will lie in the range 0.95 to 1.05 of the true value.
  • the intrinsic accuracy of NMR provides a distinct advantage when applying pattern recognition techniques.
  • the multivariate nature of the NMR data means that classification of samples is possible using a combination of descriptors even when one descriptor is not sufficient, because of the inherently low analytical variation in the data. All biological fluids and tissues have their own characteristic physico-chemical properties, and these affect the types of NMR experiment that may be usefully employed.
  • One major advantage of using NMR spectroscopy to study complex biomixtures is that measurements can often be made with minimal sample preparation (usually with only the addition of 5-10% D 2 O) and a detailed analytical profile can be obtained on the whole biological sample. Sample volumes are small, typically 0.3 to 0.5 mL for standard probes, and as low as 3 ⁇ L for microprobes. Acquisition of simple NMR spectra is rapid and efficient using flow-injection technology. It is usually necessary to suppress the water NMR resonance.
  • biofluids are not chemically stable and for this reason care should be taken in their collection and storage. For example, cell lysis in erythrocytes can easily occur. If a substantial amount of D 2 O has been added, then it is possible that certain 1 H NMR resonances will be lost by H/D exchange. Freeze-drying of biofluid samples also causes the loss of volatile components such as acetone. Biofluids are also very prone to microbiological contamination, especially fluids, such as urine, which are difficult to collect under sterile conditions. Many biofluids contain significant amounts of active enzymes, either normally or due to a disease state or organ damage, and these enzymes may alter the composition of the biofluid following sampling.
  • Samples should be stored deep frozen to minimise the effects of such contamination.
  • Sodium azide is usually added to urine at the collection point to act as an antimicrobial agent.
  • Metal ions and or chelating agents e.g., EDTA
  • endogenous metal ions e.g., Ca 2+ , Mg 2+ and Zn 2+
  • chelating agents e.g., free amino acids, especially glutamate, cysteine, histidine and aspartate; citrate
  • the analytical problem usually involves the detection of "trace" amounts of analytes in a very complex matrix of potential interferences. It is, therefore, critical to choose a suitable analytical technique for the particular class of analyte of interest in the particular biomatrix which could be, for example, a biofluid or a tissue. High resolution NMR spectroscopy (in particular 1 H NMR) appears to be particularly appropriate.
  • the main advantages of using 1 H NMR spectroscopy in this area are the speed of the method (with spectra being obtained in 5 to 10 minutes), the requirement for minimal sample preparation, and the fact that it provides a non-selective detector for all metabolites in the biofluid regardless of their structural type, provided only that they are present above the detection limit of the NMR experiment and that they contain non-exchangeable hydrogen atoms.
  • the speed advantage is of crucial importance in this area of work as the clinical condition of a patient may require rapid diagnosis, and can change very rapidly and so correspondingly rapid changes must be made to the therapy provided.
  • NMR studies of body fluids should ideally be performed at the highest magnetic field available to obtain maximal dispersion and sensitivity and most 1 H NMR studies have been performed at 400 MHz or greater.
  • the number of resonances that can be resolved in a biofluid increases and although this has the effect of solving some assignment problems, it also poses new ones.
  • there are still important problems of spectral interpretation that arise due to compartmentation and binding of small molecules in the organised macromolecular domains that exist in some biofluids such as blood plasma and bile. All this complexity need not reduce the diagnostic capabilities and potential of the technique, but demonstrates the problems of biological variation and the influence of variation on diagnostic certainty.
  • NMR spectra of urine is identifiably altered in situations where damage has occurred to the kidney or liver. It has been shown that specific and identifiable changes can be observed which distinguish the organ that is the site of a toxic lesion. Also it is possible to focus in on particular parts of an organ such as the cortex of the kidney and even in favourable cases to very localised parts of the cortex.
  • Pattern recognition (PR) methods can be used to reduce the complexity of data sets, to generate scientific hypotheses and to test hypotheses.
  • PR pattern recognition
  • Pattern recognition methods have been used widely to characterise many different types of problem ranging for example over linguistics, fingerprinting, chemistry and psychology. In the context of the methods described herein, pattern recognition is the use of multivariate statistics, both parametric and non-parametric, to analyse spectroscopic data, and hence to classify samples and to predict the value of some dependent variable based on a range of observed measurements. There are two main approaches.
  • unsupervised One set of methods is termed “unsupervised” and these simply reduce data complexity in a rational way and also produce display plots which can be interpreted by the human eye.
  • the other approach is termed “supervised” whereby a training set of samples with known class or outcome is used to produce a mathematical model and this is then evaluated with independent validation data sets.
  • Unsupervised PR methods are used to analyse data without reference to any other independent knowledge, for example, without regard to the identity or nature of a xenobiotic or its mode of action.
  • Examples of unsupervised pattern recognition methods include principal component analysis (PCA), hierarchical cluster analysis (HCA), and nonlinear mapping (NLM).
  • PCA principal components analysis
  • Principal components are new variables created from linear combinations of the starting variables with appropriate weighting coefficients.
  • the properties of these PCs are such that: (i) each PC is orthogonal to (uncorrelated with) all other PCs, and (ii) the first PC contains the largest part of the variance of the data set (information content) with subsequent PCs containing correspondingly smaller amounts of variance.
  • PCA a dimension reduction technique, takes m objects or samples, each described by values in K dimensions (descriptor vectors), and extracts a set of eigenvectors, which are linear combinations of the descriptor vectors.
  • the eigenvectors and eigenvalues are obtained by diagonalisation of the covariance matrix of the data.
  • the eigenvectors can be thought of as a new set of orthogonal plotting axes, called principal components (PCs).
  • PCs principal components
  • the extraction of the systematic variations in the data is accomplished by projection and modelling of variance and covariance structure of the data matrix.
  • the primary axis is a single eigenvector describing the largest variation in the data, and is termed principal component one (PC1).
  • PC1 principal component one
  • Subsequent PCs, ranked by decreasing eigenvalue describe successively less variability.
  • residual variance The variation in the data that has not been described by the PCs is called residual variance and signifies how well the model fits the data.
  • the projections of the descriptor vectors onto the PCs are defined as scores, which reveal the relationships between the samples or objects.
  • a graphical representation a "scores plot” or eigenvector projection
  • objects or samples having similar descriptor vectors will group together in clusters.
  • Another graphical representation is called a loadings plot, and this connects the PCs to the individual descriptor vectors, and displays both the importance of each descriptor vector to the interpretation of a PC and the relationship among descriptor vectors in that PC.
  • a loading value is simply the cosine of the angle which the original descriptor vector makes with the PC. Descriptor vectors which fall close to the origin in this plot carry little information in the PC, while descriptor vectors distant from the origin (high loading) are important in interpretation.
  • a plot of the first two or three PC scores gives the "best" representation, in terms of information content, of the data set in two or three dimensions, respectively.
  • a plot of the first two principal component scores, PC1 and PC2 provides the maximum information content of the data in two dimensions.
  • Such PC maps can be used to visualise inherent clustering behaviour, for example, for drugs and toxins based on similarity of their metabonomic responses and hence mechanism of action. Of course, the clustering information might be in lower PCs and these have also to be examined.
  • Hierarchical Cluster Analysis another unsupervised pattern recognition method, permits the grouping of data points which are similar by virtue of being "near" to one another in some multidimensional space.
  • Individual data points may be, for example, the signal intensities for particular assigned peaks in an NMR spectrum.
  • the similarity matrix is scanned for the closest pair of points.
  • the pair of points are reported with their separation distance, and then the two points are deleted and replaced with a single combined point. The process is then repeated iteratively until only one point remains.
  • a number of different methods may be used to determine how two clusters will0 be joined, including the nearest neighbour method (also known as the single link method), the furthest neighbour method, and the centroid method (including centroid link, incremental link, median link, group average link, and flexible link variations).
  • the reported connectivities are then plotted as a dendrogram (a tree-like chart which5 allows visualisation of clustering), showing sample-sample connectivities versus increasing separation distance (or equivalents, versus decreasing similarity).
  • the dendrogram has the property in which the branch lengths are proportional to the distances between the various clusters and hence the length of the branches linking one sample to the next is a measure of their similarity. In this way, similar data points may beO identified algorithmically.
  • Non-linear mapping is a simple concept which involves calculation of the distances between all of the points in the original K dimensions. This is followed by construction of a map of points in 2 or 3 dimensions where the sample points are placed in random5 positions or at values determined by a prior principal components analysis. The least squares criterion is used to move the sample points in the lower dimension map to fit the inter-point distances in the lower dimension space to those in the K dimensional space. Non-linear mapping is therefore an approximation to the true inter-point distances, but points close in K-dimensional space should also be close in 2 or 3 dimensional spaceO (see, for example, Brown et al., 1996; Farrant et al., 1992).
  • the methods allow the quantitative description of the multivariate boundaries that characterise and separate each class, for example, each class of xenobiotic in terms of its metabolic effects. It is also possible to obtain confidence limits on any predictions, for example, a level of probability to be placed on the goodness of fit (see, for example, Kowalski et al., 1986). The robustness of the predictive models can also be checked using cross-validation, by leaving out selected samples from the analysis.
  • Expert systems may operate to generate a variety of useful outputs, for example, (i) classification of the sample as "normal” or “abnormal” (this is a useful tool in the control of spectrometer automation, e.g., using sequential flow injection NMR spectroscopy); (ii) classification of the target organ for toxicity and site of action within the tissue where in certain cases, mechanism of toxic action may also be classified; and, (iii) identification of the biomarkers of a pathological disease condition or toxic effect for the particular compound under study. For example, a sample can be classified as belonging to a single class of toxicity, to multiple classes of toxicity (more than one target organ), or to no class.
  • supervised pattern recognition methods include the following: soft independent modelling of class analysis (SIMCA) (see, for example, Wold, 1976); partial least squares analysis (PLS) (see, for example, Wold, 1966; Joreskog, 1982; Frank, 1984; Bro, R, 1997); linear discriminant analysis (LDA) (see, for example, Nillson, 1965); K-nearest neighbour analysis (KNN) (see, for example, Brown et al., 1996); artificial neural networks (ANN) (see, for example, Wasserman, 1989; Anker et al.,
  • SIMCA soft independent modelling of class analysis
  • PLS partial least squares analysis
  • LDA linear discriminant analysis
  • KNN K-nearest neighbour analysis
  • ANN artificial neural networks
  • PNNs probabilistic neural networks
  • RI rule induction
  • Bayesian methods see, for example, Bretthorst, 1990a, 1990b, 1988.
  • Pattern recognition methods have been applied to the analysis of metabonomic data.
  • One aspect of the present invention pertains to a method of classifying a sample/ classifying a subject/diagnosing a subject, as described herein (see, e.g., claims 1-76).
  • One aspect of the present invention pertains to a method of identifying a diagnostic species, or a combination of a plurality of diagnostic species, for a predetermined condition associated with a prion disease, as described herein (see, e.g., claims 77-100).
  • One aspect of the present invention pertains to a (novel) diagnostic species, for a predetermined condition associated with a prion disease, identified by a method as described herein (see, e.g., claim 101).
  • One aspect of the present invention pertains to a diagnostic species, or a combination of a plurality of diagnostic species, for a predetermined condition associated with a prion disease, e.g., identified by a method as described herein, for use in a method of classification (see, e.g., claim 102).
  • One aspect of the present invention pertains to a method of classification which relies upon (or employs) a diagnostic species, or a combination of a plurality of diagnostic species, for a predetermined condition associated with a prion disease, e.g., identified by a method as described herein (see, e.g., claim 103).
  • One aspect of the present invention pertains to use of a diagnostic species, or a combination of a plurality of diagnostic species, for a predetermined condition associated with a prion disease, e.g., identified by a method of classification as described herein, e.g., in a method of classification (see, e.g., claim 104).
  • One aspect of the present invention pertains to an assay for use in a method of classification , which assay relies upon a diagnostic species, or a combination of a plurality of diagnostic species, for a predetermined condition associated with a prion disease, e.g. , identified by a method as described herein (see, e.g., claim 105).
  • One aspect of the present invention pertains to use of an assay in a method of classification , which assay relies upon a diagnostic species, or a combination of a plurality of diagnostic species, for a predetermined condition associated with a prion disease, e.g. , identified by a method as described herein (see, e.g., claim 106).
  • One aspect of the present invention pertains to a diagnostic species, or a combination of a plurality of diagnostic species, selected from lactate, glucose, glycoproteins, glycerophosphorylcholine (GPC), trimethylamine-N-oxide (TMAO), and alanine for use in a method of classification (e.g., diagnosis of a prion disease) (see, e.g., claim 107).
  • a diagnostic species selected from lactate, glucose, glycoproteins, glycerophosphorylcholine (GPC), trimethylamine-N-oxide (TMAO), and alanine for use in a method of classification (e.g., diagnosis of a prion disease) (see, e.g., claim 107).
  • One aspect of the present invention pertains to a method of classification (e.g., diagnosis of a prion disease) which relies upon a diagnostic species, or a combination of a plurality of diagnostic species, selected from lactate, glucose, glycoproteins, glycerophosphorylcholine (GPC), trimethylamine-N-oxide (TMAO), and alanine (see, e.g., claim 108).
  • a diagnostic species selected from lactate, glucose, glycoproteins, glycerophosphorylcholine (GPC), trimethylamine-N-oxide (TMAO), and alanine (see, e.g., claim 108).
  • One aspect of the present invention pertains to use of a diagnostic species, or a combination of a plurality of diagnostic species, selected from lactate, glucose, glycoproteins, glycerophosphorylcholine (GPC), trimethylamine-N-oxide (TMAO), and alanine, in a method of classification (e.g., diagnosis of a prion disease) (see, e.g., claim 109).
  • a diagnostic species selected from lactate, glucose, glycoproteins, glycerophosphorylcholine (GPC), trimethylamine-N-oxide (TMAO), and alanine
  • One aspect of the present invention pertains to an assay for use in a method of classification (e.g., diagnosis of a prion disease), which assay relies upon a diagnostic species, or a combination of a plurality of diagnostic species, selected from lactate, glucose, glycoproteins, glycerophosphorylcholine (GPC), trimethylamine-N-oxide (TMAO), and alanine (see, e.g., claim 110).
  • a diagnostic species selected from lactate, glucose, glycoproteins, glycerophosphorylcholine (GPC), trimethylamine-N-oxide (TMAO), and alanine (see, e.g., claim 110).
  • One aspect of the present invention pertains to use of an assay in a method of classification (e.g., diagnosis of a prion disease), which assay relies upon a diagnostic species, or a combination of a plurality of diagnostic species, selected from lactate, glucose, glycoproteins, glycerophosphorylcholine (GPC), trimethylamine-N-oxide (TMAO), and alanine (see, e.g., claim 111).
  • a diagnostic species selected from lactate, glucose, glycoproteins, glycerophosphorylcholine (GPC), trimethylamine-N-oxide (TMAO), and alanine.
  • Figure 1 is a PCA scores plot (component 1 vs. component 2) for the study described in the Examples. Controls (circles, •), pre-clinical manifestation (open squares, D), post-clinical manifestation (filled squares, ⁇ ).
  • Figure 2 is the corresponding PCA loadings plot (component 1 v. component 2).
  • Figure 3 is a PCA scores plot (component 2 vs. component 3) for the study described in the Examples. Controls (circles, •), pre-clinical manifestation (open squares, ⁇ ), post-clinical manifestation (filled squares, ⁇ ).
  • Figure 4 is the corresponding PCA loadings plot (component 2 v. component 3).
  • Figure 5 is a PLS-DA scores plot (component 2 v. component 3) for the study described in the Examples. Controls (filled circles, •), pre-clinical manifestation (open circles, o), post-clinical manifestation (open squares, D).
  • Figure 6 is the corresponding PLS-DA loadings plot (component 2 v. component 3).
  • Figure 7 is the corresponding PLS-DA variable importance plot (VIP).
  • Figure 8 shows a typical 1 H CPMG NMR spectrum of sheep serum with some peak assignments marked.
  • Figure 9 is a PCA scores plot (component 1 vs. component 2) of the CPMG NMR serum spectral data from control and dose-challenged animals.
  • Controls filled squares, ⁇
  • dose-challenged in June and sampled in August open squares, D
  • dose-challenged in June and sampled in October filled diamonds, ⁇
  • dose-challenged in June and sampled in December open circles, O
  • dose-challenged in June and sampled in January stars, *).
  • NMR or MS spectrum provides a fingerprint or profile for the sample to which it pertains.
  • Such spectra represent a measure of all NMR/MS detectable species present in the sample (rather than a select few) and also, to some extent, interactions between these species. As such, these spectra are characterised by a high data density which, heretofore, has not been fully exploited.
  • test data e.g., test spectra (and therefore the associated samples and subjects, if applicable) according to one or more distinguishing criteria, at a discrimination level never before achieved.
  • these methods facilitate the identification of the particular combination of amounts of (e.g., endogenous) species which are invariably associated with the presence of the condition.
  • These combinations (patterns) which typically comprise many (often small) uncorrelated variances which together are diagnostic, are encoded within the high data density of the NMR/MS spectra. The methods described herein permit their identification and subsequent use for classification.
  • a part of that variance may be associated with a given molecule (a biomarker), the level of which varies consistently as a result of the condition under study.
  • the remainder of the variance may be due to differences in the levels of other molecules which give peaks in that integral region but which are unrelated to the condition under study (e.g., individual to individual differences such as dietary factors, age, gender, etc.).
  • the methods described herein which employ pattern recognition techniques, permit identification of that NMR peak intensity which is related to the condition under study, even though only a small part of the variance in a spectral region (bucket) may be related to the condition under study.
  • the identification power is enhanced by the application of data filtering techniques (e.g., orthogonal signal correction, OSC) which can lower the influence of buckets with variance unrelated to the condition of interest.
  • OSC orthogonal signal correction
  • one aspect of the present invention pertains to improved methods for the analysis of chemical, biochemical, and biological data, for example spectra, for example, nuclear magnetic resonance (NMR), mass spectra (MS), and other types of spectra.
  • spectra for example, nuclear magnetic resonance (NMR), mass spectra (MS), and other types of spectra.
  • TSEs transmissible spongiform encephalopathies
  • BSE bovine spongiform encephalopathy
  • CJD Creutzfeldt-Jakob Disease
  • GSS Gerstmann-Straussler-Scheinker
  • metabonomic analysis can distinguish between individuals with and without scrapie.
  • Novel diagnostic biomarkers for scrapie have been identified, and associated methods for diagnosis have been described.
  • One aspect of the present invention pertains to a method of classifying a sample, as described herein.
  • One aspect of the present invention pertains to a method of classifying a subject by classifying a sample from said subject, wherein said method of classifying a sample is as described herein.
  • One aspect of the present invention pertains to a method of diagnosing a subject by classifying a sample from said subject, wherein said method of classifying a sample is as described herein.
  • One aspect of the present invention pertains to a method of classifying a sample, said method comprising the step of relating NMR spectral intensity at one or more predetermined diagnostic spectral windows (e.g., a predetermined diagnostic spectral window, or a combination of a plurality of predetermined diagnostic spectral windows) for said sample with a predetermined condition associated with a prion disease.
  • predetermined diagnostic spectral windows e.g., a predetermined diagnostic spectral window, or a combination of a plurality of predetermined diagnostic spectral windows
  • the sample is: a sample from a subject
  • the predetermined condition is: a predetermined condition of said subject.
  • the relating with a predetermined condition is: relating with the presence or absence of a predetermined condition.
  • the relating of NMR spectral intensity is: relating a modulation of NMR spectral intensity, relative to a control value.
  • the method is a method of classifying a sample from a subject, said method comprising the step of relating a modulation of NMR spectral intensity, relative to a control value, at one or more predetermined diagnostic spectral windows for said sample with the presence or absence of a predetermined condition associated with a prion disease.
  • One aspect of the present invention pertains to a method of classifying a subject, said method comprising the step of relating NMR spectral intensity at one or more predetermined diagnostic spectral windows for a sample from said subject with a predetermined condition associated with a prion disease of said subject.
  • the relating with a predetermined condition is: relating with the presence or absence of a predetermined condition.
  • the relating of NMR spectral intensity is: relating a modulation of NMR spectral intensity, relative to a control value.
  • the method is a method of classifying a subject, said method comprising the step of relating a modulation of NMR spectral intensity, relative to a control value, at one or more predetermined diagnostic spectral windows for a sample from said subject with the presence or absence of a predetermined condition associated with a prion disease of said subject.
  • One aspect of the present invention pertains to a method of diagnosing a predetermined condition associated with a prion disease of a subject, said method comprising the step of relating NMR spectral intensity at one or more predetermined diagnostic spectral windows for a sample from said subject with said predetermined condition of said subject.
  • the relating with said predetermined condition is relating with the presence or absence of said predetermined condition.
  • the relating of NMR spectral intensity is: relating a modulation of NMR spectral intensity, relative to a control value.
  • the method is a method of diagnosing a predetermined condition associated with a prion disease of a subject, said method comprising the step of relating a modulation of NMR spectral intensity, relative to a control value, at one or more predetermined diagnostic spectral windows for a sample from said subject with the presence or absence of said predetermined condition of said subject.
  • One aspect of the present invention pertains to a method of classifying a sample, said method comprising the step of relating the amount of, or relative amount of one or more diagnostic species (e.g., a diagnostic species, or a combination of a plurality of diagnostic species) present in said sample with a predetermined condition associated with a prion disease.
  • diagnostic species e.g., a diagnostic species, or a combination of a plurality of diagnostic species
  • the sample is: a sample from a subject
  • the predetermined condition is: a predetermined condition of said subject.
  • the relating with a predetermined condition is: relating with the presence or absence of a predetermined condition.
  • the relating the amount of, or relative amount of one or more diagnostic species present in said sample is: relating a modulation of the amount of, or relative amount of one or more diagnostic species present in said sample, as compared to a control sample.
  • the method is a method of classifying a sample from a subject, said method comprising the step of relating a modulation of the amount of, or relative amount of one or more diagnostic species present in said sample, as compared to a control sample, with the presence or absence of a predetermined condition associated with a prion disease of said subject.
  • One aspect of the present invention pertains to a method of classifying a subject, said method comprising the step of relating the amount of, or relative amount of one or more diagnostic species present in a sample from said subject with a predetermined condition associated with a prion disease of said subject.
  • the relating with a predetermined condition is: relating with the presence or absence of a predetermined condition.
  • the relating the amount of, or relative amount of one or more diagnostic species present in a sample is: relating a modulation of the amount of, or relative amount of one or more diagnostic species present in a sample, as compared to a control sample.
  • the method is a method of classifying a subject, said method comprising the step of relating a modulation of the amount of, or relative amount of one or more diagnostic species present in a sample from said subject, as compared to a control sample, with the presence or absence of a predetermined condition associated with a prion disease of said subject.
  • Diagnosing a Subject By Amount of Diagnostic Species
  • One aspect of the present invention pertains to a method of diagnosing a predetermined condition associated with a prion disease of a subject, said method comprising the step of relating the amount of, or relative amount of one or more diagnostic species present in a sample from said subject with said predetermined condition of said subject. ln one embodiment, the relating with said predetermined condition is: relating with the presence or absence of said predetermined condition.
  • the relating the amount of, or relative amount of one or more diagnostic species present in a sample is: relating a modulation of the amount of, or relative amount of one or more diagnostic species present in a sample, as compared to a control sample.
  • the method is a method of diagnosing a predetermined condition associated with a prion disease of a subject, said method comprising the step of relating a modulation of the amount of, or relative amount of one or more diagnostic species present in a sample from said subject, as compared to a control sample, with the presence or absence of said predetermined condition of said subject.
  • the step of "relating" involves the use of a predictive mathematical model; wherein the model is formed by applying a modelling method to modelling data; wherein the modelling data comprises a plurality of data sets for modelling samples of known class. A test sample/subject, represented by test data, is then classified, using the model, as being a member of one of the known classes.
  • the modelling data comprises at least one data set for each of a plurality of modelling samples; wherein said modelling samples define a class group consisting of a plurality of classes; wherein each of said modelling samples is of a known class selected from said class group.
  • a test sample/subject, represented by test data, is then classified, using the model, as being a member of one class selected from said class group.
  • said class group comprises classes associated with said predetermined condition (e.g., presence, absence, degree, etc.).
  • said class group comprises exactly two classes. ln one embodiment, said class group comprises exactly two classes: presence of said predetermined condition; and absence of said predetermined condition.
  • replicate NMR spectra may be taken for each sample.
  • samples may be taken from each subject.
  • each of the subjects, and therefore each of the samples and each of the spectra are of known class (e.g., presence or absence of predetermined condition).
  • the modelling data might comprise 525 NMR spectra, that is: 5 replicate NMR spectra recorded for each of 3 serum samples taken from each of 20 sheep with scrapie, and each of 15 healthy control sheep.
  • said modelling method is a multivariate statistical analysis modelling method.
  • said modelling method is a multivariate statistical analysis modelling method which employs a pattern recognition method.
  • said modelling method is, or employs PCA.
  • said modelling method is, or employs PLS.
  • said modelling method is, or employs PLS-DA.
  • said modelling method includes a step of data filtering.
  • said modelling method includes a step of orthogonal data filtering.
  • said modelling method includes a step of OSC.
  • said model takes account of one or more diagnostic species. Modelling Data
  • modelling data e.g., modelling data sets
  • said modelling data comprise spectral data.
  • said modelling data comprise both spectral data and non-spectral data (and is referred to as a "composite data").
  • said modelling data comprise NMR spectral data.
  • said NMR spectral data comprises 1 H NMR spectral data and/or 13 C NMR spectral data.
  • said NMR spectral data comprises 1 H NMR spectral data.
  • said modelling data comprise both NMR spectral data and MS spectral data.
  • said modelling data comprise spectra.
  • said modelling data are spectra.
  • said modelling data comprise both NMR spectral data and non-spectral data.
  • said non-spectral data is non-spectral clinical data.
  • said modelling data comprises a plurality of data sets for modelling samples of known class.
  • said modelling data comprises at least one data set for each of a plurality of modelling samples. ln one embodiment, said modelling data comprises exactly one data set for each of a plurality of modelling samples.
  • test sample/subject is represented by test data (e.g., a test data set or sets).
  • test data is as defined above for modelling data.
  • said test data comprise both NMR spectral data and MS spectral data.
  • said test data comprises a plurality of test data sets for a test sample (e.g., replicate spectra for a single test sample).
  • said test data comprises at least one data set for each of a plurality of test samples (e.g., replicate spectra for each of a number of test samples, each from the same test subject).
  • said test data comprises exactly one data set for each of a plurality of test samples (e.g., one spectrum for each of a number of test samples, each from the same test subject).
  • many aspects of the present invention pertain to methods of classifying things, for example, a sample, a subject, etc.
  • the thing is classified, that is, it is associated with an outcome, or, more specifically, it is assigned membership to a particular class (i.e., it is assigned class membership), and is said "to be of,” “to belong to,” “to be a member of,” a particular class.
  • Classification is made (i.e., class membership is assigned) on the basis of diagnostic criteria.
  • the step of considering such diagnostic criteria, and assigning class membership is described by the word "relating,” for example, in the phrase “relating NMR spectral intensity at one or more predetermined diagnostic spectral windows for said sample (i.e., diagnostic criteria) with the presence or absence of a predetermined condition (i.e., class membership).”
  • predetermined condition is one class
  • absence of a predetermined condition is another class; in such cases, classification (i.e., assignment to one of these classes) is equivalent to diagnosis.
  • many methods of the present invention involve assigning class membership, for example, to one of one or more classes, for example, to one of the two classes: (i) presence of a predetermined condition, or (ii) absence of a predetermined condition.
  • a condition is "predetermined” in the sense that it is the condition in respect to which the invention is practised; a condition is predetermined by a step of selecting a condition for considering, study, etc.
  • condition relates to a state which is, in at least one respect, distinct from the state of normality, as determined by a suitable control population.
  • the predetermined condition is a prion disease, such as a transmissible spongiform encephalopathy (TSE), such as, for example, scrapie, bovine spongiform encephalopathy (BSE), kuru, Creutzfeldt-Jakob Disease (CJD), and Gerstmann-Straussler-Scheinker (GSS) disease.
  • TSE transmissible spongiform encephalopathy
  • BSE bovine spongiform encephalopathy
  • CJD Creutzfeldt-Jakob Disease
  • GSS Gerstmann-Straussler-Scheinker
  • a method of diagnosis may be considered to be a method of prognosis.
  • the phrases "at risk of,” “predisposition towards,” and the like indicate a probability of being classified/diagnosed (or being able to be classified/diagnosed) with the predetermined condition which is greater (e.g., 1.5x, 2x, 5x, 10x, etc.) than for the corresponding control.
  • a time period e.g., within the next 5 years, 10 years, 20 years, etc.
  • a subject who is 2x more likely to be diagnosed with the predetermined condition within the next 5 years, as compared to a suitable control is "at risk of that condition.
  • the degree of a condition for example, the progress or phase of a disease, or a recovery therefrom.
  • the degree of a condition may refer to how temporally advanced the condition is.
  • Another example of a degree of a condition relates to its maximum severity, e.g., a disease can be classified as mild, moderate or severe).
  • Yet another example of a degree of a condition relates to the nature of the condition (e.g., anatomical site, extent of tissue involvement, etc.).
  • sample e.g., a particular sample under study ("study sample”.
  • a sample may be in any suitable form.
  • the sample may be in any form which is compatible with the particular type of spectroscopy, and therefore may be, as appropriate, homogeneous or heterogeneous, comprising one or a combination of, for example, a gas, a liquid, a liquid crystal, a gel, and a solid.
  • Samples which originate from an organism may be in vivo; that is, not removed from or separated from the organism.
  • said sample is an in vivo sample.
  • the sample may be circulating blood, which is "probed” in situ, in vivo, for example, using NMR methods.
  • Samples which originate from an organism may be ex vivo; that is, removed from or separated from the organism (e.g., an ex vivo blood sample, an ex vivo urine sample).
  • said sample is an ex vivo sample.
  • said sample is an ex vivo blood or blood-derived sample. In one embodiment, said sample is an ex vivo blood sample. ln one embodiment, said sample is an ex vivo plasma sample. In one embodiment, said sample is an ex vivo serum sample. In one embodiment, said sample is an ex vivo urine sample.
  • said sample is removed from or separated from an/said organism, and is not returned to said organism (e.g., an ex vivo blood sample, an ex vivo urine sample).
  • said sample is removed from or separated from an/said organism, and is returned to said organism (i.e., "in transit") (e.g., as with dialysis methods).
  • said sample is an ex vivo in transit sample.
  • samples include: a whole organism (living or dead, e.g., a living human); a part or parts of an organism (e.g., a tissue sample, an organ); a pathological tissue such as a tumour; a tissue homogenate (e.g. a liver microsome fraction); an extract prepared from a organism or a part of an organism (e.g., a tissue sample extract, such as perchloric acid extract); an in vitro tissue, such as a spheroid; a suspension of a particular cell type (e.g.
  • hepatocytes an excretion, secretion, or emission from an organism (especially a fluid); material which is administered and collected (e.g., dialysis fluid); material which develops as a function of pathology (e.g., a cyst, blisters); and, supernatant from a cell culture.
  • fluid samples include, for example, blood plasma, blood serum, whole blood, urine, (gall bladder) bile, cerebrospinal fluid, milk, saliva, mucus, nasal fluids, sweat, gastric juice, pancreatic juice, seminal fluid, prostatic fluid, seminal vesicle fluid, seminal plasma, amniotic fluid, foetal fluid, follicular fluid, synovial fluid , aqueous humour, ascite fluid, cystic fluid, blister fluid, and cell suspensions; and extracts thereof.
  • fluid samples include, for example, blood plasma, blood serum, whole blood, urine, (gall bladder) bile, cerebrospinal fluid, milk, saliva, mucus, nasal fluids, sweat, gastric juice, pancreatic juice, seminal fluid, prostatic fluid, seminal vesicle fluid, seminal plasma, amniotic fluid, foetal fluid, follicular fluid, synovial fluid , aqueous humour, ascite fluid, cystic fluid
  • tissue samples include liver, kidney, prostate, brain, gut, blood, blood cells, skeletal muscle, heart muscle, lymphoid, bone, cartilage, and reproductive tissues.
  • blood sample pertains to a sample of whole blood.
  • blood-derived sample pertains to an ex vivo sample derived from the blood of the subject under study.
  • blood and blood-derived samples include, but are not limited to, whole blood (WB), blood plasma (including, e.g., fresh frozen plasma (FFP)), blood serum, blood fractions, plasma fractions, serum fractions, blood fractions comprising red blood cells (RBC), platelets (PLT), leukocytes, etc., and cell lysates including fractions thereof (for example, cells, such as red blood cells, white blood cells, etc., may be harvested and lysed to obtain a cell lysate).
  • WB whole blood
  • blood plasma including, e.g., fresh frozen plasma (FFP)
  • RBC red blood cells
  • PHT platelets
  • leukocytes etc.
  • cell lysates including fractions thereof (for example, cells, such as red blood cells, white blood cells, etc., may be harvested and lysed to obtain a cell lysate).
  • blood and blood-derived samples e.g., plasma, serum
  • blood-derived samples e.g., plasma, serum
  • blood is collected from subjects using conventional techniques (e.g., from the ante-cubital fossa), typically pre-prandially.
  • the method used to prepare the blood fraction should be reproduced as carefully as possible from one subject to the next. It is important that the same or similar procedure be used for all subjects. It may be preferable to prepare serum (as opposed to plasma or other blood fractions) for two reasons: (a) the preparation of serum is more reproducible from individual to individual than the preparation of plasma, and (b) the preparation of plasma requires the addition of anticoagulants (e.g., EDTA, citrate, or heparin) which will be visible in the NMR metabonomic profile and may reduce the data density available.
  • anticoagulants e.g., EDTA, citrate, or heparin
  • a typical method for the preparation of serum suitable for analysis by the methods described herein is as follows: 10 mL of blood is drawn from the antecubital fossa of an individual who had fasted overnight, using an 18 gauge butterfly needle. The blood is immediately dispensed into a polypropylene tube and allowed to clot at room temperature for 3 hours. The clotted blood is then subjected to centrifugation (e.g., 4,500 x g for 5 minutes) and the serum supernatant removed to a clean tube. If necessary, the centrifugation step can be repeated to ensure the serum is efficiently separated from the clot. The serum supernatant may be analysed "fresh" or it may be stored frozen for later analysis.
  • a typical method for the preparation of plasma suitable for analysis by the methods described herein is as follows: high quality platelet-poor plasma is made by drawing the blood using a 19 gauge butterfly needle without the use of a tourniquet from the antecubital fossa. The first 2 mL of blood drawn is discarded and the remainder is rapidly mixed and aliquoted into Diatube H anticoagulant tubes (Becton Dickinson). After gentle mixing by inversion the anticoagulated blood is cooled on ice for 15 minutes then subjected to centrifugation to pellet the cells and platelets (approximately 1 ,200 x g for 15 minutes).
  • the platelet poor plasma supernatant is carefully removed, drawing off the middle third of the supernatant and discarding the upper third (which may contain floating platelets) and the lower third which is too close to the readily disturbed platelet layer on the top of the cell pellet.
  • the plasma may then be aliquoted and stored frozen at -20°C or colder, and then thawed when required for assay.
  • Samples may be analysed immediately ("fresh”), or may be frozen and stored (e.g., at - 80°C) ("fresh frozen") for future analysis. If frozen, samples are completely thawed prior to NMR analysis.
  • said sample is a blood sample or a blood-derived sample.
  • said sample is a blood sample.
  • said sample is a blood plasma sample.
  • said sample is a blood serum sample.
  • urine refers to whole (or intact) urine, whether in vivo (e.g., foetal urine) or ex vivo, e.g., by excretion or catheterisation.
  • urine-derived sample pertains to an ex vivo sample derived from the urine of the subject under study (e.g., obtained by dilution, concentration, addition of additives, solvent- or solid-phase extraction, etc.). Analysis may be performed using, for example, fresh urine; urine which has been frozen and then thawed; urine which has been dried (e.g., freeze-dried) and then reconstituted, e.g., with water or D 2 O.
  • Methods for the collection, handling, storage, and pre-analysis preparation of many classes of sample, especially biological samples (e.g., biofluids) are well known in the art. See, for example, Lindon et al., 1999.
  • said sample is a urine sample or a urine-derived sample. In one embodiment, said sample is a urine sample.
  • samples are, or originate from, or are drawn or derived from, an organism (e.g., subject, patient).
  • the organism may be as defined below.
  • the organism is an animal.
  • the organism e.g., subject, patient
  • the organism is a mammal.
  • the organism e.g., subject, patient
  • a placental mammal e.g., a marsupial (e.g., kangaroo, wombat), a monotreme (e.g., duckbilled platypus), a rodent (e.g., a guinea pig, a hamster, a rat, a mouse), murine (e.g., a mouse), a lagomorph (e.g., a rabbit), avian (e.g., a bird), canine (e.g., a dog), feline (e.g., a cat), equine (e.g ., a horse), porcine (e.g., a pig), ovine (e.g., a sheep), bovine (e.g., a cow), a primate, simian (e.g., a monkey or ape), a monkey (e.g., marmoset,
  • the organism may be any of its forms of development, for example, a foetus.
  • the organism e.g., subject, patient
  • the organism is a food animal.
  • the organism e.g., subject, patient
  • ovine e.g., a sheep
  • bovine bovine
  • a cow e.g., a cow
  • a human e.g., a cow
  • the organism e.g., subject, patient
  • ovine e.g., a sheep
  • the organism e.g., subject, patient
  • bovine e.g., a cow
  • the organism e.g., subject, patient
  • the subject e.g., a human
  • the subject may be characterised by one or more criteria, for example, sex, age (e.g., 40 years or more, etc.), ethnicity, medical history, lifestyle (e.g., smoker, non-smoker), hormonal status (e.g., pre-menopausal, post-menopausal), etc.
  • population refers to a group of organisms (e.g., subjects, patients). If desired, a population (e.g., of humans) may be selected according to one or more of the criteria listed above.
  • the principal nucleus studied in biomedical NMR spectroscopy is the proton or 1 H nucleus. This is the most sensitive of all naturally occurring nuclei.
  • the chemical shift range is about 10 ppm for organic molecules.
  • 13 C NMR spectroscopy using either the naturally abundant 1.1% 13 C nuclei or employing isotopic enrichment is useful for identifying metabolites.
  • the 13 C chemical shift range is about 200 ppm.
  • Other nuclei find special application. These include 15 N (in natural abundance or enriched), 19 F for studies of drug metabolism, and 31 P for studies of endogenous phosphate biochemistry either in vitro or in vivo.
  • the FID can be multiplied by a mathematical function to improve the signal-to-noise ratio or reduce the peak line widths. The expert operator has choice over such parameters.
  • the FID is then often filled by a number of zeros and then subjected to Fourier transformation. After this conversion from time-dependent data to frequency dependent data, it is necessary to phase the spectrum so that all peaks appear upright - this is done using two parameters by visual inspection on screen (now automatic routines are available with reasonable success). At this point the spectrum baseline can be curved. To remedy this, one defines points in the spectrum where no peaks appear and these are taken to be baseline.
  • An NMR spectrum consists of a series of digital data points with a y value (relating to signal strength) as a function of equally spaced x-values (frequency). These data point values run over the whole of the spectrum. Individual peaks in the spectrum are identified by the spectroscopist or automatically by software and the area under each peak is determined either by integration (summation of the y values of all points over the peak) or by curve fitting. A peak can be a single resonance or a multiplet of resonances corresponding to a single type of nucleus in a particular chemical environment (e.g., the two protons ortho to the carboxyl group in benzoic acid). Integration is also possible of the three dimensional peak volumes in 2-dimensional NMR spectra.
  • the intensity of a peak in an NMR spectrum is proportional to the number of nuclei giving rise to that peak (if the experiment is conducted under conditions where each successive accumulated free induction decay (FID) is taken starting at equilibrium). Also, the relative intensity of peaks from different analytes in the same sample is proportional to the concentration of that analyte (again if equilibrium prevails at the start of each scan).
  • NMR spectral intensity refers to some measure related to the NMR peak area, and may be absolute or relative.
  • NMR spectral intensity may be, for example, a combination of a plurality of NMR spectral intensities, e.g., a linear combination of a plurality of NMR spectral intensities.
  • NMR NMR spectral intensity
  • NMR spectroscopic techniques can be classified according to the number of frequency axes and these include 1D-, 2D-, and 3D-NMR.
  • 1D spectra include, for example, single pulse; water-peak eliminated either by saturation or non-excitation; spin-echo, such as CPMG (i.e., edited on the basis of spin-spin relaxation); diffusion-edited, selective excitation of specific spectra regions.
  • 2D spectra include for example J-resolved (JRES); 1 H-1 H correlation methods, such as NOESY, COSY, TOCSY and variants thereof; heteronuclear correlation including direct detection methods, such as HETCOR, and inverse-detected methods, such as 1H-13C HMQC, HSQC, HMBC.
  • JRES J-resolved
  • 1 H-1 H correlation methods such as NOESY, COSY, TOCSY and variants thereof
  • heteronuclear correlation including direct detection methods such as HETCOR
  • inverse-detected methods such as 1H-13C HMQC, HSQC, HMBC.
  • 3D spectra include many variants, all of which are combinations of 2D methods, e.g. HMQC-TOCSY,
  • NMR spectroscopic techniques can also be combined with magic-angle-spinning (MAS) in order to study samples other than isotropic liquids, such as tissues, which are characterised by anisotropic composition.
  • MAS magic-angle-spinning
  • Preferred nuclei include 1 H and 13 C.
  • Preferred techniques for use in the present invention include water-peak eliminated, spin-echo such as CPMG, diffusion edited, JRES, COSY, TOCSY, HMQC, HSQC, and HMBC.
  • the 1 H observation frequency is from about 200 MHz to about 900 MHz, more typically from about 400 MHz to about 900 MHz, yet more typically from about 500 MHz to about 750 MHz.
  • 1 H observation frequencies of 500 and 600 MHz may be particularly preferred. Instruments with the following 1 H observation frequencies are/were commercially available: 200, 250, 270 (discontinued), 300, 360 (discontinued), 400, 500, 600, 700, 750, 800, and 900 MHz.
  • NMR spectra can be measured in solid, liquid, liquid crystal or gas states over a range of temperatures from 120 K to 420 K and outside this range with specialised equipment.
  • NMR analysis of biofluids is performed in the liquid state with a sample temperature of from about 274 K to about 328 K, but more typically from about 283 K to about 321 K.
  • An example of a typical temperature is about 300 K.
  • LDL low density lipoprotein
  • biofluid samples are diluted with solvent prior to NMR analysis. This is done for a variety of reasons, including: to lessen solution viscosity, to control the pH of the solution, and to allow addition of reagents and reference materials.
  • An example of a typical dilution solvent is a solution of 0.9% by weight of sodium chloride in D 2 O.
  • the D 2 O lessens the overall concentration of H 2 O and eases the technical requirements in the suppression of the solvent water NMR resonance, necessary for optimum detection of metabolite NMR signals.
  • the deuterium nuclei of the D 2 O also provides an NMR signal for locking the magnetic field enabling the exact co-registration of successive scans.
  • the dilution ratio is from about 1 :50 to about 5:1 by volume, but more typically from about 1:20 to about 1:1 by volume.
  • An example of a typical dilution ratio is 3:7 by volume (e.g., 150 ⁇ L sample, 350 ⁇ L solvent), typical for conventional 5 mm NMR tubes and for flow-injection NMR spectroscopy.
  • Typical sample volumes for NMR analysis are from about 50 ⁇ L (e.g., for microprobes) to about 2 mL.
  • An example of a typical sample volume is about 500 ⁇ L.
  • NMR peak positions are measured relative to that of a known standard compound usually added directly to the sample.
  • a known standard compound usually added directly to the sample.
  • TSP partially deuterated form of TSP
  • 3-trimethylsilyl-[2,2,3,3- 2 H ]-propionate sodium salt For biofluids containing high levels of proteins, this substance is not suitable since it binds to proteins and shows a broadened NMR line.
  • Added formate anion e.g., as a salt can be used in such cases as for blood plasma.
  • NMR spectra are typically acquired, and subsequently, handled in digitised form.
  • Conventional methods of spectral pre-processing of (digital) spectra are well known, and include, where applicable, signal averaging, Fourier transformation (and other transformation methods), phase correction, baseline correction, smoothing, and the like (see, for example, Lindon et al., 1980).
  • a typical 1 H NMR spectrum is recorded as signal intensity versus chemical shift ( ⁇ ) which ranges from about ⁇ 0 to ⁇ 10.
  • signal intensity versus chemical shift
  • the spectrum in digital form comprises about 10,000 to 100,000 data points.
  • it is often desirable to compress this data for example, by a factor of about 10 to 100, to about 1000 data points.
  • the chemical shift axis, ⁇ is "segmented" into “buckets” or "bins" of a specific length.
  • For a 1-D 1 H NMR spectrum which spans the range from ⁇ 0 to ⁇ 10, using a bucket length, ⁇ , of 0.04 yields 250 buckets, for example, ⁇ 10.0-9.96, ⁇ 9.96-9.92, ⁇ 9.92-9.88, etc., usually reported by their midpoint, for example, ⁇ 9.98, ⁇ 9.94, ⁇ 9.90, etc.
  • the signal intensity within a given bucket may be averaged or integrated, and the resulting value reported. In this way, a spectrum with, for example, 100,000 original data points can be compressed to an equivalent spectrum with, for example, 250 data points.
  • a similar approach can be applied to 2-D spectra, 3-D spectra, and the like.
  • the "bucket” approach may be extended to a "patch.”
  • the "bucket” approach may be extended to a "volume.” For example, a 2-D 1 H NMR spectrum which spans the range from ⁇ 0 to ⁇ 10 on both axes, using a patch of ⁇ 0.1 x ⁇ 0.1 yields 10,000 patches. In this way, a spectrum with perhaps 10 8 original data points can be compressed to an equivalent spectrum of 10 4 data points.
  • the equivalent spectrum may be referred to as "a spectral data set,” “a data set comprising spectral data,” etc.
  • spectral regions carry no real diagnostic information, or carry conflicting biochemical information, and it is often useful to remove these "redundant" regions before performing detailed analysis.
  • the data points are deleted.
  • the data in the redundant regions are replaced with zero values.
  • NMR data is handled as a data matrix.
  • each row in the matrix corresponds to an individual sample (often referred to as a "data vector"), and the entries in the columns are, for example, spectral intensity of a particular data point, at a particular ⁇ or ⁇ (often referred to as "descriptors").
  • Multivariate projection methods such as principal component analysis (PCA) and partial least squares analysis (PLS), are so-called scaling sensitive methods.
  • PCA principal component analysis
  • PLS partial least squares analysis
  • Scaling and weighting may be used to place the data in the correct metric, based on knowledge and experience of the studied system, and therefore reveal patterns already inherently present in the data.
  • missing data for example, gaps in column values
  • such missing data may replaced or "filled” with, for example, the mean value of a column ("mean fill”); a random value (“random fill”); or a value based on a principal component analysis ("principal component fill”).
  • Mean fill a random value
  • principal component fill a value based on a principal component analysis
  • Translation of the descriptor coordinate axes can be useful. Examples of such translation include normalisation and mean centring.
  • Normalisation may be used to remove sample-to-sample variation. Many normalisation approaches are possible, and they can often be applied at any of several points in the analysis. Usually, normalisation is applied after redundant spectral regions have been removed.
  • each spectrum is normalised (scaled) by a factor of 1/A, where A is the sum of the absolute values of all of the descriptors for that spectrum.
  • each data vector has the same length, specifically, 1. For example, if the sum of the absolute values of intensities for each bucket in a particular spectrum is 1067, then the intensity for each bucket for this particular spectrum is scaled by 1/1067.
  • Mean centring may be used to simplify interpretation. Usually, for each descriptor, the average value of that descriptor for all samples is subtracted. In this way, the mean of a descriptor coincides with the origin, and all descriptors are "centred” at zero. For example, if the average intensity at ⁇ 10.0-9.96, for all spectra, is 1.2 units, then the intensity at ⁇ 10.0-9.96, for all spectra, is reduced by 1.2 units.
  • unit variance scaling (UV scaling)
  • data can be scaled to equal variance.
  • the value of each descriptor is scaled by 1/StDev, where StDev is the standard deviation for that descriptor for all samples. For example, if the standard deviation at ⁇ 10.0-9.96, for all spectra, is 2.5 units, then the intensity at ⁇ 10.0-9.96, for all spectra, is scaled by 1/2.5 or 0.4.
  • Unit variance scaling may be used to reduce the impact of "noisy" data. For example, some metabolites in biofluids show a strong degree of physiological variation (e.g., diurnal variation, dietary-related variation) that is unrelated to any pathophysiological process. Without unit variance scaling, these noisy metabolites may dominate subsequent analysis.
  • Pareto scaling is, in some sense, intermediate between mean centering and unit variance scaling. In effect, smaller peaks in the spectra can influence the model to a higher degree than for the mean centered case. Also, the loadings are, in general, more interpretable than for unit variance based models.
  • the value of each descriptor is scaled by 1/sqrt(StDev), where StDev is the standard deviation for that descriptor for all samples. In this way, each descriptor has a variance numerically equal to its initial standard deviation.
  • the pareto scaling may be performed, for example, on raw data or mean centered data.
  • Logarithmic scaling may be used to assist interpretation when data have a positive skew and/or when data spans a large range, e.g., several orders of magnitude. Usually, for each descriptor, the value is replaced by the logarithm of that value. For example, the intensity at ⁇ 10.0-9.96 is replaced the logarithm of the intensity at ⁇ 10.0-9.96, for all spectra.
  • each descriptor is divided by the range of that descriptor for all samples. In this way, all descriptors have the same range, that is, 1. For example, if, at ⁇ 10.0-9.96, for all spectra, the largest value is 87 units and the smallest value is 1, then the range is 86 units, and the intensity at ⁇ 10.0-9.96, for all spectra, is divided by 86 units. However, this method is sensitive to presence of outlier points.
  • each data vector is mean centred and unit variance scaled. This technique is a very useful because each descriptor is then weighted equally and, in the case of NMR descriptors, large and small peaks are treated with equal emphasis. This can be important for metabolites present at very low, but still detectable, levels.
  • the variance weight of a single parameter is calculated as the ratio of the inter-class variances to the sum of the intra- class variances.
  • a large value means that this variable is discriminating between the classes. For example, if the samples are known to fall into two classes (e.g., a training set), it is possible to examine the mean and variance of each descriptor. If a descriptor has very different mean values and a small variance, then it will be good at separating the classes.
  • Feature weighting is a more general description of variance weighting, where not only the mean and standard deviation of each descriptor is calculated, but other well known weighting factors, such as the Fisher weight, are used. Multivariate Statistical Analysis
  • multivariate statistics analysis methods including pattern recognition methods, are often the most convenient and efficient way to analyse complex data, such as NMR spectra.
  • such analysis methods may be used to identify, for example diagnostic spectral windows and/or diagnostic species, for a particular condition under study.
  • Such analysis methods may be used to form a predictive model, and then use that model to classify test data.
  • one convenient and particularly effective method of classification employs multivariate statistical analysis modelling, first to form a model (a "predictive mathematical model") using data ("modelling data") from samples of known class (e.g., from subjects known to have, or not have, a particular condition), and second to classify an unknown sample (e.g., "test data”), as having, or not having, that condition.
  • pattern recognition methods include, but are not limited to, Principal Component Analysis (PCA) and Partial Least Squares-Discriminant Analysis (PLS-DA).
  • PCA Principal Component Analysis
  • PLS-DA Partial Least Squares-Discriminant Analysis
  • PCA is a bilinear decomposition method used for overviewing "clusters" within multivariate data.
  • the data are represented in K-dimensional space (where K is equal to the number of variables) and reduced to a few principal components (or latent variables) which describe the maximum variation within the data, independent of any knowledge of class membership (i.e., "unsupervised”).
  • the principal components are displayed as a set of “scores” (t) which highlight clustering, trends, or outliers, and a set of "loadings” (p) which highlight the influence of input variables on t. See, for example, Kowalski et al., 1986).
  • PLS-DA is a supervised multivariate method yielding latent variables describing maximum separation between known classes of samples.
  • PLS-DA is based on PLS which is the regression extension of the PCA method explained earlier.
  • the calculated PLS components will thereby be more focused on describing the variation separating the classes in X if this information is present in the data. From an interpretation point of view all the features of PLS can be used, which means that the variation can be interpreted in terms of scores (t,u), loadings (p,c), PLS weights (w) and regression coefficients (b).
  • the fact that a regression is carried out against a known class separation means that the PLS-DA is a supervised method and that the class membership has to be known prior to the actual modelling. Once a model is calculated and validated it can be used for prediction of class membership for "new" unknown samples.
  • Judgement of class membership is done on basis of predicted class membership (Ypred), predicted scores (tpred) and predicted residuals (DmodXpred) using statistical significance limits for the decision. See, for example, Sjostrom et al., 1986; Stahle et al., 1987.
  • the variation between the objects in X is described by the X-scores, T, and the variation in the Y-block regressed against is described in the Y-scores, U.
  • the Y-block is a "dummy vector or matrix" describing the class membership of each observation. Basically, what PLS does is to maximize the covariance between T and U.
  • a PLS weight vector, w is calculated, containing the influence of each X-variable on the explanation of the variation in Y. Together the weight vectors will form a matrix, W, containing the variation in X that maximizes the covariance between the scores T and U for each calculated component.
  • weights, W contain the variation in X that is correlated to the class separation described in Y.
  • the Y-block matrix of weights is designated C.
  • a matrix of X-loadings, P, is also calculated. These loadings are apart from interpretation used to perform the proper decomposition of X.
  • Spurious or irregular data in spectra are preferably identified and removed.
  • Common reasons for irregular data include spectral artefacts such as poor phase correction, poor baseline correction, poor chemical shift referencing, poor water suppression, and biological effects such as bacterial contamination, shifts in the pH of the biofluid, toxin- or disease-induced biochemical response, and other conditions, e.g., pathological conditions, which have metabolic consequences, e.g., diabetes.
  • Outliers are identified in different ways depending on the method of analysis used. For example, when using principal component analysis (PCA), small numbers of samples lying far from the rest of the replicate group can be identified by eye as outliers.
  • PCA principal component analysis
  • a more objective means of identification for PCA is to use the Hotelling's T Test which is the multivariate version of the well known Student's T test used in univariate statistics. For any given sample, the T2 value can be calculated and this is compared with a standard value within which a chosen fraction (e.g., 95%) of the samples would normally lie. Samples with T2 values substantially outside this limit can then be flagged as outliers.
  • a confidence level (e.g., 95%) is selected and the region of multivariate space corresponding to confidence values above this limit is determined. This region can be displayed graphically in several different ways (for example by plotting the critical T2 ellipse on a PCA scores plot). Any samples falling outside the high confidence region are flagged as potential outliers.
  • DModX is the perpendicular distance of an object to the principal component (or to the plane or hyper plane made up by two or more principal components). In the SIMCA software, DModX is calculated as:
  • e is the residual for a single observation
  • K is the number of original variables in the data set
  • A is the number of principal components in the model
  • v is a correction factor, based on the number of observations (N) and the number of principal components (A), and is slightly larger than one.
  • outliers in this direction are not as severe as those occurring in the score direction but should always be carefully examined before making a decision whether to include them in the modelling or not.
  • all outliers are thoroughly investigated, for example, by examining the contributing loadings and distance to model (DModX) as well as visually inspecting the original NMR spectrum for deviating features, before removing them from the model.
  • Outlier detection by automatic algorithm is a possibility using the features of scores and residual distance to model (DModX) described above.
  • the distance to the model in Y (DmodY) can also be calculated in the same way.
  • filtering methods include the regression of descriptor variables against an index based on sample class to eliminate variables with low correlation to the predefined classes.
  • Related methods include target rotation (see, e.g., Kvalheim et al., 1989) and PCT filtering (see, e.g., Sun, 1997). In these methods, the removed variation is not necessarily completely uncorrelated with sample class (i.e., orthogonal).
  • latent variables which are orthogonal to some variation or class index of interest are removed by "orthogonal filtering."
  • variation in the data which is not correlated to (i.e., is orthogonal to) the class separating variation of interest may be removed.
  • Such methods are, in general, more efficient than non-orthogonal filtering methods.
  • Orthogonal Signal Correction (OSC)
  • OSC Orthogonal Signal Correction
  • the class identity is used as a response vector, Y, to describe the variation between the sample classes.
  • the OSC method locates the longest vector describing the variation between the samples which is not correlated with the Y-vector, and removes it from the data matrix.
  • the resultant dataset has been filtered to allow pattern recognition focused on the variation correlated to features of interest within the sample population, rather than non-correlated, orthogonal variation.
  • OSC is a method for spectral filtering that solves the problem of unwanted systematic variation in the spectra by removing components, latent variables, orthogonal to the response calibrated against.
  • the weights, w are calculated to maximise the covariance between X and Y.
  • the weights, w are calculated to minimize the covariance between X and Y, which is the same as calculating components as close to orthogonal to Y as possible.
  • OSC can be described as a bilinear decomposition of the spectral matrix, X, in a set of scores, T**, and a set of corresponding loadings, P**, containing variation orthogonal to the response, Y.
  • the unexplained part or the residuals, E is equal to the filtered X-matrix, X osc , containing less unwanted variation.
  • the decomposition is described by the following equation:
  • the OSC procedure starts by calculation of the first latent variable or principal component describing the variation in the data, X.
  • the calculation is done according to the NIPALS algorithm.
  • the first score vector, t which is a summary of the between sample variation in X, is then orthogonalized against response (Y), giving the orthogonalized score vector t*.
  • t* (I - Y (Y'Y)- 1 Y") t
  • the estimate or updated score vector t** is then again orthogonalized to Y, and the iteration proceeds until t** has converged. This will ensure that t** will converge towards the longest vector orthogonal to response Y, still giving a good description of the variation in X.
  • the data, X can then be described as the score, t**, orthogonal to Y, times the corresponding loading vector p**, plus the unexplained part, the residual, E.
  • orthogonal signal correction can be used to optimize the separation, thus improving the performance 5 of subsequent multivariate pattern recognition analysis and enhancing the predictive power of the model.
  • OSC orthogonal signal correction
  • An example of a typical OSC process includes the following steps: 0 (a) 1 H NMR data are segmented using AMIX, normalised, and optionally scaled and/or mean centered. The default for orthogonal filtering of spectral data is to use only mean centered data, which means that the mean for each variable (spectral bucket) is subtracted from each single variable in the data matrix. (b) a response vector (y) describing the class separating variation is created by 5 assigning class membership to each sample. (c) one latent variable orthogonal to the response vector (y) is removed according to the OSC algorithm. (d) if desired, the removed orthogonal variation can be viewed and interpreted in terms of scores (T) and loadings (P). (e) the filtered data matrix, which contains less variation not correlated to class separation, is next used for further multivariate modelling after optional scaling and/or mean centering.
  • any particular model is only as good as the data used to formulate it. Therefore, it is preferable that all modelling data and test data are obtained under the same (or similar) conditions and using the same (or similar) experimental parameters.
  • Such conditions and parameters include, for example, sample type (e.g., plasma, serum), sample collection and handling protocol, sample dilution, NMR analysis (e.g., type, field strength/frequency, temperature), and data-processing (e.g., referencing, baseline correction, normalisation).
  • models for a particular sub-group of cases e.g., according to any of the parameters mentioned above (e.g., field strength/frequency), or others, such as sex, age, ethnicity, medical history, lifestyle (e.g., smoker, nonsmoker), hormonal status (e.g., pre-menopausal, post-menopausal).
  • parameters mentioned above e.g., field strength/frequency
  • others such as sex, age, ethnicity, medical history, lifestyle (e.g., smoker, nonsmoker), hormonal status (e.g., pre-menopausal, post-menopausal).
  • the quality of the model improves as the amount of modelling data increases. Nonetheless, as shown in the examples below, even relatively small sets of modelling data (e.g., about 50-100 subjects) is sufficient to achieve a confident classification (e.g., diagnosis).
  • a confident classification e.g., diagnosis
  • a typical unsupervised modelling process includes the following steps: (a) optionally scaling and/or mean centering modelling data; (b) classifying data (e.g., as control or positive, e.g., diseased); (c) fitting the model (e.g., using PCA, PLS-DA); (d) identifying and removing outliers, if any; (e) re-fitting the model; (f) optionally repeating (c), (d), and (e) as necessary.
  • data filtering is performed following step (d) and before step (e).
  • orthogonal filtering e.g., OSC
  • An example of a typical PLS-DA modelling process, using OSC filtered data includes the following steps: (a) OSC filtered data is optionally scaled and/or mean centered. (b) a response vector (y) describing the class separating variation is created by assigning class membership to all samples. (c) a PLS regression model is calculated between the OSC filtered data and the response vector (y). The calculated latent variables or PLS components will be focused on describing maximum separation between the known classes.
  • the model is interpreted by viewing scores (T), loadings (P), PLS weights (W), PLS coefficients (B) and residuals (E). Together they will function as a means for describing the separation between the classes as well as provide an explanation to the observed separation.
  • the model may be verified using data for samples of known class which were not used to calculate the model. In this way, the ability of the model to accurately predict classes may be tested. This may be achieved, for example, in the method above, with the following additional step: (e) a set of external samples, with known class belonging, which were not used in the (e.g., PLS) model calculation is used for validation of the model's predictive ability. The prediction results are investigated, fore example, in terms of predicted response
  • the model may then be used to classify test data, of unknown class.
  • the test data are numerically pre-processed in the same manner as the modelling data.
  • the data matrix (X) is built up by N observations (samples, rats, patients, etc.) and K variables (spectral buckets carrying the biomarker information in terms of 1 H-NMR resonances).
  • PCA the N*K matrix (X) is decomposed into a few latent variables or principal components (PCs) describing the systematic variation in the data. Since PCA is a bilinear decomposition method, each PC can be divided into two vectors, scores (t) and loadings (p). The scores can be described as the projection of each observation on to each PC and the loadings as the contribution of each variable (spectral bucket) to the PC expressed in terms of direction.
  • any clustering of observations (samples) along a direction found in scores plots can be explained by identifying which variables (spectral buckets) have high loadings for this particular direction in the scores.
  • a high loading is defined as a variable (spectral bucket) that changes between the observations in a systematic way showing a trend which matches the sample positions in the scores plot.
  • Each spectral bucket with a high loading, or a combination thereof is defined by its 1 H NMR chemical shift position; this is its diagnostic spectral window. These chemical shift values then allow the skilled NMR spectroscopist to examine the original NMR spectra and identify the molecules giving rise to the peaks in the relevant buckets; these are the biomarkers. This is typically done using a combination of standard 1- and 2-dimensional NMR methods.
  • the loadings plot shows points which are labelled according to the bucket chemical shift. This is the 1 H NMR spectroscopic chemical shift which corresponds to the centre of the bucket. This bucket defines a diagnostic spectral window. Given a list of these bucket identifiers, the skilled NMR spectroscopist then re-examines the H NMR spectra and identifies, within the bucket width, which of several possible NMR resonances are changed between the two classes.
  • the important resonance is characterised in terms of exact chemical shift, intensity, and peak multiplicity.
  • PLS-DA which is a regression extension of the PCA method
  • the options for interpretation are more extensive compared to the PCA case.
  • PLS-DA performs a regression between the data matrix (X) and a "dummy matrix" (Y) containing the class membership information (e.g., samples may be assigned the value 1 for healthy and 2 for diseased classes).
  • the calculated PLS components will describe the maximum covariance between X and Y which in this case is the same as maximum separation between the known classes in X.
  • the interpretation of scores (t) and loadings (p) is the same in PLS-DA as in PCA.
  • Interpretation of the PLS weights (w) for each component provides an explanation of the variables in X correlated to the variation in Y.
  • regression coefficients (b) can also be used for discovery and interpretation of biomarkers.
  • the regression coefficients (b) in PLS-DA provide a summary of which variables in X (spectral buckets) that are most important in terms of both describing variation in X and correlating to Y. This means that variables (spectral buckets) with high regression coefficients are important for separating the known classes in X since the Y matrix against which it is correlated only contains information on the class identity of each sample.
  • the scores plot is examined to identify important loadings, diagnostic spectral windows, relevant NMR resonances, and ultimately the associated biomarkers.
  • a variable importance plot is another method of evaluating the significance of loadings in causing a separation of class of sample in a scores plot.
  • the VIP is a squared function of PLS weights, and therefore only positive numerical values are encountered; in addition, for a given model, there is only one set of VIP-values. Variables with a VIP value of greater than 1 are considered most influential for the model.
  • the VIP shows each loading in a decreasing order of importance for class separation based on the PLS regression against class variable.
  • a (w*c) plot is another diagnostic plot obtained from a PLS-DA analysis. It shows which descriptors are mainly responsible for class separation.
  • the (w*c) parameters are an attempt to describe the total variable correlations in the model, i.e., between the descriptors (e.g., NMR intensities in buckets), between the NMR descriptors and the class variables, and between class variables if they exist (in the present two class case, where samples are assigned by definition to class 1 and class 2 there is no correlation).
  • the descriptors e.g., NMR intensities in buckets
  • class variables if they exist (in the present two class case, where samples are assigned by definition to class 1 and class 2 there is no correlation.
  • each bar represents a spectral region (e.g., 0.04 ppm) and shows how the 1 H NMR profile of one class of samples differs from the 1 H NMR profile of a second class of samples.
  • a positive value on the x-axis indicates there is a relatively greater concentration of metabolite (assigned using NMR chemical shift assignment tables) in one class as compared to the other class, and a negative value on the x-axis indicates a relatively lower concentration in one class as compared to the other class.
  • the analysis methods described herein can be applied to a single sample, or alternatively, to a timed series of samples. These samples may be taken relatively close together in time (e.g., daily) or less frequently (e.g., monthly or yearly).
  • the timed series of samples may be used for one or more purposes, e.g., to make sequential diagnoses, applying the same classification method as if each sample were a single sample. This will allow greater confidence in the diagnosis compared to obtaining a single sample for the patient, or alternatively to monitor temporal changes in the subject (e.g., changes in the underlying condition being diagnosed, treated, etc.).
  • the timed series of samples can be collectively treated as a single dataset increasing the information density of the input dataset and hence increasing the power of the analysis method to identify weaker patterns.
  • the timed series of samples can be collectively processed to yield a single dataset in which the temporal changes (e.g., in each bin) is included as an extra list of variables (e.g., as in composite data sets).
  • Temporal changes in the amount of (e.g., endogenous) diagnostic species may greatly improve the ability of the analysis method to accurate classify patterns (especially when patterns are weak).
  • Statistical batch processing can be divided into two levels of multivariate modelling.
  • the lower or the observation level is usually based on Partial Least Squares (PLS) regression against time (or any other index describing process maturity), whereas the upper or batch level consists of a PCA based on the scores from the lower level PLS model.
  • PLS Partial Least Squares
  • PLS can also be used in the upper level to correlate the matrix based on the lower level scores with the end properties of the separate batches. This is common in industrial applications where properties of the end product are used as a description of quality.
  • the evolution of the studied process with time can be monitored and interpreted in terms of PLS scores and loadings.
  • the calculated components will be focused on the evolution with time.
  • the fact that the calculated PLS components are orthogonal to each other means that it is possible to detect independent time (maturity) profiles and also to interpret which measured variables are causing these profiles. Confidence limits are used for detection of deviating behaviour of any spectra at any time point for some optional significance level, usually 95% and/or 99%.
  • the residuals expressed as distance to model is, at the lower level, another important tool for detecting outlying batches or deviating behaviour for a specific batch at a specific time point.
  • the upper level or batch level provides the possibility to just look at the difference between the separate batches. This is done by using the lower level scores including all time points for each batch as new variables describing each single batch and then performing a PCA on this new data matrix.
  • the features of scores, loadings and DmodX are used in the same way as for ordinary PCA analysis, with the exception that the upper level loadings can be traced back down to the lower level for a more detailed explanation in the original loadings.
  • Predictions for "new" batches can be done on both levels of the batch model.
  • On the upper level prediction of single batch behaviour can be done in terms of scores and DmodX.
  • the definition of a batch process, and also a requirement for batch modelling, is a process where all batches have equal duration and are synchronised according to sample collection. For example, samples taken from a cohort of animals at identical fixed time points to monitor the effects of an administered xenobiotic substance.
  • the advantage of using batch modelling for such studies is the possibility of detecting known, or discovering new, metabolic processes which evolve with time in the lower level scores, and also the identification of the actual metabolites involved in the different processes from the contributing lower level loadings.
  • the lower level analysis also makes it possible to differentiate between single observations (e.g., individual animals at specific time points).
  • Applications for the lower level modelling include, for example, distinguishing between undosed controls and dosed animals in terms of metabolic effects of dosing in certain time points; and creating models for normality and using the models as a classification tool for new samples, e.g., as normal or abnormal. This may be achieved using a PLS prediction of the new sample's class using the model describing normality. Decisions can then be made on basis of the combination of the predicted scores and residuals (DmodX).
  • An automated , expert system can be used for early fault detection in the lower level batch modelling, and this can be used to further enhance the analysis procedure and improve efficiency.
  • the upper level provides the possibility of making predictions of new animals using the existing model. Abnormal animals can then be detected by judging predicted scores and residuals (DmodX) together. Since the upper level model is based on the lower level scores, the interpretation of an animal predicted to be abnormal can be traced back to the original lower level scores and loadings as well as the original raw variables making up the NMR spectra. Combining the upper and lower level for prediction of the status of a new animal, the classification can be based on four parameters: upper level scores and residuals (DmodX) and lover level scores and residuals (DModX). This demonstrates that batch modelling is an efficient tool for determining if an animal is normal or abnormal, and if the latter, why and when they are deviating from normality.
  • composite data set pertains to a spectrum (or data vector) which comprises spectral data (e.g., NMR spectral data, e.g., an NMR spectrum) as well as at least one other datum or data vector.
  • spectral data e.g., NMR spectral data, e.g., an NMR spectrum
  • Examples of other data vectors include, e.g., one or more other NMR spectral data, e.g., NMR spectra, e.g., obtained for the same sample using a different NMR technique; other types of spectral data, e.g., other types of spectra, e.g., mass spectra, numerical representations of images, etc.; obtained for the another sample, of the same sample type (e.g., blood, urine, tissue, tissue extract), but obtained from the subject at a different timepoint; obtained for another sample of different sample type (e.g., blood, urine, tissue, tissue extract) for the same subject; and the like.
  • NMR spectral data e.g., NMR spectra, e.g., obtained for the same sample using a different NMR technique
  • other types of spectral data e.g., other types of spectra, e.g., mass spectra, numerical representations of images, etc.
  • obtained for the another sample of
  • Clinical parameters which are suitable for use in composite methods include, but are not limited to, the following:
  • many of the methods of the present invention involve relating NMR spectral intensity at one or more predetermined diagnostic spectral windows with a predetermined condition. Examples of methods for identifying one or more suitable diagnostic spectral windows for a given condition, using, for example, pattern recognition methods, are described herein.
  • diagnosis spectral window pertains to narrow range of chemical shift ( ⁇ ) values encompassing an index value, ⁇ r (that is, ⁇ r falls within the range ⁇ ).
  • Each index value, and its associated spectral window define a range of chemical shift ( ⁇ ) in which the NMR spectral intensity is indicative of the presence of one or more chemical species.
  • the diagnostic spectral window refers to a chemical shift patch ( ⁇ -,, ⁇ 2 ) which encompasses an index value, [ ⁇ r1 , ⁇ r2 ].
  • the diagnostic spectral window refers to a chemical shift volume ( ⁇ n , ⁇ 2 , ⁇ 3 ) which encompasses an index value, [ ⁇ n, ⁇ , 6 r3 ].
  • ⁇ 0.04, and ⁇ 1.28-1.32).
  • the breadth of the range, i ⁇ ] is determined largely by the spectroscopic parameters, such as field strength/frequency, temperature, sample viscosity, etc.
  • the breadth of the range is often chosen to encompass a typical spin-coupled multiplet pattern. For peaks whose position varies with sample pH, the breadth of the range is may be widened to encompass the expected range of positions.
  • is from about ⁇ 0.001 to about ⁇ 0.2. In one embodiment, the breadth is from about ⁇ 0.005 to about ⁇ 0.1.
  • the breadth is from about ⁇ 0.005 to about ⁇ 0.08.
  • the breadth is from about ⁇ 0.01 to about ⁇ 0.08.
  • the breadth is from about ⁇ 0.02 to about ⁇ 0.08.
  • the breadth is from about ⁇ 0.005 to about ⁇ 0.06. In one embodiment, the breadth is from about ⁇ 0.01 to about ⁇ 0.06.
  • the breadth is from about ⁇ 0.02 to about ⁇ 0.06.
  • the breadth is about ⁇ 0.04.
  • the breadth is equal to the "bucket” or “bin” width. In one embodiment, the breadth is equal to an integer multiple of the “bucket” or “bin” width.
  • the diagnostic spectral windows are determined in relation to the condition under study, the precise index values for such windows may vary in accordance with the experimental parameters employed, for example, the digital resolution in the original spectra, the width of the buckets used, the temperature of the spectral data acquisition, etc.
  • the exact composition of the sample e.g., biofluid, tissue, etc.
  • the observation frequency will have an effect because of different degrees of peak overlap and of first/second order nature of spectra.
  • said one or more predetermined diagnostic spectral windows is: a single predetermined diagnostic spectral window.
  • said one or more predetermined diagnostic spectral windows is: a plurality of predetermined diagnostic spectral windows. In practice, this may be preferred.
  • the theoretical limit on the number of predetermined diagnostic spectral windows is a function of the data density (e.g., the number of variables, e.g., buckets), typically the number of predetermined diagnostic spectral windows is from 1 to about 30. It is possible for the actual number to be in any sub-range within these general limits. Examples of lower limits include 1 , 2, 3, 4, 5, 6, 8, 10, and 15. Examples of upper limits include 3, 4, 5, 6, 8, 10, 15, 20, 25, and 30.
  • the number is from 1 to about 20. In one embodiment the number is from 1 to about 15. In one embodiment the number is from 1 to about 10. In one embodiment the number is from 1 to about 8. In one embodiment the number is from 1 to about 6. In one embodiment the number is from 1 to about 5. In one embodiment the number is from 1 to about 4. In one embodiment the number is from 1 to about 3. In one embodiment the number is 1 or 2.
  • said one or more predetermined diagnostic spectral windows is: a plurality of diagnostic spectral windows; and, said NMR spectral intensity at one or more predetermined diagnostic spectral windows is: a combination of a plurality of NMR spectral intensities, each of which is NMR spectral intensity for one of said plurality of predetermined diagnostic spectral windows.
  • said combination is a linear combination.
  • At least one of said one or more predetermined diagnostic spectral windows encompasses a chemical shift value for an NMR resonance of a diagnostic species (e.g., a 1 H NMR resonance of a diagnostic species).
  • each of a plurality of said one or more predetermined diagnostic spectral windows encompasses a chemical shift value for an NMR resonance of a diagnostic species (e.g., a 1 H NMR resonance of a diagnostic species).
  • each of said one or more predetermined diagnostic spectral windows encompasses a chemical shift value for an NMR resonance of a diagnostic species (e.g., a 1 H NMR resonance of a diagnostic species).
  • said one or more predetermined diagnostic spectral windows are associated with one or more diagnostic species.
  • index values and the associated diagnostic spectral windows, primarily reflect one or more of the species described in Table 2, Table 3, Table 4, and Table 5, below.
  • said predetermined diagnostic spectral windows are defined by one or more index values, ⁇ r , corresponding to the bucket regions listed in Table 2, Table 3, Table 4, and Table 5, below.
  • said predetermined diagnostic spectral windows are defined by one or more index values, ⁇ r , corresponding to the bucket regions listed in Table 2, Table 3, Table 4, and Table 5, below, for: acetate; amino acids; arginine; lactate; lipids (fatty acyl groups); lysine; sugars (mainly glucose); taurine. ln one embodiment, said predetermined diagnostic spectral windows are defined by one or more index values, ⁇ r , corresponding to the bucket regions listed in Table 2, Table 3, Table 4, and Table 5, below, for: lactate; acetate; glycoproteins (e.g., N-acetyl groups thereof); sugars; and amino acids (e.g., ⁇ -CH groups thereof).
  • said predetermined diagnostic spectral windows are defined by one or more index values, ⁇ r , corresponding to the bucket regions listed in Table 2, Table 3, Table 4, and Table 5, below, for: glycoproteins (e.g., N-acetyl groups thereof); sugars; and amino acids (e.g., ⁇ -CH groups thereof).
  • said predetermined diagnostic spectral windows are defined by one or more index values, ⁇ r , corresponding to the bucket regions listed in Table 2, Table 3, Table 4, and Table 5, below, for: lactate, glucose, glycoproteins (e.g., N-acetyl groups thereof), and glycerophosphorylcholine (GPC), trimethylamine-N-oxide (TMAO), and alanine.
  • index values ⁇ r , corresponding to the bucket regions listed in Table 2, Table 3, Table 4, and Table 5, below, for: lactate, glucose, glycoproteins (e.g., N-acetyl groups thereof), and glycerophosphorylcholine (GPC), trimethylamine-N-oxide (TMAO), and alanine.
  • said predetermined diagnostic spectral windows are defined by one or more index values, ⁇ r , corresponding to the bucket regions listed in Table 2, Table 3, Table 4, and Table 5, below, for: lactate, glucose, glycoproteins (e.g., N-acetyl groups thereof), and glycerophosphorylcholine (GPC).
  • index values ⁇ r , corresponding to the bucket regions listed in Table 2, Table 3, Table 4, and Table 5, below, for: lactate, glucose, glycoproteins (e.g., N-acetyl groups thereof), and glycerophosphorylcholine (GPC).
  • said predetermined diagnostic spectral windows are defined by one or more index values, ⁇ r , corresponding to the bucket regions listed in Table 2, Table 3, Table 4, and Table 5, below, for: lactate, glycerophosphorylcholine (GPC), trimethylamine-N-oxide (TMAO), glycoproteins (e.g., N-acetyl groups thereof), and alanine.
  • index values ⁇ r , corresponding to the bucket regions listed in Table 2, Table 3, Table 4, and Table 5, below, for: lactate, glycerophosphorylcholine (GPC), trimethylamine-N-oxide (TMAO), glycoproteins (e.g., N-acetyl groups thereof), and alanine.
  • said predetermined diagnostic spectral windows are defined by one or more index values, ⁇ r , corresponding to the bucket regions listed in Table 2, Table 3, Table 4, and Table 5, below, and breadth of the range value,
  • said predetermined diagnostic spectral windows are defined by one or more index values, ⁇ r , corresponding to the bucket regions listed in Table 2, Table 3, Table 4, and Table 5, below, and which are determined using the NMR experimental parameters set forth in the Examples. Diaonostic Species and Biomarkers
  • the index values, and the associated diagnostic spectral windows define ranges of chemical shift in which NMR spectral intensity is indicative of the presence of one or more chemical species, one or more of which are diagnostic species (e.g., biomarkers), for example, for a condition (e.g., indication) under study.
  • diagnostic species e.g., biomarkers
  • said one or more diagnostic species are endogenous diagnostic species.
  • said one or more diagnostic species are associated with NMR spectral intensity at predetermined diagnostic spectral windows.
  • said one or more diagnostic species are a plurality of diagnostic species (i.e., a combination of diagnostic species).
  • said one or more diagnostic species is a single diagnostic species.
  • endogenous species pertains to chemical species which originated from the subject under study, for example, which were present in the sample of the subject.
  • an index value, and its associated diagnostic spectral window is identified (e.g., by the application of modelling methods as described herein), it is often possible to identify one or more putative biomarkers which give rise to NMR spectral intensity in that particular window.
  • the (e.g., integrated) NMR spectral intensity in a particular spectral window is the sum of the spectral intensity for all of the NMR peaks in that window.
  • a particular spectral window e.g., bucket
  • the relevant peak(s) are then assigned.
  • Such assignments may be made, for example, by reference to published data; by comparison with spectra of authentic materials; by standard addition of an authentic reference standard to the sample; by separating the individual component, e.g., by using HPLC-NMR and identifying it using NMR and mass spectrometry. Additional confirmation of assignments is usually sought from the application of other NMR methods, including, for example, 2-dimensional (2D) NMR methods.
  • concentrations of candidate chemical species are measured by another specific method (e.g., ELISA, chromatography, RIA, etc.) and compared with the spectral intensity observed in the relevant diagnostic spectral window, and any correlation noted. This will reveal how much of the variance in the diagnostic spectral window is contributed by the candidate chemical species. This may also reveal that suspected diagnostic species are, in fact, not highly correlated with the condition under examination.
  • another specific method e.g., ELISA, chromatography, RIA, etc.
  • the methods described herein also facilitate the identification of species (often referred to as biomarkers or diagnostic species) which are indicative (e.g., diagnostic) of a particular condition.
  • species often referred to as biomarkers or diagnostic species
  • diagnostic species e.g., diagnostic of a particular condition.
  • particular metabolites e.g., in blood, urine, etc.
  • One aspect of the present invention pertains to a method of identifying such diagnostic species (e.g., biomarkers), as described herein.
  • diagnostic species e.g., biomarkers
  • One aspect of the present invention pertains to a method of identifying a diagnostic species, or a combination of a plurality of diagnostic species, for a predetermined condition, said method comprising the steps of: (a) applying a multivariate statistical analysis method to experimental data; wherein said experimental data comprises at least one data comprising experimental parameters measured for each of a plurality of experimental samples; wherein said experimental samples define a class group consisting of a plurality of classes; wherein at least one of said plurality of classes is a class associated with said predetermined condition, e.g., a class associated with the presence of said predetermined condition; wherein at least one of said plurality of classes is a class not associated with said predetermined condition, e.g., a class associated with the absence of said predetermined condition; wherein each of said experimental samples is of known class selected from said class group; and:
  • one or more of said critical experimental parameters is a spectral parameter (i.e., a critical experimental spectral parameter); and said identifying and matching steps are: (b) identifying one or more critical experimental spectral parameters; and, (c) matching each of one or more of said one or more critical experimental spectral parameters with a spectral feature, e.g., a spectral peak; and matching one or more of said spectral peaks with said diagnostic species; or:
  • said multivariate statistical analysis method is a multivariate statistical analysis method which employs a pattern recognition method.
  • said multivariate statistical analysis method is, or employs PCA.
  • said multivariate statistical analysis method is, or employs PLS.
  • said multivariate statistical analysis method is, or employs PLS-DA.
  • said multivariate statistical analysis method includes a step of data filtering.
  • said multivariate statistical analysis method includes a step of orthogonal data filtering.
  • said multivariate statistical analysis method includes a step of OSC.
  • said experimental parameters comprise spectral data.
  • said experimental parameters comprise both spectral data and non-spectral data (and is referred to as a "composite experimental data").
  • said experimental parameters comprise NMR spectral data.
  • said experimental parameters comprise both NMR spectral data and non-NMR spectral data.
  • said NMR spectral data comprises 1 H NMR spectral data and/or 13 C NMR spectral data.
  • said NMR spectral data comprises 1 H NMR spectral data.
  • said NMR spectral data comprises CPMG NMR spectral data.
  • said non-spectral data is non-spectral clinical data.
  • said non-NMR spectral data is non-spectral clinical data.
  • said critical experimental parameters are spectral parameters.
  • said class group comprises classes associated with said predetermined condition (e.g., presence, absence, degree, etc.).
  • said class group comprises exactly two classes.
  • said class group comprises exactly two classes: presence of said predetermined condition; and absence of said predetermined condition.
  • said class associated with said predetermined condition is a class associated with the presence of said predetermined condition.
  • said class not associated with said predetermined condition is a class associated with the absence of said predetermined condition.
  • said method further comprises the additional step of: (d) confirming the identity of said diagnostic species.
  • One aspect of the present invention pertain to novel diagnostic species (e.g., biomarker) which are identified by such a method.
  • novel diagnostic species e.g., biomarker
  • One aspect of the present invention pertains to one or more diagnostic species (e.g., biomarkers) which are identified by such a method for use in a method of classification (e.g., diagnosis).
  • diagnostic species e.g., biomarkers
  • One aspect of the present invention pertains to a method of classification (e.g., diagnosis) which relies upon (or employs) one or more diagnostic species (e.g., biomarkers) which are identified by such a method.
  • One aspect of the present invention pertains to use of one or more diagnostic species (e.g., biomarkers) which are identified by such a method in a method of classification (e.g., diagnosis).
  • One aspect of the present invention pertains to an assay for use in a method of classification (e.g., diagnosis), which assay relies upon one or more diagnostic species (e.g., biomarkers) which are identified by such a method.
  • a method of classification e.g., diagnosis
  • diagnostic species e.g., biomarkers
  • One aspect of the present invention pertains to use of an assay in a method of classification (e.g., diagnosis), which assay relies upon one or more diagnostic species (e.g., biomarkers) which are identified by such a method.
  • a method of classification e.g., diagnosis
  • diagnostic species e.g., biomarkers
  • At least one of said one or more predetermined diagnostic species is a species described in Table 2, Table 3, Table 4, or Table 5, below.
  • At least one of said one or more predetermined diagnostic species is selected from: acetate; alanine; amino acids; arginine; glycoproteins; 3-hydroxybutyrate; lactate; lipids (fatty acyl groups); lysine; sugars (mainly glucose); taurine.
  • At least one of said one or more predetermined diagnostic species is selected from: acetate; amino acids; arginine; lactate; lipids (fatty acyl groups); lysine; sugars (mainly glucose); taurine.
  • At least one of said one or more predetermined diagnostic species is selected from: lactate; acetate; glycoproteins; sugars; and amino acids.
  • At least one of said one or more predetermined diagnostic species is selected from: glycoproteins; sugars; and amino acids.
  • At least one of said one or more predetermined diagnostic species is selected from: lactate, glucose, glycoproteins (e.g., N-acetyl groups thereof), and glycerophosphorylcholine (GPC), trimethylamine-N-oxide (TMAO), and alanine.
  • at least one of said one or more predetermined diagnostic species is selected from: lactate, glucose, glycoproteins (e.g., N-acetyl groups thereof), and glycerophosphorylcholine (GPC).
  • At least one of said one or more predetermined diagnostic species is selected from: lactate, glycerophosphorylcholine (GPC), trimethylamine-N-oxide (TMAO), glycoproteins (e.g., N-acetyl groups thereof), and alanine.
  • GPC glycerophosphorylcholine
  • TMAO trimethylamine-N-oxide
  • glycoproteins e.g., N-acetyl groups thereof
  • alanine is selected from: lactate, glycerophosphorylcholine (GPC), trimethylamine-N-oxide (TMAO), glycoproteins (e.g., N-acetyl groups thereof), and alanine.
  • many of the methods of the present invention involve classification on the basis of an amount, or a relative amount, of one or more diagnostic species.
  • said classification/diagnosis is performed on the basis of an amount, or a relative amount, of a single diagnostic species.
  • said classification/diagnosis is performed on the basis of an amount, or a relative amount, of a plurality of diagnostic species.
  • said classification/diagnosis is performed on the basis of an amount, or a relative amount, of each of a plurality of diagnostic species.
  • said classification/diagnosis is performed on the basis of a total amount, or a relative total amount, of a plurality of diagnostic species.
  • said amount of, or relative amount of one or more diagnostic species is: a combination of a plurality of amounts, or relative amounts, each of which is the amount of, or relative amount of one of said plurality of diagnostic species.
  • said combination is a linear combination.
  • amount refers to the amount regardless of the terms of expression.
  • amount as used herein in the context of “ amount of, or relative amount of (e.g., diagnostic) species,” pertains to the amount regardless of the terms of expression.
  • Absolute amounts may be expressed, for example, in terms of mass (e.g., ⁇ g), moles (e.g., ⁇ mol), volume (i.e., ⁇ L), concentration (molarity, ⁇ g/mL, ⁇ g/g, wt%, vol%, etc.), etc.
  • Relative amounts may be expressed, for example, as ratios of absolute amounts (e.g., as a fraction, as a multiple, as a %) with respect to another chemical species.
  • the amount may expressed as a relative amount, relative to an internal standard, for example, another chemical species which is endogenous or added.
  • the amount may be indicated indirectly, in terms of another quantity (possibly a precursor quantity) which is indicative of the amount.
  • the other quantity may be a spectrometric or spectroscopic quantity (e.g., signal, intensity, absorbance, transmittance, extinction coefficient, conductivity, etc.; optionally processed, e.g., integrated) which itself indicative of the amount.
  • the amount may be indicated, directly or indirectly, in regard to a different chemical species (e.g., a metabolic precursor, a metabolic product, etc.), which is indicative the amount.
  • a different chemical species e.g., a metabolic precursor, a metabolic product, etc.
  • modulation e.g., of NMR spectral intensity at one or more predetermined diagnostic spectral windows; of the amount, or a relative amount, of diagnostic species; etc.
  • modulation pertains to a change, and may be, for example, an increase or a decrease. In one embodiment, said "a modulation of is "an increase or decrease in.”
  • the modulation (e.g., increase, decrease) is at least 10%, as compared to a suitable control. In one embodiment, the modulation (e.g., increase, decrease) is at least 20%, as compared to a suitable control. In one embodiment, the modulation is a decrease of at least 50% (i.e., a factor of 0.5). In one embodiment, the modulation is a increase of at least 100% (i.e., a factor of 2).
  • Each of a plurality of predetermined diagnostic spectral windows, and each of a plurality of diagnostic species may have independent modulations, which may be the same or different. For example, if there are two predetermined diagnostic spectral windows, NMR spectral intensity may increase in one window and decrease in the other window.
  • combinations of modulations of NMR spectral intensity in different diagnostic spectral windows may be diagnostic.
  • the amount of one may increase, and the amount of the other may decrease.
  • combinations of modulations of amounts, or relative amounts of, different diagnostic species may be diagnostic. See, for example, the data in the Examples below, which illustrate cases where different species have different modulations.
  • diagnosis shift pertains a modulation (e.g., increase, decrease), as compared to a suitable control.
  • a diagnostic shift may be in regard to, for example, NMR spectral intensity at one or more predetermined diagnostic spectral windows; or the amount of, or relative amount of, diagnostic species.
  • the diagnostic shift is as described in Table 2, Table 3, Table 4, or Table 5, below.
  • Suitable controls are usually selected on the basis of the organism (e.g., subject, patient) under study (test subject, study subject, etc.), and the nature of the study (e.g., type of sample, type of spectra, etc.). Usually, controls are selected to represent the state of "normality.” As described herein, deviations from normality (e.g., higher than normal, lower than normal) in test data, test samples, test subjects, etc. are used in classification, diagnosis, etc.
  • control subjects are the same species as the test subject and are chosen to be representative of the equivalent normal (e.g., healthy) organism.
  • a control population is a population of control subjects. If appropriate, control subjects may have characteristics in common (e.g., sex, ethnicity, age group, etc.) with the test subject. If appropriate, control subjects may have characteristics (e.g., age group, etc.) which differ from those of the test subject. For example, it may be desirable to choose healthy 20-year olds of the same sex and ethnicity as the study subject as control subjects.
  • control samples are taken from control subjects.
  • control samples are of the same sample type (e.g., serum), and are collected and handled (e.g., treated, processed, stored) under the same or similar conditions, as the sample under study (e.g., test sample, study sample).
  • sample under study e.g., test sample, study sample.
  • control data e.g., control values
  • control data are obtained from control samples which are taken from control subjects.
  • control data e.g., control data sets, control spectral data, control spectra, etc.
  • control data are of the same type (e.g., 1-D 1 H NMR, etc.), and are collected and handled (e.g., recorded, processed) under the same or similar conditions (e.g., parameters), as the test data.
  • the methods of the present invention, or parts thereof, may be conveniently performed electronically, for example, using a suitably programmed computer system.
  • One aspect of the present invention pertains to a computer system or device, such as a computer or linked computers, operatively configured to implement a method of the present invention, as described herein.
  • One aspect of the present invention pertains to computer code suitable for implementing a method of the present invention, as described herein, on a suitable computer system.
  • One aspect of the present invention pertains to a computer program comprising computer program means adapted to perform a method according to the present invention, as described herein, when said program is run on a computer.
  • One aspect of the present invention pertains to a computer program, as described above, embodied on a computer readable medium.
  • One aspect of the present invention pertains to a data carrier which carries computer code suitable for implementing a method of the present invention, as described herein, on a suitable computer.
  • the above-mentioned computer code or computer program includes, or is accompanied by, computer code and/or computer readable data representing a predictive mathematical model, as described herein.
  • One aspect of the present invention pertains to a computer system or device, such as a computer or linked computers, programmed or loaded with computer code and/or computer readable data representing a predictive mathematical model as described herein.
  • the above-mentioned computer code or computer program includes, or is accompanied by, computer code and/or computer readable data representing data from which a predictive mathematical model, as described herein, may be calculated.
  • One aspect of the present invention pertains to computer code and/or computer readable data representing a predictive mathematical model, as described herein.
  • One aspect of the present invention pertains to a data carrier which carries computer code and/or computer readable data representing a predictive mathematical model, as described herein.
  • One aspect of the present invention pertains to a computer system or device, such as a computer or linked computers, programmed or loaded with computer code and/or computer readable data representing a predictive mathematical model, as described herein.
  • Computers may be linked, for example, internally (e.g ., on the same circuit board, on different circuit boards which are part of the same unit), by cabling (e.g., networking, ethemet, internet), using wireless technology (e.g., radio, microwave, satellite link, cellphone), etc., or by a combination thereof.
  • cabling e.g., networking, ethemet, internet
  • wireless technology e.g., radio, microwave, satellite link, cellphone
  • Examples of data carriers and computer readable media include chip media (e.g., ROM, RAM, flash memory (e.g., Memory StickTM, Compact FlashTM, SmartmediaTM), magnetic disk media (e.g., floppy disks, hard drives), optical disk media (e.g., compact disks (CDs), digital versatile disks (DVDs), magneto-optical (MO) disks), and magnetic tape media.
  • chip media e.g., ROM, RAM, flash memory (e.g., Memory StickTM, Compact FlashTM, SmartmediaTM
  • magnetic disk media e.g., floppy disks, hard drives
  • optical disk media e.g., compact disks (CDs), digital versatile disks (DVDs), magneto-optical (MO) disks
  • magnetic tape media e.g., magnetic tape, and magnetic tape media.
  • One aspect of the present invention pertains to a system (e.g., an "integrated analyser", “diagnostic apparatus”) comprising: (a) a first component comprising a device for obtaining NMR spectral intensity data for a sample (e.g., a NMR spectrometer, e.g., a Bruker INCA 500 MHz); and, (b) a second component comprising computer system or device, such as a computer or linked computers, operatively configured to implement a method of the present invention, as described herein, and operatively linked to said first component.
  • a system e.g., an "integrated analyser", “diagnostic apparatus”
  • a first component comprising a device for obtaining NMR spectral intensity data for a sample
  • a second component comprising computer system or device, such as a computer or linked computers, operatively configured to implement a method of the present invention, as described herein, and operatively linked to said first component.
  • first and second components are in close proximity, e.g., so as to form a single console, unit, system, etc. In one embodiment, the first and second components are remote (e.g., in separate rooms, in separate buildings).
  • a sample e.g., blood, urine, etc.
  • a sample is obtained from a subject, for example, by a suitably qualified medical technician, nurse, etc., and the sample is processed as required.
  • a blood sample may be drawn, and subsequently processed to yield a serum sample, within about three hours.
  • the sample is appropriately processed (e.g., by dilution, as described herein), and an NMR spectrum is obtained for the sample, for example, by a suitably qualified NMR technician. Typically, this would require about fifteen minutes.
  • the NMR spectrum is analysed and/or classified using a method of the present invention, as described herein.
  • This may be performed, for example, using a computer system or device, such as a computer or linked computers, operatively configured to implement the methods described herein.
  • this step is performed at a location remote from the previous step.
  • an NMR spectrometer located in a hospital or clinic may be linked, for example, by ethemet, internet, or wireless connection, to a remote computer which performs the analysis/classification. If appropriate, the result is then forwarded to the appropriate destination, e.g., the attending physician. Typically, this would require about fifteen minutes.
  • the methods described herein provide powerful means for the diagnosis and prognosis of disease, for assisting medical practitioners in providing optimum therapy for disease, and for understanding the benefits and side-effects of xenobiotic compounds thereby aiding the drug development process.
  • the methods described herein also have use in veterinary applications.
  • the methods described herein can be applied in a non-medical setting, such as in post mortem examinations and forensic science.
  • the technique can be used to identify a clinically silent disease prior to the onset of clinical symptoms.
  • Antenatal screening for a wide range of disease susceptibilities.
  • the methods described herein can be used to analyse blood or tissue drawn from a pre-term fetus (e.g., during chorionic vilus sampling or amniocentesis) for the purposes of antenatal screening.
  • Therapeutic monitoring e.g., to monitor the progress of treatment. For example, by making serial diagnostic tests, it will be possible to determine whether and to what extent the subject is returning to normal following initiation of a therapeutic regimen.
  • the methods described herein may be used as an alternative or adjunct to other methods, e.g., the various genomic, pharmacogenomic, and proteomic methods.
  • test and control animals were VRQ homozygous (TSE susceptible genotype) Cheviot lambs (male & female) housed in high disease security accommodation on the VLA associated farms (ADAS, DEFRA, High Mowthorpe farm). Animals were bred for the purpose from New Zealand derived stock (free from scrapie). All groups were housed throughout, and maintained under identical conditions (straw and concentrates diet). Pooled brain homogenate was derived from similarly controlled disease free or experimentally infected sheep. Normal or scrapie infected pooled brain homogenate was administered orally (via syringe) to control or test animals.
  • each NMR spectrum was segmented into 256 regions of equal width and the signal intensities within each region summed; the region around the variably suppressed water peak was set to zero.
  • PCA Principal components analysis
  • Figure 1 is a PCA scores plot (component 1 vs. component 2). Controls (circles, •), pre-clinical manifestation (open squares, D), post-clinical manifestation (filled squares, ⁇ ). Some separation of the classes is apparent.
  • Figure 2 is the corresponding PCA loadings plot (component 1 v. component 2).
  • Figure 3 is a PCA scores plot (component 2 vs. component 3). Controls (circles, •), pre-clinical manifestation (open squares, ⁇ ), post-clinical manifestation (filled squares, ⁇ ). Good separation of the classes is apparent.
  • Figure 4 is the corresponding PCA loadings plot (component 2 v. component 3).
  • PLSDA Partial Least Squares discriminant analysis
  • Figure 5 is a PLS-DA scores plot (component 2 v. component 3). Controls
  • Figure 6 is the corresponding PLS-DA loadings plot (component 2 v. component 3).
  • Figure 7 is the corresponding PLS-DA variable importance plot (VIP). Cumulative effects to component 3. All parameters contributing to >1% of the model are included.
  • NMR peak assignments corresponding to the buckets identified in Figure 7 are listed in the following table, in order of importance (all >1%).
  • Sheep were challenged with 5 g of a pooled brain sample from scrapie-infected sheep in June and serum samples were taken in following August, October, December, and January. 600 MHz 1 H NMR spectra were recorded for these serum samples using a standard spin-echo (CPMG) sequence to attenuate the intensity of the NMR peaks from macromolecules, thereby aiding the visibility of the small molecule metabolite peaks.
  • CPMG spin-echo
  • Figure 8 shows a typical 1 H CPMG NMR spectrum of sheep serum with some peak assignments marked.
  • FIG. 9 is a PCA scores plot (component 1 vs. component 2) of the CPMG NMR serum spectral data from control and dose-challenged animals. Controls (filled squares, ⁇ ), dose-challenged in June and sampled in August (open squares, ⁇ ), dose-challenged in June and sampled in October (filled diamonds, ⁇ ), dose-challenged in June and sampled in December (open circles, O), dose-challenged in June and sampled in January (stars, *).
  • pair-wise partial least squares- discriminant analysis was performed using those samples that deviated from the controls, i.e., the group that moved in direction A (Group A) with controls and the group that moved in direction B (Group B) with controls. From the loadings of such models, it is possible to determine the importance of each spectral variable to the discrimination.
  • the dominant metabolites influencing the differentiation between control and infected sheep (Group A and Group B) are listed in Table 4 and Table 5, respectively, together with a measure of their relative influence on the model, as given by the variable importance parameter (VIP).
  • VIP variable importance parameter
  • the relative changes (increase, decrease) of metabolite levels observed in serum by 1 H NMR spectroscopy is also indicated.

Landscapes

  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • High Energy & Nuclear Physics (AREA)
  • Condensed Matter Physics & Semiconductors (AREA)
  • General Physics & Mathematics (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

L'invention concerne, de manière générale, le domaine des métabonomiques, et, plus particulièrement, des procédés chimiométriques destinés à analyser des données chimiques, biochimiques et biologiques, par exemple, des données spectrales, notamment des spectres par résonance magnétique nucléaire (NMR) et leurs applications, telles que la classification, le diagnostic, le pronostic, etc., notamment dans le contexte des maladies à prion, plus particulièrement les encéphalopathies spongiformes transmissibles (TSE), comme par exemple l'encéphalopathie spongiforme bovine, la tremblante du mouton, le kuru, la maladie de Creutzfeldt-Jakob (CJD), et le syndrome de Gerstmann-Sträussler-Scheinker (GSS), mais surtout la tremblante. L'invention concerne également des procédés destinés à classifier un échantillon et/ou un sujet, et des procédés de diagnostic, qui consistent à mettre en relation l'intensité spectrale NMR au niveau d'une ou de plusieurs fenêtres spectrales de diagnostic prédéfinies avec une condition prédéfinie associée à une maladie à prion, ou à mettre en relation la quantité, ou la quantité relative d'une ou plusieurs espèces de diagnostic avec une condition prédéfinie associée à une maladie à prion. L'invention concerne aussi des procédés d'identification d'espèces de diagnostic, ou des combinaisons de plusieurs espèces de diagnostic, pour une condition prédéfinie associée à une maladie à prion, des espèces de diagnostic ainsi identifiées, l'utilisation de ces espèces (par exemple, lactate, glucose, glycoprotéines, glycérophosphorylcholine (GPC), triméthylamine-N-oxyde (TMAO), alanine) dans un procédé de classification; ces espèces utilisées dans un procédé de classification; un procédé de classification qui repose sur une ou plusieurs de ces espèces; l'utilisation d'une ou plusieurs de ces espèces dans un procédé de classification; un dosage biologique utilisé dans un procédé de classification, lequel repose sur une ou plusieurs de ces espèces, l'utilisation de ce dosage dans un procédé de classification; etc..
PCT/GB2004/004219 2003-10-07 2004-10-05 Diagnostic de maladies a prion et classification d'echantillons par mme et/ou mlle WO2005036198A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0323451A GB0323451D0 (en) 2003-10-07 2003-10-07 Methods for analysis of spectral data and their applications
GB0323451.5 2003-10-07

Publications (1)

Publication Number Publication Date
WO2005036198A1 true WO2005036198A1 (fr) 2005-04-21

Family

ID=29415668

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2004/004219 WO2005036198A1 (fr) 2003-10-07 2004-10-05 Diagnostic de maladies a prion et classification d'echantillons par mme et/ou mlle

Country Status (2)

Country Link
GB (1) GB0323451D0 (fr)
WO (1) WO2005036198A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105823788A (zh) * 2016-05-12 2016-08-03 山西大学 山西老陈醋1h-nmr指纹图谱的构建方法和应用
EP3073914A4 (fr) * 2013-11-26 2017-10-04 Bioscreening and Diagnostics LLC Prédiction métabolomique de défaut cardiaque congénital au cours de la grossesse, ainsi qu'aux stades de nouveau-né et pédiatrique
CN109791186A (zh) * 2016-10-06 2019-05-21 皇家飞利浦有限公司 对磁共振指纹期间的b0偏共振场的直接测量
US11723590B2 (en) * 2018-07-05 2023-08-15 Datchem Method and system for detecting and identifying acute pain, its transition to chronic pain, and monitoring subsequent therapy

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108333206A (zh) * 2017-09-11 2018-07-27 宁波大学 一种拟穴青蟹产地的鉴别方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000072007A2 (fr) * 1999-05-20 2000-11-30 Robert-Koch-Institut Procede pour diagnostiquer par spectroscopie infrarouge des modifications de tissu induites par une encephalopathie spongiforme transmissible (tse)
WO2002066963A2 (fr) * 2001-02-22 2002-08-29 Bundersrepublik Deutschland Vertreten Durch Das Bundesministerium Für Gesundheit, Dieses Vertreten Durch Das Rober-Koch-Institut Vertreten Durch Seinen Leiter Procede de detection de modifications induites par l'encephalopathie spongiforme chez l'homme et l'animal
US20030124610A1 (en) * 2000-07-04 2003-07-03 Pattern Recognition Systems Holding As Method for the analysis of a selected multicomponent sample
WO2004038444A1 (fr) * 2002-10-25 2004-05-06 Nederlandse Organisatie Voor Toegepast-Natuurwetenschappelijk Onderzoek Tno Systeme de detection precoce de maladie et developpement de biomarqueurs specifiques d'une maladie

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000072007A2 (fr) * 1999-05-20 2000-11-30 Robert-Koch-Institut Procede pour diagnostiquer par spectroscopie infrarouge des modifications de tissu induites par une encephalopathie spongiforme transmissible (tse)
US20030124610A1 (en) * 2000-07-04 2003-07-03 Pattern Recognition Systems Holding As Method for the analysis of a selected multicomponent sample
WO2002066963A2 (fr) * 2001-02-22 2002-08-29 Bundersrepublik Deutschland Vertreten Durch Das Bundesministerium Für Gesundheit, Dieses Vertreten Durch Das Rober-Koch-Institut Vertreten Durch Seinen Leiter Procede de detection de modifications induites par l'encephalopathie spongiforme chez l'homme et l'animal
WO2004038444A1 (fr) * 2002-10-25 2004-05-06 Nederlandse Organisatie Voor Toegepast-Natuurwetenschappelijk Onderzoek Tno Systeme de detection precoce de maladie et developpement de biomarqueurs specifiques d'une maladie

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
BELL JD ET AL: "In vivo detection of metabolic changes in a mouse model of scrapie using nuclear magnetic resonance spectroscopy", JOURNAL OF GENERAL VIROLOGY, vol. 72, 1991, pages 2419 - 2423, XP009040640 *
COULTHARD A ET AL: "Quantitative analysis of MRI signal intensity in new variant Creutzfeldt-Jakob disease", THE BRITISH JOURNAL OF RADIOLOGY, vol. 72, 1999, pages 742 - 748, XP002308716 *
DEMAEREL P ET AL: "Accuracy of diffusion-weighted MR imaging in the diagnosis of sporadic Creutzfeldt-Jakob disease", J. NEUROL., vol. 250, February 2003 (2003-02-01), pages 222 - 225, XP002308715 *
GALANAUD D ET AL: "MR spectroscopic pulvinar sign in a case of variant Creutzfeldt-Jakob disease", J. NEURORADIOL., vol. 29, 2002, pages 285 - 287, XP001204104 *
KÜBLER E ET AL: "Diagnosis of prion diseases", BRITISH MEDICAL BULLETIN, vol. 66, June 2003 (2003-06-01), pages 267 - 279, XP001204105 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3073914A4 (fr) * 2013-11-26 2017-10-04 Bioscreening and Diagnostics LLC Prédiction métabolomique de défaut cardiaque congénital au cours de la grossesse, ainsi qu'aux stades de nouveau-né et pédiatrique
US10835148B2 (en) 2013-11-26 2020-11-17 Bioscreening & Diagnostics Llc Metabolomic prediction of congenital heart defect during pregnancy, newborn and pediatric stages
CN105823788A (zh) * 2016-05-12 2016-08-03 山西大学 山西老陈醋1h-nmr指纹图谱的构建方法和应用
CN105823788B (zh) * 2016-05-12 2017-10-17 山西大学 山西老陈醋1h‑nmr指纹图谱的构建方法和应用
CN109791186A (zh) * 2016-10-06 2019-05-21 皇家飞利浦有限公司 对磁共振指纹期间的b0偏共振场的直接测量
CN109791186B (zh) * 2016-10-06 2021-09-24 皇家飞利浦有限公司 对磁共振指纹期间的b0偏共振场的直接测量
US11723590B2 (en) * 2018-07-05 2023-08-15 Datchem Method and system for detecting and identifying acute pain, its transition to chronic pain, and monitoring subsequent therapy

Also Published As

Publication number Publication date
GB0323451D0 (en) 2003-11-05

Similar Documents

Publication Publication Date Title
US20050037515A1 (en) Methods for analysis of spectral data and their applications osteoporosis
US20040142496A1 (en) Methods for analysis of spectral data and their applications: atherosclerosis/coronary heart disease
US20040214348A1 (en) Methods for analysis of spectral data and their applications: osteoarthritis
US6683455B2 (en) Methods for spectral analysis and their applications: spectral replacement
Smolinska et al. NMR and pattern recognition methods in metabolomics: from data acquisition to biomarker discovery: a review
Lindon et al. Metabonomics: metabolic processes studied by NMR spectroscopy of biofluids
Lindon et al. Peer reviewed: so what’s the deal with metabonomics?
Beckwith-Hall et al. Application of orthogonal signal correction to minimise the effects of physical and biological variation in high resolution 1H NMR spectra of biofluids
Holmes et al. Automatic data reduction and pattern recognition methods for analysis of 1H nuclear magnetic resonance spectra of human urine from normal and pathological states
Lindon et al. Pattern recognition methods and applications in biomedical magnetic resonance
Van et al. The use of urine proteomic and metabonomic patterns for the diagnosis of interstitial cystitis and bacterial cystitis
WO2002099452A1 (fr) Procedes d'analyse spectrale et leurs applications dans l'evaluation de la fiabilite
WO2005036198A1 (fr) Diagnostic de maladies a prion et classification d'echantillons par mme et/ou mlle
Fan et al. Diagnosis of breast cancer using HPLC metabonomics fingerprints coupled with computational methods
AU2002251319A1 (en) Methods for analysis of spectral data and their applications: atherosclerosis/coronary heart disease
AU2002251321A1 (en) Methods for analysis of spectral data and their applications: osteoporosis
AU2002251332A1 (en) Methods for analysis of spectral data and their applications: osteoarthritis
WO2001092880A2 (fr) Methode d'analyse de voies metaboliques
Wen et al. Data normalization for diabetes II metabonomics analysis
Lindon et al. 16 An Overview of
AU2002249452A1 (en) Methods for analysis of spectral data and their applications
Lindon et al. An overview of metabonomics
Tenori Metabolomics
Smolinska NMR AND PATTERN RECOGNITION METHODS IN METABOLOMICS. FROM DATA ACQUISITION TO BIOMARKER DISCOVERY.
AU2002217282A1 (en) Methods for spectral analysis and their applications: spectral replacement

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase