WO2015173107A1 - Non-invasive diagnostic method for the early detection of fetal malformations - Google Patents

Non-invasive diagnostic method for the early detection of fetal malformations Download PDF

Info

Publication number
WO2015173107A1
WO2015173107A1 PCT/EP2015/060051 EP2015060051W WO2015173107A1 WO 2015173107 A1 WO2015173107 A1 WO 2015173107A1 EP 2015060051 W EP2015060051 W EP 2015060051W WO 2015173107 A1 WO2015173107 A1 WO 2015173107A1
Authority
WO
WIPO (PCT)
Prior art keywords
analysis
fetal
classification
metabolites
data
Prior art date
Application number
PCT/EP2015/060051
Other languages
French (fr)
Other versions
WO2015173107A8 (en
Inventor
Jacopo TROISI
Giovanni SCALA
Original Assignee
Guida, Maurizio
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guida, Maurizio filed Critical Guida, Maurizio
Priority to EP15719743.5A priority Critical patent/EP3143408A1/en
Priority to US15/310,197 priority patent/US20170138930A1/en
Priority to JP2017512110A priority patent/JP2017516118A/en
Publication of WO2015173107A1 publication Critical patent/WO2015173107A1/en
Publication of WO2015173107A8 publication Critical patent/WO2015173107A8/en
Priority to ZA2016/07324A priority patent/ZA201607324B/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/483Physical analysis of biological material
    • G01N33/487Physical analysis of biological material of liquid biological material
    • G01N33/49Blood
    • G01N33/492Determining multiple analytes
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6803General methods of protein analysis not limited to specific proteins or families of proteins
    • G01N33/6848Methods of protein analysis involving mass spectrometry
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/689Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids related to pregnancy or the gonads
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/38Pediatrics
    • G01N2800/385Congenital anomalies

Definitions

  • the present invention relates to a non-invasive method for early diagnosis of fetal malformations and, more specifically, to a non-invasive method for early diagnosis of fetal malformations based on the metabolomic analysis of maternal blood.
  • Fetal development defecs together reach a frequency of between 2 and 3% of all pregnancies (Hoyert DL, Mathews, T.J., Menacker F, et al. Annual summary of vital statistics: 2004. Pediatrics 2006; 1 17: 168-83), and are responsible for around 21 % of perinatal and infant deaths (T.J. Mathews, M.S., and Marian f. MacDorman, Infant Mortality Statistics From the 2008 Period Linked birth/Infant Death Data Set, National Vital Statistics Reports, 2012; 60 (5)), as well as for a significant number of disability cases and chronic diseases. For these reasons, the screening of fetal malformations is a common clinical practice in most developed countries.
  • ultrasonography is a non-invasive method, safe for both the mother and the fetus. Its effectiveness in detecting fetal malformations, however, depends on the operator's experience and the quality of the equipment used, and is in any event decreased in particular clinical conditions such as oligohydramnios, maternal obesity, or complex fetal abnormalities.
  • diagnostic methodology is the inability to detect birth defects before the second trimester of pregnancy.
  • diagnostic methods such as chorionic villus sampling and amniocentesis, able to identify some of the defined disease malformations, already in the first trimester of pregnancy.
  • these methods are useful only for some of defined congenital anomalies such as trisomy or other forms of cromosomopathies, and they are invasive, thus exposing both the mother and the fetus to a significant risk of serious complications.
  • the present invention relates to a non-invasive diagnostic method for early diagnosis of fetal malformations, based on the metabolomics analysis of maternal blood and on an integration of the obtained results by means of multivariate analysis that uses both models of PL -DA and OPLS-DA discriminant analysis and computer learning models as well ( SVM and decision tree ).
  • Metadata commonly defines the analysis of cellular processes through the study of the metabolic profile of small molecules of an organism.
  • metabolomics analysis inventors refer to the execution of a process aimed at the identification and the determination of the concentration of the greatest possible number of metabolites in a biological sample .
  • metabolomics commonly refers to the analysis of cellular processes by the metabolomics profile study of small molecules derived from an organism.
  • metabolomics profile the inventors refer to the execution of a process aimed at the identification and the determination of the concentration of the greatest possible number of metabolites in a biological sample.
  • metabolites commonly refers to small molecules derived from the biological processes of anabolic or catabolic type of a cell or a set of cells. With the term “metabolites” the inventors refer to all the molecules with a molecular weight of less than 1000 Dalton, which are potentially identifiable and measurable within a biological sample.
  • the diagnostic method of the present invention is based on two phases.
  • samples from mothers with definitely malformed fetuses and samples from mothers with surely healthy fetuses are analyzed, and by means of these classification models are trained.
  • This phase defined as training phase, is designed to create and define the characteristics of the metabolic profile in the blood of the two groups.
  • the expression "metabolic profile” refers to the specific pattern that the metabolites take in the patient blood, depending on their relative proportions.
  • the unknown samples are subjected to GCMS analysis, and the resulting chromatograms are classified according to the models previously trained, thus estimating the most probable class.
  • the diagnostic process is not based on the measurement of the concentration of the individual metabolites, but the entire cluster of metabolites is considered as a biomarker; said metabolites allow for the insertion in two different classes in that they are present in different proportion in the two groups.
  • the first phase is based on several sub-phases:
  • the second phase involves the application of the first three sub-phases of the first phase to the unknown sample and the attribution of the most likely class of membership on the basis of the question of the classification model formulated in the first phase.
  • the method of the present invention has the advantage that it can be used already in the first trimester of gestation.
  • a BPX-50 1 .5 m x 0.25 mm ID, 0.25 ⁇ is fixed at the position 6 and connected to a flame ionization detector (FID) put at 320°C, while the analytical column of 5.0 m (chemically identical to the one connected to the FI D) is connected to the qMS system.
  • the column connected to the FID is used to reduce the flow in the second dimension and to verify that a unrepresentative compound is not the result of a random fluctuation of the chromatography.
  • a 40 ⁇ _ external capillary (20 cm x 0.71 mm OD x 0.51 mm ID made of stainless steel) is used to connect the ports 3 and 4 of SGE interface.
  • the temperature program is the same for the two ovens: 80°C for 1 minute and then heating up to 320°C at 3°C/minute and held for 1 minute.
  • the initial helium pressure (constant linear velocity) is fixed at 129.6 kPa.
  • the initial auxiliary helium pressure APC advanced control pressure
  • the modulation period is set at 4.1 s (accumulation period of 4.0 seconds, the injection period of 0.1 second).
  • the conditions of the mass spectrometer quadrupole are: ionization mode: electron impact (70 eV), mass range: 40-800 m/z, scanning speed: 10,000 amu/ second.
  • the temperature program of GC provides 80°C for 1 minute and then heating up to 300°C at 3°C/minute and 1.67 minutes of hold time.
  • the initial helium pressure (constant linear velocity) is fixed at 129.6 kPa.
  • the conditions of the quadrupole mass spectrometer are: ionization mode: electron impact (70 eV) , mass range: 40-800 m/z , scan speed: 10,000 amu / second.
  • the gas chromatograms obtained in SCAN mode are integrated in order to identify all the peaks having an area greater than 10 times the background noise of the gas chromatographic plot. Each peak must be identified on the basis of one quantization m/z signal and at least on 2 qualification m/z signals.
  • the quantification is carried out with the method of the normalized percentages areas, the peak of Ribitol is used as a reference for the quantitative analysis and for the centering of the retention times.
  • the results obtained by this quantization (percentage areas normalized) are transferred to a matrix in which each sample represents a line and the columns are represented by various metabolites, uniquely identified by means of their gas chromatographic retention time.
  • the first column of the matrix is used to define the class of the sample.
  • the first column of the matrix is used to define the class of the sample.
  • two classes "normal fetus” and “malformed fetus” can be envisaged; evidence of the invention based on this dichotomous classification are shown by the inventors in the "Experimental evidence of the operation of the invention", but they consider that it is possible to imagine more complex classification scenarios where specific malformation classes can be separated, by placing a sufficient number of observations.
  • Different classification models are suitable for the purpose of the present invention; in particular, the performance of PLS-DA, OPLS-DA, SVM and decision tree models have been positively evaluated.
  • PLS is a supervised method that uses multivariate regression techniques to extract the information that may provide for the membership of a particular class (Y) by linear combinations of the original variables (X).
  • the PLS regression is performed using the PLSR function provided by the pis package of the R language (Ron Wehrens and Bjorn Helge-Mevik. Pis: Partial Least Squares Regression (PLSR) and Principal Component Regression (PCR), 2007, R package version 2.1 -0). Classification and cross-validation are performed using the corresponding wrapper function by the caret package (Max Kuhn. Contributions from Jed Wing and Steve Weston and Andre Williams. caret: Classification and Regression Training, 2008, R package version 3.45).
  • a permutation test is performed.
  • a PLS-DA model is built from the data (X) and the commuted class labels (Y) by using the optimal number of components determined by cross validation for the model based on the assignment of the original classes.
  • Two types of statistical tests are performed to measure the discrimination power between classes. The first is based on the prediction accuracy in the training phase of the model. The second is based on the separation distance according to the ratio between the sum of the quadratic distances within the classes and among the classes (B/W-ratio).
  • Orthogonal Partial Least Squares - Discriminant Analysis is an important development of the technique PLS-DA that has been proposed to manage the variation of the orthogonally class in the data matrix.
  • OPLS-DA increases the classification performances of the PLS-DA models. The performances of classification are estimated on the basis of "k-fold cross validation" by dividing the array of data in k random subsets. For each calculation cycle, one of the subsets of k is kept aside as a test set and the remaining k-1 subsets act as trainers. Each of the k subsets is used one time as a test set, generating k precision values. The accuracy of the classification is calculated as the average of the accuracy rates in k subsets.
  • the model is subjected to cross validation with the method leave one out cross validation (LOOCV), in order to be validated.
  • LOOCV cross validation
  • To perform the classification is chosen the kernel parameter, which corresponds to the maximum precision of the cross validation.
  • the data matrix is scaled to the mean and the unit variance, before being submitted to the division into k subsets. In other words, the average and the standard deviation of the training data are used to indicate the center and to scale the test data.
  • the model is used to check whether the data have generated an "overfitting". To do this, a validation set with known class labels is created and it is thus verified whether it gives an accuracy rate comparable to that of the training data.
  • R 2 /Q 2 Another method is a plot validation R 2 /Q 2 which helps to assess the risk that the current model is spurious, ie, the model fits well only to subsets set but does not predict Y just as well for the new observations.
  • the value of R 2 is the percentage variation of the training set that can be explained by the model.
  • the value of Q 2 is a measure of cross validated R 2 .
  • This validation compares the goodness of fit of the original model with the goodness of fit of the different models based on the data in which the order of the observations Y is permuted randomly, while the matrix is kept intact.
  • the criteria for the validity of the model are the following:
  • the regression line (the line joining the point Q 2 real to the centroid of the cluster of values permuted Q 2 ) has a negative value of the y-axis intercept.
  • Support Vector Machines are machine learning supervised techniques relatively new for classification uses.
  • the SVMs were proposed for the first time in 1982 by Vapnik (Vapnik, V. Estimation of Dependences Based on Empirical Data, Springer Verlag: New York, 1982).
  • the basic principle of SVMs which are essentially binary classifiers, is the following: given a set of data with two classes, a linear classifier is constructed in the form of a hyperplane, which has the maximum margin in the simultaneous minimization of the empirical classification error and the maximization of the geometric margin.
  • the original data are mapped into a higher dimensional feature space and a linear classifier is built in this new space (this is known as the "kernel"), which is equivalent to the construction of a linear classifier in the space of the original input.
  • This mapping is implicitly given by the kernel function.
  • c is the regularization parameter, which is a compromise between the learning accuracy and the term prediction
  • is a measure of the number of classification errors.
  • Decision trees build classification models based on recursive partitioning of data.
  • an algorithm of the decision tree begins with the entire set of data, the data are divided into two or more subgroups based on the values of one or more attributes, and then each subset is repeatedly divided in smaller subsets until the size of each subset reaches an appropriate level.
  • the entire modeling process can be represented in a tree structure, and the generated model can be summarized as a set of rules "if-then”.
  • Decision trees are easy to interpret, computationally undemanding, and able to cope with noisy data. Most of the decision trees tackles the classification problems, which is also the object of this invention.
  • the technique is also referred to as classification tree.
  • a node represents a set of data
  • the entire set of data is represented as a node at the root.
  • the diagnostic method of the present invention has been developed starting from the metabolomics analysis, carried out on blood samples collected from pregnant women with diagnosis of fetal malformation and from control pregnant women, with the clinical certainty of absence of fetal malformation pathologies.
  • the samples were collected from 100 healthy pregnant women, who have undergone abortion following the diagnosis of fetal malformation, and have voluntarily donated blood samples. Blood samples were taken immediately before the termination of pregnancy using BD Vacutainer® tubes, and frozen at -30°C until analysis. The suspected diagnosis of fetal malformation due to amniocentesis or ultrasound examination was confirmed by autopic post explant fetal examination. Each blood sample was associated with an equivalent control sample taken from a person to the same week of gestation and with similar personal, physical and social characteristics (weight, height, body mass index, age, marital status, economic status, etc. .).
  • central nervous callosum central nervous callosum, hydrocephalus, cystic hygroma,
  • TIC chromatogram In a TIC chromatogram are normally recognized more than 150 signals in a single sample and some of these peaks were not further investigated because they were not found correspondingly in other samples, because of in too low concentration or because of poor spectral quality in order to be confirmed as metabolites.
  • a total of 1 16 endogenous metabolites such as amino acids, organic acids, carbohydrates, fatty acids and steroids were detected.
  • LRI linear retention index
  • the peak areas were normalized and corrected to Ribitol signal. The results were summarized in a matrix file, separated by comma (CSV) and loaded into an appropriate software for statistical processing.
  • the other classification models showed good classification capacity (although lower than OPLS-DA).
  • Several approaches are possible for the definitive allocation of the class of the unknown sample. It is possible to use the response of a single model or to integrate the responses of individual models in a more complex decision algorithm.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Physics & Mathematics (AREA)
  • Chemical & Material Sciences (AREA)
  • Hematology (AREA)
  • Immunology (AREA)
  • Urology & Nephrology (AREA)
  • General Health & Medical Sciences (AREA)
  • Pathology (AREA)
  • General Physics & Mathematics (AREA)
  • Food Science & Technology (AREA)
  • Biochemistry (AREA)
  • Analytical Chemistry (AREA)
  • Medicinal Chemistry (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Cell Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Primary Health Care (AREA)
  • Software Systems (AREA)
  • Epidemiology (AREA)
  • Theoretical Computer Science (AREA)
  • Gynecology & Obstetrics (AREA)
  • Pregnancy & Childbirth (AREA)
  • Reproductive Health (AREA)
  • Databases & Information Systems (AREA)
  • Ecology (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)

Abstract

Anon-invasive method for the early diagnosis of fetal malformations based on the metabolomics analysis of maternal blood is here described.

Description

"NON-INVASIVE DIAGNOSTIC METHOD FOR THE EARLY DETECTION OF FETAL MALFORMATIONS"
*************
INTRODUCTION
The present invention relates to a non-invasive method for early diagnosis of fetal malformations and, more specifically, to a non-invasive method for early diagnosis of fetal malformations based on the metabolomic analysis of maternal blood.
Fetal development defecs together reach a frequency of between 2 and 3% of all pregnancies (Hoyert DL, Mathews, T.J., Menacker F, et al. Annual summary of vital statistics: 2004. Pediatrics 2006; 1 17: 168-83), and are responsible for around 21 % of perinatal and infant deaths (T.J. Mathews, M.S., and Marian f. MacDorman, Infant Mortality Statistics From the 2008 Period Linked Birth/Infant Death Data Set, National Vital Statistics Reports, 2012; 60 (5)), as well as for a significant number of disability cases and chronic diseases. For these reasons, the screening of fetal malformations is a common clinical practice in most developed countries. The most commonly used diagnostic methodology for this purpose is ultrasonography, which is a non-invasive method, safe for both the mother and the fetus. Its effectiveness in detecting fetal malformations, however, depends on the operator's experience and the quality of the equipment used, and is in any event decreased in particular clinical conditions such as oligohydramnios, maternal obesity, or complex fetal abnormalities.
The main limitation of such diagnostic methodology is the inability to detect birth defects before the second trimester of pregnancy. On the other hand, there are other diagnostic methods, such as chorionic villus sampling and amniocentesis, able to identify some of the defined disease malformations, already in the first trimester of pregnancy. However, these methods are useful only for some of defined congenital anomalies such as trisomy or other forms of cromosomopathies, and they are invasive, thus exposing both the mother and the fetus to a significant risk of serious complications.
Therefore the need for a non-invasive diagnostic method, capable of detecting fetal malformations in the first trimester of pregnancy, with good sensitivity and specificity is very felt.
The present invention relates to a non-invasive diagnostic method for early diagnosis of fetal malformations, based on the metabolomics analysis of maternal blood and on an integration of the obtained results by means of multivariate analysis that uses both models of PL -DA and OPLS-DA discriminant analysis and computer learning models as well ( SVM and decision tree ).
"Metabolomics" commonly defines the analysis of cellular processes through the study of the metabolic profile of small molecules of an organism. With "metabolomics analysis" inventors refer to the execution of a process aimed at the identification and the determination of the concentration of the greatest possible number of metabolites in a biological sample .
The term "metabolomics" commonly refers to the analysis of cellular processes by the metabolomics profile study of small molecules derived from an organism.
With the term "metabolomics profile" the inventors refer to the execution of a process aimed at the identification and the determination of the concentration of the greatest possible number of metabolites in a biological sample.
The term "metabolites" commonly refers to small molecules derived from the biological processes of anabolic or catabolic type of a cell or a set of cells. With the term "metabolites" the inventors refer to all the molecules with a molecular weight of less than 1000 Dalton, which are potentially identifiable and measurable within a biological sample.
To date, several thousands of metabolites in human serum have been identified and the application of metabolomics has allowed the development of biomarkers in many diseases such as schizophrenia (Kaddurah-Daouk R., Metabolic profiling of patients with schizophrenia, PLOS Med 2006; 8:e363), meningitis (Subramanian A. et al., Proton MR/CSF analysis and a new software as predictors for the differentiation of meningitis in children, NMR Biomed 2005; 18: 213-25) and colon cancer (C Denkert, et al., Metabolite profiling of human colon carcinoma - deregulation of TCA cuycle and amino acid turnover, Mol Cancer 2008; 7:1 -15).
However, the use of metabolomics in obstetric has been so far limited to studies of preeclampsia (RO Bahado-Singh, R. Akolekar, R. Mandal et al., Metabolomics and first-trimester prediction of early-onset preeclampsia, Journal of Maternal-Fetal and Neonatal Medicine, vol. 25(10): 1840-7,2012) restrictions on growth (RPHorgan, OF Broadhurst, SKWalsh et al., Metabolic profiling uncovers a phenotypic signature of small for gestational age in early pregnancy, Journal of Proteome Research, vol. 10(8): 3660-73, 201 1 ); and studies using nuclear magnetic resonance (NMR). To date, studies conducted in gas chromatography coupled to mass spectrometry and chemometric techniques for the diagnosis of fetal malformations are not reported in literature.
DESCRIPTION
The diagnostic method of the present invention is based on two phases. In a first stage samples from mothers with definitely malformed fetuses and samples from mothers with surely healthy fetuses are analyzed, and by means of these classification models are trained. This phase, defined as training phase, is designed to create and define the characteristics of the metabolic profile in the blood of the two groups. The expression "metabolic profile" refers to the specific pattern that the metabolites take in the patient blood, depending on their relative proportions.
In the second stage, the unknown samples are subjected to GCMS analysis, and the resulting chromatograms are classified according to the models previously trained, thus estimating the most probable class.
Therefore, the diagnostic process is not based on the measurement of the concentration of the individual metabolites, but the entire cluster of metabolites is considered as a biomarker; said metabolites allow for the insertion in two different classes in that they are present in different proportion in the two groups.
More in detail, the first phase is based on several sub-phases:
1 . Extraction and derivatization of metabolites;
2. GCMS or GCxGCMS analysis;
3. Data array creation;
4. Structuring of the classification models.
The second phase involves the application of the first three sub-phases of the first phase to the unknown sample and the attribution of the most likely class of membership on the basis of the question of the classification model formulated in the first phase.
The method of the present invention has the advantage that it can be used already in the first trimester of gestation.
Extraction and derivatization of the metabolites
50 L of haemolysed blood are transferred into 2 mL Eppendorf tubes and 20 μΙ_ of a solution of 1 g/L of Ribitol and 500 μΙ_ of methanol are added. The solution is mixed in a vortex for 30 seconds. After heating for 15 minutes at 70°C, the samples are centrifuged at 10,000 rpm for 10 minutes at 20°C. An aliquot of 200 μΙ_ of the supernatant is collected and transferred to new 2 mL Eppendorf tubes and added with 200 μΙ_ of water and 100 μΙ_ of chloroform, mixed in a vortex for 30 seconds and centrifuged at 4,000 rpm for 15 minutes at 20°C. An aliquot of 200 μΙ_ of the supernatant is again collected and transferred into 0.2 mL glass vials, dried under nitrogen flow, and then added with 50 μΙ_ of methoxylamine hydrochloride 20 mg/mL in pyridine and the reaction is conducted in the dark at 20°C for 16 hours. At the end, 50 μΙ_ of N,0-bis(trimethylsilyl)trifluoroacetamide (BSTFA) with 1 % tnmethylchlorosilane (TMCS) are added to each vial and the silanization reaction is conducted at 70°C for 16 hours.
To obtain a separation between the metabolites useful to the purposes of this invention, it is possible to operate in one-dimensional gas chromatography and in two-dimensional gas chromatography as well. The best resolving power of the two- dimensional technique potentially offers a better accuracy in classification, but it is also possible to operate with one-dimensional gas chromatography -which is the most commonly known- so as shown in the "Experimental evidence of the operation of the invention".
MDGCMS analysis
For the two-dimensional gas chromatography a primary column (placed in the first oven) of the type SLB-5ms 20.0 mx 0.18 mm ID with 0.10 μηη of film thickness [sylphenilene polymer, which is virtually equivalent in polarity to poly(5% diphenyl/95% methylsiloxane)] (Supelco) is used, and it is connected to the position 1 of 7 ports interface (SGE). A BPX-50 5.0 m x 0.25 mm ID with 0.25 μηι of the film thickness is connected to the position 7 of the interface. A BPX-50 1 .5 m x 0.25 mm ID, 0.25 μηη is fixed at the position 6 and connected to a flame ionization detector (FID) put at 320°C, while the analytical column of 5.0 m (chemically identical to the one connected to the FI D) is connected to the qMS system. The column connected to the FID is used to reduce the flow in the second dimension and to verify that a unrepresentative compound is not the result of a random fluctuation of the chromatography. A 40 μΙ_ external capillary (20 cm x 0.71 mm OD x 0.51 mm ID made of stainless steel) is used to connect the ports 3 and 4 of SGE interface. The temperature program is the same for the two ovens: 80°C for 1 minute and then heating up to 320°C at 3°C/minute and held for 1 minute. The initial helium pressure (constant linear velocity) is fixed at 129.6 kPa. The initial auxiliary helium pressure APC (advanced control pressure), which is also operating in conditions of constant linear velocity, is set at 90.4 kPa, the injection volume to 1 μΙ_ with a split ratio: 1 :10. The modulation period is set at 4.1 s (accumulation period of 4.0 seconds, the injection period of 0.1 second). The conditions of the mass spectrometer quadrupole are: ionization mode: electron impact (70 eV), mass range: 40-800 m/z, scanning speed: 10,000 amu/ second.
GCMS analysis
For the one-dimensional gas chromatography a column of type ZB5-ms 60.0 m x 0.25 mm ID x 0.25 μηη [sylphenilene polimer, virtually equivalent in polarity to poly(5% diphenyl/95 % methylsiloxane)] (Phenomenex) is used.
The temperature program of GC provides 80°C for 1 minute and then heating up to 300°C at 3°C/minute and 1.67 minutes of hold time. The initial helium pressure (constant linear velocity) is fixed at 129.6 kPa. The injection volume to 1 μΙ_ with a split ratio : 1 :2. The conditions of the quadrupole mass spectrometer are: ionization mode: electron impact (70 eV) , mass range: 40-800 m/z , scan speed: 10,000 amu / second.
Data Array Creation
The gas chromatograms obtained in SCAN mode are integrated in order to identify all the peaks having an area greater than 10 times the background noise of the gas chromatographic plot. Each peak must be identified on the basis of one quantization m/z signal and at least on 2 qualification m/z signals. In consequence of the integration, the quantification is carried out with the method of the normalized percentages areas, the peak of Ribitol is used as a reference for the quantitative analysis and for the centering of the retention times. The results obtained by this quantization (percentage areas normalized) are transferred to a matrix in which each sample represents a line and the columns are represented by various metabolites, uniquely identified by means of their gas chromatographic retention time.
The first column of the matrix is used to define the class of the sample. In the simplest scenario only two classes "normal fetus" and "malformed fetus" can be envisaged; evidence of the invention based on this dichotomous classification are shown by the inventors in the "Experimental evidence of the operation of the invention", but they consider that it is possible to imagine more complex classification scenarios where specific malformation classes can be separated, by placing a sufficient number of observations.
Structuring of classification models
Different classification models are suitable for the purpose of the present invention; in particular, the performance of PLS-DA, OPLS-DA, SVM and decision tree models have been positively evaluated.
PLS-DA
PLS is a supervised method that uses multivariate regression techniques to extract the information that may provide for the membership of a particular class (Y) by linear combinations of the original variables (X). The PLS regression is performed using the PLSR function provided by the pis package of the R language (Ron Wehrens and Bjorn Helge-Mevik. Pis: Partial Least Squares Regression (PLSR) and Principal Component Regression (PCR), 2007, R package version 2.1 -0). Classification and cross-validation are performed using the corresponding wrapper function by the caret package (Max Kuhn. Contributions from Jed Wing and Steve Weston and Andre Williams. caret: Classification and Regression Training, 2008, R package version 3.45). In order to evaluate the effectiveness in classes discrimination, a permutation test is performed. In each permutation, a PLS-DA model is built from the data (X) and the commuted class labels (Y) by using the optimal number of components determined by cross validation for the model based on the assignment of the original classes. Two types of statistical tests are performed to measure the discrimination power between classes. The first is based on the prediction accuracy in the training phase of the model. The second is based on the separation distance according to the ratio between the sum of the quadratic distances within the classes and among the classes (B/W-ratio).
OPLS-DA
Orthogonal Partial Least Squares - Discriminant Analysis (OPLS-DA) is an important development of the technique PLS-DA that has been proposed to manage the variation of the orthogonally class in the data matrix. OPLS-DA increases the classification performances of the PLS-DA models. The performances of classification are estimated on the basis of "k-fold cross validation" by dividing the array of data in k random subsets. For each calculation cycle, one of the subsets of k is kept aside as a test set and the remaining k-1 subsets act as trainers. Each of the k subsets is used one time as a test set, generating k precision values. The accuracy of the classification is calculated as the average of the accuracy rates in k subsets. The model is subjected to cross validation with the method leave one out cross validation (LOOCV), in order to be validated. To perform the classification is chosen the kernel parameter, which corresponds to the maximum precision of the cross validation. The data matrix is scaled to the mean and the unit variance, before being submitted to the division into k subsets. In other words, the average and the standard deviation of the training data are used to indicate the center and to scale the test data. Once trained, the model is used to check whether the data have generated an "overfitting". To do this, a validation set with known class labels is created and it is thus verified whether it gives an accuracy rate comparable to that of the training data. Another method is a plot validation R2/Q2 which helps to assess the risk that the current model is spurious, ie, the model fits well only to subsets set but does not predict Y just as well for the new observations. The value of R2 is the percentage variation of the training set that can be explained by the model. The value of Q2 is a measure of cross validated R2. This validation compares the goodness of fit of the original model with the goodness of fit of the different models based on the data in which the order of the observations Y is permuted randomly, while the matrix is kept intact. The criteria for the validity of the model are the following:
1 . All the Q2 values on the permuted data set must be lower than the value Q2 value, estimated on the current data set. If this is not checked, it means that the model is overfitted.
2. The regression line (the line joining the point Q2 real to the centroid of the cluster of values permuted Q2) has a negative value of the y-axis intercept.
SVM
Support Vector Machines (SVMs) are machine learning supervised techniques relatively new for classification uses. The SVMs were proposed for the first time in 1982 by Vapnik (Vapnik, V. Estimation of Dependences Based on Empirical Data, Springer Verlag: New York, 1982). The basic principle of SVMs, which are essentially binary classifiers, is the following: given a set of data with two classes, a linear classifier is constructed in the form of a hyperplane, which has the maximum margin in the simultaneous minimization of the empirical classification error and the maximization of the geometric margin. In the case of data sets that are not linearly separable, the original data are mapped into a higher dimensional feature space and a linear classifier is built in this new space (this is known as the "kernel"), which is equivalent to the construction of a linear classifier in the space of the original input. This mapping is implicitly given by the kernel function.
Given a set of training data e Rn, /' = 1 , m where each X, falls into one of two categories V,- e {-1 ,1}, SVM determines the hyperplane whose parameters are given by (w, b) so as obtained from the solution of the following convex optimization problem:
Figure imgf000009_0001
which is subjected to the following conditions V,-(v x, + b)≥1- e,
e;≥0
where c is the regularization parameter, which is a compromise between the learning accuracy and the term prediction, and ε is a measure of the number of classification errors. The inclusion of the term regularization reduces the problem of overfitting.
Decision trees
Decision trees build classification models based on recursive partitioning of data. Typically, an algorithm of the decision tree begins with the entire set of data, the data are divided into two or more subgroups based on the values of one or more attributes, and then each subset is repeatedly divided in smaller subsets until the size of each subset reaches an appropriate level. The entire modeling process can be represented in a tree structure, and the generated model can be summarized as a set of rules "if-then". Decision trees are easy to interpret, computationally undemanding, and able to cope with noisy data. Most of the decision trees tackles the classification problems, which is also the object of this invention. In this context, the technique is also referred to as classification tree. In the representation with the tree structure, a node represents a set of data, and the entire set of data is represented as a node at the root. EXPERI MENTAL EVIDENCE OF THE INVENTION FUNCTIONING
The diagnostic method of the present invention has been developed starting from the metabolomics analysis, carried out on blood samples collected from pregnant women with diagnosis of fetal malformation and from control pregnant women, with the clinical certainty of absence of fetal malformation pathologies.
Sample Collection
The samples were collected from 100 healthy pregnant women, who have undergone abortion following the diagnosis of fetal malformation, and have voluntarily donated blood samples. Blood samples were taken immediately before the termination of pregnancy using BD Vacutainer® tubes, and frozen at -30°C until analysis. The suspected diagnosis of fetal malformation due to amniocentesis or ultrasound examination was confirmed by autopic post explant fetal examination. Each blood sample was associated with an equivalent control sample taken from a person to the same week of gestation and with similar personal, physical and social characteristics (weight, height, body mass index, age, marital status, economic status, etc. .).
Extraction and derivatization of the metabolites
The extraction and derivatization of the samples were conducted in accordance with the provisions in the DESCRIPTION paragraph.
Chromatographic determinations
The GCMS analysis and GCxGCMS were carried out with Shimadzu instruments according to the information given in the DESCRIPTION paragraph.
Statistical analysis
The multivariate statistical analysis of the data (PLS-DA and OPLS-DA) and the machine learning (SVM and decision tree) were performed on a chromatogram, normalized and corrected (based on the peak area of Ribitol) using SIMPCA-P 13.0 (Umetrics), RapidMiner 5.3 (Rapid-I) and R (Foundation for Statistial Computing, Vienna). The values have been centered on the mean and the variance was normalized.
Results
The results were obtained from 100 cases of fetal malformation (FM) and from 100 controls. The demographic and clinical characteristics of the cases of FM and controls are shown in Table 1 , whereas the investigated malformations are listed in Table 2
Table 1 : characteristics of the study population
Figure imgf000011_0001
Table 2 : malformations investigated
Type Malformation Number of cases
Abnormalities of the Acrania, anencephaly, agenesis of the corpus
central nervous callosum, hydrocephalus, cystic hygroma,
system and peripheral myelomeningocele, spina bifida, spinal
muscular atrophy, leukodystrophy Krabbe 26 disease Werding-Hoffmann, Dandy Walker
syndrome
Chromosomal Trisomy 21 , trisomy 18, trisomy 13, balanced
abnormalities translocation, unbalanced translocation, 46
Turner, X0/XY
Twins polymalformation, polymalformation,
Polymalformations Ellis van Creveld syndrome 9
Cardiac abnormalities Fallot tetralogy, complex cardiac
4 malformations, heart disease
Haematological Fetal hydrops, major thalassemia 4 abnormalities
Abnormalities of the Budd Chiari syndrome, diaphragmatic hernia,
digestive system syndrome Bochdalek
3
Anomalies of the Renal agenesis renal dysplasia, Potter
urogenital system syndrome 3
Bone and skeletal Imperfecta osteogenesis
system abnormalities 2 Other Cystic fibrosis , non-immune fetal hydrops
2
In a TIC chromatogram are normally recognized more than 150 signals in a single sample and some of these peaks were not further investigated because they were not found correspondingly in other samples, because of in too low concentration or because of poor spectral quality in order to be confirmed as metabolites. A total of 1 16 endogenous metabolites such as amino acids, organic acids, carbohydrates, fatty acids and steroids were detected. For the identification of the peak, the linear retention index (LRI) was used by placing as tolerance a difference between the tabulated data and those identified of maximum 10, while the minimum of compatibility for the search in the NIST library was placed at 85% minimum. The peak areas were normalized and corrected to Ribitol signal. The results were summarized in a matrix file, separated by comma (CSV) and loaded into an appropriate software for statistical processing.
For the metabolic profile, the model OPLS-DA showed satisfactory predictive and modeling capabilities by using a predictive component and three orthogonal components (R2Ycum = 0.971 , Q2cum = 0.372). The other classification models showed good classification capacity (although lower than OPLS-DA). Several approaches are possible for the definitive allocation of the class of the unknown sample. It is possible to use the response of a single model or to integrate the responses of individual models in a more complex decision algorithm.

Claims

CLAIMS 1 ) A method for the early diagnosis of fetal malformations based on the metabolomics analysis of the maternal blood comprising a first phase of blood samples analysis collected from mothers with malformed fetuses and from mothers with healthy fetuses to train the classification models and a second phase of GCMS analysis of blood samples collected from pregnant women and assignment of a class membership on the basis of the classification model established in the first phase. A method according to claim 1 wherein the first phase comprises:
1 . Extraction and derivatization of the metabolites;
2. GCMS or GCxGCMS analysis;
3. Creation of a data matrix;
4. Structuring of classification models.
A method according to claim 1 wherein the second phase comprises
1 . Extraction and derivatization of the metabolites;
2. GCMS or GCxGCMS analysis;
3. Creation of a data matrix;
4. Assignment of the class membership.
4) A method according to claim 1 for the diagnosis of fetal malformation first trimester of pregnancy.
PCT/EP2015/060051 2014-05-15 2015-05-07 Non-invasive diagnostic method for the early detection of fetal malformations WO2015173107A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP15719743.5A EP3143408A1 (en) 2014-05-15 2015-05-07 Non-invasive diagnostic method for the early detection of fetal malformations
US15/310,197 US20170138930A1 (en) 2014-05-15 2015-05-07 Non-invasive diagnostic method for the early detection of fetal malformations
JP2017512110A JP2017516118A (en) 2014-05-15 2015-05-07 Noninvasive diagnostic method for early detection of fetal malformations
ZA2016/07324A ZA201607324B (en) 2014-05-15 2016-10-24 Non-invasive diagnostic method for the early detection of fetal malformations

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
ITMI2014A000889 2014-05-15
ITMI20140889 2014-05-15

Publications (2)

Publication Number Publication Date
WO2015173107A1 true WO2015173107A1 (en) 2015-11-19
WO2015173107A8 WO2015173107A8 (en) 2016-01-07

Family

ID=51179026

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2015/060051 WO2015173107A1 (en) 2014-05-15 2015-05-07 Non-invasive diagnostic method for the early detection of fetal malformations

Country Status (5)

Country Link
US (1) US20170138930A1 (en)
EP (1) EP3143408A1 (en)
JP (1) JP2017516118A (en)
WO (1) WO2015173107A1 (en)
ZA (1) ZA201607324B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3198279B1 (en) * 2014-09-24 2020-09-09 Map Ip Holding Limited Method of providing a prognosis of successful implantation of a cultured embryo

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012066057A1 (en) * 2010-11-16 2012-05-24 University College Cork - National University Of Ireland, Cork Prediction of a small-for-gestational age (sga) infant

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011247869A (en) * 2010-04-27 2011-12-08 Kobe Univ Inspection method of specific disease using metabolome analysis method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012066057A1 (en) * 2010-11-16 2012-05-24 University College Cork - National University Of Ireland, Cork Prediction of a small-for-gestational age (sga) infant

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
GRAÇA GONÇALO ET AL: "UPLC-MS metabolic profiling of second trimester amniotic fluid and maternal urine and comparison with NMR spectral profiling for the identification of pregnancy disorder biomarkers.", MOLECULAR BIOSYSTEMS APR 2012, vol. 8, no. 4, April 2012 (2012-04-01), pages 1243 - 1254, XP002732984, ISSN: 1742-2051 *
See also references of EP3143408A1 *
SÍLVIA O. DIAZ ET AL: "Metabolic Biomarkers of Prenatal Disorders: An Exploratory NMR Metabonomics Study of Second Trimester Maternal Urine and Blood Plasma", JOURNAL OF PROTEOME RESEARCH, vol. 10, no. 8, 5 August 2011 (2011-08-05), pages 3732 - 3742, XP055154950, ISSN: 1535-3893, DOI: 10.1021/pr200352m *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3198279B1 (en) * 2014-09-24 2020-09-09 Map Ip Holding Limited Method of providing a prognosis of successful implantation of a cultured embryo

Also Published As

Publication number Publication date
EP3143408A1 (en) 2017-03-22
JP2017516118A (en) 2017-06-15
WO2015173107A8 (en) 2016-01-07
ZA201607324B (en) 2017-09-27
US20170138930A1 (en) 2017-05-18

Similar Documents

Publication Publication Date Title
Bahado-Singh et al. Metabolomics and first-trimester prediction of early-onset preeclampsia
Chetty et al. Role of attributes selection in classification of Chronic Kidney Disease patients
Sachse et al. Metabolic changes in urine during and after pregnancy in a large, multiethnic population-based cohort study of gestational diabetes
Mahadevan et al. Analysis of metabolomic data using support vector machines
Daley et al. Metabolomics profiling of concussion in adolescent male hockey players: a novel diagnostic method
Wang et al. NMR-based metabolomic techniques identify potential urinary biomarkers for early colorectal cancer detection
EP3151665A1 (en) Methods and systems for determining autism spectrum disorder risk
WO2021262905A2 (en) Multimodality systems and methods for detection, prognosis, and monitoring of neurological injury and disease
EP3019624A2 (en) Biomarkers of autism spectrum disorder
WO2017082103A1 (en) Biomarker for diagnosing depression and use of said biomarker
Debik et al. Multivariate analysis of NMR‐based metabolomic data
CN113484511B (en) Screening and application of early gestation blood lipid biomarker for gestational diabetes
CN111989090A (en) Use of stratified spontaneous preterm birth risk of circulating microparticles
Troisi et al. A metabolomics-based approach for non-invasive screening of fetal central nervous system anomalies
Qian et al. A cardiovascular disease prediction model based on routine physical examination indicators using machine learning methods: a cohort study
Graça et al. Can biofluids metabolic profiling help to improve healthcare during pregnancy?
Xu et al. Diagnosis of Parkinson's Disease via the Metabolic Fingerprint in Saliva by Deep Learning
Zhang et al. Detection of acute ischemic stroke and backtracking stroke onset time via machine learning analysis of metabolomics
WO2015173107A1 (en) Non-invasive diagnostic method for the early detection of fetal malformations
EP3262416A1 (en) Method for the diagnosis of endometrial carcinoma
WO2020070320A1 (en) A method for differentially diagnosing in vitro a bipolar disorder and a major depressive disorder
CN109946467B (en) Biomarker for ossification diagnosis of thoracic vertebra ligamentum flavum
Baral et al. Redefining lobe-wise ground-glass opacity in COVID-19 through deep learning and its correlation with biochemical parameters
TZ et al. c) Agent. LoNooNI, Alessandra. A s Partners sL, via
US20220005605A1 (en) A system and method of generating a model to detect, or predict the risk of, an outcome

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15719743

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
WWE Wipo information: entry into national phase

Ref document number: 15310197

Country of ref document: US

ENP Entry into the national phase

Ref document number: 2017512110

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

REEP Request for entry into the european phase

Ref document number: 2015719743

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2015719743

Country of ref document: EP