WO2014144129A2 - Biomarkers and methods for predicting preterm birth - Google Patents

Biomarkers and methods for predicting preterm birth Download PDF

Info

Publication number
WO2014144129A2
WO2014144129A2 PCT/US2014/028412 US2014028412W WO2014144129A2 WO 2014144129 A2 WO2014144129 A2 WO 2014144129A2 US 2014028412 W US2014028412 W US 2014028412W WO 2014144129 A2 WO2014144129 A2 WO 2014144129A2
Authority
WO
WIPO (PCT)
Prior art keywords
human
biomarkers
precursor
pregnant female
beta
Prior art date
Application number
PCT/US2014/028412
Other languages
French (fr)
Other versions
WO2014144129A3 (en
Inventor
Durlin Edward HICKOK
John Jay BONIFACE
Gregory Charles CRITCHFIELD
Tracey Cristine FLEISCHER
Original Assignee
Sera Prognostics, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=51538302&utm_source=google_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=WO2014144129(A2) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Priority to BR112015023706A priority Critical patent/BR112015023706A2/en
Priority to CN201480028164.3A priority patent/CN106574919B/en
Priority to AU2014227891A priority patent/AU2014227891A1/en
Priority to RU2015144304A priority patent/RU2015144304A/en
Priority to EP14765203.6A priority patent/EP2972308B9/en
Application filed by Sera Prognostics, Inc. filed Critical Sera Prognostics, Inc.
Priority to EP20195737.0A priority patent/EP3800470A1/en
Priority to JP2016502779A priority patent/JP2016518589A/en
Priority to CA2907120A priority patent/CA2907120C/en
Priority to ES14765203T priority patent/ES2836127T3/en
Publication of WO2014144129A2 publication Critical patent/WO2014144129A2/en
Publication of WO2014144129A3 publication Critical patent/WO2014144129A3/en
Priority to AU2020201701A priority patent/AU2020201701B2/en
Priority to AU2022221445A priority patent/AU2022221445A1/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/689Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids related to pregnancy or the gonads
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K4/00Peptides having up to 20 amino acids in an undefined or only partially defined sequence; Derivatives thereof
    • C07K4/12Peptides having up to 20 amino acids in an undefined or only partially defined sequence; Derivatives thereof from animals; from humans
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/36Gynecology or obstetrics
    • G01N2800/368Pregnancy complicated by disease or abnormalities of pregnancy, e.g. preeclampsia, preterm labour
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/52Predicting or monitoring the response to treatment, e.g. for selection of therapy based on assay results in personalised medicine; Prognosis
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/60Complex ways of combining multiple protein biomarkers for diagnosis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Definitions

  • the invention relates generally to the field of personalized medicine and, more specifically to compositions and methods for determining the probability for preterm birth in a pregnant female.
  • Infants born preterm are at greater risk than infants born at term for mortality and a variety of health and developmental problems. Complications include acute respiratory, gastrointestinal, immunologic, central nervous system, hearing, and vision problems, as well as longer-term motor, cognitive, visual, hearing, behavioral, social- emotional, health, and growth problems.
  • the birth of a preterm infant can also bring considerable emotional and economic costs to families and have implications for public- sector services, such as health insurance, educational, and other social support systems.
  • the greatest risk of mortality and morbidity is for those infants born at the earliest gestational ages. However, those infants born nearer to term represent the greatest number of infants born preterm and also experience more complications than infants born at term.
  • cervical cerclage To prevent preterm birth in women who are less than 24 weeks pregnant with an ultrasound showing cervical opening, a surgical procedure known as cervical cerclage can be employed in which the cervix is stitched closed with strong sutures. For women less than 34 weeks pregnant and in active preterm labor, hospitalization may be necessary as well as the administration of medications to temporarily halt preterm labor an/or promote the fetal lung development.
  • health care providers can implement various clinical strategies that may include preventive medications, for example, hydroxyprogesterone caproate (Makena) injections and/or vaginal progesterone gel, cervical pessaries, restrictions on sexual activity and/or other physical activities, and alterations of treatments for chronic conditions, such as diabetes and high blood pressure, that increase the risk of preterm labor.
  • preventive medications for example, hydroxyprogesterone caproate (Makena) injections and/or vaginal progesterone gel, cervical pessaries, restrictions on sexual activity and/or other physical activities, and alterations of treatments for chronic conditions, such as diabetes and high blood pressure, that increase the risk of preterm labor.
  • the present invention addresses this need by providing compositions and methods for determining whether a pregnant woman is at risk for preterm birth. Related advantages are provided as well.
  • the present invention provides compositions and methods for predicting the probability of preterm birth in a pregnant female.
  • the invention provides a panel of isolated biomarkers comprising N of the biomarkers listed in Tables 1 through 63.
  • N is a number selected from the group consisting of 2 to 24.
  • the biomarker panel comprises at least two of the isolated biomarkers selected from the group consisting of AFTECCVVASQLR, ELLESYIDGR, and ITLPDFTGDLR.
  • the biomarker panel comprises at least two of the isolated biomarkers selected from the group consisting of FLNWIK, FGFGGSTDSGPIR, LLELTGPK, VEHSDLSFSK, IEGNLIFDP NYLPK, ALVLELAK, TQILEWAAER,
  • the biomarker panel comprises at least two of the isolated biomarkers selected from the group consisting of the biomarkers set forth in Table 50 and the biomarkers set forth in Table 52.
  • the invention provides a panel of isolated biomarkers comprising N of the biomarkers listed in Tables 1 through 63.
  • N is a number selected from the group consisting of 2 to 24.
  • the biomarker panel comprises at least two of the isolated biomarkers selected from the group consisting of the biomarkers set forth in Table 50 and the biomarkers set forth in Table 52.
  • the invention provides a biomarker panel comprising at least two of the isolated biomarkers selected from the group consisting of
  • LBP lipopolysaccharide-binding protein
  • THRB prothrombin
  • C5 or C05 complement component C5
  • PLMN plasminogen
  • C8G or C08G complement component C8 gamma chain
  • the invention provides a biomarker panel comprising at least two of the isolated biomarkers selected from the group consisting of Alpha- 1B- glycoprotein (A1BG), Disintegrin and metalloproteinase domain-containing protein 12 (ADA 12), Apolipoprotein B-100 (APOB), Beta-2-microglobulin (B2MG),
  • A1BG Alpha- 1B- glycoprotein
  • ADA 12 Disintegrin and metalloproteinase domain-containing protein 12
  • APOB Apolipoprotein B-100
  • B2MG Beta-2-microglobulin
  • CCAAT/enhancer-binding protein alpha/beta HP8 Peptide
  • Corticosteroid-binding globulin CBG
  • C6 Corticosteroid-binding globulin
  • EGLN Complement component C6, Endoglin
  • ENPP2 Ectonucleotide pyrophosphatase/phosphodiesterase family member 2
  • F7 Coagulation factor VII
  • HBP2 Hyaluronan-binding protein 2
  • PSG9 Pregnancy-specific beta- 1 -glycoprotein 9
  • IHBE Inhibin beta E chain
  • the invention provides a biomarker panel comprising lipopolysaccharide-binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), complement component C8 gamma chain (C8G or C08G), complement component 1, q subcomponent, B chain (C1QB), fibrinogen beta chain (FIBB or FIB), C-reactive protein (CRP), inter-alpha-trypsin inhibitor heavy chain H4 (ITIH4), chorionic somatomammotropin hormone (CSH), and angiotensinogen (ANG or ANGT).
  • LBP lipopolysaccharide-binding protein
  • THRB prothrombin
  • C5 C5 or C05
  • PLMN plasminogen
  • C8G or C08G complement component 1 q subcomponent
  • B chain C1QB
  • fibrinogen beta chain FIBB or FIB
  • CRP C-reactive protein
  • the invention provides a biomarker panel comprising Alpha- lB-glycoprotein (A1BG), Disintegrin and metalloproteinase domain-containing protein 12 (ADA 12), Apolipoprotein B-100 (APOB), Beta-2-microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid-binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy-specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE).
  • A1BG Alpha- lB-glycoprotein
  • ADA 12 Disintegrin and metalloproteinase domain-containing protein 12
  • APOB Apolipoprotein
  • the invention provides a biomarker panel comprising at least two of the isolated biomarkers selected from the group consisting of the biomarkers set forth in Table 51 and the biomarkers set forth in Table 53.
  • Also provided by the invention is a method of determining probability for preterm birth in a pregnant female comprising detecting a measurable feature of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63 in a biological sample obtained from the pregnant female, and analyzing the measurable feature to determine the probability for preterm birth in the pregnant female.
  • the invention provides a method of predicting GAB, the method encompassing detecting a measurable feature of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63 in a biological sample obtained from a pregnant female, and analyzing said measurable feature to predict GAB.
  • a measurable feature comprises fragments or derivatives of each of the N biomarkers selected from the biomarkers listed in Tables 1 through 63.
  • detecting a measurable feature comprises quantifying an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63, combinations or portions and/or derivatives thereof in a biological sample obtained from the pregnant female.
  • the disclosed methods of determining probability for preterm birth in a pregnant female further encompass detecting a measurable feature for one or more risk indicia associated with preterm birth.
  • the disclosed methods of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of N biomarkers, wherein N is selected from the group consisting of 2 to 24.
  • the disclosed methods of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of AFTECCVVASQLR, ELLESYIDGR, and
  • the disclosed methods of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of FLNWIK, FGFGGSTDSGPIR, LLELTGPK, VEHSDLSFSK, IEGNLIFDPNNYLPK, ALVLELAK, TQILEWAAER,
  • the disclosed methods of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of the biomarkers set forth in Table 50 and the biomarkers set forth in Table 52.
  • the disclosed methods of determining probability for preterm birth in a pregnant female comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of lipopolysaccharide- binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), and complement component C8 gamma chain (C8G or C08G).
  • LBP lipopolysaccharide- binding protein
  • THRB prothrombin
  • C5 complement component C5
  • PLMN plasminogen
  • C8G or C08G complement component C8 gamma chain
  • the disclosed methods of determining probability for preterm birth in a pregnant female comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of of Alpha- 1B- glycoprotein (A1BG), Disintegrin and metalloproteinase domain-containing protein 12 (ADA 12), Apolipoprotein B-100 (APOB), Beta-2-microglobulin (B2MG),
  • A1BG Alpha- 1B- glycoprotein
  • ADA 12 Disintegrin and metalloproteinase domain-containing protein 12
  • APOB Apolipoprotein B-100
  • B2MG Beta-2-microglobulin
  • CCAAT/enhancer-binding protein alpha/beta HP8 Peptide
  • Corticosteroid-binding globulin CBG
  • C6 Corticosteroid-binding globulin
  • EGLN Complement component C6, Endoglin
  • ENPP2 Ectonucleotide pyrophosphatase/phosphodiesterase family member 2
  • F7 Coagulation factor VII
  • HBP2 Hyaluronan-binding protein 2
  • PSG9 Pregnancy-specific beta- 1 -glycoprotein 9
  • IHBE Inhibin beta E chain
  • the disclosed methods of determining probability for preterm birth in a pregnant female comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of lipopolysaccharide- binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), complement component C8 gamma chain (C8G or C08G), complement component 1, q subcomponent, B chain (C1QB), fibrinogen beta chain (FIBB or FIB), C-reactive protein (CRP), inter-alpha-trypsin inhibitor heavy chain H4 (ITIH4), chorionic somatomammotropin hormone (CSH), and angiotensinogen (ANG or ANGT).
  • LBP lipopolysaccharide- binding protein
  • THRB prothrombin
  • C5 complement component C5
  • PLMN plasminogen
  • C8G or C08G complement component 1 q subcomponent
  • B chain C
  • the disclosed methods of determining probability for preterm birth in a pregnant female comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of the biomarkers set forth in Table 51 and the biomarkers set forth in Table 53.
  • the probability for preterm birth in the pregnant female is calculated based on the quantified amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63.
  • the disclosed methods for determining the probability of preterm birth encompass detecting and/or quantifying one or more biomarkers using mass sprectrometry, a capture agent or a combination thereof.
  • the disclosed methods of determining probability for preterm birth in a pregnant female encompass an initial step of providing a biomarker panel comprising N of the biomarkers listed in Tables 1 through 63. In additional embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female encompass an initial step of providing a biological sample from the pregnant female.
  • the disclosed methods of determining probability for preterm birth in a pregnant female encompass communicating the probability to a health care provider. In additional embodiments, the communication informs a subsequent treatment decision for the pregnant female. In further embodiments, the treatment decision of one or more selected from the group of consisting of more frequent prenatal care visits, serial cervical length measurements, enhanced education regarding signs and symptoms of early preterm labor, lifestyle interventions for modifiable risk behaviors and progesterone treatment. [0028] In further embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female encompass analyzing the measurable feature of one or more isolated biomarkers using a predictive model. In some embodiments of the disclosed methods, a measurable feature of one or more isolated biomarkers is compared with a reference feature.
  • the disclosed methods of determining probability for preterm birth in a pregnant female encompass using one or more analyses selected from a linear discriminant analysis model, a support vector machine classification algorithm, a recursive feature elimination model, a prediction analysis of microarray model, a logistic regression model, a CART algorithm, a flex tree algorithm, a LART algorithm, a random forest algorithm, a MART algorithm, a machine learning algorithm, a penalized regression method, and a combination thereof.
  • the disclosed methods of determining probability for preterm birth in a pregnant female encompass logistic regression.
  • the invention provides a method of determining probability for preterm birth in a pregnant female, the method encompassing quantifying in a biological sample obtained from the pregnant female an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63; multiplying the amount by a predetermined coefficient, and determining the probability for preterm birth in the pregnant female comprising adding the individual products to obtain a total risk score that corresponds to the probability
  • the invention provides a method of prediciting GAB, the method comprising: (a) quantifying in a biological sample obtained from said pregnant female an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63; (b) multiplying or thresholding said amount by a predetermined coefficient, (c) determining the predicted GAB birth in said pregnant female comprising adding said individual products to obtain a total risk score that corresponds to said predicted GAB.
  • the invention provides a method of prediciting time to birth in a pregnant female, the method comprising: (a) obtaining a biological sample from said pregnant female; (b) quantifying an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63 in said biological sample; (c) multiplying or thresholding said amount by a predetermined coefficient, (d) determining predicted GAB in said pregnant female comprising adding said individual products to obtain a total risk score that corresponds to said predicted GAB; and (e) substracting the estimated gestational age (GA) at time biological sample was obtained from the predicted GAB to predict time to birth in said pregnant female.
  • Figure 1 Scatterplot of actual gestational age at birth versus predicted gestational age from random forest regression model.
  • Firgure Distribution of predicted gestational age from random forest regression model versus actual gestational age at birth (GAB), where actual GAB is given in categories of (i) less than 37 weeks, (ii) 37 to 39 weeks, and (iii) 40 weeks or greater (peaks left to right, respectively).
  • the present disclosure is based, in part, on the discovery that certain proteins and peptides in biological samples obtained from a pregnant female are differentially expressed in pregnant females that have an increased risk of preterm birth relative to controls.
  • the present disclosure is further based, in part, on the unexpected discovery that panels combining one or more of these proteins and peptides can be utilized in methods of determining the probability for preterm birth in a pregnant female with high sensitivity and specificity.
  • proteins and peptides disclosed herein serve as biomarkers for classifying test samples, predicting probability of preterm birth, predicting probability of term birth, predicting gestational age at birth (GAB), predicting time to birth and/or monitoring of progress of preventative therapy in a pregnant female, either individually or in a panel of biomarkers.
  • the disclosure provides biomarker panels, methods and kits for determining the probability for preterm birth in a pregnant female.
  • One major advantage of the present disclosure is that risk of developing preterm birth can be assessed early during pregnancy so that appropriate monitoring and clinical management to prevent preterm delivery can be initiated in a timely fashion.
  • the present invention is of particular benefit to females lacking any risk factors for preterm birth and who would not otherwise be identified and treated.
  • the present disclosure includes methods for generating a result useful in determining probability for preterm birth in a pregnant female by obtaining a dataset associated with a sample, where the dataset at least includes quantitative data about biomarkers and panels of biomarkers that have been identified as predictive of preterm birth, and inputting the dataset into an analytic process that uses the dataset to generate a result useful in determining probability for preterm birth in a pregnant female.
  • this quantitative data can include amino acids, peptides, polypeptides, proteins, nucleotides, nucleic acids, nucleosides, sugars, fatty acids, steroids, metabolites, carbohydrates, lipids, hormones, antibodies, regions of interest that serve as surrogates for biological macromolecules and combinations thereof.
  • biomarker variants that are at least 90% or at least 95% or at least 97% identical to the exemplified sequences and that are now known or later discovered and that have utility for the methods of the invention. These variants may represent
  • Suitable samples in the context of the present invention include, for example, blood, plasma, serum, amniotic fluid, vaginal secretions, saliva, and urine.
  • the biological sample is selected from the group consisting of whole blood, plasma, and serum.
  • the biological sample is serum.
  • biomarkers can be detected through a variety of assays and techniques known in the art. As further described herein, such assays include, without limitation, mass spectrometry (MS)-based assays, antibody-based assays as well as assays that combine aspects of the two.
  • MS mass spectrometry
  • Protein biomarkers associated with the probability for preterm birth in a pregnant female include, but are not limited to, one or more of the isolated biomarkers listed in Tables 1 through 63.
  • the disclosure further includes biomarker variants that are about 90%, about 95%, or about 97% identical to the exemplified sequences.
  • Variants, as used herein, include polymorphisms, splice variants, mutations, and the like.
  • Additional markers can be selected from one or more risk indicia, including but not limited to, maternal characteristics, medical history, past pregnancy history, and obstetrical history.
  • Such additional markers can include, for example, previous low birth weight or preterm delivery, multiple 2nd trimester spontaneous abortions, prior first trimester induced abortion, familial and intergenerational factors, history of infertility, nulliparity, placental abnormalities, cervical and uterine anomalies, short cervical length measurements, gestational bleeding, intrauterine growth restriction, in utero
  • diethylstilbestrol exposure multiple gestations, infant sex, short stature, low prepregnancy weight, low or high body mass index, diabetes, hypertension, urogenital infections (i.e. urinary tract infection), asthma, anxiety and depression, asthma, hypertension,
  • urogenital infections i.e. urinary tract infection
  • asthma anxiety and depression
  • asthma hypertension
  • Demographic risk indicia for preterm birth can include, for example, maternal age, race/ethnicity, single marital status, low socioeconomic status, maternal age, employment-related physical activity, occupational exposures and environment exposures and stress. Further risk indicia can include, inadequate prenatal care, cigarette smoking, use of marijuana and other illicit drugs, cocaine use, alcohol consumption, caffeine intake, maternal weight gain, dietary intake, sexual activity during late pregnancy and leisure-time physical activities.
  • Preterm birth Causes, Consequences, and Prevention, Institute of Medicine (US) Committee on Understanding Premature birth and Assuring Healthy Outcomes; Behrman RE, Butler AS, editors. Washington (DC): National Academys Press (US); 2007).
  • Additional risk indicia useful for as markers can be identified using learning algorithms known in the art, such as linear discriminant analysis, support vector machine classification, recursive feature elimination, prediction analysis of microarray, logistic regression, CART, FlexTree, LART, random forest, MART, and/or survival analysis regression, which are known to those of skill in the art and are further described herein.
  • learning algorithms known in the art, such as linear discriminant analysis, support vector machine classification, recursive feature elimination, prediction analysis of microarray, logistic regression, CART, FlexTree, LART, random forest, MART, and/or survival analysis regression, which are known to those of skill in the art and are further described herein.
  • the disclosed panels of biomarkers comprising N of the biomarkers selected from the group listed in Tables 1 through 63.
  • N can be a number selected from the group consisting of 2 to 24.
  • the number of biomarkers that are detected and whose levels are determined can be 1, or more than 1, such as 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25 or more.
  • the number of biomarkers that are detected, and whose levels are determined can be 1 , or more than 1 , such as 2, 3, 4, 5, 6, 7, 8, 9, 10, or more.
  • the methods of this disclosure are useful for determining the probability for preterm birth in a pregnant female.
  • biomarkers listed in Tables 1 through 63 are useful alone for determining the probability for preterm birth in a pregnant female, methods are also described herein for the grouping of multiple subsets of the biomarkers that are each useful as a panel of three or more biomarkers.
  • the invention provides panels comprising N biomarkers, wherein N is at least three biomarkers. In other embodiments, N is selected to be any number from 3-23 biomarkers.
  • N is selected to be any number from 2-5, 2-10, 2- 15, 2-20, or 2-23. In other embodiments, N is selected to be any number from 3-5, 3-10, 3- 15, 3-20, or 3-23. In other embodiments, N is selected to be any number from 4-5, 4-10, 4- 15, 4-20, or 4-23. In other embodiments, N is selected to be any number from 5-10, 5-15, 5-20, or 5-23. In other embodiments, N is selected to be any number from 6-10, 6-15, 6-20, or 6-23. In other embodiments, N is selected to be any number from 7-10, 7-15, 7-20, or 7- 23.
  • N is selected to be any number from 8-10, 8-15, 8-20, or 8-23. In other embodiments, N is selected to be any number from 9-10, 9-15, 9-20, or 9-23. In other embodiments, N is selected to be any number from 10-15, 10-20, or 10-23. It will be appreciated that N can be selected to encompass similar, but higher order, ranges.
  • the panel of isolated biomarkers comprises one or more, two or more, three or more, four or more, or five isolated biomarkers comprising an amino acid sequence selected from AFTECCVVASQLR, ELLESYIDGR,
  • the panel of isolated biomarkers comprises one or more, two or more, three or more, four or more, or five isolated biomarkers comprising an amino acid sequence selected from FLNWIK, FGFGGSTDSGPIR, LLELTGPK, VEHSDLSFSK, IEGNLIFDPNNYLPK, ALVLELAK, TQILEWAAER, DVLLLVHNLPQNLPGYFWYK, SEPRPGVLLR, ITQDAQLK, ALDLSLK, WWGGQPLWITATK, and LSETNR.
  • the panel of isolated biomarkers comprises one or more, two or more, or three of the isolated biomarkers consisting of an amino acid sequence selected from AFTECCVVASQLR, ELLESYIDGR, and ITLPDFTGDLR.
  • the panel of isolated biomarkers comprises one or more, two or more, or three of the isolated biomarkers consisting of an amino acid sequence selected from FLNWIK, FGFGGSTDSGPIR, LLELTGPK, VEHSDLSFSK, IEGNLIFDPNNYLPK, ALVLELAK, TQILEWAAER, DVLLLVHNLPQNLPGYFWYK, SEPRPGVLLR, ITQDAQLK, ALDLSLK, WWGGQPLWITATK, and LSETNR.
  • the panel of isolated biomarkers comprises one or more, two or more, or three of the isolated biomarkers consisting of an amino acid sequence selected from the biomarkers set forth in Table 50 and the biomarkers set forth in Table 52.
  • the panel of isolated biomarkers comprises one or more peptides comprising a fragment from lipopolysaccharide-binding protein (LBP), Schumann et al. Science 249 (4975), 1429-1431 (1990) (UniProtKB/Swiss-Prot: P18428.3);
  • LBP lipopolysaccharide-binding protein
  • Schumann et al. Science 249 (4975), 1429-1431 (1990) UniProtKB/Swiss-Prot: P18428.3;
  • complement component C5 C5 or C05
  • Haviland J. Immunol. 146 (1), 362-368 (1991)
  • GenBank: AAA51925.1 plasminogen
  • PLMN plasminogen
  • PLMN Petersen et al, J. Biol. Chem. 265 (11), 6104-6111(1990)
  • C8G or C08G complement component C8 gamma chain
  • Haefliger et al Mol. Immunol. 28 (1-2), 123-131 (1991)
  • NCBI Reference Sequence: NP 000597.2 complement component C8 gamma chain
  • the panel of isolated biomarkers comprises one or more peptides comprising a fragment from cell adhesion molecule with homology to
  • NCBI Reference Sequence: NP 000482.3 complement component 1, q subcomponent, B chain (C1QB), Reid, Biochem. J. 179 (2), 367-371 (1979)
  • NCBI Reference Sequence: NP 000482.3 fibrinogen beta chain (FIBB or FIB); Watt et al, Biochemistry 18 (1), 68-76 (1979)
  • the invention provides a panel of isolated biomarkers comprising N of the biomarkers listed in Tables 1 through 63.
  • N is a number selected from the group consisting of 2 to 24.
  • the biomarker panel comprises at least two of the isolated biomarkers selected from the group consisting of AFTECCVVASQLR, ELLESYIDGR, and
  • the biomarker panel comprises at least two of the isolated biomarkers selected from the group consisting of AFTECCVVASQLR, ELLESYIDGR, ITLPDFTGDLR, TDAPDLPEENQAR and SFRPFVPR. In additional embodiments, the biomarker panel comprises at least two of the isolated biomarkers selected from the group consisting of FLNWIK, FGFGGSTDSGPIR, LLELTGPK, VEHSDLSFSK, IEGNLIFDPNNYLPK, ALVLELAK, TQILEWAAER,
  • the biomarker panel comprises at least two of the isolated biomarkers selected from the group consisting of the biomarkers set forth in Table 50 and the biomarkers set forth in Table 52.
  • the biomarker panel comprises at least two of the isolated biomarkers selected from the group consisting of lipopolysaccharide-binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), and complement component C8 gamma chain (C8G or C08G).
  • the invention provides a biomarker panel comprising at least three isolated biomarkers selected from the group consisting of lipopolysaccharide-binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), and complement component C8 gamma chain (C8G or C08G).
  • the biomarker panel comprises at least two of the isolated biomarkers selected from the group consisting of Alpha- lB-glycoprotein (AIBG), Disintegrin and metalloproteinase domain-containing protein 12 (ADA12), Apolipoprotein B-100 (APOB), Beta-2-microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid-binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy-specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE).
  • AIBG Alpha- lB-glycoprotein
  • ADA12 Disintegrin and metalloproteinase domain-containing protein 12
  • the invention provides a biomarker panel comprising lipopolysaccharide-binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), complement component C8 gamma chain (C8G or C08G), complement component 1, q subcomponent, B chain (C1QB), fibrinogen beta chain (FIBB or FIB), C-reactive protein (CRP), inter-alpha-trypsin inhibitor heavy chain H4 (ITIH4), chorionic somatomammotropin hormone (CSH), and angiotensinogen (ANG or ANGT).
  • LBP lipopolysaccharide-binding protein
  • THRB prothrombin
  • C5 C5
  • PLMN plasminogen
  • C8G or C08G complement component 1 q subcomponent
  • B chain C1QB
  • fibrinogen beta chain FIBB or FIB
  • CRP C-reactive protein
  • the invention provides a biomarker panel comprising Alpha- lB-glycoprotein (AIBG), Disintegrin and metalloproteinase domain-containing protein 12 (ADA 12), Apolipoprotein B-100 (APOB), Beta-2-microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid-binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy-specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE).
  • AIBG Alpha- lB-glycoprotein
  • ADA 12 Disintegrin and metalloproteinase domain-containing protein 12
  • APOB Apolipoprotein B-
  • the invention provides a biomarker panel comprising at least two isolated biomarkers selected from the group consisting of lipopolysaccharide -binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05),
  • LBP lipopolysaccharide -binding protein
  • THRB prothrombin
  • C5 or C05 complement component C5
  • PLMN complement component C8 gamma chain
  • C8G or C08G complement component 1 q subcomponent
  • B chain CIQB
  • fibrinogen beta chain FIBB or FIB
  • CRP C-reactive protein
  • IIH4 inter-alpha-trypsin inhibitor heavy chain H4
  • CSH chorionic somatomammotropin hormone
  • ANG or ANGT angiotensinogen
  • the invention provides a biomarker panel comprising at least two isolated biomarkers selected from the group consisting of Alpha- lB-glycoprotein (A1BG), Disintegrin and metalloproteinase domain-containing protein 12 (ADA12), Apolipoprotein B-100 (APOB), Beta-2-microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid-binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy-specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain
  • A1BG Alpha- lB-glycoprotein
  • ADA12 Disintegrin and metalloproteinase domain-containing protein 12
  • the terms “comprises,” “comprising,” “includes,” “including,” “contains,” “containing,” and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, product-by-process, or composition of matter that comprises, includes, or contains an element or list of elements does not include only those elements but can include other elements not expressly listed or inherent to such process, method, product-by-process, or composition of matter.
  • the term "panel” refers to a composition, such as an array or a collection, comprising one or more biomarkers.
  • the term can also refer to a profile or index of expression patterns of one or more biomarkers described herein.
  • the number of biomarkers useful for a biomarker panel is based on the sensitivity and specificity value for the particular combination of biomarker values.
  • isolated and purified generally describes a composition of matter that has been removed from its native environment (e.g., the natural environment if it is naturally occurring), and thus is altered by the hand of man from its natural state.
  • An isolated protein or nucleic acid is distinct from the way it exists in nature.
  • biomarker refers to a biological molecule, or a fragment of a biological molecule, the change and/or the detection of which can be correlated with a particular physical condition or state.
  • the terms “marker” and “biomarker” are used interchangeably throughout the disclosure.
  • the biomarkers of the present invention are correlated with an increased likelihood of preterm birth.
  • biomarkers include, but are not limited to, biological molecules comprising nucleotides, nucleic acids, nucleosides, amino acids, sugars, fatty acids, steroids, metabolites, peptides, polypeptides, proteins, carbohydrates, lipids, hormones, antibodies, regions of interest that serve as surrogates for biological macromolecules and combinations thereof (e.g., glycoproteins, ribonucleoproteins, lipoproteins).
  • peptide fragment of a protein or polypeptide that comprises at least 5 consecutive amino acid residues, at least 6 consecutive amino acid residues, at least 7 consecutive amino acid residues, at least 8 consecutive amino acid residues, at least 9 consecutive amino acid residues, at least 10 consecutive amino acid residues, at least 1 1 consecutive amino acid residues, at least 12 consecutive amino acid residues, at least 13 consecutive amino acid residues, at least 14 consecutive amino acid residues, at least 15 consecutive amino acid residues, at least 5 consecutive amino acid residues, at least 16 consecutive amino acid residues, at least 17consecutive amino acid residues, at least 18 consecutive amino acid residues, at least 19 consecutive amino acid residues, at least 20 consecutive amino acid residues, at least 21 consecutive amino acid residues, at least 22 consecutive amino acid residues, at least 23 consecutive amino acid residues, at least 24 consecutive amino acid residues, at least 25 consecutive amino acid residues,or more consecutive amino acid residues.
  • the invention also provides a method of determining probability for preterm birth in a pregnant female, the method comprising detecting a measurable feature of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63 in a biological sample obtained from the pregnant female, and analyzing the measurable feature to determine the probability for preterm birth in the pregnant female.
  • a measurable feature comprises fragments or derivatives of each of said N biomarkers selected from the biomarkers listed in Tables 1 through 63.
  • detecting a measurable feature comprises quantifying an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63, combinations or portions and/or derivatives thereof in a biological sample obtained from said pregnant female.
  • the invention further provides a method of predicting GAB, the method encompassing detecting a measurable feature of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63 in a biological sample obtained from a pregnant female, and analyzing the measurable feature to predict GAB.
  • the invention also provides a method of prediciting GAB, the method comprising: (a) quantifying in a biological sample obtained from the pregnant female an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63; (b) multiplying or thresholding the amount by a predetermined coefficient, (c) determining the predicted GAB birth in the pregnant female comprising adding the individual products to obtain a total risk score that corresponds to the predicted GAB.
  • the invention further provides a method of prediciting time to birth in a pregnant female, the method comprising: (a) obtaining a biological sample from the pregnant female; (b) quantifying an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63 in the biological sample; (c) multiplying or thresholding the amount by a predetermined coefficient, (d) determining predicted GAB in the pregnant female comprising adding the individual products to obtain a total risk score that corresponds to the predicted GAB; and (e) substracting the estimated gestational age (GA) at time biological sample was obtained from the predicted GAB to predict time to birth in said pregnant female.
  • a biological sample from the pregnant female comprising: (a) obtaining a biological sample from the pregnant female; (b) quantifying an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63 in the biological sample; (c) multiplying or thresholding the amount by a predetermined coefficient, (d) determining predicted GAB in the pregnant female comprising adding
  • birth means birth following spontaneous onset of labor, with or without rupture of membranes.
  • birth means birth following spontaneous onset of labor, with or without rupture of membranes.
  • present disclosure is similarly applicable to the methods of predicting GAB, the methods for predicting term birth, methods for determining the probability of term birth in a pregnant female as well methods of prediciting time to birth in a pregnant female. It will be apparent to one skilled in the art that each of the aforementioned methods has specific and substantial utilities and benefits with regard maternal-fetal health considerations.
  • the method of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of N biomarkers, wherein N is selected from the group consisting of 2 to 24.
  • the disclosed methods of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of AFTECCVVASQLR, ELLESYIDGR, and ITLPDFTGDLR.
  • the disclosed methods of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of FLNWIK, FGFGGSTDSGPIR, LLELTGPK, VEHSDLSFSK, IEGNLIFDP NYLPK, ALVLELAK, TQILEWAAER, DVLLLVHNLPQNLPGYFWYK, SEPRPGVLLR, ITQDAQLK, ALDLSLK, WWGGQPLWITATK, and LSETNR.
  • the disclosed methods of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of the biomarkers set forth in Table 50 and the biomarkers set forth in Table 52.
  • the method of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of lipopolysaccharide-binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), and complement component C8 gamma chain (C8G or C08G).
  • LBP lipopolysaccharide-binding protein
  • THRB prothrombin
  • C5 complement component C5
  • PLMN plasminogen
  • C8G or C08G complement component C8 gamma chain
  • the method of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of Alpha- lB-glycoprotein (A1BG), Disintegrin and metalloproteinase domain- containing protein 12 (ADA12), Apolipoprotein B-100 (APOB), Beta-2 -microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid- binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy-specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE).
  • A1BG Alpha- lB
  • the disclosed method of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of lipopolysaccharide-binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), complement component C8 gamma chain (C8G or C08G), complement component 1, q subcomponent, B chain (C1QB), fibrinogen beta chain (FIBB or FIB), C-reactive protein (CRP), inter-alpha-trypsin inhibitor heavy chain H4 (ITIH4), chorionic somatomammotropin hormone (CSH), and
  • angiotensinogen (ANG or ANGT).
  • the disclosed method of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of Alpha- lB-glycoprotein (A1BG), Disintegrin and metalloproteinase domain- containing protein 12 (ADA12), Apolipoprotein B-100 (APOB), Beta-2 -microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid- binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy-specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE).
  • A1BG Alpha- l
  • the disclosed method of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of Alpha- lB-glycoprotein (A1BG), Disintegrin and metalloproteinase domain- containing protein 12 (ADA12), Apolipoprotein B-100 (APOB), Beta-2 -microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid- binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy-specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE).
  • A1BG Alpha- l
  • the disclosed method of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of the biomarkers set forth in Table 51 and the biomarkers set forth in Table 53.
  • the methods of determining probability for preterm birth in a pregnant female further encompass detecting a measurable feature for one or more risk indicia associated with preterm birth.
  • the risk indicia are selected form the group consisting of previous low birth weight or preterm delivery, multiple 2nd trimester spontaneous abortions, prior first trimester induced abortion, familial and intergenerational factors, history of infertility, nulliparity, placental abnormalities, cervical and uterine anomalies, gestational bleeding, intrauterine growth restriction, in utero diethylstilbestrol exposure, multiple gestations, infant sex, short stature, low prepregnancy weight, low or high body mass index, diabetes, hypertension, and urogenital infections.
  • a "measurable feature” is any property, characteristic or aspect that can be determined and correlated with the probability for preterm birth in a subject.
  • the term further encompasses any property, characteristic or aspect that can be determined and correlated in connection with a prediction of GAB, a prediction of term birth, or a prediction of time to birth in a pregnant female.
  • such a measurable feature can include, for example, the presence, absence, or concentration of the biomarker, or a fragment thereof, in the biological sample, an altered structure, such as, for example, the presence or amount of a post-translational modification, such as oxidation at one or more positions on the amino acid sequence of the biomarker or, for example, the presence of an altered conformation in comparison to the conformation of the biomarker in normal control subjects, and/or the presence, amount, or altered structure of the biomarker as a part of a profile of more than one biomarker.
  • an altered structure such as, for example, the presence or amount of a post-translational modification, such as oxidation at one or more positions on the amino acid sequence of the biomarker or, for example, the presence of an altered conformation in comparison to the conformation of the biomarker in normal control subjects, and/or the presence, amount, or altered structure of the biomarker as a part of a profile of more than one biomarker.
  • measurable features can further include risk indicia including, for example, maternal characteristics, age, race, ethnicity, medical history, past pregnancy history, obstetrical history.
  • a measurable feature can include, for example, previous low birth weight or preterm delivery, multiple 2nd trimester spontaneous abortions, prior first trimester induced abortion, familial and intergenerational factors, history of infertility, nulliparity, placental abnormalities, cervical and uterine anomalies, short cervical length meansurements, gestational bleeding, intrauterine growth restriction, in utero diethylstilbestrol exposure, multiple gestations, infant sex, short stature, low prepregnancy weight/low body mass index, diabetes, hypertension, urogenital infections, hypothyroidism,asthma, low
  • the probability for preterm birth in the pregnant female is calculated based on the quantified amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63.
  • the disclosed methods for determining the probability of preterm birth encompass detecting and/or quantifying one or more biomarkers using mass sprectrometry, a capture agent or a combination thereof.
  • the disclosed methods of determining probability for preterm birth in a pregnant female encompass an initial step of providing a biomarker panel comprising N of the biomarkers listed in Tables 1 through 63. In additional embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female encompass an initial step of providing a biological sample from the pregnant female.
  • the disclosed methods of determining probability for preterm birth in a pregnant female encompass communicating the probability to a health care provider.
  • the disclosed of predicting GAB, the methods for predicting term birth, methods for determining the probability of term birth in a pregnant female as well methods of prediciting time to birth in a pregnant female similarly encompass communicating the probability to a health care provider.
  • all embodiments described throughout this disclosure are similarly applicable to the methods of predicting GAB, the methods for predicting term birth, methods for determining the probability of term birth in a pregnant female as well methods of prediciting time to birth in a pregnant female.
  • the communication informs a subsequent treatment decision for the pregnant female.
  • the method of determining probability for preterm birth in a pregnant female encompasses the additional feature of expressing the probability as a risk score.
  • the term "risk score” refers to a score that can be assigned based on comparing the amount of one or more biomarkers in a biological sample obtained from a pregnant female to a standard or reference score that represents an average amount of the one or more biomarkers calculated from biological samples obtained from a random pool of pregnant females. Because the level of a biomarker may not be static throughout pregnancy, a standard or reference score has to have been obtained for the gestational time point that corresponds to that of the pregnant female at the time the sample was taken. The standard or reference score can be predetermined and built into a predictor model such that the comparison is indirect rather than actually performed every time the probability is determined for a subject.
  • a risk score can be a standard (e.g., a number) or a threshold (e.g., a line on a graph).
  • the value of the risk score correlates to the deviation, upwards or downwards, from the average amount of the one or more biomarkers calculated from biological samples obtained from a random pool of pregnant females.
  • a risk score if a risk score is greater than a standard or reference risk score, the pregnant female can have an increased likelihood of preterm birth.
  • the magnitude of a pregnant female's risk score, or the amount by which it exceeds a reference risk score can be indicative of or correlated to that pregnant female's level of risk.
  • the term "biological sample,” encompasses any sample that is taken from pregnant female and contains one or more of the biomarkers listed in Tables 1 through 63.
  • Suitable samples in the context of the present invention include, for example, blood, plasma, serum, amniotic fluid, vaginal secretions, saliva, and urine.
  • the biological sample is selected from the group consisting of whole blood, plasma, and serum.
  • the biological sample is serum.
  • a biological sample can include any fraction or component of blood, without limitation, T cells, monocytes, neutrophils, erythrocytes, platelets and microvesicles such as exosomes and exosome-like vesicles.
  • the biological sample is serum.
  • Preterm birth refers to delivery or birth at a gestational age less than 37 completed weeks.
  • Other commonly used subcategories of preterm birth have been established and delineate moderately preterm (birth at 33 to 36 weeks of gestation), very preterm (birth at ⁇ 33 weeks of gestation), and extremely preterm (birth at ⁇ 28 weeks of gestation).
  • cut-offs that delineate preterm birth and term birth as well as the cut-offs that delineate subcategories of preterm birth can be adjusted in practicing the methods disclosed herein, for example, to maximize a particular health benefit.
  • Gestational age is a proxy for the extent of fetal development and the fetus's readiness for birth. Gestational age has typically been defined as the length of time from the date of the last normal menses to the date of birth. However, obstetric measures and ultrasound estimates also can aid in estimating gestational age. Preterm births have generally been classified into two separate subgroups. One, spontaneous preterm births are those occurring subsequent to spontaneous onset of preterm labor or preterm premature rupture of membranes regardless of subsequent labor augmentation or cesarean delivery.
  • Two, indicated preterm births are those occurring following induction or cesarean section for one or more conditions that the woman's caregiver determines to threaten the health or life of the mother and/or fetus.
  • the methods disclosed herein are directed to determining the probability for spontaneous preterm birth. In additional embodiments, the methods disclosed herein are directed to predicting gestational birth.
  • the term “estimated gestational age” or “estimated GA” refers to the GA determined based on the date of the last normal menses and additional obstetric measures, ultrasound estimates or other clinical parameters including, without limitation, those described in the preceding paragraph.
  • predicted gestational age at birth or “predicted GAB” refers to the GAB determined based on the methods of the invention as dislosed herein.
  • term birth refers to birth at a gestational age equal or more than 37 completed weeks.
  • the pregnant female is between 17 and 28 weeks of gestation at the time the biological sample is collected. In other embodiments, the pregnant female is between 16 and 29 weeks, between 17 and 28 weeks, between 18 and 27 weeks, between 19 and 26 weeks, between 20 and 25 weeks, between 21 and 24 weeks, or between 22 and 23 weeks of gestation at the time the biological sample is collected. In further embodiments, the the pregnant female is between about 17 and 22 weeks, between about 16 and 22 weeks between about 22 and 25 weeks, between about 13 and 25 weeks, between about 26 and 28, or between about 26 and 29 weeks of gestation at the time the biological sample is collected. Accordingly, the gestational age of a pregnant female at the time the biological sample is collected can be 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29 or 30 weeks.
  • the measurable feature comprises fragments or derivatives of each of the N biomarkers selected from the biomarkers listed in Tables 1 through 63.
  • detecting a measurable feature comprises quantifying an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63, combinations or portions and/or derivatives thereof in a biological sample obtained from said pregnant female.
  • the term “amount” or “level” as used herein refers to a quantity of a biomarker that is detectable or measurable in a biological sample and/or control.
  • the quantity of a biomarker can be, for example, a quantity of polypeptide, the quantity of nucleic acid, or the quantity of a fragment or surrogate. The term can alternatively include combinations thereof.
  • the term “amount” or “level” of a biomarker is a measurable feature of that biomarker.
  • calculating the probability for preterm birth in a pregnant female is based on the quantified amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63. Any existing, available or conventional separation, detection and quantification methods can be used herein to measure the presence or absence (e.g., readout being present vs. absent; or detectable amount vs.
  • detection and/or quantification of one or more biomarkers comprises an assay that utilizes a capture agent.
  • the capture agent is an antibody, antibody fragment, nucleic acid- based protein binding reagent, small molecule or variant thereof.
  • the assay is an enzyme immunoassay (EIA), enzyme-linked immunosorbent assay (ELISA), and radioimmunoassay (RIA).
  • detection and/or quantification of one or more biomarkers further comprises mass spectrometry (MS).
  • the mass spectrometry is co-immunoprecitipation-mass spectrometry (co-IP MS), where coimmunoprecipitation, a technique suitable for the isolation of whole protein complexes is followed by mass spectrometric analysis.
  • mass spectrometer refers to a device able to volatilize/ionize analytes to form gas-phase ions and determine their absolute or relative molecular masses. Suitable methods of volatilization/ionization are matrix-assisted laser desorption
  • spectrometry include, but are not limited to, ion trap instruments, quadrupole instruments, electrostatic and magnetic sector instruments, time of flight instruments, time of flight tandem mass spectrometer (TOF MS/MS), Fourier-transform mass spectrometers,
  • Orbitraps and hybrid instruments composed of various combinations of these types of mass analyzers. These instruments can, in turn, be interfaced with a variety of other instruments that fractionate the samples (for example, liquid chromatography or solid-phase adsorption techniques based on chemical, or biological properties) and that ionize the samples for introduction into the mass spectrometer, including matrix-assisted laser desorption
  • MALDI electrospray
  • ESI nanospray ionization
  • any mass spectrometric (MS) technique that can provide precise information on the mass of peptides, and preferably also on fragmentation and/or (partial) amino acid sequence of selected peptides (e.g., in tandem mass spectrometry, MS/MS; or in post source decay, TOF MS), can be used in the methods disclosed herein.
  • MS/MS tandem mass spectrometry
  • TOF MS post source decay
  • the disclosed methods comprise performing quantitative MS to measure one or more biomarkers.
  • quantitiative methods can be performed in an automated (Villanueva, et ah, Nature Protocols (2006) 1(2):880-891) or semi-automated format.
  • MS can be operably linked to a liquid chromatography device (LC-MS/MS or LC-MS) or gas chromatography device (GC-MS or GC-MS/MS).
  • ICAT isotope-coded affinity tag
  • TMT tandem mass tags
  • SILAC stable isotope labeling by amino acids in cell culture
  • MRM multiple reaction monitoring
  • SRM selected reaction monitoring
  • analyte e.g. , peptide or small molecule such as chemical entity, steroid, hormone
  • a large number of analytes can be quantified during a single LC-MS experiment.
  • standards that correspond to the analytes of interest e.g., same amino acid sequence), but differ by the inclusion of stable isotopes.
  • Stable isotopic standards can be incorporated into the assay at precise levels and used to quantify the corresponding unknown analyte.
  • An additional level of specificity is contributed by the co-elution of the unknown analyte and its corresponding SIS and properties of their transitions (e.g., the similarity in the ratio of the level of two transitions of the unknown and the ratio of the two transitions of its corresponding SIS).
  • Mass spectrometry assays, instruments and systems suitable for biomarker peptide analysis can include, without limitation, matrix-assisted laser desorption/ionisation time-of-flight (MALDI-TOF) MS; MALDI-TOF post-source-decay (PSD); MALDI- TOF/TOF; surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF) MS; electrospray ionization mass spectrometry (ESI-MS); ESI-MS/MS; ESI-MS/(MS) soir (n is an integer greater than zero); ESI 3D or linear (2D) ion trap MS; ESI triple quadrupole MS; ESI quadrupole orthogonal TOF (Q-TOF); ESI Fourier transform MS systems; desorption/ionization on silicon (DIOS); secondary ion mass spectrometry (SIMS); atmospheric pressure chemical ionization mass
  • Peptide ion fragmentation in tandem MS (MS/MS) arrangements can be achieved using manners established in the art, such as, e.g., collision induced dissociation (CID).
  • CID collision induced dissociation
  • detection and quantification of biomarkers by mass spectrometry can involve multiple reaction monitoring (MRM), such as described among others by Kuhn et al. Proteomics 4: 1175-86 (2004).
  • MRM multiple reaction monitoring
  • Scheduled multiple-reaction-monitoring (Scheduled MRM) mode acquisition during LC-MS/MS analysis enhances the sensitivity and accuracy of peptide quantitation. Anderson and Hunter, Molecular and Cellular Proteomics 5 (4): 573 (2006).
  • mass spectrometry-based assays can be advantageously combined with upstream peptide or protein separation or fractionation methods, such as for example with the chromatographic and other methods described herein below.
  • shotgun quantitative proteomics can be combined with SRM/MRM-based assays for high- throughput identification and verification of prognostic biomarkers of preterm birth.
  • a person skilled in the art will appreciate that a number of methods can be used to determine the amount of a biomarker, including mass spectrometry approaches, such as MS/MS, LC-MS/MS, multiple reaction monitoring (MRM) or SRM and production monitoring (PIM) and also including antibody based methods such as immunoassays such as Western blots, enzyme-linked immunosorbant assay (ELISA),
  • mass spectrometry approaches such as MS/MS, LC-MS/MS, multiple reaction monitoring (MRM) or SRM and production monitoring (PIM)
  • antibody based methods such as immunoassays such as Western blots, enzyme-linked immunosorbant assay (ELISA),
  • determining the level of the at least one biomarker comprises using an immunoassay and/or mass spectrometric methods.
  • the mass spectrometric methods are selected from MS, MS/MS, LC-MS/MS, SRM, PIM, and other such methods that are known in the art.
  • LC-MS/MS further comprises ID LC-MS/MS, 2D LC-MS/MS or 3D LC-MS/MS.
  • Immunoassay techniques and protocols are generally known to those skilled in the art ( Price and Newman, Principles and Practice of Immunoassay, 2nd Edition, Grove's Dictionaries, 1997; and Gosling, Immunoassays: A Practical Approach, Oxford University Press, 2000.)
  • a variety of immunoassay techniques, including competitive and non-competitive immunoassays, can be used ( Self et al, Curr. Opin. BiotechnoL, 7:60-65 (1996).
  • the immunoassay is selected from Western blot, ELISA, immunoprecipitation, immunohistochemistry, immunofluorescence,
  • the immunoassay is an ELISA.
  • the ELISA is direct ELISA (enzyme-linked immunosorbent assay), indirect ELISA, sandwich ELISA, competitive ELISA, multiplex ELISA, ELISPOT technologies, and other similar techniques known in the art. Principles of these immunoassay methods are known in the art, for example John R. Crowther, The ELISA Guidebook, 1st ed., Humana Press 2000, ISBN 0896037282. Typically ELISAs are performed with antibodies but they can be performed with any capture agents that bind specifically to one or more biomarkers of the invention and that can be detected.
  • Multiplex ELISA allows simultaneous detection of two or more analytes within a single compartment (e.g., microplate well) usually at a plurality of array addresses (Nielsen and Geierstanger 2004. J Immunol Methods 290: 107-20 (2004) and Ling et al. 2007. Expert Rev Mol Diagn 7: 87-98 (2007)).
  • Radioimmunoassay can be used to detect one or more biomarkers in the methods of the invention.
  • RIA is a competition-based assay that is well known in the art and involves mixing known quantities of radioactavely-labelled (e.g., 125 I or 131 I-labelled) target analyte with antibody specific for the analyte, then adding non-labelled analyte from a sample and measuring the amount of labelled analyte that is displaced (see, e.g., An Introduction to Radioimmunoassay and Related Techniques, by Chard T, ed., Elsevier Science 1995, ISBN 0444821198 for guidance).
  • a detectable label can be used in the assays described herein for direct or indirect detection of the biomarkers in the methods of the invention.
  • a wide variety of detectable labels can be used, with the choice of label depending on the sensitivity required, ease of conjugation with the antibody, stability requirements, and available instrumentation and disposal provisions. Those skilled in the art are familiar with selection of a suitable detectable label based on the assay detection of the biomarkers in the methods of the invention.
  • Suitable detectable labels include, but are not limited to, fluorescent dyes ⁇ e.g., fluorescein, fluorescein isothiocyanate (FITC), Oregon GreenTM, rhodamine, Texas red, tetrarhodimine isothiocynate (TRITC), Cy3, Cy5, etc.), fluorescent markers ⁇ e.g., green fluorescent protein (GFP), phycoerythrin, etc.), enzymes ⁇ e.g., luciferase, horseradish peroxidase, alkaline phosphatase, etc.), nanoparticles, biotin, digoxigenin, metals, and the like.
  • fluorescent dyes ⁇ e.g., fluorescein, fluorescein isothiocyanate (FITC), Oregon GreenTM, rhodamine, Texas red, tetrarhodimine isothiocynate (TRITC), Cy3, Cy5, etc.
  • fluorescent markers ⁇ e.g.,
  • differential tagging with isotopic reagents e.g., isotope-coded affinity tags (ICAT) or the more recent variation that uses isobaric tagging reagents, iTRAQ (Applied Biosystems, Foster City, Calif), or tandem mass tags, TMT, (Thermo Scientific, Rockford, IL), followed by multidimensional liquid chromatography (LC) and tandem mass spectrometry (MS/MS) analysis can provide a further methodology in practicing the methods of the inventon.
  • ICAT isotope-coded affinity tags
  • iTRAQ Applied Biosystems, Foster City, Calif
  • tandem mass tags TMT
  • MS/MS tandem mass spectrometry
  • a chemiluminescence assay using a chemiluminescent antibody can be used for sensitive, non-radioactive detection of protein levels.
  • fiuorochromes include, without limitation, DAPI, fluorescein, Hoechst 33258, R-phycocyanin, B-phycoerythrin, R-phycoerythrin, rhodamine, Texas red, and lissamine.
  • Indirect labels include various enzymes well known in the art, such as horseradish peroxidase (HRP), alkaline phosphatase (AP), beta- galactosidase, urease, and the like. Detection systems using suitable substrates for horseradish-peroxidase, alkaline phosphatase, beta-galactosidase are well known in the art.
  • a signal from the direct or indirect label can be analyzed, for example, using a spectrophotometer to detect color from a chromogenic substrate; a radiation counter to detect radiation such as a gamma counter for detection of 125 I; or a fluorometer to detect fluorescence in the presence of light of a certain wavelength.
  • a quantitative analysis can be made using a spectrophotometer such as an EMAX Microplate Reader (Molecular Devices; Menlo Park, Calif.) in accordance with the manufacturer's instructions.
  • assays used to practice the invention can be automated or performed robotically, and the signal from multiple samples can be detected simultaneously.
  • the methods described herein encompass quantification of the biomarkers using mass spectrometry (MS).
  • MS mass spectrometry
  • the mass spectrometry can be liquid chromatography-mass spectrometry (LC-MS), multiple reaction monitoring (MRM) or selected reaction monitoring (SRM).
  • MRM multiple reaction monitoring
  • SRM selected reaction monitoring
  • the MRM or SRM can further encompass scheduled MRM or scheduled SRM.
  • Chromatography encompasses methods for separating chemical substances and generally involves a process in which a mixture of analytes is carried by a moving stream of liquid or gas ("mobile phase") and separated into components as a result of differential distribution of the analytes as they flow around or over a stationary liquid or solid phase (“stationary phase”), between the mobile phase and said stationary phase.
  • the stationary phase can be usually a finely divided solid, a sheet of filter material, or a thin film of a liquid on the surface of a solid, or the like.
  • Chromatography is well understood by those skilled in the art as a technique applicable for the separation of chemical compounds of biological origin, such as, e.g., amino acids, proteins, fragments of proteins or peptides, etc.
  • Chromatography can be columnar (i.e., wherein the stationary phase is deposited or packed in a column), preferably liquid chromatography, and yet more preferably high-performance liquid chromatography (HPLC), or ultra high
  • exemplary types of chromatography include, without limitation, high-performance liquid chromatography (HPLC), UHPLC, normal phase HPLC (NP-HPLC), reversed phase HPLC (RP-HPLC), ion exchange chromatography (IEC), such as cation or anion exchange chromatography, hydrophilic interaction chromatography (HILIC), hydrophobic interaction chromatography (HIC), size exclusion chromatography (SEC) including gel filtration chromatography or gel permeation chromatography, chromatofocusing, affinity chromatography such as immuno-affinity, immobilised metal affinity chromatography, and the like.
  • HPLC high-performance liquid chromatography
  • UHPLC normal phase HPLC
  • RP-HPLC reversed phase HPLC
  • IEC ion exchange chromatography
  • HILIC hydrophilic interaction chromatography
  • HIC hydrophobic interaction chromatography
  • SEC size exclusion chromatography
  • gel filtration chromatography or gel permeation chromatography chromatofocusing
  • affinity chromatography such as immuno
  • Chromatography including single-, two- or more-dimensional chromatography, can be used as a peptide fractionation method in conjunction with a further peptide analysis method, such as for example, with a downstream mass spectrometry analysis as described elsewhere in this specification.
  • peptide or polypeptide separation, identification or quantification methods can be used, optionally in conjunction with any of the above described analysis methods, for measuring biomarkers in the present disclosure.
  • Such methods include, without limitation, chemical extraction partitioning, isoelectric focusing (IEF) including capillary isoelectric focusing (CIEF), capillary isotachophoresis (CITP), capillary electrochromatography (CEC), and the like, one-dimensional polyacrylamide gel electrophoresis (PAGE), two-dimensional polyacrylamide gel electrophoresis (2D-PAGE), capillary gel electrophoresis (CGE), capillary zone electrophoresis (CZE), micellar electrokinetic chromatography (MEKC), free flow electrophoresis (FFE), etc.
  • IEF isoelectric focusing
  • CITP capillary isotachophoresis
  • CEC capillary electrochromatography
  • PAGE polyacrylamide gel electrophoresis
  • 2D-PAGE two-dimensional polyacrylamide gel electrophore
  • the term “capture agent” refers to a compound that can specifically bind to a target, in particular a biomarker.
  • the term includes antibodies, antibody fragments, nucleic acid-based protein binding reagents (e.g. aptamers, Slow Off-rate Modified Aptamers (SOMAmerTM)), protein-capture agents, natural ligands (i.e. a hormone for its receptor or vice versa), small molecules or variants thereof.
  • Capture agents can be configured to specifically bind to a target, in particular a biomarker.
  • Capture agents can include but are not limited to organic molecules, such as polypeptides, polynucleotides and other non polymeric molecules that are identifiable to a skilled person.
  • capture agents include any agent that can be used to detect, purify, isolate, or enrich a target, in particular a biomarker. Any art- known affinity capture technologies can be used to selectively isolate and
  • biomarkers that are components of complex mixtures of biological media for use in the disclosed methods.
  • Antibody capture agents that specifically bind to a biomarker can be prepared using any suitable methods known in the art. See, e.g., Coligan, Current Protocols in Immunology (1991); Harlow & Lane, Antibodies: A Laboratory Manual (1988); Goding, Monoclonal Antibodies: Principles and Practice (2d ed. 1986).
  • Antibody capture agents can be any immunoglobulin or derivative therof, whether natural or wholly or partially synthetically produced. All derivatives thereof which maintain specific binding ability are also included in the term.
  • Antibody capture agents have a binding domain that is homologous or largely homologous to an immunoglobulin binding domain and can be derived from natural sources, or partly or wholly synthetically produced.
  • Antibody capture agents can be monoclonal or polyclonal antibodies. In some embodiments, an antibody is a single chain antibody. Those of ordinary skill in the art will appreciate that antibodies can be provided in any of a variety of forms including, for example, humanized, partially humanized, chimeric, chimeric humanized, etc. Antibody capture agents can be antibody fragments including, but not limited to, Fab, Fab', F(ab')2, scFv, Fv, dsFv diabody, and Fd fragments. An antibody capture agent can be produced by any means.
  • an antibody capture agent can be enzymatically or chemically produced by fragmentation of an intact antibody and/or it can be recombinantly produced from a gene encoding the partial antibody sequence.
  • An antibody capture agent can comprise a single chain antibody fragment. Alternatively or additionally, antibody capture agent can comprise multiple chains which are linked together, for example, by disulfide linkages.; and, any functional fragments obtained from such molecules, wherein such fragments retain specific-binding properties of the parent antibody molecule. Because of their smaller size as functional components of the whole molecule, antibody fragments can offer advantages over intact antibodies for use in certain immunochemical techniques and experimental applications.
  • Suitable capture agents useful for practicing the invention also include aptamers.
  • Aptamers are oligonucleotide sequences that can bind to their targets specifically via unique three dimensional (3-D) structures.
  • An aptamer can include any suitable number of nucleotides and different aptamers can have either the same or different numbers of nucleotides.
  • Aptamers can be DNA or R A or chemically modified nucleic acids and can be single stranded, double stranded, or contain double stranded regions, and can include higher ordered structures.
  • An aptamer can also be a photoaptamer, where a photoreactive or chemically reactive functional group is included in the aptamer to allow it to be covalently linked to its corresponding target.
  • an aptamer capture agent can include the use of two or more aptamers that specifically bind the same biomarker.
  • An aptamer can include a tag.
  • An aptamer can be identified using any known method, including the SELEX (systematic evolution of ligands by exponential enrichment), process. Once identified, an aptamer can be prepared or synthesized in accordance with any known method, including chemical synthetic methods and enzymatic synthetic methods and used in a variety of applications for biomarker detection. Liu et al., Curr Med Chem.
  • Capture agents useful in practicing the methods of the invention also include SOMAmers (Slow Off-Rate Modified Aptamers) known in the art to have improved off-rate characteristics. Brody et al., J Mol Biol. 422(5):595-606 (2012).
  • SOMAmers can be generated using any known method, including the SELEX method.
  • biomarkers can be modified prior to analysis to improve their resolution or to determine their identity.
  • the biomarkers can be subject to proteolytic digestion before analysis. Any protease can be used. Proteases, such as trypsin, that are likely to cleave the biomarkers into a discrete number of fragments are particularly useful. The fragments that result from digestion function as a fingerprint for the biomarkers, thereby enabling their detection indirectly. This is particularly useful where there are biomarkers with similar molecular masses that might be confused for the biomarker in question. Also, proteolytic fragmentation is useful for high molecular weight biomarkers because smaller biomarkers are more easily resolved by mass spectrometry.
  • biomarkers can be modified to improve detection resolution.
  • neuraminidase can be used to remove terminal sialic acid residues from glycoproteins to improve binding to an anionic adsorbent and to improve detection resolution.
  • the biomarkers can be modified by the attachment of a tag of particular molecular weight that specifically binds to molecular biomarkers, further distinguishing them.
  • the identity of the biomarkers can be further determined by matching the physical and chemical characteristics of the modified biomarkers in a protein database ⁇ e.g., SwissProt).
  • biomarkers in a sample can be captured on a substrate for detection.
  • Traditional substrates include antibody-coated 96-well plates or nitrocellulose membranes that are subsequently probed for the presence of the proteins.
  • protein-binding molecules attached to microspheres, microparticles, microbeads, beads, or other particles can be used for capture and detection of biomarkers.
  • the protein-binding molecules can be antibodies, peptides, peptoids, aptamers, small molecule ligands or other protein-binding capture agents attached to the surface of particles.
  • Each protein-binding molecule can include unique detectable label that is coded such that it can be distinguished from other detectable labels attached to other protein- binding molecules to allow detection of biomarkers in multiplex assays.
  • Examples include, but are not limited to, color-coded microspheres with known fluorescent light intensities (see e.g., microspheres with xMAP technology produced by Luminex (Austin, Tex.);
  • microspheres containing quantum dot nanocrystals for example, having different ratios and combinations of quantum dot colors (e.g., Qdot nanocrystals produced by Life
  • chemiluminescent dyes examples include chemiluminescent dyes, combinations of dye compounds; and beads of detectably different sizes.
  • biochips can be used for capture and detection of the biomarkers of the invention.
  • Many protein biochips are known in the art. These include, for example, protein biochips produced by Packard BioScience Company (Meriden Conn.), Zyomyx (Hayward, Calif.) and Phylos (Lexington, Mass.).
  • protein biochips comprise a substrate having a surface. A capture reagent or adsorbent is attached to the surface of the substrate. Frequently, the surface comprises a plurality of addressable locations, each of which location has the capture agent bound there.
  • the capture agent can be a biological molecule, such as a polypeptide or a nucleic acid, which captures other biomarkers in a specific manner. Alternatively, the capture agent can be a chromatographic material, such as an anion exchange material or a hydrophilic material. Examples of protein biochips are well known in the art.
  • Measuring mRNA in a biological sample can be used as a surrogate for detection of the level of the corresponding protein biomarker in a biological sample.
  • any of the biomarkers or biomarker panels described herein can also be detected by detecting the appropriate RNA.
  • Levels of mR A can measured by reverse transcription quantitative polymerase chain reaction (RT-PCR followed with qPCR).
  • RT-PCR is used to create a cDNA from the mRNA.
  • the cDNA can be used in a qPCR assay to produce fluorescence as the DNA amplification process progresses. By comparison to a standard curve, qPCR can produce an absolute measurement such as number of copies of mRNA per cell.
  • Some embodiments disclosed herein relate to diagnostic and prognostic methods of determining the probability for preterm birth in a pregnant female.
  • the detection of the level of expression of one or more biomarkers and/or the determination of a ratio of biomarkers can be used to determine the probability for preterm birth in a pregnant female.
  • detection methods can be used, for example, for early diagnosis of the condition, to determine whether a subject is predisposed to preterm birth, to monitor the progress of preterm birth or the progress of treatment protocols, to assess the severity of preterm birth, to forecast the outcome of preterm birth and/or prospects of recovery or birth at full term, or to aid in the determination of a suitable treatment for preterm birth.
  • the quantitation of biomarkers in a biological sample can be determined, without limitation, by the methods described above as well as any other method known in the art.
  • the quantitative data thus obtained is then subjected to an analytic classification process.
  • the raw data is manipulated according to an algorithm, where the algorithm has been pre-defined by a training set of data, for example as described in the examples provided herein.
  • An algorithm can utilize the training set of data provided herein, or can utilize the guidelines provided herein to generate an algorithm with a different set of data.
  • analyzing a measurable feature to determine the probability for preterm birth in a pregnant female encompasses the use of a predictive model. In further embodiments, analyzing a measurable feature to determine the probability for preterm birth in a pregnant female encompasses comparing said measurable feature with a reference feature. As those skilled in the art can appreciate, such comparison can be a direct comparison to the reference feature or an indirect comparison where the reference feature has been incorporated into the predictive model.
  • analyzing a measurable feature to determine the probability for preterm birth in a pregnant female encompasses one or more of a linear discriminant analysis model, a support vector machine classification algorithm, a recursive feature elimination model, a prediction analysis of microarray model, a logistic regression model, a CART algorithm, a flex tree algorithm, a LART algorithm, a random forest algorithm, a MART algorithm, a machine learning algorithm, a penalized regression method, or a combination thereof.
  • the analysis comprises logistic regression.
  • An analytic classification process can use any one of a variety of statistical analytic methods to manipulate the quantitative data and provide for classification of the sample. Examples of useful methods include linear discriminant analysis, recursive feature elimination, a prediction analysis of microarray, a logistic regression, a CART algorithm, a FlexTree algorithm, a LART algorithm, a random forest algorithm, a MART algorithm, machine learning algorithms; etc.
  • a regression tree begins with a root node that contains all the subjects.
  • the average GAB for all subjects can be cacluclated in the root node.
  • the variance of the GAB within the root node will be high, because there is a mixture of women with different GAB's.
  • the root node is then divided (partitioned) into two branches, so that each branch contains women with a similar GAB.
  • the average GAB for subjects in each branch is again caluclated.
  • the variance of the GAB within each branch will be lower than in the root node, because the subset of women within each branch has relatively more similar GAB's than those in the root node.
  • the two branches are created by selecting an analyte and a threshold value for the analyte that creates branches with similar GAB.
  • the analyte and threshold value are chosen from among the set of all analytes and threshold values, usually with a random subset of the analytes at each node.
  • the procedure continues recursively producing branches to create leaves (terminal nodes) in which the subjects have very similar GAB's.
  • the predicted GAB in each terminal node is the average GAB for subjects in that terminal node. This procedure creates a single regression tree.
  • a random forest can consist of several hundred or several thousand such trees.
  • Classification can be made according to predictive modeling methods that set a threshold for determining the probability that a sample belongs to a given class. The probability preferably is at least 50%, or at least 60%, or at least 70%>, or at least 80%> or higher. Classifications also can be made by determining whether a comparison between an obtained dataset and a reference dataset yields a statistically significant difference. If so, then the sample from which the dataset was obtained is classified as not belonging to the reference dataset class. Conversely, if such a comparison is not statistically significantly different from the reference dataset, then the sample from which the dataset was obtained is classified as belonging to the reference dataset class.
  • a desired quality threshold is a predictive model that will classify a sample with an accuracy of at least about 0.5, at least about 0.55, at least about 0.6, at least about 0.7, at least about 0.75, at least about 0.8, at least about 0.85, at least about 0.9, at least about 0.95, or higher.
  • a desired quality threshold can refer to a predictive model that will classify a sample with an AUC of at least about 0.7, at least about 0.75, at least about 0.8, at least about 0.85, at least about 0.9, or higher.
  • the relative sensitivity and specificity of a predictive model can be adjusted to favor either the selectivity metric or the sensitivity metric, where the two metrics have an inverse relationship.
  • the limits in a model as described above can be adjusted to provide a selected sensitivity or specificity level, depending on the particular requirements of the test being performed.
  • One or both of sensitivity and specificity can be at least about 0.7, at least about 0.75, at least about 0.8, at least about 0.85, at least about 0.9, or higher.
  • the raw data can be initially analyzed by measuring the values for each biomarker, usually in triplicate or in multiple triplicates.
  • the data can be manipulated, for example, raw data can be transformed using standard curves, and the average of triplicate measurements used to calculate the average and standard deviation for each patient. These values can be transformed before being used in the models, e.g. log-transformed, Box-Cox transformed (Box and Cox, Royal Stat. Soc, Series B, 26:211-246(1964).
  • the data are then input into a predictive model, which will classify the sample according to the state.
  • the resulting information can be communicated to a patient or health care provider.
  • a robust data set comprising known control samples and samples corresponding to the preterm birth classification of interest is used in a training set.
  • a sample size can be selected using generally accepted criteria.
  • different statistical methods can be used to obtain a highly accurate predictive model. Examples of such analysis are provided in Example 2.
  • hierarchical clustering is performed in the derivation of a predictive model, where the Pearson correlation is employed as the clustering metric.
  • One approach is to consider a preterm birth dataset as a "learning sample” in a problem of "supervised learning.”
  • CART is a standard in applications to medicine (Singer, Recursive Partitioning in the Health Sciences, Springer(1999)) and can be modified by transforming any qualitative features to quantitative features; sorting them by attained significance levels, evaluated by sample reuse methods for Hotelling's T 2 statistic; and suitable application of the lasso method.
  • Problems in prediction are turned into problems in regression without losing sight of prediction, indeed by making suitable use of the Gini criterion for classification in evaluating the quality of regressions.
  • LARTree or LART can be used (Turnbull (2005) Classification Trees with Subset Analysis Selection by the Lasso, Stanford University).
  • the name reflects binary trees, as in CART and FlexTree; the lasso, as has been noted; and the implementation of the lasso through what is termed LARS by Efron et al. (2004) Annals of Statistics 32:407-451 (2004). See, also, Huang et al.., Proc. Natl. Acad. Sci. USA. 101(29): 10529-34 (2004).
  • Other methods of analysis that can be used include logic regression.
  • One method of logic regression Ruczinski Journal of Computational and Graphical Statistics 12:475-512 (2003).
  • Logic regression resembles CART in that its classifier can be displayed as a binary tree. It is different in that each node has Boolean statements about features that are more general than the simple "and" statements produced by CART.
  • the false discovery rate can be determined.
  • a set of null distributions of dissimilarity values is generated.
  • the values of observed profiles are permuted to create a sequence of distributions of correlation coefficients obtained out of chance, thereby creating an appropriate set of null distributions of correlation coefficients (Tusher et al. , Proc. Natl. Acad. Sci. U.S.A 98, 5116-21 (2001)).
  • the set of null distribution is obtained by:
  • the FDR is the ratio of the number of the expected falsely significant correlations (estimated from the correlations greater than this selected Pearson correlation in the set of randomized data) to the number of correlations greater than this selected Pearson correlation in the empirical data (significant correlations).
  • This cut-off correlation value can be applied to the correlations between experimental profiles. Using the aforementioned distribution, a level of confidence is chosen for significance. This is used to determine the lowest value of the correlation coefficient that exceeds the result that would have obtained by chance.
  • this method one obtains thresholds for positive correlation, negative correlation or both. Using this threshold(s), the user can filter the observed values of the pair wise correlation coefficients and eliminate those that do not exceed the threshold(s).
  • an estimate of the false positive rate can be obtained for a given threshold. For each of the individual "random correlation" distributions, one can find how many observations fall outside the threshold range. This procedure provides a sequence of counts. The mean and the standard deviation of the sequence provide the average number of potential false positives and its standard deviation.
  • variables chosen in the cross-sectional analysis are separately employed as predictors in a time-to-event analysis (survival analysis), where the event is the occurrence of preterm birth, and subjects with no event are considered censored at the time of giving birth.
  • survival analysis a time-to-event analysis
  • the event is the occurrence of preterm birth, and subjects with no event are considered censored at the time of giving birth.
  • a parametric approach to analyzing survival can be better than the widely applied semi-parametric Cox model.
  • a Weibull parametric fit of survival permits the hazard rate to be monotonically increasing, decreasing, or constant, and also has a proportional hazards representation (as does the Cox model) and an accelerated failure-time representation. All the standard tools available in obtaining approximate maximum likelihood estimators of regression coefficients and corresponding functions are available with this model.
  • Cox models can be used, especially since reductions of numbers of covariates to manageable size with the lasso will significantly simplify the analysis, allowing the possibility of a nonparametric or semi-parametric approach to prediction of time to preterm birth.
  • These statistical tools are known in the art and applicable to all manner of proteomic data.
  • a set of biomarker, clinical and genetic data that can be easily determined, and that is highly informative regarding the probability for preterm birth and predicted time to a preterm birth event in said pregnant female is provided.
  • algorithms provide information regarding the probability for preterm birth in the pregnant female.
  • the probability for preterm birth according to the invention can be determined using either a quantitative or a categorical variable.
  • the measurable feature of each of N biomarkers can be subjected to categorical data analysis to determine the probability for preterm birth as a binary categorical outcome.
  • the methods of the invention may analyze the measurable feature of each of N biomarkers by initially calculating quantitative variables, in particular, predicted gestational age at birth. The predicted gestational age at birth can subsequently be used as a basis to predict risk of preterm birth.
  • the methods of the invention take into account the continuum of measurements detected for the measurable features. For example, by predicting the gestational age at birth rather than making a binary prediction of preterm birth versus term birth, it is possible to tailor the treatment for the pregnant female. For example, an earlier predicted gestational age at birth will result in more intensive prenatal intervention, i.e. monitoring and treatment, than a predicted gestational age that approaches full term.
  • p(PTB) can estimated as the proportion of women in the PAPR clinical trial ⁇ see Example 1) with a predicted GAB of j days plus or minus k days who actually deliver before 37 weeks gestational age. More generally, for women with a predicted GAB of j days plus or minus k days, the probability that the actual gestational age at birth will be less than a specified gestational age, p(actual GAB ⁇ specified GAB), was estimated as the proportion of women in the PAPR clinical trial with a predicted GAB of j days plus or minus k days who actually deliver before the specified gestational age.
  • a subset of markers i.e. at least 3, at least 4, at least 5, at least 6, up to the complete set of markers.
  • a subset of markers will be chosen that provides for the needs of the quantitative sample analysis, e.g. availability of reagents, convenience of quantitation, etc., while maintaining a highly accurate predictive model.
  • the selection of a number of informative markers for building classification models requires the definition of a performance metric and a user-defined threshold for producing a model with useful predictive ability based on this metric.
  • the performance metric can be the AUC, the sensitivity and/or specificity of the prediction as well as the overall accuracy of the prediction model.
  • an analytic classification process can use any one of a variety of statistical analytic methods to manipulate the quantitative data and provide for classification of the sample.
  • useful methods include, without limitation, linear discriminant analysis, recursive feature elimination, a prediction analysis of microarray, a logistic regression, a CART algorithm, a FlexTree algorithm, a LART algorithm, a random forest algorithm, a MART algorithm, and machine learning algorithms.
  • VQEAHLTEDQIFYFPK 655. -2.02E+01 1.77E-09 2.45E+00 -8.22 2.20E-16 66 391.2
  • HTLNQIDEVK 598.82 951.5 1.03E+01 3.04E+04 2.11E+00 4.89 9.90E-07
  • Table 4 Area under the ROC (AUROC) curve for individual analytes to discriminate pre-term birth subjects from non-pre-term birth subjects. The 77 transitions with the highest AUROC area are shown. Transition AUROC
  • Table 5 AUROCs for random forest, boosting, lasso, and logistic regression for a specific number of transitions permitted in the model, as estimated by 100 rounds of bootstrap resampling.
  • kits for determining probability of preterm birth wherein the kits can be used to detect N of the isolated biomarkers listed in Tables 1 through 63.
  • the kits can be used to detect one or more, two or more, or three of the isolated biomarkers selected from the group consisting of AFTECCVVASQLR, ELLESYIDGR, and ITLPDFTGDLR.
  • kits can be used to detect one or more, two or more, or three of the isolated biomarkers selected from the group consisting of FLNWIK, FGFGGSTDSGPIR, LLELTGPK, VEHSDLSFSK, IEGNLIFDP NYLPK, ALVLELAK, TQILEWAAER, DVLLLVHNLPQNLPGYFWYK, SEPRPGVLLR, ITQDAQLK, ALDLSLK, WWGGQPLWITATK, and LSETNR.
  • the isolated biomarkers selected from the group consisting of FLNWIK, FGFGGSTDSGPIR, LLELTGPK, VEHSDLSFSK, IEGNLIFDP NYLPK, ALVLELAK, TQILEWAAER, DVLLLVHNLPQNLPGYFWYK, SEPRPGVLLR, ITQDAQLK, ALDLSLK, WWGGQPLWITATK, and LSETNR.
  • kits can be used to detect one or more, two or more, three or more, four or more, five or more, six or more, seven or more, or eight of the isolated biomarkers selected from the group consisting of lipopolysaccharide-binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), and complement component C8 gamma chain (C8G or C08G).
  • LBP lipopolysaccharide-binding protein
  • THRB prothrombin
  • C5 or C05 complement component C5
  • PLMN plasminogen
  • C8G or C08G complement component C8 gamma chain
  • kits can be used to detect one or more, two or more, three or more, four or more, five or more, six or more, seven or more, or eight of the isolated biomarkers selected from the group consisting of Alpha- lB-glycoprotein (A1BG), Disintegrin and metalloproteinase domain-containing protein 12 (ADA12), Apolipoprotein B-100 (APOB), Beta-2-microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid-binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy-specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE).
  • A1BG Alpha-
  • the kit can include one or more agents for detection of biomarkers, a container for holding a biological sample isolated from a pregnant female; and printed instructions for reacting agents with the biological sample or a portion of the biological sample to detect the presence or amount of the isolated biomarkers in the biological sample.
  • the agents can be packaged in separate containers.
  • the kit can further comprise one or more control reference samples and reagents for performing an immunoassay.
  • the kit comprises agents for measuring the levels of at least N of the isolated biomarkers listed in Tables 1 through 63.
  • the kit can include antibodies that specifically bind to these biomarkers, for example, the kit can contain at least one of an antibody that specifically binds to lipopolysaccharide-binding protein (LBP), an antibody that specifically binds to prothrombin (THRB), an antibody that specifically binds to complement component C5 (C5 or C05), an antibody that specifically binds to plasminogen (PLMN), and an antibody that specifically binds to complement component C8 gamma chain (C8G or C08G).
  • LBP lipopolysaccharide-binding protein
  • THRB prothrombin
  • C5 or C05 complement component C5
  • PLMN plasminogen
  • C8G or C08G complement component C8 gamma chain
  • the kit comprises agents for measuring the levels of at least N of the isolated biomarkers listed in Tables 1 through 63.
  • the kit can include antibodies that specifically bind to these biomarkers, for example, the kit can contain at least one of an antibody that specifically binds to Alpha- lB-glycoprotein (A1BG),
  • Disintegrin and metalloproteinase domain-containing protein 12 ADA12
  • Apolipoprotein B-100 APOB
  • Beta-2-microglobulin B2MG
  • CCAAT/enhancer-binding protein alpha/beta HP8 Peptide
  • Corticosteroid-binding globulin CBG
  • Complement component C6, Endoglin EGLN
  • EGF Ectonucleotide pyrophosphatase/phosphodiesterase family member 2
  • ENPP2 Ectonucleotide pyrophosphatase/phosphodiesterase family member 2
  • FA7 Coagulation factor VII
  • HBP2 Hyaluronan-binding protein 2
  • PSG9 Pregnancy-specific beta- 1 -glycoprotein 9
  • IHBE Inhibin beta E chain
  • the kit can comprise one or more containers for compositions contained in the kit.
  • Compositions can be in liquid form or can be lyophilized. Suitable containers for the compositions include, for example, bottles, vials, syringes, and test tubes. Containers can be formed from a variety of materials, including glass or plastic.
  • the kit can also comprise a package insert containing written instructions for methods of determining probability of preterm birth.
  • a standard protocol was developed governing conduct of the Proteomic Assessment of Preterm Risk (PAPR) clinical study. This protocol also specified that the samples and clinical information could be used to study other pregnancy complications for some of the subjects. Specimens were obtained from women at 11 Internal Review Board (IRB) approved sites across the United States. After providing informed consent, serum and plasma samples were obtained, as well as pertinent information regarding the patient's demographic characteristics, past medical and pregnancy history, current pregnancy history and concurrent medications. Following delivery, data were collected relating to maternal and infant conditions and complications. Serum and plasma samples were processed according to a protocol that requires standardized refrigerated centrifugation, aliquoting of the samples into 0.5 ml 2-D bar-coded cryovials and subsequent freezing at -80°C.
  • preterm birth cases were individually reviewed to determine their status as either a spontaneous preterm birth or a medically indicated preterm birth. Only spontaneous preterm birth cases were used for this analysis.
  • 80 samples were analyzed in two gestational age groups: a) a late window composed of samples from 23-28 weeks of gestation which included 13 cases, 13 term controls matched within one week of sample collection and 14 term random controls, and, b) an early window composed of samples from 17-22 weeks of gestation included 15 cases, 15 term controls matched within one week of sample collection and 10 random term controls.
  • MARS-14 Human 14 Multiple Affinity Removal System
  • sMRM Multiple Reaction Monitoring method
  • the peptides were separated on a 150 mm x 0.32 mm Bio-Basic CI 8 column (ThermoFisher) at a flow rate of 5 ⁇ /min using a Waters Nano Acquity UPLC and eluted using an acetonitrile gradient into a AB SCIEX QTRAP 5500 with a Turbo V source (AB SCIEX, Framingham, MA).
  • the sMR assay measured 1708 transitions that correspond to 854 peptides and 236 proteins. Chromatographic peaks were integrated using Rosetta Elucidator software (Ceiba Solutions).
  • the objective of these analyses was to examine the data collected in Example 1 to identify transitions and proteins that predict preterm birth.
  • the specific analyses employed were (i) Cox time-to-event analyses and (ii) models with preterm birth as a binary categorical dependent variable.
  • the dependent variable for all the Cox analyses was Gestational Age of time to event (where event is preterm birth).
  • preterm birth subjects have the event on the day of birth.
  • Term subjects are censored on the day of birth.
  • Gestational age on the day of specimen collection is a covariate in all Cox analyses.
  • the assay data were previously adjusted for run order and depletion batch, and log transformed. Values for gestational age at time of sample collection were adjusted as follows. Transition values were regressed on gestational age at time of sample collection using only controls (non-pre-term subjects). The residuals from the regression were designated as adjusted values. The adjusted values were used in the models with pre-term birth as a binary categorical dependent variable. Unadjusted values were used in the Cox analyses.
  • the stepwise variable selection analysis used the Akaike Information Criterion (AIC) as the stopping criterion. Table 2 shows the transitions selected by the stepwise AIC analysis.
  • the coefficient of determination (R 2 ) for the stepwise AIC model is 0.86 (not corrected for multiple comparisons).
  • Lasso variable selection was used as the second method of multivariate Cox Proportional Hazards analyses to predict Gestational Age at birth, including Gestational age on the day of specimen collection as a covariate. This analysis uses a lambda penalty for lasso estimated by cross validation. Table 3 shows the results.
  • the lasso variable selection method is considerably more stringent than the stepwise AIC, and selects only 3 transitions for the final model, representing 3 different proteins. These 3 proteins give the top 4 transitions from the univariate analysis; 2 of the top 4 univariate are from the same protein, and hence are not both selected by the lasso method. Lasso tends to select a relatively small number of variables with low mutual correlation.
  • the coefficient of determination (R 2 ) for the lasso model is 0.21 (not corrected for multiple comparisons).
  • Multivariate analyses was performed to predict preterm birth as a binary categorical dependent variable, using random forest, boosting, lasso, and logistic regression models. Random forest and boosting models grow many classification trees. The trees vote on the assignment of each subject to one of the possible classes. The forest chooses the class with the most votes over all the trees.
  • each method was allowed to select and rank its own best 15 transitions. We then built models with 1 to 15 transitions. Each method sequentially reduces the number of nodes from 15 to 1 independently. A recursive option was used to reduce the number of nodes at each step: To determine which node to remove, the nodes were ranked at each step based on their importance from a nested cross-validation procedure. The least important node was eliminated. The importance measures for lasso and logistic regression are z-values.
  • variable importance was calculated from permuting out-of-bag data: for each tree, the classification error rate on the out-of-bag portion of the data was recorded; the error rate was then recalculated after permuting the values of each variable (i.e., transition); if the transition was in fact important, there would have been be a big difference between the two error rates; the difference between the two error rates were then averaged over all trees, and normalized by the standard deviation of the differences.
  • the AUCs for these models are shown in Table 5, as estimated by 100 rounds of bootstrap resampling.
  • Table 6 shows the top 15 transitions selected by each multivariate method, ranked by importance for that method.
  • univariate and multivariate Cox analyses was performed using transitions to predict Gestational Age at Birth (GAB), including Gestational age on the day of specimen collection as a covariate.
  • GAB Gestational Age at birth
  • five proteins were identified that have multiple transitions among those with p-value less than 0.05: lipopolysaccharide -binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), and complement component C8 gamma chain (C8G or C08G).
  • stepwise AIC variable analysis selects 24 transitions, while the lasso model selects 3 transitions, which include the 3 top proteins in the univariate analysis.
  • Univariate (AUROC) and multivariate (random forest, boosting, lasso, and logistic regression) analyses were performed to predict pre-term birth as a binary categorical variable.
  • Univariate analyses identified 63 analytes with AUROC of 0.6 or greater.
  • Multivariate analyses suggest that models that combine 3 or more transitions give AUC greater than 0.7, as estimated by bootstrap.
  • the samples were processed in 4 batches with each batch composed of 7 cases, 14 matched controls and 3 HGS controls. Serum samples were depleted of the 14 most abundant serum samples by MARS 14 as described in Example 1. Depleted serum was then reduced with dithiothreitol, alkylated with iodacetamide, and then digested with trypsin at a 1 :20 trypsin to protein ratio overnight at 37°C. Following trypsin digestion, the samples were desalted on an Empore CI 8 96-well Solid Phase Extraction Plate (3M Company) and lyophilized to dryness. The desalted samples were resolubilized in a reconstitution solution containing five internal standard peptides .
  • a further study used a hypothesis-independent shotgun approach to identify and quantify additional biomarkers not present on our multiplexed hypothesis dependent MRM assay. Samples were processed as described in the preceding Examples unless noted below.
  • B 250mM ammonium acetate, 2% acetonitrile, 0.1% formic acid
  • Xcorr scores (charge +1 > 1.5 Xcorr, charge +2 > 2.0, charge +3 > 2.5). Similar search parameters were used for X!tandem, except the mass tolerance for the fragment ion was 0.8 AMU and there is no Xcorr filtering. Instead, the PeptideProphet algorithm (Keller et al., Anal. Chem 2002;74:5383-5392) was used to validate each X! Tandem peptide-spectrum assignment and Protein assignments were validated using ProteinProphet algorithm (Nesvizhskii et al., Anal. Chem 2002; 74:5383-5392). Data was filtered to include only the peptide-spectrum matches that had PeptideProphet probability of 0.9 or more.
  • ROC Receiver Operating Characteristic
  • the area under the ROC curve is equal to the probability that a classifier will rank a randomly chosen positive instance higher than a randomly chosen negative one.
  • Peptides with AUC greater than or equal to 0.6 found uniquely by Sequest or Xtandem are found in Tables 8 and 9, respectively, and those identified by both approaches are found in Table 10.
  • precursor P04217 (A1BG_HUMAN) R.CEGPIPDVTFELLR.E 0.67 alpha-lB- glycoprotein
  • preproprotein P02765 (FETUA_HUMAN) K.CNLLAEK.Q 0.67 alpha-2-HS- glycoprotein
  • preproprotein P02765 (FETUA_HUMAN) K.EHAVEGDCDFQLLK.L 0.67 alpha-2-HS- glycoprotein K.HTLNQIDEVKVWPQQPSGELFEIEID preproprotein P02765 (FETUA_HUMAN) TLETTCHVLDPTPVAR.C 0.64 alpha-2- macroglobulin
  • preproprotein P01019 (ANGT_HUMAN) K.IDRFMQAVTGWK.T 0.81 angiotensinogen
  • preproprotein P01019 (ANGT_HUMAN) R.AAMVGMLANFLGFR.I 0.62 antithrombin-lll
  • preproprotein P02652 (APOA2_HUMAN) K.AGTELVNFLSYFVELGTQPATQ.- 0.61 apolipoprotein A-ll
  • preproprotein P02652 (APOA2_HUMAN) K.EPCVESLVSQYFQTVTDYGK.D 0.63 apolipoprotein A-IV
  • P04114 (APOB_HUMAN) K.VLADKFIIPGLK.L 0.72 apolipoprotein B-100 K.YSQPEDSLIPFFEITVPESQLTVSQFTL precursor P04114 (APOB_HUMAN) PK.S 0.61 apolipoprotein B-100
  • P04114 (APOB_HUMAN) R.GIISALLVPPETEEAK.Q 0.81 apolipoprotein B-100 P04114 (APOB_HUMAN) R.ILGEELGFASLHDLQLLGK.L 0.62 Protein Description Uniprot ID (name) Peptide S_AUC precursor
  • preproprotein P10909 CLUS_HUMAN
  • preproprotein P10909 CLUS_HUMAN
  • preproprotein P00740 (FA9_HUMAN) K.WIVTAAHCVETGVK.I 0.60 coagulation factor VII
  • preproprotein P08709 FA7_HUMAN
  • preproprotein P01031 C05_HUMAN
  • preproprotein P01031 C05_HUMAN
  • preproprotein P01031 C05_HUMAN
  • preproprotein P01031 C05_HUMAN
  • preproprotein P01031 C05_HUMAN
  • preproprotein P00751 (CFAB_HUMAN) K.ALFVSEEEKK.L 0.64 complement factor B
  • preproprotein P00751 (CFAB_HUMAN) K.CLVNLIEK.V 0.70 complement factor B
  • preproprotein P00751 (CFAB_HUMAN) K.EAGIPEFYDYDVALIK.L 0.66 complement factor B
  • preproprotein P00751 (CFAB_HUMAN) K.VSEADSSNADWVTK.Q 0.73 complement factor B
  • preproprotein P00751 (CFAB_HUMAN) K.YGQTIRPICLPCTEGTTR.A 0.67 complement factor B
  • preproprotein P00751 (CFAB_HUMAN) R.DLEIEVVLFHPNYNINGK.K 0.71 complement factor B
  • preproprotein P00751 (CFAB_HUMAN) R.FLCTGGVSPYADPNTCR.G 0.64 complement factor H
  • preproprotein P05156 (CFAI_HUMAN) K.DASGITCGGIYIGGCWILTAAHCLR.A 0.71 complement factor 1
  • preproprotein P05156 (CFAI_HUMAN) K.VANYFDWISYHVGR.P 0.72 complement factor 1
  • preproprotein P05156 (CFAI_HUMAN) R.IIFHENYNAGTYQNDIAUEMK.K 0.63 Protein Description Uniprot ID (name) Peptide S_AUC complement factor 1
  • preproprotein P05156 (CFAI_HUMAN) R.YQIWTTVVDWIHPDLK.R 0.63 conserved oligomeric
  • subunit 6 isoform Q9Y2V7 (COG6_HUMAN) K.ISNLLK.F 0.65 corticosteroid- binding globulin
  • FERM domain- containing protein 8 Q9BZ67 (FRMD8_HUMAN) R.VQLGPYQPGRPAACDLR.E 0.65 fetuin-B precursor Q9UGM5 (FETUB_HUMAN) R.GGLGSLFYLTLDVLETDCHVLR.K 0.83 ficolin-3 isoform 1
  • hemopexin precursor P02790 (HEMO_HUMAN) EK.S 0.61 hemopexin precursor P02790 (HEMO_HUMAN) K.VDGALCMEK.S 0.66 hemopexin precursor P02790 (HEMO_HUMAN) R.DYFMPCPGR.G 0.68 hemopexin precursor P02790 (HEMO_HUMAN) R.EWFWDLATGTM*K.E 0.64 hemopexin precursor P02790 (HEMO_HUMAN) R.QGHNSVFLIK.G 0.71 heparin cofactor 2 K. HQGTITVN EEGTQATTVTTVGFM PL precursor P05546 (HEP2_HUMAN) STQVR.F 0.60 heparin cofactor 2
  • preproprotein Q14520 (HABP2_HUMAN) K.FLNWIK.A 0.82 hyaluronan-binding
  • preproprotein Q14520 HBP2_HUMAN
  • inhibitor heavy chain R.EVAFDLEIPKTAFISDFAVTADGNAFI
  • H2 precursor P19823 (ITIH2_HUMAN) R.KLGSYEHR.I 0.72 inter-alpha-trypsin
  • H2 precursor P19823 (ITIH2_HUMAN) R.LSNENHGIAQR.I 0.66 inter-alpha-trypsin
  • H2 precursor P19823 (ITIH2_HUMAN) R.MATTMIQSK.V 0.60 inter-alpha-trypsin
  • H2 precursor P19823 (ITIH2_HUMAN) R.M 0.65 inter-alpha-trypsin
  • H4 isoform 1 Q14624 (ITIH4_HUMAN) K.YIFHNFM*ER.L 0.67 Protein Description Uniprot ID (name) Peptide S_AUC precursor
  • H4 isoform 1 R.IHEDSDSALQLQDFYQEVANPLLTA
  • precursor P02750 (A2GL_HUMAN) R.LHLEGNKLQVLGK.D 0.76 leucine-rich alpha-2- glycoprotein
  • P02750 (A2GL_HUMAN) R.TLDLGENQLETLPPDLLR.G 0.61 lipopolysaccharide- binding protein P18428 (LBP_HUMAN) K.GLQYAAQEGLLALQSELLR.I 0.82 Protein Description Uniprot ID (name) Peptide S_AUC precursor
  • preproprotein Q99542 M MP19_HUMAN
  • preproprotein Q13219 PAPP1_HUMAN
  • preproprotein P03952 KLKB1_HUMAN
  • R.CLLFSFLPASSINDM EKR.F 0.60 plasma protease CI
  • P00747 P00747 (PLM N_HUMAN) R.HSIFTPETNPR.A 0.63 platelet basic protein
  • preproprotein P02775 CXCL7_HUMAN
  • V precursor P40197 (GPV_HUMAN) R.LVSLDSGLLNSLGALTELQFHR.N 0.88 pregnancy zone
  • protein precursor P20742 (PZP_HUMAN) R.SYIFIDEAHITQSLTWLSQMQK.D 0.68 pregnancy-specific
  • preproprotein P02760 AMBP_HUMAN
  • AMBP_HUMAN preproprotein P02760
  • preproprotein P00734 (THRB_HUMAN) K.SPQELLCGASLISDR.W 0.84 prothrombin
  • preproprotein P00734 (THRB_HUMAN) R.LAVTTHGLPCLAWASAQAK.A 0.62 prothrombin
  • preproprotein P00734 (THRB_HUMAN) R.SEGSSVNLSPPLEQCVPDR.G 0.70 prothrombin
  • preproprotein P00734 (THRB_HUMAN) R.SGIECQLWR.S 0.68 prothrombin
  • preproprotein P00734 (THRB_HUMAN) R.TATSEYQTFFNPR.T 0.60 prothrombin
  • preproprotein P00734 (THRB_HUMAN) R.VTGWGNLKETWTANVGK.G 0.69 putative
  • isomerase isoform 1 Q5T013 (HYI_HUMAN) R.IHLMAGR.V 0.66 ras-like protein family
  • biotinidase precursor (BTD_HUMAN) R.TSIYPFLDFM*PSPQVVR.W 0.79 carboxypeptidase N
  • subcomponent subunit P02746 K.NSLLGM EGANSIFSGFLLFPDMEA.
  • isoform 1 (C04A_HUMAN) R.EELVYELNPLDHR.G 0.66 complement C4-A P0C0L4 R.STQDTVIALDALSAYWIASHTTEER.
  • preproprotein C05_HUMAN
  • preproprotein (CFAB_HUMAN) R.GDSGGPLIVHK.R 0.63 complement factor B P00751
  • preproprotein preproprotein (CFAB_HUMAN) R.LEDSVTYHCSR.G 0.68 complement factor B P00751
  • CBG_HUMAN globulin precursor
  • CBG_HUMAN globulin precursor
  • gelsolin isoform b (GELS_HUMAN) K.FDLVPVPTNLYGDFFTGDAYVILK.T 0.66
  • gelsolin isoform b (GELS_HUMAN) K.QTQVSVLPEGGETPLFK.Q 0.66
  • gelsolin isoform b (GELS_HUMAN) K.TPSAAYLWVGTGASEAEK.T 0.71
  • gelsolin isoform b (GELS_HUMAN) A 0.67
  • gelsolin isoform b (GELS_HUMAN) YNYR.H 0.60
  • gelsolin isoform b (GELS_HUMAN) LK.T 0.73
  • gelsolin isoform b (GELS_HUMAN) DGTGQK.Q 0.63 glutathione peroxidase P22352
  • hemopexin precursor HEMO_HUMAN K.ALPQPQNVTSLLGCTH.- 0.63
  • hemopexin precursor (HEMO_HUMAN) SDVEK.L 0.68
  • hemopexin precursor HEMO_HUMAN
  • hemopexin precursor HEMO_HUMAN
  • hemopexin precursor HEMO_HUMAN
  • hemopexin precursor (HEMO_HUMAN) R.L 0.75
  • hemopexin precursor (HEMO_HUMAN) R.LWWLDLK.S 0.62
  • hemopexin precursor HEMO_HUMAN
  • HEP2_HUMAN GLK.G 0.60 Protein description Uniprot ID (name) Peptide XT_AUC insulin-like growth
  • HI isoform a precursor IIH1_HUMAN
  • H2 precursor IIH2_HUMAN
  • H2 precursor IIH2_HUMAN
  • H3 preproprotein IMIH3_HUMAN
  • H4 isoform 1 precursor (ITIH4_HUMAN) K.ITFELVYEELLK.R 0.60 inter-alpha-trypsin
  • H4 isoform 1 precursor (ITIH4_HUMAN) K.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Urology & Nephrology (AREA)
  • Hematology (AREA)
  • Chemical & Material Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Immunology (AREA)
  • Food Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Pregnancy & Childbirth (AREA)
  • Gynecology & Obstetrics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Reproductive Health (AREA)
  • Medicinal Chemistry (AREA)
  • Cell Biology (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)
  • Peptides Or Proteins (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The disclosure provides biomarker panels, methods and kits for determining the probability for preterm birth in a pregnant female. The present disclosure is based, in part, on the discovery that certain proteins and peptides in biological samples obtained from a pregnant female are differentially expressed in pregnant females that have an increased risk of developing in the future or presently suffering from preterm birth relative to matched controls. The present disclosure is further based, in part, on the unexepected discovery that panels combining one or more of these proteins and peptides can be utilized in methods of determining the probability for preterm birth in a pregnant female with relatively high sensitivity and specificity. These proteins and peptides dislosed herein serve as biomarkers for classifying test samples, predicting a probability of preterm birth, monitoring of progress of preterm birth in a pregnant female, either individually or in a panel of biomarkers.

Description

BIOMARKERS AND METHODS FOR PREDICTING PRETERM BIRTH
[0001] This application claims the benefit of priority to U.S. provisional patent application number 61/919,586, filed December 20, 2013, and U.S. provisional application number 61/798,504, filed March 15, 2013, each of which is incorporated herein by reference in its entirety.
[0002] The invention relates generally to the field of personalized medicine and, more specifically to compositions and methods for determining the probability for preterm birth in a pregnant female.
BACKGROUND
[0003] According to the World Heath Organization, an estimated 15 million babies are born preterm (before 37 completed weeks of gestation) every year. In almost all countries with reliable data, preterm birth rates are increasing. See, World Health Organization; March of Dimes; The Partnership for Maternal, Newborn & Child Health; Save the Children, Born too soon: the global action report on preterm birth, ISBN 9789241503433(2012). An estimated 1 million babies die annually from preterm birth complications. Globally, preterm birth is the leading cause of newborn deaths (babies in the first four weeks of life) and the second leading cause of death after pneumonia in children under five years. Many survivors face a lifetime of disability, including learning disabilities and visual and hearing problems.
[0004] Across 184 countries with reliable data, the rate of preterm birth ranges from 5% to 18% of babies born. Blencowe et ah, "National, regional and worldwide estimates of preterm birth." The Lancet, 9;379(9832):2162-72 (2012). While over 60% of preterm births occur in Africa and south Asia, preterm birth is nevertheless a global problem. Countries with the highest numbers include Brazil, India, Nigeria and the United States of America. Of the 11 countries with preterm birth rates over 15%, all but two are in sub- Saharan Africa. In the poorest countries, on average, 12% of babies are born too soon compared with 9% in higher-income countries. Within countries, poorer families are at higher risk. More than three-quarters of premature babies can be saved with feasible, cost- effective care, for example, antenatal steroid injections given to pregnant women at risk of preterm labour to strengthen the babies' lungs.
[0005] Infants born preterm are at greater risk than infants born at term for mortality and a variety of health and developmental problems. Complications include acute respiratory, gastrointestinal, immunologic, central nervous system, hearing, and vision problems, as well as longer-term motor, cognitive, visual, hearing, behavioral, social- emotional, health, and growth problems. The birth of a preterm infant can also bring considerable emotional and economic costs to families and have implications for public- sector services, such as health insurance, educational, and other social support systems. The greatest risk of mortality and morbidity is for those infants born at the earliest gestational ages. However, those infants born nearer to term represent the greatest number of infants born preterm and also experience more complications than infants born at term.
[0006] To prevent preterm birth in women who are less than 24 weeks pregnant with an ultrasound showing cervical opening, a surgical procedure known as cervical cerclage can be employed in which the cervix is stitched closed with strong sutures. For women less than 34 weeks pregnant and in active preterm labor, hospitalization may be necessary as well as the administration of medications to temporarily halt preterm labor an/or promote the fetal lung development. If a pregnant women is determined to be at risk for preterm birth, health care providers can implement various clinical strategies that may include preventive medications, for example, hydroxyprogesterone caproate (Makena) injections and/or vaginal progesterone gel, cervical pessaries, restrictions on sexual activity and/or other physical activities, and alterations of treatments for chronic conditions, such as diabetes and high blood pressure, that increase the risk of preterm labor.
[0007] There is a great need to identify and provide women at risk for preterm birth with proper antenatal care. Women identified as high-risk can be scheduled for more intensive antenatal surveillance and prophylactic interventions. Current strategies for risk assessment are based on the obstetric and medical history and clinical examination, but these strategies are only able to identify a small percentage of women who are at risk for preterm delivery. Reliable early identification of risk for preterm birth would enable planning appropriate monitoring and clinical management to prevent preterm delivery. Such monitoring and management might include: more frequent prenatal care visits, serial cervical length measurements, enhanced education regarding signs and symptoms of early preterm labor, lifestyle interventions for modifiable risk behaviors, cervical pessaries and progesterone treatment. Finally, reliable antenatal identification of risk for preterm birth also is crucial to cost-effective allocation of monitoring resources.
[0008] The present invention addresses this need by providing compositions and methods for determining whether a pregnant woman is at risk for preterm birth. Related advantages are provided as well. SUMMARY
[0009] The present invention provides compositions and methods for predicting the probability of preterm birth in a pregnant female.
[0010] In one aspect, the invention provides a panel of isolated biomarkers comprising N of the biomarkers listed in Tables 1 through 63. In some embodiments, N is a number selected from the group consisting of 2 to 24. In additional embodiments, the biomarker panel comprises at least two of the isolated biomarkers selected from the group consisting of AFTECCVVASQLR, ELLESYIDGR, and ITLPDFTGDLR. In additional embodiments, the biomarker panel comprises at least two of the isolated biomarkers selected from the group consisting of FLNWIK, FGFGGSTDSGPIR, LLELTGPK, VEHSDLSFSK, IEGNLIFDP NYLPK, ALVLELAK, TQILEWAAER,
DVLLLVHNLPQNLPGYFWYK, SEPRPGVLLR, ITQDAQLK, ALDLSLK,
WWGGQPLWITATK, and LSETNR
[0011] In further embodiments, the biomarker panel comprises at least two of the isolated biomarkers selected from the group consisting of the biomarkers set forth in Table 50 and the biomarkers set forth in Table 52.
[0012] In a further aspect, the invention provides a panel of isolated biomarkers comprising N of the biomarkers listed in Tables 1 through 63. In some embodiments, N is a number selected from the group consisting of 2 to 24. In additional embodiments, the biomarker panel comprises at least two of the isolated biomarkers selected from the group consisting of the biomarkers set forth in Table 50 and the biomarkers set forth in Table 52.
[0013] In some embodiments, the invention provides a biomarker panel comprising at least two of the isolated biomarkers selected from the group consisting of
lipopolysaccharide-binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), and complement component C8 gamma chain (C8G or C08G).
[0014] In some embodiments, the invention provides a biomarker panel comprising at least two of the isolated biomarkers selected from the group consisting of Alpha- 1B- glycoprotein (A1BG), Disintegrin and metalloproteinase domain-containing protein 12 (ADA 12), Apolipoprotein B-100 (APOB), Beta-2-microglobulin (B2MG),
CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid-binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy-specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE).
[0015] In other embodiments, the invention provides a biomarker panel comprising lipopolysaccharide-binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), complement component C8 gamma chain (C8G or C08G), complement component 1, q subcomponent, B chain (C1QB), fibrinogen beta chain (FIBB or FIB), C-reactive protein (CRP), inter-alpha-trypsin inhibitor heavy chain H4 (ITIH4), chorionic somatomammotropin hormone (CSH), and angiotensinogen (ANG or ANGT).
[0016] In other embodiments, the invention provides a biomarker panel comprising Alpha- lB-glycoprotein (A1BG), Disintegrin and metalloproteinase domain-containing protein 12 (ADA 12), Apolipoprotein B-100 (APOB), Beta-2-microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid-binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy-specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE).
[0017] In additional embodiments, the invention provides a biomarker panel comprising at least two of the isolated biomarkers selected from the group consisting of the biomarkers set forth in Table 51 and the biomarkers set forth in Table 53.
[0018] Also provided by the invention is a method of determining probability for preterm birth in a pregnant female comprising detecting a measurable feature of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63 in a biological sample obtained from the pregnant female, and analyzing the measurable feature to determine the probability for preterm birth in the pregnant female. In some embodiments, the invention provides a method of predicting GAB, the method encompassing detecting a measurable feature of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63 in a biological sample obtained from a pregnant female, and analyzing said measurable feature to predict GAB.
[0019] In some embodiments, a measurable feature comprises fragments or derivatives of each of the N biomarkers selected from the biomarkers listed in Tables 1 through 63. In some embodiments of the disclosed methods detecting a measurable feature comprises quantifying an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63, combinations or portions and/or derivatives thereof in a biological sample obtained from the pregnant female. In additional embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female further encompass detecting a measurable feature for one or more risk indicia associated with preterm birth.
[0020] In some embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of N biomarkers, wherein N is selected from the group consisting of 2 to 24. In further embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of AFTECCVVASQLR, ELLESYIDGR, and
ITLPDFTGDLR. In further embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of FLNWIK, FGFGGSTDSGPIR, LLELTGPK, VEHSDLSFSK, IEGNLIFDPNNYLPK, ALVLELAK, TQILEWAAER,
DVLLLVHNLPQNLPGYFWYK, SEPRPGVLLR, ITQDAQLK, ALDLSLK,
WWGGQPLWITATK, and LSETNR. In further embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of the biomarkers set forth in Table 50 and the biomarkers set forth in Table 52.
[0021] In other embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of lipopolysaccharide- binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), and complement component C8 gamma chain (C8G or C08G).
[0022] In other embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of of Alpha- 1B- glycoprotein (A1BG), Disintegrin and metalloproteinase domain-containing protein 12 (ADA 12), Apolipoprotein B-100 (APOB), Beta-2-microglobulin (B2MG),
CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid-binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy-specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE).
[0023] In further embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of lipopolysaccharide- binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), complement component C8 gamma chain (C8G or C08G), complement component 1, q subcomponent, B chain (C1QB), fibrinogen beta chain (FIBB or FIB), C-reactive protein (CRP), inter-alpha-trypsin inhibitor heavy chain H4 (ITIH4), chorionic somatomammotropin hormone (CSH), and angiotensinogen (ANG or ANGT).
[0024] In further embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of the biomarkers set forth in Table 51 and the biomarkers set forth in Table 53.
[0025] In some embodiments of the methods of determining probability for preterm birth in a pregnant female, the probability for preterm birth in the pregnant female is calculated based on the quantified amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63. In some embodiments, the disclosed methods for determining the probability of preterm birth encompass detecting and/or quantifying one or more biomarkers using mass sprectrometry, a capture agent or a combination thereof.
[0026] In some embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female encompass an initial step of providing a biomarker panel comprising N of the biomarkers listed in Tables 1 through 63. In additional embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female encompass an initial step of providing a biological sample from the pregnant female.
[0027] In some embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female encompass communicating the probability to a health care provider. In additional embodiments, the communication informs a subsequent treatment decision for the pregnant female. In further embodiments, the treatment decision of one or more selected from the group of consisting of more frequent prenatal care visits, serial cervical length measurements, enhanced education regarding signs and symptoms of early preterm labor, lifestyle interventions for modifiable risk behaviors and progesterone treatment. [0028] In further embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female encompass analyzing the measurable feature of one or more isolated biomarkers using a predictive model. In some embodiments of the disclosed methods, a measurable feature of one or more isolated biomarkers is compared with a reference feature.
[0029] In additional embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female encompass using one or more analyses selected from a linear discriminant analysis model, a support vector machine classification algorithm, a recursive feature elimination model, a prediction analysis of microarray model, a logistic regression model, a CART algorithm, a flex tree algorithm, a LART algorithm, a random forest algorithm, a MART algorithm, a machine learning algorithm, a penalized regression method, and a combination thereof. In one embodiment, the disclosed methods of determining probability for preterm birth in a pregnant female encompass logistic regression.
[0030] In some embodiments, the invention provides a method of determining probability for preterm birth in a pregnant female, the method encompassing quantifying in a biological sample obtained from the pregnant female an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63; multiplying the amount by a predetermined coefficient, and determining the probability for preterm birth in the pregnant female comprising adding the individual products to obtain a total risk score that corresponds to the probability
[0031] In additional embodiments, the invention provides a method of prediciting GAB, the method comprising: (a) quantifying in a biological sample obtained from said pregnant female an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63; (b) multiplying or thresholding said amount by a predetermined coefficient, (c) determining the predicted GAB birth in said pregnant female comprising adding said individual products to obtain a total risk score that corresponds to said predicted GAB.
[0032] In further embodiments, the invention provides a method of prediciting time to birth in a pregnant female, the method comprising: (a) obtaining a biological sample from said pregnant female; (b) quantifying an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63 in said biological sample; (c) multiplying or thresholding said amount by a predetermined coefficient, (d) determining predicted GAB in said pregnant female comprising adding said individual products to obtain a total risk score that corresponds to said predicted GAB; and (e) substracting the estimated gestational age (GA) at time biological sample was obtained from the predicted GAB to predict time to birth in said pregnant female.
[0033] Other features and advantages of the invention will be apparent from the detailed description, and from the claims.
BRIEF DESCRIPTION OF THE DRAWINGS
[0034] Figure 1. Scatterplot of actual gestational age at birth versus predicted gestational age from random forest regression model.
[0035] Firgure 2. Distribution of predicted gestational age from random forest regression model versus actual gestational age at birth (GAB), where actual GAB is given in categories of (i) less than 37 weeks, (ii) 37 to 39 weeks, and (iii) 40 weeks or greater (peaks left to right, respectively).
DETAILED DESCRIPTION
[0036] The present disclosure is based, in part, on the discovery that certain proteins and peptides in biological samples obtained from a pregnant female are differentially expressed in pregnant females that have an increased risk of preterm birth relative to controls. The present disclosure is further based, in part, on the unexpected discovery that panels combining one or more of these proteins and peptides can be utilized in methods of determining the probability for preterm birth in a pregnant female with high sensitivity and specificity. These proteins and peptides disclosed herein serve as biomarkers for classifying test samples, predicting probability of preterm birth, predicting probability of term birth, predicting gestational age at birth (GAB), predicting time to birth and/or monitoring of progress of preventative therapy in a pregnant female, either individually or in a panel of biomarkers.
[0037] The disclosure provides biomarker panels, methods and kits for determining the probability for preterm birth in a pregnant female. One major advantage of the present disclosure is that risk of developing preterm birth can be assessed early during pregnancy so that appropriate monitoring and clinical management to prevent preterm delivery can be initiated in a timely fashion. The present invention is of particular benefit to females lacking any risk factors for preterm birth and who would not otherwise be identified and treated.
[0038] By way of example, the present disclosure includes methods for generating a result useful in determining probability for preterm birth in a pregnant female by obtaining a dataset associated with a sample, where the dataset at least includes quantitative data about biomarkers and panels of biomarkers that have been identified as predictive of preterm birth, and inputting the dataset into an analytic process that uses the dataset to generate a result useful in determining probability for preterm birth in a pregnant female. As described further below, this quantitative data can include amino acids, peptides, polypeptides, proteins, nucleotides, nucleic acids, nucleosides, sugars, fatty acids, steroids, metabolites, carbohydrates, lipids, hormones, antibodies, regions of interest that serve as surrogates for biological macromolecules and combinations thereof.
[0039] In addition to the specific biomarkers identified in this disclosure, for example, by accession number in a public database, sequence, or reference, the invention also contemplates use of biomarker variants that are at least 90% or at least 95% or at least 97% identical to the exemplified sequences and that are now known or later discovered and that have utility for the methods of the invention. These variants may represent
polymorphisms, splice variants, mutations, and the like. In this regard, the instant specification discloses multiple art-known proteins in the context of the invention and provides exemplary accession numbers associated with one or more public databases as well as exemplary references to published journal articles relating to these art-known proteins. However, those skilled in the art appreciate that additional accession numbers and journal articles can easily be identified that can provide additional characteristics of the disclosed biomarkers and that the exemplified references are in no way limiting with regard to the disclosed biomarkers. As described herein, various techniques and reagents find use in the methods of the present invention. Suitable samples in the context of the present invention include, for example, blood, plasma, serum, amniotic fluid, vaginal secretions, saliva, and urine. In some embodiments, the biological sample is selected from the group consisting of whole blood, plasma, and serum. In a particular embodiment, the biological sample is serum. As described herein, biomarkers can be detected through a variety of assays and techniques known in the art. As further described herein, such assays include, without limitation, mass spectrometry (MS)-based assays, antibody-based assays as well as assays that combine aspects of the two.
[0040] Protein biomarkers associated with the probability for preterm birth in a pregnant female include, but are not limited to, one or more of the isolated biomarkers listed in Tables 1 through 63. In addition to the specific biomarkers, the disclosure further includes biomarker variants that are about 90%, about 95%, or about 97% identical to the exemplified sequences. Variants, as used herein, include polymorphisms, splice variants, mutations, and the like. [0041] Additional markers can be selected from one or more risk indicia, including but not limited to, maternal characteristics, medical history, past pregnancy history, and obstetrical history. Such additional markers can include, for example, previous low birth weight or preterm delivery, multiple 2nd trimester spontaneous abortions, prior first trimester induced abortion, familial and intergenerational factors, history of infertility, nulliparity, placental abnormalities, cervical and uterine anomalies, short cervical length measurements, gestational bleeding, intrauterine growth restriction, in utero
diethylstilbestrol exposure, multiple gestations, infant sex, short stature, low prepregnancy weight, low or high body mass index, diabetes, hypertension, urogenital infections (i.e. urinary tract infection), asthma, anxiety and depression, asthma, hypertension,
hypothyroidism. Demographic risk indicia for preterm birth can include, for example, maternal age, race/ethnicity, single marital status, low socioeconomic status, maternal age, employment-related physical activity, occupational exposures and environment exposures and stress. Further risk indicia can include, inadequate prenatal care, cigarette smoking, use of marijuana and other illicit drugs, cocaine use, alcohol consumption, caffeine intake, maternal weight gain, dietary intake, sexual activity during late pregnancy and leisure-time physical activities. (Preterm Birth: Causes, Consequences, and Prevention, Institute of Medicine (US) Committee on Understanding Premature Birth and Assuring Healthy Outcomes; Behrman RE, Butler AS, editors. Washington (DC): National Academies Press (US); 2007). Additional risk indicia useful for as markers can be identified using learning algorithms known in the art, such as linear discriminant analysis, support vector machine classification, recursive feature elimination, prediction analysis of microarray, logistic regression, CART, FlexTree, LART, random forest, MART, and/or survival analysis regression, which are known to those of skill in the art and are further described herein.
[0042] Provided herein are panels of isolated biomarkers comprising N of the biomarkers selected from the group listed in Tables 1 through 63. In the disclosed panels of biomarkers N can be a number selected from the group consisting of 2 to 24. In the disclosed methods, the number of biomarkers that are detected and whose levels are determined, can be 1, or more than 1, such as 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25 or more. In certain embodiments, the number of biomarkers that are detected, and whose levels are determined, can be 1 , or more than 1 , such as 2, 3, 4, 5, 6, 7, 8, 9, 10, or more. The methods of this disclosure are useful for determining the probability for preterm birth in a pregnant female. [0043] While certain of the biomarkers listed in Tables 1 through 63 are useful alone for determining the probability for preterm birth in a pregnant female, methods are also described herein for the grouping of multiple subsets of the biomarkers that are each useful as a panel of three or more biomarkers. In some embodiments, the invention provides panels comprising N biomarkers, wherein N is at least three biomarkers. In other embodiments, N is selected to be any number from 3-23 biomarkers.
[0044] In yet other embodiments, N is selected to be any number from 2-5, 2-10, 2- 15, 2-20, or 2-23. In other embodiments, N is selected to be any number from 3-5, 3-10, 3- 15, 3-20, or 3-23. In other embodiments, N is selected to be any number from 4-5, 4-10, 4- 15, 4-20, or 4-23. In other embodiments, N is selected to be any number from 5-10, 5-15, 5-20, or 5-23. In other embodiments, N is selected to be any number from 6-10, 6-15, 6-20, or 6-23. In other embodiments, N is selected to be any number from 7-10, 7-15, 7-20, or 7- 23. In other embodiments, N is selected to be any number from 8-10, 8-15, 8-20, or 8-23. In other embodiments, N is selected to be any number from 9-10, 9-15, 9-20, or 9-23. In other embodiments, N is selected to be any number from 10-15, 10-20, or 10-23. It will be appreciated that N can be selected to encompass similar, but higher order, ranges.
[0045] In certain embodiments, the panel of isolated biomarkers comprises one or more, two or more, three or more, four or more, or five isolated biomarkers comprising an amino acid sequence selected from AFTECCVVASQLR, ELLESYIDGR,
ITLPDFTGDLR, TDAPDLPEENQAR and SFRPFVPR. In some embodiments, the panel of isolated biomarkers comprises one or more, two or more, three or more, four or more, or five isolated biomarkers comprising an amino acid sequence selected from FLNWIK, FGFGGSTDSGPIR, LLELTGPK, VEHSDLSFSK, IEGNLIFDPNNYLPK, ALVLELAK, TQILEWAAER, DVLLLVHNLPQNLPGYFWYK, SEPRPGVLLR, ITQDAQLK, ALDLSLK, WWGGQPLWITATK, and LSETNR.
[0046] In some embodiments, the panel of isolated biomarkers comprises one or more, two or more, or three of the isolated biomarkers consisting of an amino acid sequence selected from AFTECCVVASQLR, ELLESYIDGR, and ITLPDFTGDLR. In some embodiments, the panel of isolated biomarkers comprises one or more, two or more, or three of the isolated biomarkers consisting of an amino acid sequence selected from FLNWIK, FGFGGSTDSGPIR, LLELTGPK, VEHSDLSFSK, IEGNLIFDPNNYLPK, ALVLELAK, TQILEWAAER, DVLLLVHNLPQNLPGYFWYK, SEPRPGVLLR, ITQDAQLK, ALDLSLK, WWGGQPLWITATK, and LSETNR. [0047] In some embodiments, the panel of isolated biomarkers comprises one or more, two or more, or three of the isolated biomarkers consisting of an amino acid sequence selected from the biomarkers set forth in Table 50 and the biomarkers set forth in Table 52.
[0048] In some embodiments, the panel of isolated biomarkers comprises one or more peptides comprising a fragment from lipopolysaccharide-binding protein (LBP), Schumann et al. Science 249 (4975), 1429-1431 (1990) (UniProtKB/Swiss-Prot: P18428.3);
prothrombin (THRB), Walz et al, Proc. Natl. Acad. Sci. U.S.A. 74 (5), 1969-1972(1977) (NCBI Reference Sequence: NP 000497.1); complement component C5 (C5 or C05) Haviland, J. Immunol. 146 (1), 362-368 (1991) (GenBank: AAA51925.1); plasminogen (PLMN) Petersen et al, J. Biol. Chem. 265 (11), 6104-6111(1990) (NCBI Reference Sequences: NP 000292.1 NP 001161810.1); and complement component C8 gamma chain (C8G or C08G), Haefliger et al, Mol. Immunol. 28 (1-2), 123-131 (1991) (NCBI Reference Sequence: NP 000597.2).
[0049] In some embodiments, the panel of isolated biomarkers comprises one or more peptides comprising a fragment from cell adhesion molecule with homology to
complement component 1, q subcomponent, B chain (C1QB), Reid, Biochem. J. 179 (2), 367-371 (1979) (NCBI Reference Sequence: NP 000482.3); fibrinogen beta chain (FIBB or FIB); Watt et al, Biochemistry 18 (1), 68-76 (1979) (NCBI Reference Sequences: NP 001171670.1 and NP 005132.2); C-reactive protein (CRP), Oliveira et al, J. Biol. Chem. 254 (2), 489-502 (1979) (NCBI Reference Sequence: NP_000558.2); inter-alpha- trypsin inhibitor heavy chain H4 (ITIH4) Kim et al, Mol. Biosvst. 7 (5), 1430-1440 (2011) (NCBI Reference Sequences: NP 001159921.1 and NP 002209.2); chorionic
somatomammotropin hormone (CSH) Selby et al, J. Biol. Chem. 259 (21), 13131-13138 (1984) (NCBI Reference Sequence: NP 001308.1); and angiotensinogen (ANG or ANGT) Underwood et al, Metabolism 60(8): 1150-7 (2011) (NCBI Reference Sequence:
NP 000020.1).
[0050] In additional embodiments, the invention provides a panel of isolated biomarkers comprising N of the biomarkers listed in Tables 1 through 63. In some embodiments, N is a number selected from the group consisting of 2 to 24. In additional embodiments, the biomarker panel comprises at least two of the isolated biomarkers selected from the group consisting of AFTECCVVASQLR, ELLESYIDGR, and
ITLPDFTGDLR. In additional embodiments, the biomarker panel comprises at least two of the isolated biomarkers selected from the group consisting of AFTECCVVASQLR, ELLESYIDGR, ITLPDFTGDLR, TDAPDLPEENQAR and SFRPFVPR. In additional embodiments, the biomarker panel comprises at least two of the isolated biomarkers selected from the group consisting of FLNWIK, FGFGGSTDSGPIR, LLELTGPK, VEHSDLSFSK, IEGNLIFDPNNYLPK, ALVLELAK, TQILEWAAER,
DVLLLVHNLPQNLPGYFWYK, SEPRPGVLLR, ITQDAQLK, ALDLSLK,
WWGGQPLWITATK, and LSETNR.
[0051] In additional embodiments, the biomarker panel comprises at least two of the isolated biomarkers selected from the group consisting of the biomarkers set forth in Table 50 and the biomarkers set forth in Table 52.
[0052] In further embodiments, the biomarker panel comprises at least two of the isolated biomarkers selected from the group consisting of lipopolysaccharide-binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), and complement component C8 gamma chain (C8G or C08G). In another embodiment, the invention provides a biomarker panel comprising at least three isolated biomarkers selected from the group consisting of lipopolysaccharide-binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), and complement component C8 gamma chain (C8G or C08G).
[0053] In further embodiments, the biomarker panel comprises at least two of the isolated biomarkers selected from the group consisting of Alpha- lB-glycoprotein (AIBG), Disintegrin and metalloproteinase domain-containing protein 12 (ADA12), Apolipoprotein B-100 (APOB), Beta-2-microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid-binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy-specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE).
[0054] In some embodiments, the invention provides a biomarker panel comprising lipopolysaccharide-binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), complement component C8 gamma chain (C8G or C08G), complement component 1, q subcomponent, B chain (C1QB), fibrinogen beta chain (FIBB or FIB), C-reactive protein (CRP), inter-alpha-trypsin inhibitor heavy chain H4 (ITIH4), chorionic somatomammotropin hormone (CSH), and angiotensinogen (ANG or ANGT). In some embodiments, the invention provides a biomarker panel comprising Alpha- lB-glycoprotein (AIBG), Disintegrin and metalloproteinase domain-containing protein 12 (ADA 12), Apolipoprotein B-100 (APOB), Beta-2-microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid-binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy-specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE).
[0055] In another aspect, the invention provides a biomarker panel comprising at least two isolated biomarkers selected from the group consisting of lipopolysaccharide -binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05),
plasminogen (PLMN), complement component C8 gamma chain (C8G or C08G), complement component 1, q subcomponent, B chain (CIQB), fibrinogen beta chain (FIBB or FIB), C-reactive protein (CRP), inter-alpha-trypsin inhibitor heavy chain H4 (ITIH4), chorionic somatomammotropin hormone (CSH), and angiotensinogen (ANG or ANGT) and the biomarkers set forth in Tables 51 and 53.
[0056] In another aspect, the invention provides a biomarker panel comprising at least two isolated biomarkers selected from the group consisting of Alpha- lB-glycoprotein (A1BG), Disintegrin and metalloproteinase domain-containing protein 12 (ADA12), Apolipoprotein B-100 (APOB), Beta-2-microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid-binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy-specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain
(INHBE).
[0057] It must be noted that, as used in this specification and the appended claims, the singular forms "a", "an" and "the" include plural referents unless the content clearly dictates otherwise. Thus, for example, reference to "a biomarker" includes a mixture of two or more biomarkers, and the like.
[0058] The term "about," particularly in reference to a given quantity, is meant to encompass deviations of plus or minus five percent.
[0059] As used in this application, including the appended claims, the singular forms "a," "an," and "the" include plural references, unless the content clearly dictates otherwise, and are used interchangeably with "at least one" and "one or more."
[0060] As used herein, the terms "comprises," "comprising," "includes," "including," "contains," "containing," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, product-by-process, or composition of matter that comprises, includes, or contains an element or list of elements does not include only those elements but can include other elements not expressly listed or inherent to such process, method, product-by-process, or composition of matter.
[0061] As used herein, the term "panel" refers to a composition, such as an array or a collection, comprising one or more biomarkers. The term can also refer to a profile or index of expression patterns of one or more biomarkers described herein. The number of biomarkers useful for a biomarker panel is based on the sensitivity and specificity value for the particular combination of biomarker values.
[0062] As used herein, and unless otherwise specified, the terms "isolated" and "purified" generally describes a composition of matter that has been removed from its native environment (e.g., the natural environment if it is naturally occurring), and thus is altered by the hand of man from its natural state. An isolated protein or nucleic acid is distinct from the way it exists in nature.
[0063] The term "biomarker" refers to a biological molecule, or a fragment of a biological molecule, the change and/or the detection of which can be correlated with a particular physical condition or state. The terms "marker" and "biomarker" are used interchangeably throughout the disclosure. For example, the biomarkers of the present invention are correlated with an increased likelihood of preterm birth. Such biomarkers include, but are not limited to, biological molecules comprising nucleotides, nucleic acids, nucleosides, amino acids, sugars, fatty acids, steroids, metabolites, peptides, polypeptides, proteins, carbohydrates, lipids, hormones, antibodies, regions of interest that serve as surrogates for biological macromolecules and combinations thereof (e.g., glycoproteins, ribonucleoproteins, lipoproteins). The term also encompasses portions or fragments of a biological molecule, for example, peptide fragment of a protein or polypeptide that comprises at least 5 consecutive amino acid residues, at least 6 consecutive amino acid residues, at least 7 consecutive amino acid residues, at least 8 consecutive amino acid residues, at least 9 consecutive amino acid residues, at least 10 consecutive amino acid residues, at least 1 1 consecutive amino acid residues, at least 12 consecutive amino acid residues, at least 13 consecutive amino acid residues, at least 14 consecutive amino acid residues, at least 15 consecutive amino acid residues, at least 5 consecutive amino acid residues, at least 16 consecutive amino acid residues, at least 17consecutive amino acid residues, at least 18 consecutive amino acid residues, at least 19 consecutive amino acid residues, at least 20 consecutive amino acid residues, at least 21 consecutive amino acid residues, at least 22 consecutive amino acid residues, at least 23 consecutive amino acid residues, at least 24 consecutive amino acid residues, at least 25 consecutive amino acid residues,or more consecutive amino acid residues.
[0064] The invention also provides a method of determining probability for preterm birth in a pregnant female, the method comprising detecting a measurable feature of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63 in a biological sample obtained from the pregnant female, and analyzing the measurable feature to determine the probability for preterm birth in the pregnant female. As disclosed herein, a measurable feature comprises fragments or derivatives of each of said N biomarkers selected from the biomarkers listed in Tables 1 through 63. In some embodiments of the disclosed methods detecting a measurable feature comprises quantifying an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63, combinations or portions and/or derivatives thereof in a biological sample obtained from said pregnant female.
[0065] The invention further provides a method of predicting GAB, the method encompassing detecting a measurable feature of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63 in a biological sample obtained from a pregnant female, and analyzing the measurable feature to predict GAB.
[0066] The invention also provides a method of prediciting GAB, the method comprising: (a) quantifying in a biological sample obtained from the pregnant female an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63; (b) multiplying or thresholding the amount by a predetermined coefficient, (c) determining the predicted GAB birth in the pregnant female comprising adding the individual products to obtain a total risk score that corresponds to the predicted GAB.
[0067] The invention further provides a method of prediciting time to birth in a pregnant female, the method comprising: (a) obtaining a biological sample from the pregnant female; (b) quantifying an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63 in the biological sample; (c) multiplying or thresholding the amount by a predetermined coefficient, (d) determining predicted GAB in the pregnant female comprising adding the individual products to obtain a total risk score that corresponds to the predicted GAB; and (e) substracting the estimated gestational age (GA) at time biological sample was obtained from the predicted GAB to predict time to birth in said pregnant female. For methods directed to prediciting time to birth, it is understood that "birth" means birth following spontaneous onset of labor, with or without rupture of membranes. [0068] Although described and exemplified with reference to methods of determining probability for preterm birth in a pregnant female, the present disclosure is similarly applicable to the methods of predicting GAB, the methods for predicting term birth, methods for determining the probability of term birth in a pregnant female as well methods of prediciting time to birth in a pregnant female. It will be apparent to one skilled in the art that each of the aforementioned methods has specific and substantial utilities and benefits with regard maternal-fetal health considerations.
[0069] In some embodiments, the method of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of N biomarkers, wherein N is selected from the group consisting of 2 to 24. In further embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of AFTECCVVASQLR, ELLESYIDGR, and ITLPDFTGDLR. In further embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of FLNWIK, FGFGGSTDSGPIR, LLELTGPK, VEHSDLSFSK, IEGNLIFDP NYLPK, ALVLELAK, TQILEWAAER, DVLLLVHNLPQNLPGYFWYK, SEPRPGVLLR, ITQDAQLK, ALDLSLK, WWGGQPLWITATK, and LSETNR.
[0070] In additional embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of the biomarkers set forth in Table 50 and the biomarkers set forth in Table 52.
[0071] In additional embodiments, the method of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of lipopolysaccharide-binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), and complement component C8 gamma chain (C8G or C08G).
[0072] In additional embodiments, the method of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of Alpha- lB-glycoprotein (A1BG), Disintegrin and metalloproteinase domain- containing protein 12 (ADA12), Apolipoprotein B-100 (APOB), Beta-2 -microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid- binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy-specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE).
[0073] In further embodiments, the disclosed method of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of lipopolysaccharide-binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), complement component C8 gamma chain (C8G or C08G), complement component 1, q subcomponent, B chain (C1QB), fibrinogen beta chain (FIBB or FIB), C-reactive protein (CRP), inter-alpha-trypsin inhibitor heavy chain H4 (ITIH4), chorionic somatomammotropin hormone (CSH), and
angiotensinogen (ANG or ANGT).
[0074] In further embodiments, the disclosed method of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of Alpha- lB-glycoprotein (A1BG), Disintegrin and metalloproteinase domain- containing protein 12 (ADA12), Apolipoprotein B-100 (APOB), Beta-2 -microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid- binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy-specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE).
[0075] In further embodiments, the disclosed method of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of Alpha- lB-glycoprotein (A1BG), Disintegrin and metalloproteinase domain- containing protein 12 (ADA12), Apolipoprotein B-100 (APOB), Beta-2 -microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid- binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy-specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE).
[0076] In further embodiments, the disclosed method of determining probability for preterm birth in a pregnant female and related methods disclosed herein comprise detecting a measurable feature of each of at least two isolated biomarkers selected from the group consisting of the biomarkers set forth in Table 51 and the biomarkers set forth in Table 53.
[0077] In additional embodiments, the methods of determining probability for preterm birth in a pregnant female further encompass detecting a measurable feature for one or more risk indicia associated with preterm birth. In additional embodiments the risk indicia are selected form the group consisting of previous low birth weight or preterm delivery, multiple 2nd trimester spontaneous abortions, prior first trimester induced abortion, familial and intergenerational factors, history of infertility, nulliparity, placental abnormalities, cervical and uterine anomalies, gestational bleeding, intrauterine growth restriction, in utero diethylstilbestrol exposure, multiple gestations, infant sex, short stature, low prepregnancy weight, low or high body mass index, diabetes, hypertension, and urogenital infections.
[0078] A "measurable feature" is any property, characteristic or aspect that can be determined and correlated with the probability for preterm birth in a subject. The term further encompasses any property, characteristic or aspect that can be determined and correlated in connection with a prediction of GAB, a prediction of term birth, or a prediction of time to birth in a pregnant female. For a biomarker, such a measurable feature can include, for example, the presence, absence, or concentration of the biomarker, or a fragment thereof, in the biological sample, an altered structure, such as, for example, the presence or amount of a post-translational modification, such as oxidation at one or more positions on the amino acid sequence of the biomarker or, for example, the presence of an altered conformation in comparison to the conformation of the biomarker in normal control subjects, and/or the presence, amount, or altered structure of the biomarker as a part of a profile of more than one biomarker. In addition to biomarkers, measurable features can further include risk indicia including, for example, maternal characteristics, age, race, ethnicity, medical history, past pregnancy history, obstetrical history. For a risk indicium, a measurable feature can include, for example, previous low birth weight or preterm delivery, multiple 2nd trimester spontaneous abortions, prior first trimester induced abortion, familial and intergenerational factors, history of infertility, nulliparity, placental abnormalities, cervical and uterine anomalies, short cervical length meansurements, gestational bleeding, intrauterine growth restriction, in utero diethylstilbestrol exposure, multiple gestations, infant sex, short stature, low prepregnancy weight/low body mass index, diabetes, hypertension, urogenital infections, hypothyroidism,asthma, low
educational attainment, cigarette smoking, drug use and alcohol consumption.
[0079] In some embodiments of the disclosed methods of determining probability for preterm birth in a pregnant female, the probability for preterm birth in the pregnant female is calculated based on the quantified amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63. In some embodiments, the disclosed methods for determining the probability of preterm birth encompass detecting and/or quantifying one or more biomarkers using mass sprectrometry, a capture agent or a combination thereof.
[0080] In some embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female encompass an initial step of providing a biomarker panel comprising N of the biomarkers listed in Tables 1 through 63. In additional embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female encompass an initial step of providing a biological sample from the pregnant female.
[0081] In some embodiments, the disclosed methods of determining probability for preterm birth in a pregnant female encompass communicating the probability to a health care provider. The disclosed of predicting GAB, the methods for predicting term birth, methods for determining the probability of term birth in a pregnant female as well methods of prediciting time to birth in a pregnant female similarly encompass communicating the probability to a health care provider. As stated above, although described and exemplified with reference to determining probability for preterm birth in a pregnant female, all embodiments described throughout this disclosure are similarly applicable to the methods of predicting GAB, the methods for predicting term birth, methods for determining the probability of term birth in a pregnant female as well methods of prediciting time to birth in a pregnant female. Specifically, he biomarkers and panels recited throughout this application with express reference to methods for preterm birth can also be used in methods for predicting GAB, the methods for predicting term birth, methods for determining the probability of term birth in a pregnant female as well methods of prediciting time to birth in a pregnant female. It will be apparent to one skilled in the art that each of the
aforementioned methods have specific and substantial utilities and benefits with regard maternal-fetal health considerations.
[0082] In additional embodiments, the communication informs a subsequent treatment decision for the pregnant female. In some embodiments, the method of determining probability for preterm birth in a pregnant female encompasses the additional feature of expressing the probability as a risk score.
[0083] As used herein, the term "risk score" refers to a score that can be assigned based on comparing the amount of one or more biomarkers in a biological sample obtained from a pregnant female to a standard or reference score that represents an average amount of the one or more biomarkers calculated from biological samples obtained from a random pool of pregnant females. Because the level of a biomarker may not be static throughout pregnancy, a standard or reference score has to have been obtained for the gestational time point that corresponds to that of the pregnant female at the time the sample was taken. The standard or reference score can be predetermined and built into a predictor model such that the comparison is indirect rather than actually performed every time the probability is determined for a subject. A risk score can be a standard (e.g., a number) or a threshold (e.g., a line on a graph). The value of the risk score correlates to the deviation, upwards or downwards, from the average amount of the one or more biomarkers calculated from biological samples obtained from a random pool of pregnant females. In certain embodiments, if a risk score is greater than a standard or reference risk score, the pregnant female can have an increased likelihood of preterm birth. In some embodiments, the magnitude of a pregnant female's risk score, or the amount by which it exceeds a reference risk score, can be indicative of or correlated to that pregnant female's level of risk.
[0084] In the context of the present invention, the term "biological sample," encompasses any sample that is taken from pregnant female and contains one or more of the biomarkers listed in Tables 1 through 63. Suitable samples in the context of the present invention include, for example, blood, plasma, serum, amniotic fluid, vaginal secretions, saliva, and urine. In some embodiments, the biological sample is selected from the group consisting of whole blood, plasma, and serum. In a particular embodiment, the biological sample is serum. As will be appreciated by those skilled in the art, a biological sample can include any fraction or component of blood, without limitation, T cells, monocytes, neutrophils, erythrocytes, platelets and microvesicles such as exosomes and exosome-like vesicles. In a particular embodiment, the biological sample is serum.
[0085] Preterm birth refers to delivery or birth at a gestational age less than 37 completed weeks. Other commonly used subcategories of preterm birth have been established and delineate moderately preterm (birth at 33 to 36 weeks of gestation), very preterm (birth at <33 weeks of gestation), and extremely preterm (birth at <28 weeks of gestation). With regard to the methods disclosed herein, those skilled in the art understand that the cut-offs that delineate preterm birth and term birth as well as the cut-offs that delineate subcategories of preterm birth can be adjusted in practicing the methods disclosed herein, for example, to maximize a particular health benefit. It is further understood that such adjustments are well within the skill set of individuals considered skilled in the art and encompassed within the scope of the inventions disclosed herein. Gestational age is a proxy for the extent of fetal development and the fetus's readiness for birth. Gestational age has typically been defined as the length of time from the date of the last normal menses to the date of birth. However, obstetric measures and ultrasound estimates also can aid in estimating gestational age. Preterm births have generally been classified into two separate subgroups. One, spontaneous preterm births are those occurring subsequent to spontaneous onset of preterm labor or preterm premature rupture of membranes regardless of subsequent labor augmentation or cesarean delivery. Two, indicated preterm births are those occurring following induction or cesarean section for one or more conditions that the woman's caregiver determines to threaten the health or life of the mother and/or fetus. In some embodiments, the methods disclosed herein are directed to determining the probability for spontaneous preterm birth. In additional embodiments, the methods disclosed herein are directed to predicting gestational birth.
[0086] As used herein, the term "estimated gestational age" or "estimated GA" refers to the GA determined based on the date of the last normal menses and additional obstetric measures, ultrasound estimates or other clinical parameters including, without limitation, those described in the preceding paragraph. In contrast the term "predicted gestational age at birth" or "predicted GAB" refers to the GAB determined based on the methods of the invention as dislosed herein. As used herein, "term birth" refers to birth at a gestational age equal or more than 37 completed weeks.
[0087] In some embodiments, the pregnant female is between 17 and 28 weeks of gestation at the time the biological sample is collected. In other embodiments, the pregnant female is between 16 and 29 weeks, between 17 and 28 weeks, between 18 and 27 weeks, between 19 and 26 weeks, between 20 and 25 weeks, between 21 and 24 weeks, or between 22 and 23 weeks of gestation at the time the biological sample is collected. In further embodiments, the the pregnant female is between about 17 and 22 weeks, between about 16 and 22 weeks between about 22 and 25 weeks, between about 13 and 25 weeks, between about 26 and 28, or between about 26 and 29 weeks of gestation at the time the biological sample is collected. Accordingly, the gestational age of a pregnant female at the time the biological sample is collected can be 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29 or 30 weeks.
[0088] In some embodiments of the claimed methods the measurable feature comprises fragments or derivatives of each of the N biomarkers selected from the biomarkers listed in Tables 1 through 63. In additional embodiments of the claimed methods, detecting a measurable feature comprises quantifying an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63, combinations or portions and/or derivatives thereof in a biological sample obtained from said pregnant female.
[0089] The term "amount" or "level" as used herein refers to a quantity of a biomarker that is detectable or measurable in a biological sample and/or control. The quantity of a biomarker can be, for example, a quantity of polypeptide, the quantity of nucleic acid, or the quantity of a fragment or surrogate. The term can alternatively include combinations thereof. The term "amount" or "level" of a biomarker is a measurable feature of that biomarker.
[0090] In some embodiments, calculating the probability for preterm birth in a pregnant female is based on the quantified amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63. Any existing, available or conventional separation, detection and quantification methods can be used herein to measure the presence or absence (e.g., readout being present vs. absent; or detectable amount vs.
undetectable amount) and/or quantity (e.g., readout being an absolute or relative quantity, such as, for example, absolute or relative concentration) of biomarkers, peptides, polypeptides, proteins and/or fragments thereof and optionally of the one or more other biomarkers or fragments thereof in samples. In some embodiments, detection and/or quantification of one or more biomarkers comprises an assay that utilizes a capture agent. In further embodiments, the capture agent is an antibody, antibody fragment, nucleic acid- based protein binding reagent, small molecule or variant thereof. In additional
embodiments, the assay is an enzyme immunoassay (EIA), enzyme-linked immunosorbent assay (ELISA), and radioimmunoassay (RIA). In some embodiments, detection and/or quantification of one or more biomarkers further comprises mass spectrometry (MS). In yet further embodiments, the mass spectrometry is co-immunoprecitipation-mass spectrometry (co-IP MS), where coimmunoprecipitation, a technique suitable for the isolation of whole protein complexes is followed by mass spectrometric analysis. [0091] As used herein, the term "mass spectrometer" refers to a device able to volatilize/ionize analytes to form gas-phase ions and determine their absolute or relative molecular masses. Suitable methods of volatilization/ionization are matrix-assisted laser desorption ionization (MALDI), electrospray, laser/light, thermal, electrical,
atomized/sprayed and the like, or combinations thereof. Suitable forms of mass
spectrometry include, but are not limited to, ion trap instruments, quadrupole instruments, electrostatic and magnetic sector instruments, time of flight instruments, time of flight tandem mass spectrometer (TOF MS/MS), Fourier-transform mass spectrometers,
Orbitraps and hybrid instruments composed of various combinations of these types of mass analyzers. These instruments can, in turn, be interfaced with a variety of other instruments that fractionate the samples (for example, liquid chromatography or solid-phase adsorption techniques based on chemical, or biological properties) and that ionize the samples for introduction into the mass spectrometer, including matrix-assisted laser desorption
(MALDI), electrospray, or nanospray ionization (ESI) or combinations thereof.
[0092] Generally, any mass spectrometric (MS) technique that can provide precise information on the mass of peptides, and preferably also on fragmentation and/or (partial) amino acid sequence of selected peptides (e.g., in tandem mass spectrometry, MS/MS; or in post source decay, TOF MS), can be used in the methods disclosed herein. Suitable peptide MS and MS/MS techniques and systems are well-known per se (see, e.g., Methods in Molecular Biology, vol. 146: "Mass Spectrometry of Proteins and Peptides", by
Chapman, ed., Humana Press 2000; Biemann 1990. Methods Enzymol 193: 455-79; or Methods in Enzymology, vol. 402: "Biological Mass Spectrometry", by Burlingame, ed., Academic Press 2005) and can be used in practicing the methods disclosed herein.
Accordingly, in some embodiments, the disclosed methods comprise performing quantitative MS to measure one or more biomarkers. Such quantitiative methods can be performed in an automated (Villanueva, et ah, Nature Protocols (2006) 1(2):880-891) or semi-automated format. In particular embodiments, MS can be operably linked to a liquid chromatography device (LC-MS/MS or LC-MS) or gas chromatography device (GC-MS or GC-MS/MS). Other methods useful in this context include isotope-coded affinity tag (ICAT), tandem mass tags (TMT), or stable isotope labeling by amino acids in cell culture (SILAC), followed by chromatography and MS/MS.
[0093] As used herein, the terms "multiple reaction monitoring (MRM)" or "selected reaction monitoring (SRM)" refer to an MS-based quantification method that is particularly useful for quantifying analytes that are in low abundance. In an SRM experiment, a predefined precursor ion and one or more of its fragments are selected by the two mass filters of a triple quadrupole instrument and monitored over time for precise quantification. Multiple SR precursor and fragment ion pairs can be measured within the same experiment on the chromatographic lime scale by rapidly toggling between the different precursor/fragment pairs to perform an MRM experiment. A series of transitions
(precursor/fragment ion pairs) in combination with the retention time of the targeted analyte (e.g. , peptide or small molecule such as chemical entity, steroid, hormone) can constitute a definitive assay. A large number of analytes can be quantified during a single LC-MS experiment. The term "scheduled," or "dynamic" in reference to MRM or SRM, refers to a variation of the assay wherein the transitions for a particular analyte are only acquired in a time window around the expected retention time, significantly increasing the number of analytes that can be detected and quantified in a single LC-MS experiment and contributing to the selectivity of the test, as retention time is a property dependent on the physical nature of the analyte. A single analyte can also be monitored with more than one transition. Finally, included in the assay can be standards that correspond to the analytes of interest (e.g., same amino acid sequence), but differ by the inclusion of stable isotopes. Stable isotopic standards (SIS) can be incorporated into the assay at precise levels and used to quantify the corresponding unknown analyte. An additional level of specificity is contributed by the co-elution of the unknown analyte and its corresponding SIS and properties of their transitions (e.g., the similarity in the ratio of the level of two transitions of the unknown and the ratio of the two transitions of its corresponding SIS).
[0094] Mass spectrometry assays, instruments and systems suitable for biomarker peptide analysis can include, without limitation, matrix-assisted laser desorption/ionisation time-of-flight (MALDI-TOF) MS; MALDI-TOF post-source-decay (PSD); MALDI- TOF/TOF; surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF) MS; electrospray ionization mass spectrometry (ESI-MS); ESI-MS/MS; ESI-MS/(MS)„ (n is an integer greater than zero); ESI 3D or linear (2D) ion trap MS; ESI triple quadrupole MS; ESI quadrupole orthogonal TOF (Q-TOF); ESI Fourier transform MS systems; desorption/ionization on silicon (DIOS); secondary ion mass spectrometry (SIMS); atmospheric pressure chemical ionization mass spectrometry (APCI-MS); APCI- MS/MS; APCI- (MS)n; ion mobility spectrometry (IMS); inductively coupled plasma mass spectrometry (ICP-MS)atmospheric pressure photoionization mass spectrometry (APPI- MS); APPI-MS/MS; and APPI- (MS)„. Peptide ion fragmentation in tandem MS (MS/MS) arrangements can be achieved using manners established in the art, such as, e.g., collision induced dissociation (CID). As described herein, detection and quantification of biomarkers by mass spectrometry can involve multiple reaction monitoring (MRM), such as described among others by Kuhn et al. Proteomics 4: 1175-86 (2004). Scheduled multiple-reaction-monitoring (Scheduled MRM) mode acquisition during LC-MS/MS analysis enhances the sensitivity and accuracy of peptide quantitation. Anderson and Hunter, Molecular and Cellular Proteomics 5 (4): 573 (2006). As described herein, mass spectrometry-based assays can be advantageously combined with upstream peptide or protein separation or fractionation methods, such as for example with the chromatographic and other methods described herein below. As further described herein, shotgun quantitative proteomics can be combined with SRM/MRM-based assays for high- throughput identification and verification of prognostic biomarkers of preterm birth.
[0095] A person skilled in the art will appreciate that a number of methods can be used to determine the amount of a biomarker, including mass spectrometry approaches, such as MS/MS, LC-MS/MS, multiple reaction monitoring (MRM) or SRM and production monitoring (PIM) and also including antibody based methods such as immunoassays such as Western blots, enzyme-linked immunosorbant assay (ELISA),
immunopercipitation, immunohistochemistry, immunofluorescence, radioimmunoassay, dot blotting, and FACS. Accordingly, in some embodiments, determining the level of the at least one biomarker comprises using an immunoassay and/or mass spectrometric methods. In additional embodiments, the mass spectrometric methods are selected from MS, MS/MS, LC-MS/MS, SRM, PIM, and other such methods that are known in the art. In other embodiments, LC-MS/MS further comprises ID LC-MS/MS, 2D LC-MS/MS or 3D LC-MS/MS. Immunoassay techniques and protocols are generally known to those skilled in the art ( Price and Newman, Principles and Practice of Immunoassay, 2nd Edition, Grove's Dictionaries, 1997; and Gosling, Immunoassays: A Practical Approach, Oxford University Press, 2000.) A variety of immunoassay techniques, including competitive and non-competitive immunoassays, can be used ( Self et al, Curr. Opin. BiotechnoL, 7:60-65 (1996).
[0096] In further embodiments, the immunoassay is selected from Western blot, ELISA, immunoprecipitation, immunohistochemistry, immunofluorescence,
radioimmunoassay (RIA), dot blotting, and FACS. In certain embodiments, the immunoassay is an ELISA. In yet a further embodiment, the ELISA is direct ELISA (enzyme-linked immunosorbent assay), indirect ELISA, sandwich ELISA, competitive ELISA, multiplex ELISA, ELISPOT technologies, and other similar techniques known in the art. Principles of these immunoassay methods are known in the art, for example John R. Crowther, The ELISA Guidebook, 1st ed., Humana Press 2000, ISBN 0896037282. Typically ELISAs are performed with antibodies but they can be performed with any capture agents that bind specifically to one or more biomarkers of the invention and that can be detected. Multiplex ELISA allows simultaneous detection of two or more analytes within a single compartment (e.g., microplate well) usually at a plurality of array addresses (Nielsen and Geierstanger 2004. J Immunol Methods 290: 107-20 (2004) and Ling et al. 2007. Expert Rev Mol Diagn 7: 87-98 (2007)).
[0097] In some embodiments, Radioimmunoassay (RIA) can be used to detect one or more biomarkers in the methods of the invention. RIA is a competition-based assay that is well known in the art and involves mixing known quantities of radioactavely-labelled (e.g.,125I or 131I-labelled) target analyte with antibody specific for the analyte, then adding non-labelled analyte from a sample and measuring the amount of labelled analyte that is displaced (see, e.g., An Introduction to Radioimmunoassay and Related Techniques, by Chard T, ed., Elsevier Science 1995, ISBN 0444821198 for guidance).
[0098] A detectable label can be used in the assays described herein for direct or indirect detection of the biomarkers in the methods of the invention. A wide variety of detectable labels can be used, with the choice of label depending on the sensitivity required, ease of conjugation with the antibody, stability requirements, and available instrumentation and disposal provisions. Those skilled in the art are familiar with selection of a suitable detectable label based on the assay detection of the biomarkers in the methods of the invention. Suitable detectable labels include, but are not limited to, fluorescent dyes {e.g., fluorescein, fluorescein isothiocyanate (FITC), Oregon Green™, rhodamine, Texas red, tetrarhodimine isothiocynate (TRITC), Cy3, Cy5, etc.), fluorescent markers {e.g., green fluorescent protein (GFP), phycoerythrin, etc.), enzymes {e.g., luciferase, horseradish peroxidase, alkaline phosphatase, etc.), nanoparticles, biotin, digoxigenin, metals, and the like.
[0099] For mass-sectrometry based analysis, differential tagging with isotopic reagents, e.g., isotope-coded affinity tags (ICAT) or the more recent variation that uses isobaric tagging reagents, iTRAQ (Applied Biosystems, Foster City, Calif), or tandem mass tags, TMT, (Thermo Scientific, Rockford, IL), followed by multidimensional liquid chromatography (LC) and tandem mass spectrometry (MS/MS) analysis can provide a further methodology in practicing the methods of the inventon. [00100] A chemiluminescence assay using a chemiluminescent antibody can be used for sensitive, non-radioactive detection of protein levels. An antibody labeled with fiuorochrome also can be suitable. Examples of fiuorochromes include, without limitation, DAPI, fluorescein, Hoechst 33258, R-phycocyanin, B-phycoerythrin, R-phycoerythrin, rhodamine, Texas red, and lissamine. Indirect labels include various enzymes well known in the art, such as horseradish peroxidase (HRP), alkaline phosphatase (AP), beta- galactosidase, urease, and the like. Detection systems using suitable substrates for horseradish-peroxidase, alkaline phosphatase, beta-galactosidase are well known in the art.
[00101] A signal from the direct or indirect label can be analyzed, for example, using a spectrophotometer to detect color from a chromogenic substrate; a radiation counter to detect radiation such as a gamma counter for detection of 125I; or a fluorometer to detect fluorescence in the presence of light of a certain wavelength. For detection of enzyme- linked antibodies, a quantitative analysis can be made using a spectrophotometer such as an EMAX Microplate Reader (Molecular Devices; Menlo Park, Calif.) in accordance with the manufacturer's instructions. If desired, assays used to practice the invention can be automated or performed robotically, and the signal from multiple samples can be detected simultaneously.
[00102] In some embodiments, the methods described herein encompass quantification of the biomarkers using mass spectrometry (MS). In further embodiments, the mass spectrometry can be liquid chromatography-mass spectrometry (LC-MS), multiple reaction monitoring (MRM) or selected reaction monitoring (SRM). In additional embodiments, the MRM or SRM can further encompass scheduled MRM or scheduled SRM.
[00103] As described above, chromatography can also be used in practicing the methods of the invention. Chromatography encompasses methods for separating chemical substances and generally involves a process in which a mixture of analytes is carried by a moving stream of liquid or gas ("mobile phase") and separated into components as a result of differential distribution of the analytes as they flow around or over a stationary liquid or solid phase ("stationary phase"), between the mobile phase and said stationary phase. The stationary phase can be usually a finely divided solid, a sheet of filter material, or a thin film of a liquid on the surface of a solid, or the like. Chromatography is well understood by those skilled in the art as a technique applicable for the separation of chemical compounds of biological origin, such as, e.g., amino acids, proteins, fragments of proteins or peptides, etc. [00104] Chromatography can be columnar (i.e., wherein the stationary phase is deposited or packed in a column), preferably liquid chromatography, and yet more preferably high-performance liquid chromatography (HPLC), or ultra high
performance/pressure liquid chromatography (UHPLC). Particulars of chromatography are well known in the art (Bidlingmeyer, Practical HPLC Methodology and Applications, John Wiley & Sons Inc., 1993). Exemplary types of chromatography include, without limitation, high-performance liquid chromatography (HPLC), UHPLC, normal phase HPLC (NP-HPLC), reversed phase HPLC (RP-HPLC), ion exchange chromatography (IEC), such as cation or anion exchange chromatography, hydrophilic interaction chromatography (HILIC), hydrophobic interaction chromatography (HIC), size exclusion chromatography (SEC) including gel filtration chromatography or gel permeation chromatography, chromatofocusing, affinity chromatography such as immuno-affinity, immobilised metal affinity chromatography, and the like. Chromatography, including single-, two- or more-dimensional chromatography, can be used as a peptide fractionation method in conjunction with a further peptide analysis method, such as for example, with a downstream mass spectrometry analysis as described elsewhere in this specification.
[00105] Further peptide or polypeptide separation, identification or quantification methods can be used, optionally in conjunction with any of the above described analysis methods, for measuring biomarkers in the present disclosure. Such methods include, without limitation, chemical extraction partitioning, isoelectric focusing (IEF) including capillary isoelectric focusing (CIEF), capillary isotachophoresis (CITP), capillary electrochromatography (CEC), and the like, one-dimensional polyacrylamide gel electrophoresis (PAGE), two-dimensional polyacrylamide gel electrophoresis (2D-PAGE), capillary gel electrophoresis (CGE), capillary zone electrophoresis (CZE), micellar electrokinetic chromatography (MEKC), free flow electrophoresis (FFE), etc.
[00106] In the context of the invention, the term "capture agent" refers to a compound that can specifically bind to a target, in particular a biomarker. The term includes antibodies, antibody fragments, nucleic acid-based protein binding reagents (e.g. aptamers, Slow Off-rate Modified Aptamers (SOMAmer™)), protein-capture agents, natural ligands (i.e. a hormone for its receptor or vice versa), small molecules or variants thereof.
[00107] Capture agents can be configured to specifically bind to a target, in particular a biomarker. Capture agents can include but are not limited to organic molecules, such as polypeptides, polynucleotides and other non polymeric molecules that are identifiable to a skilled person. In the embodiments disclosed herein, capture agents include any agent that can be used to detect, purify, isolate, or enrich a target, in particular a biomarker. Any art- known affinity capture technologies can be used to selectively isolate and
enrich/concentrate biomarkers that are components of complex mixtures of biological media for use in the disclosed methods.
[00108] Antibody capture agents that specifically bind to a biomarker can be prepared using any suitable methods known in the art. See, e.g., Coligan, Current Protocols in Immunology (1991); Harlow & Lane, Antibodies: A Laboratory Manual (1988); Goding, Monoclonal Antibodies: Principles and Practice (2d ed. 1986). Antibody capture agents can be any immunoglobulin or derivative therof, whether natural or wholly or partially synthetically produced. All derivatives thereof which maintain specific binding ability are also included in the term. Antibody capture agents have a binding domain that is homologous or largely homologous to an immunoglobulin binding domain and can be derived from natural sources, or partly or wholly synthetically produced. Antibody capture agents can be monoclonal or polyclonal antibodies. In some embodiments, an antibody is a single chain antibody. Those of ordinary skill in the art will appreciate that antibodies can be provided in any of a variety of forms including, for example, humanized, partially humanized, chimeric, chimeric humanized, etc. Antibody capture agents can be antibody fragments including, but not limited to, Fab, Fab', F(ab')2, scFv, Fv, dsFv diabody, and Fd fragments. An antibody capture agent can be produced by any means. For example, an antibody capture agent can be enzymatically or chemically produced by fragmentation of an intact antibody and/or it can be recombinantly produced from a gene encoding the partial antibody sequence. An antibody capture agent can comprise a single chain antibody fragment. Alternatively or additionally, antibody capture agent can comprise multiple chains which are linked together, for example, by disulfide linkages.; and, any functional fragments obtained from such molecules, wherein such fragments retain specific-binding properties of the parent antibody molecule. Because of their smaller size as functional components of the whole molecule, antibody fragments can offer advantages over intact antibodies for use in certain immunochemical techniques and experimental applications.
[00109] Suitable capture agents useful for practicing the invention also include aptamers. Aptamers are oligonucleotide sequences that can bind to their targets specifically via unique three dimensional (3-D) structures. An aptamer can include any suitable number of nucleotides and different aptamers can have either the same or different numbers of nucleotides. Aptamers can be DNA or R A or chemically modified nucleic acids and can be single stranded, double stranded, or contain double stranded regions, and can include higher ordered structures. An aptamer can also be a photoaptamer, where a photoreactive or chemically reactive functional group is included in the aptamer to allow it to be covalently linked to its corresponding target. Use of an aptamer capture agent can include the use of two or more aptamers that specifically bind the same biomarker. An aptamer can include a tag. An aptamer can be identified using any known method, including the SELEX (systematic evolution of ligands by exponential enrichment), process. Once identified, an aptamer can be prepared or synthesized in accordance with any known method, including chemical synthetic methods and enzymatic synthetic methods and used in a variety of applications for biomarker detection. Liu et al., Curr Med Chem.
18(27) :4117-25 (2011). Capture agents useful in practicing the methods of the invention also include SOMAmers (Slow Off-Rate Modified Aptamers) known in the art to have improved off-rate characteristics. Brody et al., J Mol Biol. 422(5):595-606 (2012).
SOMAmers can be generated using using any known method, including the SELEX method.
[00110] It is understood by those skilled in the art that biomarkers can be modified prior to analysis to improve their resolution or to determine their identity. For example, the biomarkers can be subject to proteolytic digestion before analysis. Any protease can be used. Proteases, such as trypsin, that are likely to cleave the biomarkers into a discrete number of fragments are particularly useful. The fragments that result from digestion function as a fingerprint for the biomarkers, thereby enabling their detection indirectly. This is particularly useful where there are biomarkers with similar molecular masses that might be confused for the biomarker in question. Also, proteolytic fragmentation is useful for high molecular weight biomarkers because smaller biomarkers are more easily resolved by mass spectrometry. In another example, biomarkers can be modified to improve detection resolution. For instance, neuraminidase can be used to remove terminal sialic acid residues from glycoproteins to improve binding to an anionic adsorbent and to improve detection resolution. In another example, the biomarkers can be modified by the attachment of a tag of particular molecular weight that specifically binds to molecular biomarkers, further distinguishing them. Optionally, after detecting such modified biomarkers, the identity of the biomarkers can be further determined by matching the physical and chemical characteristics of the modified biomarkers in a protein database {e.g., SwissProt).
[00111] It is further appreciated in the art that biomarkers in a sample can be captured on a substrate for detection. Traditional substrates include antibody-coated 96-well plates or nitrocellulose membranes that are subsequently probed for the presence of the proteins. Alternatively, protein-binding molecules attached to microspheres, microparticles, microbeads, beads, or other particles can be used for capture and detection of biomarkers. The protein-binding molecules can be antibodies, peptides, peptoids, aptamers, small molecule ligands or other protein-binding capture agents attached to the surface of particles. Each protein-binding molecule can include unique detectable label that is coded such that it can be distinguished from other detectable labels attached to other protein- binding molecules to allow detection of biomarkers in multiplex assays. Examples include, but are not limited to, color-coded microspheres with known fluorescent light intensities (see e.g., microspheres with xMAP technology produced by Luminex (Austin, Tex.);
microspheres containing quantum dot nanocrystals, for example, having different ratios and combinations of quantum dot colors (e.g., Qdot nanocrystals produced by Life
Technologies (Carlsbad, Calif); glass coated metal nanoparticles (see e.g., SERS nanotags produced by Nanoplex Technologies, Inc. (Mountain View, Calif); barcode materials (see e.g., sub-micron sized striped metallic rods such as Nanobarcodes produced by Nanoplex Technologies, Inc.), encoded microparticles with colored bar codes (see e.g., CellCard produced by Vitra Bioscience, vitrabio.com), glass microparticles with digital holographic code images (see e.g., CyVera microbeads produced by Illumina (San Diego, Calif);
chemiluminescent dyes, combinations of dye compounds; and beads of detectably different sizes.
[00112] In another aspect, biochips can be used for capture and detection of the biomarkers of the invention. Many protein biochips are known in the art. These include, for example, protein biochips produced by Packard BioScience Company (Meriden Conn.), Zyomyx (Hayward, Calif.) and Phylos (Lexington, Mass.). In general, protein biochips comprise a substrate having a surface. A capture reagent or adsorbent is attached to the surface of the substrate. Frequently, the surface comprises a plurality of addressable locations, each of which location has the capture agent bound there. The capture agent can be a biological molecule, such as a polypeptide or a nucleic acid, which captures other biomarkers in a specific manner. Alternatively, the capture agent can be a chromatographic material, such as an anion exchange material or a hydrophilic material. Examples of protein biochips are well known in the art.
[00113] Measuring mRNA in a biological sample can be used as a surrogate for detection of the level of the corresponding protein biomarker in a biological sample. Thus, any of the biomarkers or biomarker panels described herein can also be detected by detecting the appropriate RNA. Levels of mR A can measured by reverse transcription quantitative polymerase chain reaction (RT-PCR followed with qPCR). RT-PCR is used to create a cDNA from the mRNA. The cDNA can be used in a qPCR assay to produce fluorescence as the DNA amplification process progresses. By comparison to a standard curve, qPCR can produce an absolute measurement such as number of copies of mRNA per cell. Northern blots, microarrays, Invader assays, and RT-PCR combined with capillary electrophoresis have all been used to measure expression levels of mRNA in a sample. See Gene Expression Profiling: Methods and Protocols, Richard A. Shimkets, editor, Humana Press, 2004.
[00114] Some embodiments disclosed herein relate to diagnostic and prognostic methods of determining the probability for preterm birth in a pregnant female. The detection of the level of expression of one or more biomarkers and/or the determination of a ratio of biomarkers can be used to determine the probability for preterm birth in a pregnant female. Such detection methods can be used, for example, for early diagnosis of the condition, to determine whether a subject is predisposed to preterm birth, to monitor the progress of preterm birth or the progress of treatment protocols, to assess the severity of preterm birth, to forecast the outcome of preterm birth and/or prospects of recovery or birth at full term, or to aid in the determination of a suitable treatment for preterm birth.
[00115] The quantitation of biomarkers in a biological sample can be determined, without limitation, by the methods described above as well as any other method known in the art. The quantitative data thus obtained is then subjected to an analytic classification process. In such a process, the raw data is manipulated according to an algorithm, where the algorithm has been pre-defined by a training set of data, for example as described in the examples provided herein. An algorithm can utilize the training set of data provided herein, or can utilize the guidelines provided herein to generate an algorithm with a different set of data.
[00116] In some embodiments, analyzing a measurable feature to determine the probability for preterm birth in a pregnant female encompasses the use of a predictive model. In further embodiments, analyzing a measurable feature to determine the probability for preterm birth in a pregnant female encompasses comparing said measurable feature with a reference feature. As those skilled in the art can appreciate, such comparison can be a direct comparison to the reference feature or an indirect comparison where the reference feature has been incorporated into the predictive model. In further embodiments, analyzing a measurable feature to determine the probability for preterm birth in a pregnant female encompasses one or more of a linear discriminant analysis model, a support vector machine classification algorithm, a recursive feature elimination model, a prediction analysis of microarray model, a logistic regression model, a CART algorithm, a flex tree algorithm, a LART algorithm, a random forest algorithm, a MART algorithm, a machine learning algorithm, a penalized regression method, or a combination thereof. In particular embodiments, the analysis comprises logistic regression.
[00117] An analytic classification process can use any one of a variety of statistical analytic methods to manipulate the quantitative data and provide for classification of the sample. Examples of useful methods include linear discriminant analysis, recursive feature elimination, a prediction analysis of microarray, a logistic regression, a CART algorithm, a FlexTree algorithm, a LART algorithm, a random forest algorithm, a MART algorithm, machine learning algorithms; etc.
[00118] For creation of a random forest for prediction of GAB one skilled in the art can consider a set of k subjects (pregnant women) for whom the gestational age at birth (GAB) is known, and for whom N analytes (transitions) have been measured in a blood specimen taken several weeks prior to birth. A regression tree begins with a root node that contains all the subjects. The average GAB for all subjects can be cacluclated in the root node. The variance of the GAB within the root node will be high, because there is a mixture of women with different GAB's. The root node is then divided (partitioned) into two branches, so that each branch contains women with a similar GAB. The average GAB for subjects in each branch is again caluclated. The variance of the GAB within each branch will be lower than in the root node, because the subset of women within each branch has relatively more similar GAB's than those in the root node. The two branches are created by selecting an analyte and a threshold value for the analyte that creates branches with similar GAB. The analyte and threshold value are chosen from among the set of all analytes and threshold values, usually with a random subset of the analytes at each node. The procedure continues recursively producing branches to create leaves (terminal nodes) in which the subjects have very similar GAB's. The predicted GAB in each terminal node is the average GAB for subjects in that terminal node. This procedure creates a single regression tree. A random forest can consist of several hundred or several thousand such trees.
[00119] Classification can be made according to predictive modeling methods that set a threshold for determining the probability that a sample belongs to a given class. The probability preferably is at least 50%, or at least 60%, or at least 70%>, or at least 80%> or higher. Classifications also can be made by determining whether a comparison between an obtained dataset and a reference dataset yields a statistically significant difference. If so, then the sample from which the dataset was obtained is classified as not belonging to the reference dataset class. Conversely, if such a comparison is not statistically significantly different from the reference dataset, then the sample from which the dataset was obtained is classified as belonging to the reference dataset class.
[00120] The predictive ability of a model can be evaluated according to its ability to provide a quality metric, e.g. AUROC (area under the ROC curve) or accuracy, of a particular value, or range of values. Area under the curve measures are useful for comparing the accuracy of a classifier across the complete data range. Classifiers with a greater AUC have a greater capacity to classify unknowns correctly between two groups of interest. In some embodiments, a desired quality threshold is a predictive model that will classify a sample with an accuracy of at least about 0.5, at least about 0.55, at least about 0.6, at least about 0.7, at least about 0.75, at least about 0.8, at least about 0.85, at least about 0.9, at least about 0.95, or higher. As an alternative measure, a desired quality threshold can refer to a predictive model that will classify a sample with an AUC of at least about 0.7, at least about 0.75, at least about 0.8, at least about 0.85, at least about 0.9, or higher.
[00121] As is known in the art, the relative sensitivity and specificity of a predictive model can be adjusted to favor either the selectivity metric or the sensitivity metric, where the two metrics have an inverse relationship. The limits in a model as described above can be adjusted to provide a selected sensitivity or specificity level, depending on the particular requirements of the test being performed. One or both of sensitivity and specificity can be at least about 0.7, at least about 0.75, at least about 0.8, at least about 0.85, at least about 0.9, or higher.
[00122] The raw data can be initially analyzed by measuring the values for each biomarker, usually in triplicate or in multiple triplicates. The data can be manipulated, for example, raw data can be transformed using standard curves, and the average of triplicate measurements used to calculate the average and standard deviation for each patient. These values can be transformed before being used in the models, e.g. log-transformed, Box-Cox transformed (Box and Cox, Royal Stat. Soc, Series B, 26:211-246(1964). The data are then input into a predictive model, which will classify the sample according to the state. The resulting information can be communicated to a patient or health care provider. [00123] To generate a predictive model for preterm birth, a robust data set, comprising known control samples and samples corresponding to the preterm birth classification of interest is used in a training set. A sample size can be selected using generally accepted criteria. As discussed above, different statistical methods can be used to obtain a highly accurate predictive model. Examples of such analysis are provided in Example 2.
[00124] In one embodiment, hierarchical clustering is performed in the derivation of a predictive model, where the Pearson correlation is employed as the clustering metric. One approach is to consider a preterm birth dataset as a "learning sample" in a problem of "supervised learning." CART is a standard in applications to medicine (Singer, Recursive Partitioning in the Health Sciences, Springer(1999)) and can be modified by transforming any qualitative features to quantitative features; sorting them by attained significance levels, evaluated by sample reuse methods for Hotelling's T2 statistic; and suitable application of the lasso method. Problems in prediction are turned into problems in regression without losing sight of prediction, indeed by making suitable use of the Gini criterion for classification in evaluating the quality of regressions.
[00125] This approach led to what is termed FlexTree (Huang, Proc. Nat. Acad. Sci. U.S.A 101 : 10529-10534(2004)). FlexTree performs very well in simulations and when applied to multiple forms of data and is useful for practicing the claimed methods.
Software automating FlexTree has been developed. Alternatively, LARTree or LART can be used (Turnbull (2005) Classification Trees with Subset Analysis Selection by the Lasso, Stanford University). The name reflects binary trees, as in CART and FlexTree; the lasso, as has been noted; and the implementation of the lasso through what is termed LARS by Efron et al. (2004) Annals of Statistics 32:407-451 (2004). See, also, Huang et al.., Proc. Natl. Acad. Sci. USA. 101(29): 10529-34 (2004). Other methods of analysis that can be used include logic regression. One method of logic regression Ruczinski, Journal of Computational and Graphical Statistics 12:475-512 (2003). Logic regression resembles CART in that its classifier can be displayed as a binary tree. It is different in that each node has Boolean statements about features that are more general than the simple "and" statements produced by CART.
[00126] Another approach is that of nearest shrunken centroids (Tibshirani, Proc. Natl. Acad. Sci. U.S.A 99:6567-72(2002)). The technology is k-means-like, but has the advantage that by shrinking cluster centers, one automatically selects features, as is the case in the lasso, to focus attention on small numbers of those that are informative. The approach is available as PAM software and is widely used. Two further sets of algorithms that can be used are random forests (Breiman, Machine Learning 45:5-32 (2001)) and MART (Hastie, The Elements of Statistical Learning, Springer (2001)). These two methods are known in the art as "committee methods," that involve predictors that "vote" on outcome.
[00127] To provide significance ordering, the false discovery rate (FDR) can be determined. First, a set of null distributions of dissimilarity values is generated. In one embodiment, the values of observed profiles are permuted to create a sequence of distributions of correlation coefficients obtained out of chance, thereby creating an appropriate set of null distributions of correlation coefficients (Tusher et al. , Proc. Natl. Acad. Sci. U.S.A 98, 5116-21 (2001)). The set of null distribution is obtained by:
permuting the values of each profile for all available profiles; calculating the pair-wise correlation coefficients for all profile; calculating the probability density function of the correlation coefficients for this permutation; and repeating the procedure for N times, where N is a large number, usually 300. Using the N distributions, one calculates an appropriate measure (mean, median, etc) of the count of correlation coefficient values that their values exceed the value (of similarity) that is obtained from the distribution of experimentally observed similarity values at given significance level.
[00128] The FDR is the ratio of the number of the expected falsely significant correlations (estimated from the correlations greater than this selected Pearson correlation in the set of randomized data) to the number of correlations greater than this selected Pearson correlation in the empirical data (significant correlations). This cut-off correlation value can be applied to the correlations between experimental profiles. Using the aforementioned distribution, a level of confidence is chosen for significance. This is used to determine the lowest value of the correlation coefficient that exceeds the result that would have obtained by chance. Using this method, one obtains thresholds for positive correlation, negative correlation or both. Using this threshold(s), the user can filter the observed values of the pair wise correlation coefficients and eliminate those that do not exceed the threshold(s). Furthermore, an estimate of the false positive rate can be obtained for a given threshold. For each of the individual "random correlation" distributions, one can find how many observations fall outside the threshold range. This procedure provides a sequence of counts. The mean and the standard deviation of the sequence provide the average number of potential false positives and its standard deviation.
[00129] In an alternative analytical approach, variables chosen in the cross-sectional analysis are separately employed as predictors in a time-to-event analysis (survival analysis), where the event is the occurrence of preterm birth, and subjects with no event are considered censored at the time of giving birth. Given the specific pregnancy outcome (preterm birth event or no event), the random lengths of time each patient will be observed, and selection of proteomic and other features, a parametric approach to analyzing survival can be better than the widely applied semi-parametric Cox model. A Weibull parametric fit of survival permits the hazard rate to be monotonically increasing, decreasing, or constant, and also has a proportional hazards representation (as does the Cox model) and an accelerated failure-time representation. All the standard tools available in obtaining approximate maximum likelihood estimators of regression coefficients and corresponding functions are available with this model.
[00130] In addition the Cox models can be used, especially since reductions of numbers of covariates to manageable size with the lasso will significantly simplify the analysis, allowing the possibility of a nonparametric or semi-parametric approach to prediction of time to preterm birth. These statistical tools are known in the art and applicable to all manner of proteomic data. A set of biomarker, clinical and genetic data that can be easily determined, and that is highly informative regarding the probability for preterm birth and predicted time to a preterm birth event in said pregnant female is provided. Also, algorithms provide information regarding the probability for preterm birth in the pregnant female.
[00131] Accordingly, one skilled in the art understands that the probability for preterm birth according to the invention can be determined using either a quantitative or a categorical variable. For example, in practicing the methods of the invention the measurable feature of each of N biomarkers can be subjected to categorical data analysis to determine the probability for preterm birth as a binary categorical outcome. Alternatively, the methods of the invention may analyze the measurable feature of each of N biomarkers by initially calculating quantitative variables, in particular, predicted gestational age at birth. The predicted gestational age at birth can subsequently be used as a basis to predict risk of preterm birth. By initially using a quantitative variable and subsequently converting the quantitative variable into a categorical variable the methods of the invention take into account the continuum of measurements detected for the measurable features. For example, by predicting the gestational age at birth rather than making a binary prediction of preterm birth versus term birth, it is possible to tailor the treatment for the pregnant female. For example, an earlier predicted gestational age at birth will result in more intensive prenatal intervention, i.e. monitoring and treatment, than a predicted gestational age that approaches full term.
[00132] Among women with a predicted GAB of j days plus or minus k days, p(PTB) can estimated as the proportion of women in the PAPR clinical trial {see Example 1) with a predicted GAB of j days plus or minus k days who actually deliver before 37 weeks gestational age. More generally, for women with a predicted GAB of j days plus or minus k days, the probability that the actual gestational age at birth will be less than a specified gestational age, p(actual GAB < specified GAB), was estimated as the proportion of women in the PAPR clinical trial with a predicted GAB of j days plus or minus k days who actually deliver before the specified gestational age.
[00133] In the development of a predictive model, it can be desirable to select a subset of markers, i.e. at least 3, at least 4, at least 5, at least 6, up to the complete set of markers. Usually a subset of markers will be chosen that provides for the needs of the quantitative sample analysis, e.g. availability of reagents, convenience of quantitation, etc., while maintaining a highly accurate predictive model. The selection of a number of informative markers for building classification models requires the definition of a performance metric and a user-defined threshold for producing a model with useful predictive ability based on this metric. For example, the performance metric can be the AUC, the sensitivity and/or specificity of the prediction as well as the overall accuracy of the prediction model.
[00134] As will be understood by those skilled in the art, an analytic classification process can use any one of a variety of statistical analytic methods to manipulate the quantitative data and provide for classification of the sample. Examples of useful methods include, without limitation, linear discriminant analysis, recursive feature elimination, a prediction analysis of microarray, a logistic regression, a CART algorithm, a FlexTree algorithm, a LART algorithm, a random forest algorithm, a MART algorithm, and machine learning algorithms.
[00135] As described in Example 2, various methods are used in a training model. The selection of a subset of markers can be for a forward selection or a backward selection of a marker subset. The number of markers can be selected that will optimize the performance of a model without the use of all the markers. One way to define the optimum number of terms is to choose the number of terms that produce a model with desired predictive ability {e.g. an AUC>0.75, or equivalent measures of sensitivity/specificity) that lies no more than one standard error from the maximum value obtained for this metric using any combination and number of terms used for the given algorithm. [00136] Table 1. Transitions with p-values less than 0.05 in univariate Cox
Proportional Hazards analyses to predict Gestational Age at Birth
Figure imgf000041_0001
Transition Protein p-value Cox univariate
TLLPVSKPEIR 418.26 514.3 C05 HUMAN 0.042
VQEAHLTEDQIFYFPK 655.66 701.4 C08G HUMAN 0.047
ISLLLIESWLEPVR 834.49 371.2 CSH HUMAN 0.048
ALQDQLVLVAAK 634.88 289.2 ANGT HUMAN 0.048
YEFLNGR 449.72 293.1 PLMN HUMAN 0.049
[00137] Table 2. Transitions selected by the Cox stepwise AIC analysis
Figure imgf000042_0001
Transition coef exp(coef) se(coef) z Pr(>|z|)
VQEAHLTEDQIFYFPK 655. -2.02E+01 1.77E-09 2.45E+00 -8.22 2.20E-16 66 391.2
VEIDTK 352.7 476.3 7.06E+00 1.17E+03 1.45E+00 4.86 1.20E-06
AVLTIDEK 444.76 605.3 7.85E+00 2.56E+03 9.46E-01 8.29 < 2e-16
FSVVYAK_407.23_579.4 -2.44E+01 2.42E-11 3.08E+00 -7.93 2.20E-15
YYLQGAK 421.72 516.3 -1.82E+01 1.22E-08 2.45E+00 -7.44 1.00E-13
EENFYVDETTVVK 786.88 -1.90E+01 5.36E-09 2.71E+00 -7.03 2.00E-12 259.1
YGFYTHVFR 397.2 421.3 1.90E+01 1.71E+08 2.73E+00 6.93 4.20E-12
HTLNQIDEVK 598.82 951.5 1.03E+01 3.04E+04 2.11E+00 4.89 9.90E-07
AFIQLWAFDAVK 704.89 8 1.08E+01 4.72E+04 2.59E+00 4.16 3.20E-05 36.4
SGFSFGFK 438.72 585.3 1.35E+01 7.32E+05 2.56E+00 5.27 1.40E-07
GWVTDGFSSLK 598.8 854. -3.12E+00 4.42E-02 9.16E-01 -3.4 0.00066 4
ITENDIQIALDDAK 779.9 6 1.91E+00 6.78E+00 1.36E+00 1.4 0.16036 32.3
[00138] Table 3. Transitions selected by Cox lasso model
Figure imgf000043_0001
[00139] Table 4. Area under the ROC (AUROC) curve for individual analytes to discriminate pre-term birth subjects from non-pre-term birth subjects. The 77 transitions with the highest AUROC area are shown.
Figure imgf000043_0002
Transition AUROC
AFTEC C V V AS QLR 770.87 574.3 0.70
ITLPDFTGDLR 624.34 920.4 0.70
IRPFFPQQ 516.79 661.4 0.68
TDAPDLPEENQ AR 728.34 613.3 0.67
ITLPDFTGDLR 624.34 288.2 0.67
ELLESYIDGR 597.8 839.4 0.67
SFRPFVPR 335.86 635.3 0.67
ETAASLLQAGYK 626.33 879.5 0.67
TLLPVSKPEIR 418.26 288.2 0.66
ETAASLLQAGYK 626.33 679.4 0.66
SFRPFVPR 335.86 272.2 0.66
LQGTLPVEAR 542.31 571.3 0.66
VEPLYELVTATDF AYS ST VR 754.38 712.4 0.66
DPDQTDGLGLSYLSSHIANVER 796.39 328.1 0.66
VTGWGNLK 437.74 617.3 0.65
ALQDQLVLVAAK 634.88 289.2 0.65
EAQLPVIENK 570.82 329.1 0.65
VRPQQLVK 484.31 609.3 0.65
AFTECCVVASQLR 770.87 673.4 0.65
YEFLNGR 449.72 293.1 0.65
VGEYSLYIGR 578.8 871.5 0.64
EAQLPVIENK 570.82 699.4 0.64
TLLPVSKPEIR 418.26 514.3 0.64
IEEIAAK 387.22 531.3 0.64
LEQGENVFLQATDK 796.4 822.4 0.64 Transition AUROC
LQGTLPVEAR 542.31 842.5 0.64
FLQEQGHR 338.84 497.3 0.63
ISLLLIESWLEPVR 834.49 371.2 0.63
IITGLLEFEVYLEYLQNR 738.4 530.3 0.63
LSSPAVITDK 515.79 743.4 0.63
VRPQQLVK 484.31 722.4 0.63
SLPVSDSVLSGFEQR 810.92 723.3 0.63
VQEAHLTEDQIFYFPK 655.66 701.4 0.63
NADYSYSVWK 616.78 333.2 0.63
DAQYAPGYDK 564.25 813.4 0.62
FQLPGQK 409.23 276.1 0.62
TASDFITK 441.73 781.4 0.62
YGLVTYATYPK 638.33 334.2 0.62
GSFALSFPVESDVAPIAR 931.99 363.2 0.62
TLLIANETLR 572.34 703.4 0.62
VILGAHQEVNLEPHVQEIEVSR 832.78 860.4 0.62
T ATSE YQTFFNPR 781.37 386.2 0.62
YE VQGE VFTKPQL WP 910.96 392.2 0.62
DISEVVTPR 508.27 472.3 0.62
GSFALSFPVESDVAPIAR 931.99 456.3 0.62
YGFYTHVFR 397.2 421.3 0.62
TLEAQLTPR 514.79 685.4 0.62
YGFYTHVFR 397.2 659.4 0.62
AVGYLITGYQR 620.84 737.4 0.61
DPDQTDGLGLSYLSSHIANVER 796.39 456.2 0.61 Transition AUROC
FNAVLTNPQGDYDTSTGK 964.46 262.1 0.61
SPEQQETVLDGNLIIR 906.48 685.4 0.61
ALNHLPLEYNSALYSR 620.99 538.3 0.61
GGEIEGFR 432.71 508.3 0.61
GIVEECCFR 585.26 900.3 0.61
DAQYAPGYDK 564.25 315.1 0.61
FAFNLYR 465.75_712.4 0.61
YTTEIIK 434.25 603.4 0.61
AVLTIDEK 444.76 605.3 0.61
AITPPHP ASQ ANIIFDITEGNLR 825.77 459.3 0.60
EPGLCTWQSLR 673.83 790.4 0.60
AVYEAVLR 460.76 587.4 0.60
ALQDQLVLVAAK 634.88 956.6 0.60
AWVAWR_394.71_531.3 0.60
TNLESILSYPK 632.84 807.5 0.60
HLSLLTTLSNR 418.91 376.2 0.60
FTFTLHLETPKPSISSSNLNPR 829.44 787.4 0.60
AVGYLITGYQR 620.84 523.3 0.60
FQLPGQK 409.23 429.2 0.60
YGLVTYATYPK 638.33 843.4 0.60
TELRPGETLNVNFLLR 624.68 662.4 0.60
LSSPAVITDK 515.79 830.5 0.60
TATSEYQTFFNPR 781.37 272.2 0.60
LPTAVVPLR 483.31 385.3 0.60
APLTKPLK 289.86 260.2 0.60 Table 5. AUROCs for random forest, boosting, lasso, and logistic regression for a specific number of transitions permitted in the model, as estimated by 100 rounds of bootstrap resampling.
Figure imgf000047_0001
[00141] Table 6. Top 15 transitions selected by each multivariate method, ranked by importance for that method.
Figure imgf000047_0002
6 GSFALSFPVES GGEIEGFR 432.71 IITGLLEFEVYLEYL AEAQAQYSAAVA DVAPIAR 931. _379.2 QNR_738.4_530.3 K 654.33 709.4 99 363.2
7 VGEYSLYIGR ALQDQLVLVAAK ADSQAQLLLSTVV ADSQAQLLLSTVV 578.8_871.5 _634.88_289.2 GVFTAPGLHLK 82 GVFTAPGLHLK 82
2.46 983.6 2.46 983.6
8 SFRPFVPR 335 VGEYSLYIGR 57 SLPVSDSVLSGFEQ AITPPHP AS Q ANIIF .86_635.3 8.8_871.5 R_810.92_723.3 DITEGNLR 825.77
459.3
9 ALQDQLVLVA VEPLYELVTATD SFRPFVPR 335.86 ADSQAQLLLSTVV AK 634.88 289 FAYS ST VR 754.3 272.2 GVFTAPGLHLK 82 .2 8 712.4 2.46 664.4
10 EDTPNSVWEP SPEQQETVLDGN IIGGSDADIK 494.7 AYSDLSR 406.2 37 AK 686.82 315 LIIR_906.48_685.4 7_260.2 5.2
.2
11 YGFYTHVFR YEFLNGR 449.72 NADYSYSVWK 61 DALSSVQESQVAQ 397.2 421.3 293.1 6.78 333.2 QAR 572.96 672.4
12 DPDQTDGLGL LEQGENVFLQAT GSFALSFPVESDVA ANRPFLVFIR 411.5 SYLSSHIANVE DK_796.4_822.4 PIAR 931.99 456.3 8_435.3
R 796.39 328.1
13 LEQGENVFLQ LQGTLPVEAR 54 LSSPAVITDK 515.7 DALSSVQESQVAQ ATDK 796.4 8 2.31 571.3 9_743.4 QAR 572.96 502.3 22.4
14 LQGTLPVEAR ISLLLIESWLEPVR ELPEHTVK 476.76 ALEQDLPVNIK 62 542.31 571.3 834.49 371.2 347.2 0.35 570.4
15 SFRPFVPR 335 TASDFITK 441.73 EAQLPVIENK 570. AVLTIDEK 444.76 .86 272.2 781.4 82 699.4 718.4
[00142] In yet another aspect, the invention provides kits for determining probability of preterm birth, wherein the kits can be used to detect N of the isolated biomarkers listed in Tables 1 through 63. For example, the kits can be used to detect one or more, two or more, or three of the isolated biomarkers selected from the group consisting of AFTECCVVASQLR, ELLESYIDGR, and ITLPDFTGDLR. For example, the kits can be used to detect one or more, two or more, or three of the isolated biomarkers selected from the group consisting of FLNWIK, FGFGGSTDSGPIR, LLELTGPK, VEHSDLSFSK, IEGNLIFDP NYLPK, ALVLELAK, TQILEWAAER, DVLLLVHNLPQNLPGYFWYK, SEPRPGVLLR, ITQDAQLK, ALDLSLK, WWGGQPLWITATK, and LSETNR.
[00143] In another aspect, the kits can be used to detect one or more, two or more, three or more, four or more, five or more, six or more, seven or more, or eight of the isolated biomarkers selected from the group consisting of lipopolysaccharide-binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), and complement component C8 gamma chain (C8G or C08G). [00144] In another aspect, the kits can be used to detect one or more, two or more, three or more, four or more, five or more, six or more, seven or more, or eight of the isolated biomarkers selected from the group consisting of Alpha- lB-glycoprotein (A1BG), Disintegrin and metalloproteinase domain-containing protein 12 (ADA12), Apolipoprotein B-100 (APOB), Beta-2-microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid-binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy-specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE).
[00145] The kit can include one or more agents for detection of biomarkers, a container for holding a biological sample isolated from a pregnant female; and printed instructions for reacting agents with the biological sample or a portion of the biological sample to detect the presence or amount of the isolated biomarkers in the biological sample. The agents can be packaged in separate containers. The kit can further comprise one or more control reference samples and reagents for performing an immunoassay.
[00146] In one embodiment, the kit comprises agents for measuring the levels of at least N of the isolated biomarkers listed in Tables 1 through 63. The kit can include antibodies that specifically bind to these biomarkers, for example, the kit can contain at least one of an antibody that specifically binds to lipopolysaccharide-binding protein (LBP), an antibody that specifically binds to prothrombin (THRB), an antibody that specifically binds to complement component C5 (C5 or C05), an antibody that specifically binds to plasminogen (PLMN), and an antibody that specifically binds to complement component C8 gamma chain (C8G or C08G).
[00147] In one embodiment, the kit comprises agents for measuring the levels of at least N of the isolated biomarkers listed in Tables 1 through 63. The kit can include antibodies that specifically bind to these biomarkers, for example, the kit can contain at least one of an antibody that specifically binds to Alpha- lB-glycoprotein (A1BG),
Disintegrin and metalloproteinase domain-containing protein 12 (ADA12), Apolipoprotein B-100 (APOB), Beta-2-microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid-binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy-specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE). [00148] The kit can comprise one or more containers for compositions contained in the kit. Compositions can be in liquid form or can be lyophilized. Suitable containers for the compositions include, for example, bottles, vials, syringes, and test tubes. Containers can be formed from a variety of materials, including glass or plastic. The kit can also comprise a package insert containing written instructions for methods of determining probability of preterm birth.
[00149] From the foregoing description, it will be apparent that variations and modifications can be made to the invention described herein to adopt it to various usages and conditions. Such embodiments are also within the scope of the following claims.
[00150] The recitation of a listing of elements in any definition of a variable herein includes definitions of that variable as any single element or combination (or subcombination) of listed elements. The recitation of an embodiment herein includes that embodiment as any single embodiment or in combination with any other embodiments or portions thereof.
[00151] All patents and publications mentioned in this specification are herein incorporated by reference to the same extent as if each independent patent and publication was specifically and individually indicated to be incorporated by reference.
[00152] The following examples are provided by way of illustration, not limitation.
EXAMPLES
Example 1. Development of Sample Set for Discovery and Validation of Biomarkers for Preterm Birth
[00153] A standard protocol was developed governing conduct of the Proteomic Assessment of Preterm Risk (PAPR) clinical study. This protocol also specified that the samples and clinical information could be used to study other pregnancy complications for some of the subjects. Specimens were obtained from women at 11 Internal Review Board (IRB) approved sites across the United States. After providing informed consent, serum and plasma samples were obtained, as well as pertinent information regarding the patient's demographic characteristics, past medical and pregnancy history, current pregnancy history and concurrent medications. Following delivery, data were collected relating to maternal and infant conditions and complications. Serum and plasma samples were processed according to a protocol that requires standardized refrigerated centrifugation, aliquoting of the samples into 0.5 ml 2-D bar-coded cryovials and subsequent freezing at -80°C. [00154] Following delivery, preterm birth cases were individually reviewed to determine their status as either a spontaneous preterm birth or a medically indicated preterm birth. Only spontaneous preterm birth cases were used for this analysis. For discovery of biomarkers of preterm birth, 80 samples were analyzed in two gestational age groups: a) a late window composed of samples from 23-28 weeks of gestation which included 13 cases, 13 term controls matched within one week of sample collection and 14 term random controls, and, b) an early window composed of samples from 17-22 weeks of gestation included 15 cases, 15 term controls matched within one week of sample collection and 10 random term controls.
[00155] The samples were subsequently depleted of high abundance proteins using the Human 14 Multiple Affinity Removal System (MARS 14), which removes 14 of the most abundant proteins that are treated as uninformative with regard to the identification for disease-relevant changes in the serum proteome. To this end, equal volumes of each clinical or a pooled human serum sample (HGS) sample were diluted with column buffer and filtered to remove precipitates. Filtered samples were depleted using a MARS-14 column (4.6 x 100 mm, Cat. #5188-6558, Agilent Technologies). Samples were chilled to 4°C in the autosampler, the depletion column was run at room temperature, and collected fractions were kept at 4°C until further analysis. The unbound fractions were collected for further analysis.
[00156] A second aliquot of each clinical serum sample and of each HGS was diluted into ammonium bicarbonate buffer and depleted of the 14 high and approximately 60 additional moderately abundant proteins using an IgY14-SuperMix (Sigma) hand-packed column, comprised of 10 mL of bulk material (50% slurry, Sigma). Shi et ah, Methods, 56(2):246-53 (2012). Samples were chilled to 4°C in the autosampler, the depletion column was run at room temperature, and collected fractions were kept at 4°C until further analysis. The unbound fractions were collected for further analysis.
[00157] Depleted serum samples were denatured with trifluorethanol, reduced with dithiotreitol, alkylated using iodoacetamide, and then digested with trypsin at a 1 : 10 trypsin: protein ratio. Following trypsin digestion, samples were desalted on a C18 column, and the eluate lyophilized to dryness. The desalted samples were resolubilized in a reconstitution solution containing five internal standard peptides.
[00158] Depleted and trypsin digested samples were analyzed using a scheduled Multiple Reaction Monitoring method (sMRM). The peptides were separated on a 150 mm x 0.32 mm Bio-Basic CI 8 column (ThermoFisher) at a flow rate of 5 μΐ/min using a Waters Nano Acquity UPLC and eluted using an acetonitrile gradient into a AB SCIEX QTRAP 5500 with a Turbo V source (AB SCIEX, Framingham, MA). The sMR assay measured 1708 transitions that correspond to 854 peptides and 236 proteins. Chromatographic peaks were integrated using Rosetta Elucidator software (Ceiba Solutions).
[00159] Transitions were excluded from analysis, if their intensity area counts were less than 10000 and if they were missing in more than three samples per batch. Intensity area counts were log transformed and Mass Spectrometry run order trends and depletion batch effects were minimized using a regression analysis.
Example 2. Analysis I of Transitions to Identify Preterm Birth Biomarkers
[00160] The objective of these analyses was to examine the data collected in Example 1 to identify transitions and proteins that predict preterm birth. The specific analyses employed were (i) Cox time-to-event analyses and (ii) models with preterm birth as a binary categorical dependent variable. The dependent variable for all the Cox analyses was Gestational Age of time to event (where event is preterm birth). For the purpose of the Cox analyses, preterm birth subjects have the event on the day of birth. Term subjects are censored on the day of birth. Gestational age on the day of specimen collection is a covariate in all Cox analyses.
[00161] The assay data were previously adjusted for run order and depletion batch, and log transformed. Values for gestational age at time of sample collection were adjusted as follows. Transition values were regressed on gestational age at time of sample collection using only controls (non-pre-term subjects). The residuals from the regression were designated as adjusted values. The adjusted values were used in the models with pre-term birth as a binary categorical dependent variable. Unadjusted values were used in the Cox analyses.
Univariate Cox Proportional Hazards Analyses
[00162] Univariate Cox Proportional Hazards analyses was performed to predict Gestational Age at Birth, including Gestational age on the day of specimen collection as a covariate. Table 1 shows the transitions with p-values less than 0.05. Five proteins have multiple transitions among those with p-value less than 0.05: lipopolysaccharide-binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), and complement component C8 gamma chain (C8G or C08G).
Multivariate Cox Proportional Hazards Analyses: Stepwise AIC selection [00163] Cox Proportional Hazards analyses was performed to predict Gestational Age at Birth, including Gestational age on the day of specimen collection as a covariate, using stepwise and lasso models for variable selection. These analyses include a total of n= 80 subjects, with number of PTB events= 28. The stepwise variable selection analysis used the Akaike Information Criterion (AIC) as the stopping criterion. Table 2 shows the transitions selected by the stepwise AIC analysis. The coefficient of determination (R2) for the stepwise AIC model is 0.86 (not corrected for multiple comparisons).
Multivariate Cox Proportional Hazards Analyses: lasso selection
[00164] Lasso variable selection was used as the second method of multivariate Cox Proportional Hazards analyses to predict Gestational Age at Birth, including Gestational age on the day of specimen collection as a covariate. This analysis uses a lambda penalty for lasso estimated by cross validation. Table 3 shows the results. The lasso variable selection method is considerably more stringent than the stepwise AIC, and selects only 3 transitions for the final model, representing 3 different proteins. These 3 proteins give the top 4 transitions from the univariate analysis; 2 of the top 4 univariate are from the same protein, and hence are not both selected by the lasso method. Lasso tends to select a relatively small number of variables with low mutual correlation. The coefficient of determination (R2) for the lasso model is 0.21 (not corrected for multiple comparisons).
Univariate AUROC analysis of preterm birth as a binary categorical dependent variable
[00165] Univariate analyses was performed to discriminate pre-term subjects from non-pre-term subjects (pre-term as a binary categorical variable) as estimated by area under the receiver operating characteristic (AUROC) curve. These analyses use transition values adjusted for gestational age at time of sample collection, as described above. Table 4 shows the AUROC curve for the 77 transitions with the highest AUROC area of 0.6 or greater.
Multivariate analysis of preterm birth as a binary categorical dependent variable
[00166] Multivariate analyses was performed to predict preterm birth as a binary categorical dependent variable, using random forest, boosting, lasso, and logistic regression models. Random forest and boosting models grow many classification trees. The trees vote on the assignment of each subject to one of the possible classes. The forest chooses the class with the most votes over all the trees.
[00167] For each of the four methods (random forest, boosting, lasso, and logistic regression) each method was allowed to select and rank its own best 15 transitions. We then built models with 1 to 15 transitions. Each method sequentially reduces the number of nodes from 15 to 1 independently. A recursive option was used to reduce the number of nodes at each step: To determine which node to remove, the nodes were ranked at each step based on their importance from a nested cross-validation procedure. The least important node was eliminated. The importance measures for lasso and logistic regression are z-values. For random forest and boosting, the variable importance was calculated from permuting out-of-bag data: for each tree, the classification error rate on the out-of-bag portion of the data was recorded; the error rate was then recalculated after permuting the values of each variable (i.e., transition); if the transition was in fact important, there would have been be a big difference between the two error rates; the difference between the two error rates were then averaged over all trees, and normalized by the standard deviation of the differences. The AUCs for these models are shown in Table 5, as estimated by 100 rounds of bootstrap resampling. Table 6 shows the top 15 transitions selected by each multivariate method, ranked by importance for that method. These multivariate analyses suggest that models that combine 3 or more transitions give AUC greater than 0.7, as estimated by bootstrap.
[00168] In multivariate models, random forest (rf), boosting, and lasso models gave the best area under the AUROC curve. The following transitions were selected by these models, as significant in Cox univariate models, and/or having high univariate ROC's:
AFTECC VVASQLR 770.87 574.3
ELLESYIDGR 597.8 710.3
ITLPDFTGDLR 624.34 920.4
TD APDLPEENQ AR 728.34 613.3
SFRPFVPR 335.86 635.3
[00169] In summary, univariate and multivariate Cox analyses was performed using transitions to predict Gestational Age at Birth (GAB), including Gestational age on the day of specimen collection as a covariate. In the univariate Cox analysis, five proteins were identified that have multiple transitions among those with p-value less than 0.05: lipopolysaccharide -binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), and complement component C8 gamma chain (C8G or C08G).
[00170] In multivariate Cox analyses, stepwise AIC variable analysis selects 24 transitions, while the lasso model selects 3 transitions, which include the 3 top proteins in the univariate analysis. Univariate (AUROC) and multivariate (random forest, boosting, lasso, and logistic regression) analyses were performed to predict pre-term birth as a binary categorical variable. Univariate analyses identified 63 analytes with AUROC of 0.6 or greater. Multivariate analyses suggest that models that combine 3 or more transitions give AUC greater than 0.7, as estimated by bootstrap.
Example 3. Study II to Identify and Confirm Preterm Birth Biomarkers
[00171] A further study was performed using essentially the same methods described in the preceding Examples unless noted below. In this study, 2 gestational aged matched controls were used for each case of 28 cases and 56 matched controls, all from the early gestational window only (17-22 weeks).
[00172] The samples were processed in 4 batches with each batch composed of 7 cases, 14 matched controls and 3 HGS controls. Serum samples were depleted of the 14 most abundant serum samples by MARS 14 as described in Example 1. Depleted serum was then reduced with dithiothreitol, alkylated with iodacetamide, and then digested with trypsin at a 1 :20 trypsin to protein ratio overnight at 37°C. Following trypsin digestion, the samples were desalted on an Empore CI 8 96-well Solid Phase Extraction Plate (3M Company) and lyophilized to dryness. The desalted samples were resolubilized in a reconstitution solution containing five internal standard peptides .
[00173] The LC-MS/MS analysis was performed with an Agilent Poroshell 120 EC- CIS column (2.1x50mm, 2.7 μιη) and eluted with an acetonitrile gradient into a Agilent 6490 Triple Quadrapole mass spectrometer.
[00174] Data analysis included the use of conditional logistic regression where each matching triplet (case and 2 matched controls) was a stratum. The p-value reported in the table indicates whether there is a significant difference between cases and matched controls.
[00175] Table 7. Results of Study II
Figure imgf000055_0001
Transition Protein Annotation p-value
Complement
component C8 gamma
FLQEQGHR C08G HUMAN chain 0.014339596
Hyaluronan-binding
FLNWIK HABP2 HUMAN protein 2 0.014790418
EKPAGGIPVLGSLVNTVL BPI fold-containing
K BPIB1 HUMAN family B member 1 0.019027746
Lipopolysaccharide-
ITGFLKPGK LBP HUMAN binding protein 0.019836986
YGLVTYATYPK CFAB HUMAN Complement factor B 0.019927774
Complement
component C8 alpha
SLLQPNK C08A HUMAN chain 0.020930939
DISEVVTPR CFAB HUMAN Complement factor B 0.021738046
Complement
component C8 gamma
VQEAHLTEDQIFYFPK C08G HUMAN chain 0.021924548
SPELQAEAK APOA2 HUMAN Apolipoprotein A-II 0.025944285
Ectonucleotide
pyrophosphatase/phosp
hodiesterase family
TYLHTYESEI ENPP2 HUMAN member 2 0.026150038
DSPSVWAAVPGK PROF1 HUMAN Profilin-1 0.026607371
HYINLITR NPY HUMAN Pro-neuropeptide Y 0.027432804
Complement
component C8 gamma
SLPVSDSVLSGFEQR C08G HUMAN chain 0.029647857
Complement
component C8 beta
IPGIFELGISSQSDR C08B HUMAN chain 0.030430996
Coagulation factor XIII
IQTHSTTYR F13B HUMAN B chain 0.031667664
DGSPDVTTADIGANTPDA N-acetylmuramoyl-L-
TK PGRP2 HUMAN alanine amidase 0.034738338
Inter-alpha-trypsin
QLGLPGPPDVPDHAAYHP inhibitor heavy chain
F ITIH4 HUMAN H4 0.043130591
FPLGSYTIQNIVAGSTYLF Leucyl-cystinyl
STK LCAP HUMAN aminopeptidase 0.044698045
Alpha-2-HS-
AHYDLR FETUA HUMAN glycoprotein 0.046259201
Lipopolysaccharide-
SFRPFVPR LBP HUMAN binding protein 0.047948847 Example 4. Study III Shotgun Identification of Preterm Birth Biomarkers
[00176] A further study used a hypothesis-independent shotgun approach to identify and quantify additional biomarkers not present on our multiplexed hypothesis dependent MRM assay. Samples were processed as described in the preceding Examples unless noted below.
[00177] Tryptic digests of MARS depleted patient (preterm birth cases and term controls) samples were fractionated by two-dimensional liquid chromatography and analyzed by tandem mass spectrometry. Aliquots of the samples, equivalent to 3-4 μΐ of serum, were injected onto a 6 cm x 75 μιη self-packed strong cation exchange (Luna SCX, Phenomenex) column. Peptides were eluded from the SCX column with salt (15, 30, 50, 70, and 100% B, where B = 250mM ammonium acetate, 2% acetonitrile, 0.1% formic acid in water) and consecutively for each salt elution, were bound to a 0.5 μΐ C18 packed stem trap (Optimize Technologies, Inc.) and further fractionated on a 10 cm x 75 μιη reversed phase ProteoPep II PicoFrit column (New Objective). Peptides were eluted from the reversed phase column with an acetonitrile gradient containing 0.1% formic acid and directly ionized on an LTQ-Orbitrap (ThermoFisher). For each scan, peptide parent ion masses were obtained in the Orbitrap at 60K resolution and the top seven most abundant ions were fragmented in the LTQ to obtain peptide sequence information.
[00178] Parent and fragment ion data were used to search the Human RefSeq database using the Sequest (Eng et al., J. Am. Soc. Mass Spectrom 1994; 5:976-989) and X!Tandem (Craig and Beavis, Bioinformatics 2004; 20: 1466-1467) algorithms. For Sequest, data was searched with a 20 ppm tolerance for the parent ion and 1 AMU for the fragment ion. Two missed trypsin cleavages were allowed, and modifications included static cysteine carboxyamidomethylation and methionine oxidation. After searching the data was filtered by charge state vs. Xcorr scores (charge +1 > 1.5 Xcorr, charge +2 > 2.0, charge +3 > 2.5). Similar search parameters were used for X!tandem, except the mass tolerance for the fragment ion was 0.8 AMU and there is no Xcorr filtering. Instead, the PeptideProphet algorithm (Keller et al., Anal. Chem 2002;74:5383-5392) was used to validate each X! Tandem peptide-spectrum assignment and Protein assignments were validated using ProteinProphet algorithm (Nesvizhskii et al., Anal. Chem 2002; 74:5383-5392). Data was filtered to include only the peptide-spectrum matches that had PeptideProphet probability of 0.9 or more. After compiling peptide and protein identifications, spectral count data for each peptide were imported into DAnTE software (Polpitiya et al., Bioinformatics. 2008; 24: 1556-1558). Log transformed data was mean centered and missing values were filtered, by requiring that a peptide had to be identified in at least 4 cases and 4 controls. To determine the significance of an analyte, Receiver Operating Characteristic (ROC) curves for each analyte were created where the true positive rate (Sensitivity) is plotted as a function of the false positive rate (1 -Specificity) for different thresholds that separate the SPTB and Term groups. The area under the ROC curve (AUC) is equal to the probability that a classifier will rank a randomly chosen positive instance higher than a randomly chosen negative one. Peptides with AUC greater than or equal to 0.6 found uniquely by Sequest or Xtandem are found in Tables 8 and 9, respectively, and those identified by both approaches are found in Table 10.
[00179] Table 8. Significant peptides (AUO0.6) for Sequest only
Figure imgf000058_0001
Protein Description Uniprot ID (name) Peptide S_AUC alpha-1- antichymotrypsin
precursor P01011 (AACT_HUMAN) R.EIGELYLPK.F 0.65 alpha-lB- glycoprotein
precursor P04217 (A1BG_HUMAN) R.CEGPIPDVTFELLR.E 0.67 alpha-lB- glycoprotein
precursor P04217 (A1BG_HUMAN) R.FALVR.E 0.79 alpha-2-antiplasmin
isoform a precursor P08697 (A2AP_HUMAN) K.SPPGVCSR.D 0.81 alpha-2-antiplasmin
isoform a precursor P08697 (A2AP_HUMAN) R.DSFHLDEQFTVPVEMMQAR.T 0.69 alpha-2-HS- glycoprotein
preproprotein P02765 (FETUA_HUMAN) K.CNLLAEK.Q 0.67 alpha-2-HS- glycoprotein
preproprotein P02765 (FETUA_HUMAN) K.EHAVEGDCDFQLLK.L 0.67 alpha-2-HS- glycoprotein K.HTLNQIDEVKVWPQQPSGELFEIEID preproprotein P02765 (FETUA_HUMAN) TLETTCHVLDPTPVAR.C 0.64 alpha-2- macroglobulin
precursor P01023 (A2MG_HUMAN) K.MVSGFIPLKPTVK.M 0.73 alpha-2- macroglobulin
precursor P01023 (A2MG_HUMAN) R.AFQPFFVELTM*PYSVIR.G 0.68 alpha-2- macroglobulin
precursor P01023 (A2MG_HUMAN) R.AFQPFFVELTM PYSVIR.G 0.62 alpha-2- macroglobulin
precursor P01023 (A2MG_HUMAN) R.NQGNTWLTAFVLK.T 0.73 angiotensinogen
preproprotein P01019 (ANGT_HUMAN) K.IDRFMQAVTGWK.T 0.81 angiotensinogen
preproprotein P01019 (ANGT_HUMAN) K.LDTEDKLR.A 0.72 angiotensinogen K.TGCSLMGASVDSTLAFNTYVHFQGK preproprotein P01019 (ANGT_HUMAN) .M 0.64 angiotensinogen
preproprotein P01019 (ANGT_HUMAN) R.AAMVGMLANFLGFR.I 0.62 antithrombin-lll
precursor P01008 (ANT3_HUMAN) K.NDNDNIFLSPLSISTAFAMTK.L 0.64 antithrombin-lll
precursor P01008 (ANT3_HUMAN) K.SKLPGIVAEGRDDLYVSDAFHK.A 0.81 antithrombin-lll
precursor P01008 (ANT3_HUMAN) R.EVPLNTIIFMGR.V 0.61 antithrombin-lll R.FATTFYQHLADSKNDNDNIFLSPLSIS precursor P01008 (ANT3_HUMAN) TAFAMTK.L 0.66 Protein Description Uniprot ID (name) Peptide S_AUC antithrombin-lll
precursor P01008 (ANT3_HUMAN) R.ITDVIPSEAINELTVLVLVNTIYFK.G 0.60 antithrombin-lll
precursor P01008 (ANT3_HUMAN) R.RVWELSK.A 0.63 antithrombin-lll R.VAEGTQVLELPFKGDDITM*VLILPK precursor P01008 (ANT3_HUMAN) PEK.S 0.62 antithrombin-lll R.VAEGTQVLELPFKGDDITMVULPKP precursor P01008 (ANT3_HUMAN) EK.S 0.62 apolipoprotein A-ll
preproprotein P02652 (APOA2_HUMAN) K.AGTELVNFLSYFVELGTQPATQ.- 0.61 apolipoprotein A-ll
preproprotein P02652 (APOA2_HUMAN) K.EPCVESLVSQYFQTVTDYGK.D 0.63 apolipoprotein A-IV
precursor P06727 (APOA4_HUMAN) K. ALVQQM EQLR.Q 0.61 apolipoprotein A-IV
precursor P06727 (APOA4_HUMAN) K.LGPHAGDVEGHLSFLEK.D 0.61 apolipoprotein A-IV
precursor P06727 (APOA4_HUMAN) K.SELTQQLNALFQDK.L 0.71 apolipoprotein A-IV
precursor P06727 (APOA4_HUMAN) K.SLAELGGHLDQQVEEFRR.R 0.61 apolipoprotein A-IV
precursor P06727 (APOA4_HUMAN) K.VKIDQTVEELRR.S 0.75 apolipoprotein A-IV
precursor P06727 (APOA4_HUMAN) K.VNSFFSTFK.E 0.63 apolipoprotein B-100
precursor P04114 (APOB_HUMAN) K.ATFQTPDFIVPLTDLR.I 0.65 apolipoprotein B-100
precursor P04114 (APOB_HUMAN) K.AVSM*PSFSILGSDVR.V 0.65 apolipoprotein B-100
precursor P04114 (APOB_HUMAN) K.AVSMPSFSILGSDVR.V 0.67 apolipoprotein B-100
precursor P04114 (APOB_HUMAN) K.EQHLFLPFSYK.N 0.65 apolipoprotein B-100
precursor P04114 (APOB_HUMAN) K.KIISDYHQQFR.Y 0.63 apolipoprotein B-100
precursor P04114 (APOB_HUMAN) K.QVFLYPEKDEPTYILNIK.R 0.64 apolipoprotein B-100
precursor P04114 (APOB_HUMAN) K.SPAFTDLHLR.Y 0.69 apolipoprotein B-100
precursor P04114 (APOB_HUMAN) K.TILGTMPAF EVS LQALQK. A 0.62 apolipoprotein B-100
precursor P04114 (APOB_HUMAN) K.VLADKFIIPGLK.L 0.72 apolipoprotein B-100 K.YSQPEDSLIPFFEITVPESQLTVSQFTL precursor P04114 (APOB_HUMAN) PK.S 0.61 apolipoprotein B-100
precursor P04114 (APOB_HUMAN) R.DLKVEDIPLAR.I 0.64 apolipoprotein B-100
precursor P04114 (APOB_HUMAN) R.GIISALLVPPETEEAK.Q 0.81 apolipoprotein B-100 P04114 (APOB_HUMAN) R.ILGEELGFASLHDLQLLGK.L 0.62 Protein Description Uniprot ID (name) Peptide S_AUC precursor
apolipoprotein B-100
precursor P04114 (APOB_HUMAN) R.LELELRPTGEIEQYSVSATYELQR.E 0.60 apolipoprotein B-100
precursor P04114 (APOB_HUMAN) R.NIQEYLSILTDPDGK.G 0.68 apolipoprotein B-100 R.TFQIPGYTVPVVNVEVSPFTIEMSAF precursor P04114 (APOB_HUMAN) GYVFPK.A 0.75 apolipoprotein B-100
precursor P04114 (APOB_HUMAN) R.TIDQMLNSELQWPVPDIYLR.D 0.70 apolipoprotein C-l
precursor P02654 (APOCl_HUMAN) K.MREWFSETFQK.V 0.61 apolipoprotein C-ll
precursor P02655 (APOC2_HUMAN) K.STAAMSTYTGIFTDQVLSVLKGEE.- 0.61 apolipoprotein C-lll
precursor P02656 (APOC3_HUMAN) R.GWVTDGFSSLK.D 0.62 apolipoprotein E
precursor P02649 (APOE_HUMAN) R.AATVGSLAGQPLQER.A 0.61 apolipoprotein E
precursor P02649 (APOE_HUMAN) R.LKSWFEPLVEDMQR.Q 0.65 apolipoprotein E
precursor P02649 (APOE_HUMAN) R.WVQTLSEQVQEELLSSQVTQELR.A 0.64
ATP-binding cassette
sub-family D member
4 014678 (ABCD4_HUMAN) K.LCGGGRWELM*R.I 0.60
ATP-binding cassette
sub-family F member
3 Q9NUQ8 (ABCF3_HUMAN) K.LPGLLK.R 0.73 beta-2-glycoprotein 1
precursor P02749 (APOH_HUMAN) K.EHSSLAFWK.T 0.64 beta-2-glycoprotein 1
precursor P02749 (APOH_HUMAN) R.TCPKPDDLPFSTVVPLK.T 0.60 beta-2-glycoprotein 1
precursor P02749 (APOH_HUMAN) R.VCPFAGILENGAVR.Y 0.68 beta-Ala-His
dipeptidase
precursor Q96KN2 (CNDP1_HUMAN) K.LFAAFFLEMAQLH.- 0.68 biotinidase precursor P43251 (BTD_HUMAN) K.SHLIIAQVAK.N 0.62 carboxypeptidase B2
preproprotein Q96IY4 (CBPB2_HUMAN) K.NAIWIDCGIHAR.E 0.62 carboxypeptidase N
catalytic chain
precursor P15169 (CBPN_HUMAN) R.EALIQFLEQVHQGIK.G 0.69 carboxypeptidase N
subunit 2 precursor P22792 (CPN2_HUMAN) R. LLN IQTYCAG PAYLK.G 0.62 catalase P04040 (CATA_HUMAN) R.LCENIAGHLKDAQIFIQK.K 0.62 ceruloplasmin
precursor P00450 (CERU_HUMAN) K.AETGDKVYVHLK.N 0.61 ceruloplasmin
precursor P00450 (CERU_HUMAN) K.AGLQAFFQVQECNK.S 0.62 Protein Description Uniprot ID (name) Peptide S_AUC ceruloplasmin
precursor P00450 (CERU_HUMAN) K.DIASGLIGPLIICK.K 0.63 ceruloplasmin
precursor P00450 (CERU_HUMAN) K.DIFTGLIGPM*K.I 0.63 ceruloplasmin
precursor P00450 (CERU_HUMAN) K.DIFTGLIGPMK.I 0.68 ceruloplasmin
precursor P00450 (CERU_HUMAN) K.M*YYSAVDPTKDIFTGLIGPMK.I 0.62 ceruloplasmin
precursor P00450 (CERU_HUMAN) K.MYYSAVDPTKDIFTGLIGPM*K.I 0.63 ceruloplasmin
precursor P00450 (CERU_HUMAN) K.PVWLGFLGPIIK.A 0.63 ceruloplasmin R.ADDKVYPGEQYTYMLLATEEQSPGE precursor P00450 (CERU_HUMAN) GDGNCVTR.I 0.64 ceruloplasmin R.DTANLFPQTSLTLHM*WPDTEGTF precursor P00450 (CERU_HUMAN) NVECLTTDHYTGGMK.Q 0.71 ceruloplasmin R.DTANLFPQTSLTLHMWPDTEGTFN
precursor P00450 (CERU_HUMAN) VECLTTDHYTGGMK.Q 0.68 ceruloplasmin
precursor P00450 (CERU_HUMAN) R.FNKNNEGTYYSPNYNPQSR.S 0.74 ceruloplasmin R.IDTINLFPATLFDAYM*VAQNPGEW
precursor P00450 (CERU_HUMAN) M*LSCQNLNHLK.A 0.75 ceruloplasmin R.IDTINLFPATLFDAYM*VAQNPGEW
precursor P00450 (CERU_HUMAN) MLSCQNLNHLK.A 0.86 ceruloplasmin R.IDTINLFPATLFDAYMVAQNPGEW
precursor P00450 (CERU_HUMAN) M*LSCQNLNHLK.A 0.60 ceruloplasmin
precursor P00450 (CERU_HUMAN) R.KAEEEHLGILGPQLHADVGDKVK.I 0.71 ceruloplasmin
precursor P00450 (CERU_HUMAN) R.TTIEKPVWLGFLGPIIK.A 0.63 cholinesterase
precursor P06276 (CHLE_HUMAN) R.FWTSFFPK.V 0.76 clusterin
preproprotein P10909 (CLUS_HUMAN) K.LFDSDPITVTVPVEVSR.K 0.78 clusterin
preproprotein P10909 (CLUS_HUMAN) R.ASSIIDELFQDR.F 0.68 coagulation factor IX
preproprotein P00740 (FA9_HUMAN) K.WIVTAAHCVETGVK.I 0.60 coagulation factor VII
isoform a
preproprotein P08709 (FA7_HUMAN) R.FSLVSGWGQLLDR.G 0.78 coagulation factor X
preproprotein P00742 (FA10_HUMAN) K.ETYDFDIAVLR.L 0.75 coiled-coil domain- containing protein 13 Q8IYE1 (CCD13_HUMAN) K.VRQLEMEIGQLNVHYLR.N 0.67 complement Clq
subcomponent
subunit A precursor P02745 (C1QA_HUMAN) R.PAFSAIR.R 0.66 complement Clq
subcomponent P02746 (C1QB_HUMAN) K.VVTFCDYAYNTFQVTTGGMVLK.L 0.63 Protein Description Uniprot ID (name) Peptide S_AUC subunit B precursor
complement Clq
subcomponent
subunit C precursor P02747 (C1QC_HUMAN) K.FQSVFTVTR.Q 0.63 complement Clr
subcomponent
precursor P00736 (C1R_HUMAN) K.TLDEFTIIQNLQPQYQFR.D 0.62 complement Clr
subcomponent
precursor P00736 (C1R_HUMAN) R. M DVFSQN M FCAGH PSLK.Q 0.68 complement Clr
subcomponent
precursor P00736 (C1R_HUMAN) R.WILTAAHTLYPK.E 0.74 complement Cls
subcomponent
precursor P09871 (C1S_HUMAN) K. FYAAG LVS WG PQCGTYG LYTR. V 0.68 complement Cls
subcomponent
precursor P09871 (C1S_HUMAN) K.GFQVVVTLR.R 0.63 complement C2
isoform 3 P06681 (C02_HUMAN) R.GALISDQWVLTAAHCFR.D 0.61 complement C2
isoform 3 P06681 (C02_HUMAN) R.PICLPCTMEANLALR.R 0.66 complement C3 R.YYGGGYGSTQATFMVFQALAQYQK precursor P01024 (CO3_HUMAN) .D 0.75 complement C4-A
isoform 1 P0C0L4 (C04A_HUMAN) K.GLCVATPVQLR.V 0.74 complement C4-A
isoform 1 P0C0L4 (C04A_HUMAN) K.M*RPSTDTITVM*VENSHGLR.V 0.83 complement C4-A
isoform 1 P0C0L4 (C04A_HUMAN) K.MRPSTDTITVM*VENSHGLR.V 0.72 complement C4-A
isoform 1 P0C0L4 (C04A_HUMAN) K.VGLSGM*AIADVTLLSGFHALR.A 0.71 complement C4-A
isoform 1 P0C0L4 (C04A_HUMAN) K.VLSLAQEQVGGSPEK.L 0.63 complement C4-A
isoform 1 P0C0L4 (C04A_HUMAN) R.EMSGSPASGIPVK.V 0.65 complement C4-A
isoform 1 P0C0L4 (C04A_HUMAN) R.GCGEQTM*IYLAPTLAASR.Y 0.75 complement C4-A
isoform 1 P0C0L4 (C04A_HUMAN) R.GLQDEDGYR.M 0.75 complement C4-A
isoform 1 P0C0L4 (C04A_HUMAN) R.GQIVFMNREPK.R 0.93 complement C4-A R.KKEVYM*PSSIFQDDFVIPDISEPGT
isoform 1 P0C0L4 (C04A_HUMAN) WK.I 0.72 complement C4-A
isoform 1 P0C0L4 (C04A_HUMAN) R.LPMSVR.R 0.78 complement C4-A
isoform 1 P0C0L4 (C04A_HUMAN) R.LTVAAPPSGGPGFLSIER.P 0.84 complement C4-A P0C0L4 (C04A_HUMAN) R.NFLVR.A 0.75 Protein Description Uniprot ID (name) Peptide S_AUC isoform 1
complement C4-A R. NG ESVKLH LETDSLALVALG ALDTAL isoform 1 P0C0L4 (C04A_HUMAN) YAAGSK.S 0.88 complement C4-A
isoform 1 P0C0L4 (C04A_HUMAN) R.QGSFQGGFR.S 0.60 complement C4-A
isoform 1 P0C0L4 (C04A_HUMAN) R.TLEIPGNSDPNMIPDGDFNSYVR.V 0.69 complement C4-A R.VTASDPLDTLGSEGALSPGGVASLLR isoform 1 P0C0L4 (C04A_HUMAN) .L 0.63 complement C4-A
isoform 1 P0C0L4 (C04A_HUMAN) R.YLDKTEQWSTLPPETK.D 0.67 complement C5 K.ADNFLLENTLPAQSTFTLAISAYALSL preproprotein P01031 (C05_HUMAN) GDK.T 0.63 complement C5
preproprotein P01031 (C05_HUMAN) K.ALVEGVDQLFTDYQIK.D 0.63 complement C5
preproprotein P01031 (C05_HUMAN) K.DGHVILQLNSIPSSDFLCVR.F 0.62 complement C5
preproprotein P01031 (C05_HUMAN) K.DVFLEMNIPYSVVR.G 0.63 complement C5
preproprotein P01031 (C05_HUMAN) K.EFPYRIPLDLVPK.T 0.60 complement C5
preproprotein P01031 (C05_HUMAN) K.FQNSAILTIQPK.Q 0.67 complement C5
preproprotein P01031 (C05_HUMAN) K.VFKDVFLEMNIPYSVVR.G 0.63 complement C5
preproprotein P01031 (C05_HUMAN) R.VFQFLEK.S 0.61 complement
component C6
precursor P13671 (C06_HUMAN) K.DLHLSDVFLK.A 0.60 complement
component C6
precursor P13671 (C06_HUMAN) R.TECIKPVVQEVLTITPFQR.L 0.62 complement
component C7
precursor P10643 (C07_HUMAN) K.SSGWHFVVK.F 0.61 complement
component C7
precursor P10643 (C07_HUMAN) R.ILPLTVCK.M 0.75 complement
component C8 alpha
chain precursor P07357 (C08A_HUMAN) R.ALDQYLMEFNACR.C 0.65 complement
component C8
gamma chain
precursor P07360 (C08G_HUMAN) K.YGFCEAADQFHVLDEVR.R 0.60 complement
component C9
precursor P02748 (C09_HUMAN) R.AIEDYINEFSVRK.C 0.69 complement P02748 (C09_HUMAN) R.TAGYGINI LGMDPLSTPFDNEFYNGL 0.69 Protein Description Uniprot ID (name) Peptide S_AUC component C9 CNR.D
precursor
complement factor B
preproprotein P00751 (CFAB_HUMAN) K.ALFVSEEEKK.L 0.64 complement factor B
preproprotein P00751 (CFAB_HUMAN) K.CLVNLIEK.V 0.70 complement factor B
preproprotein P00751 (CFAB_HUMAN) K.EAGIPEFYDYDVALIK.L 0.66 complement factor B
preproprotein P00751 (CFAB_HUMAN) K.VSEADSSNADWVTK.Q 0.73 complement factor B
preproprotein P00751 (CFAB_HUMAN) K.YGQTIRPICLPCTEGTTR.A 0.67 complement factor B
preproprotein P00751 (CFAB_HUMAN) R.DLEIEVVLFHPNYNINGK.K 0.71 complement factor B
preproprotein P00751 (CFAB_HUMAN) R.FLCTGGVSPYADPNTCR.G 0.64 complement factor H
isoform a precursor P08603 (CFAH_HUMAN) K.DGWSAQPTCIK.S 0.80 complement factor H
isoform a precursor P08603 (CFAH_HUMAN) K.EGWIHTVCINGR.W 0.67 complement factor H
isoform a precursor P08603 (CFAH_HUMAN) K.TDCLSLPSFENAIPMGEK.K 0.61 complement factor H
isoform a precursor P08603 (CFAH_HUMAN) R.DTSCVNPPTVQNAYIVSR.Q 0.60 complement factor H
isoform b precursor P08603 (CFAH_HUMAN) K.CTSTGWIPAPR.C 0.68 complement factor H
isoform b precursor P08603 (CFAH_HUMAN) K.IIYKENER.F 0.76 complement factor H
isoform b precursor P08603 (CFAH_HUMAN) K.IVSSAM*EPDREYHFGQAVR.F 0.75 complement factor H
isoform b precursor P08603 (CFAH_HUMAN) K.IVSSAMEPDREYHFGQAVR.F 0.68 complement factor H
isoform b precursor P08603 (CFAH_HUMAN) R.CTLKPCDYPDIK.H 0.81 complement factor H
isoform b precursor P08603 (CFAH_HUMAN) R.KGEWVALNPLR.K 0.60 complement factor H
isoform b precursor P08603 (CFAH_HUMAN) R.KGEWVALNPLRK.C 0.69 complement factor H
isoform b precursor P08603 (CFAH_HUMAN) R.RPYFPVAVGK.Y 0.68 complement factor
H-related protein 1
precursor Q03591 (FHR1_HUMAN) R.EIMENYNIALR.W 0.64 complement factor 1
preproprotein P05156 (CFAI_HUMAN) K.DASGITCGGIYIGGCWILTAAHCLR.A 0.71 complement factor 1
preproprotein P05156 (CFAI_HUMAN) K.VANYFDWISYHVGR.P 0.72 complement factor 1
preproprotein P05156 (CFAI_HUMAN) R.IIFHENYNAGTYQNDIAUEMK.K 0.63 Protein Description Uniprot ID (name) Peptide S_AUC complement factor 1
preproprotein P05156 (CFAI_HUMAN) R.YQIWTTVVDWIHPDLK.R 0.63 conserved oligomeric
Golgi complex
subunit 6 isoform Q9Y2V7 (COG6_HUMAN) K.ISNLLK.F 0.65 corticosteroid- binding globulin
precursor P08185 (CBG_HUMAN) R.WSAGLTSSQVDLYIPK.V 0.62
C-reactive protein
precursor P02741 (CRP_HUMAN) K.YEVQGEVFTKPQLWP.- 0.60 dopamine beta- hydroxylase
precursor P09172 (DOPO_HUMAN) R. HVLAAWALG AK. A 0.88 double-stranded
RNA-specific editase
B2 Q9NS39 (RED2_HUMAN) R.AGLRYVCLAEPAER.R 0.75 dual oxidase 2
precursor Q9NRD8 (DUOX2_HUMAN) R.FTQLCVKGGGGGGNGIR.D 0.65
FERM domain- containing protein 8 Q9BZ67 (FRMD8_HUMAN) R.VQLGPYQPGRPAACDLR.E 0.65 fetuin-B precursor Q9UGM5 (FETUB_HUMAN) R.GGLGSLFYLTLDVLETDCHVLR.K 0.83 ficolin-3 isoform 1
precursor 075636 (FCN3_HUMAN) R.ELLSQGATLSGWYHLCLPEGR.A 0.69 gastric intrinsic factor
precursor P27352 (IF_HUMAN) K.KTTDM*ILNEIKQGK.F 0.60
K.NWRDPDQTDGLGLSYLSSHIANVER gelsolin isoform d P06396 (GELS_HUMAN) .V 0.72 gelsolin isoform d P06396 (GELS_HUMAN) K.TPSAAYLWVGTGASEAEK.T 0.80
R.VEKFDLVPVPTNLYGDFFTGDAYVIL gelsolin isoform d P06396 (GELS_HUMAN) K.T 0.60
R.VPFDAATLHTSTAM AAQHGM DDD gelsolin isoform d P06396 (GELS_HUMAN) GTGQK.Q 0.67 glutathione
peroxidase 3
precursor P22352 (GPX3_HUMAN) K.FYTFLK.N 0.63 hemopexin precursor P02790 (HEMO_HUMAN) K.GDKVWVYPPEKK.E 0.65 hemopexin precursor P02790 (HEMO_HUMAN) K.LLQDEFPGIPSPLDAAVECHR.G 0.71 hemopexin precursor P02790 (HEMO_HUMAN) K.SGAQATWTELPWPHEK.V 0.64
K.SGAQATWTELPWPHEKVDGALCM
hemopexin precursor P02790 (HEMO_HUMAN) EK.S 0.61 hemopexin precursor P02790 (HEMO_HUMAN) K.VDGALCMEK.S 0.66 hemopexin precursor P02790 (HEMO_HUMAN) R.DYFMPCPGR.G 0.68 hemopexin precursor P02790 (HEMO_HUMAN) R.EWFWDLATGTM*K.E 0.64 hemopexin precursor P02790 (HEMO_HUMAN) R.QGHNSVFLIK.G 0.71 heparin cofactor 2 K. HQGTITVN EEGTQATTVTTVGFM PL precursor P05546 (HEP2_HUMAN) STQVR.F 0.60 heparin cofactor 2
precursor P05546 (HEP2_HUMAN) K.YEITTIHNLFR.K 0.62 heparin cofactor 2 P05546 (HEP2_HUMAN) R.LNILNAK.F 0.68 Protein Description Uniprot ID (name) Peptide S_AUC precursor
heparin cofactor 2
precursor P05546 (HEP2_HUMAN) R.NFGYTLR.S 0.64 heparin cofactor 2 R.VLKDQVNTFDNIFIAPVGISTAMGM
precursor P05546 (HEP2_HUMAN) *ISLGLK.G 0.63 hepatocyte cell
adhesion molecule
precursor Q14CZ8 (HECAM_HUMAN) K.PLLNDSRM LLSPDQK.V 0.61 hepatocyte growth
factor activator
preproprotein Q04756 (HGFA_HUMAN) R.VQLSPDLLATLPEPASPGR.Q 0.82 histidine-rich
glycoprotein
precursor P04196 (HRG_HUMAN) R.DGYLFQLLR.I 0.63 hyaluronan-binding
protein 2 isoform 1
preproprotein Q14520 (HABP2_HUMAN) K.FLNWIK.A 0.82 hyaluronan-binding
protein 2 isoform 1
preproprotein Q14520 (HABP2_HUMAN) K.LKPVDGHCALESK.Y 0.61 hyaluronan-binding
protein 2 isoform 1
preproprotein Q14520 (HABP2_HUMAN) K.RPGVYTQVTK.F 0.74 inactive caspase-12 Q6UXS9 (CASPC_HUMAN) K.AGADTHGRLLQGNICNDAVTK.A 0.74 insulin-degrading
enzyme isoform 1 P14735 (IDE_HUMAN) K.KIIEKM*ATFEIDEK.R 0.85 insulin-like growth
factor-binding
protein complex acid
labile subunit isoform
2 precursor P35858 (ALS_HUMAN) R.SFEGLGQLEVLTLDHNQLQEVK.A 0.62 inter-alpha-trypsin
inhibitor heavy chain
HI isoform a
precursor P19827 (ITIH1_HUMAN) K.ELAAQTIKK.S 0.81 inter-alpha-trypsin
inhibitor heavy chain
HI isoform a
precursor P19827 (ITIH1_HUMAN) K.GSLVQASEANLQAAQDFVR.G 0.71 inter-alpha-trypsin
inhibitor heavy chain
HI isoform a
precursor P19827 (ITIH1_HUMAN) K.QLVHHFEIDVDIFEPQGISK.L 0.70 inter-alpha-trypsin
inhibitor heavy chain
HI isoform a
precursor P19827 (ITIH1_HUMAN) K.QYYEGSEIVVAGR.I 0.83 inter-alpha-trypsin
inhibitor heavy chain R.EVAFDLEIPKTAFISDFAVTADGNAFI
HI isoform a P19827 (ITIH1_HUMAN) GDIK.D 0.70 Protein Description Uniprot ID (name) Peptide S_AUC precursor
inter-alpha-trypsin
inhibitor heavy chain
HI isoform a R.GMADQDGLKPTIDKPSEDSPPLEM* precursor P19827 (ITIH1_HUMAN) LGPR.R 0.63 inter-alpha-trypsin
inhibitor heavy chain
HI isoform a R.GMADQDGLKPTIDKPSEDSPPLEML precursor P19827 (ITIH1_HUMAN) GPR.R 0.60 inter-alpha-trypsin
inhibitor heavy chain K.FDPAKLDQIESVITATSANTQLVLETL
H2 precursor P19823 (ITIH2_HUMAN) AQM *DDLQDFLSK.D 0.80 inter-alpha-trypsin
inhibitor heavy chain
H2 precursor P19823 (ITIH2_HUMAN) K.KFYNQVSTPLLR.N 0.76 inter-alpha-trypsin
inhibitor heavy chain
H2 precursor P19823 (ITIH2_HUMAN) K.NILFVIDVSGSM*WGVK.M 0.68 inter-alpha-trypsin
inhibitor heavy chain
H2 precursor P19823 (ITIH2_HUMAN) K.NILFVIDVSGSMWGVK.M 0.62 inter-alpha-trypsin
inhibitor heavy chain
H2 precursor P19823 (ITIH2_HUMAN) R.KLGSYEHR.I 0.72 inter-alpha-trypsin
inhibitor heavy chain
H2 precursor P19823 (ITIH2_HUMAN) R.LSNENHGIAQR.I 0.66 inter-alpha-trypsin
inhibitor heavy chain
H2 precursor P19823 (ITIH2_HUMAN) R.MATTMIQSK.V 0.60 inter-alpha-trypsin
inhibitor heavy chain R.SILQM*SLDHHIVTPLTSLVIENEAG
H2 precursor P19823 (ITIH2_HUMAN) DER.M 0.63 inter-alpha-trypsin
inhibitor heavy chain R.SILQMSLDHHIVTPLTSLVIENEAGDE
H2 precursor P19823 (ITIH2_HUMAN) R.M 0.65 inter-alpha-trypsin
inhibitor heavy chain
H2 precursor P19823 (ITIH2_HUMAN) R.TEVNVLPGAK.V 0.69 inter-alpha-trypsin
inhibitor heavy chain
H4 isoform 1
precursor Q14624 (ITIH4_HUMAN) K.NVVFVIDK.S 0.68 inter-alpha-trypsin
inhibitor heavy chain
H4 isoform 1
precursor Q14624 (ITIH4_HUMAN) K.WKETLFSVMPGLK.M 0.65 inter-alpha-trypsin
inhibitor heavy chain
H4 isoform 1 Q14624 (ITIH4_HUMAN) K.YIFHNFM*ER.L 0.67 Protein Description Uniprot ID (name) Peptide S_AUC precursor
inter-alpha-trypsin
inhibitor heavy chain
H4 isoform 1
precursor Q14624 (ITIH4_HUMAN) R.FAHTVVTSR.V 0.63 inter-alpha-trypsin
inhibitor heavy chain
H4 isoform 1
precursor Q14624 (ITIH4_HUMAN) R.FKPTLSQQQK.S 0.60 inter-alpha-trypsin
inhibitor heavy chain
H4 isoform 1 R.IHEDSDSALQLQDFYQEVANPLLTA
precursor Q14624 (ITIH4_HUMAN) VTFEYPSNAVEEVTQNNFR.L 0.64 inter-alpha-trypsin
inhibitor heavy chain
H4 isoform 1
precursor Q14624 (ITIH4_HUMAN) R.MNFRPGVLSSR.Q 0.63 inter-alpha-trypsin
inhibitor heavy chain
H4 isoform 1
precursor Q14624 (ITIH4_HUMAN) R.NVHSAGAAGSR.M 0.62 inter-alpha-trypsin
inhibitor heavy chain
H4 isoform 1
precursor Q14624 (ITIH4_HUMAN) R.NVHSGSTFFK.Y 0.75 inter-alpha-trypsin
inhibitor heavy chain
H4 isoform 1
precursor Q14624 (ITIH4_HUMAN) R.RLGVYELLLK.V 0.66 kallistatin precursor P29622 (KAIN_HUMAN) K.KLELHLPK.F 0.78 kallistatin precursor P29622 (KAIN_HUMAN) R.EIEEVLTPEMLMR.W 0.60 kininogen-1 isoform 2
precursor P01042 (KNG1_HUMAN) K.AATGECTATVGKR.S 0.67 kininogen-1 isoform 2
precursor P01042 (KNG1_HUMAN) K.LGQSLDCNAEVYVVPWEK.K 0.72 kininogen-1 isoform 2
precursor P01042 (KNG1_HUMAN) K.YNSQNQSNNQFVLYR.I 0.62 kininogen-1 isoform 2
precursor P01042 (KNG1_HUMAN) R.QVVAGLNFR.I 0.64 leucine-rich alpha-2- glycoprotein
precursor P02750 (A2GL_HUMAN) K.DLLLPQPDLR.Y 0.64 leucine-rich alpha-2- glycoprotein
precursor P02750 (A2GL_HUMAN) R.LHLEGNKLQVLGK.D 0.76 leucine-rich alpha-2- glycoprotein
precursor P02750 (A2GL_HUMAN) R.TLDLGENQLETLPPDLLR.G 0.61 lipopolysaccharide- binding protein P18428 (LBP_HUMAN) K.GLQYAAQEGLLALQSELLR.I 0.82 Protein Description Uniprot ID (name) Peptide S_AUC precursor
lipopolysaccharide- binding protein
precursor P18428 (LBP_HUMAN) K.LAEGFPLPLLK.R 0.66 lumican precursor P51884 (LUM_HUMAN) K.SLEYLDLSFNQIAR.L 0.65 lumican precursor P51884 (LUM_HUMAN) R.LKEDAVSAAFK.G 0.74 m7GpppX
diphosphatase Q96C86 (DCPS_HUMAN) R.IVFENPDPSDGFVLI PDLK.W 0.62 matrix
metalloproteinase-19
isoform 1
preproprotein Q99542 (M MP19_HUMAN) R.VYFFK.G 0.63
MBT domain- containing protein 1 Q05BQ5 (MBTD1_HUMAN) K.WFDYLR.E 0.65 monocyte
differentiation
antigen CD14
precursor P08571 (CD14_HUMAN) R. LTVG AAQVPAQLLVG ALR. V 0.66 pappalysin-1
preproprotein Q13219 (PAPP1_HUMAN) R.VSFSSPLVAISGVALR.S 0.66 phosphatidylinositol- glycan-specific
phospholipase D
precursor P80108 (PHLD_HUMAN) K.GIVAAFYSGPSLSDKEK.L 0.71 phosphatidylinositol- glycan-specific
phospholipase D
precursor P80108 (PHLD_HUMAN) R.WYVPVKDLLGIYEK.L 0.71 pigment epithelium- derived factor
precursor P36955 (PEDF_HUMAN) K.LQSLFDSPDFSK.I 0.61 pigment epithelium- derived factor
precursor P36955 (PEDF_HUMAN) R.ALYYDLISSPDIHGTYK.E 0.72 plasma kallikrein
preproprotein P03952 (KLKB1_HUMAN) R.CLLFSFLPASSINDM EKR.F 0.60 plasma protease CI
inhibitor precursor P05155 (IC1_HUMAN) K.FQPTLLTLPR.I 0.70 plasma protease CI
inhibitor precursor P05155 (IC1_HUMAN) K.GVTSVSQIFHSPDLAIR.D 0.66 plasminogen isoform
1 precursor P00747 (PLM N_HUMAN) K.VIPACLPSPNYVVADR.T 0.63 plasminogen isoform
1 precursor P00747 (PLM N_HUMAN) R.FVTWIEGVMR.N 0.60 plasminogen isoform
1 precursor P00747 (PLM N_HUMAN) R.HSIFTPETNPR.A 0.63 platelet basic protein
preproprotein P02775 (CXCL7_HUMAN) K.GKEESLDSDLYAELR.C 0.70 platelet glycoprotein
V precursor P40197 (GPV_HUMAN) K.MVLLEQLFLDHNALR.G 0.66 Protein Description Uniprot ID (name) Peptide S_AUC platelet glycoprotein
V precursor P40197 (GPV_HUMAN) R.LVSLDSGLLNSLGALTELQFHR.N 0.88 pregnancy zone
protein precursor P20742 (PZP_HUMAN) K.ALLAYAFSLLGK.Q 0.66 pregnancy zone
protein precursor P20742 (PZP_HUMAN) K.DLFHCVSFTLPR.I 0.86 pregnancy zone
protein precursor P20742 (PZP_HUMAN) K.MLQITNTGFEMK.L 0.84 pregnancy zone
protein precursor P20742 (PZP_HUMAN) R.NELIPLIYLENPRR.N 0.65 pregnancy zone
protein precursor P20742 (PZP_HUMAN) R.SYIFIDEAHITQSLTWLSQMQK.D 0.68 pregnancy-specific
beta-l-glycoprotein 2
precursor P11465 (PSG2_HUMAN) R.SDPVTLNLLHGPDLPR.I 0.66 pregnancy-specific
beta-l-glycoprotein 3
precursor Q16557 (PSG3_HUMAN) R.TLFLFGVTK.Y 0.62 pregnancy-specific
beta-l-glycoprotein 5
precursor Q15238 (PSG5_HUMAN) R.ILILPSVTR.N 0.76 pregnancy-specific
beta-l-glycoprotein 6
isoform a Q00889 (PSG6_HUMAN) R.SDPVTLNLLPK.L 0.63 progesterone- induced-blocking
factor 1 Q8WXW3 (PIBF1_HUMAN) R.VLQLEK.Q 0.71 protein AMBP
preproprotein P02760 (AMBP_HUMAN) R.VVAQGVGIPEDSIFTMADR.G 0.60 protein CBFA2T2 R.LTEREWADEWKHLDHALNCIMEM
isoform MTGRlb 043439 (MTG8R_HUMAN) VEK.T 0.70 protein FAM98C Q17RN3 (FA98C_HUMAN) R.ALCGGDGAAALREPGAGLR.L 0.75 protein NLRC3 Q7RTR2 (NLRC3_HUMAN) K.ALM*DLLAGKGSQGSQAPQALDR.T 0.92 protein Z-dependent
protease inhibitor
precursor Q9UK55 (ZPI_HUMAN) K.MGDHLALEDYLTTDLVETWLR.N 0.60 prothrombin
preproprotein P00734 (THRB_HUMAN) K.SPQELLCGASLISDR.W 0.84 prothrombin
preproprotein P00734 (THRB_HUMAN) R.LAVTTHGLPCLAWASAQAK.A 0.62 prothrombin
preproprotein P00734 (THRB_HUMAN) R.SEGSSVNLSPPLEQCVPDR.G 0.70 prothrombin
preproprotein P00734 (THRB_HUMAN) R.SGIECQLWR.S 0.68 prothrombin
preproprotein P00734 (THRB_HUMAN) R.TATSEYQTFFNPR.T 0.60 prothrombin
preproprotein P00734 (THRB_HUMAN) R.VTGWGNLKETWTANVGK.G 0.69 putative
hydroxypyruvate Q5T013 (HYI_HUMAN) R.IHLM*AGR.V 0.69 Protein Description Uniprot ID (name) Peptide S_AUC isomerase isoform 1
putative
hydroxypyruvate
isomerase isoform 1 Q5T013 (HYI_HUMAN) R.IHLMAGR.V 0.66 ras-like protein family
member 10A
precursor Q92737 (RSLAA_HUMAN) R.PAHPALR.L 0.71 ras-related GTP- binding protein A Q7L523 (RRAGA_HUMAN) K.ISNIIK.Q 0.82 retinol-binding
protein 4 precursor P02753 (RET4_HUMAN) K. M * KYWGV AS F LQK. G 0.73 retinol-binding
protein 4 precursor P02753 (RET4_HUMAN) R.FSGTWYAM*AK.K 0.63 retinol-binding
protein 4 precursor P02753 (RET4_HUMAN) R.LLNLDGTCADSYSFVFSR.D 0.79 retinol-binding R.LLNNWDVCADMVGTFTDTEDPAKF protein 4 precursor P02753 (RET4_HUMAN) K.M 0.77 sex hormone-binding
globulin isoform 1 R.LFLGALPGEDSSTSFCLNGLWAQGQ precursor P04278 (SHBG_HUMAN) R.L 0.66 sex hormone-binding
globulin isoform 4
precursor P04278 (SHBG_HUMAN) K.DDWFMLGLR.D 0.60 sex hormone-binding
globulin isoform 4
precursor P04278 (SHBG_HUMAN) R.SCDVESNPGIFLPPGTQAEFNLR.G 0.64 sex hormone-binding
globulin isoform 4 R.TWDPEGVIFYGDTNPKDDWFM*L
precursor P04278 (SHBG_HUMAN) GLR.D 0.65 sex hormone-binding
globulin isoform 4 R.TWDPEGVIFYGDTNPKDDWFMLGL precursor P04278 (SHBG_HUMAN) R.D 0.66 signal transducer and
activator of
transcription 2 P52630 (STAT2_HUMAN) R.KFCRDIQDPTQLAEMIFNLLLEEK.R 0.73 spectrin beta chain,
non-erythrocytic 1 Q13813 (SPTN1_HUMAN) R.NELIRQEKLEQLAR.R 0.60 stabilin-1 precursor Q9NY15 (STAB1_HUMAN) R.KNLSER.W 0.88 succinate- semialdehyde
dehydrogenase,
mitochondrial P51649 (SSDH_HUMAN) R.KWYNLMIQNK.D 0.88 tetranectin precursor P05452 (TETN_HUMAN) K.SRLDTLAQEVALLK.E 0.75
THAP domain- containing protein 6 Q8TBB0 (THAP6_HUMAN) K.RLDVNAAGIWEPKK.G 0.69 thyroxine-binding
globulin precursor P05543 (THBG_HUMAN) R.SILFLGK.V 0.79 tripartite motif- R.ELISDLEHRLQGSVM*ELLQGVDGVI containing protein 5 Q9C035 (TRIM5_HUMAN) K.R 0.60 vitamin D-binding P02774 (VTDB_HUMAN) K.EDFTSLSLVLYSR.K 0.66 Protein Description Uniprot ID (name) Peptide S_AUC protein isoform 1
precursor
vitamin D-binding
protein isoform 1
precursor P02774 (VTDB_HUMAN) K.ELSSFIDKGQELCADYSENTFTEYK.K 0.67 vitamin D-binding
protein isoform 1 K.ELSSFIDKGQELCADYSENTFTEYKK.
precursor P02774 (VTDB_HUMAN) K 0.66 vitamin D-binding
protein isoform 1
precursor P02774 (VTDB_HUMAN) K.EVVSLTEACCAEGADPDCYDT .T 0.65 vitamin D-binding
protein isoform 1 K.TAMDVFVCTYFMPAAQLPELPDVEL precursor P02774 (VTDB_HUMAN) PTNKDVCDPGNTK.V 0.84 vitamin D-binding
protein isoform 1
precursor P02774 (VTDB_HUMAN) R.RTHLPEVFLSK.V 0.69 vitamin D-binding
protein isoform 1
precursor P02774 (VTDB_HUMAN) R.VCSQYAAYGEK.K 0.66 vitronectin precursor P04004 (VTNC_HUMAN) K.LIRDVWGIEGPIDAAFTR.I 0.61 vitronectin precursor P04004 (VTNC_HUMAN) R.DVWGIEGPIDAAFTR.I 0.63 vitronectin precursor P04004 (VTNC_HUMAN) R.ERVYFFK.G 0.81 vitronectin precursor P04004 (VTNC_HUMAN) R.FEDGVLDPDYPR.N 0.64 vitronectin precursor P04004 (VTNC_HUMAN) R.IYISGM*APRPSLAK.K 0.75 zinc finger protein
142 P52746 (ZN142_HUMAN) K.TRFLLR.T 0.66
[00180] Table 9. Significant peptides (AUO0.6) for for X!Tandem only
Figure imgf000073_0001
Protein description Uniprot ID (name) Peptide XT_AUC alpha-lB-glycoprotein P04217
precursor (A1BG_HUMAN) K.NGVAQEPVHLDSPAIK.H 0.63 alpha-lB-glycoprotein P04217
precursor (A1BG_HUMAN) K.SLPAPWLSM*APVSWITPGLK.T 0.72 alpha-lB-glycoprotein P04217
precursor (A1BG_HUMAN) K.VTLTCVAPLSGVDFQLRR.G 0.67 alpha-lB-glycoprotein P04217
precursor (A1BG_HUMAN) R.C*EGPIPDVTFELLR.E 0.67 alpha-lB-glycoprotein P04217
precursor (A1BG_HUMAN) R.C*LAPLEGAR.F 0.79 alpha-lB-glycoprotein P04217
precursor (A1BG_HUMAN) R.CLAPLEGAR.F 0.63 alpha-lB-glycoprotein P04217
precursor (A1BG_HUMAN) R.GVTFLLR.R 0.69 alpha-lB-glycoprotein P04217 R.LHDNQNGWSGDSAPVELILSDETL
precursor (A1BG_HUMAN) PAPEFSPEPESGR.A 0.60 alpha-lB-glycoprotein P04217
precursor (A1BG_HUMAN) R.TPGAAANLELIFVGPQHAGNYR.C 0.62 alpha-2-antiplasmin P08697 K.HQM*DLVATLSQLGLQELFQAPDL
isoform a precursor (A2AP_HUMAN) R.G 0.61 alpha-2-antiplasmin P08697
isoform a precursor (A2AP_HUMAN) R.LCQDLGPGAFR.L 0.68 alpha-2-antiplasmin P08697
isoform a precursor (A2AP_HUMAN) R.WFLLEQPEIQVAHFPFK.N 0.60 alpha-2-HS- glycoprotein P02765 K.VWPQQPSGELFEIEIDTLETTCHVL
preproprotein (FETUA_HUMAN) DPTPVAR.C 0.61 alpha-2-HS- glycoprotein P02765
preproprotein (FETUA_HUMAN) R.HTFMGVVSLGSPSGEVSHPR.K 0.68 alpha-2-HS- glycoprotein P02765 R.Q*PNCDDPETEEAALVAIDYINQNL
preproprotein (FETUA_HUMAN) PWGYK.H 0.69 alpha-2-HS- glycoprotein P02765 R.QPNCDDPETEEAALVAIDYINQNLP
preproprotein (FETUA_HUMAN) WGYK.H 0.64 alpha-2-HS- glycoprotein P02765
preproprotein (FETUA_HUMAN) R.TVVQPSVGAAAGPVVPPCPGR.I 0.64 angiotensinogen P01019
preproprotein (ANGT_HUMAN) K.QPFVQGLALYTPVVLPR.S 0.73 angiotensinogen P01019
preproprotein (ANGT_HUMAN) R.AAM*VGM *LANFLGFR.I 0.62 apolipoprotein A-IV P06727
precursor (APOA4_HUMAN) K.LVPFATELHER.L 0.64 apolipoprotein A-IV P06727
precursor (APOA4_HUMAN) R.LLPHANEVSQK.I 0.61 apolipoprotein A-IV P06727 R.SLAPYAQDTQEKLNHQLEGLTFQM
precursor (APOA4_HUMAN) K.K 0.70 Protein description Uniprot ID (name) Peptide XT_AUC apolipoprotein B-100 P04114
precursor (APOB_HUMAN) K.FPEVDVLTK.Y 0.61 apolipoprotein B-100 P04114
precursor (APOB_HUMAN) K.HINIDQFVR.K 0.70 apolipoprotein B-100 P04114
precursor (APOB_HUMAN) K.LLSGGNTLHLVSTTK.T 0.66 apolipoprotein B-100 P04114
precursor (APOB_HUMAN) K.Q*VFLYPEKDEPTYILNIKR.G 0.81 apolipoprotein B-100 P04114
precursor (APOB_HUMAN) K.QVFLYPEKDEPTYILNIKR.G 0.77 apolipoprotein B-100 P04114
precursor (APOB_HUMAN) K.SLHMYANR.L 0.83 apolipoprotein B-100 P04114
precursor (APOB_HUMAN) K.SVSDGIAALDLNAVANK.I 0.62 apolipoprotein B-100 P04114 K.SVSLPSLDPASAKIEGN LI FDPNNYL
precursor (APOB_HUMAN) PK.E 0.67 apolipoprotein B-100 P04114
precursor (APOB_HUMAN) K.TEVIPPLIENR.Q 0.63 apolipoprotein B-100 P04114
precursor (APOB_HUMAN) K.VLVDHFGYTK.D 0.76 apolipoprotein B-100 P04114
precursor (APOB_HUMAN) R.TSSFALNLPTLPEVKFPEVDVLTK.Y 0.62 apolipoprotein C-lll P02656
precursor (APOC3_HUMAN) R.GWVTDGFSSLKDYWSTVK.D 0.66 apolipoprotein E P02649
precursor (APOE_HUMAN) R.GEVQAMLGQSTEELR.V 0.81 apolipoprotein E P02649
precursor (APOE_HUMAN) R.LAVYQAGAR.E 0.63 apolipoprotein E P02649
precursor (APOE_HUMAN) R.LGPLVEQGR.V 0.69 attractin isoform 2 075882
preproprotein (ATRN_HUMAN) K.LTLTPWVGLR.K 0.69 beta-2-glycoprotein 1 P02749
precursor (APOH_HUMAN) K.FICPLTGLWPINTLK.C 0.63 beta-2-glycoprotein 1 P02749
precursor (APOH_HUMAN) K.TFYEPGEEITYSCKPGYVSR.G 0.62 beta-Ala-His Q96KN2 K.MVVSMTLGLHPWIANIDDTQYLA
dipeptidase precursor (CNDP1_HUMAN) AK.R 0.81 beta-Ala-His Q96KN2
dipeptidase precursor (CNDP1_HUMAN) K.VFQYIDLHQDEFVQTLK.E 0.65
P43251
biotinidase precursor (BTD_HUMAN) R.TSIYPFLDFM*PSPQVVR.W 0.79 carboxypeptidase N
catalytic chain P15169
precursor (CBPN_HUMAN) R.ELMLQLSEFLCEEFR.N 0.61 ceruloplasmin P00450
precursor (CERU_HUMAN) K.AEEEHLGILGPQLHADVGDKVK.I 0.73 ceruloplasmin P00450
precursor (CERU_HUMAN) K.ALYLQYTDETFR.T 0.64 Protein description Uniprot ID (name) Peptide XT_AUC ceruloplasmin P00450 K.DVDKEFYLFPTVFDENESLLLEDN IR
precursor (CERU_HUMAN) .M 0.62 ceruloplasmin P00450
precursor (CERU_HUMAN) K.HYYIGIIETTWDYASDHGEK.K 0.61 ceruloplasmin P00450
precursor (CERU_HUMAN) R.EYTDASFTNRK.E 0.67 ceruloplasmin P00450
precursor (CERU_HUMAN) R.HYYIAAEEIIWNYAPSGIDIFTK.E 0.63 ceruloplasmin P00450
precursor (CERU_HUMAN) R.IYHSHIDAPK.D 0.62 ceruloplasmin P00450 R.Q*KDVDKEFYLFPTVFDENESLLLE
precursor (CERU_HUMAN) DNIR.M 0.74 ceruloplasmin P00450 R.QKDVDKEFYLFPTVFDENESLLLED
precursor (CERU_HUMAN) NIR.M 0.65 ceruloplasmin P00450
precursor (CERU_HUMAN) R.TYYIAAVEVEWDYSPQR.E 0.90 coagulation factor IX P00740
preproprotein (FA9_HUMAN) R.SALVLQYLR.V 0.69 coagulation factor V P12259
precursor (FA5_HUMAN) K.EFNPLVIVGLSK.D 0.61 coagulation factor XII P00748
precursor (FA12_HUMAN) R.NPDNDIRPWCFVLNR.D 0.65 coagulation factor XII P00748
precursor (FA12_HUMAN) R.VVGGLVALR.G 0.61 complement Clq
subcomponent subunit P02746 K.NSLLGM EGANSIFSGFLLFPDMEA.
B precursor (C1QB_HUMAN) - 0.64 complement Clq
subcomponent subunit P02746
B precursor (C1QB_HUMAN) K.VPGLYYFTYHASSR.G 0.63 complement Clq
subcomponent subunit P02747
C precursor (C1QC_HUMAN) R.Q*THQPPAPNSLIR.F 0.60 complement Clr
subcomponent P00736
precursor (C1R_HUMAN) R.LPVANPQACENWLR.G 0.72 complement C2 P06681
isoform 3 (C02_HUMAN) K.NQGILEFYGDDIALLK.L 0.74 complement C2 P06681
isoform 3 (C02_HUMAN) K.RNDYLDIYAIGVGK.L 0.61 complement C2 P06681 R.QPYSYDFPEDVAPALGTSFSHMLG
isoform 3 (C02_HUMAN) ATNPTQK.T 0.78 complement C3 P01024
precursor (C03_HUMAN) R.IHWESASLLR.S 0.69 complement C4-A P0C0L4
isoform 1 (C04A_HUMAN) K.FACYYPR.V 0.64 complement C4-A P0C0L4 K.LHLETDSLALVALGALDTALYAAGS
isoform 1 (C04A_HUMAN) K.S 0.74 complement C4-A P0C0L4
isoform 1 (C04A_HUMAN) K.LVNGQSHISLSK.A 0.64 Protein description Uniprot ID (name) Peptide XT_AUC complement C4-A P0C0L4
isoform 1 (C04A_HUMAN) K.M*RPSTDTITVMVENSHGLR.V 0.60 complement C4-A P0C0L4
isoform 1 (C04A_HUMAN) K.MRPSTDTITVMVENSHGLR.V 0.65 complement C4-A P0C0L4
isoform 1 (C04A_HUMAN) K.SCGLHQLLR.G 0.74 complement C4-A P0C0L4
isoform 1 (C04A_HUMAN) K.VGLSGMAIADVTLLSGFHALR.A 0.61 complement C4-A P0C0L4
isoform 1 (C04A_HUMAN) K.YVLPNFEVK.I 0.64 complement C4-A P0C0L4
isoform 1 (C04A_HUMAN) R.ALEILQEEDLIDEDDIPVR.S 0.64 complement C4-A P0C0L4 R. ECVG FEAVQEVPVG LVQPASATLY
isoform 1 (C04A_HUMAN) DYYNPER.R 0.62 complement C4-A P0C0L4
isoform 1 (C04A_HUMAN) R.EELVYELNPLDHR.G 0.66 complement C4-A P0C0L4 R.STQDTVIALDALSAYWIASHTTEER.
isoform 1 (C04A_HUMAN) G 0.70 complement C4-A P0C0L4
isoform 1 (C04A_HUMAN) R.VGDTLNLNLR.A 0.79 complement C4-A P0C0L4
isoform 1 (C04A_HUMAN) R.VHYTVCIWR.N 0.65 complement C4-B-like P0C0L5
preproprotein (C04B_HUMAN) K.GLCVATPVQLR.V 1.00 complement C4-B-like P0C0L5
preproprotein (C04B_HUMAN) K.KYVLPNFEVK.I 0.60 complement C4-B-like P0C0L5
preproprotein (C04B_HUMAN) K.VDFTLSSERDFALLSLQVPLKDAK.S 0.74 complement C4-B-like P0C0L5
preproprotein (C04B_HUMAN) R.EMSGSPASGIPVK.V 0.72 complement C4-B-like P0C0L5
preproprotein (C04B_HUMAN) R.GCGEQTM *IYLAPTLAASR.Y 0.75 complement C4-B-like P0C0L5 R.NGESVKLHLETDSLALVALGALDTA
preproprotein (C04B_HUMAN) LYAAGSK.S 0.85 complement C5 P01031
preproprotein (C05_HUMAN) R.IPLDLVPK.T 0.65 complement C5 P01031
preproprotein (C05_HUMAN) R.SYFPESWLWEVHLVPR.R 0.63 complement C5 P01031 R.YGGGFYSTQDTINAIEGLTEYSLLVK
preproprotein (C05_HUMAN) •Q 0.62 complement
component C6 P13671
precursor (C06_HUMAN) K.ENPAVIDFELAPIVDLVR.N 0.63 complement
component C8 alpha P07357
chain precursor (C08A_HUMAN) K.YNPVVIDFEMQPIHEVLR.H 0.61 complement
component C8 alpha P07357
chain precursor (C08A_HUMAN) R.HTSLGPLEAK.R 0.65 Protein description Uniprot ID (name) Peptide XT_AUC complement
component C8 beta P07358 K.C*QHEMDQYWGIGSLASGINLFTN
chain preproprotein (C08B_HUMAN) SFEGPVLDHR.Y 0.61 complement
component C8 beta P07358
chain preproprotein (C08B_HUMAN) K.SGFSFGFK.I 0.64 complement
component C8 beta P07358
chain preproprotein (C08B_HUMAN) R.DTMVEDLVVLVR.G 0.77 complement
component C8 gamma P07360
chain precursor (C08G_HUMAN) K.ANFDAQQFAGTWLLVAVGSACR.F 0.63 complement
component C8 gamma P07360
chain precursor (C08G_HUMAN) R.AEATTLHVAPQGTAMAVSTFR.K 0.61 complement
component C9 P02748
precursor (C09_HUMAN) R.DVVLTTTFVDDIK.A 0.73 complement
component C9 P02748
precursor (C09_HUMAN) R.RPWNVASLIYETK.G 0.66 complement factor B P00751
preproprotein (CFAB_HUMAN) K.ISVIRPSK.G 0.70 complement factor B P00751
preproprotein (CFAB_HUMAN) K.VASYGVKPR.Y 0.63 complement factor B P00751
preproprotein (CFAB_HUMAN) R.DFHINLFQVLPWLK.E 0.68 complement factor B P00751
preproprotein (CFAB_HUMAN) R.DLLYIGK.D 0.63 complement factor B P00751
preproprotein (CFAB_HUMAN) R.GDSGGPLIVHK.R 0.63 complement factor B P00751
preproprotein (CFAB_HUMAN) R.LEDSVTYHCSR.G 0.68 complement factor B P00751
preproprotein (CFAB_HUMAN) R.LPPTTTCQQQK.E 0.68 complement factor H P08603
isoform a precursor (CFAH_HUMAN) K.CLHPCVISR.E 0.62 complement factor H P08603
isoform a precursor (CFAH_HUMAN) K.CTSTGWIPAPR.C 0.74 complement factor H P08603
isoform a precursor (CFAH_HUMAN) K.IDVHLVPDR.K 0.66 complement factor H P08603
isoform a precursor (CFAH_HUMAN) K.IVSSAMEPDREYHFGQAVR.F 0.67 complement factor H P08603
isoform a precursor (CFAH_HUMAN) K.SIDVACHPGYALPK.A 0.67 complement factor H P08603
isoform a precursor (CFAH_HUMAN) K.VSVLCQENYLIQEGEEITCKDGR.W 0.63 complement factor H P08603
isoform a precursor (CFAH_HUMAN) K.WSSPPQCEGLPCK.S 0.60 Protein description Uniprot ID (name) Peptide XT_AUC complement factor H P08603
isoform a precursor (CFAH_HUMAN) R.EIMENYNIALR.W 0.61 complement factor H P08603
isoform a precursor (CFAH_HUMAN) R.RPYFPVAVGK.Y 0.83 complement factor H P08603
isoform a precursor (CFAH_HUMAN) R.WQSIPLCVEK.I 0.63 complement factor 1 P05156
preproprotein (CFAI_HUMAN) R.YQIWTTVVDWIHPDLKR.I 0.72 corticosteroid-binding P08185 K.AVLQLNEEGVDTAGSTGVTLNLTSK
globulin precursor (CBG_HUMAN) PIILR.F 0.61 corticosteroid-binding P08185
globulin precursor (CBG_HUMAN) R.GLASANVDFAFSLYK.H 0.66 fibrinogen alpha chain
isoform alpha-E P02671
preproprotein (FIBA_HUMAN) K.TFPGFFSPMLGEFVSETESR.G 0.62
P06396
gelsolin isoform b (GELS_HUMAN) K.FDLVPVPTNLYGDFFTGDAYVILK.T 0.66
P06396
gelsolin isoform b (GELS_HUMAN) K.QTQVSVLPEGGETPLFK.Q 0.66
P06396
gelsolin isoform b (GELS_HUMAN) K.TPSAAYLWVGTGASEAEK.T 0.71
P06396 R.AQPVQVAEGSEPDGFWEALGGK.
gelsolin isoform b (GELS_HUMAN) A 0.67
P06396 R.IEGSNKVPVDPATYGQFYGGDSYIIL
gelsolin isoform b (GELS_HUMAN) YNYR.H 0.60
P06396 R.VEKFDLVPVPTNLYGDFFTGDAYVI
gelsolin isoform b (GELS_HUMAN) LK.T 0.73
P06396 R.VPFDAATLHTSTAMAAQHGMDD
gelsolin isoform b (GELS_HUMAN) DGTGQK.Q 0.63 glutathione peroxidase P22352
3 precursor (GPX3_HUMAN) K.FLVGPDGIPIMR.W 0.60
P02790
hemopexin precursor (HEMO_HUMAN) K.ALPQPQNVTSLLGCTH.- 0.63
P02790 K.SLGPNSCSANGPGLYLIHGPNLYCY
hemopexin precursor (HEMO_HUMAN) SDVEK.L 0.68
P02790 R.DGWHSWPIAHQWPQGPSAVDAA
hemopexin precursor (HEMO_HUMAN) FSWEEK.L 0.63
P02790
hemopexin precursor (HEMO_HUMAN) R.GECQAEGVLFFQGDR.E 0.67
P02790 R.GECQAEGVLFFQGDREWFWDLAT
hemopexin precursor (HEMO_HUMAN) GTM*K.E 0.67
P02790 R.LEKEVGTPHGIILDSVDAAFICPGSS
hemopexin precursor (HEMO_HUMAN) R.L 0.75
P02790
hemopexin precursor (HEMO_HUMAN) R.LWWLDLK.S 0.62
P02790
hemopexin precursor (HEMO_HUMAN) R.WKNFPSPVDAAFR.Q 0.68 heparin cofactor 2 P05546 K.DQVNTFDNIFIAPVGISTAMGMISL
precursor (HEP2_HUMAN) GLK.G 0.60 Protein description Uniprot ID (name) Peptide XT_AUC insulin-like growth
factor-binding protein
complex acid labile
subunit isoform 2 P35858
precursor (ALS_HUMAN) K.ANVFVQLPR.L 0.71 insulin-like growth
factor-binding protein
complex acid labile
subunit isoform 2 P35858
precursor (ALS_HUMAN) R.LEALPNSLLAPLGR.L 0.61 insulin-like growth
factor-binding protein
complex acid labile
subunit isoform 2 P35858
precursor (ALS_HUMAN) R.LFQGLGK.L 0.68 insulin-like growth
factor-binding protein
complex acid labile
subunit isoform 2 P35858
precursor (ALS_HUMAN) R.NLIAAVAPGAFLGLK.A 0.76 insulin-like growth
factor-binding protein
complex acid labile
subunit isoform 2 P35858
precursor (ALS_HUMAN) R.TFTPQPPGLER.L 0.73 inter-alpha-trypsin
inhibitor heavy chain P19827
HI isoform a precursor (ITIH1_HUMAN) K.Q*LVHHFEIDVDIFEPQGISK.L 0.69 inter-alpha-trypsin
inhibitor heavy chain P19827
HI isoform a precursor (ITIH1_HUMAN) K.VTFQLTYEEVLK.R 0.61 inter-alpha-trypsin
inhibitor heavy chain P19827
HI isoform a precursor (ITIH1_HUMAN) K.VTFQLTYEEVLKR.N 0.70 inter-alpha-trypsin
inhibitor heavy chain P19827 R.GIEILNQVQESLPELSNHASILIMLT
HI isoform a precursor (ITIH1_HUMAN) DGDPTEGVTDR.S 0.62 inter-alpha-trypsin
inhibitor heavy chain P19827 R.GM *ADQDGLKPTIDKPSEDSPPLE
HI isoform a precursor (ITIH1_HUMAN) M*LGPR.R 0.79 inter-alpha-trypsin
inhibitor heavy chain P19827
HI isoform a precursor (ITIH1_HUMAN) R.KAAISGENAGLVR.A 0.78 inter-alpha-trypsin
inhibitor heavy chain P19823 K.AGELEVFNGYFVHFFAPDNLDPIPK
H2 precursor (ITIH2_HUMAN) .N 0.64 inter-alpha-trypsin
inhibitor heavy chain P19823
H2 precursor (ITIH2_HUMAN) K.FYNQVSTPLLR.N 0.68 Protein description Uniprot ID (name) Peptide XT_AUC inter-alpha-trypsin
inhibitor heavy chain P19823
H2 precursor (ITIH2_HUMAN) K.VQFELHYQEVK.W 0.68 inter-alpha-trypsin
inhibitor heavy chain P19823
H2 precursor (ITIH2_HUMAN) R.ETAVDGELVVLYDVK.R 0.63 inter-alpha-trypsin
inhibitor heavy chain P19823
H2 precursor (ITIH2_HUMAN) R.IYLQPGR.L 0.75 inter-alpha-trypsin
inhibitor heavy chain Q06033
H3 preproprotein (ITIH3_HUMAN) R.LWAYLTIEQLLEK.R 0.60 inter-alpha-trypsin
inhibitor heavy chain Q14624
H4 isoform 1 precursor (ITIH4_HUMAN) K.ITFELVYEELLK.R 0.60 inter-alpha-trypsin
inhibitor heavy chain Q14624
H4 isoform 1 precursor (ITIH4_HUMAN) K. LQD RG P DVLTATVSG K. L 0.67 inter-alpha-trypsin
inhibitor heavy chain Q14624 K.TGLLLLSDPDKVTIGLLFWDGRGEG
H4 isoform 1 precursor (ITIH4_HUMAN) LR.L 0.63 inter-alpha-trypsin
inhibitor heavy chain Q14624
H4 isoform 1 precursor (ITIH4_HUMAN) K.WKETLFSVM*PGLK.M 0.79 inter-alpha-trypsin
inhibitor heavy chain Q14624 R.AISGGSIQIENGYFVHYFAPEGLTT
H4 isoform 1 precursor (ITIH4_HUMAN) M*PK.N 0.60 inter-alpha-trypsin
inhibitor heavy chain Q14624 R.AISGGSIQIENGYFVHYFAPEGLTT
H4 isoform 1 precursor (ITIH4_HUMAN) MPK.N 0.65 inter-alpha-trypsin
inhibitor heavy chain Q14624
H4 isoform 1 precursor (ITIH4_HUMAN) R.ANTVQEATFQMELPK.K 0.68 inter-alpha-trypsin
inhibitor heavy chain Q14624 R.SFAAGIQALGGTNINDAMLMAVQ
H4 isoform 1 precursor (ITIH4_HUMAN) LLDSSNQEER.L 0.64 inter-alpha-trypsin
inhibitor heavy chain Q14624
H4 isoform 1 precursor (ITIH4_HUMAN) R.VQGNDHSATR.E 0.63 inter-alpha-trypsin
inhibitor heavy chain Q14624
H4 isoform 2 precursor (ITIH4_HUMAN) K.ITFELVYEELLKR.R 0.60 inter-alpha-trypsin
inhibitor heavy chain Q14624
H4 isoform 2 precursor (ITIH4_HUMAN) K.VTIGLLFWDGR.G 0.65 inter-alpha-trypsin
inhibitor heavy chain Q14624 R.LWAYLTIQQLLEQTVSASDADQQA
H4 isoform 2 precursor (ITIH4_HUMAN) LR.N 0.68
P29622
kallistatin precursor (KAIN_HUMAN) K.LFHTN FYDTVGTIQLIN DHVK. K 0.73 Protein description Uniprot ID (name) Peptide XT_AUC kininogen-1 isoform 2 P01042
precursor (KNG1_HUMAN) K.ENFLFLTPDCK.S 0.64 kininogen-1 isoform 2 P01042
precursor (KNG1_HUMAN) K.1 YPTVN CQP LG M 1 S LM K. R 0.64 kininogen-1 isoform 2 P01042
precursor (KNG1_HUMAN) K.KIYPTVNCQPLGMISLMK.R 0.78 kininogen-1 isoform 2 P01042
precursor (KNG1_HUMAN) K.SLWNGDTGECTDNAYIDIQLR.I 0.67
P51884
lumican precursor (LUM_HUMAN) K.ILGPLSYSK.I 0.60
N-acetylmuramoyl-L- alanine amidase Q96PD5 K.EYGVVLAPDGSTVAVEPLLAGLEAG
precursor (PGRP2_HUMAN) LQGR.R 0.61
N-acetylmuramoyl-L- alanine amidase Q96PD5 R.EGKEYGVVLAPDGSTVAVEPLLAGL
precursor (PGRP2_HUMAN) E AG LQGR.R 0.69
N-acetylmuramoyl-L- alanine amidase Q96PD5 R.Q*NGAALTSASILAQQVWGTLVLL
precursor (PGRP2_HUMAN) QR.L 0.60 pigment epithelium- derived factor P36955
precursor (PEDF_HUMAN) K.IAQLPLTGSMSIIFFLPLK.V 0.65 pigment epithelium- derived factor P36955 R.SSTSPTTNVLLSPLSVATALSALSLG
precursor (PEDF_HUMAN) AEQR.T 0.79 plasma kallikrein P03952
preproprotein (KLKB1_HUMAN) K.VAEYMDWILEK.T 0.62 plasma kallikrein P03952
preproprotein (KLKB1_HUMAN) R.C*LLFSFLPASSINDMEKR.F 0.60 plasma kallikrein P03952
preproprotein (KLKB1_HUMAN) R.C*QFFSYATQTFHK.A 0.60 plasma kallikrein P03952
preproprotein (KLKB1_HUMAN) R.CLLFSFLPASSINDMEK.R 0.76 plasma protease CI P05155
inhibitor precursor (IC1_HUMAN) R.LVLLNAIYLSAK.W 0.96 pregnancy zone protein P20742
precursor (PZP_HUMAN) R.NALFCLESAWNVAK.E 0.67 pregnancy zone protein P20742
precursor (PZP_HUMAN) R.NQGNTWLTAFVLK.T 0.61 pregnancy-specific
beta-l-glycoprotein 9 Q00887
precursor (PSG9_HUMAN) R.SNPVILNVLYGPDLPR.I 0.62 prenylcysteine oxidase Q9UHG3
1 precursor (PCYOX_HUMAN) K.IAIIGAGIGGTSAAYYLR.Q 0.71 protein AMBP P02760
preproprotein (AMBP_HUMAN) K.WYNLAIGSTCPWLK.K 0.77 protein AMBP P02760
preproprotein (AMBP_HUMAN) R.TVAACNLPIVR.G 0.66 prothrombin P00734
preproprotein (THRB_HUMAN) R.IVEGSDAEIGMSPWQVMLFR.K 0.62 Protein description Uniprot ID (name) Peptide XT_AUC prothrombin P00734
preproprotein (TH B_HUMAN) R.RQECSIPVCGQDQVTVAMTPR.S 0.69 prothrombin P00734
preproprotein (TH B_HUMAN) R.TFGSGEADCGLRPLFEK.K 0.61 retinol-binding protein P02753
4 precursor (RET4_HUMAN) R.FSGTWYAMAK.K 0.60 retinol-binding protein P02753 R.LLNNWDVCADMVGTFTDTEDPAK
4 precursor (RET4_HUMAN) .F 0.64 serum amyloid P- P02743
component precursor (SAMP_HUMAN) R.GYVIIKPLVWV.- 0.62 sex hormone-binding
globulin isoform 1 P04278
precursor (SHBG_HUMAN) K.VVLSSGSGPGLDLPLVLGLPLQLK.L 0.60 sex hormone-binding
globulin isoform 1 P04278 R.TWDPEGVIFYGDTNPKDDWFM*L
precursor (SHBG_HUMAN) GLR.D 0.75 sex hormone-binding
globulin isoform 1 P04278 R.TWDPEGVI FYGDTNPKDDWFMLG
precursor (SHBG_HUMAN) LR.D 0.74 thrombospondin-1 P07996
precursor (TSP1_HUMAN) K.GFLLLASLR.Q 0.70 thyroxine-binding P05543
globulin precursor (THBG_HUMAN) K.AVLHIGEK.G 0.85 thyroxine-binding P05543
globulin precursor (THBG_HUMAN) K.FSISATYDLGATLLK.M 0.65 thyroxine-binding P05543
globulin precursor (THBG_HUMAN) K.KELELQIGNALFIGK.H 0.61 thyroxine-binding P05543
globulin precursor (THBG_HUMAN) K.MSSINADFAFNLYR.R 0.67 transforming growth
factor-beta-induced Q15582
protein ig-h3 precursor (BGH3_HUMAN) R.LTLLAPLNSVFK.D 0.65
P02766
transthyretin precursor (TTHY_HUMAN) R.GSPAINVAVHVFR.K 0.67 uncharacterized
protein C3orf20 Q8ND61
isoform 1 (CC020_HUMAN) K.MPSHLMLAR.K 0.64 vitamin D-binding
protein isoform 1 P02774
precursor (VTDB_HUMAN) K.ELPEHTVK.L 0.75 vitamin D-binding
protein isoform 1 P02774 K.EYANQFMWEYSTNYGQAPLSLLVS
precursor (VTDB_HUMAN) YTK.S 0.69 vitamin D-binding
protein isoform 1 P02774
precursor (VTDB_HUMAN) K.HLSLLTTLSNR.V 0.65 vitamin D-binding
protein isoform 1 P02774
precursor (VTDB_HUMAN) K.HQPQEFPTYVEPTNDEICEAFR.K 0.64 Protein description Uniprot ID (name) Peptide XT_AUC vitamin D-binding
protein isoform 1 P02774 K.LAQKVPTADLEDVLPLAEDITNILSK.
precursor (VTDB_HUMAN) C 0.73 vitamin D-binding
protein isoform 1 P02774
precursor (VTDB_HUMAN) K.LCDNLSTK.N 0.70 vitamin D-binding
protein isoform 1 P02774
precursor (VTDB_HUMAN) K.LCMAALK.H 0.63 vitamin D-binding
protein isoform 1 P02774
precursor (VTDB_HUMAN) K.SCESNSPFPVHPGTAECCTK.E 0.63 vitamin D-binding
protein isoform 1 P02774
precursor (VTDB_HUMAN) K.SYLSMVGSCCTSASPTVCFLK.E 0.61 vitamin D-binding
protein isoform 1 P02774 K.TAM DVFVCTYF M * PAAQ LPE LPDV
precursor (VTDB_HUMAN) ELPTNK.D 0.61 vitamin D-binding
protein isoform 1 P02774
precursor (VTDB_HUMAN) K.VLEPTLK.S 0.69 vitamin D-binding
protein isoform 1 P02774
precursor (VTDB_HUMAN) R.KFPSGTFEQVSQLVK.E 0.66 vitamin D-binding
protein isoform 1 P02774
precursor (VTDB_HUMAN) R.THLPEVFLSK.V 0.62 vitamin D-binding
protein isoform 1 P02774
precursor (VTDB_HUMAN) R.TSALSAK.S 0.74
P04004
vitronectin precursor (VTNC_HUMAN) R.GQYCYELDEK.A 0.73
P04004
vitronectin precursor (VTNC_HUMAN) R.M *DWLVPATCEPIQSVFFFSGDK.Y 0.64
P04004
vitronectin precursor (VTNC_HUMAN) R.Q*PQFISR.D 0.63
[00181] Table 10. Significant peptides (AUO0.6) for both X!Tandem and Sequest
Figure imgf000084_0001
Figure imgf000085_0001
Figure imgf000086_0001
Protein description Uniprot ID (name) Peptide XT_AUC S_AUC
R.V
ceruloplasmin precursor P00450 R.IDTINLFPATL 0.66 0.70
(CERU HUMAN) FDAYMVAQNP
GEWMLSCQNL NHLK.A
ceruloplasmin precursor P00450 R. S G AGTED SAC 0.88 0.92
(CERU HUMAN) IPWAYYSTVDQ
VKDLYSGLIGPL IVCR.R
cholmesterase precursor P06276 K.IFFPGVSEFGK 0.70 0.63
(CHLE HUMAN) .E
cholmesterase precursor P06276 R.AILQSGSFNAP 0.75 0.77
(CHLE HUMAN) WAVTSLYEAR.
N
chorionic gonadotropin, P01233 R.VLQGVLPALP 0.60 0.75 beta polypeptide 8 (CGHB HUMAN) QVVCNYR.D
precursor
chorionic P01243 R.ISLLLIESWLE 0.83 0.63 somatomammotropin (CSH HUMAN) PVR.F
hormone 2 isoform 2
precursor
coagulation factor XII P00748 R.LHEAFSPVSY 0.60 0.66 precursor (FA12 HUMAN) QHDLALLR.L
coagulation factor XII P00748 R.TTLSGAPCQP 0.69 0.82 precursor (FA12 HUMAN) WASEATYR.N
complement Clq P02745 K.GLFQVVSGG 0.65 0.60 subcomponent subunit A (C1QA HUMAN) MVLQLQQGDQ
precursor VWVEKDPK.K
complement Clr P00736 K.VLNYVDWIK 0.80 0.76 subcomponent precursor (C1R HUMAN) K.E
complement Cls P09871 K. SNALDIIFQTD 0.62 0.77 subcomponent precursor (CIS HUMAN) LTGQK.K
complement C4-A P0C0L4 K.EGAIHREELV 0.76 0.75 isoform 1 (C04A HUMAN) YELNPLDHR.G
complement C4-A P0C0L4 K.ITQVLHFTK.D 0.63 0.62 isoform 1 (C04A HUMAN)
complement C4-A P0C0L4 K.SHALQLNNR. 0.66 0.71 isoform 1 (C04A HUMAN) Q
complement C4-A P0C0L4 R.AVGSGATFSH 0.65 0.60 isoform 1 (C04A HUMAN) YYYM*ILSR.G
complement C4-A P0C0L4 R.EPFLSCCQFA 0.64 0.72 isoform 1 (C04A HUMAN) ESLR.K
complement C4-A P0C0L4 R.GHLFLQTDQP 0.63 0.76 isoform 1 (C04A HUMAN) IYNPGQR.V
complement C4-A P0C0L4 R.GLEEELQFSL 0.68 0.68 isoform 1 (C04A HUMAN) GSK.I
complement C4-A P0C0L4 R.GSFEFPVGDA 0.67 0.70 isoform 1 (C04A HUMAN) VSK.V Protein description Uniprot ID (name) Peptide XT_AUC S_AUC complement C4-A P0C0L4 R.LLATLCSAEV 0.61 0.71 isoform 1 (C04A HUMAN) CQCAEGK.C
complement C4-A P0C0L4 R.VQQPDCREPF 0.65 0.83 isoform 1 (C04A HUMAN) LSCCQFAESLR
.K
complement C4-A P0C0L4 R.YIYGKPVQGV 0.82 0.76 isoform 1 (C04A HUMAN) AYVR.F
complement C5 P01031 K.ITHYNYLILSK 0.66 0.69 preproprotein (C05 HUMAN) .G
complement C5 P01031 R.ENSLYLTAFT 0.60 0.68 preproprotein (C05 HUMAN) VIGIR.K
complement C5 P01031 R.KAFDICPLVK. 0.77 0.65 preproprotein (C05 HUMAN) I
complement C5 P01031 R.VDDGVASFVL 0.68 0.61 preproprotein (C05 HUMAN) NLPSGVTVLEFN
VK.T
complement component P13671 K.TFSEWLESVK 0.94 0.64 C6 precursor (C06 HUMAN) ENPAVIDFELAP
IVDLVR.N
complement component P13671 R.IFDDFGTHYF 0.78 0.75 C6 precursor (C06 HUMAN) TSGSLGGVYDL
LYQFSSEELK.N
complement component PI 0643 K.ELSHLPSLYD 0.69 0.71 C7 precursor (C07 HUMAN) YSAYR.R
complement component PI 0643 R.RYSAWAESV 0.71 0.70 C7 precursor (C07 HUMAN) TNLPQVIK.Q
complement component P07357 K.YNPVVIDFEM 0.68 0.73 C8 alpha chain precursor (C08A HUMAN) *QPIHEVLR.H
complement component P07358 K.VEPLYELVTA 0.69 0.70 C8 beta chain (C08B HUMAN) TDFAYSSTVR.Q
preproprotein
complement component P07358 R.SLM*LHYEFL 0.61 0.65 C8 beta chain (C08B HUMAN) QR.V
preproprotein
complement component P07360 K.YGFCEAADQF 0.78 0.76 C8 gamma chain (C08G HUMAN) HVLDEVRR.- precursor
complement component P07360 R.FLQEQGHR.A 0.63 0.69 C8 gamma chain (C08G HUMAN)
precursor
complement component P07360 R.KLDGICWQV 0.75 0.70 C8 gamma chain (C08G HUMAN) R.Q
precursor
complement component P07360 R.SLPVSDSVLS 0.70 0.60 C8 gamma chain (C08G HUMAN) GFEQR.V
precursor
complement component P02748 R.GTVIDVTDFV 0.68 0.69 C9 precursor (C09 HUMAN) NWASSINDAPV
Figure imgf000089_0001
Protein description Uniprot ID (name) Peptide XT_AUC S_AUC complex acid labile
subunit isoform 2
precursor
inter-alpha-trypsin P19827 K.ADVQAHGEG 0.61 0.74 inhibitor heavy chain HI (ITIH 1 HUMAN) QEFSITCLVDEE
isoform a precursor EMK .L
inter-alpha-trypsin P19827 K.ILGDM * QPGD 0.71 0.63 inhibitor heavy chain HI (ITIH 1 HUMAN) YFDLVLFGTR.V
isoform a precursor
inter-alpha-trypsin P19827 K.ILGDMQPGDY 0.68 0.60 inhibitor heavy chain HI (ITIH 1 HUMAN) FDLVLFGTR.V
isoform a precursor
inter-alpha-trypsin P19827 K.NVVFVIDISGS 0.76 0.83 inhibitor heavy chain HI (ITIH 1 HUMAN) MR.G
isoform a precursor
inter-alpha-trypsin P19827 K.TAFISDFAVT 0.74 0.63 inhibitor heavy chain HI (ITIH 1 HUMAN) ADGNAFIGDIKD
isoform a precursor K.V
inter-alpha-trypsin P19827 R. GHMLENH VE 0.78 0.80 inhibitor heavy chain HI (ITIH 1 HUMAN) R.L
isoform a precursor
inter-alpha-trypsin P19827 R.GM*ADQDGL 0.61 0.62 inhibitor heavy chain HI (ITIH 1 HUMAN) KPTIDKPSEDSP
isoform a precursor PLEMLGPR.R
inter-alpha-trypsin P19827 R.LWAYLTIQEL 0.68 0.62 inhibitor heavy chain HI (ITIH 1 HUMAN) LAK.R
isoform a precursor
inter-alpha-trypsin P19827 R.NHM * Q YEI VI 0.67 0.65 inhibitor heavy chain HI (ITIH 1 HUMAN) K.V
isoform a precursor
inter-alpha-trypsin P19823 K.AHVSFKPTVA 0.75 0.61 inhibitor heavy chain H2 (ITIH2 HUMAN) QQR.I
precursor
inter-alpha-trypsin P19823 K.ENIQDNISLFS 0.80 0.93 inhibitor heavy chain H2 (ITIH2 HUMAN) LGM * GFD VD YD
precursor FLKR.L
inter-alpha-trypsin P19823 K.ENIQDNISLFS 0.63 0.80 inhibitor heavy chain H2 (ITIH2 HUMAN) LGMGFDVDYDF
precursor LKR.L
inter-alpha-trypsin P19823 K.HLEVDVWVIE 0.61 0.61 inhibitor heavy chain H2 (ITIH2 HUMAN) PQGLR.F
precursor
inter-alpha-trypsin P19823 K.LWAYLTINQL 0.69 0.62 inhibitor heavy chain H2 (ITIH2 HUMAN) LAER.S
precursor
inter-alpha-trypsin P19823 R. AEDHF S VIDF 0.65 0.63 inhibitor heavy chain H2 (ITIH2 HUMAN) NQNIR.T
precursor Protein description Uniprot ID (name) Peptide XT_AUC S_AUC inter-alpha-trypsin P19823 R.FLHVPDTFEG 0.66 0.62 inhibitor heavy chain H2 (ITIH2 HUMAN) HFDGVPVISK.G
precursor
inter-alpha-trypsin Q 14624 K.ILDDLSPR.D 0.67 0.65 inhibitor heavy chain H4 (ITIH4 HUMAN)
isoform 1 precursor
inter-alpha-trypsin Q 14624 K.IPKPEASFSPR. 0.69 0.77 inhibitor heavy chain H4 (ITIH4 HUMAN) R
isoform 1 precursor
inter-alpha-trypsin Q 14624 K.SPEQQETVLD 0.63 0.69 inhibitor heavy chain H4 (ITIH4 HUMAN) GNLIIR.Y
isoform 1 precursor
inter-alpha-trypsin Q 14624 K.YIFHNFMER.L 0.66 0.61 inhibitor heavy chain H4 (ITIH4 HUMAN)
isoform 1 precursor
inter-alpha-trypsin Q 14624 R.FSSHVGGTLG 0.69 0.71 inhibitor heavy chain H4 (ITIH4 HUMAN) QFYQEVLWGSP
isoform 1 precursor AASDDGRR.T
inter-alpha-trypsin Q 14624 R.GPDVLTATVS 0.63 0.82 inhibitor heavy chain H4 (ITIH4 HUMAN) GK.L
isoform 1 precursor
inter-alpha-trypsin Q 14624 R.NMEQFQVSVS 0.78 0.60 inhibitor heavy chain H4 (ITIH4 HUMAN) VAPNAK.I
isoform 1 precursor
inter-alpha-trypsin Q 14624 R.RLDYQEGPPG 0.68 0.62 inhibitor heavy chain H4 (ITIH4 HUMAN) VEISCWSVEL.- isoform 1 precursor
kallistatin precursor P29622 K.IVDLVSELKK. 0.75 0.67
(KAIN HUMAN) D
kallistatin precursor P29622 R.VGSALFLSHN 0.70 0.74
(KAIN HUMAN) LK.F
kininogen-1 isoform 2 P01042 K.IYPTVNCQPL 0.89 0.62 precursor (KNGl HUMAN) GM*ISLM*K.R
kininogen-1 isoform 2 P01042 K.TVGSDTFYSF 0.61 0.68 precursor (KNGl HUMAN) K.Y
kininogen-1 isoform 2 P01042 R.DIPTNSPELEE 0.61 0.76 precursor (KNGl HUMAN) TLTHTITK.L
kininogen-1 isoform 2 P01042 R.VQVVAGK.K 0.67 0.71 precursor (KNGl HUMAN)
lumican precursor P51884 R.FNALQYLR.L 0.68 0.76
(LUM HUMAN)
macrophage colony- P09603 K.VIPGPPALTLV 0.68 0.60 stimulating factor 1 (CSF1 HUMAN) PAELVR.I
receptor precursor
monocyte differentiation P08571 K.ITGTMPPLPLE 0.80 0.67 antigen CD 14 precursor (CD 14 HUMAN) ATGLALSSLR.L
N-acetylmuramoyl-L- Q96PD5 K.EFTEAFLGCP 0.62 0.64 alanine amidase (PGRP2 HUMAN AIHPR.C Protein description Uniprot ID (name) Peptide XT_AUC S_AUC precursor )
N-acetylmuramoyl-L- Q96PD5 R.RVINLPLDSM 0.63 0.62 alanine amidase (PGRP2 HUMAN AAPWETGDTFP
precursor ) DVVAIAPDVR.A
phosphatidylinositol- P80108 R.GVFFSVNSWT 0.67 0.78 glycan-specific (PHLD HUMAN) PDSMSFIYK.A
phospholipase D
precursor
pigment epithelium- P36955 K.EIPDEISILLLGVAHF 0.63 0.61 derived factor precursor (PEDF_HUMAN) K.G
pigment epithelium- P36955 K.IAQLPLTGSM*SIIF 0.79 0.61 derived factor precursor (PEDF_HUMAN) FLPLK.V
pigment epithelium- P36955 K.TVQAVLTVPK.L 0.75 0.79 derived factor precursor (PEDF_HUMAN)
pigment epithelium- P36955 R.ALYYDLISSPDIHGT 0.60 0.73 derived factor precursor (PEDF_HUMAN) YKELLDTVTAPQK.N
pigment epithelium- P36955 R.DTDTGALLFIGK.I 0.85 0.62 derived factor precursor (PEDF_HUMAN)
plasminogen isoform 1 P00747 R.ELRPWCFTTDPNK 0.70 0.68 precursor (PLMN_HUMAN) R.W
plasminogen isoform 1 P00747 R.TECFITGWGETQGT 0.63 0.68 precursor (PLMN_HUMAN) FGAGLLK.E
platelet basic protein P02775 K.GTHCNQVEVIATLK 0.60 0.61 preproprotein (CXCL7_HUMAN) .D
pregnancy zone protein P20742 K.AVGYLITGYQR.Q 0.87 0.73 precursor (PZP_HUMAN)
pregnancy zone protein P20742 R.AVDQSVLLM*KPE 0.64 0.62 precursor (PZP_HUMAN) AELSVSSVYNLLTVK.D
pregnancy zone protein P20742 R.IQHPFTVEEFVLPK. 0.66 0.74 precursor (PZP_HUMAN) F
pregnancy zone protein P20742 R.NELIPLIYLENPR.R 0.61 0.61 precursor (PZP_HUMAN)
protein AMBP P02760 R.AFIQLWAFDAVK.G 0.72 0.67 preproprotein (AMBP_HUMAN)
proteoglycan 4 isoform B Q92954 K.GFGGLTGQIVAALS 0.70 0.72 precursor (P G4_HUMAN) TAK.Y
prothrombin preproprotein P00734 K.YGFYTHVFR.L 0.70 0.63
(TH B_HUMAN)
prothrombin preproprotein P00734 R.IVEGSDAEIGM*SP 0.63 0.71
(TH B_HUMAN) WQVMLFR.K
retinol-binding protein 4 P02753 K.KDPEGLFLQDNIVA 0.67 0.67 precursor (RET4_HUMAN) EFSVDETGQMSATAK
.G
thyroxine-binding globulin P05543 K.AQWANPFDPSKTE 0.67 0.80 precursor (THBG_HUMAN) DSSSFLIDK.T
thyroxine-binding globulin P05543 K.GWVDLFVPK.F 0.67 0.64 precursor (THBG_HUMAN)
thyroxine-binding globulin P05543 R.SFM*LLILER.S 0.65 0.68 precursor (THBG_HUMAN) Protein description Uniprot ID (name) Peptide XT_AUC S_AUC thyroxine-binding globulin P05543 .SFMLLILE .S 0.64 0.62 precursor (THBG_HUMAN)
vitamin D-binding protein P02774 K.EFSHLGKEDFTSLSL 0.74 0.61 isoform 1 precursor (VTDB_HUMAN) VLYSR.K
vitamin D-binding protein P02774 K. E YA N QF M * W E YST 0.73 0.61 isoform 1 precursor (VTDB_HUMAN) NYGQAPLSLLVSYTK.
S
vitamin D-binding protein P02774 K.HQPQEFPTYVEPTN 0.67 0.69 isoform 1 precursor (VTDB_HUMAN) DEICEAFRK.D
vitamin D-binding protein P02774 K.SYLSM *VGSCCTSA 0.63 0.62 isoform 1 precursor (VTDB_HUMAN) SPTVCFLK.E
vitamin D-binding protein P02774 K.TAM*DVFVCTYFM 0.63 0.60 isoform 1 precursor (VTDB_HUMAN) PAAQLPELPDVELPT
NK.D
vitamin D-binding protein P02774 K.VPTADLEDVLPLAE 0.70 0.71 isoform 1 precursor (VTDB_HUMAN) DITNILSK.C
vitronectin precursor P04004 K.AVRPGYPK.L 0.68 0.77
(VTNC_HUMAN)
vitronectin precursor P04004 R.MDWLVPATCEPIQ 0.67 0.65
(VTNC_HUMAN) SVFFFSGDK.Y
zinc-alpha-2-glycoprotein P25311 K.EIPAWVPFDPAAQI 0.63 0.67 precursor (ZA2G_HUMAN) TK.Q
[00182] The differentially expressed proteins identified by the hypothesis-independent strategy above, not already present in our MRM-MS assay, were candidates for incorporation into the MRM-MS assay. Two additional proteins (AFP, PGH1) of functional interest were also selected for MRM development. Candidates were prioritized by AUC and biological function, with preference give for new pathways. Sequences for each protein of interest, were imported into Skyline software which generated a list of tryptic peptides, m/z values for the parent ions and fragment ions, and an instrument- specific collision energy (McLean et al. Bioinformatics (2010) 26 (7): 966-968; McLean et al. Anal. Chem (2010) 82 (24): 10116-10124).
[00183] The list was refined by eliminating peptides containing cysteines and methionies, and by using the shotgun data to select the charge state(s) and a subset of potential fragment ions for each peptide that had already been observed on a mass spectrometer.
[00184] After prioritizing parent and fragment ions, a list of transitions was exported with a single predicted collision energy. Approximately 100 transitions were added to a single MRM run. For development, MRM data was collected on either a QTRAP 5500 (AB Sciex) or a 6490 QQQ (Agilent). Commercially available human female serum (from pregnant and non-pregnant donors), was depleted and processed to tryptic peptides, as described above, and used to "scan" for peptides of interest. In some cases, purified synthetic peptides were used for further optimization. For development, digested serum or purified synthetic peptides were separated with a 15 min acetonitrile gradient at 100 ul/min on a 2.1 x 50 mM Poroshell 120 EC-C18 column (Agilent) at 40°C.
[00185] The MS/MS data was imported back into Skyline, where all chromatograms for each peptide were overlayed and used to identify a concensus peak corresponding to the peptide of interest and the transitions with the highest intensities and the least noise. Table 11, contains a list of the most intensely observed candidate transitions and peptides for transfer to the MRM assay.
[00186] Table 11. Candidate peptides and transitions for transferring to the MRM assay
fragment ion, m/z,
Figure imgf000094_0001
G [b3] - 300.1554+[7] 305120
R.GTHVDLGLASA
154927 alpha-l-antichymotrypsin NVDFAFSLYK.Q 742.3794+++ D [y8] - 990.4931+[1]
L [b8] - 793.4203+[2] 51068 D [b5] - 510.2307+[3] 45310 F [y7] - 875.4662+[4] 42630 A [b9] - 864.4574+[5] 43355 S [y4] - 510.2922+[6] 45310 F [y5] - 657.3606+[7] 37330 V [y9] - 1089.5615+[8] 32491 G [b7] - 680.3362+[9] 38185 Y [y2] - 310.1761+[10] 36336
N [bl2] -
16389
1136.5695+[11]
S [blO] - 951.4894+[12] 16365
L [b6] - 623.3148+[13] 13687
L [y3] - 423.2602+[14] 17156
V [b4] - 395.2037+[15] 10964
R.NLAVSQWHK.
266203 alpha-l-antichymotrypsin A 547.8195++ A [y8] - 867.5047+[l]
L [b2] - 228.1343+[2] 314232
V [y7] - 796.4676+[3] 165231 A [b3] - 299.1714+[4] 173694 S [y6] - 697.3991+[5] 158512 H [y2] - 284.1717+[6] 136431
V [b4] - 398.2398+[7] 36099 S [b5] - 485.2718+[8] 23836
365.5487+++ S [y6] - 697.3991+[1] 223443
V [y3] - 383.2401+[2] 112952
V [y4] - 482.3085+[3] 84872 Q [y5] - 610.3671+[4] 30835 inter-alpha-trypsin K.AAISGENAGLVR
inhibitor heavy chain HI 518001
.A 579.3173++ S [y9] - 902.4690+[l]
G [y8] - 815.4370+[2] 326256 N [y6] - 629.3729+[3] 296670 S [b4] - 343.1976+[4] 258172 inter-alpha-trypsin K.GSLVQASEANL
inhibitor heavy chain HI 304374
QAAQDFVR.G 668.6763+++ A [y7] - 806.4155+[1]
A [y6] - 735.3784+[2] 193844 V [b4] - 357.2132+[3] 294094 F [y3] - 421.2558+[4] 167816 A [b6] - 556.3089+[5] 149216 L [bll] - 535.7775++[6] 156882 A [bl3] - 635.3253++[7] 249287 A [yl4] - 760.3786++[8] 123723 F [bl7] - 865.9208++[9] 23057 inter-alpha-trypsin K.TAFISDFAVTAD 1087.0442++ G [y4] - 432.2453+[l] 22362 inhibitor heavy chain HI GNAFIGDIK.D
1 [y5] - 545.3293+[2] 8319 A [b8] - 853.4090+[3] 7006 G [y9] - 934.4993+[4] 6755 F [y6] - 692.3978+[5] 6193 V [b9] - 952.4775+[6] 9508 inter-alpha-trypsin
609348 inhibitor heavy chain HI K.VTYDVS .D 420.2165++ Y [y5] - 639.3097+[l]
T [b2] - 201.1234+[2] 792556 D [y4] - 476.2463+[3] 169546
Y [y3] - 361.2194+[4] 256946
Y [y5] - 320.1585++[5] 110608 S [y2] - 262.1510+[6] 50268
Y [b3] - 182.5970++[7] 10947 D [b4] - 479.2136+[8] 13662 inter-alpha-trypsin
2032509 inhibitor heavy chain HI R.EVAFDLEIPK.T 580.8135++ P [y2] - 244.1656+[1]
D [y6] - 714.4032+[2] 672749 A [y8] - 932.5088+[3] 390837 L [y5] - 599.3763+[4] 255527 F [y7] - 861.4716+[5] 305087 inter-alpha-trypsin R.LWAYLTIQELLA
602601 inhibitor heavy chain HI K.R 781.4531++ W [b2] - 300.1707+[1]
A [b3] - 371.2078+[2] 356967 T [y8] - 915.5510+[3] 150419 Y [b4] - 534.2711+[4] 103449 1 [y7] - 814.5033+[5] 72044 Q [y6] - 701.4192+[6] 66989 L [b5] - 647.3552+[7] 99820 E [y5] - 573.3606+[8] 44843 inter-alpha-trypsin K.FYNQVSTPLLR.
367330 inhibitor heavy chain H2 N 669.3642++ S [y6] - 686.4196+[1]
V [y7] - 785.4880+[2] 182396 P [y4] - 498.3398+[3] 103638 Y [b2] - 311.1390+[4] 52172 Q [b4] - 553.2405+[5] 54270 N [b3] - 425.1819+[6] 34567 inter-alpha-trypsin K.HLEVDVWVIEP
206996 inhibitor heavy chain H2 QGLR.F 597.3247+++ 1 [y7] - 812.4625+[1]
P [y5] - 570.3358+[2] 303693 E [y6] - 699.3784+[3] 126752 P [y5] - 285.6715++[4] 79841 inter-alpha-trypsin
460019 inhibitor heavy chain H2 K.TAGLVR.S 308.6925++ A [b2] - 173.0921+[l]
G [y4] - 444.2929+[2] 789068 V [y2] - 274.1874+[3] 34333 G [b3] - 230.1135+[4] 15169 L [y3] - 387.2714+[5] 29020 inter-alpha-trypsin
638209 inhibitor heavy chain H2 .IYLQPG .L 423.7452++ L [y5] - 570.3358+[l]
P [y3] - 329.1932+[2] 235194 Y [b2] - 277.1547+[3] 266889 Q [y4] - 457.2518+[4] 171389 inter-alpha-trypsin R.LSNENHGIAQR.
325409 inhibitor heavy chain H2 1 413.5461+++ N [y9] - 519.7574++[1]
N [y7] - 398.2146++[2] 39521 G [y5] - 544.3202+[3] 139598 S [b2] - 201.1234+[4] 54786 E [y8] - 462.7359++[5] 30623 inter-alpha-trypsin
582421 inhibitor heavy chain H2 R.SLAPTAAAKR.R 415.2425++ A [y7] - 629.3617+[1]
L [b2] - 201.1234+[2] 430584 P [y6] - 558.3246+[3] 463815 A [b3] - 272.1605+[4] 204183 T [y5] - 461.2718+[5] 47301 inter-alpha-trypsin
132304 inhibitor heavy chain H3 K.EVSFDVELPK.T 581.8032++ P [y2] - 244.1656+[1]
V [b2] - 229.1183+[2] 48895 L [y3] - 357.2496+[3] 20685 inter-alpha-trypsin
190296 inhibitor heavy chain H3 K.IQENVR.N 379.7114++ E [y4] - 517.2729+[1]
E [b3] - 371.1925+[2] 51697 Q [b2] - 242.1499+[3] 54241 N [y3] - 388.2303+[4] 21156 V [y2] - 274.1874+[5] 8309 inter-alpha-trypsin
687902 inhibitor heavy chain H3 R.ALDLSLK.Y 380.2342++ D [y5] - 575.3399+[l]
L [b2] - 185.1285+[2] 241010 L [y2] - 260.1969+[3] 29365 inter-alpha-trypsin R.LIQDAVTGLTVN
139259 inhibitor heavy chain H3 GQITGDK.R 972.0258++ V [b6] - 640.3665+[l]
G [b8] - 798.4356+[2] 53886 G [y7] - 718.3730+[3] 12518 pigment epithelium-
13436 derived factor precursor K.SSFVAPLEK.S 489.2687++ A [y5] - 557.3293+[l]
V [y6] - 656.3978+[2] 9350 F [y7] - 803.4662+[3] 6672 P [y4] - 486.2922+[4] 6753 pigment epithelium-
26719 derived factor precursor K.TVQAVLTVPK.L 528.3266++ Q [y8] - 855.5298+[l]
V [b2] - 201.1234+[2] 21239 Q [y8] - 428.2686++[3] 16900 A [y7] - 727.4713+[4] 9518 L [y5] - 557.3657+[5] 5108 Q [b3] - 329.1819+[6] 5450 V [y6] - 656.4341+[7] 4391 pigment epithelium- .ALYYDLISSPDIH
78073 derived factor precursor GTYK.E 652.6632+++ Y [yl5] - 886.4305++[l]
Y [yl4] - 804.8988++[2] 26148 pigment epithelium- R.DTDTGALLFIGK.
25553 derived factor precursor 1 625.8350++ G [y8] - 818.5135+[1]
T [b2] - 217.0819+[2] 22716 T [b4] - 217.0819++[3] 22716 L [y5] - 577.3708+[4] 11600 1 [y3] - 317.2183+[5] 11089 A [b6] - 561.2151+[6] 6956 pigment epithelium- K.ELLDTVTAPQK.
17139 derived factor precursor N 607.8350++ T [y5] - 544.3089+[l]
D [v8] - 859.4520+[2] 17440 L [y9] - 972.5360+[3] 14344 A [y4] - 443.2613+[4] 11474 T [y7] - 744.4250+[5] 10808 V [y6] - 643.3774+[6] 9064 pregnancy-specific beta-
116611 1-glycoprotein 1 K.FQLPGQK.L 409.2320++ L [y5] - 542.3297+[l]
P [y4] - 429.2456+[2] 91769 Q [b2] - 276.1343+[3] 93301 pregnancy-specific beta- R.DLYHYITSYVVD
5376
1-glycoprotein 1 GEIIIYGPAYSGR.E 955.4762+++ G [y7] - 707.3471+[1]
Y [y8] - 870.4104+[2] 3610 P [y6] - 650.3257+[3] 2770 1 [y9] - 983.4945+[4] 3361 pregnancy-specific beta-
39754
1-glycoprotein 11 K.LFIPQITPK.H 528.8262++ P [y6] - 683.4087+[l]
F [b2] - 261.1598+[2] 29966 1 [y7] - 796.4927+[3] 13162 pregnancy-specific beta- NSATGEESSTSLTI
11009
1-glycoprotein 11 R 776.8761++ E [b7] - 689.2737+[l]
T [y6] - 690.4145+[2] 11284 L [y4] - 502.3348+[3] 2265 S [y7] - 389.2269++[4] 1200 T [y3] - 389.2507+[5] 1200 1 [y2] - 288.2030+[6] 2248 pregnancy-specific beta- K.FQQSGQNLFIP
43682
1-glycoprotein 2 QITTK.H 617.3317+++ F [y8] - 474.2817++[1]
G [yl2] - 680.3852++[2] 24166 S [b4] - 491.2249+[3] 23548 Q [b3] - 404.1928+[4] 17499 1 [y4] - 462.2922+[5] 17304 F [b9] - 525.7538++[6] 17206 1 [blO] - 582.2958++[7] 16718 L [b8] - 452.2196++[8] 16490 P [y6] - 344.2054++[9] 16198
G [b5] - 548.2463+[10] 15320 pregnancy-specific beta-
16879
1-glycoprotein 2 IHPSYTNY 575.7856++ N [b7] - 813.3890+[1]
Y [b5] - 598.2984+[2] 18087 T [y4] - 553.2729+[3] 2682 pregnancy-specific beta-
358059
1-glycoprotein 2 FQLSETNR 497.7513++ L [y6] - 719.3682+[1]
S [y5] - 606.2842+[2] 182330 Q [b2] - 276.1343+[3] 292482 pregnancy-specific beta- VSAPSGTGHLPGL
25346
1-glycoprotein 3 NPL 506.2755+++ T [b7] - 300.6530++[l]
H [y8] - 860.4989+[2] 12159 H [y8] - 430.7531++[3] 15522 pregnancy-specific beta-
23965
1-glycoprotein 3 EDAGSYTLHIVK 666.8433++ Y [b6] - 623.2307+[l]
Y [y7] - 873.5193+[2] 21686 L [b8] - 837.3625+[3] 4104 A [b3] - 316.1139+[4] 1987 pregnancy-specific beta-
62145
1-glycoprotein 4 R.TLFIFGVTK.Y 513.3051++ F [y7] - 811.4713+[1]
L [b2] - 215.1390+[2] 31687 F [y5] - 551.3188+[3] 972 pregnancy-specific beta- NYTYIWWLNGQS
25756
1-glycoprotein 4 LPVSPR 1097.5576++ W [b6] - 841.3879+[1]
G [y9] - 940.5211+[2] 25018 Y [b4] - 542.2245+[3] 19778 Q [y8] - 883.4996+[4] 6642 P [y2] - 272.1717+[5] 5018 pregnancy-specific beta-
176797 1-glycoprotein 5 GVTGYFTFNLYLK 508.2695+++ L [y2] - 260.1969+[1]
T [yll] - 683.8557++[2] 136231 F [b6] - 625.2980+[3] 47523 L [y4] - 536.3443+[4] 23513 pregnancy-specific beta- SNPVTLNVLYGPD
14118
1-glycoprotein 6 LPR 585.6527+++ Y [y7] - 817.4203+[1]
G [y6] - 654.3570+[2] 10433 P [b3] - 299.1350+[3] 87138* P [y5] - 299.1714++[4] 77478* P [y5] - 597.3355+[5] 68089* pregnancy-specific beta- DVLLLVHNLPQNL
L [y8] - 1017.5516+[3] 141169
1-glycoprotein 7 TGHIWYK 791.7741+++
G [y6] - 803.4199+[5] 115905 W [y3] - 496.2554+[6] 108565 P [yll] - 678.8566++[7] 105493 V [b2] - 215.1026+[1] 239492 L [b3] - 328.1867+[2] 204413 N [b8] - 904.5251+[4] 121880 pregnancy-specific beta-
25743*
1-glycoprotein 7 YGPAYSGR 435.7089++ A [y5] - 553.2729+[l]
Y [y4] - 482.2358+[2] 25580* P [y6] - 650.3257+[3] 10831* S [y3] - 319.1724+[4] 10559* G [b2] - 221.0921+[5] 7837* pregnancy-specific beta-
18766
1-glycoprotein 8 LQLSETN 480.7591++ S [b4] - 442.2660+[l]
L [b3] - 355.2340+[2] 12050 Q [b2] - 242.1499+[3] 1339 T [b6] - 672.3563+[4] 2489 pregnancy-specific beta-
53829
1-glycoprotein 9 K.LFIPQITR.N 494.3029++ P [y5] - 614.3620+[1]
1 [y6] - 727.4461+[2] 13731 1 [b3] - 374.2438+[3] 4178 Q [y4] - 517.3093+[4] 2984 pregnancy-specific beta- K.LPIPYITINNLNP
18814*
1-glycoprotein 9 R.E 819.4723++ P [b2] - 211.1441+[1]
P [b4] - 211.1441++[2] 18814* T [b7] - 798.4760+[3] 17287* T [y8] - 941.5163+[4] 10205* Y [b5] - 584.3443+[5] 10136* N [y6] - 727.3846+[6] 9511* pregnancy-specific beta- R.SNPVILNVLYGP
3994
1-glycoprotein 9 DLPR.I 589.6648+++ P [y5] - 597.3355+[l]
Y [y7] - 817.4203+[2] 3743 G [y6] - 654.3570+[3] 3045 pregnancy-specific beta- DVLLLVHNLPQNL
120212
1-glycoprotein 9 PGYFWYK 810.4387+++ P [y7] - 960.4614+[1]
V [b2] - 215.1026+[2] 65494 L [b3] - 328.1867+[3] 54798 pregnancy-specific beta- SENYTYIWWLNG
14788
1-glycoprotein 9 QSLPVSPGVK 846.7603+++ W [yl5] - 834.4488++[l]
P [y4] - 200.6314++[2] 19000 Y [yl7] - 972.5225++[3] 4596 L [blO] - 678.8166++[4] 2660 Y [b6] - 758.2992+[5] 1705 P [y4] - 400.2554+[6] 1847
Pan-PSG ILILPSVTR 506.3317++ P [y5] - 559.3198+[1] 484395
L [b2] - 227.1754+[2] 102774 L [b4] - 227.1754++[3] 102774 1 [y7] - 785.4880+[4] 90153 1 [b3] - 340.2595+[5] 45515 L [y6] - 672.4039+[6] 40368 thyroxine-binding K.AQWANPFDPS
30802 globulin precursor K.T 630.8040++ A [b4] - 457.2194+[1]
S [y2] - 234.1448+[2] 28255 D [y4] - 446.2245+[3] 24933 thyroxine-binding
220841 globulin precursor K.AVLHIGEK.G 289.5080+++ 1 [y4] - 446.2609+[l]
H [y5] - 292.1636++[2] 303815 H [y5] - 583.3198+[3] 133795 V [b2] - 171.1128+[4] 166139 L [y6] - 348.7056++[5] 823533 thyroxine-binding
296859 globulin precursor K.FLNDVK.T 368.2054++ N [y4] - 475.2511+[1]
V [y2] - 246.1812+[2] 219597 L [b2] - 261.1598+[3] 87504 thyroxine-binding K.FSISATYDLGATL
34111 globulin precursor LK.M 800.4351++ Y [y9] - 993.5615+[1]
G [y6] - 602.3872+[2] 17012 D [y8] - 830.4982+ 45104 S [b2] - 235.1077+[4] 15480 thyroxine-binding
1261810 globulin precursor K.GWVDLFVPK.F 530.7949++ W [b2] - 244.1081+[1]
P [y2] - 244.1656+[2] 1261810
V [b7] - 817.4243+[3] 517675
V [y7] - 817.4818+[4] 517675 D [y6] - 718.4134+[5] 306994 F [b6] - 718.3559+[6] 306994
V [y3] - 343.2340+[7] 112565
V [b3] - 343.1765+[8] 112565 thyroxine-binding
198085 globulin precursor K.NALALFVLPK.E 543.3395++ A [y7] - 787.5076+[l]
L [b3] - 299.1714+[2] 199857 P [y2] - 244.1656+[3] 129799 L [y8] - 900.5917+[4] 111572 L [y6] - 716.4705+[5] 88773 F [y5] - 603.3865+[6] 54020 L [y3] - 357.2496+[7] 43353 thyroxine-binding
1878736 globulin precursor .SILFLGK.V 389.2471++ L [y5] - 577.3708+[l]
1 [b2] - 201.1234+[2] 946031 G [y2] - 204.1343+[3] 424248 L [y3] - 317.2183+[4] 291162 F [y4] - 464.2867+[5] 391171
AFP R.DFNQFSSGEK.N 386.8402+++ N [b3] - 189.0764++[1] 42543
S [y4] - 210.6081++[2] 21340 G [y3] - 333.1769+[3] 53766 N [b3] - 377.1456+[4] 58644 F [b2] - 263.1026+[5] 5301
AFP K.GYQELLEK.C 490.2584++ E [y5] - 631.3661+[1] 110518
L [y4] - 502.3235+[2] 74844 E [y2] - 276.1554+[3] 42924 E [b4] - 478.1932+[4] 20953
AFP K.GEEELQK.Y 416.7060++ E [b2] - 187.0713+[1] 37843
E [y4] - 517.2980+[2] 56988
AFP K.FIYEIA . 456.2529++ 1 [y3] - 359.2401+[1] 34880
1 [b2] - 261.1598+[2] 7931
R.HPFLYAPTILLW
11471
AFP AAR.Y 590.3348+++ 1 [y7] - 421.7660++[1]
L [y6] - 365.2239++[2] 5001 A [b6] - 365.1896++[3] 5001 L [y6] - 729.4406+[4] 3218 F [b3] - 382.1874+[5] 6536 A [b6] - 729.3719+[6] 3218
AFP R.TFQAITVTK.L 504.7898++ T [b6] - 662.3508+[l] 11241
T [y4] - 448.2766+[2] 7541 A [b4] - 448.2191+[3] 7541
AFP K.LTTLER.G 366.7162++ T [y4] - 518.2933+[1] 7836
L [b4] - 215.1390++[2] 4205 T [b2] - 215.1390+[3] 4205
R.HPQLAVSVILR.
3781
AFP V L [y2] - 288.2030+[l]
1 [y3] - 401.2871+[2] 2924 L [b4] - 476.2616+[3] 2647
K.LGEYYLQNAFLV
10790
AFP AYTK.K 631.6646+++ G [b2] - 171.1128+[1]
Y [y3] - 411.2238+[2] 2303
F [blO] - 600.2902++[3] 1780
Y [b4] - 463.2187+[4] 2214 F [y7] - 421.2445++[6] 3072
PGH1 R.ILPSVPK.D 377.2471++ P [y5] - 527.3188+[1] 5340492
S [y4] - 430.2660+[5] 419777 P [y2] - 244.1656+[2] 4198508 P [y5] - 264.1630++[3] 2771328 L [b2] - 227.1754+[4] 2331263
K.AEHPTWGDEQL
64350
PGH1 FQTTR.L 639.3026+++ E [b9] - 512.2120++[1]
P [b4] - 218.1030++[2] 38282 L [bll] - 632.7833++[3] 129128 G [ylO] - 597.7911++[4] 19406 G [b7] - 779.3471+[5] 51467 T [y3] - 189.1108++[6] 10590 D [y9] - 569.2804++[7] 12460 L [y6] - 765.4254+[8] 6704 D [b8] - 447.6907++[9] 4893 P [b4] - 435.1987+[10] 8858 Q [y7] - 893.4839+[ll] 6101 T [b5] - 268.6268++[12] 5456 T [b5] - 536.2463+[13] 5549 PGH1 .LILIGETIK.I 500.3261++ G [y5] - 547.3086+[l] 7649
T [y3] - 361.2445+[2] 6680
E [y4] - 490.2871+[3] 5234
L [y7] - 773.4767+[4] 3342
PGH1 R.LQPFNEYR.K 533.7694++ N [b5] - 600.3140+[1] 25963
F [b4] - 486.2711+[2] 6915
E [y3] - 467.2249+[3] 15079
* QTRAP5500 data, all other peak
from Agilent 6490
[00187] Next, the top 2-10 transitions per peptide and up to 7 peptides per protein were selected for collision energy (CE) optimization on the Agilent 6490. Using Skyline or MassHunter Qual software, the optimized CE value for each transition was determined based on the peak area or signal to noise. The two transitions with the largest peak areas per peptide and at least two peptides per protein were chosen for the final MRM method. Substitutions of transitions with lower peak areas were made when a transition with a larger peak area had a high background level or had a low m/z value that has more potential for interference.
[00188] Lastly, the retention times of selected peptides were mapped using the same column and gradient as our established sMRM assay. The newly discovered analytes were subsequently added to the sMRM method and used in a further hypothesis-dependent discovery study described in Example 5 below.
[00189] The above method was typical for most proteins. However, in some cases, the differentially expressed peptide identified in the shotgun method did not uniquely identify a protein, for example, in protein families with high sequence identity. In these cases, a MRM method was developed for each family member. Also, let it be noted that, for any given protein, peptides in addition to those found to be significant and fragment ions not observed on the Orbitrap may have been included in MRM optimization and added to the final sMRM method if those yielded the best signal intensities.
Example 5. Study IV to Identify and Confirm Preterm Birth Biomarkers
[00190] A further hypothesis-dependent discovery study was performed with the scheduled MRM assay used in Examples 3 but now augmented with newly discovered analytes from the Example 4. Less robust transitions (from the original 1708 described in Example 1) were removed to improve analytical performance and make room for the newly discovered analytes. Samples included approximately 30 cases and 60 matched controls from each of three gestational periods (early, 17-22 weeks, middle, 23-25 weeks and late, 26-28 weeks). Log transformed peak areas for each transition were corrected for run order and batch effects by regression. The ability of each analyte to separate cases and controls was determined by calculating univariate AUC values from ROC curves. Ranked univariate AUC values (0.6 or greater) are reported for individual gestational age window sample sets (Tables 12, 13, 15) and a combination of the middle and late window (Table 14). Multivariate classifiers were built using different subsets of analytes (described below) by Lasso and Random Forest methods. Lasso significant transitions correspond to those with non-zero coefficients and Random Forest analye ranking was determined by the Gini importance values (mean decrease in model accuracy if that variable is removed). We report all analytes with non-zero Lasso coefficients (Tables 16-32) and the top 30 analytes from each Random Forest analysis (Tables 33-49). Models were built considering the top univariate 32 or 100 analytes, the single best univariate analyte for the top 50 proteins or all analytes. Lastly 1000 rounds of bootstrap resampling were performed and the nonzero Lasso coefficients or Random Forest Gini importance values were summed for each analyte amongst panels with AUCs of 0.85 or greater.
[00191] Table 12. Early Window Individual Stats
Figure imgf000104_0001
Transition Protein AUC
SDLEVAHYK_531.3_617.3 C08B_HUMAN 0.777
SLLQPN K_400.2_358.2 C08A_HUMAN 0.776
TLLPVSKPEIR_418.3_288.2 C05_HUMAN 0.776
ALNHLPLEYNSALYSR_621.0_538.3 C06_HUMAN 0.774
DISEVVTPR_508.3_787.4 CFAB_HUMAN 0.774
VSEADSSNADWVTK_754.9_533.3 CFAB_HUMAN 0.773
LSSPAVITDK_515.8_743.4 PLMN_HUMAN 0.773
VQEAHLTEDQJ FYFPK_655.7_701.4 C08G_HU MAN 0.772
DVLLLVH NLPQN LPGYFWYK_810.4_594.3 PSG9_HU MAN 0.771
ALVLELAK_428.8_672.4 IN HBE_HUMAN 0.770
FLNWIK_410.7_561.3 HABP2_H UMAN 0.770
LSSPAVITDK_515.8_830.5 PLMN_HUMAN 0.769
LPNNVLQEK_527.8_844.5 AFAM_H UMAN 0.769
VSEADSSNADWVTK_754.9_347.2 CFAB_HUMAN 0.768
HTLNQJ DEVK_598.8_951.5 FETUA_HUMAN 0.767
TTSDGGYSFK_531.7_860.4 IN HA_HUMAN 0.761
YENYTSSFFI R_713.8_756.4 IL12B_HUMAN 0.760
HTLNQJ DEVK_598.8_958.5 FETUA_HUMAN 0.760
DISEVVTPR_508.3_472.3 CFAB_HUMAN 0.760
LIQDAVTGLTVNGQJTGDK_972.0_640.4 ITIH3_HUMAN 0.759
EAQLPVI ENK_570.8_699.4 PLMN_HUMAN 0.759
SLPVSDSVLSGFEQR_810.9_836.4 C08G_HU MAN 0.757
AVLHIGEK_289.5_348.7 THBG_HUMAN 0.755
GLQYAAQEGLLALQSELLR_1037.1_929.5 LBP_HUMAN 0.752
FLQEQGHR_338.8_497.3 C08G_HU MAN 0.750
LPNNVLQEK_527.8_730.4 AFAMJH UMAN 0.750
AVLHIGEK_289.5_292.2 THBG_HUMAN 0.749
QLYGDTGVLGR_589.8_501.3 C08G_HU MAN 0.748
WWGGQPLWITATK_772.4_929.5 ENPP2_H UMAN 0.747
NADYSYSVWK_616.8_769.4 C05_HUMAN 0.746
GLQYAAQEGLLALQSELLR_1037.1_858.5 LBP_HUMAN 0.746
SLPVSDSVLSGFEQR_810.9_723.3 C08G_HU MAN 0.745
I EEIAAK_387.2_531.3 C05_HUMAN 0.743
TYLHTYESEI_628.3_908.4 ENPP2_H UMAN 0.742
WWGGQPLWITATK_772.4_373.2 ENPP2_H UMAN 0.742
FQLSETNR_497.8_605.3 PSG2_HU MAN 0.741
NIQSVNVK_451.3_674.4 GROA_H UMAN 0.741
TGVAVN KPAEFTVDAK_549.6_258.1 FLNA_HUMAN 0.740
LQGTLPVEAR_542.3_571.3 C05_HUMAN 0.740
SGFSFGFK_438.7_732.4 C08B_HUMAN 0.740
HELTDEELQSLFTN FANVVDK_817.1_906.5 AFAMJH UMAN 0.740
VQTAHFK_277.5_502.3 C08A_HUMAN 0.739
YENYTSSFFI R_713.8_293.1 IL12B_HUMAN 0.739
AFTECCVVASQLR_770.9_574.3 C05_HUMAN 0.736
EAQLPVI ENK_570.8_329.2 PLMN_HUMAN 0.734 Transition Protein AUC
QALEEFQK_496.8_551.3 C08B_HUMAN 0.734
DAQYAPGYDK_564.3_813.4 CFAB_HUMAN 0.734
TEFLSNYLTNVDDITLVPGTLGR_846.8_600.3 ENPP2_HUMAN 0.734
IAIDLFK_410.3_635.4 HEP2_HUMAN 0.733
TASDFITK_441.7_781.4 GELS_HUMAN 0.731
YEFLNGR_449.7_606.3 PLMN_HUMAN 0.731
TVQAVLTVPK_528.3_428.3 PEDF_HUMAN 0.731
LIENGYFHPVK_439.6_627.4 F13B_HUMAN 0.730
DALSSVQESQVAQQAR_573.0_672.4 APOC3_HUMAN 0.730
TVQAVLTVPK_528.3_855.5 PEDF_HUMAN 0.730
ALQDQLVLVAAK_634.9_289.2 ANGT_HUMAN 0.727
TYLHTYESEI_628.3_515.3 ENPP2_HUMAN 0.727
SDLEVAHYK_531.3_746.4 C08B_HUMAN 0.726
FLPCENK_454.2_550.2 IL10_HUMAN 0.725
HPWIVHWDQLPQYQLNR_744.0_1047.0 KS6A3_HUMAN 0.725
AFTECCWASQLR_770.9_673.4 C05_HUMAN 0.725
YGLVTYATYPK_638.3_843.4 CFAB_HUMAN 0.724
TLEAQLTPR_514.8_685.4 HEP2_HUMAN 0.724
DAQYAPGYDK_564.3_315.1 CFAB_HUMAN 0.724
QGHNSVFLIK_381.6_260.2 HEMO_HUMAN 0.722
HELTDEELQSLFTNFANWDK_817.1_854.4 AFAM_HUMAN 0.722
TLEAQLTPR_514.8_814.4 HEP2_HUMAN 0.721
IEEIAAK_387.2_660.4 C05_HUMAN 0.721
HFQNLGK_422.2_527.2 AFAM_HUMAN 0.721
IAPQLSTEELVSLGEK_857.5_333.2 AFAM_HUMAN 0.721
DALSSVQESQVAQQAR_573.0_502.3 APOC3_HUMAN 0.720
ALNHLPLEYNSALYSR_621.0_696.4 C06_HUMAN 0.719
IAIDLFK_410.3_706.4 HEP2_HUMAN 0.719
FLQEQGHR_338.8_369.2 C08G_HUMAN 0.719
ALQDQLVLVAAK_634.9_956.6 ANGT_HUMAN 0.718
IEGNLIFDPNNYLPK_874.0_414.2 APOB_HUMAN 0.717
YEFLNGR_449.7_293.1 PLMN_HUMAN 0.717
TASDFITK_441.7_710.4 GELS_HUMAN 0.716
DADPDTFFAK_563.8_825.4 AFAM_HUMAN 0.716
TLLPVSKPEIR_418.3_514.3 C05_HUMAN 0.716
NADYSYSVWK_616.8_333.2 C05_HUMAN 0.715
YGLVTYATYPK_638.3_334.2 CFAB_HUMAN 0.715
VNHVTLSQPK_374.9_459.3 B2MG_HUMAN 0.715
HYGGLTGLNK_530.3_759.4 PGAM1_HUMAN 0.714
DFHINLFQVLPWLK_885.5_400.2 CFAB_HUMAN 0.714
NCSFSIIYPVVIK_770.4_555.4 CRHBP_HUMAN 0.714
HPWIVHWDQLPQYQLNR_744.0_918.5 KS6A3_HUMAN 0.712
AQPVQVAEGSEPDGFWEALGGK_758.0_574.3 GELS_HUMAN 0.711
ALDLSLK_380.2_185.1 ITIH3_HUMAN 0.711
ALDLSLK_380.2_575.3 ITIH3_HUMAN 0.710 Transition Protein AUC
LDFHFSSDR_375.2_611.3 IN HBC_HUMAN 0.709
TLNAYDHR_330.5_312.2 PAR3_HUMAN 0.707
EVFSKPISWEELLQ._852.9_260.2 FA40A_HU MAN 0.706
IAPQLSTEELVSLGEK_857.5_533.3 AFAM_H UMAN 0.704
LI ENGYFHPVK_439.6_343.2 F13B_HUMAN 0.703
NFPSPVDAAFR_610.8_775.4 HEMO_HUMAN 0.703
QLYGDTGVLGR_589.8_345.2 C08G_HU MAN 0.702
LYYGDDEK_501.7_563.2 C08A_HUMAN 0.702
FQLSETNR_497.8_476.3 PSG2_HU MAN 0.701
TGVAVN KPAEFTVDAK_549.6_977.5 FLNA_HUMAN 0.700
I PGI FELGISSQSDR_809.9_679.3 C08B_HUMAN 0.700
TLFI FGVTK_513.3_215.1 PSG4_HU MAN 0.699
YYGYTGAFR_549.3_450.3 TRFL_HUMAN 0.699
QVFAVQR_424.2_473.3 ELN E_HUMAN 0.699
AQPVQVAEGSEPDGFWEALGGK_758.0_623.4 GELS_HUMAN 0.699
DFNQFSSGEK_386.8_189.1 FETA_H UMAN 0.699
SVSLPSLDPASAK_636.4_473.3 APOB_HUMAN 0.699
GNGLTWAEK_488.3_634.3 C163B_HUMAN 0.698
LYYGDDEK_501.7_726.3 C08A_HUMAN 0.698
NFPSPVDAAFR_610.8_959.5 HEMO_HUMAN 0.698
FAFNLYR_465.8_565.3 HEP2JH UMAN 0.697
SGFSFGFK_438.7_585.3 C08B_HUMAN 0.696
DFHIN LFQVLPWLK_885.5_543.3 CFAB_HUMAN 0.696
LQGTLPVEAR_542.3_842.5 C05_HUMAN 0.694
GAVHWVAETDYQSFAVLYLER_822.8_863.5 C08G_HU MAN 0.694
TSESTGSLPSPFLR_739.9_716.4 PSMG1_HUMAN 0.694
YISPDQLADLYK_713.4_277.2 ENOA_H UMAN 0.694
ESDTSYVSLK_564.8_347.2 CRP_H UMAN 0.693
I LDDLSPR_464.8_587.3 ITIH4_HUMAN 0.693
VQEAHLTEDQI FYFPK_655.7_391.2 C08G_HU MAN 0.692
SGVDLADSNQK_567.3_662.3 VGFR3_HUMAN 0.692
DTDTGALLFIGK_625.8_217.1 PEDF_HUMAN 0.692
HFQNLGK_422.2_285.1 AFAM_H UMAN 0.691
NNQLVAGYLQGPNVNLEEK_700.7_999.5 IL1RA_HUMAN 0.691
I PGI FELGISSQSDR_809.9_849.4 C08B_HUMAN 0.691
ESDTSYVSLK_564.8_696.4 CRP_H UMAN 0.690
GAVHWVAETDYQSFAVLYLER_822.8_580.3 C08G_HU MAN 0.690
DADPDTFFAK_563.8_302.1 AFAM_H UMAN 0.690
LDFHFSSDR_375.2_464.2 IN HBC_HUMAN 0.689
TLFI FGVTK_513.3_811.5 PSG4_HU MAN 0.688
DFNQFSSGEK_386.8_333.2 FETA_H UMAN 0.687
IQTHSTTYR_369.5_627.3 F13B_HUMAN 0.686
HYFIAAVER_553.3_658.4 FA8_HUMAN 0.686
VN HVTLSQPK_374.9_244.2 B2MG_HUMAN 0.686
DLHLSDVFLK_396.2_366.2 C06_HUMAN 0.685 Transition Protein AUC
DPTFIPAPIQAK_433.2_556.3 ANGT_HUMAN 0.684
AGITI PR_364.2_272.2 IL17_HUMAN 0.684
IAQYYYTFK_598.8_884.4 F13B_HUMAN 0.684
SGVDLADSNQK_567.3_591.3 VGFR3_HUMAN 0.683
VEPLYELVTATDFAYSSTVR_754.4_549.3 C08B_HUMAN 0.682
AGITI PR_364.2_486.3 IL17_HUMAN 0.682
YEVQGEVFTKPQLWP_911.0_293.1 CRP_H UMAN 0.681
APLTKPLK_289.9_357.2 CRP_H UMAN 0.681
YNSQLLSFVR_613.8_508.3 TFR1_H UMAN 0.681
ANDQYLTAAALHN LDEAVK_686.4_301.1 IL1A_HUMAN 0.681
IQTHSTTYR_369.5_540.3 F13B_HUMAN 0.681
I HPSYTNYR_575.8_598.3 PSG2_HU MAN 0.681
TEFLSNYLTNVDDITLVPGTLGR_846.8_699.4 ENPP2_H UMAN 0.681
DPTFIPAPIQAK_433.2_461.2 ANGT_HUMAN 0.679
FQSVFTVTR_542.8_623.4 C1QC_HUMAN 0.679
LQVNTPLVGASLLR_741.0_925.6 BPIA1_HUMAN 0.679
DEI PHNDIALLK_459.9_510.8 HABP2_H UMAN 0.678
HATLSLSI PR_365.6_272.2 VGFR3_HUMAN 0.678
EDTPNSVWEPAK_686.8_315.2 C1S_HUMAN 0.678
TGISPLALIK_506.8_741.5 APOB_HUMAN 0.678
I LPSVPK_377.2_244.2 PGH1_HUMAN 0.676
HATLSLSI PR_365.6_472.3 VGFR3_HUMAN 0.676
QGHNSVFLIK_381.6_520.4 HEMO_HUMAN 0.676
LPATEKPVLLSK_432.6_460.3 HY0U1_HUMAN 0.675
APLTKPLK_289.9_398.8 CRP_H UMAN 0.674
GVTGYFTFN LYLK_508.3_683.9 PSG5_HU MAN 0.673
TFLTVYWTPER_706.9_401.2 ICAM 1_HUMAN 0.673
GDTYPAELYITGSI LR_885.0_274.1 F13B_HUMAN 0.672
EDTPNSVWEPAK_686.8_630.3 C1S_HUMAN 0.672
SLDFTELDVAAEK_719.4_316.2 ANGT_HUMAN 0.672
VELAPLPSWQPVGK_760.9_342.2 ICAM 1_HUMAN 0.671
GPGEDFR_389.2_322.2 PTGDS_HUMAN 0.670
TDAPDLPEENQAR_728.3_843.4 C05_HUMAN 0.670
GVTGYFTFN LYLK_508.3_260.2 PSG5_HU MAN 0.669
FAFNLYR_465.8_712.4 HEP2_H UMAN 0.669
ITEN DIQIALDDAK_779.9_873.5 APOB_HUMAN 0.669
I LNIFGVI K_508.8_790.5 TFR1_H UMAN 0.669
ISQGEADIN IAFYQR_575.6_684.4 M MP8_HU MAN 0.668
GDTYPAELYITGSILR_885.0_1332.8 F13B_HUMAN 0.668
ELLESYI DGR_597.8_710.4 THRB_HUMAN 0.668
FTITAGSK_412.7_576.3 FABPL_HUMAN 0.667
I LDGGN K_358.7_490.2 CXCL5_HUMAN 0.667
GWVTDGFSSLK_598.8_854.4 APOC3_HUMAN 0.667
FSLVSGWGQLLDR_493.3_403.2 FA7_HUMAN 0.665
I HPSYTNYR_575.8_813.4 PSG2_HU MAN 0.665 Transition Protein AUC
ELLESYI DGR_597.8_839.4 THRB_HUMAN 0.665
SDGAKPGPR_442.7_213.6 COLI_HUMAN 0.664
IAQYYYTFK_598.8_395.2 F13B_HUMAN 0.664
SI LFLGK_389.2_201.1 THBG_HUMAN 0.664
I EVN ESGTVASSSTAVIVSAR_693.0_545.3 PAI1_H UMAN 0.664
VSAPSGTGH LPGLNPL_506.3_300.7 PSG3_HU MAN 0.664
LLAPSDSPEWLSFDVTGWR_730.1_430.3 TGFB1_HUMAN 0.664
YYGYTGAFR_549.3_771.4 TRFL_HUMAN 0.663
TDAPDLPEENQAR_728.3_613.3 C05_HUMAN 0.663
I EVI ITLK_464.8_815.5 CXL11_HUMAN 0.662
I LPSVPK_377.2_227.2 PGH1_HUMAN 0.662
FGFGGSTDSGPI R_649.3_745.4 ADA12_HUMAN 0.661
DYWSTVK_449.7_347.2 APOC3_HUMAN 0.661
I EGN LIFDPNNYLPK_874.0_845.5 APOB_HUMAN 0.661
WI LTAAHTLYPK_471.9_407.2 C1R_HUMAN 0.661
WN FAYWAAHQPWSR_607.3_545.3 PRG2_HUMAN 0.661
SI LFLGK_389.2_577.4 THBG_HUMAN 0.661
FSLVSGWGQ.LLD R_493.3_516.3 FA7_HUMAN 0.661
DTDTGALLFIGK_625.8_818.5 PEDF_HUMAN 0.661
SEYGAALAWEK_612.8_845.5 C06_HUMAN 0.660
LWAYLTIQELLAK_781.5_371.2 ITIH1_HUMAN 0.660
LLEVPEGR_456.8_356.2 C1S_HUMAN 0.659
ITEN DIQIALDDAK_779.9_632.3 APOB_HUMAN 0.659
LTTVDIVTLR_565.8_716.4 IL2RB_HUMAN 0.658
I EVI ITLK_464.8_587.4 CXL11_HUMAN 0.658
QLGLPGPPDVPDHAAYHPF_676.7_299.2 ITIH4_HUMAN 0.658
TLAFVR_353.7_492.3 FA7_HUMAN 0.656
NSDQEI DFK_548.3_294.2 S10A5_HUMAN 0.656
YHFEALADTGISSEFYDNAN DLLSK_940.8_874.5 C08A_HUMAN 0.656
SEPRPGVLLR_375.2_454.3 FA7_HUMAN 0.655
FLPCENK_454.2_390.2 IL10_HUMAN 0.654
NCSFSIIYPWIK_770.4_831.5 CRHBP_HUMAN 0.654
SLDFTELDVAAEK_719.4_874.5 ANGT_HUMAN 0.654
1 LLLGTAV ES A WG D E QS AF R_721.7_909.4 CXA1_HUMAN 0.653
SVSLPSLDPASAK_636.4_885.5 APOB_HUMAN 0.653
TGISPLALIK_506.8_654.5 APOB_HUMAN 0.653
YNQLLR_403.7_288.2 ENOA_H UMAN 0.653
YEVQGEVFTKPQLWP_911.0_392.2 CRP_H UMAN 0.652
VPGLYYFTYHASSR_554.3_720.3 C1QB_HUMAN 0.650
SLQNASAIESILK_687.4_589.4 IL3_HUMAN 0.650
WI LTAAHTLYPK_471.9_621.4 C1R_HUMAN 0.650
GWVTDGFSSLK_598.8_953.5 APOC3_HUMAN 0.650
YGIEEHGK_311.5_599.3 CXA1_HUMAN 0.649
QDLGWK_373.7_503.3 TGFB3_HUMAN 0.649
DYWSTVK_449.7_620.3 APOC3_HUMAN 0.648 Transition Protein AUC
ALVLELAK_428.8_331.2 IN HBE_HUMAN 0.647
QLGLPGPPDVPDHAAYHPF_676.7_263.1 ITIH4_HUMAN 0.646
SEYGAALAWEK_612.8_788.4 C06_HUMAN 0.645
TFLTVYWTPER_706.9_502.3 ICAM 1_HUMAN 0.644
FQSVFTVTR_542.8_722.4 C1QC_HUMAN 0.643
DPNGLPPEAQK_583.3_669.4 RET4_HUMAN 0.642
ETLLQDFR_511.3_322.2 AMBP_HUMAN 0.642
I IEVEEEQEDPYLNDR_996.0_777.4 FBLN1_HUMAN 0.641
ELCLDPK_437.7_359.2 IL8_HUMAN 0.641
TPSAAYLWVGTGASEAEK_919.5_849.4 GELS_HUMAN 0.641
NQSPVLEPVGR_598.3_866.5 KS6A3_HU MAN 0.641
FNAVLTNPQGDYDTSTGK_964.5_333.2 C1QC_HUMAN 0.641
LLEVPEGR_456.8_686.4 C1S_HUMAN 0.641
FFQYDTWK_567.8_840.4 IGF2_HUMAN 0.640
SPEAEDPLGVER_649.8_670.4 Z512B_HU MAN 0.639
SEPRPGVLLR_375.2_654.4 FA7_HUMAN 0.639
SGAQATWTELPWPHEK_613.3_793.4 HEMO_HUMAN 0.638
YSHYNER_323.5_581.3 HABP2_H UMAN 0.638
YHFEALADTGISSEFYDNAN DLLSK_940.8_301.1 C08A_HUMAN 0.637
DLHLSDVFLK_396.2_260.2 C06_HUMAN 0.637
YSHYNER_323.5_418.2 HABP2_H UMAN 0.637
YYLQGAK_421.7_327.1 ITIH4_HUMAN 0.636
E V P LSALTN 1 LSAQ.LI S H WK_740.8_996.6 PAI1_H UMAN 0.636
VPGLYYFTYHASSR_554.3_420.2 C1QB_HUMAN 0.636
AALAAFNAQNNGSNFQLEEISR_789.1_746.4 FETUA_HUMAN 0.636
ETLLQDFR_511.3_565.3 AMBP_HUMAN 0.635
IVLSLDVPIGLLQILLEQAR_735.1_503.3 UCN2_HUMAN 0.635
EN PAVIDFELAPIVDLVR_670.7_811.5 C06_HUMAN 0.635
LQLSETN R_480.8_355.2 PSG8_HU MAN 0.635
DPDQTDGLGLSYLSSHIANVER_796.4_456.2 GELS_HUMAN 0.635
NVNQSLLELHK_432.2_656.3 FRI H_HUMAN 0.634
EIGELYLPK_531.3_633.4 AACT_HUMAN 0.634
SPEQQETVLDGNLI IR_906.5_699.3 ITIH4_HUMAN 0.634
NKPGVYTDVAYYLAWIR_677.0_545.3 FA12_HU MAN 0.632
QNYHQDSEAAI NR_515.9_544.3 FRI H_HUMAN 0.632
EKPAGGI PVLGSLVNTVLK_631.4_930.6 BPIB1_HUMAN 0.632
VTFEYR_407.7_614.3 CRHBP_HUMAN 0.630
DLPHITVDR_533.3_490.3 M MP7_HU MAN 0.630
VEHSDLSFSK_383.5_234.1 B2MG_HUMAN 0.630
EN PAVIDFELAPIVDLVR_670.7_601.4 C06_HUMAN 0.630
YGFYTHVFR_397.2_659.4 THRB_HUMAN 0.629
I LDDLSPR_464.8_702.3 ITIH4_HUMAN 0.629
DPNGLPPEAQK_583.3_497.2 RET4_HUMAN 0.629
GSLVQASEANLQAAQDFVR_668.7_806.4 ITIH1_HUMAN 0.629
FLYHK_354.2_447.2 AMBP_HUMAN 0.627 Transition Protein AUC
FNAVLTNPQGDYDTSTGK_964.5_262.1 C1QC_HUMAN 0.627
LQDAGVYR_461.2_680.3 PD1L1_HUMAN 0.627
INPASLDK_429.2_630.4 C163A_HUMAN 0.626
LEEHYELR_363.5_580.3 PAI2_HUMAN 0.625
VEHSDLSFSK_383.5_468.2 B2MG_HUMAN 0.624
TSDQIHFFFAK_447.6_659.4 ANT3_HUMAN 0.624
ATLSAAPSNPR_542.8_570.3 CXCL2_HUMAN 0.624
YGFYTHVFR_397.2_421.3 THRB_HUMAN 0.624
EAN QSTLEN FLE R_775.9_678.4 IL4_HUMAN 0.623
G QQP ADVTGTALP R_705.9_314.2 CSF1_HUMAN 0.623
VELAPLPSWQPVGK_760.9_400.3 ICAM1_HUMAN 0.622
GEVTYTTSQVSK_650.3_750.4 EGLN_HUMAN 0.622
SLQAFVAVAAR_566.8_487.3 IL23A_HUMAN 0.622
HYGGLTGLNK_530.3_301.1 PGAM1_HUMAN 0.622
GPEDQDISISFAWDK_854.4_753.4 DEF4_HUMAN 0.622
YWISQGLDKPR_458.9_400.3 LRP1_HUMAN 0.621
LWAY LTI QE LLAK_781.5_300.2 ITIH1_HUMAN 0.621
SGAQATWTELPWPHEK_613.3_510.3 HEMO_HUMAN 0.621
GTAEWLSFDVTDTVR_848.9_952.5 TGFB3_HUMAN 0.621
FFQYDTWK_567.8_712.3 IGF2_HUMAN 0.621
AHQLAIDTYQEFEETYIPK_766.0_634.4 CSH_HUMAN 0.620
LPATEKPVLLSK_432.6_347.2 HYOUl_HUMAN 0.620
NIQSVNVK_451.3_546.3 GROA_HUMAN 0.620
TAVTANLDIR_537.3_288.2 CHL1_HUMAN 0.619
WSAGLTSSQVDLYIPK_883.0_515.3 CBG_HUMAN 0.616
QINSYVK_426.2_496.3 CBG_HUMAN 0.616
GFQALGDAADIR_617.3_288.2 TIMP1_HUMAN 0.615
WNFAYWAAHQPWSR_607.3_673.3 PRG2_HUMAN 0.615
NEIWYR_440.7_357.2 FA12_HUMAN 0.615
VLEPTLK_400.3_587.3 VTDB_HUMAN 0.614
YYLQGAK_421.7_516.3 ITIH4_HUMAN 0.614
ALNSIIDVYHK_424.9_774.4 S10A8_HUMAN 0.614
ETPEGAEAKPWYEPIYLGGVFQLEK_951.1_877.5 TNFA_HUMAN 0.614
LNIGYIEDLK_589.3_837.4 PAI2_HUMAN 0.614
NVNQSLLELHK_432.2_543.3 FRIH_HUMAN 0.613
1 LLLGTAV ES A WG D E QS AF R_721.7_910.6 CXA1_HUMAN 0.613
AALAAFNAQNNGSNFQLEEISR_789.1_633.3 FETUA_HUMAN 0.613
VLEPTLK_400.3_458.3 VTDB_HUMAN 0.613
VGEYSLYIGR_578.8_708.4 SAMP_HUMAN 0.613
DIPHWLNPTR_416.9_373.2 PAPP1_HUMAN 0.612
NEIVFPAGILQAPFYTR_968.5_357.2 ECE1_HUMAN 0.612
AEHPTWGDEQLFQTTR_639.3_765.4 PGH1_HUMAN 0.612
VEPLYELVTATDFAYSSTVR_754.4_712.4 C08B_HUMAN 0.611
DEIPHNDIALLK_459.9_260.2 HABP2_HUMAN 0.611
QINSYVK_426.2_610.3 CBG_HUMAN 0.610 Transition Protein AUC
SWN EPLYHLVTEVR_581.6_614.3 PRL_HUMAN 0.610
YGIEEHGK_311.5_341.2 CXA1_HUMAN 0.610
FGFGGSTDSGPI R_649.3_946.5 ADA12_HUMAN 0.610
ANDQYLTAAALHN LDEAVK_686.4_317.2 IL1A_HUMAN 0.610
VRPQQLVK_484.3_609.4 ITIH4_HUMAN 0.609
I PKPEASFSPR_410.2_506.3 ITIH4_HUMAN 0.609
SPEQQETVLDGNLI IR_906.5_685.4 ITIH4_HUMAN 0.609
DDLYVSDAFHK_655.3_704.3 ANT3_HUMAN 0.609
ELPEHTVK_476.8_347.2 VTDB_HU MAN 0.609
FLYHK_354.2_284.2 AMBP_HUMAN 0.608
QRPPDLDTSSNAVDLLFFTDESGDSR_961.5_262.2 C1R_HUMAN 0.608
DPDQTDGLGLSYLSSHIANVER_796.4_328.1 GELS_HUMAN 0.608
NEIWYR_440.7_637.4 FA12_HU MAN 0.607
LQLSETN R_480.8_672.4 PSG8_HU MAN 0.606
GQVPENEANVVITTLK_571.3_462.3 CADH1_HUMAN 0.606
FTGSQPFGQGVEHATANK_626.0_521.2 TSP1_HU MAN 0.605
LEPLYSASGPGLRPLVIK_637.4_260.2 CAA60698 0.605
QRPPDLDTSSNAVDLLFFTDESGDSR_961.5_866.3 C1R_HUMAN 0.604
LTTVDIVTLR_565.8_815.5 IL2RB_HUMAN 0.604
TSDQIHFFFAK_447.6_512.3 ANT3_HUMAN 0.604
IQHPFTVEEFVLPK_562.0_861.5 PZP_HUMAN 0.603
NKPGVYTDVAYYLAWIR_677.0_821.5 FA12_HU MAN 0.603
TEQAAVAR_423.2_615.4 FA12_HU MAN 0.603
EIGELYLPK_531.3_819.5 AACT_HUMAN 0.602
LFYADHPFIFLVR_546.6_647.4 SERPH_HUMAN 0.602
AEHPTWGDEQLFQTTR_639.3_569.3 PGH1_HUMAN 0.601
TSYQVYSK_488.2_787.4 C163A_HU MAN 0.601
YTTEI IK_434.2_704.4 C1R_HUMAN 0.601
NVIQISNDLENLR_509.9_402.3 LEP_HUMAN 0.600
AFLEVNEEGSEAAASTAVVIAGR_764.4_685.4 ANT3_HUMAN 0.600
[00192] Table 13. Middle Window Individual Stats
Figure imgf000112_0001
Transition Protein AUC
YGIEEHGK_311.5_599.3 CXA1_HUMAN 0.677
ALQDQLVLVAAK_634.9_289.2 ANGT_HUMAN 0.675
VLEPTLK_400.3_587.3 VTDB_HUMAN 0.667
VN HVTLSQPK_374.9_244.2 B2MG_HUMAN 0.665
I EEIAAK_387.2_660.4 C05_HUMAN 0.664
DALSSVQESQVAQQAR_573.0_502.3 APOC3_HUMAN 0.664
TLLPVSKPEIR_418.3_514.3 C05_HUMAN 0.662
ALQDQLVLVAAK_634.9_956.6 ANGT_HUMAN 0.661
TLAFVR_353.7_492.3 FA7_HUMAN 0.661
SEPRPGVLLR_375.2_654.4 FA7_HUMAN 0.658
VEHSDLSFSK_383.5_468.2 B2MG_HUMAN 0.653
DPTFIPAPIQAK_433.2_461.2 ANGT_HUMAN 0.653
QGHNSVFLIK_381.6_260.2 HEMO_HUMAN 0.650
SLDFTELDVAAEK_719.4_874.5 ANGT_HUMAN 0.650
ELPQSIVYK_538.8_417.7 FBLN3_HUMAN 0.649
TYLHTYESEI_628.3_515.3 EN PP2_HUMAN 0.647
SLQAFVAVAAR_566.8_804.5 I L23A_HUMAN 0.646
AQPVQVAEGSEPDGFWEALGGK_758.0_574.3 GELS_HUMAN 0.644
QGHNSVFLIK_381.6_520.4 HEMO_HUMAN 0.644
VN HVTLSQPK_374.9_459.3 B2MG_HUMAN 0.643
DLHLSDVFLK_396.2_260.2 C06_HUMAN 0.643
TEQAAVAR_423.2_615.4 FA12_HUMAN 0.643
GPITSAAELN DPQSI LLR_632.4_826.5 EGLN_HU MAN 0.643
HFQNLGK_422.2_527.2 AFAM_HUMAN 0.642
TEQAAVAR_423.2_487.3 FA12_HUMAN 0.642
AVDI PGLEAATPYR_736.9_399.2 TENA_HUMAN 0.642
TLFI FGVTK_513.3_811.5 PSG4_HUMAN 0.642
DLHLSDVFLK_396.2_366.2 C06_HUMAN 0.641
AFTECCWASQLR_770.9_574.3 C05_HUMAN 0.640
EVFSKPISWEELLQ_852.9_376.2 FA40A_HUMAN 0.639
DPTFIPAPIQAK_433.2_556.3 ANGT_HUMAN 0.639
FSLVSGWGQLLDR_493.3_403.2 FA7_HUMAN 0.638
HYI NLITR_515.3_301.1 NPY_H UMAN 0.637
HFQNLGK_422.2_285.1 AFAM_HUMAN 0.637
VPLALFALNR_557.3_620.4 PEPD_HUMAN 0.636
I HPSYTNYR_575.8_813.4 PSG2_HUMAN 0.635
I EEIAAK_387.2_531.3 C05_HUMAN 0.635
GEVTYTTSQVSK_650.3_750.4 EGLN_HU MAN 0.634
DFNQFSSGEK_386.8_333.2 FETA_HUMAN 0.634
VVGGLVALR_442.3_784.5 FA12_HUMAN 0.634
SDGAKPGPR_442.7_459.2 COLI_HU MAN 0.634
DVLLLVH NLPQNLTGHIWYK_791.8_310.2 PSG7_HUMAN 0.634
TLLPVSKPEIR_418.3_288.2 C05_HUMAN 0.633
NKPGVYTDVAYYLAWIR_677.0_821.5 FA12_HUMAN 0.630
QVFAVQR_424.2_473.3 ELN E_HUMAN 0.630 Transition Protein AUC
NHYTESISVAK_624.8_415.2 NEUR1_HU MAN 0.630
IAPQLSTEELVSLGEK_857.5_333.2 AFAM_HUMAN 0.629
I HPSYTNYR_575.8_598.3 PSG2_HUMAN 0.627
EVFSKPISWEELLQ_852.9_260.2 FA40A_HUMAN 0.627
SI LFLGK_389.2_201.1 TH BG_HU MAN 0.626
I EVI ITLK_464.8_587.4 CXL11_HU MAN 0.625
WGGLVALR_442.3_685.4 FA12_HUMAN 0.624
WLSSGSGPGLDLPLVLGLPLQLK_791.5_598.4 SHBG_HUMAN 0.624
FGFGGSTDSGPI R_649.3_946.5 ADA12_HU MAN 0.623
WLSSGSGPGLDLPLVLGLPLQLK_791.5_768.5 SHBG_HUMAN 0.622
YGIEEHGK_311.5_341.2 CXA1_HUMAN 0.621
LHEAFSPVSYQH DLALLR_699.4_380.2 FA12_HUMAN 0.621
AHYDLR_387.7_566.3 FETUA_HUMAN 0.620
FSVVYAK_407.2_381.2 FETUA_HUMAN 0.618
ALALPPLGLAPLLNLWAKPQGR_770.5_256.2 SHBG_HUMAN 0.618
YENYTSSFFI R_713.8_293.1 I L12B_H UMAN 0.617
VELAPLPSWQPVGK_760.9_342.2 ICAM 1_HU MAN 0.617
SI LFLGK_389.2_577.4 TH BG_HU MAN 0.616
I LPSVPK_377.2_227.2 PGH 1_HUMAN 0.615
I PSN PSH R_303.2_496.3 FBLN3_HUMAN 0.615
HYFIAAVER_553.3_301.1 FA8_HUMAN 0.615
FSWYAK_407.2_579.4 FETUA_HUMAN 0.613
VFQFLEK_455.8_276.2 C05_HUMAN 0.613
IAPQLSTEELVSLGEK_857.5_533.3 AFAM_HUMAN 0.613
I LPSVPK_377.2_244.2 PGH 1_HUMAN 0.613
NKPGVYTDVAYYLAWIR_677.0_545.3 FA12_HUMAN 0.613
WSAGLTSSQVDLYI PK_883.0_515.3 CBG_H UMAN 0.612
TPSAAYLWVGTGASEAEK_919.5_849.4 GELS_HUMAN 0.612
ALALPPLGLAPLLNLWAKPQGR_770.5_457.3 SHBG_HUMAN 0.612
QLGLPGPPDVPDHAAYHPF_676.7_299.2 ITIH4_HUMAN 0.612
I LDDLSPR_464.8_587.3 ITIH4_HUMAN 0.611
VELAPLPSWQPVGK_760.9_400.3 ICAM 1_HU MAN 0.611
DADPDTFFAK_563.8_825.4 AFAM_HUMAN 0.611
NHYTESISVAK_624.8_252.1 NEUR1_HU MAN 0.611
SEPRPGVLLR_375.2_454.3 FA7_HUMAN 0.611
LNIGYIEDLK_589.3_950.5 PAI2_HUMAN 0.611
ANLI NNI FELAGLGK_793.9_299.2 LCAP_H UMAN 0.609
LTTVDIVTLR_565.8_716.4 I L2RB_HU MAN 0.608
TQI LEWAAER_608.8_761.4 EGLN_HU MAN 0.608
NEPEETPSI EK_636.8_573.3 SOX5_HU MAN 0.608
AQPVQVAEGSEPDGFWEALGGK_758.0_623.4 GELS_HUMAN 0.607
LQVNTPLVGASLLR_741.0_925.6 BPIA1_HUMAN 0.607
VPSHAVVAR_312.5_345.2 TRFL_H UMAN 0.607
SLQNASAIESILK_687.4_860.5 I L3_HUMAN 0.607
GVTGYFTFN LYLK_508.3_260.2 PSG5_HUMAN 0.605 Transition Protein AUC
DFNQFSSGEK_386.8_189.1 FETA_HUMAN 0.605
QLGLPGPPDVPDHAAYHPF_676.7_263.1 ITIH4_HUMAN 0.605
TLEAQLTPR_514.8_814.4 HEP2_HUMAN 0.604
AFTECCWASQLR_770.9_673.4 C05_HUMAN 0.604
LTTVDIVTLR_565.8_815.5 I L2RB_HU MAN 0.604
TLNAYDHR_330.5_312.2 PAR3_HU MAN 0.603
L W AY LTI QE LLA K_781.5_300.2 ITIH 1_HUMAN 0.603
GGLFADIASHPWQAAI FAK_667.4_375.2 TPA_HUMAN 0.603
I PSN PSH R_303.2_610.3 FBLN3_HUMAN 0.603
TDAPDLPEENQAR_728.3_843.4 C05_HUMAN 0.603
SPQAFYR_434.7_684.4 REL3_HUMAN 0.602
SSNNPHSPIVEEFQVPYNK_729.4_261.2 C1S_HUMAN 0.601
AHYDLR_387.7_288.2 FETUA_HUMAN 0.600
DGSPDVTTADIGANTPDATK_973.5_844.4 PGRP2_HUMAN 0.600
SPQAFYR_434.7_556.3 REL3_HUMAN 0.600
[00193] Table 14. Middle Late Individual Stats
Figure imgf000115_0001
Transition Protein AUC
SEYGAALAWEK_612.8_788.4 C06_HU MAN 0.623
WSAGLTSSQVDLYIPK_883.0_515.3 CBG_HUMAN 0.622
DALSSVQESQVAQQAR_573.0_502.3 APOC3_HUMAN 0.622
ALQDQLVLVAAK_634.9_289.2 ANGT_HU MAN 0.621
SLQAFVAVAAR_566.8_804.5 I L23A_H UMAN 0.621
DPTFI PAPIQAK_433.2_556.3 ANGT_HU MAN 0.620
FGFGGSTDSGPI R_649.3_946.5 ADA12_HUMAN 0.619
VLEPTLK_400.3_458.3 VTDB_HUMAN 0.619
SLDFTELDVAAEK_719.4_874.5 ANGT_HU MAN 0.618
EVFSKPISWEELLQ_852.9_376.2 FA40A_HUMAN 0.618
FGFGGSTDSGPI R_649.3_745.4 ADA12_HUMAN 0.618
TPSAAYLWVGTGASEAEK_919.5_849.4 GELS_HUMAN 0.615
LHEAFSPVSYQHDLALLR_699.4_251.2 FA12_HUMAN 0.615
TLEAQLTPR_514.8_685.4 HEP2_HUMAN 0.613
ELPQSIVYK_538.8_417.7 FBLN3_HUMAN 0.612
GYQELLEK_490.3_631.4 FETA_HUMAN 0.612
VPLALFALNR_557.3_917.6 PEPD_HUMAN 0.611
DLHLSDVFLK_396.2_260.2 C06_HU MAN 0.611
LTTVDIVTLR_565.8_815.5 I L2RB_HUMAN 0.608
WSAGLTSSQVDLYIPK_883.0_357.2 CBG_HUMAN 0.608
ITQDAQLK_458.8_702.4 CBG_HUMAN 0.608
N IQSVNVK_451.3_674.4 GROA_HUMAN 0.607
ALEQDLPVNI K_620.4_570.4 CNDP1_HUMAN 0.607
TLNAYDHR_330.5_312.2 PAR3_HUMAN 0.606
LWAYLTIQELLAK_781.5_300.2 ITI H1_HUMAN 0.606
VVGGLVALR_442.3_784.5 FA12_HUMAN 0.605
AQPVQVAEGSEPDGFWEALGGK_758.0_623.4 GELS_HUMAN 0.603
SVVLIPLGAVDDGEHSQN EK_703.0_798.4 CNDP1_HUMAN 0.603
SETEI HQGFQHLHQLFAK_717.4_318.1 CBG_HUMAN 0.603
LLAPSDSPEWLSFDVTGVVR_730.1_430.3 TGFB1_HUMAN 0.603
I EVIITLK_464.8_587.4 CXL11_HUMAN 0.602
ITQDAQLK_458.8_803.4 CBG_HUMAN 0.602
AEIEYLEK_497.8_552.3 LYAM1_HUMAN 0.601
AVDI PGLEAATPYR_736.9_399.2 TENA_HU MAN 0.601
LTTVDIVTLR_565.8_716.4 I L2RB_HUMAN 0.600
WWGGQPLWITATK_772.4_929.5 ENPP2_HUMAN 0.600
[00194] Table 15. Late Window Individual Stats
Transition Protein AUC
AVYEAVLR_460.8_587.4 PEPD_HUMAN 0.724
AEIEYLEK 497.8 552.3 LYAM1 HUMAN 0.703 Transition Protein AUC
QINSYVK_426.2_496.3 CBG_HUMAN 0.695
AVYEAVLR_460.8_750.4 PEPD_HUMAN 0.693
AALAAFNAQNNGSNFQLEEISR_789.1_746.4 FETUA_HUMAN 0.684
QINSYVK_426.2_610.3 CBG_HUMAN 0.681
VPLALFALNR_557.3_620.4 PEPD_HUMAN 0.678
VGVISFAQK_474.8_580.3 TFR2_HUMAN 0.674
TGVAVNKPAEFTVDAK_549.6_258.1 FLNA_H UMAN 0.670
LI EIANHVDK_384.6_683.4 ADA12_HUMAN 0.670
LI EIANHVDK_384.6_498.3 ADA12_HUMAN 0.660
SGVDLADSNQK_567.3_662.3 VGFR3_HUMAN 0.660
TSYQVYSK_488.2_787.4 C163A_HUMAN 0.657
ITQDAQLK_458.8_702.4 CBG_HUMAN 0.652
YYGYTGAFR_549.3_450.3 TRFL_HUMAN 0.650
ALEQDLPVNI K_620.4_798.5 CNDP1_HUMAN 0.650
VFQYI DLHQDEFVQTLK_708.4_375.2 CNDP1_HUMAN 0.650
SGVDLADSNQK_567.3_591.3 VGFR3_HUMAN 0.648
YENYTSSFFIR_713.8_756.4 I L12B_HUMAN 0.647
VLSSI EQK_452.3_691.4 1433S_HU MAN 0.647
YSHYNER_323.5_418.2 HABP2_HUMAN 0.646
I LDGGNK_358.7_603.3 CXCL5_HUMAN 0.645
GTYLYNDCPGPGQDTDCR_697.0_666.3 TNR1A_HU MAN 0.645
AEIEYLEK_497.8_389.2 LYAM1_HUMAN 0.645
TLPFSR_360.7_506.3 LYAM1_HUMAN 0.645
DEI PHNDIALLK_459.9_510.8 HABP2_HUMAN 0.644
ALEQDLPVNI K_620.4_570.4 CNDP1_HUMAN 0.644
SPEAEDPLGVER_649.8_314.1 Z512B_HUMAN 0.644
FGFGGSTDSGPI R_649.3_745.4 ADA12_HUMAN 0.642
TASDFITK_441.7_781.4 GELS_HUMAN 0.641
SETEI HQGFQHLHQLFAK_717.4_447.2 CBG_HUMAN 0.640
SPQAFYR_434.7_556.3 REL3_HUMAN 0.639
TAVTANLDI R_537.3_288.2 CHL1_H UMAN 0.636
VPLALFALNR_557.3_917.6 PEPD_HUMAN 0.636
YISPDQLADLYK_713.4_277.2 ENOA_HUMAN 0.633
SETEI HQGFQHLHQLFAK_717.4_318.1 CBG_HUMAN 0.633
SEPRPGVLLR_375.2_654.4 FA7_H UMAN 0.633
GYQELLEK_490.3_631.4 FETA_HUMAN 0.633
AYSDLSR_406.2_375.2 SAM P_HUMAN 0.633
SVVLIPLGAVDDGEHSQN EK_703.0_798.4 CNDP1_HUMAN 0.632
TLEAQLTPR_514.8_685.4 HEP2_HUMAN 0.631
WSAGLTSSQVDLYIPK_883.0_515.3 CBG_HUMAN 0.631
TEQAAVAR_423.2_615.4 FA12_HUMAN 0.628 Transition Protein AUC
AQPVQVAEGSEPDGFWEALGGK_758.0_574.3 GELS_HUMAN 0.626
AGITIPR_364.2_486.3 I L17_H UMAN 0.626
AEVIWTSSDHQVLSGK_586.3_300.2 PD1L1_H UMAN 0.625
TEQAAVAR_423.2_487.3 FA12_HUMAN 0.625
N HYTESISVAK_624.8_415.2 N EUR1_HUMAN 0.625
WSAGLTSSQVDLYIPK_883.0_357.2 CBG_HUMAN 0.623
YSHYNER_323.5_581.3 HABP2_HUMAN 0.623
DFNQFSSGEK_386.8_333.2 FETA_HUMAN 0.621
N IQSVNVK_451.3_674.4 GROA_HUMAN 0.620
SVVLIPLGAVDDGEHSQN EK_703.0_286.2 CNDP1_HUMAN 0.620
TLAFVR_353.7_492.3 FA7_H UMAN 0.619
AVDI PGLEAATPYR_736.9_286.1 TENA_HU MAN 0.619
TEFLSNYLTNVDDITLVPGTLGR_846.8_600.3 ENPP2_HUMAN 0.618
YWGVASFLQK_599.8_849.5 RET4_H UMAN 0.618
TPSAAYLWVGTGASEAEK_919.5_428.2 GELS_HUMAN 0.618
DPNGLPPEAQK_583.3_669.4 RET4_H UMAN 0.617
TYLHTYESEI_628.3_908.4 ENPP2_HUMAN 0.616
SPQAFYR_434.7_684.4 REL3_HUMAN 0.616
TPSAAYLWVGTGASEAEK_919.5_849.4 GELS_HUMAN 0.615
ALNH LPLEYNSALYSR_621.0_538.3 C06_HU MAN 0.615
I EVNESGTVASSSTAVIVSAR_693.0_545.3 PAI 1_HUMAN 0.615
LTTVDIVTLR_565.8_815.5 I L2RB_HUMAN 0.615
LWAYLTIQELLAK_781.5_371.2 ITI H1_HUMAN 0.613
SYTITGLQPGTDYK_772.4_352.2 FI NC_HUMAN 0.612
GAVHWVAETDYQSFAVLYLER_822.8_863.5 C08G_HUMAN 0.612
FQLPGQK_409.2_276.1 PSG1_HUMAN 0.612
I LDGGNK_358.7_490.2 CXCL5_HUMAN 0.611
DYWSTVK_449.7_620.3 APOC3_HUMAN 0.611
AGLLRPDYALLGHR_518.0_595.4 PGRP2_HUMAN 0.611
ALNFGGIGVVVGHELTHAFDDQGR_837.1_360.2 ECE1_HUMAN 0.611
GYQELLEK_490.3_502.3 FETA_HUMAN 0.611
HATLSLSIPR_365.6_472.3 VGFR3_HUMAN 0.610
SVPVTKPVPVTKPITVTK_631.1_658.4 Z512B_HUMAN 0.610
FQLPGQK_409.2_429.2 PSG1_HUMAN 0.610
IYLQPGR_423.7_329.2 ITI H2_HUMAN 0.610
TLNAYDHR_330.5_312.2 PAR3_HUMAN 0.609
DPNGLPPEAQK_583.3_497.2 RET4_H UMAN 0.609
FGFGGSTDSGPI R_649.3_946.5 ADA12_HUMAN 0.609
TYLHTYESEI_628.3_515.3 ENPP2_HUMAN 0.608
GAVHVVVAETDYQSFAVLYLER_822.8_580.3 C08G_HUMAN 0.608
VPSHAVVAR_312.5_515.3 TRFL_HUMAN 0.608 Transition Protein AUC
YWGVASFLQK_599.8_350.2 RET4_H UMAN 0.608
EWVAI ESDSVQPVPR_856.4_468.3 CNDP1_HUMAN 0.607
LQDAGVYR_461.2_680.3 PD1L1_H UMAN 0.607
DLYHYITSYVVDGEI IIYGPAYSGR_955.5_650.3 PSG1_HUMAN 0.607
LWAYLTIQELLAK_781.5_300.2 ITI H1_HUMAN 0.606
ITENDIQIALDDAK_779.9_632.3 APOB_HU MAN 0.606
SYTITGLQPGTDYK_772.4_680.3 FI NC_HUMAN 0.606
FFQYDTWK_567.8_712.3 IGF2_HUMAN 0.605
IYLQPGR_423.7_570.3 ITI H2_HUMAN 0.605
YNQLLR_403.7_529.4 ENOA_HUMAN 0.605
WWGGQPLWITATK_772.4_929.5 ENPP2_HUMAN 0.605
WWGGQPLWITATK_772.4_373.2 ENPP2_HUMAN 0.605
TASDFITK_441.7_710.4 GELS_HUMAN 0.605
EWVAI ESDSVQPVPR_856.4_486.2 CNDP1_HUMAN 0.605
YEFLNGR_449.7_606.3 PLM N_HUMAN 0.604
SNPVTLNVLYGPDLPR_585.7_654.4 PSG6_HUMAN 0.604
ITQDAQLK_458.8_803.4 CBG_HUMAN 0.603
LTTVDIVTLR_565.8_716.4 I L2RB_HUMAN 0.602
FNAVLTNPQGDYDTSTGK_964.5_262.1 C1QC_HUMAN 0.602
ITGFLKPGK_320.9_301.2 LBP_HUMAN 0.601
DYWSTVK_449.7_347.2 APOC3_HUMAN 0.601
DPTFI PAPIQAK_433.2_556.3 ANGT_HU MAN 0.601
GWVTDGFSSLK_598.8_953.5 APOC3_HUMAN 0.601
YYGYTGAFR_549.3_771.4 TRFL_HUMAN 0.601
ELPEHTVK_476.8_347.2 VTDB_HUMAN 0.601
FTFTLHLETPKPSISSSNLNPR_829.4_874.4 PSG1_HUMAN 0.601
DLYHYITSYVVDGEI IIYGPAYSGR_955.5_707.3 PSG1_HUMAN 0.601
SPQAFYR_434.7_684.4 REL3_HUMAN 0.616
TPSAAYLWVGTGASEAEK_919.5_849.4 GELS_HUMAN 0.615
ALNH LPLEYNSALYSR_621.0_538.3 C06_HU MAN 0.615
I EVNESGTVASSSTAVIVSAR_693.0_545.3 PAI 1_HUMAN 0.615
LTTVDIVTLR_565.8_815.5 I L2RB_HUMAN 0.615
LWAYLTIQELLAK_781.5_371.2 ITI H1_HUMAN 0.613
SYTITGLQPGTDYK_772.4_352.2 FI NC_HUMAN 0.612
GAVHVVVAETDYQSFAVLYLER_822.8_863.5 C08G_HUMAN 0.612
FQLPGQK_409.2_276.1 PSG1_HUMAN 0.612
DLYHYITSYVVDGEI IIYGPAYSGR_955.5_707.3 PSG1_HUMAN 0.601
[00195] Table 16. Lasso Early 32
Figure imgf000119_0001
Variable Protein Coefficient
VQTAHFK_277.5_431.2 C08A_HUMAN 9.09
FLNWIK_410.7_560.3 HABP2_H UMAN 6.15
ITGFLKPGK_320.9_429.3 LBP_HUMAN 5.29
EU EELVNITQNQK_557.6_517.3 IL13_HUMAN 3.83
ALNHLPLEYNSALYSR_621.0_538.3 C06_HUMAN 3.41
DISEVVTPR_508.3_787.4 CFAB_HUMAN 0.44
AHYDLR_387.7_288.2 FETUA_HUMAN 0.1
[00196] Table 17. Lasso Early 100
Variable Protein Coefficient
LIQDAVTGLTVNGQJTGDK_972.0_798.4 ITI H3_HUMAN 6.56
ALNHLPLEYNSALYSR_621.0_538.3 C06_HUMAN 6.51
VQTAHFK_277.5_431.2 C08A_HUMAN 4.51
NIQSVNVK_451.3_674.4 GROA_HU MAN 3.12
TYLHTYESEI_628.3_908.4 ENPP2_HUMAN 2.68
LI ENGYFHPVK_439.6_627.4 F13B_HU MAN 2.56
AVLHIGEK_289.5_292.2 THBG_HUMAN 2.11
FLNWIK_410.7_560.3 HABP2_HUMAN 1.85
ITGFLKPGK_320.9_429.3 LBP_H UMAN 1.36
DALSSVQESQVAQQAR_573.0_672.4 APOC3_HU MAN 1.3
DALSSVQESQVAQQAR_573.0_502.3 APOC3_HU MAN 0.83
FLPCENK_454.2_550.2 IL10_HUMAN 0.39
EU EELVNITQNQK_557.6_517.3 IL13_HUMAN 0.3
TEFLSNYLTNVDDITLVPGTLGR_846.8_600.3 ENPP2_HUMAN 0.29
VSEADSSNADWVTK_754.9_347.2 CFAB_HU MAN 0.27
ITLPDFTGDLR_624.3_288.2 LBP_H UMAN 0.13
TGVAVN KPAEFTVDAK_549.6_258.1 FLNA_HUMAN 0.04
TASDFITK_441.7_781.4 GELS_H UMAN -5.91
LIQDAVTGLTVNGQJTGDK_972.0_798.4 ITI H3_HUMAN 6.56
[00197] Table 18. Lasso Protein Early Window
Figure imgf000120_0001
Variable Protein Coefficient
ITLPDFTGDLR_624.3_288.2 LBP_H UMAN 0.7
DISEVVTPR_508.3_787.4 CFAB_HU MAN 0.45
EVFSKPISWEELLQ_852.9_260.2 FA40A_HUMAN 0.17
YYGYTGAFR_549.3_450.3 TRFL_HUMAN 0.06
TASDFITK_441.7_781.4 GELS_H UMAN -7.65
[00198] Table 19. Lasso All Early Window
Figure imgf000121_0001
ILNIFGVIK_508.8_790.5 TFR1_HUMAN 0.05
TPSAAYLWVGTGASEAEK_919.5_849.4 GELS_HUMAN -2.69
VGVISFAQK_474.8_693.4 TFR2_HUMAN -5.68
LNIGYIEDLK_589.3_950.5 PAI2_HUMAN -1.43
GQVPENEANWITTLK_571.3_462.3 CADH1_HUMAN -0.55
STPSLTTK_417.7_549.3 IL6RA_HUMAN -0.59
ALLLGWVPTR_563.3_373.2 PAR4_HUMAN -0.97
[00199] Table 20: Lasso SummedCoef Early Window
Transition Protein SumBestCoefs
LI Q.DAVTG LTVNG QJTGDK_972.0_798.4 ITIH3_HUMAN 1173.723955
ALNHLPLEYNSALYSR_621.0_538.3 C06_HUMAN 811.0150364
ELIEELVNITQNQK_557.6_618.3 IL13_HUMAN 621.9659363
VQJAHFK_277.5_431.2 C08A_HUMAN 454.178544
NIQSVNVK_451.3_674.4 GROA_HUMAN 355.9550674
TLFIFGVTK_513.3_811.5 PSG4_HUMAN 331.8629189
GPGEDFR_389.2_322.2 PTGDS_HUMAN 305.9079494
FLPCENK_454.2_550.2 IL10_HUMAN 296.9473975
FLNWIK_410.7_560.3 HABP2_HUMAN 282.9841332
LIENGYFHPVK_439.6_627.4 F13B_HUMAN 237.5320227
ECEELEEK_533.2_405.2 IL15_HUMAN 200.38281
FGFGGSTDSGPIR_649.3_745.4 ADA12_HUMAN 194.6252869
QALEEFQK_496.8_680.3 C08B_HUMAN 179.2518843
IIEVEEEQEDPYLNDR_996.0_777.4 FBLN1_HUMAN 177.7534111
TYLHTYESEI_628.3_908.4 ENPP2_HUMAN 164.9735228
ELIEELVNITQNQK_557.6_517.3 IL13_HUMAN 162.2414693
LEEHYELR_363.5_580.3 PAI2_HUMAN 152.9262386
ISQGEADINIAFYQR_575.6_684.4 MMP8_HUMAN 144.2445011
HPWIVHWDQLPQYQLNR_744.0_918.5 KS6A3_HUMAN 140.2287926
AHYDLR_387.7_288.2 FETUA_HUMAN 137.9737525
GFQALGDAADIR_617.3_288.2 TIMP1_HUMAN 130.4945567
SWNEPLYHLVTEVR_581.6_716.4 PRL_HUMAN 127.442646
SGVDLADSNQK_567.3_591.3 VGFR3_HUMAN 120.5149446
YENYTSSFFIR_713.8_293.1 IL12B_HUMAN 117.0947487
FFQYDTWK_567.8_840.4 IGF2_HUMAN 109.8569617
HYFIAAVER_553.3_658.4 FA8_HUMAN 106.9426543
ITGFLKPGK_320.9_429.3 LBP_HUMAN 103.8056505
DALSSVQESQVAQQAR_573.0_502.3 APOC3_HUMAN 98.50490812
SGVDLADSNQK_567.3_662.3 VGFR3_HUMAN 97.19989285
ALDLSLK_380.2_575.3 ITIH3_HUMAN 94.84900337
TGVAVNKPAEFTVDAK_549.6_258.1 FLNA_HUMAN 92.52335783
HPWIVHWDQLPQYQLNR_744.0_1047.0 KS6A3_HUMAN 91.77547608 Transition Protein SumBestCoefs
LI Q.DAVTG LTVNG QJTG D K_972.0_640.4 ITIH3_HUMAN 83.6483639
LNIGYIEDLK_589.3_837.4 PAI2_HUMAN 83.50221521
IALGGLLFPASNLR_481.3_657.4 SHBG_HUMAN 79.33146741
LPATEKPVLLSK_432.6_460.3 HYOUl_HUMAN 78.89429168
FQLSETNR_497.8_605.3 PSG2_HUMAN 78.13445824
NEIVFPAGILQAPFYTR_968.5_357.2 ECE1_HUMAN 75.12145257
ALDLSLK_380.2_185.1 ITIH3_HUMAN 63.05454715
DLHLSDVFLK_396.2_366.2 C06_HUMAN 58.26831142
TQILEWAAER_608.8_761.4 EGLN_HUMAN 57.29461621
FSWYAK_407.2_381.2 FETUA_HUMAN 54.78436389
VSEADSSNADWVTK_754.9_347.2 CFAB_HUMAN 54.40003244
DPNGLPPEAQK_583.3_669.4 RET4_HUMAN 53.89169348
VQEAHLTEDQIFYFPK_655.7_701.4 C08G_HUMAN 53.33747599
LSSPAVITDK_515.8_830.5 PLMN_HUMAN 53.22513181
ITLPDFTGDLR_624.3_288.2 LBP_HUMAN 51.5477235
AVLHIGEK_289.5_292.2 THBG_HUMAN 49.73092632
GEVTYTTSQVSK_650.3_750.4 EGLN_HUMAN 45.14743629
GYVIIKPLVWV_643.9_854.6 SAMP_HUMAN 44.05164273
TGVAVNKPAEFTVDAK_549.6_977.5 FLNA_HUMAN 42.99898046
YYGYTGAFR_549.3_450.3 TRFL_HUMAN 42.90897411
ILDGGNK_358.7_490.2 CXCL5_HUMAN 42.60771281
FLPCENK_454.2_390.2 IL10_HUMAN 42.56799651
GFQALGDAADIR_617.3_717.4 TIMP1_HUMAN 38.68456017
SDGAKPGPR_442.7_213.6 COLI_HUMAN 38.47800265
NTGVISVVTTGLDR_716.4_662.4 CADH1_HUMAN 32.62953675
SERPPIFEIR_415.2_288.2 LRP1_HUMAN 31.48248968
DFHINLFQVLPWLK_885.5_400.2 CFAB_HUMAN 31.27286268
DALSSVQESQVAQQAR_573.0_672.4 APOC3_HUMAN 31.26972354
ELCLDPK_437.7_359.2 IL8_HUMAN 29.91108737
ILNIFGVIK_508.8_790.5 TFR1_HUMAN 29.88784921
TEFLSNYLTNVDDITLVPGTLGR_846.8_600.3 ENPP2_HUMAN 29.42327998
GAVHWVAETDYQSFAVLYLER_822.8_863.5 C08G_HUMAN 26.70286929
AVLHIGEK_289.5_348.7 THBG_HUMAN 25.78703299
TFLTVYWTPER_706.9_401.2 ICAM1_HUMAN 24.73090242
AGITIPR_364.2_486.3 IL17_HUMAN 23.84580477
GAVHWVAETDYQSFAVLYLER_822.8_580.3 C08G_HUMAN 23.81167843
SLQAFVAVAAR_566.8_487.3 IL23A_HUMAN 23.61468839
SWNEPLYHLVTEVR_581.6_614.3 PRL_HUMAN 23.2538221
TYLHTYESEI_628.3_515.3 ENPP2_HUMAN 22.70115313
TAHISGLPPSTDFIVYLSGLAPSIR_871.5_800.5 TENA_HUMAN 22.42695892
QNYHQDSEAAINR_515.9_544.3 FRIH_HUMAN 21.96827269
AHQLAIDTYQEFEETYIPK_766.0_634.4 CSH_HUMAN 21.75765717
GDTYPAELYITGSILR_885.0_274.1 F13B_HUMAN 20.89751398 Transition Protein SumBestCoefs
AHYDLR_387.7_566.3 FETUA_HUMAN 20.67629529
IALGGLLFPASNLR_481.3_412.3 SHBG_HUMAN 19.28973033
ATNATLDPR_479.8_272.2 PAR1_HUMAN 18.77604574
FSVVYAK_407.2_579.4 FETUA_HUMAN 17.81136564
HTLNQIDEVK_598.8_951.5 FETUA_HUMAN 17.29763288
DIPHWLNPTR_416.9_373.2 PAPP1_HUMAN 17.00562521
LYYGDDEK_501.7_563.2 C08A_HUMAN 16.78897272
AALAAFNAQNNGSNFQLEEISR_789.1_633.3 FETUA_HUMAN 16.41986569
IQTHSTTYR_369.5_627.3 F13B_HUMAN 15.78335174
GPITSAAELNDPQSILLR_632.4_826.5 EGLN_HUMAN 15.3936876
QTLSWTVTPK_580.8_818.4 PZP_HUMAN 14.92509259
AVGYLITGYQR_620.8_737.4 PZP_HUMAN 13.9795325
DIIKPDPPK_511.8_342.2 IL12B_HUMAN 13.76508282
YNQLLR_403.7_288.2 ENOA_HUMAN 12.61733711
GNGLTWAEK_488.3_634.3 C163B_HUMAN 12.5891421
QVFAVQR_424.2_473.3 ELNE_HUMAN 12.57709327
FLQEQGHR_338.8_497.3 C08G_HUMAN 12.51843475
HWQLR_376.2_515.3 IL6RA_HUMAN 11.83747559
DVLLLVHNLPQNLTGHIWYK_791.8_883.0 PSG7_HUMAN 11.69074708
TFLTVYWTPER_706.9_502.3 ICAM1_HUMAN 11.63709776
VELAPLPSWQPVGK_760.9_400.3 ICAM1_HUMAN 10.79897269
TLFIFGVTK_513.3_215.1 PSG4_HUMAN 10.2831751
AYSDLSR_406.2_375.2 SAMP_HUMAN 10.00461148
HATLSLSIPR_365.6_472.3 VGFR3_HUMAN 9.967933028
LQGTLPVEAR_542.3_571.3 C05_HUMAN 9.963760572
NTVISVNPSTK_580.3_732.4 VCAM1_HUMAN 9.124228658
EVFSKPISWEELLQ_852.9_260.2 FA40A_HUMAN 8.527980294
SLQ.NASAI ESI LK_687.4_860.5 IL3_HUMAN 8.429061621
IQHPFTVEEFVLPK_562.0_861.5 PZP_HUMAN 7.996504258
GVTGYFTFNLYLK_508.3_683.9 PSG5_HUMAN 7.94396229
VFQYIDLHQDEFVQTLK_708.4_361.2 CNDP1_HUMAN 7.860590049
ILDDLSPR_464.8_587.3 ITIH4_HUMAN 7.593889262
LIENGYFHPVK_439.6_343.2 F13B_HUMAN 7.05838337
VFQFLEK_455.8_811.4 C05_HUMAN 6.976884759
AFTECCWASQLR_770.9_574.3 C05_HUMAN 6.847474286
WWGGQPLWITATK_772.4_929.5 ENPP2_HUMAN 6.744837357
IQTHSTTYR_369.5_540.3 F13B_HUMAN 6.71464509
IAQYYYTFK_598.8_395.2 F13B_HUMAN 6.540497911
YGFYTHVFR_397.2_421.3 THRB_HUMAN 6.326347548
YHFEALADTGISSEFYDNANDLLSK_940.8_874. C08A_HUMAN 6.261787525 5
ANDQYLTAAALHNLDEAVK_686.4_301.1 IL1A_HUMAN 6.217191651
FSLVSGWGQLLDR_493.3_403.2 FA7_HUMAN 6.1038295 Transition Protein SumBestCoefs
GWVTDGFSSLK_598.8_854.4 APOC3_HUMAN 6.053494609
TLEAQLTPR_514.8_814.4 HEP2_HUMAN 5.855967278
VSAPSGTGHLPGLNPL_506.3_300.7 PSG3_HUMAN 5.625944609
EAQLPVI EN K_570.8_699.4 PLMN_HUMAN 5.407703773
SPEAEDPLGVER_649.8_670.4 Z512B_HUMAN 5.341420139
IAIDLFK_410.3_635.4 HEP2_HUMAN 4.698739039
YEFLNGR_449.7_293.1 PLMN_HUMAN 4.658286706
VQTAHFK_277.5_502.3 C08A_HUMAN 4.628247194
IEVIITLK_464.8_815.5 CXL11_HUMAN 4.57198762
ILTPEVR_414.3_601.3 GDF15_HUMAN 4.452884608
LEEHYELR_363.5_288.2 PAI2_HUMAN 4.411983862
HATLSLSIPR_365.6_272.2 VGFR3_HUMAN 4.334242077
NSDQEIDFK_548.3_294.2 S10A5_HUMAN 4.25302369
LPNNVLQEK_527.8_844.5 AFAM_HUMAN 4.183602548
ELANTIK_394.7_475.3 S10AC_HUMAN 4.13558153
LSIPQITTK_500.8_687.4 PSG5_HUMAN 3.966238797
TLNAYDHR_330.5_312.2 PAR3_HUMAN 3.961140111
WWGGQPLWITATK_772.4_373.2 ENPP2_HUMAN 3.941476057
ELLESYIDGR_597.8_710.4 THRB_HUMAN 3.832723338
ATLSAAPSNPR_542.8_570.3 CXCL2_HUMAN 3.82834767
VVLSSGSGPGLDLPLVLGLPLQLK_791.5_598.4 SHBG_HUMAN 3.80737887
NADYSYSVWK_616.8_333.2 C05_HUMAN 3.56404167
ILILPSVTR_506.3_559.3 PSGx_HUMAN 3.526998593
ALEQDLPVNIK_620.4_798.5 CNDP1_HUMAN 3.410412424
QVCADPSEEWVQK_788.4_275.2 CCL3_HUMAN 3.30795151
SVQNDSQAIAEVLNQLK_619.7_914.5 DESP_HUMAN 3.259270741
QVFAVQR_424.2_620.4 ELNE_HUMAN 3.211482663
ALPGEQQPLHALTR_511.0_807.5 IBP1_HUMAN 3.211207158
LEPLYSASGPGLRPLVIK_637.4_260.2 CAA60698 3.203088951
GTYLYNDCPGPGQDTDCR_697.0_666.3 TNR1A_HUMAN 3.139418139
DAGLSWGSAR_510.2_576.3 NEUR4_HUMAN 3.005197927
YGFYTHVFR_397.2_659.4 THRB_HUMAN 2.985663918
NNQLVAGYLQGPNVNLEEK_700.7_357.2 IL1RA_HUMAN 2.866983196
EKPAGGIPVLGSLVNTVLK_631.4_930.6 BPIB1_HUMAN 2.798965142
FGSDDEGR_441.7_735.3 PTHR_HUMAN 2.743283546
IEVNESGTVASSSTAVIVSAR_693.0_545.3 PAI1_HUMAN 2.699725572
FATTFYQHLADSK_510.3_533.3 ANT3_HUMAN 2.615073729
DYWSTVK_449.7_347.2 APOC3_HUMAN 2.525459346
QLGLPGPPDVPDHAAYHPF_676.7_263.1 ITIH4_HUMAN 2.525383799
LSSPAVITDK_515.8_743.4 PLMN_HUMAN 2.522306831
TEFLSNYLTNVDDITLVPGTLGR_846.8_699.4 ENPP2_HUMAN 2.473366805
SILFLGK_389.2_201.1 THBG_HUMAN 2.472413913
VTFEYR_407.7_614.3 CRHBP_HUMAN 2.425338167 Transition Protein SumBestCoefs
SVVLIPLGAVDDGEHSQNEK_703.0_798.4 CNDP1_HUMAN 2.421340244
HTLNQIDEVK_598.8_958.5 FETUA_HUMAN 2.419851187
ALNSII DVYH K_424.9_661.3 S10A8_HUMAN 2.367904596
ETLALLSTHR_570.8_500.3 IL5_HUMAN 2.230076769
GLQYAAQEGLLALQSELLR_1037.1_858.5 LBP_HUMAN 2.205949216
TYNVDK_370.2_262.1 PPB1_HUMAN 2.11849772
FTITAGSK_412.7_576.3 FABPL_HUMAN 2.098589805
GIVEECCFR_585.3_900.3 IGF2_HUMAN 2.059942995
YGIEEHGK_311.5_599.3 CXA1_HUMAN 2.033828589
ALVLELAK_428.8_331.2 INHBE_HUMAN 1.993820617
ITLPDFTGDLR_624.3_920.5 LBP_HUMAN 1.968753183
HELTDEELQSLFTNFANWDK_817.1_906.5 AFAM_HUMAN 1.916438806
EANQSTLENFLER_775.9_678.4 IL4_HUMAN 1.902033355
DADPDTFFAK_563.8_825.4 AFAM_HUMAN 1.882254674
LFIPQITR_494.3_727.4 PSG9_HUMAN 1.860649392
DPNGLPPEAQK_583.3_497.2 RET4_HUMAN 1.847702127
VEPLYELVTATDFAYSSTVR_754.4_549.3 C08B_HUMAN 1.842159131
FQLSETNR_497.8_476.3 PSG2_HUMAN 1.834693717
FSLVSGWGQLLDR_493.3_516.3 FA7_HUMAN 1.790582748
NKPGVYTDVAYYLAWIR_677.0_545.3 FA12_HUMAN 1.777303353
FTGSQPFGQGVEHATANK_626.0_521.2 TSP1_HUMAN 1.736517431
DDLYVSDAFHK_655.3_704.3 ANT3_HUMAN 1.717534082
AFLEVNEEGSEAAASTAVVIAGR_764.4_685.4 ANT3_HUMAN 1.679420475
LPNNVLQEK_527.8_730.4 AFAM_HUMAN 1.66321148
IVLSLDVPIGLLQILLEQAR_735.1_503.3 UCN2_HUMAN 1.644983604
DPTFIPAPIQAK_433.2_556.3 ANGT_HUMAN 1.625411496
SDLEVAHYK_531.3_617.3 C08B_HUMAN 1.543640117
Q.LYG DTG V LG R_589.8_501.3 C08G_HUMAN 1.505242962
VNHVTLSQPK_374.9_459.3 B2MG_HUMAN 1.48233058
TLLPVSKPEIR_418.3_288.2 C05_HUMAN 1.439531341
S E YG AALA W E K_612.8_845.5 C06_HUMAN 1.424401638
YGIEEHGK_311.5_341.2 CXA1_HUMAN 1.379872204
DAGLSWGSAR_510.3_390.2 NEUR4_HUMAN 1.334272677
AEHPTWGDEQLFQTTR_639.3_569.3 PGH1_HUMAN 1.30549273
FQSVFTVTR_542.8_623.4 C1QC_HUMAN 1.302847429
VPGLYYFTYHASSR_554.3_420.2 C1QB_HUMAN 1.245565877
AYSDLSR_406.2_577.3 SAMP_HUMAN 1.220777002
ALEQDLPVNIK_620.4_570.4 CNDP1_HUMAN 1.216612522
NAVVQGLEQPHGLWHPLR_688.4_890.6 LRP1_HUMAN 1.212935735
TSDQIHFFFAK_447.6_659.4 ANT3_HUMAN 1.176238265
GTYLYNDCPGPGQDTDCR_697.0_335.2 TNR1A_HUMAN 1.1455649
TSYQVYSK_488.2_787.4 C163A_HUMAN 1.048896429
ALNSII DVYH K_424.9_774.4 S10A8_HUMAN 1.028522516 Transition Protein SumBestCoefs
VELAPLPSWQPVGK_760.9_342.2 ICAM1_HUMAN 0.995831393
LSETNR_360.2_330.2 PSG1_HUMAN 0.976094717
HFQNLGK_422.2_527.2 AFAMJHUMAN 0.956286531
ELPQSIVYK_538.8_417.7 FBLN3_HUMAN 0.947931674
LPATEKPVLLSK_432.6_347.2 HY0U1_HUMAN 0.932537153
SPEAEDPLGVER_649.8_314.1 Z512B_HUMAN 0.905955419
DEIPHNDIALLK_459.9_510.8 HABP2_HUMAN 0.9032484
FFQYDTWK_567.8_712.3 IGF2_HUMAN 0.884340285
LIEIANHVDK_384.6_498.3 ADA12_HUMAN 0.881493383
AGFAGDDAPR_488.7_701.3 ACTB_HUMAN 0.814836556
YEFLNGR_449.7_606.3 PLMN_HUMAN 0.767373087
VIAVNEVGR_478.8_284.2 CHL1_HUMAN 0.721519592
SLSQQIENIR_594.3_531.3 C01A1_HUMAN 0.712051082
EWVAIESDSVQPVPR_856.4_486.2 CNDP1_HUMAN 0.647712421
YGLVTYATYPK_638.3_843.4 CFAB_HUMAN 0.618499569
SVVLIPLGAVDDGEHSQNEK_703.0_286.2 CNDP1_HUMAN 0.606626346
NSDQEIDFK_548.3_409.2 S10A5_HUMAN 0.601928175
NVNQSLLELHK_432.2_543.3 FRIH_HUMAN 0.572008792
IAQYYYTFK_598.8_884.4 F13B_HUMAN 0.495062844
GPITSAAELNDPQSILLR_632.4_601.4 EGLN_HUMAN 0.47565795
YTTEIIK_434.2_704.4 C1R_HUMAN 0.433318952
GYVIIKPLVWV_643.9_304.2 SAMP_HUMAN 0.427905264
LDFHFSSDR_375.2_464.2 INHBC_HUMAN 0.411898116
IPSNPSHR_303.2_496.3 FBLN3_HUMAN 0.390037291
APLTKPLK_289.9_357.2 CRP_HUMAN 0.38859469
EVFSKPISWEELLQ_852.9_376.2 FA40A_HUMAN 0.371359974
YENYTSSFFIR_713.8_756.4 IL12B_HUMAN 0.346336267
SPQAFYR_434.7_556.3 REL3_HUMAN 0.345901234
SVDEALR_395.2_488.3 PRDX2_HUMAN 0.307518869
FVFGTTPEDILR_697.9_742.4 TSP1_HUMAN 0.302313589
FTFTLHLETPKPSISSSNLNPR_829.4_787.4 PSG1_HUMAN 0.269826678
VGEYSLYIGR_578.8_708.4 SAMP_HUMAN 0.226573173
ILPSVPK_377.2_244.2 PGH1_HUMAN 0.225429414
LFIPQITR_494.3_614.4 PSG9_HUMAN 0.18285533
TGYYFDGISR_589.8_857.4 FBLN1_HUMAN 0.182474114
HYGGLTGLNK_530.3_759.4 PGAM1_HUMAN 0.152397007
NQSPVLEPVGR_598.3_866.5 KS6A3_HUMAN 0.128963949
IGKPAPDFK_324.9_294.2 PRDX2_HUMAN 0.113383235
TSESTGSLPSPFLR_739.9_716.4 PSMG1_HUMAN 0.108159874
ESDTSYVSLK_564.8_347.2 CRP_HUMAN 0.08569303
ETPEGAEAKPWYEPIYLGGVFQLEK_951.1_877. TNFA_HUMAN 0.039781728 5
TSDQIHFFFAK_447.6_512.3 ANT3_HUMAN 0.008064465 [00200] Table 21. Lasso32 Middle Window
Figure imgf000128_0001
SLDFTELDVAAEK_719.4_316.2 ANGT_HUMAN 2.11
TLAFVR_353.7_492.3 FA7_HUMAN 1.83
LHEAFSPVSYQHDLALLR_699.4_251.2 FA12_HUMAN 1.62
HYINLITR_515.3_301.1 NPY_HUMAN 1.39
VLEPTLK_400.3_458.3 VTDB_HUMAN 1.37
YGIEEHGK_311.5_599.3 CXA1_HUMAN 1.17
VELAPLPSWQPVGK_760.9_342.2 ICAM1_HUMAN 1.13
QVFAVQR_424.2_473.3 ELNE_HUMAN 0.79
ANLINNIFELAGLGK_793.9_299.2 LCAP_HUMAN 0.23
DVLLLVHNLPQNLTGHIWYK_791.8_310.2 PSG7_HUMAN -0.61
VEHSDLSFSK_383.5_234.1 B2MG_HUMAN -0.69
AVDIPGLEAATPYR_736.9_399.2 TENA_HUMAN -0.85
VPLALFALNR_557.3_620.4 PEPD_HUMAN -1.45
ELPQSIVYK_538.8_417.7 FBLN3_HUMAN -1.9
LLAPSDSPEWLSFDVTGVVR_730.1_430.3 TGFB1_HUMAN -2.07
EVFSKPISWEELLQ_852.9_376.2 FA40A_HUMAN -2.32
[00203] Table 24. Lasso All Middle Window
Variable UniProtJD Coefficient
SEYGAALAWEK_612.8_788.4 C06_HUMAN 2.48
VFQFLEK_455.8_811.4 C05_HUMAN 2.41
SLDFTELDVAAEK_719.4_316.2 ANGT_HUMAN 1.07
YGIEEHGK_311.5_599.3 CXA1_HUMAN 0.64
VLEPTLK_400.3_458.3 VTDB_HUMAN 0.58
LHEAFSPVSYQHDLALLR_699.4_251.2 FA12_HUMAN 0.21
LLAPSDSPEWLSFDVTGVVR_730.1_430.3 TGFB1_HUMAN -0.62
VNHVTLSQPK_374.9_244.2 B2MG_HUMAN -1.28
[00204] Table 25. Lasso32 Middle-Late Window
Figure imgf000129_0001
AVYEAVLR_460.8_587.4 PEPD_HUMAN -3.37
[00205] Table 26. Lasso 100 Middle-Late Window
Variable UniProtJD Coefficient
VFQFLEK_455.8_811.4 C05_HUMAN 3.82
SEYGAALAWEK_612.8_845.5 C06_HUMAN 2.94
YGIEEHGK_311.5_599.3 CXA1_HUMAN 2.39
DPTFIPAPIQAK_433.2_556.3 ANGT_HUMAN 2.05
TLAFVR_353.7_492.3 FA7_HUMAN 1.9
NQSPVLEPVGR_598.3_866.5 KS6A3_HUMAN 1.87
ALNHLPLEYNSALYSR_621.0_538.3 C06_HUMAN 1.4
TQILEWAAER_608.8_761.4 EGLN_HUMAN 1.29
VVGGLVALR_442.3_784.5 FA12_HUMAN 1.24
QINSYVK_426.2_496.3 CBG_HUMAN 1.14
YGIEEHGK_311.5_341.2 CXA1_HUMAN 0.84
ALEQDLPVNIK_620.4_570.4 CNDP1_HUMAN 0.74
GTYLYNDCPGPGQDTDCR_697.0_666.3 TNR1A_HUMAN 0.51
SLQNASAIESILK_687.4_860.5 IL3_HUMAN 0.44
DLHLSDVFLK_396.2_260.2 C06_HUMAN 0.38
LIEIANHVDK_384.6_683.4 ADA12_HUMAN 0.37
NIQSVNVK_451.3_674.4 GROA_HUMAN 0.3
FFQYDTWK_567.8_712.3 IGF2_HUMAN 0.19
ANLINNIFELAGLGK_793.9_299.2 LCAP_HUMAN 0.19
TYLHTYESEI_628.3_515.3 ENPP2_HUMAN 0.15
AALAAFNAQNNGSNFQLEEISR_789.1_746.4 FETUA_HUMAN -0.09
AQPVQVAEGSEPDGFWEALGGK_758.0_574.3 GELS_HUMAN -0.52
TSYQVYSK_488.2_787.4 C163A_HUMAN -0.62
AVDIPGLEAATPYR_736.9_399.2 TENA_HUMAN -1.29
TAHISGLPPSTDFIVYLSGLAPSIR_871.5_472.3 TENA_HUMAN -1.53
AEIEYLEK_497.8_552.3 LYAM1_HUMAN -1.73
LLAPSDSPEWLSFDVTGVVR_730.1_430.3 TGFB1_HUMAN -1.95
VPLALFALNR_557.3_620.4 PEPD_HUMAN -2.9
AVYEAVLR_460.8_587.4 PEPD_HUMAN -3.04
ELPQSIVYK_538.8_417.7 FBLN3_HUMAN -3.49
EVFSKPISWEELLQ_852.9_376.2 FA40A_HUMAN -3.71
[00206] Table 27. Lasso Protein Middle-Late Window
Variable UniProtJD Coefficient
VFQFLEK_455.8_811.4 C05_HUMAN 4.25
ALNHLPLEYNSALYSR_621.0_696.4 C06_HUMAN 3.06
YGIEEHGK_311.5_599.3 CXA1_HUMAN 2.36
SEPRPGVLLR_375.2_654.4 FA7_HUMAN 2.11
TQILEWAAER_608.8_761.4 EGLN_HUMAN 1.81 NQSPVLEPVGR_598.3_866.5 KS6A3_HU MAN 1.79
TEQAAVAR_423.2_615.4 FA12_HU MAN 1.72
QINSYVK_426.2_496.3 CBG_HUMAN 0.98
ALEQDLPVNI K_620.4_570.4 CN DP1_HUMAN 0.98
NCSFSIIYPVVIK_770.4_555.4 CRHBP_HUMAN 0.76
LI EIANHVDK_384.6_683.4 ADA12_HUMAN 0.63
SLQNASAIESILK_687.4_860.5 IL3_HUMAN 0.59
ANLI NNI FELAGLGK_793.9_299.2 LCAP_HU MAN 0.55
GTYLYNDCPGPGQDTDCR_697.0_666.3 TNR1A_HUMAN 0.55
TYLHTYESEI_628.3_515.3 ENPP2_H UMAN 0.46
NIQSVNVK_451.3_674.4 GROA_H UMAN 0.22
LTTVDIVTLR_565.8_815.5 IL2RB_HUMAN 0.11
FFQYDTWK_567.8_712.3 IGF2_HUMAN 0.01
TSYQVYSK_488.2_787.4 C163A_HU MAN -0.76
AQPVQVAEGSEPDGFWEALGGK_758.0_574.3 GELS_HUMAN -1.31
AEI EYLEK_497.8_552.3 LYAM 1_HUMAN -1.59
LLAPSDSPEWLSFDVTGVVR_730.1_430.3 TGFB1_HUMAN -1.73
AVDI PGLEAATPYR_736.9_399.2 TENA_HUMAN -2.02
EVFSKPISWEELLQ_852.9_376.2 FA40A_HU MAN -3
TGVAVN KPAEFTVDAK_549.6_258.1 FLNA_HUMAN -3.15
ELPQSIVYK_538.8_417.7 FBLN3_HUMAN -3.49
VN HVTLSQPK_374.9_244.2 B2MG_HUMAN -3.82
VPLALFALNR_557.3_620.4 PEPD_HUMAN -4.94
[00207] Table 28. Lasso All Middle-LateWindow
Variable UniProtJD Coefficient
ALNHLPLEYNSALYSR_621.0_538.3 C06_HUMAN 2.38
TLAFVR_353.7_492.3 FA7_HUMAN 0.96
YGIEEHGK_311.5_599.3 CXA1_HUMAN 0.34
DPTFIPAPIQAK_433.2_461.2 ANGT_HUMAN 0.33
DFNQFSSGEK_386.8_333.2 FETA_H UMAN 0.13
QINSYVK_426.2_496.3 CBG_HUMAN 0.03
TYLHTYESEI_628.3_515.3 ENPP2_H UMAN 0
AQPVQVAEGSEPDGFWEALGGK_758.0_574.3 GELS_HUMAN -0.02
AEI EYLEK_497.8_552.3 LYAM 1_HUMAN -0.05
VN HVTLSQPK_374.9_244.2 B2MG_HUMAN -0.12
LLAPSDSPEWLSFDVTGVVR_730.1_430.3 TGFB1_HUMAN -0.17
EVFSKPISWEELLQ_852.9_376.2 FA40A_HU MAN -0.31
AVDI PGLEAATPYR_736.9_399.2 TENA_HUMAN -0.35
VPLALFALNR_557.3_620.4 PEPD_HUMAN -0.43
AVYEAVLR 460.8 587.4 PEPD HUMAN -2.33 [00208] Table 29. Lasso 32 LateWindow
Figure imgf000132_0001
Variable UniProtJD Coefficient
AVDIPGLEAATPYR_736.9_286.1 TENA_HUMAN -3.9
AEIEYLEK_497.8_552.3 LYAM1_HUMAN -5.29
AVYEAVLR_460.8_587.4 PEPD_HUMAN -5.51
AALAAFNAQNNGSNFQLEEISR_789.1_746.4 FETUA_HUMAN -6.49
[00210] Table 31 : Lasso Protein Late Window
Variable UniProtJD Coefficient
SGVDLADSNQK_567.3_662.3 VGFR3_HUMAN 3.33
ILDGGNK_358.7_603.3 CXCL5_HUMAN 3.25
QINSYVK_426.2_496.3 CBG_HUMAN 2.41
YSHYNER_323.5_418.2 HABP2_HUMAN 1.82
ALEQDLPVNIK_620.4_798.5 CNDP1_HUMAN 1.32
LIEIANHVDK_384.6_683.4 ADA12_HUMAN 1.27
GTYLYNDCPGPGQDTDCR_697.0_666.3 TNR1A_HUMAN 0.26
IEVNESGTVASSSTAVIVSAR_693.0_545.3 PAI1_HUMAN 0.18
LTTVDIVTLR_565.8_815.5 IL2RB_HUMAN 0.18
TSYQVYSK_488.2_787.4 C163A_HUMAN -0.11
TGVAVNKPAEFTVDAK_549.6_258.1 FLNA_HUMAN -0.89
AYSDLSR_406.2_375.2 SAMP_HUMAN -1.47
SPEAEDPLGVER_649.8_314.1 Z512B_HUMAN -1.79
YYGYTGAFR_549.3_450.3 TRFL_HUMAN -2.22
YISPDQLADLYK_713.4_277.2 ENOA_HUMAN -2.41
AVDIPGLEAATPYR_736.9_286.1 TENA_HUMAN -2.94
AEIEYLEK_497.8_552.3 LYAM1_HUMAN -5.18
AALAAFNAQNNGSNFQLEEISR_789.1_746.4 FETUA_HUMAN -5.71
AVYEAVLR_460.8_587.4 PEPD_HUMAN -7.33
[00211] Table 32: Lasso All Late Window
Variable UniProtJD Coefficient
QINSYVK_426.2_496.3 CBG_HUMAN 0.5
DEIPHNDIALLK_459.9_510.8 HABP2_HUMAN 0.15
ALEQDLPVNIK_620.4_570.4 CNDP1_HUMAN 0.11
ILDGGNK_358.7_603.3 CXCL5_HUMAN 0.08
LIEIANHVDK_384.6_683.4 ADA12_HUMAN 0.06
YYGYTGAFR_549.3_450.3 TRFL_HUMAN -0.39
AALAAFNAQNNGSNFQLEEISR_789.1_746.4 FETUA_HUMAN -1.57
AEIEYLEK_497.8_552.3 LYAM1_HUMAN -2.46
AVYEAVLR_460.8_587.4 PEPD_HUMAN -2.92
[00212] Table 33 : Random Forest 32 Early Window
Variable Protein MeanDecreaseGini ELIEELVNITQNQK_557.6_517.3 IL13_HUMAN 3.224369171
AHYDLR_387.7_288.2 FETUA_HUMAN 1.869007658
FSWYAK_407.2_381.2 FETUA_HUMAN 1.770198171
ITLPDFTGDLR_624.3_288.2 LBP_HUMAN 1.710936472
ITGFLKPGK_320.9_301.2 LBP_HUMAN 1.623922439
ITGFLKPGK_320.9_429.3 LBP_HUMAN 1.408035272
ELIEELVNITQNQK_557.6_618.3 IL13_HUMAN 1.345412168
VFQFLEK_455.8_811.4 C05_HUMAN 1.311332013
VQTAHFK_277.5_431.2 C08A_HUMAN 1.308902373
FLNWIK_410.7_560.3 HABP2_HUMAN 1.308093745
DAGLSWGSAR_510.3_390.2 NEUR4_HUMAN 1.297033607
TLLPVSKPEIR_418.3_288.2 C05_HUMAN 1.291280928
LIQDAVTGLTVNGQITGDK_972.0_798.4 ITIH3_HUMAN 1.28622301
QALEEFQK_496.8_680.3 C08B_HUMAN 1.191731825
FSVVYAK_407.2_579.4 FETUA_HUMAN 1.078909138
ITLPDFTGDLR_624.3_920.5 LBP_HUMAN 1.072613747
AHYDLR_387.7_566.3 FETUA_HUMAN 1.029562263
ALNHLPLEYNSALYSR_621.0_538.3 C06_HUMAN 1.00992071
DVLLLVH N LPQ.N LPGYF WYK_810.4_967.5 PSG9_HUMAN 1.007095529
SFRPFVPR_335.9_635.3 LBP_HUMAN 0.970312536
SDLEVAHYK_531.3_617.3 C08B_HUMAN 0.967904893
VQEAHLTEDQIFYFPK_655.7_701.4 C08G_HUMAN 0.960398254
VFQFLEK_455.8_276.2 C05_HUMAN 0.931652095
SLLQPNK_400.2_599.4 C08A_HUMAN 0.926470249
SFRPFVPR_335.9_272.2 LBP_HUMAN 0.911599611
FLNWIK_410.7_561.3 HABP2_HUMAN 0.852022868
LSSPAVITDK_515.8_743.4 PLMN_HUMAN 0.825455824
DVLLLVHNLPQNLPGYFWYK_810.4_594.3 PSG9_HUMAN 0.756797142
ALVLELAK_428.8_672.4 INHBE_HUMAN 0.748802555
DISEVVTPR_508.3_787.4 CFAB_HUMAN 0.733731518
[00213] Table 34. Random Forest 100 Early Window
Variable Protein MeanDecreaseGini
ELIEELVNITQNQK_557.6_517.3 IL13_HUMAN 1.709778508
LPNNVLQEK_527.8_844.5 AFAM_HUMAN 0.961692716
AHYDLR_387.7_288.2 FETUA_HUMAN 0.901586746
ITLPDFTGDLR_624.3_288.2 LBP_HUMAN 0.879119498
IEGNLIFDPNNYLPK_874.0_414.2 APOB_HUMAN 0.842483095
ITGFLKPGK_320.9_301.2 LBP_HUMAN 0.806905233
FSWYAK_407.2_381.2 FETUA_HUMAN 0.790429706
ITGFLKPGK_320.9_429.3 LBP_HUMAN 0.710312386
VFQFLEK_455.8_811.4 C05_HUMAN 0.709531553
LIQDAVTGLTVNGQITGDK_972.0_798.4 ITIH3_HUMAN 0.624325189
DADPDTFFAK_563.8_825.4 AFAM_HUMAN 0.618684313 FLNWIK_410.7_560.3 HABP2_HUMAN 0.617501242
TASDFITK_441.7_781.4 GELS_HUMAN 0.609275999
DAGLSWGSAR_510.3_390.2 NEUR4_HUMAN 0.588718595
VQ.TAHFK_277.5_431.2 C08A_HUMAN 0.58669845
TLLPVSKPEIR_418.3_288.2 C05_HUMAN 0.5670608
ELIEELVNITQNQK_557.6_618.3 IL13_HUMAN 0.555624783
TYLHTYESEI_628.3_908.4 ENPP2_HUMAN 0.537678415
HFQNLGK_422.2_527.2 AFAM_HUMAN 0.535543137
TASDFITK_441.7_710.4 GELS_HUMAN 0.532743323
ITLPDFTGDLR_624.3_920.5 LBP_HUMAN 0.51667902
QALEEFQK_496.8_680.3 C08B_HUMAN 0.511314017
AVLHIGEK_289.5_348.7 THBG_HUMAN 0.510284122
FSWYAK_407.2_579.4 FETUA_HUMAN 0.503907813
LPNNVLQEK_527.8_730.4 AFAM_HUMAN 0.501281631
AHYDLR_387.7_566.3 FETUA_HUMAN 0.474166711
IAPQLSTEELVSLGEK_857.5_333.2 AFAM_HUMAN 0.459595701
WWGGQPLWITATK_772.4_929.5 ENPP2_HUMAN 0.44680777
TYLHTYESEI_628.3_515.3 ENPP2_HUMAN 0.434157773
DALSSVQESQVAQQAR_573.0_502.3 APOC3_HUMAN 0.432484862
[00214] Table 35. Random Forest Protein Early Window
Variable Protein MeanDecreaseGini
ELIEELVNITQNQK_557.6_517.3 IL13_HUMAN 2.881452809
LPNNVLQEK_527.8_844.5 AFAM_HUMAN 1.833987752
ITLPDFTGDLR_624.3_288.2 LBP_HUMAN 1.608843881
IEGNLIFDPNNYLPK_874.0_414.2 APOB_HUMAN 1.594658208
VFQFLEK_455.8_811.4 C05_HUMAN 1.290134412
LIQDAVTGLTVNGQITGDK_972.0_798.4 ITIH3_HUMAN 1.167981736
TASDFITK_441.7_781.4 GELS_HUMAN 1.152847453
DAGLSWGSAR_510.3_390.2 NEUR4_HUMAN 1.146752656
FSVVYAK_407.2_579.4 FETUA_HUMAN 1.060168583
AVLHIGEK_289.5_348.7 THBG_HUMAN 1.033625773
FLNWIK_410.7_560.3 HABP2_HUMAN 1.022356789
QALEEFQK_496.8_680.3 C08B_HUMAN 0.990074129
DVLLLVH N LPQ.N LPGYF WYK_810.4_967.5 PSG9_HUMAN 0.929633865
WWGGQPLWITATK_772.4_929.5 ENPP2_HUMAN 0.905895642
VQEAHLTEDQIFYFPK_655.7_701.4 C08G_HUMAN 0.883887371
NNQLVAGYLQGPNVNLEEK_700.7_999.5 IL1RA_HUMAN 0.806472085
SLLQPNK_400.2_599.4 C08A_HUMAN 0.783623222
DALSSVQESQVAQQAR_573.0_672.4 APOC3_HUMAN 0.774365756
NIQSVNVK_451.3_674.4 GROA_HUMAN 0.767963386
HPWIVHWDQLPQYQLNR_744.0_1047.0 KS6A3_HUMAN 0.759960139
TTSDGGYSFK_531.7_860.4 INHA_HUMAN 0.732813448
ALNHLPLEYNSALYSR 621.0 538.3 C06 HUMAN 0.718779092 LSSPAVITDK_515.8_743.4 PLMN_HUMAN 0.699547739
TGVAVNKPAEFTVDAK_549.6_258.1 FLNA_HUMAN 0.693159192
TLNAYDHR_330.5_312.2 PAR3_HUMAN 0.647300964
DISEVVTPR_508.3_787.4 CFAB_HUMAN 0.609165621
LIENGYFHPVK_439.6_627.4 F13B_HUMAN 0.60043345
SGVDLADSNQK_567.3_662.3 VGFR3_HUMAN 0.596079858
ALQDQLVLVAAK_634.9_289.2 ANGT_HUMAN 0.579034994
ALVLELAK_428.8_672.4 INHBE_HUMAN 0.573458483
[00215] Table 36. Random Forest All Early Window
Variable Protein MeanDecreaseGini
ELIEELVNITQNQK_557.6_517.3 IL13_HUMAN 0.730972421
ITLPDFTGDLR_624.3_288.2 LBP_HUMAN 0.409808774
AHYDLR_387.7_288.2 FETUA_HUMAN 0.409298983
FSWYAK_407.2_381.2 FETUA_HUMAN 0.367730833
ITGFLKPGK_320.9_301.2 LBP_HUMAN 0.350485117
VFQFLEK_455.8_811.4 C05_HUMAN 0.339289475
ELIEELVNITQNQK_557.6_618.3 IL13_HUMAN 0.334303166
LPNNVLQEK_527.8_844.5 AFAM_HUMAN 0.329800706
IEGNLIFDPNNYLPK_874.0_414.2 APOB_HUMAN 0.325596677
ITGFLKPGK_320.9_429.3 LBP_HUMAN 0.31473104
FLNWIK_410.7_560.3 HABP2_HUMAN 0.299810081
LIQDAVTGLTVNGQITGDK_972.0_798.4 ITIH3_HUMAN 0.295613448
ITLPDFTGDLR_624.3_920.5 LBP_HUMAN 0.292212699
DAGLSWGSAR_510.3_390.2 NEUR4_HUMAN 0.285812225
TLLPVSKPEIR_418.3_288.2 C05_HUMAN 0.280857718
FSWYAK_407.2_579.4 FETUA_HUMAN 0.278531322
DADPDTFFAK_563.8_825.4 AFAM_HUMAN 0.258938798
AHYDLR_387.7_566.3 FETUA_HUMAN 0.256160046
QALEEFQK_496.8_680.3 C08B_HUMAN 0.245543641
HTLNQIDEVK_598.8_951.5 FETUA_HUMAN 0.239528081
TASDFITK_441.7_781.4 GELS_HUMAN 0.227485958
VFQFLEK_455.8_276.2 C05_HUMAN 0.226172392
DVLLLVH N LPQ.N LPGYF WYK_810.4_967.5 PSG9_HUMAN 0.218613384
VQTAHFK_277.5_431.2 C08A_HUMAN 0.217171548
SFRPFVPR_335.9_635.3 LBP_HUMAN 0.214798112
HFQNLGK_422.2_527.2 AFAM_HUMAN 0.211756476
SVSLPSLDPASAK_636.4_473.3 APOB_HUMAN 0.211319422
FGFGGSTDSGPIR_649.3_745.4 ADA12_HUMAN 0.206574494
HFQNLGK_422.2_285.1 AFAM_HUMAN 0.204024196
AVLHIGEK 289.5 348.7 THBG HUMAN 0.201102917 [00216] Table 37. Random Forest SummedGini Early Window
Figure imgf000137_0001
Transition Protein SumBestGini
NADYSYSVWK_616.8_333.2 C05_HUMAN 76.07974354
AHYDLR_387.7_566.3 FETUA_HUMAN 74.68253347
GAVHWVAETDYQSFAVLYLER_822.8_580.3 C08G_HUMAN 73.75860248
LIENGYFHPVK_439.6_627.4 F13B_HUMAN 73.74965194
ALDLSLK_380.2_185.1 ITIH3_HUMAN 72.760739
WWGGQPLWITATK_772.4_373.2 ENPP2_HUMAN 72.51936706
FGFGGSTDSGPIR_649.3_946.5 ADA12_HUMAN 72.49183198
GLQYAAQEGLLALQSELLR_1037.1_929.5 LBP_HUMAN 67.17588648
HFQNLGK_422.2_527.2 AFAM_HUMAN 66.11702719
YSHYNER_323.5_581.3 HABP2_HUMAN 65.56238612
ISQGEADINIAFYQR_575.6_684.4 MMP8_HUMAN 65.50301246
TGVAVNKPAEFTVDAK_549.6_258.1 FLNA_HUMAN 64.85259525
NIQSVNVK_451.3_674.4 GROA_HUMAN 64.53010225
DALSSVQESQVAQQAR_573.0_672.4 APOC3_HUMAN 64.12149927
SLLQPNK_400.2_599.4 C08A_HUMAN 62.68167847
SFRPFVPR_335.9_635.3 LBP_HUMAN 61.90157662
NNQLVAGYLQGPNVNLEEK_700.7_999.5 IL1RA_HUMAN 61.54435815
LYYGDDEK_501.7_563.2 C08A_HUMAN 60.16700473
SWNEPLYHLVTEVR_581.6_716.4 PRL_HUMAN 59.78209065
SGVDLADSNQK_567.3_662.3 VGFR3_HUMAN 58.93982896
GTYLYNDCPGPGQDTDCR_697.0_335.2 TNR1A_HUMAN 58.72963941
HATLSLSIPR_365.6_472.3 VGFR3_HUMAN 57.98669834
FIVGFTR_420.2_261.2 CCL20_HUMAN 57.23165578
QNYHQDSEAAINR_515.9_544.3 FRIH_HUMAN 57.21116697
DVLLLVHNLPQNLPGYFWYK_810.4_594.3 PSG9_HUMAN 56.84150484
FLNWIK_410.7_561.3 HABP2_HUMAN 56.37258274
SLQAFVAVAAR_566.8_487.3 IL23A_HUMAN 56.09012981
HFQNLGK_422.2_285.1 AFAM_HUMAN 56.04480022
GPGEDFR_389.2_322.2 PTGDS_HUMAN 55.7583763
NKPGVYTDVAYYLAWIR_677.0_821.5 FA12_HUMAN 55.53857645
LIQDAVTGLTVNGQITGDK_972.0_640.4 ITIH3_HUMAN 55.52577583
YYGYTGAFR_549.3_450.3 TRFL_HUMAN 54.27147366
TLNAYDHR_330.5_312.2 PAR3_HUMAN 54.19190934
IQTHSTTYR_369.5_627.3 F13B_HUMAN 54.18950583
TASDFITK_441.7_710.4 GELS_HUMAN 54.1056456
ALNHLPLEYNSALYSR_621.0_696.4 C06_HUMAN 53.8997252
DADPDTFFAK_563.8_302.1 AFAM_HUMAN 53.85914848
SVSLPSLDPASAK_636.4_473.3 APOB_HUMAN 53.41996191
TTSDGGYSFK_531.7_860.4 INHA_HUMAN 52.24655536
AFTECCWASQLR_770.9_574.3 C05_HUMAN 51.67853429
ELPQSIVYK_538.8_409.2 FBLN3_HUMAN 51.35853002
TYLHTYESEI_628.3_515.3 ENPP2_HUMAN 51.23842124
FQLSETNR_497.8_605.3 PSG2_HUMAN 51.01576848
GSLVQASEANLQAAQDFVR_668.7_806.4 ITIH1_HUMAN 50.81923338
FSLVSGWGQLLDR_493.3_403.2 FA7_HUMAN 50.54425114 Transition Protein SumBestGini
ECEELEEK_533.2_405.2 IL15_HUMAN 50.41977421
NADYSYSVWK_616.8_769.4 C05_HUMAN 50.36434595
SLLQPNK_400.2_358.2 C08A_HUMAN 49.75593162
LIEIANHVDK_384.6_683.4 ADA12_HUMAN 49.43389721
DISEVVTPR_508.3_787.4 CFAB_HUMAN 49.00234897
AEVIWTSSDHQVLSGK_586.3_300.2 PD1L1_HUMAN 48.79028835
SGVDLADSNQK_567.3_591.3 VGFR3_HUMAN 48.70665587
SILFLGK_389.2_201.1 THBG_HUMAN 48.5997957
AVLHIGEK_289.5_292.2 THBG_HUMAN 48.4605866
QLYGDTGVLGR_589.8_501.3 C08G_HUMAN 48.11414904
FSLVSGWGQ.LLD R_493.3_516.3 FA7_HUMAN 47.59635333
DSPVLIDFFEDTER_841.9_399.2 HRG_HUMAN 46.83840473
INPASLDK_429.2_630.4 C163A_HUMAN 46.78947931
GAVHWVAETDYQSFAVLYLER_822.8_863.5 C08G_HUMAN 46.66185339
FLQEQGHR_338.8_497.3 C08G_HUMAN 46.64415952
LNIGYIEDLK_589.3_837.4 PAI2_HUMAN 46.5879123
LSSPAVITDK_515.8_743.4 PLMN_HUMAN 46.2857838
GLQYAAQEGLLALQSELLR_1037.1_858.5 LBP_HUMAN 45.7427767
SDGAKPGPR_442.7_213.6 COLI_HUMAN 45.27828366
GYQELLEK_490.3_502.3 FETA_HUMAN 43.52928868
GGEGTGYFVDFSVR_745.9_869.5 HRG_HUMAN 43.24514327
ADLFYDVEALDLESPK_913.0_447.2 HRG_HUMAN 42.56268679
ADLFYDVEALDLESPK_913.0_331.2 HRG_HUMAN 42.48967422
EAQLPVIENK_570.8_699.4 PLMN_HUMAN 42.21213429
SILFLGK_389.2_577.4 THBG_HUMAN 42.03379581
HTLNQIDEVK_598.8_958.5 FETUA_HUMAN 41.98377176
AQPVQVAEGSEPDGFWEALGGK_758.0_574.3 GELS_HUMAN 41.89547273
FLPCENK_454.2_390.2 IL10_HUMAN 41.66612478
LIEIANHVDK_384.6_498.3 ADA12_HUMAN 41.50878046
DEIPHNDIALLK_459.9_510.8 HABP2_HUMAN 41.27830935
SLQAFVAVAAR_566.8_804.5 IL23A_HUMAN 41.00430596
YISPDQLADLYK_713.4_277.2 ENOA_HUMAN 40.90053801
SLPVSDSVLSGFEQR_810.9_836.4 C08G_HUMAN 40.62020941
DGSPDVTTADIGANTPDATK_973.5_531.3 PGRP2_HUMAN 40.33913091
NTGVISVVTTGLDR_716.4_662.4 CADH1_HUMAN 40.05291612
ALVLELAK_428.8_672.4 INHBE_HUMAN 40.01646465
YEFLNGR_449.7_293.1 PLMN_HUMAN 39.83344278
WGAAPYR_410.7_577.3 PGRP2_HUMAN 39.52766213
TFLTVYWTPER_706.9_401.2 ICAM1_HUMAN 39.13662034
SEYGAALAWEK_612.8_845.5 C06_HUMAN 38.77511119
VGVISFAQK_474.8_693.4 TFR2_HUMAN 38.5823457
IIEVEEEQEDPYLNDR_996.0_777.4 FBLN1_HUMAN 38.30913304
TGYYFDGISR_589.8_694.4 FBLN1_HUMAN 38.30617106
LQGTLPVEAR_542.3_571.3 C05_HUMAN 37.93064544
DSPVLIDFFEDTER_841.9_512.3 HRG_HUMAN 37.4447737 Transition Protein SumBestGini
AALAAFNAQNNGSNFQLEEISR_789.1_746.4 FETUA_HUMAN 37.02483715
DGSPDVTTADIGANTPDATK_973.5_844.4 PGRP2_HUMAN 36.59864788
ILILPSVTR_506.3_785.5 PSGx_HUMAN 36.43814815
SVSLPSLDPASAK_636.4_885.5 APOB_HUMAN 36.27689491
TLAFVR_353.7_492.3 FA7_HUMAN 36.18771771
VAPGVANPGTPLA_582.3_555.3 A6NIT4_HUMAN 35.70677357
HELTDEELQSLFTNFANWDK_817.1_906.5 AFAM_HUMAN 35.14441609
AGLLRPDYALLGHR_518.0_369.2 PGRP2_HUMAN 35.13047098
GDTYPAELYITGSILR_885.0_1332.8 F13B_HUMAN 34.97832404
LFIPQITR_494.3_727.4 PSG9_HUMAN 34.76811249
GYQELLEK_490.3_631.4 FETA_HUMAN 34.76117605
VSEADSSNADWVTK_754.9_533.3 CFAB_HUMAN 34.49787512
LNIGYIEDLK_589.3_950.5 PAI2_HUMAN 34.48448691
SFRPFVPR_335.9_272.2 LBP_HUMAN 34.27529415
ILDGGNK_358.7_490.2 CXCL5_HUMAN 34.2331388
EAN QSTLEN FLE R_775.9_678.4 IL4_HUMAN 34.14295797
DFNQFSSGEK_386.8_189.1 FETA_HUMAN 34.05459951
IEEIAAK_387.2_660.4 C05_HUMAN 33.93778148
TEFLSNYLTNVDDITLVPGTLGR_846.8_600.3 ENPP2_HUMAN 33.87864446
LPATEKPVLLSK_432.6_347.2 HYOUl_HUMAN 33.69005522
FLQEQGHR_338.8_369.2 C08G_HUMAN 33.61179024
APLTKPLK_289.9_357.2 CRP_HUMAN 33.59900279
YSHYNER_323.5_418.2 HABP2_HUMAN 33.50888447
TSYQVYSK_488.2_787.4 C163A_HUMAN 33.11650018
IALGGLLFPASNLR_481.3_657.4 SHBG_HUMAN 33.02974341
TGISPLALIK_506.8_741.5 APOB_HUMAN 32.64471573
LYYGDDEK_501.7_726.3 C08A_HUMAN 32.60782458
IVLSLDVPIGLLQILLEQAR_735.1_503.3 UCN2_HUMAN 32.37907686
EAQLPVIENK_570.8_329.2 PLMN_HUMAN 32.34049256
TGYYFDGISR_589.8_857.4 FBLN1_HUMAN 32.14526507
VGVISFAQK_474.8_580.3 TFR2_HUMAN 32.11753213
FQSVFTVTR_542.8_623.4 C1QC_HUMAN 32.11360444
TSDQIHFFFAK_447.6_659.4 ANT3_HUMAN 31.95867038
IAPQLSTEELVSLGEK_857.5_333.2 AFAM_HUMAN 31.81531364
EVFSKPISWEELLQ_852.9_260.2 FA40A_HUMAN 31.36698726
DEIPHNDIALLK_459.9_260.2 HABP2_HUMAN 31.1839869
NYFTSVAHPNLFIATK_608.3_319.2 IL1A_HUMAN 31.09867061
ITENDIQIALDDAK_779.9_632.3 APOB_HUMAN 30.77026845
DTYVSSFPR_357.8_272.2 TCEA1_HUMAN 30.67784731
TDAPDLPEENQAR_728.3_843.4 C05_HUMAN 30.66251941
LFYADHPFIFLVR_546.6_647.4 SERPH_HUMAN 30.65831566
TEQAAVAR_423.2_487.3 FA12_HUMAN 30.44356842
AVGYLITGYQR_620.8_737.4 PZP_HUMAN 30.36425528
HSHESQDLR_370.2_288.2 HRG_HUMAN 30.34684703
IALGGLLFPASNLR_481.3_412.3 SHBG_HUMAN 30.34101643 Transition Protein SumBestGini
IAQYYYTFK_598.8_884.4 F13B_HUMAN 30.23453833
SLPVSDSVLSGFEQR_810.9_723.3 C08G_HUMAN 30.11396489
IIGGSDADIK_494.8_762.4 C1S_HUMAN 30.06572687
Q.TLSWTVTPK_580.8_545.3 PZP_HUMAN 30.04139865
HYFIAAVER_553.3_658.4 FA8_HUMAN 29.80239884
QVCADPSEEWVQK_788.4_374.2 CCL3_HUMAN 29.61435573
DLHLSDVFLK_396.2_366.2 C06_HUMAN 29.60077507
NIQSVNVK_451.3_546.3 GROA_HUMAN 29.47619619
QTLSWTVTPK_580.8_818.4 PZP_HUMAN 29.40047934
HSHESQDLR_370.2_403.2 HRG_HUMAN 29.32242262
LLEVPEGR_456.8_356.2 C1S_HUMAN 29.14169137
LIENGYFHPVK_439.6_343.2 F13B_HUMAN 28.63056809
EDTPNSVWEPAK_686.8_630.3 C1S_HUMAN 28.61352686
AFTECCWASQLR_770.9_673.4 C05_HUMAN 28.57830281
VNHVTLSQPK_374.9_459.3 B2MG_HUMAN 28.27203693
VSFSSPLVAISGVALR_802.0_715.4 PAPP1_HUMAN 28.13008712
DPDQTDGLGLSYLSSHIANVER_796.4_456.2 GELS_HUMAN 28.06549895
VVGGLVALR_442.3_784.5 FA12_HUMAN 28.00684006
NEIVFPAGILQAPFYTR_968.5_357.2 ECE1_HUMAN 27.97758456
QVCADPSEEWVQK_788.4_275.2 CCL3_HUMAN 27.94276837
LQDAGVYR_461.2_680.3 PD1L1_HUMAN 27.88063261
IQTHSTTYR_369.5_540.3 F13B_HUMAN 27.68873826
TPSAAYLWVGTGASEAEK_919.5_849.4 GELS_HUMAN 27.66889639
ALALPPLGLAPLLNLWAKPQGR_770.5_256.2 SHBG_HUMAN 27.63105727
ALQDQLVLVAAK_634.9_289.2 ANGT_HUMAN 27.63097319
IEEIAAK_387.2_531.3 C05_HUMAN 27.52427934
TAVTANLDIR_537.3_288.2 CHL1_HUMAN 27.44246841
VSEADSSNADWVTK_754.9_347.2 CFAB_HUMAN 27.43976782
ITENDIQIALDDAK_779.9_873.5 APOB_HUMAN 27.39263522
SSNNPHSPIVEEFQVPYNK_729.4_521.3 C1S_HUMAN 27.34493617
HPWIVHWDQLPQYQLNR_744.0_918.5 KS6A3_HUMAN 27.19681613
TPSAAYLWVGTGASEAEK_919.5_428.2 GELS_HUMAN 27.17319953
AFLEVNEEGSEAAASTAVVIAGR_764.4_614.4 ANT3_HUMAN 27.10487351
WGAAPYR_410.7_634.3 PGRP2_HUMAN 27.09930054
IEVNESGTVASSSTAVIVSAR_693.0_545.3 PAI1_HUMAN 27.02567296
AEAQAQYSAAVAK_654.3_908.5 ITIH4_HUMAN 26.98305259
VPLALFALNR_557.3_917.6 PEPD_HUMAN 26.96988826
TLEAQLTPR_514.8_685.4 HEP2_HUMAN 26.94672621
QALEEFQK_496.8_551.3 C08B_HUMAN 26.67037155
WNFAYWAAHQPWSR_607.3_545.3 PRG2_HUMAN 26.62600679
IYLQPGR_423.7_570.3 ITIH2_HUMAN 26.58752589
FFQYDTWK_567.8_840.4 IGF2_HUMAN 26.39942037
NEIWYR_440.7_357.2 FA12_HUMAN 26.35177282
GGEGTGYFVDFSVR_745.9_722.4 HRG_HUMAN 26.31688167
VGEYSLYIGR_578.8_708.4 SAMP_HUMAN 26.17367498 Transition Protein SumBestGini
TAHISGLPPSTDFIVYLSGLAPSIR_871.5_800.5 TENA_HUMAN 26.13688183
GVTGYFTFNLYLK_508.3_260.2 PSG5_HUMAN 26.06007032
DYWSTVK_449.7_620.3 APOC3_HUMAN 26.03765187
YENYTSSFFIR_713.8_756.4 IL12B_HUMAN 25.9096605
YGLVTYATYPK_638.3_334.2 CFAB_HUMAN 25.84440452
LFIPQITR_494.3_614.4 PSG9_HUMAN 25.78081129
YEFLNGR_449.7_606.3 PLMN_HUMAN 25.17159874
SEPRPGVLLR_375.2_454.3 FA7_HUMAN 25.16444381
NSDQEIDFK_548.3_294.2 S10A5_HUMAN 25.12266401
YEVQGEVFTKPQLWP_911.0_293.1 CRP_HUMAN 24.77595195
GVTGYFTFNLYLK_508.3_683.9 PSG5_HUMAN 24.75289081
ISLLLIESWLEPVR_834.5_371.2 CSH_HUMAN 24.72379326
ALLLGWVPTR_563.3_373.2 PAR4_HUMAN 24.68096599
VNHVTLSQPK_374.9_244.2 B2MG_HUMAN 24.53420489
SGAQATWTELPWPHEK_613.3_793.4 HEMO_HUMAN 24.25610995
AQPVQVAEGSEPDGFWEALGGK_758.0_623.4 GELS_HUMAN 24.18769142
DLPHITVDR_533.3_490.3 MMP7_HUMAN 24.02606052
SEYGAALAWEK_612.8_788.4 C06_HUMAN 24.00163743
AVGYLITGYQR_620.8_523.3 PZP_HUMAN 23.93958524
GFQALGDAADIR_617.3_717.4 TIMP1_HUMAN 23.69249513
YEVQGEVFTKPQLWP_911.0_392.2 CRP_HUMAN 23.67764212
SDGAKPGPR_442.7_459.2 COLI_HUMAN 23.63551614
GFQALGDAADIR_617.3_288.2 TIMP1_HUMAN 23.55832742
IAPQLSTEELVSLGEK_857.5_533.3 AFAM_HUMAN 23.38139357
DTDTGALLFIGK_625.8_217.1 PEDF_HUMAN 23.33375418
LHEAFSPVSYQHDLALLR_699.4_380.2 FA12_HUMAN 23.27455931
IYLQPGR_423.7_329.2 ITIH2_HUMAN 23.19122626
[00217] Table 38. Random Forest 32 Middle Window
Variable UniProtJD MeanDecreaseGini
SEYGAALAWEK_612.8_788.4 C06_HUMAN 2.27812193
LLAPSDSPEWLSFDVTGWR_730.1_430.3 TGFB1_HUMAN 2.080133179
ALNHLPLEYNSALYSR_621.0_696.4 C06_HUMAN 1.952233942
ELPQSIVYK_538.8_417.7 FBLN3_HUMAN 1.518833357
VEHSDLSFSK_383.5_234.1 B2MG_HUMAN 1.482593086
VFQFLEK_455.8_811.4 C05_HUMAN 1.448810425
VNHVTLSQPK_374.9_244.2 B2MG_HUMAN 1.389922815
YGIEEHGK_311.5_599.3 CXA1_HUMAN 1.386794676
TLAFVR_353.7_492.3 FA7_HUMAN 1.371530925
VLEPTLK_400.3_587.3 VTDB_HUMAN 1.368583173
VLEPTLK_400.3_458.3 VTDB_HUMAN 1.336029064
DALSSVQESQVAQQAR_573.0_502.3 APOC3_HUMAN 1.307024357
AQPVQVAEGSEPDGFWEALGGK_758.0_574.3 GELS_HUMAN 1.282930911 Variable UniProtJD MeanDecreaseGini
LHEAFSPVSYQHDLALLR_699.4_251.2 FA12_HUMAN 1.25362163
SEPRPGVLLR_375.2_654.4 FA7_HUMAN 1.205539225
VEHSDLSFSK_383.5_468.2 B2MG_HUMAN 1.201047302
SLDFTELDVAAEK_719.4_316.2 ANGT_HUMAN 1.189617326
SEYGAALAWEK_612.8_845.5 C06_HUMAN 1.120706696
TYLHTYESEI_628.3_515.3 ENPP2_HUMAN 1.107036657
VNHVTLSQPK_374.9_459.3 B2MG_HUMAN 1.083264902
IEEIAAK_387.2_660.4 C05_HUMAN 1.043635292
ALNHLPLEYNSALYSR_621.0_538.3 C06_HUMAN 0.962643698
TLLPVSKPEIR_418.3_514.3 C05_HUMAN 0.933440467
TEQAAVAR_423.2_615.4 FA12_HUMAN 0.878933553
DLHLSDVFLK_396.2_260.2 C06_HUMAN 0.816855601
ALQDQLVLVAAK_634.9_289.2 ANGT_HUMAN 0.812620232
SLQAFVAVAAR_566.8_804.5 IL23A_HUMAN 0.792274782
QGHNSVFLIK_381.6_260.2 HEMO_HUMAN 0.770830031
ALQDQLVLVAAK_634.9_956.6 ANGT_HUMAN 0.767468246
SLDFTELDVAAEK_719.4_874.5 ANGT_HUMAN 0.745827911
[00218] Table 39. Random Forest 100 Middle Window
Variable UniProtJD MeanDecreaseGini
SEYGAALAWEK_612.8_788.4 C06_HUMAN 1.241568411
ALNHLPLEYNSALYSR_621.0_696.4 C06_HUMAN 0.903126414
LLAPSDSPEWLSFDVTGVVR_730.1_430.3 TGFB1_HUMAN 0.846216563
ANLINNIFELAGLGK_793.9_299.2 LCAP_HUMAN 0.748261193
VFQFLEK_455.8_811.4 C05_HUMAN 0.717545171
VEHSDLSFSK_383.5_234.1 B2MG_HUMAN 0.683219617
ELPQSIVYK_538.8_417.7 FBLN3_HUMAN 0.671091545
LNIGYIEDLK_589.3_950.5 PAI2_HUMAN 0.652293621
VLEPTLK_400.3_587.3 VTDB_HUMAN 0.627095631
VNHVTLSQPK_374.9_244.2 B2MG_HUMAN 0.625773888
VLEPTLK_400.3_458.3 VTDB_HUMAN 0.613655529
AQPVQVAEGSEPDGFWEALGGK_758.0_574.3 GELS_HUMAN 0.576305627
TLFIFGVTK_513.3_811.5 PSG4_HUMAN 0.574056825
YGIEEHGK_311.5_599.3 CXA1_HUMAN 0.570270447
VPLALFALNR_557.3_620.4 PEPD_HUMAN 0.556087614
EVFSKPISWEELLQ_852.9_376.2 FA40A_HUMAN 0.531461012
VEHSDLSFSK_383.5_468.2 B2MG_HUMAN 0.531214597
TLAFVR_353.7_492.3 FA7_HUMAN 0.53070743
DALSSVQESQVAQQAR_573.0_502.3 APOC3_HUMAN 0.521633041
SEYGAALAWEK_612.8_845.5 C06_HUMAN 0.514509661
SLDFTELDVAAEK_719.4_316.2 ANGT_HUMAN 0.50489698
SEPRPGVLLR_375.2_654.4 FA7_HUMAN 0.4824926
LHEAFSPVSYQHDLALLR_699.4_251.2 FA12_HUMAN 0.48217238
TYLHTYESEI 628.3 515.3 ENPP2 HUMAN 0.472286273 Variable UniProtJD MeanDecreaseGini
AVDIPGLEAATPYR_736.9_399.2 TENA_HUMAN 0.470892051
FSLVSGWGQLLDR_493.3_403.2 FA7_HUMAN 0.465839813
GEVTYTTSQVSK_650.3_750.4 EGLN_HUMAN 0.458736205
VNHVTLSQPK_374.9_459.3 B2MG_HUMAN 0.454348892
HFQNLGK_422.2_527.2 AFAM_HUMAN 0.45127405
YGIEEHGK_311.5_341.2 CXA1_HUMAN 0.430641646
[00219] Table 40. Random Forest Protein Middle Window
Figure imgf000144_0001
Variable UniProtJD MeanDecreaseGini
ALNHLPLEYNSALYSR_621.0_696.4 C06_HUMAN 0.382180772
VFQFLEK_455.8_811.4 C05_HUMAN 0.260292083
LLAPSDSPEWLSFDVTGVVR_730.1_430.3 TGFB1_HUMAN 0.243156718
NADYSYSVWK_616.8_769.4 C05_HUMAN 0.242388196
VLEPTLK_400.3_458.3 VTDB_HUMAN 0.238171849
VEHSDLSFSK_383.5_234.1 B2MG_HUMAN 0.236873731
ELPQSIVYK_538.8_417.7 FBLN3_HUMAN 0.224727161
VLEPTLK_400.3_587.3 VTDB_HUMAN 0.222105614
TLFIFGVTK_513.3_811.5 PSG4_HUMAN 0.210807574
ANLINNIFELAGLGK_793.9_299.2 LCAP_HUMAN 0.208714978
LNIGYIEDLK_589.3_950.5 PAI2_HUMAN 0.208027555
SEYGAALAWEK_612.8_845.5 C06_HUMAN 0.197362212
VNHVTLSQPK_374.9_244.2 B2MG_HUMAN 0.195728091
YGIEEHGK_311.5_599.3 CXA1_HUMAN 0.189969499
HFQNLGK_422.2_527.2 AFAM_HUMAN 0.189572857
AGITIPR_364.2_486.3 IL17_HUMAN 0.188351054
AQPVQVAEGSEPDGFWEALGGK_758.0_574.3 GELS_HUMAN 0.185069517
SLDFTELDVAAEK_719.4_316.2 ANGT_HUMAN 0.173688295
TLAFVR_353.7_492.3 FA7_HUMAN 0.170636045
SEPRPGVLLR_375.2_654.4 FA7_HUMAN 0.170608352
TLLIANETLR_572.3_703.4 IL5_HUMAN 0.16745571
ALNHLPLEYNSALYSR_621.0_538.3 C06_HUMAN 0.161514946
LHEAFSPVSYQHDLALLR_699.4_251.2 FA12_HUMAN 0.15852146
DGSPDVTTADIGANTPDATK_973.5_844.4 PGRP2_HUMAN 0.154028378
VPLALFALNR_557.3_620.4 PEPD_HUMAN 0.153725879
AVDIPGLEAATPYR_736.9_399.2 TENA_HUMAN 0.150920884
YGIEEHGK_311.5_341.2 CXA1_HUMAN 0.150319671
FSLVSGWGQLLDR_493.3_403.2 FA7_HUMAN 0.144781622
IEEIAAK_387.2_660.4 C05_HUMAN 0.141983196
[00221] Table 42. Random Forest 32 Middle-Late Window
Variable UniProtJD MeanDecreaseGini
VPLALFALNR_557.3_620.4 PEPD_HUMAN 4.566619475
VFQFLEK_455.8_811.4 C05_HUMAN 3.062474666
AQPVQVAEGSEPDGFWEALGGK_758.0_574.3 GELS_HUMAN 3.033740627
LIEIANHVDK_384.6_498.3 ADA12_HUMAN 2.825082394
DALSSVQESQVAQQAR_573.0_502.3 APOC3_HUMAN 2.787777983
TLAFVR_353.7_492.3 FA7_HUMAN 2.730532075
ALNHLPLEYNSALYSR_621.0_696.4 C06_HUMAN 2.671290375
AVYEAVLR_460.8_587.4 PEPD_HUMAN 2.621357053
SEPRPGVLLR_375.2_654.4 FA7_HUMAN 2.57568964
TYLHTYESEI_628.3_515.3 ENPP2_HUMAN 2.516708906
ALNHLPLEYNSALYSR_621.0_538.3 C06_HUMAN 2.497348374
LIEIANHVDK 384.6 683.4 ADA12 HUMAN 2.457401462 Variable UniProtJD MeanDecreaseGini
YGIEEHGK_311.5_599.3 CXA1_HUMAN 2.396824268
VLEPTLK_400.3_587.3 VTDB_HUMAN 2.388105564
SEYGAALAWEK_612.8_788.4 C06_HUMAN 2.340473883
WSAGLTSSQVDLYIPK_883.0_515.3 CBG_HUMAN 2.332007976
FGFGGSTDSGPIR_649.3_946.5 ADA12_HUMAN 2.325669514
SEYGAALAWEK_612.8_845.5 C06_HUMAN 2.31761671
QINSYVK_426.2_496.3 CBG_HUMAN 2.245221163
QINSYVK_426.2_610.3 CBG_HUMAN 2.212307699
TEQAAVAR_423.2_615.4 FA12_HUMAN 2.105860336
AVYEAVLR_460.8_750.4 PEPD_HUMAN 2.098321893
TEQAAVAR_423.2_487.3 FA12_HUMAN 2.062684763
DFNQFSSGEK_386.8_333.2 FETA_HUMAN 2.05160689
SLQAFVAVAAR_566.8_804.5 IL23A_HUMAN 1.989521006
SLDFTELDVAAEK_719.4_316.2 ANGT_HUMAN 1.820628782
DPTFIPAPIQAK_433.2_556.3 ANGT_HUMAN 1.763514326
DPTFIPAPIQAK_433.2_461.2 ANGT_HUMAN 1.760870392
VLEPTLK_400.3_458.3 VTDB_HUMAN 1.723389354
YENYTSSFFIR_713.8_756.4 IL12B_HUMAN 1.63355187
[00222] Table 43. Random Forest 100 Middle-Late Window
Variable UniProtJD MeanDecreaseGini
VPLALFALNR_557.3_620.4 PEPD_HUMAN 1.995805024
VFQFLEK_455.8_811.4 C05_HUMAN 1.235926416
DALSSVQESQVAQQAR_573.0_502.3 APOC3_HUMAN 1.187464899
EVFSKPISWEELLQ_852.9_376.2 FA40A_HUMAN 1.166642578
AQPVQVAEGSEPDGFWEALGGK_758.0_574.3 GELS_HUMAN 1.146077071
TLAFVR_353.7_492.3 FA7_HUMAN 1.143038275
ANLINNIFELAGLGK_793.9_299.2 LCAP_HUMAN 1.130656591
ALNHLPLEYNSALYSR_621.0_538.3 C06_HUMAN 1.098305298
ELPQSIVYK_538.8_417.7 FBLN3_HUMAN 1.096715712
LLAPSDSPEWLSFDVTGVVR_730.1_430.3 TGFB1_HUMAN 1.086171713
YGIEEHGK_311.5_341.2 CXA1_HUMAN 1.071880823
ALNHLPLEYNSALYSR_621.0_696.4 C06_HUMAN 1.062278869
TQILEWAAER_608.8_761.4 EGLN_HUMAN 1.059019017
AVYEAVLR_460.8_587.4 PEPD_HUMAN 1.057920661
AEIEYLEK_497.8_552.3 LYAM1_HUMAN 1.038388955
SEPRPGVLLR_375.2_654.4 FA7_HUMAN 1.028275728
AVDIPGLEAATPYR_736.9_399.2 TENA_HUMAN 1.026032369
LIEIANHVDK_384.6_498.3 ADA12_HUMAN 1.015065282
YGIEEHGK_311.5_599.3 CXA1_HUMAN 0.98667651
VLEPTLK_400.3_587.3 VTDB_HUMAN 0.970330675
DVLLLVHNLPQNLTGHIWYK_791.8_883.0 PSG7_HUMAN 0.934747674
TAHISGLPPSTDFIVYLSGLAPSIR 871.5 472.3 TENA HUMAN 0.889111923 Variable UniProtJD MeanDecreaseGini
TLNAYDHR_330.5_312.2 PAR3_HUMAN 0.887605636
FGFGGSTDSGPIR_649.3_946.5 ADA12_HUMAN 0.884305889
LIEIANHVDK_384.6_683.4 ADA12_HUMAN 0.880889836
SEYGAALAWEK_612.8_788.4 C06_HUMAN 0.863585472
TYLHTYESEI_628.3_515.3 ENPP2_HUMAN 0.849232356
FGFGGSTDSGPIR_649.3_745.4 ADA12_HUMAN 0.843334824
SEYGAALAWEK_612.8_845.5 C06_HUMAN 0.842319271
TPSAAYLWVGTGASEAEK_919.5_849.4 GELS_HUMAN 0.828959173
[00223] Table 44. Random Forest Protein Middle-Late Window
Variable UniProtJD MeanDecreaseGini
VPLALFALNR_557.3_620.4 PEPD_HUMAN 3.202123047
ANLINNIFELAGLGK_793.9_299.2 LCAP_HUMAN 2.100447309
VFQFLEK_455.8_811.4 C05_HUMAN 2.096157529
AQPVQVAEGSEPDGFWEALGGK_758.0_574.3 GELS_HUMAN 2.052960939
ALNHLPLEYNSALYSR_621.0_696.4 C06_HUMAN 2.046139797
TQILEWAAER_608.8_761.4 EGLN_HUMAN 1.99287941
ELPQSIVYK_538.8_417.7 FBLN3_HUMAN 1.920894959
TGVAVNKPAEFTVDAK_549.6_258.1 FLNA_HUMAN 1.917665697
SEPRPGVLLR_375.2_654.4 FA7_HUMAN 1.883557705
DALSSVQESQVAQQAR_573.0_502.3 APOC3_HUMAN 1.870232155
EVFSKPISWEELLQ_852.9_376.2 FA40A_HUMAN 1.869000136
LIEIANHVDK_384.6_683.4 ADA12_HUMAN 1.825457092
VLEPTLK_400.3_587.3 VTDB_HUMAN 1.695327774
TEQAAVAR_423.2_615.4 FA12_HUMAN 1.685013152
LLAPSDSPEWLSFDVTGWR_730.1_430.3 TGFB1_HUMAN 1.684068039
TLNAYDHR_330.5_312.2 PAR3_HUMAN 1.673758239
AVDIPGLEAATPYR_736.9_399.2 TENA_HUMAN 1.648896853
DVLLLVHNLPQNLTGHIWYK_791.8_883.0 PSG7_HUMAN 1.648146088
AEIEYLEK_497.8_552.3 LYAM1_HUMAN 1.645833005
TYLHTYESEI_628.3_515.3 ENPP2_HUMAN 1.639121965
AGLLRPDYALLGHR_518.0_595.4 PGRP2_HUMAN 1.610227875
YGIEEHGK_311.5_599.3 CXA1_HUMAN 1.606978339
QINSYVK_426.2_496.3 CBG_HUMAN 1.554905578
LTTVDIVTLR_565.8_815.5 IL2RB_HUMAN 1.484081016
AALAAFNAQNNGSNFQLEEISR_789.1_746.4 FETUA_HUMAN 1.43173022
AEVIWTSSDHQVLSGK_586.3_300.2 PD1L1_HUMAN 1.394857397
ALEQDLPVNIK_620.4_570.4 CNDP1_HUMAN 1.393464547
DFNQFSSGEK_386.8_333.2 FETA_HUMAN 1.374296237
TSYQVYSK_488.2_787.4 C163A_HUMAN 1.36141387
TLEAQLTPR_514.8_685.4 HEP2_HUMAN 1.311118611 [00224] Table 45. Random Forest All Middle-Late Window
Figure imgf000148_0001
FGFGGSTDSGPIR_649.3_745.4 ADA12_HUMAN 1.245153376
LIEIANHVDK_384.6_498.3 ADA12_HUMAN 1.236529173
QINSYVK_426.2_496.3 CBG_HUMAN 1.221866266
YSHYNER_323.5_418.2 HABP2_HUMAN 1.169575572
YYGYTGAFR_549.3_450.3 TRFL_HUMAN 1.126684146
VGVISFAQK_474.8_580.3 TFR2_HUMAN 1.075283855
VFQYIDLHQDEFVQTLK_708.4_375.2 CNDP1_HUMAN 1.07279097
SPEAEDPLGVER_649.8_314.1 Z512B_HUMAN 1.05759256
DEIPHNDIALLK_459.9_510.8 HABP2_HUMAN 1.028933332
ALEQDLPVNIK_620.4_798.5 CNDP1_HUMAN 1.014443799
ALEQDLPVNIK_620.4_570.4 CNDP1_HUMAN 1.010573267
ILDGGNK_358.7_603.3 CXCL5_HUMAN 0.992175141
TSYQVYSK_488.2_787.4 C163A_HUMAN 0.95649585
YENYTSSFFIR_713.8_756.4 IL12B_HUMAN 0.955085198
SETEIHQGFQHLHQLFAK_717.4_447.2 CBG_HUMAN 0.944726739
TLPFSR_360.7_506.3 LYAM1_HUMAN 0.944426109
VLSSIEQK_452.3_691.4 1433S_HUMAN 0.933902495
AEIEYLEK_497.8_389.2 LYAM1_HUMAN 0.891235263
GTYLYNDCPGPGQDTDCR_697.0_666.3 TNR1A_HUMAN 0.87187037
SGVDLADSNQK_567.3_662.3 VGFR3_HUMAN 0.869821307
SGVDLADSNQK_567.3_591.3 VGFR3_HUMAN 0.839946466
[00226] Table 47. Random Forest 100 Late Window
Variable UniProtJD MeanDecreaseGini
AVYEAVLR_460.8_587.4 PEPD_HUMAN 0.971695767
AEIEYLEK_497.8_552.3 LYAM1_HUMAN 0.920098693
TGVAVNKPAEFTVDAK_549.6_258.1 FLNA_HUMAN 0.786924487
AVYEAVLR_460.8_750.4 PEPD_HUMAN 0.772867983
AALAAFNAQNNGSNFQLEEISR_789.1_746.4 FETUA_HUMAN 0.744138513
AYSDLSR_406.2_375.2 SAMP_HUMAN 0.736078079
VPLALFALNR_557.3_620.4 PEPD_HUMAN 0.681784822
QINSYVK_426.2_610.3 CBG_HUMAN 0.585819307
LIEIANHVDK_384.6_498.3 ADA12_HUMAN 0.577161158
FGFGGSTDSGPIR_649.3_745.4 ADA12_HUMAN 0.573055613
WSAGLTSSQVDLYIPK_883.0_515.3 CBG_HUMAN 0.569156128
ITQDAQLK_458.8_702.4 CBG_HUMAN 0.551017844
LIEIANHVDK_384.6_683.4 ADA12_HUMAN 0.539330047
YYGYTGAFR_549.3_450.3 TRFL_HUMAN 0.527652175
VFQYIDLHQDEFVQTLK_708.4_375.2 CNDP1_HUMAN 0.484155289
FQLPGQK_409.2_429.2 PSG1_HUMAN 0.480394031
AVDIPGLEAATPYR_736.9_286.1 TENA_HUMAN 0.475252565
QINSYVK_426.2_496.3 CBG_HUMAN 0.4728541
YISPDQLADLYK_713.4_277.2 ENOA_HUMAN 0.470079977
TLPFSR 360.7 506.3 LYAM1 HUMAN 0.46881451 Variable UniProtJD MeanDecreaseGini
SPEAEDPLGVER_649.8_314.1 Z512B_HUMAN 0.4658941
ALEQDLPVNIK_620.4_798.5 CNDP1_HUMAN 0.463604174
YSHYNER_323.5_418.2 HABP2_HUMAN 0.453076307
VGVISFAQK_474.8_580.3 TFR2_HUMAN 0.437768219
LQDAGVYR_461.2_680.3 PD1L1_HUMAN 0.428524689
AEIEYLEK_497.8_389.2 LYAM1_HUMAN 0.42041448
TSYQVYSK_488.2_787.4 C163A_HUMAN 0.419411932
SWLIPLGAVDDGEHSQNEK_703.0_798.4 CNDP1_HUMAN 0.415325735
ALEQDLPVNIK_620.4_570.4 CNDP1_HUMAN 0.407951733
ILDGGNK_358.7_603.3 CXCL5_HUMAN 0.401059572
[00227] Table 48. Random Forest Protein Late Window
Variable UniProtJD MeanDecreaseGini
AVYEAVLR_460.8_587.4 PEPD_HUMAN 1.836010146
AEIEYLEK_497.8_552.3 LYAM1_HUMAN 1.739802548
AALAAFNAQNNGSNFQLEEISR_789.1_746.4 FETUA_HUMAN 1.455337749
TGVAVNKPAEFTVDAK_549.6_258.1 FLNA_HUMAN 1.395043941
AYSDLSR_406.2_375.2 SAMP_HUMAN 1.177349958
LIEIANHVDK_384.6_683.4 ADA12_HUMAN 1.14243936
QINSYVK_426.2_496.3 CBG_HUMAN 1.05284482
ALEQDLPVNIK_620.4_798.5 CNDP1_HUMAN 0.971678206
YISPDQLADLYK_713.4_277.2 ENOA_HUMAN 0.902293734
AVDIPGLEAATPYR_736.9_286.1 TENA_HUMAN 0.893163413
SPEAEDPLGVER_649.8_314.1 Z512B_HUMAN 0.856551531
ILDGGNK_358.7_603.3 CXCL5_HUMAN 0.841485153
VGVISFAQK_474.8_580.3 TFR2_HUMAN 0.835256078
YYGYTGAFR_549.3_450.3 TRFL_HUMAN 0.831195917
YSHYNER_323.5_418.2 HABP2_HUMAN 0.814479968
FQLPGQK_409.2_276.1 PSG1_HUMAN 0.77635168
YENYTSSFFIR_713.8_756.4 IL12B_HUMAN 0.761241391
TEQAAVAR_423.2_615.4 FA12_HUMAN 0.73195592
SGVDLADSNQK_567.3_662.3 VGFR3_HUMAN 0.72504131
VLSSIEQK_452.3_691.4 1433S_HUMAN 0.713380314
GTYLYNDCPGPGQDTDCR_697.0_666.3 TNR1A_HUMAN 0.704248586
TSYQVYSK_488.2_787.4 C163A_HUMAN 0.69026345
TLEAQLTPR_514.8_685.4 HEP2_HUMAN 0.654641588
AEVIWTSSDHQVLSGK_586.3_300.2 PD1L1_HUMAN 0.634751081
TAVTANLDIR_537.3_288.2 CHL1_HUMAN 0.619871203
ITENDIQIALDDAK_779.9_632.3 APOB_HUMAN 0.606313398
TASDFITK_441.7_781.4 GELS_HUMAN 0.593535076
SPQAFYR_434.7_556.3 REL3_HUMAN 0.592004045
NHYTESISVAK_624.8_415.2 NEUR1_HUMAN 0.588383911
LTTVDIVTLR 565.8 815.5 IL2RB HUMAN 0.587343951 [00228] Table 49. Random Forest All Late Window
Figure imgf000151_0001
[00229] Table 50. Selected Transitions for Early Window
Figure imgf000151_0002
Transition Parent Protein
TEFLSNYLTNVDDITLVPGTLGR 846.8 600.3 ENPP2 HUMAN
TASDFITK 441.7 781.4 GELS HUMAN
LPNNVLQEK 527.8 844.5 AFAM HUMAN
AHYDLR 387.7 288.2 FETUA HUMAN
ITLPDFTGDLR 624.3 288.2 LBP HUMAN
IEGNLIFDPNNYLPK 874.0 414.2 APOB HUMAN
ITGFLKPGK 320.9 301.2 LBP HUMAN
FSVVYAK 407.2 381.2 FETUA HUMAN
ITGFLKPGK 320.9 429.3 LBP HUMAN
VFQFLEK 455.8 811.4 C05 HUMAN
LIQDAVTGLTVNGQITGDK 972.0 798.4 ITIH3 HUMAN
DADPDTFFAK 563.8 825.4 AFAM HUMAN
[00230] Table 51. Selected Proteins for Early Window
Figure imgf000152_0001
[00231] Table 52. Selected Transitions for Middle-Late Window
Transition Patent Protein
VPLALFALNR_557.3_620.4 PEPD_HUMAN
VFQFLEK_455.8_811.4 C05_HUMAN
AQPVQVAEGSEPDGFWEALGGK_758.0_574.3 GELS_HUMAN
LIEIANHVDK_384.6_498.3 ADA12_HUMAN
TLAFVR_353.7_492.3 FA7_HUMAN
ALNHLPLEYNSALYSR_621.0_696.4 C06_HUMAN
AVYEAVLR_460.8_587.4 PEPD_HUMAN
SEPRPGVLLR_375.2_654.4 FA7 HUMAN Transition Patent Protein
TYLHTYESEI_628.3_515.3 ENPP2_HUMAN
ALNHLPLEYNSALYSR_621.0_538.3 C06_HUMAN
[00232] Table 53. Selected Proteins for Middle-Late Window
Protein
Xaa-Pro dipeptidase PEPD_HUMAN
Leucyl-cystinyl aminopeptidase LCAPJHUMAN
complement component C5 C05_HUMAN
Gelsolin GELS_HUMAN
complement component C6 precursor C06_HUMAN
Endoglin precursor EGLN_HUMAN
EGF-containing fibulin-like extracellular matrix protein 1 FBLN3_HUMAN
coagulation factor VII isoform a FA7_HUMAN
Disintegrin and metalloproteinase domain-containing
protein 12 ADA12_HUMAN
vitamin D-binding protein isoform 1 precursor VTDB_HUMAN
coagulation factor XII precursor FA12_HUMAN
Corticosteroid-binding globulin CBG_HUMAN
Example 6. Study V to Further Refine Preterm Birth Biomarkers
[00233] A additional hypothesis-dependent discovery study was performed with a further refined scheduled MRM assay. Less robust transitions were again removed to improve analytical performance and make room for the inclusion of stable-isotope labeled standards (SIS) corresponding to 79 analytes of interest identified in previous studies. SIS peptides have identical amino acid sequence, chromatographic and MS fragmentation behaviour as their endogenous peptide counterparts, but differ in mass. Therefore they can be used to reduce LC-MS analytical variability and confirm analyte identity. Samples included approximately 60 spontaneous PTB cases (delivery at less than 37 weeks, 0 days), and 180 term controls (delivery at greater than or equal to 37 weeks, 0 days). Each case was designated a "matched" control to within one day of blood draw and two "random" controls matched to the same 3 week blood draw window (17-19, 20-22 or 23-25 weeks gestation). For the purposes of analysis these three blood draw windows were combined. Samples were processed essentially as described previously, except that in this study, tryptic digests were reconstituted in a solution containing SIS standards. Raw analyte peak areas were Box-Cox transformed, corrected for run order and batch effects by regression and used for univariate and multivariate statistical analyses. Univariate analysis included determination of p-values for adjusted peak areas for all analytes from t-tests considering cases vs controls defined as either deliveries at >37 weeks (Table 54) or deliveries at >40 weeks (Table 55). Univariate analysis also included the determination of p-values for a linear model that evaluates the dependence of each analyte's adjusted peak area on the time to birth (gestational age at birth minus the gestational age at blood draw) (Table 56) and the gestational age at birth (Table 57). Additionally raw peak area ratios were calculated for endogenous analytes and their corresponding SIS counterparts, Box-Cox transformed and then used for univariate and multivariate statistical analyses. The above univariate analysis was repeated for analyte/SIS peak area ratio values, summarized in Tables 58-61 , respectively.
[00234] Multivariate random forest regression models were built using analyte values and clinical variables (e.g. Maternal age, (MAGE), Body mass index, (BMI)) to predict Gestational Age at Birth (GAB). The accuracy of the random forest was evaluated with respect to correlation of the predicted and actual GAB, and with respect to the mean absolute deviation (MAD) of the predicted from actual GAB. The accuracy was further evaluated by determining the area under the receiver operating characteristic curve (AUC) when using the predicted GAB as a quantitative variable to classify subjects as full term or pre-term. Random Forest Importance Values were fit to an Empirical Cumulative
Disribution Function and probabilities (P) were calculated. We report the analytes by importance ranking (P>0.7) in the random forest models, using adjusted analyte peak area values (Table 62) and analyte/SIS peak area ratio values (Table 63).
[00235] The probability of pre-term birth, p(PTB), may be estimated using the predicted gestational age at birth (GAB) as follows. The estimate will be based on women enrolled in the Sera PAPR clinical trial, which provided the subjects used to develop the PTB prediction methods.
[00236] Among women with a predicted GAB of j days plus or minus k days, p(PTB) was estimated as the proportion of women in the PAPR clinical trial with a predicted GAB of j days plus or minus k days who actually deliver before 37 weeks gestational age.
[00237] More generally, for women with a predicted GAB of j days plus or minus k days, the probability that the actual gestational age at birth will be less than a specified gestational age, p(actual GAB < specified GAB), was estimated as the proportion of women in the PAPR clinical trial with a predicted GAB of j days plus or minus k days who actually deliver before the specified gestational age. Figure 1 depicts a scatterplot of actual gestational age at birth versus predicted gestational age from random forest regression model. Firgure 2 shows the distribution of predicted gestational age from random forest regression model versus actual gestational age at birth (GAB), where actual GAB was given in categories of (i) less than 37 weeks, (ii) 37 to 39 weeks, and (iii) 40 weeks or greater.
[00238] Table 54. Univariate p-values for Adjusted Peak Areas (<37 vs >37 weeks)
Figure imgf000155_0001
Transition Protein pvalue
ILILPSVTR 506.3 559.3 PSGx HUMAN 0.042460036
YYLQGAK 421.7 516.3 ITIH4 HUMAN 0.044511962
TPSAAYLWVGTGASEAEK 919.5 849.4 GELS HUMAN 0.046362381
AGLLRPDYALLGHR 518.0 595.4 PGRP2 HUMAN 0.046572355
TYLHTYESEI 628.3 908.4 ENPP2 HUMAN 0.04754503
FSLVSGWGQLLDR 493.3 403.2 FA7 HUMAN 0.048642964
VNFTEIQK 489.8 765.4 FETA HUMAN 0.04871392
LFIPQITR 494.3 727.4 PSG9 HUMAN 0.049288923
DISEVVTPR 508.3 787.4 CFAB HUMAN 0.049458374
SEPRPGVLLR 375.2 454.3 FA7 HUMAN 0.049567047
[00239] Table 55. Univariate p-values for Adjusted Peak Areas (<37 vs >40 weeks)
Transition Protein pvalue
SPELQAEAK 486.8 659.4 APOA2 HUMAN 0.001457796
DYWSTVK 449.7 347.2 APOC3 HUMAN 0.001619622
DYWSTVK 449.7 620.3 APOC3 HUMAN 0.002068704
DALSSVQESQVAQQAR 573.0 502.3 APOC3 HUMAN 0.00250563
GWVTDGFSSLK 598.8 854.4 APOC3 HUMAN 0.002543943
SPELQAEAK 486.8 788.4 APOA2 HUMAN 0.003108814
SEPRPGVLLR 375.2 654.4 FA7 HUMAN 0.004035832
DALSSVQESQVAQQAR 573.0 672.4 APOC3 HUMAN 0.00434652
SEYGAALAWEK 612.8 788.4 C06 HUMAN 0.005306924
GWVTDGFSSLK 598.8 953.5 APOC3 HUMAN 0.005685534
ALNHLPLEYNSALYSR 621.0 696.4 C06 HUMAN 0.005770384
TYLHTYESEI 628.3 515.3 ENPP2 HUMAN 0.005798991
ENPAVIDFELAPIVDLVR 670.7 601.4 C06 HUMAN 0.006248095
ALNHLPLEYNSALYSR 621.0 538.3 C06 HUMAN 0.006735817
TYLHTYESEI 628.3 908.4 ENPP2 HUMAN 0.007351774
AGLLRPDYALLGHR 518.0 369.2 PGRP2 HUMAN 0.009541521
AKPALEDLR 506.8 813.5 APOA1 HUMAN 0.009780371
SEYGAALAWEK 612.8 845.5 C06 HUMAN 0.010085363
FSLVSGWGQLLDR 493.3 447.3 FA7 HUMAN 0.010401836
WGAAPYR 410.7 634.3 PGRP2 HUMAN 0.011233623
ENPAVIDFELAPIVDLVR 670.7 811.5 C06 HUMAN 0.012029564
DVLLLVHNLPQNLPGYFWYK 810.4 215.1 PSG9 HUMAN 0.014808277
LFIPQITR 494.3 614.4 PSG9 HUMAN 0.015879755
WGAAPYR 410.7 577.3 PGRP2 HUMAN 0.016562435
AGLLRPDYALLGHR 518.0 595.4 PGRP2 HUMAN 0.016793521
TLAFVR 353.7 492.3 FA7 HUMAN 0.016919708
FSLVSGWGQLLDR 493.3 403.2 FA7 HUMAN 0.016937583
WWGGQPLWITATK 772.4 373.2 ENPP2 HUMAN 0.019050115
GYVIIKPLVWV 643.9 304.2 SAMP HUMAN 0.019675317
DVLLLVHNLPQNLPGYFWYK 810.4 960.5 PSG9 HUMAN 0.020387647
FGFGGSTDSGPIR 649.3 745.4 ADA12 HUMAN 0.020458335
DVLLLVHNLPQNLPGYFWYK 810.4 328.2 PSG9 HUMAN 0.021488084
WWGGQPLWITATK 772.4 929.5 ENPP2 HUMAN 0.021709354 Transition Protein pvalue
LDFHFSSDR 375.2 448.2 INHBC HUMAN 0.022403383
LFIPQITR 494.3 727.4 PSG9 HUMAN 0.025561103
TEFLSNYLTNVDDITLVPGTLGR 846.8 600.3 ENPP2 HUMAN 0.029344366
LSIPQITTK 500.8 800.5 PSG5 HUMAN 0.031361776
ALVLELAK 428.8 672.4 INHBE HUMAN 0.031690737
SEPRPGVLLR 375.2 454.3 FA7 HUMAN 0.033067953
LSIPQITTK 500.8 687.4 PSG5 HUMAN 0.033972449
LDFHFSSDR 375.2 611.3 INHBC HUMAN 0.034500249
LDFHFSSDR 375.2 464.2 INHBC HUMAN 0.035166664
GAVHVVVAETDYQSFAVLYLER 822.8 580.3 C08G HUMAN 0.037334975
HELTDEELQ SLFTNF ANVVDK 817.1 854.4 AFAM HUMAN 0.039258528
AYSDLSR 406.2 375.2 SAMP HUMAN 0.04036485
YYLQGAK 421.7 516.3 ITIH4 HUMAN 0.042204165
ILPSVPK 377.2 264.2 PGH1 HUMAN 0.042397885
ELLESYIDGR 597.8 710.4 THRB HUMAN 0.043053589
ALALPPLGLAPLLNLWAKPQGR 770.5 256.2 SHBG HUMAN 0.045692283
VGEYSLYIGR 578.8 871.5 SAMP HUMAN 0.04765767
ANDQYLTAAALHNLDEAVK 686.4 317.2 ILIA HUMAN 0.048928376
YYGYTGAFR 549.3 551.3 TRFL HUMAN 0.049568351
[00240] Table 56. Univariate p-values for Adjusted Peak Areas in Time to Birth Linear
Model
Figure imgf000157_0001
Protein pvalue
PSGx HUMAN 0.035719227
CBG HUMAN 0.036331871
CSH HUMAN 0.039896611
CSH HUMAN 0.04244001
SAMP HUMAN 0.047112128
LBP HUMAN 0.048141371
LBP HUMAN 0.048433174
C06 HUMAN 0.04850949
PSGx HUMAN 0.049640167
[00241] Table 57. Univariate p-values for Adjusted Peak Areas in Gestation Age at
Birth Linear Model
Figure imgf000158_0001
Transition Protein pvalue
HELTDEELQSLFTNFANVVDK 817.1 854.4 AFAM HUMAN 0.005370567
ALNHLPLEYNSALYSR 621.0 696.4 C06 HUMAN 0.005705922
ITQDAQLK 458.8 702.4 CBG HUMAN 0.006762484
ITLPDFTGDLR 624.3 920.5 LBP HUMAN 0.006993268
SILFLGK 389.2 577.4 THBG HUMAN 0.007134146
WSAGLTSSQVDLYIPK 883.0 357.2 CBG HUMAN 0.007670388
GVTSVSQIFHSPDLAIR 609.7 472.3 IC1 HUMAN 0.007742729
VGEYSLYIGR 578.8 871.5 SAMP HUMAN 0.007778691
ITLPDFTGDLR 624.3 288.2 LBP HUMAN 0.008179918
YYLQGAK 421.7 327.1 ITIH4 HUMAN 0.008404686
ALNHLPLEYNSALYSR 621.0 538.3 C06 HUMAN 0.008601162
DYWSTVK 449.7 620.3 APOC3 HUMAN 0.008626786
TVQAVLTVPK 528.3 855.5 PEDF HUMAN 0.008907523
ITGFLKPGK 320.9 301.2 LBP HUMAN 0.009155417
LFIPQITR 494.3 727.4 PSG9 HUMAN 0.009571006
SPELQAEAK 486.8 788.4 APOA2 HUMAN 0.009776508
DYWSTVK 449.7 347.2 APOC3 HUMAN 0.00998356
ITGFLKPGK 320.9 429.3 LBP HUMAN 0.010050264
FLNWIK 410.7 560.3 HABP2 HUMAN 0.010372454
DLHLSDVFLK 396.2 366.2 C06 HUMAN 0.010806378
GVTSVSQIFHSPDLAIR 609.7 908.5 IC1 HUMAN 0.011035991
VEHSDLSFSK 383.5 468.2 B2MG HUMAN 0.011113172
LLDSLPSDTR 558.8 276.2 IC1 HUMAN 0.011589013
LLDSLPSDTR 558.8 890.4 IC1 HUMAN 0.011629438
QALEEFQK 496.8 551.3 C08B HUMAN 0.011693839
LLDSLPSDTR 558.8 575.3 IC1 HUMAN 0.012159314
IIGGSDADIK 494.8 762.4 CIS HUMAN 0.013080243
AFIQLWAFDAVK 704.9 650.4 AMBP HUMAN 0.013462234
GFQALGDAADIR 617.3 717.4 TIMP1 HUMAN 0.014370997
LPNNVLQEK 527.8 730.4 AFAM HUMAN 0.014424891
DTDTGALLFIGK 625.8 217.1 PEDF HUMAN 0.014967952
VQTAHFK 277.5 502.3 C08A HUMAN 0.01524844
ILILPSVTR 506.3 559.3 PSGx HUMAN 0.015263132
SILFLGK 389.2 201.1 THBG HUMAN 0.015265233
TVQAVLTVPK 528.3 428.3 PEDF HUMAN 0.015344052
VEPLYELVTATDFAYSSTVR 754.4 712.4 C08B HUMAN 0.015451068
FSLVSGWGQLLDR 493.3 447.3 FA7 HUMAN 0.015510454
GWVTDGFSSLK 598.8 854.4 APOC3 HUMAN 0.01610797
LSETNR 360.2 519.3 PSG1 HUMAN 0.016433362
TQILEWAAER 608.8 632.3 EGLN HUMAN 0.01644844
SETEIHQGFQHLHQLFAK 717.4 318.1 CBG HUMAN 0.016720367
TNLESILSYPK 632.8 936.5 IC1 HUMAN 0.017314185
TNLESILSYPK 632.8 807.5 IC1 HUMAN 0.017593786
AYSDLSR 406.2 375.2 SAMP HUMAN 0.018531348
YEVQGEVFTKPQLWP 911.0 392.2 CRP HUMAN 0.019111323
AYSDLSR 406.2 577.3 SAMP HUMAN 0.019271266
QALEEFQK 496.8 680.3 C08B HUMAN 0.019429489 Transition Protein pvalue
APLTKPLK 289.9 398.8 CRP HUMAN 0.020110081
FQPTLLTLPR 593.4 276.1 IC1 HUMAN 0.020114306
ITQDAQLK 458.8 803.4 CBG HUMAN 0.020401782
AVLHIGEK 289.5 292.2 THBG HUMAN 0.02056597
ANDQYLTAAALHNLDEAVK 686.4 317.2 ILIA HUMAN 0.020770124
VGEYSLYIGR 578.8 708.4 SAMP HUMAN 0.021126414
TLYSSSPR 455.7 533.3 IC1 HUMAN 0.021306106
VEHSDLSFSK 383.5 234.1 B2MG HUMAN 0.021640643
HELTDEELQSLFTNFANVVDK 817.1 906.5 AFAM HUMAN 0.021921609
TLYSSSPR 455.7 696.3 IC1 HUMAN 0.022196181
GYVIIKPLVWV 643.9 854.6 SAMP HUMAN 0.023126336
DEIPHNDIALLK 459.9 260.2 HABP2 HUMAN 0.023232158
ILILPSVTR 506.3 785.5 PSGx HUMAN 0.023519909
WNFAYWAAHQPWSR 607.3 545.3 PRG2 HUMAN 0.023697087
FQPTLLTLPR 593.4 712.5 IC1 HUMAN 0.023751959
AQPVQVAEGSEPDGFWEALGGK 758.0 623.4 GELS HUMAN 0.024262721
DEIPHNDIALLK 459.9 510.8 HABP2 HUMAN 0.024414348
GDSGGAFAVQDPNDK 739.3 716.3 CIS HUMAN 0.025075028
FLNWIK 410.7 561.3 HABP2 HUMAN 0.025649617
APLTKPLK 289.9 357.2 CRP HUMAN 0.025961162
ALDLSLK 380.2 185.1 ITIH3 HUMAN 0.026233504
GWVTDGFSSLK 598.8 953.5 APOC3 HUMAN 0.026291884
SETEIHQGFQHLHQLFAK 717.4 447.2 CBG HUMAN 0.026457136
GDSGGAFAVQDPNDK 739.3 473.2 CIS HUMAN 0.02727457
YEVQGEVFTKPQLWP 911.0 293.1 CRP HUMAN 0.028244448
HVVQLR 376.2 614.4 IL6RA HUMAN 0.028428028
DTDTGALLFIGK 625.8 818.5 PEDF HUMAN 0.028773557
E VPL S ALTNILS AQLI SH WK 740.8 996.6 PAI1 HUMAN 0.029150774
AFTECCVVASQLR 770.9 574.3 C05 HUMAN 0.029993325
TLAFVR 353.7 492.3 FA7 HUMAN 0.030064307
LWAYLTIQELLAK 781.5 300.2 ITIH1 HUMAN 0.030368674
DEIPHNDIALLK 459.9 245.1 HABP2 HUMAN 0.031972082
AGLLRPDYALLGHR 518.0 369.2 PGRP2 HUMAN 0.032057409
AVYEAVLR 460.8 587.4 PEPD HUMAN 0.032527521
LPNNVLQEK 527.8 844.5 AFAM HUMAN 0.033807082
GAVHVVVAETDYQSFAVLYLER 822.8 580.3 C08G HUMAN 0.034370139
WNFAYWAAHQPWSR 607.3 673.3 PRG2 HUMAN 0.0349737
EAQLPVIENK 570.8 329.2 PLMN HUMAN 0.035304322
VQEAHLTEDQIFYFPK 655.7 701.4 C08G HUMAN 0.035704382
AFIQLWAFDAVK 704.9 836.4 AMBP HUMAN 0.035914532
SGFSFGFK 438.7 585.3 C08B HUMAN 0.037168221
SGFSFGFK 438.7 732.4 C08B HUMAN 0.040182596
DADPDTFFAK 563.8 302.1 AFAM HUMAN 0.041439744
EAQLPVIENK 570.8 699.4 PLMN HUMAN 0.041447675
IIGGSDADIK 494.8 260.2 CIS HUMAN 0.041683256
AVLTIDEK 444.8 718.4 A1AT HUMAN 0.043221658
SEPRPGVLLR 375.2 654.4 FA7 HUMAN 0.044079127 Transition Protein pvalue
YHFEALADTGISSEFYDNANDLLSK 940.8 874.5 C08A HUMAN 0.045313634
HFQNLGK 422.2 527.2 AFAM HUMAN 0.047118971
LEQGENVFLQATDK 796.4 822.4 C1QB HUMAN 0.047818928
NTVISVNPSTK 580.3 732.4 VCAM1 HUMAN 0.048102262
YYGYTGAFR 549.3 551.3 TRFL HUMAN 0.048331316
ISLLLIESWLEPVR 834.5 500.3 CSH HUMAN 0.049561581
LQVLGK 329.2 416.3 A2GL HUMAN 0.049738493
[00242] Table 58. Univariate p-values for Peak Area Ratios (<37 vs >37 weeks)
Figure imgf000161_0001
[00244] Table 60. Univariate p-values for Peak Area Ratios in Time to Birth Linear
Model
Figure imgf000161_0002
UniProt ID Transition pvalue
PSG6 HUMAN SNPVTLNVLYGPDLPR 585.7 817.4 5.71E-12
PSG6 HUMAN SNPVTLNVLYGPDLPR 585.7 654.4 1.82E-11
VGFR3 HUMAN SGVDLADSNQK 567.3 662.3 4.57E-11
INHBE HUMAN ALVLELAK 428.8 331.2 1.04E-08
PSG2 HUMAN IHPSYTNYR 384.2 452.2 6.27E-08
PSG9 HUMAN LFIPQITR 494.3 727.4 1.50E-07
VGFR3 HUMAN SGVDLADSNQK 567.3 591.3 2.09E-07
PSG9 HUMAN LFIPQITR 494.3 614.4 2.71E-07
PSG9 HUMAN DVLLLVHNLPQNLPGYFWYK 810.4 960.5 3.10E-07
PSG2 HUMAN IHPSYTNYR 384.2 338.2 2.55E-06
ITIH3 HUMAN LIQDAVTGLTVNGQITGDK 972.0 640.4 2.76E-06
ENPP2 HUMAN TYLHTYESEI 628.3 908.4 2.82E-06
ENPP2 HUMAN WWGGQPLWITATK 772.4 373.2 3.75E-06
PSG9 HUMAN DVLLLVHNLPQNLPGYFWYK 810.4 328.2 3.94E-06
B2MG HUMAN VEHSDLSFSK 383.5 468.2 5.42E-06
ENPP2 HUMAN WWGGQPLWITATK 772.4 929.5 7.93E-06
ANGT HUMAN ALQDQLVLVAAK 634.9 289.2 1.04E-05
B2MG HUMAN VNHVTLSQPK 374.9 244.2 1.46E-05
AFAM HUMAN LPNNVLQEK 527.8 730.4 1.50E-05
AFAM HUMAN LPNNVLQEK 527.8 844.5 1.98E-05
THBG HUMAN AVLHIGEK 289.5 292.2 2.15E-05
ENPP2 HUMAN TYLHTYESEI 628.3 515.3 2.17E-05
IL12B HUMAN DIIKPDPPK 511.8 342.2 3.31E-05
AFAM HUMAN DADPDTFFAK 563.8 302.1 6.16E-05
THBG HUMAN AVLHIGEK 289.5 348.7 8.34E-05
PSG9 HUMAN DVLLLVHNLPQNLPGYFWYK 810.4 215.1 0.000104442
B2MG HUMAN VEHSDLSFSK 383.5 234.1 0.000140786
TRFL HUMAN YYGYTGAFR 549.3 450.3 0.000156543
HEMO HUMAN QGHNSVFLIK 381.6 260.2 0.000164578
A1BG HUMAN LLELTGPK 435.8 227.2 0.000171113
C06 HUMAN ALNHLPLEYNSALYSR 621.0 696.4 0.000242116
C06 HUMAN ALNHLPLEYNSALYSR 621.0 538.3 0.00024681
ALS HUMAN IRPHTFTGLSGLR 485.6 432.3 0.000314359
ITIH2 HUMAN LSNENHGIAQR 413.5 544.3 0.0004877
PEDF HUMAN TVQAVLTVPK 528.3 855.5 0.000508174
AFAM HUMAN HFQNLGK 422.2 527.2 0.000522139
FLNA HUMAN TGVAVNKPAEFTVDAK 549.6 258.1 0.000594403
ANGT HUMAN ALQDQLVLVAAK 634.9 956.6 0.000640673
AFAM HUMAN HFQNLGK 422.2 285.1 0.000718763
HGFA HUMAN LHKPGVYTR 357.5 692.4 0.000753293
HGFA HUMAN LHKPGVYTR 357.5 479.3 0.000909298
HABP2 HUMAN FLNWIK 410.7 561.3 0.001282014
FETUA HUMAN HTLNQIDEVK 598.8 951.5 0.001389792
AFAM HUMAN DADPDTFFAK 563.8 825.4 0.001498237
B2MG HUMAN VNHVTLSQPK 374.9 459.3 0.001559862
ALS HUMAN IRPHTFTGLSGLR 485.6 545.3 0.001612361
A1BG HUMAN LLELTGPK 435.8 644.4 0.002012656 UniProt ID Transition pvalue
F13B HUMAN LIENGYFHPVK 439.6 343.2 0.00275216
ITIH2 HUMAN LSNENHGIAQR 413.5 519.8 0.00356561
APOC3 HUMAN DALSSVQESQVAQQAR 573.0 672.4 0.00392745
F13B HUMAN LIENGYFHPVK 439.6 627.4 0.00434836
PEDF HUMAN TVQAVLTVPK 528.3 428.3 0.00482765
PLMN HUMAN YEFLNGR 449.7 293.1 0.007325436
HEMO HUMAN QGHNSVFLIK 381.6 520.4 0.009508516
FETUA HUMAN HTLNQIDEVK 598.8 958.5 0.010018936
C05 HUMAN LQGTLPVEAR 542.3 842.5 0.011140661
PLMN HUMAN YEFLNGR 449.7 606.3 0.01135322
C05 HUMAN TLLPVSKPEIR 418.3 288.2 0.015045275
HABP2 HUMAN FLNWIK 410.7 560.3 0.01523134
APOC3 HUMAN DALSSVQESQVAQQAR 573.0 502.3 0.01584708
C05 HUMAN LQGTLPVEAR 542.3 571.3 0.017298064
CFAB HUMAN DISEVVTPR 508.3 472.3 0.021743221
CERU HUMAN TTIEKPVWLGFLGPIIK 638.0 640.4 0.02376225
C08G HUMAN SLPVSDSVLSGFEQR 810.9 723.3 0.041150397
C08G HUMAN FLQEQGHR 338.8 497.3 0.042038143
C05 HUMAN VFQFLEK 455.8 811.4 0.043651929
C08B HUMAN QALEEFQK 496.8 680.3 0.04761631
[00245] Table 61. Univariate p-values for Peak Area Ratios in Gestation Age at Birth
Linear Model
Figure imgf000163_0001
UniProt ID Transition pvalue
AFAM HUMAN DADPDTFFAK 563.8 302.1 0.033791429
C06 HUMAN ALNHLPLEYNSALYSPv 621.0 538.3 0.034865341
AFAM HUMAN LPNNVLQEK 527.8 844.5 0.039880594
PEDF HUMAN TVQAVLTVPK 528.3 428.3 0.040854402
PLMN HUMAN EAQLPVIENK 570.8 329.2 0.041023812
LBP HUMAN ITLPDFTGDLR 624.3 920.5 0.042276813
C08G HUMAN VQEAHLTEDQIFYFPK 655.7 701.4 0.042353851
PLMN HUMAN YEFLNGR 449.7 606.3 0.04416504
B2MG HUMAN VNHVTLSQPK 374.9 459.3 0.045458409
CFAB HUMAN DISEVVTPR 508.3 472.3 0.046493405
INHBE HUMAN ALVLELAK 428.8 331.2 0.04789353
[00246] Table 62. Random Forest Importance Values Using Adjusted Peak Areas
Figure imgf000164_0001
Transition Rank Importance
ADA12 LIEIANHVDK 384.6 498.3 34 305.8803542
PGRP2 WGAAPYR 410.7 577.3 35 303.5539874
PSG9 LFIPQITR 494.3 614.4 36 300.7877317
HABP2 FLNWIK 410.7 560.3 37 298.3363186
CBG WSAGLTSSQVDLYIPK 883.0 357.2 38 297.2474385
PSG2 IHPSYTNYR 384.2 452.2 39 292.6203405
PSG5 LSIPQITTK 500.8 800.5 40 290.2023364
HABP2 FLNWIK 410.7 561.3 41 289.5092933
C06 SEYGAALAWEK 612.8 788.4 42 287.7634114
ADA12 LIEIANHVDK 384.6 683.4 43 286.5047372
EGLN TQILEWAAER 608.8 632.3 44 284.5138846
C06 ENPAVIDFELAPIVDLVR 670.7 601.4 45 273.5146272
FA7 FSLVSGWGQLLDR 493.3 447.3 46 271.7850098
ITIH3 ALDLSLK 380.2 575.3 47 269.9425709
ADA12 FGFGGSTDSGPIR 649.3 946.5 48 264.5698225
FETUA AALAAFNAQNNGSNFQLEEISR 789.1 746.4 49 247.4728828
FBLNl AITPPHPASQANIIFDITEGNLR 825.8 459.3 50 246.572102
TSP1 FVFGTTPEDILR 697.9 843.5 51 245.0459575
VCAM1 NTVISVNPSTK 580.3 732.4 52 240.576729
ENPP2 TEFLSNYLTNVDDITLVPGTLGR 846.8 699.4 53 240.1949512
FBLN3 ELPQSIVYK 538.8 409.2 55 233.6825304
ACTB VAPEEHPVLLTEAPLNPK 652.0 892.5 56 226.9772749
TSP1 FVFGTTPEDILR 697.9 742.4 57 224.4627393
PLMN EAQLPVIENK 570.8 699.4 58 221.4663735
CIS IIGGSDADIK 494.8 260.2 59 218.069476
ILIA ANDQYLTAAALHNLDEAVK 686.4 317.2 60 216.5531949
PGRP2 WGAAPYR 410.7 634.3 61 211.0918302
PSG5 LSIPQITTK 500.8 687.4 62 208.7871461
PSG6 SNPVTLNVLYGPDLPR 585.7 654.4 63 207.9294937
PRG2 WNFAYWAAHQPWSR 607.3 545.3 64 202.9494031
CXCL2 CQCLQTLQGIHLK 13p8RT 533.6 567.4 65 202.9051326
CXCL2 CQCLQTLQGIHLK 13p48RT 533.6 695.4 66 202.6561548
G6PE LLDFEFSSGR 585.8 553.3 67 201.004611
GELS TASDFITK 441.7 710.4 68 200.2704809
B2MG VEHSDLSFSK 383.5 468.2 69 199.880987
C08B IPGIFELGISSQSDR 809.9 849.4 70 198.7563875
PSG8 LQLSETNR 480.8 606.3 71 197.6739966
LBP GLQ YAAQEGLL ALQ SELLR 1037.1 929.5 72 197.4094851
AFAM LPNNVLQEK 527.8 844.5 73 196.8123228
MAGE 74 196.2410502
PSG2 IHPSYTNYR 384.2 338.2 75 196.2410458
PSG9 LFIPQITR 494.3 727.4 76 193.5329266
TFR1 YNSQLLSFVR 613.8 734.5 77 193.2711994
CIR QRPPDLDTSSNAVDLLFFTDESGDSR 961.5 866.3 78 193.0625419
PGH1 ILPSVPK 377.2 264.2 79 190.0504508
FA7 SEPRPGVLLR 375.2 454.3 80 188.2718422
FA7 TLAFVR 353.7 274.2 81 187.6895294 Transition Rank Importance
PGRP2 DGSPDVTTADIGANTPDATK 973.5 844.4 82 185.6017519
CIS IIGGSDADIK 494.8 762.4 83 184.5985543
PEPD VPLALFALNR 557.3 620.4 84 184.3962957
CIS EDTPNSVWEPAK 686.8 630.3 85 179.2043504
CHL1 TAVTANLDIR 537.3 802.4 86 174.9866792
CHL1 VIAVNEVGR 478.8 744.4 88 172.2053147
SDF1 ILNTPNCALQIVAR 791.9 341.2 89 171.4604557
PAI1 EVPLSALTNILSAQLISHWK 740.8 996.6 90 169.5635635
AMBP AFIQLWAFDAVK 704.9 650.4 91 169.2124477
G6PE LLDFEFSSGR 585.8 944.4 92 168.2398598
THBG SILFLGK 389.2 577.4 93 166.3110206
PRDX2 GLFIIDGK 431.8 545.3 94 164.3125132
ENPP2 WWGGQPLWITATK 772.4 373.2 95 163.4011689
VGFR3 SGVDLADSNQK 567.3 662.3 96 162.8822352
CIS EDTPNSVWEPAK 686.8 315.2 97 161.6140915
AFAM DADPDTFFAK 563.8 302.1 98 159.5917449
CBG SETEIHQGFQHLHQLFAK 717.4 447.2 99 156.1357404
CIS LLEVPEGR 456.8 686.4 100 155.1763293
PTGDS GPGEDFR 389.2 623.3 101 154.9205208
ITIH2 IYLQPGR 423.7 329.2 102 154.6552717
FA7 TLAFVR 353.7 492.3 103 152.5009422
FA7 FSLVSGWGQLLDR 493.3 403.2 104 151.9971204
SAMP VGEYSLYIGR 578.8 871.5 105 151.4738449
APOH EHSSLAFWK 552.8 267.1 106 151.0052645
PGRP2 AGLLRPDYALLGHR 518.0 595.4 107 150.4149907
C1QC FNAVLTNPQGDYDTSTGK 964.5 333.2 108 149.2592827
PGRP2 AGLLRPDYALLGHR 518.0 369.2 109 147.3609354
PGRP2 TFTLLDPK 467.8 686.4 111 145.2145223
C05 TDAPDLPEENQAR 728.3 843.4 112 144.5213118
THRB ELLESYIDGR 597.8 839.4 113 143.924639
GELS DPDQTDGLGLSYLSSHIANVER 796.4 328.1 114 142.8936101
TRFL YYGYTGAFR 549.3 450.3 115 142.8651352
HEMO QGHNSVFLIK 381.6 260.2 116 142.703845
CIS GDSGGAFAVQDPNDK 739.3 716.3 117 142.2799122
B1A4H9 AHQLAIDTYQEFR 531.3 450.3 118 138.196407
CIS S SNNPHSPI VEEFQ VP YNK 729.4 261.2 119 136.7868935
HYOU1 LPATEKPVLLSK 432.6 347.2 120 136.1146437
FETA GYQELLEK 490.3 502.3 121 135.2890322
LRP1 SERPPIFEIR 415.2 288.2 122 134.6569527
C06 SEYGAALAWEK 612.8 845.5 124 132.8634704
CERU TTIEKPVWLGFLGPIIK 638.0 844.5 125 132.1047746
IBP1 AQETSGEEISK 589.8 850.4 126 130.934446
SHBG VVLSSGSGPGLDLPLVLGLPLQLK 791.5 768.5 127 128.2052287
CBG SETEIHQGFQHLHQLFAK 717.4 318.1 128 127.9873837
A1AT LSITGTYDLK 555.8 696.4 129 127.658818
PGRP2 DGSPDVTTADIGANTPDATK 973.5 531.3 130 126.5775806
C1QB LEQGENVFLQATDK 796.4 675.4 131 126.1762726 Transition Rank Importance
EGLN GPITS AAELNDPQ SILLR 632.4 826.5 132 125.7658253
IL12B YENYTSSFFIR 713.8 293.1 133 125.0476631
B2MG VEHSDLSFSK 383.5 234.1 134 124.9154706
PGH1 AEHPTWGDEQLFQTTR 639.3 765.4 135 124.8913193
INHBE ALVLELAK 428.8 331.2 136 124.0109276
HYOU1 LPATEKPVLLSK 432.6 460.3 137 123.1900369
CXCL2 CQCLQTLQGIHLK 13p48RT 533.6 567.4 138 122.8800873
PZP AVGYLITGYQR 620.8 523.3 139 122.4733204
AFAM IAPQLSTEELVSLGEK 857.5 333.2 140 122.4707849
ICAM1 VELAPLPSWQPVGK 760.9 400.3 141 121.5494206
CHL1 VIAVNEVGR 478.8 284.2 142 119.0877137
APOB ITENDIQIALDDAK 779.9 632.3 143 118.0222045
SAMP AYSDLSR 406.2 577.3 144 116.409429
AMBP AFIQLWAFDAVK 704.9 836.4 145 116.1900846
EGLN GPITS AAELNDPQ SILLR 632.4 601.4 146 115.8438804
LRP1 NAVVQGLEQPHGLVVHPLR 688.4 890.6 147 114.539707
SHBG VVLSSGSGPGLDLPLVLGLPLQLK 791.5 598.4 148 113.1931134
IBP1 AQETSGEEISK 589.8 979.5 149 112.9902709
PSG6 SNPVTLNVLYGPDLPR 585.7 817.4 150 112.7910917
APOC3 DYWSTVK 449.7 347.2 151 112.544736
C1R WILTAAHTLYPK 471.9 621.4 152 112.2199708
ANGT ADSQAQLLLSTVVGVFTAPGLHLK 822.5 983.6 153 111.9634671
PSG9 DVLLLVHNLPQNLPGYFWYK 810.4 328.2 154 111.5743214
A1AT AVLTIDEK 444.8 605.3 155 111.216651
PSGx ILILPSVTR 506.3 785.5 156 110.8482935
THRB ELLESYIDGR 597.8 710.4 157 110.7496103
SHBG ALALPPLGLAPLLNLWAKPQGR 770.5 256.2 158 110.5091269
PZP QTLSWTVTPK 580.8 545.3 159 110.4675104
SHBG ALALPPLGLAPLLNLWAKPQGR 770.5 457.3 160 110.089808
PSG4 TLFIFGVTK 513.3 811.5 161 109.9039967
PLMN YEFLNGR 449.7 293.1 162 109.6880397
PEPD AVYEAVLR 460.8 587.4 163 109.3697285
PLMN LSSPAVITDK 515.8 830.5 164 108.963353
FINC SYTITGLQPGTDYK 772.4 352.2 165 108.452612
C1R WILTAAHTLYPK 471.9 407.2 166 107.8348417
CHL1 TAVTANLDIR 537.3 288.2 167 107.7278897
TENA AVDIPGLEAATPYR 736.9 286.1 168 107.6166195
CRP YEVQGEVFTKPQLWP 911.0 293.1 169 106.9739589
APOB SVSLPSLDPASAK 636.4 885.5 170 106.5901668
PRDX2 SVDEALR 395.2 488.3 171 106.2325046
C08A YHFEALADTGISSEFYDNANDLLSK 940.8 301.1 172 105.8963287
C1QC FQSVFTVTR 542.8 722.4 173 105.4338742
PSGx ILILPSVTR 506.3 559.3 174 105.1942655
VCAM1 TQIDSPLSGK 523.3 703.4 175 105.0091767
VCAM1 NTVISVNPSTK 580.3 845.5 176 104.8754444
CSH ISLLLIESWLEPVR 834.5 500.3 177 104.6158295
HGFA EALVPLVADHK 397.9 439.8 178 104.3383142 Transition Rank Importance
CGB1 CRPINATLAVEK 457.9 660.4 179 104.3378072
APOB IEGNLIFDPNNYLPK 874.0 414.2 180 103.9849346
C1QB LEQGENVFLQATDK 796.4 822.4 181 103.9153207
APOH EHSSLAFWK 552.8 838.4 182 103.9052103
C05 LQGTLPVEAR 542.3 842.5 183 103.1061869
SHBG IALGGLLFPASNLR 481.3 412.3 184 102.2490294
B2MG VNHVTLSQPK 374.9 459.3 185 102.1204362
APOA2 SPELQAEAK 486.8 659.4 186 101.9166647
FLNA TGVAVNKPAEFTVDAK 549.6 258.1 187 101.5207852
PLMN YEFLNGR 449.7 606.3 188 101.2531011
[00247] Table 63. Random Forest Importance Values Using Peak Area Ratios
Figure imgf000168_0001
HGFA LHKPGVYTR 357.5 479.3 34 536.6051948
PSG2 IHPSYTNYR 384.2 338.2 35 536.5363489
GELS AQPVQVAEGSEPDGFWEALGGK 758.0 623.4 36 536.524931
PSG6 SNPVTLNVLYGPDLPR 585.7 654.4 37 520.108646
HABP2 FLNWIK 410.7 560.3 38 509.0707814
PGH1 ILPSVPK 377.2 527.3 39 503.593718
HYOU1 LPATEKPVLLSK 432.6 460.3 40 484.047422
C06 ALNHLPLEYNSALYSR 621.0 696.4 41 477.8773179
INHBE ALVLELAK 428.8 672.4 42 459.1998276
PLMN LSSPAVITDK 515.8 743.4 43 452.9466414
PSG9 DVLLLVHNLPQNLPGYFWYK 810.4 215.1 44 431.8528248
BGH3 LTLLAPLNSVFK 658.4 875.5 45 424.2540315
AFAM LPNNVLQEK 527.8 730.4 46 421.4953221
ITIH2 LSNENHGIAQR 413.5 519.8 47 413.1231437
GELS TASDFITK 441.7 781.4 48 404.2679723
FETUA AHYDLR 387.7 566.3 49 400.4711207
CERU TTIEKPVWLGFLGPIIK 638.0 844.5 50 396.2873451
PSGx ILILPSVTR 506.3 785.5 51 374.5672526
APOB SVSLPSLDPASAK 636.4 885.5 52 371.1416438
FLNA TGVAVNKPAEFTVDAK 549.6 258.1 53 370.4175588
PLMN YEFLNGR 449.7 606.3 54 367.2768078
PSGx ILILPSVTR 506.3 559.3 55 365.7704321
[00248] From the foregoing description, it will be apparent that variations and modifications can be made to the invention described herein to adopt it to various usages and conditions. Such embodiments are also within the scope of the following claims.
[00249] The recitation of a listing of elements in any definition of a variable herein includes definitions of that variable as any single element or combination (or subcombination) of listed elements. The recitation of an embodiment herein includes that embodiment as any single embodiment or in combination with any other embodiments or portions thereof.
[00250] All patents and publications mentioned in this specification are herein incorporated by reference to the same extent as if each independent patent and publication was specifically and individually indicated to be incorporated by reference.

Claims

What is claimed is:
1. A panel of isolated biomarkers comprising N of the biomarkers listed in Tables 1 through 63.
2. The panel of claim 1 , wherein N is a number selected from the group consisting of 2 to 24.
3. The panel of claim 2, wherein said panel comprises at least two of the isolated biomarkers selected from the group consisting of AFTECCVVASQLR, ELLESYIDGR, ITLPDFTGDLR, the biomarkers set forth in Table 50, and the biomarkers set forth in Table 52.
4. The panel of claim 2, wherein said panel comprises lipopolysaccharide - binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), and complement component C8 gamma chain (C8G or C08G).
5. The panel of claim 2, wherein said panel comprises at least two isolated biomarkers selected from the group consisting of lipopolysaccharide -binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), and complement component C8 gamma chain (C8G or C08G).
6. The panel of claim 2, wherein said panel comprises at least two isolated biomarkers selected from the group consisting of lipopolysaccharide -binding protein (LBP), prothrombin (THRB), complement component C5 (C5 or C05), plasminogen (PLMN), complement component C8 gamma chain (C8G or C08G), complement component 1, q subcomponent, B chain (C1QB), fibrinogen beta chain (FIBB or FIB), C-reactive protein (CRP), inter-alpha-trypsin inhibitor heavy chain H4 (ITIH4), chorionic somatomammotropin hormone (CSH), angiotensinogen (ANG or ANGT), the biomarkers set forth in Table 51, and the biomarkers set forth in Table 53.
7. A method of determining probability for preterm birth in a pregnant female, the method comprising detecting a measurable feature of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63 in a biological sample obtained from said pregnant female, and analyzing said measurable feature to determine the probability for preterm birth in said pregnant female.
8. The method of claim 7, wherein said measurable feature comprises fragments or derivatives of each of said N biomarkers selected from the biomarkers listed in Tables 1 through 63.
9. The method of claim 7, wherein said detecting a measurable feature comprises quantifying an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63, combinations or portions and/or derivatives thereof in a biological sample obtained from said pregnant female.
10. The method of claim 9, further comprising calculating the probability for preterm birth in said pregnant female based on said quantified amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63.
11. The method of claim 7, further comprising an initial step of providing a biomarker panel comprising N of the biomarkers listed in Tables 1 through 63.
12. The method of claim 7, further comprising an initial step of providing a biological sample from the pregnant female.
13. The method of claim 7, further comprising communicating said probability to a health care provider.
14. The method of claim 13, wherein said communication informs a subsequent treatment decision for said pregnant female.
15. The method of claim 7, wherein N is a number selected from the group consisting of 2 to 24.
16. The method of claim 15, wherein said N biomarkers comprise at least two of the isolated biomarkers selected from the group consisting of of AFTECCVVASQLR, ELLESYIDGR, ITLPDFTGDLR, the biomarkers set forth in Table 50, and the biomarkers set forth in Table 52.
17. The method of claim 7, wherein said analysis comprises a use of a predictive model.
18. The method of claim 17, wherein said analysis comprises comparing said measurable feature with a reference feature.
19. The method of claim 18, wherein said analysis comprises using one or more selected from the group consisting of a linear discriminant analysis model, a support vector machine classification algorithm, a recursive feature elimination model, a prediction analysis of microarray model, a logistic regression model, a multiple regression model, a survival model, a CART algorithm, a flex tree algorithm, a LART algorithm, a random forest algorithm, a MART algorithm, a machine learning algorithm, a penalized regression method, and a combination thereof.
20. The method of claim 19, wherein said analysis comprises logistic regression.
21. The method of claim 7, wherein said probability is expressed as a risk score.
22. The method of claim 7, wherein the biological sample is selected from the group consisting of whole blood, plasma, and serum.
23. The method of claim 22, wherein the biological sample is serum.
24. The method of claim 7, wherein said quantifying comprises mass spectrometry
(MS).
25. The method of claim 24, wherein said MS comprises liquid chromatography- mass spectrometry (LC-MS).
26. The method of claim 24, wherein said MS comprises multiple reaction monitoring (MRM) or selected reaction monitoring (SRM).
27. The method of claim 26, wherein said MRM (or SRM) comprises scheduled MRM (SRM).
28. The method of claim 7, wherein said quantifying comprises an assay that utilizes a capture agent.
29. The method of claim 28, wherein said capture agent is selected from the group consisting of and antibody, antibody fragment, nucleic acid-based protein binding reagent, small molecule or variant thereof.
30. The method of claim 28, wherein said assay is selected from the group consisting of enzyme immunoassay (EIA), enzyme-linked immunosorbent assay (ELISA), and radioimmunoassay (RIA).
31. The method of claim 30, wherein said quantifying further comprises mass spectrometry (MS).
32. The method of claim 31 , wherein said quantifying comprises co- immunoprecitipation-mass spectrometry (co-IP MS).
33. The method of claim 7, further comprising detecting a measurable feature for one or more risk indicia.
34. The method of claim 7, wherein said analyzing of said measurable feature initially comprises prediction of gestational age at birth (GAB) prior to said determining the probability for preterm birth.
35. The method of claim 34, wherein said prediction of the GAB is used to determine the probability for preterm birth.
36. The method of claim 33, wherein the one or more risk indicia are selected from the group consisting of age, prior pregnancy, history of previous low birth weight or preterm delivery, multiple 2nd trimester spontaneous abortion, prior first trimester induced abortion, familial and intergenerational factors, history of infertility, nulliparity, placental abnormalities, cervical and uterine anomalies, gestational bleeding, intrauterine growth restriction, in utero diethylstilbestrol exposure, multiple gestations, infant sex, short stature, low prepregnancy weight/low body mass index, diabetes, hypertension, hypothyroidism, asthma, education level, tobacco use, and urogenital infections .
37. A method of determining probability for preterm birth in a pregnant female, the method comprising: (a) quantifying in a biological sample obtained from said pregnant female an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63; (b) multiplying said amount by a predetermined coefficient, (c) determining the probability for preterm birth in said pregnant female comprising adding said individual products to obtain a total risk score that corresponds to said probability.
38. A method of predicting GAB, the method comprising detecting a measurable feature of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63 in a biological sample obtained from a pregnant female, and analyzing said measurable feature to predict GAB.
39. The method of claim 38, wherein said measurable feature comprises fragments or derivatives of each of said N biomarkers selected from the biomarkers listed in Tables 1 through 63.
40. The method of claim 38, wherein said detecting a measurable feature comprises quantifying an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63, combinations or portions and/or derivatives thereof in a biological sample obtained from said pregnant female.
41. The method of claim 40, further comprising calculating the probability for preterm birth in said pregnant female based on said quantified amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63.
42. The method of claim 38, further comprising an initial step of providing a biomarker panel comprising N of the biomarkers listed in Tables 1 through 63.
43. The method of claim 38, further comprising an initial step of providing a biological sample from the pregnant female.
44. The method of claim 38, further comprising communicating said probability to a health care provider.
45. The method of claim 44, wherein said communication informs a subsequent treatment decision for said pregnant female.
46. The method of claim 38, wherein N is a number selected from the group consisting of 2 to 24.
47. The method of claim 46, wherein said N biomarkers comprise at least two of the isolated biomarkers selected from the group consisting of of AFTECCVVASQLR, ELLESYIDGR, ITLPDFTGDLR, the biomarkers set forth in Table 50, and the biomarkers set forth in Table 52.
48. The method of claim 38, wherein said analysis comprises a use of a predictive model.
49. The method of claim 48, wherein said analysis comprises comparing said measurable feature with a reference feature.
50. The method of claim 49, wherein said analysis comprises using one or more selected from the group consisting of a linear discriminant analysis model, a support vector machine classification algorithm, a recursive feature elimination model, a prediction analysis of microarray model, a logistic regression model, a multiple regression model, a survival model, a CART algorithm, a flex tree algorithm, a LART algorithm, a random forest algorithm, a MART algorithm, a machine learning algorithm, a penalized regression method, and a combination thereof.
51. The method of claim 50, wherein said analysis comprises a random forest algorithm.
52. The method of claim 38, wherein said probability is expressed as a risk score.
53. The method of claim 38, wherein the biological sample is selected from the group consisting of whole blood, plasma, and serum.
54. The method of claim 53, wherein the biological sample is serum.
55. The method of claim 38, wherein said quantifying comprises mass spectrometry (MS).
56. The method of claim 55, wherein said MS comprises liquid chromatography- mass spectrometry (LC-MS).
57. The method of claim 55, wherein said MS comprises multiple reaction monitoring (MRM) or selected reaction monitoring (SRM).
58. The method of claim 57, wherein said MRM (or SRM) comprises scheduled MRM (SRM).
59. The method of claim 38, wherein said quantifying comprises an assay that utilizes a capture agent.
60. The method of claim 59, wherein said capture agent is selected from the group consisting of and antibody, antibody fragment, nucleic acid-based protein binding reagent, small molecule or variant thereof.
61. The method of claim 59, wherein said assay is selected from the group consisting of enzyme immunoassay (EIA), enzyme-linked immunosorbent assay (ELISA), and radioimmunoassay (RIA).
62. The method of claim 61, wherein said quantifying further comprises mass spectrometry (MS).
63. The method of claim 62, wherein said quantifying comprises co- immunoprecitipation-mass spectrometry (co-IP MS).
64. The method of claim 38, further comprising detecting a measurable feature for one or more risk indicia.
65. The method of claim 64, wherein the one or more risk indicia are selected from the group consisting of age, prior pregnancy, history of previous low birth weight or preterm delivery, multiple 2nd trimester spontaneous abortion, prior first trimester induced abortion, familial and intergenerational factors, history of infertility, nulliparity, placental abnormalities, cervical and uterine anomalies, gestational bleeding, intrauterine growth restriction, in utero diethylstilbestrol exposure, multiple gestations, infant sex, short stature, low prepregnancy weight/low body mass index, diabetes, hypertension, hypothyroidism, asthma, education level, tobacco use, and urogenital infections.
66. A method of prediciting GAB, the method comprising: (a) quantifying in a biological sample obtained from said pregnant female an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63; (b) multiplying and/or thresholding said amount by a predetermined coefficient, (c) determining the predicted GAB birth in said pregnant female comprising adding said individual products to obtain a total risk score that corresponds to said predicted GAB.
67. A method of prediciting time to birth in a pregnant female, the method comprising: (a) obtaining a biological sample from said pregnant female; (b) quantifying an amount of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63 in said biological sample; (c) multiplying and/or thresholding said amount by a predetermined coefficient, (d) determining predicted GAB in said pregnant female comprising adding said individual products to obtain a total risk score that corresponds to said predicted GAB; and (e) substracting the estimated GA at time biological sample was obtained from the predicted GAB to predict time to birth in said pregnant female.
68. A method of determining probability for term birth in a pregnant female, the method comprising detecting a measurable feature of each of N biomarkers selected from the biomarkers listed in Tables 1 through 63 in a biological sample obtained from said pregnant female, and analyzing said measurable feature to determine the probability for term birth in said pregnant female.
69. The panel of claim 2, wherein said panel comprises at least two of the isolated biomarkers selected from the group consisting of FLNWIK, FGFGGSTDSGPIR,
LLELTGPK, VEHSDLSFSK, IEGNLIFDPNNYLPK, ALVLELAK, TQILEWAAER, DVLLLVHNLPQNLPGYFWYK, SEPRPGVLLR, ITQDAQLK, ALDLSLK,
WWGGQPLWITATK, and LSETNR.
70. The panel of claim 2, wherein said panel comprises Alpha- lB-glycoprotein (A1BG), Disintegrin and metalloproteinase domain-containing protein 12 (ADA12), Apolipoprotein B-100 (APOB), Beta-2-microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid-binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy-specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE).
71. The panel of claim 2, wherein said panel comprises at least two isolated biomarkers selected from the group consisting Alpha- lB-glycoprotein (A1BG), Disintegrin and metalloproteinase domain-containing protein 12 (ADA12), Apolipoprotein B-100 (APOB), Beta-2-microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP 8 Peptide), Corticosteroid-binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy-specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE).
72. The method of claim 38, wherein said biomarkers comprise at least two of the isolated biomarkers selected from the group consisting of FLNWIK, FGFGGSTDSGPIR, LLELTGPK, VEHSDLSFSK, IEGNLIFDPNNYLPK, ALVLELAK, TQILEWAAER, DVLLLVHNLPQNLPGYFWYK, SEPRPGVLLR, ITQDAQLK, ALDLSLK,
WWGGQPLWITATK, and LSETNR.
73. The method of claim 66, wherein said biomarkers comprise at least two of the isolated biomarkers selected from the group consisting of FLNWIK, FGFGGSTDSGPIR, LLELTGPK, VEHSDLSFSK, IEGNLIFDPNNYLPK, ALVLELAK, TQILEWAAER, DVLLLVHNLPQNLPGYFWYK, SEPRPGVLLR, ITQDAQLK, ALDLSLK,
WWGGQPLWITATK, and LSETNR.
74. The method of claim 67, wherein said biomarkers comprise at least two of the isolated biomarkers selected from the group consisting of FLNWIK, FGFGGSTDSGPIR, LLELTGPK, VEHSDLSFSK, IEGNLIFDPNNYLPK, ALVLELAK, TQILEWAAER, DVLLLVHNLPQNLPGYFWYK, SEPRPGVLLR, ITQDAQLK, ALDLSLK,
WWGGQPLWITATK, and LSETNR.
75. The method of claim 68, wherein said biomarkers comprise at least two of the isolated biomarkers selected from the group consisting of FLNWIK, FGFGGSTDSGPIR, LLELTGPK, VEHSDLSFSK, IEGNLIFDPNNYLPK, ALVLELAK, TQILEWAAER, DVLLLVHNLPQNLPGYFWYK, SEPRPGVLLR, ITQDAQLK, ALDLSLK,
WWGGQPLWITATK, and LSETNR.
76. The method of claim 38, wherein said biomarkers comprise at least two of the isolated biomarkers selected from the group consisting of Alpha- lB-glycoprotein (AIBG), Disintegrin and metalloproteinase domain-containing protein 12 (ADA12), Apolipoprotein B-100 (APOB), Beta-2-microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid-binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy- specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE).
77. The method of claim 66, wherein said biomarkers comprise at least two of the isolated biomarkers selected from the group consisting of Alpha- lB-glycoprotein (AIBG), Disintegrin and metalloproteinase domain-containing protein 12 (ADA12), Apolipoprotein B-100 (APOB), Beta-2-microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid-binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy- specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE).
78. The method of claim 67, wherein said biomarkers comprise at least two of the isolated biomarkers selected from the group consisting of Alpha- lB-glycoprotein (A1BG), Disintegrin and metalloproteinase domain-containing protein 12 (ADA12), Apolipoprotein B-100 (APOB), Beta-2-microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid-binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy- specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (INHBE).
79. The method of claim 68, wherein said biomarkers comprise at least two of the isolated biomarkers selected from the group consisting of Alpha- lB-glycoprotein (A1BG), Disintegrin and metalloproteinase domain-containing protein 12 (ADA12), Apolipoprotein B-100 (APOB), Beta-2-microglobulin (B2MG), CCAAT/enhancer-binding protein alpha/beta (HP8 Peptide), Corticosteroid-binding globulin (CBG), Complement component C6, Endoglin (EGLN), Ectonucleotide pyrophosphatase/phosphodiesterase family member 2 ( ENPP2), Coagulation factor VII (FA7), Hyaluronan-binding protein 2 (HABP2), Pregnancy- specific beta- 1 -glycoprotein 9 (PSG9), Inhibin beta E chain (F HBE).
PCT/US2014/028412 2013-03-15 2014-03-14 Biomarkers and methods for predicting preterm birth WO2014144129A2 (en)

Priority Applications (11)

Application Number Priority Date Filing Date Title
ES14765203T ES2836127T3 (en) 2013-03-15 2014-03-14 Biomarkers and methods to predict preterm birth
JP2016502779A JP2016518589A (en) 2013-03-15 2014-03-14 Biomarkers and methods for predicting preterm birth
AU2014227891A AU2014227891A1 (en) 2013-03-15 2014-03-14 Biomarkers and methods for predicting preterm birth
RU2015144304A RU2015144304A (en) 2013-03-15 2014-03-14 BIOMARKERS AND METHODS FOR PREDICTING PREMATURE BIRTH
EP14765203.6A EP2972308B9 (en) 2013-03-15 2014-03-14 Biomarkers and methods for predicting preterm birth
BR112015023706A BR112015023706A2 (en) 2013-03-15 2014-03-14 panel of isolated biomarkers, method of determining the likelihood of preterm birth, method of predicting gab, method of predicting time of delivery and method of determining probability for normal birth.
EP20195737.0A EP3800470A1 (en) 2013-03-15 2014-03-14 Biomarkers and methods for predicting preterm birth
CN201480028164.3A CN106574919B (en) 2013-03-15 2014-03-14 Biomarkers and methods for predicting preterm birth
CA2907120A CA2907120C (en) 2013-03-15 2014-03-14 Biomarkers and methods for predicting preterm birth
AU2020201701A AU2020201701B2 (en) 2013-03-15 2020-03-06 Biomarkers and methods for predicting preterm birth
AU2022221445A AU2022221445A1 (en) 2013-03-15 2022-08-24 Biomarkers and methods for predicting preterm birth

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201361798504P 2013-03-15 2013-03-15
US61/798,504 2013-03-15
US201361919586P 2013-12-20 2013-12-20
US61/919,586 2013-12-20

Publications (2)

Publication Number Publication Date
WO2014144129A2 true WO2014144129A2 (en) 2014-09-18
WO2014144129A3 WO2014144129A3 (en) 2015-02-05

Family

ID=51538302

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2014/028412 WO2014144129A2 (en) 2013-03-15 2014-03-14 Biomarkers and methods for predicting preterm birth

Country Status (10)

Country Link
US (4) US20140287950A1 (en)
EP (2) EP2972308B9 (en)
JP (5) JP2016518589A (en)
CN (2) CN112213500A (en)
AU (3) AU2014227891A1 (en)
BR (1) BR112015023706A2 (en)
CA (2) CA3209974A1 (en)
ES (1) ES2836127T3 (en)
RU (1) RU2015144304A (en)
WO (1) WO2014144129A2 (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017192668A1 (en) * 2016-05-05 2017-11-09 Indiana University Research & Technology Corporation Quantitative profiling of progesterone metabolites for the prediction of spontaneous preterm delivery
AU2015203904B2 (en) * 2014-01-06 2018-07-19 Expression Pathology, Inc. SRM assay for PD-L1
CN108450003A (en) * 2015-06-19 2018-08-24 赛拉预测公司 Biomarker pair for predicting premature labor
RU2670672C1 (en) * 2017-11-08 2018-10-24 Федеральное государственное бюджетное образовательное учреждение высшего образования "Российский национальный исследовательский медицинский университет им. Н.И. Пирогова" Министерства здравоохранения Российской Федерации (ФГБОУ ВО РНИМУ им. Н.И. Пирогова Минздрава России) Method for prediction of preterm delivery at gestation period of 24-34 weeks
JP2019512082A (en) * 2016-02-05 2019-05-09 ザ リージェンツ オブ ザ ユニバーシティ オブ カリフォルニアThe Regents Of The University Of California Tools for predicting the risk of preterm birth
WO2019086410A1 (en) * 2017-10-30 2019-05-09 Carmentix Pte. Ltd. Biomarkers of preterm birth
CN109983137A (en) * 2016-08-05 2019-07-05 赛拉预测公司 For predicted exposure in the biomarker of the premature labor of the pregnant female of progestational hormone
EP3494233A4 (en) * 2016-08-05 2020-03-18 Sera Prognostics, Inc. Biomarkers for predicting preterm birth due to preterm premature rupture of membranes versus idiopathic spontaneous labor
US10877046B2 (en) 2015-12-04 2020-12-29 Nx Prenatal Inc. Treatment of spontaneous preterm birth
US10928402B2 (en) 2012-12-28 2021-02-23 Nx Prenatal Inc. Treatment of spontaneous preterm birth
CN113396332A (en) * 2018-09-21 2021-09-14 斯坦福大学托管董事会 Method for evaluating pregnancy progression and preterm birth miscarriage for clinical intervention and uses thereof
US11293931B2 (en) 2016-05-05 2022-04-05 Indiana University Research And Technology Corporation Quantitative profiling of progesterone metabolites for the prediction of spontaneous preterm delivery
EP4019530A4 (en) * 2019-08-20 2023-04-05 Caregen Co., Ltd. Peptide having activity of improving skin condition and use thereof
US11662351B2 (en) 2017-08-18 2023-05-30 Sera Prognostics, Inc. Pregnancy clock proteins for predicting due date and time to birth
US11664100B2 (en) * 2021-08-17 2023-05-30 Birth Model, Inc. Predicting time to vaginal delivery

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11753682B2 (en) 2016-03-07 2023-09-12 Father Flanagan's Boys'Home Noninvasive molecular controls
JP2018140172A (en) * 2017-02-28 2018-09-13 株式会社Nttドコモ Data collection device and data collection method
EP3746048A4 (en) * 2018-01-31 2021-12-15 Nx Prenatal Inc. Use of circulating microparticles to stratify risk of spontaneous preterm birth
CN108918886A (en) * 2018-06-26 2018-11-30 连云港市妇幼保健院(连云港市第三人民医院) A kind of premature labor diagnosis biomarker and its application
WO2020132650A1 (en) * 2018-12-20 2020-06-25 Parkland Center For Clinical Innovation Lightweight clinical pregnancy preterm birth predictive system and method
CN113533547A (en) * 2020-04-20 2021-10-22 复旦大学 Method for measuring expression quantity of complement protein C1R
US11266376B2 (en) * 2020-06-19 2022-03-08 Ultrasound Ai Inc. Premature birth prediction
RU2743880C1 (en) * 2020-08-06 2021-03-01 Федеральное государственное бюджетное научное учреждение "Научно-исследовательский институт акушерства, гинекологии и репродуктологии имени Д.О. Отта" Method for prediction of preterm delivery in multiple pregnancies
KR102339013B1 (en) * 2020-11-16 2021-12-14 주식회사 옵티바이오 In-vitro diagnostic kit capable of quantitating mmp-8 in amniotic fluid specimen
CN112256422B (en) * 2020-11-17 2023-08-04 中国人民解放军战略支援部队信息工程大学 Heterogeneous platform task scheduling method and system based on Q learning
WO2022132666A1 (en) 2020-12-14 2022-06-23 Regeneron Pharmaceuticals, Inc. Methods of treating metabolic disorders and cardiovascular disease with inhibin subunit beta e (inhbe) inhibitors
CN112899360A (en) * 2021-02-02 2021-06-04 北京航空航天大学 Application method of composition for detecting occurrence probability of Terchester-Coriolis syndrome
CN113267587B (en) * 2021-05-27 2023-11-10 杭州广科安德生物科技有限公司 Characteristic peptide fragment and method for measuring content of pro-SFTPB standard substance
CN114702548B (en) * 2022-03-11 2023-08-04 湖南源品细胞生物科技有限公司 Polypeptide composition for promoting growth of mesenchymal stem cells, and culture medium and application thereof
KR20230145829A (en) * 2022-04-11 2023-10-18 의료법인 성광의료재단 Mass Spectrum Analysis Method for Prediction of High-Risk Pregnancies using machine learning
CN115344846B (en) * 2022-07-29 2024-07-12 贵州电网有限责任公司 Fingerprint retrieval model and verification method
CN116904587B (en) * 2023-09-13 2023-12-05 天津云检医学检验所有限公司 Biomarker group, prediction model and kit for predicting premature delivery

Family Cites Families (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4710475A (en) * 1986-05-12 1987-12-01 Mayo Medical Resources Method for the determination of the imminence of labor
US20030139335A1 (en) * 2001-07-27 2003-07-24 Hanauske-Abel Hartmut M. Enhancing organ maturity in neonates and predicting their duration of intensive care
AU2004205774B2 (en) * 2003-01-17 2006-12-14 The Chinese University Of Hong Kong Circulating mRNA as diagnostic markers for pregnancy-related disorders
EP1646648A2 (en) * 2003-07-15 2006-04-19 Genova Ltd. Secreted polypeptide species reduced in cardiovascular disorders
JP4568282B2 (en) * 2003-09-23 2010-10-27 ザ ジェネラル ホスピタル コーポレイション Screening for pregnancy diseases
EP2343384A3 (en) * 2004-03-23 2012-01-04 Oncotherapy Science, Inc. Method for diagnosing non-small cell lung cancer
EP1901074B1 (en) * 2004-05-19 2011-07-27 Københavns Universitet Adam12 as marker for fetal Turner Syndrome
CN101437959A (en) * 2004-09-20 2009-05-20 普罗特奥格尼克斯公司 Diagnosis of fetal aneuploidy
US7790463B2 (en) * 2006-02-02 2010-09-07 Yale University Methods of determining whether a pregnant woman is at risk of developing preeclampsia
EP2097094A4 (en) * 2006-11-01 2011-01-05 George Mason Intellectual Prop Biomarkers for neurological conditions
GB0705321D0 (en) * 2007-03-20 2007-04-25 Novocellus Ltd Method
WO2008146100A1 (en) * 2007-06-01 2008-12-04 INSERM (Institut National de la Santé et de la Recherche Médicale) Method for absolute quantification of polypeptides
CA2694112A1 (en) * 2007-07-20 2009-01-29 The University Of Utah Research Foundation Identification and quantification of biomarkers for evaluating the risk of preterm birth
ES2374762T3 (en) * 2007-10-10 2012-02-21 Bg Medicine, Inc. PROCEDURES TO DETECT SERIOUS ADVERSE CARDIOVASCULAR AND CEREBROVASCULAR COMPLICATIONS.
US20100017143A1 (en) * 2008-01-30 2010-01-21 Proteogenix, Inc. Gestational age dependent proteomic changes of human maternal serum for monitoring maternal and fetal health
EP2283153A4 (en) * 2008-04-09 2011-09-14 Univ British Columbia Methods of diagnosing acute cardiac allograft rejection
WO2010017201A1 (en) * 2008-08-04 2010-02-11 The Board Of Regents Of The University Of Texas System Multiplexed diagnostic test for preterm labor
CN102460176B (en) * 2009-05-21 2014-09-24 李荣秀 Process for differential polypeptides detection and uses thereof
MX2012005015A (en) * 2009-10-30 2012-06-12 Prometheus Lab Inc Methods for diagnosing irritable bowel syndrome.
CN101930007B (en) * 2010-07-30 2015-01-21 北京热景生物技术有限公司 Up-converting fast quantification kit for preterm detection of pregnant woman
US10557856B2 (en) * 2010-09-24 2020-02-11 University Of Pittsburgh-Of The Commonwealth System Of Higher Education Biomarkers of renal injury
CA2819886A1 (en) * 2010-12-06 2012-06-14 Pronota N.V. Biomarkers and parameters for hypertensive disorders of pregnancy
US9568486B2 (en) * 2011-02-15 2017-02-14 The Wistar Institute Of Anatomy And Biology Methods and compositions for diagnosis of ectopic pregnancy
US9091651B2 (en) * 2011-12-21 2015-07-28 Integrated Diagnostics, Inc. Selected reaction monitoring assays

Non-Patent Citations (42)

* Cited by examiner, † Cited by third party
Title
"An Introduction to Radioimmunoassay and Related Techniques", 1995, ELSEVIER SCIENCE
"Born too soon: the global action report on preterm birth", 2012, THE PARTNERSHIP FOR MATERNAL, NEWBORN & CHILD HEALTH
"Causes, Consequences, and Prevention", 2007, NATIONAL ACADEMIES PRESS
"Gene Expression Profiling: Methods and Protocols", 2004, HUMANA PRESS
"Methods in Enzymology", vol. 402, 2005, ACADEMIC PRESS, article "Biological Mass Spectrometry"
"Methods in Molecular Biology", vol. 146, 2000, HUMANA PRESS, article "Mass Spectrometry of Proteins and Peptides"
ANDERSON; HUNTER, MOLECULAR AND CELLULAR PROTEOMICS, vol. 5, no. 4, 2006, pages 573
BIDLINGMEYER: "Practical HPLC Methodology and Applications", 1993, JOHN WILEY & SONS INC.
BIEMANN, METHODS ENZYMOL, vol. 193, 1990, pages 455 - 79
BLENCOWE ET AL.: "National, regional and worldwide estimates of preterm birth", THE LANCET, vol. 379, no. 9832, 2012, pages 2162 - 72
BOX; COX, ROYAL STAT. SOC., vol. 26, 1964, pages 211 - 246
BREIMAN, MACHINE LEARNING, vol. 45, 2001, pages 5 - 32
BRODY ET AL., J MOL BIOL, vol. 422, no. 5, 2012, pages 595 - 606
COLIGAN, CURRENT PROTOCOLS IN IMMUNOLOGY, 1991
CRAIG; BEAVIS, BIOINFORMATICS, vol. 20, 2004, pages 1466 - 1467
EFRON ET AL., ANNALS OF STATISTICS, vol. 32, 2004, pages 407 - 451
ENG ET AL., J. AM. SOC. MASS SPECTROM, vol. 5, 1994, pages 976 - 989
GODING: "Monoclonal Antibodies: Principles and Practice", 1986
GOSLING: "Immunoassays: A Practical Approach", 2000, OXFORD UNIVERSITY PRESS
HARLOW; LANE, ANTIBODIES: A LABORATORY MANUAL, 1988
HASTIE: "The Elements of Statistical Learning", 2001, SPRINGER
HUANG ET AL., PROC. NATL. ACAD. SCI. USA., vol. 101, no. 29, 2004, pages 10529 - 34
HUANG, PROC. NAT. ACAD. SCI. U.S.A, vol. 101, 2004, pages 10529 - 10534
JOHN R. CROWTHER: "The ELISA Guidebook", 2000, HUMANA PRESS
KELLER ET AL., ANAL. CHEM, vol. 74, 2002, pages 5383 - 5392
KUHN ET AL., PROTEOMICS, vol. 4, 2004, pages 1175 - 86
LING ET AL., EXPERT REV MOL DIAGN, vol. 7, 2007, pages 87 - 98
LIU ET AL., CURR MED CHEM, vol. 18, no. 27, 2011, pages 4117 - 25
MCLEAN ET AL., ANAL. CHEM, vol. 82, no. 24, 2010, pages 10116 - 10124
MCLEAN ET AL., BIOINFORMATICS, vol. 26, no. 7, 2010, pages 966 - 968
NESVIZHSKII ET AL., ANAL. CHEM, vol. 74, 2002, pages 5383 - 5392
NIELSEN; GEIERSTANGER, J IMMUNOL METHODS, vol. 290, 2004, pages 107 - 20
POLPITIYA ET AL., BIOINFORMATICS, vol. 24, 2008, pages 1556 - 1558
PRICE; NEWMAN: "Principles and Practice of Immunoassay", 1997, GROVE'S DICTIONARIES
RUCZINSKI, JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, vol. 12, 2003, pages 475 - 512
SELF ET AL., CURR. OPIN. BIOTECHNOL., vol. 7, 1996, pages 60 - 65
SHI ET AL., METHODS, vol. 56, no. 2, 2012, pages 246 - 53
SINGER: "Recursive Partitioning in the Health Sciences", 1999, SPRINGER
TIBSHIRANI, PROC. NATL. ACAD. SCI. U.S.A, vol. 99, 2002, pages 6567 - 72
TURNBULL: "Classification Trees with Subset Analysis Selection", 2005, THE LASSO, STANFORD UNIVERSITY
TUSHER ET AL., PROC. NATL. ACAD. SCI. U.S.A, vol. 98, 2001, pages 5116 - 21
VILLANUEVA ET AL., NATURE PROTOCOLS, vol. 1, no. 2, 2006, pages 880 - 891

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11835530B2 (en) 2012-12-28 2023-12-05 Nx Prenatal Inc. Detection of microparticle-associated proteins associated with spontaneous preterm birth
US10928402B2 (en) 2012-12-28 2021-02-23 Nx Prenatal Inc. Treatment of spontaneous preterm birth
AU2015203904B2 (en) * 2014-01-06 2018-07-19 Expression Pathology, Inc. SRM assay for PD-L1
JP7385951B2 (en) 2015-06-19 2023-11-24 セラ プログノスティックス, インコーポレイテッド Pairs of biomarkers to predict preterm birth
US11987846B2 (en) 2015-06-19 2024-05-21 Sera Prognostics, Inc. Biomarker pairs for predicting preterm birth
JP2018524581A (en) * 2015-06-19 2018-08-30 セラ プログノスティックス, インコーポレイテッド A pair of biomarkers to predict preterm birth
CN108450003B (en) * 2015-06-19 2022-04-01 赛拉预测公司 Biomarker pairs for predicting preterm birth
US10392665B2 (en) 2015-06-19 2019-08-27 Sera Prognostics, Inc. Biomarker pairs for predicting preterm birth
US10961584B2 (en) 2015-06-19 2021-03-30 Sera Prognostics, Inc. Biomarker pairs for predicting preterm birth
CN108450003A (en) * 2015-06-19 2018-08-24 赛拉预测公司 Biomarker pair for predicting premature labor
RU2724013C2 (en) * 2015-06-19 2020-06-18 Сера Прогностикс, Инк. Pairs of biological markers for prediction of preterm delivery
US10877046B2 (en) 2015-12-04 2020-12-29 Nx Prenatal Inc. Treatment of spontaneous preterm birth
JP2022064897A (en) * 2015-12-04 2022-04-26 エヌエックス・プリネイタル・インコーポレイテッド Use of circulating microparticles to stratify risk of spontaneous preterm birth
JP7050688B2 (en) 2016-02-05 2022-04-08 ザ リージェンツ オブ ザ ユニバーシティ オブ カリフォルニア Tools for predicting the risk of preterm birth
JP2019512082A (en) * 2016-02-05 2019-05-09 ザ リージェンツ オブ ザ ユニバーシティ オブ カリフォルニアThe Regents Of The University Of California Tools for predicting the risk of preterm birth
CN109415402A (en) * 2016-05-05 2019-03-01 印第安纳大学研究与技术公司 For predicting the quantitative analysis of the progesterone metabolite of spontaneous pre-term
CN109415402B (en) * 2016-05-05 2021-10-19 印第安纳大学研究与技术公司 Quantitative analysis of progesterone metabolites for prediction of spontaneous preterm birth
US11280795B2 (en) 2016-05-05 2022-03-22 Indiana University Research And Technology Corporation Quantitative profiling of progesterone metabolites for the prediction of spontaneous preterm delivery
US11293931B2 (en) 2016-05-05 2022-04-05 Indiana University Research And Technology Corporation Quantitative profiling of progesterone metabolites for the prediction of spontaneous preterm delivery
WO2017192668A1 (en) * 2016-05-05 2017-11-09 Indiana University Research & Technology Corporation Quantitative profiling of progesterone metabolites for the prediction of spontaneous preterm delivery
EP3494233A4 (en) * 2016-08-05 2020-03-18 Sera Prognostics, Inc. Biomarkers for predicting preterm birth due to preterm premature rupture of membranes versus idiopathic spontaneous labor
CN109983137A (en) * 2016-08-05 2019-07-05 赛拉预测公司 For predicted exposure in the biomarker of the premature labor of the pregnant female of progestational hormone
EP3494232A4 (en) * 2016-08-05 2020-04-01 Sera Prognostics, Inc. Biomarkers for predicting preterm birth in a pregnant female exposed to progestogens
US11662351B2 (en) 2017-08-18 2023-05-30 Sera Prognostics, Inc. Pregnancy clock proteins for predicting due date and time to birth
WO2019086410A1 (en) * 2017-10-30 2019-05-09 Carmentix Pte. Ltd. Biomarkers of preterm birth
RU2670672C1 (en) * 2017-11-08 2018-10-24 Федеральное государственное бюджетное образовательное учреждение высшего образования "Российский национальный исследовательский медицинский университет им. Н.И. Пирогова" Министерства здравоохранения Российской Федерации (ФГБОУ ВО РНИМУ им. Н.И. Пирогова Минздрава России) Method for prediction of preterm delivery at gestation period of 24-34 weeks
EP3853613A4 (en) * 2018-09-21 2022-06-01 The Board of Trustees of the Leland Stanford Junior University Methods for evaluation of gestational progress and preterm abortion for clinical intervention and applications thereof
CN113396332A (en) * 2018-09-21 2021-09-14 斯坦福大学托管董事会 Method for evaluating pregnancy progression and preterm birth miscarriage for clinical intervention and uses thereof
EP4019530A4 (en) * 2019-08-20 2023-04-05 Caregen Co., Ltd. Peptide having activity of improving skin condition and use thereof
US11664100B2 (en) * 2021-08-17 2023-05-30 Birth Model, Inc. Predicting time to vaginal delivery

Also Published As

Publication number Publication date
AU2020201701A1 (en) 2020-03-26
JP2022069504A (en) 2022-05-11
WO2014144129A3 (en) 2015-02-05
JP7412790B2 (en) 2024-01-15
US20220178938A1 (en) 2022-06-09
EP3800470A1 (en) 2021-04-07
CA2907120C (en) 2023-10-17
US20170146548A1 (en) 2017-05-25
AU2020201701B2 (en) 2022-05-26
CA2907120A1 (en) 2014-09-18
EP2972308A4 (en) 2016-10-26
EP2972308B9 (en) 2021-01-20
CN106574919A (en) 2017-04-19
JP2020193989A (en) 2020-12-03
CA3209974A1 (en) 2014-09-18
CN106574919B (en) 2020-11-03
AU2014227891A1 (en) 2015-10-08
RU2015144304A (en) 2017-04-21
CN112213500A (en) 2021-01-12
US20140287950A1 (en) 2014-09-25
JP2016518589A (en) 2016-06-23
RU2015144304A3 (en) 2018-03-26
JP2023164722A (en) 2023-11-10
AU2022221445A1 (en) 2022-09-22
JP2018193402A (en) 2018-12-06
EP2972308A2 (en) 2016-01-20
BR112015023706A2 (en) 2017-07-18
US20190376978A1 (en) 2019-12-12
ES2836127T3 (en) 2021-06-24
EP2972308B1 (en) 2020-09-16

Similar Documents

Publication Publication Date Title
AU2020201701B2 (en) Biomarkers and methods for predicting preterm birth
AU2020201695B2 (en) Biomarkers and methods for predicting preeclampsia
US20190317107A1 (en) Biomarkers and methods for predicting preterm birth
EP3311158B1 (en) Biomarker pairs for predicting preterm birth
EP3494233A1 (en) Biomarkers for predicting preterm birth due to preterm premature rupture of membranes versus idiopathic spontaneous labor
AU2018317902A1 (en) Pregnancy clock proteins for predicting due date and time to birth
WO2023158504A1 (en) Biomarker panels and methods for predicting preeclampsia

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14765203

Country of ref document: EP

Kind code of ref document: A2

ENP Entry into the national phase

Ref document number: 2907120

Country of ref document: CA

Ref document number: 2016502779

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2014227891

Country of ref document: AU

Date of ref document: 20140314

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 2014765203

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2015144304

Country of ref document: RU

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14765203

Country of ref document: EP

Kind code of ref document: A2

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112015023706

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 112015023706

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20150915