WO2022060842A1 - Systems and methods for predicting graft dysfunction with exosome proteins - Google Patents

Systems and methods for predicting graft dysfunction with exosome proteins Download PDF

Info

Publication number
WO2022060842A1
WO2022060842A1 PCT/US2021/050465 US2021050465W WO2022060842A1 WO 2022060842 A1 WO2022060842 A1 WO 2022060842A1 US 2021050465 W US2021050465 W US 2021050465W WO 2022060842 A1 WO2022060842 A1 WO 2022060842A1
Authority
WO
WIPO (PCT)
Prior art keywords
pgd
marker
subject
risk
therapy
Prior art date
Application number
PCT/US2021/050465
Other languages
French (fr)
Inventor
Barry FINE
Nicholas Tatonetti
Nicholas GIANGRECO
Original Assignee
The Trustees Of Columbia University In The City Of New York
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Trustees Of Columbia University In The City Of New York filed Critical The Trustees Of Columbia University In The City Of New York
Publication of WO2022060842A1 publication Critical patent/WO2022060842A1/en
Priority to US18/180,991 priority Critical patent/US20230273210A1/en

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P11/00Drugs for disorders of the respiratory system

Definitions

  • PGD Primary graft dysfunction after heart transplant can be defined as idiopathic heart failure occurring within the immediate postoperative period.
  • PGD can affect either or both ventricles simultaneously and be graded from mild to severe depending on the amount of support required to compensate for organ dysfunction.
  • PGD can cause the death of patients within 30 days after transplant.
  • the disclosed subject matter provides techniques for identifying the risk of primary graft dysfunction (PGD) of a subject.
  • PGD primary graft dysfunction
  • An exemplary method can include collecting a sample of the subject, measuring a level of a PGD marker from the sample, providing a PGD risk value that is quantified based on the level of the PGD marker using an adaptive Monte Carlo cross-validation (MCCV) model, and identifying the risk of PGD based on the PGD risk value.
  • the PGD marker can include plasma kallikrein (KLKB1).
  • the method can further include assessing an effect of a therapy on the heart transplant by estimating the PGD risk value of the subj ect.
  • the subj ect can receive the therapy before or after the assessment.
  • the method can further include identifying a clinical variable of the subject.
  • the clinical variable can include a medical history of the subject.
  • the medical history of the one subject can include a pre-transplant inotrope therapy.
  • the method can further include measuring a level of an additional marker from the sample.
  • the additional marker can include proteins peroxiredoxin 2 (PRDX2), tropomyosin alpha-4 (TPM4), myeloperoxidase (MPO), PGLYRP2, DEFA1, DEFA1B, LDHB, F2, FCGBP, CAT, CFHR5, HIST1H4, GAPDH, LTF, ADIPOQ, HSPA5, or combinations thereof.
  • the PGD risk value can be quantified based on the level of the PGD marker and the additional marker.
  • the method can further include providing the adaptive MCCV model with a training set for machine learning.
  • the adaptive MCCV model can be a continuously evolving model based on the training set.
  • the method can further include providing an additional therapy to the subject based on the PGD risk value.
  • the additional therapy can include KLKB1 activators, anti-inflammatory agents, or combinations thereof.
  • the disclosed subject matter also provides systems for identifying the risk of primary graft dysfunction (PGD) of a subject.
  • An example system can include one or more processors and one or more computer-readable non-transitory storage media coupled to one or more of the processors.
  • the one or more computer-readable non-transitory storage media can include instructions operable when executed by one or more of the processors to cause the system to collect a sample of the subject, measure a level of a PGD marker from the sample, provide a PGD risk value that is quantified based on the level of the PGD marker using an adaptive Monte Carlo cross-validation (MCCV) model, and identify the risk of PGD based on the PGD risk value.
  • the PGD marker can include plasma kallikrein (KLKB1).
  • Fig. 1 A provides a diagram of example blood-derived micro-vesicle proteomics in accordance with the disclosed subject matter.
  • Fig. IB provides a diagram showing an example protein markers identified by mass spectrometry in accordance with the disclosed subject matter.
  • Fig. 1C provides example protein filtering in accordance with the disclosed subject matter.
  • Fig. 2 provides a graph showing clinical diagnostic ELISA tests for C3, C4, total complement proteins in accordance with the disclosed subject matter.
  • Fig. 3 provides a diagram showing Monte Carlo Cross-Validation (MCCV) Prediction in accordance with the disclosed subject matter.
  • Fig. 4 provides a graph showing exosome protein expression distributions for patient cohorts in accordance with the disclosed subject matter.
  • Fig. 5 provides a graph showing example techniques for primary graft dysfunction (PGD) prediction by clinical and protein markers in accordance with the disclosed subject matter.
  • Fig. 6 provides a graph showing the prediction of pre-transplant inotrope therapy, left ventricular assist device, and both clinical factors on posttransplant PGD in accordance with the disclosed subject matter.
  • Fig. 7A provides a graph showing the area under the receiver operating characteristic curve (AUROC) in accordance with the disclosed subject matter.
  • Fig. 7B provides a graph showing the AUROC distribution for all panels per marker composition in accordance with the disclosed subject matter.
  • Fig. 7C provides a graph showing the AUROC distribution for all marker panels composed of at least 1 protein marker and all inotrope therapy panels in accordance with the disclosed subject matter.
  • Fig. 7D provides a graph showing the AUROC performance of 2 marker panels comparison overall against the average of individual cohorts and the integrated cohort in accordance with the disclosed subject matter.
  • Fig. 7E provides a graph showing the performance vs. the variation of the performance between the three patient cohorts in accordance with the disclosed subject matter.
  • Fig. 7F provides the KLKB 1 and inotrope therapy PGD classifier equation in accordance with the disclosed subject matter.
  • Fig. 8 provides graphs showing pre-transplant KLKB 1 protein expression and inotrope therapy predict post-transplant PGD in accordance with the disclosed subject matter.
  • Fig. 9 provides graphs showing clinical and protein panel that outperforms existing clinical predictors in accordance with the disclosed subject matter.
  • Fig. 10A provides a graph showing a normalized ELISA KLKB1 concentration comparison in accordance with the disclosed subject matter.
  • Fig. 10B provides a graph showing the putative PGD classifier in accordance with the disclosed subject matter.
  • Fig. 10C provides example performance metrics of the classifier at the highest sensitivity in accordance with the disclosed subject matter.
  • Fig. 11 provides a diagram showing a differential protein analysis modeling scheme in accordance with the disclosed subject matter.
  • Fig. 12A provides a graph showing enrichment and depletion of pathways using differential protein expression in accordance with the disclosed subject matter.
  • Fig. 12B provides a graph showing protein marker predictors in accordance with the disclosed subject matter.
  • Fig. 12C provides a graph showing ESR expression in accordance with the disclosed subject matter.
  • Fig. 12D provides a graph showing hsCRP expression in accordance with the disclosed subject matter.
  • Fig. 13 provides a graph showing a calibration curve for PGD prediction by a putative classifier on 80 CUIMC patient assessment data in accordance with the disclosed subject matter.
  • Figs. 14A-14C provide graphs showing principal components of protein expression and association with covariates in accordance with the disclosed subject matter.
  • Fig. 15A-15B provide graphs showing a correlation between unadjusted and adjusted individual and two marker panel performances in accordance with the disclosed subject matter.
  • the disclosed subject matter provides techniques for treating and/or preventing primary graft dysfunction (PGD) by analyzing exosome proteins.
  • PGD primary graft dysfunction
  • the disclosed subject matter provides systems and methods for predicting PGD with exosome proteins and treating PGD based on the prediction.
  • PGD primary graft dysfunction
  • PPF primary graft failure
  • the term “about” or “approximately” means within an acceptable error range for the particular value as determined by one of ordinary skill in the art, which will depend in part on how the value is measured or determined, i.e., the limitations of the measurement system. For example, “about” can mean within 3 or more than 3 standard deviations, per the practice in the art. Alternatively, “about” can mean a range of up to 20%, up to 10%, up to 5%, and up to 1% of a given value. Alternatively, e.g., with respect to biological systems or processes, the term can mean within an order of magnitude, within 5-fold, and within 2-fold, of a value.
  • Coupled refers to the connection of a device component to another device component by methods known in the art.
  • the term “subject” includes any human or nonhuman animal.
  • nonhuman animal includes, but is not limited to, all vertebrates, e.g., mammals and non-mammals, such as nonhuman primates, dogs, cats, sheep, horses, cows, chickens, amphibians, reptiles, etc.
  • the disclosed subject matter provides a method for identifying the risk of primary graft dysfunction (PGD) of a subject.
  • An example method can include collecting a sample of the subject, measuring a level of PGD marker from the sample, providing a PGD risk value, and identifying the risk of PGD based on the PGD risk value.
  • the sample can be collected from a subject.
  • the sample can include any body fluids of the subject.
  • the sample can include blood, serum, tears, effluent fluids, plasma, urine, semen, saliva, bronchial fluid, cerebral spinal fluid (CSF), amniotic fluid, synovial fluid, lymph, bile, gastric acid, or combinations thereof.
  • CSF cerebral spinal fluid
  • the method can include obtaining one or more characteristics of the subject.
  • the characteristic can include demographics, biometrics, lab values, medications, hemodynamics, cardiomyopathy, transplant factors, clinical variables or combinations thereof.
  • the demographics can include body mass index (BMI), blood type, age, sex, history of tobacco, diabetes, ischemic, or combinations thereof.
  • the cardiomyopathy can include non-ischemic, Adriamycin, amyloid, Chagas, Congenital, Hypertrophic cardiomyopathy, Idiopathic, Myocarditis, Valvular Heart Disease, Viral, Ischemic Time, or combination thereof.
  • the transplant factors can include ventricular assist device, pulmonary artery (PA) diastolic, or a combination thereof.
  • PA pulmonary artery
  • the hemodynamics can include pulmonary artery systolic, PA mean, central venous pressure (CVP), pulmonary capillary wedge pressure (PCWP), creatinine, or a combination thereof.
  • the lab values can include an international normalized ratio (INR), total bilirubin, sodium, anti arrhythmic, or combinations thereof.
  • the medications can include beta-blocker, inotrope, CVP/PCWP, or combinations thereof.
  • the clinical variables can include a medical history of the subject (e.g., pre-transplant inotrope therapy).
  • the characteristic can be used for calculating radial and model for end-stage liver disease score (MELD) scores.
  • MELD end-stage liver disease score
  • the level of a PGD marker can be measured from the sample of the subj ect.
  • the PGD marker can include proteins peroxiredoxin 2 (PRDX2), tropomyosin alpha-4 (TPM4), myeloperoxidase (MPO), PGLYRP2, DEF Al, DEFA1B, LDHB, F2, FCGBP, CAT, CFHR5, HIST1H4, GAPDH, LTF, ADIPOQ, HSPA5, plasma kallikrein (KLKB1), or combinations thereof.
  • the PGD marker can be KLKB1.
  • the method can further include measuring the level of the additional marker from the sample.
  • the additional marker can include PRDX2, TPM4, MPO, PGLYRP2, DEFA1, DEFA1B, LDHB, F2, FCGBP, CAT, CFHR5, HIST1H4, GAPDH, LTF, ADIPOQ, HSPA5, KLKB1, IGHD, IGLV2-11, or combinations thereof.
  • the level of the PGD marker and/or additional maker can be measured through various assays.
  • the level of the PGD marker and/or additional maker can be measured using mass spectrometry analysis.
  • microvesicles can be isolated from a sample (e.g., 100 ul) from a subject and homogenized using an MS-compatible lysis buffer. Lysate (e.g., 20pg) from each sample can be proteolytically cleaved with trypsin and chemically labeled with mass spectrometer detectable quantification reagent.
  • a reference sample can be generated by pooling equal amounts of microvesicles from each subject to create a protein library for quantification.
  • Samples can be bulk mixed (e.g., at 1 : 1) across all channels, and bulk mixed samples can be fractionated, and each fraction can be dried.
  • Dried peptides can be dissolved in a solution of 2% acetonitrile/2% formic acid and injected (e.g., in Oribitrap Fusion coupled with the UltiMateTM 3000 RSLCnano system).
  • Fractionated peptides can be separated with an about 5-30% acetonitrile gradient in about 0.1% formic acid over about 70 min.
  • the full MS spectra were acquired at a resolution of about 120,000.
  • the method can include selecting the most intense ions (e.g., MSI ions) for MS2 analysis.
  • MSI can be the initial ionized sample. These ions can split into smaller fragments usually through collision to generate smaller ions (MS2) and so on (MS3). Each MS represents a greater fragmentation such that the their separation by mass/charge ratio allows to identify individual ions.
  • the isolation width can be set at about 0.7 Da, and isolated precursors can be fragmented by Collision Induced Dissociation (CID) at normalized collision energy (NCE) of 35% and analyzed in the ion trap using “turbo” scan speed.
  • CID Collision Induced Dissociation
  • NCE normalized collision energy
  • a synchronous precursor selection (SPS) MS3 scan can be collected on the selected ions (e.g., the top 10 most intense ions in the MS2 spectrum).
  • SPS-MS3 precursors can be fragmented by higher energy collision-induced dissociation (HCD) at an normalized collision energy (NCE) of 60% and analyzed.
  • Raw mass spectrometric data can be analyzed using to perform database search and tandem mass tags (TMT) reporter ions quantification.
  • TMT can be isobaric mass tags that can allow for quantitation of each protein identified in mass spec.
  • TMT tags on lysine residues and peptide N termini e.g., +229.163 Da
  • the carbamidomethylating of cysteine residues e.g., +57.021 Da
  • data can be searched against a predetermined database (e.g., a UniProt human database) with peptide-spectrum match (PSMs) and protein-level at 1% false discovery rate (FDR).
  • FDR can be a multiple hypothesis correction that quantifies the rate of false discoveries or false positive predictions.
  • the signal-to-noise (S/N) measurements of each protein can be normalized so that the sum of the signal for all proteins in each channel can be equivalent to account for equal protein loading.
  • the level of the PGD marker and/or additional maker can be measured using enzyme-linked immunosorbent assay (ELISA) assays.
  • ELISA assay can be used to assess PGD maker/additional PGD marker (e.g., KLKB1 protein) concentrations.
  • the ELISA and mass spectrometry-derived protein expression can be compared through the minimum-maximum normalized patient cohort data. The obtained results can be further analyzed for protein expression analysis.
  • the method can include performing protein expression analysis.
  • the difference in protein expression distributions between the prospective and retrospective cohorts can be evaluated (e.g., with the Kolmogorov- Smirnov 2-sample test).
  • the protein expression distribution deviation from the normality test can be from D’Agostino’s and Pearson’s test, where the normality of a distribution can be rejected at an alpha level p-value.
  • a differential protein expression signature between PGD and non-PGD patient samples can be calculated.
  • LI -regularized logistic regression models can be calculated for each protein with the sites-of-origin as covariates. For example, about 200 bootstraps (samples with replacement) of the models can be performed to determine a confidence interval for the protein expression association to PGD. The average of the bootstrap distribution for each protein can be used as the differential rank statistic.
  • pathway analysis can be conducted using gene set enrichment analysis (GSEA).
  • GSEA Gene set enrichment analysis
  • the Normalized Enrichment Score (NES) can provide a gene set enrichment compared to all permutations of the gene set enrichment for the protein expression data.
  • the NES can be interpreted as the gene set enrichment score corrected for the size of the gene set and spurious, uninteresting correlations between the gene sets and the expression dataset.
  • the p-value can estimate the probability of seeing an enrichment score as high or higher among the permutation distribution, and the false discovery rate (FDR) can estimate the probability that an enrichment score with a given NES is a false positive finding.
  • FDR false discovery rate
  • the protein prediction contribution can be assessed within each of the pathways and functions from the GSEA analysis.
  • the set of proteins within each pathway and function can be used as features in an LI -regularized logistic regression model (e.g., using a Monte Carlo cross-validation (MCCV) model). For example, if a given pathway A includes a set of 5 proteins, then those 5 proteins can be included as features in the LI -regularized logistic regression model, given the sites-of-origin as covariates.
  • MCCV Monte Carlo cross-validation
  • the method can include providing a PGD risk value that can be quantified based on the level of the PGD marker using an adaptive MCCV model.
  • the PGD marker, additional PGD markers, characteristics of the subject, or combinations thereof can be used for calculating the PGD risk value.
  • the MCCV can be used.
  • the PGD prediction probabilities can be compared to the true PGD status to compute the area under the receiver operating characteristic curve (AUROC) and other metrics. From the disclosed model, a possible PGD risk value of 2 can be the log odds risk of PGD for every unit increase of the characteristic.
  • AUROC receiver operating characteristic curve
  • bootstrapping analysis (samples with replacement) can be used for analyzing a population distribution for prediction performances, and a permutation analysis can be performed, with random labeling of PGD status in patients, to generate and test prediction metrics from random PGD assignment.
  • the differences in the bootstrap and permutation distributions, as well as between the 2 bootstrap distributions, with the 2-sample Kolmogorov- Smirnov test can be evaluated.
  • the adaptive MCCV technique can perform prediction of non-PGD as well as PGD.
  • Machine learning models can be used to produce higher probabilities for non-PGD patients, which can result in AUROC values (e.g., less than about 0.5), which can be regarded as a random prediction.
  • the disclosed MCCV technique can sample these patient probabilities to derive an AUROC performance metric and confidence interval.
  • the calculated marker performances can be representative of the model’s confidence in predicting the occurrence of PGD.
  • the disclosed machine learning model can be used for predicting the risk of PGD at every iteration of the MCCV technique.
  • patients can be randomly assigned to training and validation sets.
  • the lambda hyperparameter from the machine learning model can be estimated (e.g., using 10-fold cross validation or an appropriate hyperparameter set from the chosen machine learning model).
  • a training set of patients can set the machine learning model parameters and the performance can be assessed on a separate training set.
  • the best performing fold on the testing set can be then chosen to evaluate the machine learning model parameters.
  • the validation set which has remained unused in the procedure, can be now used to evaluate the performance of the top performing machine learning model (e.g., from the 10-fold cross validation).
  • the method can include providing the disclosed MCCV technique with a training set for machine learning.
  • the disclosed MCCV technique can use a training set to optimize machine learning model hyperparameters to make final predictions of PGD risk.
  • the size, diversity, and composition of the training set can determine the hyperparameters chosen for the final machine learning model.
  • machine learning model hyperparameters can be chosen for a more accurate and generalizable risk prediction.
  • the MCCV technique can be a continuously evolving technique based on the training set. For example, Machine learning and statistical techniques can be used to mitigate confounding in biological enrichment analyses and improve predictive accuracy with modest population size.
  • putative PGD classifiers can be generated from the disclosed MCCV technique and used for the prediction of PGD.
  • the average of the bootstrap distribution of marker importance (beta coefficients) of the disclosed models can be applied to provide PGD risk on new data.
  • the risk score of the putative PGD classifier can undergoe an additional mathematical transformation, a logistic equation, before becoming usable as a clinical risk score.
  • marker A and marker B can have average importance of -1 and -2, respectively.
  • the method can include identifying the risk of PGD based on the PGD risk value.
  • Alternation of the level of PGD marker expression can be a predictor of PGD.
  • reduction in KLKB1 can be a predictor of PGD both by itself and in combination with other markers.
  • an increase of the makers involved in either inflammation or innate immunity e.g., PRDX2, MPO, PGLYRP2, and DEFA1 can be a predictor of PGD.
  • the characteristic of the subject can be evaluated for identifying the PGD risk.
  • the lack of inotrope therapy can be predictive of PGD.
  • Patient’ s blood type and/or whether the patient has diabetes can also be a risk factor for PGD.
  • the disclosed information related to proteomics and clinical variables can be evaluated through the disclosed model tin increase classification power.
  • KLKB 1 combination with inotrope therapy can result in a significant increase in classification power when compared to a combination of KLKB 1 and other top-performing proteins.
  • this panel can outperform other composite scores and clinical variables such as the Radial score.
  • the disclosed method can further include assessing an effect of a therapy on the heart transplant by estimating the PGD risk value of the subject before/after the therapy administered to the subject.
  • the therapy can be any use of mechanical support and/or drug therapy (e.g., beta blockers, antiarrhythmics, etc.).
  • the heart transplant surgery can be canceled based on the identified PGD risk value.
  • additional therapy can be administered to the subject to reduce the PGD risk value before or after the heart transplant.
  • KLKB1 activators/blockers, anti-inflammatory agents, or combinations can be administered to the subject to reduce PGD risk value.
  • the disclosed subject matter provides a system for predicting PGD and/or treating/preventing PGD based on the prediction.
  • the system can include one or more processors and one or more computer-readable non-transitory storage media coupled to one or more of the processors.
  • the one or more computer-readable non- transitory storage media can include instructions operable when executed by one or more of the processors to cause the system to collect a sample of the subject, measure a level of a PGD marker from the sample, provide a PGD risk value that can be quantified based on the level of the PGD marker using an adaptive Monte Carlo cross-validation (MCCV) model, and identify the risk of PGD based on the PGD risk value.
  • MCCV adaptive Monte Carlo cross-validation
  • the PGD marker can include plasma kallikrein (KLKB1).
  • the processor can be an electronic circuitry (e.g., central processing unit, graphics processing unit, digital signal processor, etc.) within a computer/server that can include a non-transitory storage media.
  • instructions can include a set of machine languages that a processor can understand and execute.
  • the disclosed processor can be configured to collect or receive the sample of the subject.
  • the sample can include any body fluids of the subject.
  • the sample can include blood, serum, tears, effluent fluids, plasma, urine, semen, saliva, bronchial fluid, cerebral spinal fluid (CSF), amniotic fluid, synovial fluid, lymph, bile, gastric acid, or combinations thereof.
  • CSF cerebral spinal fluid
  • the disclosed processor can be configured to receive information related to one or more characteristics of a subject.
  • the characteristic can include the disclosed demographics, biometrics, lab values, medications, hemodynamics, cardiomyopathy, transplant factors, clinical variables or combinations thereof.
  • the disclosed processor can be configured to measure or receive information related to a level of a PGD marker from the sample.
  • the PGD marker can include proteins peroxiredoxin 2 (PRDX2), tropomyosin alpha-4 (TPM4), myeloperoxidase (MPO), PGLYRP2, DEFA1, DEFA1B, LDHB, F2, FCGBP, CAT, CFHR5, HIST1H4, GAPDH, LTF, ADIPOQ, HSPA5, plasma kallikrein (KLKB1), or combinations thereof.
  • the PGD marker can be KLKB 1.
  • the system can be configured to measure or receive information related to the level of the additional marker from the sample.
  • the additional marker can include PRDX2, TPM4, MPO, PGLYRP2, DEF Al, DEFA1B, LDHB, F2, FCGBP, CAT, CFHR5, HIST1H4, GAPDH, LTF, ADIPOQ, HSPA5, KLKB1, IGHD, IGLV2-11, or combinations thereof.
  • the disclosed processor can be configured to provide the disclosed PGD risk value that can be quantified based on the level of the PGD marker using the disclosed adaptive Monte Carlo cross-validation (MCCV) model.
  • the adaptive MCCV model can assess the level of PGD marker, additional marker, characteristics of the subject, or combinations thereof to provide the PGD risk value. For example, the KLKB1 combination and history of inotrope therapy can be assessed for predicting the PGD risk value.
  • the MCCV model can be a continuously evolving model.
  • the processor can include a machine learning program, which can mitigate confounding in biological enrichment analyses and improve predictive accuracy with modest population size.
  • the MCCV model can be improved by providing a training set for machine learning. Training sets can include matched patients (e.g., one patient group that had PGD and one group that did not have PGD but both patients groups were similar age and the same sex). Other criterion can be a number of patients in the training set.
  • the processor can be configured to identify the risk of PGD based on the calculated PGD risk value.
  • the processor can be configured to assess an effect of a therapy on the heart transplant by estimating the PGD risk value of the subject.
  • the processor can provide further recommendations or instructions for additional treatment for the subject based on the PGD risk value.
  • the processor can recommend canceling the heart transplant based on the identified PGD risk value.
  • the processor can recommend additional therapy (e.g., KLKB1 activators, antiinflammatory agents, or combinations) for reducing the PGD risk value before or after the heart transplant.
  • Example 1 Plasma kallikrein predicts primary graft dysfunction after heart transplant
  • PGD Primary graft dysfunction
  • PGD Primary graft dysfunction after heart transplant can be defined as idiopathic ventricular dysfunction during the immediate post-transplant period.
  • PGD can affect either or both ventricles simultaneously and be graded from mild to severe depending on the amount of compensatory support required.
  • the International Society for Heart and Lung Transplantation reported that PGD is the leading cause of death within 30 days after transplant. Identifying predictive factors of PGD has the potential to improve risk stratification, organ allocation, and post-operative care, as well as increase the understanding of the etiology of PGD.
  • a risk model based solely on pretransplant recipient factors remains elusive.
  • Molecular biomarkers can be predictive and robust for many diseases.
  • a rich and underexplored source of potential prognostic biomarkers can be contained in extracellular vesicles.
  • extracellular vesicles can be stable, easily extracted from patient blood, and be used in the prediction of heart disease.
  • the disclosed subject matter provides techniques for a multi -institutional cohort analysis to predict PGD using machine learning to identify combinations of serum microvesicle proteomics and clinical characteristics.
  • Patient cohorts patient blood samples were prospectively recruited between 2014 and 2016. Patient blood samples were retrospectively collected from biobanks at Cedars- Sinai hospital (Cedars) and Pitie Salpetriere University Hospital (Paris). Only severe PGD by ISHLT definition was included. Patients undergoing re-transplant were excluded. The initial cohort for PGD prediction was comprised of PGD samples matched to non-PGD samples by age and gender. In order to calculate more clinically relevant predictive values, the validation ELISA cohort included consecutive patients undergoing a transplant. Human subjects protocol was approved by each institution’s IRB, and patients provided informed consent. Patient characteristics were collected, including demographics, biometrics, labs, medications and hemodynamics. PGD status was defined per ISHLT guidelines.
  • Mass spectrometry analysis patient samples from each site were collected for processing. Each patient cohort was processed independently. The total microvesicle was isolated from serum. Each sample was proteolytically cleaved with trypsin and chemically labeled with TMTIOplex isobaric mass tags separately. MS spectra were acquired with an Orbitrap Fusion Tribrid Mass Spectrometer (Thermo Scientific), and raw spectrometric data were analyzed using Proteome Discoverer.
  • Protein expression analysis a differential protein expression signature between PGD and non-PGD patient samples was calculated ( Figure 2). The protein association calculated was used as the differential rank statistic for pathway analysis using gene set enrichment analysis (GSEA).
  • Figure 2 shows the clinical diagnostic ELISA tests for C3, C4, total complement proteins. C3 are mg/dl, C4 are mg/dl, and total complement are U/ml.
  • PGD prediction a Logistic Regression model with LI regularization was used for each marker to determine their predictive performance and association to PGD (see Figure 3).
  • Figure 3 shows a Monte Carlo Cross-Validation (MCCV) Prediction diagram.
  • MCCV Monte Carlo Cross-Validation
  • PGD prediction strategy for estimating the prediction of clinical and protein markers toward the occurrence of PGD post-heart transplant are shown in Fig. 3.
  • An LI -regularized logistic regression model predicted post-transplant PGD using each pre-transplant clinical and protein marker’s value distribution.
  • the prediction scheme estimates the variance of prediction using different patient splits of the patient population. Patients are randomly assigned to training (75 patients) and validation (13 patients) sets. Within the training set, model parameters are estimated using 10-fold cross-validation. Within each fold, 64 patient data set the model parameters, and 11 patient data test the model performance.
  • the model parameters with the best prediction performance can be used as initial parameters to train the model on all 75 patients in the training set.
  • the 13 patients in the validation set, which have been set aside throughout the procedure, are now used to evaluate the model’s prediction performance.
  • the importance of the marker towards the prediction on the validation patient data is collected from the beta coefficients of the logistic regression model.
  • the end result is a 200 bootstrap confidence interval of PGD prediction performance and importance for each of the clinical and protein markers controlling for the patient’s site-of-origin. 200 random patient splits were computed following this prediction paradigm for comparison to a random prediction distribution.
  • Confidence intervals were generated from predicted patient probabilities by taking 50 bootstraps and calculating the mean and 95% confidence interval. To estimate the prediction variance, Monte Carlo cross-validation (MCCV) was used. The PGD prediction probabilities were compared to the true PGD status to compute the area under the receiver operating characteristic curve (AUROC) and other metrics. Bootstrapping analysis (samples with replacement) resulted in population distribution for prediction performances, and a permutation analysis was similarly performed, with random labeling of PGD status in patients, to generate and test prediction metrics from random PGD assignment. Differences were evaluated in the bootstrap and permutation distributions, as well as between the 2 bootstrap distributions, with the 2-sample Kolmogorov- Smirnov test. Statistics followed by the use of bracket notation indicated reporting of the average statistic and its 95% confidence interval. The average statistic and standard errors were noted when reporting Student t-test results.
  • MCCV Monte Carlo cross-validation
  • KLKB1 ELISA assay heart transplant patients enzyme-linked immunosorbent assay (ELISA) (Abeam) was used to assess KLKB1 protein concentration in a validation cohort of pre-transplant serum prospectively collected in 65 consecutive patients at CUIMC. To be able to compare ELISA and mass spectrometry derived protein expression, the patient cohort data was minimum-maximum normalized before application of the MCCV strategy for all predictions.
  • ELISA enzyme-linked immunosorbent assay
  • PTD primary graft dysfunction
  • BMI body mass index
  • PA pulmonary artery
  • CVP central venous pressure
  • PCWP pulmonary capillary wedge pressure
  • IR international normalized ratio
  • Tili total bilirubin
  • MELD model for end-stage liver disease score
  • Table 3 Cellular enrichment of identified proteins.
  • Patient blood microvesicle proteomic characteristics serum microvesicle protein spectra were obtained in at least triplicate for each patient (322 total replicates) ( Figure 1 A). The identified proteins were enriched in micro-vesicle and extracellular components (Table 3).
  • Table 4 is a Table sorted by Area Under the Receiver Operating Characteristic curve (AUROC). The beta coefficients of the models were exponentiated to odds shown below. The lower and upper bounds indicate the 95% confidence interval. AUROC average > 0.5, Bonferroni corrected p-value ⁇ 0.001, beta coefficient 95% CI not including the null association, and permutation beta coefficient 95% CI including the null association. Significant clinical characteristics are highlighted.
  • Table 4 Prediction statistics of significant protein markers and clinical characteristics.
  • Protein expression in the three patient cohorts does not follow a normal distribution (Omnibus test of normality p-values « 0.001).
  • Figure 4 shows exosome protein expression distributions for the patient cohorts. The individual patient expression distributions in each cohort were superimposed to represent each patient’s individual contribution to the whole cohort protein expression distribution.
  • Prediction of post-transplant PGD using pre-transplant clinical and protein markers the prediction of post-transplant PGD in patients was investigated using clinical and protein markers derived prior to transplant. Monte Carlo cross-validation (MCCV; Figure 3) and permutation analysis was employed to calculate the prediction and significance of each clinical and protein marker in predicting PGD.
  • Figure 5 shows that pre-transplant inotrope therapy can be predictive of PGD independent of a left ventricular assistive device.
  • Figure 5 shows the prediction of pre-transplant inotrope therapy, left ventricular assist device, and both clinical factors on posttransplant PGD.
  • Tables 6 and 7 describe the biological pathways where proteins expressed in PGD patients were significantly different than in patients without PGD. The difference would be enriched if the expression was higher in PGD patients and depleted if the expression was lower in PGD patients.
  • the panel of inotrope therapy and KLKB1 showed the least variation while maintaining high performance across all cohorts (95% AUROC CI above 0.7; Figure 7E).
  • PGD classifier performance Each panel’s predictions form a 2-marker classifier equation, as shown for the KLKB1 protein and inotrope therapy panel in Figure 7F.
  • the classifier equation for inotrope therapy and KLKB1 is the summation of multiplying - 0.9946 by a binary value of pre-transplant inotrope therapy (0 or 1) and -2.140 by normalized pre-transplant KLKB1 expression. This equation demonstrates an inverse relationship between post-transplant PGD risk and either pre-transplant KLKB1 expression or inotrope therapy (or both).
  • the PGD classifier has significantly increased performance compared to the markers on their own (Kolmogorov-Smirnov 2-sample test p-values ⁇ 2.165E-23; Figure 8 and Table 8).
  • the disclosed prediction panel was compared to existing PGD predictors: the radial score, the MELD score, and the CVP/PCWP ratio.
  • the 2-marker panel significantly outperforms all composite scores by 50% on average (Figure 9; Kilogorov Smirnov 2- sample p-values ⁇ 2.165E-23).
  • Table 10 Two marker panel equation performance on validation data. True Positive TP; True Negative TN; False Positive FP; False Negative FN; Positive Predictive Value PPV; Negative Predictive Value NPV.
  • Figure 11 shows the differential protein analysis modeling scheme.
  • the post-transplant PGD population risk of each marker is shown. From the 88 patients, sampling with replacement (e.g., over- and under-representing males and females in a population as shown here) was performed prior to the fitting model.
  • the fitted model can estimate the population risk towards PGD occurrence of a marker controlling for the patient’s site of origin.
  • the sampling was random sampling and fit the model 200 times.
  • the 200 bootstrap distribution produces a confidence interval for PGD population risk for each marker.
  • the average of each distribution is the population risk value for that marker.
  • the collection of the population risk values is the differential expression signature towards primary graft dysfunction.
  • GSEA Gene set enrichment analysis
  • Figure 12B revealed enrichment of processes related to inflammation, coagulation, and activation of the innate immune system. Downregulation of KLKB1 was identified in the activated complement and immune response pathways.
  • Table 12 Depleted Functions and Pathways in PGD.
  • Figure 13 shows a calibration curve for PGD prediction by a putative classifier on 80 CUTMC patient assessment data. Probabilities of PGD risk versus Percent/Number of PGD patients of the CUIMC assessment patients are shown in Figure 13. Patients who had moderate or severe PGD are shown as enlarged triangles, and those who did not are circles on the calibration curve. The probabilities calculated are the logit-transformed dot product between the assessment data (KLKB1 ELISA expression and pre-transplant Inotrope therapy) and the putative PGD classifier.
  • Figures 14A-14C show principal components of protein expression and association with covariates. Overlay of Site-of-origin (14A), Set (14B), and TMT-Tag covariates on protein expression variation(14C), determined via Principal Components Analysis for patients are shown. Set/TMT Tag (or experimental batch) is accounted for during protein identification and quantification, while each patient cohort was a different experiment and not accounted for during this process. As shown in the principal components analysis, cohort site of origin explains protein expression variation for patients and thus is included as a covariate in association and prediction analyses. PCA can determine the most variability found with the protein expression data, where this variability can come from non-biological variability. Therefore patients are projected onto their variability components to assess which non-biological variability explains the observed differences in the protein expression data.
  • Figures 15A-15B show correlation between unadjusted and adjusted individual and two marker panel performances. Comparison between model specifications for the individual (15 A) and two marker panel predictions (15B) when including and not including covariate adjustment (i.e., cohort site-of-origin) are shown.
  • the marker prediction specifications did not include site-of-origin as covariates in order to easily translate the putative classifier equations to new patient data.
  • covariate adjustment was included, the average AUROC performance is highly correlated with the unadjusted performances suggesting minimal confounding by site and accuracy of the classifier equations to translate onto new patient data agnostic of site. This analysis was performed to generate evidence in using simpler and more interpretable machine learning models that did not account for patient site of origin.
  • KLKB 1 is a serine protease that controls the activation of both inflammation and coagulation in what is known as the kallikrein-kinin-system (KKS).
  • KLKB1 converts high molecular weight kininogen into bradykinin, stimulating the release of nitric oxide and prostacyclin, causing vasodilation and increased vascular permeability. It also acts as a neutrophil chemoattractant, causing degranulation. Evaluations of the KKS system in patients with sepsis, a markedly inflammatory state, demonstrated increased KKS activity, characterized by decreased levels of plasma kallikrein, likely due to consumption. Decreases in KLKB1 have been noted in typhoid fever, ARDS, cardiopulmonary bypass and in normal volunteers infused with gram-negative endotoxin. Similarly, in animal models of inflammatory bowel disease and inflammatory arthritis, plasma kallikrein levels were markedly reduced.
  • inotrope therapy was predictive of PGD, and this stands in contrast to prior analyses, which demonstrated that the presence of inotrope therapy was associated with PGD.
  • Pre-transplant inotrope therapy and durable mechanical support (such as LVAD) are exclusive prior to transplant, and mechanical support has been associated with PGD in prior studies.
  • mechanical support was not significantly predictive of PGD in the analyses and did not interact with inotrope therapy in prediction models. Whether inotrope therapy itself is an actual driver of PGD protection versus an epiphenomenal marker remains to be explored. There are clear differences in medical therapy, anti coagulation and mechanical support between patients receiving and not receiving inotrope therapy (Table 10).
  • proteomic results were being driven by a specific microvesicular process or a reflection of the greater overall serum milieu was tested in the validation ELISA cohort.
  • the ELISA samples themselves were not able to generate a classifier using KLKB 1 and inotrope therapy due to the paucity of PGD samples in that cohort.
  • the proteomics-derived classifier generated a similar AUROC on whole serum as it did in the original microvesicle proteomic cohort.
  • the classifier performed essentially as a rule-out test with a very high negative predictive value.

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Pulmonology (AREA)
  • Chemical & Material Sciences (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Organic Chemistry (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The present subject matter relates to techniques for identifying risk of primary graft dysfunction (PGD) of a subject. The disclosed method can include collecting serum of the subject, measuring a level of a PGD marker from the serum, wherein the PGD marker comprises plasma kallikrein (KLKB1), providing a PGD risk value that is quantified based on the level of the PGD marker using an adaptive Monte Carlo cross-validation (MCCV) model, and identifying the risk of PGD based on the PGD risk value.

Description

SYSTEMS AND METHODS FOR PREDICTING GRAFT
DYSFUNCTION WITH EXOSOME PROTEINS
CROSS-REFERENCE TO RELATED APPLICATION
This application claims priority to U.S. Provisional Patent Application No. 63/078,672, which was filed on September 15, 2020, the entire contents of which are incorporated by reference herein.
GRANT INFORMATION
This invention was made with government support under grant number ULF TR001873 awarded by the National Institutes of Health. The government has certain rights in the invention.
BACKGROUND
Primary graft dysfunction (PGD) after heart transplant can be defined as idiopathic heart failure occurring within the immediate postoperative period. PGD can affect either or both ventricles simultaneously and be graded from mild to severe depending on the amount of support required to compensate for organ dysfunction. PGD can cause the death of patients within 30 days after transplant.
The underlying cause of PGD and the importance of different factors towards posttransplant PGD remains unclear. . Identifying predictive factors of PGD in recipients has the potential to improve risk stratification, organ allocation, and post-operative care as well as increase the understanding behind the etiology of PGD.
Therefore, there is a need for improved techniques for predicting PGD. SUMMARY
The disclosed subject matter provides techniques for identifying the risk of primary graft dysfunction (PGD) of a subject.
An exemplary method can include collecting a sample of the subject, measuring a level of a PGD marker from the sample, providing a PGD risk value that is quantified based on the level of the PGD marker using an adaptive Monte Carlo cross-validation (MCCV) model, and identifying the risk of PGD based on the PGD risk value. In nonlimiting embodiments, the PGD marker can include plasma kallikrein (KLKB1).
In certain embodiments, the method can further include assessing an effect of a therapy on the heart transplant by estimating the PGD risk value of the subj ect. The subj ect can receive the therapy before or after the assessment.
In certain embodiments, the method can further include identifying a clinical variable of the subject. In non-limiting embodiments, the clinical variable can include a medical history of the subject. In some embodiments, the medical history of the one subject can include a pre-transplant inotrope therapy.
In certain embodiments, the method can further include measuring a level of an additional marker from the sample. In non-limiting embodiments, the additional marker can include proteins peroxiredoxin 2 (PRDX2), tropomyosin alpha-4 (TPM4), myeloperoxidase (MPO), PGLYRP2, DEFA1, DEFA1B, LDHB, F2, FCGBP, CAT, CFHR5, HIST1H4, GAPDH, LTF, ADIPOQ, HSPA5, or combinations thereof.
In certain embodiments, the PGD risk value can be quantified based on the level of the PGD marker and the additional marker. In certain embodiments, the method can further include providing the adaptive MCCV model with a training set for machine learning. In non-limiting embodiments, the adaptive MCCV model can be a continuously evolving model based on the training set.
In certain embodiments, the method can further include providing an additional therapy to the subject based on the PGD risk value. In non-limiting embodiments, the additional therapy can include KLKB1 activators, anti-inflammatory agents, or combinations thereof.
The disclosed subject matter also provides systems for identifying the risk of primary graft dysfunction (PGD) of a subject. An example system can include one or more processors and one or more computer-readable non-transitory storage media coupled to one or more of the processors. The one or more computer-readable non-transitory storage media can include instructions operable when executed by one or more of the processors to cause the system to collect a sample of the subject, measure a level of a PGD marker from the sample, provide a PGD risk value that is quantified based on the level of the PGD marker using an adaptive Monte Carlo cross-validation (MCCV) model, and identify the risk of PGD based on the PGD risk value. In non-limiting embodiments, the PGD marker can include plasma kallikrein (KLKB1).
The disclosed subject matter will be further described below.
BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 A provides a diagram of example blood-derived micro-vesicle proteomics in accordance with the disclosed subject matter. Fig. IB provides a diagram showing an example protein markers identified by mass spectrometry in accordance with the disclosed subject matter. Fig. 1C provides example protein filtering in accordance with the disclosed subject matter.
Fig. 2 provides a graph showing clinical diagnostic ELISA tests for C3, C4, total complement proteins in accordance with the disclosed subject matter.
Fig. 3 provides a diagram showing Monte Carlo Cross-Validation (MCCV) Prediction in accordance with the disclosed subject matter.
Fig. 4 provides a graph showing exosome protein expression distributions for patient cohorts in accordance with the disclosed subject matter.
Fig. 5 provides a graph showing example techniques for primary graft dysfunction (PGD) prediction by clinical and protein markers in accordance with the disclosed subject matter.
Fig. 6 provides a graph showing the prediction of pre-transplant inotrope therapy, left ventricular assist device, and both clinical factors on posttransplant PGD in accordance with the disclosed subject matter.
Fig. 7A provides a graph showing the area under the receiver operating characteristic curve (AUROC) in accordance with the disclosed subject matter. Fig. 7B provides a graph showing the AUROC distribution for all panels per marker composition in accordance with the disclosed subject matter. Fig. 7C provides a graph showing the AUROC distribution for all marker panels composed of at least 1 protein marker and all inotrope therapy panels in accordance with the disclosed subject matter. Fig. 7D provides a graph showing the AUROC performance of 2 marker panels comparison overall against the average of individual cohorts and the integrated cohort in accordance with the disclosed subject matter. Fig. 7E provides a graph showing the performance vs. the variation of the performance between the three patient cohorts in accordance with the disclosed subject matter. Fig. 7F provides the KLKB 1 and inotrope therapy PGD classifier equation in accordance with the disclosed subject matter.
Fig. 8 provides graphs showing pre-transplant KLKB 1 protein expression and inotrope therapy predict post-transplant PGD in accordance with the disclosed subject matter.
Fig. 9 provides graphs showing clinical and protein panel that outperforms existing clinical predictors in accordance with the disclosed subject matter.
Fig. 10A provides a graph showing a normalized ELISA KLKB1 concentration comparison in accordance with the disclosed subject matter. Fig. 10B provides a graph showing the putative PGD classifier in accordance with the disclosed subject matter. Fig. 10C provides example performance metrics of the classifier at the highest sensitivity in accordance with the disclosed subject matter.
Fig. 11 provides a diagram showing a differential protein analysis modeling scheme in accordance with the disclosed subject matter.
Fig. 12A provides a graph showing enrichment and depletion of pathways using differential protein expression in accordance with the disclosed subject matter. Fig. 12B provides a graph showing protein marker predictors in accordance with the disclosed subject matter. Fig. 12C provides a graph showing ESR expression in accordance with the disclosed subject matter. Fig. 12D provides a graph showing hsCRP expression in accordance with the disclosed subject matter.
Fig. 13 provides a graph showing a calibration curve for PGD prediction by a putative classifier on 80 CUIMC patient assessment data in accordance with the disclosed subject matter. Figs. 14A-14C provide graphs showing principal components of protein expression and association with covariates in accordance with the disclosed subject matter.
Fig. 15A-15B provide graphs showing a correlation between unadjusted and adjusted individual and two marker panel performances in accordance with the disclosed subject matter.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and are intended to provide further explanation of the disclosed subject matter.
DETAILED DESCRIPTION
The disclosed subject matter provides techniques for treating and/or preventing primary graft dysfunction (PGD) by analyzing exosome proteins. The disclosed subject matter provides systems and methods for predicting PGD with exosome proteins and treating PGD based on the prediction. The terms primary graft dysfunction (PGD) and primary graft failure (PGF) can be used interchangeably herein.
The terms “comprise(s),” “include(s),” “having,” “has,” “can,” “contain(s),” and variants thereof, as used herein, are intended to be open-ended transitional phrases, terms, or words that do not preclude additional acts or structures. The singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise. The present disclosure also contemplates other embodiments “comprising,” “consisting of,” and “consisting essentially of,” the embodiments or elements presented herein, whether explicitly set forth or not.
As used herein, the term “about” or “approximately” means within an acceptable error range for the particular value as determined by one of ordinary skill in the art, which will depend in part on how the value is measured or determined, i.e., the limitations of the measurement system. For example, “about” can mean within 3 or more than 3 standard deviations, per the practice in the art. Alternatively, “about” can mean a range of up to 20%, up to 10%, up to 5%, and up to 1% of a given value. Alternatively, e.g., with respect to biological systems or processes, the term can mean within an order of magnitude, within 5-fold, and within 2-fold, of a value.
The term “coupled,” as used herein, refers to the connection of a device component to another device component by methods known in the art.
As used herein, the term “subject” includes any human or nonhuman animal. The term “nonhuman animal” includes, but is not limited to, all vertebrates, e.g., mammals and non-mammals, such as nonhuman primates, dogs, cats, sheep, horses, cows, chickens, amphibians, reptiles, etc.
In certain embodiments, the disclosed subject matter provides a method for identifying the risk of primary graft dysfunction (PGD) of a subject. An example method can include collecting a sample of the subject, measuring a level of PGD marker from the sample, providing a PGD risk value, and identifying the risk of PGD based on the PGD risk value.
In certain embodiments, as shown in Fig. 1, the sample can be collected from a subject. In non-limiting embodiments, the sample can include any body fluids of the subject. For example, the sample can include blood, serum, tears, effluent fluids, plasma, urine, semen, saliva, bronchial fluid, cerebral spinal fluid (CSF), amniotic fluid, synovial fluid, lymph, bile, gastric acid, or combinations thereof.
In certain embodiments, the method can include obtaining one or more characteristics of the subject. The characteristic can include demographics, biometrics, lab values, medications, hemodynamics, cardiomyopathy, transplant factors, clinical variables or combinations thereof. For example, the demographics can include body mass index (BMI), blood type, age, sex, history of tobacco, diabetes, ischemic, or combinations thereof. The cardiomyopathy can include non-ischemic, Adriamycin, amyloid, Chagas, Congenital, Hypertrophic cardiomyopathy, Idiopathic, Myocarditis, Valvular Heart Disease, Viral, Ischemic Time, or combination thereof. The transplant factors can include ventricular assist device, pulmonary artery (PA) diastolic, or a combination thereof. The hemodynamics can include pulmonary artery systolic, PA mean, central venous pressure (CVP), pulmonary capillary wedge pressure (PCWP), creatinine, or a combination thereof. The lab values can include an international normalized ratio (INR), total bilirubin, sodium, anti arrhythmic, or combinations thereof. The medications can include beta-blocker, inotrope, CVP/PCWP, or combinations thereof. The clinical variables can include a medical history of the subject (e.g., pre-transplant inotrope therapy). In non-limiting embodiments, the characteristic can be used for calculating radial and model for end-stage liver disease score (MELD) scores. For example, the MELD score can be derived for each patient using the formula:
3.78 x ln[ serum bilirubin (mg/dL) ] + 11.2 x ln[ INR ] + 9.57 x ln[ serum creatinine (mg/dL) ] + 6.43 (1)
In non-limiting embodiments, the clinical risk scores can include a plurality risk factors for primary graft dysfunction (e.g., Right atrial pressure >=10 mm Hg, recipient Age >=60 years, Diabetes mellitus, Inotrope dependence, donor Age >=30 years, Length of ischemic time >=240 minutes — i.e., RADIAL).
In certain embodiments, the level of a PGD marker can be measured from the sample of the subj ect. In non-limiting embodiments, the PGD marker can include proteins peroxiredoxin 2 (PRDX2), tropomyosin alpha-4 (TPM4), myeloperoxidase (MPO), PGLYRP2, DEF Al, DEFA1B, LDHB, F2, FCGBP, CAT, CFHR5, HIST1H4, GAPDH, LTF, ADIPOQ, HSPA5, plasma kallikrein (KLKB1), or combinations thereof. In nonlimiting embodiments, the PGD marker can be KLKB1. In some embodiments, the method can further include measuring the level of the additional marker from the sample. The additional marker can include PRDX2, TPM4, MPO, PGLYRP2, DEFA1, DEFA1B, LDHB, F2, FCGBP, CAT, CFHR5, HIST1H4, GAPDH, LTF, ADIPOQ, HSPA5, KLKB1, IGHD, IGLV2-11, or combinations thereof.
In certain embodiments, the level of the PGD marker and/or additional maker can be measured through various assays. In non-limiting embodiments, the level of the PGD marker and/or additional maker can be measured using mass spectrometry analysis. For example, microvesicles can be isolated from a sample (e.g., 100 ul) from a subject and homogenized using an MS-compatible lysis buffer. Lysate (e.g., 20pg) from each sample can be proteolytically cleaved with trypsin and chemically labeled with mass spectrometer detectable quantification reagent. A reference sample can be generated by pooling equal amounts of microvesicles from each subject to create a protein library for quantification. Samples can be bulk mixed (e.g., at 1 : 1) across all channels, and bulk mixed samples can be fractionated, and each fraction can be dried. Dried peptides can be dissolved in a solution of 2% acetonitrile/2% formic acid and injected (e.g., in Oribitrap Fusion coupled with the UltiMate™ 3000 RSLCnano system). Fractionated peptides can be separated with an about 5-30% acetonitrile gradient in about 0.1% formic acid over about 70 min. In non-limiting embodiments, the full MS spectra were acquired at a resolution of about 120,000. In some embodiments, the method can include selecting the most intense ions (e.g., MSI ions) for MS2 analysis. MSI can be the initial ionized sample. These ions can split into smaller fragments usually through collision to generate smaller ions (MS2) and so on (MS3). Each MS represents a greater fragmentation such that the their separation by mass/charge ratio allows to identify individual ions. The isolation width can be set at about 0.7 Da, and isolated precursors can be fragmented by Collision Induced Dissociation (CID) at normalized collision energy (NCE) of 35% and analyzed in the ion trap using “turbo” scan speed. Following the acquisition of each MS2 spectrum, a synchronous precursor selection (SPS) MS3 scan can be collected on the selected ions (e.g., the top 10 most intense ions in the MS2 spectrum). SPS-MS3 precursors can be fragmented by higher energy collision-induced dissociation (HCD) at an normalized collision energy (NCE) of 60% and analyzed. Raw mass spectrometric data can be analyzed using to perform database search and tandem mass tags (TMT) reporter ions quantification. TMT can be isobaric mass tags that can allow for quantitation of each protein identified in mass spec. TMT tags on lysine residues and peptide N termini (e.g., +229.163 Da) and the carbamidomethylating of cysteine residues (e.g., +57.021 Da) can be set as static modifications, while the oxidation of methionine residues (e.g., +15.995 Da), deamidation (+0.984) on asparagine and glutamine can be set as a variable modification. In nonlimiting embodiments, data can be searched against a predetermined database (e.g., a UniProt human database) with peptide-spectrum match (PSMs) and protein-level at 1% false discovery rate (FDR). The FDR can be a multiple hypothesis correction that quantifies the rate of false discoveries or false positive predictions. The signal-to-noise (S/N) measurements of each protein can be normalized so that the sum of the signal for all proteins in each channel can be equivalent to account for equal protein loading. In certain embodiments, the level of the PGD marker and/or additional maker can be measured using enzyme-linked immunosorbent assay (ELISA) assays. For example, ELISA assay can be used to assess PGD maker/additional PGD marker (e.g., KLKB1 protein) concentrations. The ELISA and mass spectrometry-derived protein expression can be compared through the minimum-maximum normalized patient cohort data. The obtained results can be further analyzed for protein expression analysis.
In certain embodiments, the method can include performing protein expression analysis. For example, the difference in protein expression distributions between the prospective and retrospective cohorts can be evaluated (e.g., with the Kolmogorov- Smirnov 2-sample test). The protein expression distribution deviation from the normality test can be from D’Agostino’s and Pearson’s test, where the normality of a distribution can be rejected at an alpha level p-value. In some embodiments, a differential protein expression signature between PGD and non-PGD patient samples can be calculated. To estimate the association of individual protein levels to PGD, LI -regularized logistic regression models can be calculated for each protein with the sites-of-origin as covariates. For example, about 200 bootstraps (samples with replacement) of the models can be performed to determine a confidence interval for the protein expression association to PGD. The average of the bootstrap distribution for each protein can be used as the differential rank statistic.
In certain embodiments, pathway analysis can be conducted using gene set enrichment analysis (GSEA). For GSEA, the Normalized Enrichment Score (NES) can provide a gene set enrichment compared to all permutations of the gene set enrichment for the protein expression data. The NES can be interpreted as the gene set enrichment score corrected for the size of the gene set and spurious, uninteresting correlations between the gene sets and the expression dataset. The p-value can estimate the probability of seeing an enrichment score as high or higher among the permutation distribution, and the false discovery rate (FDR) can estimate the probability that an enrichment score with a given NES is a false positive finding.
In certain embodiments, the protein prediction contribution can be assessed within each of the pathways and functions from the GSEA analysis. The set of proteins within each pathway and function can be used as features in an LI -regularized logistic regression model (e.g., using a Monte Carlo cross-validation (MCCV) model). For example, if a given pathway A includes a set of 5 proteins, then those 5 proteins can be included as features in the LI -regularized logistic regression model, given the sites-of-origin as covariates.
In certain embodiments, the method can include providing a PGD risk value that can be quantified based on the level of the PGD marker using an adaptive MCCV model. The PGD marker, additional PGD markers, characteristics of the subject, or combinations thereof can be used for calculating the PGD risk value. For example, a Logistic Regression model with LI regularization for each marker to determine their predictive performance and association to PGD. To estimate the prediction variance and PGD risk value, the MCCV can be used. For example, the PGD prediction probabilities can be compared to the true PGD status to compute the area under the receiver operating characteristic curve (AUROC) and other metrics. From the disclosed model, a possible PGD risk value of 2 can be the log odds risk of PGD for every unit increase of the characteristic. In nonlimiting embodiments, bootstrapping analysis (samples with replacement) can be used for analyzing a population distribution for prediction performances, and a permutation analysis can be performed, with random labeling of PGD status in patients, to generate and test prediction metrics from random PGD assignment. In some embodiments, the differences in the bootstrap and permutation distributions, as well as between the 2 bootstrap distributions, with the 2-sample Kolmogorov- Smirnov test can be evaluated.
In certain embodiments, the adaptive MCCV technique can perform prediction of non-PGD as well as PGD. Machine learning models can be used to produce higher probabilities for non-PGD patients, which can result in AUROC values (e.g., less than about 0.5), which can be regarded as a random prediction. The disclosed MCCV technique can sample these patient probabilities to derive an AUROC performance metric and confidence interval. The calculated marker performances can be representative of the model’s confidence in predicting the occurrence of PGD. The disclosed machine learning model can be used for predicting the risk of PGD at every iteration of the MCCV technique. In MCCV, patients can be randomly assigned to training and validation sets. Within the training set, the lambda hyperparameter from the machine learning model can be estimated (e.g., using 10-fold cross validation or an appropriate hyperparameter set from the chosen machine learning model). Within each fold, a training set of patients can set the machine learning model parameters and the performance can be assessed on a separate training set. The best performing fold on the testing set can be then chosen to evaluate the machine learning model parameters. The validation set, which has remained unused in the procedure, can be now used to evaluate the performance of the top performing machine learning model (e.g., from the 10-fold cross validation).
In certain embodiments, the method can include providing the disclosed MCCV technique with a training set for machine learning. The disclosed MCCV technique can use a training set to optimize machine learning model hyperparameters to make final predictions of PGD risk. Thus, the size, diversity, and composition of the training set can determine the hyperparameters chosen for the final machine learning model. By utilizing a robust and diverse training set, machine learning model hyperparameters can be chosen for a more accurate and generalizable risk prediction. In non-limiting embodiments, the MCCV technique can be a continuously evolving technique based on the training set. For example, Machine learning and statistical techniques can be used to mitigate confounding in biological enrichment analyses and improve predictive accuracy with modest population size.
In certain embodiments, putative PGD classifiers can be generated from the disclosed MCCV technique and used for the prediction of PGD. The average of the bootstrap distribution of marker importance (beta coefficients) of the disclosed models can be applied to provide PGD risk on new data. Unlike certain classifiers that resemble a simple equation with feature risk coefficients multiplied by the normalized value or indicator of that feature for a patient summed together for a final risk score, the risk score of the putative PGD classifier can undergoe an additional mathematical transformation, a logistic equation, before becoming usable as a clinical risk score. For example, marker A and marker B can have average importance of -1 and -2, respectively. By applying the dot product between the average marker importance of -1 and -2 and a patient’s values for markers A and B and applying a logit transformation, the equation results in a probability of PGD risk for each patient. These equations are produced for every two-marker panel. An example equation can be (-0.9946* [pre-transplant Inotrope therapy indicator]) + (- 2.140*[pre-transplant KLKBl normalized protein expression value]).
In certain embodiments, the method can include identifying the risk of PGD based on the PGD risk value. Alternation of the level of PGD marker expression can be a predictor of PGD. For example, reduction in KLKB1 can be a predictor of PGD both by itself and in combination with other markers. In non-limiting embodiments, an increase of the makers involved in either inflammation or innate immunity (e.g., PRDX2, MPO, PGLYRP2, and DEFA1) can be a predictor of PGD. In some embodiments, the characteristic of the subject can be evaluated for identifying the PGD risk. For example, the lack of inotrope therapy can be predictive of PGD. Patient’ s blood type and/or whether the patient has diabetes can also be a risk factor for PGD.
In certain embodiments, the disclosed information related to proteomics and clinical variables can be evaluated through the disclosed model tin increase classification power. For example, KLKB 1 combination with inotrope therapy can result in a significant increase in classification power when compared to a combination of KLKB 1 and other top-performing proteins. Furthermore, this panel can outperform other composite scores and clinical variables such as the Radial score.
In certain embodiments, the disclosed method can further include assessing an effect of a therapy on the heart transplant by estimating the PGD risk value of the subject before/after the therapy administered to the subject. The therapy can be any use of mechanical support and/or drug therapy (e.g., beta blockers, antiarrhythmics, etc.). In nonlimiting embodiments, the heart transplant surgery can be canceled based on the identified PGD risk value. In some embodiments, additional therapy can be administered to the subject to reduce the PGD risk value before or after the heart transplant. For example, KLKB1 activators/blockers, anti-inflammatory agents, or combinations can be administered to the subject to reduce PGD risk value.
In certain embodiments, the disclosed subject matter provides a system for predicting PGD and/or treating/preventing PGD based on the prediction. The system can include one or more processors and one or more computer-readable non-transitory storage media coupled to one or more of the processors. The one or more computer-readable non- transitory storage media can include instructions operable when executed by one or more of the processors to cause the system to collect a sample of the subject, measure a level of a PGD marker from the sample, provide a PGD risk value that can be quantified based on the level of the PGD marker using an adaptive Monte Carlo cross-validation (MCCV) model, and identify the risk of PGD based on the PGD risk value. In non-limiting embodiments, the PGD marker can include plasma kallikrein (KLKB1). In some embodiments, the processor can be an electronic circuitry (e.g., central processing unit, graphics processing unit, digital signal processor, etc.) within a computer/server that can include a non-transitory storage media. In non-limiting embodiments, instructions can include a set of machine languages that a processor can understand and execute.
In certain embodiments, the disclosed processor can be configured to collect or receive the sample of the subject. The sample can include any body fluids of the subject. For example, the sample can include blood, serum, tears, effluent fluids, plasma, urine, semen, saliva, bronchial fluid, cerebral spinal fluid (CSF), amniotic fluid, synovial fluid, lymph, bile, gastric acid, or combinations thereof.
In certain embodiments, the disclosed processor can be configured to receive information related to one or more characteristics of a subject. The characteristic can include the disclosed demographics, biometrics, lab values, medications, hemodynamics, cardiomyopathy, transplant factors, clinical variables or combinations thereof.
In certain embodiments, the disclosed processor can be configured to measure or receive information related to a level of a PGD marker from the sample. In non-limiting embodiments, the PGD marker can include proteins peroxiredoxin 2 (PRDX2), tropomyosin alpha-4 (TPM4), myeloperoxidase (MPO), PGLYRP2, DEFA1, DEFA1B, LDHB, F2, FCGBP, CAT, CFHR5, HIST1H4, GAPDH, LTF, ADIPOQ, HSPA5, plasma kallikrein (KLKB1), or combinations thereof. In non-limiting embodiments, the PGD marker can be KLKB 1. In some embodiments, the system can be configured to measure or receive information related to the level of the additional marker from the sample. The additional marker can include PRDX2, TPM4, MPO, PGLYRP2, DEF Al, DEFA1B, LDHB, F2, FCGBP, CAT, CFHR5, HIST1H4, GAPDH, LTF, ADIPOQ, HSPA5, KLKB1, IGHD, IGLV2-11, or combinations thereof.
In certain embodiments, the disclosed processor can be configured to provide the disclosed PGD risk value that can be quantified based on the level of the PGD marker using the disclosed adaptive Monte Carlo cross-validation (MCCV) model. The adaptive MCCV model can assess the level of PGD marker, additional marker, characteristics of the subject, or combinations thereof to provide the PGD risk value. For example, the KLKB1 combination and history of inotrope therapy can be assessed for predicting the PGD risk value.
In non-limiting embodiments, the MCCV model can be a continuously evolving model. For example, the processor can include a machine learning program, which can mitigate confounding in biological enrichment analyses and improve predictive accuracy with modest population size. The MCCV model can be improved by providing a training set for machine learning. Training sets can include matched patients (e.g., one patient group that had PGD and one group that did not have PGD but both patients groups were similar age and the same sex). Other criterion can be a number of patients in the training set. In non-limiting embodiments, the processor can be configured to identify the risk of PGD based on the calculated PGD risk value.
In certain embodiments, the processor can be configured to assess an effect of a therapy on the heart transplant by estimating the PGD risk value of the subject. In non- liming embodiments, the processor can provide further recommendations or instructions for additional treatment for the subject based on the PGD risk value. For example, the processor can recommend canceling the heart transplant based on the identified PGD risk value. The processor can recommend additional therapy (e.g., KLKB1 activators, antiinflammatory agents, or combinations) for reducing the PGD risk value before or after the heart transplant.
EXAMPLES
Example 1 : Plasma kallikrein predicts primary graft dysfunction after heart transplant
Primary graft dysfunction (PGD) after heart transplant can be defined as idiopathic ventricular dysfunction during the immediate post-transplant period. PGD can affect either or both ventricles simultaneously and be graded from mild to severe depending on the amount of compensatory support required. The International Society for Heart and Lung Transplantation reported that PGD is the leading cause of death within 30 days after transplant. Identifying predictive factors of PGD has the potential to improve risk stratification, organ allocation, and post-operative care, as well as increase the understanding of the etiology of PGD. However, a risk model based solely on pretransplant recipient factors remains elusive.
Molecular biomarkers can be predictive and robust for many diseases. A rich and underexplored source of potential prognostic biomarkers can be contained in extracellular vesicles. In addition to diagnostic potential, extracellular vesicles can be stable, easily extracted from patient blood, and be used in the prediction of heart disease. The disclosed subject matter provides techniques for a multi -institutional cohort analysis to predict PGD using machine learning to identify combinations of serum microvesicle proteomics and clinical characteristics.
Patient cohorts: patient blood samples were prospectively recruited between 2014 and 2016. Patient blood samples were retrospectively collected from biobanks at Cedars- Sinai hospital (Cedars) and Pitie Salpetriere University Hospital (Paris). Only severe PGD by ISHLT definition was included. Patients undergoing re-transplant were excluded. The initial cohort for PGD prediction was comprised of PGD samples matched to non-PGD samples by age and gender. In order to calculate more clinically relevant predictive values, the validation ELISA cohort included consecutive patients undergoing a transplant. Human subjects protocol was approved by each institution’s IRB, and patients provided informed consent. Patient characteristics were collected, including demographics, biometrics, labs, medications and hemodynamics. PGD status was defined per ISHLT guidelines.
Mass spectrometry analysis: patient samples from each site were collected for processing. Each patient cohort was processed independently. The total microvesicle was isolated from serum. Each sample was proteolytically cleaved with trypsin and chemically labeled with TMTIOplex isobaric mass tags separately. MS spectra were acquired with an Orbitrap Fusion Tribrid Mass Spectrometer (Thermo Scientific), and raw spectrometric data were analyzed using Proteome Discoverer.
Protein expression analysis: a differential protein expression signature between PGD and non-PGD patient samples was calculated (Figure 2). The protein association calculated was used as the differential rank statistic for pathway analysis using gene set enrichment analysis (GSEA). Figure 2 shows the clinical diagnostic ELISA tests for C3, C4, total complement proteins. C3 are mg/dl, C4 are mg/dl, and total complement are U/ml.
PGD prediction: a Logistic Regression model with LI regularization was used for each marker to determine their predictive performance and association to PGD (see Figure 3). Figure 3 shows a Monte Carlo Cross-Validation (MCCV) Prediction diagram. PGD prediction strategy for estimating the prediction of clinical and protein markers toward the occurrence of PGD post-heart transplant are shown in Fig. 3. An LI -regularized logistic regression model predicted post-transplant PGD using each pre-transplant clinical and protein marker’s value distribution. The prediction scheme estimates the variance of prediction using different patient splits of the patient population. Patients are randomly assigned to training (75 patients) and validation (13 patients) sets. Within the training set, model parameters are estimated using 10-fold cross-validation. Within each fold, 64 patient data set the model parameters, and 11 patient data test the model performance.
The model parameters with the best prediction performance can be used as initial parameters to train the model on all 75 patients in the training set. The 13 patients in the validation set, which have been set aside throughout the procedure, are now used to evaluate the model’s prediction performance. The importance of the marker towards the prediction on the validation patient data is collected from the beta coefficients of the logistic regression model. The end result is a 200 bootstrap confidence interval of PGD prediction performance and importance for each of the clinical and protein markers controlling for the patient’s site-of-origin. 200 random patient splits were computed following this prediction paradigm for comparison to a random prediction distribution.
Confidence intervals were generated from predicted patient probabilities by taking 50 bootstraps and calculating the mean and 95% confidence interval. To estimate the prediction variance, Monte Carlo cross-validation (MCCV) was used. The PGD prediction probabilities were compared to the true PGD status to compute the area under the receiver operating characteristic curve (AUROC) and other metrics. Bootstrapping analysis (samples with replacement) resulted in population distribution for prediction performances, and a permutation analysis was similarly performed, with random labeling of PGD status in patients, to generate and test prediction metrics from random PGD assignment. Differences were evaluated in the bootstrap and permutation distributions, as well as between the 2 bootstrap distributions, with the 2-sample Kolmogorov- Smirnov test. Statistics followed by the use of bracket notation indicated reporting of the average statistic and its 95% confidence interval. The average statistic and standard errors were noted when reporting Student t-test results.
KLKB1 ELISA assay heart transplant patients: enzyme-linked immunosorbent assay (ELISA) (Abeam) was used to assess KLKB1 protein concentration in a validation cohort of pre-transplant serum prospectively collected in 65 consecutive patients at CUIMC. To be able to compare ELISA and mass spectrometry derived protein expression, the patient cohort data was minimum-maximum normalized before application of the MCCV strategy for all predictions.
Patient clinical characteristics: in total, 88 patients who underwent heart transplantation between 2014 and 2016 at Cedars Sinai Medical Center (n = 43), Pitie Salpetriere University Hospital (n = 29) and Columbia University Irving Medical Center (n = 16) were used for the initial proteomic and clinical characteristic analysis (Table 1).
Figure imgf000024_0001
Table 1. Clinical Characteristics.
Recipient characteristics at the time of transplant unless otherwise specified. Significance evaluated with a continuity-corrected chi-squared test for categorical characteristics and t- test for continuous characteristic: primary graft dysfunction (PGD), body mass index (BMI), pulmonary artery (PA), central venous pressure (CVP), pulmonary capillary wedge pressure (PCWP), international normalized ratio (INR), total bilirubin (TBili), and model for end-stage liver disease score (MELD).
There are 37 different pre-transplant clinical characteristics across all the patients, including PGD status (Table 2). Prior inotrope therapy significantly differed (linear model with and without site-of-origin p-values = 0.002 and 0.003) between PGD and non-PGD (Table 1).
Figure imgf000025_0001
Figure imgf000026_0001
Table 2. Baseline characteristics of patients.
In a multivariate model including all characteristics, only pre-transplant inotrope therapy associates with PGD (Table 3).
Figure imgf000026_0002
Table 3: Cellular enrichment of identified proteins. Patient blood microvesicle proteomic characteristics: serum microvesicle protein spectra were obtained in at least triplicate for each patient (322 total replicates) (Figure 1 A). The identified proteins were enriched in micro-vesicle and extracellular components (Table 3). Table 4 is a Table sorted by Area Under the Receiver Operating Characteristic curve (AUROC). The beta coefficients of the models were exponentiated to odds shown below. The lower and upper bounds indicate the 95% confidence interval. AUROC average > 0.5, Bonferroni corrected p-value < 0.001, beta coefficient 95% CI not including the null association, and permutation beta coefficient 95% CI including the null association. Significant clinical characteristics are highlighted.
Figure imgf000027_0001
Figure imgf000028_0001
Table 4: Prediction statistics of significant protein markers and clinical characteristics.
Protein expression in the three patient cohorts (Figure 4) does not follow a normal distribution (Omnibus test of normality p-values « 0.001). Figure 4 shows exosome protein expression distributions for the patient cohorts. The individual patient expression distributions in each cohort were superimposed to represent each patient’s individual contribution to the whole cohort protein expression distribution. The Columbia cohort was significantly different from Cedars-Sinai (Kolmogorov Smirnov test p-value < 3.19E- 08) and from Pitie Salpetriere (p-value = 8.70E-06). Protein expression was statistically different between the 2 retrospective patient cohorts (p-value = 0.030).
In total, 681 unique proteins were identified with 345 identified proteins present in every cohort of the patient cohorts and 80 proteins were not identified in at least one patient (Figure IB). There were 81 identified immunoglobulin proteins that were not included in the analysis. Additionally, three proteins did not have corresponding gene name annotations. A final set of 181 proteins, which were identified in every patient across all patient cohorts, were used in downstream analyses (Figure 1C).
Prediction of post-transplant PGD using pre-transplant clinical and protein markers: the prediction of post-transplant PGD in patients was investigated using clinical and protein markers derived prior to transplant. Monte Carlo cross-validation (MCCV; Figure 3) and permutation analysis was employed to calculate the prediction and significance of each clinical and protein marker in predicting PGD. Overall, the expression of all protein markers did not significantly outperform (AUROC 0.4119 ± 0.05473 vs 0.3751 ± 0.04712 independent 2-sample t-test p-value = 0.9147) nor were more influential (odds 1.3477 ± 1.3324 vs 1.0544 ± 0.2115 p-value = 0.1819) than all clinical characteristics in predicting the post-transplant occurrence of PGD (Figure 5). Individually, 16 proteins and 1 clinical characteristic were significantly predictive of PGD occurrence (AUROC > 0.5, Bonferroni-corrected p-value < 0.001, beta coefficient 95% CI not including the null association, and permutation beta coefficient 95% CI including the null association). In Table 5, panels were significantly predictive when the performance upper bound of KLKB 1 was lower than the lower bound of the two marker panels. The performance coefficient of variation was calculated by taking the log base 10 of the ratio between the average performance across all and within each cohort and the variation between them.
Figure imgf000029_0001
Table 5: Two marker panels significantly outperforming top individual predictive marker KLKB1. The most predictive protein marker was plasma kallikrein (KLKB1) (AUROC
0.6444 [0.6293, 0.6655]; odds 0.1959 [0.0592, 0.3663]) where decreased expression of KLKB1 was significantly predictive of PGD status. The next most predictive markers (AUROC > 0.6) were the proteins peroxiredoxin 2 (PRDX2), tropomyosin alpha-4 (TPM4), and myeloperoxidase (MPO), where increased expression of each was significantly predictive of PGD status (Table 5). With respect to clinical factors, the absence of pre-transplant inotrope therapy was significantly predictive of PGD on its own, albeit modestly. (AUROC 0.5618 [0.5387, 0.5800]; average odds 0.4342 [0.3043, 0.6033]). Notably the presence of mechanical support was not predictive (AUROC 0.4753 [0.4395, 0.4741], odds 1.192 [1.000, 1.781],) nor did it attenuate the predictive performance of pre-transplant inotrope therapy towards PGD (Figure 6). Figure 5 shows that pre-transplant inotrope therapy can be predictive of PGD independent of a left ventricular assistive device. Figure 5 shows the prediction of pre-transplant inotrope therapy, left ventricular assist device, and both clinical factors on posttransplant PGD. Two marker panel PGD predictions: 136 pairwise combinations of the 17 significantly predictive clinical and protein markers were investigated (Figure 7A). Overall, panels of inotrope therapy and a protein had significantly increased performance than combinations of 2 proteins (AUROC 0.6505 ± 0.02980 vs 0.6070 ± 0.0454 p-value = 2.123E-4; Figure 7B). For combinations involving pre-transplant inotrope therapy, the addition of KLKB1 outperformed all other protein combinations (AUROC 0.7181 [0.7020, 0.7372]). Proteinprotein marker panels containing KLKB1 outperformed all panels composed of other protein markers (t-test, p-values: 2.193E-13 to 6.634E-02; Figure 7C). The best performing panel overall and for each patient cohort was a combination of pre-transplant inotrope therapy and expression of KLKB1 protein (Figure 7D). There were 52 marker panels significantly more predictive than the most predictive marker KLKB1 on its own.
Figure imgf000030_0001
Figure imgf000031_0001
Figure imgf000031_0002
Table 7. Enriched Pathways and Functions in PGD
Tables 6 and 7 describe the biological pathways where proteins expressed in PGD patients were significantly different than in patients without PGD. The difference would be enriched if the expression was higher in PGD patients and depleted if the expression was lower in PGD patients.
The panel of inotrope therapy and KLKB1 showed the least variation while maintaining high performance across all cohorts (95% AUROC CI above 0.7; Figure 7E). PGD classifier performance: Each panel’s predictions form a 2-marker classifier equation, as shown for the KLKB1 protein and inotrope therapy panel in Figure 7F. The classifier equation for inotrope therapy and KLKB1 is the summation of multiplying - 0.9946 by a binary value of pre-transplant inotrope therapy (0 or 1) and -2.140 by normalized pre-transplant KLKB1 expression. This equation demonstrates an inverse relationship between post-transplant PGD risk and either pre-transplant KLKB1 expression or inotrope therapy (or both). The PGD classifier has significantly increased performance compared to the markers on their own (Kolmogorov-Smirnov 2-sample test p-values<2.165E-23; Figure 8 and Table 8).
Figure imgf000032_0001
Table 8. Performance of KLKB1, Intrope Therapy, and Two-Marker Panel Predictive Performance Across and Within Patient Cohorts.
The disclosed prediction panel was compared to existing PGD predictors: the radial score, the MELD score, and the CVP/PCWP ratio. The 2-marker panel significantly outperforms all composite scores by 50% on average (Figure 9; Kilogorov Smirnov 2- sample p-values<2.165E-23).
Figure imgf000033_0001
Table 9. Performance comparison between existing PGD predictors and KLKB1 and
Inotrope Therapy Two-Marker Panel.
Whole serum KLKB1 ELISA in PGD: a validation cohort of 65 consecutive patients’ serum samples was prospectively collected on the day prior to a heart transplant at CUIMC. Whole serum was used for KLKB1 ELISA to test the feasibility of a clinical test without microvesicle purification. Patients who had RV PGD or mechanical support for reasons other than PGD were excluded from the analysis. Potentially due to the small number of severe PGD (n = 3), there was no significant difference in average KLKB1 levels when comparing patients with severe PGD to no PGD levels (Mann-Whitney U test 19.81 ± 6.248 vs 45.796 ± 32.54 p-value = 0.0511). However, by adding patients with moderate PGD (n = 4), defined per ISHLT guidelines as moderate LV dysfunction requiring pharmacologic but not mechanical support, KLKB1 levels were significantly lower (Mann-Whitney U test 20.44 ± 11.40 vs 45.796 ± 32.54 015 p-value = 0.0128;
Figure 10 A). The putative PGD classifier from the original proteomic data produces an AUROC of 0.7143 to predict moderate/ severe PGD compared to patients who did not have PGD (Figure 10B). The incidence of PGD in this cohort more closely approximates the national PGD rate of 7.4%, and in this setting, the classifier was marked by a high sensitivity and negative predictive value but a low specificity and positive predictive value (Figure 10C and Table 10).
Figure imgf000034_0001
Table 10: Two marker panel equation performance on validation data. True Positive TP; True Negative TN; False Positive FP; False Negative FN; Positive Predictive Value PPV; Negative Predictive Value NPV.
Primary graft dysfunction pathway analysis and clinical tests in patients: to investigate PGD pathogenesis, a differential expression signature was calculated from proteomic data (262 proteins, including immunoglobulins, identified in all patients with corresponding gene names) (Figure 11). Figure 11 shows the differential protein analysis modeling scheme. The post-transplant PGD population risk of each marker is shown. From the 88 patients, sampling with replacement (e.g., over- and under-representing males and females in a population as shown here) was performed prior to the fitting model. The fitted model can estimate the population risk towards PGD occurrence of a marker controlling for the patient’s site of origin. The sampling was random sampling and fit the model 200 times. The 200 bootstrap distribution produces a confidence interval for PGD population risk for each marker. The average of each distribution is the population risk value for that marker. In the case of using protein markers, the collection of the population risk values is the differential expression signature towards primary graft dysfunction.
Gene set enrichment analysis (GSEA) was used to investigate enriched pathways and functions from the differential protein signature. 6 pathways were significantly enriched
(Table 11; FDR < 0.2), and 3 pathways were depleted in patients with PGD (Table 12;
FDR < 0.2; Figure 12A).
Figure imgf000035_0001
Figure imgf000036_0001
Table 11: Enriched Pathways and Functions in PGD.
The sets of proteins involved within each pathway and function in combination were evaluated to predict PGD in patients. The same MCCV methodology and the prediction significance thresholds defined above were used for this analysis. Out of 196 proteins, 8 proteins were found to be significantly predictive within at least 1 of the 136 pathways and functions: KLKB1, PRDX2, TPM4, MPO, CAT, HSPA5, IGHD and
IGLV2-11 (Table 12). Significant protein predictions within these pathways and functions
(Figure 12B) revealed enrichment of processes related to inflammation, coagulation, and activation of the innate immune system. Downregulation of KLKB1 was identified in the activated complement and immune response pathways.
Figure imgf000037_0001
Figure imgf000038_0001
Table 12: Depleted Functions and Pathways in PGD.
Markers of inflammation were also analyzed in the validation cohort. There was a trend towards increased erythrocyte sedimentation rate (66.0 ± 43.20 vs. 33.70 ± 26.86 Mann-Whitney U test p-value = 0.07; Figure 12C). Protein (27.24 ± 19.51 vs 11.27 ± 25.54, p-value = 0.16; Figure 12D) and complement levels were not significantly altered (Figure 2). This analysis was hampered by a small number of severe PGD patients and wide confidence intervals. However, there does appear to be some laboratory trend towards increased inflammation corresponding with the results of the GSEA analysis.
Figure 13 shows a calibration curve for PGD prediction by a putative classifier on 80 CUTMC patient assessment data. Probabilities of PGD risk versus Percent/Number of PGD patients of the CUIMC assessment patients are shown in Figure 13. Patients who had moderate or severe PGD are shown as enlarged triangles, and those who did not are circles on the calibration curve. The probabilities calculated are the logit-transformed dot product between the assessment data (KLKB1 ELISA expression and pre-transplant Inotrope therapy) and the putative PGD classifier.
Figures 14A-14C show principal components of protein expression and association with covariates. Overlay of Site-of-origin (14A), Set (14B), and TMT-Tag covariates on protein expression variation(14C), determined via Principal Components Analysis for patients are shown. Set/TMT Tag (or experimental batch) is accounted for during protein identification and quantification, while each patient cohort was a different experiment and not accounted for during this process. As shown in the principal components analysis, cohort site of origin explains protein expression variation for patients and thus is included as a covariate in association and prediction analyses. PCA can determine the most variability found with the protein expression data, where this variability can come from non-biological variability. Therefore patients are projected onto their variability components to assess which non-biological variability explains the observed differences in the protein expression data.
Figures 15A-15B show correlation between unadjusted and adjusted individual and two marker panel performances. Comparison between model specifications for the individual (15 A) and two marker panel predictions (15B) when including and not including covariate adjustment (i.e., cohort site-of-origin) are shown. The marker prediction specifications did not include site-of-origin as covariates in order to easily translate the putative classifier equations to new patient data. However, when covariate adjustment was included, the average AUROC performance is highly correlated with the unadjusted performances suggesting minimal confounding by site and accuracy of the classifier equations to translate onto new patient data agnostic of site. This analysis was performed to generate evidence in using simpler and more interpretable machine learning models that did not account for patient site of origin.
Pre-heart transplant recipient clinical and proteomic markers predictive of posttransplant PGD were identified using a data-driven methodology to generate a clinically interpretable PGD classifier. Machine learning and statistical techniques were used to mitigate confounding in biological enrichment analyses and improve predictive accuracy with modest population size. Reduction in KLKB 1 was the strongest predictor of PGD both by itself and in combination with other markers. KLKB1 is a serine protease that controls the activation of both inflammation and coagulation in what is known as the kallikrein-kinin-system (KKS). In the inflammatory response, KLKB1 converts high molecular weight kininogen into bradykinin, stimulating the release of nitric oxide and prostacyclin, causing vasodilation and increased vascular permeability. It also acts as a neutrophil chemoattractant, causing degranulation. Evaluations of the KKS system in patients with sepsis, a markedly inflammatory state, demonstrated increased KKS activity, characterized by decreased levels of plasma kallikrein, likely due to consumption. Decreases in KLKB1 have been noted in typhoid fever, ARDS, cardiopulmonary bypass and in normal volunteers infused with gram-negative endotoxin. Similarly, in animal models of inflammatory bowel disease and inflammatory arthritis, plasma kallikrein levels were markedly reduced.
Other predictive proteins identified are likewise involved in either inflammation or innate immunity, including PRDX2, MPO, PGLYRP2, and DEFA1. Similarly, enrichment analysis of protein expression differences demonstrated several upregulated biological processes, including inflammatory and immune pathways in patients prior to PGD. Laboratory tests in the validation cohort trended towards increased inflammation though were not significant. It remains to be seen whether this inflammatory signature is purely a bio-marker or contributes to PGD and, importantly, whether modifying this state can have an impact on the evolution of PGD.
The lack of inotrope therapy was predictive of PGD, and this stands in contrast to prior analyses, which demonstrated that the presence of inotrope therapy was associated with PGD. Pre-transplant inotrope therapy and durable mechanical support (such as LVAD) are exclusive prior to transplant, and mechanical support has been associated with PGD in prior studies. However, mechanical support was not significantly predictive of PGD in the analyses and did not interact with inotrope therapy in prediction models. Whether inotrope therapy itself is an actual driver of PGD protection versus an epiphenomenal marker remains to be explored. There are clear differences in medical therapy, anti coagulation and mechanical support between patients receiving and not receiving inotrope therapy (Table 10).
Figure imgf000041_0001
Figure imgf000042_0001
Table 10. Clinical characteristic population associations to PGD
Integrating both proteomic and clinical variables into one model demonstrated that combinations of proteins and clinical characteristics can yield increased classification power. KLKB1 combinations resulted in the greatest classification performance. Interestingly, though inotrope therapy alone demonstrated modest prediction, its combination with KLKB1 resulted in the greatest increase in classification power when compared to the combination of KLKB 1 and other top-performing proteins. Notably, this panel outperforms other composite scores and clinical variables such as the Radial score, which demonstrated low performance in all three cohorts.
Whether the proteomic results were being driven by a specific microvesicular process or a reflection of the greater overall serum milieu was tested in the validation ELISA cohort. The ELISA samples themselves were not able to generate a classifier using KLKB 1 and inotrope therapy due to the paucity of PGD samples in that cohort. However, the proteomics-derived classifier generated a similar AUROC on whole serum as it did in the original microvesicle proteomic cohort. At the whole serum level, in a population whose incidence mirrored closely to national PGD rates, the classifier performed essentially as a rule-out test with a very high negative predictive value.
The disclosed classifier performed well when we normalized absolute values of KLKB1 in the serum by ELISA. With only 3 cases of severe PGD in this cohort, which approximates the normal incidence of PGD, KLKB1 trended towards a significant decrease in PGD patients (p = 0.051). Looking forward to clinical utility, PGD risk stratification can be served in the outpatient setting as part of an overall pre-transplant evaluation. The disclosed subject matter can be used for understanding if the patient risk is static or evolves and whether changes in that risk are associated with clinical status. The optimistic potential here is to use this classifier to evaluate therapies that can alter future PGD risk and improve heart transplant outcomes.
* * *
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art. In case of conflict, the present document, including definitions, will control. Certain methods and materials are described below, although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the presently disclosed subject matter. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. The materials, methods, and examples disclosed herein are illustrative only and not intended to be limiting. While it will become apparent that the subject matter herein described is well calculated to achieve the benefits and advantages set forth above, the presently disclosed subject matter is not to be limited in scope by the specific embodiments described herein. It will be appreciated that the disclosed subject matter is susceptible to modification, variation, and change without departing from the spirit thereof. Those skilled in the art will recognize or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments described herein. Such equivalents are intended to be encompassed by the following claims.

Claims

WHAT IS CLAIMED IS:
1. A method for identifying risk of primary graft dysfunction (PGD) of a subject comprising:
Collecting a sample of the subject; measuring a level of a PGD marker from the sample, wherein the PGD marker comprises plasma kallikrein (KLKB1); providing a PGD risk value that is quantified based on the level of the PGD marker using an adaptive Monte Carlo cross-validation (MCCV) model; and identifying the risk of PGD based on the PGD risk value.
2. The method of claim 1, further comprising assessing an effect of a therapy on the heart transplant by estimating the PGD risk value of the subject, wherein the subject receives the therapy before or after the assessing.
3. The method of claim 1, further comprising identifying a clinical variable of the subject, wherein the clinical variable comprises a medical history of the subject.
4. The method of claim 3, wherein the medical history of the one subject comprises a pre-transplant inotrope therapy.
5. The method of claim 1, further comprising measuring a level of an additional marker from the sample, wherein the additional marker is selected from the group consisting of proteins peroxiredoxin 2 (PRDX2), tropomyosin alpha-4 (TPM4), myeloperoxidase (MPO), PGLYRP2, DEFA1, DEFA1B, LDHB, F2, FCGBP,
43 CAT, CFHR5, HIST1H4, GAPDH, LTF, ADIPOQ, HSPA5, and combinations thereof. The method of claim 5, wherein the PGD risk value is quantified based on the level of the PGD marker and the additional marker. The method of claim 1, further comprising providing the adaptive MCCV model with a training set for machine learning, wherein the adaptive MCCV model is a continuously evolving model based on the training set. The method of claim 1, further comprising providing an additional therapy to the subject based on the PGD risk value. The method of claim 8, wherein the additional therapy comprises KLKB1 activators, anti-inflammatory agents, or combinations thereof. A system for identifying risk of primary graft dysfunction (PGD) of a subject comprising: one or more processors; and one or more computer-readable non-transitory storage media coupled to one or more of the processors and comprising instructions operable when executed by one or more of the processors to cause the system to: collect a sample of the subject; measure a level of a PGD marker from the sample, wherein the PGD marker comprises plasma kallikrein (KLKB1); provide a PGD risk value that is quantified based on the level of the PGD marker using an adaptive Monte Carlo cross-validation (MCCV) model; and identify the risk of PGD based on the PGD risk value.
44 The system of claim 10, wherein the processor is configured to assess an effect of a therapy on the heart transplant by estimating the PGD risk value of the subject, wherein the subject receives the therapy before or after the assessing. The system of claim 10, wherein the processor is configured to identify a clinical variable of the subject, wherein the clinical variable comprises a medical history of the subject. The system of claim 12, wherein the medical history of the one subject comprises a pre-transplant inotrope therapy. The system of claim 10, wherein the processor is configured to measure a level of an additional marker from the sample, wherein the additional marker is selected from the group consisting of proteins peroxiredoxin 2 (PRDX2), tropomyosin alpha-4 (TPM4), myeloperoxidase (MPO), PGLYRP2, DEFA1, DEFA1B, LDHB, F2, FCGBP, CAT, CFHR5, HIST1H4, GAPDH, LTF, ADIPOQ, HSPA5, and combinations thereof. The system of claim 14, wherein the PGD risk value is quantified based on the level of the PGD marker and the additional marker. The system of claim 10, wherein the processor is configured to provide the adaptive MCCV model with a training set for machine learning, wherein the adaptive MCCV model is a continuously evolving model based on the training set.
45 The system of claim 10, the system is configured to provide an additional therapy to the subject based on the PGD risk value. The system of claim 17, wherein the additional therapy comprises KLKB1 activators, anti-inflammatory agents, or combinations thereof.
PCT/US2021/050465 2020-09-15 2021-09-15 Systems and methods for predicting graft dysfunction with exosome proteins WO2022060842A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/180,991 US20230273210A1 (en) 2020-09-15 2023-03-09 Systems and methods for predicting graft dysfunction with exosome proteins

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202063078672P 2020-09-15 2020-09-15
US63/078,672 2020-09-15

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/180,991 Continuation US20230273210A1 (en) 2020-09-15 2023-03-09 Systems and methods for predicting graft dysfunction with exosome proteins

Publications (1)

Publication Number Publication Date
WO2022060842A1 true WO2022060842A1 (en) 2022-03-24

Family

ID=80776367

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2021/050465 WO2022060842A1 (en) 2020-09-15 2021-09-15 Systems and methods for predicting graft dysfunction with exosome proteins

Country Status (1)

Country Link
WO (1) WO2022060842A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024081740A1 (en) * 2022-10-13 2024-04-18 Somalogic Operating Co., Inc. Systems and methods for validation of proteomic models

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130029873A1 (en) * 2010-04-12 2013-01-31 University Health Network Methods and compositions for diagnosing pulmonary fibrosis subtypes and assessing the risk of primary graft dysfunction after lung transplantation
US9304137B2 (en) * 2011-12-21 2016-04-05 Integrated Diagnostics, Inc. Compositions, methods and kits for diagnosis of lung cancer
WO2016073768A1 (en) * 2014-11-05 2016-05-12 Veracyte, Inc. Systems and methods of diagnosing idiopathic pulmonary fibrosis on transbronchial biopsies using machine learning and high dimensional transcriptional data
US10689346B2 (en) * 2014-03-07 2020-06-23 Biocryst Pharmaceuticals, Inc. Human plasma kallikrein inhibitors

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130029873A1 (en) * 2010-04-12 2013-01-31 University Health Network Methods and compositions for diagnosing pulmonary fibrosis subtypes and assessing the risk of primary graft dysfunction after lung transplantation
US9304137B2 (en) * 2011-12-21 2016-04-05 Integrated Diagnostics, Inc. Compositions, methods and kits for diagnosis of lung cancer
US10689346B2 (en) * 2014-03-07 2020-06-23 Biocryst Pharmaceuticals, Inc. Human plasma kallikrein inhibitors
WO2016073768A1 (en) * 2014-11-05 2016-05-12 Veracyte, Inc. Systems and methods of diagnosing idiopathic pulmonary fibrosis on transbronchial biopsies using machine learning and high dimensional transcriptional data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SINGH: "Primary graft dysfunction after heart transplantation: a thorn amongst the roses", NCBI, 25 April 2019 (2019-04-25), pages 805 - 820, XP037158498 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024081740A1 (en) * 2022-10-13 2024-04-18 Somalogic Operating Co., Inc. Systems and methods for validation of proteomic models

Similar Documents

Publication Publication Date Title
JP7270696B2 (en) Prediction of cardiovascular risk events and its use
US20210180135A1 (en) Biomarker pairs for predicting preterm birth
Benny et al. A review of omics approaches to study preeclampsia
KR102248900B1 (en) Cardiovascular risk event prediction and uses thereof
Di Napoli et al. C-reactive protein level measurement improves mortality prediction when added to the spontaneous intracerebral hemorrhage score
Kacerovsky et al. Proteomic biomarkers for spontaneous preterm birth: a systematic review of the literature
US20150024969A1 (en) Sepsis prognosis biomarkers
Vanhaverbeke et al. Peripheral blood RNA biomarkers for cardiovascular disease from bench to bedside: a position paper from the EU-CardioRNA COST action CA17129
CA3032754A1 (en) Biomarkers for predicting preterm birth due to preterm premature rupture of membranes versus idiopathic spontaneous labor
US20180100858A1 (en) Protein biomarker panels for detecting colorectal cancer and advanced adenoma
Chen et al. Comprehensive maternal serum proteomics identifies the cytoskeletal proteins as non-invasive biomarkers in prenatal diagnosis of congenital heart defects
WO2011044142A1 (en) Peripheral blood biomarkers for idiopathic interstitial pneumonia and methods of use
WO2012004276A2 (en) Multiprotein biomarkers of amyotrophic lateral sclerosis in peripheral blood mononuclear cells, diagnostic methods and kits
Lee et al. Proteomic analysis of serum amyloid a as a potential marker in intestinal Behçet’s disease
US20230330121A1 (en) Compositions and methods for the treatment of bronchiolitis obliterans
WO2022060842A1 (en) Systems and methods for predicting graft dysfunction with exosome proteins
Ahmed et al. Risk of severe acute kidney injury in multiple trauma patients: risk estimation based on a national trauma dataset
WO2016123058A1 (en) Biomarkers for detection of tuberculosis risk
US20230273210A1 (en) Systems and methods for predicting graft dysfunction with exosome proteins
CN111094981A (en) PCT and PRO-ADM as markers for monitoring antibiotic therapy
EP3874275B1 (en) Biomarkers of subclinical atherosclerosis
US20180356419A1 (en) Biomarkers for detection of tuberculosis risk
CA3221353A1 (en) Renal insufficiency prediction and uses thereof
CA3151482A1 (en) Cardiovascular risk event prediction and uses thereof
WO2023069598A1 (en) Chronic wound healing biomarker diagnostics

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21870136

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21870136

Country of ref document: EP

Kind code of ref document: A1