WO2023224913A1 - Integrated host-microbe metagenomics of cell-free nucleic acid for sepsis diagnosis - Google Patents

Integrated host-microbe metagenomics of cell-free nucleic acid for sepsis diagnosis Download PDF

Info

Publication number
WO2023224913A1
WO2023224913A1 PCT/US2023/022245 US2023022245W WO2023224913A1 WO 2023224913 A1 WO2023224913 A1 WO 2023224913A1 US 2023022245 W US2023022245 W US 2023022245W WO 2023224913 A1 WO2023224913 A1 WO 2023224913A1
Authority
WO
WIPO (PCT)
Prior art keywords
sepsis
rna
gene
determining
genes
Prior art date
Application number
PCT/US2023/022245
Other languages
French (fr)
Inventor
Charles R. LANGELIER
Katrina KALANTAR
Lucile P.A. NEYTON
Carolyn S. CALFEE
Original Assignee
Cz Biohub Sf, Llc
Chan Zuckerberg Initiative Foundation
The Regents Of The University Of California
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cz Biohub Sf, Llc, Chan Zuckerberg Initiative Foundation, The Regents Of The University Of California filed Critical Cz Biohub Sf, Llc
Publication of WO2023224913A1 publication Critical patent/WO2023224913A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/689Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria

Definitions

  • antibiotic treatment often remains empiric rather than pathogen-targeted, with clinical decision-making based on epidemiological information rather than individual patient data.
  • clinicians often continue empiric antimicrobials despite negative microbiologic testing for fear of harming patients in the setting of falsely negative results. Both scenarios lead to antimicrobial overuse and misuse, which contributes to treatment failures, opportunistic infections such as C. difficile colitis, and the emergence of drug-resistant organisms (Baur, et al., The Lancet Infectious Diseases 17(9), 990-1001, (2017)).
  • transcriptional profiling has required isolating peripheral-blood mononuclear cells, or stabilizing whole blood in specialized collection tubes, and it has remained unknown whether cell-free plasma could yield informative gene expression data for infectious disease diagnosis.
  • a single-sample metagenomic approach combining host transcriptional profiling with unbiased pathogen detection was developed to improve lower respiratory tract infection diagnosis (Langelier, et al., Proc Natl Acad Sci USA 115, E12353-E12362 (2018)).
  • sepsis provide an additional application of an integrated host-microbe metagenomics approach.
  • described herein are methods for predicting the likelihood of sepsis by determining the expression profile of a panel of host genes that undergo quantitative changes in sepsis.
  • the methods comprise determining the expression profile of host genes that undergo quantitative changes in viral sepsis.
  • determination of sepsis risk further comprises determining the microbial mass and identifying the pathogen.
  • the methods provide criteria to rule out sepsis.
  • RNA in a cell-free RNA sample from the human subject of each member of a gene panel comprising: detecting RNA in a cell-free RNA sample from the human subject of each member of a gene panel, wherein the gene panel comprises at least two members, or at least three members, selected from the group consisting of NEDD4, KDM3A; GOLGA2; TBC1D8; LRRK1; MBP; C2CD3; SOS1; MFN1; ATE1 MOSPD2; SMARCA5; INTS4; P4HA2; LTBP3; SZT2; ADAP1; ORC6; KIF1B; PLTP; RAB29; GYG1; SULT1B1; ATP6V1A; ZNF672; PPP1R12B; IPO7; NAA10; GAPVD1; PSMA4; ARID3A; HBA1; VPS35; RPA1; GALNT10;
  • detecting RNA comprises measuring a level of RNA for at least 10 members, or at least 20 member of the gene panel. In some embodiments, detecting RNA comprises measuring a level of RNA for at least 50 members of the gene panel. In some embodiments, members of the gene panel are selected from the group consisting of MBP, SOS1, SMARCA5, LTBP3, PLTP, ATP6V1A, PSMA4, ARID3A, VPS35, RPA1, MYLK3, DCUN1D4, CASP9, HBB, C20orf24, IFNAR1, WIPI2, MTMR14, SULT1A1, PLEC, DNAJA3, AC138969.1, TNPO1, SLC44A1, IFITM3, UBE2G2, S100A9, and FIG4.
  • determining the quantity of differential gene expression comprises an amplification reaction for each member of the gene panel. In some embodiment, determining the quantity of differential gene expression comprises massively parallel sequencing. In some embodiments, quantification is performed by digital PCR. In some embodiments, the probability score for classifying the human subject as having an increased likelihood of sepsis is 0.5 or greater.
  • RNA in a cell-free RNA sample from the human subject of each member of a gene panel comprising: detecting RNA in a cell-free RNA sample from the human subject of each member of a gene panel, wherein the gene panel comprises at least two members, or at least three members, selected from the group consisting of: PSME3; OTUB1; RBPJ; GPX3; ERBIN; GABARAPL1; MZT2A; JMJD6; FAM214A; ZC3H11A; CACUL1; TUBG1; TRIM69; LST1; ZNF585B; RBM6; DHX29; SUGP1; SUDS3; CREB5; DYNC1H1; STXBP3; ZNF467; RAPGEF6; SIPA1; RPL7L1; CD2AP; ZNF101; CASP8AP2; CDR2; COP1; ARFGEF1; SLC30A
  • detecting RNA comprises measuring the level of RNA for at least 10 members, or at least 20 members, or at least 30 members, or at least 40 members of the gene panel.
  • members of the gene panel are selected from the group consisting of OTUB1; LITAF; RBPJ; ERBIN; MZT2A; PPM1B; PEBP1; PARL; TRIM69; RBM6; DHX29; RDX; HIST1H3I; STXBP3; IGLL5; COP1; CALCOCO2; BORCS8; TAF2; and ABL1.
  • quantification comprises an amplification reaction.
  • quantification is performed by massively parallel sequencing.
  • the probability score for classifying the human subject as having an increased likelihood of viral sepsis is 0.9 or greater.
  • a method of determining likelihood of sepsis in a subject comprising (a) quantifying microbial mass in a serum or plasma sample from a patient and (b) determining whether a predominant pathogen is present, the method comprising: (a) adding a known amount of calibration nucleic acids to the a cfDNA preparation obtained from the serum or plasma sample; (b) sequencing a library generated from the cfDNA preparation; (c) aligning sequences obtained from step (b) to sequences present in a database comprising microbial sequences to determine sequence reads that map to a microbial sequence in the database; (d) determining a ratio of a first amount of total sequence reads that correspond to the calibration nucleic acids and a second amount of all microbial sequence reads in step (c); and (f) determining a total microbial mass from the known
  • the specified mass is 20 pg.
  • described herein is a method of evaluating likelihood of sepsis, comprising performing the methods described in the preceding three paragraphs, wherein a patient has an increased likelihood of sepsis when (i) the probability score determined in claim 1 exceeds the threshold value for general sepsis, (ii) the probability score determined in claim 9 exceeds the threshold value for viral sepsis, or (iii) the total microbial mass determined in claim 15 is greater than the specified mass and the predominant pathogen exists. [0011]
  • FIG.1A provides a study flow diagram. Patients studied were enrolled in the Early Assessment of Renal and Lung Injury (EARLI) cohort. Sepsis adjudication was based on ⁇ 2 or systemic inflammatory response syndrome (SIRS) criteria plus clinical suspicion of infection was used to delineate 5 patient subgroups. Following quality control (QC), whole blood underwent RNA-seq and cell-free (cf)-RNA and DNA from plasma underwent RNA-seq and DNA-seq. [0014] FIG.1B shows analytic approaches.
  • bSVM bagged support vector machine
  • FIG.3A-3D Cell-free plasma metagenomics for detecting sepsis pathogens.
  • 3A Microbial DNA biomass differences between sepsis adjudication groups.
  • 3B Graphical depiction of rules-based model for sepsis pathogen detection that identifies established pathogens with disproportionately high abundance compared to other commensal and environmental microbes in the sample.
  • 3C Concordance between cf-plasma DNA-seq for detecting bacterial pathogens in Sepsis BSI patients with bacterial bloodstream infections compared to a gold standard of culture.3D) Sensitivity of plasma DNA-seq for detecting bacterial pathogens, or plasma RNA-seq for detecting viral pathogens, in Sepsis non-BSI patients with sepsis from peripheral sites of infection.
  • FIG.5A-5D Integrated host-microbe mNGS model for sepsis diagnosis from cf- plasma. Host criteria for positivity can be met by a sepsis transcriptomic classifier probability > 0.5 (bars shown in graphs, dotted line).
  • Microbial criteria can be met based on either: 1) detection of a pathogen by mNGS and a sample microbial mass > 20 pg (gray bars), or 2) viral transcriptomic classifier probability > 0.9 (filled circles, dotted line).
  • Host and microbial metrics are highlighted for patients with sepsis due to 5A) bloodstream infections (SepsisBSI), 5B) peripheral infection (Sepsisnon-BSI), 5C) patients with non-infectious critical illness (No- Sepsis), and 5D) patients with suspected sepsis but negative microbiological testing (Sepsis suspected) and patients with indeterminant sepsis status (Indeterm).
  • Cross sepsis positive based on model.
  • FIG.6 is a flowchart illustrating a method of measuring the expression levels of host gene markers described herein to evaluate general sepsis risk in a subject according to embodiments of the present invention.
  • FIG.7 is a flowchart illustrating a method of measuring the expression levels of host gene markers described herein to evaluate viral sepsis risk in a subject according to embodiments of the present invention.
  • FIG.8 is a flowchart illustrating a method of determining microbial mass and identifying a sepsis pathogen to evaluate sepsis risk in a subject according to embodiments of the present invention.
  • FIG.9 illustrates a measurement system 900 according to an embodiment of the present disclosure.
  • TERMS As used herein, the following terms have the meanings ascribed to them unless specified otherwise.
  • Sepsis is based on the following criteria: a proven or suspected infection in combination with at least 2 systemic inflammatory response syndrome (SIRS) criteria and persistent hypotension (defined as a mean arterial pressure below 60 mm Hg, a systolic blood pressure below 90 mm Hg or a decrease in systolic blood pressure of at least 40 mm Hg) despite adequate fluid resuscitation.
  • SIRS systemic inflammatory response syndrome
  • a patient evaluated for sepsis has altered mental status, a systolic blood pressure ⁇ 100 mm Hg or respiratory rate ⁇ 22/min.
  • Bacterial infections are the most common cause of sepsis, but fungal, viral, and protozoan infections can also lead to sepsis.
  • “general” sepsis refers to sepsis arising from infection with any of these pathogens. Common locations for a primary infection leading to sepsis include, but are not limited to, lungs, brain, urinary tract, skin, and abdominal organs.
  • RNA sample refers to a nucleic acid sample comprising extracellular RNA that is recoverable from a non-cellular fraction of sample and includes fragments of full-length RNA transcripts. In typical embodiments, the sample is from whole blood processed to remove cells e.g., a plasma or serum sample.
  • a “cell-free nucleic acid sample” refers to either cfRNA or cell-free DNA (cfDNA).
  • determining refers to quantitative determinations.
  • determining a quantity of differential gene expression for each member of a gene panel refers to determining the gene expression level of a gene in cfRNA from a test sample relative to a control expression level.
  • the control expression level is obtained from a population of subjects.
  • a control population of subjects comprises subjects who do not have a clinical symptom of sepsis.
  • a control population of subjects comprises subjects who have sepsis that arises from infection with a bacterial, fungal, or protozoal microorganism.
  • the terms “cutoff” and “threshold” refer to predetermined numbers used in an operation.
  • a threshold value may be a value above or below which a particular classification applies. Either of these terms can be used in either of these contexts.
  • a cutoff or threshold may be “a reference value” or derived from a reference value that is representative of a particular classification or discriminates between two or more classifications. Such a reference value can be determined in various ways, as will be appreciated by the skilled person.
  • metrics can be determined for two different groups of subjects with different known classifications, and a reference value can be selected as representative of one classification (e.g., a mean) or a value that is between two clusters of the metrics (e.g., chosen to obtain a desired sensitivity and specificity).
  • a reference value can be determined based on statistical simulations of samples.
  • the term “amount” or “level” of cfRNA expressed by a gene refers to the quantity of copies of an RNA transcript being assayed, including fragments of full-length transcripts that can be unambiguously identified as fragments of the transcript being assayed.
  • Such quantity may be expressed as the total quantity of the RNA, in relative terms, e.g., compared to the level present in a control cfRNA sample, or as a concentration e.g., copy number per milliliter, of the RNA in the sample.
  • expression level refers to the amount of an RNA transcript, e.g., an mRNA transcript, of the gene.
  • host gene expression refers to the amount of cell-free RNA in a cell-free nucleic acid sample from a subject that is expressed by a gene originating from the host, i.e., the subject, as opposed to expression of a microbial, e.g., bacterial, viral, or fungal, gene.
  • microbial e.g., bacterial, viral, or fungal
  • Human genes are typically referred to herein using the official symbol and official nomenclature for the human gene as assigned by the HUGO Gene Nomenclature Committee, when HUGO nomenclature is available.
  • an individual gene as designated herein may also have alternative designations, e.g., as indicated in the HGNC database.
  • the term "signature gene” refers to a gene whose expression is correlated with sepsis.
  • a “gene panel” refers to a collection of such signature genes for which gene expression scores are generated and used to provide a risk/likelihood score for sepsis. Reference to the gene by name includes any human allelic variant or splice variant encoded by the gene.
  • nucleic acid or “polynucleotide” as used herein refers to a deoxyribonucleotide or ribonucleotide in either single- or double-stranded form.
  • primers or probes encompasses nucleic acids containing known analogues of natural nucleotides which have similar or improved binding properties, for the purposes desired, as the reference nucleic acid; and nucleic-acid-like structures with synthetic backbones.
  • nucleic acids or polypeptide sequences refer to two or more sequences or subsequences that are the same (“identical”) or have a specified percentage of amino acid residues or nucleotides that are the same (i.e., at least about 70% identity, at least about 75% identity, at least 80% identity, at least about 90% identity, preferably at least about 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or higher identity over the entire sequence of a specified region, when compared and aligned for maximum correspondence over a comparison window or designated region. Methods of alignment of sequences for comparison are well-known in the art.
  • Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. Math.2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol.48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Nat'l. Acad. Sci.
  • microbial mass refers to the microbial biomass present in a sample, preferably determined by metagenomic sequencing of a cell-free nucleic acid sample, e.g., cfDNA, obtained from a patient.
  • “microbial mass” is calculated based on the ratio of (i) the number of sequence reads corresponding to sequences representing “spike-in” calibration nucleic acids added in a known amount to the cell-free nucleic acid sample and (ii) the number of sequences reads in the nucleic acid that are identified by alignment to one or more sequence databases as being of microbial, e.g., bacterial, fungal, or viral, original.
  • the calibration nucleic acids may be any nucleic acids that have a known sequence.
  • the calibration nucleic acids may have different sequences, each of which is known; a calibration sample can have a known concentration of the calibration nucleic acids.
  • the calibration nucleic acids would not occur in tests samples, at least now in appreciable amounts.
  • the calibration nucleic acids can for a calibration genome for which sequences are aligned, e.g., to identify them as calibration nucleic acids.
  • treatment typically refers to a clinical intervention, including multiple interventions over a period of time, to ameliorate at least one symptom of sepsis or otherwise slow progression. This includes alleviation of symptoms or diminishment of any direct or indirect pathological consequences of sepsis.
  • sepsis diagnostic assays including assays that combine host transcriptional profiling with pathogen identification.
  • mNGS next generation sequencing
  • host and microbial features that distinguish microbiologically confirmed sepsis from non-infectious critical illness were identified. It was additionally determined that cell-free plasma nucleic acid can be used to profile both host and microbe for precision sepsis diagnosis. Accordingly, described herein are sepsis diagnostic assays that employ host transcriptional profiling, pathogen abundance, pathogen identification, and combinations of these assays to evaluate sepsis risk.
  • described herein are methods for predicting the likelihood of “general” sepsis in a patient based on transcriptional profiling in cell-free RNA (cfRNA) of host marker genes that are associated with general sepsis.
  • cfRNA cell-free RNA
  • “General” sepsis as used herein refers to sepsis arising from any microbial pathogen.
  • described herein are methods for predicting the likelihood of viral sepsis based on transcriptional profiling of cfRNA of host markers genes that are specifically associated with viral sepsis.
  • the disclosure describes determination of sepsis risk based on microbial mass and detection of a dominant pathogen based on cfDNA analysis from a plasma or serum sample.
  • Microbial mass determination can comprise sequencing of cfDNA from a sample obtained from a patient serum or plasma sample; and determining the microbial mass by weight, e.g., picograms, for all nucleic acids identified as originating from microbes based on alignments to one or more taxonomy databases.
  • Detection of a dominant pathogen can comprise identifying whether sequence reads that map to an established (e..g., a known, bloodstream pathogen) are overrepresented (compared to commensal or contaminating microbes) in sequence data from cfDNA obtained from a serum or plasma sample.
  • a pathogen that is overrepresented is referred to herein as a “dominant” or “predominant” or a “disproportionately abundant” pathogen.
  • Determination of an abundance level of a microbial species can be based on the number of sequence reads (e.g., reads per million) that map to the microbe sequence. For each genus of microbes identified by mapping the sequence reads, the most abundant species, i.e., having the highest abundance level, in each genus can be selected and the selected species can be ranked by abundance level in sequential order.
  • a gap threshold can be determined, where the gap threshold is the abundance level at which a greatest difference in abundance level occurs between sequential microbes. It is also determined whether species having the abundance level at or that exceeds the gap threshold is a known blood stream pathogen, thereby identifying whether a predominant pathogen exists.
  • a patient is identified as likely to have sepsis based on the total microbial mass determined being greater than a specified mass, e.g., 20 pg, and that the predominant pathogen exists.
  • techniques can combine methods into an integrated host and microbe model for sepsis determination that can maximize accuracy (e.g., a negative predictive value) using sequencing of cell-free nucleic acids from a blood sample (e.g., serum or plasma) and pathogen detection, to define individuals likely to have sepsis.
  • a blood sample e.g., serum or plasma
  • cfRNA obtained from a plasma or serum sample from the subject is evaluated to determine the level of RNA encoded by each member of a panel of genes for which cfRNA levels are associated with sepsis relative to control levels from subjects that do not have clinical evidence of infection.
  • whole blood samples can be evaluated by RNA sequencing of RNA obtained from whole blood samples.
  • Signature genes identified by the inventors for evaluating cfRNA obtained from a plasma or serum sample from a patient to determine likelihood of general sepsis are described in part A of this section.
  • Signature genes for evaluating RNA from a whole blood sample to determine likelihood of general sepsis are described in part B. A.
  • sepsis risk is evaluated by determining the amount of RNA of each member of a panel of genes, or each member of a subset of the panel of genes, in cfRNA obtained from a serum or plasma sample from a patient suspected of having sepsis. In some embodiments, sepsis risk is evaluated by determining the amount of RNA of each member of a panel of genes, or each member of a subset of the panel of genes, in RNA obtained from whole blood sample from a patient suspected of having sepsis.
  • Such methods comprise quantifying the amount of RNA for each of a panel of genes associated with sepsis in a cfRNA sample obtained from plasma or serum from a human subject exhibiting one or more symptoms consistent with a diagnosis of sepsis.
  • at least 5, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 55, at least 60, at least 65, at least 70, at least 75, at least 80, at least 85, at least 90 or at least 95 genes selected from the following are quantified to determine expression levels: NEDD4, KDM3A; GOLGA2; TBC1D8; LRRK1; MBP; C2CD3; SOS1; MFN1; ATE1; MOSPD2; SMARCA5; INTS4; P4HA2; LTBP3; SZT2; ADAP1; ORC6; KIF1B; PLTP; RAB29; GYG1; SULT1B1; ATP
  • a panel evaluated for viral sepsis comprises at least two gene, or at least three, four, or five; or at least ten or more genes selected from NEDD4, KDM3A; GOLGA2; TBC1D8; LRRK1; MBP; C2CD3; SOS1; MFN1; ATE1; MOSPD2; SMARCA5; INTS4; P4HA2; LTBP3; SZT2; ADAP1; ORC6; KIF1B; PLTP; RAB29; GYG1; SULT1B1; ATP6V1A; ZNF672; PPP1R12B; IPO7; NAA10; GAPVD1; PSMA4; ARID3A; HBA1; VPS35; RPA1; GALNT10; CHD3; MYLK3; CD53; MSI2; DCUN1D4; CASP9; RPS27L; HBB; STXBP2; PADI2; HBA2; ST
  • the “ENSG” designation of teach gene is shown in Table A.
  • the “ENSG” designation is based on ENSEMBL version 99. Table A. ENSEMBL designations.
  • Additional gene information including chromosome location, number of transcripts (e.g., splice variants) encoded by the gene that have been identified, and UniProt identification numbers for protein-encoding genes are available in the ENSEMBL entry.
  • reference to the gene by name includes variants, such as allelic variants, including SNP variants, splice variants, and the like.
  • the gene may have a sequences that is at least 85% identical to the sequence provided in the respective ENSEMBL entry.
  • the gene may have a sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% identical to the sequence provided in the ENSEMBL entry.
  • sepsis risk is determined by measuring expression levels of at least two, at least three, at least four, or at least five genes selected from a group consisting of MBP, SOS1, SMARCA5, LTBP3, PLTP, ATP6V1A, PSMA4, ARID3A, VPS35, RPA1, MYLK3, DCUN1D4, CASP9, HBB, C20orf24, IFNAR1, WIPI2, MTMR14, SULT1A1, PLEC, DNAJA3, AC138969.1, TNPO1, SLC44A1, IFITM3, UBE2G2, S100A9, and FIG4.
  • detection of sepsis risk comprises assessing expression levels in cfRNA of at least six, seven, eight, nine, or ten genes selected from MBP, SOS1, SMARCA5, LTBP3, PLTP, ATP6V1A, PSMA4, ARID3A, VPS35, RPA1, MYLK3, DCUN1D4, CASP9, HBB, C20orf24, IFNAR1, WIPI2, MTMR14, SULT1A1, PLEC, DNAJA3, AC138969.1, TNPO1, SLC44A1, IFITM3, UBE2G2, S100A9, and FIG4.
  • detection of sepsis risk comprises assessing expression levels of cfRNA of at least eleven, twelve, thirteen, fourteen, or fifteen gene selected from MBP, SOS1, SMARCA5, LTBP3, PLTP, ATP6V1A, PSMA4, ARID3A, VPS35, RPA1, MYLK3, DCUN1D4, CASP9, HBB, C20orf24, IFNAR1, WIPI2, MTMR14, SULT1A1, PLEC, DNAJA3, AC138969.1, TNPO1, SLC44A1, IFITM3, UBE2G2, S100A9, and FIG4.
  • detecting sepsis risk comprises assessing expression levels of cfRNA of at least sixteen, seventeen, eighteen, nineteen, or twenty genes selected from MBP, SOS1, SMARCA5, LTBP3, PLTP, ATP6V1A, PSMA4, ARID3A, VPS35, RPA1, MYLK3, DCUN1D4, CASP9, HBB, C20orf24, IFNAR1, WIPI2, MTMR14, SULT1A1, PLEC, DNAJA3, AC138969.1, TNPO1, SLC44A1, IFITM3, UBE2G2, S100A9, and FIG4.
  • risk determination comprises quantifying cfRNA for a subset of twenty one, twenty two, twenty three, twenty four, twenty five, twenty six, or twenty seven genes selected from MBP, SOS1, SMARCA5, LTBP3, PLTP, ATP6V1A, PSMA4, ARID3A, VPS35, RPA1, MYLK3, DCUN1D4, CASP9, HBB, C20orf24, IFNAR1, WIPI2, MTMR14, SULT1A1, PLEC, DNAJA3, AC138969.1, TNPO1, SLC44A1, IFITM3, UBE2G2, S100A9, and FIG4.
  • sepsis risk is determined by quantifying cf RNA expression of twenty eight genes MBP, SOS1, SMARCA5, LTBP3, PLTP, ATP6V1A, PSMA4, ARID3A, VPS35, RPA1, MYLK3, DCUN1D4, CASP9, HBB, C20orf24, IFNAR1, WIPI2, MTMR14, SULT1A1, PLEC, DNAJA3, AC138969.1, TNPO1, SLC44A1, IFITM3, UBE2G2, S100A9, and FIG4.
  • a panel evaluated for viral sepsis comprises at least two gene, or at least three, four, or five; or at least 10 or more genes selected from MBP, SOS1, SMARCA5, LTBP3, PLTP, ATP6V1A, PSMA4, ARID3A, VPS35, RPA1, MYLK3, DCUN1D4, CASP9, HBB, C20orf24, IFNAR1, WIPI2, MTMR14, SULT1A1, PLEC, DNAJA3, AC138969.1, TNPO1, SLC44A1, IFITM3, UBE2G2, S100A9, and FIG4; and at least one additional gene that plays a role in neutrophil degranulation or the innate immune system.
  • a gene panel for sepsis risk includes one or more genes upregulated in neutrophil degranulation and/or one or more genes upregulated in innate immune signaling. In some embodiments, a gene panel for sepsis risk includes one or more genes downregulated in translation and rRNA processing.
  • sepsis risk is evaluated by determining the amount of RNA of each member of a panel of genes, or each member of a subset of the panel of genes, in RNA obtained from whole blood sample from a patient suspected of having sepsis.
  • thof at at least two, at least three, at least four, or at least five genes selected from a group consisting of CCR1, RPS5, IFITM3, DSC2, RPS3A, AC084082.1, THOC6, ZNF639, ZCCHC4, EXT1, WDR49, CBX1, TDP2, MTERF2, PRPS1, DAAM1, NOG, CALCRL, IQCB1, MAIP1, TSPAN13, NDST3, Z97832.2, SLC6A19, RPS19BP1, MRI1, LSM12, MPZL1, SLC35E3, H6PD, SYTL2, ZNF468, TXNL4A, ORAI3, UBN2, SMYD4, NDUFA3, MRPL41, WDR77, ZNF862, ZNF616, ACTR8, CHST13, EMG1, METTL21A, MBLAC2, NUP88, EFCAB5, PIGW, GLCCI1, CFAP100, and SLITRK4.
  • detection of sepsis risk comprises assessing expression levels in of at least six, seven, eight, nine, or ten genes selected from CCR1, RPS5, IFITM3, DSC2, RPS3A, AC084082.1, THOC6, ZNF639, ZCCHC4, EXT1, WDR49, CBX1, TDP2, MTERF2, PRPS1, DAAM1, NOG, CALCRL, IQCB1, MAIP1, TSPAN13, NDST3, Z97832.2, SLC6A19, RPS19BP1, MRI1, LSM12, MPZL1, SLC35E3, H6PD, SYTL2, ZNF468, TXNL4A, ORAI3, UBN2, SMYD4, NDUFA3, MRPL41, WDR77, ZNF862, ZNF616, ACTR8, CHST13, EMG1, METTL21A, MBLAC2, NUP88, EFCAB5, PIGW, GLCCI1, CFAP100, and SLITRK4.
  • detection of sepsis risk comprises assessing expression levels of whole blood RNA of at least eleven, twelve, thirteen, fourteen, or fifteen gene selected from CCR1, RPS5, IFITM3, DSC2, RPS3A, AC084082.1, THOC6, ZNF639, ZCCHC4, EXT1, WDR49, CBX1, TDP2, MTERF2, PRPS1, DAAM1, NOG, CALCRL, IQCB1, MAIP1, TSPAN13, NDST3, Z97832.2, SLC6A19, RPS19BP1, MRI1, LSM12, MPZL1, SLC35E3, H6PD, SYTL2, ZNF468, TXNL4A, ORAI3, UBN2, SMYD4, NDUFA3, MRPL41, WDR77, ZNF862, ZNF616, ACTR8, CHST13, EMG1, METTL21A, MBLAC2, NUP88, EFCAB5, PIGW, GLCCI1, CFAP100, and SLITRK4
  • cfRNA obtained from a plasma or serum sample from the subject is evaluated to determine the level of RNA encoded by each member of a panel of genes for which cfRNA levels are associated with sepsis relative to control levels from subjects that have systemic infection or sepsis, but do not have clinically confirmed viral sepsis
  • whole blood samples can be evaluated by sequencing of RNA obtained from whole blood samples.
  • Signature genes identified by the inventors for evaluating cfRNA obtained from a plasma or serum sample from a patient to determine likelihood of viral sepsis are described in part A of this section.
  • Signature genes for evaluating RNA from a whole blood sample to determine likelihood of viral sepsis are described in part B. A.
  • a cell-free sample from blood e.g., a serum or plasma sample
  • a host gene signature panel or a subset thereof, comprising genes identified as undergoing quantitative changes in a viral infection compared to a non-viral infection.
  • Such methods comprise quantifying the RNA level in a cfRNA sample obtained from a human subject exhibiting one or more symptoms consistent with a diagnosis of sepsis.
  • At least 5, at least 10, at 20, at least 25, at least 30, at least 35, at least 40 genes selected from the following are quantified to determine expression levels: PSME3; OTUB1; RBPJ; GPX3; ERBIN; GABARAPL1; MZT2A; JMJD6; FAM214A; ZC3H11A; CACUL1; TUBG1; TRIM69; LST1; ZNF585B; RBM6; DHX29; SUGP1; SUDS3; CREB5; DYNC1H1; STXBP3; ZNF467; RAPGEF6; SIPA1; RPL7L1; CD2AP; ZNF101; CASP8AP2; CDR2; COP1; ARFGEF1; SLC30A5; RNPEP; ZFX; STARD7; CALCOCO2; BORCS8; GRHPR; DOCK9; TAF2; RANBP1; ABL1; SREK1;
  • a panel evaluated for viral sepsis comprises at least two genes, or at least 5, 10, or 15 genes selected from PSME3; OTUB1; RBPJ; GPX3; ERBIN; GABARAPL1; MZT2A; JMJD6; FAM214A; ZC3H11A; CACUL1; TUBG1; TRIM69; LST1; ZNF585B; RBM6; DHX29; SUGP1; SUDS3; CREB5; DYNC1H1; STXBP3; ZNF467; RAPGEF6; SIPA1; RPL7L1; CD2AP; ZNF101; CASP8AP2; CDR2; COP1; ARFGEF1; SLC30A5; RNPEP; ZFX; STARD7; CALCOCO2; BORCS8; GRHPR; DOCK9; TAF2; RANBP1; ABL1; SREK1; and DGKA, and at least
  • the “ENSG” designation of each gene is shown in Table B.
  • the “ENSG” designation is based on ENSEMBL version 99.
  • Table B ENSEMBL designations.
  • ENSEMBL ID Gene Name Additional gene information, including chromosome location, number of transcripts (e.g., splice variants) encoded by the gene that have been identified, and UniProt identification numbers are available in the ENSEMBL entry. Reference to the gene by name includes variants, such as allelic variants, including SNP variants, splice variants, and the like.
  • sepsis risk is determined by measuring levels of at least two, at least three, at least four, or at least five genes selected from a group consisting of OTUB1; LITAF; RBPJ; ERBIN; MZT2A; PPM1B; PEBP1; PARL; TRIM69; RBM6; DHX29; RDX; HIST1H3I; STXBP3; IGLL5; COP1; CALCOCO2; BORCS8; TAF2; and ABL1.
  • detection of sepsis risk comprises assessing levels in cfRNA of at least six, seven, eight, nine, or ten genes selected from OTUB1; LITAF; RBPJ; ERBIN; MZT2A; PPM1B; PEBP1; PARL; TRIM69; RBM6; DHX29; RDX; HIST1H3I; STXBP3; IGLL5; COP1; CALCOCO2; BORCS8; TAF2; and ABL1.
  • detection of sepsis risk comprises assessing expression levels of cfRNA of at least eleven, twelve, thirteen, fourteen, or fifteen gene selected from OTUB1; LITAF; RBPJ; ERBIN; MZT2A; PPM1B; PEBP1; PARL; TRIM69; RBM6; DHX29; RDX; HIST1H3I; STXBP3; IGLL5; COP1; CALCOCO2; BORCS8; TAF2; and ABL1.
  • detecting viral sepsis risk comprises assessing expression levels of cfRNA of at least sixteen, seventeen, eighteen, or nineteen, twenty genes selected from OTUB1; LITAF; RBPJ; ERBIN; MZT2A; PPM1B; PEBP1; PARL; TRIM69; RBM6; DHX29; RDX; HIST1H3I; STXBP3; IGLL5; COP1; CALCOCO2; BORCS8; TAF2; and ABL1.
  • detecting viral sepsis risk comprises assessing expression levels of twenty genes OTUB1; LITAF; RBPJ; ERBIN; MZT2A; PPM1B; PEBP1; PARL; TRIM69; RBM6; DHX29; RDX; HIST1H3I; STXBP3; IGLL5; COP1; CALCOCO2; BORCS8; TAF2; and ABL1.
  • a panel evaluated for viral sepsis comprises at least two genes selected from OTUB1; LITAF; RBPJ; ERBIN; MZT2A; PPM1B; PEBP1; PARL; TRIM69; RBM6; DHX29; RDX; HIST1H3I; STXBP3; IGLL5; COP1; CALCOCO2; BORCS8; TAF2; and ABL1 and at least one additional gene that plays a role in responses to elevated platelet cytosolic Ca 2+ , interferon alphs/beta signaliing, or is a chemokine or chemokine receptor.
  • sepsis risk is determined by determining levels of RNA of each member of a panel of genes, or a subset thereof, in a whole blood sample.
  • the panel comprises at least two, at least three, at least four, or at least five genes selected from a group consisting of the genes listed in Table D.
  • detection of sepsis risk comprises assessing expression levels in of at least six, seven, eight, nine, or ten genes selected from the genes listed in Table D.
  • detection of sepsis risk comprises assessing levels of whole blood RNA of at least eleven, twelve, thirteen, fourteen, or fifteen gene selected from the genes listed in Table D.
  • detecting sepsis risk comprises assessing levels of RNA of at least sixteen, seventeen, eighteen, nineteen, or twenty genes; or at least thirty, thirty five, forty, or at least fifty genes; or at least sixty, seventy, or eighty or more genes listed in Table D. [0067]
  • the “ENSG” designation of teach gene is shown in Table D.
  • RNA is obtained from a whole blood sample or a bodily fluid sample that does not contain cells.
  • cfRNA is isolated from a serum or plasma sample.
  • the RNA is processed to evaluate levels of RNA, e.g., one or more genes selected from the gene panels described herein present in the RNA sample, e.g., cfRNA from serum or plasma or RNA from a whole blood sample.
  • the sample is obtained within 24 hours, or within 48 hours, of admission of a patient to hospital; or within 24 hours, or within 48 hours, of when a patient is determined to be at risk of sepsis based on the clinical factors as described above.
  • cfRNA may be evaluated by nucleic acid sequencing.
  • the gene panel comprises at least two or three genes set forth in Table A or Table B.
  • a cfRNA preparation can be depleted of abundant sequences, such as mitochondrial or ribosomal RNA sequences, to enrich for coding transcripts.
  • the cell-free nucleic acid preparation for example, RNA preparation or cDNA transcribed from the RNA preparation
  • Sequencing technologies that can be used to evaluate RNA profiles, e.g., in cfRNA from a plasma or serum sample, include next generation sequencing platforms such as RNA-seq.
  • Illustrative sequencing platforms suitable for use according to the methods include, e.g., ILLUMINA® sequencing (e.g., HiSeq, MiSeq), SOLID® sequencing, ION TORRENT® sequencing, and SMRT® sequencing and those commercialized by Roche 454 Life Sciences (GS systems).
  • ILLUMINA® sequencing e.g., HiSeq, MiSeq
  • SOLID® sequencing e.g., HiSeq, MiSeq
  • SOLID® sequencing e.g., ION TORRENT® sequencing
  • SMRT® sequencing those commercialized by Roche 454 Life Sciences (GS systems).
  • alternative methodology for assessing RNA levels may be employed.
  • the level of RNA in a cfRNA sample from serum or plasma can be detected or measured by a variety of methods including, but not limited to, an amplification assay or a microarray chip (hybridization) assay.
  • amplification of a nucleic acid sequence has its usual meaning, and refers to in vitro techniques for enzymatically increasing the number of copies of a target sequence. Amplification methods include both asymmetric methods in which the predominant product is single-stranded and conventional methods in which the predominant product is double-stranded.
  • microarray refers to an ordered arrangement of hybridizable elements, e.g., gene-specific oligonucleotides, attached to a substrate. Hybridization of nucleic acids from the sample to be evaluated is determined and converted to a quantitative value representing relative gene expression levels.
  • Non-limiting examples of methods to evaluate levels of RNA, e.g., cfRNA from serum or plasma, include amplification assays such as quantitative RT-PCR, digital PCR, microarray analysis; ligation chain reaction, oligonucleotide elongation assays, and various multiplexed assays, such as multiplexed amplification assays
  • isothermal amplification methods that may be used to measure gene expression levels include, for example, loop-mediated isothermal amplification (LAMP).
  • LAMP loop-mediated isothermal amplification
  • cfRNA values determined by sequencing or an alternative methodology are normalized to account for sample-to-sample variations in RNA isolation and the like. Methods for normalization are well known in the art.
  • normalization may be performed with reference to housekeeping genes that are constitutively expressed at any development stage irrespective of pathophysiological state.
  • housekeeping genes or normalization genes include, Ribonuclease P (RNaseP) gene, genes encoding ⁇ - actin, ⁇ - actin, 18S rRNA, 28S rRNA, albumin, and glyceraldehyde-3-phosphate dehydrogenase (GAPDH). Additional suitable housekeeping genes that can be used to carry out the methods described herein may be found in the HRT Atlas Database (www.housekeeping.unicamp.br; Hounkpe et al.
  • normalization of values is performed using trimmed mean of M values (TMM) normalization, e.g., when using RNA-Seq to evaluate cfRNA expression levels.
  • normalized values may be obtained using a reference level for one or more exogenous nucleic acids, e.g. exogenous RNA oligonucleotides added to a sample.
  • a control value for normalization of RNA values can be predetermined, determined concurrently, or determined after a sample is obtained from the subject. 2.
  • RNA of each gene measured can be quantified compared to levels of each RNA in a population of control subjects.
  • control subjects do not have clinical signs of infection, including signs such as increased pulse rate, body temperature, hypotension, hyperventilation and/or respiratory alkalosis.
  • the control subject have systemic inflammatory disease or a critical illness, e.g., cardiac arrest, overdose/poisoning, heart failure exacerbation, or pulmonary embolism.
  • a control population typically comprise at least 10 subjects, or 50 or more subjects (e.g., 10-100 subjects). In some embodiments, a control populations comprises 500 or more subjects.
  • a value may represent the median transcript level or concentration of the selected transcript in the control population.
  • Determination of a probability score and classification of whether the subject is likely to have sepsis is further detailed in Subsection B of this section. 3. Quantification of differential expression-viral sepsis [0076] The amount of RNA of each gene measured in a gene panel for determining viral sepsis is quantified compared to levels of each RNA in a population of control subjects. For quantification of viral sepsis, the amounts of RNA can be compared to control subjects having clinically adjudicated sepsis due to a non-viral pathogen.
  • control subjects have a microbiologically confirmed bacterial bloodstream infection or a microbiologically confirmed bacterial non-bloodstream infection.
  • a control population typically comprise at least 10 subjects, or 50 or more subjects (e.g., 10-100 subjects). In some embodiments, a control populations comprises 500 or more subjects.
  • a value may represent the median transcript level or concentration of the selected transcript in the control population.
  • the quantity of differential expression for each marker can be determined using a difference or ratio between a measured expression level and a reference expression level.
  • a relationship between this quantity and the likelihood (probability) of having sepsis can be determined, e.g., using a proportion of samples having sepsis that have a given quantity of differential expression. This can be done for each marker.
  • a probability score can be determined based on the quantities of differential expression for all the markers.
  • the overall probability score can be determined in various ways. For instance, a total quantity of differential expression can be determined, e.g., as a weighted sum or average of the individual quantities of differential expression.
  • the weights can be based on the importance (discriminating power) of each marker in discriminating sepsis from non-sepsis. Then, the proportion of the subjects that have sepsis at a given value for the total quantity can be used as the probability score.
  • a machine learning model can provide the probability, e.g., a support vector machine (SVM) can provide a probability based on a distance of a multidimensional point of the expression levels from the hyperplane that distinguishes between sepsis and non-sepsis.
  • SVM support vector machine
  • a probability score to classify the subject as likely or not likely to have sepsis can be determined based on the level of differential expression of each member of a gene panel as described herein, or a subset thereof.
  • the level of expression of each gene is weighted with a predefined coefficient.
  • the predefined coefficients can be the same or different for the genes.
  • the probability score can be determined in various ways, e.g., by statistical or machine learning regression or classification such as, but not limited to, linear regression, including least squares regression, ridge or LASSO regression, elastic net regression, regularized Cox regression, logistic regression, orthogonal matching pursuit models, a Bayesian regression model, or deep learning methods, such as convolutional neural networks, recurrent neural networks and generative adversarial networks (see, e.g., LeCun et al., .Nature 521: 436- 444, 2015).
  • statistical or machine learning regression or classification such as, but not limited to, linear regression, including least squares regression, ridge or LASSO regression, elastic net regression, regularized Cox regression, logistic regression, orthogonal matching pursuit models, a Bayesian regression model, or deep learning methods, such as convolutional neural networks, recurrent neural networks and generative adversarial networks (see, e.g., LeCun et al., .Nature 521: 436- 444
  • machine-learning algorithms include quadratic discriminate analysis, support vector machines, including without limitation support vector classification- based regression processes, stochastic gradient descent algorithms, nearest neighbors algorithms, Gaussian processes such as Gaussian process regression, cross-decomposition algorithms, including partial least squares and/or canonical correlation analysis; probabilistic graphical models including naive Bayes methods; models based on decision trees, such as decision tree classification algorithms.
  • Additional machine-learning algorithms include ensemble methods such as bagging meta-estimator, randomized forest algorithms, AdaBoost, gradient tree boosting, and/or voting classifier methods. Details relating to various statistical methods are found in the following references: Ruczinski et al., 12 J.
  • the probability score can be used to determine whether the subject has an increased likelihood of sepsis.
  • the probability score can be compared to a threshold value (also referred to as a cutoff value).
  • the threshold can be selected based on a desired accuracy, e.g., a trade off of sensitivity and specificity.
  • likelihood of sepsis may be assigned based on a cutoff value using a reference scale, e.g., from 0 to 1.0. In some embodiments, a cutoff value of 0.5 or greater may be employed to define likelihood of sepsis. In some embodiments, sepsis likelihood may be further stratified, for example, likelihood of sepsis may be categorized as “high,” “intermediate,” or “low”, e.g., based on the highest tertile, intermediate tertile and bottom tertile.
  • Classifiers that use host gene expression levels of sepsis marker genes as described herein in cfRNA samples from a subject evaluated for likelihood of sepsis can be generated, e.g., as described herein, from a training set of samples obtained from confirmed sepsis patients, e.g., determined by clinical adjudication and/or culture of organism from a blood or organ sample from a patient.
  • a training set can be from patients having confirmed viral sepsis vs. sepsis from a non-viral pathogen.
  • a gene expression panel to evaluate likelihood of sepsis can be determined based on a gene panel or subset panel comprising one or more gene set forth in Table A (sepsis likelihood) or Table B (viral sepsis likelihood) or may comprise one or more genes set forth in the tables and additional genes identified as being correlated with sepsis risk.
  • Different subsets of genes can be selected to train a model (e.g., to determine the probability score) using all or a subset of the training samples (i.e., subjects for which sepsis status is known and for which expression of the genes was measured).
  • This training subset can then be used to train (optimize) a model, whose accuracy can be measured, e.g., using the AUC of an ROC curve. Then, another subset of genes can be selected, with a further training process providing another model whose accuracy can also be measured.
  • the accuracy can be measured using the training set or a validation set, which can include samples with known labels that were excluded from the training set. This process of generating models for different subsets of genes, along with the accuracy of each model, can continue, possibly for all possible subsets of genes for which expression levels have been measured. A panel providing the best accuracy can be selected, however the accuracy is measured.
  • the machine learning model may be trained until certain predetermined conditions for accuracy or performance are satisfied, such as having minimum desired values corresponding to diagnostic accuracy measures.
  • the diagnostic accuracy measure may correspond to prediction of a diagnosis or disease outcome in the subject.
  • diagnostic accuracy measures may include sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and area under the curve (AUC) of a Receiver Operating Characteristic (ROC) curve corresponding to the diagnostic accuracy of detecting sepsis.
  • FIG.6 is a flowchart illustrating a method 600 of measuring the expression levels of the gene markers described herein to evaluate general sepsis risk in a subject according to embodiments of the present invention.
  • RNA in a cell-free RNA sample from a human subject of each member of a gene panel is detected.
  • cfRNA is obtained from a serum or plasma sample.
  • the plasma can be clarified by centifiugation.
  • Cell-free RNA can be extracted using available methods and kits, such as cfRNA kits available from Qiagen.
  • RNA-seq to analyze RNA levels in cfRNA
  • human rRNA cytosolic and mitochondrial
  • beta globin sequences can be depleted. Any methodology for depletion can be employed.
  • a pool of locked nucleic aicds that block reverse transcription of sequences to be removed will provide a cDNA population enriched for target.
  • Other technologies to enrich protein-encoding RNAs include affinity depletion using complementary oligonculeotides to rRNA target sequences, depletion methods using antisense DNA oligonucleotides to cover the entire rRNA molecule to target RNaseH-mediate degration of the rRNA, and selection of poly(A)-containing transcripts.
  • the gene panel (Step 620) comprises at least three genes selected from the genes listed in Table A. In some embodiments, the gene panel comprises genes as described in Section I.A.
  • RNA can be detected by any suitable method, such as those described above in section III.A.i.
  • RNA is detected by an amplification- based method such as quantitative PCR.
  • RNA is detected by sequencing that employs a massively parallel sequencing platform, such as RNA-seq.
  • a quantity of differential gene expression for each member of the gene panel is determined compared to the reference levels of RNA in control subjects. Example techniques for the quantification of differential expression are described in section III.A.ii. above.
  • a probability score based on the respective amount of differential gene expression is generated. Example techniques for the generation of the probability score are described in section III.B above.
  • a cohort of training samples can be used to train a machine learning (ML) model, such as a bagged support vector machine learning approach (bSVM), e.g., with a linear kernel.
  • ML machine learning
  • bSVM bagged support vector machine learning approach
  • a ML model can be used to determine the probability score.
  • the ML model can be trained using different panels of markers, and best performing panel can be used.
  • the pool of potential markers for the various panels can be limited to markers having at least a minimum amount of differential expression.
  • the probability score can be normalized, e.g., between 0 and 1.
  • a classification is determined, wherein an increased likelihood of sepsis is determined when the probability score exceeds a threshold value.
  • RNA is obtained from a whole blood sample from the subject and the gene panel comprises at least three genes selected from the genes set forth in Table C. Threshold scores and classifications are determined as described above. 2.
  • FIG.7 is a flowchart illustrating a method 700 of measuring the expression levels of the gene markers described herein to evaluate viral sepsis risk in a subject according to embodiments of the present invention. Aspects of method 700 can be performed in a similar manner as method 600.
  • Steps 710 and 720 RNA in a cell-free RNA sample from a human subject of each member of a gene panel is detected.
  • cfRNA is obtained from a serum or plasma sample.
  • the plasma can be clarified by centifiugation.
  • Cell-free RNA can be extracted using available methods and kits, such as cfRNA kits available from Qiagen.
  • cfRNA kits available from Qiagen.
  • human rRNA cytosolic and mitochondrial
  • beta globin sequences can be depleted. Any methodology for depletion can be employed. For example, a pool of locked nucleic aicds that block reverse transcription of sequences to be removed (Qiagen FastSelect) will provide a cDNA population enricked for target.
  • RNA can be detected by any suitable method, such as those described above in section III.A.i. In some embodiments, RNA is detected by an amplification- based method such as quantitative PCR.
  • RNA is detected by sequencing that employs a massively parallel sequencing platform, such as RNA-seq.
  • a quantity of differential gene expression for each member of the gene panel is determined compared to the reference levels of RNA in control subjects. Quantification of differential expression is described in section III.A.iii.
  • a probability score based on the respective amount of differential gene expression is generated; and in step 750 the subject a classification is determined, wherein an increased likelihood of viral sepsis is determined when the probability score exceeds a threshold value. Steps 740 and 750 can be performed in a similar manner as steps 640 and 650 of method 600.
  • RNA is obtained from a whole blood sample from the subject and the gene panel comprises at least three genes selected from the genes in Table D. IV. METHODS OF DETERMINING MICROBIAL MASS / IDENTIFICATION OF PATHOGEN [0097]
  • methods of predicting the likelihood of sepsis can comprise determining the microbial mass by sequencing a cell-free DNA sample, e.g., a plasma or serum sample from a subject undergoing evaluation for sepsis.
  • microbial mass refers to the weight, e.g., picograms, of microbial nucleic acid determined to be present in plasma or serum. This can be calculated as described below relative to a known amount of spike-in calibration nucleic acids (also referred to as a calibration standard) of known sequence unrelated to human pathogens that are is added to a sample.
  • spike-in calibration nucleic acids also referred to as a calibration standard
  • FIG.8 provides a flow chart illustrating determination of microbial mass and identification of a dominant pathogen to assess likelihood of sepsis.
  • step 810 a known amount of calibration nucleic acids are added to a cfDNA sample obtained from a patient.
  • Appropriate calibration nucleic acids for use in quantification of microbial mass include nucleic acids that are not related to human pathogens and have not been observed in human cf nucleic acid, e.g., sequences from Archaea or other extremophiles, and/or synthetic sequences.
  • cDNA transcribed from a control comprising RNA transcripts may be used to add to a sample.
  • the calibration nucleic acid is cDNA added to samples is cDNA transcribed from RNA controls from the External RNA Controls Consortium (ERCC) (Pine, et al., BMC Biotechnology 16, 54 (2016)).
  • ERCC External RNA Controls Consortium
  • ERCC consortium control RNAs are synthesized by in vitro transcription of synthetic DNA sequences or transcripts of DNA derived from the Bacillus subtilis or the deep-sea vent microbe Methanocaldococcus jannaschii genomes. They also contain a poly-A+ tail mimic in the DNA template.
  • The, ERCC control RNA show minimal sequence homology with endogenous transcripts from sequenced humans. In some embodiments, 25 pg of calibration nucleic acids, e.g., ERCC control RNA is used.
  • step 820 libraries generated from the cfDNA preparation can be sequenced.
  • calibration nucleic acids can comprise long polynucleotides, which can be fragmented, e.g., by mechanical, enzymatic or chemical shearing, to provide a uniform distribution of fragments for sequencing. The amount of fragmentation can provide sizes that are similar to the natural lengths of cell-free RNA.
  • sequences are aligned to sequences present in one or more taxonomic sequence databases (e.g., National Center for Biotechnology Information (NCBI) databases that comprise microbial sequences to determine sequence reads that align to microbial sequences (i.e., “map” to microbial sequences).
  • NCBI National Center for Biotechnology Information
  • the method comprises determining sequence reads that align to nonviral microbial sequences.
  • the NCBI GenBank nucleotide database is queried
  • the method further employs an IDseq pipeline (Kalantar, et al., Gigascience 9, 2020), which incorporates subtractive alignment of the human genome (NCBI GRC h38) using STAR28, quality and complexity filtering, and subsequent removal of cloning vectors and phiX phage using Bowtie220.
  • the identities of microbial reads are determined by querying the NCBI nucleotide nucleotide database using GSNAP-L (Zhao, et al, Bioinformatics 28, 125–126 (2012)).
  • a ratio of the amount of total sequence reads that correspond to the calibration nucleic acids and the amount of all microbial sequence reads microbial reads is determined.
  • the ratio can be determined in various ways, e.g., X1/X2, X1/(X1+X2), X2/(X1+X2), functions of such ratios, or ratios of functions of the amounts, or combinations thereof.
  • the microbial mass (e.g., weight in picograms) can be determined based on the known amount of the calibration standard and the ratio determined in step 840. For example, the microbial mass can be determined by multiplying the ratio of total microbial reads to calibration reads and multiplying by known amount. In some embodiments, a background correction using a control, such as a water control samples, is employed to account for environmental contaminants.
  • abundance levels of microbial species represented in the cfDNA preparation are determined by determining the number of sequence reads that are mapped to individual species of microbes. Negative control samples consisting of only double-distilled water can also be processed with plasma cf-DNA samples.
  • Such negative control samples provide estimation of the number of background reads expected for each taxon, e.g., as described by Mick et al, Nature Communications 11:5854, 2020).
  • the species of that genus having the highest abundance level is selected. Then, the selected species are ranked by abundance level in sequential order, typically from highest to lowest.
  • a gap threshold is determined.
  • the gap threshold can correspond to the abundance level at which the greatest difference in abundance level occurs between sequential microbes. For example, with the ranking being from highest to lowest, the highest abundance level may differ by 4.5 (e.g., 8-3.5) from the second highest, which might differ by only 0.8 from the third highest.
  • the gap threshold can be any value between 8 and 3.5, so that only the highest abundance would qualify.
  • the largest gap can be between the second highest and the third highest, e.g., with the set of abundance values being 9, 8, 2, 1.5, 1, ... .
  • the gap threshold could be any abundance between 8 and 2.
  • a microbe can be identified as a known bloodstream pathogen in various ways, e.g., by referencing indexes and listing of pathogen, e.g., a reference index derived from the most prevalent bloodstream infection pathogens reported by both the National Healthcare Safety Network (NHSN) (Weiner-Lastinger, L. M. et al. ,Infect. Control Hosp. Epidemiol. 41, 1–18 (2020)) and/or a recent multicenter surveillance study of healthcare-associated infections (Magill et al, NEJM 379:1732-1744, 2018).
  • NHSN National Healthcare Safety Network
  • Candida Citrobacter, Enterobacter, Enterococcus, Klebsiella, Lactobacillus, Morganella, Prevotella, Proteus, Serratia, Stenotrophomonas and Streptococcus as common sepsis pathogens. In this manner, it can be determined whether a predominant pathogen exists.
  • the species are present in the listing provided in Table 14 and are detected at an abundance of > 1 read per million.
  • a pathogenic respiratory virus e.g.,based on a list of pathogens (Langelier et al, 2018, supra) can be identified in the cfRNA from plasma or serum.
  • a patient is determined as likely having sepsis if the microbial mass is greater than a specified mass (e.g., 20 pg) and a predominant pathogen has been identified.
  • the specific mass can assume that all samples have the same volume. In other implementations, the specific mass can vary based on the volume of the sample.
  • V. LIKELIHOOD OF SEPSIS BASED ON INTEGRATION OF HOST TRANSCRIPTIONAL PROFILING AND MICROBIAL IDENTIFICATION [0110] A patient can be determined to likely have sepsis based on a combination of the previously described tests.
  • Such tests can be combined with logical ANDS or ORs for determining whether sepsis exists, e.g., to determine whether to provide treatment. Such a combination can maximize accuracy (eg a negative predictive value) to identify individuals likely to have sepsis.
  • accuracy e.g. a negative predictive value
  • the patient can be treated for sepsis. But if none of the three methods determines than sepsis is likely, then sepsis can be ruled out, and the patient would not be unnecessarily subjected to antibiotics or other treatments [0111]
  • the combinations of tests allow for determining that the patient is unlikely to have sepsis.
  • the combination of tests can be as follows: x A patient is deemed likely to have sepsis if the host classifier probability for general sepsis, b section I), is greater than a cutoff (based on evaluation of a host cell general sepsis gene panel as described herein (e.g., e.g., .0.5) associated with sepsis; OR x A patient is deemed likely to have sepsis if the host classifier for virial sepsis, based on evaluation of a host cell viral sepsis gene panel as described herein (e.g., in section II), is greater than a cutoff (e.g., 0.9) associated with sepsis.
  • a cutoff e.g., 0.9
  • a patient is deemed likely to have sepsis if the microbial mass is greater than a specified mass (e.g., 20 pg) AND a dominant pathogen is detected (e.g., a drop in abundance to a next prevalent pathogen is greater than a threshold).
  • a specified mass e.g. 20 pg
  • a dominant pathogen e.g., a drop in abundance to a next prevalent pathogen is greater than a threshold.
  • the patient can be treated with an antibiotic or other agent that treats sepsis.
  • the antibiotic that targets the pathogen may be selected from treatment.
  • antibiotics include, for example, ceftriaxone, cefotaxime, vancomycin, meropenem, cefepime, ceftazidime, cefuroxime, nafcillin, oxacillin, ampicillin, ticarcillin, ticarcillin/clavulinic acid, ampicillin/sulbactam (Unasyn), azithromycin, trimethoprim-sulfamethoxazole, clindamycin, ciprofloxacin, levofloxacin, synercid, amoxicillin, amoxicillin/clavulinic acid, cefuroxime, trimethoprim/sulfamethoxazole, azithromycin, clindamycin, dicloxacillin, ciprofloxacin, levofloxacin, cefixime, cefpodoxime, loracarbef, cefadroxil, cefabutin, cefdinir
  • kits, panels and devices for carrying out the methods described herein are provided in this disclosure.
  • a kit is provided for measuring and analyzing RNA in a biological sample, such as a serum or plasma sample.
  • the kit includes two or more polynucleotides for specifically hybridizing to at least a section of a gene listed in Table A or Table B for use in evaluating likelihood of sepsis in a patient.
  • the kit includes two or more polynucleotides for use in assessing likelihood of sepsis in a human subject.
  • FIG.9 illustrates a measurement system 900 according to an embodiment of the present disclosure.
  • the system as shown includes a sample 905, such as cell-free RNA or DNA molecules within an assay device 910, where an assay 908 can be performed on sample 905.
  • sample 905 can be contacted with reagents of assay 908 to provide a signal of a physical characteristic 915.
  • An example of an assay device can be a flow cell that includes probes and/or primers of an assay or a tube through which a droplet moves (with the droplet including the assay).
  • Physical characteristic 915 e.g., a fluorescence intensity, a voltage, or a current
  • Detector 920 can take a measurement at intervals (e.g., periodic intervals) to obtain data points that make up a data signal.
  • an analog-to-digital converter converts an analog signal from the detector into digital form at a plurality of times.
  • Assay device 910 and detector 920 can form an assay system, e.g., a sequencing system that performs sequencing according to embodiments described herein.
  • a data signal 925 is sent from detector 920 to logic system 930.
  • data signal 925 can be used to determine sequences and/or locations in a reference genome of DNA molecules.
  • Data signal 925 can include various measurements made at a same time, e.g., different colors of fluorescent dyes or different electrical signals for different molecule of sample 905, and thus data signal 925 can correspond to multiple signals.
  • Data signal 925 may be stored in a local memory 935, an external memory 940, or a storage device 945.
  • Logic system 930 may be, or may include, a computer system, ASIC, microprocessor, graphics processing unit (GPU), etc. It may also include or be coupled with a display (e.g., monitor, LED display, etc.) and a user input device (e.g., mouse, keyboard, buttons, etc.). Logic system 930 and the other components may be part of a stand-alone or network connected computer system, or they may be directly attached to or incorporated in a device (e.g., a sequencing device) that includes detector 920 and/or assay device 910.
  • a device e.g., a sequencing device
  • Logic system 930 may also include software that executes in a processor 950.
  • Logic system 930 may include a computer readable medium storing instructions for controlling measurement system 900 to perform any of the methods described herein.
  • logic system 930 can provide commands to a system that includes assay device 910 such that sequencing or other physical operations are performed. Such physical operations can be performed in a particular order, e.g., with reagents being added and removed in a particular order. Such physical operations may be performed by a robotics system, e.g., including a robotic arm, as may be used to obtain a sample and perform an assay.
  • System 900 may also include a treatment device 960, which can provide a treatment to the subject.
  • Treatment device 960 can determine a treatment and/or be used to perform a treatment. Examples of such treatment can include surgery, radiation therapy, chemotherapy, immunotherapy, targeted therapy, hormone therapy, and stem cell transplant.
  • Logic system 930 may be connected to treatment device 960, e.g., to provide results of a method described herein.
  • the treatment device may receive inputs from other devices, such as an imaging device and user inputs (e.g., to control the treatment, such as controls over a robotic system).
  • the specific details of particular embodiments may be combined in any suitable manner without departing from the spirit and scope of embodiments of the invention.
  • GSEA Gene set enrichment analysis
  • GSEA demonstrated enrichment for CD28 signaling, immunoregulatory interactions between lymphoid and non-lymphoid cells, and other pathways in the Sepsis non-BSI patients, while the Sepsis BSI group was characterized by enrichment in genes related to antimicrobial peptides, defensins, G alpha signaling and other pathways (Pathways are summarized in Table 4).
  • Sensitivity versus blood culture as a reference standard was 83%, and varied by pathogen, ranging from 0% (e.g., C. difficile) to 100% (e.g., E. coli, S. aureus/argenteus FIG.3C). Pathogens were called by the RBM in 10/37 (27%) of patients in the No-Sepsis group, equating to a specificity of 73%. E.
  • Plasma cf-DNA mNGS identified 2/25 (8%) of culture-confirmed bacterial LRTI pathogens in the Sepsis non-BSI group and 3/10 (30%) culture-confirmed bacterial UTI pathogens (FIG. 3D, Table 8).
  • cf-plasma DNA-seq returned negative in all three patients with sepsis attributable to C. difficile colitis.
  • mNGS identified additional putative bacterial pathogens not detected by culture in 8 of 73 (11%) of patients with microbiologically confirmed sepsis (Table 8).
  • Integrated host-microbe cf-plasma metagenomic model for sepsis rule-out and diagnosis [0135] Given the relative success of each independent host and pathogen model, we considered whether combining them could enhance diagnosis, and potentially serve as a sepsis rule-out tool. To test this possibility, we developed a proof-of-concept integrated host + microbe model based on simple rules. It returned a sepsis diagnosis based on either host criteria: [host sepsis classifier probability > 0.5] or microbial criteria: [(pathogen detected by RBM) AND (microbial mass > 20 pg)] OR [host viral classifier probability > 0.9].
  • Plasma cf-RNA sequencing alone performed poorly for detecting sepsis-associated respiratory viruses. Incorporation of a host-based viral classifier, however, markedly improved detection of clinically identified viral LRTI, and additionally predicted viral infections in three patients with sepsis who did not undergo viral PCR testing during their hospitalizations. Prior work has demonstrated that different viral species elicit distinct host transcriptional signatures in the peripheral blood (Mudd, et al., Sci. Adv.6, eabe3024 (2020)) suggesting that future studies could extend the cf-RNA host viral classifier to identify specific viral pathogens, such as influenza or SARS-CoV-2, for which therapeutics exist.
  • This study includes the use of plasma cf-RNA transcriptomics for sepsis diagnosis, development of the first sepsis diagnostic combining host and microbial mNGS data, detailed clinical phenotyping, and a large prospective cohort of critically ill adults with systemic illnesses.
  • the mNGS analyses and blood cultures were performed on different blood samples, with research specimens collected up to 24 hours after blood cultures, which may have resulted in lower concordance than truly existed.
  • a significant fraction of plasma samples had insufficient host transcripts to permit gene expression analyses, leading to a smaller sample size for the plasma versus the whole blood cohorts.
  • additional studies in an independent cohort will be useful to validate these findings.
  • PAXgene tubes were collected on patients enrolled in EARLI during the time period listed above who were hypotensive and/or mechanically ventilated at the time of enrollment.
  • the main exclusion criteria for the EARLI study are: 1) exclusively neurological, neurosurgical, or trauma surgery admission, 2) goals of care decision for exclusively comfort measures, 3) known pregnancy, 4) legal status of prisoner, and 5) anticipated ICU length of stay ⁇ 24 hours. Enrollment in EARLI began in 10/2008 and continues. B.
  • Sepsis adjudication of sepsis groups was carried out by study team physicians (MA, CL, AL, KL, PS, CH, AG, CC, KK) using the sepsis-2 definition (Kalantar et al., 2020, supra) ( ⁇ 2 SIRS criteria + suspected infection) and incorporating all available clinical and microbiologic data from the entire ICU admission, with blinding to mNGS results. Patients were categorized into five subgroups based on sepsis status (FIG.1).
  • RNA-seq was performed on the whole blood and plasma specimens, DNA-seq was performed only on plasma.
  • RNA was extracted from whole blood using the Qiagen RNEasy kit and normalized to 10ng total input per sample.
  • Total plasma nucleic acid was extracted from 300uL of plasma, first clarified by two minutes of maximum-speed centrifugation, using the Zymo Pathogen Magbead Kit. 10ng of total nucleic acid underwent DNA-seq using the NEBNext Ultra II DNA Kit.
  • RNA-seq library preparation human cytosolic and mitochondrial ribosomal RNA and globin RNA was first depleted using FastSelect (Qiagen).
  • FastSelect for the purposes of background contamination correction (see below) and to enable estimation of input microbial mass, we included negative water controls as well as positive controls (spike-in RNA standards from the External RNA Controls Consortium (ERCC), Pine, et al, BMC Biotechnology 16, 54 (2016)).
  • Detection of microbes leveraged the open-source IDseq pipeline (Kalantar et al., 2020, supra), which incorporates subtractive alignment of the human genome (NCBI GRC h38) using STAR (Dobin et al., 2013, supra), quality and complexity filtering, and subsequent removal of cloning vectors and phiX phage using Bowtie2 (Kalantar et al., 2020, supra).
  • the identities of the remaining microbial reads are determined by querying the NCBI nucleotide (NT) database using GSNAP-L (Kalantar et al., 2020, supra; Zhao, et al., Bioinformatics 28, 125–126 (2012)).
  • the RBM originally developed to identify pathogens from respiratory mNGS data, identifies outlier organisms within a sample by identifying the greatest gap in abundance between the top 15 sequentially ranked microbes in each sample (FIG.3B).
  • FOG.3B sequentially ranked microbes in each sample.
  • the RBM also identified human pathogenic respiratory viruses derived from a reference list of LRTI pathogens (Langelier et al, 2018, supra) present in the plasma cf-RNA-seq data.
  • ERCC External RNA Controls Consortium
  • SVM Support Vector Machine
  • bSVM bagged SVM
  • Each classifier used a bootstrapped set of samples and a random subset of features.
  • bSVM bagged SVM
  • Z-score-scaled transformed (variance stabilizing transformation) gene counts was used to train the model, and the rest was used as a held-out set to test the final model.
  • the training set was subsequently randomly split ten times for cross-validation, using 75% of each as intermediate training sets, and the remaining 25% as their associated testing sets.
  • RFE recursive feature elimination
  • a bSVM classifier with default parameters was built at each iteration.
  • We defined feature importance as the average squared weight across all estimators. To maximize interpretability, we restricted the maximum number of predictors to 100 genes.
  • the raw fastq files with microbial sequencing reads are available from the Sequence Read Archive under BioProject ID: PRJNA783060.
  • Table 3 Gene set enrichment analysis of differentially expressed genes between patients with microbiologically confirmed sepsis (Sepsis BSI and Sepsis non-BSI ) and those with non-infectious critical illnesses (No-Sepsis). Data from whole blood RNA-seq. The top 10 positively and negatively enriched pathways by P value are included in table.
  • Table 4 Gene set enrichment analysis of differentially expressed genes between patients with sepsis due to bloodstream infections (Sepsis BSI ) versus peripheral infections (Sepsis non-BSI ). Data from whole blood RNA-seq. The top 10 positively and negatively enriched pathways by P value are included in table. [0164] Table 5.
  • AUC Area under the receiver operating characteristic curve
  • Table 7 Mass (pg) of microbial DNA in each sample, calculated based on spiked-in 25 pg ERCC positive controls. [0167] Table 8. Sepsis pathogens detected by standard of care clinical microbiology and/or by cf-plasma mNGS, using the rules-based model. [0168] Table 9. Gene set enrichment analysis of differentially expressed genes between patients with viral versus non-viral causes of sepsis amongst the Sepsis BSI and Sepsis non-BSI patients. Data from whole blood RNA-seq. [0169] Table 10.
  • Composite list of all genes selected by each classifier model is provided in Table B. [0172]
  • Table 14 Reference index of established sepsis pathogens. Derived from the top 20 most prevalent sepsis pathogens reported by both the US CDC/ National Healthcare Safety Network and a point prevalence survey of healthcare-associated infections. These studies included multiple species of Candida, Citrobacter, Enterobacter, Enterococcus, Fusobacterium, Klebsiella and Morganella, which were collapsed in the table to the genus level.
  • Table 15 provides illustrative differential gene expression data from the analysis described in section VIII.C between patients with microbiologically confirmed Sepsis BSI and Sepsis non-BSI for various genes listed in Table A
  • Table 16 provides illustrative differential gene expression data from the analysis described in section VIII.F between patients between patients with or without clinically confirmed viral sepsis for various gene listed in Table B.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Biotechnology (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The disclosure provides methods of evaluating cell-free nucleic acids for the determination of a likelihood that a patient has sepsis.

Description

INTEGRATED HOST-MICROBE METAGENOMICS OF CELL-FREE NUCLEIC ACID FOR SEPSIS DIAGNOSIS CROSS-REFERENCE TO RELATED APPLICATIONS [0001] This application claims priority benefit of U.S. Provisional Application No. 63/342,528, filed May 16, 2022, which is incorporated by reference in its entirely for all purposes. BACKGROUND [0002] Sepsis causes 20% of all deaths globally and contributes to 20-50% of hospital deaths in the United States (Rudd, et al., The Lancet 395, 200–211 (2020); Liu, et al., JAMA 312, 90 (2014)). Early diagnosis and identification of the etiologic pathogen facilitate timely and appropriate antibiotic therapy administration, critical factors for sepsis survival. Yet in over 30% of cases, no etiologic pathogen is identified (Novosad, et al., MMWR Morb. Mortal. Wkly. Rep. 65, 864–869 (2016)), reflecting the limitations of existing culture-based microbiologic diagnostics (Lamy, et al., Clin. Infect. Dis 35, 842–850 (2002). Adding additional complexity is the fact that existing tests do not differentiate sepsis effectively from non- infectious systemic illnesses, which often appear clinically similar at the time of hospital admission. As a result, antibiotic treatment often remains empiric rather than pathogen-targeted, with clinical decision-making based on epidemiological information rather than individual patient data. Similarly, clinicians often continue empiric antimicrobials despite negative microbiologic testing for fear of harming patients in the setting of falsely negative results. Both scenarios lead to antimicrobial overuse and misuse, which contributes to treatment failures, opportunistic infections such as C. difficile colitis, and the emergence of drug-resistant organisms (Baur, et al., The Lancet Infectious Diseases 17(9), 990-1001, (2017)). [0003] With the introduction of culture-independent methods such as metagenomic next generation sequencing (mNGS), limitations in sepsis diagnostics may be overcome (Wilson, et al., Engl. J. Med. 380, 2327–2340 (2019); Blauwkamp, et al., Nature Microbiology 4, 663–674 (2019)). Recent advancements in plasma cell-free DNA sequencing have expanded the scope of mNGS diagnostics by enabling minimally invasive detection of circulating pathogen nucleic acid originating from diverse anatomical sites of infection (Blauwkamp et al., 2019, supra). While mNGS of cell-free plasma DNA has broadened the arsenal of available infectious disease diagnostics, its clinical impact has been questioned (Lee, et al., J Clin Microbiol 58, (2020); Hogan, et al., Clinical Infectious Diseases 72, 239–245 (2021)) due to frequent identification of microbes of uncertain clinical significance, inability to detect RNA viruses that cause pneumonia, and limited utility in ruling-out presence of infection. [0004] Whole blood transcriptional profiling offers the potential to mitigate these limitations by capturing host gene expression signatures that distinguish infectious from non-infectious conditions, and viral from bacterial infections (Sweeney, et al., Nat Commun 9, 694 (2018); Tsalik, et al., Science Translational Medicine 8, 322ra11-322ra11 (2016)). However, because host transcriptional profiling traditionally focuses on the host immune response, precise taxonomic identification of sepsis pathogens has not been feasible, which limits the utility of host-profiling alone. Further, transcriptional profiling has required isolating peripheral-blood mononuclear cells, or stabilizing whole blood in specialized collection tubes, and it has remained unknown whether cell-free plasma could yield informative gene expression data for infectious disease diagnosis. [0005] In recent work, a single-sample metagenomic approach combining host transcriptional profiling with unbiased pathogen detection was developed to improve lower respiratory tract infection diagnosis (Langelier, et al., Proc Natl Acad Sci USA 115, E12353-E12362 (2018)). As described herein below, sepsis provide an additional application of an integrated host-microbe metagenomics approach. BRIEF SUMMARY [0006] In one aspect, described herein are methods for predicting the likelihood of sepsis by determining the expression profile of a panel of host genes that undergo quantitative changes in sepsis. In some embodiments, the methods comprise determining the expression profile of host genes that undergo quantitative changes in viral sepsis. In some embodiments, determination of sepsis risk further comprises determining the microbial mass and identifying the pathogen. In some embodiments, the methods provide criteria to rule out sepsis. [0007] Thus, in one aspect, provided herein is a method of evaluating a likelihood of general sepsis in a human subject, the method comprising: detecting RNA in a cell-free RNA sample from the human subject of each member of a gene panel, wherein the gene panel comprises at least two members, or at least three members, selected from the group consisting of NEDD4, KDM3A; GOLGA2; TBC1D8; LRRK1; MBP; C2CD3; SOS1; MFN1; ATE1 MOSPD2; SMARCA5; INTS4; P4HA2; LTBP3; SZT2; ADAP1; ORC6; KIF1B; PLTP; RAB29; GYG1; SULT1B1; ATP6V1A; ZNF672; PPP1R12B; IPO7; NAA10; GAPVD1; PSMA4; ARID3A; HBA1; VPS35; RPA1; GALNT10; CHD3; MYLK3; CD53; MSI2; DCUN1D4; CASP9; RPS27L; HBB; STXBP2; PADI2; HBA2; STAT3; C20orf24; DMXL2; NUP107; KDM6B; IFNAR1; PAK1; WIPI2; MTMR14; NADSYN1; SULT1A1; CYB5B; STAT2; DOCK5; PCLAF; INTS1; DDAH2; NUP160; PLAA; PLEC; SHANK1; PDCD6; DNAJA3; AC138969.1; PMM2; TNPO1; ZNF330; TLN2; TFEB; GRAMD4; KIAA0930; ANXA3; FRA10AC1; SLC44A1; ARAP1; IFITM3; INTS6; SLAIN1; UBE2G2; DKC1; PFKFB2; SLC38A1; CAMTA2; DYM; TLK2; S100A9; C5orf51; FIG4; HRH2; NFIX; BIRC5, GPI, and TANGO2; determining a quantity of differential gene expression for each member of the gene panel compared to reference levels of RNA in control subjects; determining a probability score based on the respective amount of differential gene expression; and classifying the human subject as having an increased likelihood of general sepsis when the probability score exceeds a threshold value. In some embodiments, detecting RNA comprises measuring a level of RNA for at least 10 members, or at least 20 member of the gene panel. In some embodiments, detecting RNA comprises measuring a level of RNA for at least 50 members of the gene panel. In some embodiments, members of the gene panel are selected from the group consisting of MBP, SOS1, SMARCA5, LTBP3, PLTP, ATP6V1A, PSMA4, ARID3A, VPS35, RPA1, MYLK3, DCUN1D4, CASP9, HBB, C20orf24, IFNAR1, WIPI2, MTMR14, SULT1A1, PLEC, DNAJA3, AC138969.1, TNPO1, SLC44A1, IFITM3, UBE2G2, S100A9, and FIG4. In some embodiments, determining the quantity of differential gene expression comprises an amplification reaction for each member of the gene panel. In some embodiment, determining the quantity of differential gene expression comprises massively parallel sequencing. In some embodiments, quantification is performed by digital PCR. In some embodiments, the probability score for classifying the human subject as having an increased likelihood of sepsis is 0.5 or greater. [0008] In a further aspect, provided herein is a method of evaluating a likelihood of viral sepsis in a human subject, the method comprising: detecting RNA in a cell-free RNA sample from the human subject of each member of a gene panel, wherein the gene panel comprises at least two members, or at least three members, selected from the group consisting of: PSME3; OTUB1; RBPJ; GPX3; ERBIN; GABARAPL1; MZT2A; JMJD6; FAM214A; ZC3H11A; CACUL1; TUBG1; TRIM69; LST1; ZNF585B; RBM6; DHX29; SUGP1; SUDS3; CREB5; DYNC1H1; STXBP3; ZNF467; RAPGEF6; SIPA1; RPL7L1; CD2AP; ZNF101; CASP8AP2; CDR2; COP1; ARFGEF1; SLC30A5; RNPEP; ZFX; STARD7; CALCOCO2; BORCS8; GRHPR; DOCK9; TAF2; RANBP1; ABL1; SREK1; and DGKA; determining a quantity of differential gene expression for each member of the gene panel compared to reference levels of RNA in control subjects; determining a probability score based on the respective amount of differential gene expression; and classifying the human subject as having an increased likelihood of sepsis when the probability score exceeds a threshold value. In some embodiments, detecting RNA comprises measuring the level of RNA for at least 10 members, or at least 20 members, or at least 30 members, or at least 40 members of the gene panel. In some embodiments, members of the gene panel are selected from the group consisting of OTUB1; LITAF; RBPJ; ERBIN; MZT2A; PPM1B; PEBP1; PARL; TRIM69; RBM6; DHX29; RDX; HIST1H3I; STXBP3; IGLL5; COP1; CALCOCO2; BORCS8; TAF2; and ABL1. In some embodiments, quantification comprises an amplification reaction. In some embodiments, quantification is performed by massively parallel sequencing. In some embodiments, the probability score for classifying the human subject as having an increased likelihood of viral sepsis is 0.9 or greater. [0009] In another aspect, provided herein is a method of determining likelihood of sepsis in a subject comprising (a) quantifying microbial mass in a serum or plasma sample from a patient and (b) determining whether a predominant pathogen is present, the method comprising: (a) adding a known amount of calibration nucleic acids to the a cfDNA preparation obtained from the serum or plasma sample; (b) sequencing a library generated from the cfDNA preparation; (c) aligning sequences obtained from step (b) to sequences present in a database comprising microbial sequences to determine sequence reads that map to a microbial sequence in the database; (d) determining a ratio of a first amount of total sequence reads that correspond to the calibration nucleic acids and a second amount of all microbial sequence reads in step (c); and (f) determining a total microbial mass from the known amount of the calibration nucleic acids and the ratio of the first amount and the second amount determined in (d); thereby quantifying microbial mass in the serum or plasma sample; (g) determining abundance levels of microbial species represented in the cfDNA preparation comprising determining a number of sequence reads that are mapped to individual species of microbe; (h) for each genus of microbes, selecting a species in that genus having a highest abundance level and ranking the selected species by abundance level in sequential order; (i) determining a gap threshold, wherein the gap threshold is the abundance level at which a greatest difference in abundance level occurs between sequential microbes; (j) determining whether any of the species having the abundance level at or that exceeds the gap threshold is a known blood stream pathogen, thereby identifying whether a predominant pathogen exists; and (j) identifying the patient as likely to have sepsis based on the total microbial mass determined in (f) being greater than a specified mass and that the predominant pathogen exists. In some embodiments, the specified mass is 20 pg. [0010] In a further aspect, described herein is a method of evaluating likelihood of sepsis, comprising performing the methods described in the preceding three paragraphs, wherein a patient has an increased likelihood of sepsis when (i) the probability score determined in claim 1 exceeds the threshold value for general sepsis, (ii) the probability score determined in claim 9 exceeds the threshold value for viral sepsis, or (iii) the total microbial mass determined in claim 15 is greater than the specified mass and the predominant pathogen exists. [0011] These and other embodiments of the disclosure are described in detail below. For example, other embodiments are directed to systems, devices, and computer readable media associated with methods described herein. [0012] A better understanding of the nature and advantages of embodiments of the present disclosure may be gained with reference to the following detailed description and the accompanying drawings. BRIEF DESCRIPTION OF THE DRAWINGS [0013] FIG.1A provides a study flow diagram. Patients studied were enrolled in the Early Assessment of Renal and Lung Injury (EARLI) cohort. Sepsis adjudication was based on ^ 2 or systemic inflammatory response syndrome (SIRS) criteria plus clinical suspicion of infection was used to delineate 5 patient subgroups. Following quality control (QC), whole blood underwent RNA-seq and cell-free (cf)-RNA and DNA from plasma underwent RNA-seq and DNA-seq. [0014] FIG.1B shows analytic approaches. Host transcriptional sepsis diagnostic classifiers were trained and tested on RNA-seq data from whole blood (n=221) and plasma (n=110), with a goal of differentiating patients with microbiologically confirmed sepsis (SepsisBSI + Sepsisnon- BSI) from those without clinical evidence of infection (No-Sepsis). Viral infections were identified via a secondary host transcriptomic classifier. Sepsis pathogens were detected from plasma cf-nucleic acid using metagenomic next generation sequencing (mNGS) followed by a rules-based bioinformatics model (RBM). Finally, an integrated host + microbe model for sepsis diagnosis was developed and evaluated. [0015] FIG.2A-2E. Host gene expression differentiates patients with sepsis from those with non-infectious critical illnesses.2A) Heatmap of top 50 differentially expressed genes from whole blood transcriptomics comparing patients with microbiologically confirmed sepsis (SepsisBSI + Sepsisnon-BSI) versus those without evidence of infection (No-sepsis).2B) Gene set enrichment analysis of the differentially expressed genes with the top 10 up- and down-regulated pathways (P < 0.05) highlighted. 2C) Receiver operating characteristic (ROC) curve demonstrating performance of bagged support vector machine (bSVM) classifier for sepsis diagnosis from whole blood transcriptomics (n=221).2D) Cell-free plasma RNA-seq expression differences of selected differentially expressed genes previously identified as sepsis biomarkers. Adjusted P value provided above boxplot.2E) ROC curve demonstrating performance of bSVM classifier for sepsis diagnosis from cf-plasma RNA (n=110). [0016] FIG.3A-3D. Cell-free plasma metagenomics for detecting sepsis pathogens. 3A) Microbial DNA biomass differences between sepsis adjudication groups.3B) Graphical depiction of rules-based model for sepsis pathogen detection that identifies established pathogens with disproportionately high abundance compared to other commensal and environmental microbes in the sample.3C) Concordance between cf-plasma DNA-seq for detecting bacterial pathogens in SepsisBSI patients with bacterial bloodstream infections compared to a gold standard of culture.3D) Sensitivity of plasma DNA-seq for detecting bacterial pathogens, or plasma RNA-seq for detecting viral pathogens, in Sepsisnon-BSI patients with sepsis from peripheral sites of infection. Legend: LRTI = lower respiratory tract infection; UTI = urinary tract infection; CDI = Clostridium difficile colitis. Mass data is tabulated in Table 7. Clinical microbiology and metagenomics data is tabulated in Table 8. [0017] FIG.4A-4D. Detection of viral sepsis.4A) GSEA of differentially expressed genes from whole blood RNA-seq (n=129) demonstrating pathways enriched in patients with viral sepsis. Gene sets with P < 0.05 included.4B) GSEA of differentially expressed genes from cf- plasma RNA-seq (n=73) demonstrating pathways enriched in patients with viral sepsis. Gene sets with P < 0.05 included.4C) ROC curve demonstrating performance of bagged support vector machine (bSVM) classifier for detecting viral sepsis from whole blood RNA-seq (n=129). 4D) ROC curve demonstrating performance of bSVM classifier for detecting viral sepsis from plasma RNA-seq (n=73). [0018] FIG.5A-5D. Integrated host-microbe mNGS model for sepsis diagnosis from cf- plasma. Host criteria for positivity can be met by a sepsis transcriptomic classifier probability > 0.5 (bars shown in graphs, dotted line). Microbial criteria can be met based on either: 1) detection of a pathogen by mNGS and a sample microbial mass > 20 pg (gray bars), or 2) viral transcriptomic classifier probability > 0.9 (filled circles, dotted line). Host and microbial metrics are highlighted for patients with sepsis due to 5A) bloodstream infections (SepsisBSI), 5B) peripheral infection (Sepsisnon-BSI), 5C) patients with non-infectious critical illness (No- Sepsis), and 5D) patients with suspected sepsis but negative microbiological testing (Sepsis suspected) and patients with indeterminant sepsis status (Indeterm). Cross: sepsis positive based on model. Circles: virus predicted from plasma cf-RNA secondary viral host classifier. Filled circles = virus also detected by clinical respiratory viral PCR. Cases with < 20pg microbial mass indicated by lighter gray shading. Samples with mNGS-detected pathogens have the microbe(s) listed below the sample microbial mass. Raw values for plots and original training/test split assignments are tabulated in Table 13. [0019] FIG.6 is a flowchart illustrating a method of measuring the expression levels of host gene markers described herein to evaluate general sepsis risk in a subject according to embodiments of the present invention. [0020] FIG.7 is a flowchart illustrating a method of measuring the expression levels of host gene markers described herein to evaluate viral sepsis risk in a subject according to embodiments of the present invention. [0021] FIG.8 is a flowchart illustrating a method of determining microbial mass and identifying a sepsis pathogen to evaluate sepsis risk in a subject according to embodiments of the present invention. [0022] FIG.9 illustrates a measurement system 900 according to an embodiment of the present disclosure. TERMS [0023] As used herein, the following terms have the meanings ascribed to them unless specified otherwise. [0024] The terms “a,” “an,” or “the” as used herein not only include aspects with one member, but also include aspects with more than one member. For instance, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to “an agent” includes reference to one or more agents known to those skilled in the art, and so forth. [0025] A patient analyzed in accordance with the invention to determine likelihood of “sepsis” meets the criteria of the Sepsis-2 definition (Levy, et al.2001 SCCM/ESICM/ACCP/ATS/SIS International Sepsis Definitions Conference. Crit Care Med 31, 1250–1256 (2003)) or exhibits a systemic inflammatory response syndrome and at least one other clinical feature associated with sepsis. In the Sepsis-2 definition, “sepsis” is based on the following criteria: a proven or suspected infection in combination with at least 2 systemic inflammatory response syndrome (SIRS) criteria and persistent hypotension (defined as a mean arterial pressure below 60 mm Hg, a systolic blood pressure below 90 mm Hg or a decrease in systolic blood pressure of at least 40 mm Hg) despite adequate fluid resuscitation. In some instances, a patient evaluated for sepsis has altered mental status, a systolic blood pressure ^100 mm Hg or respiratory rate ^22/min. Bacterial infections are the most common cause of sepsis, but fungal, viral, and protozoan infections can also lead to sepsis. As used herein, “general” sepsis refers to sepsis arising from infection with any of these pathogens. Common locations for a primary infection leading to sepsis include, but are not limited to, lungs, brain, urinary tract, skin, and abdominal organs. As used herein, “general” sepsis refers to sepsis arising from infection with any pathogenic agent [0026] The term “cell-free RNA sample” or “cfRNA sample” refers to a nucleic acid sample comprising extracellular RNA that is recoverable from a non-cellular fraction of sample and includes fragments of full-length RNA transcripts. In typical embodiments, the sample is from whole blood processed to remove cells e.g., a plasma or serum sample. [0027] A “cell-free nucleic acid sample” refers to either cfRNA or cell-free DNA (cfDNA). [0028] The terms “determining,” “assessing,” “assaying,” “measuring” and “detecting” with respect to assessing sepsis-associated patient cfRNA profiles refer to quantitative determinations. [0029] As used herein, “determining a quantity of differential gene expression for each member of a gene panel” refers to determining the gene expression level of a gene in cfRNA from a test sample relative to a control expression level. In some embodiments, the control expression level is obtained from a population of subjects. In some embodiments, for evaluating likelihood of general sepsis, a control population of subjects comprises subjects who do not have a clinical symptom of sepsis. In some embodiments, for evaluating likelihood of viral sepsis, a control population of subjects comprises subjects who have sepsis that arises from infection with a bacterial, fungal, or protozoal microorganism. [0030] As used herein, the terms “cutoff” and “threshold” refer to predetermined numbers used in an operation. A threshold value may be a value above or below which a particular classification applies. Either of these terms can be used in either of these contexts. A cutoff or threshold may be “a reference value” or derived from a reference value that is representative of a particular classification or discriminates between two or more classifications. Such a reference value can be determined in various ways, as will be appreciated by the skilled person. For example, metrics can be determined for two different groups of subjects with different known classifications, and a reference value can be selected as representative of one classification (e.g., a mean) or a value that is between two clusters of the metrics (e.g., chosen to obtain a desired sensitivity and specificity). As another example, a reference value can be determined based on statistical simulations of samples. [0031] The term “amount” or “level” of cfRNA expressed by a gene refers to the quantity of copies of an RNA transcript being assayed, including fragments of full-length transcripts that can be unambiguously identified as fragments of the transcript being assayed. Such quantity may be expressed as the total quantity of the RNA, in relative terms, e.g., compared to the level present in a control cfRNA sample, or as a concentration e.g., copy number per milliliter, of the RNA in the sample. [0032] As used herein, the term "expression level" of a gene as described herein refers to the amount of an RNA transcript, e.g., an mRNA transcript, of the gene. [0033] The terms "host gene expression" as used in this disclosure in the context of a gene expression panel, refers to the amount of cell-free RNA in a cell-free nucleic acid sample from a subject that is expressed by a gene originating from the host, i.e., the subject, as opposed to expression of a microbial, e.g., bacterial, viral, or fungal, gene. [0034] Human genes are typically referred to herein using the official symbol and official nomenclature for the human gene as assigned by the HUGO Gene Nomenclature Committee, when HUGO nomenclature is available. In the present disclosure, an individual gene as designated herein may also have alternative designations, e.g., as indicated in the HGNC database. As used herein, the term "signature gene" refers to a gene whose expression is correlated with sepsis. A “gene panel” refers to a collection of such signature genes for which gene expression scores are generated and used to provide a risk/likelihood score for sepsis. Reference to the gene by name includes any human allelic variant or splice variant encoded by the gene. [0035] The term “nucleic acid” or “polynucleotide” as used herein refers to a deoxyribonucleotide or ribonucleotide in either single- or double-stranded form. In the context of primers or probes, the term encompasses nucleic acids containing known analogues of natural nucleotides which have similar or improved binding properties, for the purposes desired, as the reference nucleic acid; and nucleic-acid-like structures with synthetic backbones. [0036] The terms "identical" or percent "identity," in the context of two or more nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that are the same (“identical”) or have a specified percentage of amino acid residues or nucleotides that are the same (i.e., at least about 70% identity, at least about 75% identity, at least 80% identity, at least about 90% identity, preferably at least about 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or higher identity over the entire sequence of a specified region, when compared and aligned for maximum correspondence over a comparison window or designated region. Methods of alignment of sequences for comparison are well-known in the art. Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. Math.2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol.48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Nat'l. Acad. Sci. USA 85:2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by manual alignment and visual inspection (see, e.g., Current Protocols in Molecular Biology (Ausubel et al., eds. 1995 supplement)). Algorithms that are suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al., Nuc. Acids Res.25:3389-3402 (1977) and Altschul et al., J. Mol. Biol.215:403-410 (1990), respectively. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (www.ncbi.nlm.nih.gov/). [0037] As used here, “microbial mass” refers to the microbial biomass present in a sample, preferably determined by metagenomic sequencing of a cell-free nucleic acid sample, e.g., cfDNA, obtained from a patient. Thus, in some embodiments, “microbial mass” is calculated based on the ratio of (i) the number of sequence reads corresponding to sequences representing “spike-in” calibration nucleic acids added in a known amount to the cell-free nucleic acid sample and (ii) the number of sequences reads in the nucleic acid that are identified by alignment to one or more sequence databases as being of microbial, e.g., bacterial, fungal, or viral, original. The calibration nucleic acids may be any nucleic acids that have a known sequence. The calibration nucleic acids may have different sequences, each of which is known; a calibration sample can have a known concentration of the calibration nucleic acids. Preferably, the calibration nucleic acids would not occur in tests samples, at least now in appreciable amounts. The calibration nucleic acids can for a calibration genome for which sequences are aligned, e.g., to identify them as calibration nucleic acids. [0038] The term “treatment,” “treat,” or “treating” typically refers to a clinical intervention, including multiple interventions over a period of time, to ameliorate at least one symptom of sepsis or otherwise slow progression. This includes alleviation of symptoms or diminishment of any direct or indirect pathological consequences of sepsis. DETAILED DESCRIPTION [0039] In the present disclosure, a prospective cohort of critically ill adults was evaluated to develop sepsis diagnostic assays, including assays that combine host transcriptional profiling with pathogen identification. By applying machine learning to high dimensional metagenomic next generation sequencing (mNGS) data, host and microbial features that distinguish microbiologically confirmed sepsis from non-infectious critical illness were identified. It was additionally determined that cell-free plasma nucleic acid can be used to profile both host and microbe for precision sepsis diagnosis. Accordingly, described herein are sepsis diagnostic assays that employ host transcriptional profiling, pathogen abundance, pathogen identification, and combinations of these assays to evaluate sepsis risk. [0040] In one aspect, described herein are methods for predicting the likelihood of “general” sepsis in a patient based on transcriptional profiling in cell-free RNA (cfRNA) of host marker genes that are associated with general sepsis. “General” sepsis as used herein refers to sepsis arising from any microbial pathogen. [0041] In another aspect, described herein are methods for predicting the likelihood of viral sepsis based on transcriptional profiling of cfRNA of host markers genes that are specifically associated with viral sepsis. [0042] In a further aspect, the disclosure describes determination of sepsis risk based on microbial mass and detection of a dominant pathogen based on cfDNA analysis from a plasma or serum sample. Microbial mass determination can comprise sequencing of cfDNA from a sample obtained from a patient serum or plasma sample; and determining the microbial mass by weight, e.g., picograms, for all nucleic acids identified as originating from microbes based on alignments to one or more taxonomy databases. Detection of a dominant pathogen can comprise identifying whether sequence reads that map to an established (e..g., a known, bloodstream pathogen) are overrepresented (compared to commensal or contaminating microbes) in sequence data from cfDNA obtained from a serum or plasma sample. A pathogen that is overrepresented is referred to herein as a “dominant” or “predominant” or a “disproportionately abundant” pathogen. [0043] Determination of an abundance level of a microbial species can be based on the number of sequence reads (e.g., reads per million) that map to the microbe sequence. For each genus of microbes identified by mapping the sequence reads, the most abundant species, i.e., having the highest abundance level, in each genus can be selected and the selected species can be ranked by abundance level in sequential order. A gap threshold can be determined, where the gap threshold is the abundance level at which a greatest difference in abundance level occurs between sequential microbes. It is also determined whether species having the abundance level at or that exceeds the gap threshold is a known blood stream pathogen, thereby identifying whether a predominant pathogen exists. A patient is identified as likely to have sepsis based on the total microbial mass determined being greater than a specified mass, e.g., 20 pg, and that the predominant pathogen exists. [0044] In one aspect, techniques can combine methods into an integrated host and microbe model for sepsis determination that can maximize accuracy (e.g., a negative predictive value) using sequencing of cell-free nucleic acids from a blood sample (e.g., serum or plasma) and pathogen detection, to define individuals likely to have sepsis. In some implementations, if any one of the three methods determines that that sepsis is likely (e.g., has a likelihood higher than a respective threshold), then the patient can be treated for sepsis. But if none of the three methods determines than sepsis is likely, then sepsis can be ruled out, and the patient would not be unnecessarily subjected to antibiotics or other treatments. I. HOST GENE EXPRESSION PROFILING FOR GENERAL SEPSIS RISK ASSESSMENT [0045] As detailed in the EXAMPLES section, the inventors determined that host gene expression can be evaluated to assess risk of general sepsis, i.e., sepsis due to any microorganism. In some embodiments, cfRNA obtained from a plasma or serum sample from the subject is evaluated to determine the level of RNA encoded by each member of a panel of genes for which cfRNA levels are associated with sepsis relative to control levels from subjects that do not have clinical evidence of infection. [0046] In alternative embodiments, whole blood samples can be evaluated by RNA sequencing of RNA obtained from whole blood samples. [0047] Signature genes identified by the inventors for evaluating cfRNA obtained from a plasma or serum sample from a patient to determine likelihood of general sepsis are described in part A of this section. Signature genes for evaluating RNA from a whole blood sample to determine likelihood of general sepsis are described in part B. A. Gene Panels--cfRNA [0048] In some embodiments, sepsis risk is evaluated by determining the amount of RNA of each member of a panel of genes, or each member of a subset of the panel of genes, in cfRNA obtained from a serum or plasma sample from a patient suspected of having sepsis. In some embodiments, sepsis risk is evaluated by determining the amount of RNA of each member of a panel of genes, or each member of a subset of the panel of genes, in RNA obtained from whole blood sample from a patient suspected of having sepsis. Such methods comprise quantifying the amount of RNA for each of a panel of genes associated with sepsis in a cfRNA sample obtained from plasma or serum from a human subject exhibiting one or more symptoms consistent with a diagnosis of sepsis. In typical embodiments, at least 5, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 55, at least 60, at least 65, at least 70, at least 75, at least 80, at least 85, at least 90 or at least 95 genes selected from the following are quantified to determine expression levels: NEDD4, KDM3A; GOLGA2; TBC1D8; LRRK1; MBP; C2CD3; SOS1; MFN1; ATE1; MOSPD2; SMARCA5; INTS4; P4HA2; LTBP3; SZT2; ADAP1; ORC6; KIF1B; PLTP; RAB29; GYG1; SULT1B1; ATP6V1A; ZNF672; PPP1R12B; IPO7; NAA10; GAPVD1; PSMA4; ARID3A; HBA1; VPS35; RPA1; GALNT10; CHD3; MYLK3; CD53; MSI2; DCUN1D4; CASP9; RPS27L; HBB; STXBP2; PADI2; HBA2; STAT3; C20orf24; DMXL2; NUP107; KDM6B; IFNAR1; PAK1; WIPI2; MTMR14; NADSYN1; SULT1A1; CYB5B; STAT2; DOCK5; PCLAF; INTS1; DDAH2; NUP160; PLAA; PLEC; SHANK1; PDCD6; DNAJA3; AC138969.1; PMM2; TNPO1; ZNF330; TLN2; TFEB; GRAMD4; KIAA0930; ANXA3; FRA10AC1; SLC44A1; ARAP1; IFITM3; INTS6; SLAIN1; UBE2G2; DKC1; PFKFB2; SLC38A1; CAMTA2; DYM; TLK2; S100A9; C5orf51; FIG4; HRH2; NFIX; BIRC5, GPI, and TANGO2. In some embodiments, a panel evaluated for viral sepsis comprises at least two gene, or at least three, four, or five; or at least ten or more genes selected from NEDD4, KDM3A; GOLGA2; TBC1D8; LRRK1; MBP; C2CD3; SOS1; MFN1; ATE1; MOSPD2; SMARCA5; INTS4; P4HA2; LTBP3; SZT2; ADAP1; ORC6; KIF1B; PLTP; RAB29; GYG1; SULT1B1; ATP6V1A; ZNF672; PPP1R12B; IPO7; NAA10; GAPVD1; PSMA4; ARID3A; HBA1; VPS35; RPA1; GALNT10; CHD3; MYLK3; CD53; MSI2; DCUN1D4; CASP9; RPS27L; HBB; STXBP2; PADI2; HBA2; STAT3; C20orf24; DMXL2; NUP107; KDM6B; IFNAR1; PAK1; WIPI2; MTMR14; NADSYN1; SULT1A1; CYB5B; STAT2; DOCK5; PCLAF; INTS1; DDAH2; NUP160; PLAA; PLEC; SHANK1; PDCD6; DNAJA3; AC138969.1; PMM2; TNPO1; ZNF330; TLN2; TFEB; GRAMD4; KIAA0930; ANXA3; FRA10AC1; SLC44A1; ARAP1; IFITM3; INTS6; SLAIN1; UBE2G2; DKC1; PFKFB2; SLC38A1; CAMTA2; DYM; TLK2; S100A9; C5orf51; FIG4; HRH2; NFIX; BIRC5, GPI, and TANGO2 and at least one additional gene that plays a role in neutrophil degranulation or the innate immune system. [0049] The “ENSG” designation of teach gene is shown in Table A. The “ENSG” designation is based on ENSEMBL version 99. Table A. ENSEMBL designations.
Figure imgf000017_0001
Figure imgf000018_0001
Figure imgf000019_0001
Figure imgf000020_0001
[0050] Additional gene information, including chromosome location, number of transcripts (e.g., splice variants) encoded by the gene that have been identified, and UniProt identification numbers for protein-encoding genes are available in the ENSEMBL entry. In the present disclosure, reference to the gene by name includes variants, such as allelic variants, including SNP variants, splice variants, and the like. Thus, in some instances, the gene may have a sequences that is at least 85% identical to the sequence provided in the respective ENSEMBL entry. In some embodiments, the gene may have a sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% identical to the sequence provided in the ENSEMBL entry. [0051] In some embodiments, sepsis risk is determined by measuring expression levels of at least two, at least three, at least four, or at least five genes selected from a group consisting of MBP, SOS1, SMARCA5, LTBP3, PLTP, ATP6V1A, PSMA4, ARID3A, VPS35, RPA1, MYLK3, DCUN1D4, CASP9, HBB, C20orf24, IFNAR1, WIPI2, MTMR14, SULT1A1, PLEC, DNAJA3, AC138969.1, TNPO1, SLC44A1, IFITM3, UBE2G2, S100A9, and FIG4. In some embodiments, detection of sepsis risk comprises assessing expression levels in cfRNA of at least six, seven, eight, nine, or ten genes selected from MBP, SOS1, SMARCA5, LTBP3, PLTP, ATP6V1A, PSMA4, ARID3A, VPS35, RPA1, MYLK3, DCUN1D4, CASP9, HBB, C20orf24, IFNAR1, WIPI2, MTMR14, SULT1A1, PLEC, DNAJA3, AC138969.1, TNPO1, SLC44A1, IFITM3, UBE2G2, S100A9, and FIG4. In some embodiments, detection of sepsis risk comprises assessing expression levels of cfRNA of at least eleven, twelve, thirteen, fourteen, or fifteen gene selected from MBP, SOS1, SMARCA5, LTBP3, PLTP, ATP6V1A, PSMA4, ARID3A, VPS35, RPA1, MYLK3, DCUN1D4, CASP9, HBB, C20orf24, IFNAR1, WIPI2, MTMR14, SULT1A1, PLEC, DNAJA3, AC138969.1, TNPO1, SLC44A1, IFITM3, UBE2G2, S100A9, and FIG4. In other embodiments, detecting sepsis risk comprises assessing expression levels of cfRNA of at least sixteen, seventeen, eighteen, nineteen, or twenty genes selected from MBP, SOS1, SMARCA5, LTBP3, PLTP, ATP6V1A, PSMA4, ARID3A, VPS35, RPA1, MYLK3, DCUN1D4, CASP9, HBB, C20orf24, IFNAR1, WIPI2, MTMR14, SULT1A1, PLEC, DNAJA3, AC138969.1, TNPO1, SLC44A1, IFITM3, UBE2G2, S100A9, and FIG4. [0052] In some embodiments, risk determination comprises quantifying cfRNA for a subset of twenty one, twenty two, twenty three, twenty four, twenty five, twenty six, or twenty seven genes selected from MBP, SOS1, SMARCA5, LTBP3, PLTP, ATP6V1A, PSMA4, ARID3A, VPS35, RPA1, MYLK3, DCUN1D4, CASP9, HBB, C20orf24, IFNAR1, WIPI2, MTMR14, SULT1A1, PLEC, DNAJA3, AC138969.1, TNPO1, SLC44A1, IFITM3, UBE2G2, S100A9, and FIG4. In some embodiments sepsis risk is determined by quantifying cf RNA expression of twenty eight genes MBP, SOS1, SMARCA5, LTBP3, PLTP, ATP6V1A, PSMA4, ARID3A, VPS35, RPA1, MYLK3, DCUN1D4, CASP9, HBB, C20orf24, IFNAR1, WIPI2, MTMR14, SULT1A1, PLEC, DNAJA3, AC138969.1, TNPO1, SLC44A1, IFITM3, UBE2G2, S100A9, and FIG4. In some embodiments, a panel evaluated for viral sepsis comprises at least two gene, or at least three, four, or five; or at least 10 or more genes selected from MBP, SOS1, SMARCA5, LTBP3, PLTP, ATP6V1A, PSMA4, ARID3A, VPS35, RPA1, MYLK3, DCUN1D4, CASP9, HBB, C20orf24, IFNAR1, WIPI2, MTMR14, SULT1A1, PLEC, DNAJA3, AC138969.1, TNPO1, SLC44A1, IFITM3, UBE2G2, S100A9, and FIG4; and at least one additional gene that plays a role in neutrophil degranulation or the innate immune system. [0053] One of skill understands that many subsets of the 99 genes listed in Table A or the 28 genes can be informative in determining sepsis risk, e.g., depending on the sensitivity and specificity desired for the assay. The illustrative subsets described in the Tables are examples and are not limiting. [0054] In some embodiments, a gene panel for sepsis risk includes one or more genes upregulated in neutrophil degranulation and/or one or more genes upregulated in innate immune signaling. In some embodiments, a gene panel for sepsis risk includes one or more genes downregulated in translation and rRNA processing. B. Gene Panels—whole blood [0055] In some embodiments, sepsis risk is evaluated by determining the amount of RNA of each member of a panel of genes, or each member of a subset of the panel of genes, in RNA obtained from whole blood sample from a patient suspected of having sepsis. In some embodiments, thof at at least two, at least three, at least four, or at least five genes selected from a group consisting of CCR1, RPS5, IFITM3, DSC2, RPS3A, AC084082.1, THOC6, ZNF639, ZCCHC4, EXT1, WDR49, CBX1, TDP2, MTERF2, PRPS1, DAAM1, NOG, CALCRL, IQCB1, MAIP1, TSPAN13, NDST3, Z97832.2, SLC6A19, RPS19BP1, MRI1, LSM12, MPZL1, SLC35E3, H6PD, SYTL2, ZNF468, TXNL4A, ORAI3, UBN2, SMYD4, NDUFA3, MRPL41, WDR77, ZNF862, ZNF616, ACTR8, CHST13, EMG1, METTL21A, MBLAC2, NUP88, EFCAB5, PIGW, GLCCI1, CFAP100, and SLITRK4. In some embodiments, detection of sepsis risk comprises assessing expression levels in of at least six, seven, eight, nine, or ten genes selected from CCR1, RPS5, IFITM3, DSC2, RPS3A, AC084082.1, THOC6, ZNF639, ZCCHC4, EXT1, WDR49, CBX1, TDP2, MTERF2, PRPS1, DAAM1, NOG, CALCRL, IQCB1, MAIP1, TSPAN13, NDST3, Z97832.2, SLC6A19, RPS19BP1, MRI1, LSM12, MPZL1, SLC35E3, H6PD, SYTL2, ZNF468, TXNL4A, ORAI3, UBN2, SMYD4, NDUFA3, MRPL41, WDR77, ZNF862, ZNF616, ACTR8, CHST13, EMG1, METTL21A, MBLAC2, NUP88, EFCAB5, PIGW, GLCCI1, CFAP100, and SLITRK4. In some embodiments, detection of sepsis risk comprises assessing expression levels of whole blood RNA of at least eleven, twelve, thirteen, fourteen, or fifteen gene selected from CCR1, RPS5, IFITM3, DSC2, RPS3A, AC084082.1, THOC6, ZNF639, ZCCHC4, EXT1, WDR49, CBX1, TDP2, MTERF2, PRPS1, DAAM1, NOG, CALCRL, IQCB1, MAIP1, TSPAN13, NDST3, Z97832.2, SLC6A19, RPS19BP1, MRI1, LSM12, MPZL1, SLC35E3, H6PD, SYTL2, ZNF468, TXNL4A, ORAI3, UBN2, SMYD4, NDUFA3, MRPL41, WDR77, ZNF862, ZNF616, ACTR8, CHST13, EMG1, METTL21A, MBLAC2, NUP88, EFCAB5, PIGW, GLCCI1, CFAP100, and SLITRK4 In other embodiments, detecting sepsis risk comprises assessing expression levels of RNA of at least sixteen, seventeen, eighteen, nineteen, or twenty genes; or at least thirty, thirty five, forty, or at least fifty genes selected from M CCR1, RPS5, IFITM3, DSC2, RPS3A, AC084082.1, THOC6, ZNF639, ZCCHC4, EXT1, WDR49, CBX1, TDP2, MTERF2, PRPS1, DAAM1, NOG, CALCRL, IQCB1, MAIP1, TSPAN13, NDST3, Z97832.2, SLC6A19, RPS19BP1, MRI1, LSM12 MPZL1 SLC35E3 H6PD SYTL2 ZNF468 TXNL4A ORAI3 UBN2 SMYD4 NDUFA3, MRPL41, WDR77, ZNF862, ZNF616, ACTR8, CHST13, EMG1, METTL21A, MBLAC2, NUP88, EFCAB5, PIGW, GLCCI1, CFAP100, and SLITRK4. [0056] The “ENSG” designation of teach gene is shown in Table C. The “ENSG” designation is based on ENSEMBL version 99.
Figure imgf000023_0001
Figure imgf000024_0001
[0057] One of skill understands that many subsets of the genes listed in Table C can be informative in determining sepsis risk, e.g., depending on the sensitivity and specificity desired for the assay. The illustrative subsets described in the Tables are examples and are not limiting. II. HOST TRANSCRIPTIONAL PROFILING FOR VIRAL SEPSIS RISK ASSESSMENT [0058] As illustrated in FIGS.4A-D, the inventors further determined that host gene expression can be evaluated to assess risk of viral sepsis. Viral sepsis can be difficult to distinguish from sepsis arising from another microorganism such as bacteria or fungi. In some embodiments, cfRNA obtained from a plasma or serum sample from the subject is evaluated to determine the level of RNA encoded by each member of a panel of genes for which cfRNA levels are associated with sepsis relative to control levels from subjects that have systemic infection or sepsis, but do not have clinically confirmed viral sepsis [0059] In alternative embodiments, whole blood samples can be evaluated by sequencing of RNA obtained from whole blood samples. [0060] Signature genes identified by the inventors for evaluating cfRNA obtained from a plasma or serum sample from a patient to determine likelihood of viral sepsis are described in part A of this section. Signature genes for evaluating RNA from a whole blood sample to determine likelihood of viral sepsis are described in part B. A. Gene Panels-cfRNA [0061] In some embodiments, a cell-free sample from blood, e.g., a serum or plasma sample, is evaluated for to determine levels of each member of a host gene signature panel, or a subset thereof, comprising genes identified as undergoing quantitative changes in a viral infection compared to a non-viral infection. Such methods comprise quantifying the RNA level in a cfRNA sample obtained from a human subject exhibiting one or more symptoms consistent with a diagnosis of sepsis. In typical embodiments, at least 5, at least 10, at 20, at least 25, at least 30, at least 35, at least 40 genes selected from the following are quantified to determine expression levels: PSME3; OTUB1; RBPJ; GPX3; ERBIN; GABARAPL1; MZT2A; JMJD6; FAM214A; ZC3H11A; CACUL1; TUBG1; TRIM69; LST1; ZNF585B; RBM6; DHX29; SUGP1; SUDS3; CREB5; DYNC1H1; STXBP3; ZNF467; RAPGEF6; SIPA1; RPL7L1; CD2AP; ZNF101; CASP8AP2; CDR2; COP1; ARFGEF1; SLC30A5; RNPEP; ZFX; STARD7; CALCOCO2; BORCS8; GRHPR; DOCK9; TAF2; RANBP1; ABL1; SREK1; and DGKA. In some embodiments, a panel evaluated for viral sepsis comprises at least two genes, or at least 5, 10, or 15 genes selected from PSME3; OTUB1; RBPJ; GPX3; ERBIN; GABARAPL1; MZT2A; JMJD6; FAM214A; ZC3H11A; CACUL1; TUBG1; TRIM69; LST1; ZNF585B; RBM6; DHX29; SUGP1; SUDS3; CREB5; DYNC1H1; STXBP3; ZNF467; RAPGEF6; SIPA1; RPL7L1; CD2AP; ZNF101; CASP8AP2; CDR2; COP1; ARFGEF1; SLC30A5; RNPEP; ZFX; STARD7; CALCOCO2; BORCS8; GRHPR; DOCK9; TAF2; RANBP1; ABL1; SREK1; and DGKA, and at least one additional gene that plays a role in responses to elevated platelet cytosolic Ca2+, interferon alphs/beta signaliing, or is a chemokine or chemokine receptor. [0062] The “ENSG” designation of each gene is shown in Table B. The “ENSG” designation is based on ENSEMBL version 99. Table B ENSEMBL designations. ENSEMBL ID Gene Name
Figure imgf000026_0001
Figure imgf000027_0001
[0063] Additional gene information, including chromosome location, number of transcripts (e.g., splice variants) encoded by the gene that have been identified, and UniProt identification numbers are available in the ENSEMBL entry. Reference to the gene by name includes variants, such as allelic variants, including SNP variants, splice variants, and the like. [0064] In some embodiments, sepsis risk is determined by measuring levels of at least two, at least three, at least four, or at least five genes selected from a group consisting of OTUB1; LITAF; RBPJ; ERBIN; MZT2A; PPM1B; PEBP1; PARL; TRIM69; RBM6; DHX29; RDX; HIST1H3I; STXBP3; IGLL5; COP1; CALCOCO2; BORCS8; TAF2; and ABL1. In some embodiments, detection of sepsis risk comprises assessing levels in cfRNA of at least six, seven, eight, nine, or ten genes selected from OTUB1; LITAF; RBPJ; ERBIN; MZT2A; PPM1B; PEBP1; PARL; TRIM69; RBM6; DHX29; RDX; HIST1H3I; STXBP3; IGLL5; COP1; CALCOCO2; BORCS8; TAF2; and ABL1. In some embodiments, detection of sepsis risk comprises assessing expression levels of cfRNA of at least eleven, twelve, thirteen, fourteen, or fifteen gene selected from OTUB1; LITAF; RBPJ; ERBIN; MZT2A; PPM1B; PEBP1; PARL; TRIM69; RBM6; DHX29; RDX; HIST1H3I; STXBP3; IGLL5; COP1; CALCOCO2; BORCS8; TAF2; and ABL1. In other embodiments, detecting viral sepsis risk comprises assessing expression levels of cfRNA of at least sixteen, seventeen, eighteen, or nineteen, twenty genes selected from OTUB1; LITAF; RBPJ; ERBIN; MZT2A; PPM1B; PEBP1; PARL; TRIM69; RBM6; DHX29; RDX; HIST1H3I; STXBP3; IGLL5; COP1; CALCOCO2; BORCS8; TAF2; and ABL1. In other embodiments, detecting viral sepsis risk comprises assessing expression levels of twenty genes OTUB1; LITAF; RBPJ; ERBIN; MZT2A; PPM1B; PEBP1; PARL; TRIM69; RBM6; DHX29; RDX; HIST1H3I; STXBP3; IGLL5; COP1; CALCOCO2; BORCS8; TAF2; and ABL1. In some embodiments, a panel evaluated for viral sepsis comprises at least two genes selected from OTUB1; LITAF; RBPJ; ERBIN; MZT2A; PPM1B; PEBP1; PARL; TRIM69; RBM6; DHX29; RDX; HIST1H3I; STXBP3; IGLL5; COP1; CALCOCO2; BORCS8; TAF2; and ABL1 and at least one additional gene that plays a role in responses to elevated platelet cytosolic Ca2+, interferon alphs/beta signaliing, or is a chemokine or chemokine receptor. [0065] One of skill understands that many other subsets of the 45 genes listed in Table B or the 20 genes indicated above can be informative in determining viral sepsis risk, e.g., depending on the sensitivity and specificity desired for the assay. B. Gene panels—whole blood [0066] In some embodiments, sepsis risk is determined by determining levels of RNA of each member of a panel of genes, or a subset thereof, in a whole blood sample. In some embodiments, the panel comprises at least two, at least three, at least four, or at least five genes selected from a group consisting of the genes listed in Table D. In some embodiments, detection of sepsis risk comprises assessing expression levels in of at least six, seven, eight, nine, or ten genes selected from the genes listed in Table D.. In some embodiments, detection of sepsis risk comprises assessing levels of whole blood RNA of at least eleven, twelve, thirteen, fourteen, or fifteen gene selected from the genes listed in Table D. In other embodiments, detecting sepsis risk comprises assessing levels of RNA of at least sixteen, seventeen, eighteen, nineteen, or twenty genes; or at least thirty, thirty five, forty, or at least fifty genes; or at least sixty, seventy, or eighty or more genes listed in Table D. [0067] The “ENSG” designation of teach gene is shown in Table D. The “ENSG” designation is based on ENSEMBL version 99.
Figure imgf000028_0001
Figure imgf000029_0001
Figure imgf000030_0001
III. QUANTIFICATION OF DIFFERENTIALLY EXPRESSED GENES IN GENERAL SEPSIS AND VIRAL SEPSIS GENE PANELS A. Assessing RNA levels for members of a gene 1. Methods of evaluating RNA profiles [0068] In order to analyze RNA profiles from a subject to be evaluated for sepsis and/or viral sepsis risk, RNA is obtained from a whole blood sample or a bodily fluid sample that does not contain cells. In some instances, cfRNA is isolated from a serum or plasma sample. The RNA is processed to evaluate levels of RNA, e.g., one or more genes selected from the gene panels described herein present in the RNA sample, e.g., cfRNA from serum or plasma or RNA from a whole blood sample. In some instances, the sample is obtained within 24 hours, or within 48 hours, of admission of a patient to hospital; or within 24 hours, or within 48 hours, of when a patient is determined to be at risk of sepsis based on the clinical factors as described above. [0069] In some instances, e.g., when RNA levels for each member of a gene panel for general sepsis is to be evaluated in conjunction with the microbial mass/identification of pathogen aspect of the invention further detailed below, cfRNA may be evaluated by nucleic acid sequencing. In some embodiments, the gene panel comprises at least two or three genes set forth in Table A or Table B. As understood by one of skill in the art, a cfRNA preparation can be depleted of abundant sequences, such as mitochondrial or ribosomal RNA sequences, to enrich for coding transcripts. Further, the cell-free nucleic acid preparation, for example, RNA preparation or cDNA transcribed from the RNA preparation, can be fragmented, e.g., by mechanical, enzymatic or chemical shearing, to obtain a population of nucleic acid molecules having a uniform size distribution for next generation sequencing. [0070] Sequencing technologies that can be used to evaluate RNA profiles, e.g., in cfRNA from a plasma or serum sample, include next generation sequencing platforms such as RNA-seq. Illustrative sequencing platforms suitable for use according to the methods include, e.g., ILLUMINA® sequencing (e.g., HiSeq, MiSeq), SOLID® sequencing, ION TORRENT® sequencing, and SMRT® sequencing and those commercialized by Roche 454 Life Sciences (GS systems). [0071] In some instances, alternative methodology for assessing RNA levels may be employed. For example, the level of RNA in a cfRNA sample from serum or plasma can be detected or measured by a variety of methods including, but not limited to, an amplification assay or a microarray chip (hybridization) assay. As used herein, "amplification" of a nucleic acid sequence has its usual meaning, and refers to in vitro techniques for enzymatically increasing the number of copies of a target sequence. Amplification methods include both asymmetric methods in which the predominant product is single-stranded and conventional methods in which the predominant product is double-stranded. The term “microarray” refers to an ordered arrangement of hybridizable elements, e.g., gene-specific oligonucleotides, attached to a substrate. Hybridization of nucleic acids from the sample to be evaluated is determined and converted to a quantitative value representing relative gene expression levels. [0072] Non-limiting examples of methods to evaluate levels of RNA, e.g., cfRNA from serum or plasma, include amplification assays such as quantitative RT-PCR, digital PCR, microarray analysis; ligation chain reaction, oligonucleotide elongation assays, and various multiplexed assays, such as multiplexed amplification assays In some embodiments, isothermal amplification methods that may be used to measure gene expression levels include, for example, loop-mediated isothermal amplification (LAMP). [0072] Typically cfRNA values determined by sequencing or an alternative methodology are normalized to account for sample-to-sample variations in RNA isolation and the like. Methods for normalization are well known in the art. For example, in some embodiments, normalization may be performed with reference to housekeeping genes that are constitutively expressed at any development stage irrespective of pathophysiological state. Exemplary, housekeeping genes or normalization genes that may be used include, Ribonuclease P (RNaseP) gene, genes encoding Į- actin, ȕ- actin, 18S rRNA, 28S rRNA, albumin, and glyceraldehyde-3-phosphate dehydrogenase (GAPDH). Additional suitable housekeeping genes that can be used to carry out the methods described herein may be found in the HRT Atlas Database (www.housekeeping.unicamp.br; Hounkpe et al. (2020), "HRT Atlas v1.0 database: redefining human and mouse housekeeping genes and candidate reference transcripts by mining massive RNA-seq datasets," Nucleic Acids Research: gkaa609). [0073] In some embodiments, normalization of values is performed using trimmed mean of M values (TMM) normalization, e.g., when using RNA-Seq to evaluate cfRNA expression levels. In some embodiments, normalized values may be obtained using a reference level for one or more exogenous nucleic acids, e.g. exogenous RNA oligonucleotides added to a sample. A control value for normalization of RNA values can be predetermined, determined concurrently, or determined after a sample is obtained from the subject. 2. Quantification of differential expression-general sepsis [0074] The amount of RNA of each gene measured can be quantified compared to levels of each RNA in a population of control subjects. For quantification of general sepsis, control subjects do not have clinical signs of infection, including signs such as increased pulse rate, body temperature, hypotension, hyperventilation and/or respiratory alkalosis. In some instances the control subject have systemic inflammatory disease or a critical illness, e.g., cardiac arrest, overdose/poisoning, heart failure exacerbation, or pulmonary embolism. A control population typically comprise at least 10 subjects, or 50 or more subjects (e.g., 10-100 subjects). In some embodiments, a control populations comprises 500 or more subjects. A value may represent the median transcript level or concentration of the selected transcript in the control population. [0075] Determination of a probability score and classification of whether the subject is likely to have sepsis is further detailed in Subsection B of this section. 3. Quantification of differential expression-viral sepsis [0076] The amount of RNA of each gene measured in a gene panel for determining viral sepsis is quantified compared to levels of each RNA in a population of control subjects. For quantification of viral sepsis, the amounts of RNA can be compared to control subjects having clinically adjudicated sepsis due to a non-viral pathogen. In some instances, such control subjects have a microbiologically confirmed bacterial bloodstream infection or a microbiologically confirmed bacterial non-bloodstream infection. A control population typically comprise at least 10 subjects, or 50 or more subjects (e.g., 10-100 subjects). In some embodiments, a control populations comprises 500 or more subjects. A value may represent the median transcript level or concentration of the selected transcript in the control population. [0077] Determination of a probability score and classification of whether the subject is likely to have sepsis is further detailed in Subsection B of this section. B. Determination of probability score and classifier [0078] The greater the quantity of differential expression of any of the markers, generally the higher the probability that the subject would be classified. As described above, the quantity of differential expression for each marker can be determined using a difference or ratio between a measured expression level and a reference expression level. A relationship between this quantity and the likelihood (probability) of having sepsis can be determined, e.g., using a proportion of samples having sepsis that have a given quantity of differential expression. This can be done for each marker. Further, a probability score can be determined based on the quantities of differential expression for all the markers. [0079] The overall probability score can be determined in various ways. For instance, a total quantity of differential expression can be determined, e.g., as a weighted sum or average of the individual quantities of differential expression. The weights can be based on the importance (discriminating power) of each marker in discriminating sepsis from non-sepsis. Then, the proportion of the subjects that have sepsis at a given value for the total quantity can be used as the probability score. As another example, a machine learning model can provide the probability, e.g., a support vector machine (SVM) can provide a probability based on a distance of a multidimensional point of the expression levels from the hyperplane that distinguishes between sepsis and non-sepsis. [0080] Accordingly, a probability score to classify the subject as likely or not likely to have sepsis can be determined based on the level of differential expression of each member of a gene panel as described herein, or a subset thereof. In some embodiments, the level of expression of each gene is weighted with a predefined coefficient. The predefined coefficients can be the same or different for the genes. The probability score can be determined in various ways, e.g., by statistical or machine learning regression or classification such as, but not limited to, linear regression, including least squares regression, ridge or LASSO regression, elastic net regression, regularized Cox regression, logistic regression, orthogonal matching pursuit models, a Bayesian regression model, or deep learning methods, such as convolutional neural networks, recurrent neural networks and generative adversarial networks (see, e.g., LeCun et al., .Nature 521: 436- 444, 2015). Further examples of machine-learning algorithms include quadratic discriminate analysis, support vector machines, including without limitation support vector classification- based regression processes, stochastic gradient descent algorithms, nearest neighbors algorithms, Gaussian processes such as Gaussian process regression, cross-decomposition algorithms, including partial least squares and/or canonical correlation analysis; probabilistic graphical models including naive Bayes methods; models based on decision trees, such as decision tree classification algorithms. Additional machine-learning algorithms include ensemble methods such as bagging meta-estimator, randomized forest algorithms, AdaBoost, gradient tree boosting, and/or voting classifier methods. Details relating to various statistical methods are found in the following references: Ruczinski et al., 12 J. OF COMPUTATIONAL AND GRAPHICAL STATISTICS 475-511 (2003); Friedman, J. H., 84 J. OF THE AMERICAN STATISTICAL ASSOCIATION 165-75 (1989); Hastie, Trevor, Tibshirani, Robert, Friedman, Jerome, The Elements of Statistical Learning, Springer Series in Statistics (2001); Breiman, L., Friedman, J. H., Olshen, R. A., Stone, C. J. Classification and regression trees, California: Wadsworth (1984); Breiman, L., 45 MACHINE LEARNING 5-32 (2001); Pepe, M. S., The Statistical Evaluation of Medical Tests for Classification and Prediction, Oxford Statistical Science Series, 28 (2003); and Duda, R. O., Hart, P. E., Stork, D. G., Pattern Classification, Wiley Interscience, 2nd Edition (2001), each of which is incorporated by reference. Additionally, ensemble techniques that combine different machine learning models can be used. [0081] Once determined, the probability score can be used to determine whether the subject has an increased likelihood of sepsis. The probability score can be compared to a threshold value (also referred to as a cutoff value). The threshold can be selected based on a desired accuracy, e.g., a trade off of sensitivity and specificity. If the probability score exceeds the threshold (cutoff) then the subject can be identified as likely having sepsis. [0082] In some embodiments, likelihood of sepsis may be assigned based on a cutoff value using a reference scale, e.g., from 0 to 1.0. In some embodiments, a cutoff value of 0.5 or greater may be employed to define likelihood of sepsis. In some embodiments, sepsis likelihood may be further stratified, for example, likelihood of sepsis may be categorized as “high,” “intermediate,” or “low”, e.g., based on the highest tertile, intermediate tertile and bottom tertile. [0083] Classifiers that use host gene expression levels of sepsis marker genes as described herein in cfRNA samples from a subject evaluated for likelihood of sepsis can be generated, e.g., as described herein, from a training set of samples obtained from confirmed sepsis patients, e.g., determined by clinical adjudication and/or culture of organism from a blood or organ sample from a patient. In the instance where likelihood of viral sepsis is determined, a training set can be from patients having confirmed viral sepsis vs. sepsis from a non-viral pathogen. Thus, in some embodiments, a gene expression panel to evaluate likelihood of sepsis can be determined based on a gene panel or subset panel comprising one or more gene set forth in Table A (sepsis likelihood) or Table B (viral sepsis likelihood) or may comprise one or more genes set forth in the tables and additional genes identified as being correlated with sepsis risk. [0084] Different subsets of genes can be selected to train a model (e.g., to determine the probability score) using all or a subset of the training samples (i.e., subjects for which sepsis status is known and for which expression of the genes was measured). This training subset can then be used to train (optimize) a model, whose accuracy can be measured, e.g., using the AUC of an ROC curve. Then, another subset of genes can be selected, with a further training process providing another model whose accuracy can also be measured. The accuracy can be measured using the training set or a validation set, which can include samples with known labels that were excluded from the training set. This process of generating models for different subsets of genes, along with the accuracy of each model, can continue, possibly for all possible subsets of genes for which expression levels have been measured. A panel providing the best accuracy can be selected, however the accuracy is measured. [0085] The machine learning model may be trained until certain predetermined conditions for accuracy or performance are satisfied, such as having minimum desired values corresponding to diagnostic accuracy measures. For example, the diagnostic accuracy measure may correspond to prediction of a diagnosis or disease outcome in the subject. Examples of diagnostic accuracy measures may include sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and area under the curve (AUC) of a Receiver Operating Characteristic (ROC) curve corresponding to the diagnostic accuracy of detecting sepsis. C. Flow charts 1. Determining likelihood of general sepsis [0086] FIG.6 is a flowchart illustrating a method 600 of measuring the expression levels of the gene markers described herein to evaluate general sepsis risk in a subject according to embodiments of the present invention. [0087] In Steps 610 and 620, RNA in a cell-free RNA sample from a human subject of each member of a gene panel is detected. In some embodiments, cfRNA is obtained from a serum or plasma sample. In instances where cfRNA is obtained from a plasma sample, the plasma can be clarified by centifiugation. Cell-free RNA can be extracted using available methods and kits, such as cfRNA kits available from Qiagen. In some instances, for example, using RNA-seq to analyze RNA levels in cfRNA, human rRNA (cytosolic and mitochondrial) and beta globin sequences can be depleted. Any methodology for depletion can be employed. For example, a pool of locked nucleic aicds that block reverse transcription of sequences to be removed (Qiagen FastSelect) will provide a cDNA population enriched for target. Other technologies to enrich protein-encoding RNAs include affinity depletion using complementary oligonculeotides to rRNA target sequences, depletion methods using antisense DNA oligonucleotides to cover the entire rRNA molecule to target RNaseH-mediate degration of the rRNA, and selection of poly(A)-containing transcripts. The gene panel (Step 620) comprises at least three genes selected from the genes listed in Table A. In some embodiments, the gene panel comprises genes as described in Section I.A. RNA can be detected by any suitable method, such as those described above in section III.A.i. In some embodiments, RNA is detected by an amplification- based method such as quantitative PCR. In preferred embodiments, RNA is detected by sequencing that employs a massively parallel sequencing platform, such as RNA-seq. [0088] In step 630, a quantity of differential gene expression for each member of the gene panel is determined compared to the reference levels of RNA in control subjects. Example techniques for the quantification of differential expression are described in section III.A.ii. above. [0089] In step 640, a probability score based on the respective amount of differential gene expression is generated. Example techniques for the generation of the probability score are described in section III.B above. For instance, a cohort of training samples (subjects having known sepsis status and measured expression levels) can be used to train a machine learning (ML) model, such as a bagged support vector machine learning approach (bSVM), e.g., with a linear kernel. Thus, a ML model can be used to determine the probability score. The ML model can be trained using different panels of markers, and best performing panel can be used. The pool of potential markers for the various panels can be limited to markers having at least a minimum amount of differential expression. The probability score can be normalized, e.g., between 0 and 1. [0090] In step 650, a classification is determined, wherein an increased likelihood of sepsis is determined when the probability score exceeds a threshold value. Example techniques for determining the threshold value are described in section III.B. For example, the threshold can be selected based on a desired accuracy, e.g., a trade off of sensitivity and specificity. As examples, the threshold can be 0.8, 0.7, 0.6, or 0.5. [0091] In a method analogous to those shown in Steps 610-650, in some instances RNA is obtained from a whole blood sample from the subject and the gene panel comprises at least three genes selected from the genes set forth in Table C. Threshold scores and classifications are determined as described above. 2. Determining likelihood of viral sepsis [0092] FIG.7 is a flowchart illustrating a method 700 of measuring the expression levels of the gene markers described herein to evaluate viral sepsis risk in a subject according to embodiments of the present invention. Aspects of method 700 can be performed in a similar manner as method 600. [0093] In Steps 710 and 720, RNA in a cell-free RNA sample from a human subject of each member of a gene panel is detected. In some embodiments, cfRNA is obtained from a serum or plasma sample. In instances where cfRNA is obtained from a plasma sample, the plasma can be clarified by centifiugation. Cell-free RNA can be extracted using available methods and kits, such as cfRNA kits available from Qiagen. In some instances, for example, using RNA-seq to analyze RNA levels in cfRNA, human rRNA (cytosolic and mitochondrial) and beta globin sequences can be depleted. Any methodology for depletion can be employed. For example, a pool of locked nucleic aicds that block reverse transcription of sequences to be removed (Qiagen FastSelect) will provide a cDNA population enricked for target. Other technologies to enrich protein-encoding RNAs include affinity depletion using complementary oligonculeotides to rRNA target sequences, depletion methods using antisense DNA oligonucleotides to cover the entire rRNA molecule to target RNaseH-mediate degration of the rRNA, and selection of poly(A)-containing transcripts. The gene panel (Step 720) comprises at least three genes selected from the genes listed in Table B. In some embodiments, the gene panel comprises genes as described in Section II.(a) above. RNA can be detected by any suitable method, such as those described above in section III.A.i. In some embodiments, RNA is detected by an amplification- based method such as quantitative PCR. In preferred embodiments, RNA is detected by sequencing that employs a massively parallel sequencing platform, such as RNA-seq. [0094] In step 730, a quantity of differential gene expression for each member of the gene panel is determined compared to the reference levels of RNA in control subjects. Quantification of differential expression is described in section III.A.iii. [0095] In step 740, a probability score based on the respective amount of differential gene expression is generated; and in step 750 the subject a classification is determined, wherein an increased likelihood of viral sepsis is determined when the probability score exceeds a threshold value. Steps 740 and 750 can be performed in a similar manner as steps 640 and 650 of method 600. ^ [0096] In a method analogous to those shown in Steps 710-750, in some instances the RNA is obtained from a whole blood sample from the subject and the gene panel comprises at least three genes selected from the genes in Table D. IV. METHODS OF DETERMINING MICROBIAL MASS / IDENTIFICATION OF PATHOGEN [0097] In some embodiments, methods of predicting the likelihood of sepsis can comprise determining the microbial mass by sequencing a cell-free DNA sample, e.g., a plasma or serum sample from a subject undergoing evaluation for sepsis. As used herein, “microbial mass” refers to the weight, e.g., picograms, of microbial nucleic acid determined to be present in plasma or serum. This can be calculated as described below relative to a known amount of spike-in calibration nucleic acids (also referred to as a calibration standard) of known sequence unrelated to human pathogens that are is added to a sample. [0098] FIG.8 provides a flow chart illustrating determination of microbial mass and identification of a dominant pathogen to assess likelihood of sepsis. [0099] In step 810, a known amount of calibration nucleic acids are added to a cfDNA sample obtained from a patient. Appropriate calibration nucleic acids for use in quantification of microbial mass include nucleic acids that are not related to human pathogens and have not been observed in human cf nucleic acid, e.g., sequences from Archaea or other extremophiles, and/or synthetic sequences. In some embodiments, cDNA transcribed from a control comprising RNA transcripts may be used to add to a sample. In one example, the calibration nucleic acid is cDNA added to samples is cDNA transcribed from RNA controls from the External RNA Controls Consortium (ERCC) (Pine, et al., BMC Biotechnology 16, 54 (2016)). These controls are a set of unlabeled, polyadenylated transcript designed to be about 250 to 2,000 nucleotide in length to mimic eukaryotic mRNA. The ERCC consortium control RNAs are synthesized by in vitro transcription of synthetic DNA sequences or transcripts of DNA derived from the Bacillus subtilis or the deep-sea vent microbe Methanocaldococcus jannaschii genomes. They also contain a poly-A+ tail mimic in the DNA template. The, ERCC control RNA show minimal sequence homology with endogenous transcripts from sequenced humans. In some embodiments, 25 pg of calibration nucleic acids, e.g., ERCC control RNA is used. [0100] In step 820, libraries generated from the cfDNA preparation can be sequenced. As understood in the art, calibration nucleic acids can comprise long polynucleotides, which can be fragmented, e.g., by mechanical, enzymatic or chemical shearing, to provide a uniform distribution of fragments for sequencing. The amount of fragmentation can provide sizes that are similar to the natural lengths of cell-free RNA. [0101] In step 830, sequences are aligned to sequences present in one or more taxonomic sequence databases (e.g., National Center for Biotechnology Information (NCBI) databases that comprise microbial sequences to determine sequence reads that align to microbial sequences (i.e., “map” to microbial sequences). In some embodiments, the method comprises determining sequence reads that align to nonviral microbial sequences. In some embodiments, the NCBI GenBank nucleotide database is queried In some embodiments the method further employs an IDseq pipeline (Kalantar, et al., Gigascience 9, 2020), which incorporates subtractive alignment of the human genome (NCBI GRC h38) using STAR28, quality and complexity filtering, and subsequent removal of cloning vectors and phiX phage using Bowtie220. In some emboidments, the identities of microbial reads are determined by querying the NCBI nucleotide nucleotide database using GSNAP-L (Zhao, et al, Bioinformatics 28, 125–126 (2012)). [0102] In step 840, a ratio of the amount of total sequence reads that correspond to the calibration nucleic acids and the amount of all microbial sequence reads microbial reads, is determined. The ratio can be determined in various ways, e.g., X1/X2, X1/(X1+X2), X2/(X1+X2), functions of such ratios, or ratios of functions of the amounts, or combinations thereof. [0103] In step 850, the microbial mass (e.g., weight in picograms) can be determined based on the known amount of the calibration standard and the ratio determined in step 840. For example, the microbial mass can be determined by multiplying the ratio of total microbial reads to calibration reads and multiplying by known amount. In some embodiments, a background correction using a control, such as a water control samples, is employed to account for environmental contaminants. [0104] In step 860, abundance levels of microbial species represented in the cfDNA preparation are determined by determining the number of sequence reads that are mapped to individual species of microbes. Negative control samples consisting of only double-distilled water can also be processed with plasma cf-DNA samples. Such negative control samples provide estimation of the number of background reads expected for each taxon, e.g., as described by Mick et al, Nature Communications 11:5854, 2020). [0105] In step 870, for each genus of microbes, the species of that genus having the highest abundance level is selected. Then, the selected species are ranked by abundance level in sequential order, typically from highest to lowest. [0106] In step 880, a gap threshold is determined. The gap threshold can correspond to the abundance level at which the greatest difference in abundance level occurs between sequential microbes. For example, with the ranking being from highest to lowest, the highest abundance level may differ by 4.5 (e.g., 8-3.5) from the second highest, which might differ by only 0.8 from the third highest. The further differences (gaps) between other rankings can be even less. Thus, the gap threshold can be any value between 8 and 3.5, so that only the highest abundance would qualify. In another scenario, the largest gap can be between the second highest and the third highest, e.g., with the set of abundance values being 9, 8, 2, 1.5, 1, … . The gap threshold could be any abundance between 8 and 2. [0107] In step 890, any species having an abundance at or exceeding the gap threshold is selected and it is determined whether any of the selected species is a known blood stream pathogen. A microbe can be identified as a known bloodstream pathogen in various ways, e.g., by referencing indexes and listing of pathogen, e.g., a reference index derived from the most prevalent bloodstream infection pathogens reported by both the National Healthcare Safety Network (NHSN) (Weiner-Lastinger, L. M. et al. ,Infect. Control Hosp. Epidemiol. 41, 1–18 (2020)) and/or a recent multicenter surveillance study of healthcare-associated infections (Magill et al, NEJM 379:1732-1744, 2018). These studies reported multiple species of Bacteriodes, Candida, Citrobacter, Enterobacter, Enterococcus, Klebsiella, Lactobacillus, Morganella, Prevotella, Proteus, Serratia, Stenotrophomonas and Streptococcus as common sepsis pathogens. In this manner, it can be determined whether a predominant pathogen exists. [0108] In some instances, the species are present in the listing provided in Table 14 and are detected at an abundance of > 1 read per million. In some instances a pathogenic respiratory virus, e.g.,based on a list of pathogens (Langelier et al, 2018, supra) can be identified in the cfRNA from plasma or serum. [0109] Finally, in step 8100, a patient is determined as likely having sepsis if the microbial mass is greater than a specified mass (e.g., 20 pg) and a predominant pathogen has been identified. The specific mass can assume that all samples have the same volume. In other implementations, the specific mass can vary based on the volume of the sample. V. LIKELIHOOD OF SEPSIS BASED ON INTEGRATION OF HOST TRANSCRIPTIONAL PROFILING AND MICROBIAL IDENTIFICATION [0110] A patient can be determined to likely have sepsis based on a combination of the previously described tests. Such tests can be combined with logical ANDS or ORs for determining whether sepsis exists, e.g., to determine whether to provide treatment. Such a combination can maximize accuracy (eg a negative predictive value) to identify individuals likely to have sepsis. In some implementations, if any one of the three methods determines that that sepsis is likely (e.g., has a likelihood higher than a respective threshold), then the patient can be treated for sepsis. But if none of the three methods determines than sepsis is likely, then sepsis can be ruled out, and the patient would not be unnecessarily subjected to antibiotics or other treatments [0111] The combinations of tests allow for determining that the patient is unlikely to have sepsis. The combination of tests can be as follows: x A patient is deemed likely to have sepsis if the host classifier probability for general sepsis, b section I), is greater than a cutoff (based on evaluation of a host cell general sepsis gene panel as described herein (e.g., e.g., .0.5) associated with sepsis; OR x A patient is deemed likely to have sepsis if the host classifier for virial sepsis, based on evaluation of a host cell viral sepsis gene panel as described herein (e.g., in section II), is greater than a cutoff (e.g., 0.9) associated with sepsis. OR x A patient is deemed likely to have sepsis if the microbial mass is greater than a specified mass (e.g., 20 pg) AND a dominant pathogen is detected (e.g., a drop in abundance to a next prevalent pathogen is greater than a threshold). [0112] If a subject is deemed likely to have sepsis, the patient can be treated with an antibiotic or other agent that treats sepsis. In some embodiments, e.g., when a dominant pathogen is detected, the antibiotic that targets the pathogen may be selected from treatment. Illustrative antibiotics include, for example, ceftriaxone, cefotaxime, vancomycin, meropenem, cefepime, ceftazidime, cefuroxime, nafcillin, oxacillin, ampicillin, ticarcillin, ticarcillin/clavulinic acid, ampicillin/sulbactam (Unasyn), azithromycin, trimethoprim-sulfamethoxazole, clindamycin, ciprofloxacin, levofloxacin, synercid, amoxicillin, amoxicillin/clavulinic acid, cefuroxime, trimethoprim/sulfamethoxazole, azithromycin, clindamycin, dicloxacillin, ciprofloxacin, levofloxacin, cefixime, cefpodoxime, loracarbef, cefadroxil, cefabutin, cefdinir, and cephradine. VI. KITS AND DEVICES [0113] In another aspect, provided in this disclosure are kits, panels and devices for carrying out the methods described herein. In some embodiments, a kit is provided for measuring and analyzing RNA in a biological sample, such as a serum or plasma sample. In one embodiment, the kit includes two or more polynucleotides for specifically hybridizing to at least a section of a gene listed in Table A or Table B for use in evaluating likelihood of sepsis in a patient. In another embodiment, the kit includes two or more polynucleotides for use in assessing likelihood of sepsis in a human subject. VII. EXAMPLE SYSTEM [0114] FIG.9 illustrates a measurement system 900 according to an embodiment of the present disclosure. The system as shown includes a sample 905, such as cell-free RNA or DNA molecules within an assay device 910, where an assay 908 can be performed on sample 905. For example, sample 905 can be contacted with reagents of assay 908 to provide a signal of a physical characteristic 915. An example of an assay device can be a flow cell that includes probes and/or primers of an assay or a tube through which a droplet moves (with the droplet including the assay). Physical characteristic 915 (e.g., a fluorescence intensity, a voltage, or a current), from the sample is detected by detector 920. Detector 920 can take a measurement at intervals (e.g., periodic intervals) to obtain data points that make up a data signal. In one embodiment, an analog-to-digital converter converts an analog signal from the detector into digital form at a plurality of times. Assay device 910 and detector 920 can form an assay system, e.g., a sequencing system that performs sequencing according to embodiments described herein. A data signal 925 is sent from detector 920 to logic system 930. As an example, data signal 925 can be used to determine sequences and/or locations in a reference genome of DNA molecules. Data signal 925 can include various measurements made at a same time, e.g., different colors of fluorescent dyes or different electrical signals for different molecule of sample 905, and thus data signal 925 can correspond to multiple signals. Data signal 925 may be stored in a local memory 935, an external memory 940, or a storage device 945. [0115] Logic system 930 may be, or may include, a computer system, ASIC, microprocessor, graphics processing unit (GPU), etc. It may also include or be coupled with a display (e.g., monitor, LED display, etc.) and a user input device (e.g., mouse, keyboard, buttons, etc.). Logic system 930 and the other components may be part of a stand-alone or network connected computer system, or they may be directly attached to or incorporated in a device (e.g., a sequencing device) that includes detector 920 and/or assay device 910. Logic system 930 may also include software that executes in a processor 950. Logic system 930 may include a computer readable medium storing instructions for controlling measurement system 900 to perform any of the methods described herein. For example, logic system 930 can provide commands to a system that includes assay device 910 such that sequencing or other physical operations are performed. Such physical operations can be performed in a particular order, e.g., with reagents being added and removed in a particular order. Such physical operations may be performed by a robotics system, e.g., including a robotic arm, as may be used to obtain a sample and perform an assay. [0116] System 900 may also include a treatment device 960, which can provide a treatment to the subject. Treatment device 960 can determine a treatment and/or be used to perform a treatment. Examples of such treatment can include surgery, radiation therapy, chemotherapy, immunotherapy, targeted therapy, hormone therapy, and stem cell transplant. Logic system 930 may be connected to treatment device 960, e.g., to provide results of a method described herein. The treatment device may receive inputs from other devices, such as an imaging device and user inputs (e.g., to control the treatment, such as controls over a robotic system). [0117] The specific details of particular embodiments may be combined in any suitable manner without departing from the spirit and scope of embodiments of the invention. [0118] All publications, patent applications, and accession numbers mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference for the material for which it is cited. VIII. EXAMPLE TECHNIQUES [0119] The following examples illustrate the identification of cell-free RNA markers expressed by the host that are present in plasma that are associated with sepsis and cell-free RNA and additional panels of cell-free RNA markers in plasma that are associated with viral sepsis. The examples additionally describe methods of determining microbial mass and a predominant organism. Finally, the examples demonstrate that these four aspects can be used, alone or together, to assist in sepsis diagnosis. A. Clinical Features of Cohort [0120] We conducted a prospective observational study of critically ill adults admitted from the Emergency Department (ED) to the Intensive Care Unit (ICU) (FIGS.1A-1B). Patients were categorized into five subgroups based on sepsis status. These included patients with clinically adjudicated sepsis and: 1) a clinical microbiologically confirmed bacterial bloodstream infection (SepsisBSI), 2) a peripheral, non-bloodstream infection (Sepsisnon-BSI), 3) suspected sepsis with negative clinical microbiologic testing (Sepsissuspected), or 4) patients with no evidence of sepsis and a clear alternative explanation for their critical illness (No-Sepsis), or 5) patients of indeterminant status (Indeterm). The most common diagnoses in the No-Sepsis group were cardiac arrest, overdose/poisoning, heart failure exacerbation, and pulmonary embolism. The majority of patients, regardless of subgroup, required mechanical ventilation and vasopressor support (Table 1). Patients with microbiologically proven sepsis (SepsisBSI + Sepsisnon-BSI) did not differ from No-Sepsis patients in terms of age, gender, race, ethnicity, immunocompromise, APACHEIII score, maximum white blood cell count, intubation status, or 28-day mortality (FIGS.1A-1B, Tables 1 and 2). All but one patient (in the No-Sepsis group) exhibited ^ 2 systemic inflammatory response syndrome (SIRS) criteria (Kaukonen, et al., N. Engl. J. Med. 372, 1629–1638 (2015)). [0121] Patients with microbiologically proven sepsis (SepsisBSI + Sepsisnon-BSI) did not differ from No-Sepsis patients in terms of age, gender, race, ethnicity, immunocompromise, APACHEIII score, maximum white blood cell count, intubation status or 28-day mortality. Differences in intubation and vasopressor use were evident between (SepsisBSI + Sepsisnon-BSI) and No-Sepsis patients in the analysis of whole blood samples (n=221) but not in the sub- analysis of patients with paired cf-plasma RNA sequencing (RNA-seq) data (n=110, Tables 1 and 2). B. Host transcriptional signature of sepsis from whole blood [0122] Panels of genes as described in section I.B. were determined as follows. [0123] We first assessed whole blood transcriptional differences between patients with clinically and microbiologically confirmed sepsis (SepsisBSI, Sepsisnon-BSI) versus those without evidence of infection (No-Sepsis) by performing RNA-seq on whole blood specimens (n = 221 total).5,807 differentially expressed (DE) genes were identified at an adjusted P value < 0.1 (FIG. 2A). Gene set enrichment analysis (GSEA) demonstrated upregulation of pathways related to neutrophil degranulation and innate immune signaling in the sepsis group, and downregulation of pathways related to translation and rRNA processing (FIG.2B, Pathways summarized in Table 3). [0124] To further characterize differences between sepsis patients with bloodstream versus peripheral site (e.g. respiratory, urinary tract) infections, we performed differential gene expression (DGE) analysis between the SepsisBSI and Sepsisnon-BSI groups and identified 5,227 genes. GSEA demonstrated enrichment for CD28 signaling, immunoregulatory interactions between lymphoid and non-lymphoid cells, and other pathways in the Sepsisnon-BSI patients, while the SepsisBSI group was characterized by enrichment in genes related to antimicrobial peptides, defensins, G alpha signaling and other pathways (Pathways are summarized in Table 4). [0125] We first constructed a ‘universal’ sepsis diagnostic classifier based on whole blood gene expression signatures. After dividing the cohort (n=221) into independent training (75% of data) and validation groups (25%), we employed a bagged support vector machine learning approach (bSVM) to select genes that most effectively distinguished patients with sepsis (SepsisBSI and Sepsisnon-BSI, n=129) from those without (No-Sepsis, n=92). Only genes differentially expressed in the training set, using a 0.1 FDR threshold, were considered as potential predictors. Cross validation using ten random subsamples of the training set returned an average area under the receiver operating characteristic curve (AUC) of 0.81 (0.05 standard deviation). The final bSVM model trained on the entire training dataset achieved an AUC of 0.82 in a held-out validation set (FIG. 2C, Table 5). C. Host transcriptional classifier for sepsis diagnosis from cell-free plasma RNA [0126] Sequencing of cf-DNA has emerged as a preferred strategy for culture-independent detection of bacterial pathogens in the bloodstream (Blauwkamp et al., 2019, supra), as it offers the advantage of enriching for microbial DNA and reducing the fraction of uninformative host nucleic acid. It has been unknown, however, whether cf-plasma RNA could also provide meaningful information on the host response, as transcriptional profiling studies have historically relied on isolation of PBMCs or collection of whole blood. [0127] The panel of genes described in section I.A. was identified as follows. [0128] We sequenced cf-plasma RNA from patients with available specimens matched to the whole blood samples, and obtained a median of 2.3 x 107 (95% CI 2.2-2.5 x 107) reads per sample. Calculation of input RNA mass (see, Methods) demonstrated that samples with transcript counts below our QC cutoff (< 50,000) had lower input RNA mass than those with sufficient counts (85.8 pg versus 65.2 pg, respectively, p <0.0001, Table 7). After filtering to retain samples with ^ 50,000 transcripts (n=138), we performed DGE analysis to assess whether a biologically plausible signal could be observed between patients with sepsis (SepsisBSI and Sepsisnon-BSI, n=73) and those without (No-Sepsis, n=37). Remarkably, several of the top differentially expressed genes were previously reported sepsis biomarkers, e.g. CD177, HLA- DRA (see, Demaret, et al., Immunology Letters 178, 122–130 (2016); Tang, et al., Nat Commun 10, 3422 (2019); Cajander, et al., Crit Care 17, R223 (2013)), suggesting a biologically relevant transcriptomic signature from cf-plasma RNA (FIG.2D). [0129] We then asked whether a host transcriptional sepsis diagnostic classifier could be constructed using cell-free RNA gene expression data by dividing the cohort into independent training (75% of data) and validation groups (25%), and employing the same bSVM approach to select genes that most effectively distinguished SepsisBSI and Sepsisnon-BSI patients from No- Sepsis patients. An FDR threshold of 0.3 was chosen to select potential predictors. This approach yielded a classifier that achieved an average 10-fold cross-validation AUC of 0.97 (0.03 standard deviation), and an AUC of 0.77 in the held-out validation set (FIG. 2E, Table 6). D. Detection of bacterial sepsis pathogens in cell-free plasma [0130] We began microbial metagenomic analyses by assessing cf-DNA microbial mass (Methods), which was significantly lower in negative control water samples, but did not differ between adjudicated sepsis groups (FIG.3A, Table 7). We next carried out bacterial pathogen detection using the IDseq pipeline (Kalantar, et al., Gigascience 9, (2020)) for taxonomic alignment followed by a previously developed rules-based model (RBM) (Langelier et al., 2018, supra) that identifies established sepsis pathogens overrepresented in mNGS data compared to less abundant commensal or contaminating microbes14 (Methods, FIG.3B). [0131] We then asked how well the metagenomic RBM pathogen predictions agreed with bacterial blood culture data. Polymicrobial blood cultures of ≥ 3 organisms were excluded (n=2) given their unclear clinical significance, leaving a total of 40 blood culture-positive cases available for comparison (Table 8). Sensitivity versus blood culture as a reference standard was 83%, and varied by pathogen, ranging from 0% (e.g., C. difficile) to 100% (e.g., E. coli, S. aureus/argenteus FIG.3C). Pathogens were called by the RBM in 10/37 (27%) of patients in the No-Sepsis group, equating to a specificity of 73%. E. Detection of clinically confirmed pathogens from peripheral sites using cell-free plasma [0132] Plasma cf-DNA mNGS identified 2/25 (8%) of culture-confirmed bacterial LRTI pathogens in the Sepsisnon-BSI group and 3/10 (30%) culture-confirmed bacterial UTI pathogens (FIG. 3D, Table 8). cf-plasma DNA-seq returned negative in all three patients with sepsis attributable to C. difficile colitis. mNGS identified additional putative bacterial pathogens not detected by culture in 8 of 73 (11%) of patients with microbiologically confirmed sepsis (Table 8). F. Identifying viral sepsis using cell-free plasma and whole blood [0133] Only one of 13 (8%) respiratory viruses identified by clinical testing could be detected by mNGS of cf-plasma RNA (Table 8). Recognizing that an alternative approach would be needed, we asked whether host response could be informative for detecting viral infection by performing differential gene expression between patients with versus without clinically- confirmed viral sepsis pathogens within the SepsisBSI and Sepsisnon-BSI groups, using whole blood and cf-plasma. Pathways related to interferon signaling and genes important for antiviral immunity were enriched in samples from patients with viral sepsis versus those with critical illness due to other causes (bacterial and non-infectious) in both whole blood (FIG. 4A, pathways summarized in Table 9) and plasma cf RNA (FIG. 4B, pathways summarized in Table 10) datasets. [0134] We then leveraged this host signature to build a bSVM diagnostic classifier for viral sepsis selecting differentially expressed genes as potential predictors using a 0.1 FDR threshold, which on whole blood samples achieved an average 10-fold cross validation AUC within the 75% training set of 0.9 (standard deviation 0.07) and an AUC of 0.79 in the held-out 25% validation set (FIG.4C, Table 11). Slightly better performance was obtained when building a classifier using cell-free plasma RNA-seq data and using 0.2 as an FDR threshold to select candidate predictors, with an average AUC of 0.94 (standard deviation 0.09) in the training set and an AUC of 0.96 in the held-out validation set (FIG.4D, Table 12). Incorporation of the host- based viral sepsis classifier improved the percent positive agreement with clinical respiratory viral PCR testing to 13/13 (100%) and predicted viral sepsis in three additional patients with respiratory failure who didn’t undergo viral PCR testing (Table 13). G. Integrated host-microbe cf-plasma metagenomic model for sepsis rule-out and diagnosis [0135] Given the relative success of each independent host and pathogen model, we considered whether combining them could enhance diagnosis, and potentially serve as a sepsis rule-out tool. To test this possibility, we developed a proof-of-concept integrated host + microbe model based on simple rules. It returned a sepsis diagnosis based on either host criteria: [host sepsis classifier probability > 0.5] or microbial criteria: [(pathogen detected by RBM) AND (microbial mass > 20 pg)] OR [host viral classifier probability > 0.9]. Applying these rules enabled detection of 42/42 (100%) of cases in the SepsisBSI group and 30/31 (97%) of cases in the Sepsisnon-BSI subjects, for an overall sensitivity of 72/73 (99%) (FIGS.5A and 5B). This proof-of-concept model yielded a specificity of 29/37 (78%) within the No-Sepsis subjects (FIG. 5C, Table 13). H. Application to culture-negative and indeterminant sepsis [0136] Finally, we asked whether patients with clinically adjudicated sepsis, but negative in- hospital microbiologic testing (Sepsissuspected) would be predicted to have sepsis using the integrated host-microbe cf-nucleic acid mNGS model. 14/19 (74%) were classified as having sepsis by the model (FIG.5D), and of these, 10/19 (53%) had either a putative bacterial pathogen identified (n=8) or a putative viral infection identified by the viral host classifier (n=2). With respect to the indeterminate group, the integrated host + microbe model classified 8/9 (89%) as sepsis-positive (Table 13). Of these, 4/9 (44%) had either a putative bacterial pathogen identified (n=2) or a putative viral infection identified by the viral host classifier (n=2). I. Summary of Results [0137] In the experiments above, we demonstrated that host transcriptional profiling can be used in combination with broad-range pathogen detection to accurately diagnose sepsis in critically ill patients upon hospital admission. Further, we demonstrate that an integrated host- microbe metagenomics approach can be performed on circulating nucleic acid from cell-free plasma, a widely available clinical specimen type with previously unrecognized utility for host- based infectious disease diagnosis. [0138] We found that concordance between pathogen detection by the cf-plasma mNGS coupled with the RBM and traditional bacterial blood culture varied by organism. For instance, concordance with S. aureus/argenteus and E. coli, the two most globally problematic bloodstream infection (BSI) pathogens, was 100%. In contrast, mNGS failed to detect C. difficile in any patients with colitis from this pathogen. Interestingly, in 2/3 (67%) C. difficile cases, mNGS instead detected Enterobacteriaceae, raising the possibility that gut translocation may be a feature of severe C. difficile infection. With respect to non-BSI sepsis, our findings suggest that plasma mNGS may be most useful for identifying UTI-associated pathogens, although we also observed some utility for respiratory pathogen detection, in line with a prior report (Langelier, et al., Am. J. Respir. Crit. Care Med.201:491-494, 2020), [0139] Plasma cf-RNA sequencing alone performed poorly for detecting sepsis-associated respiratory viruses. Incorporation of a host-based viral classifier, however, markedly improved detection of clinically identified viral LRTI, and additionally predicted viral infections in three patients with sepsis who did not undergo viral PCR testing during their hospitalizations. Prior work has demonstrated that different viral species elicit distinct host transcriptional signatures in the peripheral blood (Mudd, et al., Sci. Adv.6, eabe3024 (2020)) suggesting that future studies could extend the cf-RNA host viral classifier to identify specific viral pathogens, such as influenza or SARS-CoV-2, for which therapeutics exist. [0140] Together, our findings emphasize that detection of a pathogen alone is insufficient for infectious disease diagnosis, but when combined with assessment of the host immune response, has promising utility for identifying, and ruling out, sepsis. Inappropriate antimicrobial use is a major challenge in management of critical illness, and is often driven by the inability to rule-out infection in patients with systemic inflammatory diseases. In our proof-of-concept analysis, the integrated host + microbe model achieved 99% sensitivity across patients with microbiologically confirmed sepsis, and 78% specificity within the No-Sepsis group, which was comprised almost entirely of patients meeting the clinical definition of systemic inflammatory response syndrome (Kaukonen, et al, 2015, supra). [0141] This study includes the use of plasma cf-RNA transcriptomics for sepsis diagnosis, development of the first sepsis diagnostic combining host and microbial mNGS data, detailed clinical phenotyping, and a large prospective cohort of critically ill adults with systemic illnesses. The mNGS analyses and blood cultures were performed on different blood samples, with research specimens collected up to 24 hours after blood cultures, which may have resulted in lower concordance than truly existed. Second, a significant fraction of plasma samples had insufficient host transcripts to permit gene expression analyses, leading to a smaller sample size for the plasma versus the whole blood cohorts. Lastly, additional studies in an independent cohort will be useful to validate these findings. [0142] In conclusion, we report that combining host gene expression profiling and metagenomic pathogen detection from plasma cf-nucleic acid enables accurate diagnosis of sepsis. IX. METHODS DESCRIPTION FOR EXAMPLES TECHNIQUES [0143] This section describes example techniques that can be used to provide the results of previous sections. A. Study design, clinical cohort [0144] We conducted a prospective observational study of patients with acute critical illnesses admitted from the ED to the ICU. We studied patients who were enrolled in the Early Assessment of Renal and Lung Injury (EARLI) cohort at the University of California, San Francisco (UCSF) and Zuckerberg San Francisco General Hospital between 10/29/2008 and 01/17/2018 (Table 1 and 2). The study was approved by the UCSF Institutional Review Board under protocol 10-02852, which granted a waiver of initial consent for blood sampling. Informed consent was subsequently obtained from patients or their surrogates for continued study participation, as previously described (.Auriemma, et al, Intensive Care Med 46, 1222–1231 (2020); Agrawal, et al., Am J Respir Crit Care Med 187, 736–742 (2013). [0145] For the parent EARLI cohort, the inclusion criteria are: 1) age ^ 18, 2) admission to the ICU from the ED, and 3) enrollment in the ED or within the first 24 hours of ICU admission. For this study, we selected patients for whom PAXgene whole blood tubes and matched plasma samples from the time of enrollment were available. PAXgene tubes were collected on patients enrolled in EARLI during the time period listed above who were hypotensive and/or mechanically ventilated at the time of enrollment. The main exclusion criteria for the EARLI study are: 1) exclusively neurological, neurosurgical, or trauma surgery admission, 2) goals of care decision for exclusively comfort measures, 3) known pregnancy, 4) legal status of prisoner, and 5) anticipated ICU length of stay < 24 hours. Enrollment in EARLI began in 10/2008 and continues. B. Sepsis adjudication [0146] Clinical adjudication of sepsis groups was carried out by study team physicians (MA, CL, AL, KL, PS, CH, AG, CC, KK) using the sepsis-2 definition (Kalantar et al., 2020, supra) (^2 SIRS criteria + suspected infection) and incorporating all available clinical and microbiologic data from the entire ICU admission, with blinding to mNGS results. Patients were categorized into five subgroups based on sepsis status (FIG.1). Patients with clinically adjudicated sepsis and a bacterial culture-confirmed bloodstream infection (SepsisBSI), sepsis due to a microbiologically confirmed primary infection at a peripheral site other than the bloodstream (Sepsisnon-BSI), suspected sepsis with negative clinical microbiologic testing (Sepsissuspect), patients with no evidence of sepsis and a clear alternative explanation for their critical illness (No-Sepsis), or patients of indeterminant status (Indeterm). Clinical and demographic features of patients are summarized in (Tables 1 and 2). C. Metagenomic sequencing [0147] Following enrollment, whole blood and plasma were collected into PAXgene and EDTA tubes, respectively. Whole blood PAXgene tubes were processed and stored at -80C according to manufacturer’s instructions, and plasma was frozen at -80C within two hours. To evaluate host gene expression and detect microbes, RNA-seq was performed on the whole blood and plasma specimens, DNA-seq was performed only on plasma. RNA was extracted from whole blood using the Qiagen RNEasy kit and normalized to 10ng total input per sample. Total plasma nucleic acid was extracted from 300uL of plasma, first clarified by two minutes of maximum-speed centrifugation, using the Zymo Pathogen Magbead Kit. 10ng of total nucleic acid underwent DNA-seq using the NEBNext Ultra II DNA Kit. Samples with at least 10ng of remaining total nucleic acid were treated with DNAse (Qiagen) to recover RNA, and then underwent RNA library preparation using the NEBNext Ultra II RNA-seq Kit as described below. [0148] For RNA-seq library preparation, human cytosolic and mitochondrial ribosomal RNA and globin RNA was first depleted using FastSelect (Qiagen). For the purposes of background contamination correction (see below) and to enable estimation of input microbial mass, we included negative water controls as well as positive controls (spike-in RNA standards from the External RNA Controls Consortium (ERCC), Pine, et al, BMC Biotechnology 16, 54 (2016)). RNA was then fragmented and underwent library preparation using the NEBNext Ultra II RNA- seq Kit (New England Biolabs) according to described methods (Mick, et al., Nature Communications 11, 5854 (2020). Finished libraries underwent 146 nucleotide paired-end Illumina sequencing on an Illumina Novaseq 6000 instrument. D. Host differential expression and pathway analysis [0149] Following demultiplexing, sequencing reads were aligned with STAR (Dobin et al, Bioinformatics 29:15-21, 2013) to an index consisting of all transcripts associated with human protein coding genes (ENSEMBL v.99), cytosolic and mitochondrial ribosomal RNA sequences, and the sequences of ERCC RNA standards. Samples retained in the dataset had a total of at least 50,000 estimated counts associated with transcripts of protein coding genes. [0150] Differential expression analysis was performed using DESeq2 and including covariates for age and gender. Significant genes were identified using an independent-hypothesis-weighted, Benjamini-Hochberg false discovery rate (FDR) < 0.1 (Ignatiadis, et al, Nature Methods 13, 577–580 (2016); Benjamini, Y. & Hochberg, Journal of the Royal Statistical Society: Series B (Methodological) 57, 289–300 (1995). We generated heatmaps of the top 50 differentially expressed genes by absolute log2-fold change. To evaluate signaling pathways from gene expression data, we employed gene set enrichment analysis using WebGestalt (Liao, et al, Nucleic Acids Research 47, W199–W205 (2019)) on all ranked differentially expressed genes with a P value < 0.1. Significant pathways and upstream regulators were defined as those with a gene set P value < 0.05. E. Pathogen detection [0151] Detection of microbes leveraged the open-source IDseq pipeline (Kalantar et al., 2020, supra), which incorporates subtractive alignment of the human genome (NCBI GRC h38) using STAR (Dobin et al., 2013, supra), quality and complexity filtering, and subsequent removal of cloning vectors and phiX phage using Bowtie2 (Kalantar et al., 2020, supra). The identities of the remaining microbial reads are determined by querying the NCBI nucleotide (NT) database using GSNAP-L (Kalantar et al., 2020, supra; Zhao, et al., Bioinformatics 28, 125–126 (2012)). After background correction (see below), retained non-viral taxonomic alignments in each sample were aggregated at the genus level, and sorted in descending order by abundance measured in reads per million (rpM), independently for each sample (Fig.3B. A previously validated rules based model (RBM) (Langelier, et al.2018, supra) was then utilized to identify disproportionately abundant microbes in each sample, and flag them as pathogens if they were present in an a. priori established reference index of the top 20 most prevalent bloodstream infection pathogens reported by both the National Healthcare Safety Network (NHSN) (Weiner- et al., Infect. Control Hosp. Epidemiol.41, 1–18 (2020)). and a recent multicenter surveillance study of healthcare-associated infections (Magill, et al,. New England Journal of Medicine 379, 1732–1744 (2018)). [0152] The RBM, originally developed to identify pathogens from respiratory mNGS data, identifies outlier organisms within a sample by identifying the greatest gap in abundance between the top 15 sequentially ranked microbes in each sample (FIG.3B). We adapted this model for sepsis pathogen detection, in which outlier organisms are sometimes present in trace amounts, by incorporating a sepsis (versus respiratory) pathogen reference index (Table 14) and requiring that the species called by the RBM both be present in the reference index and detected at an abundance of > 1 rpM. The RBM also identified human pathogenic respiratory viruses derived from a reference list of LRTI pathogens (Langelier et al, 2018, supra) present in the plasma cf-RNA-seq data. F. Identification and mitigation of environmental contaminants [0153] Negative control samples consisting of only water (n=24) were processed alongside cf- plasma DNA samples, which were sequenced in a single batch. Negative control samples enabled estimation of the number of background reads expected for each taxon (Mick, et al., 2020, supra). A previously developed negative binomial model (also Mick, et al., 2020) was employed to identify taxa with NT sequencing alignments present at an abundance significantly greater compared to negative water controls. This was done by modeling the number of background reads as a negative binomial distribution, with mean and dispersion fitted on the negative controls. For each taxon, we estimated the mean parameter of the negative binomial by averaging the read counts across all negative controls. We estimated a single dispersion parameter across all taxa, using the functions glm.nb() and theta.md() from the R package MASS (Venables, et al, Modern Applied Statistics with S. (Springer-Verlag, 2002); doi:10.1007/978-0- 387-21706-2). Taxa that achieved an adjusted P-value <0.01 (Benjamini & Hochberg multiple test correction) were carried forward to the above-described RBM for pathogen detection. G. Microbial biomass calculations [0154] Microbial biomass was calculated based on total reads aligning to the External RNA Controls Consortium (ERCC) RNA standards spiked into samples (Pine, et al, BMC Biotechnology 16, 54 (2016). The following equation was utilized for this calculation: [ERCC input mass]/[microbial input mass] = [ERCC reads]/[microbial reads], where the ERCC input mass was 25pg. H. Host transcriptional classifiers for sepsis and viral sepsis [0155] To build classifiers to differentiate patients with sepsis (SepsisBSI, Sepsisnon-BSI) from those with non-infectious critical illness (No-Sepsis), and to distinguish viral from non-viral sepsis, we built a Support Vector Machine (SVM)-based classifier (Cortes & Vapnik, Mach. Learn.20, 273–297 (1995)). with the scikit-learn (Pedregosa et al., Journal of Machine Learning Research 12, 2825–2830 (2011)). (v0.23.2) library in Python (v3.8.3). More specifically, a bagged SVM (bSVM) classifier with a linear kernel was chosen. Each classifier used a bootstrapped set of samples and a random subset of features. [0156] We evaluated samples with ^ 50,000 plasma gene counts and genes with more than 20% non-zero plasma counts in that sample subset. Only differentially expressed genes, identified using DESeq2 (v1.28.1) in the training set, were considered as potential predictors. Age and sex were included as covariates in the models. We used Z-score-scaled transformed (variance stabilizing transformation) gene counts.75% of the data was selected to train the model, and the rest was used as a held-out set to test the final model. The training set was subsequently randomly split ten times for cross-validation, using 75% of each as intermediate training sets, and the remaining 25% as their associated testing sets. [0157] On each one of those intermediate training sets, we carried out feature selection and parameters optimization using nested 5-fold cross-validations. We optimized three parameters: the regularization parameter, the maximum number of features considered for each classifier, and the total number of classifiers to use for bagging. For each parameters optimization fold, a recursive feature elimination (RFE) strategy was adopted, dropping 10% of the remaining least important features at each iteration. A bSVM classifier with default parameters was built at each iteration. We defined feature importance as the average squared weight across all estimators. To maximize interpretability, we restricted the maximum number of predictors to 100 genes. [0158] We estimated model performances using the Area Under the Receiver Operating Characteristic Curve (AUC) values. To obtain a single set of features, we fitted a model, using the aforementioned strategy, to the initial training set. This model was then tested on the held-out set to obtain a final performance value and a single set of predictors. I. Comparison of cf-plasma mNGS against clinician-ordered diagnostic testing [0159] Clinical microbiological testing was carried out based on decisions from the primary medical team during the patient’s hospital admission at the UCSF Clinical Microbiology Laboratory. Tests utilized bacterial culture from blood, lower respiratory tract and urine which were carried out according to previously described protocols (Langelier et al., 2018, supra). Clinical testing for viral respiratory pathogens was performed from nasopharyngeal swabs and/or bronchioalveolar lavage using the Luminex XTag multiplex viral PCR assay. Polymicrobial blood cultures with ^ 3 bacteria (n=2) were excluded from pathogen concordance given their unclear clinical significance and potential that some organisms reflected contamination. J. Statistics and Reproducibility [0160] Statistical tests utilized for each analysis are described in the figure legends and in further detail in each respective methods section. The number of patient samples analyzed for each comparison are indicated in the figure legends. Data were generated from single sequencing runs without technical replicates. [0161] The processed genecount data are available from the National Center for Biotechnology Information Gene Expression Omnibus database under accession code GSE189403. The raw fastq files with microbial sequencing reads are available from the Sequence Read Archive under BioProject ID: PRJNA783060. Table 1. Summary of clinical and demographic features of patients evaluated in whole blood gene expression analyses only (n=221). These include patients with microbiologically-confirmed sepsis (SepsisBSI and Sepsisnon-BSI) and those with non-infectious critical illnesses (No-Sepsis).
Figure imgf000058_0001
Figure imgf000059_0001
Figure imgf000060_0001
Additional Tables [0162] Table 3. Gene set enrichment analysis of differentially expressed genes between patients with microbiologically confirmed sepsis (SepsisBSI and Sepsisnon-BSI) and those with non-infectious critical illnesses (No-Sepsis). Data from whole blood RNA-seq. The top 10 positively and negatively enriched pathways by P value are included in table. [0163] Table 4. Gene set enrichment analysis of differentially expressed genes between patients with sepsis due to bloodstream infections (SepsisBSI) versus peripheral infections (Sepsisnon-BSI). Data from whole blood RNA-seq. The top 10 positively and negatively enriched pathways by P value are included in table. [0164] Table 5. A) Area under the receiver operating characteristic curve (AUC) values for 10 independent training set models for a whole blood gene expression support vector machine classifier to distinguish patients with microbiologically confirmed (SepsisBSI and Sepsisnon-BSI) from those with non-infectious critical illnesses (No-Sepsis). Composite list of all genes selected by each classifier model is shown in Table C. [0165] Table 6 AUC values for 10 independent training set models for a cf-plasma gene expression support vector machine classifier to distinguish patients with microbiologically confirmed (SepsisBSI and Sepsisnon-BSI) from those with non-infectious critical illnesses (No- Sepsis). Composite list of all genes selected by each classifier model is shown in Table A. [0166] Table 7. Mass (pg) of microbial DNA in each sample, calculated based on spiked-in 25 pg ERCC positive controls. [0167] Table 8. Sepsis pathogens detected by standard of care clinical microbiology and/or by cf-plasma mNGS, using the rules-based model. [0168] Table 9. Gene set enrichment analysis of differentially expressed genes between patients with viral versus non-viral causes of sepsis amongst the SepsisBSI and Sepsisnon-BSI patients. Data from whole blood RNA-seq. [0169] Table 10. Gene set enrichment analysis of differentially expressed genes between patients with viral versus non-viral causes of sepsis amongst the SepsisBSI and Sepsisnon-BSI patients. Data from plasma cf-RNA-seq. [0170] Table 11. AUC values for 10 independent training set models for a whole blood gene expression support vector machine classifier to distinguish patients with microbiologically confirmed viral versus non-viral sepsis (SepsisBSI and Sepsisnon-BSI), from whole blood RNA- seq. Composite list of all genes selected by each classifier model is provided in Table D. [0171] Table 12. AUC values for 10 independent training set models for a plasma cf-RNA gene expression support vector machine classifier to distinguish patients with microbiologically confirmed viral versus non-viral sepsis (SepsisBSI and Sepsisnon-BSI), from plasma cf-RNA-seq. Composite list of all genes selected by each classifier model is provided in Table B. [0172] Table 13. Complete integrated host-microbe mNGS dataset. This includes: per-sample classifier predictions for all patients with cf-plasma sequencing data (n=138), including the sepsis diagnostic classifier, and the viral versus non-viral sepsis classifier; pathogens detected by clinical diagnostics and by mNGS; and microbial mass per sample. [0173] Table 14. Reference index of established sepsis pathogens. Derived from the top 20 most prevalent sepsis pathogens reported by both the US CDC/ National Healthcare Safety Network and a point prevalence survey of healthcare-associated infections. These studies included multiple species of Candida, Citrobacter, Enterobacter, Enterococcus, Fusobacterium, Klebsiella and Morganella, which were collapsed in the table to the genus level. [0174] Table 15 provides illustrative differential gene expression data from the analysis described in section VIII.C between patients with microbiologically confirmed SepsisBSI and Sepsisnon-BSI for various genes listed in Table A [0175] Table 16 provides illustrative differential gene expression data from the analysis described in section VIII.F between patients between patients with or without clinically confirmed viral sepsis for various gene listed in Table B.
^ ^ HO E D 7
Figure imgf000063_0001
Figure imgf000064_0001
Figure imgf000065_0001
Figure imgf000066_0001
Figure imgf000067_0001
Figure imgf000068_0001
Figure imgf000069_0001
Figure imgf000070_0001
Figure imgf000071_0001
^ ^ HO E D 7
Figure imgf000072_0001
s g e n i n d e a G e e L g d E d t e n z i l e a m hmr cio r Nn E t ne mh ci r n E
Figure imgf000073_0001
Figure imgf000074_0001
Figure imgf000075_0001
Figure imgf000076_0001
Figure imgf000077_0001
Figure imgf000078_0001
Figure imgf000079_0001
Figure imgf000080_0001
Figure imgf000081_0001
Figure imgf000082_0001
Figure imgf000083_0001
Figure imgf000084_0001

Claims

WHAT IS CLAIMED IS: 1. A method of evaluating a likelihood of general sepsis in a human subject, the method comprising: detecting RNA in a cell-free RNA sample from the human subject of each member of a gene panel, wherein the gene panel comprises at least two members selected from the group consisting of NEDD4, KDM3A; GOLGA2; TBC1D8; LRRK1; MBP; C2CD3; SOS1; MFN1; ATE1; MOSPD2; SMARCA5; INTS4; P4HA2; LTBP3; SZT2; ADAP1; ORC6; KIF1B; PLTP; RAB29; GYG1; SULT1B1; ATP6V1A; ZNF672; PPP1R12B; IPO7; NAA10; GAPVD1; PSMA4; ARID3A; HBA1; VPS35; RPA1; GALNT10; CHD3; MYLK3; CD53; MSI2; DCUN1D4; CASP9; RPS27L; HBB; STXBP2; PADI2; HBA2; STAT3; C20orf24; DMXL2; NUP107; KDM6B; IFNAR1; PAK1; WIPI2; MTMR14; NADSYN1; SULT1A1; CYB5B; STAT2; DOCK5; PCLAF; INTS1; DDAH2; NUP160; PLAA; PLEC; SHANK1; PDCD6; DNAJA3; AC138969.1; PMM2; TNPO1; ZNF330; TLN2; TFEB; GRAMD4; KIAA0930; ANXA3; FRA10AC1; SLC44A1; ARAP1; IFITM3; INTS6; SLAIN1; UBE2G2; DKC1; PFKFB2; SLC38A1; CAMTA2; DYM; TLK2; S100A9; C5orf51; FIG4; HRH2; NFIX; BIRC5, GPI, and TANGO2; determining a quantity of differential gene expression for each member of the gene panel compared to reference levels of RNA in control subjects; determining a probability score based on the respective amount of differential gene expression; and classifying the human subject as having an increased likelihood of general sepsis when the probability score exceeds a threshold value.
2. The method of claim 1, wherein detecting RNA comprises measuring a level of RNA for at least 10 members, or at least 20 member of the gene panel.
3. The method of claim 1, wherein detecting RNA comprises measuring a level of RNA for at least 50 members of the gene panel.
4. The method of claim 1 or 2, wherein members of the gene panel are selected from the group consisting of MBP, SOS1, SMARCA5, LTBP3, PLTP, ATP6V1A, PSMA4, ARID3A, VPS35, RPA1, MYLK3, DCUN1D4, CASP9, HBB, C20orf24, IFNAR1, WIPI2, MTMR14, SULT1A1, PLEC, DNAJA3, AC138969.1, TNPO1, SLC44A1, IFITM3, UBE2G2, S100A9, and FIG4.
5. The method of claim 1-4, wherein determining the quantity of differential gene expression comprises an amplification reaction for each member of the gene panel.
6. The method of claim 1-5, wherein determining the quantity of differential gene expression comprises massively parallel sequencing.
7. The method of claim 1-5, wherein quantification is performed by digital PCR.
8. The method of claim 1-7 wherein the probability score for classifying the human subject as having an increased likelihood of sepsis is 0.5 or greater.
9. A method of evaluating a likelihood of viral sepsis in a human subject, the method comprising: detecting RNA in a cell-free RNA sample from the human subject of each member of a gene panel, wherein the gene panel comprises at least two members selected from the group consisting of: PSME3; OTUB1; RBPJ; GPX3; ERBIN; GABARAPL1; MZT2A; JMJD6; FAM214A; ZC3H11A; CACUL1; TUBG1; TRIM69; LST1; ZNF585B; RBM6; DHX29; SUGP1; SUDS3; CREB5; DYNC1H1; STXBP3; ZNF467; RAPGEF6; SIPA1; RPL7L1; CD2AP; ZNF101; CASP8AP2; CDR2; COP1; ARFGEF1; SLC30A5; RNPEP; ZFX; STARD7; CALCOCO2; BORCS8; GRHPR; DOCK9; TAF2; RANBP1; ABL1; SREK1; and DGKA; determining a quantity of differential gene expression for each member of the gene panel compared to reference levels of RNA in control subjects; determining a probability score based on the respective amount of differential gene expression; and classifying the human subject as having an increased likelihood of sepsis when the probability score exceeds a threshold value.
10. The method of claim 9, wherein detecting RNA comprises measuring the level of RNA for at least 10 members, or at least 20 members, or at least 30 members, or at least 40 members of the gene panel.
11. The method of claim 9, wherein members of the gene panel are selected from the group consisting of OTUB1; LITAF; RBPJ; ERBIN; MZT2A; PPM1B; PEBP1; PARL; TRIM69; RBM6; DHX29; RDX; HIST1H3I; STXBP3; IGLL5; COP1; CALCOCO2; BORCS8; TAF2; and ABL1.
12. The method of any one of claims 9-11, wherein quantification comprises an amplification reaction.
13. The method of any one of claim 9-12, wherein quantification is performed by massively parallel sequencing.
14. The method of any one of claims 9-13, wherein the probability score for classifying the human subject as having an increased likelihood of viral sepsis is 0.9 or greater.
15. A method of determining likelihood of sepsis in a subject comprising (a) quantifying microbial mass in a serum or plasma sample from a patient and (b) determining whether a predominant pathogen is present, the method comprising: (a) adding a known amount of calibration nucleic acids to the a cfDNA preparation obtained from the serum or plasma sample; (b) sequencing a library generated from the cfDNA preparation; (c) aligning sequences obtained from step (b) to sequences present in a database comprising microbial sequences to determine sequence reads that map to a microbial sequence in the database; (d) determining a ratio of a first amount of total sequence reads that correspond to the calibration nucleic acids and a second amount of all microbial sequence reads in step (c); and (f) determining a total microbial mass from the known amount of the calibration nucleic acids and the ratio of the first amount and the second amount determined in (d); thereby quantifying microbial mass in the serum or plasma sample; (g) determining abundance levels of microbial species represented in the cfDNA preparation comprising determining a number of sequence reads that are mapped to individual species of microbe; (h) for each genus of microbes, selecting a species in that genus having a highest abundance level and ranking the selected species by abundance level in sequential order; (i) determining a gap threshold, wherein the gap threshold is the abundance level at which a greatest difference in abundance level occurs between sequential microbes; (j) determining whether any of the species having the abundance level at or that exceeds the gap threshold is a known blood stream pathogen, thereby identifying whether a predominant pathogen exists; and (j) identifying the patient as likely to have sepsis based on the total microbial mass determined in (f) being greater than a specified mass and that the predominant pathogen exists.
16. The method of claim 15, wherein the specified mass is 20 pg.
17. A method of evaluating likelihood of sepsis, comprising performing the methods of claims 1, 9, and 15, wherein a patient has an increased likelihood of sepsis when (i) the probability score determined in claim 1 exceeds the threshold value for general sepsis, (ii) the probability score determined in claim 9 exceeds the threshold value for viral sepsis, or (iii) the total microbial mass determined in claim 15 is greater than the specified mass and the predominant pathogen exists.
PCT/US2023/022245 2022-05-16 2023-05-15 Integrated host-microbe metagenomics of cell-free nucleic acid for sepsis diagnosis WO2023224913A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202263342528P 2022-05-16 2022-05-16
US63/342,528 2022-05-16

Publications (1)

Publication Number Publication Date
WO2023224913A1 true WO2023224913A1 (en) 2023-11-23

Family

ID=88835878

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2023/022245 WO2023224913A1 (en) 2022-05-16 2023-05-15 Integrated host-microbe metagenomics of cell-free nucleic acid for sepsis diagnosis

Country Status (1)

Country Link
WO (1) WO2023224913A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170191129A1 (en) * 2014-02-06 2017-07-06 Immunexpress Pty Ltd. Biomarker signature method, and apparatus and kits thereof
US20190161813A1 (en) * 2016-03-03 2019-05-30 Memed Diagnostics Ltd. Rna determinants for distinguishing between bacterial and viral infections
US20190178888A1 (en) * 2016-01-11 2019-06-13 Technion Research & Development Foundation Limited Methods of determining prognosis of sepsis and treating same
US20210041469A1 (en) * 2012-04-02 2021-02-11 Astute Medical, Inc. Methods and compositions for diagnosis and prognosis of sepsis

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210041469A1 (en) * 2012-04-02 2021-02-11 Astute Medical, Inc. Methods and compositions for diagnosis and prognosis of sepsis
US20170191129A1 (en) * 2014-02-06 2017-07-06 Immunexpress Pty Ltd. Biomarker signature method, and apparatus and kits thereof
US20190178888A1 (en) * 2016-01-11 2019-06-13 Technion Research & Development Foundation Limited Methods of determining prognosis of sepsis and treating same
US20190161813A1 (en) * 2016-03-03 2019-05-30 Memed Diagnostics Ltd. Rna determinants for distinguishing between bacterial and viral infections

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MEYDAN CHANAN, BEKENSTEIN URIYA, SOREQ HERMONA: "Molecular Regulatory Pathways Link Sepsis With Metabolic Syndrome: Non-coding RNA Elements Underlying the Sepsis/Metabolic Cross-Talk", FRONTIERS IN MOLECULAR NEUROSCIENCE, FRONTIERS RESEARCH FOUNDATION, CH, vol. 11, CH , XP093113920, ISSN: 1662-5099, DOI: 10.3389/fnmol.2018.00189 *

Similar Documents

Publication Publication Date Title
US20200172978A1 (en) Apparatus, kits and methods for the prediction of onset of sepsis
EP3362579B1 (en) Methods for diagnosis of tuberculosis
EP3316875B1 (en) Methods to diagnose acute respiratory infections
JP2023098945A (en) Methods for diagnosis of bacterial and viral infections
JP2011517401A (en) In vitro detection and identification method for pathophysiological symptoms
US20220251647A1 (en) Gene expression signatures useful to predict or diagnose sepsis and methods of using the same
JP2019530878A (en) Biomarkers for use in prognosis of mortality in critically ill patients
JP2023501538A (en) Identification of host RNA biomarkers of infection
WO2023224913A1 (en) Integrated host-microbe metagenomics of cell-free nucleic acid for sepsis diagnosis
EP4334474A2 (en) Systems and methods for assessing a bacterial or viral status of a sample
WO2024092169A2 (en) Lower respiratory tract infections
Kalantar et al. Integrated host-microbe metagenomics for sepsis diagnosis in critically ill adults
GB2601600A (en) Apparatus, kits and methods for predicting the development of sepsis
GB2601222A (en) Apparatus, kits and methods for predicting the development of sepsis
WO2024119057A2 (en) Plasma cell-free rna signatures of tuberculosis
AU2022283025A1 (en) Diagnostic for sepsis endotypes and/or severity
CA3204787A1 (en) Methods to detect and treat a fungal infection
EP4217508A1 (en) Apparatus, kits and methods for predicting the development of sepsis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23808131

Country of ref document: EP

Kind code of ref document: A1