WO2020150258A1 - Procédés et systèmes pour la détection de maladies hépatiques - Google Patents

Procédés et systèmes pour la détection de maladies hépatiques Download PDF

Info

Publication number
WO2020150258A1
WO2020150258A1 PCT/US2020/013535 US2020013535W WO2020150258A1 WO 2020150258 A1 WO2020150258 A1 WO 2020150258A1 US 2020013535 W US2020013535 W US 2020013535W WO 2020150258 A1 WO2020150258 A1 WO 2020150258A1
Authority
WO
WIPO (PCT)
Prior art keywords
liver disease
severity
subject
cell
nucleic acid
Prior art date
Application number
PCT/US2020/013535
Other languages
English (en)
Inventor
William Olsen
Storm STILLMAN
Bodour Salhia
Original Assignee
Luminist, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Luminist, Inc. filed Critical Luminist, Inc.
Publication of WO2020150258A1 publication Critical patent/WO2020150258A1/fr

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/112Disease subtyping, staging or classification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/154Methylation markers

Definitions

  • Nonalcoholic fatty liver disease is an increasingly prevalent, progressive liver disease that affects an estimated quarter of the world’s adult population.
  • NAFLD exists as a spectrum ranging from nonalcoholic fatty liver (NAFL) to nonalcoholic steatohepatitis (NASH).
  • NASH nonalcoholic steatohepatitis
  • the presence of NASH has been a major focus of prognostic test development, with a goal of identifying patients at risk for progression to cirrhosis or hepatocellular carcinoma (HCC).
  • liver fibrosis has become increasingly recognized as a reliable predictive marker of NAFLD outcomes.
  • Present diagnostic approaches for NASH and definitive fibrosis staging may be achieved by liver biopsy; however, this“imperfect gold standard” may carry a non-negligible risk of complications, relatively high costs, and sampling biases. These drawbacks may prevent biopsies from being widely, let alone universally, applied to NAFLD patients in the screening for NASH and assessment of fibrosis.
  • repeat biopsy may be recommended for following patients presenting with NASH and biopsy confirmed early stage fibrosis.
  • repeat biopsy may be utilized in the context of clinical trials for novel NASH therapeutics.
  • non-invasive diagnostic tests that can effectively detect severity of liver disease, such as steatohepatitis, as well as discriminate between different severities of liver disease, such as no/minimal fibrosis (FO-1) and significant fibrosis (F2-4).
  • the present disclosure provides methods and systems for detecting liver disease by processing biological samples, such as cell-free samples obtained from subjects.
  • Such subjects may include subjects with a liver disease (e.g., non-alcoholic fatty liver disease (NAFLD), fibrosis, steatohepatitis) and subjects without these liver diseases.
  • NAFLD non-alcoholic fatty liver disease
  • fibrosis fibrosis
  • steatohepatitis steatohepatitis
  • the present disclosure provides a method for identifying a presence or a severity of a liver disease in a subject, comprising: (a) providing a cell-free biological sample from the subject; (b) assaying nucleic acid molecules derived from the cell-free biological sample to generate a data set comprising a methylation profile of one or more genomic regions of the cell-free biological sample; (c) using a trained machine learning algorithm to process the data set, including the methylation profile, to identify the presence or the severity of the liver disease in the subject; and (d) outputting a report that identifies the presence or the severity of the liver disease in the subject.
  • the subject has or is suspected of having a fatty liver.
  • the liver disease is non-alcoholic fatty liver disease (NAFLD).
  • NAFLD non-alcoholic fatty liver disease
  • the liver disease comprises fibrosis.
  • the liver disease comprises steatohepatitis.
  • identifying the presence or the severity of the liver disease comprises determining a presence or absence of fibrosis in the subject. In some embodiments, identifying the presence or the severity of the liver disease comprises determining a presence or absence of significant fibrosis in the subject. In some embodiments, the significant fibrosis is defined by greater than or equal to F2 fibrosis according to Non-Alcoholic Fatty Liver Disease Activity Score criteria.
  • identifying the presence or the severity of the liver disease comprises determining a presence or absence of liver inflammation in the subject. In some embodiments, identifying the presence or the severity of the liver disease comprises identifying a presence or severity of lobular inflammation in the subject. In some embodiments, the lobular inflammation is identified according to Non-Alcoholic Fatty Liver Disease Activity Score criteria. In some embodiments, identifying the presence or the severity of the liver disease comprises identifying a presence or severity of portal inflammation in the subject.
  • identifying the presence or the severity of the liver disease comprises identifying a presence or severity of hepatocellular ballooning.
  • the hepatocellular ballooning is identified according to Non-Alcoholic Fatty Liver Disease Activity Score criteria.
  • identifying the presence or the severity of the liver disease comprises determining a presence or absence of steatohepatitis in the subject. In some embodiments, identifying the presence or the severity of the liver disease comprises determining a presence of borderline steatohepatitis in the subject. In some embodiments, the steatohepatitis or the borderline steatohepatitis is identified according to Non-Alcoholic Fatty Liver Disease Activity Score criteria.
  • the presence or the severity of the liver disease is identified according to a Non-Alcoholic Fatty Liver Disease Activity Score composite comprising steatosis, lobular inflammation, and hepatocellular ballooning.
  • the presence or the severity of the liver disease is identified at an accuracy of at least about 80% for independent samples. In some embodiments, the accuracy is at least about 85%. In some embodiments, the accuracy is at least about 90%. In some embodiments, the accuracy is at least about 95%. In some embodiments, the accuracy is at least about 98%.
  • (b) comprises subjecting the nucleic acid molecules to conditions sufficient to convert unmethylated cytosines in the nucleic acid molecules to uracils. In some embodiments, (b) comprises subjecting the nucleic acid molecules to bisulfite processing.
  • the trained machine learning algorithm is trained with a training set comprising at least 100 training samples.
  • the at least 100 training samples comprise cell-free samples or tissue samples.
  • the at least 100 training samples comprise cell-free samples and tissue samples.
  • the cell-free samples are nucleic acid samples.
  • the trained machine learning algorithm is trained with liver tissue samples and cell-free nucleic acid samples.
  • the trained machine learning algorithm is trained with training samples derived from subjects having liver fibrosis.
  • the trained machine learning algorithm is trained with training samples derived from subjects having liver inflammation.
  • the trained machine learning algorithm is trained with training samples derived from subjects having steatohepatitis.
  • (c) comprises correlating the data set with additional data sets obtained from a plurality of additional samples.
  • the plurality of additional samples comprises at least 100 additional samples.
  • the plurality of additional samples comprises cell-free samples or tissue samples.
  • the plurality of additional samples comprises cell-free samples and tissue samples.
  • the cell-free samples are nucleic acid samples.
  • the plurality of additional samples comprises liver tissue samples and cell-free nucleic acid samples.
  • at least a portion of the plurality of additional samples are derived from subjects having liver fibrosis.
  • at least a portion of the plurality of additional samples are derived from subjects having liver inflammation.
  • at least a portion of the plurality of additional samples are derived from subjects having steatohepatitis.
  • (b) comprises assaying deoxyribonucleic acid (DNA) molecules.
  • the DNA molecules correspond to one or more genomic regions selected from Table 1.
  • (b) comprises assaying ribonucleic acid (RNA) molecules.
  • (b) comprises assaying deoxyribonucleic acid (DNA) molecules and ribonucleic acid (RNA) molecules.
  • the nucleic acid molecules comprise deoxyribonucleic acid (DNA) molecules.
  • the nucleic acid molecules comprise ribonucleic acid (RNA) molecules.
  • the cell-free biological sample comprises or is derived from a bodily fluid.
  • the bodily fluid comprises blood.
  • the cell-free biological sample is obtained by a blood draw.
  • the bodily fluid comprises plasma.
  • (b) comprises sequencing the nucleic acid molecules or derivatives thereof to provide a plurality of sequence reads. In some embodiments, (c) comprises processing the plurality of sequence reads to identify hypomethylated and
  • (c) comprises processing the plurality of sequence reads to identify one or more markers of inflammation. In some embodiments, (c) comprises processing the plurality of sequence reads to identify one or more markers of fibrosis.
  • the one or more genomic regions comprise one or more differentially methylated regions.
  • (c) comprises using one or more computer processors to process the data set using the trained machine learning algorithm.
  • the report is displayed on a computer screen. In some embodiments, the report is presented as a paper record. In some embodiments, the report comprises one or more items selected from the group consisting of a number of differentially methylated regions in the data set, a number or identity of markers of inflammation identified in the data set, a number or identity of markers of fibrosis identified in the data set, a quality of the cell-free biological sample, and an identification of the presence or the severity of the liver disease. In some embodiments, the identification of the presence or the severity of the liver disease comprises an identification of fibrosis. In some embodiments, the identification of the presence or the severity of the liver disease comprises an identification of steatohepatitis.
  • the report comprises a proposed therapeutic regimen for the liver disease.
  • the subject has or is suspected of having the liver disease. In some embodiments, the subject is undergoing or has undergone treatment for the liver disease. In some embodiments, the method further comprises based at least in part on the data set, identifying a risk factor for cirrhosis. In some embodiments, the method further comprises based at least in part on the data set, identifying a risk factor for hepatocellular carcinoma. In some embodiments, the report comprises the risk factor.
  • the method further comprises based at least in part on the data set, identifying a presence or absence of cirrhosis in the subject. In some embodiments, the method further comprises based at least in part on the data set, identifying a presence or absence of hepatocellular carcinoma in the subject. In some embodiments, the cell-free biological sample has not previously received a definitive diagnosis of the presence or the severity of the liver disease.
  • the trained machine learning algorithm is configured to identify the presence or the severity of the liver disease with a positive predictive value (PPV) of at least about 70%. In some embodiments, the PPV is at least about 80%. In some
  • the PPV is at least about 90%. In some embodiments, the PPV is at least about 95%.
  • the trained machine learning algorithm is configured to identify the presence or the severity of the liver disease with a negative predictive value (NPV) of at least about 70%.
  • NPV negative predictive value
  • the NPV is at least about 80%.
  • the NPV is at least about 90%.
  • the NPV is at least about 95%.
  • the trained machine learning algorithm is configured to identify the presence or the severity of the liver disease with a clinical sensitivity of at least about 70%.
  • the clinical sensitivity is at least about 80%.
  • the clinical sensitivity is at least about 90%.
  • the clinical sensitivity is at least about 95%.
  • the trained machine learning algorithm is configured to identify the presence or the severity of the liver disease with a clinical specificity of at least about 70%. In some embodiments, the clinical specificity is at least about 80%. In some embodiments, the clinical specificity is at least about 90%. In some embodiments, the clinical specificity is at least about 95%.
  • the trained machine learning algorithm is configured to identify the presence or the severity of the liver disease with an Area Under Curve (AUC) of at least about 0.80. In some embodiments, the AUC is at least about 0.90. In some embodiments, the AUC is at least about 0.95. In some embodiments, the AUC is at least about 0.99.
  • AUC Area Under Curve
  • the trained machine learning algorithm comprises a supervised machine learning algorithm.
  • the supervised machine learning algorithm comprises a Random Forest, a support vector machine (SVM), a neural network, or a deep learning algorithm.
  • the present disclosure provides a computer system for identifying a presence or a severity of a liver disease in a subject, comprising: a database that is configured to store a data set comprising a methylation profile of one or more genomic regions of a cell-free biological sample of the subject; and one or more computer processors operatively coupled to the database, wherein the one or more computer processors are individually or collectively programmed to: (i) use a trained machine learning algorithm to process the data set to identify the presence or severity of the liver disease in the subject; and (ii) output a report that identifies the presence or severity of the liver disease in the subject.
  • the computer system further comprises an electronic display operatively coupled to the one or more computer processors, wherein the electronic display comprises a graphical user interface that is configured to display the report.
  • the assaying of (b) comprises sequence identification.
  • the sequence identification comprises use of array hybridization.
  • the sequence identification comprises enrichment for one or more genomic regions.
  • the present disclosure provides a non-transitory computer readable medium comprising machine-executable code that, upon execution by one or more computer processors, implements a method for identifying a presence or a severity of a liver disease in a subject, the method comprising: (a) obtaining a data set comprising a methylation profile of one or more genomic regions of a cell-free biological sample of the subject; (b) using a trained machine learning algorithm to process the data set, including the methylation profile, to identify the presence or the severity of the liver disease in the subject; and (c) outputting a report that identifies the presence or the severity of the liver disease in the subject.
  • non-transitory computer readable medium comprising machine executable code that, upon execution by one or more computer processors, implements any of the methods above or elsewhere herein.
  • the non-transitory computer readable medium comprises machine executable code that, upon execution by one or more computer processors, implements a method for identifying a presence or a severity of a liver disease in a subject, the method comprising: (a) obtaining a data set comprising a methylation profile of one or more genomic regions of a cell-free biological sample of the subject; (b) using a trained machine learning algorithm to process the data set, including the methylation profile, to identify the presence or the severity of the liver disease in the subject; and (c) outputting a report that identifies the presence or the severity of the liver disease in the subject.
  • Another aspect of the present disclosure provides a system comprising one or more computer processors and computer memory coupled thereto.
  • the computer memory comprises machine executable code that, upon execution by the one or more computer processors, implements any of the methods above or elsewhere herein.
  • the present disclosure provides a method for identifying a presence or a severity of a liver disease in a subject, comprising: (i) assaying nucleic acid molecules derived from a cell-free biological sample of the subject to generate a methylation profile of one or more genomic regions of the cell-free biological sample; and (ii) using a trained algorithm to process at least the methylation profile to identify the presence or the severity of the liver disease in the subject at an accuracy of at least 90% for at least 50 independent samples.
  • the subject has or is suspected of having a fatty liver.
  • the liver disease is non-alcoholic fatty liver disease (NAFLD).
  • NAFLD non-alcoholic fatty liver disease
  • the liver disease comprises fibrosis.
  • the liver disease comprises steatohepatitis.
  • identifying the presence or the severity of the liver disease comprises determining a presence or absence of fibrosis in the subject. In some embodiments, identifying the presence or the severity of the liver disease comprises determining a presence or absence of significant fibrosis in the subject. In some embodiments, the significant fibrosis is defined by greater than or equal to F2 fibrosis according to Non-Alcoholic Fatty Liver Disease Activity Score criteria. In some embodiments, identifying the presence or the severity of the liver disease comprises determining a presence or absence of liver inflammation in the subject.
  • identifying the presence or the severity of the liver disease comprises identifying a presence or severity of lobular inflammation in the subject. In some embodiments, the lobular inflammation is identified according to Non-Alcoholic Fatty Liver Disease Activity Score criteria. In some embodiments, identifying the presence or the severity of the liver disease comprises identifying a presence or severity of portal inflammation in the subject. In some embodiments, identifying the presence or the severity of the liver disease comprises identifying a presence or severity of hepatocellular ballooning. In some embodiments, the hepatocellular ballooning is identified according to Non-Alcoholic Fatty Liver Disease Activity Score criteria.
  • identifying the presence or the severity of the liver disease comprises determining a presence or absence of steatohepatitis in the subject. In some embodiments, identifying the presence or the severity of the liver disease comprises determining a presence of borderline steatohepatitis in the subject. In some embodiments, the steatohepatitis or the borderline steatohepatitis is identified according to Non-Alcoholic Fatty Liver Disease Activity Score criteria. In some embodiments, the presence or the severity of the liver disease is identified according to a Non-Alcoholic Fatty Liver Disease Activity Score composite comprising steatosis, lobular inflammation, and hepatocellular ballooning.
  • the presence or the severity of the liver disease is identified at an accuracy of at least about 80% for independent samples. In some embodiments, the accuracy is at least about 85%. In some embodiments, the accuracy is at least about 90%. In some embodiments, the accuracy is at least about 95%. In some embodiments, the accuracy is at least about 98%.
  • the assaying in (a) comprises subjecting the nucleic acid molecules to conditions sufficient to convert unmethylated cytosines in the nucleic acid molecules to uracils. In some embodiments, the assaying in (i) comprises subjecting the nucleic acid molecules to bisulfite processing.
  • the trained algorithm is trained with a training set comprising at least 100 training samples.
  • the at least 100 training samples comprise cell-free samples or tissue samples.
  • the at least 100 training samples comprise cell-free samples and tissue samples.
  • the cell-free samples are nucleic acid samples.
  • the trained algorithm is trained with liver tissue samples and cell-free nucleic acid samples.
  • the trained algorithm is trained with training samples derived from subjects having liver fibrosis.
  • the trained algorithm is trained with training samples derived from subjects having liver inflammation.
  • the trained algorithm is trained with training samples derived from subjects having steatohepatitis.
  • (ii) comprises correlating the methylation profile with additional methylation profiles obtained from a plurality of additional samples.
  • the plurality of additional samples comprises at least 100 additional samples.
  • the plurality of additional samples comprises cell-free samples or tissue samples.
  • the plurality of additional samples comprises cell-free samples and tissue samples.
  • the cell-free samples are nucleic acid samples.
  • the plurality of additional samples comprises liver tissue samples and cell- free nucleic acid samples.
  • at least a portion of the plurality of additional samples are derived from subjects having liver fibrosis.
  • at least a portion of the plurality of additional samples are derived from subjects having liver inflammation.
  • At least a portion of the plurality of additional samples are derived from subjects having steatohepatitis.
  • (i) comprises assaying deoxyribonucleic acid (DNA) molecules derived from the cell-free biological sample.
  • the DNA molecules correspond to one or more genomic regions selected from Table 1.
  • (i) comprises assaying ribonucleic acid (RNA) molecules derived from the cell- free biological sample.
  • (i) comprises assaying deoxyribonucleic acid (DNA) molecules and ribonucleic acid (RNA) molecules derived from the cell-free biological sample.
  • the nucleic acid molecules comprise deoxyribonucleic acid (DNA) molecules. In some embodiments, the nucleic acid molecules comprise ribonucleic acid (RNA) molecules.
  • the cell-free biological sample comprises or is derived from a bodily fluid.
  • the bodily fluid comprises blood.
  • the cell-free biological sample is obtained by a blood draw.
  • the bodily fluid comprises plasma.
  • the assaying of (i) comprises sequencing the nucleic acid molecules or derivatives thereof to provide a plurality of sequence reads. In some embodiments, (ii) comprises processing the plurality of sequence reads to identify hypomethylated and hypermethylated regions of the one or more genomic regions. In some embodiments, (ii) comprises processing the plurality of sequence reads to identify one or more markers of inflammation. In some embodiments, (ii) comprises processing the plurality of sequence reads to identify one or more markers of fibrosis.
  • the one or more genomic regions comprise one or more differentially methylated regions.
  • (ii) comprises using one or more computer processors to process the methylation profile using the trained algorithm.
  • the subject has or is suspected of having the liver disease. In some embodiments, the subject is undergoing or has undergone treatment for the liver disease.
  • the cell-free biological sample has not previously received a definitive diagnosis of the severity of the liver disease.
  • the trained algorithm is configured to identify the severity of the liver disease with a positive predictive value (PPV) of at least about 70%.
  • PPV positive predictive value
  • the PPV is at least about 80%.
  • the PPV is at least about 90%.
  • the PPV is at least about 95%.
  • the trained algorithm is configured to identify the severity of the liver disease with a negative predictive value (NPV) of at least about 70%.
  • NPV negative predictive value
  • the NPV is at least about 80%.
  • the NPV is at least about 90%.
  • the NPV is at least about 95%.
  • the trained algorithm is configured to identify the severity of the liver disease with a clinical sensitivity of at least about 70%.
  • the clinical sensitivity is at least about 80%.
  • the clinical sensitivity is at least about 90%.
  • the clinical sensitivity is at least about 95%.
  • the trained algorithm is configured to identify the severity of the liver disease with a clinical specificity of at least about 70%. In some embodiments, the clinical specificity is at least about 80%. In some embodiments, the clinical specificity is at least about 90%. In some embodiments, the clinical specificity is at least about 95%.
  • the trained algorithm is configured to identify the severity of the liver disease with an Area Under Curve (AUC) of at least about 0.80. In some embodiments, the AUC is at least about 0.90. In some embodiments, the AUC is at least about 0.95. In some embodiments, the AUC is at least about 0.99.
  • AUC Area Under Curve
  • the trained algorithm comprises a supervised algorithm.
  • the supervised algorithm comprises a Random Forest, a support vector machine (SVM), a neural network, or a deep learning algorithm.
  • the assaying of (i) comprises sequence identification.
  • the sequence identification comprises use of array hybridization.
  • the sequence identification comprises enrichment for one or more genomic regions.
  • FIG. 1 schematically illustrates an exemplary workflow for a method of the present disclosure.
  • FIG. 2 illustrates a computer system that is programmed or otherwise configured to implement methods provided herein.
  • FIG. 3 illustrates a receiver operating characteristic (ROC) curve for fibrosis.
  • FIG. 4 illustrates a receiver operating characteristic (ROC) curve for steatohepatitis.
  • the term“subject,” as used herein, generally refers to an individual having, suspected of having, or at risk of having a disease (e.g., a liver disease), or a healthy individual.
  • the subject may be a mammal or non-mammal.
  • the subject may be symptomatic of the disease.
  • the subject may be asymptomatic with respect to the disease.
  • the subject may be a patient.
  • the subject may not be a patient.
  • the subject may be a human, dog, cat, rodent, bird, non-human primate, simian, farm animal (e.g., production cattle, dairy cattle, poultry, horses, pigs, etc.), sport animal, or companion animal (e.g., pet or support animal), or the like.
  • a subject may be known to have or to have previously had a disease (e.g., a liver disease) or condition.
  • a disease e.g., a liver disease
  • a subject may be known to have diabetes and/or to be overweight or obese.
  • a subject may be an individual who has been diagnosed with having a disease (e.g., liver disease).
  • a subject may be known to have or to have previously had a fatty liver, liver fibrosis, liver inflammation, or another liver condition.
  • a subject may be undergoing treatment or may have previously undergone treatment for the disease (e.g., liver disease) or condition.
  • a subject may have previously had and/or undergone treatment for a disease or condition other than a liver disease.
  • a subject may be suspected of having a disease or condition such as a fatty liver, liver fibrosis, liver inflammation, or another liver condition.
  • a subject may be suspected of having the disease or condition based upon an analysis of risk factors such as a medical history of diabetes or other metabolic syndromes, obese/overweight status, body mass index, elevated triglycerides, reduced high density lipoprotein levels, elevated blood pressure, elevated fasting glucose, insulin resistance, thrombocytopenia, aspartate aminotransferase (AST) level greater than alanine aminotransferase (ALT) level, age, ethnic or racial heritage, nationality, place of birth or residence, or other factors.
  • risk factors such as a medical history of diabetes or other metabolic syndromes, obese/overweight status, body mass index, elevated triglycerides, reduced high density lipoprotein levels, elevated blood pressure, elevated fasting glucose, insulin resistance, thrombocytopenia, aspartate aminotransferase (AST) level greater than alanine aminotransfer
  • NAFLD Fibrosis NAFLD Fibrosis
  • FIB-4 Fibrosis-4 index
  • MRE magnetic resonance elastography
  • a subject may be a female individual who is pregnant or planning to become pregnant who may have been diagnosed with or is suspected of having a disease (e.g., liver disease).
  • a presence of a disease may be identified in a subject.
  • a “presence” of a disease generally refers to a presence of one or more symptoms of a disease.
  • a presence of a disease may be assessed by, for example, evaluating a subject and/or a sample from the subject for one or more symptoms associated with the disease, evaluating a sample from the subject for one or more genetic signatures associated with the disease, or a combination thereof.
  • a subject may be identified as having a disease (e.g., liver disease) with a given severity.
  • A“severity” of a disease generally refers to an intensity or degree of a disease.
  • a severity may be a subjective or objective metric, or a combination thereof.
  • a severity may be indicated by a numerical indicator, such as a score from a medical evaluation.
  • a more severe disease e.g., a disease with a higher severity
  • a disease may be more severe in a higher risk patient, such as a patient having advanced age, a compromised or immature immune system, or complicating conditions.
  • a subject may have, for example, non-alcoholic fatty liver disease (NAFLD) (e.g., NAFLD comprising fatty liver, fibrosis, steatohepatitis, and/or inflammation), fibrosis, significant fibrosis, steatohepatitis, borderline steatohepatitis, liver inflammation, lobular inflammation, portal inflammation, steatosis, hepatocellular ballooning, or a combination thereof (e.g., by imaging or histology).
  • a subject may have a fatty liver or a liver disease (e.g., non alcoholic fatty liver disease (NAFLD) (e.g., NAFLD comprising fatty liver, fibrosis,
  • steatohepatitis and/or inflammation
  • fibrosis significant fibrosis
  • steatohepatitis borderline steatohepatitis
  • liver inflammation lobular inflammation
  • portal inflammation steatosis
  • hepatocellular ballooning or a combination thereof
  • a subject may be at increased risk of having a severity of a liver disease, such as non-alcoholic fatty liver disease (NAFLD) (e.g., NAFLD comprising fatty liver, fibrosis, steatohepatitis, and/or inflammation), fibrosis, significant fibrosis, steatohepatitis, borderline steatohepatitis, liver inflammation, lobular inflammation, portal inflammation, steatosis, hepatocellular ballooning, or a combination thereof.
  • NAFLD non-alcoholic fatty liver disease
  • a subject may have a risk factor of having a severity of a liver disease (such as non alcoholic fatty liver disease (NAFLD), fibrosis, significant fibrosis, steatohepatitis, borderline steatohepatitis, liver inflammation, lobular inflammation, portal inflammation, steatosis, hepatocellular ballooning, or a combination thereof) such as obesity, elevated triglycerides, reduced high density lipoprotein (FfDL-C), elevated blood pressure, elevated fasting glucose (including type 2 diabetes and impaired fasting glucose), insulin resistance, thrombocytopenia, AST level greater than ALT level, high body mass index, or age at least about 50 years (e.g., at least about 40 years, 45 years, 50 years, 55 years, 60 years, 65 years, 70 years, 75 years, or older).
  • a liver disease such as non alcoholic fatty liver disease (NAFLD), fibrosis, significant fibrosis, steatohepatitis, borderline steatohe
  • treatment for any aforementioned disease or condition may constitute a risk factor of having a severity of a liver disease (such as NASH). Risk may be assessed by NAFLD Fibrosis (NFS) scores, Fibrosis-4 (FIB-4) index, vibration-controlled transient elastography (VCTE), and magnetic resonance elastography (MRE) assessments.
  • NFS NAFLD Fibrosis
  • FIB-4 Fibrosis-4
  • MRE magnetic resonance elastography
  • a subject providing a sample for analysis according to the methods provided herein may provide the sample for any reason.
  • a sample may be obtained from the subject upon recommendation or prescription by a medical professional, e.g., in response to an inciting event (e.g., a diabetic episode, heart attack, fainting spell, stroke, sudden weight change, etc.) or based on one or more features of the subject such as one or more risk factors (e.g., as described herein).
  • a subject may provide a sample for analysis according to the methods provided herein based on their own assessment (e.g., through a laboratory testing service, molecular profiling service, or other medical/research organization or laboratory).
  • a subject may provide a sample based on their weight, body mass index, comorbidities, or other risk factors.
  • a subject may provide a sample based on one or more other characteristics. For example, a subject may provide a sample based on their age (e.g., at least 18 years of age or no more than 75 years of age), race, ethnicity, national origin, place of residence, medical history, or another reason. Such characteristics may be applied, for example, in a study such as a clinical study or method verification study.
  • a subject may provide a sample based on belonging to a particular age range (e.g., at least 18 years of age, at least 21 years of age, at least 25 years of age, at least 30 years of age, at least 35 years of age, at least 40 years of age, at least 45 years of age, at least 50 years of age, at least 55 years of age, at least 60 years of age, at least 65 years of age, at least 70 years of age, at least 75 years of age, or older; or e.g., no more than 75 years of age, no more than 70 years of age, no more than 65 years of age, no more than 60 years of age, no more than 55 years of age, no more than 50 years of age, no more than 45 years of age, no more than 40 years of age, no more than 35 years of age, no more than 30 years of age, no more than 25 years of age, no more than 21 years of age, or younger).
  • a particular age range e.g., at least 18 years of age, at least 21 years of age, at least 25
  • a subject may be recommended to provide a sample by a medical professional (e.g., based on factors described herein).
  • a subject may be asked to participate in a clinical or research study (e.g., based on any factors described herein or any other factor) and provide a sample for analysis to facilitate the clinical or research study.
  • nucleic acid or“nucleic acid molecule” generally refers to a polymeric form of nucleotides of any length, either deoxyribonucleotides (dNTPs) or ribonucleotides (rNTPs), or analogs thereof. Nucleic acid molecules may have any three- dimensional structure, and may perform any function, known or unknown.
  • dNTPs deoxyribonucleotides
  • rNTPs ribonucleotides
  • Non-limiting examples of nucleic acid molecules include deoxyribonucleic acid (DNA), ribonucleic acid (RNA), coding or non-coding regions of a gene or gene fragment, loci (locus) defined from linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), short interfering RNA (siRNA), short-hairpin RNA (shRNA), complementary DNA (cDNA), and micro-RNA (miRNA) molecules.
  • DNA deoxyribonucleic acid
  • RNA ribonucleic acid
  • coding or non-coding regions of a gene or gene fragment loci (locus) defined from linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), short interfering RNA (siRNA), short-hairpin RNA (shRNA), complementary DNA (cDNA), and micro-RNA (miRNA) molecules.
  • nucleic acid molecules include ribozymes, recombinant nucleic acids, branched nucleic acids, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers.
  • a nucleic acid molecule may be a cell-free nucleic acid molecule, such as a cell-free DNA (cfDNA) molecules.
  • Cell-free nucleic acid molecules include, for example, nucleic acid molecules derived from cells (e.g., dead or degraded cells), fetal nucleic acid molecules such as cell-free fetal DNA (cffDNA) molecules, and circulating tumor nucleic acid molecules such as circulating tumor DNA (ctDNA) molecules.
  • a nucleic acid molecule may comprise one or more modified nucleotides, such as one or more methylated nucleotides or nucleotide analogs.
  • modifications to the structure of a nucleotide may be made before or after assembly of the nucleic acid molecule.
  • Any component of a nucleotide or nucleotide analog may be modified.
  • a sugar, nucleobase, and/or linker (e.g., phosphate) component of a nucleotide or nucleotide analog may be modified.
  • a nucleobase or nucleobase analog may be methylated or de-methylated.
  • a sequence of nucleotides of a nucleic acid molecule may be interrupted by non-nucleotide components.
  • a nucleic acid molecule may be further modified after polymerization, such as by conjugation or binding with a reporter agent, barcode, or tag, such as a fluorescent moiety or other label.
  • a nucleic acid molecule may also be derivatized by addition, modification, or deletion of one or more nucleotides or nucleic acid sequences, including by incorporation of a flow cell adapter, sequencing adapter, sequencing primer, unique molecular identifier sequence, barcode sequence, or other feature.
  • the terms“amplifying” and“amplification” are used interchangeably and generally refer to generating one or more copies or“amplified product” of a nucleic acid.
  • the term“DNA amplification” generally refers to generating one or more copies of a DNA molecule or“amplified DNA product”.
  • the term“reverse transcription amplification” generally refers to the generation of deoxyribonucleic acid (DNA) from a ribonucleic acid (RNA) template via the action of a reverse transcriptase.
  • target nucleic acid generally refers to a nucleic acid molecule in a starting population of nucleic acid molecules having a nucleotide sequence whose presence, amount, methylation state, and/or sequence, or changes in one or more of these, are desired to be determined.
  • a target nucleic acid may be any type of nucleic acid (e.g., as described elsewhere herein), including DNA, RNA, and analogues thereof.
  • RNA target ribonucleic acid
  • DNA target deoxyribonucleic acid
  • biological sample generally refers to a sample from a subject.
  • a sample may be obtained from a subject via any suitable method, including, but not limited to, spitting, swabbing, blood draw, obtaining excretions (e.g., urine, stool, sputum, vomit, or saliva), excision, puncture, scraping, and biopsy.
  • a sample may be obtained from a subject by, for example, accessing the circulatory system (e.g., intravenously or intraarterially), surgically extracting a tissue (e.g., biopsy), collecting a secreted biological sample (e.g., stool, urine, saliva, sputum, etc.), or breathing.
  • the sample may be obtained by non-invasive methods including but not limited to: scraping of the skin or cervix, swabbing of the cheek, saliva collection, urine collection, feces collection, collection of menses, tears, or semen.
  • the sample may be obtained by an invasive procedure including but not limited to: biopsy, needle aspiration, or phlebotomy.
  • a sample may comprise a bodily fluid such as, but not limited to, blood (e.g., whole blood, red blood cells, leukocytes or white blood cells, platelets), plasma, serum, sweat, tears, saliva, sputum, urine, mucus, semen, synovial fluid, breast milk, colostrum, amniotic fluid, bile, interstitial or extracellular fluid, bone marrow, or cerebrospinal fluid.
  • a sample may be obtained by a puncture method to obtain a bodily fluid comprising blood and/or plasma.
  • Such a sample may comprise both cells and cell- free nucleic acid material.
  • the sample may be obtained from any other source including but not limited to blood, sweat, hair follicle, buccal tissue, tears, menses, feces, or saliva.
  • the biological sample may be a tissue sample, such as a tumor biopsy.
  • the sample may be obtained from any of the tissues provided herein including, but not limited to, skin, heart, lung, kidney, breast, pancreas, liver, muscle, smooth muscle, bladder, gall bladder, colon, intestine, brain, prostate, esophagus, or thyroid.
  • the methods of obtaining provided herein include methods of biopsy including fine needle aspiration, core needle biopsy, vacuum assisted biopsy, large core biopsy, incisional biopsy, excisional biopsy, punch biopsy, shave biopsy or skin biopsy.
  • the biological sample may comprise one or more cells.
  • a biological sample may comprise one or more nucleic acid molecules such as one or more deoxyribonucleic acid (DNA) and/or ribonucleic acid (RNA) molecules. Nucleic acid molecules may be included within cells. Alternatively or in addition to, nucleic acid molecules may not be included within cells (e.g., cell-free nucleic acid molecules).
  • the biological sample may be a cell-free sample.
  • cell-free sample generally refers to a sample that is substantially free of cells (e.g., less than about 10% cells on a volume basis, such as less than about 9%, less than about 8%, less than about 7%, less than about 6%, less than about 5%, less than about 4%, less than about 3%, less than about 2%, less than about 1%, or lower).
  • a cell- free sample may be derived from any source (e.g., as described herein).
  • a cell-free sample may be derived from blood, saliva, urine, or sweat.
  • a cell-free sample may be derived from a tissue or bodily fluid.
  • a cell-free sample may be derived from a plurality of tissues or bodily fluids. For example, a sample from a first tissue or fluid may be combined with a sample from a second tissue or fluid (e.g., while the samples are obtained or after the samples are obtained). In an example, a first fluid and a second fluid may be collected from a subject (e.g., at the same or different times) and the first and second fluids may be combined to provide a sample.
  • a cell-free sample may comprise one or more nucleic acid molecules such as one or more DNA or RNA molecules.
  • a sample (e.g., a sample comprising one or more cells) may be processed to provide a cell-free sample.
  • a sample that includes one or more cells as well as one or more nucleic acid molecules (e.g., DNA and/or RNA molecules) not included within cells may be obtained from a patient.
  • the sample may be subjected to processing (e.g., as described herein) to separate cells and other materials from the nucleic acid molecules not included within cells, thereby providing a cell-free sample (e.g., comprising nucleic acid molecules not included within cells).
  • the cell-free sample may then be subjected to further analysis and processing (e.g., as provided herein).
  • Nucleic acid molecules not included within cells may be derived from cells and tissues.
  • cell-free nucleic acid molecules may derive from a tumor tissue or a degraded cell (e.g., of a tissue of a body).
  • Cell-free nucleic acid molecules may comprise any type of nucleic acid molecules (e.g., as described herein).
  • Cell-free nucleic acid molecules may be double-stranded, single-stranded, or a combination thereof.
  • Cell-free nucleic acid molecules may be released into a bodily fluid through secretion or cell death processes, e.g., cellular necrosis, apoptosis, or the like.
  • Cell-free nucleic acid molecules may be released into bodily fluids from cancer cells (e.g., circulating tumor DNA (ctDNA)) or may be released into bodily fluids from healthy cells.
  • Cell free nucleic acid molecules may be fetal DNA circulating freely in a maternal blood stream (e.g., cell-free fetal nucleic acid molecules such as cffDNA).
  • a sample (e.g., a biological sample or cell-free biological sample) suitable for use according to the methods provided herein may be any material comprising tissues, cells, degraded cells, nucleic acids, genes, gene fragments, expression products, gene expression products, and/or gene expression product fragments of an individual to be tested. Methods for determining sample suitability and/or adequacy are provided.
  • a sample may include, but is not limited to, blood, plasma, tissue, cells, degraded cells, cell-free nucleic acid molecules, and/or biological material from cells or derived from cells of an individual such as cell-free nucleic acid molecules.
  • the sample may be a heterogeneous or homogeneous population of cells, tissues, or cell-free biological material.
  • the biological sample may be obtained using any method that can provide a sample suitable for the analytical methods described herein.
  • the methods of the present disclosure may comprise obtaining a sample (e.g., as described herein) from a subject. Multiple samples may be obtained from the same subject (e.g., as described herein) to ensure a sufficient amount of biological material is obtained (e.g., an amount sufficient for analysis according to the methods provided herein). Multiple samples may be obtained from the same subject at different times. For example, a first sample may be obtained at a first time and a second sample may be obtained at a second time that is different than the first time (e.g., seconds, minutes, hours, days, weeks, or months apart).
  • the first sample is obtained from a subject before the subject undergoes a treatment regimen or procedure and the second sample is obtained from the subject after the subject undergoes the treatment regimen or procedure.
  • multiple samples may be obtained from the same subject at the same or approximately the same time. Different samples obtained from the same subject may be obtained in the same or different manner. For example, a first sample may be obtained via a biopsy and a second sample may be obtained via a blood draw. A sample may be divided into multiple portions (e.g., at the time of collection or subsequent to collection). Samples obtained in different manners may be obtained by different medical professionals, using different techniques, at different times, and/or at different locations. Different samples obtained from the same subject may be obtained from different areas of a body.
  • a first sample may be obtained from a first area of a body (e.g., a first tissue) and a second sample may be obtained from a second area of the body (e.g., a second tissue).
  • a medical professional may obtain a biological sample for testing.
  • the medical professional may refer the subject to a testing center or laboratory for submission of the biological sample.
  • the subject may provide the sample.
  • a subject may provide a sample for testing by a laboratory facility or service.
  • a molecular profiling business may facilitate obtaining the sample (e.g., by providing a kit that may be used to obtain a sample from a subject).
  • a sample e.g., a biological sample or cell-free biological sample
  • a sample may be filtered to remove contaminants or other materials.
  • a sample comprising cells may be processed to separate the cells from other material in the sample.
  • Such a process may be used to prepare a sample comprising only cell-free nucleic acid molecules.
  • such a process may consist of a two-step centrifugation process.
  • a two-step centrifugation process may comprise centrifuging whole blood at, for example, about 1,600 G for about 10 minutes at room
  • the resultant blood plasma may then be centrifuged at about 16,000 G for about 10 minutes at about 4 °C to yield a largely cell-free sample.
  • the sample may be obtained, stored, or transported using components of a kit provided herein.
  • multiple samples such as multiple liver tissue samples and/or blood samples may be obtained for analysis and/or diagnosis by the methods of the present disclosure.
  • multiple samples such as one or more samples from one tissue type (e.g., liver) and one or more blood samples may be obtained for diagnosis by the methods of the present disclosure.
  • samples may be obtained at different times and stored and/or analyzed. For example, a sample may be obtained and analyzed by cytological analysis (e.g., staining).
  • a first sample obtained from a subject may be obtained at the same time as a second sample from the same subject, and the first and second samples may be analyzed at different times and/or by different processes.
  • the first sample may be a blood sample that is analyzed according to the methods provided herein and the second sample may be a biopsy sample that is analyzed according to another method at the same or a different time.
  • further sample may be obtained from a subject based on the results of a cytological analysis.
  • Identification of a liver disease or of a severity of a liver disease may include an examination of a subject by a physician, nurse, or other medical professional. The examination may be part of a clinical examination, or the examination may be due to a specific complaint including but not limited to one of the following: pain, illness, anticipation of illness, presence of a suspicious lump or mass, a disease, or a condition.
  • the condition may be, for example, being overweight or obese, and/or having one or more of the following: a high body mass index, diabetes, high blood sugar, insulin resistance, high cholesterol and/or triglycerides, a metabolic syndrome, a polycystic ovary syndrome, sleep apnea, hypothryoidism, and hypopituitarism.
  • the subject may or may not be aware of the disease or condition.
  • the medical professional may obtain a biological sample for testing. In some cases, the medical professional may refer the subject to a testing center or laboratory for submission of the biological sample.
  • the subject may be referred to a specialist, such as a surgeon, radiologist, gastroenterologist, hepatologist, or pathologist for further diagnosis.
  • the specialist may likewise obtain a biological sample for testing or refer the individual to a testing center or laboratory for submission of the biological sample.
  • the biological sample may be obtained by a physician, nurse, or other medical professional such as a medical technician, endocrinologist, cytologist, phlebotomist, radiologist, gastroenterologist, hepatologist, or a pulmonologist.
  • the medical professional may indicate the appropriate test or assay to perform on the sample, and/or the analysis of the present disclosure (e.g., a report generated according to the methods provided herein) may inform on which assays or tests are most appropriately indicated.
  • a medical professional may not be involved in the initial analysis, diagnosis, or sample acquisition processes.
  • an individual e.g., a human subject
  • the kit may contain one or more devices and/or reagents for obtaining the sample as described elsewhere herein, one or more storage units for storing the sample for inspection, and instructions for use of the kit.
  • molecular profiling services are included in the price for purchase of the kit. In other cases, the molecular profiling services are billed separately.
  • a sample (e.g., a biological sample or cell-free biological sample) may be stored for a period of time between its collection and analysis. For example, a sample may be stored for seconds, minutes, hours, days, weeks, months, years or longer after the sample is obtained and before the sample is analyzed.
  • a sample obtained from a subject may be subdivided prior to the step of storage or further analysis such that different portions of the sample may be tested at different times and/or using different assays.
  • a portion of a sample may be subjected to downstream methods or processes including, but not limited to, blood sample fractionation, storage, cytological analysis, adequacy tests, nucleic acid extraction, molecular profiling, or a combination thereof.
  • a portion of the sample may be stored while another portion of the sample is processed (e.g., according to the methods provided herein).
  • a first portion of the sample may be stored and a second portion may be subjected to, e.g., molecular profiling; cytological staining; nucleic acid (RNA or DNA) extraction, detection, or quantification; gene expression product (RNA or Protein) extraction, detection, or
  • a sample may be obtained and stored and subsequently subdivided. One or more portions of the sample may then be subjected to one or more different downstream methods or processes including, but not limited to, blood sample fractionation, storage, cytological analysis, adequacy tests, nucleic acid extraction, molecular profiling or a combination thereof.
  • a sample or a portion thereof may be stored in a refrigerator or freezer.
  • a sample or a portion thereof may be fixed prior to or during storage such as by using glutaraldehyde, formaldehyde, methanol, a cross-linking agent, or another material. Samples may be stored upon acquisition to facilitate transport, or to wait for the results of other analyses. Alternatively or in addition to, samples may be stored while awaiting instructions from a physician or other medical professional.
  • the acquired sample may be placed in a suitable medium, excipient, solution, or container for short term or long term storage. Storage may require keeping the sample in a refrigerated or frozen environment.
  • the sample may be frozen (e.g., rapidly frozen) prior to storage in a frozen environment.
  • a sample may be contacted with a suitable preservative and/or cryoprotectant including, but not limited to, glycerol, ethylene glycol, sucrose, trehalose, or glucose.
  • concentrations of ammonium salts include solutions of at least about 0.1 g/ml, 0.2 g/ml, 0.3 g/ml, 0.4 g/ml, 0.5 g/ml, 0.6 g/ml, 0.7 g/ml, 0.8 g/ml, 0.9 g/ml, 1.0 g/ml, 1.1 g/ml, 1.2 g/ml, 1.3 g/ml,
  • a medium, excipient, or solution that may be added to a sample to facilitate its storage may or may not be sterile.
  • a sample may be stored at room temperature or at reduced temperatures such as cold temperatures (e.g., between about -200 C and about 0°C), or freezing temperatures, including for example at most about 0°C, -EC, -2°C, -3°C, -4 ';' C, -5°C, -6°C, -7°C, -8°C, -9 ;' C, -10 ;' C, - 12°C, -14°C, -15°C, -16°C, -20°C, -22°C, -25°C, -28°C, -30°C, -35°C, -40°C, -45°C, -50°C, - 60°C, -70°C, -80°C, -100°C, -120°
  • a medium, excipient, or solution that may be added to a sample to facilitate its storage may contain preservative agents to maintain the sample in an adequate state for subsequent diagnostics or manipulation, or to prevent coagulation.
  • preservatives may include citrate, ethylene diamine tetraacetic acid, sodium azide, or thimersol.
  • the medium, excipient or solution may contain suitable buffers or salts such as Tris buffers or phosphate buffers, sodium salts (e.g., NaCl), calcium salts, magnesium salts, and the like.
  • the sample may be stored in a commercial preparation suitable for storage of cells for subsequent cytological analysis such as but not limited to Cytyc ThinPrep, SurePath, or
  • a sample container may be any container suitable for storage and or transport of the biological sample including but not limited to: a cup, a cup with a lid, a tube, a sterile tube, a vacuum tube, a syringe, a bottle, a microscope slide, or any other suitable container.
  • the container may or may not be sterile.
  • the methods of the present disclosure may comprise transport of the sample.
  • the sample is transported from a clinic, hospital, doctor's office, or other location to a second location whereupon the sample may be stored and/or analyzed by for example, cytological analysis or molecular profiling.
  • the sample may be transported to a laboratory such as a laboratory authorized or otherwise capable of analyzing the sample (e.g., as described herein) such as a Clinical Laboratory Improvement Amendments (CLIA) laboratory.
  • CLIA Clinical Laboratory Improvement Amendments
  • the sample may be transported by the individual from whom the sample derives. Transportation by the individual may involve the individual appearing at a designated sample receiving point and providing a sample or providing a sample and transporting the sample (e.g., via mail, air shipping, freight shipping, or similar process).
  • Providing a sample may involve any of the techniques of sample acquisition described herein, or the sample may have already been acquired and stored in a suitable container as described herein.
  • the sample may be transported in any medium or excipient including any medium or excipient provided herein suitable for storing the sample such as a cryopreservation medium or a liquid based cytology preparation.
  • the sample may be transported frozen or refrigerated such as at any of the suitable sample storage temperatures provided herein.
  • the sample may be assayed using a variety of analyses techniques, such as cytological assays, and genomic analysis.
  • analyses techniques such as cytological assays, and genomic analysis.
  • Such tests may be indicative of liver disease, the severity of the liver disease, any other disease or condition, the presence of disease markers, or the absence of diseases, conditions, or disease markers.
  • the tests may take the form of cytological examination including microscopic examination as described below.
  • the tests may involve the use of one or more cytological stains.
  • the biological material may be manipulated or prepared for the test prior to administration of the test by various approaches for biological sample preparation (e.g., nucleic acid amplification such as polymerase chain reaction (PCR) or library preparation for sequencing).
  • PCR polymerase chain reaction
  • the specific assay performed may be determined by the physician who ordered the test, or a third party such as a consulting medical professional, cytology laboratory, the subject from whom the sample derives, or an insurance provider.
  • the specific assay may be chosen based on the likelihood of obtaining a definite diagnosis, the cost of the assay, the speed of the assay, or the suitability of the assay to the type of material provided.
  • Clinical data may be collected from a subject.
  • data may comprise a diagnosis of a disease or condition, medical history, personal information (e.g., height, weight, waist circumference, body mass index, age, date of birth, marital status, race/ethnicity, place of residence, nation of origin, etc.), social history, and other details (e.g., cholesterol and triglycerides levels, glucose levels, blood pressure, allergies, sensitivities and resistances (e.g., insulin resistance), hormone levels, blood cell counts, etc.) of a subject.
  • clinical data may include information about use of an investigational drug intended to affect a liver disease.
  • Other data may include, for example, documented steatosis by imaging or histology, including specification of modality (e.g., US, CT, MRI, or histology) and date of procedure; documented increased risk for SH and/or fibrosis; laboratory values including: AST, ALT, platelet count, albumin, triglycerides, HDL, fasting glucose, fasting insulin; demographic and clinical values including age, race, ethnicity, height, weight, waist circumference, blood pressure; VOTE (Fibroscan®) and/or magnetic resonance elastography (MRE) findings; social history including alcohol consumption and smoking history; relevant health/disease history and anthropometric data including medication(s) for metabolic syndrome (MetS) components, other concomitant medications, diagnosis of type 2 diabetes or metabolic syndrome, and history of malignancies; and liver biopsy findings from local histology read, including steatosis grading, presence of steatohepatitis, NAFLD Activity Score (NAS) component scores, and fibrosis staging
  • the present disclosure provides methods, systems, and kits for identifying liver disease by processing biological samples, such as cell-free samples obtained from subjects.
  • Such subjects may include subjects with a liver disease (e.g., non-alcoholic fatty liver disease
  • subjects may include patients with NAFLD who are at increased risk of having SH and/or fibrosis.
  • the present disclosure provides a method 100 for identifying a severity of a liver disease in a subject, comprising: (a) providing a cell-free biological sample from said subject (102); (b) assaying nucleic acid molecules derived from said cell-free biological sample to generate a data set comprising a methylation profile of one or more genomic regions of said cell-free biological sample (104); (c) using a trained machine learning algorithm to process said data set to identify said severity of said liver disease in said subject (106); and (d) outputting a report that identifies said severity of said liver disease in said subject (108).
  • a biological sample (e.g., cell-free biological sample) may be collected from a subject (e.g., a patient having or suspected of having a fatty liver and/or other risk factor for NAFLD, as described herein).
  • the subject may be at risk of having steatohepatitis, liver fibrosis, or another condition (e.g., as described herein).
  • the subject may have undergone, or may have received a clinical recommendation to undergo, a liver biopsy procedure.
  • a sample (e.g., blood sample) may be collected from the subject (e.g., via a blood draw).
  • sample e.g., blood sample
  • sample may be obtained from the subject about 1 hour, about 12 hours, about 1 day, about 1 week, about 2 weeks, about 3 weeks, about 4 weeks, about 5 weeks, about 6 weeks, about 8 weeks, about 10 weeks, or about 12 weeks prior to analysis of the sample, or
  • a sample may be collected from the subject many months or years prior to analysis.
  • the sample may be obtained from the subject about 1 hour, about 12 hours, about 1 day, about 1 week, about 2 weeks, about 3 weeks, about 4 weeks, about 5 weeks, about 6 weeks, about 8 weeks, about 10 weeks, or about 12 weeks prior to a planned liver biopsy, or immediately prior to a planned liver biopsy.
  • the sample may be obtained from a subject after a liver biopsy or from a subject with no recent or planned liver biopsy.
  • the sample may include at least about 1 ml, about 5 ml, about 10 mL, about 20 mL, about 30 mL, about 40 mL, about 50 mL, about 60 mL, about 70 mL, about 80 mL, about 90 mL, about 100 mL, or more of a bodily fluid (e.g., blood).
  • Clinical and/or demographic data of the subject may be collected.
  • the sample and any data relating to the subject may be de-identified (e.g., assigned an identifier such as a numerical code, barcode, or QR code).
  • the sample e.g., blood sample
  • the sample may be stored prior to analysis and/or processing.
  • the sample may be stored at a variety of different conditions, such as different temperatures (e.g., at room temperature or under refrigeration or freezer conditions, such as at no more than about 4°C, -18°C, -20°C, -80°C, or less) or with different preservatives and/or cryoprotectants (e.g., EDTA or glucose).
  • a sample may be obtained using an EDTA (“lavender top”) Vacutainer, which may be stored upright at about 4°C until centrifugation.
  • a cell-free biological sample may comprise or be derived from a bodily fluid (e.g., blood or plasma).
  • a blood sample may be centrifuged at about 1100-1800 G for about 10 minutes at room temperature to fractionate the blood sample into a plasma layer, a buffy coat layer, and a red blood cell layer.
  • the plasma layer comprising a cell-free portion of the biological sample may be separated from the remainder of the blood sample, such as by pipetting into another container.
  • the plasma layer may be further centrifuged at about 12,000-18,000 G to remove cellular debris, and may be further transferred into another container such as a cryovial.
  • Cryovials containing collected cell-free biological sample may be stored in a freezer (e.g., a - 20°C or -80 C freezer).
  • the cell-free biological samples may be shipped under refrigeration (e.g., at no more than about 4 °C, 0 °C, -18 °C, -20°C, or colder) via, e.g., use of dry ice.
  • the biological sample may be obtained from a subject with a severity (e.g., known severity) of a liver disease, from a subject that is suspected of having the severity of the liver disease or disorder, or from a subject that does not have or is not suspected of having the severity of the liver disease (e.g., as described herein).
  • the liver disease may comprise non-alcoholic fatty liver disease (NAFLD), fibrosis, significant fibrosis, steatohepatitis, borderline steatohepatitis, liver inflammation, lobular inflammation, portal inflammation, steatosis, hepatocellular ballooning, or a combination thereof.
  • NAFLD non-alcoholic fatty liver disease
  • the biological sample may be taken before and/or after treatment of a subject with a disease or disorder.
  • treatment may comprise an investigational treatment.
  • Biological samples e.g., cell-free biological samples
  • a treatment e.g., for a liver disease or condition or for another disease or condition.
  • Biological samples may be taken during a treatment or a treatment regime (e.g., for a liver disease or condition or for another disease or condition). Multiple biological samples (e.g., cell-free biological samples) may be taken from a subject to monitor the effects of the treatment over time.
  • the biological sample e.g., cell-free biological sample
  • the biological sample may be obtained from a subject known or suspected of having a severity of a liver disease for which a definitive positive or negative diagnosis is not available via clinical tests such as NAFLD Fibrosis Score (NFS), FIB-4, Enhanced Liver Fibrosis (ELF) test, or VCTE (Fibroscan®).
  • the biological sample may be taken from a subject suspected of having a severity of a liver disease.
  • the biological sample e.g., cell-free biological sample
  • the biological sample may be taken from a subject experiencing symptoms (e.g., explained or unexplained symptoms), such as fatigue, discomfort in the upper right side of the abdomen, nausea, weight loss, aches and pains, weakness, or memory loss.
  • the biological sample e.g., cell-free biological sample
  • the biological sample may be taken from a subject who has or will have another clinical assessment relating to liver health, such as a liver biopsy.
  • the subject may not have previously received a definitive diagnosis of a liver disease and/or a severity of a liver disease.
  • Nucleic acid molecules of a sample may be extracted from the sample.
  • the sample may be processed to remove materials other than nucleic acid molecules (e.g., contaminants, cells, debris, and other materials).
  • Nucleic acid molecules e.g., cell-free nucleic acid molecules
  • Nucleic acid molecules may be assayed (e.g., using nucleic acid sequencing) to generate a data set comprising a methylation profile of one or more genomic regions of the cell-free biological sample.
  • the assay may measure an absolute amount or a relative amount of the nucleic acid molecules or sequences thereof (including an absolute or relative level of methylation within said molecules) corresponding to one or more genomic regions in the methylation profile of the data set.
  • a difference in the absolute amount or relative amount of the nucleic acid molecules or sequence thereof (including an absolute or relative level of methylation within said molecules) corresponding to one or more genomic regions in the methylation profile of the data set may be indicative of a liver disease (e.g., as described herein).
  • Methods to assess the methylation state of nucleic acid molecules or sequences thereof may include whole genome bisulfite sequencing (WGBS), targeted bisulfite amplicon sequencing (TBAS), hybrid capture of target molecules followed by bisulfite conversion and sequencing, hybrid capture of bisulfite converted target molecules followed by sequencing, nanopore sequencing with direct detection or inference of nucleotide methylation state, or any other suitable sequencing approaches.
  • Nucleic acid molecules may be extracted from a sample (e.g., cell-free biological sample) and assayed.
  • the nucleic acid molecules may comprise deoxyribonucleic acid (DNA), ribonucleic acid (RNA), and/or other nucleic acid molecules (e.g., as described herein).
  • Nucleic acid molecules e.g., DNA molecules
  • Nucleic acid molecules may be extracted from the biological sample by a variety of methods (e.g., as described herein).
  • nucleic acid molecules may be extracted from the sample using a QIAamp Circulating Nucleic Acid kit (Qiagen, Valencia, CA). Extraction of nucleic acid molecules from a sample may extract all nucleic acid molecules (e.g., DNA molecules) from a sample.
  • nucleic acid molecules e.g., DNA molecules
  • DNA molecules DNA molecules
  • differentially methylated regions e.g., CpG sites, CpA, sites, CpT sites, and/or CpC sites
  • Extracted RNA molecules from a sample may be converted to DNA molecules by reverse transcription (RT) to provide complementary DNA (cDNA) molecules.
  • Nucleic acid molecules of a sample e.g., cell-free biological sample
  • assaying may comprise assaying the sample using probes that are selected for a plurality of target nucleic acid sequences (e.g., as described herein).
  • Nucleic acid molecules may be subjected to conditions sufficient to convert unmethylated cytosines in the nucleic acid molecules to uracils (e.g., subsequent to extraction from a sample).
  • the nucleic acid molecules may be subjected to bisulfite processing.
  • Bisulfite treatment of nucleic acid molecules deaminates unmethylated cytosine bases, converting them to uracil bases. This“bisulfite conversion” process does not deaminate methylated or hydroxymethylated cytosines (e.g., at the 5 position, such as 5mC or 5hmC).
  • Nucleic acid molecules may be oxidized prior to undergoing bisulfite conversion to convert hydroxymethylated cytosine (e.g., 5hmC) to formylcytosine and carboxylcytosine (e.g., 5- formyl cytosine and 5-carboxylcytosine). These oxidized products may be sensitive to bisulfite conversion. Nucleic acid molecules may also be subjected to further processing including other derivatization processes (e.g., to incorporate, modify, and/or delete one or more sequences, tags, or labels). In some cases, functional sequences (e.g., sequencing adapters, flow cell adapters, sequencing primers, etc.) may be added to nucleic acid molecules to facilitate nucleic acid sequencing.
  • hydroxymethylated cytosine e.g., 5hmC
  • carboxylcytosine e.g., 5- formyl cytosine and 5-carboxylcytosine
  • Nucleic acid molecules may also be subjected to further processing including
  • derivatives of nucleic acid molecules from a sample may comprise processed nucleic acid molecules including bisulfite-modified nucleic acid molecules, reverse- transcribed nucleic acid molecules, tagged nucleic acid molecules, barcoded nucleic acid molecules, and other modified nucleic acid molecules.
  • Nucleic acid molecules e.g., extracted nucleic acid molecules
  • Sequencing reads may be aligned with and/or analyzed with regard to a reference genome. Based at least in part on sequencing reads, an absolute amount or relative amount of nucleic acid molecules (including an absolute or relative level of methylation within said molecules) corresponding to one or more genomic regions may be measured. Alternatively, sequencing reads may not be used to determine an amount or relative amount of nucleic acid molecules.
  • a data set comprising a genomic profile (e.g., methylation profile) of one or more genomic regions of a sample may be generated based at least in part on sequencing reads.
  • Sequencing reads may be processed to identify hypomethylated and/or hypermethylated regions of the one or more genomic regions. Alternatively or in addition to, sequencing reads may be processed to identify one or more markers of inflammation (e.g., liver inflammation) or fibrosis.
  • the one or more genomic regions may comprise one or more differentially methylated regions.
  • Sequence identification may be performed by sequencing, array hybridization (e.g., Affymetrix), or nucleic acid amplification (e.g., PCR), for example.
  • Sequencing may be performed by any suitable sequencing methods, such as massively parallel sequencing (MPS), paired-end sequencing, high-throughput sequencing, next-generation sequencing (NGS), shotgun sequencing, single-molecule sequencing, nanopore sequencing, nanopore sequencing with direct detection or inference of methylation status, semiconductor sequencing,
  • Sequencing may comprise bisulfite sequencing (BS- Seq), such as whole genome bisulfite sequencing (WGBS) and/or oxidative bisulfite sequencing (oxBS-Seq).
  • BS- Seq bisulfite sequencing
  • WGBS whole genome bisulfite sequencing
  • oxBS-Seq oxidative bisulfite sequencing
  • Sequencing and/or preparing a nucleic acid sample for sequencing may comprise performing one or more nucleic acid reactions such as one or more nucleic acid amplification processes (e.g., of DNA or RNA molecules).
  • Nucleic acid amplification may comprise, for example, reverse transcription, primer extension, asymmetric amplification, rolling circle amplification, ligase chain reaction, polymerase chain reaction (PCR), and multiple
  • PCR methods include digital PCR (dPCR), emulsion PCR (ePCR), quantitative PCR (qPCR), real-time PCR (RT-PCR), hot start PCR, multiplex PCR, asymmetric PCR, nested PCR, and assembly PCR.
  • dPCR digital PCR
  • ePCR emulsion PCR
  • qPCR quantitative PCR
  • RT-PCR real-time PCR
  • hot start PCR multiplex PCR
  • multiplex PCR asymmetric PCR
  • asymmetric PCR nested PCR
  • assembly PCR assembly e.g., assembly PCR.
  • a suitable number of rounds of nucleic acid amplification e.g., PCR, such as qPCR, RT-PCR, dPCR, etc.
  • the PCR may be used for global amplification of nucleic acid molecules.
  • This may comprise using adapter sequences that may be first ligated to different molecules followed by PCR amplification using universal primers.
  • PCR may be performed using any of a number of commercial kits, e.g., provided by Life Technologies, Affymetrix, Promega, Qiagen, etc.
  • Specific primers possibly in conjunction with adapter ligation, may be used to selectively amplify certain targets for downstream sequencing.
  • nested primers may be used to target specific genomic regions.
  • Nucleic acid amplification may comprise targeted amplification of one or more genetic loci, genomic regions, or differentially methylated regions (e.g., CpG sites, CpA, sites, CpT sites, and/or CpC sites). In some cases, nucleic acid amplification is performed after bisulfite conversion. Such a procedure may be termed targeted bisulfite amplicon sequencing (TBAS). Nucleic acid amplification may comprise the use of one or more primers, probes, enzymes (e.g., polymerases), buffers, and deoxyribonucleotides. Nucleic acid amplification may be isothermal or may comprise thermal cycling.
  • Thermal cycling may involve changing a temperature associated with various processes of nucleic acid amplification including, for example, initialization, denaturation, annealing, and extension.
  • Sequencing may comprise use of simultaneous reverse transcription (RT) and PCR, such as a OneStep RT-PCR kit protocol by Qiagen, NEB, Thermo Fisher Scientific, or Bio-Rad.
  • RT simultaneous reverse transcription
  • PCR such as a OneStep RT-PCR kit protocol by Qiagen, NEB, Thermo Fisher Scientific, or Bio-Rad.
  • Nucleic acid molecules e.g., DNA or RNA molecules
  • Nucleic acid molecules or derivatives thereof may be labeled or tagged, e.g., with identifiable tags, to allow for multiplexing of a plurality of samples. For example, every nucleic acid molecule or derivative thereof associated with a given sample or subject may be tagged or labeled (e.g., with a barcode such as a nucleic acid barcode sequence or a fluorescent label). Nucleic acid molecules or derivatives thereof associated with other samples or subjects may be tagged or labels with different tags or labels such that nucleic acid molecules or derivatives thereof may be associated with the sample or subject from which they derive.
  • Such tagging or labeling also facilitates multiplexing such that nucleic acid molecules or derivatives thereof from multiple samples and/or subjects may be analyzed (e.g., sequenced) at the same time. Any number of samples may be multiplexed.
  • a multiplexed reaction may contain nucleic acid molecules or derivatives thereof from at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, 55, 60,
  • sample barcodes e.g., nucleic acid barcode sequences
  • sample barcodes may permit samples from multiple subject to be differentiated from one another, which may permit sequences in such samples to be identified simultaneously, such as in a pool.
  • Tags, labels, and/or barcodes may be attached to nucleic acid molecules or derivatives thereof by ligation, primer extension, nucleic acid amplification, or another process.
  • nucleic acid molecules or derivatives thereof of a particular sample may be tagged, labeled, or barcoded with different tags, labels, or barcodes (e.g., unique molecular identifiers) such that different nucleic acid molecules or derivatives thereof deriving from the same sample may be differentially tagged, labeled, or barcoded.
  • nucleic acid molecules or derivatives thereof from a given sample may be labeled with both different labels and identical labels, such that each nucleic acid molecule or derivative thereof associated with the sample includes both a unique label and a shared label.
  • sequence reads may be aligned to one or more reference genomes (e.g., a human genome).
  • the aligned sequence reads may be quantified at one or more genomic loci to generate the data set comprising the methylation profile of one or more genomic regions of the cell-free biological sample. Quantification of sequences may be expressed as un-normalized or normalized values.
  • Identifying the presence and/or severity of a liver disease (e.g., as described herein) in a subject may comprise using probes configured to selectively enrich nucleic acid molecules (e.g., DNA or RNA molecules) or sequences thereof. Such probes may be pull-down probes (e.g., bait sets). Selectively enriched nucleic acid molecules or sequences thereof may correspond to one or more genomic regions in the methylation profile of the data set. The presence of particular sequences, modifications (e.g., methylation states), deletions, additions, single nucleotide polymorphisms, copy number variations, or other features in the selectively enriched nucleic acid molecules or sequences thereof may be indicative of a presence and/or severity of a liver disease.
  • probes configured to selectively enrich nucleic acid molecules (e.g., DNA or RNA molecules) or sequences thereof.
  • Such probes may be pull-down probes (e.g., bait sets).
  • the probes may be selective for a subset of the one or more genomic regions in the cell-free biological sample and/or for differentially methylated regions (e.g., CpG sites, CpA, sites, CpT sites, and/or CpC sites).
  • the probes may be configured to selectively enrich nucleic acid molecules (e.g., DNA or RNA molecules) or sequences thereof
  • the probes may be nucleic acid molecules (e.g., DNA or RNA molecules) having sequence complementarity with target nucleic acid sequences. These nucleic acid molecules may be primers or enrichment sequences.
  • the assaying of the nucleic acid molecules of the sample (e.g., cell-free biological sample) using probes that are selected for target nucleic acid sequences may comprise use of array
  • the number of target nucleic acid sequences selectively enriched using such a scheme may comprise at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 50, at least 100, at least 150, at least 200, at least 300, at least 500, or more than 500 different target nucleic acid sequences.
  • Use of such probes for enrichment of target nucleic acids may be termed“hybrid capture.” Use of such hybrid capture probes may take place prior to or after bisulfite conversion (if
  • target nucleic acid sequences include those associated with the genomic regions included in Table 1.
  • Assay readouts may be quantified at one or more genomic loci to generate methylation profiles of the data set for the sample (e.g., cell-free biological sample). For example, quantification of array hybridization or polymerase chain reaction (PCR)
  • corresponding to a plurality of genomic loci may generate absolute or relative amounts of nucleic acid molecules corresponding to particular genomic regions in the methylation profile of the data set.
  • Assay readouts may comprise quantitative PCR (qPCR) values, digital PCR (dPCR) values, digital droplet PCR (ddPCR) values, fluorescence values, etc. Quantification of array hybridization or polymerase chain reaction (PCR) may be expressed as un-normalized or normalized values.
  • the present disclosure further provides a method for assessing a therapeutic regimen for use in the treatment of a liver disease.
  • the method may comprise identifying a presence and/or severity of a liver disease in a subject at a first time point (e.g., as described herein) and at a second time point.
  • the second time point may be hours, days, weeks, months, or years after the first time point.
  • the subject may undergo all or a portion of a therapeutic regimen between the first and second time points.
  • the presence and/or severity of the liver disease may be identified in the subject at a first time point and the subject may then undergo all or a portion of a therapeutic regiment.
  • the therapeutic regimen undertaken by the subject may be determined at least in part based on the identification process (e.g., data and/or
  • the presence and/or severity of the liver disease may then be identified in the subject at a second time point.
  • the first identification process may identify liver disease of a first severity in the subject and the second identification process may identify liver disease of a second severity in the subject, which second severity is different than the first severity.
  • a lessening in the severity of the liver disease between the first and second time points may indicate that the therapeutic regimen undertaken by the subject is at least partially effective.
  • Lack of change or a worsening in the severity of the liver disease between the first and second time points may indicate that the therapeutic regimen undertaken by the subject is not effective.
  • identification of a liver disease or severity thereof may be performed at three or more time points, such as prior to the subject commencing a therapeutic regimen, while the subject is undergoing a therapeutic regimen, and after the subject has completed a therapeutic regimen.
  • One or more liver biopsies and accompanying analyses may also be performed in conjunction with this method.
  • a therapeutic regimen used in the treatment of a liver disease may comprise one or more therapeutic agents and/or programs.
  • a therapeutic program may comprise, for example, a weight loss regimen, exercise regimen, or body mass index reduction regimen.
  • a therapeutic agent may comprise, for example, an approved or known agent such as vitamin E.
  • a therapeutic agent may be an investigational therapeutic agent, such as an investigational therapeutic agent for use in the treatment of a liver disease (e.g., NAFLD such as NASH). Accordingly, the method may be performed as part of a clinical or investigative trial.
  • the one or more therapeutic agents and/or programs may be administered simultaneously or in sequence. For example, a first therapeutic agent may be administered and then a second therapeutic agent may be administered.
  • a first therapeutic agent may be administered at a first dosage for a first period of time and then at a second dosage for a second period of time.
  • the first and second periods of time may not be adjacent (e.g., a subject may be permitted a drug holiday between treatment periods).
  • a therapeutic program may be undertaken throughout administration of one or more therapeutic agents.
  • a subject may participate in a weight loss regimen while undergoing treatment with one or more therapeutic agents.
  • kits for identifying a presence and/or severity of a liver disease in a subject may comprise probes for identifying a severity of a liver disease by assaying nucleic acid molecules derived from a sample (e.g., cell-free biological sample) of the subject. For example, amounts (e.g., absolute or relative amounts) of nucleic acid molecules or derivatives thereof, or sequences thereof, corresponding to one or more genomic regions in the methylation profile of the data set may be indicative of a severity of a liver disease.
  • the probes may be selective for a subset of the one or more genomic regions in the sample and/or for differentially methylated regions (e.g., CpG sites, CpA, sites, CpT sites, and/or CpC sites).
  • the probes in the kit may be configured to selectively enrich nucleic acid molecules (e.g., DNA or RNA molecules) or sequences thereof corresponding to a plurality of target nucleic acid sequences, such as a subset of the one or more genomic regions in the sample (e.g., cell-free biological sample) and/or differentially methylated regions (e.g., CpG sites, CpA, sites, CpT sites, and/or CpC sites).
  • the probes in the kit may be nucleic acid primers.
  • the probes in the kit may have sequence complementarity with one or more target nucleic acid sequences of the plurality of target nucleic acid sequences.
  • the plurality of target nucleic acid sequences may comprise at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, or at least 20, at least 50, at least 100, at least 150, at least 200, at least 300, at least 500, or more different target nucleic acid sequences.
  • Examples of target nucleic acid sequences include those associated with the genomic regions included in Table 1.
  • a kit may comprise instructions to assay nucleic molecules or derivatives thereof derived from a sample (e.g., cell-free biological sample) using probes that are selective for the plurality of target nucleic acid sequences in the sample.
  • the instructions in the kit may comprise instructions to measure and interpret assay readouts, which may be quantified at one or more genomic loci to generate the methylation profiles of the data set for the sample. For example, quantification of array hybridization or polymerase chain reaction (PCR) corresponding to a plurality of genomic loci may generate the absolute amount or relative amounts of the nucleic acid molecules or derivatives thereof corresponding to the genomic regions in the methylation profile of the data set.
  • Assay readouts may comprise quantitative PCR (qPCR) values, digital PCR (dPCR) values, digital droplet PCR (ddPCR) values, fluorescence values, etc.
  • Quantification of array hybridization or polymerase chain reaction (PCR) may be expressed as un-normalized or normalized values.
  • kits for identifying a presence and/or severity of a liver disease in a subject may comprise a mechanism for obtaining a sample from the subject.
  • the kit may comprise a mechanism for extracting a bodily fluid (e.g., blood and/or plasma) or tissue from the subject.
  • a bodily fluid e.g., blood and/or plasma
  • a kit may comprise one or more reagents useful for analyzing and/or processing a sample.
  • the kit may comprise reagents for amplifying nucleic acid molecules (e.g., primers, probes, polymerases, deoxyribonucleotides), for reverse transcribing nucleic acid molecules (e.g., reverse transcriptase, deoxyribonucleotides), for removing cells from a sample, for performing bisulfite conversion and/or oxidative bisulfite conversion, and/or for preparing a sample for sequencing.
  • nucleic acid molecules e.g., primers, probes, polymerases, deoxyribonucleotides
  • reverse transcribing nucleic acid molecules e.g., reverse transcriptase, deoxyribonucleotides
  • the kit may comprise labels, tags, barcodes, primers, probes, polymerases, reverse transcriptase, deoxyribonucleotides, solvents, buffers, filters, bisulfite ions, oxidizing agents, acids, bases, denaturants, detergents, salts, surfactants, stabilizers, fluorophores, dyes, ligases, protein digestion enzymes, nucleases, restriction enzymes, and combinations thereof.
  • the present disclosure also provides methods of using a kit to perform methods of the present disclosure.
  • the present disclosure may provide a method of using a kit for preparing nucleic acid molecules of a sample (e.g., a cell-free biological sample) for processing.
  • the kit may comprise instructions for performing one or more processes for analyzing nucleic acid molecules (e.g., of a cell-free biological sample).
  • the instructions may comprise instructions for preparing the nucleic acid molecules (e.g., identifying or evaluating a subject for analysis according to the methods provided herein, collecting a sample from a subject, processing a sample to provide a cell-free biological sample, extracting nucleic acid molecules (e.g., cell-free nucleic acid molecules) from a sample, derivatizing nucleic acid molecules (e.g., as described herein), performing bisulfite and/or oxidative bisulfite conversion, labeling nucleic acid molecules, etc.) for analysis according to the methods provided herein.
  • the kit may comprise instructions for performing bisulfite conversion and/or oxidative bisulfite conversion.
  • the kit may comprise instructions for incorporating labels, barcodes or tags into the nucleic acid molecules or derivatives thereof.
  • the kit may comprise instructions for preparing the nucleic acid molecules or derivatives thereof for sequencing, e.g., by ligating flow cell adapters or sequencing adapters to the nucleic acid molecules or derivatives thereof.
  • the kit may also comprise instructions for assaying nucleic acid molecules or derivatives thereof to generate a data set comprising a methylation profile of one or more genomic regions of a sample (e.g., as described herein), using a trained machine learning algorithm to process such a data set to identify a severity of a liver disease in a subject from whom the sample derives (e.g., as described herein), and/or outputting or interpreting a report that identifies a severity of a liver disease in the subject (e.g., as described herein).
  • Instructions may be in print form, such as text and/or graphical items printed in a physical medium (e.g., paper).
  • instructions may be provided in electronic form, such as in computer memory and/or through a web-based interface.
  • instructions may be provided in an electronic document provided in a computer memory unit or in a computer server and provided to a user through a web-based interface or through an electronic communication.
  • the methods of identifying a disease (e.g., liver disease) or a severity thereof in a subject may comprise the use of a machine learning algorithm.
  • the machine learning algorithm may be a trained algorithm.
  • the machine learning algorithm may be trained on one or more features.
  • the trained machine learning algorithm may be used to process a data set generated via assaying nucleic acid molecules or derivatives thereof associated with a sample (e.g., cell- free biological sample), which data set comprises a methylation profile of one or more genomic regions of the cell-free biological sample.
  • a sample e.g., cell- free biological sample
  • one or more computer processors may be used to process the data set using the trained machine learning algorithm.
  • the machine learning algorithm may be configured to identify a presence and/or severity of a disease (e.g., liver disease) at an accuracy of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99%.
  • a disease e.g., liver disease
  • Markers and genomic regions may be identified (e.g., using the methods provided herein) to have differential methylation in samples from subjects having a liver disease and/or a liver disease of a particular severity compared to samples from subjects not having a liver disease and/or having a liver disease of a different severity.
  • a first marker and/or gene may be associated with a first severity of a liver disease (e.g., liver fibrosis) but may not be associated with a second severity of a liver disease (e.g., liver inflammation).
  • a first marker and/or gene may not be associated with a first severity of a liver disease (e.g., liver fibrosis) but may be associated with a second severity of a liver disease (e.g., liver inflammation).
  • a second marker and/or gene may be associated with the second severity of the liver disease and may or may not also be associated with the first severity of the liver disease.
  • the nucleic acid molecules may be contacted with an array of probes under conditions to allow hybridization.
  • the degree of hybridization of the probes to the nucleic acid molecules may be assayed in a quantitative matter using a number of methods.
  • the degree of hybridization at a probe position may be related to the intensity of signal provided by the assay, which therefore is related to the amount of complementary nucleic acid sequence present in the sample.
  • Software can be used to extract, normalize, summarize, and analyze array intensity data from probes across the human genome or transcriptome including expressed genes, exons, introns, and miRNAs.
  • the intensity of a given probe in either the liver disease or non-liver disease samples may be compared against a reference set to determine whether differential methylation is occurring in a sample.
  • An increase or decrease in relative intensity at a marker position on an array corresponding to an expressed sequence may be indicative of an increase or decrease respectively of methylation of the corresponding marker or gene.
  • a decrease in relative intensity may be indicative of a mutation in the
  • Sequencing assays may also be used to determine amounts or relative amounts of specific nucleic acid sequences (e.g., nucleic acid sequences of nucleic acid molecules of a sample, such as a cell-free biological sample). Such nucleic acid sequences may include nucleic acid sequences associated with specific genomic regions of interest (e.g., genomic regions comprising genes and/or markers). Sequencing data may be processed to assign values (e.g., intensity values) to given nucleic acid sequences or features thereof (e.g., sequences associated with differentially methylated regions).
  • values e.g., intensity values
  • Values (e.g., intensity values) associated with given nucleic acid sequences for a sample can be analyzed using feature selection techniques including filter techniques which assess the relevance of features by looking at the intrinsic properties of the data, wrapper methods which embed the model hypothesis within a feature subset search, and embedded techniques in which the search for an optimal set of features is built into a classifier algorithm.
  • Filter techniques may include parametric methods such as the use of two sample t-tests,
  • ANOVA analyses Bayesian frameworks, and Gamma distribution models
  • model free methods such as the use of Wilcoxon rank sum tests, between- within class sum of squares tests, rank products methods, or random permutation methods
  • multivariate methods such as bivariate methods, correlation based feature selection methods (CFS), minimum redundancy maximum relevance methods (MRMR), Markov blanket filter methods, and uncorrelated shrunken centroid methods.
  • Wrapper methods may include sequential search methods, genetic algorithms, and estimation of distribution algorithms.
  • Embedded methods may include random forest algorithms, weight vector of support vector machine algorithms, and weights of logistic regression algorithms.
  • Selected features may be classified using a classifier algorithm.
  • Illustrative algorithms include methods that reduce the number of variables such as principal component analysis algorithms, partial least squares methods, and independent component analysis algorithms.
  • Illustrative algorithms may handle large numbers of variables directly such as statistical methods and methods based on machine learning techniques.
  • Statistical methods include penalized logistic regression, prediction analysis of microarrays (PAM), methods based on shrunken centroids, support vector machine analysis, and regularized linear discriminant analysis.
  • a trained machine learning algorithm may comprise a supervised machine learning algorithm.
  • the trained machine learning algorithm may comprise a classification and regression tree (CART) algorithm.
  • the supervised machine learning algorithm may comprise, for example, a Random Forest, a support vector machine (SVM), a neural network, a deep learning algorithm, a bagging procedure, or a boosting procedure.
  • the trained machine learning algorithm may comprise an unsupervised machine learning algorithm.
  • the trained machine learning algorithm may be configured to accept a plurality of input variables and to produce one or more output values based on the plurality of input variables.
  • the plurality of input variables may comprise methylation profiles of one or more genomic regions of one or more cell-free biological samples.
  • the trained machine learning algorithm may comprise a classifier, such that each of the one or more output values comprises one of a fixed number of possible values (e.g., a linear classifier, a logistic regression classifier, etc.) indicating a classification of the cell-free biological sample by the classifier.
  • the trained machine learning algorithm may comprise a binary classifier, such that each of the one or more output values comprises one of two values (e.g., (0, 1 ⁇ , (positive, negative ⁇ , (positive for liver disease, negative for liver disease ⁇ , or (positive for significant fibrosis, negative for significant fibrosis ⁇ ) indicating a classification of the cell-free biological sample by the classifier.
  • the trained machine learning algorithm may be another type of classifier, such that each of the one or more output values comprises one of more than two values (e.g., (0, 1, 2 ⁇ or (positive, negative, or indeterminate ⁇ ) indicating a
  • the output values may comprise descriptive labels, numerical values, or a combination thereof. Some of the output values may comprise descriptive labels. Such descriptive labels may provide a clinical identification or indication of the liver disease of the subject, and may comprise, for example, positive, negative, or indeterminate. Such descriptive labels may provide an identification of a treatment for the subject’s liver disease, and may comprise, for example, a therapeutic regimen, a duration of the therapeutic regimen, and/or a dosage of the therapeutic regimen. Such descriptive labels may provide an identification of secondary clinical tests that may be appropriate to perform on the subject, and may comprise, for example, liver biopsy, magnetic resonance imaging-proton density fat fraction (MRI-PDFF), magnetic resonance elastography (MRE), or ultrasound. Such descriptive labels may provide a prognosis of the liver disease of the subject. Some descriptive labels may be mapped to numerical values, for example, by mapping “positive” to 1 and“negative” to 0.
  • Some of the output values may comprise numerical values, such as binary, integer, or continuous values. Such binary output values may comprise, for example, (0, 1 ⁇ . Such integer output values may comprise, for example, (0, 1, 2 ⁇ . Such continuous output values may comprise, for example, a probability value of at least 0 and no more than 1. Such continuous output values may comprise, for example, an un-normalized probability value of at least 0. Such continuous output values may comprise, for example, an un-normalized probability value of at least 0. Such continuous output values may indicate a presence, severity, and/or prognosis of a liver disease of the subject and may comprise, for example, a Non-Alcoholic Fatty Liver Disease (NAFLD) Activity Score of the subject (e.g., as described herein).
  • NAFLD Non-Alcoholic Fatty Liver Disease
  • Such continuous output values may indicate a prediction of the therapeutic regimen to treat the liver disease of the subject and may comprise, for example, an indication of an expected duration of efficacy of the therapeutic regimen.
  • Some numerical values may be mapped to descriptive labels, for example, by mapping 1 to“positive” and 0 to“negative”.
  • Some of the output values may be assigned based on one or more cutoff values. For example, a binary classification of samples may assign an output value of“positive” or 1 if the sample indicates that the subject has at least a 50% probability of having a liver disease. For example, a binary classification of samples may assign an output value of“negative” or 0 if the sample indicates that the subject has less than a 50% probability of having a liver disease.
  • a single cutoff value of 50% is used to classify samples into one of the two possible binary output values. Examples of single cutoff values may include about 1%, 2%, 5%, 10%, 15%,
  • the single cutoff value may be between about 1% and about 99%, such as between about 10% and about 90%, such as between about 10% and about 75%, such as between about 10% and about 60%, about 10% and about 50%, about 20% and about 75%, about 20% and about 60%, about 20% and about 50%, about 30% and about 75%, about 30% and about 60%, about 30% and about 50%, 40% and about 75%, 40% and about 60%, 40% and about 50%, 50% and about 75%, or about 50% and about 60%.
  • a classification of samples may assign an output value of “positive” or 1 if the sample indicates that the subject has a probability of having a liver disease or particular severity thereof of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
  • the classification of samples may assign an output value of“positive” or 1 if the sample indicates that the subject has a probability of having a liver disease or particular severity thereof of more than about 50%, more than about 55%, more than about 60%, more than about 65%, more than about 70%, more than about 75%, more than about 80%, more than about 85%, more than about 90%, more than about 95%, more than about 98%, or more than about 99%.
  • the classification of samples may assign an output value of“negative” or 0 if the sample indicates that the subject has a probability of having a liver disease or particular severity thereof of less than about 50%, less than about 45%, less than about 40%, less than about 35%, less than about 30%, less than about 25%, less than about 20%, less than about 10%, less than about 5%, less than about 2%, or less than about 1%.
  • the classification of samples may assign an output value of“negative” or 0 if the sample indicates that the subject has a probability of having a liver disease or particular severity thereof of no more than about 50%, no more than about 45%, no more than about 40%, no more than about 35%, no more than about 30%, no more than about 25%, no more than about 20%, no more than about 10%, no more than about 5%, no more than about 2%, or no more than about 1%.
  • the classification of samples may assign an output value of“indeterminate” or 2 if the sample has not been classified as“positive”,“negative”, 1, or 0. In this case, a set of two cutoff values is used to classify samples into one of the three possible output values.
  • sets of cutoff values may include ⁇ 1%, 99% ⁇ , (2%, 98% ⁇ , (5%, 95% ⁇ , ( 10%, 90% ⁇ , ( 15%, 85% ⁇ , (20%, 80% ⁇ , (25%, 75% ⁇ , (30%, 70% ⁇ , (35%, 65% ⁇ , (40%, 60% ⁇ , and (45%, 55% ⁇ . Similarly, sets of n cutoff values may be used to classify samples into one of n+ 1 possible output values, where n is any positive integer.
  • the trained machine learning algorithm may be trained with a plurality of independent training samples.
  • Each of the independent training samples may comprise a biological sample (e.g., cell-free biological sample) from a subject, associated data obtained by processing the biological sample (as described elsewhere herein), and one or more known output values corresponding to the biological sample (e.g., a clinical diagnosis, prognosis, treatment efficacy, or a presence, absence, or severity of a liver disease of the subject).
  • Independent training samples may comprise biological samples (e.g., cell-free biological samples) and associated data and outputs obtained from a plurality of different subjects.
  • Independent training samples may comprise biological samples (e.g., cell-free biological samples) and associated data and outputs obtained at a plurality of different time points from the same subject (e.g., before, after, and/or during a course of treatment to treat a liver disease of the subject).
  • Independent training samples may be associated with a presence or severity of the liver disease (e.g., training samples comprising cell-free biological samples and associated data and outputs obtained from a plurality of subjects known to have the liver disease), such as liver fibrosis, liver inflammation, or steatohepatitis.
  • Independent training samples may be associated with an absence of the liver disease (e.g., training samples comprising cell-free biological samples and associated data and outputs obtained from a plurality of subjects who are known to not have a previous diagnosis of the liver disease, who have recovered from the liver disease, or who are otherwise asymptomatic for the liver disease).
  • training samples comprising cell-free biological samples and associated data and outputs obtained from a plurality of subjects who are known to not have a previous diagnosis of the liver disease, who have recovered from the liver disease, or who are otherwise asymptomatic for the liver disease).
  • the trained machine algorithm may be trained with at least about 50, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, at least about 500, or more independent training samples.
  • the independent training samples may comprise samples associated with a presence or severity of the liver disease and/or samples associated with an absence of the liver disease.
  • the trained algorithm may be trained with no more than about 500, no more than about 450, no more than about 400, no more than about 350, no more than about 300, no more than about 250, no more than about 200, no more than about 150, no more than about 100, no more than about 50, or fewer independent training samples associated with a presence of the liver disease.
  • the sample e.g., cell-free biological sample
  • the trained machine learning algorithm may be trained with a first number of independent training samples associated with a presence or severity of the liver disease and a second number of independent training samples associated with an absence of the liver disease.
  • the first number of independent training samples associated with a presence or severity of the liver disease may be no more than the second number of independent training samples associated with an absence of the liver disease.
  • the first number of independent training samples associated with a presence or severity of the liver disease may be equal to the second number of independent training samples associated with an absence of the liver disease.
  • the first number of independent training samples associated with a presence or severity of the liver disease may be greater than the second number of independent training samples associated with an absence of the liver disease.
  • the trained machine learning algorithm may be trained with tissue samples (e.g., liver tissue samples), cell-free samples (e.g., cell-free nucleic acid samples), or a combination thereof.
  • tissue samples e.g., liver tissue samples
  • cell-free samples e.g., cell-free nucleic acid samples
  • a diagnostic test may seek to determine whether a person has a certain liver disease.
  • a false positive in this case occurs when the person tests positive, but actually does not have the liver disease.
  • a false negative occurs when the person tests negative, suggesting they are healthy, when they actually do have the liver disease.
  • the positive predictive value is the proportion of patients with positive test results who are correctly diagnosed. It can be an important measure of a diagnostic method as it reflects the probability that a positive test reflects the underlying condition being tested for. Its value does however depend on the prevalence of the disease, which may vary.
  • FP false positive
  • TN true negative
  • TP true positive
  • FN false negative
  • the negative predictive value is the proportion of patients with negative test results who are correctly diagnosed.
  • PPV and NPV measurements can be derived using appropriate disease subtype prevalence estimates.
  • the trained machine learning algorithm may be configured to identify the presence or severity of the liver disease with an accuracy of at least about 50%, at least about 65%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more for at least about 100 independent samples (e.g., at least about 100, 200, 300, 400, 500, or more independent samples).
  • independent samples e.g., at least about 100, 200, 300, 400, 500, or more independent samples.
  • the machine learning algorithm may be configured to identify a presence and/or severity of the disease (e.g., liver disease) at an accuracy of at least about 50%, at least about 65%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% for at least about 10, 20, 30, 40, 50, 100, 200, 250, 300, 400, 500, or more independent samples.
  • the disease e.g., liver disease
  • the accuracy of identifying the presence or severity of the liver disease by the trained machine learning algorithm may be calculated as the percentage of independent test samples (e.g., subjects known to have the severity of the liver disease or apparently healthy subjects with negative clinical test results for the severity of the liver disease) that are correctly identified or classified as having or not having the severity of the liver disease.
  • the trained algorithm may be configured to identify the presence or severity of the liver disease with a positive predictive value (PPV) of at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99%.
  • PSV positive predictive value
  • the PPV of identifying the presence or severity of the liver disease by the trained machine learning algorithm may be calculated as the percentage of biological samples (e.g., cell-free biological samples) identified or classified as having the presence or severity of the liver disease that correspond to subjects that truly have the severity of the liver disease.
  • a PPV may also be referred to as a precision.
  • the trained machine learning algorithm may be configured to identify the presence or severity of the liver disease with a negative predictive value (NPV) of at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99%.
  • NPV negative predictive value
  • the NPV of identifying the presence or severity of the liver disease by the trained machine learning algorithm may be calculated as the percentage of biological samples (e.g., cell-free biological samples) identified or classified as not having the severity of the liver disease that correspond to subjects that truly do not have the severity of the liver disease.
  • the machine learning algorithm may be configured to identify a disease (e.g., liver disease) and/or a severity of the disease at a sensitivity (e.g., clinical or diagnostic sensitivity) of at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about
  • the sensitivity of identifying the severity of the liver disease by the machine learning algorithm may be calculated as the percentage of independent test samples associated with presence of the severity of the liver disease (e.g., subjects known to have the severity of the liver disease) that are correctly identified or classified as having the severity of the liver disease.
  • a clinical sensitivity may also be referred to as a recall.
  • the machine learning algorithm may be configured to identify a disease (e.g., liver disease) and/or a severity of the disease at a specificity (e.g., clinical or diagnostic specificity) of at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about
  • the specificity of identifying the severity of the liver disease by the machine learning algorithm may be calculated as the percentage of independent test samples associated with absence of the severity of the liver disease (e.g., apparently healthy subjects with negative clinical test results for the severity of the liver disease) that are correctly identified or classified as not having the severity of the liver disease.
  • the machine learning algorithm may be configured to identify a disease (e.g., liver disease) and/or a severity of the disease with an F- score of at least about 0.05, at least about 0.10, at least about 0.15, at least about 0.20, at least about 0.25, at least about 0.30, at least about 0.35, at least about 0.40, at least about 0.50, at least about 0.65, at least about 0.60, at least about 0.65, at least about 0.70, at least about 0.75, at least about 0.80, at least about 0.81, at least about 0.82, at least about 0.83, at least about 0.84, at least about 0.85, at least about 0.86, at least about 0.87, at least about 0.88, at least about 0.89, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least about 0.98, at least about 0.99,
  • a disease e.g., liver disease
  • the machine learning algorithm may be configured to identify a disease (e.g., liver disease) and/or a severity of the disease with an Area- Under-Curve (AUC) of at least about 0.50, at least about 0.55, at least about 0.60, at least about 0.65, at least about 0.70, at least about 0.75, at least about 0.80, at least about 0.81, at least about 0.82, at least about 0.83, at least about 0.84, at least about 0.85, at least about 0.86, at least about 0.87, at least about 0.88, at least about 0.89, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least about 0.98, at least about 0.99, or higher.
  • the AUC may be calculated as an integral of the Receiver Operator Characteristic (ROC) curve (e.g., the area under the ROC curve) associated with the ROC curve
  • the machine learning algorithm may be configured to provide a statistical confidence level that a given identification of a presence or severity of a disease (e.g., liver disease) is correct.
  • the statistical confidence level may be at least about 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5%, or higher.
  • the machine learning algorithm (e.g., trained machine learning algorithm) may be adjusted or tuned to improve the performance, accuracy, PPV, NPV, clinical sensitivity, clinical specificity, or AUC of identifying a presence and/or severity of the disease (e.g., liver disease).
  • the machine learning algorithm may be adjusted or tuned by configuring or adjusting parameters of the machine learning algorithm (e.g., a set of cutoff values used to classify a sample as described elsewhere herein, or a set of weights of a neural network).
  • the machine learning algorithm may be adjusted or tuned continuously during the training process or after the training process has completed.
  • a subset of the inputs may be identified as most influential or most important to be included for making high quality classifications. For example, if each input variable comprises a methylation profile of one or more genomic regions of a cell-free biological sample, then a subset of the plurality of such input variables may be identified, indicating the genomic regions whose methylation profiles are most influential or most important to be included for making high quality classifications (e.g., an identification of a severity of a disease, such as a liver disease).
  • Such results may be used to reduce, in some cases significantly, the number of input variables (e.g., predictor variables) that may be used to train the machine learning algorithm to a desired performance level (e.g., based on a desired minimum accuracy, PPV, NPV, clinical sensitivity, clinical specificity, or AUC).
  • a desired performance level e.g., based on a desired minimum accuracy, PPV, NPV, clinical sensitivity, clinical specificity, or AUC.
  • Applying the machine learning algorithm may comprise correlating the data set with additional data sets obtained from a plurality of additional samples.
  • the plurality of additional samples can comprise at least about 100, at least about 200, at least about 300, at least about 400, at least about 500, or more than 500 additional samples.
  • the plurality of additional samples may comprise tissue samples (e.g., liver tissue samples), cell-free samples (e.g., cell-free nucleic acid samples), or a combination thereof. At least a portion of the plurality of additional samples may be derived from subjects having liver fibrosis, liver inflammation, or steatohepatitis.
  • a machine learning algorithm e.g., trained machine learning algorithm
  • a presence or severity of a disease e.g., liver disease
  • a presence or severity of the disease may be identified based at least in part on the absolute amount or relative amount of the nucleic acid molecules (including an absolute or relative level of methylation within said molecules) corresponding to one or more genomic regions in the methylation profile of the data set.
  • identifying a presence or severity of a disease may comprise determining a presence or absence of fibrosis or significant fibrosis in the subject.
  • the significant fibrosis may be defined by greater than or equal to F2 fibrosis according to Non- Alcoholic Fatty Liver Disease Activity Score (NAS) criteria.
  • NAS values e.g., NAS values obtained based at least in part upon a liver biopsy analysis
  • a sample obtained from the subject e.g., as described herein
  • An NAS value may be compared to the result of a method provided herein (e.g., applying a machine learning algorithm (e.g., trained machine learning algorithm) to data generated at least in part based on a cell-free biological sample).
  • An NAS value may be included in a report outputted as part of a method provided herein.
  • An NAS value may be determined before, after, or at approximately the same time as a method provided herein is performed.
  • An NAS value may be based at least in part on an assessment of steatosis, hepatocyte ballooning, inflammation, and fibrosis.
  • An NAS value may be taken as a sum of scores assigned for steatosis, hepatocyte ballooning, and inflammation, and a fibrosis score may be taken separately.
  • an NAS value of >5 including both steatosis and hepatocyte ballooning is considered indicative of NASH. Details of NAS values are included in Table 2 below.
  • identifying a presence and/or severity of a disease may comprise determining a presence or absence of liver inflammation in the subject, identifying a severity of lobular inflammation in the subject (e.g., according to Non-Alcoholic Fatty Liver Disease Activity Score criteria), identifying a severity of portal inflammation in the subject, identifying a severity of hepatocellular ballooning in the subject (e.g., according to Non- Alcoholic Fatty Liver Disease Activity Score criteria), determining a presence or absence of steatohepatitis or borderline steatohepatitis in the subject (e.g., according to Non-Alcoholic Fatty Liver Disease Activity Score criteria).
  • the severity of a liver disease may be identified according to a Non-Alcoholic Fatty Liver Disease Activity Score composite comprising steatosis, inflammation (e.g., lobular inflammation), and hepatocellular ballooning.
  • a presence or severity of a disease may be identified in the subject with an accuracy of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99%, for independent samples.
  • a disease e.g., liver disease
  • the accuracy of identifying the presence or severity of the disease (e.g., liver disease) by the algorithm may be calculated as the percentage of independent test samples (e.g., subjects known to have the severity of the liver disease or apparently healthy subjects with negative clinical test results for the severity of the liver disease) that are correctly identified or classified as having or not having the presence or severity of the liver disease.
  • a presence or severity of a disease may be identified in the subject with a positive predictive value (PPV) of at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99%.
  • PSV positive predictive value
  • the PPV of identifying the presence or severity of the disease (e.g., liver disease) by the trained machine learning algorithm may be calculated as the percentage of biological samples (e.g., cell-free biological samples) identified or classified as having the presence or severity of the disease (e.g., liver disease) that correspond to subjects that truly have the severity of the disease.
  • a PPV may also be referred to as a precision.
  • a presence or severity of a disease may be identified in the subject with a negative predictive value (NPV) of at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99%.
  • NPV negative predictive value
  • the NPV of identifying the presence or severity of the disease by the machine learning algorithm may be calculated as the percentage of biological samples (e.g., cell-free biological samples) identified or classified as not having the presence or severity of the disease that correspond to subjects that truly do not have the severity of the disease.
  • a presence or severity of a disease may be identified in the subject with a clinical sensitivity of at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99%.
  • a disease e.g., liver disease
  • the clinical sensitivity of identifying the presence or severity of the disease by the machine learning algorithm may be calculated as the percentage of independent test samples associated with presence of the severity of the disease (e.g., subjects known to have the severity of the disease) that are correctly identified or classified as having the severity of the disease.
  • a clinical sensitivity may also be referred to as a recall.
  • a presence or severity of a disease may be identified in the subject with a clinical specificity of at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99%.
  • a disease e.g., liver disease
  • the clinical specificity of identifying the presence or severity of the disease by the machine learning algorithm may be calculated as the percentage of independent test samples associated with absence of the severity of the disease (e.g., apparently healthy subjects with negative clinical test results for the severity of the disease) that are correctly identified or classified as not having the severity of the disease.
  • a presence or severity of a disease may be identified in the subject with an F-score of at least about 0.05, at least about 0.10, at least about 0.15, at least about 0.20, at least about 0.25, at least about 0.30, at least about 0.35, at least about 0.40, at least about 0.50, at least about 0.65, at least about 0.60, at least about 0.65, at least about 0.70, at least about 0.75, at least about 0.80, at least about 0.81, at least about 0.82, at least about 0.83, at least about 0.84, at least about 0.85, at least about 0.86, at least about 0.87, at least about 0.88, at least about 0.89, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least about 0.98, or at least about 0.99.
  • the F-score of identifying the presence or severity of the disease by
  • a presence or severity of a disease may be identified in the subject with an Area-Under-Curve (AUC) of at least about 0.50, at least about 0.55, at least about 0.60, at least about 0.65, at least about 0.70, at least about 0.75, at least about 0.80, at least about 0.81, at least about 0.82, at least about 0.83, at least about 0.84, at least about 0.85, at least about 0.86, at least about 0.87, at least about 0.88, at least about 0.89, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least about 0.98, or at least about 0.99.
  • the AUC may be calculated as an integral of the Receiver Operator Characteristic (ROC) curve (e.g., the area under the ROC curve) associated with the algorithm in classifying cell-free biological samples as having or not having the disease or disease of
  • ROC Receiver Operator Characteristic
  • the identification of the severity of the disease may include an identification of non-alcoholic fatty liver disease (NAFLD), fibrosis, significant fibrosis, steatohepatitis, borderline steatohepatitis, liver inflammation, lobular inflammation, portal inflammation, steatosis, hepatocellular ballooning, or a combination thereof.
  • NAFLD non-alcoholic fatty liver disease
  • the identification of the severity of the disease may be obtained based at least in part on one or more of: a number of differentially methylated regions in the data set, a number or identity of markers of inflammation identified in the data set, a number or identity of markers of fibrosis identified in the data set, a quality of the cell-free biological sample, and a predicted Non-Alcoholic Fatty Liver Disease (NAFLD) Activity Score identified from markers in the data set.
  • NAFLD Non-Alcoholic Fatty Liver Disease
  • the subject may be provided with a therapeutic intervention (e.g., prescribing an appropriate therapeutic regimen to treat the severity of the disease of the subject).
  • a therapeutic intervention e.g., prescribing an appropriate therapeutic regimen to treat the severity of the disease of the subject.
  • the therapeutic intervention may comprise one or more therapeutic agents or programs such as a weight loss, exercise, or BMI reduction regimen (e.g., via lifestyle changes including changes to diet and/or exercise, weight loss surgery, or a combination thereof); lowering cholesterol and/or triglycerides (e.g., via lifestyle changes including changes to diet and/or exercise and/or weight loss surgery, and/or via therapeutics such as a statin [e.g., atorvastatin (Lipitor), Fluvastatin (Lescol), lovastatin, pitavastatin (Livalo), pravastatin (Pravachol), rosuvastatin calcium
  • a statin e.g., atorvastatin (Lipitor), Fluvastatin (Lescol), lovastatin, pitavastatin (Livalo), pravastatin (Pravachol), rosuvastatin calcium
  • bile acid binding resin e.g., cholestyramine (Prevalite), colesevelam (Welchol), colestipol (Colestid)]
  • cholesterol absorption inhibitor e.g., ezetimibe (Zetia)
  • combination cholesterol absorption inhibitor and statin e.g., vytorin (e.g., ezetimibe- simvastatin)]
  • fibrates e.g., fenofibrate (Antara, Lipofen), gemfibrozil (Lopid)]
  • niacin e.g., Niacor, Niaspan] omega-3 fatty acids [e.g., Lovaza, icosapent ethyl (Vascepa), Epanova]
  • combination statin and calcium channel blocker e.g., amlodipine-atorvastatin (Caduet)]
  • alirocumab alirocumab
  • glimepiride-pioglitazone glimepiride-rosiglitazone, gliclazide, glipizide, glipizide-metformin, glyburide, glyburide-metformin, chlorpropamide, tolazamide, tolbutamide), thiazolidinediones (e.g., rosiglitazone, rosiglitazone-glimepiride, rosiglitazone-metformin, pioglitazone, pioglitazone-alogliptin, pioglitazone-glimepiride, pioglitazone-metformin)); avoiding substances such as drugs, alcohol, and nicotine; controlling blood pressure (e.g., via therapeutics such as diuretics, beta-blockers, ACE inhibitors, angiotensin II receptor blockers, calcium channel blockers, alpha blockers, alpha-2 receptor agonists, combined alpha and beta-blockers, central agonists, peripheral
  • Lifestyle changes may include, for example, reduction or cessation of substance use (e.g., alcohol, drugs, and/or nicotine use); diet changes; and introduction, enhancement, and/or regulation of exercise.
  • substance use e.g., alcohol, drugs, and/or nicotine use
  • diet changes e.g., diet changes; and introduction, enhancement, and/or regulation of exercise.
  • the therapeutic intervention may comprise a subsequent different therapeutic regimen (e.g., to increase treatment efficacy due to resistance or non-response of the current therapeutic regimen).
  • a different therapeutic regimen may comprise one or more different therapeutic agents or programs, and/or a modification to an existing therapeutic regimen (e.g., addition or reduction of a therapeutic agent or program, modification of a dosage of a therapeutic agent, modification of an intensity of a weight loss, exercise, or BMI reduction regiment, etc.).
  • the therapeutic intervention may comprise recommending the subject for a secondary clinical test to confirm a diagnosis of the severity of the disease (e.g., liver disease).
  • This secondary clinical test may comprise one or more of: liver biopsy, MRI-PDFF, MRE, or ultrasound.
  • Data sets comprising methylation profiles of genomic regions of one or more samples may be used to monitor a subject (e.g., subject who has a severity of a disease such as a liver disease or who is being treated for a severity of a disease such as a liver disease).
  • a subject e.g., subject who has a severity of a disease such as a liver disease or who is being treated for a severity of a disease such as a liver disease.
  • the methylation profiles of the subject may change during the therapeutic regimen.
  • the methylation profiles of a subject whose severity of a disease e.g., liver disease
  • an effective therapeutic regimen e.g., vitamin E
  • the methylation profiles of a subject whose severity of a disease e.g., liver disease
  • an ineffective therapeutic regimen e.g., when the liver disease becomes resistant or non-responsive
  • the progression or regression of the disease (e.g., liver disease) or severity thereof in the subject may be assessed by monitoring the subject while they are undergoing a therapeutic regimen.
  • the monitoring may comprise assessing the presence or the severity of the disease (e.g., liver disease) in the subject at two or more time points (e.g., before commencing the therapeutic regimen, during a first and/or second time point during the therapeutic regiment, and/or after completing all or a portion of a therapeutic regimen).
  • the assessing may be based at least in part on the absolute amount or relative amount of the nucleic acid molecules (including an absolute or relative level of methylation within said molecules) corresponding to one or more genomic regions in the methylation profile of the data set obtained from a sample (e.g., cell-free biological sample) of the subject at each of the two or more time points.
  • a sample e.g., cell-free biological sample
  • a difference in the presence and/or absolute or relative amount of the nucleic acid molecules or derivative thereof, or sequences thereof, corresponding to one or more genomic regions in the methylation profile of the data set obtained from a sample (e.g., cell-free biological sample) of the subject at each of the two or more time points may be indicative of one or more clinical identifications or indications, such as onset of a disease (e.g., liver disease) in the subject, progression of a disease (e.g., liver disease) in a subject (e.g., change in a severity of the disease), a type of the disease (e.g., liver disease) in the subject, a prognosis of the disease (e.g., liver disease) in the subject, a risk of the subject having the disease (e.g., liver disease), a regression of the disease (e.g., liver disease) in the subject, a proposed therapeutic regimen for the disease (e.g, liver disease), an efficacy of a therapeutic regimen for treating the disease (
  • the identification of the presence or the severity of a disease may include an identification of non alcoholic fatty liver disease (NAFLD), fibrosis, significant fibrosis, steatohepatitis, borderline steatohepatitis, liver inflammation, lobular inflammation, portal inflammation, steatosis, and hepatocellular ballooning.
  • NAFLD non alcoholic fatty liver disease
  • a difference in the presence and/or absolute or relative amount of the nucleic acid molecules or derivatives thereof, or sequencing thereof, corresponding to one or more genomic regions in the methylation profile of the data set obtained from a sample (e.g., cell-free biological sample) of the subject at each of the two or more time points may be indicative of a presence or severity of the disease (e.g., liver disease) in the subject. For example, if the disease (e.g., liver disease) was not detected in the subject at an earlier time point but was detected in the subject at a later time point, then the difference is indicative of a presence of the disease (e.g., liver disease) in the subject.
  • a clinical action or decision may be made based on this indication of diagnosis of the disease (e.g., liver disease) in the subject, e.g., prescribing a new therapeutic intervention for the subject.
  • a difference in the presence and/or absolute or relative amount of the nucleic acid molecules or derivatives thereof, or sequencing thereof, corresponding to one or more genomic regions in the methylation profile of the data set obtained from a sample (e.g., cell-free biological sample) of the subject at each of the two or more time points may be indicative of a prognosis of the disease (e.g., liver disease) in the subject.
  • a prognosis of the disease e.g., liver disease
  • a difference in the presence and/or absolute or relative amount of the nucleic acid molecules or derivatives thereof, or sequencing thereof, corresponding to one or more genomic regions in the methylation profile of the data set obtained from a sample (e.g., cell-free biological sample) of the subject at each of the two or more time points may be indicative of a progression of the disease (e.g., liver disease) in the subject.
  • a sample e.g., cell-free biological sample
  • the difference may be indicative of a progression (e.g., increased severity) of the disease (e.g., liver disease) in the subject.
  • a clinical action or decision may be made based on this indication of the progression, e.g., prescribing a new therapeutic regimen or switching therapeutic regimen (e.g., ending a current therapeutic regimen and prescribing a new
  • a difference in the presence and/or absolute or relative amount of the nucleic acid molecules or derivatives thereof, or sequencing thereof, corresponding to one or more genomic regions in the methylation profile of the data set obtained from a sample (e.g., cell-free biological sample) of the subject at each of the two or more time points may be indicative of a regression of the disease (e.g., liver disease) in the subject.
  • a sample e.g., cell-free biological sample
  • the difference may be indicative of a regression (e.g., decreased severity) of the disease (e.g., liver disease) in the subject.
  • a clinical action or decision may be made based on this indication of the regression, e.g., continuing or ending a current therapeutic regimen for the subject.
  • a difference in the presence and/or absolute or relative amount of the nucleic acid molecules or derivatives thereof, or sequencing thereof, corresponding to one or more genomic regions in the methylation profile of the data set obtained from a sample (e.g., cell-free biological sample) of the subject at each of the two or more time points may be indicative of an efficacy of the therapeutic regimen for treating the disease (e.g., liver disease) in the subject.
  • the disease e.g., liver disease
  • a clinical action or decision may be made based on this indication of the efficacy of the therapeutic regimen for treating the disease (e.g., liver disease) in the subject, e.g., continuing or ending a current therapeutic regimen for the subject.
  • a difference in the presence and/or absolute or relative amount of the nucleic acid molecules or derivatives thereof, or sequencing thereof, corresponding to one or more genomic regions in the methylation profile of the data set obtained from a sample (e.g., cell-free biological sample) of the subject at each of the two or more time points may be indicative of a resistance or non-response of the disease (e.g., liver disease) toward the therapeutic regimen for treating the disease in the subject.
  • a resistance or non-response of the disease e.g., liver disease
  • the difference may be indicative of a resistance or non-response of the therapeutic regimen for treating the disease (e.g., liver disease) in the subject.
  • a clinical action or decision may be made based on this indication of the resistance or non-response of the therapeutic regimen for treating the disease (e.g., liver disease) in the subject, e.g., ending a current therapeutic regimen and/or switching to (e.g., prescribing) a different new therapeutic regimen for the subject.
  • the disease e.g., liver disease
  • a report may be outputted (e.g., electronically outputted) that identifies the presence or severity of the disease (e.g., liver disease) in the subject.
  • the report may be presented on a computer screen, on a graphical user interface (GUI) of an electronic device of a user, as a paper record, or a combination thereof.
  • GUI graphical user interface
  • a user who obtains, views, reviews, interprets, or analyzes the report may be, for example, the subject, a caretaker, a physician, a nurse, or another health care worker.
  • the report may include one or more clinical identifications or indications such as an identification of a presence or severity of the disease (e.g., liver disease) in the subject, a type of the disease (e.g., liver disease) in the subject, a prognosis of the disease (e.g., liver disease) in the subject, a risk of the subject having the disease (e.g., liver disease), a progression of the disease (e.g., liver disease) in the subject, a regression of the disease (e.g., liver disease) in the subject, a proposed therapeutic regimen for the disease (e.g., liver disease), an efficacy of a therapeutic regimen for treating the disease (e.g., liver disease) in the subject, and a resistance of the disease (e.g., liver disease) toward a therapeutic regimen for treating the disease (e.g., liver disease) in the subject.
  • a clinical identifications or indications such as an identification of a presence or severity of the disease (e.g., liver disease) in the subject,
  • the identification of the severity of said disease may include an identification of non-alcoholic fatty liver disease (NAFLD), fibrosis, significant fibrosis, steatohepatitis, borderline steatohepatitis, liver inflammation, lobular inflammation, portal inflammation, steatosis, and hepatocellular ballooning, or lack thereof.
  • the report may include one or more of: a number of differentially methylated regions in the data set, a number or identity of markers of inflammation identified in the data set, a number or identity of markers of fibrosis identified in the data set, a quality of the cell-free biological sample, and a Non-Alcoholic Fatty Liver Disease (NAFLD) Activity Score.
  • NAFLD Non-Alcoholic Fatty Liver Disease
  • the report may include one or more symptoms, risk factors, vital statistics, or other details of the subject (e.g., weight, height, age, race/ethnicity, national origin, place of residence, body mass index, etc.).
  • the report may indicate that the subject does not display a disease (e.g., liver disease) (e.g., is asymptomatic of a disease).
  • the report may include one or more clinical actions or decisions made based on these one or more clinical identifications or indications.
  • a clinical indication of an identification (e.g., diagnosis) of the disease (e.g., liver disease) or a severity thereof in the subject may be accompanied with a clinical action of prescribing a new therapeutic intervention and/or regimen for the subject (e.g., as described herein).
  • a clinical indication of a progression of the disease (e.g., liver disease) in the subject may be accompanied with a clinical action of prescribing a new therapeutic regimen or switching therapeutic regimens (e.g., ending a current treatment and prescribing a new treatment) for the subject.
  • a clinical indication of a regression of the disease (e.g., liver disease) in the subject may be accompanied with a clinical action of continuing or ending a current therapeutic regimen for the subject.
  • a clinical indication of an efficacy of the therapeutic regimen for treating the disease (e.g., liver disease) in the subject may be
  • a clinical indication of a resistance of the therapeutic regimen for treating the disease (e.g., liver disease) in the subject may be accompanied with a clinical action of ending a current therapeutic regimen and/or switching to (e.g., prescribing) a different new therapeutic regimen for the subject.
  • the subject may have or be suspected of having the disease (e.g., liver disease) or another disease.
  • the subject is undergoing or has undergone treatment for the disease (e.g., liver disease).
  • the method for identifying the presence or severity of the disease (e.g., liver disease) in the subject may further comprise, based at least in part on said data set, identifying a risk factor for cirrhosis or hepatocellular carcinoma.
  • the report may include such risk factors.
  • a method for identifying the severity of the disease (e.g., liver disease) in the subject may further comprise, based at least in part on a data set obtained using the methods provided herein, identifying a presence or absence of cirrhosis or hepatocellular carcinoma in the subject.
  • FIG. 2 shows a computer system 201 that is programmed or otherwise configured to, for example, obtain a data set comprising a methylation profile of genomic regions of a cell-free biological sample of a subject, use a trained machine learning algorithm to process a data set to identify a presence or a severity of a disease (e.g., liver disease) in a subject, or output a report that identifies a presence or a severity of a disease (e.g., liver disease) in a subject.
  • a disease e.g., liver disease
  • the computer system 201 can regulate various aspects of analysis, calculation, and generation of the present disclosure, such as, for example, obtaining a data set comprising a methylation profile of genomic regions of a cell-free biological sample of a subject, using a trained machine learning algorithm to process a data set to identify a presence or a severity of a disease (e.g., liver disease) in a subject, or outputting a report that identifies a presence or a severity of a disease (e.g., liver disease) in a subject.
  • the computer system 201 can be an electronic device of a user or a computer system that is remotely located with respect to the electronic device.
  • the electronic device can be a mobile electronic device.
  • the computer system 201 includes a central processing unit (CPU, also“processor” and“computer processor” herein) 205, which can be a single core or multi core processor, or a plurality of processors for parallel processing.
  • the computer system 201 also includes memory or memory location 210 (e.g., random-access memory, read-only memory, flash memory), electronic storage unit 215 (e.g., hard disk), communication interface 220 (e.g., network adapter) for communicating with one or more other systems, and peripheral devices 225, such as cache, other memory, data storage, and/or electronic display adapters.
  • the memory 210, storage unit 215, interface 220, and peripheral devices 225 may be in communication with the CPU 205 through a communication bus (solid lines), such as a motherboard.
  • the storage unit 215 can be a data storage unit (or data repository) for storing data.
  • the computer system 201 can be operatively coupled to a computer network (“network”) 230 with the aid of the communication interface 220.
  • the network 230 can be the Internet, an internet and/or extranet, or an intranet and/or extranet that is in communication with the Internet.
  • the network 230 in some cases is a telecommunication and/or data network.
  • the network 230 can include one or more computer servers, which can enable distributed computing, such as cloud computing.
  • one or more computer servers may enable cloud computing over the network 230 (“the cloud”) to perform various aspects of analysis, calculation, and generation of the present disclosure, such as, for example, obtaining a data set comprising a methylation profile of genomic regions of a cell-free biological sample of a subject, using a trained machine learning algorithm to process a data set to identify a presence or a severity of a disease (e.g., liver disease) in a subject, or outputting a report that identifies a presence or a severity of a disease (e.g., liver disease) in a subject.
  • a disease e.g., liver disease
  • cloud computing may be provided by cloud computing platforms such as, for example, Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform, and IBM cloud.
  • the network 230 in some cases with the aid of the computer system 201, can implement a peer-to-peer network, which may enable devices coupled to the computer system 201 to behave as a client or a server.
  • the CPU 205 may comprise one or more computer processors and/or one or more graphics processing units (GPUs).
  • the CPU 205 can execute a sequence of machine-readable instructions, which can be embodied in a program or software.
  • the instructions may be stored in a memory location, such as the memory 210.
  • the instructions can be directed to the CPU 205, which can subsequently program or otherwise configure the CPU 205 to implement methods of the present disclosure. Examples of operations performed by the CPU 205 can include fetch, decode, execute, and writeback.
  • the CPU 205 can be part of a circuit, such as an integrated circuit.
  • One or more other components of the system 201 can be included in the circuit.
  • the circuit is an application specific integrated circuit (ASIC).
  • the storage unit 215 can store files, such as drivers, libraries and saved programs.
  • the storage unit 215 can store user data, e.g., user preferences and user programs.
  • the computer system 201 in some cases can include one or more additional data storage units that are external to the computer system 201, such as located on a remote server that is in
  • the computer system 201 can communicate with one or more remote computer systems through the network 230.
  • the computer system 201 can communicate with a remote computer system of a user.
  • remote computer systems include personal computers (e.g., portable PC), slate or tablet PC’s (e.g., Apple® iPad, Samsung® Galaxy Tab), telephones, Smart phones (e.g., Apple® iPhone, Android-enabled device, Blackberry®), or personal digital assistants.
  • the user can access the computer system 201 via the network 230.
  • Methods as described herein can be implemented by way of machine (e.g., computer processor) executable code stored on an electronic storage location of the computer system 201, such as, for example, on the memory 210 or electronic storage unit 215.
  • the machine executable or machine readable code can be provided in the form of software.
  • the code can be executed by the processor 205.
  • the code can be retrieved from the storage unit 215 and stored on the memory 210 for ready access by the processor 205.
  • the electronic storage unit 215 can be precluded, and machine-executable instructions are stored on memory 210.
  • the code can be pre-compiled and configured for use with a machine having a processer adapted to execute the code, or can be compiled during runtime.
  • the code can be supplied in a programming language that can be selected to enable the code to execute in a pre compiled or as-compiled fashion.
  • aspects of the systems and methods provided herein can be embodied in programming.
  • Various aspects of the technology may be thought of as “products” or“articles of manufacture” typically in the form of machine (or processor) executable code and/or associated data that is carried on or embodied in a type of machine readable medium.
  • Machine-executable code can be stored on an electronic storage unit, such as memory (e.g., read-only memory, random-access memory, flash memory) or a hard disk.
  • “Storage” type media can include any or all of the tangible memory of the computers, processors or the like, or associated modules thereof, such as various semiconductor memories, tape drives, disk drives and the like, which may provide non-transitory storage at any time for the software programming. All or portions of the software may at times be communicated through the Internet or various other telecommunication networks. Such communications, for example, may enable loading of the software from one computer or processor into another, for example, from a management server or host computer into the computer platform of an application server.
  • another type of media that may bear the software elements includes optical, electrical and electromagnetic waves, such as used across physical interfaces between local devices, through wired and optical landline networks and over various air-links.
  • a machine readable medium such as computer-executable code
  • Non-volatile storage media include, for example, optical or magnetic disks, such as any of the storage devices in any computer(s) or the like, such as may be used to implement the databases, etc. shown in the drawings.
  • Volatile storage media include dynamic memory, such as main memory of such a computer platform.
  • Tangible transmission media include coaxial cables; copper wire and fiber optics, including the wires that comprise a bus within a computer system.
  • Carrier-wave transmission media may take the form of electric or electromagnetic signals, or acoustic or light waves such as those generated during radio frequency (RF) and infrared (IR) data communications.
  • Computer-readable media therefore include for example: a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, DVD or DVD-ROM, any other optical medium, punch cards paper tape, any other physical storage medium with patterns of holes, a RAM, a ROM, a PROM and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave transporting data or instructions, cables or links transporting such a carrier wave, or any other medium from which a computer may read programming code and/or data. Many of these forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to a processor for execution.
  • the computer system 201 can include or be in communication with an electronic display 235 that comprises a user interface (E ⁇ ) 240 for providing, for example, a report that identifies a presence or a severity of a disease (e.g., liver disease) in a subject.
  • Eds include, without limitation, a graphical user interface (GET) and web-based user interface.
  • Methods and systems of the present disclosure can be implemented by way of one or more algorithms.
  • An algorithm can be implemented by way of software upon execution by the central processing unit 205.
  • the algorithm can, for example, obtain a data set comprising a methylation profile of genomic regions of a cell-free biological sample of a subject, use a trained machine learning algorithm to process a data set to identify a presence or a severity of a disease (e.g., liver disease) in a subject, or output a report that identifies a presence or a severity of a disease (e.g., liver disease) in a subject.
  • a disease e.g., liver disease
  • Example 1 Plasma cfDNA methylation as blood-based biomarker of disease severity
  • cfDNA methylation is implicated in the progression of NAFLD and can be detected in plasma cell-free (cfDNA). Therefore, a panel of cfDNA methylation markers that predict the presence of fibrosis and NASH is identified.
  • WGBS Whole genome bisulfite sequencing
  • Example 2 Assessing a therapeutic regimen for use in the treatment of a disease
  • a first cell-free biological sample is obtained from a subject. Nucleic acid molecules derived from the first cell-free biological sample are assayed to generate a first data set comprising a methylation profile of one or more genomic regions of the first cell-free biological sample. A trained machine learning algorithm is then used to process the first data set to identify a presence or severity of a disease (e.g., liver disease) in the subject. A report is then outputted that identifies the severity of the disease (e.g., liver disease) in the subject. [0199] The subject is assigned a therapeutic agent and begins a therapeutic regimen comprising administration of the therapeutic agent.
  • the therapeutic agent may be an methylation profile of one or more genomic regions of the first cell-free biological sample.
  • a trained machine learning algorithm is then used to process the first data set to identify a presence or severity of a disease (e.g., liver disease) in the subject.
  • a report is then outputted that identifies the severity of the disease (e.g., liver disease) in the subject.
  • a disease e.g., liver disease
  • NAFLD such as NASH
  • the subject may also be assigned a therapeutic program such as a weight loss program or specific diet.
  • a second cell-free biological sample is obtained from the subject. Nucleic acid molecules derived from the second cell-free biological sample are assayed to generate a second data set comprising a methylation profile of one or more genomic regions of the second cell-free biological sample.
  • a trained machine learning algorithm is then used to process the second data set to identify a presence or severity of a disease (e.g., liver disease) in the subject.
  • a report is then outputted that identifies the presence or the severity of the disease (e.g., liver disease) in the subject.
  • the presence or the severity of the disease (e.g., liver disease) from the second analysis may then be evaluated with regard to the presence or the severity of the disease (e.g., liver disease) from the first analysis.
  • a medical professional may evaluate the effectiveness of the therapeutic regimen (e.g., therapeutic agent).
  • multiple different subjects may be subjected to this method. The different subjects may be assigned different dosages, combination therapies, and/or treatment durations with the therapeutic agent.
  • These different therapeutic regimens may be assigned based at least in part on the gender, weight, height, and age of the subject, and/or based on the presence or the severity of the disease (e.g., liver disease) identified from the first analysis of a first sample from each subject.
  • the disease e.g., liver disease
  • Samples are collected for use in both training and validation study cohorts. Samples are collected from subjects having or at risk of having a liver disease (e.g., as described herein). Samples collected include plasma samples, liver biopsy samples, and phenotype data.
  • Phenotype data collected includes age, gender, race, ethnicity, diagnosis of diabetes, body mass index, albumin levels, AST level, ALT level, and platelet count. Liver biopsy samples are prepared on slides and stained for H&E and Masson’s Trichrome and scored. 251 subjects’ samples are included for analysis in the validation cohort. 184 subjects’ samples are included for analysis in the training cohort.
  • DMRs differentially methylated regions
  • Hybrid capture bisulfite sequencing is performed on 251 training samples, achieving an average of approximately lOOx depth of coverage across the DMRs.
  • Methylation values are calculated for 10,508 analyzed CpGs.
  • Weighted correlation network analysis is performed on methylation values and phenotype data to derive classification features.
  • Random forest classifier is used to rank and select features with highest calculated importance. Random forest classifiers are trained on 251 samples, using selected features. Binary classifiers are thus generated for fibrosis (FO-1 vs. F2- 4) and steatohepatitis (none vs. borderline or definite).
  • Hybrid capture bisulfite sequencing is performed on 184 validation samples, achieving an average of approximately lOOx depth of coverage across the DMRs.
  • Blinded validation results for classification of significant fibrosis (FO-1 vs. F2-4) and steatohepatitis (no steatohepatitis vs. borderline or definite steatohepatitis) are summarized in Table 3.
  • Receiver operating characteristic curves for fibrosis and steatohepatitis are included in FIG. 3 and FIG. 4, respectively.
  • Example 4 Methods of identifying a presence or severity of a disease
  • a cell-free biological sample is obtained (e.g., as described herein) from a subject (e.g., a subject at risk or suspected of being at risk of a liver disease (e.g., NASH or NAFLD)).
  • the subject may have one or more risk factors for a liver disease (e.g., as described herein).
  • the sample is processed and assayed for nucleic acid molecules (e.g., nucleic acid molecules derived from the cell-free biological sample).
  • a data set comprising a methylation profile of one or more genomic regions of the sample (e.g., cell-free biological sample) is generated.
  • a trained machine learning algorithm (e.g., as described herein) is used to process the data set, including the methylation profile, to identify a presence or severity of a liver disease in the subject.
  • a report is outputted that identifies the presence or severity of the liver disease in the subject.
  • the report may identify the liver disease as comprising fibrosis and/or steatohepatitis.
  • the report includes a proposed therapeutic intervention or regimen that may be reviewed by a medical professional and/or provided to the subject.
  • the therapeutic intervention may comprise one or more therapeutic agents or programs such as a weight loss, exercise, or BMI reduction regimen (e.g., via lifestyle changes including changes to diet and/or exercise, weight loss surgery, or a combination thereof); lowering cholesterol and/or triglycerides (e.g., via lifestyle changes including changes to diet and/or exercise and/or weight loss surgery, and/or via therapeutics such as a statin, bile acid binding resin, cholesterol absorption inhibitor, combination cholesterol absorption inhibitor and statin, fibrates, niacin, omega-3 fatty acids, combination statin and calcium channel blocker, alirocumab, or evolocumab; controlling diabetes (e.g., via lifestyle changes as described herein and/or via therapeutics including insulin drugs and insulin-combination therapies, amylinomimetc drugs, alpha-glucosidase inhibitors, biguanides, dopamine agonists, dipeptidyl peptidase-4 (DPP -4) inhibitors, glucagon-like
  • Example 5 Methods of identifying a presence or severity of a disease
  • a cell-free biological sample is obtained (e.g., as described herein) from a subject (e.g., a subject at risk or suspected of being at risk of a liver disease (e.g., NASH or NAFLD)).
  • the subject may have one or more risk factors for a liver disease (e.g., as described herein).
  • Nucleic acid molecules derived from the cell-free biological sample are assayed (e.g., as described herein) to generate a methylation profile of one or more genomic regions of the cell- free biological sample.
  • a trained machine learning algorithm e.g., as described herein is used to process the methylation profile to identify a presence or severity of a liver disease in the subject at an accuracy of at least 90% for at least 50 independent samples.
  • a report is outputted that identifies the presence or severity of the liver disease in the subject.
  • the report may identify the liver disease as comprising fibrosis and/or steatohepatitis.
  • the report includes a proposed therapeutic intervention or regimen that may be reviewed by a medical professional and/or provided to the subject.
  • the therapeutic intervention may comprise one or more therapeutic agents or programs such as a weight loss, exercise, or BMI reduction regimen (e.g., via lifestyle changes including changes to diet and/or exercise, weight loss surgery, or a combination thereof); lowering cholesterol and/or triglycerides (e.g., via lifestyle changes including changes to diet and/or exercise and/or weight loss surgery, and/or via therapeutics such as a statin, bile acid binding resin, cholesterol absorption inhibitor, combination cholesterol absorption inhibitor and statin, fibrates, niacin, omega-3 fatty acids, combination statin and calcium channel blocker, alirocumab, or evolocumab; controlling diabetes (e.g., via lifestyle changes as described herein and/or via therapeutics including insulin drugs and insulin-combination therapies, amylinomimetc drugs, alpha-glucosidase inhibitors, biguanides, dopamine agonists, dipeptidyl peptidase-4 (DPP -4) inhibitors, glucagon-like
  • pioglitazone Ocaliva; Bibliosertib; Elafibranor; Cenicriviroc; Aramchol; obeticholic acid; other approved, investigational, or future agents or programs, or a combination thereof.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Analytical Chemistry (AREA)
  • Biophysics (AREA)
  • Medical Informatics (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Theoretical Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Public Health (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Epidemiology (AREA)
  • Evolutionary Computation (AREA)
  • Pathology (AREA)
  • Bioethics (AREA)
  • Software Systems (AREA)
  • Immunology (AREA)
  • General Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

La présente invention concerne des procédés et des systèmes pour détecter une maladie hépatique par traitement d'échantillons biologiques, tels que des échantillons acellulaires obtenus à partir de sujets. Selon un aspect, la présente invention concerne un procédé d'identification d'une gravité d'une maladie hépatique chez un sujet, consistant (A) à fournir un échantillon biologique acellulaire provenant dudit sujet; (b) à analyser des molécules d'acide nucléique dérivées dudit échantillon biologique acellulaire pour générer un ensemble de données comprenant un profil de méthylation d'une ou plusieurs régions génomiques dudit échantillon biologique acellulaire; (c) à utiliser un algorithme d'apprentissage automatique entraîné pour traiter ledit ensemble de données pour identifier ladite gravité de ladite maladie hépatique chez ledit sujet; et (d) à produire un rapport identifiant ladite gravité de ladite maladie hépatique chez ledit sujet.
PCT/US2020/013535 2019-01-15 2020-01-14 Procédés et systèmes pour la détection de maladies hépatiques WO2020150258A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201962792768P 2019-01-15 2019-01-15
US62/792,768 2019-01-15

Publications (1)

Publication Number Publication Date
WO2020150258A1 true WO2020150258A1 (fr) 2020-07-23

Family

ID=71613580

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2020/013535 WO2020150258A1 (fr) 2019-01-15 2020-01-14 Procédés et systèmes pour la détection de maladies hépatiques

Country Status (1)

Country Link
WO (1) WO2020150258A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022165433A1 (fr) * 2021-02-01 2022-08-04 PathAI, Inc. Systèmes et procédés de classification de données d'image biomédicale à l'aide d'un réseau de neurones en graphes
RU2803945C1 (ru) * 2022-05-12 2023-09-22 Бюджетное учреждение высшего образования Ханты-Мансийского автономного округа - Югры "Сургутский государственный университет" Способ диагностики неалкогольной жировой болезни печени при метаболическом синдроме

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160019338A1 (en) * 2014-05-30 2016-01-21 Verinata Health, Inc. Detecting fetal sub-chromosomal aneuploidies
WO2018009723A1 (fr) * 2016-07-06 2018-01-11 Guardant Health, Inc. Procédés de profilage d'un fragmentome d'acides nucléiques sans cellule
WO2019200410A1 (fr) * 2018-04-13 2019-10-17 Freenome Holdings, Inc. Mise en œuvre de l'apprentissage automatique pour un dosage multi-analytes d'échantillons biologiques
WO2019209884A1 (fr) * 2018-04-23 2019-10-31 Grail, Inc. Méthodes et systèmes de dépistage d'affections

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160019338A1 (en) * 2014-05-30 2016-01-21 Verinata Health, Inc. Detecting fetal sub-chromosomal aneuploidies
WO2018009723A1 (fr) * 2016-07-06 2018-01-11 Guardant Health, Inc. Procédés de profilage d'un fragmentome d'acides nucléiques sans cellule
WO2019200410A1 (fr) * 2018-04-13 2019-10-17 Freenome Holdings, Inc. Mise en œuvre de l'apprentissage automatique pour un dosage multi-analytes d'échantillons biologiques
WO2019209884A1 (fr) * 2018-04-23 2019-10-31 Grail, Inc. Méthodes et systèmes de dépistage d'affections

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YIP, T. C.-F., MA A. J., WONG V. W.-S., TSE Y.-K., CHAN H. L.-Y., YUEN P.-C., WONG G. L.-H.: "Laboratory Parameter-Based Machine Learning Model for Excluding Non-Alcoholic Fatty Liver Disease (NAFLD) in the General Population", ALIMENTARY PHARMACOLOGY AND THERAPEUTICS, vol. 46, no. 4, 6 June 2017 (2017-06-06), pages 447 - 456, XP055728958, DOI: 10.1111/apt.14172 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022165433A1 (fr) * 2021-02-01 2022-08-04 PathAI, Inc. Systèmes et procédés de classification de données d'image biomédicale à l'aide d'un réseau de neurones en graphes
RU2803945C1 (ru) * 2022-05-12 2023-09-22 Бюджетное учреждение высшего образования Ханты-Мансийского автономного округа - Югры "Сургутский государственный университет" Способ диагностики неалкогольной жировой болезни печени при метаболическом синдроме

Similar Documents

Publication Publication Date Title
McKiernan et al. A novel urine exosome gene expression assay to predict high-grade prostate cancer at initial biopsy
JP6772216B2 (ja) 臓器移植患者における移植片拒絶反応の非侵襲的診断方法
Fernandez-Ranvier et al. Identification of biomarkers of adrenocortical carcinoma using genomewide gene expression profiling
Sutherland et al. Development and validation of a novel molecular biomarker diagnostic test for the early detection of sepsis
JP2022519897A (ja) 対象の妊娠関連状態を決定するための方法及びシステム
Agnelli et al. Identification of a 3-gene model as a powerful diagnostic tool for the recognition of ALK-negative anaplastic large-cell lymphoma
John et al. Interleukin 6 and interleukin 8 as potential biomarkers for oral cavity and oropharyngeal squamous cell carcinoma
KR20170053617A (ko) 폐암 상태를 평가하는 방법
Kienle et al. Detection of isolated disseminated tumor cells in bone marrow and blood samples of patients with hepatocellular carcinoma
WO2015069933A1 (fr) Adn acellulaire circulant pour le diagnostic du rejet de greffe
Allegretti et al. Precision diagnostics of Ewing’s sarcoma by liquid biopsy: circulating EWS-FLI1 fusion transcripts
CN110964778B (zh) 肠杆菌科作为缺血性脑卒中生物标志物的用途
US20220136062A1 (en) Method for predicting cancer risk value based on multi-omics and multidimensional plasma features and artificial intelligence
Bueno et al. Multi-institutional prospective validation of prognostic mRNA signatures in early stage squamous lung cancer (alliance)
US20190018930A1 (en) Method for building a database
Radhachandran et al. A machine learning approach to predicting risk of myelodysplastic syndrome
Banki et al. Plasma DNA as a molecular marker for completeness of resection and recurrent disease in patients with esophageal cancer
Shaw et al. An age-independent gene signature for monitoring acute rejection in kidney transplantation
Timmerman et al. Clinical and molecular diagnosis of pathologic complete response in rectal cancer: an update
Archer et al. Pretransplant kidney transcriptome captures intrinsic donor organ quality and predicts 24-month outcomes
US20220372573A1 (en) Methods and systems for detection of kidney disease or disorder by gene expression analysis
WO2020150258A1 (fr) Procédés et systèmes pour la détection de maladies hépatiques
CN113470813A (zh) 肝癌患者生存率预后模型
CN113436673A (zh) 一种用于肝癌预后预测的分子标志物及其应用
CN113345589A (zh) 肝癌预后模型的构建方法及应用方法、电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20741431

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20741431

Country of ref document: EP

Kind code of ref document: A1